Well, having a RAID1 setup on this server has saved my a$$ in a huge way. Yesterday I started getting notifications of a smart failure on one of the drives (I have two configured as a RAID1 across a few partitions). So I went down to the colo to swap the drive (after finding a spare here and testing the heck out of it first).
Well, wouldn’t you know but the drive I pulled, while having some issues (they weren’t overly severe, just being unable to read some blocks), wasn’t the one with the real problems. After replacing the drive and getting the first three partitions sync’d up (the smallest) I came back home expecting to login, check the status of the final sync, and carry on with my night. Well, get home and I can ping the machine but it’s not serving any data. Drive back to the colo.
Turns out, the other drive is really dying. Good thing I brought everything from the first trip back (my tools, etc.) so I could put the drive I had just pulled out back in and take the other one out. Fortunately, the data problems weren’t across the first threee (system) partitions (/boot, /, and /usr/local), but the partition containing all my web/mail files didn’t finish syncing due to all the problems. So after booting up with the original drive (and my new drive), I didn’t have a /dev/md3 because it failed utterly the last few times. Had to re-assemble that array from the original drive, then hot-add the new drive to it, to get the data back.
So my 20 minute job ended up taking about 5 hours to figure out.
Now the TODO is to have rsync run every four hours, rather than once a day so at the absolute most, all I can lose is 4hrs worth of stuff (this goes for the dumped and sync’d SQL databases as well).
The other TODO (already scheduled for this summer) is to replace the machines I have at the colo with two Sun x2100 servers… these little 1U baddies are awesome (and, BTW, Annvix runs peachy on them). Opteron CPUs, dual gigabit nics (both fully supported), SATA support for two drives (with front-accessible trays no less!)… I am so getting two of these. Not only will they nicely replace 6U of servers with just 2U but front-loading trays! I won’t have to take the machines out of the rack for anything less serious than a motherboard failure. Sweet machines and reasonably priced too.
Too bad they don’t have space for a third drive tho… it would be nice to set one drive aside as a spare.
Some quick notes for those interested in the implementation (and how easy it was to do this, aside from the really dying drive not reporting SMART information to smartd (bad drive!)):
Login and set the partitions of the dying drive to faulty:
# mdadm --manage --set-faulty /dev/md0 /dev/hda1
(and repeat for all partitions on /dev/hda with all your /dev/mdX devices; in my case I had four: /dev/md[0123])
Next, hot-remove the faulty devices from the RAID arrays (need to be marked as faulty first):
# mdadm /dev/md0 -r /dev/hda1
and repeat for all partitions. Next, bring the machine down (after doing a “cat /proc/mdstat” to make sure they’re really down; the arrays will be marked as running in degraded mode with a status of [U_]). Swap out the drive. Boot the machine again. It’s really a good idea here to make sure you can boot off of either drive (I do this with GRUB and set it up on both drives because here /dev/md0 is /boot, so it’s accessible on both machines (as, ie. /dev/hda1 and /dev/hdc1) and then GRUB can boot from either. This is one reason I much prefer GRUB to lilo as with GRUB I can do this, with LILO I couldn’t (maybe you can now, I don’t know… I haven’t touched LILO in about 2 years).
Anyways, once you’re back up and running, re-partition the new drive to match the old drive (making sure the partition types are set to fd for the RAIDed partitions). Then hot-add them back to the array:
# mdadm /dev/md0 -a /dev/hda1
What /proc/mdstat or you can use “mdadm —detail /dev/md0” to view progress and other neat info. The arrays will reconstruct themselves and, unless you have an issue like me, you’re good to go. So, realistically, you’re running three mdadm commands with a little hardware swapping in between. I don’t think a 20 minute estimate for downtime was too unrealistic.
Anyways, everything is back up without a problem (knock on wood) so that’s nice. I tell ya tho… front-loading drive trays would have been fantastic. Serial ATA would have been fantastic too. Luckily, those servers will only be there for a few more months than they can retire back here and replace an aging LAN server (with some new drives, of course) and maintaining these remote servers should be much simpler.