Verified:

qzjul Game profile

Administrator
Game Development
10,263

Jul 30th 2010, 0:42:41

Hi all; Sorry about the major disaster we had there, here's what happened FYI.

I'm planning on moving the server on the weekend; so in preparation of that, I wanted to do a quick test to make sure that the server would, in fact, come back after I moved it.

Unfortunately, as you can see it did not.

Why not?

I forgot some important little details to do with RAID arrays back a while ago; we had a disk failure, and I hot-swapped it for a new drive, no problem, game kept on running and hardly anybody noticed anything ;) Then later I added a 3rd drive to the array, to make sure we had an extra mirror just in case.

However, I forgot to rebuild the mdadm.conf file and update the initramfs which basically makes it so that the kernel understands which drives belong to which array on booting.


So when I rebooted, it was expecting different partitions on different drives, and got totally confused. This was at 11pm my time. I suspect I might have figured it out last night, except for the fact that it took nearly 10 minutes each time I tried to boot for it to fail and drop me to a BusyBox prompt. This necessarily protracted the amount of time needed to test things.... I ended up booting to a LiveCD about 10x, verifying the RAID was good, and looking stuff up online trying to figure out why the heck the boot sequence couldn't figure out what the drives were =/ Anyway, I went to bed at 4am, got up at 7am to go to work, looked up a few things there, built a list of commands to try; got home at 6pm, and as I was booting thought of the solution, fixed it, and here we are.... well and it forced me to do a check of all the drives in the system, as there had been 240 days without a check... (the system had been online for 192 days -- that took 30 or 40 minutes).



So the lesson of the day:

If you ever change a RAID array (especially hot-swap).... update your mdadm.conf AND initramfs RIGHT THEN AND THERE, because if you reboot, everything will be totally fubared

Edited By: Slagpit on Jul 30th 2010, 1:05:42
Back To Thread
See Original Post
See Subsequent Edit
Finally did the signature thing.