Tuesday, July 10, 2012

CentOS 6.3 mdadm won't start older md arrays.

For some of us, the drive setups we create stays with a system for a long time. Keeping the same data disk array untouched even for major revision changes is common (like a OS rebuild of 5.x -> 6.x). Sometimes that long term usage bites back. Here is my failure case while upgrading from CentOS 6.2 -> CentOS 6.3.

Symptoms:
Simply md RAID 1 extra data drive will not boot. The system drops to recovery mode with a missing (md) drive to mount and an fsck request. The extra file system has 2 "linux_raid_member" drives that show under fdisk and blkid. Even a "cat /proc/mdstat" shows no arrays. If I run, as root:
mdadm --auto-detect
the /proc/mdstat will finally show info, however.

Solutions:
  1. Make sure that the /etc/mdadm.conf contains the array info from: 
    mdadm --examine --scan >> /etc/mdadm.conf
  2. There seems to be kinda "depreciated" mdadm technical note about older created arrays with BZ-788022 implicated and a "+0.90" needing to be added to/etc/mdadm.conf, but you need to get rid of the "+1.x -all" options!
How do you tell in advance that you will have an issue *before* an upgrade? If  you run, as root,
mdadm -E /dev/sdc1|grep Vers
and get output like: "Version : 0.90.00", you will want to make a change *before* you reboot!

It is interesting to note that I was able to just have this md array work in 6.0, 6.1 and 6.2 because,
In Red Hat Enterprise Linux 6.1 and 6.2, mdadm always assembled version 0.90 RAID arrays automatically due to a bug.

1 comment:

  1. I suffered from the same problem.
    If you've got LVM volumes on your raid disks, you need to activate them using

    vgchange -a e

    after

    mdadm --auto-detect

    Now you can mount your volumes.

    If you want to stick with auto-detection you don't need to add mdadm --examine --scan to your /etc/mdadm.conf

    You can simply change the line

    AUTO +imsm +1.x -all

    in /etc/mdadm.conf to

    AUTO +imsm +0.90 +1.x -all

    as described in the tech note.

    But what about recreating the RAID1 using the current 1.2 meta-format? Reading man mdadm it seems that it could be possible without losing data if you do it in place. Or did they change the size of the meta-data after 0.90 ?

    ReplyDelete