Over the years I’ve replaced the hardware on quite a number of broken servers. Sometimes swapping the disks just works and in other occasions it fails and the disks are not detected. This is caused by missing SATA drivers in the initrd. This is easily fixed by booting from a rescue CD and creating a new initrd with the right drivers.

When you boot from a rescue CD you can check the SATA driver that is loaded by doing the following:

root@server1 [~]# lsmod|grep sata
sata_nv 22217 2
libata 105757 1 sata_nv

In this case sata_nv is used. To check if this is available in the initrd on the original disks you have to unpack the initrd that is used for booting. First chroot into the systemimage from the rescue image.

chroot /mnt/sysimage
mkdir /root/temp-initrd
cp /boot/initrd-xxx.img /root/temp-initrd
cd /root/temp-initrd
gunzip < initrd.img | cpio -i --make-directories

In the lib directory that is just unpacked you can see the modules that are included:

root@server1 [~/temp-initrd/lib]# ls
./ ../ dm-mod.ko ext3.ko jbd.ko scsi_mod.ko sd_mod.ko

This means the sata_nv driver is not included. This is causing the boot problems. To fix this we need to rebuild the initrd for the correct kernel with the right drivers:.

mkinitrd --with=sata_nv --with=raid1 /boot/initrd-2.6.x-y.z.1.el5.img 2.6.x-y.z.1.el5

Make sure to specify the right kernel, because if you boot from a rescue CD you are probably running a different kernel then is actually installed on the system you are replacing the disks for.

In general it is a good idea to configure password aging as part of your password/security policy. In some cases however, this might cause unexpected problems. I’ve seen cases where an expired password prevented a machine from booting. In this specific case this was caused by a service that ran as the user with the expired password. In general you should not run services as a normal user account, but sometimes you just have to deal with things you can’t change. Generally the documentation states that to disable password aging you have to edit the /etc/shadow file, and remove the part where the password age is stored. This is quite error prone. If you do it this way, be sure to use vipw to prevent errors in this critical file. To disable password aging I recommend just using the command to enable it as well:

# chage -m 0 -M 99999 -E -1 username

Check the before and after:

# chage -l username
Minimum: 7
Maximum: 90
Warning: 7
Inactive: -1
Last Change: Jun 26, 2009
Password Expires: Sep 24, 2009
Password Inactive: Never
Account Expires: Never

After disabling password aging:

# chage -l username
Minimum: 0
Maximum: 99999
Warning: 7
Inactive: -1
Last Change: Jun 26, 2009
Password Expires: Never
Password Inactive: Never
Account Expires: Never

As a note, please only disable password aging when there is no other way to fix the problem.