Clone a live partition on a remote server

From Organic Design wiki
Procedure.svg Clone a live partition on a remote server
Organic Design procedure

Sometimes you need to clone the active partition, for example when you want to move it onto a large hard drive and you don't have physical access to the machine. Sometimes you may need to clone onto a smaller partition, for example if you're moving to an SSD, which raises extra complications.

Cloning a live partition is risky because the data is constantly changing which means that the target data may contain partially written files or other inconsistencies. For this reason the usual partition tools do not allow it, and a low-level tool needs to be used instead. We've done this before successfully by shutting down all services first and then doing a disk check afterwards. If any problems show up in the disk check they can be fixed and the affected directories manually copied from the source disk. This procedure was documented during the process of moving a live Ubuntu 14 operating system over from a stand-alone drive to a RAID array which had slightly smaller sized disks.

To begin with check which device you'll be copying from, and which you're copying to. Sometimes it can be quite difficult to be sure if some of your devices have very similar layout and specifications, especially if you have a number of similarly sized unformatted disks for example. The lsscsi command which shows manufacturer information about the devices, can sometimes be of help, for example in our procedure it was very hard to know whether /dev/sdb or /dev/sdc was the RAID device since both were unformatted 1TB devices, but lsscsi shows that /dev/sdc is using our 9650SE RAID controller.

# lsscsi
[0:0:0:0]    disk    AMCC     9650SE-2LP DISK  4.10  /dev/sdc 
[1:0:0:0]    disk    ATA      Hitachi HUA72201 JP4O  /dev/sda 
[2:0:0:0]    disk    ATA      TOSHIBA MG03ACA1 FL1A  /dev/sdb

Next, stop all major services like MySQL and Apache then use the dd utility to clone the device which is the most common method of doing it on Linux. This could take a long time, it will probably be a couple of hours for a terrabyte of data.

dd if=/dev/sda of=/dev/sdc bs=32M

Note: If the target device is smaller than the source device, then the last partition will now have invalid information because it will be listed as being larger than the extent of the physical device. This will mean that it can't be mounted and partition editors such as parted will refuse to deal with it. If this is the case you'll need to delete it using fdisk then the "d" command and selecting "3" for the number. Also if this is the case you can stop the dd transfer after a few minutes - just long enough to have transferred the boot and swap partitions (assuming that your last partition is the largest with all the data on it).

You can then use a simpler tool such as parted to create the partition again and then use dd again but this time just cloning the individual partition, not the whoe disk:

dd if=/dev/sda3 of=/dev/sdc3 bs=4096

Now whether or not you had a smaller target size you now need to run e2fsck to remove all errors from the drive (there will almost certainly be some since the source was changing during the cloning process). If you did have a smaller target size then you will also get a warning about the filesystem's size being too large, just tell it not to abort and carry on with the check and fix all errors.

# e2fsck -f /dev/sdc3
e2fsck 1.42.9 (4-Feb-2014)
The filesystem size (according to the superblock) is 243068416 blocks
The physical size of the device is 243015936 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>? no
Clearing orphaned inode 7864391 (uid=0, gid=0, mode=0100640, size=89)
Pass 1: Checking inodes, blocks, and sizes
Inode 32899082, end of extent exceeds allowed value
	(logical block 6, physical block 152092839, len 1)
Clear<y>? yes
Pass 2: Checking directory structure
Entry 'config.autogenerated' in /var/lib/exim4 (32511225) has deleted/unused inode 32508135.  Clear<y>? yes
Entry 'shadow' in /etc (46923777) has deleted/unused inode 46925882.  Clear<y>? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 32508134
Connect to /lost+found<y>? yes
Inode 32508134 ref count is 2, should be 1.  Fix<y>? yes
Unattached inode 46924127
Connect to /lost+found<y>? yes
Inode 46924127 ref count is 2, should be 1.  Fix<y>? yes
Pass 5: Checking group summary information
Block bitmap differences:  -(130090912--130090917) +(130091346--130091351)
Fix<y>? yes
Free blocks count wrong (112542074, counted=112507183).
Fix<y>? yes
Inode bitmap differences:  +46924127 -46925882
Fix<y>? yes
Free inodes count wrong (60542803, counted=60543874).
Fix<y>? yes

If you had the smaller target issue, then now run the resize2fs utility to force the filesystem size to the physical size partition as shown above so it will be mountable.

# resize2fs -p /dev/sdc3 243015936
resize2fs 1.42.9 (4-Feb-2014)
Resizing the filesystem on /dev/sdc3 to 243015936 (4k) blocks.
Begin pass 3 (max = 7418)
Scanning inode table          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The filesystem on /dev/sdc3 is now 243015936 blocks long.

Before mounting the partition and fixing the errors, we need to make the disk ID and the partitions UUID unique, because they'll be exact copies of the source devices IDs. You can see this with the blkid command:

# blkid
/dev/sda1: UUID="bc459b13-3aad-4017-a68b-ce8ab36275e1" TYPE="ext3" 
/dev/sda2: UUID="c57f1400-129a-402a-90f1-820a22c6a2fe" TYPE="swap" 
/dev/sda3: UUID="b56d82f7-5817-4a52-9763-0b38aa360e2b" TYPE="ext4" 
/dev/sdc1: UUID="bc459b13-3aad-4017-a68b-ce8ab36275e1" SEC_TYPE="ext2" TYPE="ext3" 
/dev/sdc2: UUID="c57f1400-129a-402a-90f1-820a22c6a2fe" TYPE="swap" 
/dev/sdc3: UUID="b56d82f7-5817-4a52-9763-0b38aa360e2b" TYPE="ext4"

You can generate new UUIDs with the uuidgen command, then use tune2fs to update the UUID of each partition:

# uuidgen
81d40a02-8019-4c1f-afb8-fb41d117c6d1
# tune2fs /dev/sdc1 -U 81d40a02-8019-4c1f-afb8-fb41d117c6d1
tune2fs 1.42.9 (4-Feb-2014)

The UUID of the swap partition has to be done slightly differently, we need to actually make the system use the partition as a swap file and set its UUID in the process.

swapoff /dev/sda2
mkswap -U 86ce60b7-ad4a-44aa-88e2-2aefd5c6c396 /dev/sdc2
Setting up swapspace version 1, size = 3999740 KiB
no label, UUID=86ce60b7-ad4a-44aa-88e2-2aefd5c6c396

Then check that they're all unique by doing blkid again:

# blkid
/dev/sda1: UUID="bc459b13-3aad-4017-a68b-ce8ab36275e1" TYPE="ext3" 
/dev/sda2: UUID="c57f1400-129a-402a-90f1-820a22c6a2fe" TYPE="swap" 
/dev/sda3: UUID="b56d82f7-5817-4a52-9763-0b38aa360e2b" TYPE="ext4" 
/dev/sdc1: UUID="81d40a02-8019-4c1f-afb8-fb41d117c6d1" TYPE="ext3" 
/dev/sdc2: UUID="86ce60b7-ad4a-44aa-88e2-2aefd5c6c396" TYPE="swap" 
/dev/sdc3: UUID="c23ab65a-c32f-41e3-bd31-51ed563e0099" TYPE="ext4"

Finally we also need to set the disk identifier as well which can be done with the "i" option from fdisk, but only after selecting expert settings with the "x" command. Then quit that and check the result with fdisk -l (my new ID is c0ffeeee)

# fdisk -l
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00055af8

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048      976895      487424   83  Linux
/dev/sda2          976896     8976383     3999744   82  Linux swap / Solaris
/dev/sda3         8976384  1953523711   972273664   83  Linux

Disk /dev/sdc: 1000.0 GB, 999989182464 bytes
255 heads, 63 sectors/track, 121575 cylinders, total 1953103872 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xc0ffeeee

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *        2048      976895      487424   83  Linux
/dev/sdc2          976896     8976383     3999744   82  Linux swap / Solaris
/dev/sdc3         8976384  1953103871   972063744   83  Linux

Now we can mount the main OS partition and do an rsync to make it match the source partition properly, and then when that's done you can restart all the services you stopped.

mkdir /media/raid
mount -t ext4 /dev/sdc3 /media/raid
rsync -a --exclude=/boot --exclude=/proc --exclude=/dev --exclude=/sys --exclude=/media / /media/raid/

Note: if you got any errors about bitmap images, then it's probably a better idea to copy the files over manually with cp -pR because this will mean that rsync may miss updating some corrupted files since their size and modified date will not have changed.

And then finally we change all our UUIDs in /etc/fstab (on both partitions) to match the /dev/sdc partitions instead of the /dev/sda ones (just comment the current settings out so they can be reverted easily). You can use blkid to get all the partitions UUIDs and then cat /etc/fstab afterwards to double check that they're all correct, e.g.

Note: make sure you update /etc/fstab in the mounted new partition as well.

# cat /etc/fstab
#UUID=b56d82f7-5817-4a52-9763-0b38aa360e2b /      ext4  errors=remount-ro 0   1
UUID=c23ab65a-c32f-41e3-bd31-51ed563e0099  /      ext4  errors=remount-ro 0   1

#UUID=bc459b13-3aad-4017-a68b-ce8ab36275e1 /boot  ext3  defaults          0   2
UUID=81d40a02-8019-4c1f-afb8-fb41d117c6d1  /boot  ext3  defaults          0   2

#UUID=86ce60b7-ad4a-44aa-88e2-2aefd5c6c396 none   swap  sw                0   0
UUID=c57f1400-129a-402a-90f1-820a22c6a2fe  none   swap  sw                0   0
# blkid
/dev/sda1: UUID="bc459b13-3aad-4017-a68b-ce8ab36275e1" TYPE="ext3" 
/dev/sda2: UUID="86ce60b7-ad4a-44aa-88e2-2aefd5c6c396" TYPE="swap" 
/dev/sda3: UUID="b56d82f7-5817-4a52-9763-0b38aa360e2b" TYPE="ext4" 
/dev/sdc1: UUID="81d40a02-8019-4c1f-afb8-fb41d117c6d1" TYPE="ext3" 
/dev/sdc2: UUID="c57f1400-129a-402a-90f1-820a22c6a2fe" TYPE="swap" 
/dev/sdc3: UUID="c23ab65a-c32f-41e3-bd31-51ed563e0099" TYPE="ext4"

You can test whether the OS actually runs in your new partition before rebooting by doing chroot /media/raid/ which will make the new partition into the new Linux root. Stop all services that listen on external ports in the main OS first so you can start them in the chroot OS and then test that they're working properly.

If it's all running fine, then it's time to switch over the boot configuration to the new partition. We've found that there can be major headaches with UUID for the root partition not updating to the new one, so we recommend manually changing the UUIDs over and rebooting first, then adjusting the old partition so that it's not a candidate for root partition selection any more, then re-installing grub the normal way.

So first, change the UUIDs in grub.cfg in both boot partitions:

perl -pi -w -e 's/b56d82f7-5817-4a52-9763-0b38aa360e2b/c23ab65a-c32f-41e3-bd31-51ed563e0099/g;' /boot/grub/grub.cfg
mkdir /media/raidboot
mount -t ext4 /dev/sdc1 /media/raidboot
perl -pi -w -e 's/b56d82f7-5817-4a52-9763-0b38aa360e2b/c23ab65a-c32f-41e3-bd31-51ed563e0099/g;' /media/raidboot/grub/grub.cfg

Now, cross fingers and reboot!

If it comes back, then lsblk should show that you're fully running on the new partition:

NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 931.5G  0 disk 
├─sda1   8:1    0   476M  0 part 
├─sda2   8:2    0   3.8G  0 part 
└─sda3   8:3    0 927.2G  0 part 
sdb      8:16   0 931.5G  0 disk 
sdc      8:32   0 931.3G  0 disk 
├─sdc1   8:33   0   476M  0 part /boot
├─sdc2   8:34   0   3.8G  0 part [SWAP]
└─sdc3   8:35   0   927G  0 part /

If so, then we can now make the old drive's boot partition non-bootable in your favourite tool like cfdisk or parted, then move some files out of the old OS partition so that the grub installer will ignore that as a root partition candidate.

mkdir /media/old
mount -t ext4 /dev/sda3 /media/old
cd /media/old
mkdir backup
mv init* backup
mv vm* backup
mv etc backup
cd /
umount /dev/sda3

Then run the usual procedure to install grub. We need to do this because the file we edited is auto-generated, so we need to make sure it's auto-generating to boot off our new partition, the current boot configuration is not persistent.

# grub-install /dev/sdc
Installing for i386-pc platform.
Installation finished. No error reported.

# update-grub
Generating grub configuration file ...
Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.
Found linux image: /boot/vmlinuz-3.13.0-52-generic
Found initrd image: /boot/initrd.img-3.13.0-52-generic
Found unknown Linux distribution on /dev/sda3

Notice how it says that's it's found an unknown Linux distribution on the old partition? That's good, it means it hasn't used it as a root partition candidate. You might want to check the /boot/grub/grub.cfg that has been generated just to be certain that the only UUID it's using is the new /dev/sdc3 partition.

If the file looks good, then that's it! You're done! Reboot again and that's the persistent final state of the machine.