Upgrading SUSE with revert based on LVM snapshot

While SUSE continues to insist on using BTRFS as the file system for the OS, and uses tight integration with the snapper tool to allow you to roll back any OS changes, I personally still don't trust BTRFS after going through a data loss that would not have happened with another FS. Therefore all my SUSEs do not use BTRFS and can't benefits from snapper tool. Lets try to achieve the same results using LVM snapshots.

I've just installed not so fresh SUSE:

linux-5403:~ # cat /etc/os-release
NAME="SLES_SAP"
VERSION="12-SP2"
VERSION_ID="12.2"
PRETTY_NAME="SUSE Linux Enterprise Server for SAP Applications 12 SP2"
ID="sles_sap"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles_sap:12:sp2"

Taking LVM snapshots

Stop most of the services that can modify the FS content where you should take snapshots.

LVM snapshots take space from PV free capacity, so you should have free PV space:

linux-5403:~ # pvs
  PV         VG     Fmt  Attr PSize   PFree 
  /dev/sda1  rootvg lvm2 a--  100.00g 88.00g

I have a lot of space here, therefore I've allocated 100% of LV size:

linux-5403:~ # (cd /dev/rootvg/ ; for lv in * ; do lvcreate -l100%FREE -s -n ${lv}_backup `pwd`/$lv ; done )
  Reducing COW size 88.00 GiB down to maximum usable size 8.04 GiB.
  Logical volume "slash_backup" created.
  Reducing COW size 79.96 GiB down to maximum usable size 2.01 GiB.
  Logical volume "swap_backup" created.
  Reducing COW size 77.95 GiB down to maximum usable size 2.01 GiB.
  Logical volume "var_backup" created.
linux-5403:~ # lvs
  LV           VG     Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  slash        rootvg owi-aos--- 8.00g                                                    
  slash_backup rootvg swi-a-s--- 8.04g      slash  0.00                                   
  swap         rootvg owi-aos--- 2.00g                                                    
  swap_backup  rootvg swi-a-s--- 2.01g      swap   0.00                                   
  var          rootvg owi-aos--- 2.00g                                                    
  var_backup   rootvg swi-a-s--- 2.01g      var    0.00 

Upgrading the System

Now let's mount the next SP CD and define it as a new repository in zypper.

linux-5403:~ # zypper ar cd:///?devices=/dev/sr0 CDROM
This is a changeable read-only media (CD/DVD), disabling autorefresh.
Adding repository 'CDROM' ...............................................[done]
Repository 'CDROM' successfully added
Enabled     : Yes                    
Autorefresh : No                     
GPG Check   : Yes                    
Priority    : 99                     
URI         : cd:///?devices=/dev/sr0

Reading data from 'CDROM' media
Retrieving repository 'CDROM' metadata ..................................[done]
Building repository 'CDROM' cache .......................................[done]

Run zypper dup and reboot. The result of upgrade:

linux-5403:~ # cat /etc/os-release 
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles_sap:12:sp4"

Looks good.

Reverting system to previous state

However, we are definitely unhappy with our update and want to return the OS to its previous state.

To roll back snapshots, the original file system must be unmounted, which is impossible in a running OS. Therefore, you should boot from a different medium, either from a network boot or from a CD image. Once booting, select the "Rescue System" (could be hidded in "More .."). Login with root and see logical volumes by lvs command:

tty1:rescue:~ # lvs
  LV           VG     Attr       LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  slash        rootvg owi-a-s--- 8.00g
  slash_backup rootvg swi-a-s--- 8.04g      slash  48.85
  swap         rootvg owi-a-s--- 2.00g
  swap_backup  rootvg swi-a-s--- 2.01g      swap   0.00
  var          rootvg owi-a-s--- 2.00g
  var_backup   rootvg swi-a-s--- 2.01g      var    47.03

For some reason, which may depend on the version of the rescue disk, the active VG was unable to restore the snapshots. So deactivate the VG and revert the snapshots:

tty1:rescue:~ # vgchange -an rootvg
  6 logical volume(s) in volume group "rootvg" now inactive
tty1:rescue:~ # lvconvert --merge /dev/rootvg/swap_backup
  Merging of snapshot rootvg/swap_backup will occur on next activation of rootvg/swap.
tty1:rescue:~ # lvconvert --merge /dev/rootvg/var_backup
  Merging of snapshot rootvg/var_backup will occur on next activation of rootvg/var.
tty1:rescue:~ # lvconvert --merge /dev/rootvg/slash_backup
  Merging of snapshot rootvg/slash_backup will occur on next activation of rootvg/slash.
tty1:rescue:~ # vgchange -ay rootvg
  3 logical volume(s) in volume group "rootvg" now active
tty1:rescue:~ # lvs
  LV           VG     Attr       LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  slash        rootvg Owi-a-s--- 8.00g             48.41
  swap         rootvg -wi-a----- 2.00g
  var          rootvg Owi-a-s--- 2.00g             46.09

Keep eyes on the progress until the picture becomes to:

tty1:rescue:~ # lvs
  LV           VG     Attr       LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  slash        rootvg -wi-a----- 8.00g
  swap         rootvg -wi-a----- 2.00g
  var          rootvg -wi-a----- 2.00g
tty1:rescue:~ # reboot

Reverting system to previous state using existing initrd

This option might work. That is, it works on current (article time) versions of SUSE. Abort the GRUB boot sequence and edit the boot record by pressing e. Find the line that starts with the linux keyword and add rd.break=pre-mount to the end of that line:

 ..
 echo    'Loading Linux 4.4.21-69-default ...'
 linux   /boot/vmlinuz-4.4.21-69-default root=/dev/mapper/rootvg-slash  resume=/dev/rootvg/swap splash=none showopts rd.break=pre-mount
 echo    'Loading initial ramdisk ...'
 initrd  /boot/initrd-4.4.21-69-default
 ..

Then press Ctrl-x to boot from that entry. The added codeword should open a shell just before the root filesystem is mounted. It may ask you for the root password, depending on the version of SUSE. The environment is very limited, but the root VG is untouched and can be manipulated. All LVM commands are represented by a single lvm binary, so all known commands must be prefixed with lvm. Moreover, the LVM environment does not allow for any changes, so you should add parameters to the workaround to each “destructive” command. For example, if you enter the original merge command, you get:

sh-4.4# lvm lvconvert --merge /dev/rootvg/var_backup
  Read-only locking type set. Write locks are prohibited.
  Can't get lock for rootvg
  Cannot process volume group rootvg

A workaround is to provide the missing configuration values directly on the command line:

sh-4.4# lvm lvconvert --config 'global {locking_type=1}' --merge rootvg/var_backup
  Merging of volume rootvg/var_backup started.
  /run/lvm/lvmpoolld.socket: connect failed: No such file or directory
  WARNING: Failed to connect to lvmpolld, Proceeding with polling without using lvmpolld.
  WARNING: Check global/use_lvmpolld in lvm.conf or the lvmpolld daemon state.
  rootvg/var: Merged: 68.54%
  rootvg/var: Merged: 100.00%
  Merge of snapshot into logical volume rootvg/var has finished.
  Logical volume "var_backup" successfully removed
sh-4.4# lvm lvconvert --config 'global {locking_type=1}' --merge rootvg/swap_backup
  ..
sh-4.4# lvm lvconvert --config 'global {locking_type=1}' --merge rootvg/slash_backup
  ..
sh-4.4# lvm lvs
  LV           VG     Attr       LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  slash        rootvg -wi-a----- 8.00g
  swap         rootvg -wi-a----- 2.00g
  var          rootvg -wi-a----- 2.00g
sh-4.4# reboot

Reboot the system to finish.

Snapshot cleanup

Please do not forget to remove snapshots, if everythings good and you happy with upgrade.

# (cd /dev/rootvg/ ; for lv in *_backup ; do lvremove  `pwd`/$lv ; done )

Updated on Fri Jul 24 15:22:51 IDT 2020 More documentations here