KVM recepies

KVM, mostly QEMU, is a very powerful tool, but it requires a lot of manual actions that are not wrapped up in any GUI tool.

Services and locations

In this example, I'm using Fedora25, although most functions will be the same in other distributions. This simple "find" shows the most important locations:

# find /etc/libvirt /var/lib/libvirt -type d
/etc/libvirt				<-global configuration
/etc/libvirt/nwfilter			<-nwfilter definitions
/etc/libvirt/storage			<-storage pools definitions
/etc/libvirt/storage/autostart		<-what to start on boot
/etc/libvirt/qemu			<-VMs definitions here
/etc/libvirt/qemu/networks		<-networks definitions
/etc/libvirt/qemu/networks/autostart	<-what to start on boot
/var/lib/libvirt			<-runtime tree
/var/lib/libvirt/images			<-location of default storage pool (VM disks)

You do not need to use any tools if you know how to run the QEMU command directly. This is an example of running QEMU (just to frighten you):

# ps -ef | grep qemu
qemu 18919 1 95 10:16 ? 00:00:08 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name guest=fedora25,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-53-fedora25/master-key.aes -machine pc-i440fx-2.7,accel=kvm,usb=off,vmport=off -cpu Broadwell-noTSX -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid f03ac3b1-3268-4947-a6d0-62c3161da9fd -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-53-fedora25/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/fc25.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-0-0,readonly=on -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=27,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:45:1e:ea,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-53-fedora25/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on

Thanks to libvirt, you do not need to know all these parameters to run the VM. The libvirt-daemon package will do the basic work. Also install libvirt-client, virt-manager and virt-install packages to simplify management.

Working with disks

QEMU prefer works with qcow2 or row format disks. Although I successfully attached the vmdk image to the VM, this format is not recommended.

Creating disk

You can create a disk image using the virt-manager graphical interface. It will create a sparse file in the specified format. But I like how "qemu-img" creates disk files:

# qemu-img create -f qcow2 myvm.qcow2 1T
Formatting 'myvm.qcow2', fmt=qcow2 size=1099511627776 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
# ls -lh myvm.qcow2
-rw-r--r-- 1 root root 208K Jul 14 17:32 myvm.qcow2
# qemu-img info myvm.qcow2
image: myvm.qcow2
file format: qcow2
virtual size: 1.0T (1099511627776 bytes)
disk size: 208K
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Of course, the disk space does not comes from nothing, and if you run out of space on a real storage, the guest virtual machine will stop when it request new blocks. But this makes it possible to simulate large disks for virtual machines.

The second useful format is "raw". You can make a copy of the physical disk and attach it to the VM for testing:

# dd if=/dev/sdb of=myvm.raw bs=1024k

Or you can create an empty disk:

# dd if=/dev/zero of=myvm.raw bs=1024k count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 13.0131 s, 82.5 MB/s

Converting disks

"Qemu-img" is an excellent tool for converting disk images to any format. You will often use it to import and export virtual disk images to/from other hypervisors. However, I found myself using it more often to create template images. Imagine that I've created a 10g image and installed everything I wanted, including the latest updates and even configured something. My initially almost empty disk now takes about 3-5G in the VM, and about 8G is used from the host FS. This difference is due to many files have been written and deleted or overwritten. Now I want my template to be as small as possible, so I run convert:

# qemu-img convert -O qcow2 -c myvm.qcow2 template.qcow2

I'm using the -c compress flag here to make it even smaller. This does not permanently compress the image, new data will be recorded uncompressed. Unchanged data will be kept compressed, as our "convert" does this. The resulting image becomes about 1.5 GB in size (not to mention the MS windows).

Working with template

I keep resulting template images in "template" subdirectory, then I creates real VM disk image by command:

# qemu-img create -f qcow2 -b template/rhel7.qcow2 rhel7.qcow2
# qemu-img info rhel7.qcow2
image: rhel7.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 196K
cluster_size: 65536
backing file: template/rhel7.qcow2
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

It's important to say that you can not change the base file without corrupting the child disk image.

Working with snapshots

Working with clones

Working with network

KVM provides a very flexible way of working with the network. First of all, you can define networks using the GUI virt-manager. Right-click on the hypervisor itself and select "Details" to open the configuration window.

Default (NAT) network

The "default" network is already defined for you. A bridge is created (usually called virbr0) and an address is assigned to it (usually 192.168.122.1, if not already used). Then dnsmasq is run and NAT iptables rules are created. All virtual machines connected to the default network will access the outside world through your host using NAT. You can connect to a virtual machine from a host itself or other virtual machine attached to the same network.

# virsh net-dumpxml default
<network>
  <name>default</name>
  <uuid>0f42ea07-08df-42d0-9eeb-4c69c097eee3</uuid>
  <forward mode='nat'>
    <nat>
      <port start='1024' end='65535'/>
    </nat>
  </forward>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:3e:2f:e3'/>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.2' end='192.168.122.254'/>
    </dhcp>
  </ip>
</network>

Creating isolated network

Sometimes I need another internal virtual network to simulate a second network ring or test firewall rules. Therefore, I need a network accessible from the host, but isolated from the outside world. It is not completely isolated, because there is a connection between the host and the guests. You can do it completely isolated, if you want.

I already have this network, then I need to destroy it for demo:

# virsh net-destroy iso0
Network iso0 destroyed
# virsh net-undefine iso0
Network iso0 has been undefined
# brctl show
bridge name     bridge id               STP enabled     interfaces
virbr0          8000.5254003e2fe3       yes             virbr0-nic

As usual at libvirt, destroy is not so destructive as sounds, it just stop the network. Undefine then just remove configuration and this is destructive. Last "brctl" command shows the only virbr0 bridge that belongs to "default" network.

Using a reference Network XML format , I've created an XML configuration file similar to:

# cat iso0.net.xml 
<network>
  <name>iso0</name>
  <bridge name='virbr1' stp='off' delay='0'/>
  <domain name='iso0'/>
  <ip address='172.17.2.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='172.17.2.128' end='172.17.2.254'/>
    </dhcp>
  </ip>
</network>

Then define it:

# virsh net-define iso0.net.xml 
Network iso0 defined from iso0.net.xml

# virsh net-list --all
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 iso0                 inactive   no            yes

# virsh net-start iso0
Network iso0 started

# virsh net-list
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 iso0                 active     no            yes

# virsh net-autostart iso0
Network iso0 marked as autostarted

# virsh net-list
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes
 iso0                 active     yes           yes

# brctl show
bridge name     bridge id               STP enabled     interfaces
virbr0          8000.5254003e2fe3       yes             virbr0-nic
virbr1          8000.525400a90f62       no              virbr1-nic

Bridging to physical interface

In this case, the virt-manager GUI suggests using the "macvtap" connection. It works, but not very well, not every protocols and packets passing. A much more reliable alternative solution is to create a bridge using the OS tools and assign to it the physical host's NIC. Then you can assign VM to this new bridge. I did this on the server, continuing to use "macvtap" on my laptop.

Working with VLANs

There are two ways to work with VLANs: they are parsed by the host or the guest gets all the VLANs as a trunk.

A secure way to work with VLANs is to make the host to take care of this. In this case, you must create a virtual interface on the host that belongs to a specific VLAN. Then you must create a bridge on the host that includes this interface. You can assign an IP address (belonging to this VLAN) to the bridge if you want the host to participate in the traffic of this VLAN. This is not necessary if you only want to pass traffic to the guest. Then you must create a KVM network, as described above. And finally, attach a virtual network card of the guest to this network.

Sometimes it is required to transfer all VLANs for the guest as a trunk. Perform that as just like the "Bridge for the physical interface". But there may be a small problem here. If you have already created a virtual interface for some VLANs on the host, then this VLAN traffic will not be in the trunk.

The proposed solution, which has not yet been tested, offers to remove all host's VLAN interfaces, then create a virtual interface connected to the trunk bridge, and then create another virtual network adapter for VLAN on it.

Discovering guest's IP address

Often, when starting a virtual machine, I really do not want to open the console to find out which IP it received from DHCP. And yet, there is a way to guess about this from information from the host itself.

First, you need to get the MAC address associated with the VM:

# virsh list
 Id    Name                           State
----------------------------------------------------
 14    rhel5.11                       running

# virsh dumpxml rhel5.11 | grep "mac address"
      <mac address='52:54:00:46:ed:1a'/>
# virsh dumpxml rhel5.11 | awk -F"'" '/mac address/{print $2}'
52:54:00:46:ed:1a

Then you need to look at the ARP table to get the information you need:

# arp -an | grep -w 52:54:00:46:ed:1a
? (192.168.122.223) at 52:54:00:46:ed:1a [ether] on virbr0

I found these actions repeated very often, then I've created small script to display IP addresses of running VMs:

# cat /root/bin/kvm-guests-ip.sh 
#!/bin/bash

VMs=$(virsh list | awk '/running/{print $2}')

for V in $VMs ; do
        MACs=$(virsh dumpxml $V | awk -F"'" '/mac address/{print $2}')
        for M in $MACs ; do
                echo -n $V" => " ; arp -an | grep -w $M
        done
done

The life becomes easy:

# kvm-guests-ip.sh
rhel5.11 => ? (192.168.122.223) at 52:54:00:46:ed:1a [ether] on virbr0

Working with VMs

List of running VMs:

# virsh list
 Id    Name                           State
----------------------------------------------------

List of all defined VMs:

# virsh list --all
 Id    Name                           State
----------------------------------------------------
 -     fedora25                       shut off
 -     myvm                           shut off
 -     win7                           shut off

Turn VM of. ("Destroy" here is for instance or process, not disk or configuration).

# virsh destroy myvm
error: Failed to destroy domain myvm
error: Requested operation is not valid: domain is not running

The VM "myvm" already off, then destroy failed

Remove VM including belonging disks:

# virsh undefine --remove-all-storage myvm
Domain myvm has been undefined
Volume 'hdb'(/var/lib/libvirt/images/cos.vmdk) removed.

Deploy VM by virsh-install

For example, I cloned the disk image with the qemu-img command and do not really want to install the VM, so I use the --import option:

# qemu-img create -f qcow2 -b template/rhel7.qcow2 myvm.qcow2
Formatting 'myvm.qcow2', fmt=qcow2 size=21474836480 backing_file=template/rhel7.qcow2 encryption=off
cluster_size=65536 lazy_refcounts=off refcount_bits=16
# virt-install --name myvm --memory 2048 --vcpus 2 \
	--import \
	--disk /var/lib/libvirt/images/myvm.qcow2 \
	--network default \
	--os-type linux --os-variant rhel7

Hints --os-type and --os-variant helps "virt-install" to configure virtual hardware to known working values.

You can do fresh install from CD:

# virsh destroy myvm
Domain myvm destroyed

# virsh undefine --remove-all-storage myvm
Domain myvm has been undefined
Volume 'vda'(/var/lib/libvirt/images/myvm.qcow2) removed.

# qemu-img create -f qcow2 /var/lib/libvirt/images/myvm.qcow2 20g
Formatting '/var/lib/libvirt/images/myvm.qcow2', fmt=qcow2 size=21474836480 encryption=off
cluster_size=65536 lazy_refcounts=off refcount_bits=16
# virt-install --name myvm --memory 2048 --vcpus 2 \
	--location /mnt/BACKUP/ISO/rhel-server-7.2-x86_64-dvd.iso \
	--disk /var/lib/libvirt/images/myvm.qcow2 \
	--network default \
	--os-type linux --os-variant rhel7

Starting install...

You can use the kickstart file. I did a lot of kickstart debugging this way:

# virt-install --name myvm --memory 2048 --vcpus 2 \
	--location /mnt/BACKUP/ISO/rhel-server-7.2-x86_64-dvd.iso \
	--disk /var/lib/libvirt/images/myvm.qcow2 \
	--initrd-inject=/tmp/rhel7.ks --extra-args "ks=file:/rhel7.ks" \
	--network default \
	--os-type linux --os-variant rhel7

The --initrd-inject really inserts the mentioned file into the initrd, then an additional argument about the location of the kickstart is passed. An amazing feature!

Sometimes you need to use a serial console. It is possible to do this by deleting the graphical console and telling linux through additional arguments to use the serial console:

# virt-install --name myvm --memory 2048 --vcpus 2 \
	--location /mnt/BACKUP/ISO/rhel-server-7.2-x86_64-dvd.iso \
	--disk /var/lib/libvirt/images/myvm.qcow2 \
	--initrd-inject=/tmp/rhel7.ks --extra-args "console=ttyS0,115200n8 serial ks=file:/rhel7.ks" \
	--graphics none --console pty,target_type=serial \
	--network default \
	--os-type linux --os-variant rhel7

Starting install...
Retrieving file .treeinfo...                                      | 2.1 kB  00:00:00     
Retrieving file vmlinuz...                                        | 4.9 MB  00:00:00     
Retrieving file initrd.img...                                     |  38 MB  00:00:00     
Connected to domain myvm
Escape character is ^]
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 3.10.0-327.el7.x86_64 (mockbuild@x86-034.build.eng.bos.redhat.com)
(gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Thu Oct 29 17:29:29 EDT 2015
[    0.000000] Command line: console=ttyS0,115200n8 serial ks=file:/rhel7.ks
 ..

Create VM via virt-manager GUI

This is a very boring part. The only thing I can recommend is to use the --import parameter, even for an empty disk. In this case, you can configure a lot more parameters after the initial determination.

Working with xmldump

Taking backup of VM configuration for future use:

# virsh dumpxml myvm > myvm.xml

The resulting file contains some unique parameters that apply only to your instance. If you plan to share this file with others, edit it to make it more common. Delete the uuid and mac address entries. They will be regenerated during importing the definitions.

Importing configuration:

root@virt:~ # virsh destroy myvm
Domain myvm destroyed

root@virt:~ # virsh undefine myvm
Domain myvm has been undefined

root@virt:~ # virsh define myvm.xml
Domain myvm defined from myvm.xml

You can edit the XML file before importing. However, there is a very simple way to edit the configuration file:

# virsh edit myvm

You will use it often if you want anything special. Refer to libvirt Domain XML format.

Cluster simulation

Shared between VMs disk

When testing a cluster configuration, you need to emulate shared SAN drives. There are some limitations. This only works for emulating SCSI disks with RAW format. First, create the disk:

# dd if=/dev/zero of=/var/lib/libvirt/images/shared.raw bs=1024k count=1024

While it is possible to do this via GUI too, an XML view is much visible:

..
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source file='/var/lib/libvirt/images/shared.raw'/>
      <target dev='sdh' bus='scsi'/>
      <shareable/>
      <serial>1010101</serial>
      <address type='drive' controller='1' bus='0' target='0' unit='0'/>
    </disk>
..

Put same block into second node too.

Multipath disk

It is possible to simulate multipath disk. It also have to be SCSI, RAW format recommended.

..
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source file='/var/lib/libvirt/images/shared.raw'/>
      <target dev='sda' bus='scsi'/>
      <shareable/>
      <serial>1010101</serial>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source file='/var/lib/libvirt/images/shared.raw'/>
      <target dev='sdb' bus='scsi'/>
      <shareable/>
      <serial>1010101</serial>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
..

Serial option helps multipath driver to detect same disk detected via other connection. Pay attention to put different SCSI addresses.

Fencing for KVM

KVM fencing use multicast. This may be not usefull for real network, but works well on virtual KVM network. The following copied from here. Install (on host) software:

# yum install fence-virt fence-virtd fence-virtd-multicast fence-virtd-libvirtd
 ..
# fence_virtd -c

Last command is interactive configurator. Define fencing interface connected to VM cluster network. Put /etc/cluster/fence_xvm.key as Key File. After finishing, create a key file:

# mkdir -p /etc/cluster
# dd if=/dev/urandom bs=512 count=1 of=/etc/cluster/fence_xvm.key
# chmod 0600 /etc/cluster/fence_xvm.key
# systemctl start fence_virtd.service

GUEST PART:

# yum install fence-virt fence-virtd
# mkdir  -p /etc/cluster
# scp KVMHOST:/etc/cluster/fence_xvm.key /etc/cluster/fence_xvm.key
# fence_xvm -o list

Rest definitions depends on cluster.


Updated on Wed Sep 13 14:24:18 IDT 2017 by Oleg Volkov More documentations here