Update 2013-02-28: I’ve updated the driver to include the datastore id in the path, so it is now possible to use the driver with multiple datastores. Also the driver now correctly downloads drivers that are imported from URLs, eg. via the marketplace.

Based on this blog article I created an updated datastore driver that allows you to use a ZFS backend with OpenNebula. This datastore driver implements snapshot functionalitiy to clone images. I have created this driver to be able to start a VM with a persistent disk without having to wait untill the full image file is copied in the datastore.

The driver implements updated versions of the cp, the clone and the rm commands. I don’t use the mkfs command, so I have not implemented this with the ZFS datastore driver yet.

Due to the nature of the OpenNebula datastore layout and the ZFS snapshot capabilities, I had to make a workaround. For the sake of simplicity I dediced to use the filesystem driver as a basis. This means that the images are files in a datastore directory. ZFS snapshots are based on a directory level. To make snapshotting possible I created a link in the datastore location to the image file in a ZFS backed NFS directory.

Please find a short outline based on the blog article, and my own additions. The ZFS datastore driver is available as a tar donwload:zfs-datastore-v1.1

Setup the ZFS part
Install openindiana, create a zfs pool, create all the necessary ZFS volumes and share this as an NFS share with the frontend server.

# zfs create tank/export/home/cloud
# zfs set mountpoint=/srv/cloud tank/export/home/cloud
# zfs create tank/export/home/cloud/images
# chown -R oneadmin:cloud /srv/cloud
# zfs set sharenfs='rw=@93.188.251.125/32' tank/export/home/cloud
# zfs set sharenfs='root=@93.188.251.125/32' tank/export/home/cloud
# zfs allow oneadmin destroy,clone,create,mount,share,sharenfs tank/export/home/cloud

Setup ZFS volume per datastore
For each datastore that you want to host on the ZFS server you have to create a volume and allow oneadmin to manage it. Replace datastoreid with the id of the datastore you will create (eg. 100). Make sure to chown the directory to allow oneadmin to access it.

# zfs create tank/export/home/cloud/datastoreid
# zfs allow oneadmin destroy,clone,create,mount,share,sharenfs tank/export/home/cloud/datastoreid
# chown oneadmin:other /srv/cloud/datastoreid

Install ZFS datastore driver

Unpack the tar file with the driver in the datastore remote directory (/var/lib/one/remotes/datastore/)

$ tar xvf zfs-datastore.tar

Configure the ZFS datastore driver with the correct parameters and make sure passwordless connectivity is possible between the ZFS host and the frontend.

zfs.conf:

ZFS_HOST=10.10.10.3
ZFS_POOL=tank
ZFS_BASE_PATH=/export/home/cloud ## this is the path that maps to /srv/cloud
ZFS_LOCAL_PATH=/srv/cloud ## relative one ZFS_BASE_PATH
ZFS_CMD=/usr/sbin/zfs
ZFS_SNAPSHOT_NAME=golden

Configure OpenNebula to use ZFS datastore driver

Update the datastore configuration in oned.conf with zfs driver. See example below:

DATASTORE_MAD = [
executable = "one_datastore",
arguments = "-t 15 -d fs,vmware,iscsi,zfs"
]

Create new ZFS datastore

When the ZFS stuff is all done, make sure this NFS share is mounted on the frontend. In my example this is mounted on /srv/cloud. Now you can create a datastore with this new driver.

zfstest.conf:

NAME = zfstest
DS_MAD = zfs
TM_MAD = ssh


$ onedatastore create zfstest.conf

Create new image in ZFS datastore

Create conf file to create new image:

# cat /tmp/centos63-5gb.conf
NAME = "Centos63-5GB-zfs"
TYPE = OS
PATH = /home/user/centos63-5gb.img
DESCRIPTION = "CentOS 6.3 5GB image contextualized"

Run oneimage command on the right datastore:

[oneadmin@cloudcontroller1 ~]$ oneimage create -d 100 /tmp/centos63-5gb.conf
ID: 111

The image path to be used to create a snapshot can be found by checking the image details of the newly created image (id 111 in our example):

$ oneimage show 111
[…]
SOURCE : /var/lib/one/datastores/100/4ce405866cf95a4d77b3a9dd9c54fa73
[…]

To use ths image as a golden image, create a snapshot on the ZFS server, so this snapshot can be the basis for the future clones. Instant cloning relies on the relevant ZFS capabilities, which allows creating a new ZFS dataset (clone) based on an existing snapshot. This means the snapshot has to be created first. So after you upload the golden image, manually create a snapshot of this image. This only needs to be done once as this snapshot can be used as many times as possible. This command needs to be run on the ZFS server, not the frontend!


# zfs snapshot tank/export/home/cloud/100/4ce405866cf95a4d77b3a9dd9c54fa73@golden

Now this image can be used for instand cloning.

Example ZFS datastore content on the frontend

[oneadmin@cloudcontroller1 ~]$ cd /var/lib/one/datastores/100/
[oneadmin@cloudcontroller1 100]$ ls -al
total 28
drwxr-xr-x 2 oneadmin oneadmin 4096 Nov 7 02:43 .
drwxr-xr-x 6 oneadmin oneadmin 4096 Oct 15 17:04 ..
lrwxrwxrwx 1 oneadmin oneadmin 76 Oct 16 03:21 25473f081ba733822f3e9ba1df347753 -> /srv/cloud/25473f081ba733822f3e9ba1df347753/25473f081ba733822f3e9ba1df347753
lrwxrwxrwx 1 oneadmin oneadmin 76 Oct 16 02:53 2bf829fedb6e1728f204be8a19ff8f8c -> /srv/cloud/2bf829fedb6e1728f204be8a19ff8f8c/2bf829fedb6e1728f204be8a19ff8f8c
lrwxrwxrwx 1 oneadmin oneadmin 76 Oct 19 17:41 4ce405866cf95a4d77b3a9dd9c54fa73 -> /srv/cloud/4ce405866cf95a4d77b3a9dd9c54fa73/4ce405866cf95a4d77b3a9dd9c54fa73
lrwxrwxrwx 1 oneadmin oneadmin 76 Oct 16 03:24 a00d08dd9b7447818e110115cbc33056 -> /srv/cloud/a00d08dd9b7447818e110115cbc33056/25473f081ba733822f3e9ba1df347753
lrwxrwxrwx 1 oneadmin oneadmin 76 Oct 16 03:18 cab665db977255c4c76c7aa3d687a6d6 -> /srv/cloud/cab665db977255c4c76c7aa3d687a6d6/2bf829fedb6e1728f204be8a19ff8f8c

Example output of ZFS volumes

root@openindiana:/home/rogierm# zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 5.68G 361G 45.5K /rpool
rpool/ROOT 1.56G 361G 31K legacy
rpool/ROOT/openindiana 1.56G 361G 1.55G /
rpool/dump 2.00G 361G 2.00G -
rpool/export 133K 361G 32K /export
rpool/export/home 101K 361G 33K /export/home
rpool/export/home/oneadmin 34K 361G 34K /export/home/oneadmin
rpool/export/home/rogierm 34K 361G 34K /export/home/rogierm
rpool/swap 2.12G 362G 133M -
tank 35.3G 1.04T 32K /tank
tank/export 35.3G 1.04T 32K /tank/export
tank/export/home 35.3G 1.04T 31K /tank/export/home
tank/export/home/cloud 35.3G 1.04T 15.3G /srv/cloud
tank/export/home/cloud/1872dba973eb2f13ef745fc8619d7c30 1K 1.04T 5.00G /srv/cloud/1872dba973eb2f13ef745fc8619d7c30
tank/export/home/cloud/25473f081ba733822f3e9ba1df347753 5.00G 1.04T 5.00G /srv/cloud/25473f081ba733822f3e9ba1df347753
tank/export/home/cloud/28e04ccbc4e55779964330a2131db466 1K 1.04T 5.00G /srv/cloud/28e04ccbc4e55779964330a2131db466
tank/export/home/cloud/2bf829fedb6e1728f204be8a19ff8f8c 40.0M 1.04T 40.0M /srv/cloud/2bf829fedb6e1728f204be8a19ff8f8c
tank/export/home/cloud/4ce405866cf95a4d77b3a9dd9c54fa73 5.00G 1.04T 5.00G /srv/cloud/4ce405866cf95a4d77b3a9dd9c54fa73
tank/export/home/cloud/8b86712ae314bc80eef1dfc303740a87 1K 1.04T 5.00G /srv/cloud/8b86712ae314bc80eef1dfc303740a87
tank/export/home/cloud/931492cd32cb96aad3b8dce4869412f3 1K 1.04T 5.00G /srv/cloud/931492cd32cb96aad3b8dce4869412f3
tank/export/home/cloud/a00d08dd9b7447818e110115cbc33056 5.00G 1.04T 5.00G /srv/cloud/a00d08dd9b7447818e110115cbc33056
tank/export/home/cloud/cab665db977255c4c76c7aa3d687a6d6 1K 1.04T 40.0M /srv/cloud/cab665db977255c4c76c7aa3d687a6d6
tank/export/home/cloud/images 5.00G 1.04T 32K /srv/cloud/images
tank/export/home/cloud/images/centos6 5.00G 1.04T 5.00G /srv/cloud/images/centos6
tank/export/home/cloud/one 63K 1.04T 32K /srv/cloud/one
tank/export/home/cloud/one/var 31K 1.04T 31K /srv/cloud/one/var

After an upgrade of one of our internal KVM systems to CentOS 6.2 some of the VMs did not started after boot. When I tried to manually start them via virsh they failed with the following error:


error: internal error unable to reserve PCI address 0:0:2.0

I fixed this by changing the xml that defines this VM:


virsh
edit vm-id

Search for the device with IRQ 2, eg:

<interface type='bridge'>
<mac address='52:54:00:bc:ab:96'/>
<source bridge='br205'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</interface>

Edit this section and change the slot=’0x02′ to some other unique value, eg slot=’0x04′ and save this.

Now you can start the VM without any problem.

This issue seems to happen to more people after an upgrade to 6.2 (see below). The cause does not seem to be known, but it is easily fixed.

References:
https://www.redhat.com/archives/rhelv6-list/2011-December/msg00043.html

http://wiki.eri.ucsb.edu/sysadm/KVM

Several blogs and manuals with examples on kvm or xen setups use NFS as storage backend. Mostly they state that for production use iSCSI is recommended. However there are examples where NFS is part of the architecture, eg. OpenNebula. I tried to find specific statistics on the performance differences between NFS, iSCSI and local storage. During this search I encountered some pointers that NFS and Xen is not a good combination, but never a straight comparison.

I decided to invest some time and setup a small test environment and run some bonnie++ statistics. This is not a scientific designed experiment, but a test to show the differences between the platforms. Two test platforms are setup, 1 with a Xen server (DL360G6) (xen1) and a 12 disk SATA storage server (storage1), and another with a KVM server (DL360G5) (kvm1) and a 2 disk SATA storage server (storage2) . Both servers are connected with a gigabit network. I’ve also run a test with a 100mb/s network between the kvm1 and storage2 server. For reference I’ve also done tests with the images on localdisk.

I realize that LVM and iSCSI storage is most efficient, but storage with image files is very convenient and in case of cloud setups sometimes the only option.

Seq output Seq input Random
Per Chr Block Rewrite Per Chr Block Seeks
Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
Xen-guest-via-nfs-tapaio 1G 3570 5 2436 0 1366 0 26474 41 24831 0 6719.0 1
xen-guest-via-iscsi 1G 25242 40 12071 1 15175 0 32071 42 47742 0 7331.3 1
kvm-guest-nfs-1gb-net 1G 8140 16 17308 3 11864 2 40861 81 71711 3 2126.6 54
kvm-guest-nfs-qcow-100mb 1G 1922 3 9874 1 3994 0 10720 22 10441 0 595.4 33
kvm-guest-nfs-qcow-100mb-2nd 1G 9735 21 2039 0 3197 0 10729 22 10463 0 685.3 38
kvm-guest-nfs-qcow-100mb-3rd 1G 5327 10 7378 1 4421 0 10655 18 10512 0 706.3 39
xenserver-nfsmount 1G 41507 60 60921 7 29687 1 33427 48 64147 0 4674.4 11
kvmserver-nfs-1G 20G 31158 52 32044 17 10749 2 19152 28 18987 1 90.3 1
localdisk-on-nfs-server-cloudtest3 4G 41926 65 43805 7 18928 3 52943 72 56616 3 222.6 0

The  conclusion of the tests is that local storage is fastest. NFS storage with Xen is not a good combination. Xen runs best with iSCSI backed storage. KVM with NFS runs significantly better. It is safe to say that if you want to use NFS use it with KVM, not with Xen. In any case iSCSI is always the best option for Xen. I have not yet tested KVM with iSCSI but I expect this to perform better than NFS.

Libvirt is a toolkit to interact with several virtualization platform from a single interface. Considering you can stop and start virtual machines through this API, security is quite important. Libvirt offers several options to give authenticated access from remote machines. By default most distributions disable remote network access for libvirtd. However, I would like to access libvirtd on some of my KVM servers from a single management host to gather some information. The documentation on how to set this up is not too good, so I decided to write up a short how-to.

Step 1: Enable network access for libvirtd
First enable network access for libvirtd on the KVM server(s). On CentOS/RHEL this is done by uncommenting or adding the following line in /etc/sysconfig/libvirtd:

LIBVIRTD_ARGS="--listen"

Step 2: Install a CA on the management server
Install the Perl certificate tools:

yum install openssl-perl

Create Certificate authority:

cd /etc/pki/tls/misc/
./CA.pl -newca

Example output:

./CA.pl -newca
CA certificate filename (or enter to create)

Making CA certificate ...
Generating a 1024 bit RSA private key
..........++++++
.............++++++
writing new private key to '../../CA/private/cakey.pem'
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [GB]:XX
State or Province Name (full name) [Berkshire]:XX
Locality Name (eg, city) [Newbury]:XXXXX
Organization Name (eg, company) [My Company Ltd]:XXXXX
Organizational Unit Name (eg, section) []:XXXX
Common Name (eg, your name or your server's hostname) []:CA XXX XXX
Email Address []:XXX

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
Using configuration from /etc/pki/tls/openssl.cnf
Enter pass phrase for ../../CA/private/cakey.pem:
Check that the request matches the signature
Signature ok
Certificate Details:
Serial Number:
d8:95:24:xx:xx:xx:13:9b
Validity
Not Before: Feb 25 23:14:08 2010 GMT
Not After : Feb 24 23:14:08 2013 GMT
Subject:
countryName = XX
stateOrProvinceName = XX
organizationName = XXXX
organizationalUnitName = XXXX
commonName = CA XXX XXX
emailAddress = XXXXX
X509v3 extensions:
X509v3 Subject Key Identifier:
XXX
X509v3 Authority Key Identifier:
keyid:XXXX
DirName:/C=XX/ST=XX/O=XXX/OU=XXXX/CN=CA XXX XXX/emailAddress=XXX
serial:XXX

X509v3 Basic Constraints:
CA:TRUE
Certificate is to be certified until Feb 24 23:14:08 2013 GMT (1095 days)

Write out database with 1 new entries
Data Base Updated

Step 3: Create CSR’s

openssl genrsa -des3 -out kvm-server1.tmp
openssl rsa -in kvm-server1.tmp -out kvm-server1.key
openssl genrsa -des3 -out mgmt-host.tmp
openssl rsa -in mgmt-host.tmp -out mgmt-host.key
openssl req -new -key kvm-server1.key -out kvm-server1.csr
openssl req -new -key mgmt-host.key -out mgmt-host.csr

Step 4: Sign the certificates

openssl ca -config /etc/pki/tls/openssl.cnf -policy policy_anything -out /root/mgmt-host.crt -infiles /root/mgmt-host.csr
openssl ca -config /etc/pki/tls/openssl.cnf -policy policy_anything -out /root/kvm-server1.crt -infiles /root/kvm-server1.csr

Example output:

Using configuration from /etc/pki/tls/openssl.cnf
Enter pass phrase for /etc/pki/CA/private/cakey.pem:
Check that the request matches the signature
Signature ok
Certificate Details:
Serial Number:
d8:95:24:4b:4e:b1:13:9c
Validity
Not Before: Feb 25 23:31:40 2010 GMT
Not After : Feb 25 23:31:40 2011 GMT
Subject:
countryName = XX
stateOrProvinceName = XX
localityName = XX
organizationName = XX
organizationalUnitName = XX
commonName = mgmt-host.xxx.nl
emailAddress = xxxxx
X509v3 extensions:
X509v3 Basic Constraints:
CA:FALSE
Netscape Comment:
OpenSSL Generated Certificate
X509v3 Subject Key Identifier:
6C:EA:8B:C1:D6:XX:B6:6B:5B:18:02
X509v3 Authority Key Identifier:
keyid:C9:36:4A:XXXX:6F:FD:2E:86

Certificate is to be certified until Feb 25 23:31:40 2011 GMT (365 days)
Sign the certificate? [y/n]:y

1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated

Step 5: Copy over the certificates to the correct location
On the management host (mgmt-host):

mkdir /etc/pki/libvirt
mkdir /etc/pki/libvirt/private
mkdir /etc/pki/libvirt-vnc

cp /root/mgmt-host.key /etc/pki/libvirt/private/clientkey.pem
cp /root/mgmt-host.key /etc/pki/libvirt-vnc/clientkey.pem
cp /root/mgmt-host.crt /etc/pki/libvirt/clientcert.pem
cp /root/mgmt-host.crt /etc/pki/libvirt-vnc/clientcert.pem

Transfer the key and certificate files to the KVM server (kvm-server1). Ideally, you create the key and CSR on the host itself, so you only have to transfer the certificate. Then, copy the certificates and CA to the correct location on the KVM (libvirtd) server:


mkdir /etc/pki/libvirt
mkdir /etc/pki/libvirt/private
mkdir /etc/pki/libvirt-vnc

cp kvm-server1.key /etc/pki/libvirt/private/serverkey.pem
cp kvm-server1.key /etc/pki/libvirt-vnc/server-key.pem

cp kvm-server1.crt /etc/pki//libvirt/servercert.pem
cp kvm-server1.crt /etc/pki/libvirt-vnc/server-cert.pem

Make sure the CA generated on the management server is available on the KVM server in the following file:
/etc/pki/CA/cacert.pem

Step 6: Reload libvirtd

/etc/init.d/libvirtd reload

Step 7: Test
With these certificates setup, you should be able to access libvirtd on kvm-server1 from mgmt-host. Use the following command to test:

virsh -c qemu://kvm-server1.xxxx.nl/system
Welcome to virsh, the virtualization interactive terminal.

Type: 'help' for help with commands
'quit' to quit

virsh #

Use the list command to see a list of running guests on the server. This only works if these guests have also been created via libvirtd. Manually started KVM guests will not show up in this list.

I’ve made some quick changes to ONEMC to show the VNC port in the interface. I’ve updated the template that onemc creates with a GRAPHICS section. This enables vnc on the quest.

As a workaround until ONE can use the VMID in the graphics section, I use a virsh command to get the vncport. To get this working the webserver user should be allowed to execute the virsh command via sudo. Add the following to sudoers:

apache ALL=(ALL) NOPASSWD: /usr/bin/virsh *

Also I encountered some problems with the model section in the KVM template so I commented that out as well.

Below the patch and a screenshot listing the vnc ports in ONEMC
ONEMC screenshot
onemc_funcs.patch

I encountered the an error while experimenting with the OpenNebula (ONE) EC2 interface. I tried to upload an image file, to a OpenNebula host running CentOS 5.3 with ONE 1.3.8. After a couple of seconds the command exited with the following error:

[rogierm@cloudtest3 ~]$ econe-upload /home/rogierm/centos5.img
image /home/rogierm/centos5.img
/usr/local/one/lib/ruby/econe/EC2QueryClient.rb:164:in `http_post': server returned nothing (no headers, no data) (Curl::Err::GotNothingError)
from /usr/local/one/lib/ruby/econe/EC2QueryClient.rb:164:in `upload_image'
from /usr/local/one/bin/econe-upload:116

I informed the ONE developers of this issue on their mailing list and Sebastien Goasguen pointed me to the correct solution. There seems to be an error in the curl implementation on CentOS. I installed the multipart-post gem and executed the econe-upload with the (yet undocumented) switch ‘-M’. This fixed the problem.

Install gem:

[root@cloudtest3 ~]# gem install multipart-post

Run the working econe-upload command:

[rogierm@cloudtest3 ~]$ econe-upload -M /home/rogierm/centos5.img

While experimenting with OpenNebula and trying to build a public cloud with the EC2 interface to OpenNebula I encountered the following problem in the code:

[rogierm@cloudtest3 one]$ econe-upload /home/rogierm/test.img
/usr/lib/ruby/1.8/rdoc/ri/ri_options.rb:53: uninitialized constant RI::Paths (NameError)
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
from /usr/lib/ruby/1.8/rdoc/usage.rb:72
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
from /usr/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:in `require'
from /usr/local/one/bin/econe-upload:61

I fixed this problem by adding the following line (above the other require statements) in econe-upload, or any other command giving the same error:

require 'rdoc/ri/ri_paths'

OpenQRM uses dropbear for the communication and exchange of messages between the server and the appliances. When something goes wrong in this communication OpenQRM can’t function correctly. It can’t access the applicances for status updates and commands. These communication problems are often caused by a misconfiguration in dropbear. The most seen problem is a misconfiguration in the the public and private dropbear key.

The keys should be synchronized between the server and the appliance. On the server grep the public key with the following command:

[root@localhost log]# /usr/lib/openqrm/bin/dropbearkey -t rsa -f /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -y
Public key portion is:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgwCBvwSO7vBBL2avDMds...pVn root@localhost.localdomain
Fingerprint: md5 65:ca:5b:3b:05:c3:61:6d:fb:75:2f:c0:d2:7e:02:cf

Copy the ssh-rsa public key in /root/.ssh/authorized_keys on the appliance.

Now communication should be established.

OpenQRM event log with example of error message caused by communication problem:

openqrm-cmd-queue ERROR executing command with token 64d478dcac6670e5fb000e7c4954863b : /usr/lib/openqrm/bin/dbclient


Aug 26 23:19:45 localhost httpd: openQRM resource-monitor: (update_info) Processing statistics from resource 2
Aug 26 23:19:48 localhost logger: openQRM-cmd-queu: Running Command with token 64d478dcac6670e5fb000e7c4954863b 1. retry : /usr/lib/openqrm/bin/dbclient -I 0 -K 10 -y -i /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -p 1667 root@192.168.42.243 "/usr/lib/openqrm/bin/openqrm-cmd /usr/lib/openqrm/plugins/xen/bin/openqrm-xen post_vm_list -u openqrm -p openqrm"
Aug 26 23:19:52 localhost logger: openQRM-cmd-queu: ERROR executing command with token 64d478dcac6670e5fb000e7c4954863b 2. retry : /usr/lib/openqrm/bin/dbclient -I 0 -K 10 -y -i /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -p 1667 root@192.168.42.243 "/usr/lib/openqrm/bin/openqrm-cmd /usr/lib/openqrm/plugins/xen/bin/openqrm-xen post_vm_list -u openqrm -p openqrm" -----
Aug 26 23:19:52 localhost logger: Host '192.168.42.243' key accepted unconditionally.
Aug 26 23:19:52 localhost logger: (fingerprint md5 64:d5:c7:8e:7a:11:08:3f:43:bc:3c:2b:bf:4a:c8:ce)
Aug 26 23:19:52 localhost logger: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: root@192.168.42.243's password: /usr/lib/openqrm/bin/dbclient: connection to root@192.168.42.243:1667 exited: remote closed the connection

OpenQRM uses dropbear for communication between the OpenQRM server and the appliances. Dropbear is basically a simple version of SSH, so it uses host keys which are cached in /root/.ssh/known_hosts. Dropbear uses a different key than sshd, ssh and dropbear share the known_hosts file and ports are not included in this file.

When you ssh once into the appliance from the OpenQRM server the ssh hostkey is cached in the known_hosts file. Now if OpenQRM wants to connect to the appliance, dropbear checks the know_hosts file for the cached hostkey. This contains the ssh hostkey instead of the dropbear hostkey, so dropbear stops the connection because the hostkeys don’t matc which could be caused by a security compromise.

To solve the problem remove the hostkey entry for the host from /root/.ssh/known_hosts.


Aug 24 23:24:26 localhost logger: openQRM-cmd-queu: Running command with token 34b3e7ddd93ffa548d34ccea1e4aa7e5 : /usr/lib/openqrm/bin/dbclient -I 0 -K 10 -y -i /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -p 1667 root@192.168.42.235 "/usr/lib/openqrm/bin/openqrm-cmd openqrm_server_set_boot local 1 00:00:5A:11:21:B7 0.0.0.0"
Aug 24 23:24:26 localhost logger: openQRM-cmd-queu: ERROR while running command with token bc7c6de1b59370dd8019bcae2d7bfa45 : /usr/lib/openqrm/bin/dbclient -I 0 -K 10 -y -i /usr/lib/openqrm/etc/dropbear/dropbear_rsa_host_key -p 1667 root@192.168.42.235 "/usr/lib/openqrm/bin/openqrm-cmd openqrm_server_set_boot local 1 00:00:5A:11:21:B7 0.0.0.0" ----- /usr/lib/openqrm/bin/dbclient: connection to root@192.168.42.235:1667 exited:
Aug 24 23:24:26 localhost logger:
Aug 24 23:24:26 localhost logger: Host key mismatch for 192.168.42.235 !
Aug 24 23:24:26 localhost logger: Fingerprint is md5 65:ca:5b:3b:05:c3:61:6d:fb:75:2f:c0:d2:7e:02:cf
Aug 24 23:24:26 localhost logger: Expected md5 a8:e5:d4:62:36:d2:98:b2:c3:74:a9:0c:d5:d1:56:f9
Aug 24 23:24:26 localhost logger: If you know that the host key is correct you can
Aug 24 23:24:26 localhost logger: remove the bad entry from ~/.ssh/known_hosts

On a new Xen server I encounterd the following error while starting a fully virtualized guest:

[root@resource1 xen]# xm create test-vps.cfg
Using config file "./test-vps.cfg".
VNC= 1
Error: Unable to connect to xend: Name or service not known. Is xend running?

This problem was caused by a problem in the name resolving. I solved this by adding the hostname and ip address of the server in /etc/hosts
After this change the guest booted without problems.