Wiki
xNBD / Home
Contents
- Contents
- About
- Download
- Compile
- Usage
- Step-by-Step Examples
- Documentation
- Links
- Copyright
- Contact
About
xNBD is yet another NBD (Network Block Device) server program, which works with the NBD client driver of Linux Kernel.
xNBD provides the following features:
- Possibly better I/O performance by using
mmap()
. - Concurrent access from multiple clients.
- Copy-on-Write (basic support).
- Snapshot (basic support).
- Distributed Copy-on-Write.
- Live storage migration for virtual machines.
- IPv6 support.
See demo movies to know how it works.
This is an open source project. Your contribution is very welcome. Please send me patches. Doing things in the Mercurial way is a nice idea!
Download
You might be able to take a short cut, i.e., just install it from your distribution. At least, Debian and Ubuntu include xNBD.
apt-get install xnbd-client xnbd-server
You can download source code and follow the latest development.
- Download the latest release: http://bitbucket.org/hirofuchi/xnbd/downloads/
- Download the latest development code: http://bitbucket.org/hirofuchi/xnbd/get/tip.gz (tar.gz format)
- Browse source code via web browser: http://bitbucket.org/hirofuchi/xnbd/src/
xNBD is licensed under GPLv2.
Compile
Before compiling xNBD, setup the following libraries and programs.
- GLib 2. See http://www.gtk.org/ .
- Linux kernel and glibc with
ppoll()
system call. 2.6.22 or later is recommended. - GCC 4. C99 syntax is used.
- make, autoconf.
Then, extract an xNBD tarball, and make it.
tar xvf xnbd-x.y.z.tar.gz cd xnbd-x.y.z autoreconf -i ./configure make
To enable debug messages, add --enable-debug
option to ./configure
.
Usage
The below programs are used in a server node. xnbd-server
exports a disk image through the NBD protocol.
xnbd-server
: the xNBD server programxnbd-bgctl
: the control program of the xNBD proxy modexnbd-wrapper
: the super daemon ofxnbd-server
s, managing multiple disk images andxnbd-server
instances.xnbd-wrapper-ctl
: the control program ofxnbd-wrapper
The below programs are used in a client node. But it is optional. Instead of
them, you can still use nbd-client
of the original NBD program.
xnbd-client
: a userland helper program for the NBD driver. It works with the NBD driver in the mainline Linux kernel.xnbd-watchdog
: a watchdog program which periodically checks an NBD device is working correctly.
Summary
Command line options may be different in the latest code. See --help
outputs of compiled binaries.
xnbd-server
% xnbd-server --help Usage: xnbd-server --target [options] disk_image xnbd-server --cow-target [options] base_disk_image xnbd-server --proxy [options] remote_host remote_port cache_disk_path cache_bitmap_path control_socket_path xnbd-server --help xnbd-server --version Options: --lport listen port (default 8520) --daemonize run as a daemon process --readonly export a disk as readonly in target mode --logpath PATH use the given path for logging (default: stderr/syslog) --syslog use syslog for logging --inetd set the inetd mode (use fd 0 for TCP connection) Options (Proxy mode): --clear-bitmap clear an existing bitmap file (default: re-use previous state)
xnbd-client
% ./xnbd-client --help Usage: xnbd-client [bs=...] [timeout=...] host port nbd_device xnbd-client --connect [options] nbd_device host port [host port] ... xnbd-client -C [options] nbd_device host port [host port] ... xnbd-client --disconnect nbd_device xnbd-client -d nbd_device xnbd-client --check nbd_device xnbd-client -c nbd_device xnbd-client --help Options: --timeout set a timeout period (default 0, disabled) (DO NOT USE NOW) --blocksize select blocksize from 512, 1024, 2048, and 4096 (default 1024) --retry set the maximum count of retries to connect to a server (default 1) --recovery-command invoke a specified command on unexpected disconnection --recovery-command-reboot invoke the reboot system call on unexpected disconnection --exportname specify a target disk image Example: xnbd-client fe80::250:45ff:fe00:ab8f%%eth0 8998 /dev/nbd0 This command line is compatible with nbd-client. xnbd-client supports IPv6. xnbd-client --connect /dev/nbd0 fe80::250:45ff:fe00:ab8f%%eth0 8998 10.1.1.1 8900 This automatically tries the next server if the first one does not accept connection.
xnbd-watchdog
% ./xnbd-watchdog --help Usage: xnbd-watchdog [options] nbd_device xnbd-watchdog --help Options: --timeout set a timeout period (default 10) --interval (default 10) --recovery-command invoke a specified command if polling failed --recovery-command-reboot invoke the reboot system call if polling failed Example: xnbd-watchdog --recovery-command-reboot /dev/nbd0
xnbd-bgctl
% ./xnbd-bgctl --help Usage: xnbd-bgctl --query control_unix_socket xnbd-bgctl [--force] --switch control_unix_socket xnbd-bgctl [--progress] --cache-all control_unix_socket xnbd-bgctl --cache-all2 control_unix_socket xnbd-bgctl [--exportname NAME] --reconnect control_unix_socket remote_host remote_port Commands: --query query current status of the proxy mode --cache-all cache all blocks --cache-all2 cache all blocks with the background connection --switch stop the proxy mode and start the target mode --reconnect reconnect the forwarding session Options: --exportname NAME reconnect to a given image --progress show a progress bar on stderr (default: disabled) --force ignore risks (default: disabled)
xnbd-wrapper
% ./xnbd-wrapper --help Usage: ./xnbd-wrapper [options] Options: --daemonize run wrapper as a daemon process --cow run server instances as a cow target --readonly run server instances as a readonly target --laddr listening address --lport listening port (default: 8520) (--port) deprecated, use --lport instead --xnbd-bgctl path to the xnbd-bgctl executable --xnbd-server path to the xnbd-server executable --imgfile path to a disk image file. This options can be used multiple times. Use also xnbd-wrapper-ctl to (de)register disk images dynamically. --logpath PATH use the given path for logging (default: stderr/syslog) --socket unix socket path to listen on (default: /var/run/xnbd-wrapper.ctl) --syslog use syslog for logging Examples: xnbd-wrapper --imgfile /data/disk1 xnbd-wrapper --imgfile /data/disk1 --imgfile /data/disk2 --xnbd-binary /usr/local/bin/xnbd-server --xnbd-bgctl /usr/local/bin/xnbd-bgctl --laddr 127.0.0.1 --port 18520 --socket /tmp/xnbd-wrapper.ctl
Step-by-Step Examples
Install the NBD driver (nbd.ko
) and client program (nbd-client
or xnbd-client
) into your client node.
The NBD driver is included in Linux kernel, and enabled in most distributions.
Scenario 1 (Simple target server)
In a server node (10.1.1.1), start an xNBD server exporting a local file (disk.img). The server listens on TCP port 8992.
dd if=/dev/zero of=disk.img bs=4096 count=1 seek=1000000 xnbd-server --target --lport 8992 disk.img
In this case, we have created a new image file of 4GB.
In a client node, establish an NBD session to the above server.
modprobe nbd echo deadline > /sys/block/nbd0/queue/scheduler nbd-client bs=4096 10.1.1.1 8992 /dev/nbd0
The NBD client driver says that you may need to explicitly select the deadline I/O scheduler in order to avoid deadlock.
In addition, to get better performance, set the block size of nbd0
to 4096, which is the same value of the cache block size inside xNBD. This is optional.
If you need concurrent access from multiple clients, repeat the same operations at each client node. In normal cases, you need a cluster file system (e.g., OCFS2 and GFS) to store data safely into a shared disk.
Scenario 2 (Simple proxy server, distributed Copy-on-Write)
xNBD can also work as a proxy server to another target server. This feature is used for distributed Copy-on-Write NBD disks; one read-only disk image is shared among multiple clients, and updated disk data is saved at each proxy.
In the proxy server mode of xNBD, all I/O requests are intercepted, and redirected to the target server if needed. All updated blocks are saved at the proxy server, and read blocks are also cached. Writes do not happen at the target server.
Now, start an xNBD proxy server (10.255.255.254:8992) redirecting to the above target server (10.1.1.1:8992).
xnbd-server --proxy --lport 8992 10.1.1.1 8992 cache.img cache.bitmap proxy.ctl
Updated and cached blocks are saved at a cache disk file (cache.img). A bitmap file (cache.bitmap) records block numbers of updated and cached blocks. A UNIX socket file (proxy.ctl) is created to control the proxy server (See the next example).
Then, an NBD client node connects to the proxy server.
nbd-client 10.255.255.254 8992 /dev/nbd0
If you want to add more clients to the target server, repeat these commands at each proxy server and client node.
A proxy server accepts NBD connections from other NBD proxies. This means that you can cascade multiple NBD proxies as figured in the below.
Scenario 3 (Live VM & disk migration with Xen)
A proxy server is used for relocating a virtual disk to another xNBD server. This mechanism transparently works with live migration of a VM.
In this example, 4 physical machines are used:
- Source host node where a VM is started
- Destination host node where the VM is migrated
- xNBD target node exporting a virtual disk to the source host node
- xNBD proxy node exporting a virtual disk to the destination host node
1. Setup a VM with an NBD disk
Source Side
First, setup a xNBD target server (10.10.1.1), and connect to it from a source host node (10.10.1.2).
Then, create a VM with a virtual disk of /dev/nbd0
.
In the xNBD target node,
dd if=/dev/zero of=disk.img bs=4096 count=1 seek=1000000 xnbd-server --target --lport 8992 disk.img
In the source host node,
modprobe nbd echo deadline > /sys/block/nbd0/queue/scheduler nbd-client bs=4096 10.10.1.1 8992 /dev/nbd0 virt-install -f /dev/nbd0 # or similar command to install a VM into /dev/nbd0 xm create /etc/xen/mydomain.cfg
When using Xen, the Domain configuration file of the VM will include the following entry.
disk = [ "phy:/dev/nbd0,xvda,w" ]
xNBD is independent from VMM implementations. It also works with KVM (qemu) and others.
For instance, since KVM includes NBD client code, you do not need to set up /dev/nbd0
on a host OS.
Specify the NBD disk directly in a command line.
qemu-system-x86_64 -hda nbd:10.10.1.1:8992
Destination Side
Next, setup an xNBD proxy server (10.20.1.1), and connect to it from a destination host node (10.20.1.2).
The xNBD proxy server redirects NBD I/O requests to the above target server (10.10.1.1).
xnbd-server --proxy --lport 8992 10.10.1.1 8992 cache.img cache.bitmap proxy.ctl
This command creates a UNIX socket file (proxy.ctl), via which controls the proxy server.
In the destination host, connect to the proxy server.
modprobe nbd echo deadline > /sys/block/nbd0/queue/scheduler nbd-client bs=4096 10.20.1.1 8992 /dev/nbd0
2. Migrate the VM to the destination
Next, start live migration.
In the source host,
xm migrate -l 2 10.20.1.2 # Domain ID is 2
After memory page relocation is completed, the VM is terminated at the source host, and then restarted at the destination. All disk I/O requests are intercepted at the xNBD proxy, and disk blocks are gradually cached (i.e., relocated) at the cache file.
3. Migrate all the disk blocks
There still remains not-yet-relocated blocks. Now copy them to the xNBD proxy.
In the xNBD proxy node,
xnbd-bgctl --cache-all proxy.ctl
After all blocks are cached at the proxy node, the NBD connection to the target server is no longer required. Now, you can change the xNBD proxy to a normal target server, disconnecting the NBD connection to the target server.
In the xNBD proxy node,
xnbd-bgctl --switch proxy.ctl
This command shutdowns the xNBD proxy server and restart it as a normal xNBD target server. All client NBD sessions are preserved.
Scenario 4 (OCFS2 with XNBD)
For this example, assume there are 3 debian(squeeze) machines:
- host1: xNBD target server. IP address is 172.16.0.101
- host2: xNBD client node. This is also O2CB cluster node. IP address is 172.16.0.102
- host3: xNBD client node. This is also O2CB cluster node. IP address is 172.16.0.103
On host1, start xnbd-server:
# dd if=/dev/zero of=disk.img bs=4096 count=1 seek=1000000 # xnbd-server --target --lport 8992 disk.img
On host2 and host3, establish an NBD session:
# modprobe nbd # echo deadline > /sys/block/nbd0/queue/scheduler # xnbd-client bs=4096 172.16.0.101 8992 /dev/nbd0
On host2 and host3, set up O2CB:
# aptitude install ocfs2-tools # vi /etc/ocfs2/cluster.conf
cluster:
node_count = 2
name = cluster1
node:
ip_port = 7777
ip_address = 172.16.0.102
number = 1
name = host2
cluster = cluster1
node:
ip_port = 7777
ip_address = 172.16.0.103
number = 2
name = host3
cluster = cluster1
# vi /etc/default/o2cb
O2CB_ENABLED=true
O2CB_BOOTCLUSTER=cluster1
O2CB_HEARTBEAT_THRESHOLD=61
O2CB_IDLE_TIMEOUT_MS=30000
O2CB_KEEPALIVE_DELAY_MS=2000
O2CB_RECONNECT_DELAY_MS=2000
# /etc/init.d/o2cb start
On host2, make OCFS2 filesystem:
# mkfs.ocfs2 /dev/nbd0
On host2 and host3, mount OCFS2 volume:
# mount -t ocfs2 /dev/nbd0 /mnt/ocfs2
Now you can read/write from both machines.
Scenario 5 (Using named exports)
The recent NBD protocol allows an NBD client to request a target disk image name in the negotiation phase.
NBD clients such as xnbd-client
, nbd-client
, and qemu
support this feature.
xnbd-wrapper is used to support this feature in the server side.
First, start xnbd-wrapper on 10.1.1.1:8992:
xnbd-wrapper --port 8992 --imgfile /data/disk1 --imgfile /data/disk2
In this command line, two image files are registered to this xnbd-wrapper.
You can also dynamically register more disk images by using xnbd-wrapper-ctl
.
xnbd-wrapper-ctl --add /data/disk3 xnbd-wrapper-ctl --add /data/disk4 xnbd-wrapper-ctl --list xnbd-wrapper-ctl --remove <index>
For example, the first VM uses /data/disk1, the second VM uses /data/disk2:
qemu-kvm -hda nbd:10.0.0.1:8992:exportname=/data/disk1 qemu-kvm -hda nbd:10.0.0.1:8992:exportname=/data/disk2
qemu-kvm 0.14.0 or later is required.
Documentation
There are several papers focused on storage migration for virtual machines.
- A Live Storage Migration Mechanism over WAN and its Performance Evaluation, Takahiro Hirofuchi, Hidemoto Nakada, Hirotaka Ogawa, Satoshi Itoh and Satoshi Sekiguchi, The 3rd International Workshop on Virtualization Technologies in Distributed Computing (VTDC2009), Jun 2009 Paper (PDF) Slides (PDF)
- A Live Storage Migration Mechanism over WAN for Relocatable Virtual Machine Services on Clouds, Takahiro Hirofuchi, Hirotaka Ogawa, Hidemoto Nakada, Satoshi Itoh and Satoshi Sekiguchi, International Workshop on Cloud Computing (Cloud 2009), May 2009
- A Relocatable Storage I/O Mechanism for Live-Migration of Virtual Machines over WAN, Takahiro Hirofuchi, 2008 USENIX Annual Technical Conference (Poster), Jun 2008. Poster (PDF)
Other papers written in Japanese are listed at http://grivon.apgrid.org/publications/ .
This figure outlines the design of the proxy mode (png | odg).
Links
There are several NBD families.
- NBD (Original NBD client/server) http://nbd.sourceforge.net/ A kernel driver is included in vanilla Linux kernel.
- nbdkit https://github.com/libguestfs/nbdkit
A plugin-based NBD server by RedHat. Plugins include:
- regular files
- compressed files (gzip/xz, readonly)
- VMDK images
- guestfs guest disks
- libvirt guest disks (readonly)
- ENBD (Enhanced NBD) http://www.enbd.org/ ENBD requires to compile a special kernel driver. Linux 2.6.x is not supported (?).
- GNBD (Global NBD) http://sourceware.org/cluster/gnbd/
GNBD supports multiple clients. A special kernel driver (
gnbd.ko
) is required.
- DNBD (Distributed NBD) http://lab.openslx.org/projects/dnbd/wiki A read-only and caching network block device.
- python NBD server http://lists.canonical.org/pipermail/kragen-hacks/2004-May/000397.html A pure python implementation of an NBD server. Very tiny.
- Blockfish Was an NBD server written in Java, supported an Amazon S3 backend.
- nbd-http http://patraulea.com/nbd-http/ An NBD-like client using HTTP.
- SC101 NBD Server http://code.google.com/p/sc101-nbd/ A protocol tanslation server for an Netgear's SAN product.
- qemu-nbd http://www.qemu.org/qemu-doc.html#SEC20 A NBD server that exports a QEMU's disk image.
- BlackHole http://www.vanheusden.com/java/BlackHole/ De-duplication NBD server
- JNBDS http://sourceforge.net/projects/jnbds/ Java Network Block Device Server
- swift-nbd-server https://github.com/reidrac/swift-nbd-server NBD server for OpenStack Object Storage (Swift)
- Let me know more...
A shared xNBD disk among client nodes will be used with a cluster file system.
- GFS (Global File System) http://sources.redhat.com/cluster/gfs/
Non-Linux NBD clients:
- solaris-nbd https://github.com/imp/solaris-nbd Solaris NBD kernel client
- osx-nbd https://github.com/elsteveogrande/osx-nbd NBD client driver for OS X
This project is partially supported by a goverment-funded research project for datacenter virtualization and Green IT.
Copyright
Copyright (C) 2008-2013 National Institute of Advanced Industrial Science and Technology. All rights reserved.
Note: This program partially includes small pieces of source code written by other open source projects under the terms of the GNU General Public License.
Development of xNBD was partially sponsored by Wavecon GmbH < www.wavecon.de >.
Contact
Takahiro Hirofuchi <t.hirofuchi _at_ aist.go.jp>
Updated