Xen and the Art of System Administration
Johnny C. Lam
 
Caveats
- I am not a kernel developer.
 
- My experience is in implementing systems and porting software.
 
- Covers Xen 2.0, which lacks some of the cooler Xen 3.0 features like
    support for SMP domains.
 
- I only have a layman's knowledge of FreeBSD Jails.
 
 
The Problem
How do we isolate processes and users?
- Processes and services shouldn't interfere with each other.
 
- Users shouldn't be able to stomp on each other.
 
- Users have different levels of access to services.
 
 
The Whole "UNIX" Thing
- Processes have their own address spaces.
 
- Processes can run as different users.
 
- Processes can run in chroot "jails".
 
- Filesystem access is managed via ACLs or permission masks.
 
We can run everything on a single machine by taking care with file and
   directory permissions, using chroots, and having good bookkeeping skills.
 
Difficulties
- Bookkeeping overhead in documenting complex machine setups.
 
- Conflicting software installations require manual fix-ups.
 
- Delegation issues
    
    - Allowing junior admins to manage some things but not others.
 
    - Protecting against catastrophic screw-ups by junior admins.
 
    
 
- User/group access controls on resources are a pain in the butt.
    
    - Windows 200x Active Directory anyone?
 
    
 
 
Workaround
We can just run separate processes on separate machines.
- Easy to understand physical security.
 
- Can allocate processes to machines based on resource requirements, e.g.
    more memory, faster disks, faster NICs, etc.
 
- Each machine only has the minimum number of local users needed.
 
 
Virtualization
Virtualization lets you do all this on a single machine.
- Advantages:
    
    - Dense -- save on hardware and power costs
 
    - Simpler to maintain
        
        - Easier to set up and tear down a virtual machine -- it's just
            playing with bits.
 
        - Don't need to learn or keep track of anything new or complex
            admin-wise -- just treat a virtual machine like any other
            machine.
 
        
     
    
 
- Disadvantages:
    
    - Need a beefy machine -- higher initial costs.
 
    - Performance hit
 
    
 
 
Some virtualization technologies
These technologies provide varying degrees of virtualization:
 
Xen vs. Jails Deathmatch
Xen and jails are two completely different technologies, so comparisons
   are unfair.  However, both can be applied to solve a particular domain
   of problems in system administration:
- Process isolation
 
- User isolation
 
- All on one machine
 
Xen provides "machine-level" virtualization, while jails provide "OS-level" virtualization and have different cost trade-offs.
Jails only exist on FreeBSD and DragonFly, and I use NetBSD, so I use Xen.
 
Xen hypervisor
Originally developed by University of Cambridge Computer Lab, and currently developed by XenSource.
- GPL-licensed virtual machine monitor
 
- Xen hypervisor implements virtual x86 machines (with special devices)
 
- Securely execute multiple virtual machines with strict resource partitioning.
 
- Close-to-native performance
 
 
Xen domains
- domain 0
    
    - Privileged domain
 
    - Linux or NetBSD
 
    - access to real hardware
 
    - starts, stops & manages all guest domains
 
    
 
- domain U
    
    - Unprivileged domains
 
    - Linux, NetBSD, FreeBSD-5.x, Plan 9
 
    - Only have access to block and network devices created by domain 0
 
    
 
 
Concrete Example
- Dell PowerEdge 1750
    
    - dual Xeon 3.2GHz processors
 
    - 2GB RAM
 
    - dual embedded GigE NICs
 
    - PERC 4/Di embedded RAID with 3 150GB drives (RAID-5)
 
    
 
- domain 0
    
    - NetBSD 3.0
 
    - 64MB RAM
 
    - Provide cgd-on-vnd devices for file-backed domains
 
    
 
- domain U
    
    - NetBSD 3.0
 
    - 128MB RAM
 
    - varying amount of disk space
 
    - file-backed domains
 
    
 
 
Domain 0 Setup
- pkgsrc/sysutils/xentools20
 
- Use IPfilter to block access to ports 8000, 8001, 8002
    
    - control ports for xend, which allow managing guest domains.
 
    
 
- Create bridge(4) devices for each NIC
    
    - Network interfaces for each guest domain are attached to a specific
        bridge.
 
    
 
- Mount USB key partition containing encryption keys for cgd(4) devices
 
- Start xend and all domains
 
 
Domain U Setup
- Each domain uses three partitions
    
    - root.img (/)
        
        - 512 MB, read-only & shared amongst all domU's
 
        - Contains base installation of NetBSD 3.0
 
        - Easy to update all domains to latest netbsd-3 branch
 
        
     
    - pkg.img (/usr/pkg)
        
        - 128+ MB, read-only
 
        - contains pkgsrc-installed software
 
        - Update all packages by swapping with new image
 
        - Downgrade packages by swapping back with old image
 
        
     
    - crypt.img (/crypt)
        
        - 5+ GB, read-write
 
        - encrypted partition
 
        - contains server-specific data
 
        - /crypt/etc and /crypt/var are null-mounted to /etc and /var
 
        
     
    
 
 
Example domain U configuration file
kernel = "/xen/netbsd-3/netbsd-XENU"
memory = 128
name = jabberwock
cpu = -1
nics = 1
vif = [ 'mac=ee:14:04:d0:ec:af, bridge=bridge0' ]
cmd = '/usr/pkg/etc/xen/block-file bind
		/xen/jabberwock/crypt.img'
out = os.popen(cmd)
vnd = out.readline().rstrip().rstrip('d') + 'a'
out.close()
disk = [ 'cgd:' + vnd + ':/xen/cgd/jabberwock,wd2d,w',
         'file:/xen/jabberwock/pkg.img,wd1d,r',
         'file:/xen/netbsd-3/root.img,wd0d,r' ]
root = "/dev/wd0d"
 
Example guest domains
- VPN router (configured across three NICs)
 
- Individual mail servers per DNS domain with a common pkg.img
 
- File server (a lot more disk)
 
- Web & Subversion server
 
- Database server (more RAM)
 
- pkgsrc development server
 
- package-installing server (to install packages into new pkg.img images
 
 
Closing Thoughts
- It's quick and easy to test new software configurations on a scratch
    machine.
 
- This Xen setup has similarities to setups for embedded devices.
 
- I can't wait till NetBSD can run on Xen 3.0.
    
    - Breaks 2GB limit on memory
 
    - Allows SMP guest domains for better processor utilitization
 
    
 
 
Links to more information