Virtualization refers to the creation of virtual machines that acts like real computers with an operating system. Software executed on these virtual machines is separated from the underlying hardware resources.
This article discusses LXC, a lightweight virtualization technology built into Linux kernel. The user space LXC tool is distributed with a number of templates that allow the creation of different Linux distro filesystems, usually one template for each major Linux distribution. The problem with these templates is they never work, or they stop working with every new release of LXC tool or of the particular Linux distribution. This is the case with all Linux distributions, and Debian is no exception. Currently, the Debian template is borken under “wheezy”. The relevant Debian bug is here, and history shows that as soon such a bug gets fixed, lxc user space driver changes again and breaks it. It could be worse, in Fedora LXC was broken in Fedora 15 and it was never fixed.
The simple way to handle the problem is to forget all about the template mechanism and roll your own containers. In Debian you can build the container filesystem using the standard debootstrap, or mount read-only the host filesystem, and then use lxc-execute to start a simple bash session inside the container. In this session you can than start all the programs you need to run in the container. It is an application container, very similar to the containers created using the official ssh template distributed with LXC.
The virtual machine I will describe in this article uses a root filesystem build using debootstrap (apt-get install debootstrap). The procedure is simple and it should work on any Debian machine. It will probably work also on any other distro based on Debian, such as Ubuntu, Mint etc.
Install necessary software
By default Debian has CGROUPS disabled. LXC depends heavily on CGROUPS, we need to install the user space tools and enable them.
# apt-get install cgroup-bin libcgroup1 lxc
Enable CGROUPS by adding the following line in /etc/fstab file:
cgroup /sys/fs/cgroup cgroup defaults 0 0
Reboot the computer and check LXC and CGROUPS installation:
# lxc-checkconfig Kernel config /proc/config.gz not found, looking in other places... Found kernel config file /boot/config-3.2.0-4-amd64 --- Namespaces --- Namespaces: enabled Utsname namespace: enabled Ipc namespace: enabled Pid namespace: enabled User namespace: enabled Network namespace: enabled Multiple /dev/pts instances: enabled --- Control groups --- Cgroup: enabled Cgroup clone_children flag: enabled Cgroup device: enabled Cgroup sched: enabled Cgroup cpu account: enabled Cgroup memory controller: enabled Cgroup cpuset: enabled --- Misc --- Veth pair device: enabled Macvlan: enabled Vlan: enabled File capabilities: enabled Note : Before booting a new kernel, you can check its configuration usage : CONFIG=/path/to/config /usr/bin/lxc-checkconfig
Create the VM filesystem
As root, build the virtual machine in a brand new directory, vm1. The only extra package needed to be installed by debootstrap is LXC itself:
# mkdir ~/vm1 && cd ~/vm1 # debootstrap --include=lxc --arch=amd64 wheezy rootfs
LXC configuration file
In vm1 directory, create an LXC configuration file with the following content:
# cat ~/vm1/lxc.conf lxc.utsname = vm1 lxc.rootfs = vm1/rootfs lxc.mount.entry=proc /proc proc nodev,noexec,nosuid 0 0 lxc.mount.entry=tmpfs /dev/shm tmpfs defaults 0 0 lxc.pts=1024
Starting the virtual machine
Start the virtual machine as follows:
# cd ~ # lxc-execute -n vm1 -f vm1/lxc.conf -- /bin/bash root@vm1:/# reset root@vm1:/# export PS1="\e[01;31m\h:\W \u\$ \e[00m"
Once the VM is started, reset the terminal. This is the control terminal. It is also a good idea to change the prompt to make it visibly different than your regular host terminals.
Add (apt-get install) and start(/etc/init.d/… start) any software you need in this VM, any modifications will remain in ~/vm1/rootfs directory and will not overwrite your real filesystem.
To shut down the VM, close all the programs started manually in VM and close the control terminal.
Network isolation
The VM created so far uses the same TCP/IP networking stack as the host computer. For added security, the VM can be placed on its own network segment with its own TCP/IP stack. On the host we can go one step further and control the network access using netfilter firewall.
This is a small script that allows you to create the network setup:
#!/bin/bash # # Network configuration script for a desktop vm # # bridge setup brctl addbr br0 ifconfig br0 10.10.20.1/24 up # enable ipv4 forwarding echo "1" > /proc/sys/net/ipv4/ip_forward # netfilter cleanup iptables --flush iptables -t nat -F iptables -X iptables -Z iptables -P INPUT ACCEPT iptables -P OUTPUT ACCEPT iptables -P FORWARD ACCEPT # netfilter NAT iptables -t nat -A POSTROUTING -o eth0 -s 10.10.20.0/24 -j MASQUERADE
The script creates a Linux bridge and configures it. The VM will connect to this bridge using a “veth pair”. This is sort of a transparent tunnel from the bridge br0 device and eth0 interface inside the container. IPv4 forwarding is also enabled, so IP packets can be routed between VM and the external network.
For access control the script uses network address translation on the host. If you also have servers running in VM, you can make them visible on your external network using port forwarding. This is an example to forward port 80 (HTTP) from eth0 to your virtual machine, you can add it to the script abouve:
# netfilter port forwarding iptables -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to 10.10.20.10:80
With the network configured, modify lxc.conf and place the container on br0 bridge. Specify an IP address and a MAC address, and set the type to “veth”. The filesystem section doesn’t change.
# cat ~/vm1/lxc.conf lxc.utsname = vm1 # networking lxc.network.type = veth lxc.network.flags = up lxc.network.link = br0 lxc.network.hwaddr = 00:11:22:33:44:55 lxc.network.ipv4 = 10.10.20.10/24 # filesystem lxc.rootfs = vm1/rootfs lxc.mount.entry=proc /proc proc nodev,noexec,nosuid 0 0 lxc.mount.entry=tmpfs /dev/shm tmpfs defaults 0 0 lxc.pts=1024
Start your VM as usual and add a default gateway:
# cd ~ # lxc-execute -n vm1 -f vm1/lxc.conf -- /bin/bash root@vm1:/# reset root@vm1:/# export PS1="\e[01;31m\h:\W \u\$ \e[00m" vm1:/ root$ route add default gw 10.10.20.1
Conclusion
LXC virtual machines are very light on runtime resources. You can run in these VMs only the programs you need, without running all the programs your Linux machine runs by default.
The filesystem build by debootstrap has about 230MB. From here, it will grow as you install your applications in VM. There are other ways to build lighter filesystems: you can mount your existing host filesystem read-only in VM, and add a OverlayFS read-write layer on top of it. The resulting filesystems are much smaller, however you would need OverlayFS support in the Linux kernel you are running.
As mentioned in my other virtualization articles, LXC is basically a chroot on steroids and it doesn’t promise anything regarding security. This is changed on production boxes usually by setting each virtual machine on its own isolated network, running mandatory access control (SELinux, AppArmor) on the applications inside the containers, adding grsecurity and PaX support to Linux kernel, and probably a number of other things.
To be continued…
Related posts
Pingback: LXC on Debian (4) | 0ddn1x: tricks with *nix
Pingback: Hallow Demon
Pingback: Debian Virtualization: LXC debootstrap filesystem | Debian-News.net - Your one stop for news about Debian
Pingback: Debian Virtualization: LXC debootstrap filesystem | Debian InfoDebian Info - Just another James n Sheri.com site
I didn’t check, but this string looks incorrect:
iptables -t nat -A PREROUTING -p tcp –dport 80 -j DNAT –to 10.10.20.10:80
It works with incoming from outside (e.g.eth0) as well as incoming from container (veth) packets. Probably packets going outside from a container to http/80 route back to container.
It is a standard port redirect in iptables, for example look at this aritcle (first link returned by google):
http://www.debuntu.org/how-to-redirecting-network-traffic-to-a-new-ip-using-iptables/
You can modify and enhance it as you need it.
They forward port 1111 and they don’t guess that outgoing traffic to this port is returned. In your case you’ll notice it. I tested in VM your rule: as I expected web pages stop opening after this rule. Why? Because your rule works when packet comes from VM via veth to bridge also, not only from outside.
The correct rule is something like:
iptables -t nat -A PREROUTING -i -p tcp –dport 80 -j DNAT –to 10.10.20.10:80
or
iptables -t nat -A PREROUTING -i -p tcp –dport 80 -j DNAT –to 10.10.20.10:80
Comments engine erased some code…
iptables -t nat -A PREROUTING -i external_iface -p tcp –dport 80 -j DNAT –to 10.10.20.10:80
iptables -t nat -A PREROUTING -i external_ip -p tcp –dport 80 -j DNAT –to 10.10.20.10:80
The rule is just a start in building a firewall. It is usually paired with a rule that blocks all the traffic initiated from vm1, something like this:
iptables -A FORWARD -i br0 -m state –state NEW,INVALID -j DROP
This effectively isolates vm1 in case somebody from outside manages to take control of it. vm1 could be located for example in a DMZ, the bad guy should be stuck in vm1 without any possibility to go out.
Some people might chose however to let some traffic from vm1 towards some very specific external servers. It all depends on what applications they are running in vm1, and the rules are getting very specific to the application. Anyway, thanks for pointing it out the returned traffic.
Also cgroup-bin conflicts with LXC:
root@test:~# apt-get install lxc
…
The following packages will be REMOVED:
cgroup-bin
Pingback: Debian 6.0.8 is Out (and Other Debian News) | Techrights