Firejail Seccomp Guide

Firejail is a generic Linux namespaces security sandbox, capable of running graphic interface programs as well as server programs. The sandbox is lightweight, the overhead is low. There are no socket connections open, no daemons running in the background. All security features are implemented directly in Linux kernel and available on any Linux computer.

Seccomp-bpf stands for secure computing mode. It’s a simple, yet effective sandboxing tool introduced in Linux kernel 3.5. It allows the user to attach a system call filter to a process and all its descendants, thus reducing the attack surface of the kernel. Seccomp filters are expressed in Berkeley Packet Filter (BPF) format.

In this article I’ll show you how to build a whitelist seccomp-bpf filter and how to attach the filter to a user program using Firejail sandbox. Throughout the article I will use Transmission BitTorrent client as an example.

I start by extracting a list of syscalls the program uses, build the filter and run the program in Firejail. As new syscalls are discovered during testing, the filter is updated. When everything looks fine, I integrate the filter into a security profile suitable for Firejail. These are the steps:

Syscalls

Linux has several tools for listing syscalls. I guess the easiest one to use is strace (apt-get install strace, yum install strace). I start transmission-gtk in strace using -qcf options (quiet, count, follow).

$ strace -qcf transmission-gtk

I play for about 5 minutes with the program, go through some menus, start and stop a download etc.

transmission-gtk BitTorrent client

transmission-gtk BitTorrent client

As I close the program, strace prints the syscall list on the terminal:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 42.93    3.095527         247     12512           poll
 19.64    1.416000        2975       476           select
 13.65    0.984000        3046       323           nanosleep
 12.09    0.871552         389      2239       330 futex
 11.47    0.827229          77     10680           epoll_wait
  0.08    0.005779          66        88           fadvise64
  0.06    0.004253           4      1043       193 read
  0.06    0.004000           3      1529         3 lstat
  0.00    0.000344           0      2254      1761 stat
[...]
  0.00    0.000000           0         1           fallocate
  0.00    0.000000           0        24           eventfd2
  0.00    0.000000           0         1           inotify_init1
------ ----------- ----------- --------- --------- ----------------
100.00    7.210150                 95061     23256 total

Firejail

I bring strace output (cut&paste) in a text editor and clean it up. I build a comma-separated list without any blanks, something like:

poll,select,nanosleep,futex,epoll_wait,fadvise64,read,lstat,stat,[...]

I use –seccomp.keep option to start Firejail, and –shell=none to run the program directly without the extra syscalls required by a shell:

$ firejail --shell=none --seccomp.keep=poll,select,[...] transmission-gtk

seccomp-xterm

It looks ugly in this moment, a kilometer-long command line that doesn’t even work. For some reasons strace missed some syscalls. Time to bring in the system logger.

Syslog

If I get errors in the terminal, I just add the missing syscall to the list and try again. But this is not always the case. Most of the time Linux kernel will just kill the process and send audit messages to syslog. For this reason, I keep another terminal open monitoring syslog:

$ sudo tail -f /var/log/syslog

seccomp-syslog

The log entry tells me exactly what system call number crashed the program, syscall=201 in the example above. To associate the number with a name, I use firejail as follows:

$  firejail --debug-syscalls | grep 201
201	- time
$ 

Looks like I need to add time syscall to the list. I keep on adding syscalls to the list as they are reported and try again. To get Transmission working I ended up adding pwrite64,time,exit,exit_group on top of what strace reported – not too bad!

Security profiles

Firejail installs in /etc/firejail directory security profiles for several popular programs. The profiles define a manicured filesystem with most directories mounted read-only, and several files and directories blanked in $HOME, mainly files holding passwords and encryption keys.

Transmission BitTorrent client is supported, and the profile also defines a default seccomp blacklist filter. I want to upgrade this filter to the whitelist filter I’ve just built. For this, I go into ~/.config/firejail directory and copy the default Transmission profile there:

$ cd ~/.config/firejail
$ cp /etc/firejail/transmission-gtk.profile .
$ vim transmission-gtk.profile

I add a “shell none” line, and I replace “seccomp” with “seccomp.keep poll,select,nanosleep,futex,epoll_wait,fadvise64,[…]”. The result looks like this:

$ cd ~/.config/firejail
$ cat transmission-gtk.profile
# transmission-gtk profile
include /etc/firejail/disable-mgmt.inc
include /etc/firejail/disable-secret.inc
blacklist ${HOME}/.adobe
blacklist ${HOME}/.macromedia
blacklist ${HOME}/.mozilla
blacklist ${HOME}/.icedove
blacklist ${HOME}/.thunderbird
caps.drop all
shell none
seccomp.keep poll,select,nanosleep,futex,epoll_wait,fadvise64,read,lstat,stat,epoll_ctl,sendto,readv,recvfrom,ioctl,write,inotify_add_watch,writev,socket,getdents,mprotect,mmap,open,close,fstat,lseek,munmap,brk,rt_sigaction,rt_sigprocmask,access,pipe,madvise,connect,sendmsg,recvmsg,bind,listen,getsockname,getpeername,socketpair,setsockopt,getsockopt,clone,execve,uname,fcntl,ftruncate,rename,mkdir,rmdir,unlink,readlink,umask,getrlimit,getrusage,times,getuid,getgid,geteuid,getegid,getresuid,getresgid,statfs,fstatfs,prctl,arch_prctl,epoll_create,set_tid_address,clock_getres,inotify_rm_watch,set_robust_list,fallocate,eventfd2,inotify_init1,pwrite64,time,exit,exit_group
netfilter
$

The command “caps.drop all” in the security profile above disables all capabilities. Linux capabilities feature of Linux kernel is similar to seccomp, but works deep inside the kernel. Between seccomp and capabilities more than half the kernel code is disabled.

Firejail chooses the profile automatically, based on the name of the executable. To run Transmission with all security features enabled, the command is:

$ firejail transmission-gtk
transmission-gtk started in Firejail using the profile file

transmission-gtk started in Firejail using the profile file

Conclusion

Whitelist seccomp filters are easy to build, yet they need lots of testing. The filters are not portable. For example this filter build on Debian Wheezy will not work on Ubuntu 14.04. The exact list of syscalls depends on the kernel running the system, the version of the program and all the libraries the program is linking in.

For more information about Firejail, visit the project page.

Advertisements

13 thoughts on “Firejail Seccomp Guide

  1. Scooby

    great article

    Do you recommend building a whitelist for your favorite browser?

    keep up the good work

    Thanks

    Reply
    1. netblue30 Post author

      Thanks! The filter could break next time your distro installs an update for firefox, or an update for flash player or any other plugin/extension you have installed. I use firefox with the default seccomp blacklist filter, never tried it with a whitelist filter.

      Reply
  2. Pingback: Links 11/5/2015: Linux 4.1 RC3, OpenELEC 6.0 Beta | Techrights

  3. Chiraag Nataraj

    Could you consider having an option (maybe –seccomp-explicit) which *only* blacklists the syscalls specified? That would be very useful in the case of firefox, where the Hangouts plugin requires ptrace. seccomp.keep does not accomplish this, and nothing else allows me to whitelist a previously blacklisted syscall.

    Reply
    1. netblue30 Post author

      Use “–seccomp.drop=syscall,syscall,syscall…” instead of –seccomp.keep or –seccomp. It blacklists only the calls you specify.

      You will also need to allow capabilities, so remove “caps.drop all” line from /etc/firejail/firefox.profile.

      Reply
      1. Chiraag Nataraj

        Thanks! Although I’m curious…why would I need to remove caps.drop all? I tried just adding seccomp.drop and it seems to work fine, blacklisting all the syscalls I specified.

      2. Chiraag Nataraj

        I’m not sure I follow. Right now, I have the following (relevant) config options:

        caps.drop all
        seccomp.drop

        Both seem to work just fine – that is, all capabilities are dropped and the relevant syscalls are blacklisted.

    2. johnhoff

      Um, you should know that ptrace completely defeats the purpose of seccomp. It’s even displayed in the manpage. Basically, the SETREGS ptrace flag can be used to break out of a seccomp sandbox by messing with registers holding the syscall number and arguments right before it’s called (seccomp filters are checked for permission before ptrace is allowed to mess with things… or in other words, ptrace is allowed to mess with registers right before entering kernelmode, even after a process has been given “permission” by seccomp).

      Seccomp is not typically meant for the end-user. It requires you actually know what you’re doing and what the effects on the program will be. It’s not just a “whitelist all syscalls the program needs and you are automatically secure”.

      tl;dr if you whitelist ptrace, you are not using seccomp

      Reply
  4. Pingback: NF.sec – Linux Security Blog - Firejail – proste budowanie klatek

  5. Pingback: Firejail: A New Lightweight Browser And Application Sandbox « IgnorantGuru's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s