The core technology behind Firejail is Linux Namespaces, a virtualization technology available in Linux kernel. It allows a process and all its descendants to have their own private view of the globally shared kernel resources, such as the network stack, process table, mount table, IPC space. The following features are implemented:
Firejail restricts the processes visible in the sandbox by making the sandboxed program PID 1. Only processes started by this program and its descendants will be visible in the sandbox.
The feature is implemented using a PID namespace. Firejail can run any type of processes, servers or GUI applications. It can also be used as a login shell to sandbox users upon telnet or SSH login into a server.
Three types of filesystems are supported: local, overlay and chroot filesystems. Filesystem trees can be further modified using security profiles. Multiple Firefox sandboxes can be ran in parallel on the same filesystem tree.
- Local filesystem with the main directories mounted read-only. Only /home, /tmp and /var directories are writable. To create a sandbox for a program, pass the program and its arguments to the firejail executable (firejail program_and_arguments). Without any argument, a regular Bash shell is started in the sandbox
- Overlay filesystem mounted on top of the local filesystem using OverlayFS. OverlayFS is a patch to Linux Kernel currently applied by default to Ubuntu and OpenSUSE kernels. The overlay holds all filesystem modifications. These modifications are not saved to the local filesystem (firejail –overlay program_and_arguments).
- Classic chroot system. Build a full / directory tree using debootstrap or any other tool provided by your distribution, and start Firejail on it (firejail –chroot=/path/to/root/tree program_and_arguments). You can also use distribution-specific trees extracted from OpenVZ templates.
Private Mode and Security Profiles
Private mode can be used on top of any type of filesystem described above. It basically isolates the current user directory form the processes running in the sandbox by mounting empty temporary filesystems on top of /root, /home and /tmp directories. Any files written in these directories will be discarded when the sandbox is closed.
Security profiles are an easy way to configure an existing filsystem tree. It allows the user to specify files and directories that are not to be accessed in the sandbox, marked read-only, or empty filesystem trees mounted on top of them and discarded when the sandbox is closed.
For example, a security profile can deny access to ~/.ssh directory. This directory stores user SSH certificates and encryption keys. Except for ssh, no other program should have access to this directory. A security profile can easily deny access to it. Default security profiles are provided for Firefox, Chromium, Midori and Evince.
Seccomp (alias for “secure computing”) is a filtering mechanism that allows processes to specify an arbitrary filter of system calls (expressed as a Berkeley Packet Filter program) that should be forbidden. Berkeley Packet Filter support for seccomp was introduced in Linux kernel 3.5.
Many filesystem features in Firejail such as mounting the filesystem read-only, or disabling hotplug and uevent_helper features under /proc and /sys directories depend on the removal of mount/unmount support in the sandbox. This is easily accomplished by blacklisting support for mount and umount2 system calls in seccomp.
Another system call disabled in the current version of Firejail is ptrace. This system call allows an attacher to interogate and modify running processes started by the user. Tools such as strace and gdb are based on ptrace system call.
This is the list of system calls disabled by this feature:
- mount, umount2 – mounting and unmounting filesystems
- kexec_load – loading a different kernel
- ptrace – debugging tools such as strace, ltrace and gdb
- init_module, finit_module, delete_module – loading/removing kernel modules
- iopl, ioperm – IO port control
- swapon, swapoff – swap memory control
- syslog – privileged syslog operation, system information provided by dmesg
Kernel module loading and unloading, system reset and SUID executables are also disabled. The feature is enabled using –seccomp option. If a Linux kernel 3.4 or older is running the system, a warning is printed on the console. The filtering is inherited by all the children processes running in the sandbox, and the user privilege level is locked.
Linux Capabilities Support
When enabled using –caps support, a security filter based on Linux capabilities (POSIX draft 1003.1e) is applied to all processes inside the sandbox. Currently the following capabilities are dropped:
- CAP_SYS_MODULE – kernel module loading/unloading
- CAP_SYS_RAWIO – IO port operations
- CAP_SYS_BOOT – system reboot and replacing current kernel with a new one
- CAP_SYS_NICE – raise process nice value
- CAP_SYS_TTY_CONFIG – various privileged ioctl operations on virtual terminals
- CAP_SYSLOG – privileged syslog operation
- CAP_SYS_ADMIN – system administration privileges
For more information on Linux capabilities run man 7 capabilities.
Firejail can attach a new TCP/IP networking stack to the sandbox. This can be used to set up local Demilitarized Zones (DMZ), or to configure temporary networks for developing and testing various client/server programs.
In this example firejail sandbox vm1 runs its own TCP/IP stack and connects to the host on br0 bridge device (firejail –net=br0 program_and_arguments).
The sandboxes and the associated processes are listed using firejail –list command. A separate utility, firemon, based on Process Events Connector feature in Linux kernel allows the administrator to trace and log all fork, exec, id change, and exit events in the sandbox.
Both firejail and firemon include a –list option that lists the process tree for each sandbox. Also, a –top options similar to Linux top command is included.