The core technology behind Firejail is Linux Namespaces. It allows a process and all its descendants to have their own private view of the globally shared kernel resources, such as the network stack, process table, mount table, IPC space. Firejail also uses seccomp and Linux capabilities to restrict the kernel interface and prevent privilege escalation.
Unlike other programs, Firejail doesn’t need the user to be root to run it. This is accomplished using SUID feature available on Linux systems. Once started, Firejail enables all kernel security features requested, drops all privileges, passes the control to the user program, and gets out of the way. It is lightweight, and it doesn’t open new communication sockets that can be exploited by attackers.
Firejail restricts the processes visible in the sandbox by making the sandboxed program PID 1. Only processes started by this program and its descendants will be visible in the sandbox.
Firejail can run any type of processes, servers or GUI applications. It can also work as a login shell for telnet or SSH.
Firejail supports three types of filesystems: local, overlay and chroot. Filesystem trees can be further modified using security profiles.
- Local filesystem with the main directories mounted read-only. Only /home, /tmp and /var directories are writable. This is the default Firejail filesystem. The sandbox is started by passing the program name to firejail executable (firejail program_and_arguments). Without any argument, Firejail runs a regular Bash shell.
- Overlay filesystem mounted on top of the local filesystem using OverlayFS. OverlayFS is available in Linux kernel since version 3.18. The overlay holds all filesystem modifications. These modifications are not saved to the local filesystem (firejail –overlay program_and_arguments).
- Chroot system, using a root directory tree built by debootstrap or any other Linux tool (firejail –chroot=/path/to/root/tree program_and_arguments).
Private Mode and Security Profiles
Private mode can be used on top of any type of filesystem described above. It basically isolates the current user directory form the processes running in the sandbox. It is implemented by mounting temporary filesystems on top of /root and /home directories. Any files written in these directories will be discarded when the sandbox is closed.
Optionally, an existing directory can be used instead of the temporary one. Files written to this directory will not be discarded when the sandbox is exited.
Security profiles are an easy way to configure an existing filesystem tree. It allows the user to specify files and directories that are not visible, marked read-only, or temporary.
For example, a security profile can deny access to ~/.ssh directory. This directory stores user SSH certificates and encryption keys. Firejail provides default security profiles for all major browsers: Firefox, Chromium, Midori and Opera.
Seccomp filtering allows the sandbox to specify an arbitrary list of system calls that should be forbidden. The support was introduced in Linux kernel 3.5.
Many filesystem features in Firejail depend on the removal of mount/unmount support in the sandbox. This is easily accomplished by blacklisting support for mount and umount2 system calls in seccomp.
Another system call disabled by Firejail is ptrace. Mainstream tools such as strace and gdb are based on ptrace. Programmers and system administrators use them every day to debug processes. Attackers can also use them to change the execution of programs, and to extract passwords and other confidential data handled by programs.
This is the default list of system calls disabled:
- mount, umount2 – mounting and unmounting filesystems
- kexec_load, _sysctl – loading a different kernel, overwrite kernel parameters
- ptrace – debugging tools such as strace, ltrace and gdb
- init_module, finit_module, delete_module – loading/removing kernel modules
- iopl, ioperm – IO port control
- swapon, swapoff – swap memory control
- syslog – privileged syslog operation, system information provided by dmesg
- sysfs, mknod – get filesystem information, create device files
- adjtimex, clock_adjtime – system time
- lookup_dcookie, perf_event_open, fanotify_init, kcmp, process_vm_readv, process_vm_writev) – miscellaneous calls
Firejail also supports blacklist and whitelist filters specified by the user. If a Linux kernel older than version 3.5 is running the system, a warning is printed on the console. The filtering is inherited by all the children processes running in the sandbox, and the user privilege level is locked.
Linux Capabilities Support
Capabilities (POSIX 1003.1e) are designed to split up the root privilege into a set of distinct privileges which can be independently enabled or disabled. These are used to restrict what a process running as root can do in the system. For instance, it is possible to deny filesystem mount operations, deny kernel module loading, prevent packet spoofing by denying access to raw sockets, deny altering attributes in the file system.
The default capabilities filter disables:
- CAP_SYS_MODULE – kernel module loading/unloading
- CAP_SYS_RAWIO – IO port operations
- CAP_SYS_BOOT – system reboot and replacing current kernel with a new one
- CAP_SYS_NICE – raise process nice value
- CAP_SYS_TTY_CONFIG – various privileged ioctl operations on virtual terminals
- CAP_SYSLOG – privileged syslog operation
- CAP_MKNOD – create device files
- CAP_SYS_ADMIN – system administration privileges
Firejail also supports user specified blacklist and whitelist capabilities filters.
Firejail can attach a new TCP/IP networking stack to the sandbox. The new stack comes with its own routing table, firewall and set of interfaces. It is totally independent of the host network stack. It can be used to set up local Demilitarized Zones (DMZ), configure temporary networks for developing and testing various client/server programs, or isolating programs such as Mozilla Firefox for increased security.
In this example firejail sandbox vm1 runs its own TCP/IP stack and connects to the host on br0 bridge device (firejail –net=br0 program_and_arguments).
For running programs without any network connectivity, use –net=none. This creates a new network stack. The stack is not connected to the real network.Firejail also supports macvlan networks. Use this setup to create a new network stack with a virtual interface connected to an existing network (firejail –net=eth0 program_and_arguments).
The sandboxes and the associated processes are listed using firejail –list command. A separate utility, firemon, based on Process Events Connector feature in Linux kernel allows the administrator to trace and log all fork, exec, id change, and exit events in the sandbox.
Both firejail and firemon include a –tree option that lists the process tree for each sandbox. Also, a –top options similar to Linux top command is included.