Pragmatic Container Security

When I started to play with containers, I thought - gosh, that is super cool. Super revolutionary, better in all possible ways. Then I heard from a couple of fellow Operations Engineers, that it is not secure at all! So, is it great or poor? I decided I need to make my own opinion.

Disclaimer: This post is written from a perspective of a huge Docker fan.

Basic Concepts

cgroups, namespaces & chroot

A container is a regular Linux process running on a Linux kernel. The substantial difference to the regular setup is that it has a restricted view of a system. Cgroups restricts available resources (memory, CPU). Chroot changes the file system, so the container won't share files with the host. Then, several namespaces restrict the view of the system: processes - so container process thinks it runs alone on a machine; network - so it has its interfaces and network configuration; users - so it can have its users configured separately from host users.

Escaping a Container

A problem can occur when a process escapes a container. Usually, we run well-known processes like Nginx, Apache, MongoDB, etc. They are secure, but from time to time there are new vulnerabilities found. If somebody takes over the process from outside, using the vulnerability, then she/he is still captured inside the container, with restricted access to the host system. As of now, most official images use the root user - the same root that the host root is, but with restricted access. Let's imagine, you are side by side with a rock-solid cage in a zoo. If there is a fox in it (a regular user with restricted access), you feel pretty comfortable. However, if there is a large tiger (root user), you might feel anxious. The probability of escaping the cage is comparable, so it usually can happen only when cage doors are accidentally open. Though, when the tiger escapes the consequences can be disastrous.

The truth is, the same problem applies to VM hypervisors. There is a long history of exploits (VMware, Xen, etc.) enabling code run on one VM, compromising the entire hardware cluster. Every VM is vulnerable to them, no matter if it runs containers or not. Because of this, it won't influence our comparison.

Case Comparison

Knowing the threat of our "tiger" escaping the container "cage" let's have a look at different deployment scenarios and their pros & cons related to infrastructure security - both with and without containers.

Multiple Processes on a VM

A "classic" case, where our system is a collection of processes, scheduled scripts, etc. running on a single machine. Usually, all processes are run by a single, dedicated, non-root user. It has its rights restricted to the resources needed to fulfill its purpose.

Pros

None of the application components run as root. If somebody takes over one process, they still cannot take over the entire VM. VMs are well separated on the hypervisor level.
When one VM is compromised, others on the same physical machine are secure unless misconfigured.

Cons

If one process gets compromised, an attacker will have a clear view of all other processes and have a wider attack surface - all the packages, processes, network configuration for the user in place.
It is hard to reproduce the exact copy of a system, even using provisioning tools like Chef or Ansible. Package versions can be slightly different. We have no guarantee that what we verified against security vulnerabilities during development is secure in production deployment.
Patching security vulnerabilities might lead to unexpected issues with the system. Upgrading one part of the system can harm other parts.
All file secrets are usually available to all processes and compromised at once.

Multiple Containers on a VM

When deploying the same system as containers on a VM, we have a collection of bounded environments, each of which runs a single process.

Pros

Each part of the system is separated from the others. Integrations between parts are explicit and restricted to the required minimum.
It is easy to upgrade one part while maintaining system stability.
No problems with clashing packages.
It is possible to scan images for vulnerabilities and upgrade when needed.
It is possible to restrict resources and monitor containers against a defined profile (e.g. Seccomp).
Easy to upgrade the host system distribution:
- To run the containerized application, we usually need only the container runtime, like Docker or Podman.
- The OS is pretty much immutable and has the bare minimum of packages installed, which a) reduces the attack surface, b) simplifies upgrades and minimizes the risk of damaging business-critical services.

Cons

Container misconfiguration might enable the possibility of escaping it.
By default, container images run as root, unless you take care to avoid it. Otherwise, if an attacker takes over a single container, and manages to escape it, it has root access to the server (end of the game).

Containers Run in Orchestrated Cluster (Kubernetes/Swarm)

One of the important traits of containerization is the better utilization of resources and scalability. This can be achieved using a container orchestrator. How does this influence security?

Pros

Orchestrator provides standard ways to: configure containers, restrict network interactions between them, provide secrets securely and monitor. You don't have to come up with custom solution to secure your container - a lot is already there:
- setup is transparent
- it provides tooling to narrow down security issues
- it is easy to restrict access to the required minimum
It is possible to configure admission control for the entire cluster
Plus all the container pros still apply.

Cons

If a container running on a manager node gets compromised, and an attacker manages to escape it, she/he will take over the entire cluster (end of the game).
Multiple containers belonging to different systems can interfere with each other - shared resources (e.g. DoS type attack).
The larger the number of containers on a single node, the higher the risk of compromising the node.
Misconfiguration of a single container can lead to exposure of the entire node (see privileged mode).
Complexity of the orchestrator itself. It has quite a few moving parts, which makes it easy to misconfigure and accidentally introduce a security risk.

Single Process Run on a VM

Big players like Netflix build their infrastructure with a VM as the smallest unit. Although it might not be the best solution from the perspective of resource utilization, how does it behave from the security standpoint?

Pros

The separation between two VMs is way higher than the separation between two containers.
- Containers share the same Linux Kernel while VMs share the same hypervisor which has much smaller codebase. To compare, Xen Type-1 VM hypervisor codebase is around forty times smaller than the Linux Kernel. The number of possible vulnerabilities is much higher in the Kernel since it is immensely complex.

Cons

Scanning of VM images/snapshots is not as common and well adopted as container image scanning (Docker supports it out of the box).
VM, if not treated explicitly as immutable, can drift with time and mutate.
- The longer the time to provision, the greater the temptation to mutate it.
VM image is less reproducible than container image.
- Each time we provision a VM using e.g. Chef or Ansible, we can end up with different image due to changes in external recipes, version of OS packages, etc. Taking VM image snapshots isn't usually automated because of its volume.
- Container images are more immune to such situations, thanks to lightweight caching of layers during the build process. Changes to an application do not require re-installation of OS packages.

Other Options

I'll skip other VM options, like running multiple applications on one host under the same or different users. In my opinion, they are not that popular anymore, and less secure.

Container Security - things you might not know

The comparison gives a basic overview. Before we jump into conclusions/recommendations, let's have a look at some container security insights that might not be obvious but in my opinion are at least interesting.

Docker Runs as root

A user who can run the docker command (is in the docker group) has effective root rights on a machine. This is because the docker command communicates with the Docker Daemon which runs as root. What is more, there is no simple way to audit run actions, since a user is only issuing the docker command, the Docker Daemon is the one who runs processes. We need to be especially cautious about securing hosts which build Docker images. Consider using Podman as a daemon-less alternative. Podman can build images as a regular user if the image does not require root capabilities.

There are Rootless Containers

Thanks to user namespaces, which are available in the Linux Kernel, there is a growing family of rootless containers. You can run a container as the root inside it, however, it maps to a regular user on the host. If an attacker manages to escape the container, there is an additional boundary to break to take over the machine. It does come with drawbacks and might require additional configuration effort. Since the container user is not the root, it has lower privileges (e.g. it cannot bind ports under 1024).

Smaller Container Images are More Secure

There are stripped container images with a bare minimum of packages, that could further reduce the attack surface of a container (e.g. Alpine). What is more, there is an emerging distroless family of container images that further reduce the attack surface by striping all of the distribution noise, maintaining only the core application dependencies (no package managers, shells, etc.).

There are Super-Small & Fast VM Images

On the other hand, there are super-small and fast OS images like AWS Firecracker which can significantly reduce the drawbacks of using a single VM per process.

Periodic Image Scans Can be Cheap

Periodic image scanners can be run only on images that are actively instantiated as containers. This is usually a significantly smaller number than the number of all images in the registry. These scans make sure your images are secure against all recently found vulnerabilities.

Image Scans Can Happen During Development

We don't have to wait for an image to get scanned at some vague point in time. We can make the verification process a stage of our CI pipeline which will allow publication of verified images only. What is more, even developers can make such scans when building locally. This way, the responsibility of securing the application goes to the early stages of delivery.

Dockerfile Linter Can Help

Static analysis of Dockerfiles can improve them a lot - not only from the security perspective but also performance, size, etc. Consider using one on your daily basis (Haskell Dockerfile Linter, WhaleLint for IntelliJ).

Admission Control - run only verified images

Orchestrators support admission control - allowing only verified images to be run on a cluster. This way, your system can enforce security rules. Not only all images are scanned, but also only images that went through the verification process with a positive score can be run on a cluster.

Recommendations?

If I wanted to be super-secure, I'd vote for having one process per VM with a thin Linux distribution like Firecracker. Still, I'd have to work hard not to compromise it by installing vulnerable packages or introducing misconfiguration. Comparing other options we need to stay pragmatic.

Despite the fact containers are "younger" technology, they are already well-exercised in large commercial deployments. Additionally, there are many standards and tools in place helping to secure your containerized deployments: image scanners, declarative system description (Docker Swarm Compose or Kubernetes Configuration Files), immutability, etc. Those are very important traits that can secure the deployment. This proactivity of container technology can mitigate some of the possible vulnerabilities and make it more resilient to attacks compared to VM machines.

What is more, when using containers there is so-called "shift left" in the responsibility for deployment security. Image developers have to make sure that not only the code they write but also images they build are secure (docker scan IMAGE_NAME)! Vulnerable software gets eliminated quickly from the deployment. Since it is possible to reproduce the entire environment locally, it gets tested earlier.

I don't believe there is one recommendation. No matter if it is a single VM or multi-node Kubernetes cluster, it is good to know the very basic attack vectors (of Linux and Container Runtime) to secure your system.

Common Anti-Patterns

At the end, have a look at common anti-patterns when using containers. They are things you have to avoid like the plague.

Running Privileged Containers on Prod Cluster

By default, Docker containers have restricted capabilities. This is a sensible default. However, Docker provides a way to extend those, called privileged mode. One of the motivations can be running a CI/CD server like Jenkins which needs access to the Docker engine to perform image builds. This poses a great risk. If CI/CD server can send instructions to the Docker Daemon, it has effectively root access to the entire VM. Malicious Dockerfile passed to build can cause a lot of harm. If you decide to use privileged mode, use it on a dedicated cluster, separated from the deployment environment.

Mounting Sensitive Volumes from the Host

By default, the container gets its file root from an image that is distinct from the host's file system. However, it is simple to bind-mount virtually any system directory (imagine /etc/passwd!) to a container. Avoid using this feature in deployments. Linux does heavily use files and directories as a representation of its resources (e.g. /proc), any misconfiguration can enable escaping of the container.

Running Multiple Processes in One Container

Aside from well-motivated implementations of the sidecar pattern, avoid deploying more than one process in a single container. Container engines are designed to run and monitor only one process per container (container == process). It gives an advantage in monitoring - any spawn of a new process within a running container can be a security alarm. Background processes are hard to test, monitor and trace.

Installing Software During Entry Point

This compromises immutability. If a container is fetching additional software when running, we cannot rely on image scanners. Every container can use different versions of packages which can lead to malfunctions of the entire system.

Summary

Container security is a vast topic. I cannot say they are less or more secure compared to bare Linux. On one hand, they bring new threats, on the other, they solve some existing.

Based on the book by Liz Rise: "Container Security: Fundamental Technology Concepts that Protect Containerized Applications" .

Pragmatic Container Security

Are VMs more secure than containers? There is no general answer to that question. Hopefully, this post will give you insight into the topic.

Basic Concepts

cgroups, namespaces & chroot

Escaping a Container

Case Comparison

Multiple Processes on a VM

Pros

Cons

Multiple Containers on a VM

Pros

Cons

Containers Run in Orchestrated Cluster (Kubernetes/Swarm)

Pros

Cons

Single Process Run on a VM

Pros

Cons

Other Options

Container Security - things you might not know

Docker Runs as root

There are Rootless Containers

Smaller Container Images are More Secure

There are Super-Small & Fast VM Images

Periodic Image Scans Can be Cheap

Image Scans Can Happen During Development

Dockerfile Linter Can Help

Admission Control - run only verified images

Recommendations?

Common Anti-Patterns

Running Privileged Containers on Prod Cluster

Mounting Sensitive Volumes from the Host

Running Multiple Processes in One Container

Installing Software During Entry Point

Summary

About the author

Damian Mierzwiński