Container evangelist
Open Source Advocate
Kernel Developer
Hypervisors are based on emulating hardware
Containers are about virtualizing the Operating System subsystems
Containers: Single Kernel; Hypervisors: multiple kernels.
Immediate Benefit: One Kernel, one resource manager
Other container advantages: elasticity
↑
|
|
|
Gigabytes
|
|
|
↓
↑
Megabytes
↓
Sharing is a key attribute that enables container agility
containers can be scaled instantly up or down (instant vertical scaling)
resource decisions can be made much more efficiently than hypervisors
Boxing just the application lead to the revolution in devops
However, Docker is nothing more than an application packaging and transport system
Devops today is all about easy deployment of boxed applications with a consistent environment
Unfortunately, sharing also increases the security risk of the containment system
A fact that Hypervisor advocates seek to exploit.
"Fake News"
Problem: Lack of facts around "security" make it hard to dispute
Kernel API is the same for all containers
Came from an Agreement at the Kernel Summit in 2011
Caused container interests to converge on a unified, upstream API
No repeat of Xen/KVM split
Led directly to the ability of Docker to run on upstream containers
Also rapidly evolving
Block I/O
CPU
Devices
Memory
Network
Freezer
...
Network NS
IPC NS
Mount NS
PID NS
UTS NS
User NS
Docker is not the end of containers, it's just the begining
And the source of quite a few of our security issues
Need to find a measure to define "Security" or "Containment"
Best Candidate is Attack Profile
Vertical Attack Profile means my overall chance of my application being Hacked
Horizontal Attack Profile (HAP) means my overall chance of being hacked by my exposed shared code.
Observations
here is the measured HAP of Docker vs KVM as Kata Containers.
Containers with a good seccomp profile are not much worse than a hypervisor
Can Make "Attack Profile" more precise by equating it to number of Lines of Code traversed multiplied by exploitable defect density.
Once we have a measure, we can start to build a container description that minimizes HAP.
HAP advances the state of the art but is by no means the end point.
Problems with HAP as a Measure
If no-one can exploit your bug does it exist from a security perspective?
Need to incorporate "exploitability" as the next refinement for HAP
We suspect this means the interface description is the most important factor in exploitability
Some interfaces are inherantly more exploitable than others
Alternate Container Descriptions (Sandboxing)
Sandboxing means emulating some system calls for isolation instead of doing namespacing.
Emulation means code isn't shared and therefore the HAP is reduced (in theory)
Very difficult to get sandboxing right for containers without committing the hypervisor fault.
Well known Sandboxes are IBM Nabla and Google gvisor.
gvisor rewrites system calls in go for security
Nabla extracts unikernel techniques and fits them to a single process
However, we also try to keep sharing by using standard Linux memory management techniques.
What's Next
Is there some useful segmentation within the Linux Kernel?
Separation by address spaces within the kernel?
Run parts of the kernel with user context?
Use supervisors, like LSMs, to correct interface defects?
What about VAP ... HAP protects the operator rather than the application?