In the last couple of years Kubernetes (K8s) has become one of the most popular tools for running containerized applications. Many cloud companies, major ones among them, have adopted it to orchestrate their container-based workloads. Given its popularity, the problem of K8s security is becoming even more pressing. Read our two-part blog post series on how to make a Kubernetes cluster secure. Part one provides a brief history of virtualization, presents admission controllers and how they work and shows how Pod Security Policies, a powerful admission controller, can help you manage user actions on Kubernetes cluster.
From physical servers to containers
In the very early days of the computer era, having more computing power meant having more physical servers as well. The term “server farms” gets at just how elaborate such solutions could be. Later, thanks to the introduction of Virtual Machines (VM), the concept of virtualization was born. This approach has been playing an important role over the past 20 years and is still widely used. Yet it has some shortfalls. In order to run apps, each VM must include not only the full footprint of the OS itself, but all libraries and dependencies (Libs/Bins) for the entire stack (operating system, device drivers, apps etc.). Each VM must also emulate virtual versions of the underlying hardware. All these make VM implementations resource-hungry, complicated and difficult to manage. Last but not least, they are also very costly and require complicated IT maintenance services.
Containerization may be a viable solution to these pitfalls. Containers are lightweight, portable and easy to manage. They can be run on the same host OS, thus removing the need to replicate OSs, as must be done for each VM. Furthermore, containers use the host OS’s Libs and Bins through direct API calls. They only encapsulate the few Libs and Bins that are necessary for the apps they run. For these reasons, containers are less resource-hungry than VMs.
The price of performance and ease
Yet the use of containers poses one important problem - security. Given how they work--providing security at the process level, not at the level of the OS--containers are not as secure as VMs. There is only one underlying OS and every security flaw in it automatically becomes a possible threat to the containers themselves. Containers share not only resources and hardware, but also the OS kernel. There are no separate physical or virtual machines. Security must therefore be provided at the kernel level by using default tools or creating new ones.
Kernel namespaces can be used to separate processes such as Mount (MNT), Process ID (PID), Network (NET), Interprocess Communication (IPC), User Control Group (cgroup). For example, the Control Group namespace limits resource usage (CPU, RAM, HDD, network resources, system buffers like filesystems cache) for a certain group of processes.
Beyond separating the processes by using kernel namespaces, we may want to go a step further and provide restrictions with which the processes are launched in particular pods (a pod is a set of containers that are logically coherent, but have different functions). Another security aspect that may need to be controlled is the images used for containers, underlying hardware that should be restricted for particular applications. To do so, we can use admission controllers, which are built in Kubernetes.
What admission controllers can do for your security
Admission controllers are part of the kube-apiserver. They are designed to intercept a request after it reaches the API server, but before the configuration is stored inside the cluster setting storage (etcd). Admission controllers can be mutating, validating or both. After the request has been intercepted, the mutating controllers can modify the request. Kubernetes request validation step is then performed on request. After the general validation is passed, additional validation controllers can execute specific checks on request content. Only after the validation is passed is new request content stored in the cluster configuration.
Admission controllers offer a wide range of use cases and configuration and restriction capabilities, and bring a new level of possibilities to the containerized world. A quick overview of plugins available in Kubernetes and the features they provide will help establish the power that configuration admission controllers offer.
- AlwaysPullImages - this controller modifies requests forcing imagePullPolicy to always pull images from the registry. It helps to ensure that nobody can use private images previously cached on the Kubernetes node.
- DefaultStorageClass - upon the creation of persistent volume claim, if no storage class is specified, this controller sets it to the default value.
- DefaultTolerationSeconds - if pods don’t have any tolerations configured, this controller sets them to 5 minutes for notready:NoExecute and unreachable:NoExecute.
- EventRateLimit (alpha) - this plugin allows you to set the rate limit of requests per API server, namespace, user or object to avoid flooding the server with events (preventing denial of service).
- ExtendedResourceToleration - this plugin automatically enables tolerations for taints to pods requesting special resources (like dedicated hardware available on special nodes in the cluster, eg. GPU)
- Request Payloads - this extension is intended to perform an image review. If a pod image contains particular images that are not welcome on a cluster (due to known vulnerabilities, for example), the pod request can be rejected or accepted when the image is specified (eg. images tagged as “latest” are not allowed, only images with tag “approved” can be used).
- LimitPodHardAntiAffinityTopology - enabling this controller limits the pods antiAffinity configuration so it is based only on the hostname.
- LimitRanger - the controller verifies if an incoming request for object creation does not violate a LimitRange request or applies default values if there are no specifications
- MutatingAdmissionWebhook (beta) - this controller calls consecutively mutating webhooks that match requests. It allows you to extend the existing controller system with custom mutation, eg. inject an additional (sidecar) container in the pod for logs or metrics.
- NamespaceAutoProvision - listens for requests and verifies if the namespace which requests are referring to actually exists. If there is no requested namespace available in the cluster, a new namespace will be created.
- NamespaceLifecycle - all requests for deleting reserved namespaces: default, kube-system, kube-public are rejected. Also, if any other namespace termination request is accepted, this controller prevents the creation of any new resources.
- NodeRestriction - limits the node and pod objects that can be modified by kubelet to resources managed by this kubelet
- PodNodeSelector - sets defaults and limits which node selectors can be used within the namespace
- PersistentVolumeClaimResize - prevents PVCs from being resized, unless their storage class allows it.
- PodPreset - this controller injects parameters into a pod specification according to the specified config. It can insert the image registry credentials or set environmental variables for the pod.
- PodSecurityPolicy - one of the most powerful admission controllers, it validates specifications for a created or modified pod to match several security parameters. This is described in greater detail later.
- PodTolerationRestriction - verifies if there’s a contradiction between pod and namespace tolerations and rejects it if such a contradiction is found. If there are no contra, tolerations are merged into a single consistent set.
- Priority - forces defining priorityClassName in pod definition, while pods without these parameters are rejected.
- ResourceQuota - on-pod creation ensures that the ResourceQuota defined in the namespace is not exceeded.
- SecurityContextDeny - prevents certain SecurityContext fields in pod definition from being escalated. It should be enabled if PSP is not used in the cluster. SecurityContextDeny is significantly less powerful than PSP.
- ServiceAccount - automates the usage of serviceAccount objects in the cluster.
- StorageObjectInUseProtection - prevents the deletion of PV and PVC objects until they’re still in use.
- ValidatingAdmissionWebhook (beta) - enables the creation of webhooks for custom validation rules. This is effectively an extension with custom rules to existing validation controllers.
As you can see, Pod Security Policy is a powerful mechanism that can significantly increase your control over what actions users can perform on a Kubernetes cluster. In the 2nd part of this blog post I will show you how it can be used in practice to enhance the security on a Kubernetes cluster.