What are Pod Security Policies?

Although Kubernetes Pod Security Policies are still a beta feature of Kubernetes they are an important security feature that should not be overlooked. Pod Security Policies (PSPs) are built-in Kubernetes resources that allow you to enforce security related properties of every container in your cluster. If a container in a pod does not meet the criteria for an applicable PSP then it will not be scheduled to run.

Best Practices for CloudBees Core v2 on Kubernetes

There are numerous articles on security best practices for Kubernetes (to include this one published on the CNCF blog site). Many of these articles include similar best practices and most, if not all, apply to running Core v2 on Kubernetes. Some of these best practices are inherent in CloudBees’ documented install of Core v2 on Kubernetes, while others are documented best practices and are recommended next steps after your initial Core v2 installation.

Before we take a look at the best practices that aren’t necessarily covered by the CloudBees reference architectures and best practice documentation, I will provide a quick overview of what is already available with an OOTB Core v2 install and highlight some CloudBees documentation that speaks to other best practices for running Core v2 on Kubernetes more securely.

Enable Role-Based Access Control (RBAC)

Although you can certainly install Core v2 on Kubernetes without RBAC enabled - the CloudBees install for Core v2 comes with RBAC pre-configured. Running Kubernetes with RBAC enabled is typically the default (it is for all the major cloud providers) and is always a recommended security setting.

Use Namespaces to Establish Security Boundaries & Separate Sensitive Workloads

CloudBees recommends that you create a namespace specifically for Core v2 as part of the install. CloudBees also recommends establishing boundaries between your CloudBees Jenkins masters and agent workloads by setting up distinct node pools using taints and tolerations and assigning pods to specific node pools with node selectors.

Create and Define Cluster Network Policies

Although CloudBees doesn’t provide specific Kubernetes Network Policies, CloudBees does recommend using them and provides documentation for setting up a private and encrypted network for AWS EKS.

Run a Cluster-wide Pod Security Policy

At the time of this post, this is one component that is not documented as part of the CloudBees installation guides for Core v2 on Kubernetes and will be the focus of the rest of this post.

Why should you use Pod Security Policies?

From the Kubernetes documentation on Pod Security Policies (PSPs): “Pod security policy control is implemented as an optional (but recommended) admission controller.” If you read any number of posts on security best practices for Kubernetes, pretty much all of them will mentions PSPs.

A CD platform, like CloudBees Core v2 on Kubernetes, is typically a multi-tenant service where security is of the utmost importance. In addition to multi-tenancy, when running CD workloads on a platform like Kubernetes there are typically other workloads deployed and if any workload does not have proper security configured it can impact all of the workloads running on the cluster.

The combination of PSPs with Kubernetes RBAC, namespaces and workload specific node pools allows for the granular security you need to ensure there are adequate safeguards in place to greatly reduce the risk of unintentional (and intentional) actions that breaks your cluster. PSPs provide additional safeguards along with targeted node pools, namespaces and service accounts. This allows for the flexibility needed by CI/CD users while providing adequate guard rails so they don’t negatively impact CD workloads or other important Kubernetes workloads by doing something stupid - accidental or otherwise.

Using Pod Security Policies with CloudBees Core v2

As mentioned above, Pod Security Polices are an optional Kubernetes feature (and still beta) so they are not enabled by default on most Kubernetes distributions - to include GCP GKE, and Azure AKS. PSPs can be created and applied to a ClusterRole or a Role resource definition without enabling the PodSecurityPolicy admission controller. This is very important, because once you enable the PodSecurityPolicy admission controller any pod that does not have a PSP applied to it will not get scheduled.

NOTE: PSPs are enabled by default on AWS EKS 1.13 and above, but with a very permissive PSP that is the same as running EKS without PSPs.

We will define two PSPs for our Core v2 cluster:

  • A very restrictive PSP used for all CloudBees components, additional Kubernetes services being leveraged with Core v2 and the majority of dynamic ephemeral Kubernetes based agents used by our Core v2 cluster:
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: cb-restricted
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
    apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName:  'docker/default'
    apparmor.security.beta.kubernetes.io/defaultProfileName:  'runtime/default'
spec:
  # prevents container from manipulating the network stack, accessing devices on the host and prevents ability to run DinD
  privileged: false
  fsGroup:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  runAsUser:
    rule: 'MustRunAs'
    ranges:
      # Don't allow containers to run as ROOT
      - min: 1
        max: 65535
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  # Allow core volume types. But more specifically, don't allow mounting host volumes to include the Docker socket - '/var/run/docker.sock'
  volumes:
  - 'emptyDir'
  - 'secret'
  - 'downwardAPI'
  - 'configMap'
  # persistentVolumes are required for CJOC and Managed Master StatefulSets
  - 'persistentVolumeClaim'
  - 'projected'
  hostPID: false
  hostIPC: false
  hostNetwork: false
  # Ensures that no child process of a container can gain more privileges than its parent
  allowPrivilegeEscalation: false

Once the primary Core v2 PSP (cb-restricted in this case) has been created you must update the Roles to use it. CloudBees defines two Kubernetes Roles for the Core v2 install on Kubernetes, cjoc-master-management bound to the cjoc ServiceAccount for provisioning Managed/Team Masters StatefulSets from CJOC, and cjoc-agents bound to the jenkins ServiceAccount for scheduling dynamic ephemeral agent pods from Managed/Team Masters. The following Kubernetes configuration snippets show how this is configured:

---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: cjoc-master-management
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs:     ['use']
  resourceNames:
  - cb-restricted
...
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: cjoc-agents
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs:     ['use']
  resourceNames:
  - cb-restricted
...

Bind Restrictive PSP Role for Ingress Nginx

CloudBees recommends the ingress-nginx Ingress controller to manage external access to Core v2. The NGINX Ingress Controller is a top-level Kubernetes project and provides an example for using Pod Security Policies with the ingress-nginx Deployment. Basically, all you have to do is run the following command before installing the NGINX Ingress controller:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/docs/examples/psp/psp.yaml

The above command will create the following PSP, Role and RoleBinding with the primary differences from the cb-restricted PSP being the addition of the NET_BIND_SERVICE as an allowedCapabilities and allowing hostPorts of 80 to 65535:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  annotations:
    # Assumes apparmor available
    apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
    apparmor.security.beta.kubernetes.io/defaultProfileName:  'runtime/default'
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName:  'docker/default'
  name: ingress-nginx
spec:
  allowedCapabilities:
  - NET_BIND_SERVICE
  allowPrivilegeEscalation: true
  fsGroup:
    rule: 'MustRunAs'
    ranges:
    - min: 1
      max: 65535
  hostIPC: false
  hostNetwork: false
  hostPID: false
  hostPorts:
  - min: 80
    max: 65535
  privileged: false
  readOnlyRootFilesystem: false
  runAsUser:
    rule: 'MustRunAsNonRoot'
    ranges:
    - min: 33
      max: 65535
  seLinux:
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'MustRunAs'
    ranges:
    # Forbid adding the root group.
    - min: 1
      max: 65535
  volumes:
  - 'configMap'
  - 'downwardAPI'
  - 'emptyDir'
  - 'projected'
  - 'secret'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: ingress-nginx-psp
  namespace: ingress-nginx
rules:
- apiGroups:
  - policy
  resourceNames:
  - ingress-nginx
  resources:
  - podsecuritypolicies
  verbs:
  - use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: ingress-nginx-psp
  namespace: ingress-nginx
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: ingress-nginx-psp
subjects:
- kind: ServiceAccount
  name: default
- kind: ServiceAccount
  name: nginx-ingress-serviceaccount

NOTE: You can also run that command after you have already installed the NGINX Ingress controller but the PSP will only be applied after restarting or recreating the ingress-nginx Deployment.

Pod Security Policies for Other Services

The cluster used as an example for this post relies on the cert-manager Kubernetes add-on for automatically provisioning and managing TLS certificates for the Core v2 install on GKE. If cert-manager or other services are installed before you enable PSPs on your cluster then the pods associated with them will not run if they are restarted if the associated Roles/ClusterRoles don’t have PSPs applied to them. cert-manager is deployed to its own namespace so an easy way to ensure that all ServiceAccounts associated with the cert-manager service have a PSP applied is to create a ClusterRole with the PSP and then bind that ClusterRole to all ServiceAccounts in the applicable namespace:

ClusterRole with the cb-restricted PSP applied

apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: psp-restricted-clusterrole
rules:
- apiGroups:
  - extensions
  resources:
  - podsecuritypolicies
  resourceNames:
  - cb-restricted
  verbs:
  - use

RoleBindings for cert-manager ServiceAccounts

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cert-manager-psp-restricted
  namespace: cert-manager
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: psp-restricted-clusterrole
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:serviceaccounts

NOTE: You can use this command kubectl get role,clusterrole --all-namespaces to check your cluster for any other Roles or ClusterRoles that need to have a PSP applied to them. Remember, any pod that is running under a ServiceAccount that doesn’t have a PSP will be shut down as soon as you enable the Pod Security Policy Admission Controller. For GKE you don’t need to apply PSPs to any Roles in the kube-system namespace or any gce or system ClusterRoles as GKE will automatically apply the necessary PSPs.

Enable the Pod Security Policy Admission Controller

Now that PSPs are applied to all the necessary Roles and ClusterRoles you can enable the Pod Security Policy Admission Controller for your GKE cluster:

gcloud beta container clusters update [CLUSTER_NAME] --zone [CLUSTER_ZONE] --enable-pod-security-policy

Next, you should ensure that all pods are still running:

kubectl get pods --all-namespaces

If a pod that you expect to be running is not, you need to find the Role/ClusterRole that is used for the pod/deployment/service and apply a PSP to it.

Default Pod Security Policies created when enabling the pod-security-policy feature on a GKE cluster:

NAME                           PRIV    CAPS   SELINUX    RUNASUSER   FSGROUP    SUPGROUP   READONLYROOTFS   VOLUMES
gce.event-exporter             false          RunAsAny   RunAsAny    RunAsAny   RunAsAny   false            hostPath,secret
gce.fluentd-gcp                false          RunAsAny   RunAsAny    RunAsAny   RunAsAny   false            configMap,hostPath,secret
gce.persistent-volume-binder   false          RunAsAny   RunAsAny    RunAsAny   RunAsAny   false            nfs,secret
gce.privileged                 true    *      RunAsAny   RunAsAny    RunAsAny   RunAsAny   false            *
gce.unprivileged-addon         false          RunAsAny   RunAsAny    RunAsAny   RunAsAny   false            emptyDir,configMap,secret

NOTE: The default Pod Security Policies created automatically cannot be modified - Google will automatically change them back to those above.

AWS EKS and Azure AKS - Preview also support Pod Security Policies.

Oh no, My Jenkins Kubernetes Agents Won’t Start!

The Jenkins Kubernetes plugin (for ephemeral K8s agents) defaults to using a K8s emptyDir volume type for the Jenkins agent workspace. This causes issues when using a restrictive PSP such at the cb-restricted PSP above. Kubernetes defaults to mounting emptyDir volumes as root:root with permissions set to 750 - as detailed by this GitHub issue opened way back in 2014. When using a PSP, with Jenkins K8s agent pods, that doesn’t allow containers to run as root the containers will not be able to access the default K8s plugin workspace directory. One approach for dealing with this is to set the K8s securityContext for containers in the pod spec. You can do this in the K8s plugin UI via the Raw yaml for the Pod field:

Raw yaml for the Pod

This can also be set in the raw yaml of a pod spec that you load into your Jenkins job from a file:

pod spec with the securityContext

kind: Pod
metadata:
  name: nodejs-app
spec:
  containers:
  - name: nodejs
    image: node:10.10.0-alpine
    command:
    - cat
    tty: true
  - name: testcafe
    image: gcr.io/technologists/testcafe:0.0.2
    command:
    - cat
    tty: true
  securityContext:
    runAsUser: 1000