A Policy-Driven Approach to Enhancing Kubernetes Security with OPA and Gatekeeper

Share

OPA Blog

MedStack delivers dependable, high performance services for our customers’ critical healthcare systems.

We use Kubernetes as our foundation, providing us with powerful tool sets and guaranteeing the reliability necessary to uphold highly available workloads with minimal downtime. 

As our system expands, ensuring the reliability and availability of our systems presents numerous challenges, from the intricacies of deployment to the complexities of maintaining security and compliance with regulations.

I recently implemented Open Policy Agent (OPA) into our platform to address some of these challenges and maintain high standards of Site Reliability Engineering (SRE).

In this post I explain how the strategic integration of OPA allowed us to translate compliance policies into code, streamlining the implementation process and enhancing our overall system efficiency.

 

Open Policy Agent

Open Policy Agent (OPA) stands as an open-source, versatile policy engine designed to streamline policy enforcement across the entire stack.  

OPA empowers us to articulate policies as code, accompanied by straightforward APIs to delegate policy decision-making away from our software by using a programming language called Rego. 

OPA’s utility extends to enforcing policies within Kubernetes by leveraging admission controllers to fulfill its functions. It was accepted to Cloud Native Computing Foundation on March 29, 2018, moved to the Incubating maturity level on April 2, 2019, and then moved to the Graduated maturity level on January 29, 2021.

 

OPA Gatekeeper and Kubernetes admission controller

To implement OPA with a Kubernetes cluster, we used OPA Gatekeeper. OPA Gatekeeper is a project that specializes in integrating OPA with Kubernetes admission controller. 

OPA Gatekeeper operates as a validating admission controller within Kubernetes, responsible for enforcing policies on objects during their creation or modification. 

Whenever a user or process initiates the creation or modification of a Kubernetes object, OPA Gatekeeper steps in, scrutinizing the request against the predefined policies in OPA. Should the request breach any policy, it is promptly denied.

OPA Gatekeeper adds the following on top of plain OPA:

  • An extensible, parameterized policy library
  • Native Kubernetes CustomResourceDefinitions (CRDs) for instantiating the policy library (aka “constraints”)
  • Native Kubernetes CRDs for extending the policy library (aka “constraint templates”)
  • Audit functionality

 

Source: Kubernetes Blog

 

What is policy?

Policies serve as guidelines governing the behavior of software applications and services within a Kubernetes cluster. 

These policies are implemented at various levels depending on the requirements of different teams and organizations, often based on legal mandates, authorization rules, technical specifications, and constraints.

In our scenario, we have several typical policies in place::

  • User access rules, specifying permissions and actions allowed for each user
  • Regulations on the source of image repositories permitted for deploying services and applications
  • Mandates regarding the required labels and annotations that must accompany application deployment in configuration files
  • Specifications outlining resource limits and request boundaries for workloads operating within specific namespaces
  • And many more – almost any kind of required policy can be converted as policy-as-code and applied using OPA

 

There are many ways policies related to using specific image repositories help us: 

  • In our production cluster, OPA plays a crucial role in restricting image repositories through a policy-as-code approach. By implementing this approach, we ensure that only approved image repositories are utilized within our environment. 
  • Leveraging OPA allows us to enforce strict governance over the image sources used in our production environment. This ensures compliance with organizational policies and industry regulations, reducing the risk of deploying unauthorized or insecure images that could compromise system integrity.
  •  Our policy-as-code approach enables seamless integration of image repository restrictions into our existing deployment pipelines. This means that compliance checks are performed automatically during the deployment process, preventing unauthorized images from being deployed without manual intervention.

 

After all Gatekeeper components have been integrated into our cluster, the API server activates the Gatekeeper admission webhook to handle admission requests whenever a resource undergoes creation, modification, or deletion. 

During validation, Gatekeeper acts as a middleman linking the API server with OPA. 

The API server enforces policies set by OPA through two key CustomResourceDefinitions (CRDs) named ConstraintTemplates and Constraints. These components are essential for applying policies to different Kubernetes resources like pods, jobs, and deployments.

 

Source: https://dev.to/ashokan/kubernetes-policy-management-ii-opa-gatekeeper-465g

 

ConstraintTemplates and Constraint

ConstraintTemplates serve as blueprints for validating specific sets of Kubernetes objects within Gatekeeper’s Kubernetes admission controller. They consist of two primary elements:

  • Rego code, defining the criteria for policy violations
  • The schema of the corresponding Constraint object, representing an instance derived from the ConstraintTemplate

 

A Constraint serves as a statement outlining the requirements a system must fulfill. In essence, Constraints notify Gatekeeper of the administrator’s intention to enforce a ConstraintTemplate and specify how it should be enforced.

 

Source: https://grumpygrace.dev/posts/intro-to-gatekeeper-policies/

 

Example Walkthrough

In this example we assume that we have two namespaces called medstack and mcs in our production Kubernetes environment, and we want to allow a specific image repository to provide a container image to deploy as a production application into these namespaces. No other untrustworthy repo will be allowed. 

In this case we choose following two repos:

  • xxx.dkr.ecr.ca-central-1.amazonaws.com/
  • docker.io/bitnami/

 

To achieve this we need to follow these steps.

 

Create a ConstraintTemplate

First, we’ll create a ConstraintTemplate that defines the policy we want to enforce. 

In this case, the policy is to ensure that Kubernetes resources use a specific image repository for any new container.

 

 

This ConstraintTemplate defines a policy named “k8sAllowedRepos”. It specifies that the policy applies to a Kubernetes resource called Deployment, and Pod which is provided by constraint. 

The Rego code checks if the image repository for any container in the desired resource starts with the specified repository name. If not, it generates a violation message.

 

Create a Constraint

After defining the ConstraintTemplate, we create a Constraint to enforce the policy.

 

 

This Constraint is named “enforce-image-repository” and it specifies that it applies to Deployments and Pod. The “repos” parameter indicates the repository that the containers in the Deployment and Pod must use.

With these resources in place, Gatekeeper will enforce the policy specified in the ConstraintTemplate by evaluating all Deployments and Pods in defined namespaces to ensure they comply with the specified image repository requirement.

 

Testing 

After applying these configurations to the cluster, attempting to create a pod with a different image repository(munta/nginx) will result in a violation error.

This is how the applied policy assists us in preventing any user from deploying into the production system. 

As a result of  implementing synchronization with our policies, no one is able to mistakenly deploy any application using untrustworthy repositories. 

However, ensuring compliance for all existing resources in the cluster prior to implementing our policy is crucial. The Gatekeeper offers audit functionality that automatically keeps checking cluster resources against the deployed policy. 

The audit results are reflected in the status field of the respective constraint.

We can describe or inspect a Constraint to find policy violations by the existing Kubernetes resources:

 

 

Conclusion

At MedStack, we’ve implemented a total of 13 distinct policies across various namespaces. Each policy is designed to fulfill the compliance requirements set forth by our security team. 

Additionally,  there is a community-owned library of policies for OPA Gatekeeper projects.

In the realm of Kubernetes, maintaining robust security and compliance stands as a paramount concern for safeguarding applications and data integrity. Open Policy Agent (OPA) and Gatekeeper offers a potent solution to fortify Kubernetes security through policy enforcement. 

OPA’s adaptable policy language and Gatekeeper’s adept admission control capabilities combine forces, empowering us to proactively mitigate risks, prevent misconfigurations, and uphold compliance with security best practices.

As the Kubernetes ecosystem continues its evolution, the adoption of OPA and Gatekeeper emerges as a critical component of any comprehensive security strategy. 

By integrating these tools seamlessly into our infrastructure, we can confidently navigate the complexities of Kubernetes, unlocking its full potential while steadfastly maintaining a security-centric approach.

 

About Muntashir Islam

Muntashir brings over 10 years of experience as a DevOps and Site Reliability Engineer to MedStack and has driven MedStack’s smooth internal migration to Kubernetes as a Certified Kubernetes, Azure, and AWS Administrator. He provides expert guidance and system architecture to achieve fully automated, reliable, secure, and compliant infrastructure for MedStack and all customers. Proficient with Python, Java, Go, Terraform, Ansible, ArgoCD, Grafana, Prometheus, Kafka, and many more.