πŸš€ Automating Image Vulnerability Patching in Kubernetes with Trivy Operator, Copacetic, and GitHub Actions

πŸš€ Automating Image Vulnerability Patching in Kubernetes with Trivy Operator, Copacetic, and…

πŸš€ Automating Image Vulnerability Patching in Kubernetes with Trivy Operator, Copacetic, and GitHub Actions

1. Installing and Using Copacetic (Copa) CLI

Copacetic is a CLI tool (copa) designed to help automate the patching of vulnerabilities in your container images. Here’s how to install it:

1.1. Clone the Copacetic Repository

Start by cloning the Copacetic repository to your local machine:

git clone https://github.com/project-copacetic/copacetic
cd copacetic

1.2. Build Copacetic

Inside the cloned directory, build the Copacetic CLI:

make

1.3. Install Copa CLI (Optional)

If you want to install the copa command to a directory in your system’s PATH, move the binary to a pathed folder:

sudo mv dist/linux_amd64/release/copa /usr/local/bin/

This allows you to run copa from anywhere in your terminal.

2. Scanning and Patching Images with Trivy and Copa

Once you have installed both Trivy and Copacetic, you can start scanning your images and patching vulnerabilities.

2.1. Scan the Container Image for Patchable OS Vulnerabilities

Use Trivy to scan the container image for operating system vulnerabilities. The results will be output to a JSON file:

trivy image --vuln-type os --ignore-unfixed -f json -o $(basename $IMAGE).json $IMAGE
  • Replace $IMAGE with the name of the container image you want to scan (e.g., nginx:latest).
  • The command will create a JSON file with the same name as the image, which contains details about the vulnerabilities found.

2.2. Patch the Image Using Copa

After generating the Trivy report, use Copa to patch the image based on the vulnerabilities reported:

copa patch -r $(basename $IMAGE).json -i $IMAGE
  • This command reads the vulnerability report from the JSON file and patches the specified image.
  • By default, Copa will produce a new image with a -patched suffix (e.g., nginx-patched), tagging it with the original image tag suffixed with -patched.
  • The new patched image will be exported to your local Docker daemon.

3. Installing Trivy Operator with Helm

Once you’ve patched your container images, you can use Trivy Operator to continuously monitor your AKS cluster for any vulnerabilities.

3.1. Add the Aqua Security Helm Repository

First, add the Aqua Security Helm repository:

helm repo add aqua https://aquasecurity.github.io/helm-charts/

For more details, visit the Aqua Security Helm Charts documentation.

3.2. Update Your Helm Repositories

Update your Helm repositories to get the latest charts:

helm repo update

3.3. Install Trivy Operator

Finally, install the Trivy Operator:

helm install trivy-operator aqua/trivy-operator --namespace trivy-system --create-namespace

Trivy Operator will now be running in your AKS cluster, ready to scan for vulnerabilities. You can find more detailed instructions in the Trivy Operator documentation.

4. Exporting Trivy Metrics

Once Trivy Operator is up and running, it’s important to monitor the health of your cluster by exporting Trivy metrics. These metrics provide valuable insights into the security posture of your container images.

4.1. Enable Metrics Export

Trivy Operator can export metrics to Prometheus, allowing you to monitor the security status of your AKS cluster over time.

  1. Edit the Trivy Operator ConfigMap:

    Enable the metrics exporter in the Trivy Operator’s ConfigMap:

    kubectl edit configmap trivy-operator-config -n trivy-system
    
  2. Add or Modify the Following Configuration:

    Ensure the following settings are enabled in the ConfigMap:

    data:
      trivy.severity: "HIGH,CRITICAL"
      trivy.resources.requests.cpu: "100m"
      trivy.resources.requests.memory: "100Mi"
      trivy.resources.limits.cpu: "500m"
      trivy.resources.limits.memory: "500Mi"
      trivy.metrics.enabled: "true"
    

    This configuration ensures that Trivy Operator will export security metrics to Prometheus.

4.2. Access Trivy Metrics

With metrics enabled, you can scrape them using Prometheus and visualize them in Grafana or another monitoring tool of your choice. This allows you to track the number of vulnerabilities over time, monitor the effectiveness of your security patches, and respond to new threats promptly.

5. Setting Up a Custom Webhook to Trigger a GitHub Action for Copa

To fully automate the process of patching images, you can set up a custom webhook that triggers a GitHub Action when a vulnerability is detected by Trivy Operator.

5.1. Deploy a Custom Go Webhook

You can deploy a Go application that watches for new or updated vulnerability reports and triggers a GitHub Actions workflow. Below is a step-by-step guide on how to build, configure, and deploy this custom webhook within your cluster.

5.1.1. Set Up Your Go Application

Create a new Go module (for example, trivy-webhook) and initialize it:

mkdir trivy-webhook
cd trivy-webhook
go mod init github.com/your-username/trivy-webhook

Create a main.go file with the following sample code:

package main

import (
	"bytes"
	"context"
	"encoding/json"
	"fmt"
	"log"
	"net/http"
	"os"
	"strings"
	"time"

	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
	"k8s.io/apimachinery/pkg/runtime/schema"
	"k8s.io/apimachinery/pkg/watch"
	"k8s.io/client-go/dynamic"
	_ "k8s.io/client-go/plugin/pkg/client/auth" // Enable auth plugins if needed (GKE, etc.)
	"k8s.io/client-go/rest"
)

// DispatchRequest represents the payload sent to GitHub Actions
type DispatchRequest struct {
	EventType     string            `json:"event_type"`
	ClientPayload map[string]string `json:"client_payload"`
}

func main() {
	// Pull environment variables
	githubToken := strings.TrimSpace(os.Getenv("GITHUB_TOKEN"))
	githubRepoOwner := strings.TrimSpace(os.Getenv("GITHUB_REPO_OWNER"))
	githubRepoName := strings.TrimSpace(os.Getenv("GITHUB_REPO_NAME"))
	if githubToken == "" || githubRepoOwner == "" || githubRepoName == "" {
		log.Fatal("Missing required environment variables: GITHUB_TOKEN, GITHUB_REPO_OWNER, GITHUB_REPO_NAME")
	}

	// Create in-cluster K8s config
	config, err := rest.InClusterConfig()
	if err != nil {
		log.Fatalf("Failed to create in-cluster config: %v", err)
	}

	// Create a dynamic client
	dynClient, err := dynamic.NewForConfig(config)
	if err != nil {
		log.Fatalf("Failed to create dynamic client: %v", err)
	}

	// Define the GroupVersionResource for Trivy VulnerabilityReports
	gvr := schema.GroupVersionResource{
		Group:    "aquasecurity.github.io",
		Version:  "v1alpha1",
		Resource: "vulnerabilityreports",
	}

	// Watch vulnerabilityreports across all namespaces
	watchInterface, err := dynClient.Resource(gvr).Namespace("").Watch(context.Background(), metav1.ListOptions{})
	if err != nil {
		log.Fatalf("Failed to watch vulnerability reports: %v", err)
	}

	log.Println("Listening for new or updated Trivy VulnerabilityReports...")

	// Process events
	for event := range watchInterface.ResultChan() {
		if event.Type == watch.Added || event.Type == watch.Modified {
			// We expect *unstructured.Unstructured
			u, ok := event.Object.(*unstructured.Unstructured)
			if !ok {
				log.Println("Could not convert event object to Unstructured")
				continue
			}

			// Parse the top-level object
			namespace := u.GetNamespace()

			// The 'report' field is nested under .spec or top-level 'report'
			report, found, err := unstructured.NestedMap(u.Object, "report")
			if err != nil || !found {
				log.Printf("No 'report' field found in %s/%s\n", namespace, u.GetName())
				continue
			}

			// Extract artifact -> repository + tag
			artifact, found, err := unstructured.NestedMap(report, "artifact")
			if err != nil || !found {
				log.Printf("No 'artifact' info found in %s/%s\n", namespace, u.GetName())
				continue
			}
			repository, _, _ := unstructured.NestedString(artifact, "repository")
			tag, _, _ := unstructured.NestedString(artifact, "tag")
			image := repository
			if tag != "" {
				image = fmt.Sprintf("%s:%s", repository, tag)
			}

			// Extract vulnerabilities array
			vulns, found, err := unstructured.NestedSlice(report, "vulnerabilities")
			if err != nil || !found || len(vulns) == 0 {
				log.Printf("No vulnerabilities found in %s/%s\n", namespace, u.GetName())
				continue
			}

			// For example, just take the first vulnerability
			firstVuln, _ := vulns[0].(map[string]interface{})
			vulnID, _ := firstVuln["vulnID"].(string)

			log.Printf("Detected vulnerability %q in image %s (namespace: %s)", vulnID, image, namespace)

			// Trigger GitHub Action with the extracted data
			err = triggerGithubAction(githubToken, githubRepoOwner, githubRepoName, vulnID, image, namespace)
			if err != nil {
				log.Printf("Failed to trigger GitHub Actions for %s/%s: %v", namespace, u.GetName(), err)
			} else {
				log.Printf("Triggered GitHub Actions for %s/%s successfully", namespace, u.GetName())
			}
		}
	}
}

// triggerGithubAction sends a repository_dispatch event to the specified GitHub repository
func triggerGithubAction(token, owner, repo, vulnID, image, namespace string) error {
	url := fmt.Sprintf("https://api.github.com/repos/%s/%s/dispatches", owner, repo)

	// Build the JSON payload
	dispatchBody := DispatchRequest{
		EventType: "trivy_vulnerability_detected",
		ClientPayload: map[string]string{
			"vuln_id":   vulnID,
			"image":     image,
			"namespace": namespace,
		},
	}

	var body bytes.Buffer
	if err := json.NewEncoder(&body).Encode(dispatchBody); err != nil {
		return fmt.Errorf("failed to encode JSON payload: %v", err)
	}

	req, err := http.NewRequest("POST", url, &body)
	if err != nil {
		return fmt.Errorf("failed to create request: %v", err)
	}

	// NOTE: "Bearer <token>" or "token <token>" can work.
	// With classic PAT, "token <PAT>" is standard, but "Bearer" often works as well.
	req.Header.Set("Content-Type", "application/json")
	req.Header.Set("Authorization", "Bearer "+token)

	client := &http.Client{Timeout: 10 * time.Second}
	resp, err := client.Do(req)
	if err != nil {
		return fmt.Errorf("failed to send request: %v", err)
	}
	defer resp.Body.Close()

	if resp.StatusCode < 200 || resp.StatusCode > 299 {
		return fmt.Errorf("unexpected status code from GitHub: %d", resp.StatusCode)
	}

	return nil
}
Notes on the Code
  • Watching Vulnerability Reports: The above code uses a Watch() call on the vulnerabilityreports.aquasecurity.github.io/v1alpha1 resource. Whenever a resource is added or modified, the code sends a GitHub repository dispatch event.

  • DispatchRequest: The structure conforms to GitHub’s repository_dispatch JSON payload requirements.

  • Environment Variables:

    • GITHUB_TOKEN must be a Personal Access Token (classic) or a GitHub App token with permission to create repository dispatch events.
    • GITHUB_REPO_OWNER is your GitHub username or organization name.
    • GITHUB_REPO_NAME is the target repository to which you want to dispatch events.

5.1.2. Build & Push the Docker Image

Create a Dockerfile (named Dockerfile):

# Use a minimal base
FROM golang:1.20-alpine as builder

WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN go build -o trivy-webhook main.go

# Final stage
FROM alpine:3.18
RUN apk add --no-cache ca-certificates
COPY --from=builder /app/trivy-webhook /usr/local/bin/

ENTRYPOINT ["/usr/local/bin/trivy-webhook"]

Build and push the Docker image:

docker build -t your-dockerhub-username/trivy-webhook:latest .
docker push your-dockerhub-username/trivy-webhook:latest

5.1.3. Deploy the Webhook in Your Kubernetes Cluster

Create a Deployment and Service in the same namespace as Trivy Operator (often trivy-system) or any namespace you prefer. For example, create deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: trivy-custom-webhook
  namespace: trivy-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: trivy-custom-webhook
  template:
    metadata:
      labels:
        app: trivy-custom-webhook
    spec:
      containers:
      - name: trivy-webhook
        image: your-dockerhub-username/trivy-webhook:latest
        env:
        - name: GITHUB_TOKEN
          valueFrom:
            secretKeyRef:
              name: github-token-secret
              key: token
        - name: GITHUB_REPO_OWNER
          value: "your-username"
        - name: GITHUB_REPO_NAME
          value: "your-repo"
        # Add any other environment vars needed
        imagePullPolicy: Always
      # Optionally set serviceAccount if you have RBAC for accessing VulnerabilityReports
      serviceAccountName: trivy-operator

Deploy it:

kubectl apply -f deployment.yaml

Make sure that the github-token-secret exists in the trivy-system namespace and contains your GitHub token. For example:

apiVersion: v1
kind: Secret
metadata:
  name: github-token-secret
  namespace: trivy-system
type: Opaque
data:
  token: <BASE64_ENCODED_GITHUB_TOKEN>

5.1.4. Configure RBAC for the Webhook (If Needed)

If your cluster enforces RBAC, ensure your trivy-custom-webhook Deployment (the ServiceAccount it uses) has permission to list and watch vulnerabilityreports.aquasecurity.github.io. For example:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: trivy-webhook-role
  namespace: trivy-system
rules:
  - apiGroups: ["aquasecurity.github.io"]
    resources: ["vulnerabilityreports"]
    verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: trivy-webhook-binding
  namespace: trivy-system
subjects:
  - kind: ServiceAccount
    name: trivy-operator # or the name you set in your Deployment
    namespace: trivy-system
roleRef:
  kind: Role
  name: trivy-webhook-role
  apiGroup: rbac.authorization.k8s.io

5.1.5. Verify and Test

  1. Check Logs

    kubectl logs deployment/trivy-custom-webhook -n trivy-system
    

    You should see a message indicating it’s watching vulnerability reports.

  2. Generate a Vulnerability Report
    Install or update a Deployment in your cluster to trigger Trivy Operator to scan for vulnerabilities. When a new vulnerability report is generated, your logs should show that it detected a new report and sent a dispatch event to GitHub.

5.2. Set Up a GitHub Action to Patch the Image Using Copa

Now, create a GitHub Action in your repository that uses the copa-action to patch the image.

  1. Create a GitHub Action Workflow:

    In your GitHub repository, create a new file at .github/workflows/patch-image.yml:

     name: Patch Docker Image
    
     on:
       repository_dispatch:
         types: [trivy_vulnerability_detected]
    
     jobs:
       patch:
         runs-on: ubuntu-latest
    
         steps:
           - name: Checkout repository
             uses: actions/checkout@v2
    
           # Copacetic will run its own Trivy scan automatically if you do NOT provide a 'report'.
           - name: Patch the Docker Image with Copacetic
             id: copa
             uses: project-copacetic/copa-action@main
             with: 
               image: ${{ github.event.client_payload.image }}
    
             # see https://github.com/docker/login-action#usage for other registries
           - name: Login to GHCR
             if: steps.copa.conclusion == 'success'
             id: login
             uses: docker/login-action@343f7c4344506bcbf9b4de18042ae17996df046d # v3.0.0
             with:
               registry: ghcr.io
               username: ${{ github.actor }}
               password: ${{ secrets.GITHUB_TOKEN }}
    
           - name: Push patched image
             if: steps.login.conclusion == 'success'
             run: |
               docker tag ${{ github.event.client_payload.image }}:patched your-docker-repo/${{ github.event.client_payload.image }}:patched
               docker push your-docker-repo/${{ github.event.client_payload.image }}:patched
    
    • Replace your-docker-repo with your Docker registry’s name.
    • This workflow will automatically run when the webhook from Trivy Operator triggers it, patching the vulnerable image using copa-action and pushing the patched image to your Docker registry.

πŸŽ‰ Wrapping Up

By integrating Trivy Operator with Copacetic and GitHub Actions, you’ve created a fully automated workflow for detecting, patching, and monitoring vulnerabilities in your Kubernetes environment. Trivy Operator scans and exports metrics, while the custom webhook and GitHub Action using copa-action handle the patching process. This setup ensures that your container images are always up-to-date and secure, with minimal manual intervention.

This approach provides an efficient, automated solution for managing security in your Kubernetes environment, allowing you to focus on innovation with peace of mind! πŸš€πŸ”’

For further reading and more detailed guides, you can refer to the following resources:

Happy clustering and stay safe !