Kubernetes Security Hardening: A DevSecOps Engineer's Playbook
Securing Kubernetes clusters from day zero to production
Kubernetes Security Hardening: A DevSecOps Engineer's Playbook
Kubernetes security isn't an afterthought—it should be built into every layer of your cluster from day one. After securing dozens of production K8s environments, here's my battle-tested approach to hardening clusters.
The Kubernetes Security Model Reality Check
Most organizations deploy Kubernetes with defaults that prioritize convenience over security. That's a mistake that will bite you later. Let's fix that from the ground up.
Security Layers in Kubernetes
Think of K8s security like an onion:
- Cluster Infrastructure (nodes, network, etcd)
- Kubernetes API (RBAC, admission controllers)
- Workload Security (pods, containers, images)
- Runtime Security (monitoring, incident response)
Cluster Infrastructure Hardening
Node Security Configuration
Start with hardened node images and proper configuration:
# CIS Kubernetes Benchmark automated checks
curl -sSL https://github.com/aquasecurity/kube-bench/releases/latest/download/kube-bench_linux_amd64.tar.gz | tar xz
./kube-bench --config-dir cfg/ --config cfg/config.yaml
# Network security - disable unnecessary services
systemctl disable --now cups
systemctl disable --now bluetooth
systemctl disable --now avahi-daemon
# Kernel hardening
echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf
echo 'net.bridge.bridge-nf-call-iptables = 1' >> /etc/sysctl.conf
echo 'kernel.kptr_restrict = 2' >> /etc/sysctl.conf
sysctl -p
etcd Security Best Practices
Protect the brain of your cluster:
# etcd TLS configuration
apiVersion: v1
kind: Pod
metadata:
name: etcd
spec:
containers:
- name: etcd
image: k8s.gcr.io/etcd:3.5.1-0
command:
- etcd
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --client-cert-auth=true
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --peer-client-cert-auth=true
- --auto-tls=false
- --peer-auto-tls=false
API Server Hardening
Robust RBAC Configuration
Implement least privilege access from day one:
# Example: Developer role with limited permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: development
name: developer
rules:
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "secrets"]
verbs: ["get", "list", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
resources: ["deployments", "replicasets"]
verbs: ["get", "list", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["pods/exec", "pods/portforward"]
verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: developer-binding
namespace: development
subjects:
- kind: User
name: jane.developer
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: developer
apiGroup: rbac.authorization.k8s.io
Admission Controllers Configuration
Enable security-focused admission controllers:
# API server configuration
apiVersion: v1
kind: Pod
metadata:
name: kube-apiserver
spec:
containers:
- name: kube-apiserver
image: k8s.gcr.io/kube-apiserver:v1.28.0
command:
- kube-apiserver
- --enable-admission-plugins=NodeRestriction,ResourceQuota,LimitRanger,SecurityContextDeny,PodSecurityPolicy,AlwaysPullImages
- --audit-log-path=/var/log/audit.log
- --audit-log-maxage=30
- --audit-log-maxbackup=10
- --audit-log-maxsize=100
- --audit-policy-file=/etc/kubernetes/audit-policy.yaml
Comprehensive Audit Policy
Track everything that matters:
# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all security-sensitive operations at Metadata level
- level: Metadata
namespaces: ["kube-system", "kube-public"]
verbs: ["create", "update", "patch", "delete"]
# Log all secret operations
- level: RequestResponse
resources:
- group: ""
resources: ["secrets"]
# Log RBAC changes
- level: RequestResponse
resources:
- group: "rbac.authorization.k8s.io"
# Log pod exec and portforward
- level: Request
resources:
- group: ""
resources: ["pods/exec", "pods/portforward"]
# Log everything else at Metadata level
- level: Metadata
omitStages:
- RequestReceived
Pod Security Standards Implementation
Replace deprecated PodSecurityPolicy with Pod Security Standards:
# Namespace with restricted security profile
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Secure Pod Configuration Template
apiVersion: v1
kind: Pod
metadata:
name: secure-app
spec:
securityContext:
runAsNonRoot: true
runAsUser: 10001
runAsGroup: 10001
fsGroup: 10001
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myapp:v1.0.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
capabilities:
drop:
- ALL
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /app/cache
volumes:
- name: tmp
emptyDir: {}
- name: cache
emptyDir: {}
Network Security Implementation
NetworkPolicies for Microsegmentation
Implement zero-trust networking:
# Default deny-all policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# Allow specific communication patterns
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: web-to-api
namespace: production
spec:
podSelector:
matchLabels:
app: api-server
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: web-frontend
ports:
- protocol: TCP
port: 8080
Service Mesh Security with Istio
Implement mTLS and fine-grained access control:
# Enable strict mTLS
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT
---
# Authorization policy
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: api-access
namespace: production
spec:
selector:
matchLabels:
app: api-server
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/production/sa/web-frontend"]
- to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/v1/*"]
Container and Image Security
Image Security Scanning Pipeline
Integrate security scanning into your CI/CD:
#!/bin/bash
# Image security scanning script
IMAGE_NAME=$1
SEVERITY_THRESHOLD="HIGH"
# Trivy scanning
trivy image --severity ${SEVERITY_THRESHOLD},CRITICAL --exit-code 1 ${IMAGE_NAME}
if [ $? -ne 0 ]; then
echo "Image failed security scan with ${SEVERITY_THRESHOLD} or CRITICAL vulnerabilities"
exit 1
fi
# Cosign image signing verification
cosign verify --key cosign.pub ${IMAGE_NAME}
if [ $? -ne 0 ]; then
echo "Image signature verification failed"
exit 1
fi
echo "Image passed security checks"
Distroless Container Best Practices
Use minimal base images:
# Multi-stage build with distroless final image
FROM golang:1.19-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -ldflags '-extldflags "-static"' -o main .
FROM gcr.io/distroless/static-debian11
COPY --from=builder /app/main /
EXPOSE 8080
USER 10001
ENTRYPOINT ["/main"]
Runtime Security Monitoring
Falco Rules for Runtime Threat Detection
Deploy Falco for runtime security monitoring:
# Custom Falco rules
- rule: Unexpected K8s NodePort Connection
desc: Detect attempts to connect to K8s NodePort services
condition: >
(inbound_outbound) and
fd.sport >= 30000 and fd.sport <= 32767 and
not proc.name in (kube-proxy, kubelet)
output: >
Unexpected K8s NodePort connection
(connection=%fd.name sport=%fd.sport dport=%fd.dport
proc=%proc.name command=%proc.cmdline)
priority: WARNING
- rule: Detect crypto mining
desc: Detect cryptocurrency mining activities
condition: >
spawned_process and
(proc.name in (xmrig, minergate, ccminer, cgminer) or
proc.cmdline contains "stratum+tcp" or
proc.cmdline contains "mining.pool")
output: >
Crypto mining process detected
(user=%user.name command=%proc.cmdline)
priority: CRITICAL
OPA Gatekeeper Policies
Implement policy-as-code with Gatekeeper:
# Require security context
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8srequiredsecuritycontext
spec:
crd:
spec:
names:
kind: K8sRequiredSecurityContext
validation:
openAPIV3Schema:
type: object
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredsecuritycontext
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not container.securityContext.runAsNonRoot
msg := "Container must run as non-root user"
}
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
container.securityContext.allowPrivilegeEscalation != false
msg := "Container must not allow privilege escalation"
}
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredSecurityContext
metadata:
name: must-have-security-context
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
namespaces: ["production", "staging"]
Secrets Management Strategy
External Secrets Operator Configuration
Never store secrets in etcd:
# External Secrets Operator with AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-secrets-manager
namespace: production
spec:
provider:
aws:
service: SecretsManager
region: us-west-2
auth:
secretRef:
accessKeyID:
name: awssm-secret
key: access-key
secretAccessKey:
name: awssm-secret
key: secret-access-key
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: database-credentials
namespace: production
spec:
refreshInterval: 15s
secretStoreRef:
name: aws-secrets-manager
kind: SecretStore
target:
name: db-credentials
creationPolicy: Owner
data:
- secretKey: username
remoteRef:
key: prod/database
property: username
- secretKey: password
remoteRef:
key: prod/database
property: password
Automated Security Testing
Kubernetes Security Testing Script
#!/usr/bin/env python3
"""
Kubernetes Security Assessment Script
"""
import subprocess
import json
import sys
from typing import List, Dict
class K8sSecurityChecker:
def __init__(self):
self.results = []
def run_kube_bench(self) -> Dict:
"""Run CIS Kubernetes Benchmark checks"""
try:
result = subprocess.run(
['kube-bench', '--json'],
capture_output=True,
text=True,
check=True
)
return json.loads(result.stdout)
except subprocess.CalledProcessError:
return {"error": "kube-bench failed"}
def check_rbac_permissions(self) -> List[Dict]:
"""Check for overly permissive RBAC"""
dangerous_permissions = []
# Check for cluster-admin bindings
try:
result = subprocess.run([
'kubectl', 'get', 'clusterrolebindings',
'-o', 'json'
], capture_output=True, text=True, check=True)
bindings = json.loads(result.stdout)
for binding in bindings['items']:
if binding['roleRef']['name'] == 'cluster-admin':
dangerous_permissions.append({
'type': 'cluster-admin-binding',
'name': binding['metadata']['name'],
'subjects': binding.get('subjects', [])
})
except subprocess.CalledProcessError:
pass
return dangerous_permissions
def check_pod_security_standards(self) -> List[Dict]:
"""Check Pod Security Standards compliance"""
violations = []
try:
result = subprocess.run([
'kubectl', 'get', 'pods', '--all-namespaces',
'-o', 'json'
], capture_output=True, text=True, check=True)
pods = json.loads(result.stdout)
for pod in pods['items']:
issues = self._analyze_pod_security(pod)
if issues:
violations.append({
'pod': f"{pod['metadata']['namespace']}/{pod['metadata']['name']}",
'issues': issues
})
except subprocess.CalledProcessError:
pass
return violations
def _analyze_pod_security(self, pod: Dict) -> List[str]:
"""Analyze individual pod for security issues"""
issues = []
spec = pod.get('spec', {})
# Check if running as root
if not spec.get('securityContext', {}).get('runAsNonRoot'):
issues.append("Pod may be running as root")
# Check containers
for container in spec.get('containers', []):
sec_ctx = container.get('securityContext', {})
if sec_ctx.get('privileged'):
issues.append(f"Container {container['name']} is privileged")
if sec_ctx.get('allowPrivilegeEscalation', True):
issues.append(f"Container {container['name']} allows privilege escalation")
return issues
def generate_report(self) -> str:
"""Generate comprehensive security report"""
print("Running Kubernetes Security Assessment...")
# Run checks
cis_results = self.run_kube_bench()
rbac_issues = self.check_rbac_permissions()
pod_violations = self.check_pod_security_standards()
report = f"""
Kubernetes Security Assessment Report
=====================================
CIS Benchmark Results:
{json.dumps(cis_results, indent=2)}
RBAC Issues Found: {len(rbac_issues)}
{json.dumps(rbac_issues, indent=2)}
Pod Security Violations: {len(pod_violations)}
{json.dumps(pod_violations, indent=2)}
Recommendations:
- Review and remediate CIS benchmark failures
- Implement least-privilege RBAC policies
- Enable Pod Security Standards
- Regular security scanning and monitoring
"""
return report
if __name__ == "__main__":
checker = K8sSecurityChecker()
report = checker.generate_report()
print(report)
# Exit with error if critical issues found
if "FAIL" in report or "privileged" in report:
sys.exit(1)
Production Deployment Checklist
Pre-Deployment Security Validation
#!/bin/bash
# Pre-deployment security checklist
echo "🔒 Running Kubernetes Security Pre-Deployment Checks..."
# 1. Check for security contexts
kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.securityContext.runAsNonRoot}{"\n"}{end}' | grep -v "true"
# 2. Validate network policies exist
kubectl get networkpolicies --all-namespaces
# 3. Check for resource limits
kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].resources.limits}{"\n"}{end}' | grep -v "map"
# 4. Verify image signatures
for image in $(kubectl get pods --all-namespaces -o jsonpath='{.items[*].spec.containers[*].image}' | tr ' ' '\n' | sort -u); do
echo "Checking signature for $image"
cosign verify --key cosign.pub $image || echo "❌ No valid signature for $image"
done
# 5. Run security policy checks
gatekeeper-policy-manager audit
echo "✅ Security checks completed"
Monitoring and Incident Response
Security Metrics to Track
# Prometheus monitoring rules
groups:
- name: kubernetes-security
rules:
- alert: UnauthorizedAPIAccess
expr: increase(apiserver_audit_total{verb="create",objectRef_resource="pods/exec"}[5m]) > 0
labels:
severity: critical
annotations:
summary: "Unauthorized pod exec detected"
- alert: PrivilegedPodCreated
expr: increase(kube_pod_container_info{container_security_context_privileged="true"}[5m]) > 0
labels:
severity: high
annotations:
summary: "Privileged pod created"
- alert: FailedRBACCheck
expr: increase(apiserver_audit_total{verb="create",objectRef_resource="rolebindings",response_code!~"2.."}[5m]) > 3
labels:
severity: warning
annotations:
summary: "Multiple failed RBAC operations detected"
Wrapping Up
Kubernetes security isn't a one-time setup—it's an ongoing process. Start with these fundamentals:
- Harden the infrastructure before deploying workloads
- Implement defense in depth across all layers
- Automate security testing in your CI/CD pipeline
- Monitor continuously and respond quickly to threats
- Keep learning as the threat landscape evolves
Remember: security is a journey, not a destination. The key is building security into your processes from the beginning rather than bolting it on later.
Looking for more DevSecOps insights? Subscribe to my newsletter for weekly deep-dives into cloud security, automation, and real-world war stories from the trenches.