Post

Comprehensive Security Guide for Kubernetes

In-depth guide to Kubernetes container orchestration: cluster setup, deployments, services, ingress, scaling, security best practices, and monitoring.

Comprehensive Security Guide for Kubernetes

Comprehensive Security Guide for Kubernetes

1. Introduction and Purpose

Kubernetes has become the de facto standard for container orchestration, but its complex architecture introduces numerous security challenges. This guide provides systematic instructions to help security engineers, DevOps professionals, and system administrators implement and maintain secure Kubernetes deployments.

This guide focuses on security-first practices rather than general Kubernetes operations, emphasizing defense-in-depth strategies, least privilege principles, and continuous security validation.

2. Table of Contents

  1. Introduction and Purpose
  2. Table of Contents
  3. Installation
  4. Configuration
  5. Security Best Practices
  6. Integrations
  7. Testing and Validation
  8. References and Further Reading
  9. Appendices

3. Installation

3.1 Pre-Installation Security Planning

Before installing Kubernetes, establish your security requirements and constraints:

  1. Define Security Requirements:
    • Determine compliance requirements (e.g., PCI-DSS, HIPAA, GDPR)
    • Establish data classification levels
    • Define network segmentation requirements
    • Identify authentication and authorization requirements
  2. Security Architecture Planning:
    • Document trust boundaries
    • Establish network architecture (e.g., network plugins, service mesh requirements)
    • Plan for secret management
    • Determine logging and monitoring requirements
  3. Security Responsibility Matrix:
    • For managed Kubernetes (EKS, GKE, AKS), understand the shared responsibility model
    • Document which security controls are managed by the provider vs. your team

3.2 Platform-Specific Installation

3.2.1 Linux Installation (kubeadm)

For self-hosted Kubernetes on Linux, follow these secure installation steps:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# 1. Create a dedicated service account for Kubernetes components
sudo useradd -r -m -s /sbin/nologin kube-service

# 2. Install required packages with integrity verification
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update && sudo apt-get install -y kubelet kubeadm kubectl

# 3. Verify package integrity
apt-key fingerprint 54A647F9048D5688D7DA2ABE6A030B21BA07F4FB

# 4. Disable swap (required for Kubernetes)
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab

# 5. Load required kernel modules with secure parameters
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# 6. Set secure sysctl parameters
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system

# 7. Configure containerd with secure defaults
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo systemctl restart containerd

# 8. Initialize cluster with security-focused configuration
sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --service-cidr=10.96.0.0/12 \
  --kubernetes-version="v1.29.2" \
  --apiserver-cert-extra-sans="$EXTERNAL_IP,$EXTERNAL_DNS" \
  --control-plane-endpoint="$CONTROL_PLANE_ENDPOINT" \
  --upload-certs

3.2.2 Container-Optimized Installation (k3s)

For edge deployments with minimal attack surface:

1
2
3
4
5
# Install k3s with enhanced security settings
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--secrets-encryption --kube-apiserver-arg='enable-admission-plugins=NodeRestriction,AlwaysPullImages,PodSecurityPolicy'" sh -

# Verify installation
sudo k3s check-config

3.2.3 Managed Kubernetes (EKS)

For AWS EKS, using infrastructure as code with security controls:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# Example with eksctl - using a config file for better security control
cat > eks-cluster-secure.yaml << EOF
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: secure-cluster
  region: us-west-2
  version: "1.29"
vpc:
  cidr: 10.0.0.0/16
  clusterEndpoints:
    privateAccess: true
    publicAccess: false  # Restrict public API server access 
iam:
  withOIDC: true
  serviceAccounts:
  - metadata:
      name: ebs-csi-controller-sa
      namespace: kube-system
    wellKnownPolicies:
      ebsCSIController: true
managedNodeGroups:
  - name: managed-nodes
    instanceType: m5.large
    desiredCapacity: 3
    privateNetworking: true  # Nodes in private subnets only
    securityGroups:
      attachIDs: ["sg-12345678"]  # Pre-defined security groups
    ssh:
      allow: false  # Disable SSH access to nodes
    maxPodsPerNode: 50  # Limit pod density per node
    volumeSize: 100
    volumeType: gp3
    volumeEncrypted: true  # Encrypt node volumes
    disableIMDSv1: true  # Disable legacy instance metadata service
    iam:
      withAddonPolicies:
        autoScaler: true
secretsEncryption:
  keyARN: arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab
EOF

eksctl create cluster -f eks-cluster-secure.yaml

3.3 Installation Verification

Once installed, verify the security posture of your cluster:

  1. Check Component Versions and Signatures:
1
2
3
4
5
6
7
8
# Verify Kubernetes components versions
kubectl version --short

# Check running containers match expected signatures (example for containerd)
sudo crictl images --digests

# Verify API server certificate
echo | openssl s_client -connect kubernetes.default.svc:443 -showcerts
  1. Verify Core Security Controls:
1
2
3
4
5
6
7
8
# Check API server security configuration
kubectl get pod kube-apiserver-$(hostname) -n kube-system -o yaml | grep enable-admission-plugins

# Verify ETCD encryption
kubectl get pod etcd-$(hostname) -n kube-system -o yaml | grep "\-\-encryption"

# Check network policy enforcement
kubectl get pods -n kube-system -l k8s-app=calico-node # or other CNI pods
  1. Security Assessment Script:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#!/bin/bash
# Quick post-install security assessment

echo "Checking admission controllers..."
kubectl get pod -n kube-system -l component=kube-apiserver -o jsonpath='{.items[0].spec.containers[0].command}' | grep admission-plugins

echo "Checking authorization mode..."
kubectl get pod -n kube-system -l component=kube-apiserver -o jsonpath='{.items[0].spec.containers[0].command}' | grep authorization-mode

echo "Checking anonymous auth status..."
kubectl get pod -n kube-system -l component=kube-apiserver -o jsonpath='{.items[0].spec.containers[0].command}' | grep anonymous-auth

echo "Checking audit logging..."
kubectl get pod -n kube-system -l component=kube-apiserver -o jsonpath='{.items[0].spec.containers[0].command}' | grep audit

echo "Checking encryption configuration..."
kubectl get pod -n kube-system -l component=kube-apiserver -o jsonpath='{.items[0].spec.containers[0].command}' | grep encryption-provider-config

echo "Checking kubelet certificate usage..."
kubectl get pod -n kube-system -l component=kube-apiserver -o jsonpath='{.items[0].spec.containers[0].command}' | grep kubelet-certificate

3.4 Hardened Installation Options

3.4.1 Control Plane Hardening

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# Example kubeadm-config.yaml with security focus
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
  criSocket: "unix:///run/containerd/containerd.sock"
  taints:
  - key: "node-role.kubernetes.io/control-plane"
    effect: "NoSchedule"
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.29.2
networking:
  podSubnet: "192.168.0.0/16"
  serviceSubnet: "10.96.0.0/12"
apiServer:
  extraArgs:
    # Authentication security
    anonymous-auth: "false"
    insecure-port: "0"
    # Authorization security
    authorization-mode: "Node,RBAC"
    # API security
    enable-admission-plugins: "NodeRestriction,PodSecurityPolicy,AlwaysPullImages,ServiceAccount,NamespaceLifecycle,LimitRanger,ResourceQuota"
    profiling: "false"
    # Audit logging
    audit-log-path: "/var/log/kubernetes/audit/audit.log"
    audit-log-maxage: "30"
    audit-log-maxbackup: "10"
    audit-log-maxsize: "100"
    audit-policy-file: "/etc/kubernetes/audit-policy.yaml"
    # TLS security
    tls-cipher-suites: "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"
    tls-min-version: "VersionTLS12"
  extraVolumes:
  - name: "audit-policy"
    hostPath: "/etc/kubernetes/audit-policy.yaml"
    mountPath: "/etc/kubernetes/audit-policy.yaml"
    readOnly: true
    pathType: File
  - name: "audit-logs"
    hostPath: "/var/log/kubernetes/audit"
    mountPath: "/var/log/kubernetes/audit"
    pathType: DirectoryOrCreate
controllerManager:
  extraArgs:
    # TLS security
    use-service-account-credentials: "true"
    tls-min-version: "VersionTLS12"
    # Reduce permissions
    profiling: "false"
scheduler:
  extraArgs:
    profiling: "false"
    # TLS security
    tls-min-version: "VersionTLS12"
etcd:
  local:
    extraArgs:
      # ETCD encryption
      encryption-provider-config: "/etc/kubernetes/etcd/encryption-config.yaml"
    serverCertSANs:
      - "127.0.0.1"
      - "localhost"
      - "REPLACE_WITH_CONTROL_PLANE_IP"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
# Kubelet security
protectKernelDefaults: true
readOnlyPort: 0
tlsCipherSuites:
- "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256"
- "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
- "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305"
- "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"
- "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305"
- "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"

3.4.2 ETCD Encryption Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# /etc/kubernetes/etcd/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
      - configmaps
      - pod
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: <base64-encoded-key>
      - identity: {}

4. Configuration

4.1 API Server Security Configuration

The API server is the primary entry point to your Kubernetes cluster. Securing it properly is critical.

4.1.1 Authentication

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# API Server authentication configuration options 
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=192.168.1.10
    # Authentication options
    - --anonymous-auth=false                                # Disable anonymous authentication
    - --basic-auth-file=                                    # Do not enable basic auth
    - --client-ca-file=/etc/kubernetes/pki/ca.crt          # Certificate authority for client auth
    - --enable-bootstrap-token-auth=true                    # Needed only for cluster bootstrap
    - --oidc-issuer-url=https://example.com/identity       # OIDC provider URL if using OIDC
    - --oidc-client-id=kubernetes                          # OIDC client ID
    - --oidc-username-claim=email                          # OIDC claim to use as username
    - --oidc-groups-claim=groups                           # OIDC claim to use for groups
    - --service-account-key-file=/etc/kubernetes/pki/sa.pub # Service account public key
    - --service-account-lookup=true                        # Validate service account tokens
    - --token-auth-file=                                   # Do not use static token auth

4.1.2 Authorization

1
2
3
4
5
6
7
8
9
10
11
12
13
# API Server authorization configuration
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    # Authorization options
    - --authorization-mode=Node,RBAC                        # Use Node and RBAC authorization
    - --enable-admission-plugins=NodeRestriction,PodSecurity # Enable key admission controllers

4.1.3 Secure Communications

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# API Server TLS configuration
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    # TLS configuration
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt     # TLS certificate
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key # TLS private key
    - --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    - --tls-min-version=VersionTLS12                        # Minimum TLS version

4.1.4 Auditing

Implement comprehensive API server auditing:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      resources: ["pods"]

  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to certain non-resource URL paths
  - level: None
    nonResourceURLs:
    - "/healthz"
    - "/metrics"
    - "/readyz"
    - "/livez"

  # Log changes to ConfigMap and Secret objects at RequestResponse level
  - level: RequestResponse
    resources:
    - group: "" # core API group
      resources: ["configmaps", "secrets"]
    verbs: ["create", "patch", "update", "delete"]
    
  # Log all other resources in core and extensions at Metadata level
  - level: Metadata
    resources:
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included
    
  # A catch-all rule to log all other requests at the Metadata level
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

API server configuration for audit logging:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# API Server audit configuration
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    # Audit log configuration
    - --audit-log-path=/var/log/kubernetes/audit/audit.log
    - --audit-log-maxage=30                                 # Maximum days to retain logs
    - --audit-log-maxbackup=10                              # Maximum number of log files
    - --audit-log-maxsize=100                               # Maximum size in MB before rotation
    - --audit-log-format=json                               # Format for audit logs
    - --audit-policy-file=/etc/kubernetes/audit-policy.yaml # Audit policy file location

4.2 ETCD Security

ETCD stores the complete state of your cluster, making it a critical security component.

4.2.1 TLS Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# ETCD TLS configuration
apiVersion: v1
kind: Pod
metadata:
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    # TLS client-to-server communication
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --client-cert-auth=true                             # Enable client certificate auth
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    # TLS peer-to-peer communication
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-client-cert-auth=true                        # Enable peer client cert auth
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    # Restrict listening to secure ports and interfaces
    - --listen-client-urls=https://127.0.0.1:2379,https://192.168.1.10:2379
    - --listen-peer-urls=https://192.168.1.10:2380

4.2.2 Data Encryption

ETCD encryption at rest:

1
2
3
4
5
6
7
8
9
10
11
12
13
# /etc/kubernetes/etcd/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
      - configmaps
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: <base64-encoded-32-byte-key>
      - identity: {}  # Fallback to no encryption

API server configuration for ETCD encryption:

1
2
3
4
5
6
7
8
9
10
11
# API Server ETCD encryption configuration
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --encryption-provider-config=/etc/kubernetes/etcd/encryption-config.yaml

4.2.3 Access Controls and Backup Security

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Example of secure ETCD backup procedure
ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /backup/etcd-snapshot-$(date +%Y-%m-%d_%H-%M-%S).db

# Encrypt the backup
gpg --symmetric --cipher-algo AES256 /backup/etcd-snapshot-*.db

# Verify backup integrity
ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot status /backup/etcd-snapshot-*.db -w table

4.3 Kubelet Security

The kubelet runs on each node and manages containers.

4.3.1 Authentication and Authorization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Secure kubelet configuration
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
  anonymous:
    enabled: false              # Disable anonymous auth
  webhook:
    enabled: true               # Enable webhook auth
    cacheTTL: 2m0s
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook                 # Use webhook authorization
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s

4.3.2 Secure Settings

1
2
3
4
5
6
7
8
9
10
11
12
13
# Additional kubelet security settings
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
protectKernelDefaults: true     # Enforce security-related kernel parameters
readOnlyPort: 0                 # Disable the read-only port
tlsCipherSuites:                # Secure TLS ciphers
  - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
  - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
  - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
  - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
  - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
  - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
tlsMinVersion: VersionTLS12     # Minimum TLS version

4.3.3 Kubelet Hardening

Additional security options to include in kubelet configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
# Runtime security
protectKernelDefaults: true
protectKernelModules: true      # Protect against kernel module loading
makeIPTablesUtilChains: true    # Ensure iptables chains are created
eventRecordQPS: 5               # Limit event recording rate to prevent DoS
eventBurst: 10
# Reduced attack surface
streamingConnectionIdleTimeout: 5m  # Close idle connections
serializeImagePulls: true       # Serialize image pulls for better control
registryPullQPS: 5              # Limit registry pull rate
registryBurst: 10
maxPods: 110                    # Limit pods per node
# Feature gates
featureGates:
  RotateKubeletServerCertificate: true  # Enable certificate rotation

4.4 Network Policies

Network Policies control pod-to-pod communication and are essential for implementing zero-trust networking.

4.4.1 Default Deny Policy

1
2
3
4
5
6
7
8
9
10
11
12
# namespace-default-deny.yaml
# Apply this to each namespace for default zero-trust posture
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: your-namespace
spec:
  podSelector: {}  # Select all pods in namespace
  policyTypes:
  - Ingress
  - Egress

4.4.2 Allowing Specific Traffic

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Example: Allow web frontend to access backend API only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-to-backend-only
  namespace: app-namespace
spec:
  podSelector:
    matchLabels:
      app: backend
      role: api
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8443

4.4.3 Advanced Network Policies

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Example: Allow only specific external egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-specific-egress
  namespace: app-namespace
spec:
  podSelector:
    matchLabels:
      app: web
  policyTypes:
  - Egress
  egress:
  # Allow DNS resolution
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Allow connections to specific external services
  - to:
    - ipBlock:
        cidr: 192.168.1.0/24  # Internal services
        except:
        - 192.168.1.1/32      # Block specific addresses
    - ipBlock:
        cidr: 35.190.247.0/24  # Google API endpoints
    ports:
    - protocol: TCP
      port: 443

4.5 Role-Based Access Control (RBAC)

RBAC is the primary authorization mechanism in Kubernetes.

4.5.1 Least Privilege Roles

1
2
3
4
5
6
7
8
9
10
11
12
13
# Team-specific read-only role example
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: app-team-a
  name: app-viewer
rules:
- apiGroups: [""]
  resources: ["pods", "services", "configmaps"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources: ["deployments", "statefulsets"]
  verbs: ["get", "list", "watch"]

4.5.2 Binding Roles to Users or Groups

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Binding for a specific user
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-viewer-binding
  namespace: app-team-a
subjects:
- kind: User
  name: "[email protected]"
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: app-viewer
  apiGroup: rbac.authorization.k8s.io

4.5.3 ClusterRoles for Cluster-Wide Access

1
2
3
4
5
6
7
8
9
10
11
12
# Limited cluster-wide monitoring role
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: monitoring-role
rules:
- apiGroups: [""]
  resources: ["pods", "nodes", "namespaces"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["metrics.k8s.io"]
  resources: ["pods", "nodes"]
  verbs: ["get", "list", "watch"]

4.5.4 Service Account RBAC

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Service account with minimal permissions
apiVersion: v1
kind: ServiceAccount
metadata:
  name: app-service-account
  namespace: app-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: app-role
  namespace: app-namespace
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["app-config"]  # Restrict to specific resources
  verbs: ["get", "watch"]
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["app-secrets"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-role-binding
  namespace: app-namespace
subjects:
- kind: ServiceAccount
  name: app-service-account
  namespace: app-namespace
roleRef:
  kind: Role
  name: app-role
  apiGroup: rbac.authorization.k8s.io

4.5.5 Avoiding RBAC Anti-Patterns

Bad Practice: Overly Permissive Roles

1
2
3
4
5
6
7
8
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: dangerous-developer-role
rules:
- apiGroups: ["*"]
  resources: ["*"]
  verbs: ["*"]

Good Practice: Graduated Access with Specific Permissions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: developer-role
  namespace: dev-namespace
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
  resources: ["pods/log", "pods/exec"]
  verbs: ["get", "list", "create"]
- apiGroups: ["batch"]
  resources: ["jobs", "cronjobs"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

4.6 Common Misconfigurations

4.6.1 Security Context Misconfigurations

Problematic Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: v1
kind: Pod
metadata:
  name: insecure-pod
spec:
  containers:
  - name: app
    image: myapp:1.0
    securityContext:
      privileged: true          # Gives all capabilities
      capabilities:
        add: ["NET_ADMIN", "SYS_ADMIN"]  # Excessive capabilities
      runAsUser: 0              # Running as root
      allowPrivilegeEscalation: true  # Allows privilege escalation

Secure Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
spec:
  securityContext:
    runAsNonRoot: true          # Prevent running as root
    runAsUser: 1000             # Run as non-root user
    runAsGroup: 3000            # Run with non-root group
    fsGroup: 2000               # Set filesystem group
  containers:
  - name: app
    image: myapp:1.0
    securityContext:
      allowPrivilegeEscalation: false  # Prevent privilege escalation
      capabilities:
        drop: ["ALL"]           # Drop all capabilities
        add: ["NET_BIND_SERVICE"]  # Add only what's needed
      readOnlyRootFilesystem: true  # Read-only root filesystem

4.6.2 Resource Management Issues

Problematic Resource Configuration

1
2
3
4
5
6
7
8
9
apiVersion: v1
kind: Pod
metadata:
  name: resource-hog
spec:
  containers:
  - name: uncontrolled
    image: myapp:1.0
    # No resource limits specified

Secure Resource Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Pod
metadata:
  name: resource-controlled
spec:
  containers:
  - name: app
    image: myapp:1.0
    resources:
      requests:
        memory: "128Mi"
        cpu: "250m"
      limits:
        memory: "256Mi"
        cpu: "500m"

4.6.3 Insecure Network Exposure

Insecure Network Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
apiVersion: v1
kind: Service
metadata:
  name: insecure-service
spec:
  type: LoadBalancer      # Exposed to internet without restriction
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: myapp
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: insecure-ingress
  # No TLS configuration
spec:
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: myapp-service
            port:
              number: 80

Secure Network Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
apiVersion: v1
kind: Service
metadata:
  name: secure-service
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"  # Internal-only LB
spec:
  type: LoadBalancer
  ports:
  - port: 443                   # HTTPS only
    targetPort: 8443
  selector:
    app: myapp
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: secure-ingress
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"    # Force SSL
    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
spec:
  tls:
  - hosts:
    - myapp.example.com
    secretName: myapp-tls-cert
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: myapp-service
            port:
              number: 443
---
# Restrict outbound traffic with network policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: secure-outbound
  namespace: myapp-namespace
spec:
  podSelector:
    matchLabels:
      app: myapp
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  - to:
    - ipBlock:
        cidr: 10.0.0.0/16  # Internal services only

5. Security Best Practices

5.1 Pod Security Standards

Kubernetes provides built-in Pod Security Standards at three levels: Privileged, Baseline, and Restricted.

5.1.1 Enforcing Pod Security Standards

With Kubernetes 1.25+, you can enforce these standards with built-in admission control:

1
2
3
4
5
6
7
8
9
# Namespace with Pod Security Standards
apiVersion: v1
kind: Namespace
metadata:
  name: secure-workloads
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

5.1.2 Custom Pod Security Policies

For more granular control, use the Pod Security Admission webhook or a policy engine like OPA Gatekeeper:

1
2
3
4
5
6
7
8
9
10
11
# Example OPA Gatekeeper constraint for non-root users
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sNonRootUser
metadata:
  name: pods-must-run-as-nonroot
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
    excludedNamespaces: ["kube-system"]

5.1.3 Security Context Recommendations

Apply these security contexts to all pod specifications:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: v1
kind: Pod
metadata:
  name: security-best-practices-pod
spec:
  # Pod-level security context
  securityContext:
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
    fsGroup: 1000
  containers:
  - name: app
    image: myapp:1.0
    # Container-level security context
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsUser: 10000
      runAsGroup: 10000
      capabilities:
        drop:
        - ALL
      seccompProfile:
        type: RuntimeDefault

5.2 Secret Management

Kubernetes Secrets should be protected with additional layers beyond the built-in protection.

5.2.1 Secure Secret Creation and Storage

1
2
3
4
5
6
7
8
9
10
11
12
# Example using kubectl to create a secret without leaving trace in shell history
$ cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: api-credentials
  namespace: app-namespace
type: Opaque
data:
  api-key: $(echo -n "my-api-key" | base64)
  api-secret: $(echo -n "my-api-secret" | base64)
EOF

5.2.2 External Secret Management

Using external secret stores (e.g., Vault, AWS Secrets Manager) with External Secrets Operator:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# ExternalSecret configuration for Vault integration
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: vault-example
  namespace: app-namespace
spec:
  refreshInterval: "15m"
  secretStoreRef:
    name: vault-backend
    kind: ClusterSecretStore
  target:
    name: database-credentials
    creationPolicy: Owner
  data:
  - secretKey: username
    remoteRef:
      key: database/credentials/app
      property: username
  - secretKey: password
    remoteRef:
      key: database/credentials/app
      property: password

5.2.3 Secret Rotation and Lifecycle

Automatic secret rotation with Vault or Cert Manager:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Cert Manager certificate with auto rotation
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: app-tls
  namespace: app-namespace
spec:
  secretName: app-tls-cert
  duration: 2160h  # 90 days
  renewBefore: 360h  # 15 days
  privateKey:
    algorithm: RSA
    encoding: PKCS8
    size: 2048
    rotationPolicy: Always  # Rotate private key on renewal
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
  - app.example.com

5.3 Container Image Security

5.3.1 Image Scanning

Integrate container scanning into your CI/CD pipeline with Trivy:

1
2
3
4
5
6
7
8
9
10
11
12
# Example GitLab CI job with Trivy
image_scanning:
  stage: security
  image:
    name: aquasec/trivy:latest
    entrypoint: [""]
  script:
    - trivy image --exit-code 1 --severity HIGH,CRITICAL ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}
    - trivy image --format json --output trivy-results.json ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}
  artifacts:
    paths:
      - trivy-results.json

5.3.2 Image Signing and Verification

Set up cosign to sign and verify container images:

1
2
3
4
# Signing an image
cosign sign --key cosign.key ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}

# Verification in Kubernetes with policy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Admission controller policy enforcing signed images
apiVersion: admission.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: require-signed-images
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
    - apiGroups: [""]
      apiVersions: ["v1"]
      operations: ["CREATE", "UPDATE"]
      resources: ["pods"]
  validations:
  - expression: "object.spec.containers.all(c, c.image.contains('registry.example.com/'))"
    message: "Only images from approved registry allowed"

5.3.3 Image Pull Policies and Immutability

Best practices for container images:

1
2
3
4
5
6
7
8
9
apiVersion: v1
kind: Pod
metadata:
  name: secure-image-practices
spec:
  containers:
  - name: app
    image: registry.example.com/myapp@sha256:d12b81bc724f8388210f78a5d88a327f6904531f14a6195dd6c51700e48bf74d  # Use digests, not tags
    imagePullPolicy: Always  # Always pull to ensure correct version

5.4 Runtime Security

5.4.1 Implementing Security Monitoring

Deploy Falco for runtime security monitoring:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Falco DaemonSet excerpt
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: falco
  namespace: falco
spec:
  selector:
    matchLabels:
      app: falco
  template:
    metadata:
      labels:
        app: falco
    spec:
      containers:
      - name: falco
        image: falcosecurity/falco:0.34.1
        securityContext:
          privileged: true  # Required for system call monitoring
        volumeMounts:
        - mountPath: /host/var/run/docker.sock
          name: docker-socket
        - mountPath: /host/dev
          name: dev-fs
        - mountPath: /host/proc
          name: proc-fs
          readOnly: true
      volumes:
      - name: docker-socket
        hostPath:
          path: /var/run/docker.sock
      - name: dev-fs
        hostPath:
          path: /dev
      - name: proc-fs
        hostPath:
          path: /proc

5.4.2 Custom Falco Rules

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Example Falco rules ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: falco-rules
  namespace: falco
data:
  k8s_audit_rules.yaml: |-
    - rule: Pod Created in Kube Namespace
      desc: Detect any attempt to create a pod in the kube-system or kube-public namespace
      condition: >
        k8s.evt.type="CREATE" and
        k8s.evt.resource="pods" and
        ka.target.namespace in (kube-system, kube-public) and
        not ka.user.name in (system:node:*, system:serviceaccount:kube-system:*)
      output: >
        Pod created in system namespace (user=%ka.user.name namespace=%ka.target.namespace
        pod=%ka.target.name)
      priority: WARNING
      source: k8s_audit

5.4.3 AppArmor Profiles

Define and apply AppArmor profiles to restrict container capabilities:

1
2
3
4
5
6
7
8
9
10
11
12
# Pod with AppArmor profile
apiVersion: v1
kind: Pod
metadata:
  name: hello-apparmor
  annotations:
    container.apparmor.security.beta.kubernetes.io/hello: localhost/k8s-apparmor-example-deny-write
spec:
  containers:
  - name: hello
    image: busybox:latest
    command: ["sh", "-c", "echo 'Hello AppArmor!' && sleep 1h"]

Example AppArmor profile (to be loaded on nodes):

1
2
3
4
5
6
7
8
9
10
#include <tunables/global>

profile k8s-apparmor-example-deny-write flags=(attach_disconnected) {
  #include <abstractions/base>

  file,

  # Deny all file writes
  deny /** w,
}

5.5 Node Hardening

5.5.1 Node Security Configuration

Protect worker nodes with security-focused configurations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Example node hardening script
#!/bin/bash

# Update system
apt-get update && apt-get upgrade -y

# Install security tools
apt-get install -y auditd apparmor-utils rkhunter fail2ban

# Enable firewall
ufw default deny incoming
ufw default allow outgoing
ufw allow ssh
ufw allow 6443/tcp  # Kubernetes API
ufw allow 10250/tcp # Kubelet
ufw allow 10255/tcp # Read-only Kubelet
ufw allow 30000:32767/tcp # NodePort range
ufw --force enable

# Apply sysctl hardening
cat <<EOF > /etc/sysctl.d/99-kubernetes-security.conf
# Network security
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.icmp_ignore_bogus_error_responses = 1

# System security
kernel.randomize_va_space = 2
fs.suid_dumpable = 0
kernel.core_uses_pid = 1

# File system security
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
EOF
sysctl -p /etc/sysctl.d/99-kubernetes-security.conf

5.5.2 CIS Benchmark Compliance

Apply CIS Kubernetes Benchmark recommendations using kube-bench:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Run kube-bench as a Kubernetes Job
apiVersion: batch/v1
kind: Job
metadata:
  name: kube-bench
spec:
  template:
    spec:
      hostPID: true
      containers:
      - name: kube-bench
        image: aquasec/kube-bench:latest
        securityContext:
          privileged: true  # Required to access host resources
        volumeMounts:
        - name: var-lib-kubelet
          mountPath: /var/lib/kubelet
        - name: etc-systemd
          mountPath: /etc/systemd
        - name: etc-kubernetes
          mountPath: /etc/kubernetes
      restartPolicy: Never
      volumes:
      - name: var-lib-kubelet
        hostPath:
          path: /var/lib/kubelet
      - name: etc-systemd
        hostPath:
          path: /etc/systemd
      - name: etc-kubernetes
        hostPath:
          path: /etc/kubernetes

5.6 Update and Patch Management

5.6.1 Cluster Upgrade Strategy

Document for handling Kubernetes upgrades:

1
2
3
4
5
6
7
## Kubernetes Upgrade Checklist

### Pre-Upgrade Tasks
1. **Review release notes** for breaking changes
2. **Take etcd backup**:
   ```bash
   ETCDCTL_API=3 etcdctl snapshot save snapshot.db
  1. Document current state:
    1
    2
    
    kubectl get nodes -o wide > pre-upgrade-nodes.txt
    kubectl get pods -A -o wide > pre-upgrade-pods.txt
    
  2. Test upgrades in non-production environment first
  3. Ensure sufficient capacity for node rolling updates

Control Plane Upgrade

  1. Drain control plane node:
    1
    
    kubectl drain <control-plane-node> --ignore-daemonsets
    
  2. Upgrade kubeadm:
    1
    2
    3
    
    apt-mark unhold kubeadm && \
    apt-get update && apt-get install -y kubeadm=1.29.x-00 && \
    apt-mark hold kubeadm
    
  3. Plan the upgrade:
    1
    
    kubeadm upgrade plan
    
  4. Apply the upgrade:
    1
    
    kubeadm upgrade apply v1.29.x
    
  5. Upgrade kubelet and kubectl:
    1
    2
    3
    
    apt-mark unhold kubelet kubectl && \
    apt-get update && apt-get install -y kubelet=1.29.x-00 kubectl=1.29.x-00 && \
    apt-mark hold kubelet kubectl
    
  6. Restart kubelet:
    1
    2
    
    systemctl daemon-reload
    systemctl restart kubelet
    
  7. Uncordon node:
    1
    
    kubectl uncordon <control-plane-node>
    

Worker Node Upgrades

  1. Apply serial node upgrades:
    • Drain, upgrade, uncordon each node one at a time
    • Maximum 20% of nodes unavailable at once
    • Watch for workload disruptions ```

5.6.2 Automated Security Patching

Set up automated OS security patches with kured (Kubernetes Reboot Daemon):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# kured DaemonSet for automated OS patching
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kured
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: kured
  template:
    metadata:
      labels:
        name: kured
    spec:
      serviceAccountName: kured
      containers:
        - name: kured
          image: ghcr.io/kubereboot/kured:1.14.0
          imagePullPolicy: IfNotPresent
          securityContext:
            privileged: true # Needed for reboot
          env:
            - name: KURED_NODE_ID
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          command:
            - /usr/bin/kured
            - --period=1h
            - --reboot-days=mon,tue,wed,thu,fri
            - --start-time=2am
            - --end-time=4am
            - --time-zone=UTC
            - --lock-annotation=kured.io/reboot

5.7 Logging and Monitoring

5.7.1 Centralized Logging

Implement centralized logging with Fluent Bit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Fluent Bit DaemonSet configuration
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:1.9.8
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config

5.7.2 Security Monitoring with Prometheus

Set up Prometheus alerts for security-focused monitoring:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Security-focused Prometheus alert rules
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: security-alerts
  namespace: monitoring
spec:
  groups:
  - name: kubernetes-security
    rules:
    - alert: KubeAPIServerDown
      expr: absent(up{job="kube-apiserver"} == 1)
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Kubernetes API server is down"
        description: "API server unreachable, security and control impaired."
    
    - alert: PodRunningAsRoot
      expr: count(container_processes_running{pid="1",user="root"} > 0) > 0
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "Container running as root detected"
        description: "Container running with root privileges, increasing security risk."
    
    - alert: PrivilegedContainer
      expr: kube_pod_container_info{container!="", namespace!~"kube-system|monitoring|logging"} * on(container, pod, namespace) group_left kube_pod_container_status_running * on(container, pod, namespace) group_left(image) kube_pod_container_status_running{container!=""} * on(container, pod, namespace) group_left(security_context) kube_pod_container_security_context{privileged="true"} > 0
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Privileged container running"
        description: "Privileged container outside of system namespaces detected."
    
    - alert: UnauthorizedAPIServerAccess
      expr: sum(rate(apiserver_request_total{code=~"401|403"}[5m])) > 10
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High rate of unauthorized API access attempts"
        description: "Multiple unauthorized access attempts to API server detected."

6. Integrations

6.1 Secure Authentication Integrations

6.1.1 OIDC Integration with Keycloak

Configure API server for OIDC authentication:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# kube-apiserver configuration for OIDC
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    # OIDC configuration
    - --oidc-issuer-url=https://keycloak.example.com/auth/realms/kubernetes
    - --oidc-client-id=kubernetes
    - --oidc-username-claim=preferred_username
    - --oidc-groups-claim=groups
    - --oidc-ca-file=/etc/kubernetes/pki/oidc-ca.crt

Keycloak setup for Kubernetes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Keycloak Configuration example
apiVersion: keycloak.org/v1alpha1
kind: KeycloakRealm
metadata:
  name: kubernetes-realm
  namespace: auth
spec:
  realm:
    id: kubernetes
    realm: kubernetes
    enabled: true
    sslRequired: "external"
    displayName: "Kubernetes Authentication"
    accessTokenLifespan: 300 # 5 minutes
    groups:
      - name: "kubernetes-admins"
      - name: "kubernetes-developers"
      - name: "kubernetes-viewers"
    clients:
      - clientId: kubernetes
        enabled: true
        publicClient: true
        standardFlowEnabled: true
        directAccessGrantsEnabled: true
        redirectUris:
          - "http://localhost:8000"
          - "http://localhost:18000"
        webOrigins:
          - "*"
        protocolMappers:
          - name: "groups"
            protocol: "openid-connect"
            protocolMapper: "oidc-group-membership-mapper"
            config:
              full.path: "false"
              id.token.claim: "true"
              access.token.claim: "true"
              claim.name: "groups"
              userinfo.token.claim: "true"

6.1.2 Certificate Authentication with Vault PKI

Using HashiCorp Vault for PKI certificate issuance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Vault configuration for PKI
apiVersion: vault.banzaicloud.com/v1alpha1
kind: Vault
metadata:
  name: vault
spec:
  size: 3
  image: vault:1.12.1
  bankVaultsImage: ghcr.io/banzaicloud/bank-vaults:1.19.0
  config:
    storage:
      file:
        path: "/vault/file"
    listener:
      tcp:
        address: "0.0.0.0:8200"
        tls_cert_file: /vault/tls/server.crt
        tls_key_file: /vault/tls/server.key
    ui: true
  vaultInitPolicies:
    - name: kubernetes-pki
      rules: |
        path "pki_int/issue/kubernetes-client" {
          capabilities = ["create", "update"]
        }
        path "pki_int/roles/kubernetes-client" {
          capabilities = ["read"]
        }
        path "auth/token/lookup-self" {
          capabilities = ["read"]
        }

6.2 Security Tooling Integration

6.2.1 Integrating OPA Gatekeeper

Deploy OPA Gatekeeper for policy enforcement:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Install Gatekeeper
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/v3.11.0/deploy/gatekeeper.yaml

# Example constraint template for requiring resource limits
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredresources
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredResources
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredresources

        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          not container.resources.limits.cpu
          msg := sprintf("Container %s must specify CPU limits", [container.name])
        }
        
        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          not container.resources.limits.memory
          msg := sprintf("Container %s must specify memory limits", [container.name])
        }

Apply the constraint:

1
2
3
4
5
6
7
8
9
10
11
# Apply the constraint
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredResources
metadata:
  name: pod-must-have-resource-limits
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
    excludedNamespaces: ["kube-system", "monitoring"]

6.2.2 SIEM Integration

Configure Falco to send alerts to Elasticsearch and Kibana:

1
2
3
4
5
6
7
8
9
10
11
12
# Falco configuration for Elasticsearch output
apiVersion: v1
kind: ConfigMap
metadata:
  name: falco-config
  namespace: falco
data:
  falco.yaml: |-
    program_output:
      enabled: true
      keep_alive: false
      program: "curl -H 'Content-Type: application/json' -d @- -XPOST 'http://elasticsearch-master.monitoring:9200/falco-events/_doc'"

6.2.3 Vulnerability Scanning Integration

Set up Trivy Operator for continuous vulnerability scanning:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Install Trivy Operator
kubectl create namespace trivy-system
helm install trivy-operator aqua/trivy-operator \
  --namespace trivy-system \
  --set trivy.ignoreUnfixed=true

# ConfigAuditReport configuration
apiVersion: aquasecurity.github.io/v1alpha1
kind: ConfigAuditReport
metadata:
  name: configaudit-sample
spec:
  scanner:
    name: Trivy
    vendor: Aqua Security
  summary:
    criticalCount: 2
    highCount: 8
    lowCount: 9
    mediumCount: 15

6.3 CI/CD Security Integration

6.3.1 GitOps Pipeline Security

Example GitLab CI/CD pipeline with security checks:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
# .gitlab-ci.yml with security focus
stages:
  - lint
  - test
  - scan
  - build
  - deploy

variables:
  DOCKER_DRIVER: overlay2
  KUBERNETES_VERSION: 1.29.2
  HELM_VERSION: 3.12.2

# Lint Kubernetes resources
k8s-lint:
  stage: lint
  image: 
    name: bitnami/kubectl:${KUBERNETES_VERSION}
    entrypoint: [""]
  script:
    - |
      for file in $(find ./k8s -name "*.yaml" -type f); do
        echo "Validating $file..."
        kubectl apply --dry-run=client -f $file
      done

# Static code analysis
sast:
  stage: test
  image: aquasec/trivy:latest
  script:
    - trivy fs --security-checks vuln,config,secret --format sarif -o gl-sast-report.json .
  artifacts:
    reports:
      sast: gl-sast-report.json

# Container image scanning
container-scan:
  stage: scan
  image:
    name: aquasec/trivy:latest
    entrypoint: [""]
  script:
    - trivy image --exit-code 1 --severity HIGH,CRITICAL "${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}"
  allow_failure: true

# Image build with hardening
build:
  stage: build
  image: docker:20.10.16
  services:
    - docker:20.10.16-dind
  script:
    - |
      docker build \
        --build-arg USER=nonroot \
        --build-arg UID=10000 \
        --build-arg GID=10000 \
        --no-cache \
        --pull \
        -t "${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}" \
        .
    - docker push "${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}"
    # Sign the image
    - docker pull gcr.io/projectsigstore/cosign:v1.13.1
    - |
      docker run --rm \
        -v ${PWD}:/work \
        -e COSIGN_PASSWORD=${COSIGN_PASSWORD} \
        gcr.io/projectsigstore/cosign:v1.13.1 \
        sign \
        --key cosign.key \
        "${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}"

# Deploy with security scanning
deploy:
  stage: deploy
  image:
    name: bitnami/kubectl:${KUBERNETES_VERSION}
    entrypoint: [""]
  script:
    - kubectl config use-context ${KUBE_CONTEXT}
    - |
      # Replace image placeholder with scanned and signed image
      sed -i "s|IMAGE_PLACEHOLDER|${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}|g" ./k8s/deployment.yaml
    # Apply security policies first
    - kubectl apply -f ./k8s/network-policies.yaml
    - kubectl apply -f ./k8s/resource-quotas.yaml
    - kubectl apply -f ./k8s/pod-security-policies.yaml
    # Then apply application manifests
    - kubectl apply -f ./k8s/
    # Verify deployment security posture
    - |
      kubectl wait --for=condition=Available deployment/${APP_NAME} --timeout=300s
      kubectl exec deployment/${APP_NAME} -- trivy filesystem --exit-code 0 --severity CRITICAL /app

6.3.2 ArgoCD with Security Controls

ArgoCD configuration with security best practices:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# ArgoCD application with security controls
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: secured-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/example/app.git
    targetRevision: main
    path: k8s
    plugin:
      name: security-plugin
  destination:
    server: https://kubernetes.default.svc
    namespace: app-namespace
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - Validate=true
      - CreateNamespace=true
      - PruneLast=true
      - ApplyOutOfSyncOnly=true
    retry:
      limit: 3
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

---
# ArgoCD Project with restricted permissions
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: restricted-project
  namespace: argocd
spec:
  description: Restricted Project
  sourceRepos:
  - "https://github.com/trusted-org/*"  # Only allow from trusted sources
  destinations:
  - namespace: 'app-*'                  # Only allow deployments to app namespaces
    server: https://kubernetes.default.svc
  clusterResourceWhitelist: []          # No cluster-wide resources allowed
  namespaceResourceBlacklist:           # Block high-risk resources
  - group: ""
    kind: ResourceQuota
  - group: ""
    kind: LimitRange
  - group: ""
    kind: NetworkPolicy
  orphanedResources:
    warn: true
  roles:
  - name: developer
    description: Developer role
    policies:
    - p, proj:restricted-project:developer, applications, get, restricted-project/*, allow
    - p, proj:restricted-project:developer, applications, sync, restricted-project/*, allow

6.4 Service Mesh Security

6.4.1 Istio Service Mesh Security

Install Istio with security focus:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Download istioctl
curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.16.1 sh -
cd istio-1.16.1

# Create secure installation profile
cat > secure-profile.yaml << EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  profile: default
  components:
    pilot:
      k8s:
        resources:
          requests:
            cpu: 500m
            memory: 2048Mi
  meshConfig:
    enableAutoMtls: true
    defaultConfig:
      holdApplicationUntilProxyStarts: true
    outboundTrafficPolicy:
      mode: REGISTRY_ONLY  # Prevent connections outside the mesh
  values:
    global:
      proxy:
        privileged: false
      tls:
        minProtocolVersion: TLSV1_3
      logging:
        level: "default:info"
EOF

# Install Istio
./bin/istioctl install -f secure-profile.yaml -y

6.4.2 Mutual TLS Authentication

Enable strict mTLS at namespace level:

1
2
3
4
5
6
7
8
9
# Enable strict mTLS for a namespace
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: strict-mtls
  namespace: app-namespace
spec:
  mtls:
    mode: STRICT  # Require mTLS for all communication

6.4.3 Istio Authorization Policies

Implement fine-grained authorization:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# Deny all traffic by default
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: deny-all
  namespace: app-namespace
spec:
  {}  # Empty spec means deny all

---
# Allow specific traffic
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: frontend-to-backend
  namespace: app-namespace
spec:
  selector:
    matchLabels:
      app: backend
  action: ALLOW
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/app-namespace/sa/frontend-sa"]
    to:
    - operation:
        methods: ["GET", "POST"]
        paths: ["/api/v1/*"]
    when:
    - key: request.headers[x-api-key]
      values: ["valid-api-key"]

---
# JWT authentication for external access
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  name: jwt-auth
  namespace: app-namespace
spec:
  selector:
    matchLabels:
      app: frontend
  jwtRules:
  - issuer: "https://accounts.example.com"
    jwksUri: "https://accounts.example.com/.well-known/jwks.json"

7. Testing and Validation

7.1 Security Benchmark Testing

7.1.1 CIS Benchmark Testing

Run CIS benchmark tests with kube-bench:

1
2
3
4
5
# Run on control plane node
docker run --pid=host -v /etc:/etc:ro -v /var:/var:ro -t aquasec/kube-bench:v0.6.11 master --version 1.29

# Run on worker nodes
docker run --pid=host -v /etc:/etc:ro -v /var:/var:ro -t aquasec/kube-bench:v0.6.11 node --version 1.29

7.1.2 Compliance Testing

Implement compliance checks with kube-compliance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Deploy kube-compliance
apiVersion: compliance.openshift.io/v1alpha1
kind: ComplianceSuite
metadata:
  name: example-compliance-suite
spec:
  autoApplyRemediations: false
  schedule: "0 1 * * *"  # Run daily at 1 AM
  scans:
    - name: cis-kubernetes-benchmark
      profile: xccdf_org.ssgproject.content_profile_cis-kubernetes-benchmark
      content: ssg-kubernetes-ds.xml
      contentImage: quay.io/compliance-operator/openscap-ocp:1.3.5
      rule: "xccdf_org.ssgproject.content_rule_kubernetes_*"

7.2 Penetration Testing Kubernetes

7.2.1 Penetration Testing Tools and Checklist

Kubernetes penetration testing checklist:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Kubernetes Penetration Testing Checklist

## 1. Authentication and Authorization
- [ ] Test unauthenticated access to the API server
- [ ] Test service account token permissions
- [ ] Attempt to escape RBAC restrictions
- [ ] Test for overly permissive RBAC roles

## 2. Network Security
- [ ] Test pod-to-pod communication isolation
- [ ] Attempt to bypass network policies
- [ ] Test external access to cluster services
- [ ] Test egress controls and filtering

## 3. Container Security
- [ ] Test for container escape vulnerabilities
- [ ] Attempt to mount sensitive host paths
- [ ] Test for privileged container access
- [ ] Test seccomp and AppArmor protections

## 4. Secrets Management
- [ ] Test for unencrypted secrets
- [ ] Attempt to access secrets from unauthorized pods
- [ ] Test for secrets leaked in environment variables or logs

## 5. Control Plane Security
- [ ] Test etcd encryption and access controls
- [ ] Assess Kubernetes Dashboard security
- [ ] Test API server admission controllers
- [ ] Test kubelet security

## 6. Infrastructure Security
- [ ] Test node security configurations
- [ ] Test for vulnerable components or services
- [ ] Test cloud provider security controls

7.2.2 Red Team Exercise

Example penetration testing script for Kubernetes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/bin/bash

echo "=== Kubernetes Security Testing Script ==="
echo "This script performs basic security checks against a Kubernetes cluster"

# Check for anonymous access
echo -e "\n[*] Testing anonymous access to the API server..."
curl -s -k https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}/api/v1/namespaces

# Check for service account permissions
echo -e "\n[*] Testing service account permissions..."
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
NAMESPACE=$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace)
curl -s -k -H "Authorization: Bearer $TOKEN" https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}/api/v1/namespaces/$NAMESPACE/pods

# Check for privileged containers
echo -e "\n[*] Checking for privileged containers in the cluster..."
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.containers[].securityContext.privileged==true) | .metadata.namespace + "/" + .metadata.name'

# Check for containers with hostPath volumes
echo -e "\n[*] Checking for containers with hostPath volumes..."
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.volumes[]?.hostPath) | .metadata.namespace + "/" + .metadata.name'

# Check for containers running as root
echo -e "\n[*] Checking for containers running as root..."
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select((.spec.securityContext.runAsUser==0 or .spec.securityContext.runAsUser==null) and (.spec.containers[].securityContext.runAsUser==0 or .spec.containers[].securityContext.runAsUser==null)) | .metadata.namespace + "/" + .metadata.name'

# Check for exposed dashboards or services
echo -e "\n[*] Checking for exposed dashboards and services..."
kubectl get svc --all-namespaces -o json | jq -r '.items[] | select(.spec.type=="LoadBalancer" or .spec.type=="NodePort") | .metadata.namespace + "/" + .metadata.name + " (" + .spec.type + ")"'

echo -e "\n[*] Security test complete"

7.3 Continuous Security Validation

7.3.1 Automated Security Scanning

Set up automated security scanning with Popeye:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# Kubernetes CronJob for regular security scanning
apiVersion: batch/v1
kind: CronJob
metadata:
  name: security-scan
  namespace: security
spec:
  schedule: "0 0 * * *"  # Run daily at midnight
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: security-scanner
          containers:
          - name: popeye
            image: derailed/popeye:v0.20.0
            args:
            - -o
            - json
            - -s
            - /etc/popeye/spinach.yaml
            - --save
            - /reports/popeye-$(date +%Y-%m-%d).json
            volumeMounts:
            - name: report-volume
              mountPath: /reports
            - name: config-volume
              mountPath: /etc/popeye
          - name: report-processor
            image: alpine:3.16
            command: ["/bin/sh", "-c"]
            args:
            - |
              apk add --no-cache jq curl
              sleep 60  # Wait for popeye to finish
              REPORT_FILE=/reports/popeye-$(date +%Y-%m-%d).json
              if [ -f "$REPORT_FILE" ]; then
                SCORE=$(jq '.score' "$REPORT_FILE")
                ISSUES=$(jq '.cluster.issues' "$REPORT_FILE")
                if [ "$SCORE" -lt 80 ] || [ "$ISSUES" -gt 10 ]; then
                  curl -X POST -H "Content-Type: application/json" \
                    -d "{\"text\":\"🚨 Kubernetes Security Score: $SCORE - $ISSUES issues found!\"}" \
                    ${SLACK_WEBHOOK_URL}
                fi
              fi
            volumeMounts:
            - name: report-volume
              mountPath: /reports
          volumes:
          - name: report-volume
            persistentVolumeClaim:
              claimName: security-reports-pvc
          - name: config-volume
            configMap:
              name: popeye-config
          restartPolicy: OnFailure

7.3.2 Attack Simulation

Set up periodic attack simulation with kube-hunter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Kubernetes Job for attack simulation
apiVersion: batch/v1
kind: Job
metadata:
  name: kube-hunter
  namespace: security
spec:
  template:
    spec:
      containers:
      - name: kube-hunter
        image: aquasec/kube-hunter:0.6.8
        args:
        - "--pod"
        - "--report"
        - "json"
        - "--log"
        - "stdout"
      restartPolicy: Never
  backoffLimit: 0

7.4 Security Response Drills

7.4.1 Incident Response Playbook

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Kubernetes Security Incident Response Playbook

## 1. Preparation Phase
- Establish incident response team and roles
- Document cluster architecture and security controls
- Create communication channels and escalation paths
- Prepare forensic tools and access credentials
- Regularly test backup and restore procedures

## 2. Detection and Analysis
- Monitor for security alerts from:
  - Falco real-time alerts
  - Prometheus security metric anomalies
  - Cloud provider security notifications
  - API server audit logs
- Triage and categorize incidents:
  - Unauthorized access
  - Data breach
  - Resource hijacking (crypto mining)
  - Denial of service
  - Container/node compromise

## 3. Containment
- **For compromised pods:**
  ```bash
  # Isolate the pod by applying restrictive network policy
  kubectl apply -f emergency-isolate-policy.yaml
  # Capture pod details before termination
  kubectl describe pod <pod-name> -n <namespace> > pod-details.txt
  kubectl logs <pod-name> -n <namespace> > pod-logs.txt
  # Terminate the pod
  kubectl delete pod <pod-name> -n <namespace>
  • For compromised nodes:
    1
    2
    3
    4
    5
    
    # Cordon and drain the node
    kubectl cordon <node-name>
    kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
    # Isolate at network level
    # [Cloud provider specific steps to isolate node]
    
  • For compromised credentials:
    1
    2
    3
    4
    
    # Revoke and rotate service account tokens
    kubectl delete serviceaccount <sa-name> -n <namespace>
    kubectl create serviceaccount <sa-name> -n <namespace>
    # Revoke active sessions [platform specific]
    

4. Eradication

  • Remove compromised resources
  • Patch vulnerabilities that led to the breach
  • Apply updated security policies
  • Scan for persistence mechanisms
  • Verify integrity of critical components

5. Recovery

  • Restore from clean backups if needed
    1
    2
    
    # Restore etcd from backup
    ETCDCTL_API=3 etcdctl snapshot restore backup.db
    
  • Deploy from known good manifests
  • Implement additional security controls
  • Gradually restore service with increased monitoring

6. Post-Incident Review

  • Document the incident timeline
  • Analyze root cause and attack vectors
  • Update security controls and policies
  • Conduct lessons learned session
  • Update incident response playbook ```

7.4.2 Tabletop Exercise Scenario

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Kubernetes Security Tabletop Exercise

## Scenario: Compromised Node in Production Cluster

### Background
Your organization runs a production Kubernetes cluster with sensitive workloads. 
Monitoring alerts indicate unusual network traffic from one of your worker nodes.

### Timeline
1. **T+0:** Security monitoring detects unusual egress traffic to IP 185.143.x.x
2. **T+5min:** CPU utilization on node-04 spikes to 95%
3. **T+7min:** Multiple new processes appear on node-04
4. **T+10min:** The incident response team is activated

### Exercise Questions
1. What immediate steps would you take to investigate?
2. How would you determine if this is a false positive?
3. If confirmed as a security incident, how would you:
   - Contain the compromised node?
   - Preserve evidence for forensic analysis?
   - Determine the attack vector and affected resources?
4. What communication would you send to stakeholders?
5. How would you restore normal operations after addressing the incident?

### Exercise Goals
- Test the incident response process
- Identify communication gaps
- Practice technical response procedures
- Validate recovery time objectives
- Document areas for improvement

8. References and Further Reading

8.1 Official Documentation

8.2 Security Resources

8.3 Security Tools

  • kube-bench - CIS Kubernetes Benchmark tests
  • kube-hunter - Kubernetes penetration testing
  • Falco - Runtime security monitoring
  • Trivy - Container vulnerability scanner
  • Popeye - Kubernetes cluster sanitizer

8.4 White Papers and Articles

8.5 Known CVEs

Recent significant Kubernetes vulnerabilities (as of May 2025):

  • CVE-2024-1435: Bypassing Pod Security admission restrictions with certain container definitions
  • CVE-2023-5528: Privilege escalation vulnerability in kubectl cp command
  • CVE-2023-3676: Control plane vulnerable to proxy request smuggling
  • CVE-2022-3172: Node restriction bypass allowing certificate signing
  • CVE-2021-25742: Ingress-nginx Controller vulnerability allowing potential RCE

9. Appendices

9.1 Security Troubleshooting

9.1.1 Common Security Issues and Solutions

IssueSymptomsSolution
Unauthorized API access401/403 errors in API server logsCheck authentication configuration, RBAC settings
etcd data exposureUnencrypted sensitive dataEnable encryption at rest for etcd
Container breakoutProcess running outside container namespaceEnforce Pod Security Standards, restrict capabilities
Network policy failuresUnexpected network connectivityCheck CNI configuration, verify policy syntax
Certificate errorsTLS handshake failuresCheck certificate expiration, CA chain validity
Resource exhaustionNode or pod OOM, CPU throttlingImplement resource quotas and limits
Secret exposureSecrets visible in logs or environmentUse external secret management, vault integration

9.1.2 Debugging Security Configurations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Debug RBAC issues
kubectl auth can-i --as=system:serviceaccount:default:myapp list pods

# Verify pod security context
kubectl get pod <pod-name> -o jsonpath='{.spec.securityContext}'

# Check applied network policies
kubectl get netpol -A -o wide

# Verify webhook configurations
kubectl get validatingwebhookconfiguration
kubectl get mutatingwebhookconfiguration

# Debug admission controller issues
kubectl logs -n kube-system -l component=kube-apiserver | grep admission

# Check certificate expiration
kubeadm certs check-expiration

# Debug etcd encryption
kubectl get pod -n kube-system -l component=kube-apiserver -o yaml | grep encryption-provider

9.2 Kubernetes Security FAQ

Common Security Questions

Q: What is the difference between Pod Security Standards and Pod Security Policies?

A: Pod Security Policies (PSPs) were deprecated in Kubernetes 1.21 and removed in 1.25. Pod Security Standards (PSS) replaced them with three profiles: Privileged, Baseline, and Restricted. PSS is implemented via the built-in Pod Security Admission controller, which enforces these standards at the namespace level.

Q: How do I secure communication between microservices?

A: Secure microservice communication through multiple layers:

  1. Use Network Policies to restrict pod-to-pod traffic
  2. Implement mTLS with a service mesh like Istio
  3. Use application-level authentication
  4. Encrypt sensitive data and implement proper authorization

Q: What are the minimum RBAC permissions needed for CI/CD pipelines?

A: CI/CD pipelines typically need:

  • create, update, patch, delete permissions on workload resources (deployments, services)
  • get, list permissions for checking status
  • Namespace-scoped permissions rather than cluster-wide
  • Service accounts with separate roles for different environments

Example minimal CI/CD role:

1
2
3
4
5
6
7
8
9
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cicd-deployer
  namespace: app-namespace
rules:
- apiGroups: ["", "apps", "batch"]
  resources: ["deployments", "services", "configmaps", "pods", "jobs"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

Q: How can I detect and prevent crypto mining attacks?

A: Implement these defenses:

  1. Resource quotas and limits on all namespaces
  2. Runtime security monitoring with Falco
  3. Network egress controls to block crypto mining pools
  4. Regular scanning of container images
  5. Monitor for abnormal CPU usage patterns
  6. Implement Pod Security Standards to prevent privileged containers

9.3 Kubernetes Version Security Considerations

Security Features by Version

VersionKey Security FeaturesSecurity Considerations
1.29.xImproved in-tree storage plugin deprecation, SeccompDefault promoted to stable, Improved CEL expressions for webhook matchingPlan migration from in-tree storage plugins, enable SeccompDefault feature gate
1.28.xValidatingAdmissionPolicy graduates to beta, improved control plane resilience, KMS v2 betaMigrate to KMS v2 for secrets encryption, implement ValidatingAdmissionPolicy
1.27.xPod Security Admission now stable, KMS v2 improvements, CEL for admission controlTransition from PSPs to Pod Security Standards if not already done
1.26.xPodSecurityPolicy completely removed, SeccompDefault graduated to beta, privileged container improvementsMust use Pod Security Standards instead of PSP, enable SeccompDefault
1.25.xPSP removed, CronJob stable, CSI migration improvementsCreate migration plan from PSP to Pod Security Standards
1.24.xPSP deprecated with warning, Beta Indexed Jobs, Service Account Token Volume improvementsBegin migration from PSP to Pod Security Standards

Version Upgrade Security Checklist

Before upgrading Kubernetes:

  1. Review Release Notes:
    • Check for deprecated security features
    • Identify new security enhancements
    • Note any breaking changes to security components
  2. Test Security Controls:
    • Verify RBAC policies still work
    • Test Network Policies
    • Validate admission controller behavior
    • Check Pod Security Standard enforcement
  3. Update Security Components:
    • CNI plugins
    • Container runtime
    • Ingress controllers
    • Security monitoring tools
  4. Post-Upgrade Verification:
    • Run CIS benchmark tests
    • Validate security policy enforcement
    • Check audit logging functionality
    • Conduct penetration testing

This guide was last updated on May 8, 2025, and applies to Kubernetes v1.29.x.

This post is licensed under CC BY 4.0 by the author.