Post

Kubernetes HPA Autoscaling Guide

Step-by-step guide to using Kubernetes HPA to autoscale pods based on CPU, memory, and custom metrics, including setup, configuration, and best practices.

Kubernetes HPA Autoscaling Guide

Kubernetes Horizontal Pod Autoscaler: A Security-Focused Implementation Guide

1. Introduction and Purpose

The Kubernetes Horizontal Pod Autoscaler (HPA) automatically scales the number of pods in a deployment, replica set, or stateful set based on observed resource utilization. While autoscaling provides significant operational benefits, improper configuration can introduce security vulnerabilities, performance issues, and unexpected costs. This guide provides a security-focused approach to implementing HPA, ensuring that your autoscaling strategy enhances rather than compromises your security posture.

Key Security Considerations:

  • Resource quota management to prevent denial-of-service scenarios
  • Proper metric selection to avoid scaling based on manipulated inputs
  • Access controls around autoscaler configuration
  • Monitoring and alerting for unusual scaling events

2. Table of Contents

  1. Introduction and Purpose
  2. Table of Contents
  3. Installation
    1. Prerequisites
    2. Verifying HPA Components
    3. Platform-Specific Considerations
  4. Configuration
    1. Basic HPA Configuration
    2. Advanced Configuration Options
    3. Secure Defaults
    4. Common Misconfigurations
    5. RBAC for HPA
  5. Security Best Practices
    1. Resource Limits and Requests
    2. Metric Selection and Validation
    3. Scaling Policies and Thresholds
    4. Secrets Management with Scaling
    5. Logging and Audit Strategies
    6. Threat Modeling for Autoscaling
    7. Securely Updating HPA Configurations
  6. Integrations
    1. Metrics Server and Custom Metrics
    2. Integration with External Metrics Providers
    3. SIEM and Monitoring Integration
    4. CI/CD Pipeline Integration
  7. Testing and Validation
    1. Load Testing for Scale Verification
    2. Security Testing the Autoscaler
    3. Chaos Engineering for Autoscaling
  8. References and Further Reading
  9. Appendices
    1. Troubleshooting HPA Issues
    2. Frequently Asked Questions
    3. Version-Specific Considerations

3. Installation

3.1 Prerequisites

Before implementing HPA, ensure your Kubernetes cluster meets these security-focused prerequisites:

  • Kubernetes version 1.23+ (recommended for latest security features)
  • Metrics Server installed and properly secured
  • Resource quotas implemented at namespace level
  • Network policies in place
  • Pod Security Standards applied

Verify Metrics Server Installation:

1
2
3
4
5
6
# Check if metrics-server is deployed
kubectl get deployment metrics-server -n kube-system

# Verify metrics are being collected
kubectl top nodes
kubectl top pods --all-namespaces

3.2 Verifying HPA Components

Always verify the integrity of components before deploying:

1
2
3
4
5
6
7
# For metrics-server, verify the image digest
kubectl get deployment metrics-server -n kube-system -o jsonpath='{.spec.template.spec.containers[0].image}'

# Check expected SHA256 against official releases
EXPECTED_SHA="sha256:abc123def456..." # Replace with actual digest
ACTUAL_SHA=$(kubectl get deployment metrics-server -n kube-system -o jsonpath='{.spec.template.spec.containers[0].image}' | xargs docker pull | grep Digest | cut -d':' -f2-)
[ "$EXPECTED_SHA" = "$ACTUAL_SHA" ] && echo "Verification successful" || echo "Verification failed"

3.3 Platform-Specific Considerations

Linux-Based Kubernetes

Most secure configuration with SELinux or AppArmor profiles:

1
2
3
4
5
6
7
8
9
10
11
12
13
# Example AppArmor profile for metrics-server
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
spec:
  template:
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/metrics-server: runtime/default
EOF

Windows Worker Nodes

Windows containers have different isolation models and security considerations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Ensure proper Windows security contexts
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-windows-deployment
spec:
  template:
    spec:
      securityContext:
        windowsOptions:
          runAsUserName: "ContainerUser"
      containers:
        - name: windows-container
          # HPA will scale this container
EOF

Cloud Provider Considerations

AWS EKS:

  • Use IAM roles for service accounts (IRSA) for metrics access
  • Enable AWS CloudTrail for API auditing

Azure AKS:

  • Implement Azure AD integration for RBAC
  • Use Azure Monitor for containerized insights metrics

Google GKE:

  • Enable Workload Identity for metrics access
  • Use Cloud Monitoring for metrics collection

4. Configuration

4.1 Basic HPA Configuration

A security-focused HPA configuration with proper resource management:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: secure-application-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: secure-application
  minReplicas: 2  # Never scale below 2 for high availability
  maxReplicas: 10 # Upper bound to prevent runaway scaling
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # Prevent oscillation
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 60
      - type: Pods
        value: 5
        periodSeconds: 60
      selectPolicy: Max

4.2 Advanced Configuration Options

Multiple Metrics for Scaling:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: secure-multi-metric-hpa
spec:
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      target:
        type: AverageValue
        averageValue: 1000
  - type: Object
    object:
      metric:
        name: requests-per-second
      describedObject:
        apiVersion: networking.k8s.io/v1
        kind: Ingress
        name: main-route
      target:
        type: Value
        value: 10000

4.3 Secure Defaults

Implement these secure defaults when configuring HPA:

  1. Always set both minimum and maximum replicas
    • Minimum ensures availability (typically ≥ 2)
    • Maximum prevents resource exhaustion attacks
  2. Set appropriate stabilization windows
    • Longer downscale windows (300+ seconds)
    • Shorter upscale windows for responsiveness (30-60 seconds)
  3. Configure scaling policies to limit rate of change
    • Percentage-based scale-down (max 20-25% at once)
    • Pod-limit for scale-up to prevent excessive scaling
  4. Use namespaced resource quotas
1
2
3
4
5
6
7
8
9
apiVersion: v1
kind: ResourceQuota
metadata:
  name: autoscaling-limits
  namespace: production
spec:
  hard:
    pods: "50"
    count/horizontalpodautoscalers.autoscaling: "10"

4.4 Common Misconfigurations

MisconfigurationSecurity RiskRecommended Fix
No maximum replica limitPotential DoS or resource exhaustionAlways set reasonable maxReplicas
CPU-only scalingCan miss memory leaks or attacksUse multiple metrics, including memory
No stabilization windowScale thrashing, predictable behaviorSet appropriate windows (300s+ for scale-down)
Custom metrics without validationScaling on manipulated metricsVerify metric sources, implement anomaly detection
No resource limits on podsPod resource consumption attacksAlways set resource requests and limits
Autoscaling privileged containersIncreased attack surface on scaleNever run privileged containers with autoscaling

4.5 RBAC for HPA

Implement least-privilege access for HPA management:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Role for HPA monitoring only
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: hpa-monitor
  namespace: production
rules:
- apiGroups: ["autoscaling"]
  resources: ["horizontalpodautoscalers"]
  verbs: ["get", "list", "watch"]
---
# Role for HPA management
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: hpa-admin
  namespace: production
rules:
- apiGroups: ["autoscaling"]
  resources: ["horizontalpodautoscalers"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

Bind these roles only to service accounts and users that require access:

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: hpa-admins
  namespace: production
subjects:
- kind: Group
  name: system:platform-admins
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: hpa-admin
  apiGroup: rbac.authorization.k8s.io

5. Security Best Practices

5.1 Resource Limits and Requests

Properly configuring resource requests and limits is essential for secure HPA operation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-application
spec:
  template:
    spec:
      containers:
      - name: app
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
          limits:
            cpu: 500m
            memory: 1Gi

Security Benefits:

  • Prevents noisy neighbor issues
  • Mitigates pod resource consumption attacks
  • Ensures predictable scaling behavior
  • Protects against memory leak exploitation

Best Practices:

  • Set memory limits ~2x the expected usage
  • Set CPU requests based on observed P95 usage
  • Use LimitRanges to enforce minimum and maximum values
  • Monitor for containers hitting limits

5.2 Metric Selection and Validation

Metric TypeSecurity ConsiderationsRecommendation
CPUReliable but can be manipulatedGood baseline, combine with others
MemoryCritical for detecting memory leaksAlways include, but set alerts for anomalies
Custom application metricsCan be spoofed if not securedImplement authentication, validate sources
External metricsPotential supply chain attack vectorUse only trusted external metrics providers with TLS

Validation Strategy:

  1. Implement baseline and anomaly detection
  2. Record metric history to identify unusual patterns
  3. Set up alerting for metric spikes that trigger scaling
  4. Review logs when unusual scaling occurs

5.3 Scaling Policies and Thresholds

Configure scaling thresholds to balance responsiveness with security:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 20  # Max 20% scale down at once
      periodSeconds: 60
  scaleUp:
    stabilizationWindowSeconds: 60
    policies:
    - type: Pods
      value: 5   # Max 5 pods at once
      periodSeconds: 30
    selectPolicy: Max

Security-Focused Policies:

  • Implement slower scale-down than scale-up
  • Use percentage-based policies for large deployments
  • Set pod-count limits for smaller deployments
  • Configure alerts for rapid scale events

5.4 Secrets Management with Scaling

When pods scale rapidly, secrets management becomes critical:

Recommended Approaches:

  1. Use external secrets stores like HashiCorp Vault or AWS Secrets Manager
  2. Implement proper init container patterns for secret retrieval
  3. Set up dynamic secret rotation that accommodates scaled pods
  4. Never bake secrets into container images

Example with External Secrets Operator:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: application-secrets
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: ClusterSecretStore
  target:
    name: application-secret
    creationPolicy: Owner
  data:
  - secretKey: api-key
    remoteRef:
      key: services/myapp/api-key
      property: value

5.5 Logging and Audit Strategies

Implement comprehensive logging for HPA operations:

  1. Enable Kubernetes Audit Logging:
1
2
3
# kube-apiserver flag
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--audit-log-path=/var/log/kubernetes/audit.log
  1. Audit Policy for HPA Operations:
1
2
3
4
5
6
7
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
  resources:
  - group: autoscaling
    resources: ["horizontalpodautoscalers"]
  1. Forward logs to SIEM:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
data:
  fluent.conf: |
    <match kubernetes.var.log.containers.metrics-server-**.log>
      @type elasticsearch
      host elasticsearch.monitoring
      port 9200
      logstash_format true
      tag_key @log_name
      <buffer>
        flush_interval 5s
      </buffer>
    </match>

5.6 Threat Modeling for Autoscaling

ThreatPotential ImpactMitigation
Denial of Wallet attackExcessive resource costsSet strict maxReplicas, implement alerts for unusual scaling
Pod Crash Loop attackForce continuous rescalingConfigure proper liveness/readiness probes, increase stabilization windows
Metric ManipulationImproper scaling decisionsUse multiple metrics, implement anomaly detection
Resource ExhaustionCluster-wide outageConfigure namespace quotas, node affinity to spread load
Credential Exposure during scaleSecret disclosureUse proper secret management systems designed for scale

5.7 Securely Updating HPA Configurations

Implement a secure change process:

  1. Version control all HPA configurations with GitOps
  2. Require peer review before applying changes
  3. Test changes in staging environment before production
  4. Use progressive deployment patterns:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Initial deployment with conservative settings
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: application-hpa
  annotations:
    fluxcd.io/automated: "true"
    fluxcd.io/tag.chart-image: semver:~1.0
spec:
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80

After validation, incrementally update maximum limits and adjust metrics.

6. Integrations

6.1 Metrics Server and Custom Metrics

Secure metrics-server deployment:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: metrics-server
        image: registry.k8s.io/metrics-server/metrics-server:v0.6.3@sha256:abc123... # Use digest
        args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls=false # Require valid certs
        - --tls-cert-file=/certs/tls.crt
        - --tls-private-key-file=/certs/tls.key
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: certs
          mountPath: /certs
          readOnly: true
      volumes:
      - name: tmp
        emptyDir: {}
      - name: certs
        secret:
          secretName: metrics-server-certs

6.2 Integration with External Metrics Providers

When using external metrics (e.g., Prometheus), implement proper authentication:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
apiVersion: v1
kind: Secret
metadata:
  name: prometheus-adapter-tls
  namespace: monitoring
type: kubernetes.io/tls
data:
  tls.crt: BASE64_ENCODED_CERT
  tls.key: BASE64_ENCODED_KEY
  ca.crt: BASE64_ENCODED_CA
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-adapter
  namespace: monitoring
spec:
  template:
    spec:
      containers:
      - name: prometheus-adapter
        args:
        - --secure-port=6443
        - --tls-cert-file=/var/run/serving-cert/tls.crt
        - --tls-private-key-file=/var/run/serving-cert/tls.key
        - --logtostderr=true
        - --prometheus-url=https://prometheus.monitoring.svc:9090/
        - --metrics-relist-interval=30s
        - --v=4
        - --config=/etc/adapter/config.yaml
        - --prometheus-auth-config=/etc/prometheus-auth/auth.yaml
        volumeMounts:
        - name: serving-cert
          mountPath: /var/run/serving-cert
          readOnly: true
        - name: config
          mountPath: /etc/adapter/
          readOnly: true
        - name: prometheus-auth
          mountPath: /etc/prometheus-auth/
          readOnly: true
      volumes:
      - name: serving-cert
        secret:
          secretName: prometheus-adapter-tls
      - name: config
        configMap:
          name: adapter-config
      - name: prometheus-auth
        secret:
          secretName: prometheus-basic-auth

6.3 SIEM and Monitoring Integration

Integrate HPA with security monitoring:

  1. Prometheus AlertManager rules for HPA:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: hpa-alerts
  namespace: monitoring
spec:
  groups:
  - name: hpa.rules
    rules:
    - alert: HPARapidScaling
      expr: changes(kube_horizontalpodautoscaler_status_current_replicas[5m]) > 10
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "HPA {{$labels.horizontalpodautoscaler}} in {{$labels.namespace}} is scaling rapidly"
        description: "The HPA has changed replica count by more than 10 in 5 minutes, which may indicate an attack or resource leak."
    - alert: HPAAtMaxReplicas
      expr: kube_horizontalpodautoscaler_status_current_replicas == kube_horizontalpodautoscaler_spec_max_replicas
      for: 30m
      labels:
        severity: warning
      annotations:
        summary: "HPA {{$labels.horizontalpodautoscaler}} in {{$labels.namespace}} at max replicas"
        description: "The HPA has been at maximum replicas for 30 minutes, which may indicate sustained load or an attack."
  1. Forward relevant security events to SIEM:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: v1
kind: ConfigMap
metadata:
  name: vector-config
  namespace: monitoring
data:
  vector.toml: |
    [sources.kubernetes_logs]
    type = "kubernetes_logs"

    [transforms.hpa_events]
    type = "filter"
    inputs = ["kubernetes_logs"]
    condition = 'includes(message, "horizontalpodautoscaler") && (includes(message, "scaled") || includes(message, "failed"))'

    [sinks.elasticsearch]
    type = "elasticsearch"
    inputs = ["hpa_events"]
    endpoint = "https://elk.security.svc:9200"
    index = "kubernetes-hpa-events-%Y.%m.%d"
    
    [sinks.elasticsearch.auth]
    strategy = "basic"
    user = "${ELASTICSEARCH_USER}"
    password = "${ELASTICSEARCH_PASSWORD}"

6.4 CI/CD Pipeline Integration

Secure automation for HPA configuration in CI/CD:

  1. GitOps workflow with policy validation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# .github/workflows/hpa-validation.yml
name: Validate HPA Configurations
on:
  pull_request:
    paths:
      - 'k8s/autoscaling/*.yaml'
      
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Set up Kubeconform
        run: |
          curl -Lo ./kubeconform.tar.gz https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz
          tar xf kubeconform.tar.gz
      
      - name: Validate Kubernetes schemas
        run: |
          ./kubeconform -strict -schema-location default -schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' k8s/autoscaling/
      
      - name: Run security policy checks
        uses: instrumenta/conftest-action@master
        with:
          files: k8s/autoscaling/
          policy: policy/autoscaling/
  1. Example Conftest policy for HPA:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# policy/autoscaling/hpa.rego
package main

deny[msg] {
  input.kind == "HorizontalPodAutoscaler"
  not input.spec.maxReplicas
  msg = "HPA must specify maxReplicas"
}

deny[msg] {
  input.kind == "HorizontalPodAutoscaler"
  input.spec.maxReplicas > 20
  msg = "HPA maxReplicas exceeds security threshold of 20"
}

deny[msg] {
  input.kind == "HorizontalPodAutoscaler"
  not input.spec.behavior
  msg = "HPA must specify scaling behavior with appropriate stabilization windows"
}

7. Testing and Validation

7.1 Load Testing for Scale Verification

Use secure load testing to validate autoscaling behavior:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Using k6 for load testing with security profile
cat <<EOF > load-test.js
import http from 'k6/http';
import { sleep } from 'k6';

export let options = {
  stages: [
    { duration: '5m', target: 100 },  // Ramp up
    { duration: '10m', target: 100 }, // Stay at 100 VUs
    { duration: '5m', target: 0 },    // Ramp down
  ],
  thresholds: {
    'http_req_duration': ['p(95)<500'], // 95% of requests must complete under 500ms
  },
};

export default function() {
  let res = http.get('https://app.example.com/api/endpoint');
  sleep(1);
}
EOF

# Run the test and monitor HPA behavior
k6 run load-test.js

Monitor HPA behavior during the test:

1
watch -n 5 "kubectl get hpa -n production"

7.2 Security Testing the Autoscaler

Test scenarios for common attacks:

  1. Resource Exhaustion Test:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Deploy test pod that consumes increasing CPU
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: resource-consumer
  namespace: testing
spec:
  replicas: 1
  selector:
    matchLabels:
      app: resource-consumer
  template:
    metadata:
      labels:
        app: resource-consumer
    spec:
      containers:
      - name: resource-consumer
        image: k8s.gcr.io/e2e-test-images/resource-consumer:1.9
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 1
            memory: 500Mi
EOF

# Generate load
kubectl exec -it $(kubectl get pod -l app=resource-consumer -n testing -o jsonpath='{.items[0].metadata.name}') -n testing -- /bin/sh -c "/consume-cpu/consume-cpu 800 300" # 800m CPU for 300 seconds
  1. Metric Manipulation Test:

For custom metrics, implement tests that generate anomalous metrics to observe autoscaler response.

7.3 Chaos Engineering for Autoscaling

Use chaos engineering to test resilience of autoscaled applications:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
  name: pod-kill-chaos
  namespace: testing
spec:
  action: pod-kill
  mode: one
  selector:
    namespaces:
      - production
    labelSelectors:
      app: "scaled-application"
  scheduler:
    cron: "@every 5m"

Monitor how the HPA responds to pod failures and ensure it maintains stability.

8. References and Further Reading

Official Documentation

Security Resources

Known CVEs Affecting Autoscaling

  • CVE-2023-3955: Custom metrics API server privilege escalation
  • CVE-2022-24348: HPA controller memory leak on malformed metrics
  • CVE-2021-25740: Potential DoS with external metrics manipulations

Whitepapers & Articles

9. Appendices

9.1 Troubleshooting HPA Issues

IssuePossible CausesSecurity ImplicationsResolution
HPA not scalingMetrics unavailableMay indicate metrics service compromiseVerify metrics-server is running and secure
 Resource requests not setResource exhaustion riskConfigure proper resource requests
 RBAC issuesUnauthorized access attemptsCheck permissions for HPA controller
Erratic scalingMetric source fluctuatingPotential DoS vectorIncrease stabilization window
 Improper probe configurationPod restart loopsConfigure appropriate liveness/readiness probes
Scale-up but no scale-downMemory leaksResource exhaustionReview application for memory management issues
HPA at maximum replicasDoS attackCluster resource exhaustionInvestigate unusual traffic patterns

Debugging Commands:

1
2
3
4
5
6
7
8
9
10
11
# Check HPA status and events
kubectl describe hpa application-hpa -n production

# Verify metrics availability
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/production/pods"

# Check external metrics
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq

# Examine pod resource usage
kubectl top pods -n production

9.2 Frequently Asked Questions

Q: What’s the most secure way to implement custom metrics for HPA?

A: The most secure implementation involves:

  1. Deploying the custom metrics adapter with RBAC
  2. Enforcing TLS for all metrics communication
  3. Validating metrics against baselines to detect anomalies
  4. Using PodSecurityPolicies (or alternatives) to restrict the metrics adapter’s permissions

Q: How can I prevent Denial of Wallet attacks through HPA manipulation?

A: Implement these defenses:

  1. Set reasonable maxReplicas limits
  2. Configure rate-limited scaling policies
  3. Implement cost monitoring with alerts for unusual scaling
  4. Use namespace quotas to limit total resource consumption
  5. Consider implementing custom admission controllers that validate scaling requests

Q: Should I use CPU, memory, or custom metrics for autoscaling?

A: From a security perspective:

  1. Use a combination of metric types for defense-in-depth
  2. CPU is a good baseline but can be manipulated through CPU-intensive attacks
  3. Memory is critical for detecting memory leaks and should always be monitored
  4. Custom metrics provide application-specific insight but must be secured against manipulation
  5. Always validate custom metrics by implementing authentication and TLS

Q: How do I prevent scaling based on malicious traffic?

A: Implement these safeguards:

  1. Rate limiting at the ingress level before traffic affects scaling metrics
  2. Web Application Firewall (WAF) to filter malicious requests
  3. Anomaly detection for traffic patterns that trigger scaling
  4. Separate metrics for legitimate vs. suspicious traffic (e.g., authenticated vs. unauthenticated requests)
  5. Configure longer stabilization windows to mitigate short-term traffic spikes

Q: What RBAC permissions should the HPA controller have?

A: Apply least privilege principles:

  1. The HPA controller service account should have only:
    • get, list, and watch on target resources
    • get on metrics endpoints
    • update on scale subresources
  2. Never give the controller broad cluster-admin privileges
  3. Scope permissions to specific namespaces where possible
  4. Regularly audit RBAC configurations for drift

9.3 Version-Specific Considerations

Kubernetes 1.23+

Offers stabilization window configurations for better control over scaling behavior:

1
2
3
4
5
behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
  scaleUp:
    stabilizationWindowSeconds: 60

Kubernetes 1.24+

Added container resource metrics support:

1
2
3
4
5
6
7
8
metrics:
- type: ContainerResource
  containerResource:
    name: cpu
    container: application
    target:
      type: Utilization
      averageUtilization: 70

Kubernetes 1.25+

Improved support for custom and external metrics with standardized labeling:

1
2
3
4
5
6
7
8
9
10
11
metrics:
- type: External
  external:
    metric:
      name: queue_messages_ready
      selector:
        matchLabels:
          queue: worker_tasks
    target:
      type: AverageValue
      averageValue: 30

Kubernetes 1.26+

Enhanced HPA status reporting with conditions:

1
kubectl get hpa application-hpa -o jsonpath='{.status.conditions}'

Status conditions provide better visibility into scaling decisions and potential security issues.


Security Checklist for Kubernetes HPA

Use this checklist to verify your HPA configuration meets security best practices:

  • Resource Limits and Requests
    • All pods have CPU and Memory requests defined
    • All pods have CPU and Memory limits defined
    • Namespace resource quotas are implemented
  • HPA Configuration
    • Both minReplicas and maxReplicas are explicitly set
    • Multiple metrics are used for scaling decisions
    • Stabilization windows are configured appropriately
    • Scale-up/down policies limit rate of change
  • Access Controls
    • RBAC is implemented with least privilege
    • Service accounts use minimal permissions
    • HPA configuration changes require approval
  • Metrics Security
    • Metrics Server is configured with TLS
    • Custom metrics adapters use authentication
    • External metrics sources use secure connections
  • Monitoring and Auditing
    • Logging is enabled for all scaling events
    • Alerts exist for unusual scaling behavior
    • Audit logging captures HPA configuration changes
  • Testing and Validation
    • Load testing validates scaling behavior
    • Security tests check for scaling vulnerabilities
    • Chaos testing verifies resilience during scaling
  • CI/CD Integration
    • HPA configurations are version controlled
    • Security validation occurs before deployment
    • Changes are deployed progressively

By implementing the security controls outlined in this guide, you can ensure that your Kubernetes Horizontal Pod Autoscaler enhances both your application’s performance and security posture. Properly configured autoscaling provides resilience against both legitimate traffic spikes and potential attacks, while maintaining a secure and efficient resource footprint.

Remember that autoscaling is just one component of a comprehensive Kubernetes security strategy and should be implemented alongside other security controls such as network policies, pod security standards, and secure supply chain practices.


This guide was last updated: May 9, 2025

This post is licensed under CC BY 4.0 by the author.