Kubernetes HPA Autoscaling Guide

Step-by-step guide to using Kubernetes HPA to autoscale pods based on CPU, memory, and custom metrics, including setup, configuration, and best practices.

Posted May 6, 2025

By Jared

20 min read

Kubernetes Horizontal Pod Autoscaler: A Security-Focused Implementation Guide

1. Introduction and Purpose

The Kubernetes Horizontal Pod Autoscaler (HPA) automatically scales the number of pods in a deployment, replica set, or stateful set based on observed resource utilization. While autoscaling provides significant operational benefits, improper configuration can introduce security vulnerabilities, performance issues, and unexpected costs. This guide provides a security-focused approach to implementing HPA, ensuring that your autoscaling strategy enhances rather than compromises your security posture.

Key Security Considerations:

Resource quota management to prevent denial-of-service scenarios
Proper metric selection to avoid scaling based on manipulated inputs
Access controls around autoscaler configuration
Monitoring and alerting for unusual scaling events

3. Installation

3.1 Prerequisites

Before implementing HPA, ensure your Kubernetes cluster meets these security-focused prerequisites:

Kubernetes version 1.23+ (recommended for latest security features)
Metrics Server installed and properly secured
Resource quotas implemented at namespace level
Network policies in place
Pod Security Standards applied

Verify Metrics Server Installation:

  
# Check if metrics-server is deployed
kubectl get deployment metrics-server -n kube-system

# Verify metrics are being collected
kubectl top nodes
kubectl top pods --all-namespaces

3.2 Verifying HPA Components

Always verify the integrity of components before deploying:

  
# For metrics-server, verify the image digest
kubectl get deployment metrics-server -n kube-system -o jsonpath='{.spec.template.spec.containers[0].image}'

# Check expected SHA256 against official releases
EXPECTED_SHA="sha256:abc123def456..." # Replace with actual digest
ACTUAL_SHA=$(kubectl get deployment metrics-server -n kube-system -o jsonpath='{.spec.template.spec.containers[0].image}' | xargs docker pull | grep Digest | cut -d':' -f2-)
[ "$EXPECTED_SHA" = "$ACTUAL_SHA" ] && echo "Verification successful" || echo "Verification failed"

3.3 Platform-Specific Considerations

Linux-Based Kubernetes

Most secure configuration with SELinux or AppArmor profiles:

  
# Example AppArmor profile for metrics-server
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
spec:
  template:
    metadata:
      annotations:
        container.apparmor.security.beta.kubernetes.io/metrics-server: runtime/default
EOF

Windows Worker Nodes

Windows containers have different isolation models and security considerations:

  
# Ensure proper Windows security contexts
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-windows-deployment
spec:
  template:
    spec:
      securityContext:
        windowsOptions:
          runAsUserName: "ContainerUser"
      containers:
        - name: windows-container
          # HPA will scale this container
EOF

Cloud Provider Considerations

AWS EKS:

Use IAM roles for service accounts (IRSA) for metrics access
Enable AWS CloudTrail for API auditing

Azure AKS:

Implement Azure AD integration for RBAC
Use Azure Monitor for containerized insights metrics

Google GKE:

Enable Workload Identity for metrics access
Use Cloud Monitoring for metrics collection

4. Configuration

4.1 Basic HPA Configuration

A security-focused HPA configuration with proper resource management:

  
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: secure-application-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: secure-application
  minReplicas: 2  # Never scale below 2 for high availability
  maxReplicas: 10 # Upper bound to prevent runaway scaling
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # Prevent oscillation
      policies:
      - type: Percent
        value: 25
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 60
      - type: Pods
        value: 5
        periodSeconds: 60
      selectPolicy: Max

4.2 Advanced Configuration Options

Multiple Metrics for Scaling:

  
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: secure-multi-metric-hpa
spec:
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      target:
        type: AverageValue
        averageValue: 1000
  - type: Object
    object:
      metric:
        name: requests-per-second
      describedObject:
        apiVersion: networking.k8s.io/v1
        kind: Ingress
        name: main-route
      target:
        type: Value
        value: 10000

4.3 Secure Defaults

Implement these secure defaults when configuring HPA:

Always set both minimum and maximum replicas
- Minimum ensures availability (typically ≥ 2)
- Maximum prevents resource exhaustion attacks
Set appropriate stabilization windows
- Longer downscale windows (300+ seconds)
- Shorter upscale windows for responsiveness (30-60 seconds)
Configure scaling policies to limit rate of change
- Percentage-based scale-down (max 20-25% at once)
- Pod-limit for scale-up to prevent excessive scaling
Use namespaced resource quotas

  
apiVersion: v1
kind: ResourceQuota
metadata:
  name: autoscaling-limits
  namespace: production
spec:
  hard:
    pods: "50"
    count/horizontalpodautoscalers.autoscaling: "10"

4.4 Common Misconfigurations

Misconfiguration	Security Risk	Recommended Fix
No maximum replica limit	Potential DoS or resource exhaustion	Always set reasonable `maxReplicas`
CPU-only scaling	Can miss memory leaks or attacks	Use multiple metrics, including memory
No stabilization window	Scale thrashing, predictable behavior	Set appropriate windows (300s+ for scale-down)
Custom metrics without validation	Scaling on manipulated metrics	Verify metric sources, implement anomaly detection
No resource limits on pods	Pod resource consumption attacks	Always set resource requests and limits
Autoscaling privileged containers	Increased attack surface on scale	Never run privileged containers with autoscaling

4.5 RBAC for HPA

Implement least-privilege access for HPA management:

  
# Role for HPA monitoring only
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: hpa-monitor
  namespace: production
rules:
- apiGroups: ["autoscaling"]
  resources: ["horizontalpodautoscalers"]
  verbs: ["get", "list", "watch"]
---
# Role for HPA management
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: hpa-admin
  namespace: production
rules:
- apiGroups: ["autoscaling"]
  resources: ["horizontalpodautoscalers"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

Bind these roles only to service accounts and users that require access:

  
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: hpa-admins
  namespace: production
subjects:
- kind: Group
  name: system:platform-admins
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: hpa-admin
  apiGroup: rbac.authorization.k8s.io

5. Security Best Practices

5.1 Resource Limits and Requests

Properly configuring resource requests and limits is essential for secure HPA operation:

  
apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-application
spec:
  template:
    spec:
      containers:
      - name: app
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
          limits:
            cpu: 500m
            memory: 1Gi

Security Benefits:

Prevents noisy neighbor issues
Mitigates pod resource consumption attacks
Ensures predictable scaling behavior
Protects against memory leak exploitation

Best Practices:

Set memory limits ~2x the expected usage
Set CPU requests based on observed P95 usage
Use LimitRanges to enforce minimum and maximum values
Monitor for containers hitting limits

5.2 Metric Selection and Validation

Metric Type	Security Considerations	Recommendation
CPU	Reliable but can be manipulated	Good baseline, combine with others
Memory	Critical for detecting memory leaks	Always include, but set alerts for anomalies
Custom application metrics	Can be spoofed if not secured	Implement authentication, validate sources
External metrics	Potential supply chain attack vector	Use only trusted external metrics providers with TLS

Validation Strategy:

Implement baseline and anomaly detection
Record metric history to identify unusual patterns
Set up alerting for metric spikes that trigger scaling
Review logs when unusual scaling occurs

5.3 Scaling Policies and Thresholds

Configure scaling thresholds to balance responsiveness with security:

  
behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 20  # Max 20% scale down at once
      periodSeconds: 60
  scaleUp:
    stabilizationWindowSeconds: 60
    policies:
    - type: Pods
      value: 5   # Max 5 pods at once
      periodSeconds: 30
    selectPolicy: Max

Security-Focused Policies:

Implement slower scale-down than scale-up
Use percentage-based policies for large deployments
Set pod-count limits for smaller deployments
Configure alerts for rapid scale events

5.4 Secrets Management with Scaling

When pods scale rapidly, secrets management becomes critical:

Recommended Approaches:

Use external secrets stores like HashiCorp Vault or AWS Secrets Manager
Implement proper init container patterns for secret retrieval
Set up dynamic secret rotation that accommodates scaled pods
Never bake secrets into container images

Example with External Secrets Operator:

  
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: application-secrets
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: ClusterSecretStore
  target:
    name: application-secret
    creationPolicy: Owner
  data:
  - secretKey: api-key
    remoteRef:
      key: services/myapp/api-key
      property: value

5.5 Logging and Audit Strategies

Implement comprehensive logging for HPA operations:

Enable Kubernetes Audit Logging:

  
# kube-apiserver flag
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--audit-log-path=/var/log/kubernetes/audit.log

Audit Policy for HPA Operations:

  
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
  resources:
  - group: autoscaling
    resources: ["horizontalpodautoscalers"]

Forward logs to SIEM:

  
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
data:
  fluent.conf: |
    <match kubernetes.var.log.containers.metrics-server-**.log>
      @type elasticsearch
      host elasticsearch.monitoring
      port 9200
      logstash_format true
      tag_key @log_name
      <buffer>
        flush_interval 5s
      </buffer>
    </match>

5.6 Threat Modeling for Autoscaling

Threat	Potential Impact	Mitigation
Denial of Wallet attack	Excessive resource costs	Set strict maxReplicas, implement alerts for unusual scaling
Pod Crash Loop attack	Force continuous rescaling	Configure proper liveness/readiness probes, increase stabilization windows
Metric Manipulation	Improper scaling decisions	Use multiple metrics, implement anomaly detection
Resource Exhaustion	Cluster-wide outage	Configure namespace quotas, node affinity to spread load
Credential Exposure during scale	Secret disclosure	Use proper secret management systems designed for scale

5.7 Securely Updating HPA Configurations

Implement a secure change process:

Version control all HPA configurations with GitOps
Require peer review before applying changes
Test changes in staging environment before production
Use progressive deployment patterns:

  
# Initial deployment with conservative settings
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: application-hpa
  annotations:
    fluxcd.io/automated: "true"
    fluxcd.io/tag.chart-image: semver:~1.0
spec:
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80

After validation, incrementally update maximum limits and adjust metrics.

6. Integrations

6.1 Metrics Server and Custom Metrics

Secure metrics-server deployment:

  
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: metrics-server
        image: registry.k8s.io/metrics-server/metrics-server:v0.6.3@sha256:abc123... # Use digest
        args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls=false # Require valid certs
        - --tls-cert-file=/certs/tls.crt
        - --tls-private-key-file=/certs/tls.key
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - name: tmp
          mountPath: /tmp
        - name: certs
          mountPath: /certs
          readOnly: true
      volumes:
      - name: tmp
        emptyDir: {}
      - name: certs
        secret:
          secretName: metrics-server-certs

6.2 Integration with External Metrics Providers

When using external metrics (e.g., Prometheus), implement proper authentication:

  
apiVersion: v1
kind: Secret
metadata:
  name: prometheus-adapter-tls
  namespace: monitoring
type: kubernetes.io/tls
data:
  tls.crt: BASE64_ENCODED_CERT
  tls.key: BASE64_ENCODED_KEY
  ca.crt: BASE64_ENCODED_CA
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-adapter
  namespace: monitoring
spec:
  template:
    spec:
      containers:
      - name: prometheus-adapter
        args:
        - --secure-port=6443
        - --tls-cert-file=/var/run/serving-cert/tls.crt
        - --tls-private-key-file=/var/run/serving-cert/tls.key
        - --logtostderr=true
        - --prometheus-url=https://prometheus.monitoring.svc:9090/
        - --metrics-relist-interval=30s
        - --v=4
        - --config=/etc/adapter/config.yaml
        - --prometheus-auth-config=/etc/prometheus-auth/auth.yaml
        volumeMounts:
        - name: serving-cert
          mountPath: /var/run/serving-cert
          readOnly: true
        - name: config
          mountPath: /etc/adapter/
          readOnly: true
        - name: prometheus-auth
          mountPath: /etc/prometheus-auth/
          readOnly: true
      volumes:
      - name: serving-cert
        secret:
          secretName: prometheus-adapter-tls
      - name: config
        configMap:
          name: adapter-config
      - name: prometheus-auth
        secret:
          secretName: prometheus-basic-auth

6.3 SIEM and Monitoring Integration

Integrate HPA with security monitoring:

Prometheus AlertManager rules for HPA:

  
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: hpa-alerts
  namespace: monitoring
spec:
  groups:
  - name: hpa.rules
    rules:
    - alert: HPARapidScaling
      expr: changes(kube_horizontalpodautoscaler_status_current_replicas[5m]) > 10
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "HPA {{$labels.horizontalpodautoscaler}} in {{$labels.namespace}} is scaling rapidly"
        description: "The HPA has changed replica count by more than 10 in 5 minutes, which may indicate an attack or resource leak."
    - alert: HPAAtMaxReplicas
      expr: kube_horizontalpodautoscaler_status_current_replicas == kube_horizontalpodautoscaler_spec_max_replicas
      for: 30m
      labels:
        severity: warning
      annotations:
        summary: "HPA {{$labels.horizontalpodautoscaler}} in {{$labels.namespace}} at max replicas"
        description: "The HPA has been at maximum replicas for 30 minutes, which may indicate sustained load or an attack."

Forward relevant security events to SIEM:

  
apiVersion: v1
kind: ConfigMap
metadata:
  name: vector-config
  namespace: monitoring
data:
  vector.toml: |
    [sources.kubernetes_logs]
    type = "kubernetes_logs"

    [transforms.hpa_events]
    type = "filter"
    inputs = ["kubernetes_logs"]
    condition = 'includes(message, "horizontalpodautoscaler") && (includes(message, "scaled") || includes(message, "failed"))'

    [sinks.elasticsearch]
    type = "elasticsearch"
    inputs = ["hpa_events"]
    endpoint = "https://elk.security.svc:9200"
    index = "kubernetes-hpa-events-%Y.%m.%d"
    
    [sinks.elasticsearch.auth]
    strategy = "basic"
    user = "${ELASTICSEARCH_USER}"
    password = "${ELASTICSEARCH_PASSWORD}"

6.4 CI/CD Pipeline Integration

Secure automation for HPA configuration in CI/CD:

GitOps workflow with policy validation:

  
# .github/workflows/hpa-validation.yml
name: Validate HPA Configurations
on:
  pull_request:
    paths:
      - 'k8s/autoscaling/*.yaml'
      
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Set up Kubeconform
        run: |
          curl -Lo ./kubeconform.tar.gz https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz
          tar xf kubeconform.tar.gz
      
      - name: Validate Kubernetes schemas
        run: |
          ./kubeconform -strict -schema-location default -schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' k8s/autoscaling/
      
      - name: Run security policy checks
        uses: instrumenta/conftest-action@master
        with:
          files: k8s/autoscaling/
          policy: policy/autoscaling/

Example Conftest policy for HPA:

  
# policy/autoscaling/hpa.rego
package main

deny[msg] {
  input.kind == "HorizontalPodAutoscaler"
  not input.spec.maxReplicas
  msg = "HPA must specify maxReplicas"
}

deny[msg] {
  input.kind == "HorizontalPodAutoscaler"
  input.spec.maxReplicas > 20
  msg = "HPA maxReplicas exceeds security threshold of 20"
}

deny[msg] {
  input.kind == "HorizontalPodAutoscaler"
  not input.spec.behavior
  msg = "HPA must specify scaling behavior with appropriate stabilization windows"
}

7. Testing and Validation

7.1 Load Testing for Scale Verification

Use secure load testing to validate autoscaling behavior:

  
# Using k6 for load testing with security profile
cat <<EOF > load-test.js
import http from 'k6/http';
import { sleep } from 'k6';

export let options = {
  stages: [
    { duration: '5m', target: 100 },  // Ramp up
    { duration: '10m', target: 100 }, // Stay at 100 VUs
    { duration: '5m', target: 0 },    // Ramp down
  ],
  thresholds: {
    'http_req_duration': ['p(95)<500'], // 95% of requests must complete under 500ms
  },
};

export default function() {
  let res = http.get('https://app.example.com/api/endpoint');
  sleep(1);
}
EOF

# Run the test and monitor HPA behavior
k6 run load-test.js

Monitor HPA behavior during the test:

watch -n 5 "kubectl get hpa -n production"

7.2 Security Testing the Autoscaler

Test scenarios for common attacks:

Resource Exhaustion Test:

  
# Deploy test pod that consumes increasing CPU
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: resource-consumer
  namespace: testing
spec:
  replicas: 1
  selector:
    matchLabels:
      app: resource-consumer
  template:
    metadata:
      labels:
        app: resource-consumer
    spec:
      containers:
      - name: resource-consumer
        image: k8s.gcr.io/e2e-test-images/resource-consumer:1.9
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 1
            memory: 500Mi
EOF

# Generate load
kubectl exec -it $(kubectl get pod -l app=resource-consumer -n testing -o jsonpath='{.items[0].metadata.name}') -n testing -- /bin/sh -c "/consume-cpu/consume-cpu 800 300" # 800m CPU for 300 seconds

Metric Manipulation Test:

For custom metrics, implement tests that generate anomalous metrics to observe autoscaler response.

7.3 Chaos Engineering for Autoscaling

Use chaos engineering to test resilience of autoscaled applications:

  
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
  name: pod-kill-chaos
  namespace: testing
spec:
  action: pod-kill
  mode: one
  selector:
    namespaces:
      - production
    labelSelectors:
      app: "scaled-application"
  scheduler:
    cron: "@every 5m"

Monitor how the HPA responds to pod failures and ensure it maintains stability.

8. References and Further Reading

Official Documentation

Security Resources

Known CVEs Affecting Autoscaling

CVE-2023-3955: Custom metrics API server privilege escalation
CVE-2022-24348: HPA controller memory leak on malformed metrics
CVE-2021-25740: Potential DoS with external metrics manipulations

Whitepapers & Articles

9. Appendices

9.1 Troubleshooting HPA Issues

Issue	Possible Causes	Security Implications	Resolution
HPA not scaling	Metrics unavailable	May indicate metrics service compromise	Verify metrics-server is running and secure
	Resource requests not set	Resource exhaustion risk	Configure proper resource requests
	RBAC issues	Unauthorized access attempts	Check permissions for HPA controller
Erratic scaling	Metric source fluctuating	Potential DoS vector	Increase stabilization window
	Improper probe configuration	Pod restart loops	Configure appropriate liveness/readiness probes
Scale-up but no scale-down	Memory leaks	Resource exhaustion	Review application for memory management issues
HPA at maximum replicas	DoS attack	Cluster resource exhaustion	Investigate unusual traffic patterns

Debugging Commands:

  
# Check HPA status and events
kubectl describe hpa application-hpa -n production

# Verify metrics availability
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/production/pods"

# Check external metrics
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq

# Examine pod resource usage
kubectl top pods -n production

9.2 Frequently Asked Questions

Q: What’s the most secure way to implement custom metrics for HPA?

A: The most secure implementation involves:

Deploying the custom metrics adapter with RBAC
Enforcing TLS for all metrics communication
Validating metrics against baselines to detect anomalies
Using PodSecurityPolicies (or alternatives) to restrict the metrics adapter’s permissions

Q: How can I prevent Denial of Wallet attacks through HPA manipulation?

A: Implement these defenses:

Set reasonable maxReplicas limits
Configure rate-limited scaling policies
Implement cost monitoring with alerts for unusual scaling
Use namespace quotas to limit total resource consumption
Consider implementing custom admission controllers that validate scaling requests

Q: Should I use CPU, memory, or custom metrics for autoscaling?

A: From a security perspective:

Use a combination of metric types for defense-in-depth
CPU is a good baseline but can be manipulated through CPU-intensive attacks
Memory is critical for detecting memory leaks and should always be monitored
Custom metrics provide application-specific insight but must be secured against manipulation
Always validate custom metrics by implementing authentication and TLS

Q: How do I prevent scaling based on malicious traffic?

A: Implement these safeguards:

Rate limiting at the ingress level before traffic affects scaling metrics
Web Application Firewall (WAF) to filter malicious requests
Anomaly detection for traffic patterns that trigger scaling
Separate metrics for legitimate vs. suspicious traffic (e.g., authenticated vs. unauthenticated requests)
Configure longer stabilization windows to mitigate short-term traffic spikes

Q: What RBAC permissions should the HPA controller have?

A: Apply least privilege principles:

The HPA controller service account should have only:
- get, list, and watch on target resources
- get on metrics endpoints
- update on scale subresources
Never give the controller broad cluster-admin privileges
Scope permissions to specific namespaces where possible
Regularly audit RBAC configurations for drift

9.3 Version-Specific Considerations

Kubernetes 1.23+

Offers stabilization window configurations for better control over scaling behavior:

  
behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
  scaleUp:
    stabilizationWindowSeconds: 60

Kubernetes 1.24+

Added container resource metrics support:

  
metrics:
- type: ContainerResource
  containerResource:
    name: cpu
    container: application
    target:
      type: Utilization
      averageUtilization: 70

Kubernetes 1.25+

Improved support for custom and external metrics with standardized labeling:

  
metrics:
- type: External
  external:
    metric:
      name: queue_messages_ready
      selector:
        matchLabels:
          queue: worker_tasks
    target:
      type: AverageValue
      averageValue: 30

Kubernetes 1.26+

Enhanced HPA status reporting with conditions:

  
kubectl get hpa application-hpa -o jsonpath='{.status.conditions}'

Status conditions provide better visibility into scaling decisions and potential security issues.

Security Checklist for Kubernetes HPA

Use this checklist to verify your HPA configuration meets security best practices:

Resource Limits and Requests
- All pods have CPU and Memory requests defined
- All pods have CPU and Memory limits defined
- Namespace resource quotas are implemented
HPA Configuration
- Both minReplicas and maxReplicas are explicitly set
- Multiple metrics are used for scaling decisions
- Stabilization windows are configured appropriately
- Scale-up/down policies limit rate of change
Access Controls
- RBAC is implemented with least privilege
- Service accounts use minimal permissions
- HPA configuration changes require approval
Metrics Security
- Metrics Server is configured with TLS
- Custom metrics adapters use authentication
- External metrics sources use secure connections
Monitoring and Auditing
- Logging is enabled for all scaling events
- Alerts exist for unusual scaling behavior
- Audit logging captures HPA configuration changes
Testing and Validation
- Load testing validates scaling behavior
- Security tests check for scaling vulnerabilities
- Chaos testing verifies resilience during scaling
CI/CD Integration
- HPA configurations are version controlled
- Security validation occurs before deployment
- Changes are deployed progressively

By implementing the security controls outlined in this guide, you can ensure that your Kubernetes Horizontal Pod Autoscaler enhances both your application’s performance and security posture. Properly configured autoscaling provides resilience against both legitimate traffic spikes and potential attacks, while maintaining a secure and efficient resource footprint.

Remember that autoscaling is just one component of a comprehensive Kubernetes security strategy and should be implemented alongside other security controls such as network policies, pod security standards, and secure supply chain practices.

This guide was last updated: May 9, 2025

DevOps, Container Orchestration

This post is licensed under CC BY 4.0 by the author.