Kubernetes HPA Autoscaling Guide
Step-by-step guide to using Kubernetes HPA to autoscale pods based on CPU, memory, and custom metrics, including setup, configuration, and best practices.
Kubernetes Horizontal Pod Autoscaler: A Security-Focused Implementation Guide
1. Introduction and Purpose
The Kubernetes Horizontal Pod Autoscaler (HPA) automatically scales the number of pods in a deployment, replica set, or stateful set based on observed resource utilization. While autoscaling provides significant operational benefits, improper configuration can introduce security vulnerabilities, performance issues, and unexpected costs. This guide provides a security-focused approach to implementing HPA, ensuring that your autoscaling strategy enhances rather than compromises your security posture.
Key Security Considerations:
- Resource quota management to prevent denial-of-service scenarios
- Proper metric selection to avoid scaling based on manipulated inputs
- Access controls around autoscaler configuration
- Monitoring and alerting for unusual scaling events
2. Table of Contents
- Introduction and Purpose
- Table of Contents
- Installation
- Configuration
- Security Best Practices
- Integrations
- Testing and Validation
- References and Further Reading
- Appendices
3. Installation
3.1 Prerequisites
Before implementing HPA, ensure your Kubernetes cluster meets these security-focused prerequisites:
- Kubernetes version 1.23+ (recommended for latest security features)
- Metrics Server installed and properly secured
- Resource quotas implemented at namespace level
- Network policies in place
- Pod Security Standards applied
Verify Metrics Server Installation:
1
2
3
4
5
6
# Check if metrics-server is deployed
kubectl get deployment metrics-server -n kube-system
# Verify metrics are being collected
kubectl top nodes
kubectl top pods --all-namespaces
3.2 Verifying HPA Components
Always verify the integrity of components before deploying:
1
2
3
4
5
6
7
# For metrics-server, verify the image digest
kubectl get deployment metrics-server -n kube-system -o jsonpath='{.spec.template.spec.containers[0].image}'
# Check expected SHA256 against official releases
EXPECTED_SHA="sha256:abc123def456..." # Replace with actual digest
ACTUAL_SHA=$(kubectl get deployment metrics-server -n kube-system -o jsonpath='{.spec.template.spec.containers[0].image}' | xargs docker pull | grep Digest | cut -d':' -f2-)
[ "$EXPECTED_SHA" = "$ACTUAL_SHA" ] && echo "Verification successful" || echo "Verification failed"
3.3 Platform-Specific Considerations
Linux-Based Kubernetes
Most secure configuration with SELinux or AppArmor profiles:
1
2
3
4
5
6
7
8
9
10
11
12
13
# Example AppArmor profile for metrics-server
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
spec:
template:
metadata:
annotations:
container.apparmor.security.beta.kubernetes.io/metrics-server: runtime/default
EOF
Windows Worker Nodes
Windows containers have different isolation models and security considerations:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Ensure proper Windows security contexts
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-windows-deployment
spec:
template:
spec:
securityContext:
windowsOptions:
runAsUserName: "ContainerUser"
containers:
- name: windows-container
# HPA will scale this container
EOF
Cloud Provider Considerations
AWS EKS:
- Use IAM roles for service accounts (IRSA) for metrics access
- Enable AWS CloudTrail for API auditing
Azure AKS:
- Implement Azure AD integration for RBAC
- Use Azure Monitor for containerized insights metrics
Google GKE:
- Enable Workload Identity for metrics access
- Use Cloud Monitoring for metrics collection
4. Configuration
4.1 Basic HPA Configuration
A security-focused HPA configuration with proper resource management:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: secure-application-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: secure-application
minReplicas: 2 # Never scale below 2 for high availability
maxReplicas: 10 # Upper bound to prevent runaway scaling
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # Prevent oscillation
policies:
- type: Percent
value: 25
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 60
- type: Pods
value: 5
periodSeconds: 60
selectPolicy: Max
4.2 Advanced Configuration Options
Multiple Metrics for Scaling:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: secure-multi-metric-hpa
spec:
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: packets-per-second
target:
type: AverageValue
averageValue: 1000
- type: Object
object:
metric:
name: requests-per-second
describedObject:
apiVersion: networking.k8s.io/v1
kind: Ingress
name: main-route
target:
type: Value
value: 10000
4.3 Secure Defaults
Implement these secure defaults when configuring HPA:
- Always set both minimum and maximum replicas
- Minimum ensures availability (typically ≥ 2)
- Maximum prevents resource exhaustion attacks
- Set appropriate stabilization windows
- Longer downscale windows (300+ seconds)
- Shorter upscale windows for responsiveness (30-60 seconds)
- Configure scaling policies to limit rate of change
- Percentage-based scale-down (max 20-25% at once)
- Pod-limit for scale-up to prevent excessive scaling
- Use namespaced resource quotas
1
2
3
4
5
6
7
8
9
apiVersion: v1
kind: ResourceQuota
metadata:
name: autoscaling-limits
namespace: production
spec:
hard:
pods: "50"
count/horizontalpodautoscalers.autoscaling: "10"
4.4 Common Misconfigurations
Misconfiguration | Security Risk | Recommended Fix |
---|---|---|
No maximum replica limit | Potential DoS or resource exhaustion | Always set reasonable maxReplicas |
CPU-only scaling | Can miss memory leaks or attacks | Use multiple metrics, including memory |
No stabilization window | Scale thrashing, predictable behavior | Set appropriate windows (300s+ for scale-down) |
Custom metrics without validation | Scaling on manipulated metrics | Verify metric sources, implement anomaly detection |
No resource limits on pods | Pod resource consumption attacks | Always set resource requests and limits |
Autoscaling privileged containers | Increased attack surface on scale | Never run privileged containers with autoscaling |
4.5 RBAC for HPA
Implement least-privilege access for HPA management:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Role for HPA monitoring only
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: hpa-monitor
namespace: production
rules:
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["get", "list", "watch"]
---
# Role for HPA management
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: hpa-admin
namespace: production
rules:
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
Bind these roles only to service accounts and users that require access:
1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: hpa-admins
namespace: production
subjects:
- kind: Group
name: system:platform-admins
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: hpa-admin
apiGroup: rbac.authorization.k8s.io
5. Security Best Practices
5.1 Resource Limits and Requests
Properly configuring resource requests and limits is essential for secure HPA operation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: apps/v1
kind: Deployment
metadata:
name: secure-application
spec:
template:
spec:
containers:
- name: app
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 500m
memory: 1Gi
Security Benefits:
- Prevents noisy neighbor issues
- Mitigates pod resource consumption attacks
- Ensures predictable scaling behavior
- Protects against memory leak exploitation
Best Practices:
- Set memory limits ~2x the expected usage
- Set CPU requests based on observed P95 usage
- Use LimitRanges to enforce minimum and maximum values
- Monitor for containers hitting limits
5.2 Metric Selection and Validation
Metric Type | Security Considerations | Recommendation |
---|---|---|
CPU | Reliable but can be manipulated | Good baseline, combine with others |
Memory | Critical for detecting memory leaks | Always include, but set alerts for anomalies |
Custom application metrics | Can be spoofed if not secured | Implement authentication, validate sources |
External metrics | Potential supply chain attack vector | Use only trusted external metrics providers with TLS |
Validation Strategy:
- Implement baseline and anomaly detection
- Record metric history to identify unusual patterns
- Set up alerting for metric spikes that trigger scaling
- Review logs when unusual scaling occurs
5.3 Scaling Policies and Thresholds
Configure scaling thresholds to balance responsiveness with security:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 20 # Max 20% scale down at once
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Pods
value: 5 # Max 5 pods at once
periodSeconds: 30
selectPolicy: Max
Security-Focused Policies:
- Implement slower scale-down than scale-up
- Use percentage-based policies for large deployments
- Set pod-count limits for smaller deployments
- Configure alerts for rapid scale events
5.4 Secrets Management with Scaling
When pods scale rapidly, secrets management becomes critical:
Recommended Approaches:
- Use external secrets stores like HashiCorp Vault or AWS Secrets Manager
- Implement proper init container patterns for secret retrieval
- Set up dynamic secret rotation that accommodates scaled pods
- Never bake secrets into container images
Example with External Secrets Operator:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: application-secrets
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: application-secret
creationPolicy: Owner
data:
- secretKey: api-key
remoteRef:
key: services/myapp/api-key
property: value
5.5 Logging and Audit Strategies
Implement comprehensive logging for HPA operations:
- Enable Kubernetes Audit Logging:
1
2
3
# kube-apiserver flag
--audit-policy-file=/etc/kubernetes/audit-policy.yaml
--audit-log-path=/var/log/kubernetes/audit.log
- Audit Policy for HPA Operations:
1
2
3
4
5
6
7
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
resources:
- group: autoscaling
resources: ["horizontalpodautoscalers"]
- Forward logs to SIEM:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
data:
fluent.conf: |
<match kubernetes.var.log.containers.metrics-server-**.log>
@type elasticsearch
host elasticsearch.monitoring
port 9200
logstash_format true
tag_key @log_name
<buffer>
flush_interval 5s
</buffer>
</match>
5.6 Threat Modeling for Autoscaling
Threat | Potential Impact | Mitigation |
---|---|---|
Denial of Wallet attack | Excessive resource costs | Set strict maxReplicas, implement alerts for unusual scaling |
Pod Crash Loop attack | Force continuous rescaling | Configure proper liveness/readiness probes, increase stabilization windows |
Metric Manipulation | Improper scaling decisions | Use multiple metrics, implement anomaly detection |
Resource Exhaustion | Cluster-wide outage | Configure namespace quotas, node affinity to spread load |
Credential Exposure during scale | Secret disclosure | Use proper secret management systems designed for scale |
5.7 Securely Updating HPA Configurations
Implement a secure change process:
- Version control all HPA configurations with GitOps
- Require peer review before applying changes
- Test changes in staging environment before production
- Use progressive deployment patterns:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Initial deployment with conservative settings
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: application-hpa
annotations:
fluxcd.io/automated: "true"
fluxcd.io/tag.chart-image: semver:~1.0
spec:
minReplicas: 2
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
After validation, incrementally update maximum limits and adjust metrics.
6. Integrations
6.1 Metrics Server and Custom Metrics
Secure metrics-server deployment:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
spec:
template:
spec:
containers:
- name: metrics-server
image: registry.k8s.io/metrics-server/metrics-server:v0.6.3@sha256:abc123... # Use digest
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --kubelet-insecure-tls=false # Require valid certs
- --tls-cert-file=/certs/tls.crt
- --tls-private-key-file=/certs/tls.key
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- name: tmp
mountPath: /tmp
- name: certs
mountPath: /certs
readOnly: true
volumes:
- name: tmp
emptyDir: {}
- name: certs
secret:
secretName: metrics-server-certs
6.2 Integration with External Metrics Providers
When using external metrics (e.g., Prometheus), implement proper authentication:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
apiVersion: v1
kind: Secret
metadata:
name: prometheus-adapter-tls
namespace: monitoring
type: kubernetes.io/tls
data:
tls.crt: BASE64_ENCODED_CERT
tls.key: BASE64_ENCODED_KEY
ca.crt: BASE64_ENCODED_CA
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-adapter
namespace: monitoring
spec:
template:
spec:
containers:
- name: prometheus-adapter
args:
- --secure-port=6443
- --tls-cert-file=/var/run/serving-cert/tls.crt
- --tls-private-key-file=/var/run/serving-cert/tls.key
- --logtostderr=true
- --prometheus-url=https://prometheus.monitoring.svc:9090/
- --metrics-relist-interval=30s
- --v=4
- --config=/etc/adapter/config.yaml
- --prometheus-auth-config=/etc/prometheus-auth/auth.yaml
volumeMounts:
- name: serving-cert
mountPath: /var/run/serving-cert
readOnly: true
- name: config
mountPath: /etc/adapter/
readOnly: true
- name: prometheus-auth
mountPath: /etc/prometheus-auth/
readOnly: true
volumes:
- name: serving-cert
secret:
secretName: prometheus-adapter-tls
- name: config
configMap:
name: adapter-config
- name: prometheus-auth
secret:
secretName: prometheus-basic-auth
6.3 SIEM and Monitoring Integration
Integrate HPA with security monitoring:
- Prometheus AlertManager rules for HPA:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: hpa-alerts
namespace: monitoring
spec:
groups:
- name: hpa.rules
rules:
- alert: HPARapidScaling
expr: changes(kube_horizontalpodautoscaler_status_current_replicas[5m]) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "HPA {{$labels.horizontalpodautoscaler}} in {{$labels.namespace}} is scaling rapidly"
description: "The HPA has changed replica count by more than 10 in 5 minutes, which may indicate an attack or resource leak."
- alert: HPAAtMaxReplicas
expr: kube_horizontalpodautoscaler_status_current_replicas == kube_horizontalpodautoscaler_spec_max_replicas
for: 30m
labels:
severity: warning
annotations:
summary: "HPA {{$labels.horizontalpodautoscaler}} in {{$labels.namespace}} at max replicas"
description: "The HPA has been at maximum replicas for 30 minutes, which may indicate sustained load or an attack."
- Forward relevant security events to SIEM:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: v1
kind: ConfigMap
metadata:
name: vector-config
namespace: monitoring
data:
vector.toml: |
[sources.kubernetes_logs]
type = "kubernetes_logs"
[transforms.hpa_events]
type = "filter"
inputs = ["kubernetes_logs"]
condition = 'includes(message, "horizontalpodautoscaler") && (includes(message, "scaled") || includes(message, "failed"))'
[sinks.elasticsearch]
type = "elasticsearch"
inputs = ["hpa_events"]
endpoint = "https://elk.security.svc:9200"
index = "kubernetes-hpa-events-%Y.%m.%d"
[sinks.elasticsearch.auth]
strategy = "basic"
user = "${ELASTICSEARCH_USER}"
password = "${ELASTICSEARCH_PASSWORD}"
6.4 CI/CD Pipeline Integration
Secure automation for HPA configuration in CI/CD:
- GitOps workflow with policy validation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# .github/workflows/hpa-validation.yml
name: Validate HPA Configurations
on:
pull_request:
paths:
- 'k8s/autoscaling/*.yaml'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Kubeconform
run: |
curl -Lo ./kubeconform.tar.gz https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz
tar xf kubeconform.tar.gz
- name: Validate Kubernetes schemas
run: |
./kubeconform -strict -schema-location default -schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' k8s/autoscaling/
- name: Run security policy checks
uses: instrumenta/conftest-action@master
with:
files: k8s/autoscaling/
policy: policy/autoscaling/
- Example Conftest policy for HPA:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# policy/autoscaling/hpa.rego
package main
deny[msg] {
input.kind == "HorizontalPodAutoscaler"
not input.spec.maxReplicas
msg = "HPA must specify maxReplicas"
}
deny[msg] {
input.kind == "HorizontalPodAutoscaler"
input.spec.maxReplicas > 20
msg = "HPA maxReplicas exceeds security threshold of 20"
}
deny[msg] {
input.kind == "HorizontalPodAutoscaler"
not input.spec.behavior
msg = "HPA must specify scaling behavior with appropriate stabilization windows"
}
7. Testing and Validation
7.1 Load Testing for Scale Verification
Use secure load testing to validate autoscaling behavior:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Using k6 for load testing with security profile
cat <<EOF > load-test.js
import http from 'k6/http';
import { sleep } from 'k6';
export let options = {
stages: [
{ duration: '5m', target: 100 }, // Ramp up
{ duration: '10m', target: 100 }, // Stay at 100 VUs
{ duration: '5m', target: 0 }, // Ramp down
],
thresholds: {
'http_req_duration': ['p(95)<500'], // 95% of requests must complete under 500ms
},
};
export default function() {
let res = http.get('https://app.example.com/api/endpoint');
sleep(1);
}
EOF
# Run the test and monitor HPA behavior
k6 run load-test.js
Monitor HPA behavior during the test:
1
watch -n 5 "kubectl get hpa -n production"
7.2 Security Testing the Autoscaler
Test scenarios for common attacks:
- Resource Exhaustion Test:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Deploy test pod that consumes increasing CPU
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: resource-consumer
namespace: testing
spec:
replicas: 1
selector:
matchLabels:
app: resource-consumer
template:
metadata:
labels:
app: resource-consumer
spec:
containers:
- name: resource-consumer
image: k8s.gcr.io/e2e-test-images/resource-consumer:1.9
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 1
memory: 500Mi
EOF
# Generate load
kubectl exec -it $(kubectl get pod -l app=resource-consumer -n testing -o jsonpath='{.items[0].metadata.name}') -n testing -- /bin/sh -c "/consume-cpu/consume-cpu 800 300" # 800m CPU for 300 seconds
- Metric Manipulation Test:
For custom metrics, implement tests that generate anomalous metrics to observe autoscaler response.
7.3 Chaos Engineering for Autoscaling
Use chaos engineering to test resilience of autoscaled applications:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
name: pod-kill-chaos
namespace: testing
spec:
action: pod-kill
mode: one
selector:
namespaces:
- production
labelSelectors:
app: "scaled-application"
scheduler:
cron: "@every 5m"
Monitor how the HPA responds to pod failures and ensure it maintains stability.
8. References and Further Reading
Official Documentation
Security Resources
- CNCF Cloud Native Security Whitepaper
- Kubernetes Security Best Practices
- NSA Kubernetes Hardening Guide
Known CVEs Affecting Autoscaling
- CVE-2023-3955: Custom metrics API server privilege escalation
- CVE-2022-24348: HPA controller memory leak on malformed metrics
- CVE-2021-25740: Potential DoS with external metrics manipulations
Whitepapers & Articles
- “Secure Autoscaling Practices for Kubernetes”
- “Defending Against Resource-Based Attacks in Kubernetes”
- “Autoscaling Security: Lessons Learned from the Field”
9. Appendices
9.1 Troubleshooting HPA Issues
Issue | Possible Causes | Security Implications | Resolution |
---|---|---|---|
HPA not scaling | Metrics unavailable | May indicate metrics service compromise | Verify metrics-server is running and secure |
Resource requests not set | Resource exhaustion risk | Configure proper resource requests | |
RBAC issues | Unauthorized access attempts | Check permissions for HPA controller | |
Erratic scaling | Metric source fluctuating | Potential DoS vector | Increase stabilization window |
Improper probe configuration | Pod restart loops | Configure appropriate liveness/readiness probes | |
Scale-up but no scale-down | Memory leaks | Resource exhaustion | Review application for memory management issues |
HPA at maximum replicas | DoS attack | Cluster resource exhaustion | Investigate unusual traffic patterns |
Debugging Commands:
1
2
3
4
5
6
7
8
9
10
11
# Check HPA status and events
kubectl describe hpa application-hpa -n production
# Verify metrics availability
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/production/pods"
# Check external metrics
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq
# Examine pod resource usage
kubectl top pods -n production
9.2 Frequently Asked Questions
Q: What’s the most secure way to implement custom metrics for HPA?
A: The most secure implementation involves:
- Deploying the custom metrics adapter with RBAC
- Enforcing TLS for all metrics communication
- Validating metrics against baselines to detect anomalies
- Using PodSecurityPolicies (or alternatives) to restrict the metrics adapter’s permissions
Q: How can I prevent Denial of Wallet attacks through HPA manipulation?
A: Implement these defenses:
- Set reasonable maxReplicas limits
- Configure rate-limited scaling policies
- Implement cost monitoring with alerts for unusual scaling
- Use namespace quotas to limit total resource consumption
- Consider implementing custom admission controllers that validate scaling requests
Q: Should I use CPU, memory, or custom metrics for autoscaling?
A: From a security perspective:
- Use a combination of metric types for defense-in-depth
- CPU is a good baseline but can be manipulated through CPU-intensive attacks
- Memory is critical for detecting memory leaks and should always be monitored
- Custom metrics provide application-specific insight but must be secured against manipulation
- Always validate custom metrics by implementing authentication and TLS
Q: How do I prevent scaling based on malicious traffic?
A: Implement these safeguards:
- Rate limiting at the ingress level before traffic affects scaling metrics
- Web Application Firewall (WAF) to filter malicious requests
- Anomaly detection for traffic patterns that trigger scaling
- Separate metrics for legitimate vs. suspicious traffic (e.g., authenticated vs. unauthenticated requests)
- Configure longer stabilization windows to mitigate short-term traffic spikes
Q: What RBAC permissions should the HPA controller have?
A: Apply least privilege principles:
- The HPA controller service account should have only:
get
,list
, andwatch
on target resourcesget
on metrics endpointsupdate
on scale subresources
- Never give the controller broad cluster-admin privileges
- Scope permissions to specific namespaces where possible
- Regularly audit RBAC configurations for drift
9.3 Version-Specific Considerations
Kubernetes 1.23+
Offers stabilization window configurations for better control over scaling behavior:
1
2
3
4
5
behavior:
scaleDown:
stabilizationWindowSeconds: 300
scaleUp:
stabilizationWindowSeconds: 60
Kubernetes 1.24+
Added container resource metrics support:
1
2
3
4
5
6
7
8
metrics:
- type: ContainerResource
containerResource:
name: cpu
container: application
target:
type: Utilization
averageUtilization: 70
Kubernetes 1.25+
Improved support for custom and external metrics with standardized labeling:
1
2
3
4
5
6
7
8
9
10
11
metrics:
- type: External
external:
metric:
name: queue_messages_ready
selector:
matchLabels:
queue: worker_tasks
target:
type: AverageValue
averageValue: 30
Kubernetes 1.26+
Enhanced HPA status reporting with conditions:
1
kubectl get hpa application-hpa -o jsonpath='{.status.conditions}'
Status conditions provide better visibility into scaling decisions and potential security issues.
Security Checklist for Kubernetes HPA
Use this checklist to verify your HPA configuration meets security best practices:
- Resource Limits and Requests
- All pods have CPU and Memory requests defined
- All pods have CPU and Memory limits defined
- Namespace resource quotas are implemented
- HPA Configuration
- Both minReplicas and maxReplicas are explicitly set
- Multiple metrics are used for scaling decisions
- Stabilization windows are configured appropriately
- Scale-up/down policies limit rate of change
- Access Controls
- RBAC is implemented with least privilege
- Service accounts use minimal permissions
- HPA configuration changes require approval
- Metrics Security
- Metrics Server is configured with TLS
- Custom metrics adapters use authentication
- External metrics sources use secure connections
- Monitoring and Auditing
- Logging is enabled for all scaling events
- Alerts exist for unusual scaling behavior
- Audit logging captures HPA configuration changes
- Testing and Validation
- Load testing validates scaling behavior
- Security tests check for scaling vulnerabilities
- Chaos testing verifies resilience during scaling
- CI/CD Integration
- HPA configurations are version controlled
- Security validation occurs before deployment
- Changes are deployed progressively
By implementing the security controls outlined in this guide, you can ensure that your Kubernetes Horizontal Pod Autoscaler enhances both your application’s performance and security posture. Properly configured autoscaling provides resilience against both legitimate traffic spikes and potential attacks, while maintaining a secure and efficient resource footprint.
Remember that autoscaling is just one component of a comprehensive Kubernetes security strategy and should be implemented alongside other security controls such as network policies, pod security standards, and secure supply chain practices.
This guide was last updated: May 9, 2025