Kubernetes Horizontal Pod Autoscaler

HPA 概述

Horizontal Pod Autoscaler（HPA）是 Kubernetes 內建的自動擴展機制，能夠根據觀察到的 CPU 使用率、記憶體使用量或自訂指標，自動調整 Deployment、ReplicaSet 或 StatefulSet 中的 Pod 副本數量。

HPA 控制器會定期（預設每 15 秒）查詢資源指標，並根據設定的目標值計算所需的副本數量。這種機制使應用程式能夠在流量高峰時自動擴展，並在流量降低時縮減資源，有效優化成本與效能。

前置需求

安裝 Metrics Server

HPA 依賴 Metrics Server 來獲取 Pod 的資源使用指標。在使用 HPA 之前，請確保叢集中已安裝 Metrics Server：

1
2
3
4
5
6
7
8
9
# 安裝 Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# 驗證安裝
kubectl get deployment metrics-server -n kube-system

# 測試指標 API
kubectl top nodes
kubectl top pods

基本 HPA 設定

使用 kubectl 命令建立

1
2
# 為 deployment 建立 HPA，目標 CPU 使用率 50%
kubectl autoscale deployment my-app --cpu-percent=50 --min=2 --max=10

使用 YAML 宣告式設定

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

CPU 與記憶體指標

HPA 支援同時監控多種資源指標，以下範例展示如何設定 CPU 和記憶體的擴展閾值：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: multi-metric-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-server
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

當設定多個指標時，HPA 會計算每個指標所需的副本數量，並選擇最大值作為最終的副本數。

自訂指標擴展

除了 CPU 和記憶體，HPA 也支援自訂指標（Custom Metrics）和外部指標（External Metrics）。這需要安裝額外的 Metrics Adapter，例如 Prometheus Adapter。

基於 Pod 自訂指標

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metric-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Pods
    pods:
      metric:
        name: requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"

基於外部指標

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: external-metric-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: queue-processor
  minReplicas: 1
  maxReplicas: 30
  metrics:
  - type: External
    external:
      metric:
        name: sqs_queue_length
        selector:
          matchLabels:
            queue: "processing-queue"
      target:
        type: AverageValue
        averageValue: "5"

擴展行為設定

HPA v2 API 提供了精細的擴展行為控制，可以分別設定擴展（Scale Up）和縮減（Scale Down）的策略：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: advanced-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: critical-service
  minReplicas: 5
  maxReplicas: 100
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
      selectPolicy: Min

行為設定說明

參數	說明
`stabilizationWindowSeconds`	穩定視窗，防止指標波動導致頻繁擴縮
`policies`	擴縮策略列表
`type: Percent`	按百分比擴縮
`type: Pods`	按固定數量擴縮
`selectPolicy`	多策略選擇方式：Max、Min 或 Disabled

HPA v2 API 功能

HPA v2 API（autoscaling/v2）提供了更強大的功能：

多指標支援：同時基於多個指標進行擴展決策
自訂指標：支援應用程式特定的指標
外部指標：支援來自叢集外部的指標
行為控制：精細控制擴展和縮減行為
容器資源指標：可針對特定容器設定指標

容器資源指標範例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: container-resource-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: multi-container-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: ContainerResource
    containerResource:
      name: cpu
      container: main-app
      target:
        type: Utilization
        averageUtilization: 60

監控 HPA 狀態

查看 HPA 狀態

1
2
3
4
5
6
7
8
# 列出所有 HPA
kubectl get hpa

# 查看 HPA 詳細資訊
kubectl describe hpa my-app-hpa

# 持續監控 HPA 狀態
kubectl get hpa -w

HPA 狀態輸出範例

1
2
NAME          REFERENCE        TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
my-app-hpa    Deployment/my-app   45%/50%   2         10        4          1h

常見 HPA 事件

SuccessfulRescale：成功調整副本數
DesiredReplicasComputed：計算出期望副本數
FailedGetResourceMetric：無法獲取資源指標

最佳實踐

1. 設定適當的資源請求

HPA 的 CPU 使用率是基於 Pod 的資源請求（requests）計算的，務必為容器設定合理的資源請求：

1
2
3
4
5
6
7
resources:
  requests:
    cpu: "500m"
    memory: "256Mi"
  limits:
    cpu: "1000m"
    memory: "512Mi"

2. 避免擴縮振盪

使用 stabilizationWindowSeconds 防止頻繁擴縮
設定合理的擴縮策略，縮減時使用較保守的設定

3. 監控與告警

監控 HPA 的擴縮事件
設定當副本數接近上限時的告警
追蹤指標數據以優化閾值設定

4. 結合 PodDisruptionBudget

1
2
3
4
5
6
7
8
9
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

5. 考慮冷啟動時間

對於啟動時間較長的應用程式，設定較積極的擴展策略和較保守的縮減策略，確保有足夠的容量處理流量。