Kubernetes Vertical Pod Autoscaler

VPA 概述

Vertical Pod Autoscaler（VPA）是 Kubernetes 中用於自動調整 Pod 資源請求（requests）和限制（limits）的元件。與 Horizontal Pod Autoscaler（HPA）透過增減 Pod 數量來擴展不同，VPA 專注於垂直擴展，即調整單個 Pod 的 CPU 和記憶體配置。

VPA 能夠：

監控 Pod 的實際資源使用情況
根據歷史使用數據提供資源建議
自動更新 Pod 的資源請求

VPA 元件架構

VPA 由三個主要元件組成：

1. Recommender

監控資源使用情況
分析歷史數據
計算並提供最佳資源建議

2. Updater

檢查 Pod 是否使用正確的資源配置
驅逐需要更新資源的 Pod
讓新的 Pod 以更新後的資源設定重新建立

3. Admission Controller

攔截 Pod 建立請求
根據 VPA 建議修改 Pod 的資源請求
確保新建立的 Pod 使用最佳資源配置

安裝 VPA

使用官方腳本安裝

1
2
3
4
5
6
7
8
# 克隆 autoscaler 儲存庫
git clone https://github.com/kubernetes/autoscaler.git

# 進入 VPA 目錄
cd autoscaler/vertical-pod-autoscaler

# 執行安裝腳本
./hack/vpa-up.sh

使用 Helm 安裝

1
2
3
4
5
6
# 新增 Helm 儲存庫
helm repo add fairwinds-stable https://charts.fairwinds.com/stable

# 安裝 VPA
helm install vpa fairwinds-stable/vpa \
  --namespace kube-system

驗證安裝

1
kubectl get pods -n kube-system | grep vpa

更新模式

VPA 提供三種更新模式：

1. Off 模式

僅提供建議，不會自動更新 Pod 資源。適合用於觀察和分析階段。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Off"

2. Initial 模式

僅在 Pod 建立時套用建議的資源配置，不會重啟現有的 Pod。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Initial"

3. Auto 模式

自動更新 Pod 資源，可能會導致 Pod 重啟。這是預設模式。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"

VPA 設定範例

完整設定範例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: web-app-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: web-app
  updatePolicy:
    updateMode: "Auto"
    minReplicas: 2
  resourcePolicy:
    containerPolicies:
      - containerName: "web-container"
        minAllowed:
          cpu: "100m"
          memory: "128Mi"
        maxAllowed:
          cpu: "2000m"
          memory: "4Gi"
        controlledResources: ["cpu", "memory"]
        controlledValues: RequestsAndLimits

搭配 Deployment 的完整範例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-app
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
        - name: app
          image: nginx:latest
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: sample-app-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: sample-app
  updatePolicy:
    updateMode: "Auto"

資源政策設定

containerPolicies 詳細說明

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
resourcePolicy:
  containerPolicies:
    - containerName: "*"  # 套用到所有容器
      minAllowed:
        cpu: "50m"
        memory: "64Mi"
      maxAllowed:
        cpu: "4"
        memory: "8Gi"
      controlledResources: ["cpu", "memory"]
      controlledValues: RequestsAndLimits
    - containerName: "sidecar"
      mode: "Off"  # 排除特定容器

controlledValues 選項

RequestsOnly：僅調整 requests
RequestsAndLimits：同時調整 requests 和 limits

與 HPA 的關係

同時使用的注意事項

VPA 和 HPA 可以同時使用，但需要注意以下幾點：

避免同時基於 CPU 或記憶體進行擴展：如果 HPA 使用 CPU 或記憶體作為擴展指標，VPA 不應該調整相同的資源
建議配置：HPA 使用自訂指標，VPA 調整 CPU 和記憶體

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# HPA 使用自訂指標
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Pods
      pods:
        metric:
          name: requests_per_second
        target:
          type: AverageValue
          averageValue: "1000"
---
# VPA 調整資源
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"

限制與注意事項

主要限制

Pod 重啟：Auto 模式下，更新資源需要重啟 Pod
不支援 JVM 應用的動態調整：Java 應用需要重啟才能套用新的記憶體設定
最小副本數：建議至少維持 2 個副本以確保服務可用性
不支援 DaemonSet：VPA 不適用於 DaemonSet

注意事項

VPA 建議基於歷史數據，需要一段時間收集才能提供準確建議
初始部署時建議使用 Off 模式觀察
生產環境建議設定合理的 minAllowed 和 maxAllowed

最佳實踐

1. 逐步導入

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 第一階段：觀察模式
updatePolicy:
  updateMode: "Off"

# 第二階段：僅新 Pod 套用
updatePolicy:
  updateMode: "Initial"

# 第三階段：全自動模式
updatePolicy:
  updateMode: "Auto"

2. 設定合理的資源邊界

1
2
3
4
5
6
7
8
9
resourcePolicy:
  containerPolicies:
    - containerName: "*"
      minAllowed:
        cpu: "100m"
        memory: "128Mi"
      maxAllowed:
        cpu: "4"
        memory: "8Gi"

3. 監控 VPA 建議

1
2
3
4
5
# 查看 VPA 狀態和建議
kubectl describe vpa my-app-vpa

# 查看特定欄位
kubectl get vpa my-app-vpa -o jsonpath='{.status.recommendation}'

4. 排除不需要調整的容器

1
2
3
4
5
6
resourcePolicy:
  containerPolicies:
    - containerName: "init-container"
      mode: "Off"
    - containerName: "log-sidecar"
      mode: "Off"

5. 結合 PodDisruptionBudget

1
2
3
4
5
6
7
8
9
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: my-app