前言
在 Kubernetes 環境中,資料備份與災難復原是維運團隊必須面對的重要課題。Velero(前身為 Heptio Ark)是一個開源工具,專門用於 Kubernetes 叢集的備份、還原和遷移。本文將深入介紹 Velero 的架構、安裝配置、備份策略,以及實際操作範例。
1. Velero 概述與架構
什麼是 Velero?
Velero 是一個由 VMware 維護的開源專案,提供以下核心功能:
- 備份 Kubernetes 叢集資源和持久化卷(Persistent Volumes)
- 還原叢集資源至相同或不同的叢集
- 遷移叢集資源至其他叢集
- 災難復原功能,確保業務連續性
架構組件
Velero 的架構主要包含以下組件:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| ┌─────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ ┌─────────────────┐ ┌─────────────────────────────────┐ │
│ │ Velero Server │ │ Custom Resources │ │
│ │ (Deployment) │ │ - Backup │ │
│ │ │◄───│ - Restore │ │
│ │ - Controller │ │ - Schedule │ │
│ │ - Plugins │ │ - BackupStorageLocation │ │
│ └────────┬────────┘ │ - VolumeSnapshotLocation │ │
│ │ └─────────────────────────────────┘ │
└───────────┼─────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Object Storage │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ AWS S3 │ │ MinIO │ │ Azure Blob/GCS │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
|
主要組件說明:
| 組件 | 說明 |
|---|
| Velero Server | 運行於叢集內的控制器,負責執行備份和還原操作 |
| BackupStorageLocation | 定義備份檔案存放的位置(如 S3 bucket) |
| VolumeSnapshotLocation | 定義 PV 快照存放的位置 |
| Backup | 備份任務的 Custom Resource |
| Restore | 還原任務的 Custom Resource |
| Schedule | 排程備份的 Custom Resource |
2. 安裝與設定(AWS S3)
前置需求
- Kubernetes 叢集(1.16+)
- kubectl 已配置並可存取叢集
- AWS 帳號與適當權限
步驟一:建立 S3 Bucket
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # 設定變數
BUCKET_NAME=velero-backup-$(date +%s)
REGION=ap-northeast-1
# 建立 S3 bucket
aws s3api create-bucket \
--bucket $BUCKET_NAME \
--region $REGION \
--create-bucket-configuration LocationConstraint=$REGION
# 啟用版本控制(建議)
aws s3api put-bucket-versioning \
--bucket $BUCKET_NAME \
--versioning-configuration Status=Enabled
|
步驟二:建立 IAM 使用者與政策
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
| # 建立 IAM 使用者
aws iam create-user --user-name velero
# 建立 IAM 政策檔案
cat > velero-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeVolumes",
"ec2:DescribeSnapshots",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:CreateSnapshot",
"ec2:DeleteSnapshot"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::${BUCKET_NAME}/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::${BUCKET_NAME}"
]
}
]
}
EOF
# 附加政策至使用者
aws iam put-user-policy \
--user-name velero \
--policy-name velero \
--policy-document file://velero-policy.json
# 建立存取金鑰
aws iam create-access-key --user-name velero
|
步驟三:安裝 Velero CLI
1
2
3
4
5
6
7
8
9
10
| # 下載 Velero CLI(以 Linux 為例)
VELERO_VERSION=v1.13.0
wget https://github.com/vmware-tanzu/velero/releases/download/${VELERO_VERSION}/velero-${VELERO_VERSION}-linux-amd64.tar.gz
# 解壓縮並安裝
tar -xvf velero-${VELERO_VERSION}-linux-amd64.tar.gz
sudo mv velero-${VELERO_VERSION}-linux-amd64/velero /usr/local/bin/
# 驗證安裝
velero version --client-only
|
步驟四:建立認證檔案
1
2
3
4
5
| cat > credentials-velero <<EOF
[default]
aws_access_key_id=<YOUR_ACCESS_KEY_ID>
aws_secret_access_key=<YOUR_SECRET_ACCESS_KEY>
EOF
|
步驟五:安裝 Velero 至 Kubernetes
1
2
3
4
5
6
7
8
9
10
11
| velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.9.0 \
--bucket $BUCKET_NAME \
--backup-location-config region=$REGION \
--snapshot-location-config region=$REGION \
--secret-file ./credentials-velero
# 確認安裝狀態
kubectl get pods -n velero
kubectl get backupstoragelocations -n velero
|
使用 Helm 安裝(替代方案)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
| # 新增 Helm repo
helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts
helm repo update
# 建立 values 檔案
cat > velero-values.yaml <<EOF
configuration:
backupStorageLocation:
- name: default
provider: aws
bucket: ${BUCKET_NAME}
config:
region: ${REGION}
volumeSnapshotLocation:
- name: default
provider: aws
config:
region: ${REGION}
credentials:
useSecret: true
secretContents:
cloud: |
[default]
aws_access_key_id=<YOUR_ACCESS_KEY_ID>
aws_secret_access_key=<YOUR_SECRET_ACCESS_KEY>
initContainers:
- name: velero-plugin-for-aws
image: velero/velero-plugin-for-aws:v1.9.0
volumeMounts:
- mountPath: /target
name: plugins
EOF
# 安裝
helm install velero vmware-tanzu/velero \
--namespace velero \
--create-namespace \
-f velero-values.yaml
|
3. 備份策略與排程
手動備份
1
2
3
4
5
6
7
8
| # 備份整個叢集
velero backup create full-cluster-backup
# 查看備份狀態
velero backup describe full-cluster-backup
# 查看備份詳細日誌
velero backup logs full-cluster-backup
|
排程備份
Velero 支援使用 Cron 表達式進行排程備份:
1
2
3
4
5
6
7
8
9
10
11
12
13
| # 每日凌晨 2 點備份
velero schedule create daily-backup \
--schedule="0 2 * * *"
# 每週日凌晨 3 點備份,保留 30 天
velero schedule create weekly-backup \
--schedule="0 3 * * 0" \
--ttl 720h
# 每 6 小時備份一次
velero schedule create hourly-backup \
--schedule="0 */6 * * *" \
--ttl 168h
|
查看與管理排程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # 列出所有排程
velero schedule get
# 查看排程詳情
velero schedule describe daily-backup
# 暫停排程
velero schedule pause daily-backup
# 恢復排程
velero schedule unpause daily-backup
# 刪除排程
velero schedule delete daily-backup
|
使用 YAML 定義排程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| apiVersion: velero.io/v1
kind: Schedule
metadata:
name: production-backup
namespace: velero
spec:
schedule: "0 2 * * *"
template:
includedNamespaces:
- production
- staging
excludedResources:
- events
- events.events.k8s.io
snapshotVolumes: true
ttl: 720h0m0s
storageLocation: default
volumeSnapshotLocations:
- default
|
1
| kubectl apply -f production-schedule.yaml
|
4. 選擇性備份(Namespace、Labels)
依 Namespace 備份
1
2
3
4
5
6
7
8
9
10
11
| # 備份單一 namespace
velero backup create app-backup \
--include-namespaces production
# 備份多個 namespaces
velero backup create multi-ns-backup \
--include-namespaces production,staging,development
# 排除特定 namespaces
velero backup create exclude-system-backup \
--exclude-namespaces kube-system,kube-public,velero
|
依 Labels 備份
1
2
3
4
5
6
7
| # 備份具有特定 label 的資源
velero backup create labeled-backup \
--selector app=nginx
# 使用複雜的 label 選擇器
velero backup create complex-label-backup \
--selector "app=myapp,environment=production"
|
依資源類型備份
1
2
3
4
5
6
7
8
9
10
11
| # 僅備份 Deployments 和 Services
velero backup create resources-backup \
--include-resources deployments,services
# 排除 Secrets 和 ConfigMaps
velero backup create exclude-secrets-backup \
--exclude-resources secrets,configmaps
# 備份叢集範圍的資源
velero backup create cluster-resources-backup \
--include-cluster-resources=true
|
組合使用
1
2
3
4
5
| # 備份 production namespace 中具有 app=web label 的 Deployments
velero backup create selective-backup \
--include-namespaces production \
--selector app=web \
--include-resources deployments,services,configmaps
|
使用 YAML 定義複雜備份
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
| apiVersion: velero.io/v1
kind: Backup
metadata:
name: selective-backup
namespace: velero
spec:
includedNamespaces:
- production
- staging
excludedNamespaces:
- kube-system
includedResources:
- deployments
- services
- configmaps
- secrets
- persistentvolumeclaims
excludedResources:
- events
labelSelector:
matchLabels:
app: myapp
includeClusterResources: true
snapshotVolumes: true
ttl: 720h0m0s
storageLocation: default
volumeSnapshotLocations:
- default
|
5. 還原操作與驗證
基本還原
1
2
3
4
5
6
7
8
9
10
11
| # 從備份還原(還原至相同 namespace)
velero restore create --from-backup full-cluster-backup
# 指定還原名稱
velero restore create my-restore --from-backup full-cluster-backup
# 查看還原狀態
velero restore describe my-restore
# 查看還原日誌
velero restore logs my-restore
|
選擇性還原
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # 僅還原特定 namespace
velero restore create ns-restore \
--from-backup full-cluster-backup \
--include-namespaces production
# 還原至不同的 namespace
velero restore create mapped-restore \
--from-backup full-cluster-backup \
--namespace-mappings old-namespace:new-namespace
# 排除特定資源
velero restore create selective-restore \
--from-backup full-cluster-backup \
--exclude-resources secrets
|
還原驗證
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
| #!/bin/bash
# restore-verify.sh - 還原驗證腳本
BACKUP_NAME=$1
RESTORE_NAME="restore-${BACKUP_NAME}-$(date +%s)"
echo "Starting restore from backup: ${BACKUP_NAME}"
# 執行還原
velero restore create ${RESTORE_NAME} --from-backup ${BACKUP_NAME}
# 等待還原完成
echo "Waiting for restore to complete..."
while true; do
STATUS=$(velero restore get ${RESTORE_NAME} -o jsonpath='{.status.phase}')
if [ "$STATUS" == "Completed" ]; then
echo "Restore completed successfully!"
break
elif [ "$STATUS" == "Failed" ] || [ "$STATUS" == "PartiallyFailed" ]; then
echo "Restore failed with status: ${STATUS}"
velero restore logs ${RESTORE_NAME}
exit 1
fi
echo "Current status: ${STATUS}"
sleep 10
done
# 驗證資源
echo "Verifying restored resources..."
velero restore describe ${RESTORE_NAME}
# 檢查 Pod 狀態
echo "Checking Pod status..."
kubectl get pods --all-namespaces | grep -v Running | grep -v Completed
echo "Restore verification complete!"
|
還原衝突處理
1
2
3
4
5
6
7
8
9
| # 更新現有資源(預設行為是跳過)
velero restore create update-restore \
--from-backup full-cluster-backup \
--existing-resource-policy update
# 預設:保留現有資源
velero restore create preserve-restore \
--from-backup full-cluster-backup \
--existing-resource-policy none
|
6. 跨叢集遷移
遷移前準備
- 確保目標叢集已安裝 Velero
- 配置相同的 BackupStorageLocation
- 驗證網路連通性
在來源叢集建立備份
1
2
3
4
5
6
7
8
| # 來源叢集:建立完整備份
velero backup create migration-backup \
--include-namespaces production \
--snapshot-volumes \
--wait
# 確認備份完成
velero backup describe migration-backup
|
在目標叢集配置存取
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
| # 目標叢集:配置相同的 BackupStorageLocation
cat > backup-location.yaml <<EOF
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: default
namespace: velero
spec:
provider: aws
objectStorage:
bucket: ${BUCKET_NAME}
prefix: ""
config:
region: ${REGION}
accessMode: ReadWrite
EOF
kubectl apply -f backup-location.yaml
# 同步備份清單
velero backup get
|
在目標叢集還原
1
2
3
4
5
6
7
8
9
10
| # 目標叢集:執行還原
velero restore create migration-restore \
--from-backup migration-backup \
--namespace-mappings source-ns:target-ns
# 監控還原進度
watch velero restore get
# 驗證還原結果
kubectl get all -n target-ns
|
遷移最佳實踐
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| #!/bin/bash
# migration-checklist.sh - 遷移檢查清單
echo "=== Pre-Migration Checklist ==="
# 1. 驗證來源叢集備份
echo "1. Verifying source cluster backup..."
velero backup describe migration-backup --details
# 2. 檢查目標叢集連通性
echo "2. Checking target cluster connectivity..."
kubectl cluster-info
# 3. 驗證 Storage Class 對應
echo "3. Checking Storage Classes..."
kubectl get storageclasses
# 4. 驗證 PV 快照
echo "4. Verifying volume snapshots..."
velero backup describe migration-backup | grep -A 20 "Volume Snapshots"
# 5. 檢查 CRDs
echo "5. Checking required CRDs..."
kubectl get crds | grep -E "(velero|cert-manager|istio)"
echo "=== Checklist Complete ==="
|
7. 備份鉤子(Hooks)
Velero Hooks 允許在備份或還原過程中執行自訂腳本,適用於:
- 資料庫一致性備份(如 MySQL、PostgreSQL)
- 應用程式狀態處理
- 快取清理
Pre-Backup Hook
在備份前執行命令(例如:凍結資料庫寫入):
1
2
3
4
5
6
7
8
9
10
11
12
13
| apiVersion: v1
kind: Pod
metadata:
name: mysql-pod
annotations:
pre.hook.backup.velero.io/container: mysql
pre.hook.backup.velero.io/command: '["/bin/bash", "-c", "mysql -u root -p$MYSQL_ROOT_PASSWORD -e \"FLUSH TABLES WITH READ LOCK;\""]'
pre.hook.backup.velero.io/timeout: 30s
pre.hook.backup.velero.io/on-error: Fail
spec:
containers:
- name: mysql
image: mysql:8.0
|
Post-Backup Hook
在備份後執行命令(例如:解除資料庫鎖定):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| apiVersion: v1
kind: Pod
metadata:
name: mysql-pod
annotations:
pre.hook.backup.velero.io/container: mysql
pre.hook.backup.velero.io/command: '["/bin/bash", "-c", "mysql -u root -p$MYSQL_ROOT_PASSWORD -e \"FLUSH TABLES WITH READ LOCK;\""]'
post.hook.backup.velero.io/container: mysql
post.hook.backup.velero.io/command: '["/bin/bash", "-c", "mysql -u root -p$MYSQL_ROOT_PASSWORD -e \"UNLOCK TABLES;\""]'
post.hook.backup.velero.io/timeout: 30s
spec:
containers:
- name: mysql
image: mysql:8.0
|
Restore Hooks
在還原過程中執行命令:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| apiVersion: v1
kind: Pod
metadata:
name: app-pod
annotations:
init.hook.restore.velero.io/container-name: restore-init
init.hook.restore.velero.io/container-image: busybox:latest
init.hook.restore.velero.io/command: '["/bin/sh", "-c", "echo Initializing restore..."]'
post.hook.restore.velero.io/container: app
post.hook.restore.velero.io/command: '["/bin/bash", "-c", "/scripts/post-restore.sh"]'
post.hook.restore.velero.io/wait-timeout: 5m
post.hook.restore.velero.io/exec-timeout: 2m
spec:
containers:
- name: app
image: myapp:latest
|
PostgreSQL 備份 Hook 範例
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| apiVersion: apps/v1
kind: Deployment
metadata:
name: postgresql
namespace: database
spec:
selector:
matchLabels:
app: postgresql
template:
metadata:
labels:
app: postgresql
annotations:
# 備份前:開始備份模式
pre.hook.backup.velero.io/container: postgresql
pre.hook.backup.velero.io/command: '["/bin/bash", "-c", "psql -U postgres -c \"SELECT pg_start_backup(''velero-backup'', false, false);\""]'
pre.hook.backup.velero.io/timeout: 60s
pre.hook.backup.velero.io/on-error: Fail
# 備份後:結束備份模式
post.hook.backup.velero.io/container: postgresql
post.hook.backup.velero.io/command: '["/bin/bash", "-c", "psql -U postgres -c \"SELECT pg_stop_backup(false);\""]'
post.hook.backup.velero.io/timeout: 60s
spec:
containers:
- name: postgresql
image: postgres:15
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgresql-secret
key: password
|
在備份規格中定義 Hooks
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| apiVersion: velero.io/v1
kind: Backup
metadata:
name: database-backup
namespace: velero
spec:
includedNamespaces:
- database
hooks:
resources:
- name: postgresql-hook
includedNamespaces:
- database
labelSelector:
matchLabels:
app: postgresql
pre:
- exec:
container: postgresql
command:
- /bin/bash
- -c
- "pg_dump -U postgres mydb > /backup/mydb.sql"
onError: Fail
timeout: 5m
post:
- exec:
container: postgresql
command:
- /bin/bash
- -c
- "echo 'Backup completed successfully'"
timeout: 30s
|
8. 監控與故障排除
基本監控命令
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| # 查看 Velero 元件狀態
kubectl get pods -n velero
kubectl logs deployment/velero -n velero
# 查看備份儲存位置狀態
velero backup-location get
# 查看最近的備份
velero backup get --show-labels
# 查看備份詳情
velero backup describe <backup-name> --details
# 查看還原狀態
velero restore get
velero restore describe <restore-name> --details
|
設置 Prometheus 監控
Velero 內建 Prometheus metrics endpoint:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| # velero-service-monitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: velero
namespace: velero
labels:
app: velero
spec:
selector:
matchLabels:
app.kubernetes.io/name: velero
endpoints:
- port: http-monitoring
interval: 30s
|
常用 Metrics
| Metric | 說明 |
|---|
velero_backup_total | 備份總數 |
velero_backup_success_total | 成功備份數 |
velero_backup_failure_total | 失敗備份數 |
velero_backup_duration_seconds | 備份耗時 |
velero_restore_total | 還原總數 |
velero_restore_success_total | 成功還原數 |
Grafana Dashboard
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| {
"dashboard": {
"title": "Velero Backup Dashboard",
"panels": [
{
"title": "Backup Success Rate",
"type": "gauge",
"targets": [
{
"expr": "velero_backup_success_total / velero_backup_total * 100"
}
]
},
{
"title": "Backup Duration",
"type": "graph",
"targets": [
{
"expr": "velero_backup_duration_seconds"
}
]
}
]
}
}
|
常見問題排除
問題一:備份一直處於 InProgress 狀態
1
2
3
4
5
6
7
8
| # 檢查 Velero 日誌
kubectl logs deployment/velero -n velero -f
# 檢查備份控制器
kubectl describe backup <backup-name> -n velero
# 重啟 Velero
kubectl rollout restart deployment/velero -n velero
|
問題二:備份儲存位置無法存取
1
2
3
4
5
6
7
8
| # 檢查 BackupStorageLocation 狀態
velero backup-location get
# 驗證認證
kubectl get secret -n velero cloud-credentials -o yaml
# 測試 S3 連線
aws s3 ls s3://${BUCKET_NAME}/
|
問題三:Volume 快照失敗
1
2
3
4
5
6
7
8
9
| # 檢查 VolumeSnapshotLocation
velero snapshot-location get
# 檢查 CSI driver
kubectl get csidrivers
# 查看快照狀態
kubectl get volumesnapshots --all-namespaces
kubectl get volumesnapshotcontents
|
問題四:還原後 Pod 無法啟動
1
2
3
4
5
6
7
8
9
| # 檢查還原日誌
velero restore logs <restore-name>
# 檢查 Pod 事件
kubectl describe pod <pod-name> -n <namespace>
# 檢查 PVC 狀態
kubectl get pvc -n <namespace>
kubectl describe pvc <pvc-name> -n <namespace>
|
故障排除腳本
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
| #!/bin/bash
# velero-troubleshoot.sh
echo "=== Velero Troubleshooting Report ==="
echo ""
echo "1. Velero Version:"
velero version
echo ""
echo "2. Velero Pod Status:"
kubectl get pods -n velero
echo ""
echo "3. Backup Storage Locations:"
velero backup-location get
echo ""
echo "4. Volume Snapshot Locations:"
velero snapshot-location get
echo ""
echo "5. Recent Backups (last 5):"
velero backup get | head -6
echo ""
echo "6. Failed Backups:"
velero backup get | grep -E "Failed|PartiallyFailed"
echo ""
echo "7. Recent Restores (last 5):"
velero restore get | head -6
echo ""
echo "8. Velero Logs (last 50 lines):"
kubectl logs deployment/velero -n velero --tail=50
echo ""
echo "9. Velero Resource Usage:"
kubectl top pod -n velero
echo ""
echo "=== End of Report ==="
|
日誌收集
1
2
3
4
5
6
7
8
| # 收集完整診斷資訊
velero debug \
--backup <backup-name> \
--restore <restore-name> \
--output-dir ./velero-debug
# 檢視輸出
ls -la ./velero-debug/
|
最佳實踐總結
- 定期測試還原:定期執行還原測試,確保備份可用
- 實施 3-2-1 備份策略:3 份備份、2 種媒體、1 份異地
- 設定適當的 TTL:根據合規需求設定備份保留期限
- 監控備份狀態:設置告警,及時發現備份失敗
- 使用 Hooks 確保一致性:資料庫等有狀態應用務必使用 hooks
- 加密備份資料:啟用 S3 伺服器端加密
- 版本控制:啟用 S3 bucket 版本控制,防止意外刪除
- 文件化流程:記錄備份與還原的 SOP
結論
Velero 是 Kubernetes 環境中不可或缺的備份工具,它提供了完整的備份、還原和遷移功能。透過本文的介紹,您應該能夠:
- 理解 Velero 的架構與工作原理
- 在 AWS 環境中完成 Velero 的安裝與配置
- 設計和實施符合需求的備份策略
- 執行選擇性備份和還原操作
- 實現跨叢集的資源遷移
- 使用 Hooks 處理有狀態應用的備份
- 監控 Velero 運行狀態並排除故障
建議在生產環境部署前,先在測試環境中充分驗證備份和還原流程,確保在真正需要時能夠順利恢復服務。