在現代雲原生架構中,服務之間的安全通訊與身份驗證變得至關重要。SPIFFE(Secure Production Identity Framework For Everyone)與 SPIRE(SPIFFE Runtime Environment)提供了一套標準化的解決方案,用於在動態、異質化的環境中建立和管理服務身份。本文將深入探討這兩個重要的開源專案。
SPIFFE 標準概述
什麼是 SPIFFE?
SPIFFE 是由 CNCF(Cloud Native Computing Foundation)託管的開源標準,旨在為分散式系統中的工作負載提供安全的身份識別機制。它定義了一套標準化的方式來:
- 識別工作負載:為每個服務提供唯一的身份標識
- 驗證身份:透過加密方式驗證服務的真實性
- 建立信任:在不同平台和環境之間建立統一的信任關係
SPIFFE 的核心優勢
- 平台無關性:可在 Kubernetes、虛擬機器、裸機等各種環境中運作
- 自動化:無需人工介入即可完成身份發放和輪換
- 短期憑證:使用短期效期的憑證,降低憑證洩露的風險
- 零信任基礎:符合零信任安全模型的核心要求
SPIFFE 信任域(Trust Domain)
信任域是 SPIFFE 中的核心概念,代表一個管理邊界內的身份頒發機構:
1
2
3
4
5
6
7
8
9
10
| 信任域範例:example.org
│
┌────────┴────────┐
│ SPIRE Server │
│ (Trust Root) │
└────────┬────────┘
│
┌────────┴────────┐
│ │
服務 A 服務 B
|
SPIFFE ID 與 SVID 格式
SPIFFE ID
SPIFFE ID 是工作負載的唯一識別符,採用 URI 格式:
實際範例:
1
2
3
| spiffe://example.org/ns/production/sa/web-frontend
spiffe://example.org/region/us-west/host/database-01
spiffe://example.org/cluster/k8s-prod/ns/default/pod/api-server-abc123
|
SPIFFE ID 的組成部分
| 組成部分 | 說明 | 範例 |
|---|
| Scheme | 固定為 spiffe | spiffe |
| Trust Domain | 信任域名稱 | example.org |
| Path | 工作負載路徑 | /ns/production/sa/web-frontend |
SVID(SPIFFE Verifiable Identity Document)
SVID 是 SPIFFE ID 的可驗證表示形式,目前支援兩種格式:
X.509-SVID
最常用的 SVID 格式,將 SPIFFE ID 編碼在 X.509 憑證的 SAN(Subject Alternative Name)URI 欄位中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| Certificate:
Data:
Version: 3 (0x2)
Serial Number: 1234567890
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN=SPIRE Server CA
Validity
Not Before: Jun 28 00:00:00 2025 GMT
Not After : Jun 28 01:00:00 2025 GMT
Subject: CN=web-frontend
Subject Public Key Info:
...
X509v3 extensions:
X509v3 Subject Alternative Name:
URI:spiffe://example.org/ns/production/sa/web-frontend
|
JWT-SVID
適用於 HTTP 和 gRPC 通訊的 JWT 格式:
1
2
3
4
5
6
7
8
9
10
11
| {
"alg": "RS256",
"kid": "key-id-12345",
"typ": "JWT"
}
{
"aud": ["spiffe://example.org/backend-service"],
"exp": 1719619200,
"iat": 1719615600,
"sub": "spiffe://example.org/ns/production/sa/web-frontend"
}
|
SPIRE 架構與元件
SPIRE 概述
SPIRE 是 SPIFFE 標準的參考實作,提供了完整的身份管理解決方案。
架構圖
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| ┌─────────────────────────────────────────────────────────────┐
│ SPIRE Server │
│ ┌─────────────┐ ┌──────────────┐ ┌───────────────────┐ │
│ │ CA Manager │ │ Registration │ │ Node Attestor │ │
│ │ │ │ API │ │ │ │
│ └─────────────┘ └──────────────┘ └───────────────────┘ │
│ ┌─────────────┐ ┌──────────────┐ ┌───────────────────┐ │
│ │ DataStore │ │ Key Manager │ │ Upstream Authority│ │
│ │ (SQL/etcd) │ │ │ │ │ │
│ └─────────────┘ └──────────────┘ └───────────────────┘ │
└─────────────────────────────┬───────────────────────────────┘
│ Node API
│
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ SPIRE Agent │ │ SPIRE Agent │ │ SPIRE Agent │
│ ┌───────┐ │ │ ┌───────┐ │ │ ┌───────┐ │
│ │Workload│ │ │ │Workload│ │ │ │Workload│ │
│ │Attestor│ │ │ │Attestor│ │ │ │Attestor│ │
│ └───────┘ │ │ └───────┘ │ │ └───────┘ │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│Workload │ │Workload │ │Workload │
│ A │ │ B │ │ C │
└─────────┘ └─────────┘ └─────────┘
|
核心元件說明
SPIRE Server
SPIRE Server 是整個系統的控制平面,負責:
- CA 管理:簽發和管理 X.509 憑證
- 註冊管理:維護工作負載身份註冊資訊
- 節點認證:驗證 SPIRE Agent 的身份
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
| # SPIRE Server 設定範例
server {
bind_address = "0.0.0.0"
bind_port = "8081"
trust_domain = "example.org"
data_dir = "/run/spire/data"
log_level = "INFO"
ca_key_type = "ec-p256"
ca_ttl = "24h"
default_svid_ttl = "1h"
}
plugins {
DataStore "sql" {
plugin_data {
database_type = "postgres"
connection_string = "dbname=spire host=postgres user=spire password=secret"
}
}
NodeAttestor "k8s_psat" {
plugin_data {
clusters = {
"k8s-cluster" = {
service_account_allow_list = ["spire:spire-agent"]
}
}
}
}
KeyManager "disk" {
plugin_data {
keys_path = "/run/spire/data/keys.json"
}
}
}
|
SPIRE Agent
SPIRE Agent 運行在每個節點上,負責:
- 工作負載認證:識別並驗證本地工作負載
- SVID 發放:向工作負載提供身份憑證
- 憑證快取:管理本地 SVID 快取
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| # SPIRE Agent 設定範例
agent {
data_dir = "/run/spire/agent"
log_level = "INFO"
server_address = "spire-server"
server_port = "8081"
trust_domain = "example.org"
socket_path = "/run/spire/sockets/agent.sock"
}
plugins {
NodeAttestor "k8s_psat" {
plugin_data {
cluster = "k8s-cluster"
}
}
KeyManager "memory" {
plugin_data {}
}
WorkloadAttestor "k8s" {
plugin_data {
skip_kubelet_verification = true
}
}
WorkloadAttestor "unix" {
plugin_data {}
}
}
|
SPIRE Server 與 Agent 部署
在 Kubernetes 上部署 SPIRE
步驟 1:建立命名空間和 RBAC
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
| # 建立 SPIRE 命名空間
kubectl create namespace spire
# 套用 RBAC 設定
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: spire-server
namespace: spire
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: spire-server-cluster-role
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: ["authentication.k8s.io"]
resources: ["tokenreviews"]
verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: spire-server-cluster-role-binding
subjects:
- kind: ServiceAccount
name: spire-server
namespace: spire
roleRef:
kind: ClusterRole
name: spire-server-cluster-role
apiGroup: rbac.authorization.k8s.io
EOF
|
步驟 2:部署 SPIRE Server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
| # spire-server-deployment.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: spire-server
namespace: spire
spec:
replicas: 1
selector:
matchLabels:
app: spire-server
serviceName: spire-server
template:
metadata:
labels:
app: spire-server
spec:
serviceAccountName: spire-server
containers:
- name: spire-server
image: ghcr.io/spiffe/spire-server:1.9.0
args:
- -config
- /run/spire/config/server.conf
ports:
- name: grpc
containerPort: 8081
volumeMounts:
- name: spire-config
mountPath: /run/spire/config
readOnly: true
- name: spire-data
mountPath: /run/spire/data
livenessProbe:
exec:
command:
- /opt/spire/bin/spire-server
- healthcheck
initialDelaySeconds: 15
periodSeconds: 60
readinessProbe:
exec:
command:
- /opt/spire/bin/spire-server
- healthcheck
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: spire-config
configMap:
name: spire-server-config
volumeClaimTemplates:
- metadata:
name: spire-data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
name: spire-server
namespace: spire
spec:
type: ClusterIP
ports:
- name: grpc
port: 8081
targetPort: 8081
selector:
app: spire-server
|
1
2
| # 套用 SPIRE Server 部署
kubectl apply -f spire-server-deployment.yaml
|
步驟 3:部署 SPIRE Agent
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
| # spire-agent-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: spire-agent
namespace: spire
spec:
selector:
matchLabels:
app: spire-agent
template:
metadata:
labels:
app: spire-agent
spec:
serviceAccountName: spire-agent
hostPID: true
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
initContainers:
- name: init
image: ghcr.io/spiffe/spire-agent:1.9.0
args:
- -c
- |
/opt/spire/bin/spire-agent healthcheck -socketPath /run/spire/sockets/agent.sock || true
command:
- /bin/sh
volumeMounts:
- name: spire-agent-socket
mountPath: /run/spire/sockets
containers:
- name: spire-agent
image: ghcr.io/spiffe/spire-agent:1.9.0
args:
- -config
- /run/spire/config/agent.conf
volumeMounts:
- name: spire-config
mountPath: /run/spire/config
readOnly: true
- name: spire-agent-socket
mountPath: /run/spire/sockets
- name: spire-token
mountPath: /var/run/secrets/tokens
livenessProbe:
exec:
command:
- /opt/spire/bin/spire-agent
- healthcheck
- -socketPath
- /run/spire/sockets/agent.sock
initialDelaySeconds: 15
periodSeconds: 60
readinessProbe:
exec:
command:
- /opt/spire/bin/spire-agent
- healthcheck
- -socketPath
- /run/spire/sockets/agent.sock
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: spire-config
configMap:
name: spire-agent-config
- name: spire-agent-socket
hostPath:
path: /run/spire/sockets
type: DirectoryOrCreate
- name: spire-token
projected:
sources:
- serviceAccountToken:
path: spire-agent
expirationSeconds: 7200
audience: spire-server
|
1
2
| # 套用 SPIRE Agent 部署
kubectl apply -f spire-agent-daemonset.yaml
|
步驟 4:驗證部署
1
2
3
4
5
6
7
8
9
10
11
12
| # 檢查 SPIRE Server 狀態
kubectl -n spire get pods -l app=spire-server
# 檢查 SPIRE Agent 狀態
kubectl -n spire get pods -l app=spire-agent
# 查看 SPIRE Server 日誌
kubectl -n spire logs -l app=spire-server
# 執行健康檢查
kubectl -n spire exec -it spire-server-0 -- \
/opt/spire/bin/spire-server healthcheck
|
使用 Helm 部署
1
2
3
4
5
6
7
8
9
10
| # 新增 SPIRE Helm 倉庫
helm repo add spiffe https://spiffe.github.io/helm-charts/
helm repo update
# 安裝 SPIRE
helm install spire spiffe/spire \
--namespace spire \
--create-namespace \
--set global.spire.trustDomain=example.org \
--set global.spire.clusterName=k8s-cluster
|
Workload Attestation 機制
什麼是 Workload Attestation?
Workload Attestation 是 SPIRE 用來驗證工作負載身份的機制。它透過收集工作負載的各種屬性來確定其身份。
Attestation 流程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| ┌─────────────────────────────────────────────────────────────┐
│ Attestation 流程 │
└─────────────────────────────────────────────────────────────┘
工作負載 SPIRE Agent SPIRE Server
│ │ │
│ 1. 請求 SVID │ │
│──────────────────────>│ │
│ │ │
│ 2. 收集工作負載屬性 │ │
│ (PID, UID, K8s Pod) │ │
│<──────────────────────│ │
│ │ │
│ │ 3. 查詢匹配的註冊項目 │
│ │────────────────────────>│
│ │ │
│ │ 4. 返回 SPIFFE ID │
│ │<────────────────────────│
│ │ │
│ │ 5. 簽發 SVID │
│ │────────────────────────>│
│ │ │
│ 6. 返回 SVID │<────────────────────────│
│<──────────────────────│ │
│ │ │
|
常見的 Workload Attestor 類型
1. Kubernetes Workload Attestor
識別 Kubernetes Pod 的屬性:
1
2
3
4
5
6
7
8
| # 可用的 Selector 類型
k8s:ns:namespace # 命名空間
k8s:sa:service-account # 服務帳號
k8s:pod-label:key:value # Pod 標籤
k8s:pod-name:name # Pod 名稱
k8s:container-name:name # 容器名稱
k8s:container-image:image # 容器映像
k8s:node-name:name # 節點名稱
|
2. Unix Workload Attestor
識別 Unix 程序屬性:
1
2
3
4
5
6
7
| # 可用的 Selector 類型
unix:uid:1000 # 使用者 ID
unix:gid:1000 # 群組 ID
unix:user:username # 使用者名稱
unix:group:groupname # 群組名稱
unix:path:/usr/bin/myapp # 執行檔路徑
unix:sha256:abc123... # 執行檔 SHA256 雜湊
|
3. Docker Workload Attestor
識別 Docker 容器屬性:
1
2
3
4
| # 可用的 Selector 類型
docker:label:key:value # 容器標籤
docker:image_id:id # 映像 ID
docker:env:key:value # 環境變數
|
建立工作負載註冊
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| # 為 Kubernetes 工作負載建立註冊
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry create \
-spiffeID spiffe://example.org/ns/production/sa/web-frontend \
-parentID spiffe://example.org/spire/agent/k8s_psat/k8s-cluster/$(kubectl get node -o jsonpath='{.items[0].metadata.uid}') \
-selector k8s:ns:production \
-selector k8s:sa:web-frontend
# 為特定 Pod 標籤建立註冊
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry create \
-spiffeID spiffe://example.org/app/api-server \
-parentID spiffe://example.org/spire/agent/k8s_psat/k8s-cluster \
-selector k8s:pod-label:app:api-server \
-selector k8s:ns:default
# 查看所有註冊項目
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry show
|
使用 Kubernetes SPIFFE Controller
透過 CRD 管理註冊項目:
1
2
3
4
5
6
7
8
9
10
11
12
13
| # ClusterSPIFFEID 範例
apiVersion: spire.spiffe.io/v1alpha1
kind: ClusterSPIFFEID
metadata:
name: web-frontend
spec:
spiffeIDTemplate: "spiffe://{{ .TrustDomain }}/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}"
podSelector:
matchLabels:
app: web-frontend
namespaceSelector:
matchLabels:
environment: production
|
與 Kubernetes 整合
使用 SPIFFE CSI Driver
SPIFFE CSI Driver 讓 Pod 可以直接掛載 SVID:
1
2
3
| # 安裝 SPIFFE CSI Driver
helm install spiffe-csi-driver spiffe/spiffe-csi-driver \
--namespace spire
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| # 使用 CSI Driver 的 Pod 範例
apiVersion: v1
kind: Pod
metadata:
name: my-workload
namespace: production
spec:
serviceAccountName: web-frontend
containers:
- name: app
image: my-app:latest
volumeMounts:
- name: spiffe-workload-api
mountPath: /spiffe-workload-api
readOnly: true
env:
- name: SPIFFE_ENDPOINT_SOCKET
value: unix:///spiffe-workload-api/spire-agent.sock
volumes:
- name: spiffe-workload-api
csi:
driver: "csi.spiffe.io"
readOnly: true
|
使用 Workload API
在應用程式中使用 SPIFFE Workload API:
Go 範例
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
| package main
import (
"context"
"log"
"time"
"github.com/spiffe/go-spiffe/v2/spiffeid"
"github.com/spiffe/go-spiffe/v2/spiffetls"
"github.com/spiffe/go-spiffe/v2/spiffetls/tlsconfig"
"github.com/spiffe/go-spiffe/v2/workloadapi"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// 建立 Workload API 客戶端
client, err := workloadapi.New(ctx, workloadapi.WithAddr("unix:///spiffe-workload-api/spire-agent.sock"))
if err != nil {
log.Fatalf("無法建立 workload API 客戶端: %v", err)
}
defer client.Close()
// 取得 X.509 SVID
x509SVID, err := client.FetchX509SVID(ctx)
if err != nil {
log.Fatalf("無法取得 X.509 SVID: %v", err)
}
log.Printf("SPIFFE ID: %s", x509SVID.ID)
log.Printf("憑證有效期至: %s", x509SVID.Certificates[0].NotAfter)
// 建立 mTLS 伺服器
serverID := spiffeid.RequireFromString("spiffe://example.org/server")
tlsConfig := tlsconfig.MTLSServerConfig(
x509SVID,
x509SVID.Bundles,
tlsconfig.AuthorizeID(serverID),
)
// 使用 tlsConfig 建立 HTTPS 伺服器...
}
|
Python 範例
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
| import logging
from pyspiffe.spiffe_id.spiffe_id import SpiffeId
from pyspiffe.workloadapi.default_workload_api_client import DefaultWorkloadApiClient
from pyspiffe.workloadapi.x509_context import X509Context
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def main():
# 建立 Workload API 客戶端
socket_path = "unix:///spiffe-workload-api/spire-agent.sock"
with DefaultWorkloadApiClient(socket_path) as client:
# 取得 X.509 Context
x509_context: X509Context = client.fetch_x509_context()
# 取得預設 SVID
svid = x509_context.default_svid()
logger.info(f"SPIFFE ID: {svid.spiffe_id}")
logger.info(f"憑證: {svid.cert_chain[0].subject}")
# 取得信任套件
bundles = x509_context.x509_bundle_set()
logger.info(f"信任域數量: {len(bundles)}")
if __name__ == "__main__":
main()
|
Kubernetes 服務帳號與 SPIFFE ID 對應
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| # 自動產生 SPIFFE ID 的設定
apiVersion: spire.spiffe.io/v1alpha1
kind: ClusterSPIFFEID
metadata:
name: default-spiffe-id
spec:
className: "spire-spiffe-class"
spiffeIDTemplate: >-
spiffe://{{ .TrustDomain }}/k8s/{{ .PodMeta.Namespace }}/{{ .PodSpec.ServiceAccountName }}
podSelector: {}
namespaceSelector: {}
dnsNameTemplates:
- "{{ .PodMeta.Name }}.{{ .PodMeta.Namespace }}.svc.cluster.local"
workloadSelectorTemplates:
- "k8s:ns:{{ .PodMeta.Namespace }}"
- "k8s:sa:{{ .PodSpec.ServiceAccountName }}"
|
與 Envoy/Istio 整合
SPIFFE 與 Envoy 整合
Envoy 原生支援 SDS(Secret Discovery Service)來取得 SPIFFE 憑證:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
| # Envoy 設定範例
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 8443
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: backend
domains: ["*"]
routes:
- match:
prefix: "/"
route:
cluster: backend_cluster
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
common_tls_context:
tls_certificate_sds_secret_configs:
- name: "spiffe://example.org/ns/production/sa/frontend"
sds_config:
resource_api_version: V3
api_config_source:
api_type: GRPC
transport_api_version: V3
grpc_services:
- envoy_grpc:
cluster_name: spire_agent
validation_context_sds_secret_config:
name: "spiffe://example.org"
sds_config:
resource_api_version: V3
api_config_source:
api_type: GRPC
transport_api_version: V3
grpc_services:
- envoy_grpc:
cluster_name: spire_agent
tls_params:
tls_minimum_protocol_version: TLSv1_3
clusters:
- name: spire_agent
connect_timeout: 1s
type: STATIC
lb_policy: ROUND_ROBIN
typed_extension_protocol_options:
envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
"@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
explicit_http_config:
http2_protocol_options: {}
load_assignment:
cluster_name: spire_agent
endpoints:
- lb_endpoints:
- endpoint:
address:
pipe:
path: /run/spire/sockets/agent.sock
- name: backend_cluster
connect_timeout: 5s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: backend_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: backend-service
port_value: 8080
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
common_tls_context:
tls_certificate_sds_secret_configs:
- name: "spiffe://example.org/ns/production/sa/frontend"
sds_config:
resource_api_version: V3
api_config_source:
api_type: GRPC
transport_api_version: V3
grpc_services:
- envoy_grpc:
cluster_name: spire_agent
validation_context_sds_secret_config:
name: "spiffe://example.org"
sds_config:
resource_api_version: V3
api_config_source:
api_type: GRPC
transport_api_version: V3
grpc_services:
- envoy_grpc:
cluster_name: spire_agent
|
與 Istio 整合
Istio 可以使用 SPIRE 作為身份提供者:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
| # Istio 設定使用 SPIRE
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
name: istio-spire
spec:
profile: default
meshConfig:
trustDomain: example.org
values:
global:
caAddress: spire-server.spire:8081
pilot:
env:
ENABLE_SPIFFE_WORKLOAD_API: "true"
SPIFFE_ENDPOINT_SOCKET: "unix:///run/spire/sockets/agent.sock"
components:
ingressGateways:
- name: istio-ingressgateway
enabled: true
k8s:
overlays:
- kind: Deployment
name: istio-ingressgateway
patches:
- path: spec.template.spec.volumes[100]
value:
name: spire-agent-socket
hostPath:
path: /run/spire/sockets
type: Directory
- path: spec.template.spec.containers[0].volumeMounts[100]
value:
name: spire-agent-socket
mountPath: /run/spire/sockets
readOnly: true
|
跨叢集信任聯邦
在多叢集環境中建立 SPIFFE 信任聯邦:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # 在叢集 A 匯出信任套件
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server bundle show -format spiffe > cluster-a-bundle.json
# 在叢集 B 設定信任叢集 A
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server bundle set \
-id spiffe://cluster-a.example.org \
-path /tmp/cluster-a-bundle.json
# 建立 Federation Relationship
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server federation create \
-bundleEndpointURL https://spire-server.cluster-a.example.org:8443 \
-bundleEndpointProfile https_spiffe \
-trustDomain cluster-a.example.org \
-trustDomainBundleFormat spiffe
|
最佳實務與安全考量
憑證生命週期管理
1. 設定適當的 TTL
1
2
3
4
5
6
7
8
9
10
11
| # SPIRE Server 設定
server {
# CA 憑證有效期(建議 24-72 小時)
ca_ttl = "24h"
# SVID 預設有效期(建議 1 小時或更短)
default_svid_ttl = "1h"
# JWT-SVID 有效期
jwt_issuer = "spire-server"
}
|
2. 憑證輪換策略
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
| // 在應用程式中監聽憑證更新
package main
import (
"context"
"log"
"github.com/spiffe/go-spiffe/v2/workloadapi"
)
func main() {
ctx := context.Background()
// 建立持續監聽的客戶端
client, err := workloadapi.New(ctx)
if err != nil {
log.Fatal(err)
}
defer client.Close()
// 持續監聽 X.509 Context 更新
err = client.WatchX509Context(ctx, &x509Watcher{})
if err != nil {
log.Fatal(err)
}
}
type x509Watcher struct{}
func (w *x509Watcher) OnX509ContextUpdate(x509Context *workloadapi.X509Context) {
svid := x509Context.DefaultSVID()
log.Printf("收到新的 SVID: %s", svid.ID)
log.Printf("新憑證有效期至: %s", svid.Certificates[0].NotAfter)
// 更新應用程式的 TLS 設定
// updateTLSConfig(svid)
}
func (w *x509Watcher) OnX509ContextWatchError(err error) {
log.Printf("監聽錯誤: %v", err)
}
|
安全強化建議
1. Node Attestation 最佳實務
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| # 使用 Projected Service Account Token(PSAT)
plugins {
NodeAttestor "k8s_psat" {
plugin_data {
clusters = {
"production" = {
# 明確允許的服務帳號
service_account_allow_list = ["spire:spire-agent"]
# 使用特定的 audience
audience = ["spire-server"]
}
}
}
}
}
|
2. 工作負載選擇器強化
1
2
3
4
5
6
7
8
9
| # 使用多個選擇器增加安全性
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry create \
-spiffeID spiffe://example.org/critical-service \
-parentID spiffe://example.org/spire/agent/k8s_psat/production \
-selector k8s:ns:production \
-selector k8s:sa:critical-service \
-selector k8s:pod-label:security-level:high \
-selector k8s:container-image:registry.example.org/critical-service@sha256:abc123
|
3. 信任域隔離
1
2
3
4
5
6
7
8
9
10
11
| 生產環境信任域:prod.example.org
│
├── spiffe://prod.example.org/payment-service
├── spiffe://prod.example.org/user-service
└── spiffe://prod.example.org/inventory-service
測試環境信任域:staging.example.org
│
├── spiffe://staging.example.org/payment-service
├── spiffe://staging.example.org/user-service
└── spiffe://staging.example.org/inventory-service
|
監控與可觀測性
1. SPIRE 指標監控
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # Prometheus ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: spire-server
namespace: spire
spec:
selector:
matchLabels:
app: spire-server
endpoints:
- port: metrics
interval: 30s
path: /metrics
|
2. 重要指標
1
2
3
4
5
6
7
8
9
10
| # SPIRE Server 指標
spire_server_ca_manager_x509_ca_rotate_total # CA 輪換次數
spire_server_datastore_bundle_count # 信任套件數量
spire_server_svid_issued_total # 簽發的 SVID 總數
spire_server_node_attestation_duration_seconds # 節點認證耗時
# SPIRE Agent 指標
spire_agent_svid_rotations_total # SVID 輪換次數
spire_agent_workload_attestation_duration_seconds # 工作負載認證耗時
spire_agent_delegated_identity_api_connection_count # API 連線數
|
3. 日誌設定
1
2
3
4
5
6
7
8
9
| # 設定結構化日誌
server {
log_level = "INFO"
log_format = "json"
}
# 在 Kubernetes 中收集日誌
kubectl logs -n spire -l app=spire-server -f | \
jq 'select(.level == "error" or .level == "warn")'
|
災難復原
1. 備份 SPIRE Server 資料
1
2
3
4
5
6
7
8
| # 備份 DataStore(以 PostgreSQL 為例)
pg_dump -h postgres -U spire -d spire > spire_backup_$(date +%Y%m%d).sql
# 備份 CA 金鑰
kubectl exec -n spire spire-server-0 -- \
tar czf /tmp/keys_backup.tar.gz /run/spire/data/keys.json
kubectl cp spire/spire-server-0:/tmp/keys_backup.tar.gz ./keys_backup.tar.gz
|
2. 高可用性部署
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| # SPIRE Server 高可用設定
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: spire-server
namespace: spire
spec:
replicas: 3 # 部署多個副本
selector:
matchLabels:
app: spire-server
serviceName: spire-server
template:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: spire-server
topologyKey: "kubernetes.io/hostname"
# ... 其他設定
|
常見問題排解
1. 工作負載無法取得 SVID
1
2
3
4
5
6
7
8
9
10
11
12
| # 檢查 SPIRE Agent 連線
kubectl exec -n spire -l app=spire-agent -- \
/opt/spire/bin/spire-agent healthcheck \
-socketPath /run/spire/sockets/agent.sock
# 檢查工作負載註冊
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry show \
-selector k8s:ns:YOUR_NAMESPACE
# 檢查 Agent 日誌
kubectl logs -n spire -l app=spire-agent | grep -i error
|
2. 憑證驗證失敗
1
2
3
4
5
6
7
| # 驗證信任套件
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server bundle show
# 測試 mTLS 連線
openssl s_client -connect service:443 \
-cert svid.pem -key key.pem -CAfile bundle.pem
|
總結
SPIFFE 和 SPIRE 為現代雲原生架構提供了強大的服務身份解決方案。透過本文的介紹,您應該已經了解:
- SPIFFE 標準:提供統一的服務身份識別框架
- SVID 格式:X.509-SVID 和 JWT-SVID 的使用場景
- SPIRE 架構:Server 和 Agent 的角色與互動
- 部署策略:在 Kubernetes 環境中的最佳部署方式
- Workload Attestation:自動化的身份驗證機制
- 服務網格整合:與 Envoy 和 Istio 的無縫協作
- 安全最佳實務:確保生產環境的安全性
採用 SPIFFE/SPIRE 可以顯著提升您的微服務架構安全性,實現真正的零信任網路架構。建議從小規模試點開始,逐步擴展到整個組織的服務身份管理。
參考資源