自動化憑證部署流水線

Automated Certificate Deployment Pipeline Implementation

在現代化的雲端架構中,SSL/TLS 憑證的管理與部署是維運團隊不可忽視的重要任務。本文將深入探討如何建構一套完整的自動化憑證部署流水線,從憑證申請、更新到多伺服器部署,實現全自動化的憑證生命週期管理。

一、憑證自動化的必要性

傳統憑證管理的挑戰

在傳統的憑證管理模式中,運維人員經常面臨以下困境:

  1. 人工操作容易出錯:手動申請、下載、安裝憑證的流程繁瑣,容易因人為疏失導致服務中斷
  2. 憑證過期風險:憑證有效期限通常為 90 天至 1 年,若未及時更新將導致網站出現安全警告
  3. 規模化管理困難:當伺服器數量增加,手動管理每台主機的憑證變得不切實際
  4. 稽核與合規要求:許多產業對於憑證管理有嚴格的合規要求,需要完整的變更記錄

自動化帶來的效益

透過自動化憑證部署流水線,我們可以獲得:

  • 零停機更新:在憑證到期前自動完成更新與部署
  • 一致性保證:確保所有環境使用相同的憑證配置
  • 完整稽核軌跡:所有操作都有版本控制與日誌記錄
  • 降低營運成本:減少人工介入,讓團隊專注於更有價值的工作

二、流水線架構設計

整體架構概覽

一個完整的憑證部署流水線包含以下核心元件:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
┌─────────────────────────────────────────────────────────────────┐
│                    Certificate Pipeline                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                   │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐  │
│  │ Certificate│    │  Secret   │    │  Deploy   │    │ Monitor  │  │
│  │  Request  │───▶│  Store   │───▶│  Agent   │───▶│ & Alert  │  │
│  └──────────┘    └──────────┘    └──────────┘    └──────────┘  │
│       │                │                │                │       │
│       ▼                ▼                ▼                ▼       │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐  │
│  │  ACME    │    │ Vault /  │    │  Nginx   │    │Prometheus│  │
│  │  Server  │    │   KMS    │    │  Servers │    │  Grafana │  │
│  └──────────┘    └──────────┘    └──────────┘    └──────────┘  │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

關鍵設計原則

  1. 不可變基礎設施(Immutable Infrastructure):憑證作為配置的一部分,透過 CI/CD 流程部署
  2. 最小權限原則:各元件僅擁有完成任務所需的最小權限
  3. 加密傳輸與儲存:憑證與私鑰在傳輸與靜態儲存時都必須加密
  4. 失敗重試機制:網路暫時性錯誤不應導致整個流程失敗

三、使用 Certbot 與 ACME 協定

ACME 協定簡介

ACME(Automatic Certificate Management Environment)是一個用於自動化憑證頒發與管理的協定。Let’s Encrypt 是最知名的 ACME 憑證頒發機構(CA),提供免費的 SSL/TLS 憑證。

Certbot 安裝與設定

在 Ubuntu/Debian 系統上安裝 Certbot:

1
2
3
4
5
6
# 安裝 Certbot
sudo apt update
sudo apt install -y certbot

# 若使用 Nginx,安裝對應外掛
sudo apt install -y python3-certbot-nginx

DNS-01 Challenge 自動化

對於無法直接存取網頁伺服器的場景,DNS-01 challenge 是最佳選擇。以下範例使用 Cloudflare DNS:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# 安裝 Cloudflare DNS 外掛
sudo apt install -y python3-certbot-dns-cloudflare

# 建立 Cloudflare API 認證檔案
sudo mkdir -p /etc/letsencrypt/credentials
sudo cat > /etc/letsencrypt/credentials/cloudflare.ini << EOF
dns_cloudflare_api_token = YOUR_CLOUDFLARE_API_TOKEN
EOF
sudo chmod 600 /etc/letsencrypt/credentials/cloudflare.ini

# 申請憑證
sudo certbot certonly \
  --dns-cloudflare \
  --dns-cloudflare-credentials /etc/letsencrypt/credentials/cloudflare.ini \
  --dns-cloudflare-propagation-seconds 60 \
  -d "*.example.com" \
  -d "example.com" \
  --non-interactive \
  --agree-tos \
  --email admin@example.com

憑證自動更新腳本

建立一個完整的憑證更新腳本 /opt/scripts/renew-certificates.sh

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#!/bin/bash
set -euo pipefail

LOG_FILE="/var/log/certbot/renewal.log"
SLACK_WEBHOOK_URL="${SLACK_WEBHOOK_URL:-}"

log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}

send_notification() {
    local status="$1"
    local message="$2"

    if [[ -n "$SLACK_WEBHOOK_URL" ]]; then
        curl -s -X POST "$SLACK_WEBHOOK_URL" \
            -H 'Content-Type: application/json' \
            -d "{\"text\": \"[$status] Certificate Renewal: $message\"}"
    fi
}

main() {
    log "Starting certificate renewal process..."

    # 執行憑證更新
    if certbot renew --quiet --deploy-hook "/opt/scripts/post-renewal.sh"; then
        log "Certificate renewal completed successfully"
        send_notification "SUCCESS" "Certificates renewed successfully on $(hostname)"
    else
        log "Certificate renewal failed"
        send_notification "FAILURE" "Certificate renewal failed on $(hostname)"
        exit 1
    fi
}

main "$@"

四、GitLab CI/CD 整合

GitLab CI/CD Pipeline 設定

以下是一個完整的 .gitlab-ci.yml 範例:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
stages:
  - validate
  - request
  - store
  - deploy
  - verify

variables:
  DOMAIN: "example.com"
  CERT_PATH: "/etc/letsencrypt/live/${DOMAIN}"

# 驗證階段:檢查現有憑證狀態
validate:
  stage: validate
  image: alpine:latest
  script:
    - apk add --no-cache openssl curl
    - |
      # 檢查憑證到期日
      EXPIRY=$(echo | openssl s_client -servername ${DOMAIN} -connect ${DOMAIN}:443 2>/dev/null | openssl x509 -noout -enddate | cut -d= -f2)
      EXPIRY_EPOCH=$(date -d "${EXPIRY}" +%s)
      NOW_EPOCH=$(date +%s)
      DAYS_LEFT=$(( (EXPIRY_EPOCH - NOW_EPOCH) / 86400 ))

      echo "Certificate expires in ${DAYS_LEFT} days"

      if [ ${DAYS_LEFT} -gt 30 ]; then
        echo "Certificate is still valid, skipping renewal"
        exit 0
      fi      
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"

# 申請階段:使用 Certbot 申請新憑證
request:
  stage: request
  image: certbot/dns-cloudflare:latest
  script:
    - mkdir -p /etc/letsencrypt/credentials
    - echo "dns_cloudflare_api_token = ${CLOUDFLARE_API_TOKEN}" > /etc/letsencrypt/credentials/cloudflare.ini
    - chmod 600 /etc/letsencrypt/credentials/cloudflare.ini
    - |
      certbot certonly \
        --dns-cloudflare \
        --dns-cloudflare-credentials /etc/letsencrypt/credentials/cloudflare.ini \
        --dns-cloudflare-propagation-seconds 60 \
        -d "*.${DOMAIN}" \
        -d "${DOMAIN}" \
        --non-interactive \
        --agree-tos \
        --email ${CERT_EMAIL}      
    - |
      # 打包憑證檔案
      tar -czvf certificates.tar.gz \
        /etc/letsencrypt/live/${DOMAIN}/fullchain.pem \
        /etc/letsencrypt/live/${DOMAIN}/privkey.pem      
  artifacts:
    paths:
      - certificates.tar.gz
    expire_in: 1 hour
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
      when: on_success

# 儲存階段:將憑證存入 Vault
store:
  stage: store
  image: hashicorp/vault:latest
  script:
    - tar -xzvf certificates.tar.gz
    - |
      export VAULT_ADDR="${VAULT_ADDR}"
      export VAULT_TOKEN="${VAULT_TOKEN}"

      # 讀取憑證內容
      FULLCHAIN=$(cat /etc/letsencrypt/live/${DOMAIN}/fullchain.pem | base64 -w 0)
      PRIVKEY=$(cat /etc/letsencrypt/live/${DOMAIN}/privkey.pem | base64 -w 0)

      # 存入 Vault
      vault kv put secret/certificates/${DOMAIN} \
        fullchain="${FULLCHAIN}" \
        privkey="${PRIVKEY}" \
        updated_at="$(date -Iseconds)"      
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
      when: on_success

# 部署階段:將憑證部署到目標伺服器
deploy:
  stage: deploy
  image: alpine:latest
  before_script:
    - apk add --no-cache openssh-client curl jq
    - eval $(ssh-agent -s)
    - echo "${SSH_PRIVATE_KEY}" | tr -d '\r' | ssh-add -
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    - echo "${SSH_KNOWN_HOSTS}" > ~/.ssh/known_hosts
  script:
    - |
      # 從 Vault 取得憑證
      export VAULT_ADDR="${VAULT_ADDR}"
      CERT_DATA=$(curl -s -H "X-Vault-Token: ${VAULT_TOKEN}" \
        "${VAULT_ADDR}/v1/secret/data/certificates/${DOMAIN}" | jq -r '.data.data')

      FULLCHAIN=$(echo ${CERT_DATA} | jq -r '.fullchain' | base64 -d)
      PRIVKEY=$(echo ${CERT_DATA} | jq -r '.privkey' | base64 -d)

      # 部署到所有目標伺服器
      for SERVER in ${DEPLOY_SERVERS}; do
        echo "Deploying to ${SERVER}..."

        ssh deploy@${SERVER} "sudo mkdir -p /etc/nginx/ssl/${DOMAIN}"
        echo "${FULLCHAIN}" | ssh deploy@${SERVER} "sudo tee /etc/nginx/ssl/${DOMAIN}/fullchain.pem > /dev/null"
        echo "${PRIVKEY}" | ssh deploy@${SERVER} "sudo tee /etc/nginx/ssl/${DOMAIN}/privkey.pem > /dev/null"
        ssh deploy@${SERVER} "sudo chmod 600 /etc/nginx/ssl/${DOMAIN}/*.pem"
        ssh deploy@${SERVER} "sudo nginx -t && sudo systemctl reload nginx"

        echo "Successfully deployed to ${SERVER}"
      done      
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
      when: on_success

# 驗證階段:確認憑證已正確部署
verify:
  stage: verify
  image: alpine:latest
  script:
    - apk add --no-cache openssl curl
    - |
      for SERVER in ${DEPLOY_SERVERS}; do
        echo "Verifying certificate on ${SERVER}..."

        CERT_INFO=$(echo | openssl s_client -servername ${DOMAIN} -connect ${SERVER}:443 2>/dev/null | openssl x509 -noout -dates -subject)

        echo "Certificate info for ${SERVER}:"
        echo "${CERT_INFO}"

        # 驗證憑證主體
        if echo "${CERT_INFO}" | grep -q "${DOMAIN}"; then
          echo "✓ Certificate verified for ${SERVER}"
        else
          echo "✗ Certificate verification failed for ${SERVER}"
          exit 1
        fi
      done      
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
      when: on_success

GitHub Actions 整合

若使用 GitHub Actions,以下是對應的 workflow 設定:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
name: Certificate Renewal Pipeline

on:
  schedule:
    - cron: '0 2 * * *'  # 每天凌晨 2 點執行
  workflow_dispatch:

env:
  DOMAIN: example.com

jobs:
  check-certificate:
    runs-on: ubuntu-latest
    outputs:
      needs_renewal: ${{ steps.check.outputs.needs_renewal }}
    steps:
      - name: Check certificate expiry
        id: check
        run: |
          EXPIRY=$(echo | openssl s_client -servername ${{ env.DOMAIN }} -connect ${{ env.DOMAIN }}:443 2>/dev/null | openssl x509 -noout -enddate | cut -d= -f2)
          EXPIRY_EPOCH=$(date -d "${EXPIRY}" +%s)
          NOW_EPOCH=$(date +%s)
          DAYS_LEFT=$(( (EXPIRY_EPOCH - NOW_EPOCH) / 86400 ))

          echo "Certificate expires in ${DAYS_LEFT} days"

          if [ ${DAYS_LEFT} -le 30 ]; then
            echo "needs_renewal=true" >> $GITHUB_OUTPUT
          else
            echo "needs_renewal=false" >> $GITHUB_OUTPUT
          fi          

  request-certificate:
    needs: check-certificate
    if: needs.check-certificate.outputs.needs_renewal == 'true'
    runs-on: ubuntu-latest
    steps:
      - name: Install Certbot
        run: |
          sudo apt-get update
          sudo apt-get install -y certbot python3-certbot-dns-cloudflare          

      - name: Configure Cloudflare credentials
        run: |
          sudo mkdir -p /etc/letsencrypt/credentials
          echo "dns_cloudflare_api_token = ${{ secrets.CLOUDFLARE_API_TOKEN }}" | sudo tee /etc/letsencrypt/credentials/cloudflare.ini
          sudo chmod 600 /etc/letsencrypt/credentials/cloudflare.ini          

      - name: Request certificate
        run: |
          sudo certbot certonly \
            --dns-cloudflare \
            --dns-cloudflare-credentials /etc/letsencrypt/credentials/cloudflare.ini \
            --dns-cloudflare-propagation-seconds 60 \
            -d "*.${{ env.DOMAIN }}" \
            -d "${{ env.DOMAIN }}" \
            --non-interactive \
            --agree-tos \
            --email ${{ secrets.CERT_EMAIL }}          

      - name: Upload certificates
        uses: actions/upload-artifact@v4
        with:
          name: certificates
          path: /etc/letsencrypt/live/${{ env.DOMAIN }}/
          retention-days: 1

  deploy-certificate:
    needs: request-certificate
    runs-on: ubuntu-latest
    strategy:
      matrix:
        server: ${{ fromJson(vars.DEPLOY_SERVERS) }}
    steps:
      - name: Download certificates
        uses: actions/download-artifact@v4
        with:
          name: certificates
          path: ./certs

      - name: Deploy to server
        uses: appleboy/ssh-action@master
        with:
          host: ${{ matrix.server }}
          username: deploy
          key: ${{ secrets.SSH_PRIVATE_KEY }}
          script: |
            sudo mkdir -p /etc/nginx/ssl/${{ env.DOMAIN }}            

      - name: Copy certificates
        uses: appleboy/scp-action@master
        with:
          host: ${{ matrix.server }}
          username: deploy
          key: ${{ secrets.SSH_PRIVATE_KEY }}
          source: "./certs/*"
          target: "/tmp/certs/"

      - name: Install certificates and reload Nginx
        uses: appleboy/ssh-action@master
        with:
          host: ${{ matrix.server }}
          username: deploy
          key: ${{ secrets.SSH_PRIVATE_KEY }}
          script: |
            sudo cp /tmp/certs/* /etc/nginx/ssl/${{ env.DOMAIN }}/
            sudo chmod 600 /etc/nginx/ssl/${{ env.DOMAIN }}/*.pem
            sudo nginx -t && sudo systemctl reload nginx
            rm -rf /tmp/certs            

五、憑證儲存與金鑰管理

HashiCorp Vault 整合

Vault 是業界標準的秘密管理工具,以下是完整的整合範例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 啟用 KV 秘密引擎
vault secrets enable -path=certificates kv-v2

# 設定存取策略
vault policy write certificate-reader - <<EOF
path "certificates/data/*" {
  capabilities = ["read"]
}
EOF

vault policy write certificate-writer - <<EOF
path "certificates/data/*" {
  capabilities = ["create", "update", "read"]
}
EOF

# 建立 AppRole 供 CI/CD 使用
vault auth enable approle

vault write auth/approle/role/cert-pipeline \
  token_policies="certificate-writer" \
  token_ttl=1h \
  token_max_ttl=4h \
  secret_id_ttl=24h

Python 腳本範例:Vault 憑證管理

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
#!/usr/bin/env python3
"""
Certificate Vault Manager
管理憑證在 HashiCorp Vault 中的存取
"""

import hvac
import base64
import os
from datetime import datetime
from typing import Optional, Dict, Any


class CertificateVaultManager:
    def __init__(self, vault_addr: str, vault_token: str):
        self.client = hvac.Client(url=vault_addr, token=vault_token)
        self.mount_point = "certificates"

    def store_certificate(
        self,
        domain: str,
        fullchain_path: str,
        privkey_path: str,
        metadata: Optional[Dict[str, Any]] = None
    ) -> bool:
        """將憑證存入 Vault"""
        try:
            with open(fullchain_path, 'r') as f:
                fullchain = base64.b64encode(f.read().encode()).decode()

            with open(privkey_path, 'r') as f:
                privkey = base64.b64encode(f.read().encode()).decode()

            secret_data = {
                "fullchain": fullchain,
                "privkey": privkey,
                "updated_at": datetime.utcnow().isoformat(),
                "domain": domain,
            }

            if metadata:
                secret_data["metadata"] = metadata

            self.client.secrets.kv.v2.create_or_update_secret(
                path=f"domains/{domain}",
                secret=secret_data,
                mount_point=self.mount_point
            )

            print(f"Successfully stored certificate for {domain}")
            return True

        except Exception as e:
            print(f"Failed to store certificate: {e}")
            return False

    def retrieve_certificate(self, domain: str) -> Optional[Dict[str, str]]:
        """從 Vault 取得憑證"""
        try:
            response = self.client.secrets.kv.v2.read_secret_version(
                path=f"domains/{domain}",
                mount_point=self.mount_point
            )

            data = response['data']['data']

            return {
                "fullchain": base64.b64decode(data['fullchain']).decode(),
                "privkey": base64.b64decode(data['privkey']).decode(),
                "updated_at": data['updated_at'],
            }

        except Exception as e:
            print(f"Failed to retrieve certificate: {e}")
            return None

    def list_certificates(self) -> list:
        """列出所有憑證"""
        try:
            response = self.client.secrets.kv.v2.list_secrets(
                path="domains",
                mount_point=self.mount_point
            )
            return response['data']['keys']
        except Exception as e:
            print(f"Failed to list certificates: {e}")
            return []


if __name__ == "__main__":
    manager = CertificateVaultManager(
        vault_addr=os.environ.get("VAULT_ADDR", "http://localhost:8200"),
        vault_token=os.environ.get("VAULT_TOKEN")
    )

    # 範例:存入憑證
    manager.store_certificate(
        domain="example.com",
        fullchain_path="/etc/letsencrypt/live/example.com/fullchain.pem",
        privkey_path="/etc/letsencrypt/live/example.com/privkey.pem",
        metadata={"environment": "production", "issuer": "Let's Encrypt"}
    )

AWS Secrets Manager 整合

若使用 AWS,可透過 Secrets Manager 管理憑證:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#!/usr/bin/env python3
"""
AWS Secrets Manager Certificate Handler
"""

import boto3
import json
import base64
from datetime import datetime


class AWSCertificateManager:
    def __init__(self, region: str = "ap-northeast-1"):
        self.client = boto3.client('secretsmanager', region_name=region)

    def store_certificate(self, domain: str, fullchain: str, privkey: str):
        """存儲憑證到 AWS Secrets Manager"""
        secret_name = f"certificates/{domain.replace('.', '-')}"

        secret_value = json.dumps({
            "fullchain": base64.b64encode(fullchain.encode()).decode(),
            "privkey": base64.b64encode(privkey.encode()).decode(),
            "updated_at": datetime.utcnow().isoformat(),
        })

        try:
            self.client.create_secret(
                Name=secret_name,
                SecretString=secret_value,
                Tags=[
                    {"Key": "Domain", "Value": domain},
                    {"Key": "Type", "Value": "SSL_Certificate"},
                ]
            )
        except self.client.exceptions.ResourceExistsException:
            self.client.update_secret(
                SecretId=secret_name,
                SecretString=secret_value
            )

        print(f"Certificate stored: {secret_name}")

    def retrieve_certificate(self, domain: str) -> dict:
        """從 AWS Secrets Manager 取得憑證"""
        secret_name = f"certificates/{domain.replace('.', '-')}"

        response = self.client.get_secret_value(SecretId=secret_name)
        data = json.loads(response['SecretString'])

        return {
            "fullchain": base64.b64decode(data['fullchain']).decode(),
            "privkey": base64.b64decode(data['privkey']).decode(),
            "updated_at": data['updated_at'],
        }

六、自動部署到多台伺服器

Ansible Playbook 部署方案

建立 deploy-certificates.yml playbook:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
- name: Deploy SSL Certificates
  hosts: webservers
  become: yes
  vars:
    domain: "example.com"
    cert_dir: "/etc/nginx/ssl/{{ domain }}"
    vault_addr: "{{ lookup('env', 'VAULT_ADDR') }}"
    vault_token: "{{ lookup('env', 'VAULT_TOKEN') }}"

  tasks:
    - name: Ensure certificate directory exists
      file:
        path: "{{ cert_dir }}"
        state: directory
        mode: '0750'
        owner: root
        group: nginx

    - name: Retrieve certificate from Vault
      uri:
        url: "{{ vault_addr }}/v1/certificates/data/domains/{{ domain }}"
        headers:
          X-Vault-Token: "{{ vault_token }}"
        return_content: yes
      register: vault_response
      delegate_to: localhost
      run_once: true

    - name: Set certificate facts
      set_fact:
        cert_fullchain: "{{ vault_response.json.data.data.fullchain | b64decode }}"
        cert_privkey: "{{ vault_response.json.data.data.privkey | b64decode }}"

    - name: Deploy fullchain certificate
      copy:
        content: "{{ cert_fullchain }}"
        dest: "{{ cert_dir }}/fullchain.pem"
        mode: '0644'
        owner: root
        group: nginx
      notify: Reload Nginx

    - name: Deploy private key
      copy:
        content: "{{ cert_privkey }}"
        dest: "{{ cert_dir }}/privkey.pem"
        mode: '0600'
        owner: root
        group: nginx
      notify: Reload Nginx

    - name: Validate Nginx configuration
      command: nginx -t
      changed_when: false

  handlers:
    - name: Reload Nginx
      service:
        name: nginx
        state: reloaded

使用 Fabric 進行平行部署

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
#!/usr/bin/env python3
"""
Certificate Deployment Script using Fabric
平行部署憑證到多台伺服器
"""

from fabric import Connection, ThreadingGroup
from invoke import task
import os
import tempfile


class CertificateDeployer:
    def __init__(self, servers: list, domain: str, ssh_key_path: str):
        self.servers = servers
        self.domain = domain
        self.ssh_key_path = ssh_key_path
        self.cert_dir = f"/etc/nginx/ssl/{domain}"

    def deploy(self, fullchain: str, privkey: str):
        """平行部署憑證到所有伺服器"""

        # 建立臨時檔案
        with tempfile.NamedTemporaryFile(mode='w', suffix='.pem', delete=False) as f:
            f.write(fullchain)
            fullchain_path = f.name

        with tempfile.NamedTemporaryFile(mode='w', suffix='.pem', delete=False) as f:
            f.write(privkey)
            privkey_path = f.name

        try:
            # 建立連線群組
            connect_kwargs = {"key_filename": self.ssh_key_path}
            group = ThreadingGroup(
                *self.servers,
                user="deploy",
                connect_kwargs=connect_kwargs
            )

            # 建立目錄
            group.run(f"sudo mkdir -p {self.cert_dir}", hide=True)

            # 部署憑證(需要逐一處理檔案傳輸)
            for server in self.servers:
                conn = Connection(
                    server,
                    user="deploy",
                    connect_kwargs=connect_kwargs
                )

                print(f"Deploying to {server}...")

                # 上傳憑證
                conn.put(fullchain_path, "/tmp/fullchain.pem")
                conn.put(privkey_path, "/tmp/privkey.pem")

                # 移動到正確位置並設定權限
                conn.run(f"sudo mv /tmp/fullchain.pem {self.cert_dir}/fullchain.pem")
                conn.run(f"sudo mv /tmp/privkey.pem {self.cert_dir}/privkey.pem")
                conn.run(f"sudo chmod 644 {self.cert_dir}/fullchain.pem")
                conn.run(f"sudo chmod 600 {self.cert_dir}/privkey.pem")
                conn.run(f"sudo chown root:nginx {self.cert_dir}/*.pem")

                # 測試並重載 Nginx
                conn.run("sudo nginx -t")
                conn.run("sudo systemctl reload nginx")

                print(f"Successfully deployed to {server}")
                conn.close()

            print("Deployment completed successfully!")

        finally:
            # 清理臨時檔案
            os.unlink(fullchain_path)
            os.unlink(privkey_path)


if __name__ == "__main__":
    deployer = CertificateDeployer(
        servers=["web1.example.com", "web2.example.com", "web3.example.com"],
        domain="example.com",
        ssh_key_path="/path/to/ssh/key"
    )

    # 從 Vault 或其他來源取得憑證後呼叫
    # deployer.deploy(fullchain, privkey)

七、監控與告警機制

Prometheus 憑證監控

建立 certificate_exporter.py 來匯出憑證指標:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
#!/usr/bin/env python3
"""
Certificate Exporter for Prometheus
匯出 SSL 憑證指標供 Prometheus 收集
"""

import ssl
import socket
from datetime import datetime
from prometheus_client import start_http_server, Gauge, Info
import time
import yaml


# 定義指標
CERT_EXPIRY_SECONDS = Gauge(
    'ssl_certificate_expiry_seconds',
    'SSL certificate expiry time in seconds',
    ['domain', 'issuer']
)

CERT_VALID = Gauge(
    'ssl_certificate_valid',
    'SSL certificate validity (1=valid, 0=invalid)',
    ['domain']
)

CERT_DAYS_REMAINING = Gauge(
    'ssl_certificate_days_remaining',
    'Days remaining until certificate expires',
    ['domain']
)

CERT_INFO = Info(
    'ssl_certificate',
    'SSL certificate information',
    ['domain']
)


def get_certificate_info(hostname: str, port: int = 443) -> dict:
    """取得 SSL 憑證資訊"""
    context = ssl.create_default_context()

    try:
        with socket.create_connection((hostname, port), timeout=10) as sock:
            with context.wrap_socket(sock, server_hostname=hostname) as ssock:
                cert = ssock.getpeercert()

                # 解析到期日期
                not_after = datetime.strptime(
                    cert['notAfter'],
                    '%b %d %H:%M:%S %Y %Z'
                )

                not_before = datetime.strptime(
                    cert['notBefore'],
                    '%b %d %H:%M:%S %Y %Z'
                )

                # 取得 issuer
                issuer = dict(x[0] for x in cert['issuer'])
                issuer_name = issuer.get('organizationName', 'Unknown')

                # 取得 subject
                subject = dict(x[0] for x in cert['subject'])
                common_name = subject.get('commonName', hostname)

                # 計算剩餘天數
                now = datetime.utcnow()
                days_remaining = (not_after - now).days
                expiry_seconds = (not_after - now).total_seconds()

                return {
                    'valid': True,
                    'common_name': common_name,
                    'issuer': issuer_name,
                    'not_before': not_before.isoformat(),
                    'not_after': not_after.isoformat(),
                    'days_remaining': days_remaining,
                    'expiry_seconds': expiry_seconds,
                    'serial_number': cert.get('serialNumber', ''),
                }

    except Exception as e:
        return {
            'valid': False,
            'error': str(e),
            'days_remaining': -1,
            'expiry_seconds': -1,
        }


def update_metrics(domains: list):
    """更新所有網域的指標"""
    for domain in domains:
        info = get_certificate_info(domain)

        CERT_VALID.labels(domain=domain).set(1 if info['valid'] else 0)

        if info['valid']:
            CERT_EXPIRY_SECONDS.labels(
                domain=domain,
                issuer=info['issuer']
            ).set(info['expiry_seconds'])

            CERT_DAYS_REMAINING.labels(domain=domain).set(info['days_remaining'])

            CERT_INFO.labels(domain=domain).info({
                'common_name': info['common_name'],
                'issuer': info['issuer'],
                'not_before': info['not_before'],
                'not_after': info['not_after'],
                'serial_number': info['serial_number'],
            })


def main():
    # 載入設定
    with open('/etc/cert-exporter/config.yaml', 'r') as f:
        config = yaml.safe_load(f)

    domains = config.get('domains', [])
    port = config.get('port', 9117)
    interval = config.get('interval', 300)

    # 啟動 HTTP 伺服器
    start_http_server(port)
    print(f"Certificate exporter started on port {port}")

    while True:
        update_metrics(domains)
        time.sleep(interval)


if __name__ == "__main__":
    main()

Prometheus Alert Rules

建立 certificate_alerts.yml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
groups:
  - name: certificate_alerts
    rules:
      - alert: CertificateExpiringSoon
        expr: ssl_certificate_days_remaining < 30
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: "SSL certificate expiring soon for {{ $labels.domain }}"
          description: "Certificate for {{ $labels.domain }} expires in {{ $value }} days"

      - alert: CertificateExpiryCritical
        expr: ssl_certificate_days_remaining < 7
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "SSL certificate expiring critically soon for {{ $labels.domain }}"
          description: "Certificate for {{ $labels.domain }} expires in {{ $value }} days. Immediate action required!"

      - alert: CertificateExpired
        expr: ssl_certificate_days_remaining <= 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "SSL certificate has expired for {{ $labels.domain }}"
          description: "Certificate for {{ $labels.domain }} has expired!"

      - alert: CertificateInvalid
        expr: ssl_certificate_valid == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "SSL certificate invalid for {{ $labels.domain }}"
          description: "Unable to validate certificate for {{ $labels.domain }}. Check if the certificate is properly installed."

Grafana Dashboard

以下是 Grafana Dashboard JSON 設定片段:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
{
  "dashboard": {
    "title": "SSL Certificate Monitoring",
    "panels": [
      {
        "title": "Certificate Expiry Timeline",
        "type": "timeseries",
        "targets": [
          {
            "expr": "ssl_certificate_days_remaining",
            "legendFormat": "{{ domain }}"
          }
        ],
        "fieldConfig": {
          "defaults": {
            "unit": "d",
            "thresholds": {
              "mode": "absolute",
              "steps": [
                { "color": "red", "value": 0 },
                { "color": "orange", "value": 14 },
                { "color": "yellow", "value": 30 },
                { "color": "green", "value": 60 }
              ]
            }
          }
        }
      },
      {
        "title": "Certificate Status",
        "type": "stat",
        "targets": [
          {
            "expr": "ssl_certificate_valid",
            "legendFormat": "{{ domain }}"
          }
        ],
        "fieldConfig": {
          "defaults": {
            "mappings": [
              { "type": "value", "options": { "0": { "text": "Invalid", "color": "red" } } },
              { "type": "value", "options": { "1": { "text": "Valid", "color": "green" } } }
            ]
          }
        }
      }
    ]
  }
}

八、最佳實務與安全考量

安全性最佳實務

1. 私鑰保護

1
2
3
4
5
6
7
# 確保私鑰檔案權限正確
chmod 600 /etc/nginx/ssl/*/privkey.pem
chown root:root /etc/nginx/ssl/*/privkey.pem

# 使用 SELinux 加強保護(如適用)
semanage fcontext -a -t cert_t '/etc/nginx/ssl(/.*)?'
restorecon -Rv /etc/nginx/ssl/

2. 傳輸安全

1
2
3
4
5
6
7
8
# 使用 SSH 金鑰進行部署,而非密碼
# GitLab CI/CD 範例
deploy:
  script:
    - eval $(ssh-agent -s)
    - echo "$SSH_PRIVATE_KEY" | ssh-add -
    # 使用 ProxyJump 透過堡壘機存取
    - ssh -o ProxyJump=bastion@jump.example.com deploy@target.example.com

3. 秘密輪換

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#!/usr/bin/env python3
"""
自動輪換 CI/CD 使用的秘密
"""

import hvac
from datetime import datetime, timedelta


def rotate_approle_secret(vault_client, role_name: str):
    """輪換 AppRole Secret ID"""

    # 產生新的 Secret ID
    response = vault_client.auth.approle.generate_secret_id(
        role_name=role_name,
        metadata={"rotated_at": datetime.utcnow().isoformat()}
    )

    new_secret_id = response['data']['secret_id']

    # 更新 CI/CD 變數(以 GitLab 為例)
    # 這裡應該使用 GitLab API 更新變數

    return new_secret_id

運維最佳實務

1. 憑證清單管理

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#!/usr/bin/env python3
"""
Certificate Inventory Manager
維護憑證清單並追蹤狀態
"""

import json
from datetime import datetime
from pathlib import Path


class CertificateInventory:
    def __init__(self, inventory_file: str = "/etc/cert-inventory.json"):
        self.inventory_file = Path(inventory_file)
        self.load()

    def load(self):
        if self.inventory_file.exists():
            with open(self.inventory_file, 'r') as f:
                self.inventory = json.load(f)
        else:
            self.inventory = {"certificates": {}, "last_updated": None}

    def save(self):
        self.inventory["last_updated"] = datetime.utcnow().isoformat()
        with open(self.inventory_file, 'w') as f:
            json.dump(self.inventory, f, indent=2)

    def add_certificate(self, domain: str, info: dict):
        self.inventory["certificates"][domain] = {
            **info,
            "added_at": datetime.utcnow().isoformat()
        }
        self.save()

    def get_expiring_soon(self, days: int = 30) -> list:
        """取得即將到期的憑證清單"""
        expiring = []
        now = datetime.utcnow()

        for domain, info in self.inventory["certificates"].items():
            expiry = datetime.fromisoformat(info.get("expiry_date", ""))
            if (expiry - now).days <= days:
                expiring.append({
                    "domain": domain,
                    "days_remaining": (expiry - now).days,
                    **info
                })

        return sorted(expiring, key=lambda x: x["days_remaining"])

2. 災難復原計畫

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/bin/bash
# Certificate Backup Script
# 定期備份憑證到安全位置

BACKUP_DIR="/backup/certificates/$(date +%Y%m%d)"
ENCRYPTION_KEY="${BACKUP_ENCRYPTION_KEY}"

mkdir -p "${BACKUP_DIR}"

# 備份所有憑證
for cert_dir in /etc/nginx/ssl/*/; do
    domain=$(basename "${cert_dir}")

    # 使用 GPG 加密備份
    tar -czf - "${cert_dir}" | \
        gpg --symmetric --cipher-algo AES256 \
            --passphrase "${ENCRYPTION_KEY}" \
            -o "${BACKUP_DIR}/${domain}.tar.gz.gpg"
done

# 上傳到遠端儲存
aws s3 sync "${BACKUP_DIR}" "s3://cert-backups/$(date +%Y%m%d)/" \
    --sse AES256

# 清理舊備份(保留 90 天)
find /backup/certificates -type d -mtime +90 -exec rm -rf {} \;

3. 多環境憑證管理

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# environments.yaml
environments:
  production:
    domains:
      - "example.com"
      - "*.example.com"
      - "api.example.com"
    servers:
      - "prod-web1.example.com"
      - "prod-web2.example.com"
    vault_path: "certificates/production"

  staging:
    domains:
      - "staging.example.com"
      - "*.staging.example.com"
    servers:
      - "staging-web1.example.com"
    vault_path: "certificates/staging"

  development:
    domains:
      - "dev.example.com"
    servers:
      - "dev-web1.example.com"
    vault_path: "certificates/development"
    # 開發環境使用自簽憑證
    use_self_signed: true

效能最佳化

1. OCSP Stapling

1
2
3
4
5
6
# Nginx 設定啟用 OCSP Stapling
ssl_stapling on;
ssl_stapling_verify on;
ssl_trusted_certificate /etc/nginx/ssl/example.com/fullchain.pem;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 5s;

2. Session Resumption

1
2
3
4
# 啟用 SSL Session 快取
ssl_session_cache shared:SSL:50m;
ssl_session_timeout 1d;
ssl_session_tickets off;  # 建議關閉以提高安全性

結論

建構一套完整的自動化憑證部署流水線需要整合多個系統與工具,但投入的努力將帶來顯著的營運效益。透過本文介紹的架構與實作方式,您可以:

  1. 消除人為錯誤:自動化流程確保每次部署的一致性
  2. 降低營運風險:提前預警與自動更新避免憑證過期
  3. 提升團隊效率:將維運人員從重複性工作中解放
  4. 滿足合規要求:完整的稽核軌跡與變更記錄

建議從小規模開始,先針對最關鍵的服務實施自動化,再逐步擴展到整個基礎架構。記住,安全性永遠是第一優先,任何自動化實作都應該經過充分的測試與審核。

參考資源

comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy