Ubuntu 22.04 Nginx OpenTelemetry 追蹤

Ubuntu 22.04 Nginx OpenTelemetry Tracing Integration

前言

在現代微服務架構中,分散式追蹤(Distributed Tracing)已成為可觀測性(Observability)的核心組成部分。當請求跨越多個服務時,追蹤功能可以幫助我們了解請求的完整生命週期,識別效能瓶頸,並快速診斷問題。

本文將詳細介紹如何在 Ubuntu 22.04 上為 Nginx 配置 OpenTelemetry 追蹤功能,並整合常見的追蹤後端如 Jaeger 和 Zipkin。


1. OpenTelemetry 與分散式追蹤概念

什麼是 OpenTelemetry?

OpenTelemetry(簡稱 OTel)是一個開源的可觀測性框架,提供了一套標準化的 API、SDK 和工具,用於收集和導出遙測資料,包括:

  • Traces(追蹤):記錄請求在系統中的完整路徑
  • Metrics(指標):量化的系統測量數據
  • Logs(日誌):事件記錄

核心概念

術語說明
Trace一個完整請求的追蹤記錄,由多個 Span 組成
Span追蹤中的單一操作單位,包含開始時間、持續時間、屬性等
Context在服務間傳遞的追蹤資訊,包含 Trace ID 和 Span ID
Propagator負責在服務間傳遞 Context 的組件
Exporter將追蹤資料發送到後端系統的組件

為什麼在 Nginx 使用 OpenTelemetry?

Nginx 通常作為反向代理或負載平衡器,是請求進入系統的第一個接觸點。在 Nginx 層啟用追蹤可以:

  1. 追蹤請求從入口到後端的完整路徑
  2. 測量 Nginx 處理延遲
  3. 關聯前端與後端的追蹤資料
  4. 提供統一的可觀測性視圖

2. Nginx OpenTelemetry 模組安裝

前置需求

首先,確保系統已更新並安裝必要的依賴:

1
2
3
4
5
6
7
# 更新系統套件
sudo apt update && sudo apt upgrade -y

# 安裝編譯工具和依賴
sudo apt install -y build-essential git cmake libpcre3-dev zlib1g-dev \
    libssl-dev libcurl4-openssl-dev libprotobuf-dev protobuf-compiler \
    libgrpc++-dev libgtest-dev libbenchmark-dev

方法一:使用預編譯模組(推薦)

Nginx 官方提供了 OpenTelemetry 動態模組,可以直接安裝:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# 添加 Nginx 官方 repository
sudo apt install -y curl gnupg2 ca-certificates lsb-release ubuntu-keyring

curl https://nginx.org/keys/nginx_signing.key | gpg --dearmor \
    | sudo tee /usr/share/keyrings/nginx-archive-keyring.gpg >/dev/null

echo "deb [signed-by=/usr/share/keyrings/nginx-archive-keyring.gpg] \
    http://nginx.org/packages/mainline/ubuntu $(lsb_release -cs) nginx" \
    | sudo tee /etc/apt/sources.list.d/nginx.list

# 設定優先使用 nginx.org 的套件
echo -e "Package: *\nPin: origin nginx.org\nPin: release o=nginx\nPin-Priority: 900\n" \
    | sudo tee /etc/apt/preferences.d/99nginx

# 安裝 Nginx 和 OpenTelemetry 模組
sudo apt update
sudo apt install -y nginx nginx-module-otel

方法二:從原始碼編譯

如果需要自訂配置或使用特定版本,可以從原始碼編譯:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# 設定工作目錄
WORK_DIR=/opt/nginx-otel
sudo mkdir -p $WORK_DIR
cd $WORK_DIR

# 下載 opentelemetry-cpp
sudo git clone --recurse-submodules https://github.com/open-telemetry/opentelemetry-cpp.git
cd opentelemetry-cpp
sudo mkdir build && cd build
sudo cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DBUILD_TESTING=OFF \
    -DWITH_OTLP=ON \
    -DWITH_OTLP_GRPC=ON \
    -DWITH_OTLP_HTTP=ON
sudo make -j$(nproc)
sudo make install
cd $WORK_DIR

# 下載 Nginx 原始碼(確認版本與已安裝的相符)
NGINX_VERSION=$(nginx -v 2>&1 | grep -oP 'nginx/\K[\d.]+')
sudo wget http://nginx.org/download/nginx-${NGINX_VERSION}.tar.gz
sudo tar -xzf nginx-${NGINX_VERSION}.tar.gz
cd nginx-${NGINX_VERSION}

# 下載 OpenTelemetry Nginx 模組
cd $WORK_DIR
sudo git clone https://github.com/open-telemetry/opentelemetry-cpp-contrib.git
cd opentelemetry-cpp-contrib/instrumentation/nginx

# 編譯模組
cd $WORK_DIR/nginx-${NGINX_VERSION}
sudo ./configure --with-compat --add-dynamic-module=$WORK_DIR/opentelemetry-cpp-contrib/instrumentation/nginx
sudo make modules

# 安裝模組
sudo cp objs/ngx_otel_module.so /etc/nginx/modules/

載入模組

編輯 Nginx 主配置檔案以載入 OpenTelemetry 模組:

1
sudo nano /etc/nginx/nginx.conf

在檔案最上方添加:

1
load_module modules/ngx_otel_module.so;

驗證配置:

1
sudo nginx -t

3. 追蹤設定與配置

基本配置

建立 OpenTelemetry 配置檔案:

1
sudo nano /etc/nginx/conf.d/otel.conf

添加基本配置:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# OpenTelemetry 模組配置
otel_exporter {
    endpoint localhost:4317;  # OTLP gRPC endpoint
}

otel_service_name "nginx-gateway";

# 追蹤取樣率(1.0 = 100%)
otel_trace on;
otel_trace_context propagate;

完整配置範例

以下是一個完整的 Nginx 配置範例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# /etc/nginx/nginx.conf

load_module modules/ngx_otel_module.so;

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log notice;
pid /run/nginx.pid;

events {
    worker_connections 1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    # 日誌格式(包含追蹤資訊)
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for" '
                    'trace_id=$otel_trace_id span_id=$otel_span_id';

    access_log /var/log/nginx/access.log main;

    sendfile on;
    keepalive_timeout 65;

    # OpenTelemetry 配置
    otel_exporter {
        endpoint localhost:4317;
        interval 5s;
        batch_size 512;
        batch_count 4;
    }

    otel_service_name "nginx-gateway";
    otel_trace on;
    otel_trace_context propagate;

    # 設定資源屬性
    otel_resource_attr "service.version" "1.0.0";
    otel_resource_attr "deployment.environment" "production";
    otel_resource_attr "service.namespace" "my-application";

    include /etc/nginx/conf.d/*.conf;
}

站點配置

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# /etc/nginx/conf.d/default.conf

upstream backend {
    server 127.0.0.1:8080;
    server 127.0.0.1:8081;
}

server {
    listen 80;
    server_name example.com;

    # 啟用追蹤
    otel_trace on;
    otel_trace_context propagate;

    # 設定 Span 名稱
    otel_span_name $request_uri;

    location / {
        # 設定自訂屬性
        otel_span_attr "http.route" "/";
        otel_span_attr "custom.client_ip" $remote_addr;

        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    location /api/ {
        otel_span_attr "http.route" "/api/*";
        otel_span_attr "api.version" "v1";

        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location /health {
        # 健康檢查端點不追蹤
        otel_trace off;
        return 200 "OK";
    }

    location /static/ {
        # 靜態資源使用較低的取樣率
        otel_trace on;
        alias /var/www/static/;
    }
}

4. Trace Context 傳播

W3C Trace Context 標準

OpenTelemetry 使用 W3C Trace Context 標準來傳播追蹤資訊。主要的 HTTP 標頭包括:

標頭說明
traceparent包含 trace-id、parent-id 和 trace-flags
tracestate包含供應商特定的追蹤資訊

traceparent 格式

1
traceparent: 00-{trace-id}-{parent-id}-{trace-flags}

範例:

1
traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
  • 00: 版本
  • 0af7651916cd43dd8448eb211c80319c: 32 字元的 Trace ID
  • b7ad6b7169203331: 16 字元的 Parent Span ID
  • 01: Trace Flags(01 表示已取樣)

Nginx 傳播配置

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
http {
    # 傳播模式
    otel_trace_context propagate;  # 傳播並創建子 Span
    # otel_trace_context extract;  # 僅提取,不創建 Span
    # otel_trace_context inject;   # 僅注入,不提取
    # otel_trace_context ignore;   # 忽略追蹤上下文

    server {
        location /api/ {
            # 確保追蹤標頭傳遞到後端
            proxy_pass http://backend;

            # 這些標頭會自動由 otel 模組處理
            # 但如果需要手動控制:
            proxy_set_header traceparent $otel_traceparent;
            proxy_set_header tracestate $otel_tracestate;
        }
    }
}

與後端服務整合

後端服務需要能夠解析和傳播追蹤上下文。以下是不同語言的範例:

Python (Flask + OpenTelemetry)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from flask import Flask
from opentelemetry import trace
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

app = Flask(__name__)

# 配置 OpenTelemetry
trace.set_tracer_provider(TracerProvider())
otlp_exporter = OTLPSpanExporter(endpoint="localhost:4317", insecure=True)
trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(otlp_exporter))

# 自動儀表化 Flask
FlaskInstrumentor().instrument_app(app)

@app.route('/api/users')
def get_users():
    tracer = trace.get_tracer(__name__)
    with tracer.start_as_current_span("fetch_users_from_db"):
        # 資料庫操作
        pass
    return {"users": []}

if __name__ == '__main__':
    app.run(port=8080)

Node.js (Express + OpenTelemetry)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { registerInstrumentations } = require('@opentelemetry/instrumentation');
const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http');
const { ExpressInstrumentation } = require('@opentelemetry/instrumentation-express');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc');
const { BatchSpanProcessor } = require('@opentelemetry/sdk-trace-base');

const provider = new NodeTracerProvider();

const exporter = new OTLPTraceExporter({
    url: 'localhost:4317',
});

provider.addSpanProcessor(new BatchSpanProcessor(exporter));
provider.register();

registerInstrumentations({
    instrumentations: [
        new HttpInstrumentation(),
        new ExpressInstrumentation(),
    ],
});

const express = require('express');
const app = express();

app.get('/api/users', (req, res) => {
    res.json({ users: [] });
});

app.listen(8080);

5. 與 Jaeger 整合

安裝 Jaeger

使用 Docker 快速部署 Jaeger:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# 安裝 Docker(如果尚未安裝)
sudo apt install -y docker.io
sudo systemctl enable docker
sudo systemctl start docker
sudo usermod -aG docker $USER

# 執行 Jaeger All-in-One
docker run -d --name jaeger \
    -e COLLECTOR_OTLP_ENABLED=true \
    -p 4317:4317 \
    -p 4318:4318 \
    -p 16686:16686 \
    -p 14268:14268 \
    -p 14250:14250 \
    jaegertracing/all-in-one:latest

端口說明:

端口協議說明
4317gRPCOTLP gRPC 接收器
4318HTTPOTLP HTTP 接收器
16686HTTPJaeger UI
14268HTTPJaeger 收集器
14250gRPCJaeger 收集器

配置 Nginx 發送到 Jaeger

1
2
3
4
5
6
otel_exporter {
    endpoint localhost:4317;  # Jaeger OTLP gRPC endpoint
    interval 5s;
    batch_size 512;
    batch_count 4;
}

使用 Docker Compose 部署完整環境

建立 docker-compose.yml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
version: '3.8'

services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    container_name: jaeger
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    ports:
      - "4317:4317"    # OTLP gRPC
      - "4318:4318"    # OTLP HTTP
      - "16686:16686"  # Jaeger UI
      - "14268:14268"  # Jaeger collector
    networks:
      - otel-network

  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    container_name: otel-collector
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317:4317"    # OTLP gRPC
      - "4318:4318"    # OTLP HTTP
      - "8888:8888"    # Prometheus metrics
    depends_on:
      - jaeger
    networks:
      - otel-network

networks:
  otel-network:
    driver: bridge

建立 otel-collector-config.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024

  memory_limiter:
    check_interval: 1s
    limit_mib: 1000
    spike_limit_mib: 200

exporters:
  jaeger:
    endpoint: jaeger:14250
    tls:
      insecure: true

  logging:
    verbosity: detailed

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [jaeger, logging]

啟動服務:

1
docker-compose up -d

訪問 Jaeger UI

開啟瀏覽器訪問 http://localhost:16686,即可查看追蹤資料。


6. 與 Zipkin 整合

安裝 Zipkin

使用 Docker 部署 Zipkin:

1
2
3
docker run -d --name zipkin \
    -p 9411:9411 \
    openzipkin/zipkin:latest

使用 OpenTelemetry Collector 轉發到 Zipkin

更新 otel-collector-config.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024

exporters:
  zipkin:
    endpoint: "http://zipkin:9411/api/v2/spans"
    format: proto

  logging:
    verbosity: detailed

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [zipkin, logging]

同時發送到 Jaeger 和 Zipkin

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  batch:
    timeout: 1s

exporters:
  jaeger:
    endpoint: jaeger:14250
    tls:
      insecure: true

  zipkin:
    endpoint: "http://zipkin:9411/api/v2/spans"

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [jaeger, zipkin]

Docker Compose 完整配置

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
version: '3.8'

services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"
      - "14250:14250"
    networks:
      - otel-network

  zipkin:
    image: openzipkin/zipkin:latest
    ports:
      - "9411:9411"
    networks:
      - otel-network

  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "4317:4317"
      - "4318:4318"
    depends_on:
      - jaeger
      - zipkin
    networks:
      - otel-network

networks:
  otel-network:
    driver: bridge

7. 自訂屬性與 Span 設定

設定自訂屬性

在 Nginx 配置中添加自訂屬性:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
http {
    otel_exporter {
        endpoint localhost:4317;
    }

    otel_service_name "nginx-gateway";
    otel_trace on;

    # 全域資源屬性
    otel_resource_attr "service.version" "2.1.0";
    otel_resource_attr "deployment.environment" "production";
    otel_resource_attr "cloud.provider" "aws";
    otel_resource_attr "cloud.region" "ap-northeast-1";

    server {
        listen 80;

        location /api/v1/ {
            # Span 層級屬性
            otel_span_attr "http.route" "/api/v1/*";
            otel_span_attr "api.version" "v1";
            otel_span_attr "api.deprecated" "false";

            proxy_pass http://backend_v1;
        }

        location /api/v2/ {
            otel_span_attr "http.route" "/api/v2/*";
            otel_span_attr "api.version" "v2";
            otel_span_attr "api.deprecated" "false";

            proxy_pass http://backend_v2;
        }

        location /legacy/ {
            otel_span_attr "http.route" "/legacy/*";
            otel_span_attr "api.deprecated" "true";
            otel_span_attr "deprecation.date" "2025-12-31";

            proxy_pass http://legacy_backend;
        }
    }
}

動態屬性

使用 Nginx 變數設定動態屬性:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
server {
    # 根據請求設定動態屬性
    set $tenant_id "";
    if ($http_x_tenant_id) {
        set $tenant_id $http_x_tenant_id;
    }

    location / {
        otel_span_attr "tenant.id" $tenant_id;
        otel_span_attr "request.id" $request_id;
        otel_span_attr "client.ip" $remote_addr;
        otel_span_attr "client.user_agent" $http_user_agent;
        otel_span_attr "request.size" $content_length;

        proxy_pass http://backend;
    }
}

自訂 Span 名稱

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
server {
    # 使用 URI 作為 Span 名稱
    otel_span_name $request_uri;

    location ~ ^/users/(\d+)$ {
        # 使用正規化的路由作為 Span 名稱
        otel_span_name "/users/{id}";
        otel_span_attr "user.id" $1;

        proxy_pass http://user_service;
    }

    location ~ ^/orders/(\d+)/items/(\d+)$ {
        otel_span_name "/orders/{order_id}/items/{item_id}";
        otel_span_attr "order.id" $1;
        otel_span_attr "item.id" $2;

        proxy_pass http://order_service;
    }
}

條件式追蹤

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
http {
    # 使用 map 設定追蹤條件
    map $request_uri $otel_trace_enabled {
        default         "on";
        ~^/health       "off";
        ~^/metrics      "off";
        ~^/favicon.ico  "off";
        ~^/robots.txt   "off";
    }

    map $http_user_agent $is_bot {
        default         "false";
        ~*googlebot     "true";
        ~*bingbot       "true";
        ~*slurp         "true";
    }

    server {
        location / {
            otel_trace $otel_trace_enabled;
            otel_span_attr "client.is_bot" $is_bot;

            proxy_pass http://backend;
        }
    }
}

8. 效能影響與最佳實務

效能考量

啟用追蹤會對系統產生一定的效能開銷:

項目影響程度說明
CPU 使用率低-中主要用於 Span 創建和序列化
記憶體使用Span 暫存在記憶體中等待批次發送
網路頻寬低-中取決於取樣率和 Span 屬性數量
延遲極低非同步發送,不阻塞請求處理

取樣策略

頭部取樣(Head Sampling)

在請求開始時決定是否追蹤:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
http {
    # 使用隨機變數進行機率取樣
    split_clients "${request_id}" $trace_sample {
        10%     "on";   # 10% 的請求會被追蹤
        *       "off";
    }

    server {
        location / {
            otel_trace $trace_sample;
            proxy_pass http://backend;
        }
    }
}

基於條件的取樣

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
http {
    map $request_uri $trace_enabled {
        default             "on";
        ~^/static/          "off";    # 靜態資源不追蹤
        ~^/health           "off";    # 健康檢查不追蹤
    }

    map $status $trace_error_sample {
        ~^[45]  "on";    # 錯誤請求一定追蹤
        default "off";
    }

    server {
        location / {
            otel_trace $trace_enabled;
            proxy_pass http://backend;
        }
    }
}

批次處理配置

優化 exporter 配置以減少網路開銷:

1
2
3
4
5
6
otel_exporter {
    endpoint localhost:4317;
    interval 5s;         # 每 5 秒發送一次
    batch_size 512;      # 每批最多 512 個 Span
    batch_count 4;       # 最多暫存 4 批
}

最佳實務

1. 使用有意義的服務名稱

1
2
3
4
5
6
7
# 好的做法
otel_service_name "api-gateway";
otel_service_name "auth-proxy";

# 避免
otel_service_name "nginx";
otel_service_name "server1";

2. 正規化 Span 名稱

1
2
3
4
5
6
# 好的做法 - 使用路由模板
otel_span_name "/users/{id}";
otel_span_name "/orders/{order_id}/items";

# 避免 - 使用完整 URI(會造成高基數問題)
otel_span_name $request_uri;  # /users/12345 會產生大量不同的 Span 名稱

3. 限制屬性數量

1
2
3
4
5
6
7
8
# 只添加必要的屬性
otel_span_attr "http.route" "/api/users";
otel_span_attr "tenant.id" $tenant_id;

# 避免添加過多屬性
# otel_span_attr "header.accept" $http_accept;
# otel_span_attr "header.accept-encoding" $http_accept_encoding;
# otel_span_attr "header.accept-language" $http_accept_language;

4. 監控追蹤系統本身

1
2
3
4
5
# 監控 Nginx 錯誤日誌中的 OpenTelemetry 相關訊息
sudo tail -f /var/log/nginx/error.log | grep -i otel

# 使用 Prometheus 監控 OpenTelemetry Collector
curl http://localhost:8888/metrics

5. 設定適當的超時和重試

在 OpenTelemetry Collector 配置中:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
exporters:
  otlp:
    endpoint: jaeger:4317
    tls:
      insecure: true
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s
    timeout: 10s

疑難排解

檢查模組是否正確載入

1
nginx -V 2>&1 | grep otel

驗證配置

1
sudo nginx -t

檢查日誌

1
2
3
4
5
# Nginx 錯誤日誌
sudo tail -f /var/log/nginx/error.log

# OpenTelemetry Collector 日誌
docker logs -f otel-collector

測試追蹤

1
2
3
4
5
6
# 發送測試請求並檢查 Jaeger UI
curl -v http://localhost/api/test

# 使用特定的 trace ID
curl -H "traceparent: 00-12345678901234567890123456789012-1234567890123456-01" \
     http://localhost/api/test

總結

本文介紹了如何在 Ubuntu 22.04 上為 Nginx 配置 OpenTelemetry 追蹤功能。主要內容包括:

  1. OpenTelemetry 基礎概念:理解 Trace、Span、Context 等核心概念
  2. 模組安裝:使用預編譯模組或從原始碼編譯
  3. 追蹤配置:設定 exporter、服務名稱和取樣策略
  4. Context 傳播:使用 W3C Trace Context 標準在服務間傳遞追蹤資訊
  5. 後端整合:與 Jaeger 和 Zipkin 整合
  6. 自訂屬性:添加業務相關的屬性和動態資訊
  7. 效能最佳化:取樣策略、批次處理和最佳實務

透過實施這些配置,您可以獲得完整的請求追蹤能力,從而更好地理解系統行為、識別效能瓶頸,並快速診斷問題。

參考資源

comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy