Ubuntu 22.04 PostgreSQL TimescaleDB 時序資料庫

Ubuntu 22.04 PostgreSQL TimescaleDB Time Series Database

TimescaleDB 是一個建構在 PostgreSQL 之上的開源時序資料庫擴充套件,專為處理時間序列資料而設計。本文將詳細介紹如何在 Ubuntu 22.04 上安裝、設定和優化 TimescaleDB。

時序資料庫概念

什麼是時序資料

時序資料(Time Series Data)是按照時間順序記錄的資料點集合,每個資料點都包含一個時間戳記和相關的測量值。常見的應用場景包括:

  • IoT 感測器資料:溫度、濕度、壓力等環境監測
  • 應用程式效能監控:CPU 使用率、記憶體消耗、回應時間
  • 金融交易資料:股票價格、交易量、匯率變動
  • 日誌資料:系統日誌、存取日誌、錯誤日誌
  • 智慧電網資料:電力消耗、負載均衡

時序資料庫的特點

相較於傳統的關聯式資料庫,時序資料庫針對時序資料有以下優化:

特性傳統 RDBMS時序資料庫
寫入模式隨機寫入順序追加寫入
資料更新頻繁更新幾乎不更新
查詢模式點查詢為主範圍查詢為主
資料壓縮通用壓縮時序專用壓縮
資料保留長期保存自動過期刪除

TimescaleDB 的優勢

TimescaleDB 相較於其他時序資料庫具有以下優勢:

  1. 完整的 SQL 支援:支援完整的 PostgreSQL SQL 語法
  2. 無縫整合:可與現有 PostgreSQL 工具和生態系統整合
  3. 自動分區:透過 Hypertable 自動進行時間分區
  4. 高效壓縮:原生壓縮可節省高達 95% 的儲存空間
  5. 連續聚合:自動維護預先計算的聚合資料
  6. 資料保留策略:自動刪除過期資料

TimescaleDB 安裝

前置需求

確保系統已安裝 PostgreSQL:

1
2
3
4
5
# 更新系統套件
sudo apt update && sudo apt upgrade -y

# 安裝 PostgreSQL(如尚未安裝)
sudo apt install -y postgresql postgresql-contrib

新增 TimescaleDB 套件庫

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# 新增 TimescaleDB 官方 GPG 金鑰
sudo apt install -y gnupg postgresql-common apt-transport-https lsb-release wget

# 執行 PostgreSQL repository 設定腳本
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh -y

# 新增 TimescaleDB 套件庫
echo "deb https://packagecloud.io/timescale/timescaledb/ubuntu/ $(lsb_release -c -s) main" | sudo tee /etc/apt/sources.list.d/timescaledb.list

# 匯入 GPG 金鑰
wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/timescaledb.gpg

# 更新套件列表
sudo apt update

安裝 TimescaleDB

1
2
3
4
5
6
7
8
# 查看可用版本
apt-cache search timescaledb

# 安裝 TimescaleDB(根據 PostgreSQL 版本選擇)
sudo apt install -y timescaledb-2-postgresql-14

# 或安裝較新版本
sudo apt install -y timescaledb-2-postgresql-16

設定 PostgreSQL

使用 timescaledb-tune 工具自動優化設定:

1
2
3
4
5
# 執行調校工具
sudo timescaledb-tune --quiet --yes

# 重新啟動 PostgreSQL 服務
sudo systemctl restart postgresql

手動設定(如需要):

1
2
# 編輯 PostgreSQL 設定檔
sudo vim /etc/postgresql/14/main/postgresql.conf

新增或修改以下設定:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# 載入 TimescaleDB 擴充套件
shared_preload_libraries = 'timescaledb'

# 記憶體設定(根據系統資源調整)
shared_buffers = 4GB
effective_cache_size = 12GB
work_mem = 64MB
maintenance_work_mem = 2GB

# WAL 設定
wal_level = replica
max_wal_size = 4GB
min_wal_size = 1GB

# 並行處理
max_worker_processes = 16
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
max_parallel_maintenance_workers = 4

啟用 TimescaleDB 擴充套件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# 連接到 PostgreSQL
sudo -u postgres psql

# 建立資料庫
CREATE DATABASE tsdb;

# 連接到新資料庫
\c tsdb

# 啟用 TimescaleDB 擴充套件
CREATE EXTENSION IF NOT EXISTS timescaledb;

# 確認安裝
\dx timescaledb

驗證安裝成功:

1
2
3
4
5
6
7
-- 查看 TimescaleDB 版本
SELECT extversion FROM pg_extension WHERE extname = 'timescaledb';

-- 輸出範例
-- extversion
-- -----------
--  2.13.0

Hypertable 建立與使用

Hypertable 概念

Hypertable 是 TimescaleDB 的核心概念,它是一個自動分區的時序資料表。Hypertable 將資料依據時間自動分割成多個 chunk(區塊),每個 chunk 是一個獨立的 PostgreSQL 資料表。

1
2
3
4
5
6
7
8
9
+------------------------------------------+
|              Hypertable                  |
|  (虛擬資料表,對使用者透明)              |
+------------------------------------------+
         |           |           |
    +--------+  +--------+  +--------+
    | Chunk  |  | Chunk  |  | Chunk  |
    | Week 1 |  | Week 2 |  | Week 3 |
    +--------+  +--------+  +--------+

建立 Hypertable

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
-- 建立一般的 PostgreSQL 資料表
CREATE TABLE sensor_data (
    time        TIMESTAMPTZ NOT NULL,
    sensor_id   INTEGER NOT NULL,
    location    TEXT NOT NULL,
    temperature DOUBLE PRECISION,
    humidity    DOUBLE PRECISION,
    pressure    DOUBLE PRECISION
);

-- 將資料表轉換為 Hypertable
SELECT create_hypertable('sensor_data', 'time');

設定 Chunk 區間

1
2
3
4
5
6
7
8
9
-- 建立 Hypertable 時指定 chunk 區間(例如:7 天)
SELECT create_hypertable(
    'sensor_data',
    'time',
    chunk_time_interval => INTERVAL '7 days'
);

-- 修改現有 Hypertable 的 chunk 區間
SELECT set_chunk_time_interval('sensor_data', INTERVAL '1 day');

空間分區(多維度分區)

對於大規模資料,可以同時按時間和其他欄位進行分區:

1
2
3
4
5
6
7
-- 建立具有空間分區的 Hypertable
SELECT create_hypertable(
    'sensor_data',
    'time',
    partitioning_column => 'sensor_id',
    number_partitions => 4
);

插入資料

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
-- 插入單筆資料
INSERT INTO sensor_data (time, sensor_id, location, temperature, humidity, pressure)
VALUES (NOW(), 1, 'Taipei', 25.5, 65.0, 1013.25);

-- 批次插入資料
INSERT INTO sensor_data (time, sensor_id, location, temperature, humidity, pressure)
VALUES
    (NOW() - INTERVAL '1 hour', 1, 'Taipei', 24.8, 68.0, 1013.10),
    (NOW() - INTERVAL '2 hours', 1, 'Taipei', 24.2, 70.0, 1012.95),
    (NOW() - INTERVAL '3 hours', 1, 'Taipei', 23.5, 72.0, 1012.80);

-- 使用 COPY 命令大量匯入(效能最佳)
COPY sensor_data FROM '/path/to/data.csv' WITH (FORMAT csv, HEADER true);

時序查詢範例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
-- 查詢最近 24 小時的資料
SELECT * FROM sensor_data
WHERE time > NOW() - INTERVAL '24 hours'
ORDER BY time DESC;

-- 計算每小時平均溫度
SELECT
    time_bucket('1 hour', time) AS hour,
    sensor_id,
    AVG(temperature) AS avg_temp,
    MIN(temperature) AS min_temp,
    MAX(temperature) AS max_temp
FROM sensor_data
WHERE time > NOW() - INTERVAL '7 days'
GROUP BY hour, sensor_id
ORDER BY hour DESC;

-- 取得最新讀數(每個感測器)
SELECT DISTINCT ON (sensor_id)
    sensor_id,
    time,
    temperature,
    humidity
FROM sensor_data
ORDER BY sensor_id, time DESC;

-- 計算移動平均
SELECT
    time,
    temperature,
    AVG(temperature) OVER (
        ORDER BY time
        ROWS BETWEEN 5 PRECEDING AND CURRENT ROW
    ) AS moving_avg
FROM sensor_data
WHERE sensor_id = 1
ORDER BY time DESC
LIMIT 100;

查看 Hypertable 資訊

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
-- 列出所有 Hypertable
SELECT hypertable_name, num_chunks, compression_enabled
FROM timescaledb_information.hypertables;

-- 查看 chunk 資訊
SELECT chunk_name, range_start, range_end, is_compressed
FROM timescaledb_information.chunks
WHERE hypertable_name = 'sensor_data'
ORDER BY range_start DESC;

-- 查看 Hypertable 大小
SELECT hypertable_size('sensor_data');

-- 查看詳細大小資訊
SELECT * FROM hypertable_detailed_size('sensor_data');

資料壓縮策略

啟用壓縮

TimescaleDB 的原生壓縮可以大幅減少儲存空間,壓縮率通常可達 90-95%。

1
2
3
4
5
6
-- 啟用 Hypertable 的壓縮功能
ALTER TABLE sensor_data SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'sensor_id',
    timescaledb.compress_orderby = 'time DESC'
);

壓縮設定說明

  • compress_segmentby:按此欄位分組壓縮,適合經常一起查詢的資料
  • compress_orderby:壓縮資料的排序方式,優化查詢效能
1
2
3
4
5
6
7
-- 更複雜的壓縮設定範例
ALTER TABLE sensor_data SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'sensor_id, location',
    timescaledb.compress_orderby = 'time DESC',
    timescaledb.compress_chunk_time_interval = '7 days'
);

手動壓縮 Chunk

1
2
3
4
5
6
-- 壓縮特定 chunk
SELECT compress_chunk('_timescaledb_internal._hyper_1_1_chunk');

-- 壓縮超過 7 天的所有 chunk
SELECT compress_chunk(chunk)
FROM show_chunks('sensor_data', older_than => INTERVAL '7 days') AS chunk;

自動壓縮策略

1
2
3
4
5
6
7
8
9
-- 建立自動壓縮策略:壓縮超過 7 天的資料
SELECT add_compression_policy('sensor_data', INTERVAL '7 days');

-- 查看壓縮策略
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_compression';

-- 移除壓縮策略
SELECT remove_compression_policy('sensor_data');

查看壓縮狀態

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- 查看壓縮統計
SELECT
    chunk_name,
    before_compression_total_bytes,
    after_compression_total_bytes,
    round(
        (1 - after_compression_total_bytes::numeric / before_compression_total_bytes) * 100,
        2
    ) AS compression_ratio
FROM timescaledb_information.compressed_chunk_stats
WHERE hypertable_name = 'sensor_data';

-- 查看總體壓縮效果
SELECT
    hypertable_name,
    SUM(before_compression_total_bytes) AS total_before,
    SUM(after_compression_total_bytes) AS total_after,
    round(
        (1 - SUM(after_compression_total_bytes)::numeric / SUM(before_compression_total_bytes)) * 100,
        2
    ) AS overall_compression_ratio
FROM timescaledb_information.compressed_chunk_stats
GROUP BY hypertable_name;

解壓縮資料

1
2
3
4
5
6
-- 解壓縮特定 chunk(用於修改歷史資料)
SELECT decompress_chunk('_timescaledb_internal._hyper_1_1_chunk');

-- 解壓縮時間範圍內的所有 chunk
SELECT decompress_chunk(chunk)
FROM show_chunks('sensor_data', older_than => INTERVAL '7 days', newer_than => INTERVAL '14 days') AS chunk;

連續聚合

連續聚合概念

連續聚合(Continuous Aggregates)是 TimescaleDB 的強大功能,它會自動維護預先計算的聚合資料,大幅提升查詢效能。

建立連續聚合

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- 建立每小時統計的連續聚合
CREATE MATERIALIZED VIEW sensor_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', time) AS bucket,
    sensor_id,
    location,
    AVG(temperature) AS avg_temperature,
    MIN(temperature) AS min_temperature,
    MAX(temperature) AS max_temperature,
    AVG(humidity) AS avg_humidity,
    COUNT(*) AS sample_count
FROM sensor_data
GROUP BY bucket, sensor_id, location
WITH NO DATA;

-- 建立每日統計的連續聚合
CREATE MATERIALIZED VIEW sensor_daily
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 day', time) AS bucket,
    sensor_id,
    location,
    AVG(temperature) AS avg_temperature,
    MIN(temperature) AS min_temperature,
    MAX(temperature) AS max_temperature,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY temperature) AS median_temperature,
    STDDEV(temperature) AS stddev_temperature,
    COUNT(*) AS sample_count
FROM sensor_data
GROUP BY bucket, sensor_id, location
WITH NO DATA;

手動刷新連續聚合

1
2
3
4
5
-- 刷新特定時間範圍
CALL refresh_continuous_aggregate('sensor_hourly', '2025-01-01', '2025-12-31');

-- 刷新所有資料
CALL refresh_continuous_aggregate('sensor_hourly', NULL, NULL);

自動刷新策略

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
-- 設定自動刷新策略
SELECT add_continuous_aggregate_policy('sensor_hourly',
    start_offset => INTERVAL '3 days',
    end_offset => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour'
);

-- 查看刷新策略
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_refresh_continuous_aggregate';

-- 移除刷新策略
SELECT remove_continuous_aggregate_policy('sensor_hourly');

查詢連續聚合

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
-- 直接查詢連續聚合(效能極佳)
SELECT * FROM sensor_hourly
WHERE bucket > NOW() - INTERVAL '7 days'
ORDER BY bucket DESC;

-- 進一步聚合(例如:日報表)
SELECT
    time_bucket('1 day', bucket) AS day,
    sensor_id,
    AVG(avg_temperature) AS daily_avg_temp,
    SUM(sample_count) AS total_samples
FROM sensor_hourly
WHERE bucket > NOW() - INTERVAL '30 days'
GROUP BY day, sensor_id
ORDER BY day DESC;

階層式連續聚合

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
-- 基於小時聚合建立日聚合
CREATE MATERIALIZED VIEW sensor_daily_from_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 day', bucket) AS bucket,
    sensor_id,
    location,
    AVG(avg_temperature) AS avg_temperature,
    MIN(min_temperature) AS min_temperature,
    MAX(max_temperature) AS max_temperature,
    SUM(sample_count) AS total_samples
FROM sensor_hourly
GROUP BY time_bucket('1 day', bucket), sensor_id, location
WITH NO DATA;

即時連續聚合

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
-- 建立即時連續聚合(包含最新未刷新資料)
CREATE MATERIALIZED VIEW sensor_hourly_realtime
WITH (timescaledb.continuous, timescaledb.materialized_only = false) AS
SELECT
    time_bucket('1 hour', time) AS bucket,
    sensor_id,
    AVG(temperature) AS avg_temperature,
    COUNT(*) AS sample_count
FROM sensor_data
GROUP BY bucket, sensor_id
WITH NO DATA;

資料保留政策

建立資料保留策略

1
2
3
4
5
6
-- 自動刪除超過 90 天的資料
SELECT add_retention_policy('sensor_data', INTERVAL '90 days');

-- 查看保留策略
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';

手動刪除舊資料

1
2
3
4
5
6
7
8
-- 刪除特定時間之前的所有 chunk
SELECT drop_chunks('sensor_data', older_than => INTERVAL '90 days');

-- 刪除特定時間範圍的 chunk
SELECT drop_chunks('sensor_data',
    older_than => INTERVAL '90 days',
    newer_than => INTERVAL '180 days'
);

分層儲存策略

對於需要長期保留但較少存取的資料,可以使用分層策略:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
-- 建立分層策略:
-- 1. 最近 7 天:原始資料,未壓縮
-- 2. 7-90 天:壓縮資料
-- 3. 90 天以上:只保留連續聚合,刪除原始資料

-- 設定壓縮策略
SELECT add_compression_policy('sensor_data', INTERVAL '7 days');

-- 設定保留策略
SELECT add_retention_policy('sensor_data', INTERVAL '90 days');

-- 連續聚合保留更長時間
SELECT add_retention_policy('sensor_hourly', INTERVAL '365 days');
SELECT add_retention_policy('sensor_daily', INTERVAL '5 years');

查看和管理策略

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
-- 列出所有背景任務
SELECT * FROM timescaledb_information.jobs;

-- 查看任務執行統計
SELECT * FROM timescaledb_information.job_stats;

-- 暫停策略
SELECT alter_job(job_id, scheduled => false)
FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';

-- 重新啟用策略
SELECT alter_job(job_id, scheduled => true)
FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';

-- 立即執行策略
CALL run_job(1001);  -- 使用實際的 job_id

效能調校

索引優化

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
-- 建立複合索引
CREATE INDEX ON sensor_data (sensor_id, time DESC);

-- 建立部分索引(只索引特定條件的資料)
CREATE INDEX ON sensor_data (time DESC)
WHERE temperature > 30;

-- 建立 BRIN 索引(適合時序資料)
CREATE INDEX ON sensor_data USING BRIN (time);

-- 查看索引使用情況
SELECT
    indexrelname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
WHERE schemaname = 'public';

查詢優化

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- 使用 EXPLAIN ANALYZE 分析查詢
EXPLAIN ANALYZE
SELECT time_bucket('1 hour', time) AS hour, AVG(temperature)
FROM sensor_data
WHERE time > NOW() - INTERVAL '7 days'
GROUP BY hour;

-- 啟用並行查詢
SET max_parallel_workers_per_gather = 4;

-- 使用 first() 和 last() 函數
SELECT
    sensor_id,
    first(temperature, time) AS first_temp,
    last(temperature, time) AS last_temp
FROM sensor_data
WHERE time > NOW() - INTERVAL '1 day'
GROUP BY sensor_id;

-- 使用 time_bucket_gapfill 處理缺失資料
SELECT
    time_bucket_gapfill('1 hour', time) AS hour,
    sensor_id,
    COALESCE(AVG(temperature), 0) AS avg_temp,
    locf(AVG(temperature)) AS avg_temp_filled  -- 用前一個值填充
FROM sensor_data
WHERE time BETWEEN '2025-12-01' AND '2025-12-31'
GROUP BY hour, sensor_id;

PostgreSQL 效能參數

1
2
# 編輯設定檔
sudo vim /etc/postgresql/14/main/postgresql.conf
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# 記憶體設定
shared_buffers = 8GB                    # 系統記憶體的 25%
effective_cache_size = 24GB             # 系統記憶體的 75%
work_mem = 128MB                        # 每個查詢的工作記憶體
maintenance_work_mem = 2GB              # 維護作業記憶體

# 並行處理
max_worker_processes = 16
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
max_parallel_maintenance_workers = 4

# WAL 設定
wal_level = replica
max_wal_size = 8GB
min_wal_size = 2GB
wal_compression = on
wal_buffers = 64MB

# 檢查點設定
checkpoint_completion_target = 0.9
checkpoint_timeout = 15min

# 查詢規劃器
random_page_cost = 1.1                  # SSD 使用較低值
effective_io_concurrency = 200          # SSD 使用較高值

# TimescaleDB 特定設定
timescaledb.max_background_workers = 8
timescaledb.last_tuned = '2025-12-25'

批次寫入優化

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
-- 使用 COPY 命令(最快)
COPY sensor_data FROM '/path/to/data.csv' WITH (FORMAT csv, HEADER true);

-- 使用預備語句批次插入
PREPARE insert_sensor (timestamptz, int, text, float8, float8, float8) AS
INSERT INTO sensor_data (time, sensor_id, location, temperature, humidity, pressure)
VALUES ($1, $2, $3, $4, $5, $6);

-- 執行預備語句
EXECUTE insert_sensor(NOW(), 1, 'Taipei', 25.5, 65.0, 1013.25);

-- 關閉自動提交以批次處理
BEGIN;
-- 多個 INSERT 語句
COMMIT;

使用 Timescale 並行複製

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
-- 啟用並行 COPY
-- 需要使用 timescaledb-parallel-copy 工具

-- 安裝工具
-- go install github.com/timescale/timescaledb-parallel-copy/cmd/timescaledb-parallel-copy@latest

-- 使用範例(在 shell 中執行)
-- timescaledb-parallel-copy \
--   --connection "host=localhost user=postgres dbname=tsdb" \
--   --table sensor_data \
--   --file data.csv \
--   --workers 4 \
--   --batch-size 10000

監控與維護

系統監控查詢

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-- 查看 Hypertable 資訊
SELECT
    hypertable_name,
    num_chunks,
    compression_enabled,
    num_dimensions
FROM timescaledb_information.hypertables;

-- 查看 Chunk 統計
SELECT
    hypertable_name,
    chunk_name,
    range_start,
    range_end,
    is_compressed,
    pg_size_pretty(total_bytes) AS size
FROM timescaledb_information.chunks
ORDER BY range_start DESC
LIMIT 20;

-- 查看壓縮效果
SELECT
    hypertable_name,
    pg_size_pretty(SUM(before_compression_total_bytes)) AS before_compression,
    pg_size_pretty(SUM(after_compression_total_bytes)) AS after_compression,
    round(
        (1 - SUM(after_compression_total_bytes)::numeric /
         NULLIF(SUM(before_compression_total_bytes), 0)) * 100,
        2
    ) AS compression_ratio_pct
FROM timescaledb_information.compressed_chunk_stats
GROUP BY hypertable_name;

-- 查看連續聚合狀態
SELECT
    view_name,
    materialized_only,
    compression_enabled,
    pg_size_pretty(materialization_hypertable_size) AS size
FROM timescaledb_information.continuous_aggregates;

背景任務監控

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- 查看所有排程任務
SELECT
    job_id,
    application_name,
    schedule_interval,
    scheduled,
    proc_name,
    hypertable_name
FROM timescaledb_information.jobs;

-- 查看任務執行歷史
SELECT
    job_id,
    total_runs,
    total_successes,
    total_failures,
    last_run_status,
    last_run_started_at,
    last_run_duration
FROM timescaledb_information.job_stats;

-- 查看失敗的任務
SELECT *
FROM timescaledb_information.job_errors
ORDER BY start_time DESC
LIMIT 10;

效能監控

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- 查看慢查詢(需啟用 pg_stat_statements)
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

SELECT
    calls,
    round(total_exec_time::numeric / 1000, 2) AS total_seconds,
    round(mean_exec_time::numeric, 2) AS mean_ms,
    query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- 查看資料表統計
SELECT
    schemaname,
    relname,
    seq_scan,
    seq_tup_read,
    idx_scan,
    idx_tup_fetch,
    n_tup_ins,
    n_tup_upd,
    n_tup_del
FROM pg_stat_user_tables
WHERE relname LIKE '%sensor%';

-- 查看索引使用情況
SELECT
    schemaname,
    relname,
    indexrelname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC;

維護作業

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
-- 更新統計資訊
ANALYZE sensor_data;

-- 清理死元組
VACUUM (VERBOSE, ANALYZE) sensor_data;

-- 重建索引
REINDEX TABLE sensor_data;

-- 檢查資料完整性
SELECT
    chunk_name,
    hypertable_name,
    range_start,
    range_end
FROM timescaledb_information.chunks
WHERE NOT EXISTS (
    SELECT 1 FROM pg_class WHERE relname = chunk_name
);

自動化維護腳本

建立維護腳本:

1
2
# 建立維護腳本
sudo vim /opt/scripts/timescaledb_maintenance.sh
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#!/bin/bash

# TimescaleDB 每日維護腳本
LOG_FILE="/var/log/timescaledb_maintenance.log"
DATE=$(date +%Y-%m-%d)

echo "[$DATE] 開始維護作業" >> $LOG_FILE

# 更新統計資訊
sudo -u postgres psql -d tsdb -c "ANALYZE;" >> $LOG_FILE 2>&1

# 查看壓縮統計
sudo -u postgres psql -d tsdb -c "
SELECT
    hypertable_name,
    pg_size_pretty(SUM(before_compression_total_bytes)) AS before,
    pg_size_pretty(SUM(after_compression_total_bytes)) AS after
FROM timescaledb_information.compressed_chunk_stats
GROUP BY hypertable_name;
" >> $LOG_FILE 2>&1

# 查看 chunk 數量
sudo -u postgres psql -d tsdb -c "
SELECT hypertable_name, COUNT(*) as chunk_count
FROM timescaledb_information.chunks
GROUP BY hypertable_name;
" >> $LOG_FILE 2>&1

# 檢查失敗的背景任務
sudo -u postgres psql -d tsdb -c "
SELECT * FROM timescaledb_information.job_errors
WHERE start_time > NOW() - INTERVAL '1 day';
" >> $LOG_FILE 2>&1

echo "[$DATE] 維護作業完成" >> $LOG_FILE
1
2
3
4
5
# 設定執行權限
sudo chmod +x /opt/scripts/timescaledb_maintenance.sh

# 加入 crontab
sudo crontab -e
1
2
# 每天凌晨 3 點執行維護
0 3 * * * /opt/scripts/timescaledb_maintenance.sh

Prometheus 監控整合

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- 建立監控用的檢視
CREATE VIEW timescaledb_metrics AS
SELECT
    'timescaledb_hypertable_size_bytes' AS metric_name,
    hypertable_name AS label,
    hypertable_size(format('%I.%I', hypertable_schema, hypertable_name)::regclass) AS value
FROM timescaledb_information.hypertables
UNION ALL
SELECT
    'timescaledb_chunk_count' AS metric_name,
    hypertable_name AS label,
    num_chunks AS value
FROM timescaledb_information.hypertables
UNION ALL
SELECT
    'timescaledb_compression_ratio' AS metric_name,
    hypertable_name AS label,
    CASE
        WHEN SUM(before_compression_total_bytes) > 0
        THEN round((1 - SUM(after_compression_total_bytes)::numeric / SUM(before_compression_total_bytes)) * 100, 2)
        ELSE 0
    END AS value
FROM timescaledb_information.compressed_chunk_stats
GROUP BY hypertable_name;

結論

TimescaleDB 是一個功能強大的時序資料庫解決方案,它結合了 PostgreSQL 的穩定性和 SQL 相容性,並針對時序資料進行了專門優化。透過本文介紹的功能,您可以:

  1. 高效儲存:利用 Hypertable 自動分區和壓縮,節省儲存空間
  2. 快速查詢:透過連續聚合預先計算常用統計
  3. 自動維護:設定資料保留策略,自動管理資料生命週期
  4. 效能優化:使用適當的索引和查詢技巧提升效能
  5. 持續監控:建立完整的監控和維護機制

建議在正式環境部署前,先在測試環境中驗證各項設定,並根據實際的資料量和查詢模式進行調整。

參考資源

comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy