[AI 協作筆記] gRPC 傳輸優化：基於 Flattening 與 Bitset 的高效方案

Published: 1 month ago (December 16, 2025 at 10:13 PM EST)

6 min read

Source: Dev.to

背景與技術挑戰

在設計資料庫中間件的 API 時，我們通常需要回傳多行的查詢結果。若採用傳統的 gRPC 定義方式，會面臨以下效能與實作上的問題。

1.1 Payload 冗餘問題 (Key Repetition)

最直觀的 Protobuf 定義通常是將每一行資料定義為一個 Map 或 Object：

message Row {
    map data = 1;
}
message Response {
    repeated Row rows = 1;
}

問題分析
這種結構會導致嚴重的 Payload 膨脹。假設查詢結果有 10,000 筆資料，且包含欄位 customer_id, created_at, status。在傳輸過程中，這些欄位名稱 (Key) 會被重複傳輸 10,000 次，佔用了大量的頻寬資源。

1.2 Protobuf 對 NULL 值的限制

Protobuf (proto3) 的設計哲學將純量型別 (Scalar Types) 視為不可為空。

string 欄位若為 NULL，傳輸時會被序列化為空字串 ""。
Client 端無法區分這是「空值」還是「原始資料庫中的 NULL」。

雖然可以使用 google.protobuf.StringValue 等 Wrapper 類型解決，但會增加額外的 Message 嵌套層級與處理開銷。

解決方案：類資料庫底層架構

針對上述挑戰，AI 建議跳脫傳統的 API 物件思維，參考 資料庫底層 (Columnar Storage) 或 ODBC/JDBC 驅動 的實作方式。核心優化策略包含以下兩個部分：

A. 陣列扁平化 (Flattening)

Header (Metadata)：單獨傳輸一次欄位定義 (columns)。
Body (Values)：將所有資料值攤平成一個巨大的一維陣列 (values)。

此設計完全移除每行資料中的 Key 傳輸，顯著降低 Payload 大小。

B. Bitset (Bitmap) 機制

使用 Bitset 來標記每一個值是否為 NULL。
1 個 bit 對應 1 個值：Bit = 1 表示該值為 NULL，Bit = 0 表示該值有效。

空間效率
每 8 個欄位值僅需消耗 1 byte 的額外空間。對於 1,000 行 × 8 欄的資料，僅需約 1 KB 的 Overhead 即可精確記錄所有 NULL 狀態。

實作細節 (Implementation Essentials)

3.1 Protobuf 定義 (`proto/query.proto`)

message QueryResponse {
    repeated string values = 3;   // 扁平化數值
    repeated Column columns = 4;  // 欄位定義
    bytes null_bitmap = 7;        // NULL 標記位元流
    int32 row_count = 6;
}

3.2 Server 端：編碼與壓縮 (Encoding)

Server 端的任務是一次性遍歷資料庫結果，同時完成「數值扁平化」與「Bitmap 生成」。

// $result['rows'] 是資料庫回傳的二維陣列
$values = [];
$packedBytes = "";
$currentByte = 0;
$bitIndex = 0;

foreach ($result['rows'] as $row) {
    foreach ($row as $value) {
        if ($value === null) {
            $values[] = "";                     // 值放空字串佔位
            $currentByte |= (1 << $bitIndex);   // 設置對應位元為 1
        } else {
            $values[] = (string)$value;
            // 位元保持為 0（預設）
        }

        $bitIndex++;
        if ($bitIndex === 8) {
            $packedBytes .= chr($currentByte);
            $currentByte = 0;
            $bitIndex = 0;
        }
    }
}

// 處理最後不足 8 位的情況
if ($bitIndex > 0) {
    $packedBytes .= chr($currentByte);
}

3.3 Client 端：解碼與還原 (Decoding)

Client 端收到資料後，需要根據 columns 數量切割，並參考 null_bitmap 將 NULL 還原回來。

$fetchedRows = [];
$columns = $response->getColumns();
$colCount = count($columns);
$values = $response->getValues();       // 取得扁平化陣列
$bitmap = $response->getNullBitmap();   // 取得 Bitmap string
$rowCount = $response->getRowCount();

$bytePos = 0;
$bitPos = 0;

for ($r = 0; $r < $rowCount; $r++) {
    $row = [];
    for ($c = 0; $c < $colCount; $c++) {
        $flatIndex = $r * $colCount + $c;

        // 取得對應位元
        $byte = ord($bitmap[$bytePos]);
        $isNull = ($byte >> $bitPos) & 1;

        $row[] = $isNull ? null : $values[$flatIndex];

        // 移動位元指標
        $bitPos++;
        if ($bitPos === 8) {
            $bitPos = 0;
            $bytePos++;
        }
    }
    $fetchedRows[] = $row;
}

透過上述對稱的邏輯，我們即可以極低的運算成本完成資料的壓縮與還原。

優化效益分析

採用此架構後，我們獲得了以下具體效益：

極致的傳輸效率
扁平化設計使 Payload 大小與資料量呈線性增長，不受欄位名稱長度影響，對大數據量查詢的頻寬節省效果顯著。
精確的型別還原
Client 端可透過 null_bitmap 精確還原資料庫的 NULL 狀態，解決了 gRPC 預設型別的限制。
解析效能提升
對於 PHP 與其他語言而言，處理一維陣列通常比處理大量巢狀物件擁有更好的 CPU Cache 命中率與更低的記憶體碎片。

總結

這個優化案例展示了在現代分散式系統中，適度引入 底層系統設計思維 的重要性。透過與 AI 的協作，我們跳脫了單純的 API 設計框架，利用 位元運算 與 資料結構優化，以極低的成本解決了 gRPC/Protobuf 在資料庫應用場景下的先天限制。

[AI 協作筆記] gRPC 傳輸優化：基於 Flattening 與 Bitset 的高效方案

背景與技術挑戰

1.1 Payload 冗餘問題 (Key Repetition)

1.2 Protobuf 對 NULL 值的限制

解決方案：類資料庫底層架構

A. 陣列扁平化 (Flattening)

B. Bitset (Bitmap) 機制

實作細節 (Implementation Essentials)

3.1 Protobuf 定義 (`proto/query.proto`)

3.2 Server 端：編碼與壓縮 (Encoding)

3.3 Client 端：解碼與還原 (Decoding)

優化效益分析

總結

Related posts

gRPC Transmission Optimization: An Efficient Solution Based on Flattening and Bitset

gRPC -Why use a Mock Server?

Nuxt Scripts for improved Performance and Security

Web Development in 2025: 7 Tricks That Actually Make a Difference

背景與技術挑戰

1.1 Payload 冗餘問題 (Key Repetition)

1.2 Protobuf 對 NULL 值的限制

解決方案：類資料庫底層架構

A. 陣列扁平化 (Flattening)

B. Bitset (Bitmap) 機制

實作細節 (Implementation Essentials)

3.1 Protobuf 定義 (proto/query.proto)

3.2 Server 端：編碼與壓縮 (Encoding)

3.3 Client 端：解碼與還原 (Decoding)

優化效益分析

總結

Related posts

gRPC Transmission Optimization: An Efficient Solution Based on Flattening and Bitset

gRPC -Why use a Mock Server?

Nuxt Scripts for improved Performance and Security

Web Development in 2025: 7 Tricks That Actually Make a Difference

3.1 Protobuf 定義 (`proto/query.proto`)