feat(reject-history): finalize sql runtime and archive completed openspec changes

2026-03-04 16:32:00 +08:00
parent 1d2786e7a8
commit 5517f7e85c
36 changed files with 2095 additions and 179 deletions
--- a/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/.openspec.yaml
+++ b/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/.openspec.yaml
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-03-04
--- a/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/design.md
+++ b/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/design.md
@@ -0,0 +1,83 @@
+## Context
+
+`reject-history` 的 two-phase 主查詢 (`POST /api/reject-history/query`) 目前重用 `reject_history/list.sql`。該 SQL 內含 `COUNT(*) OVER()` 與 `OFFSET/FETCH`，原本是為 paginated `/api/reject-history/list` 契約設計。當主查詢走大日期範圍且啟用 batch chunk 時，`reject_dataset_cache` 會透過 offset/limit 迴圈重複執行 list 查詢，導致高成本重算與長尾延遲（已出現 90~150 秒慢查詢）。
+
+同時，`/api/reject-history/list` 仍保留於後端路由與測試中，不能以「直接改 list.sql」方式處理，否則會有分頁契約與 legacy 調用相容風險。
+
+## Goals / Non-Goals
+
+**Goals:**
+- 讓 `POST /api/reject-history/query` 改用 primary 專用 SQL，移除對 paginated `list.sql` 的執行耦合。
+- 讓 batch chunk 路徑不再以 `offset/limit` 迴圈拉全量，降低 chunk 級重複計算。
+- 維持 `/api/reject-history/list` 既有分頁與回應語意不變。
+- 維持 `/query`、`/view`、`/export-cached` 既有資料語意與 API 欄位契約不變。
+
+**Non-Goals:**
+- 不移除 `/api/reject-history/list` 或其 legacy 路由。
+- 不調整業務指標定義（`REJECT_TOTAL_QTY`、`DEFECT_QTY`、policy filter）。
+- 不引入新基礎設施（新 DB、新 cache 類型、新第三方依賴）。
+
+## Decisions
+
+### D1: 新增 primary 專用 SQL 模板，不修改 `list.sql` 契約
+
+- Decision: 在 `src/mes_dashboard/sql/reject_history/` 新增主查詢專用 SQL（lot-level、非分頁語意），供 dataset cache 主查詢使用。
+- Why: `list.sql` 同時服務 `/list` 與 `/query` 是目前性能與相容性衝突根因。拆分來源可同時滿足性能與相容。
+- Alternatives considered:
+  - 直接修改 `list.sql`：可改善 `/query` 但高機率破壞 `/list` pagination/total_count 契約。
+  - 只調整並行度參數：可緩解但無法消除 SQL 本身重複計算成本。
+
+### D2: `reject_dataset_cache` direct/chunk 路徑統一使用 primary SQL
+
+- Decision: `execute_primary_query()` 的 direct 路徑與 batch chunk 路徑都切到 primary SQL。
+- Why: 若僅改 direct 路徑，長範圍查詢（實際主要痛點）仍走舊的 chunk + paginated list 模式，收益有限。
+- Alternatives considered:
+  - 只改 direct path：大範圍查詢仍受慢查詢影響。
+  - 只改 chunk path：短範圍直查仍殘留 list 耦合與語意不一致。
+
+### D3: chunk 執行改為單次查詢，不再依賴 offset/limit 迴圈
+
+- Decision: 每個 chunk 以一次 primary SQL 查詢取得完整 chunk dataset，移除 `offset` 迴圈抓取邏輯。
+- Why: 現行做法會在同 chunk 內重跑含排序與 total_count 的查詢多次，放大 DB 成本。
+- Alternatives considered:
+  - 保留迴圈但調大 page size：只能減少次數，仍有重複計算與語意負擔。
+
+### D4: 將 `/list` 相容性納入硬性回歸防護
+
+- Decision: 保留 `/api/reject-history/list` 路由與 `query_list()` 邏輯不變，並補齊相容性測試。
+- Why: 專案仍保留路由、文件與 smoke test 依賴；需要明確防止回歸。
+- Alternatives considered:
+  - 一併移除 `/list`：牽涉範圍擴大，與本次性能修復目標不一致。
+
+## Risks / Trade-offs
+
+- [Risk] primary SQL 與 list SQL 欄位差異造成後續 pandas 衍生失敗  
+  → Mitigation: 明確定義 primary SQL 欄位最小契約，新增單元測試檢查欄位完整性。
+
+- [Risk] chunk 單次查詢結果量過大導致記憶體壓力  
+  → Mitigation: 保留既有 batch decomposing、max rows/total rows 與 parquet spill guardrail。
+
+- [Risk] 只優化 `/query` 但未改善其他慢查詢來源  
+  → Mitigation: 本次聚焦 reject-history primary path，其他路徑另案處理。
+
+- [Trade-off] 新增一份 SQL 會提高維護成本  
+  → Mitigation: 文件化「list for paginated API / primary for dataset cache」分工，避免再度耦合。
+
+## Migration Plan
+
+1. 新增 primary 專用 SQL 並接入 SQL loader。
+2. 修改 `reject_dataset_cache` direct 與 chunk 路徑，改用 primary SQL。
+3. 保持 `reject_history_service.query_list()` 與 `list.sql` 不變。
+4. 補齊測試：
+   - `/query` 不再送 `offset/limit` 到 primary SQL 路徑。
+   - `/list` 回應 pagination 契約不變。
+5. 先在 dev 環境比對慢查詢日誌與前端逾時事件，再推進上線。
+
+Rollback strategy:
+- 若新 primary SQL 發生欄位或性能異常，回退 `reject_dataset_cache` 到原 `list.sql` 路徑（保留新 SQL 檔但不啟用）。
+- `/list` 路徑因未改動，回退風險低。
+
+## Open Questions
+
+- primary SQL 檔名是否採 `primary.sql` 或 `dataset_primary.sql`（需與現有命名規則一致）？
+- 是否需要在 API `meta` 暴露診斷欄位（例如 `primary_sql_source=dedicated`）供線上追蹤？
--- a/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/proposal.md
+++ b/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/proposal.md
@@ -0,0 +1,38 @@
+## Why
+
+目前 reject-history 的 `POST /api/reject-history/query` 主查詢路徑重用 `list.sql`（含 `COUNT(*) OVER()` 與 `OFFSET/FETCH` 分頁語意），在大日期範圍與 batch chunk 場景下會產生高成本重算，已出現多次 90~150 秒慢查詢與前端逾時。`list.sql` 同時服務 legacy `/api/reject-history/list`，直接改動容易破壞既有分頁契約，因此需要將 primary 查詢來源與 list 查詢解耦。
+
+## What Changes
+
+- 新增 reject-history primary 專用 SQL（lot-level、非分頁語意），供 dataset cache 主查詢使用。
+- 調整 `reject_dataset_cache.execute_primary_query()`（direct 與 engine chunk 路徑）改用 primary 專用 SQL，不再依賴 `list.sql` 的 `offset/limit` 分頁迴圈。
+- 保留 `list.sql` 與 `GET /api/reject-history/list` 現有行為與回應契約（排序、分頁、`TOTAL_COUNT`）不變。
+- 補齊回歸防護：新增/調整測試以驗證 `/list` 契約未變，且 `/query` 查詢來源已切換到 primary 專用 SQL。
+- 保持 `/query`、`/view`、`/export-cached` 的資料語意與欄位契約不變（非 breaking）。
+
+## Capabilities
+
+### New Capabilities
+- `reject-history-primary-query-source-isolation`: 為 primary query 建立獨立資料來源，避免與 paginated list SQL 耦合，降低大範圍查詢延遲與逾時風險。
+
+### Modified Capabilities
+- `reject-history-api`: 調整 primary query 的實作要求，明確規範 `/query` 與 `/list` 查詢路徑解耦，且 `/list` 契約必須維持相容。
+- `batch-query-resilience`: 調整 reject-history chunk 執行要求，移除以 paginated list SQL 疊代抓全量的依賴，降低 chunk 級重複計算成本。
+
+## Impact
+
+- Affected backend code:
+  - `src/mes_dashboard/services/reject_dataset_cache.py`
+  - `src/mes_dashboard/services/reject_history_service.py`（若需擴充 SQL template slot）
+  - `src/mes_dashboard/sql/reject_history/`（新增 primary 專用 SQL）
+  - `src/mes_dashboard/routes/reject_history_routes.py`（僅在需要補充 meta/診斷資訊時）
+- Affected tests:
+  - `tests/test_reject_dataset_cache.py`
+  - `tests/test_reject_history_service.py`
+  - `tests/test_reject_history_routes.py`
+- API surface:
+  - 無新增或移除 endpoint
+  - 無既有參數/回應破壞性變更
+- Dependencies/infra:
+  - 無新增外部依賴
+  - 可沿用既有 slow-query engine、batch engine、cache/spool 機制
--- a/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/specs/batch-query-resilience/spec.md
+++ b/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/specs/batch-query-resilience/spec.md
@@ -0,0 +1,25 @@
+## ADDED Requirements
+
+### Requirement: reject_dataset_cache batch primary execution SHALL avoid paginated replay loops
+Batch chunk execution for reject-history primary query SHALL avoid page-by-page replay against paginated list SQL semantics.
+
+#### Scenario: Chunk execution avoids offset iteration
+- **WHEN** batch engine executes a reject-history chunk in `execute_primary_query()`
+- **THEN** chunk execution SHALL NOT iterate through `offset` pages to assemble full chunk data
+- **THEN** chunk execution SHALL retrieve chunk data via the dedicated primary SQL path
+
+#### Scenario: Chunk bind contract excludes pagination parameters
+- **WHEN** chunk query parameters are prepared for batch execution
+- **THEN** `offset` and `limit` SHALL NOT be required bind variables for normal chunk retrieval
+
+### Requirement: Partial-failure resilience SHALL remain intact after source decoupling
+Decoupling from paginated list SQL SHALL NOT regress partial-failure metadata behavior.
+
+#### Scenario: Failed chunks still produce partial-failure metadata
+- **WHEN** one or more reject-history chunks fail during batch execution
+- **THEN** response `meta` SHALL still report partial-failure indicators according to existing resilience contract
+
+#### Scenario: Successful chunks still merge and continue
+- **WHEN** some chunks succeed and others fail
+- **THEN** the system SHALL continue to merge successful chunks and return partial results
+- **THEN** progress metadata SHALL remain available for diagnostics
--- a/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/specs/reject-history-api/spec.md
+++ b/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/specs/reject-history-api/spec.md
@@ -0,0 +1,26 @@
+## ADDED Requirements
+
+### Requirement: Reject History API SHALL preserve paginated list contract after primary-query decoupling
+The API SHALL keep `GET /api/reject-history/list` behavior and response schema stable after `/query` switches to a dedicated SQL source.
+
+#### Scenario: List endpoint pagination schema remains stable
+- **WHEN** `GET /api/reject-history/list` is called with valid date range and paging params
+- **THEN** the response SHALL still include `items` and `pagination` with `page`, `perPage`, `total`, and `totalPages`
+- **THEN** the endpoint SHALL continue to support page-bound retrieval semantics
+
+#### Scenario: List endpoint sorting semantics remain stable
+- **WHEN** two equivalent list requests are executed before and after the primary-query decoupling change
+- **THEN** row ordering semantics SHALL remain consistent with existing list contract
+
+### Requirement: Reject History API primary response contract SHALL remain backward compatible
+Switching the primary SQL source SHALL NOT alter `/api/reject-history/query` response fields consumed by the current UI flow.
+
+#### Scenario: Primary query response shape is unchanged
+- **WHEN** `POST /api/reject-history/query` succeeds
+- **THEN** the response SHALL continue to include `query_id`, `summary`, `trend`, `detail`, `available_filters`, and `meta`
+- **THEN** existing `/view` and `/export-cached` workflows SHALL remain compatible with the returned `query_id`
+
+#### Scenario: Cache-hit behavior remains unchanged
+- **WHEN** the same primary query is executed again within cache lifetime
+- **THEN** cache-hit behavior SHALL remain functionally equivalent to pre-decoupling behavior
+- **THEN** response field names and types SHALL remain stable
--- a/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/specs/reject-history-primary-query-source-isolation/spec.md
+++ b/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/specs/reject-history-primary-query-source-isolation/spec.md
@@ -0,0 +1,25 @@
+## ADDED Requirements
+
+### Requirement: Reject-history primary query SHALL use a dedicated non-paginated SQL source
+The system SHALL execute `POST /api/reject-history/query` against a dedicated primary SQL template that is isolated from the paginated list SQL contract.
+
+#### Scenario: Direct primary path uses dedicated SQL
+- **WHEN** `execute_primary_query()` runs in direct mode (no batch decomposition)
+- **THEN** it SHALL compile SQL from the dedicated primary template
+- **THEN** it SHALL NOT require `offset` or `limit` bind parameters for result retrieval
+
+#### Scenario: Batch chunk path uses dedicated SQL
+- **WHEN** `execute_primary_query()` runs in batch chunk mode
+- **THEN** each chunk query SHALL compile SQL from the same dedicated primary template
+- **THEN** chunk queries SHALL apply chunk-specific filters without relying on page-by-page replay semantics
+
+### Requirement: Dedicated primary SQL SHALL exclude pagination-only operators
+The dedicated primary SQL template SHALL avoid pagination-only constructs used by `/api/reject-history/list`.
+
+#### Scenario: Primary SQL excludes total-count window computation
+- **WHEN** the dedicated primary SQL is loaded for `/query`
+- **THEN** it SHALL NOT include `COUNT(*) OVER()` as a required output field
+
+#### Scenario: Primary SQL excludes offset-fetch pagination
+- **WHEN** the dedicated primary SQL is loaded for `/query`
+- **THEN** it SHALL NOT include `OFFSET ... FETCH NEXT ...` pagination clauses
--- a/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/tasks.md
+++ b/openspec/changes/archive/2026-03-04-optimize-reject-primary-query-sql/tasks.md
@@ -0,0 +1,25 @@
+## 1. Primary SQL Source Isolation
+
+- [x] 1.1 Add a dedicated reject-history primary SQL file under `src/mes_dashboard/sql/reject_history/` without paginated list operators
+- [x] 1.2 Ensure the new SQL template preserves the column contract required by dataset-cache derivation (`summary`/`trend`/`detail`/`pareto`)
+- [x] 1.3 Keep `src/mes_dashboard/sql/reject_history/list.sql` unchanged for legacy paginated list use
+
+## 2. Service Path Decoupling
+
+- [x] 2.1 Update `reject_dataset_cache.execute_primary_query()` direct path to compile and execute the dedicated primary SQL template
+- [x] 2.2 Update reject-history batch chunk execution path to use the dedicated primary SQL template
+- [x] 2.3 Remove reject chunk data assembly logic that depends on `offset/limit` pagination replay
+- [x] 2.4 Preserve existing cache/spool write path and response shape (`query_id`, `summary`, `trend`, `detail`, `available_filters`, `meta`)
+
+## 3. Compatibility and Resilience Guards
+
+- [x] 3.1 Verify `query_list()` and `GET /api/reject-history/list` pagination behavior remains unchanged
+- [x] 3.2 Verify partial-failure metadata behavior remains unchanged for batch mode (`has_partial_failure`, failed chunks/ranges)
+- [x] 3.3 Add defensive logging/diagnostics confirming primary query source path selection for troubleshooting
+
+## 4. Tests and Verification
+
+- [x] 4.1 Add or update unit tests in `tests/test_reject_dataset_cache.py` to assert primary/chunk paths no longer require `offset/limit`
+- [x] 4.2 Add or update tests in `tests/test_reject_history_service.py` and `tests/test_reject_history_routes.py` to assert `/list` contract compatibility
+- [x] 4.3 Run targeted test suite for reject-history cache/service/routes and batch resilience coverage
+- [ ] 4.4 Perform manual validation of large-range reject-history query latency and ensure no frontend timeout regression (requires integration env + Oracle data + frontend flow)
--- a/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/.openspec.yaml
+++ b/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/.openspec.yaml
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-03-04
--- a/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/design.md
+++ b/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/design.md
@@ -0,0 +1,124 @@
+## Context
+
+reject-history 目前的 cache 後查詢主要依賴 pandas（`apply_view`、`compute_batch_pareto`、`export_csv_from_cache`），在大範圍資料（百 MB 級）下會出現高峰值 RSS，導致 interactive memory guard 拒絕請求與 worker RSS guard 觸發重啟。現有系統已具備 parquet spool（`query_spool_store`），但後續計算仍常回載為 DataFrame 再做全表運算。
+
+本次設計目標是在不改變 API 介面與回應 schema 的前提下，將 cache 後運算遷移到 SQL runtime（DuckDB）以降低 Python 記憶體壓力，同時保留既有 guard 作為最後保護。
+
+約束條件：
+- 不破壞 `reject-history` 前端既有參數與資料結構。
+- 需保留 materialized pareto 的命中路徑與語意。
+- 需維持明細/匯出的篩選一致性與資料完整性。
+- rollout 必須可開關、可回退。
+
+## Goals / Non-Goals
+
+**Goals:**
+- 在 cache/spool 資料上導入 DuckDB SQL 執行路徑，避免 pandas 全表 copy/groupby 成為主路徑。
+- 第一階段優先改造 `batch-pareto`，在 materialized miss 時改走 cache-SQL。
+- 第二階段改造 `view`，使 summary/trend/detail 分頁以 SQL 聚合與查詢產生。
+- 第三階段改造 `export-cached`，改為串流輸出，避免一次性 `to_dict` 全載入。
+- 保留並持續觀測現有 memory guard，穩定後再調整門檻。
+
+**Non-Goals:**
+- 不變更 Oracle primary query 與 chunk engine 的核心策略。
+- 不新增或移除 reject-history API endpoint。
+- 不變更前端查詢流程、URL 參數格式與欄位命名。
+- 不在此變更中重寫其他頁面（hold/resource/material-trace）的 cache 運算。
+
+## Decisions
+
+### D1. 採用 DuckDB 作為 cache-SQL runtime（非 SQLite）
+
+- **Decision**: 新增 DuckDB 依賴，作為 parquet/spool 查詢與聚合執行引擎。
+- **Rationale**:
+  - DuckDB 可直接查 parquet，支援 predicate pushdown、projection pushdown、aggregation/window，符合本次需求。
+  - SQLite 無原生 parquet 掃描能力，需先灌入資料，反而增加一次記憶體與 I/O 成本。
+  - 相較 pandas，DuckDB 在大資料篩選/聚合路徑更容易控制 worker RSS。
+- **Alternatives considered**:
+  - pandas 優化（減欄位、category）: 已做但仍有高 RSS 與 guard 誤擋。
+  - SQLite 臨時表: 需要 ETL 步驟，不能直接利用 parquet spool。
+
+### D2. 建立 reject-history 專用 cache-SQL facade
+
+- **Decision**: 新增 `reject_cache_sql_runtime`（名稱可依實作調整）統一提供：
+  - 載入來源解析（parquet spool 優先，必要時 fallback）
+  - 參數綁定與安全 SQL 片段組裝
+  - 共用 filter 條件建構（policy/supplementary/trend/pareto selections）
+- **Rationale**:
+  - 避免 SQL 字串組裝分散在 route/service，降低語意漂移。
+  - 將 parity 規則集中管理，便於與 legacy pandas 對照測試。
+- **Alternatives considered**:
+  - 直接在 `reject_dataset_cache.py` 內內嵌 SQL: 快但可維護性差、測試切面不清。
+
+### D3. batch-pareto 路徑優先改造，保留 materialized-hit
+
+- **Decision**:
+  - `try_materialized_batch_pareto` 命中時行為不變。
+  - miss/stale/build-fail 時，先走 cache-SQL 批次計算。
+  - cache-SQL 不可用時，才回退 legacy DataFrame 計算。
+- **Rationale**:
+  - `batch-pareto` 是高頻且高成本聚合點，改造收益最大。
+  - 保留既有 materialized 快路，避免重工。
+- **Alternatives considered**:
+  - 直接移除 materialized 層: 風險高，且會放棄既有命中收益。
+
+### D4. view 改為 SQL 聚合 + SQL 分頁
+
+- **Decision**:
+  - `summary`/`trend` 透過 SQL 聚合計算。
+  - `detail` 透過 SQL 套用所有篩選後再排序分頁。
+  - 保持現有輸出結構（`analytics_raw`、`summary`、`detail.pagination`）。
+- **Rationale**:
+  - 解決目前「先 guard 後篩選」導致的大量誤拒。
+  - 減少 pandas 多段中間 DataFrame 生命週期。
+
+### D5. export-cached 改為串流匯出
+
+- **Decision**:
+  - 使用 generator 逐批讀取並寫出 CSV response。
+  - 不再先建立完整 rows list / to_dict 再回應。
+- **Rationale**:
+  - 匯出為典型大輸出場景，串流可有效降低峰值 RSS。
+  - 維持既有篩選條件與欄位契約不變。
+
+### D6. 以 feature flags 漸進 rollout，保留雙路 fallback
+
+- **Decision**: 新增 runtime 開關（命名待實作定稿），至少包含：
+  - 全域開關（cache-SQL 啟用/停用）
+  - endpoint 級開關（batch/view/export 分別啟用）
+  - fallback 開關（允許回退到 legacy pandas）
+- **Rationale**:
+  - 便於線上灰度與快速回退。
+  - 降低一次性替換風險。
+
+## Risks / Trade-offs
+
+- **[DuckDB 依賴與執行環境相容性]** → 在 `requirements`/`environment.yml` 固定可用版本，CI 與 VM 啟動腳本納入檢查。
+- **[SQL 與 pandas 語意偏差]** → 建立 parity 測試（同 query_id、同 filter，對比 summary/trend/detail/pareto 結果）。
+- **[spool 缺失時路徑回退造成行為不一致]** → 定義明確來源優先序與 fallback reason telemetry，保證可觀測。
+- **[查詢計畫在極端條件下退化]** → 保留 guard 與 timeout，必要時對 SQL runtime 增加最大掃描/輸出限制。
+- **[導入初期同時維護雙路徑成本]** → 分階段啟用，待穩定後再收斂 legacy 路徑。
+
+## Migration Plan
+
+1. **Phase 1（batch-pareto）**
+- 引入 DuckDB runtime 與基本來源解析。
+- `batch-pareto` materialized miss 路徑改接 cache-SQL。
+- 加入 endpoint 級開關與 fallback telemetry。
+
+2. **Phase 2（view SQL 化）**
+- 將 `summary/trend/detail` 改為 SQL 路徑。
+- 調整 memory guard 觸發位置（先縮小資料再 guard 或改為 SQL 結果預估守門）。
+
+3. **Phase 3（export 串流）**
+- `export-cached` 改為串流生成 CSV。
+- 驗證與明細資料的篩選一致性。
+
+4. **Rollout / Rollback**
+- 預設先灰度啟用（batch -> view -> export）。
+- 若觀測到錯誤率或結果偏差升高，可關閉對應 endpoint 開關回退 legacy。
+
+## Open Questions
+
+- 是否要求 `view` 的 `analytics_raw` 維持完全相同排序（若前端對排序有隱性依賴）？
+- 是否在本次就引入「cache-SQL 專屬 memory budget 指標」，或先沿用現有 worker guard telemetry？
--- a/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/proposal.md
+++ b/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/proposal.md
@@ -0,0 +1,42 @@
+## Why
+
+目前 reject-history 在快取後的互動查詢仍以 pandas 在 worker 記憶體中做全表 filter/groupby/copy，導致大範圍查詢時 RSS 長時間居高不下，觸發 memory guard、batch-pareto 被拒與 worker restart。既有 parquet spool 已存在，應改為「快取後 SQL 化」以降低峰值記憶體並保留既有 API 契約。
+
+## What Changes
+
+- 新增 reject-history 的 cache-SQL 執行層（DuckDB），優先對 parquet spool / cache 資料做 SQL 查詢與聚合，避免回載整包 pandas DataFrame 再運算。
+- 第一階段：`/api/reject-history/batch-pareto` 先改為 DuckDB 路徑（高收益、低風險），維持既有 cross-filter、top80、top20 與回應 schema。
+- 第二階段：`/api/reject-history/view` 改為 SQL 化（summary/trend 聚合與明細分頁皆走 SQL），減少 in-memory 中間資料。
+- 第三階段：`/api/reject-history/export-cached` 改為串流匯出，避免先 `to_dict` 全量載入記憶體。
+- 保留現有 worker / interactive memory guard 作為最後保護；待 SQL 化穩定後再依監控數據調整 guard 門檻。
+- 補齊可觀測性與回歸測試，確保前端提示、明細資料語意與匯出完整性維持相容（非 breaking）。
+
+## Capabilities
+
+### New Capabilities
+- `reject-history-cache-sql-runtime`: 在 reject-history cache/spool 資料上提供 SQL 執行能力（DuckDB）與查詢路由，將互動查詢從 pandas 全表運算轉為 SQL pushdown / 聚合。
+
+### Modified Capabilities
+- `reject-history-api`: 調整 `/batch-pareto`、`/view`、`/export-cached` 的後端計算路徑要求，明確規範以 cache-SQL 為主、回應契約保持不變。
+- `reject-history-pareto-materialized-aggregate`: 調整 materialized miss/fallback 行為，要求優先落到 cache-SQL 計算路徑，而非 DataFrame 全表 regroup。
+- `reject-history-detail-export-parity`: 擴充匯出要求為串流輸出，同時維持與目前篩選條件一致的資料範圍與欄位語意。
+
+## Impact
+
+- Affected backend code:
+  - `src/mes_dashboard/services/reject_dataset_cache.py`
+  - `src/mes_dashboard/services/reject_pareto_materialized.py`
+  - `src/mes_dashboard/core/query_spool_store.py`（讀取介面/metadata 支援 SQL runtime）
+  - `src/mes_dashboard/routes/reject_history_routes.py`
+  - `src/mes_dashboard/sql/reject_history/`（新增/調整 SQL 片段）
+- Affected tests:
+  - `tests/test_reject_dataset_cache.py`
+  - `tests/test_reject_history_routes.py`
+  - `tests/test_reject_pareto_materialized.py`
+  - 新增 cache-SQL runtime 與串流匯出測試
+- API surface:
+  - 不新增 endpoint
+  - 不變更既有參數與回應 schema（非 breaking）
+- Dependencies/infra:
+  - 新增 DuckDB Python 依賴
+  - 可能新增少量 SQL runtime 相關 env 開關（啟用、fallback、併發/記憶體上限）
--- a/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/specs/reject-history-api/spec.md
+++ b/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/specs/reject-history-api/spec.md
@@ -0,0 +1,68 @@
+## MODIFIED Requirements
+
+### Requirement: Reject History API SHALL provide batch Pareto endpoint with cross-filter
+The API SHALL provide a batch Pareto endpoint that returns all 6 dimension Pareto results in a single response, supporting cross-dimension filtering with exclude-self logic, and SHALL prefer materialized Pareto snapshots, then cache-SQL runtime, before considering legacy full-detail regrouping.
+
+#### Scenario: Batch Pareto response structure
+- **WHEN** `GET /api/reject-history/batch-pareto` is called with valid `query_id`
+- **THEN** response SHALL be `{ success: true, data: { dimensions: { reason: {...}, package: {...}, type: {...}, workflow: {...}, workcenter: {...}, equipment: {...} } } }`
+- **THEN** each dimension object SHALL include `items` array with schema (`reason`, `metric_value`, `pct`, `cumPct`, `MOVEIN_QTY`, `REJECT_TOTAL_QTY`, `DEFECT_QTY`, `count`)
+
+#### Scenario: Cross-filter exclude-self logic
+- **WHEN** `sel_reason=A&sel_type=X` is provided
+- **THEN** reason Pareto SHALL be computed with type=X filter applied (but NOT reason=A filter)
+- **THEN** type Pareto SHALL be computed with reason=A filter applied (but NOT type=X filter)
+- **THEN** package/workflow/workcenter/equipment Paretos SHALL be computed with both reason=A AND type=X filters applied
+
+#### Scenario: Empty selections return unfiltered Paretos
+- **WHEN** batch-pareto is called with no `sel_*` parameters
+- **THEN** all 6 dimensions SHALL return their full Pareto distribution (subject to `pareto_scope`)
+
+#### Scenario: Cache-only computation
+- **WHEN** `query_id` does not exist in cache
+- **THEN** the endpoint SHALL return HTTP 400 with error message indicating cache miss
+- **THEN** the endpoint SHALL NOT fall back to Oracle query
+
+#### Scenario: Materialized snapshot preferred
+- **WHEN** a valid and fresh materialized Pareto snapshot exists for the request context
+- **THEN** the endpoint SHALL return results from that snapshot
+- **THEN** the endpoint SHALL avoid full lot-level regrouping for the same request
+
+#### Scenario: Materialized miss fallback behavior
+- **WHEN** materialized snapshot is unavailable, stale, or build fails
+- **THEN** the endpoint SHALL fall back to cache-SQL computation before legacy DataFrame computation
+- **THEN** the response schema and filter semantics SHALL remain unchanged
+
+#### Scenario: SQL fallback unavailable
+- **WHEN** cache-SQL runtime is disabled or unavailable under materialized miss
+- **THEN** the endpoint SHALL follow configured fallback policy deterministically
+- **THEN** the response metadata SHALL expose the fallback reason code
+
+#### Scenario: Supplementary and policy filters apply
+- **WHEN** batch-pareto is called with supplementary filters (packages, workcenter_groups, reason) and policy toggles
+- **THEN** all 6 dimension Paretos SHALL be computed after applying policy and supplementary filters first (before cross-filter)
+
+#### Scenario: Display scope (TOP20) support
+- **WHEN** `pareto_display_scope=top20` is provided
+- **THEN** applicable dimensions (type, workflow, equipment) SHALL truncate results to top 20 items after sorting
+- **WHEN** `pareto_display_scope` is omitted or `all`
+- **THEN** all items SHALL be returned (subject to `pareto_scope` filter)
+
+## ADDED Requirements
+
+### Requirement: Reject History API SHALL provide SQL-first cache view derivation with schema parity
+The API SHALL derive cache-backed `view` responses through SQL-first runtime when enabled, while preserving existing response schema and filter behavior.
+
+#### Scenario: View response contract preserved
+- **WHEN** `GET /api/reject-history/view` is called with valid `query_id`
+- **THEN** response payload SHALL keep existing top-level structure containing `analytics_raw`, `summary`, and `detail`
+- **THEN** pagination field names and types SHALL remain compatible with current frontend usage
+
+#### Scenario: View SQL-first with deterministic fallback
+- **WHEN** SQL runtime is enabled for `view`
+- **THEN** summary/trend/detail derivation SHALL use SQL runtime as primary path
+- **THEN** fallback to legacy path SHALL follow configured policy and preserve response schema
+
+#### Scenario: Cache-expired behavior unchanged
+- **WHEN** `query_id` cache has expired
+- **THEN** endpoint SHALL return the same cache-expired status behavior as current implementation
--- a/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/specs/reject-history-cache-sql-runtime/spec.md
+++ b/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/specs/reject-history-cache-sql-runtime/spec.md
@@ -0,0 +1,45 @@
+## ADDED Requirements
+
+### Requirement: Reject History cache-SQL runtime SHALL execute against cached datasets without full DataFrame materialization
+The system SHALL provide a SQL runtime for reject-history cached queries that reads from cache/spool data sources and avoids requiring full pandas DataFrame materialization as the primary execution path.
+
+#### Scenario: Spool-backed execution
+- **WHEN** a valid `query_id` has parquet spool metadata available
+- **THEN** the runtime SHALL execute SQL directly against the spool dataset
+- **THEN** the request SHALL NOT require loading the entire dataset into a pandas DataFrame before filtering and aggregation
+
+#### Scenario: Source resolution fallback
+- **WHEN** spool data is unavailable for a valid `query_id`
+- **THEN** the runtime SHALL follow a deterministic fallback order configured by system policy
+- **THEN** the fallback decision SHALL be observable via telemetry metadata
+
+### Requirement: Reject History cache-SQL runtime SHALL preserve filter semantics across batch/view/export paths
+The runtime SHALL apply policy, supplementary, trend-date, and pareto selection filters with the same business semantics used by existing reject-history APIs.
+
+#### Scenario: Batch pareto filter parity
+- **WHEN** `batch-pareto` is requested with policy toggles, supplementary filters, trend dates, and `sel_*` selections
+- **THEN** SQL runtime output SHALL preserve exclude-self cross-filter semantics for each dimension
+- **THEN** `pareto_scope=top80` and `pareto_display_scope=top20` behavior SHALL remain unchanged
+
+#### Scenario: View filter parity
+- **WHEN** `view` is requested with `query_id` and active supplementary/interactive filters
+- **THEN** `summary`, `trend`, and paginated `detail` SHALL all reflect the same effective filter set
+- **THEN** response schema SHALL remain compatible with existing frontend contracts
+
+#### Scenario: Export filter parity
+- **WHEN** `export-cached` is requested with the same filters as `view`
+- **THEN** exported rows SHALL represent the same filtered data scope as view/detail
+- **THEN** column naming and field semantics SHALL remain unchanged
+
+### Requirement: Reject History cache-SQL runtime SHALL support controlled rollout and safe fallback
+The system SHALL expose runtime switches to enable or disable SQL execution per endpoint and SHALL support fallback to legacy computation when SQL runtime is unavailable.
+
+#### Scenario: Endpoint-level enablement
+- **WHEN** SQL runtime is enabled only for `batch-pareto`
+- **THEN** `batch-pareto` SHALL use SQL runtime
+- **THEN** `view` and `export-cached` SHALL continue using legacy path until explicitly enabled
+
+#### Scenario: SQL runtime fallback
+- **WHEN** SQL runtime encounters an execution failure for a request
+- **THEN** the system SHALL apply configured fallback behavior (legacy path or fail-fast)
+- **THEN** the response or metadata SHALL include a deterministic fallback reason code for operations troubleshooting
--- a/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/specs/reject-history-detail-export-parity/spec.md
+++ b/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/specs/reject-history-detail-export-parity/spec.md
@@ -0,0 +1,28 @@
+## MODIFIED Requirements
+
+### Requirement: Cached reject-history export SHALL support Pareto multi-select filter parity
+The cached export endpoint SHALL support Pareto multi-select context so that exported rows match the currently drilled-down detail scope, and SHALL stream response output to avoid requiring full in-memory row materialization before sending data.
+
+#### Scenario: Apply selected Pareto dimension values
+- **WHEN** export request provides `pareto_dimension` and one or more `pareto_values`
+- **THEN** the backend SHALL apply an OR-match filter against the mapped dimension column
+- **THEN** only rows matching selected values SHALL be exported
+
+#### Scenario: No Pareto selection keeps existing behavior
+- **WHEN** `pareto_values` is absent or empty
+- **THEN** export SHALL apply no extra Pareto-selected-item filter
+- **THEN** existing supplementary and interactive filters SHALL still apply
+
+#### Scenario: Invalid Pareto dimension is rejected
+- **WHEN** `pareto_dimension` is not one of supported dimensions
+- **THEN** API SHALL return HTTP 400 with descriptive validation error
+
+#### Scenario: Export response is streamed
+- **WHEN** cached export is requested for a large filtered dataset
+- **THEN** endpoint SHALL stream CSV rows incrementally to the client
+- **THEN** endpoint SHALL NOT require building a full rows list in memory before response begins
+
+#### Scenario: Export scope matches view detail scope
+- **WHEN** `view` and `export-cached` are called with the same `query_id` and filter set
+- **THEN** exported rows SHALL represent the same filtered data scope as detail results
+- **THEN** display-only pareto truncation rules SHALL NOT remove rows from export output
--- a/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/specs/reject-history-pareto-materialized-aggregate/spec.md
+++ b/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/specs/reject-history-pareto-materialized-aggregate/spec.md
@@ -0,0 +1,19 @@
+## ADDED Requirements
+
+### Requirement: Materialized Pareto orchestration SHALL use cache-SQL fallback before legacy DataFrame regrouping
+When materialized snapshots are not available, orchestration SHALL prefer cache-SQL runtime to compute batch Pareto results before attempting legacy DataFrame regrouping.
+
+#### Scenario: Materialized miss uses cache-SQL fallback
+- **WHEN** snapshot read misses, expires, or build fails for a batch-pareto request
+- **THEN** orchestration SHALL invoke cache-SQL batch pareto computation as the first fallback path
+- **THEN** returned payload SHALL preserve the same dimensions and item schema contract
+
+#### Scenario: Cache-SQL unavailable fallback policy
+- **WHEN** cache-SQL fallback is disabled or unavailable after materialized miss
+- **THEN** orchestration SHALL apply configured fallback policy (legacy compute or fail-fast)
+- **THEN** fallback reason SHALL be recorded in metadata for diagnostics
+
+#### Scenario: Fallback path preserves cross-filter semantics
+- **WHEN** cache-SQL fallback is used with multi-dimension `sel_*` filters
+- **THEN** exclude-self cross-filter semantics SHALL remain equivalent to materialized and legacy behavior
+- **THEN** `pareto_scope` and `pareto_display_scope` rules SHALL remain unchanged
--- a/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/tasks.md
+++ b/openspec/changes/archive/2026-03-04-reject-history-duckdb-cache-query/tasks.md
@@ -0,0 +1,34 @@
+## 1. SQL Runtime Foundation
+
+- [x] 1.1 新增 reject-history cache-SQL runtime 模組（DuckDB 連線管理、來源解析、參數綁定 helper）
+- [x] 1.2 新增 parquet spool 優先讀取與來源 fallback 策略（含 deterministic fallback reason）
+- [x] 1.3 新增 runtime feature flags（全域與 endpoint 級開關）與預設值
+- [x] 1.4 補齊依賴設定（`requirements.txt` / `pyproject.toml` / `environment.yml`）與啟動相容性檢查
+
+## 2. Batch Pareto SQL-first 路徑
+
+- [x] 2.1 將 `batch-pareto` 在 materialized miss/stale/build-fail 時接入 cache-SQL 計算路徑
+- [x] 2.2 保留並驗證 exclude-self cross-filter、`top80`、`top20` 行為一致
+- [x] 2.3 實作 SQL 不可用時的 fallback policy（legacy 或 fail-fast）
+- [x] 2.4 補上 batch-pareto parity 測試（SQL vs legacy）與 fallback metadata 測試
+
+## 3. View SQL 化
+
+- [x] 3.1 以 SQL 重建 `summary` 與 `trend` 聚合計算（保持欄位與精度契約）
+- [x] 3.2 以 SQL 實作 detail 查詢、排序與分頁（含 policy/supplementary/trend/pareto selections）
+- [x] 3.3 將 `/api/reject-history/view` 切到 SQL-first 路徑並保留 schema 相容
+- [x] 3.4 補上 view parity 測試與 cache-expired 行為回歸測試
+
+## 4. Export Cached 串流化
+
+- [x] 4.1 將 `export-cached` 改為 generator/streaming CSV 輸出
+- [x] 4.2 確保 export 與 detail 使用同一套 filter 組合邏輯，維持 scope parity
+- [x] 4.3 移除全量 rows list / `to_dict` 依賴，避免匯出前全載入記憶體
+- [x] 4.4 補上大資料匯出測試（串流輸出、欄位契約、篩選一致性）
+
+## 5. Observability, Guard, Rollout
+
+- [x] 5.1 新增 SQL runtime telemetry（來源、fallback reason、耗時、列數）
+- [x] 5.2 保留既有 memory guards，調整 guard 觸發點與訊息以符合 SQL-first 流程
+- [x] 5.3 制定 rollout 策略（batch -> view -> export）與對應回退開關
+- [x] 5.4 更新操作文件與驗證清單（前端提示、匯出不受顯示限制影響、壓測項目）