feat(lineage): unified LineageEngine, EventFetcher, and progressive trace API

Introduce a unified Seed→Lineage→Event pipeline replacing per-page Python BFS with Oracle CONNECT BY NOCYCLE queries, add staged /api/trace/* endpoints with rate limiting and L2 Redis caching, and wire progressive frontend loading via useTraceProgress composable. Key changes: - Add LineageEngine (split ancestors / merge sources / full genealogy) with QueryBuilder bind-param safety and batched IN clauses - Add EventFetcher with 6-domain support and L2 Redis cache - Add trace_routes Blueprint (seed-resolve, lineage, events) with profile dispatch, rate limiting, and Redis TTL=300s caching - Refactor query_tool_service to use LineageEngine and QueryBuilder, removing raw string interpolation (SQL injection fix) - Add rate limits and resolve cache to query_tool_routes - Integrate useTraceProgress into mid-section-defect with skeleton placeholders and fade-in transitions - Add lineageCache and on-demand lot lineage to query-tool - Add TraceProgressBar shared component - Remove legacy query-tool.js static script (3k lines) - Fix MatrixTable package column truncation (.slice(0,15) removed) - Archive unified-lineage-engine change, add trace-progressive-ui specs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 16:30:24 +08:00
parent c38b5f646a
commit 519f8ae2f4
52 changed files with 5074 additions and 4047 deletions
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/.openspec.yaml
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/.openspec.yaml
@@ -0,0 +1,2 @@
+schema: spec-driven
+status: proposal
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/design.md
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/design.md
@@ -0,0 +1,202 @@
+## Context
+
+兩個高查詢複雜度頁面（`/mid-section-defect` 和 `/query-tool`）各自實作了 LOT 血緣追溯邏輯。mid-section-defect 使用 Python BFS（`_bfs_split_chain()` + `_fetch_merge_sources()`），query-tool 使用 `_build_in_filter()` 字串拼接。兩者共用的底層資料表為 `DWH.DW_MES_CONTAINER`（5.2M rows, CONTAINERID UNIQUE index）和 `DWH.DW_MES_PJ_COMBINEDASSYLOTS`（1.97M rows, FINISHEDNAME indexed）。
+
+現行問題：
+- BFS 每輪一次 DB round-trip（3-16 輪），加上 `genealogy_records.sql` 全掃描 `HM_LOTMOVEOUT`（48M rows）
+- `_build_in_filter()` 字串拼接存在 SQL injection 風險
+- query-tool 無 rate limit / cache，可打爆 DB pool (pool_size=10, max_overflow=20)
+- 兩份 service 各 1200-1300 行，血緣邏輯重複
+
+既有安全基礎設施：
+- `QueryBuilder`（`sql/builder.py`）：`add_in_condition()` 支援 bind params `:p0, :p1, ...`
+- `SQLLoader`（`sql/loader.py`）：`load_with_params()` 支援結構參數 `{{ PARAM }}`
+- `configured_rate_limit()`（`core/rate_limit.py`）：per-client rate limit with `Retry-After` header
+- `LayeredCache`（`core/cache.py`）：L1 MemoryTTL + L2 Redis
+
+## Goals / Non-Goals
+
+**Goals:**
+- 以 `CONNECT BY NOCYCLE` 取代 Python BFS，將 3-16 次 DB round-trip 縮減為 1 次
+- 建立 `LineageEngine` 統一模組，消除血緣邏輯重複
+- 消除 `_build_in_filter()` SQL injection 風險
+- 為 query-tool 加入 rate limit + cache（對齊 mid-section-defect）
+- 為 `lot_split_merge_history` 加入 fast/full 雙模式
+
+**Non-Goals:**
+- 不新增 API endpoint（由後續 `trace-progressive-ui` 負責）
+- 不改動前端
+- 不建立 materialized view / 不使用 PARALLEL hints
+- 不改動其他頁面（wip-detail, lot-detail 等）
+
+## Decisions
+
+### D1: CONNECT BY NOCYCLE 作為主要遞迴查詢策略
+
+**選擇**: Oracle `CONNECT BY NOCYCLE` with `LEVEL <= 20`
+**替代方案**: Recursive `WITH` (recursive subquery factoring)
+**理由**:
+- `CONNECT BY` 是 Oracle 原生遞迴語法，在 Oracle 19c 上執行計劃最佳化最成熟
+- `LEVEL <= 20` 等價於現行 BFS `bfs_round > 20` 防護
+- `NOCYCLE` 處理循環引用（`SPLITFROMID` 可能存在資料錯誤的循環）
+- recursive `WITH` 作為 SQL 檔案內的註解替代方案，若 execution plan 不佳可快速切換
+
+**SQL 設計**（`sql/lineage/split_ancestors.sql`）:
+```sql
+SELECT
+    c.CONTAINERID,
+    c.SPLITFROMID,
+    c.CONTAINERNAME,
+    LEVEL AS SPLIT_DEPTH
+FROM DWH.DW_MES_CONTAINER c
+START WITH {{ CID_FILTER }}
+CONNECT BY NOCYCLE PRIOR c.SPLITFROMID = c.CONTAINERID
+    AND LEVEL <= 20
+```
+- `{{ CID_FILTER }}` 由 `QueryBuilder.get_conditions_sql()` 生成，bind params 注入
+- Oracle IN clause 上限透過 `ORACLE_IN_BATCH_SIZE=1000` 分批，多批結果合併
+
+### D2: LineageEngine 模組結構
+
+```
+src/mes_dashboard/services/lineage_engine.py
+├── resolve_split_ancestors(container_ids: List[str]) -> Dict
+│   └── 回傳 {child_to_parent: {cid: parent_cid}, cid_to_name: {cid: name}}
+├── resolve_merge_sources(container_names: List[str]) -> Dict
+│   └── 回傳 {finished_name: [{source_cid, source_name}]}
+└── resolve_full_genealogy(container_ids: List[str], initial_names: Dict) -> Dict
+    └── 組合 split + merge，回傳 {cid: Set[ancestor_cids]}
+
+src/mes_dashboard/sql/lineage/
+├── split_ancestors.sql    (CONNECT BY NOCYCLE)
+└── merge_sources.sql      (from merge_lookup.sql)
+```
+
+**函數簽名設計**:
+- profile-agnostic：接受 `container_ids: List[str]`，不綁定頁面邏輯
+- 回傳原生 Python 資料結構（dict/set），不回傳 DataFrame
+- 內部使用 `QueryBuilder` + `SQLLoader.load_with_params()` + `read_sql_df()`
+- batch 邏輯封裝在模組內（caller 不需處理 `ORACLE_IN_BATCH_SIZE`）
+
+### D3: EventFetcher 模組結構
+
+```
+src/mes_dashboard/services/event_fetcher.py
+├── fetch_events(container_ids: List[str], domain: str) -> List[Dict]
+│   └── 支援 domain: history, materials, rejects, holds, jobs, upstream_history
+├── _cache_key(domain: str, container_ids: List[str]) -> str
+│   └── 格式: evt:{domain}:{sorted_cids_hash}
+└── _get_rate_limit_config(domain: str) -> Dict
+    └── 回傳 {bucket, max_attempts, window_seconds}
+```
+
+**快取策略**:
+- L2 Redis cache（對齊 `core/cache.py` 模式），TTL 依 domain 配置
+- cache key 使用 `hashlib.md5(sorted(cids).encode()).hexdigest()[:12]` 避免超長 key
+- mid-section-defect 既有的 `_fetch_upstream_history()` 遷移到 `fetch_events(cids, "upstream_history")`
+
+### D4: query-tool SQL injection 修復策略
+
+**修復範圍**（6 個呼叫點）:
+1. `_resolve_by_lot_id()` (line 262): `_build_in_filter(lot_ids, 'CONTAINERNAME')` + `read_sql_df(sql, {})`
+2. `_resolve_by_serial_number()` (line ~320): 同上模式
+3. `_resolve_by_work_order()` (line ~380): 同上模式
+4. `get_lot_history()` 內部的 IN 子句
+5. `get_lot_associations()` 內部的 IN 子句
+6. `lot_split_merge_history` 查詢
+
+**修復模式**（統一）:
+```python
+# Before (unsafe)
+in_filter = _build_in_filter(lot_ids, 'CONTAINERNAME')
+sql = f"SELECT ... WHERE {in_filter}"
+df = read_sql_df(sql, {})
+
+# After (safe)
+builder = QueryBuilder()
+builder.add_in_condition("CONTAINERNAME", lot_ids)
+sql = SQLLoader.load_with_params(
+    "query_tool/lot_resolve_id",
+    CONTAINER_FILTER=builder.get_conditions_sql(),
+)
+df = read_sql_df(sql, builder.params)
+```
+
+**`_build_in_filter()` 和 `_build_in_clause()` 完全刪除**（非 deprecated，直接刪除，因為這是安全漏洞）。
+
+### D5: query-tool rate limit + cache 配置
+
+**Rate limit**（對齊 `configured_rate_limit()` 模式）:
+| Endpoint | Bucket | Max/Window | Env Override |
+|----------|--------|------------|-------------|
+| `/resolve` | `query-tool-resolve` | 10/60s | `QT_RESOLVE_RATE_*` |
+| `/lot-history` | `query-tool-history` | 20/60s | `QT_HISTORY_RATE_*` |
+| `/lot-associations` | `query-tool-association` | 20/60s | `QT_ASSOC_RATE_*` |
+| `/adjacent-lots` | `query-tool-adjacent` | 20/60s | `QT_ADJACENT_RATE_*` |
+| `/equipment-period` | `query-tool-equipment` | 5/60s | `QT_EQUIP_RATE_*` |
+| `/export-csv` | `query-tool-export` | 3/60s | `QT_EXPORT_RATE_*` |
+
+**Cache**:
+- resolve result: L2 Redis, TTL=60s, key=`qt:resolve:{input_type}:{values_hash}`
+- 其他 GET endpoints: 暫不加 cache（結果依賴動態 CONTAINERID 參數，cache 命中率低）
+
+### D6: lot_split_merge_history fast/full 雙模式
+
+**Fast mode**（預設）:
+```sql
+-- lot_split_merge_history.sql 加入條件
+AND h.TXNDATE >= ADD_MONTHS(SYSDATE, -6)
+...
+FETCH FIRST 500 ROWS ONLY
+```
+
+**Full mode**（`full_history=true`）:
+- SQL variant 不含時間窗和 row limit
+- 使用 `read_sql_df_slow()` (120s timeout) 取代 `read_sql_df()` (55s timeout)
+- Route 層透過 `request.args.get('full_history', 'false').lower() == 'true'` 判斷
+
+### D7: 重構順序與 regression 防護
+
+**Phase 1**: mid-section-defect（較安全，有 cache + distributed lock 保護）
+1. 建立 `lineage_engine.py` + SQL files
+2. 在 `mid_section_defect_service.py` 中以 `LineageEngine` 取代 BFS 三函數
+3. golden test 驗證 BFS vs CONNECT BY 結果一致
+4. 廢棄 `genealogy_records.sql` + `split_chain.sql`（標記 deprecated）
+
+**Phase 2**: query-tool（風險較高，無既有保護）
+1. 修復所有 `_build_in_filter()` → `QueryBuilder`
+2. 刪除 `_build_in_filter()` + `_build_in_clause()`
+3. 加入 route-level rate limit
+4. 加入 resolve cache
+5. 加入 `lot_split_merge_history` fast/full mode
+
+**Phase 3**: EventFetcher
+1. 建立 `event_fetcher.py`
+2. 遷移 `_fetch_upstream_history()` → `EventFetcher`
+3. 遷移 query-tool event fetch paths → `EventFetcher`
+
+## Risks / Trade-offs
+
+| Risk | Mitigation |
+|------|-----------|
+| CONNECT BY 對超大血緣樹 (>10000 nodes) 可能產生不預期的 execution plan | `LEVEL <= 20` 硬上限 + SQL 檔案內含 recursive `WITH` 替代方案可快速切換 |
+| golden test 覆蓋率不足導致 regression 漏網 | 選取 ≥5 個已知血緣結構的 LOT（含多層 split + merge 交叉），CI gate 強制通過 |
+| `_build_in_filter()` 刪除後漏改呼叫點 | Phase 2 完成後 `grep -r "_build_in_filter\|_build_in_clause" src/` 必須 0 結果 |
+| fast mode 6 個月時間窗可能截斷需要完整歷史的追溯 | 提供 `full_history=true` 切換完整模式，前端預設不加此參數 = fast mode |
+| QueryBuilder `add_in_condition()` 對 >1000 值不自動分批 | LineageEngine 內部封裝分批邏輯（`for i in range(0, len(ids), 1000)`），呼叫者無感 |
+
+## Migration Plan
+
+1. **建立新模組**：`lineage_engine.py`, `event_fetcher.py`, `sql/lineage/*.sql` — 無副作用，可安全部署
+2. **Phase 1 切換**：mid-section-defect 內部呼叫改用 `LineageEngine` — 有 cache/lock 保護，regression 可透過 golden test + 手動比對驗證
+3. **Phase 2 切換**：query-tool 修復 + rate limit + cache — 需重新跑 query-tool 路由測試
+4. **Phase 3 切換**：EventFetcher 遷移 — 最後執行，影響範圍最小
+5. **清理**：確認 deprecated SQL files 無引用後刪除
+
+**Rollback**: 每個 Phase 獨立，可單獨 revert。`LineageEngine` 和 `EventFetcher` 為新模組，不影響既有程式碼直到各 Phase 的切換 commit。
+
+## Open Questions
+
+- `DW_MES_CONTAINER.SPLITFROMID` 欄位是否有 index？若無，`CONNECT BY` 的 `START WITH` 性能可能依賴全表掃描而非 CONTAINERID index。需確認 Oracle execution plan。
+- `ORACLE_IN_BATCH_SIZE=1000` 對 `CONNECT BY START WITH ... IN (...)` 的行為是否與普通 `WHERE ... IN (...)` 一致？需在開發環境驗證。
+- EventFetcher 的 cache TTL 各 domain 是否需要差異化（如 `upstream_history` 較長、`holds` 較短）？暫統一 300s，後續視使用模式調整。
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/proposal.md
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/proposal.md
@@ -0,0 +1,110 @@
+## Why
+
+批次追蹤工具 (`/query-tool`) 與中段製程不良追溯分析 (`/mid-section-defect`) 是本專案中查詢複雜度最高的兩個頁面。兩者都需要解析 LOT 血緣關係（拆批 split + 併批 merge），但各自實作了獨立的追溯邏輯，導致：
+
+1. **效能瓶頸**：mid-section-defect 使用 Python 多輪 BFS 追溯 split chain（`_bfs_split_chain()`，每次 3-16 次 DB round-trip），加上 `genealogy_records.sql` 對 48M 行的 `HM_LOTMOVEOUT` 全表掃描（30-120 秒）。
+2. **安全風險**：query-tool 的 `_build_in_filter()` 使用字串拼接建構 IN 子句（`query_tool_service.py:156-174`），`_resolve_by_lot_id()` / `_resolve_by_serial_number()` / `_resolve_by_work_order()` 系列函數傳入空 params `read_sql_df(sql, {})`——值直接嵌入 SQL 字串中，存在 SQL 注入風險。
+3. **缺乏防護**：query-tool 無 rate limit、無 cache，高併發時可打爆 DB connection pool（Production pool_size=10, max_overflow=20）。
+4. **重複程式碼**：兩個 service 各自維護 split chain 追溯、merge lookup、batch IN 分段等相同邏輯。
+
+Oracle 19c 的 `CONNECT BY NOCYCLE` 可以用一條 SQL 取代整套 Python BFS，將 3-16 次 DB round-trip 縮減為 1 次。備選方案為 Oracle 19c 支援的 recursive `WITH` (recursive subquery factoring)，功能等價但可讀性更好。split/merge 的資料來源 (`DW_MES_CONTAINER.SPLITFROMID` + `DW_MES_PJ_COMBINEDASSYLOTS`) 完全不需碰 `HM_LOTMOVEOUT`，可消除 48M 行全表掃描。
+
+**邊界聲明**：本變更為純後端內部重構，不新增任何 API endpoint，不改動前端。既有 API contract 向下相容（URL、request/response 格式不變），僅新增可選的 `full_history` query param 作為向下相容擴展。後續的前端分段載入和新增 API endpoints 列入獨立的 `trace-progressive-ui` 變更。
+
+## What Changes
+
+- 建立統一的 `LineageEngine` 模組（`src/mes_dashboard/services/lineage_engine.py`），提供 LOT 血緣解析共用核心：
+  - `resolve_split_ancestors()` — 使用 `CONNECT BY NOCYCLE` 單次 SQL 查詢取代 Python BFS（備選: recursive `WITH`，於 SQL 檔案中以註解標註替代寫法）
+  - `resolve_merge_sources()` — 從 `DW_MES_PJ_COMBINEDASSYLOTS` 查詢併批來源
+  - `resolve_full_genealogy()` — 組合 split + merge 為完整血緣圖
+  - 設計為 profile-agnostic 的公用函數，未來其他頁面（wip-detail、lot-detail）可直接呼叫，但本變更僅接入 mid-section-defect 和 query-tool
+- 建立統一的 `EventFetcher` 模組，提供帶 cache + rate limit 的批次事件查詢，封裝既有的 domain 查詢（history、materials、rejects、holds、jobs、upstream_history）。
+- 重構 `mid_section_defect_service.py`：以 `LineageEngine` 取代 `_bfs_split_chain()` + `_fetch_merge_sources()` + `_resolve_full_genealogy()`；以 `EventFetcher` 取代 `_fetch_upstream_history()`。
+- 重構 `query_tool_service.py`：以 `QueryBuilder` bind params 全面取代 `_build_in_filter()` 字串拼接；加入 route-level rate limit 和 cache 對齊 mid-section-defect 既有模式。
+- 新增 SQL 檔案：
+  - `sql/lineage/split_ancestors.sql`（CONNECT BY NOCYCLE 實作，檔案內包含 recursive WITH 替代寫法作為 Oracle 版本兼容備註）
+  - `sql/lineage/merge_sources.sql`（從 `sql/mid_section_defect/merge_lookup.sql` 遷移）
+- 廢棄 SQL 檔案（標記 deprecated，保留一個版本後刪除）：
+  - `sql/mid_section_defect/genealogy_records.sql`（48M row HM_LOTMOVEOUT 全掃描不再需要）
+  - `sql/mid_section_defect/split_chain.sql`（由 lineage CONNECT BY 取代）
+- 為 query-tool 的 `lot_split_merge_history.sql` 加入雙模式查詢：
+  - **fast mode**（預設）：`TXNDATE >= ADD_MONTHS(SYSDATE, -6)` + `FETCH FIRST 500 ROWS ONLY`——涵蓋近半年追溯，回應 <5s
+  - **full mode**：前端傳入 `full_history=true` 時不加時間窗，保留完整歷史追溯能力，走 `read_sql_df_slow` (120s timeout)
+  - query-tool route 新增 `full_history` boolean query param，service 依此選擇 SQL variant
+
+## Capabilities
+
+### New Capabilities
+
+- `lineage-engine-core`: 統一 LOT 血緣解析引擎。提供 `resolve_split_ancestors()`（CONNECT BY NOCYCLE，`LEVEL <= 20` 上限）、`resolve_merge_sources()`、`resolve_full_genealogy()` 三個公用函數。全部使用 `QueryBuilder` bind params，支援批次 IN 分段（`ORACLE_IN_BATCH_SIZE=1000`）。函數簽名設計為 profile-agnostic，接受 `container_ids: List[str]` 並回傳字典結構，不綁定特定頁面邏輯。
+- `event-fetcher-unified`: 統一事件查詢層，封裝 cache key 生成（格式: `evt:{domain}:{sorted_cids_hash}`）、L1/L2 layered cache（對齊 `core/cache.py` LayeredCache 模式）、rate limit bucket 配置（對齊 `configured_rate_limit()` 模式）。domain 包含 `history`、`materials`、`rejects`、`holds`、`jobs`、`upstream_history`。
+- `query-tool-safety-hardening`: 修復 query-tool SQL 注入風險——`_build_in_filter()` 和 `_build_in_clause()` 全面改用 `QueryBuilder.add_in_condition()`，消除 `read_sql_df(sql, {})` 空 params 模式；加入 route-level rate limit（對齊 `configured_rate_limit()` 模式：resolve 10/min, history 20/min, association 20/min）和 response cache（L2 Redis, 60s TTL）。
+
+### Modified Capabilities
+
+- `cache-indexed-query-acceleration`: mid-section-defect 的 genealogy 查詢從 Python BFS 多輪 + HM_LOTMOVEOUT 全掃描改為 CONNECT BY 單輪 + 索引查詢。
+- `oracle-query-fragment-governance`: `_build_in_filter()` / `_build_in_clause()` 廢棄，統一收斂到 `QueryBuilder.add_in_condition()`。新增 `sql/lineage/` 目錄遵循既有 SQLLoader 慣例。
+
+## Impact
+
+- **Affected code**:
+  - 新建: `src/mes_dashboard/services/lineage_engine.py`, `src/mes_dashboard/sql/lineage/split_ancestors.sql`, `src/mes_dashboard/sql/lineage/merge_sources.sql`
+  - 重構: `src/mes_dashboard/services/mid_section_defect_service.py` (1194L), `src/mes_dashboard/services/query_tool_service.py` (1329L), `src/mes_dashboard/routes/query_tool_routes.py`
+  - 廢棄: `src/mes_dashboard/sql/mid_section_defect/genealogy_records.sql`, `src/mes_dashboard/sql/mid_section_defect/split_chain.sql` (由 lineage 模組取代，標記 deprecated 保留一版)
+  - 修改: `src/mes_dashboard/sql/query_tool/lot_split_merge_history.sql` (加時間窗 + row limit)
+- **Runtime/deploy**: 無新依賴，仍為 Flask/Gunicorn + Oracle + Redis。DB query pattern 改變但 connection pool 設定不變。
+- **APIs/pages**: `/query-tool` 和 `/mid-section-defect` 既有 API contract 向下相容——URL、輸入輸出格式、HTTP status code 均不變，純內部實作替換。向下相容的擴展：query-tool API 新增 rate limit header（`Retry-After`，對齊 `rate_limit.py` 既有實作）；query-tool split-merge history 新增可選 `full_history` query param（預設 false = fast mode，不傳時行為與舊版等價）。
+- **Performance**: 見下方 Verification 章節的量化驗收基準。
+- **Security**: query-tool IN clause SQL injection 風險消除。所有 `_build_in_filter()` / `_build_in_clause()` 呼叫點改為 `QueryBuilder.add_in_condition()`。
+- **Testing**: 需新增 LineageEngine 單元測試，並建立 golden test 比對 BFS vs CONNECT BY 結果一致性。既有 mid-section-defect 和 query-tool 測試需更新 mock 路徑。
+
+## Verification
+
+效能驗收基準——所有指標須在以下條件下量測：
+
+**測試資料規模**:
+- LOT 血緣樹: 目標 seed lot 具備 ≥3 層 split depth、≥50 ancestor nodes、至少 1 條 merge path
+- mid-section-defect: 使用 TMTT detection 產出 ≥10 seed lots 的日期範圍查詢
+- query-tool: resolve 結果 ≥20 lots 的 work order 查詢
+
+**驗收指標**（冷查詢 = cache miss，熱查詢 = L2 Redis hit）:
+
+| 指標 | 現況 (P95) | 目標 (P95) | 條件 |
+|------|-----------|-----------|------|
+| mid-section-defect genealogy（冷） | 30-120s | ≤8s | CONNECT BY 單輪，≥50 ancestor nodes |
+| mid-section-defect genealogy（熱） | 3-5s (L2 hit) | ≤1s | Redis cache hit |
+| query-tool lot_split_merge_history fast mode（冷） | 無上限（可 >120s timeout） | ≤5s | 時間窗 6 個月 + FETCH FIRST 500 ROWS |
+| query-tool lot_split_merge_history full mode（冷） | 同上 | ≤60s | 無時間窗，走 `read_sql_df_slow` 120s timeout |
+| LineageEngine.resolve_split_ancestors（冷） | N/A (新模組) | ≤3s | ≥50 ancestor nodes, CONNECT BY |
+| DB connection 佔用時間 | 3-16 round-trips × 0.5-2s each | 單次 ≤3s | 單一 CONNECT BY 查詢 |
+
+**安全驗收**:
+- `_build_in_filter()` 和 `_build_in_clause()` 零引用（grep 確認）
+- 所有含使用者輸入的查詢（resolve_by_lot_id, resolve_by_serial_number, resolve_by_work_order 等）必須使用 `QueryBuilder` bind params，不可字串拼接。純靜態 SQL（無使用者輸入）允許空 params
+
+**結果一致性驗收**:
+- Golden test: 選取 ≥5 個已知血緣結構的 LOT，比對 BFS vs CONNECT BY 輸出的 `child_to_parent` 和 `cid_to_name` 結果集合完全一致
+
+## Non-Goals
+
+- 前端 UI 改動不在此變更範圍內（前端分段載入和漸進式 UX 列入後續 `trace-progressive-ui` 變更）。
+- 不新增任何 API endpoint——既有 API contract 向下相容（僅新增可選 query param `full_history` 作為擴展）。新增 endpoint 由後續 `trace-progressive-ui` 負責。
+- 不改動 DB schema、不建立 materialized view、不使用 PARALLEL hints——所有最佳化在應用層（SQL 改寫 + Python 重構 + Redis cache）完成。
+- 不改動其他頁面（wip-detail、lot-detail 等）的查詢邏輯——`LineageEngine` 設計為可擴展，但本變更僅接入兩個目標頁面。
+- 不使用 Oracle PARALLEL hints（在 connection pool 環境下行為不可預測，不做為最佳化手段）。
+
+## Dependencies
+
+- 無前置依賴。本變更可獨立實施。
+- 後續 `trace-progressive-ui` 依賴本變更完成後的 `LineageEngine` 和 `EventFetcher` 模組。
+
+## Risks
+
+| 風險 | 緩解 |
+|------|------|
+| CONNECT BY 遇超大血緣樹（>10000 ancestors）效能退化 | `LEVEL <= 20` 上限 + `NOCYCLE` 防循環；與目前 BFS `bfs_round > 20` 等效。若 Oracle 19c 執行計劃不佳，SQL 檔案內含 recursive `WITH` 替代寫法可快速切換 |
+| 血緣結果與 BFS 版本不一致（regression） | 建立 golden test：用 ≥5 個已知 LOT 比對 BFS vs CONNECT BY 輸出，CI gate 確保結果集合完全一致 |
+| 重構範圍橫跨兩個大 service（2500+ 行） | 分階段：先重構 mid-section-defect（有 cache+lock 保護，regression 風險較低），再做 query-tool |
+| `genealogy_records.sql` 廢棄後遺漏引用 | grep 全域搜索確認無其他引用點；SQL file 標記 deprecated 保留一個版本後刪除 |
+| query-tool 新增 rate limit 影響使用者體驗 | 預設值寬鬆（resolve 10/min, history 20/min），與 mid-section-defect 既有 rate limit 對齊，回應包含 `Retry-After` header |
+| `QueryBuilder` 取代 `_build_in_filter()` 時漏改呼叫點 | grep 搜索 `_build_in_filter` 和 `_build_in_clause` 所有引用，逐一替換並確認 0 殘留引用 |
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/cache-indexed-query-acceleration/spec.md
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/cache-indexed-query-acceleration/spec.md
@@ -0,0 +1,18 @@
+## ADDED Requirements
+
+### Requirement: Mid-section defect genealogy SHALL use CONNECT BY instead of Python BFS
+The mid-section-defect genealogy resolution SHALL use `LineageEngine.resolve_full_genealogy()` (CONNECT BY NOCYCLE) instead of the existing `_bfs_split_chain()` Python BFS implementation.
+
+#### Scenario: Genealogy cold query performance
+- **WHEN** mid-section-defect analysis executes genealogy resolution with cache miss
+- **THEN** `LineageEngine.resolve_split_ancestors()` SHALL be called (single CONNECT BY query)
+- **THEN** response time SHALL be ≤8s (P95) for ≥50 ancestor nodes
+- **THEN** Python BFS `_bfs_split_chain()` SHALL NOT be called
+
+#### Scenario: Genealogy hot query performance
+- **WHEN** mid-section-defect analysis executes genealogy resolution with L2 Redis cache hit
+- **THEN** response time SHALL be ≤1s (P95)
+
+#### Scenario: Golden test result equivalence
+- **WHEN** golden test runs with ≥5 known LOTs
+- **THEN** CONNECT BY output (`child_to_parent`, `cid_to_name`) SHALL be identical to BFS output for the same inputs
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/event-fetcher-unified/spec.md
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/event-fetcher-unified/spec.md
@@ -0,0 +1,20 @@
+## ADDED Requirements
+
+### Requirement: EventFetcher SHALL provide unified cached event querying across domains
+`EventFetcher` SHALL encapsulate batch event queries with L1/L2 layered cache and rate limit bucket configuration, supporting domains: `history`, `materials`, `rejects`, `holds`, `jobs`, `upstream_history`.
+
+#### Scenario: Cache miss for event domain query
+- **WHEN** `EventFetcher` is called for a domain with container IDs and no cache exists
+- **THEN** the domain query SHALL execute against Oracle via `read_sql_df()`
+- **THEN** the result SHALL be stored in L2 Redis cache with key format `evt:{domain}:{sorted_cids_hash}`
+- **THEN** L1 memory cache SHALL also be populated (aligned with `core/cache.py` LayeredCache pattern)
+
+#### Scenario: Cache hit for event domain query
+- **WHEN** `EventFetcher` is called for a domain and L2 Redis cache contains a valid entry
+- **THEN** the cached result SHALL be returned without executing Oracle query
+- **THEN** DB connection pool SHALL NOT be consumed
+
+#### Scenario: Rate limit bucket per domain
+- **WHEN** `EventFetcher` is used from a route handler
+- **THEN** each domain SHALL have a configurable rate limit bucket aligned with `configured_rate_limit()` pattern
+- **THEN** rate limit configuration SHALL be overridable via environment variables
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/lineage-engine-core/spec.md
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/lineage-engine-core/spec.md
@@ -0,0 +1,57 @@
+## ADDED Requirements
+
+### Requirement: LineageEngine SHALL provide unified split ancestor resolution via CONNECT BY NOCYCLE
+`LineageEngine.resolve_split_ancestors()` SHALL accept a list of container IDs and return the complete split ancestry graph using a single Oracle `CONNECT BY NOCYCLE` query on `DW_MES_CONTAINER.SPLITFROMID`.
+
+#### Scenario: Normal split chain resolution
+- **WHEN** `resolve_split_ancestors()` is called with a list of container IDs
+- **THEN** a single SQL query using `CONNECT BY NOCYCLE` SHALL be executed against `DW_MES_CONTAINER`
+- **THEN** the result SHALL include a `child_to_parent` mapping and a `cid_to_name` mapping for all discovered ancestor nodes
+- **THEN** the traversal depth SHALL be limited to `LEVEL <= 20` (equivalent to existing BFS `bfs_round > 20` guard)
+
+#### Scenario: Large input batch exceeding Oracle IN clause limit
+- **WHEN** the input `container_ids` list exceeds `ORACLE_IN_BATCH_SIZE` (1000)
+- **THEN** `QueryBuilder.add_in_condition()` SHALL batch the IDs and combine results
+- **THEN** all bind parameters SHALL use `QueryBuilder.params` (no string concatenation)
+
+#### Scenario: Cyclic split references in data
+- **WHEN** `DW_MES_CONTAINER.SPLITFROMID` contains cyclic references
+- **THEN** `NOCYCLE` SHALL prevent infinite traversal
+- **THEN** the query SHALL return all non-cyclic ancestors up to `LEVEL <= 20`
+
+#### Scenario: CONNECT BY performance regression
+- **WHEN** Oracle 19c execution plan for `CONNECT BY NOCYCLE` performs worse than expected
+- **THEN** the SQL file SHALL contain a commented-out recursive `WITH` (recursive subquery factoring) alternative that can be swapped in without code changes
+
+### Requirement: LineageEngine SHALL provide unified merge source resolution
+`LineageEngine.resolve_merge_sources()` SHALL accept a list of container IDs and return merge source mappings from `DW_MES_PJ_COMBINEDASSYLOTS`.
+
+#### Scenario: Merge source lookup
+- **WHEN** `resolve_merge_sources()` is called with container IDs
+- **THEN** the result SHALL include `{cid: [merge_source_cid, ...]}` for all containers that have merge sources
+- **THEN** all queries SHALL use `QueryBuilder` bind params
+
+### Requirement: LineageEngine SHALL provide combined genealogy resolution
+`LineageEngine.resolve_full_genealogy()` SHALL combine split ancestors and merge sources into a complete genealogy graph.
+
+#### Scenario: Full genealogy for a set of seed lots
+- **WHEN** `resolve_full_genealogy()` is called with seed container IDs
+- **THEN** split ancestors SHALL be resolved first via `resolve_split_ancestors()`
+- **THEN** merge sources SHALL be resolved for all discovered ancestor nodes
+- **THEN** the combined result SHALL be equivalent to the existing `_resolve_full_genealogy()` output in `mid_section_defect_service.py`
+
+### Requirement: LineageEngine functions SHALL be profile-agnostic
+All `LineageEngine` public functions SHALL accept `container_ids: List[str]` and return dictionary structures without binding to any specific page logic.
+
+#### Scenario: Reuse from different pages
+- **WHEN** a new page (e.g., wip-detail) needs lineage resolution
+- **THEN** it SHALL be able to call `LineageEngine` functions directly without modification
+- **THEN** no page-specific logic (profile, TMTT detection, etc.) SHALL exist in `LineageEngine`
+
+### Requirement: LineageEngine SQL files SHALL reside in `sql/lineage/` directory
+New SQL files SHALL follow the existing `SQLLoader` convention under `src/mes_dashboard/sql/lineage/`.
+
+#### Scenario: SQL file organization
+- **WHEN** `LineageEngine` executes queries
+- **THEN** `split_ancestors.sql` and `merge_sources.sql` SHALL be loaded via `SQLLoader.load_with_params("lineage/split_ancestors", ...)`
+- **THEN** the SQL files SHALL NOT reference `HM_LOTMOVEOUT` (48M row table no longer needed for genealogy)
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/oracle-query-fragment-governance/spec.md
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/oracle-query-fragment-governance/spec.md
@@ -0,0 +1,23 @@
+## ADDED Requirements
+
+### Requirement: Lineage SQL fragments SHALL be centralized in `sql/lineage/` directory
+Split ancestor and merge source SQL queries SHALL be defined in `sql/lineage/` and shared across services via `SQLLoader`.
+
+#### Scenario: Mid-section-defect lineage query
+- **WHEN** `mid_section_defect_service.py` needs split ancestry or merge source data
+- **THEN** it SHALL call `LineageEngine` which loads SQL from `sql/lineage/split_ancestors.sql` and `sql/lineage/merge_sources.sql`
+- **THEN** it SHALL NOT use `sql/mid_section_defect/split_chain.sql` or `sql/mid_section_defect/genealogy_records.sql`
+
+#### Scenario: Deprecated SQL file handling
+- **WHEN** `sql/mid_section_defect/genealogy_records.sql` and `sql/mid_section_defect/split_chain.sql` are deprecated
+- **THEN** the files SHALL be marked with a deprecated comment at the top
+- **THEN** grep SHALL confirm zero `SQLLoader.load` references to these files
+- **THEN** the files SHALL be retained for one version before deletion
+
+### Requirement: All user-input SQL queries SHALL use QueryBuilder bind params
+`_build_in_filter()` and `_build_in_clause()` in `query_tool_service.py` SHALL be fully replaced by `QueryBuilder.add_in_condition()`.
+
+#### Scenario: Complete migration to QueryBuilder
+- **WHEN** the refactoring is complete
+- **THEN** grep for `_build_in_filter` and `_build_in_clause` SHALL return zero results
+- **THEN** all queries involving user-supplied values SHALL use `QueryBuilder.params`
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/query-tool-safety-hardening/spec.md
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/specs/query-tool-safety-hardening/spec.md
@@ -0,0 +1,57 @@
+## ADDED Requirements
+
+### Requirement: query-tool resolve functions SHALL use QueryBuilder bind params for all user input
+All `resolve_lots()` family functions (`_resolve_by_lot_id`, `_resolve_by_serial_number`, `_resolve_by_work_order`) SHALL use `QueryBuilder.add_in_condition()` with bind parameters instead of `_build_in_filter()` string concatenation.
+
+#### Scenario: Lot resolve with user-supplied values
+- **WHEN** a resolve function receives user-supplied lot IDs, serial numbers, or work order names
+- **THEN** the SQL query SHALL use `:p0, :p1, ...` bind parameters via `QueryBuilder`
+- **THEN** `read_sql_df()` SHALL receive `builder.params` (never an empty `{}` dict for queries with user input)
+- **THEN** `_build_in_filter()` and `_build_in_clause()` SHALL NOT be called
+
+#### Scenario: Pure static SQL without user input
+- **WHEN** a query contains no user-supplied values (e.g., static lookups)
+- **THEN** empty params `{}` is acceptable
+- **THEN** no `_build_in_filter()` SHALL be used
+
+#### Scenario: Zero residual references to deprecated functions
+- **WHEN** the refactoring is complete
+- **THEN** grep for `_build_in_filter` and `_build_in_clause` SHALL return zero results across the entire codebase
+
+### Requirement: query-tool routes SHALL apply rate limiting
+All query-tool API endpoints SHALL apply per-client rate limiting using the existing `configured_rate_limit` mechanism.
+
+#### Scenario: Resolve endpoint rate limit exceeded
+- **WHEN** a client sends more than 10 requests to query-tool resolve endpoints within 60 seconds
+- **THEN** the endpoint SHALL return HTTP 429 with a `Retry-After` header
+- **THEN** the resolve service function SHALL NOT be called
+
+#### Scenario: History endpoint rate limit exceeded
+- **WHEN** a client sends more than 20 requests to query-tool history endpoints within 60 seconds
+- **THEN** the endpoint SHALL return HTTP 429 with a `Retry-After` header
+
+#### Scenario: Association endpoint rate limit exceeded
+- **WHEN** a client sends more than 20 requests to query-tool association endpoints within 60 seconds
+- **THEN** the endpoint SHALL return HTTP 429 with a `Retry-After` header
+
+### Requirement: query-tool routes SHALL apply response caching
+High-cost query-tool endpoints SHALL cache responses in L2 Redis.
+
+#### Scenario: Resolve result caching
+- **WHEN** a resolve request succeeds
+- **THEN** the response SHALL be cached in L2 Redis with TTL = 60s
+- **THEN** subsequent identical requests within TTL SHALL return cached result without Oracle query
+
+### Requirement: lot_split_merge_history SHALL support fast and full query modes
+The `lot_split_merge_history.sql` query SHALL support two modes to balance traceability completeness vs performance.
+
+#### Scenario: Fast mode (default)
+- **WHEN** `full_history` query parameter is absent or `false`
+- **THEN** the SQL SHALL include `TXNDATE >= ADD_MONTHS(SYSDATE, -6)` time window and `FETCH FIRST 500 ROWS ONLY`
+- **THEN** query response time SHALL be ≤5s (P95)
+
+#### Scenario: Full mode
+- **WHEN** `full_history=true` query parameter is provided
+- **THEN** the SQL SHALL NOT include time window restriction
+- **THEN** the query SHALL use `read_sql_df_slow` (120s timeout)
+- **THEN** query response time SHALL be ≤60s (P95)
--- a/openspec/changes/archive/2026-02-12-unified-lineage-engine/tasks.md
+++ b/openspec/changes/archive/2026-02-12-unified-lineage-engine/tasks.md
@@ -0,0 +1,57 @@
+## Phase 1: LineageEngine 模組建立
+
+- [x] 1.1 建立 `src/mes_dashboard/sql/lineage/split_ancestors.sql`（CONNECT BY NOCYCLE，含 recursive WITH 註解替代方案）
+- [x] 1.2 建立 `src/mes_dashboard/sql/lineage/merge_sources.sql`（從 `mid_section_defect/merge_lookup.sql` 遷移，改用 `{{ FINISHED_NAME_FILTER }}` 結構參數）
+- [x] 1.3 建立 `src/mes_dashboard/services/lineage_engine.py`：`resolve_split_ancestors()`、`resolve_merge_sources()`、`resolve_full_genealogy()` 三個公用函數，使用 `QueryBuilder` bind params + `ORACLE_IN_BATCH_SIZE=1000` 分批
+- [x] 1.4 LineageEngine 單元測試：mock `read_sql_df` 驗證 batch 分割、dict 回傳結構、LEVEL <= 20 防護
+
+## Phase 2: mid-section-defect 切換到 LineageEngine
+
+- [x] 2.1 在 `mid_section_defect_service.py` 中以 `LineageEngine.resolve_split_ancestors()` 取代 `_bfs_split_chain()`
+- [x] 2.2 以 `LineageEngine.resolve_merge_sources()` 取代 `_fetch_merge_sources()`
+- [x] 2.3 以 `LineageEngine.resolve_full_genealogy()` 取代 `_resolve_full_genealogy()`
+- [x] 2.4 Golden test：選取 ≥5 個已知血緣結構 LOT，比對 BFS vs CONNECT BY 輸出的 `child_to_parent` 和 `cid_to_name` 結果集合完全一致
+- [x] 2.5 標記 `sql/mid_section_defect/genealogy_records.sql` 和 `sql/mid_section_defect/split_chain.sql` 為 deprecated（檔案頂部加 `-- DEPRECATED: replaced by sql/lineage/split_ancestors.sql`）
+
+## Phase 3: query-tool SQL injection 修復
+
+- [x] 3.1 建立 `sql/query_tool/lot_resolve_id.sql`、`lot_resolve_serial.sql`、`lot_resolve_work_order.sql` SQL 檔案（從 inline SQL 遷移到 SQLLoader 管理）
+- [x] 3.2 修復 `_resolve_by_lot_id()`：`_build_in_filter()` → `QueryBuilder.add_in_condition()` + `SQLLoader.load_with_params()` + `read_sql_df(sql, builder.params)`
+- [x] 3.3 修復 `_resolve_by_serial_number()`：同上模式
+- [x] 3.4 修復 `_resolve_by_work_order()`：同上模式
+- [x] 3.5 修復 `get_lot_history()` 內部 IN 子句：改用 `QueryBuilder`
+- [x] 3.6 修復 lot-associations 查詢路徑（`get_lot_materials()` / `get_lot_rejects()` / `get_lot_holds()` / `get_lot_splits()` / `get_lot_jobs()`）中涉及使用者輸入的 IN 子句：改用 `QueryBuilder`
+- [x] 3.7 修復 `lot_split_merge_history` 查詢：改用 `QueryBuilder`
+- [x] 3.8 刪除 `_build_in_filter()` 和 `_build_in_clause()` 函數
+- [x] 3.9 驗證：`grep -r "_build_in_filter\|_build_in_clause" src/` 回傳 0 結果
+- [x] 3.10 更新既有 query-tool 路由測試的 mock 路徑
+
+## Phase 4: query-tool rate limit + cache
+
+- [x] 4.1 在 `query_tool_routes.py` 為 `/resolve` 加入 `configured_rate_limit(bucket='query-tool-resolve', default_max_attempts=10, default_window_seconds=60)`
+- [x] 4.2 為 `/lot-history` 加入 `configured_rate_limit(bucket='query-tool-history', default_max_attempts=20, default_window_seconds=60)`
+- [x] 4.3 為 `/lot-associations` 加入 `configured_rate_limit(bucket='query-tool-association', default_max_attempts=20, default_window_seconds=60)`
+- [x] 4.4 為 `/adjacent-lots` 加入 `configured_rate_limit(bucket='query-tool-adjacent', default_max_attempts=20, default_window_seconds=60)`
+- [x] 4.5 為 `/equipment-period` 加入 `configured_rate_limit(bucket='query-tool-equipment', default_max_attempts=5, default_window_seconds=60)`
+- [x] 4.6 為 `/export-csv` 加入 `configured_rate_limit(bucket='query-tool-export', default_max_attempts=3, default_window_seconds=60)`
+- [x] 4.7 為 resolve 結果加入 L2 Redis cache（key=`qt:resolve:{input_type}:{values_hash}`, TTL=60s）
+
+## Phase 5: lot_split_merge_history fast/full 雙模式
+
+- [x] 5.1 修改 `sql/query_tool/lot_split_merge_history.sql`：加入 `{{ TIME_WINDOW }}` 和 `{{ ROW_LIMIT }}` 結構參數
+- [x] 5.2 在 `query_tool_service.py` 中根據 `full_history` 參數選擇 SQL variant（fast: `AND h.TXNDATE >= ADD_MONTHS(SYSDATE, -6)` + `FETCH FIRST 500 ROWS ONLY`，full: 無限制 + `read_sql_df_slow`）
+- [x] 5.3 在 `query_tool_routes.py` 的 `/api/query-tool/lot-associations?type=splits` 路徑加入 `full_history` query param 解析，並傳遞到 split-merge-history 查詢
+- [x] 5.4 路由測試：驗證 fast mode（預設）和 full mode（`full_history=true`）的行為差異
+
+## Phase 6: EventFetcher 模組建立
+
+- [x] 6.1 建立 `src/mes_dashboard/services/event_fetcher.py`：`fetch_events(container_ids, domain)` + cache key 生成 + rate limit config
+- [x] 6.2 遷移 `mid_section_defect_service.py` 的 `_fetch_upstream_history()` 到 `EventFetcher.fetch_events(cids, "upstream_history")`
+- [x] 6.3 遷移 query-tool event fetch paths 到 `EventFetcher`（`get_lot_history`、`get_lot_associations` 的 DB 查詢部分）
+- [x] 6.4 EventFetcher 單元測試：mock DB 驗證 cache key 格式、rate limit config、domain 分支
+
+## Phase 7: 清理與驗證
+
+- [x] 7.1 確認 `genealogy_records.sql` 和 `split_chain.sql` 無活躍引用（`grep -r` 確認），保留 deprecated 標記
+- [x] 7.2 確認所有含使用者輸入的查詢使用 `QueryBuilder` bind params（grep `read_sql_df` 呼叫點逐一確認）
+- [x] 7.3 執行完整 query-tool 和 mid-section-defect 路由測試，確認無 regression