feat: simplify layout model selection and archive proposals

Changes: - Replace PP-Structure 7-slider parameter UI with simple 3-option layout model selector - Add layout model mapping: chinese (PP-DocLayout-S), default (PubLayNet), cdla - Add LayoutModelSelector component and zh-TW translations - Fix "default" model behavior with sentinel value for PubLayNet - Add gap filling service for OCR track coverage improvement - Add PP-Structure debug utilities - Archive completed/incomplete proposals: - add-ocr-track-gap-filling (complete) - fix-ocr-track-table-rendering (incomplete) - simplify-ppstructure-model-selection (22/25 tasks) - Add new layout model tests, archive old PP-Structure param tests - Update OpenSpec ocr-processing spec with layout model requirements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 13:27:00 +08:00
parent c65df754cf
commit 59206a6ab8
35 changed files with 3621 additions and 658 deletions
--- a/openspec/changes/archive/2025-11-26-fix-ocr-table-empty-columns/proposal.md
+++ b/openspec/changes/archive/2025-11-26-fix-ocr-table-empty-columns/proposal.md
@@ -0,0 +1,28 @@
+# Change: Fix OCR Track Table Empty Columns and Alignment
+
+## Why
+
+PP-Structure 生成的表格經常包含空白欄位（所有 row 該欄皆為空/空白），導致轉換後的 UnifiedDocument 表格出現空欄與欄位錯位。目前 OCR Track 直接使用原始資料，未進行清理，影響 PDF/JSON/Markdown 輸出品質。
+
+## What Changes
+
+- 新增 `trim_empty_columns()` 函數，清理 OCR Track 表格的空欄
+- 在 `_convert_table_data` 入口調用清洗邏輯，確保 TableData 乾淨
+- 處理 col_span 重算：若 span 跨過被移除欄位，縮小 span
+- 更新 columns/cols 數值、調整各 cell 的 col 索引
+- 可選：依 bbox x0 進行欄對齊排序
+
+## Impact
+
+- Affected specs: `ocr-processing`
+- Affected code:
+  - `backend/app/services/ocr_to_unified_converter.py` (主要修改)
+- 不影響 Direct/HYBRID 路徑
+- PDF/JSON/Markdown 輸出將更乾淨
+
+## Constraints
+
+- 保持表格 bbox、頁面座標不變
+- 不修改 Direct/HYBRID 路徑
+- 只移除「所有行皆空」的欄；若表頭空但數據有值，不應移除
+- 保留原 bbox，避免 PDF 版面漂移
--- a/openspec/changes/archive/2025-11-26-fix-ocr-table-empty-columns/specs/ocr-processing/spec.md
+++ b/openspec/changes/archive/2025-11-26-fix-ocr-table-empty-columns/specs/ocr-processing/spec.md
@@ -0,0 +1,61 @@
+## ADDED Requirements
+
+### Requirement: OCR Table Empty Column Cleanup
+
+The OCR Track converter SHALL clean up PP-Structure generated tables by removing columns where all rows have empty or whitespace-only content.
+
+The system SHALL:
+1. Identify columns where every cell's content is empty or contains only whitespace (using `.strip()` to determine emptiness)
+2. Remove identified empty columns from the table structure
+3. Update the `columns`/`cols` value to reflect the new column count
+4. Recalculate each cell's `col` index to maintain continuity
+5. Adjust `col_span` values when spans cross removed columns (shrink span size)
+6. Remove cells entirely when their complete span falls within removed columns
+7. Preserve original bbox and page coordinates (no layout drift)
+8. If `columns` is 0 or missing after cleanup, fill with the calculated column count
+
+The cleanup SHALL NOT:
+- Remove columns where the header is empty but data rows contain values
+- Modify tables in Direct or HYBRID track
+- Alter the original bbox coordinates
+
+#### Scenario: All rows in column are empty
+- **WHEN** a table has a column where all cells contain only empty or whitespace content
+- **THEN** that column is removed
+- **AND** remaining cells have their `col` indices decremented appropriately
+- **AND** `cols` count is reduced by 1
+
+#### Scenario: Column has empty header but data has values
+- **WHEN** a table has a column where the header cell is empty
+- **AND** at least one data row cell in that column contains non-whitespace content
+- **THEN** that column is NOT removed
+
+#### Scenario: Cell span crosses removed column
+- **WHEN** a cell has `col_span > 1`
+- **AND** one or more columns within the span are removed
+- **THEN** the `col_span` is reduced by the number of removed columns within the span
+
+#### Scenario: Cell span entirely within removed columns
+- **WHEN** a cell's entire span falls within columns that are all removed
+- **THEN** that cell is removed from the table
+
+#### Scenario: Missing columns metadata
+- **WHEN** the table dict has `columns` set to 0 or missing
+- **AFTER** cleanup is performed
+- **THEN** `columns` is set to the calculated number of remaining columns
+
+### Requirement: OCR Table Column Alignment by Bbox
+
+(Optional Enhancement) When bbox coordinates are available for table cells, the OCR Track converter SHALL use cell bbox x0 coordinates to improve column alignment accuracy.
+
+The system SHALL:
+1. Sort cells by bbox `x0` coordinate before assigning column indices
+2. Reassign `col` indices based on spatial position rather than HTML order
+
+This requirement is optional and implementation MAY be deferred if bbox data is not reliably available.
+
+#### Scenario: Cells reordered by bbox position
+- **WHEN** bbox coordinates are available for table cells
+- **AND** the original HTML order does not match spatial order
+- **THEN** cells are reordered by `x0` coordinate
+- **AND** `col` indices are reassigned to reflect spatial positioning
--- a/openspec/changes/archive/2025-11-26-fix-ocr-table-empty-columns/tasks.md
+++ b/openspec/changes/archive/2025-11-26-fix-ocr-table-empty-columns/tasks.md
@@ -0,0 +1,43 @@
+# Tasks: Fix OCR Track Table Empty Columns
+
+## 1. Core Implementation
+
+- [x] 1.1 在 `ocr_to_unified_converter.py` 實作 `trim_empty_columns(table_dict: Dict[str, Any]) -> Dict[str, Any]`
+  - 依據 cells 陣列計算每一欄是否「所有 row 的內容皆為空/空白」
+  - 使用 `.strip()` 判斷空白字元
+- [x] 1.2 實作欄位移除邏輯
+  - 更新 columns/cols 數值
+  - 調整各 cell 的 col 索引
+- [x] 1.3 實作 col_span 重算邏輯
+  - 若 span 跨過被移除欄位，縮小 span
+  - 若整個 span 落在被刪欄位上，移除該 cell
+- [x] 1.4 在 `_convert_table_data` 入口呼叫 `trim_empty_columns`
+  - 在建 TableData 之前執行清洗
+  - 同時也在 `_extract_table_data` (HTML 表格解析) 中加入清洗
+- [ ] 1.5 (可選) 依 bbox x0/x1 進行欄對齊排序
+  - 若可取得 bbox 網格，先依 x0 排序再重排 col index
+  - 此功能延後實作，待 bbox 資料確認可用性後進行
+
+## 2. Testing & Validation
+
+- [x] 2.1 單元測試通過
+  - 測試基本空欄移除
+  - 測試表頭空但數據有值（不移除）
+  - 測試 col_span 跨越被移除欄位（縮小 span）
+  - 測試 cell 完全落在被移除欄位（移除 cell）
+  - 測試無空欄情況（不變更）
+- [x] 2.2 檢查現有 OCR 結果
+  - 現有結果中無「整欄為空」的表格
+  - 實作已就緒，遇到空欄時會正確清理
+- [x] 2.3 確認 Direct/HYBRID 表格不變
+  - `OCRToUnifiedConverter` 僅在 `ocr_service.py` 中使用
+  - Direct 軌使用 `DirectExtractionEngine`，不受影響
+
+## 3. Edge Cases & Validation
+
+- [x] 3.1 處理 columns 欄位為 0/缺失的情況
+  - 以計算後的欄數回填，避免 downstream 依賴出錯
+- [x] 3.2 處理表頭為空但數據有值的情況
+  - 只移除「所有行皆空」的欄
+- [x] 3.3 確保不直接修改 `backend/storage/results/...`
+  - 修改 converter，需重新跑任務驗證
--- a/openspec/changes/archive/2025-11-27-add-ocr-track-gap-filling/design.md
+++ b/openspec/changes/archive/2025-11-27-add-ocr-track-gap-filling/design.md
@@ -0,0 +1,183 @@
+# Design: OCR Track Gap Filling
+
+## Context
+
+PP-StructureV3 版面分析模型在處理某些掃描文件時會嚴重漏檢。實測顯示 Raw PaddleOCR 能偵測 56 個文字區域，但 PP-StructureV3 僅輸出 9 個元素（遺失 84%）。
+
+問題發生在 PP-StructureV3 內部的 Layout Detection Model，這是 PaddleOCR 函式庫的限制，無法從外部修復。但 Raw OCR 的 `text_regions` 資料仍然完整可用。
+
+### Stakeholders
+- **End users**: 需要完整的 OCR 輸出，不能有大量文字遺失
+- **OCR track**: 需要整合 Raw OCR 與 PP-StructureV3 結果
+- **Direct/Hybrid track**: 不應受此變更影響
+
+## Goals / Non-Goals
+
+### Goals
+- 偵測 PP-StructureV3 漏檢區域並以 Raw OCR 結果補回
+- 確保補回的文字不會與現有元素重複
+- 維持正確的閱讀順序
+- 僅影響 OCR track，不改變其他 track 的行為
+
+### Non-Goals
+- 不修改 PP-StructureV3 或 PaddleOCR 內部邏輯
+- 不處理圖片/表格/圖表等非文字元素的補漏
+- 不實作複雜的版面分析（僅做 gap filling）
+
+## Decisions
+
+### Decision 1: 覆蓋判定策略
+**選擇**: 優先使用「中心點落入」判定，輔以 IoU 閾值
+
+**理由**:
+- 中心點判定計算簡單，效能好
+- IoU 閾值作為補充，處理邊界情況
+- 建議 IoU 閾值 0.1~0.2，避免低 IoU 被誤判為未覆蓋
+
+**替代方案**:
+- 純 IoU 判定：計算量較大，且對部分重疊的處理較複雜
+- 面積比例判定：對不同大小的區域不夠公平
+
+### Decision 2: 補漏觸發條件
+**選擇**: 當 PP-Structure 覆蓋率 < 70% 或元素數顯著低於 Raw OCR
+
+**理由**:
+- 避免正常文件出現重複文字
+- 70% 閾值經驗值，可透過設定調整
+- 元素數比較作為快速判斷條件
+
+### Decision 3: 補漏元素類型
+**選擇**: 僅補 TEXT 類型，跳過 TABLE/IMAGE/FIGURE/FLOWCHART/HEADER/FOOTER
+
+**理由**:
+- PP-StructureV3 對結構化元素（表格、圖片）的識別通常較準確
+- 補回原始 OCR 文字可能破壞表格結構
+- 這些元素需要保持結構完整性
+
+### Decision 4: 重複判定與去重
+**選擇**: IoU > 0.5 的 Raw OCR 區域視為與 PP-Structure TEXT 重複，跳過
+
+**理由**:
+- 0.5 是常見的重疊閾值
+- 避免同一文字出現兩次
+- 對細碎的 Raw OCR 框可考慮輕量合併
+
+### Decision 5: 座標對齊
+**選擇**: 使用 `ocr_dimensions` 進行 bbox 換算
+
+**理由**:
+- OCR 可能有 resize 處理
+- 確保 Raw OCR 與 PP-Structure 的座標在同一空間
+- 避免因尺寸不一致導致覆蓋誤判
+
+## Data Flow
+
+```
+┌─────────────────┐     ┌──────────────────────┐
+│  Raw OCR Result │     │ PP-StructureV3 Result│
+│  (56 regions)   │     │    (9 elements)      │
+└────────┬────────┘     └──────────┬───────────┘
+         │                         │
+         └────────────┬────────────┘
+                      │
+              ┌───────▼───────┐
+              │ GapFillingService │
+              │ 1. Calculate coverage
+              │ 2. Find uncovered regions
+              │ 3. Filter by confidence
+              │ 4. Deduplicate
+              │ 5. Merge if needed
+              └───────┬───────┘
+                      │
+              ┌───────▼───────┐
+              │ OCRToUnifiedConverter │
+              │ - Combine elements
+              │ - Recalculate reading order
+              └───────┬───────┘
+                      │
+              ┌───────▼───────┐
+              │ UnifiedDocument │
+              │ (complete content)
+              └───────────────┘
+```
+
+## Algorithm: Gap Detection
+
+```python
+def find_uncovered_regions(
+    raw_ocr_regions: List[TextRegion],
+    pp_structure_elements: List[Element],
+    iou_threshold: float = 0.15
+) -> List[TextRegion]:
+    """
+    Find Raw OCR regions not covered by PP-Structure elements.
+
+    Coverage criteria (either one):
+    1. Center point of raw region falls inside any PP-Structure bbox
+    2. IoU with any PP-Structure bbox > iou_threshold
+    """
+    uncovered = []
+
+    # Filter PP-Structure elements: only consider TEXT, skip TABLE/IMAGE/etc.
+    text_elements = [e for e in pp_structure_elements
+                     if e.type not in SKIP_TYPES]
+
+    for region in raw_ocr_regions:
+        center = get_center(region.bbox)
+        is_covered = False
+
+        for element in text_elements:
+            # Check center point
+            if point_in_bbox(center, element.bbox):
+                is_covered = True
+                break
+
+            # Check IoU
+            if calculate_iou(region.bbox, element.bbox) > iou_threshold:
+                is_covered = True
+                break
+
+        if not is_covered:
+            uncovered.append(region)
+
+    return uncovered
+```
+
+## Configuration Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `gap_filling_enabled` | bool | True | 是否啟用 gap filling |
+| `gap_filling_coverage_threshold` | float | 0.7 | 覆蓋率低於此值時啟用 |
+| `gap_filling_iou_threshold` | float | 0.15 | 覆蓋判定 IoU 閾值 |
+| `gap_filling_confidence_threshold` | float | 0.3 | Raw OCR 信心度門檻 |
+| `gap_filling_dedup_iou_threshold` | float | 0.5 | 去重 IoU 閾值 |
+
+## Risks / Trade-offs
+
+### Risk 1: 補漏造成文字重複
+**Mitigation**: 設定 dedup_iou_threshold，對高重疊區域進行去重
+
+### Risk 2: 閱讀順序錯亂
+**Mitigation**: 補回元素後重新計算整頁的 reading_order（依 y0, x0 排序）
+
+### Risk 3: 效能影響
+**Mitigation**:
+- 先做快速的覆蓋率檢查，若 > 70% 則跳過 gap filling
+- 使用 R-tree 或 interval tree 加速 bbox 查詢（若效能成為瓶頸）
+
+### Risk 4: 座標不對齊
+**Mitigation**: 使用 `ocr_dimensions` 確保座標空間一致
+
+## Migration Plan
+
+1. 新增功能為可選（預設啟用）
+2. 可透過設定關閉 gap filling
+3. 不影響現有 API 介面
+4. 向後相容：不傳參數時使用預設行為
+
+## Open Questions
+
+1. 是否需要 UI 開關讓使用者選擇啟用/停用 gap filling？
+2. 對於細碎的 Raw OCR 框，是否需要實作合併邏輯？（同行、相鄰且間距很小）
+3. 是否需要在輸出中標記哪些元素是補漏來的？（debug 用途）
--- a/openspec/changes/archive/2025-11-27-add-ocr-track-gap-filling/proposal.md
+++ b/openspec/changes/archive/2025-11-27-add-ocr-track-gap-filling/proposal.md
@@ -0,0 +1,30 @@
+# Change: Add OCR Track Gap Filling with Raw OCR Text Regions
+
+## Why
+
+PP-StructureV3 的版面分析模型在處理某些掃描文件時會嚴重漏檢，導致大量文字內容遺失。實測 scan.pdf 顯示：
+- Raw PaddleOCR 文字識別：偵測到 **56 個文字區域**
+- PP-StructureV3 版面分析：僅輸出 **9 個元素**
+- 遺失比例：約 **84%** 的內容未被 PP-StructureV3 識別
+
+問題根源在於 PP-StructureV3 內部的 Layout Detection Model 對掃描文件類型支援不足，而非我們的程式碼問題。Raw OCR 能正確偵測所有文字區域，但這些資訊在 PP-StructureV3 的結構化處理過程中被遺失。
+
+## What Changes
+
+實作「混合式處理」(Hybrid Approach)：使用 Raw OCR 的文字區域來補充 PP-StructureV3 遺失的內容。
+
+- **新增** `GapFillingService` 類別，負責偵測並補回 PP-StructureV3 遺漏的文字區域
+- **新增** 覆蓋率計算邏輯（中心點落入或 IoU 閾值判斷）
+- **新增** 自動啟用條件：當 PP-Structure 覆蓋率 < 70% 或元素數顯著低於 Raw OCR 框數
+- **修改** `OCRToUnifiedConverter` 整合 gap filling 邏輯
+- **新增** 重新計算 reading_order 邏輯（依 y0, x0 排序）
+- **新增** 測試案例：PP-Structure 嚴重漏檢案例、無漏檢正常文件驗證
+
+## Impact
+
+- **Affected specs**: `ocr-processing`
+- **Affected code**:
+  - `backend/app/services/ocr_to_unified_converter.py` - 整合 gap filling
+  - `backend/app/services/gap_filling_service.py` - 新增 (核心邏輯)
+  - `backend/tests/test_gap_filling.py` - 新增 (測試)
+- **Track isolation**: 僅作用於 OCR track；Direct/Hybrid track 不受影響
--- a/openspec/changes/archive/2025-11-27-add-ocr-track-gap-filling/specs/ocr-processing/spec.md
+++ b/openspec/changes/archive/2025-11-27-add-ocr-track-gap-filling/specs/ocr-processing/spec.md
@@ -0,0 +1,111 @@
+## ADDED Requirements
+
+### Requirement: OCR Track Gap Filling with Raw OCR Regions
+
+The system SHALL detect and fill gaps in PP-StructureV3 output by supplementing with Raw OCR text regions when significant content loss is detected.
+
+#### Scenario: Gap filling activates when coverage is low
+- **GIVEN** an OCR track processing task
+- **WHEN** PP-StructureV3 outputs elements that cover less than 70% of Raw OCR text regions
+- **THEN** the system SHALL activate gap filling
+- **AND** identify Raw OCR regions not covered by any PP-StructureV3 element
+- **AND** supplement these regions as TEXT elements in the output
+
+#### Scenario: Coverage is determined by center-point and IoU
+- **GIVEN** a Raw OCR text region with bounding box
+- **WHEN** checking if the region is covered by PP-StructureV3
+- **THEN** the region SHALL be considered covered if its center point falls inside any PP-StructureV3 element bbox
+- **OR** if IoU with any PP-StructureV3 element exceeds 0.15 threshold
+- **AND** regions not meeting either criterion SHALL be marked as uncovered
+
+#### Scenario: Only TEXT elements are supplemented
+- **GIVEN** uncovered Raw OCR regions identified for supplementation
+- **WHEN** PP-StructureV3 has detected TABLE, IMAGE, FIGURE, FLOWCHART, HEADER, or FOOTER elements
+- **THEN** the system SHALL NOT supplement regions that overlap with these structural elements
+- **AND** only supplement regions as TEXT type to preserve structural integrity
+
+#### Scenario: Supplemented regions meet confidence threshold
+- **GIVEN** Raw OCR regions to be supplemented
+- **WHEN** a region has confidence score below 0.3
+- **THEN** the system SHALL skip that region
+- **AND** only supplement regions with confidence >= 0.3
+
+#### Scenario: Deduplication prevents repeated text
+- **GIVEN** a Raw OCR region being considered for supplementation
+- **WHEN** the region has IoU > 0.5 with any existing PP-StructureV3 TEXT element
+- **THEN** the system SHALL skip that region to prevent duplicate text
+- **AND** the original PP-StructureV3 element SHALL be preserved
+
+#### Scenario: Reading order is recalculated after gap filling
+- **GIVEN** supplemented elements have been added to the page
+- **WHEN** assembling the final element list
+- **THEN** the system SHALL recalculate reading order for the entire page
+- **AND** sort elements by y0 coordinate (top to bottom) then x0 (left to right)
+- **AND** ensure logical document flow is maintained
+
+#### Scenario: Coordinate alignment with ocr_dimensions
+- **GIVEN** Raw OCR processing may involve image resizing
+- **WHEN** comparing Raw OCR bbox with PP-StructureV3 bbox
+- **THEN** the system SHALL use ocr_dimensions to normalize coordinates
+- **AND** ensure both sources reference the same coordinate space
+- **AND** prevent coverage misdetection due to scale differences
+
+#### Scenario: Supplemented elements have complete metadata
+- **GIVEN** a Raw OCR region being added as supplemented element
+- **WHEN** creating the DocumentElement
+- **THEN** the element SHALL include page_number
+- **AND** include confidence score from Raw OCR
+- **AND** include original bbox coordinates
+- **AND** optionally include source indicator for debugging
+
+### Requirement: Gap Filling Track Isolation
+
+The gap filling feature SHALL only apply to OCR track processing and SHALL NOT affect Direct or Hybrid track outputs.
+
+#### Scenario: Gap filling only activates for OCR track
+- **GIVEN** a document processing task
+- **WHEN** the processing track is OCR
+- **THEN** the system SHALL evaluate and apply gap filling as needed
+- **AND** produce enhanced output with supplemented content
+
+#### Scenario: Direct track is unaffected
+- **GIVEN** a document processing task with Direct track
+- **WHEN** the task is processed
+- **THEN** the system SHALL NOT invoke any gap filling logic
+- **AND** produce output identical to current Direct track behavior
+
+#### Scenario: Hybrid track is unaffected
+- **GIVEN** a document processing task with Hybrid track
+- **WHEN** the task is processed
+- **THEN** the system SHALL NOT invoke gap filling logic
+- **AND** use existing Hybrid track processing pipeline
+
+### Requirement: Gap Filling Configuration
+
+The system SHALL provide configurable parameters for gap filling behavior.
+
+#### Scenario: Gap filling can be disabled via configuration
+- **GIVEN** gap_filling_enabled is set to false in configuration
+- **WHEN** OCR track processing runs
+- **THEN** the system SHALL skip all gap filling logic
+- **AND** output only PP-StructureV3 results as before
+
+#### Scenario: Coverage threshold is configurable
+- **GIVEN** gap_filling_coverage_threshold is set to 0.8
+- **WHEN** PP-StructureV3 coverage is 75%
+- **THEN** the system SHALL activate gap filling
+- **AND** supplement uncovered regions
+
+#### Scenario: IoU thresholds are configurable
+- **GIVEN** custom IoU thresholds configured:
+  - gap_filling_iou_threshold: 0.2
+  - gap_filling_dedup_iou_threshold: 0.6
+- **WHEN** evaluating coverage and deduplication
+- **THEN** the system SHALL use the configured values
+- **AND** apply them consistently throughout gap filling process
+
+#### Scenario: Confidence threshold is configurable
+- **GIVEN** gap_filling_confidence_threshold is set to 0.5
+- **WHEN** supplementing Raw OCR regions
+- **THEN** the system SHALL only include regions with confidence >= 0.5
+- **AND** filter out lower confidence regions
--- a/openspec/changes/archive/2025-11-27-add-ocr-track-gap-filling/tasks.md
+++ b/openspec/changes/archive/2025-11-27-add-ocr-track-gap-filling/tasks.md
@@ -0,0 +1,44 @@
+# Tasks: Add OCR Track Gap Filling
+
+## 1. Core Implementation
+
+- [x] 1.1 Create `gap_filling_service.py` with `GapFillingService` class
+- [x] 1.2 Implement bbox coverage calculation (center-point and IoU methods)
+- [x] 1.3 Implement gap detection logic (find uncovered raw OCR regions)
+- [x] 1.4 Implement confidence threshold filtering for supplemented regions
+- [x] 1.5 Implement element type filtering (only supplement TEXT, skip TABLE/IMAGE/FIGURE/etc.)
+- [x] 1.6 Implement reading order recalculation (sort by y0, x0)
+- [x] 1.7 Implement deduplication logic (skip high IoU overlaps with PP-Structure TEXT)
+- [x] 1.8 Implement optional text merging for fragmented adjacent regions
+
+## 2. Integration
+
+- [x] 2.1 Modify `OCRToUnifiedConverter` to accept raw OCR text_regions
+- [x] 2.2 Add gap filling activation condition check (coverage < 70% or element count disparity)
+- [x] 2.3 Ensure coordinate alignment between raw OCR and PP-Structure (ocr_dimensions handling)
+- [x] 2.4 Add page metadata (page_number, confidence, bbox) to supplemented elements
+- [x] 2.5 Ensure track isolation (only OCR track, not Direct/Hybrid)
+
+## 3. Configuration
+
+- [x] 3.1 Add configurable parameters to settings:
+  - `gap_filling_enabled`: bool (default: True)
+  - `gap_filling_coverage_threshold`: float (default: 0.7)
+  - `gap_filling_iou_threshold`: float (default: 0.15)
+  - `gap_filling_confidence_threshold`: float (default: 0.3)
+  - `gap_filling_dedup_iou_threshold`: float (default: 0.5)
+
+## 4. Testing(with env)
+
+- [x] 4.1 Create test fixtures with PP-Structure severe miss-detection case(with scan.pdf / scan2.pdf)
+- [x] 4.2 Test gap detection correctly identifies uncovered regions
+- [x] 4.3 Test supplemented elements have correct metadata
+- [x] 4.4 Test reading order is correctly recalculated
+- [x] 4.5 Test deduplication prevents duplicate text
+- [x] 4.6 Test normal document without miss-detection has no duplicate/inflation
+- [x] 4.7 Test track isolation (Direct track unaffected)
+
+## 5. Documentation
+
+- [x] 5.1 Add inline documentation to GapFillingService
+- [x] 5.2 Update configuration documentation with new settings
--- a/openspec/changes/archive/2025-11-27-fix-ocr-track-table-rendering/proposal.md
+++ b/openspec/changes/archive/2025-11-27-fix-ocr-track-table-rendering/proposal.md
--- a/openspec/changes/archive/2025-11-27-fix-ocr-track-table-rendering/specs/pdf-generation/spec.md
+++ b/openspec/changes/archive/2025-11-27-fix-ocr-track-table-rendering/specs/pdf-generation/spec.md
--- a/openspec/changes/archive/2025-11-27-fix-ocr-track-table-rendering/tasks.md
+++ b/openspec/changes/archive/2025-11-27-fix-ocr-track-table-rendering/tasks.md
--- a/openspec/changes/archive/2025-11-27-simplify-ppstructure-model-selection/proposal.md
+++ b/openspec/changes/archive/2025-11-27-simplify-ppstructure-model-selection/proposal.md
@@ -0,0 +1,40 @@
+# Change: Simplify PP-StructureV3 Configuration with Layout Model Selection
+
+## Why
+
+Current PP-StructureV3 parameter adjustment UI exposes 7 technical ML parameters (thresholds, ratios, merge modes) that are difficult for end users to understand. Meanwhile, switching to a different layout detection model (e.g., CDLA-trained models for Chinese documents) would have a much greater impact on OCR quality than fine-tuning these parameters.
+
+**Problems with current approach:**
+- Users don't understand what `layout_detection_threshold` or `text_det_unclip_ratio` mean
+- Wrong parameter values can make OCR results worse
+- The default model (PubLayNet-based) is optimized for English academic papers, not Chinese business documents
+- Model selection is far more impactful than parameter tuning
+
+## What Changes
+
+### Backend Changes
+- **REMOVED**: API parameter `pp_structure_params` from task start endpoint
+- **ADDED**: New API parameter `layout_model` with predefined options:
+  - `"default"` - Standard model (PubLayNet-based, for English documents)
+  - `"chinese"` - PP-DocLayout-S model (for Chinese documents, forms, contracts)
+  - `"cdla"` - CDLA model (alternative Chinese document layout model)
+- **MODIFIED**: PP-StructureV3 initialization uses `layout_detection_model_name` based on selection
+- Keep fine-tuning parameters in backend `config.py` with optimized defaults
+
+### Frontend Changes
+- **REMOVED**: `PPStructureParams.tsx` component (slider/dropdown UI for 7 parameters)
+- **ADDED**: Simple radio button/dropdown for layout model selection with clear descriptions
+- **MODIFIED**: Task start request body to send `layout_model` instead of `pp_structure_params`
+
+### API Changes
+- **BREAKING**: Remove `pp_structure_params` from `POST /api/v2/tasks/{task_id}/start`
+- **ADDED**: New optional parameter `layout_model: "default" | "chinese" | "cdla"`
+
+## Impact
+
+- Affected specs: `ocr-processing`
+- Affected code:
+  - Backend: `app/routers/tasks.py`, `app/services/ocr_service.py`, `app/core/config.py`
+  - Frontend: `src/components/PPStructureParams.tsx` (remove), `src/types/apiV2.ts`, task start form
+- Breaking change: Clients using `pp_structure_params` will need to migrate to `layout_model`
+- User impact: Simpler UI, better default OCR quality for Chinese documents
--- a/openspec/changes/archive/2025-11-27-simplify-ppstructure-model-selection/specs/ocr-processing/spec.md
+++ b/openspec/changes/archive/2025-11-27-simplify-ppstructure-model-selection/specs/ocr-processing/spec.md
@@ -0,0 +1,86 @@
+# ocr-processing Specification Delta
+
+## REMOVED Requirements
+
+### Requirement: Frontend-Adjustable PP-StructureV3 Parameters
+**Reason**: Complex ML parameters are difficult for end users to understand and tune. Model selection provides better UX and more significant quality improvements.
+**Migration**: Replace `pp_structure_params` API parameter with `layout_model` parameter.
+
+### Requirement: PP-StructureV3 Parameter UI Controls
+**Reason**: Slider/dropdown UI for 7 technical parameters adds complexity without proportional benefit. Simple model selection is more user-friendly.
+**Migration**: Remove `PPStructureParams.tsx` component, add `LayoutModelSelector.tsx` component.
+
+## ADDED Requirements
+
+### Requirement: Layout Model Selection
+The system SHALL allow users to select a layout detection model optimized for their document type, providing a simple choice between pre-configured models instead of manual parameter tuning.
+
+#### Scenario: User selects Chinese document model
+- **GIVEN** a user is processing Chinese business documents (forms, contracts, invoices)
+- **WHEN** the user selects "Chinese Document Model" (PP-DocLayout-S)
+- **THEN** the OCR engine SHALL use the PP-DocLayout-S layout detection model
+- **AND** the model SHALL be optimized for 23 Chinese document element types
+- **AND** table and form detection accuracy SHALL be improved over the default model
+
+#### Scenario: User selects standard model for English documents
+- **GIVEN** a user is processing English academic papers or reports
+- **WHEN** the user selects "Standard Model" (PubLayNet-based)
+- **THEN** the OCR engine SHALL use the default PubLayNet-based layout detection model
+- **AND** the model SHALL be optimized for English document layouts
+
+#### Scenario: User selects CDLA model for specialized Chinese layout
+- **GIVEN** a user is processing Chinese documents with complex layouts
+- **WHEN** the user selects "CDLA Model"
+- **THEN** the OCR engine SHALL use the picodet_lcnet_x1_0_fgd_layout_cdla model
+- **AND** the model SHALL provide specialized Chinese document layout analysis
+
+#### Scenario: Layout model is sent via API request
+- **GIVEN** a frontend application with model selection UI
+- **WHEN** the user starts task processing with a selected model
+- **THEN** the frontend SHALL send the model choice in the request body:
+  ```json
+  POST /api/v2/tasks/{task_id}/start
+  {
+    "use_dual_track": true,
+    "force_track": "ocr",
+    "language": "ch",
+    "layout_model": "chinese"
+  }
+  ```
+- **AND** the backend SHALL configure PP-StructureV3 with the corresponding model
+
+#### Scenario: Default model when not specified
+- **GIVEN** an API request without `layout_model` parameter
+- **WHEN** the task is started
+- **THEN** the system SHALL use "chinese" (PP-DocLayout-S) as the default model
+- **AND** processing SHALL work correctly without requiring model selection
+
+#### Scenario: Invalid model name is rejected
+- **GIVEN** a request with an invalid `layout_model` value
+- **WHEN** the user sends `layout_model: "invalid_model"`
+- **THEN** the API SHALL return 422 Validation Error
+- **AND** provide a clear error message listing valid model options
+
+### Requirement: Layout Model Selection UI
+The frontend SHALL provide a simple, user-friendly interface for selecting layout detection models with clear descriptions of each option.
+
+#### Scenario: Model options are displayed with descriptions
+- **GIVEN** the model selection UI is displayed
+- **WHEN** the user views the available options
+- **THEN** the UI SHALL show the following options:
+  - "Chinese Document Model (Recommended)" - for Chinese forms, contracts, invoices
+  - "Standard Model" - for English academic papers, reports
+  - "CDLA Model" - for specialized Chinese layout analysis
+- **AND** each option SHALL have a brief description of its use case
+
+#### Scenario: Chinese model is selected by default
+- **GIVEN** the user opens the task processing interface
+- **WHEN** the model selection is displayed
+- **THEN** "Chinese Document Model" SHALL be pre-selected as the default
+- **AND** the user MAY change the selection before starting processing
+
+#### Scenario: Model selection is visible only for OCR track
+- **GIVEN** a document processing interface
+- **WHEN** the user selects processing track
+- **THEN** layout model selection SHALL be shown ONLY when OCR track is selected or auto-detected
+- **AND** SHALL be hidden for Direct track (which does not use PP-StructureV3)
--- a/openspec/changes/archive/2025-11-27-simplify-ppstructure-model-selection/tasks.md
+++ b/openspec/changes/archive/2025-11-27-simplify-ppstructure-model-selection/tasks.md
@@ -0,0 +1,56 @@
+# Implementation Tasks
+
+## 1. Backend API Changes
+
+- [x] 1.1 Update `app/schemas/task.py` to add `layout_model` enum type
+- [x] 1.2 Update `app/routers/tasks.py` to replace `pp_structure_params` with `layout_model` parameter
+- [x] 1.3 Update `app/services/ocr_service.py` to map `layout_model` to `layout_detection_model_name`
+- [x] 1.4 Remove custom PP-Structure engine creation logic (use model selection instead)
+- [x] 1.5 Add backward compatibility: default to "chinese" if no model specified
+
+## 2. Backend Configuration
+
+- [x] 2.1 Keep `layout_detection_model_name` in `config.py` as fallback default
+- [x] 2.2 Keep fine-tuning parameters in `config.py` (not exposed to API)
+- [x] 2.3 Document available layout models in config comments
+
+## 3. Frontend Changes
+
+- [x] 3.1 Remove `PPStructureParams.tsx` component
+- [x] 3.2 Update `src/types/apiV2.ts`:
+  - Remove `PPStructureV3Params` interface
+  - Add `LayoutModel` type: `"default" | "chinese" | "cdla"`
+  - Update `ProcessingOptions` to use `layout_model` instead of `pp_structure_params`
+- [x] 3.3 Create `LayoutModelSelector.tsx` component with:
+  - Radio buttons or dropdown for model selection
+  - Clear descriptions for each model option
+  - Default selection: "chinese"
+- [x] 3.4 Update task start form to use new `LayoutModelSelector`
+- [x] 3.5 Update API calls to send `layout_model` instead of `pp_structure_params`
+
+## 4. Internationalization
+
+- [x] 4.1 Add i18n strings for layout model options:
+  - `layoutModel.default`: "Standard Model (English documents)"
+  - `layoutModel.chinese`: "Chinese Document Model (Recommended)"
+  - `layoutModel.cdla`: "CDLA Model (Chinese layout analysis)"
+- [x] 4.2 Add i18n strings for model descriptions
+
+## 5. Testing
+
+- [x] 5.1 Create new tests for `layout_model` parameter (`test_layout_model_api.py`, `test_layout_model.py`)
+- [x] 5.2 Archive tests for `pp_structure_params` validation (moved to `tests/archived/`)
+- [x] 5.3 Add tests for layout model selection (19 tests passing)
+- [x] 5.4 Test backward compatibility (no model specified → use chinese default)
+
+## 6. Documentation
+
+- [ ] 6.1 Update API documentation for task start endpoint
+- [ ] 6.2 Remove PP-Structure parameter documentation
+- [ ] 6.3 Add layout model selection documentation
+
+## 7. Cleanup
+
+- [x] 7.1 Remove localStorage keys for PP-Structure params (`pp_structure_params_presets`, `pp_structure_params_last_used`)
+- [x] 7.2 Remove any unused imports/types related to PP-Structure params
+- [x] 7.3 Archive old PP-Structure params test files
--- a/openspec/specs/ocr-processing/spec.md
+++ b/openspec/specs/ocr-processing/spec.md
@@ -3,100 +3,186 @@
 ## Purpose
 TBD - created by archiving change frontend-adjustable-ppstructure-params. Update Purpose after archive.
 ## Requirements
-### Requirement: Frontend-Adjustable PP-StructureV3 Parameters
-The system SHALL allow frontend users to dynamically adjust PP-StructureV3 OCR parameters for fine-tuning document processing without backend configuration changes.
+### Requirement: OCR Track Gap Filling with Raw OCR Regions

-#### Scenario: User adjusts layout detection threshold
- **GIVEN** a user is processing a document with OCR track
- **WHEN** the user sets `layout_detection_threshold` to 0.1 (lower than default 0.2)
- **THEN** the OCR engine SHALL detect more layout blocks including weak signals
- **AND** the processing SHALL use the custom parameter instead of backend defaults
- **AND** the custom parameter SHALL NOT be cached for reuse
+The system SHALL detect and fill gaps in PP-StructureV3 output by supplementing with Raw OCR text regions when significant content loss is detected.

-#### Scenario: User selects high-quality preset configuration
- **GIVEN** a user wants to process a complex document with many small text elements
- **WHEN** the user selects "High Quality" preset mode
- **THEN** the system SHALL automatically set:
-  - `layout_detection_threshold` to 0.1
-  - `layout_nms_threshold` to 0.15
-  - `text_det_thresh` to 0.1
-  - `text_det_box_thresh` to 0.2
- **AND** process the document with these optimized parameters
+#### Scenario: Gap filling activates when coverage is low
+- **GIVEN** an OCR track processing task
+- **WHEN** PP-StructureV3 outputs elements that cover less than 70% of Raw OCR text regions
+- **THEN** the system SHALL activate gap filling
+- **AND** identify Raw OCR regions not covered by any PP-StructureV3 element
+- **AND** supplement these regions as TEXT elements in the output

-#### Scenario: User adjusts text detection parameters
- **GIVEN** a document with low-contrast text
- **WHEN** the user sets:
-  - `text_det_thresh` to 0.05 (very low)
-  - `text_det_unclip_ratio` to 1.5 (larger boxes)
- **THEN** the OCR SHALL detect more small and low-contrast text
- **AND** text bounding boxes SHALL be expanded by the specified ratio
+#### Scenario: Coverage is determined by center-point and IoU
+- **GIVEN** a Raw OCR text region with bounding box
+- **WHEN** checking if the region is covered by PP-StructureV3
+- **THEN** the region SHALL be considered covered if its center point falls inside any PP-StructureV3 element bbox
+- **OR** if IoU with any PP-StructureV3 element exceeds 0.15 threshold
+- **AND** regions not meeting either criterion SHALL be marked as uncovered

-#### Scenario: Parameters are sent via API request body
- **GIVEN** a frontend application with parameter adjustment UI
- **WHEN** the user starts task processing with custom parameters
- **THEN** the frontend SHALL send parameters in the request body (not query params):
+#### Scenario: Only TEXT elements are supplemented
+- **GIVEN** uncovered Raw OCR regions identified for supplementation
+- **WHEN** PP-StructureV3 has detected TABLE, IMAGE, FIGURE, FLOWCHART, HEADER, or FOOTER elements
+- **THEN** the system SHALL NOT supplement regions that overlap with these structural elements
+- **AND** only supplement regions as TEXT type to preserve structural integrity
+
+#### Scenario: Supplemented regions meet confidence threshold
+- **GIVEN** Raw OCR regions to be supplemented
+- **WHEN** a region has confidence score below 0.3
+- **THEN** the system SHALL skip that region
+- **AND** only supplement regions with confidence >= 0.3
+
+#### Scenario: Deduplication prevents repeated text
+- **GIVEN** a Raw OCR region being considered for supplementation
+- **WHEN** the region has IoU > 0.5 with any existing PP-StructureV3 TEXT element
+- **THEN** the system SHALL skip that region to prevent duplicate text
+- **AND** the original PP-StructureV3 element SHALL be preserved
+
+#### Scenario: Reading order is recalculated after gap filling
+- **GIVEN** supplemented elements have been added to the page
+- **WHEN** assembling the final element list
+- **THEN** the system SHALL recalculate reading order for the entire page
+- **AND** sort elements by y0 coordinate (top to bottom) then x0 (left to right)
+- **AND** ensure logical document flow is maintained
+
+#### Scenario: Coordinate alignment with ocr_dimensions
+- **GIVEN** Raw OCR processing may involve image resizing
+- **WHEN** comparing Raw OCR bbox with PP-StructureV3 bbox
+- **THEN** the system SHALL use ocr_dimensions to normalize coordinates
+- **AND** ensure both sources reference the same coordinate space
+- **AND** prevent coverage misdetection due to scale differences
+
+#### Scenario: Supplemented elements have complete metadata
+- **GIVEN** a Raw OCR region being added as supplemented element
+- **WHEN** creating the DocumentElement
+- **THEN** the element SHALL include page_number
+- **AND** include confidence score from Raw OCR
+- **AND** include original bbox coordinates
+- **AND** optionally include source indicator for debugging
+
+### Requirement: Gap Filling Track Isolation
+
+The gap filling feature SHALL only apply to OCR track processing and SHALL NOT affect Direct or Hybrid track outputs.
+
+#### Scenario: Gap filling only activates for OCR track
+- **GIVEN** a document processing task
+- **WHEN** the processing track is OCR
+- **THEN** the system SHALL evaluate and apply gap filling as needed
+- **AND** produce enhanced output with supplemented content
+
+#### Scenario: Direct track is unaffected
+- **GIVEN** a document processing task with Direct track
+- **WHEN** the task is processed
+- **THEN** the system SHALL NOT invoke any gap filling logic
+- **AND** produce output identical to current Direct track behavior
+
+#### Scenario: Hybrid track is unaffected
+- **GIVEN** a document processing task with Hybrid track
+- **WHEN** the task is processed
+- **THEN** the system SHALL NOT invoke gap filling logic
+- **AND** use existing Hybrid track processing pipeline
+
+### Requirement: Gap Filling Configuration
+
+The system SHALL provide configurable parameters for gap filling behavior.
+
+#### Scenario: Gap filling can be disabled via configuration
+- **GIVEN** gap_filling_enabled is set to false in configuration
+- **WHEN** OCR track processing runs
+- **THEN** the system SHALL skip all gap filling logic
+- **AND** output only PP-StructureV3 results as before
+
+#### Scenario: Coverage threshold is configurable
+- **GIVEN** gap_filling_coverage_threshold is set to 0.8
+- **WHEN** PP-StructureV3 coverage is 75%
+- **THEN** the system SHALL activate gap filling
+- **AND** supplement uncovered regions
+
+#### Scenario: IoU thresholds are configurable
+- **GIVEN** custom IoU thresholds configured:
+  - gap_filling_iou_threshold: 0.2
+  - gap_filling_dedup_iou_threshold: 0.6
+- **WHEN** evaluating coverage and deduplication
+- **THEN** the system SHALL use the configured values
+- **AND** apply them consistently throughout gap filling process
+
+#### Scenario: Confidence threshold is configurable
+- **GIVEN** gap_filling_confidence_threshold is set to 0.5
+- **WHEN** supplementing Raw OCR regions
+- **THEN** the system SHALL only include regions with confidence >= 0.5
+- **AND** filter out lower confidence regions
+
+### Requirement: Layout Model Selection
+The system SHALL allow users to select a layout detection model optimized for their document type, providing a simple choice between pre-configured models instead of manual parameter tuning.
+
+#### Scenario: User selects Chinese document model
+- **GIVEN** a user is processing Chinese business documents (forms, contracts, invoices)
+- **WHEN** the user selects "Chinese Document Model" (PP-DocLayout-S)
+- **THEN** the OCR engine SHALL use the PP-DocLayout-S layout detection model
+- **AND** the model SHALL be optimized for 23 Chinese document element types
+- **AND** table and form detection accuracy SHALL be improved over the default model
+
+#### Scenario: User selects standard model for English documents
+- **GIVEN** a user is processing English academic papers or reports
+- **WHEN** the user selects "Standard Model" (PubLayNet-based)
+- **THEN** the OCR engine SHALL use the default PubLayNet-based layout detection model
+- **AND** the model SHALL be optimized for English document layouts
+
+#### Scenario: User selects CDLA model for specialized Chinese layout
+- **GIVEN** a user is processing Chinese documents with complex layouts
+- **WHEN** the user selects "CDLA Model"
+- **THEN** the OCR engine SHALL use the picodet_lcnet_x1_0_fgd_layout_cdla model
+- **AND** the model SHALL provide specialized Chinese document layout analysis
+
+#### Scenario: Layout model is sent via API request
+- **GIVEN** a frontend application with model selection UI
+- **WHEN** the user starts task processing with a selected model
+- **THEN** the frontend SHALL send the model choice in the request body:
  ```json
  POST /api/v2/tasks/{task_id}/start
  {
    "use_dual_track": true,
    "force_track": "ocr",
    "language": "ch",
-    "pp_structure_params": {
-      "layout_detection_threshold": 0.15,
-      "layout_merge_bboxes_mode": "small",
-      "text_det_thresh": 0.1
-    }
+    "layout_model": "chinese"
  }
  ```
- **AND** the backend SHALL parse and apply these parameters
+- **AND** the backend SHALL configure PP-StructureV3 with the corresponding model

-#### Scenario: Backward compatibility is maintained
- **GIVEN** existing API clients without PP-StructureV3 parameter support
- **WHEN** a task is started without `pp_structure_params`
- **THEN** the system SHALL use backend default settings
- **AND** processing SHALL work exactly as before
- **AND** no errors SHALL occur
+#### Scenario: Default model when not specified
+- **GIVEN** an API request without `layout_model` parameter
+- **WHEN** the task is started
+- **THEN** the system SHALL use "chinese" (PP-DocLayout-S) as the default model
+- **AND** processing SHALL work correctly without requiring model selection

-#### Scenario: Invalid parameters are rejected
- **GIVEN** a request with invalid parameter values
- **WHEN** the user sends:
-  - `layout_detection_threshold` = 1.5 (exceeds max 1.0)
-  - `layout_merge_bboxes_mode` = "invalid" (not in allowed values)
+#### Scenario: Invalid model name is rejected
+- **GIVEN** a request with an invalid `layout_model` value
+- **WHEN** the user sends `layout_model: "invalid_model"`
 - **THEN** the API SHALL return 422 Validation Error
- **AND** provide clear error messages about invalid parameters
+- **AND** provide a clear error message listing valid model options

-#### Scenario: Custom parameters affect only current processing
- **GIVEN** multiple concurrent OCR processing tasks
- **WHEN** Task A uses custom parameters and Task B uses defaults
- **THEN** Task A SHALL process with its custom parameters
- **AND** Task B SHALL process with default parameters
- **AND** no parameter interference SHALL occur between tasks
+### Requirement: Layout Model Selection UI
+The frontend SHALL provide a simple, user-friendly interface for selecting layout detection models with clear descriptions of each option.

-### Requirement: PP-StructureV3 Parameter UI Controls
-The frontend SHALL provide intuitive UI controls for adjusting PP-StructureV3 parameters with appropriate constraints and help text.
+#### Scenario: Model options are displayed with descriptions
+- **GIVEN** the model selection UI is displayed
+- **WHEN** the user views the available options
+- **THEN** the UI SHALL show the following options:
+  - "Chinese Document Model (Recommended)" - for Chinese forms, contracts, invoices
+  - "Standard Model" - for English academic papers, reports
+  - "CDLA Model" - for specialized Chinese layout analysis
+- **AND** each option SHALL have a brief description of its use case

-#### Scenario: Slider controls for numeric parameters
- **GIVEN** the parameter adjustment UI is displayed
- **WHEN** the user adjusts a numeric parameter slider
- **THEN** the slider SHALL enforce min/max constraints:
-  - Threshold parameters: 0.0 to 1.0
-  - Ratio parameters: > 0 (typically 0.5 to 3.0)
- **AND** display current value in real-time
- **AND** show help text explaining the parameter effect
+#### Scenario: Chinese model is selected by default
+- **GIVEN** the user opens the task processing interface
+- **WHEN** the model selection is displayed
+- **THEN** "Chinese Document Model" SHALL be pre-selected as the default
+- **AND** the user MAY change the selection before starting processing

-#### Scenario: Dropdown for merge mode selection
- **GIVEN** the layout merge mode parameter
- **WHEN** the user clicks the dropdown
- **THEN** the UI SHALL show exactly three options:
-  - "small" (conservative merging)
-  - "large" (aggressive merging)
-  - "union" (middle ground)
- **AND** display description for each option
-
-#### Scenario: Parameters shown only for OCR track
+#### Scenario: Model selection is visible only for OCR track
 - **GIVEN** a document processing interface
 - **WHEN** the user selects processing track
- **THEN** PP-StructureV3 parameters SHALL be shown ONLY when OCR track is selected
- **AND** SHALL be hidden for Direct track
- **AND** SHALL be disabled for Auto track until track is determined
+- **THEN** layout model selection SHALL be shown ONLY when OCR track is selected or auto-detected
+- **AND** SHALL be hidden for Direct track (which does not use PP-StructureV3)