test

2025-12-04 18:00:37 +08:00
parent 9437387ef1
commit 8265be1741
22 changed files with 2672 additions and 196 deletions
--- a/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/design.md
+++ b/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/design.md
--- a/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/proposal.md
+++ b/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/proposal.md
--- a/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/specs/result-export/spec.md
+++ b/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/specs/result-export/spec.md
--- a/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/tasks.md
+++ b/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/tasks.md
--- a/openspec/changes/archive/2025-12-04-pdf-preprocessing-pipeline/design.md
+++ b/openspec/changes/archive/2025-12-04-pdf-preprocessing-pipeline/design.md
--- a/openspec/changes/archive/2025-12-04-pdf-preprocessing-pipeline/proposal.md
+++ b/openspec/changes/archive/2025-12-04-pdf-preprocessing-pipeline/proposal.md
--- a/openspec/changes/archive/2025-12-04-pdf-preprocessing-pipeline/tasks.md
+++ b/openspec/changes/archive/2025-12-04-pdf-preprocessing-pipeline/tasks.md
--- a/openspec/changes/refactor-dual-track-architecture/design.md
+++ b/openspec/changes/refactor-dual-track-architecture/design.md
@@ -0,0 +1,240 @@
+# Design: Refactor Dual-Track Architecture
+
+## Context
+
+Tool_OCR 是一個雙軌制文件處理系統，支援：
+- **Direct Track**: 從可編輯 PDF 直接提取結構化內容
+- **OCR Track**: 使用 PaddleOCR + PP-StructureV3 進行光學字符識別
+
+目前系統存在以下技術債務：
+- OCRService (2,326 行) 承擔過多職責
+- PDFGeneratorService (4,644 行) 是單體服務
+- 記憶體管理分散在多個組件中
+- 已知 bug 影響輸出品質
+
+## Goals / Non-Goals
+
+### Goals
+- 修復 PLAN.md 中列出的所有已知 bug
+- 將 OCRService 拆分為 < 800 行的可維護單元
+- 將 PDFGeneratorService 拆分為 < 2,000 行
+- 簡化記憶體管理配置
+- 提升前端狀態管理一致性
+
+### Non-Goals
+- 不改變現有 API 契約
+- 不引入新的外部依賴
+- 不改變資料庫 schema
+- 不改變使用者介面
+
+## Decisions
+
+### Decision 1: 使用 PyMuPDF find_tables() 取代自定義表格檢測
+
+**選擇**: 使用 PyMuPDF 內建的 `page.find_tables()` API
+
+**理由**:
+- PyMuPDF 的表格檢測能正確識別合併單元格
+- 返回的 `table.cells` 結構包含 span 資訊
+- 減少自定義代碼維護負擔
+
+**替代方案**:
+- 改進 `_detect_tables_by_position()` 算法
+  - 優點：不依賴外部 API 變更
+  - 缺點：複雜度高，難以處理所有邊界情況
+- 使用 Camelot 或 Tabula
+  - 優點：成熟的表格提取庫
+  - 缺點：引入新依賴，增加系統複雜度
+
+### Decision 2: 使用 Strategy Pattern 重構服務層
+
+**選擇**: 引入 ProcessingOrchestrator 使用策略模式
+
+```python
+class ProcessingPipeline(Protocol):
+    def process(self, file_path: str, options: ProcessingOptions) -> UnifiedDocument:
+        ...
+
+class DirectPipeline(ProcessingPipeline):
+    def __init__(self, extraction_engine: DirectExtractionEngine):
+        self.engine = extraction_engine
+
+    def process(self, file_path, options):
+        return self.engine.extract(file_path)
+
+class OCRPipeline(ProcessingPipeline):
+    def __init__(self, ocr_service: OCRService, preprocessor: LayoutPreprocessingService):
+        self.ocr = ocr_service
+        self.preprocessor = preprocessor
+
+    def process(self, file_path, options):
+        # Preprocessing + OCR + Conversion
+        ...
+
+class ProcessingOrchestrator:
+    def __init__(self, detector: DocumentTypeDetector, pipelines: dict[str, ProcessingPipeline]):
+        self.detector = detector
+        self.pipelines = pipelines
+
+    def process(self, file_path, options):
+        track = options.force_track or self.detector.detect(file_path).track
+        return self.pipelines[track].process(file_path, options)
+```
+
+**理由**:
+- 職責分離：檢測、處理、轉換各自獨立
+- 易於測試：可以單獨測試每個 Pipeline
+- 易於擴展：新增處理方式只需添加新 Pipeline
+
+**替代方案**:
+- 使用 Chain of Responsibility
+  - 優點：更靈活的處理鏈
+  - 缺點：對於二選一的場景過於複雜
+- 保持現狀，只做代碼整理
+  - 優點：風險最低
+  - 缺點：無法解決根本問題
+
+### Decision 3: 分層提取 PDF 生成邏輯
+
+**選擇**: 將 PDFGeneratorService 拆分為三個模組
+
+```
+PDFGeneratorService (主要編排)
+├── PDFTableRenderer (表格渲染)
+│   ├── HTMLTableParser (HTML 表格解析)
+│   └── CellRenderer (單元格渲染)
+├── PDFFontManager (字體管理)
+│   ├── FontLoader (字體載入)
+│   └── FontFallback (字體 fallback)
+└── PDFLayoutEngine (版面配置)
+```
+
+**理由**:
+- 單一職責：每個模組專注一件事
+- 可重用：FontManager 可被其他服務使用
+- 易於測試：表格渲染可獨立測試
+
+### Decision 4: 統一記憶體策略引擎
+
+**選擇**: 合併記憶體管理組件為單一 MemoryPolicyEngine
+
+```python
+class MemoryPolicyEngine:
+    """統一的記憶體策略引擎"""
+
+    def __init__(self, config: MemoryConfig):
+        self.config = config
+        self._semaphore = asyncio.Semaphore(config.max_concurrent_predictions)
+
+    @property
+    def gpu_usage_percent(self) -> float:
+        # 統一的 GPU 使用率查詢
+        ...
+
+    def check_availability(self) -> MemoryStatus:
+        # 返回 AVAILABLE, WARNING, CRITICAL, EMERGENCY
+        ...
+
+    async def acquire_prediction_slot(self):
+        # 統一的並發控制
+        ...
+
+    def cleanup_if_needed(self):
+        # 根據狀態自動清理
+        ...
+
+@dataclass
+class MemoryConfig:
+    warning_threshold: float = 0.80      # 80%
+    critical_threshold: float = 0.95     # 95%
+    max_concurrent_predictions: int = 2
+    model_idle_timeout: int = 300        # 5 minutes
+```
+
+**理由**:
+- 減少配置項：從 8+ 降到 4 個核心配置
+- 簡化依賴：服務只需依賴一個記憶體引擎
+- 統一行為：所有記憶體決策在同一處做出
+
+### Decision 5: 使用 Zustand 管理任務狀態
+
+**選擇**: 新增 TaskStore 統一管理任務狀態
+
+```typescript
+interface TaskState {
+  currentTaskId: string | null;
+  tasks: Record<string, TaskDetail>;
+  processingStatus: Record<string, ProcessingStatus>;
+}
+
+interface TaskActions {
+  setCurrentTask: (taskId: string) => void;
+  updateTask: (taskId: string, updates: Partial<TaskDetail>) => void;
+  updateProcessingStatus: (taskId: string, status: ProcessingStatus) => void;
+  clearTasks: () => void;
+}
+
+const useTaskStore = create<TaskState & TaskActions>()(
+  persist(
+    (set) => ({
+      currentTaskId: null,
+      tasks: {},
+      processingStatus: {},
+      // ... actions
+    }),
+    { name: 'task-storage' }
+  )
+);
+```
+
+**理由**:
+- 一致性：與現有 uploadStore、authStore 模式一致
+- 可追蹤：任務狀態變更集中管理
+- 持久化：刷新頁面後狀態保留
+
+## Risks / Trade-offs
+
+| 風險 | 影響 | 緩解措施 |
+|------|------|----------|
+| PyMuPDF find_tables() API 變更 | 中 | 封裝為獨立函數，易於替換 |
+| 服務重構導致處理邏輯錯誤 | 高 | 保留原有測試，逐步重構 |
+| 記憶體引擎改變導致 OOM | 高 | 使用相同閾值，僅改變代碼結構 |
+| 前端狀態遷移導致 bug | 中 | 逐頁遷移，完整測試每個頁面 |
+
+## Migration Plan
+
+### Step 1: Bug Fixes (可獨立部署)
+1. 實現 PyMuPDF find_tables() 整合
+2. 修復 OCR Track 圖片路徑
+3. 添加 cell_boxes 座標驗證
+4. 測試並部署
+
+### Step 2: Service Refactoring (可獨立部署)
+1. 提取 ProcessingOrchestrator
+2. 提取 TableRenderer 和 FontManager
+3. 更新 OCRService 使用新組件
+4. 測試並部署
+
+### Step 3: Memory Management (可獨立部署)
+1. 實現 MemoryPolicyEngine
+2. 逐步遷移服務使用新引擎
+3. 移除舊組件
+4. 測試並部署
+
+### Step 4: Frontend Improvements (可獨立部署)
+1. 新增 TaskStore
+2. 遷移 ProcessingPage
+3. 遷移 TaskDetailPage
+4. 合併類型定義
+5. 測試並部署
+
+### Rollback Plan
+- 每個 Step 獨立部署，問題時可回滾到上一個穩定版本
+- Bug fixes 優先，確保基本功能正確
+- 重構不改變外部行為，回滾影響最小
+
+## Open Questions
+
+1. **PyMuPDF find_tables() 的版本相容性**: 需確認目前使用的 PyMuPDF 版本是否支援此 API
+2. **前端狀態持久化範圍**: 是否所有任務都需要持久化，還是只保留當前會話？
+3. **記憶體閾值調整**: 現有閾值是否經過生產驗證，可以直接沿用？
--- a/openspec/changes/refactor-dual-track-architecture/proposal.md
+++ b/openspec/changes/refactor-dual-track-architecture/proposal.md
@@ -0,0 +1,68 @@
+# Change: Refactor Dual-Track Architecture
+
+## Why
+
+目前雙軌制 OCR 系統存在多個已知問題和架構債務：
+
+1. **Direct Track 表格問題**: `_detect_tables_by_position()` 無法識別合併單元格，導致 edit3.pdf 產生 204 個錯誤拆分的 cells（應為 83 個）
+2. **OCR Track 圖片路徑丟失**: CHART/DIAGRAM 等視覺元素的 `saved_path` 在轉換時丟失，導致圖片未放回 PDF
+3. **OCR Track cell_boxes 座標錯亂**: PP-StructureV3 返回的 cell_boxes 超出頁面邊界
+4. **服務層過度複雜**: OCRService (2,326 行) 承擔過多職責，難以維護和測試
+5. **PDF 生成器過於龐大**: PDFGeneratorService (4,644 行) 是單體服務，難以擴展
+
+## What Changes
+
+### Phase 1: 修復已知 Bug（優先級：最高）
+
+- **Direct Track 表格修復**: 改用 PyMuPDF `find_tables()` API 取代 `_detect_tables_by_position()`
+- **OCR Track 圖片路徑修復**: 擴展 `_convert_pp3_element` 處理所有視覺元素類型 (IMAGE, FIGURE, CHART, DIAGRAM, LOGO, STAMP)
+- **Cell boxes 座標驗證**: 添加邊界檢查，超出範圍時使用 CV 線檢測 fallback
+- **過濾極小裝飾圖片**: 過濾 < 200 px² 的圖片
+- **移除覆蓋圖像**: 在渲染階段過濾與 covering_images 重疊的圖片
+
+### Phase 2: 服務層重構（優先級：高）
+
+- **拆分 OCRService**: 提取獨立的 `ProcessingOrchestrator` 負責流程編排
+- **建立 Pipeline 模式**: 使用組合模式取代目前的聚合模式
+- **提取 TableRenderer**: 從 PDFGeneratorService 提取表格渲染邏輯
+- **提取 FontManager**: 從 PDFGeneratorService 提取字體管理邏輯
+
+### Phase 3: 記憶體管理簡化（優先級：中）
+
+- **統一記憶體策略**: 合併 MemoryManager、MemoryGuard、各類 Semaphore 為單一策略引擎
+- **簡化配置**: 減少 8+ 個記憶體相關配置項到核心 3-4 項
+
+### Phase 4: 前端狀態管理改進（優先級：中）
+
+- **新增 TaskStore**: 使用 Zustand 管理任務狀態，取代分散的 useState
+- **合併類型定義**: 統一 api.ts 和 apiV2.ts 為單一類型定義檔案
+
+## Impact
+
+- Affected specs: `document-processing`
+- Affected code:
+  - `backend/app/services/direct_extraction_engine.py` (表格檢測)
+  - `backend/app/services/ocr_to_unified_converter.py` (元素轉換)
+  - `backend/app/services/ocr_service.py` (服務編排)
+  - `backend/app/services/pdf_generator_service.py` (PDF 生成)
+  - `backend/app/services/memory_manager.py` (記憶體管理)
+  - `frontend/src/store/` (狀態管理)
+  - `frontend/src/types/` (類型定義)
+
+## Risk Assessment
+
+| 風險 | 嚴重性 | 緩解措施 |
+|------|--------|----------|
+| 表格渲染回歸 | 高 | 使用 edit.pdf 和 edit3.pdf 作為回歸測試 |
+| 記憶體管理變更導致 OOM | 高 | 保留現有閾值，僅重構代碼結構 |
+| 服務重構導致處理失敗 | 中 | 逐步重構，每階段完整測試 |
+
+## Success Metrics
+
+| 指標 | 目前 | 目標 |
+|------|------|------|
+| edit3.pdf Direct Track cells | 204 (錯誤) | 83 (正確) |
+| OCR Track 圖片放回率 | 0% | 100% |
+| cell_boxes 座標正確率 | ~40% | 100% |
+| OCRService 行數 | 2,326 | < 800 |
+| PDFGeneratorService 行數 | 4,644 | < 2,000 |
--- a/openspec/changes/refactor-dual-track-architecture/specs/document-processing/spec.md
+++ b/openspec/changes/refactor-dual-track-architecture/specs/document-processing/spec.md
@@ -0,0 +1,151 @@
+# document-processing Specification Delta
+
+## ADDED Requirements
+
+### Requirement: Table Cell Merging Detection
+The system SHALL correctly detect and preserve merged cells (rowspan/colspan) when extracting tables from PDF documents.
+
+#### Scenario: Detect merged cells in Direct Track
+- **WHEN** extracting tables from an editable PDF using Direct Track
+- **THEN** the system SHALL use PyMuPDF find_tables() API
+- **AND** correctly identify cells with rowspan > 1 or colspan > 1
+- **AND** preserve merge information in UnifiedDocument table structure
+- **AND** skip placeholder cells that are covered by merged cells
+
+#### Scenario: Handle complex table structures
+- **WHEN** processing a table with mixed merged and regular cells (e.g., edit3.pdf with 83 cells including 121 merges)
+- **THEN** the system SHALL NOT split merged cells into individual cells
+- **AND** the output cell count SHALL match the actual visual cell count
+- **AND** the rendered PDF SHALL display correct merged cell boundaries
+
+### Requirement: Visual Element Path Preservation
+The system SHALL preserve image paths for all visual element types during OCR conversion.
+
+#### Scenario: Preserve CHART element paths
+- **WHEN** converting PP-StructureV3 output containing CHART elements
+- **THEN** the system SHALL treat CHART as a visual element type
+- **AND** extract saved_path from the element data
+- **AND** include saved_path in the UnifiedDocument content field
+
+#### Scenario: Support all visual element types
+- **WHEN** processing visual elements of types IMAGE, FIGURE, CHART, DIAGRAM, LOGO, or STAMP
+- **THEN** the system SHALL extract saved_path or img_path for each element
+- **AND** preserve path, width, height, and format in content dictionary
+- **AND** enable downstream PDF generation to embed these images
+
+#### Scenario: Fallback path resolution
+- **WHEN** a visual element has multiple path fields (saved_path, img_path)
+- **THEN** the system SHALL prefer saved_path over img_path
+- **AND** fallback to img_path if saved_path is missing
+- **AND** log warning if both paths are missing
+
+### Requirement: Cell Box Coordinate Validation
+The system SHALL validate cell box coordinates from PP-StructureV3 and handle out-of-bounds cases.
+
+#### Scenario: Detect out-of-bounds coordinates
+- **WHEN** processing cell_boxes from PP-StructureV3
+- **THEN** the system SHALL validate each coordinate against page boundaries (0, 0, page_width, page_height)
+- **AND** log tables with coordinates exceeding page bounds
+- **AND** mark affected cells for fallback processing
+
+#### Scenario: Apply CV line detection fallback
+- **WHEN** cell_boxes coordinates are invalid (out of bounds)
+- **THEN** the system SHALL apply OpenCV line detection as fallback
+- **AND** reconstruct table structure from detected lines
+- **AND** include fallback_used flag in table metadata
+
+#### Scenario: Coordinate normalization
+- **WHEN** coordinates are within page bounds but slightly outside table bbox
+- **THEN** the system SHALL clamp coordinates to table boundaries
+- **AND** preserve relative cell positions
+- **AND** ensure no cells overlap after normalization
+
+### Requirement: Decoration Image Filtering
+The system SHALL filter out minimal decoration images that do not contribute meaningful content.
+
+#### Scenario: Filter tiny images by area
+- **WHEN** extracting images from a document
+- **THEN** the system SHALL calculate image area (width x height)
+- **AND** filter out images with area < 200 square pixels
+- **AND** log filtered image count for debugging
+
+#### Scenario: Configurable filtering threshold
+- **WHEN** processing documents with intentionally small images
+- **THEN** the system SHALL support configuration of minimum image area threshold
+- **AND** default to 200 square pixels if not specified
+- **AND** allow threshold = 0 to disable filtering
+
+### Requirement: Covering Image Removal
+The system SHALL remove covering/redaction images from the final output.
+
+#### Scenario: Detect covering rectangles
+- **WHEN** preprocessing a PDF page
+- **THEN** the system SHALL detect black/white rectangles covering text regions
+- **AND** identify covering images by high IoU (> 0.8) with underlying content
+- **AND** mark covering images for exclusion
+
+#### Scenario: Exclude covering images from rendering
+- **WHEN** generating output PDF
+- **THEN** the system SHALL exclude images marked as covering
+- **AND** preserve the text content that was covered
+- **AND** include covering_images_removed count in metadata
+
+#### Scenario: Handle both black and white covering
+- **WHEN** detecting covering rectangles
+- **THEN** the system SHALL detect both black fill (redaction style)
+- **AND** white fill (whiteout style)
+- **AND** low-contrast rectangles intended to hide content
+
+## MODIFIED Requirements
+
+### Requirement: Enhanced OCR with Full PP-StructureV3
+The system SHALL utilize the full capabilities of PP-StructureV3, extracting all 23 element types from parsing_res_list, with proper handling of visual elements and table coordinates.
+
+#### Scenario: Extract comprehensive document structure
+- **WHEN** processing through OCR track
+- **THEN** the system SHALL use page_result.json['parsing_res_list']
+- **AND** extract all element types including headers, lists, tables, figures
+- **AND** preserve layout_bbox coordinates for each element
+
+#### Scenario: Maintain reading order
+- **WHEN** extracting elements from PP-StructureV3
+- **THEN** the system SHALL preserve the reading order from parsing_res_list
+- **AND** assign sequential indices to elements
+- **AND** support reordering for complex layouts
+
+#### Scenario: Extract table structure
+- **WHEN** PP-StructureV3 identifies a table
+- **THEN** the system SHALL extract cell content and boundaries
+- **AND** validate cell_boxes coordinates against page boundaries
+- **AND** apply fallback detection for invalid coordinates
+- **AND** preserve table HTML for structure
+- **AND** extract plain text for translation
+
+#### Scenario: Extract visual elements with paths
+- **WHEN** PP-StructureV3 identifies visual elements (IMAGE, FIGURE, CHART, DIAGRAM)
+- **THEN** the system SHALL preserve saved_path for each element
+- **AND** include image dimensions and format
+- **AND** enable image embedding in output PDF
+
+### Requirement: Generate UnifiedDocument from direct extraction
+The system SHALL convert PyMuPDF results to UnifiedDocument with correct table cell merging.
+
+#### Scenario: Extract tables with cell merging
+- **WHEN** direct extraction encounters a table
+- **THEN** the system SHALL use PyMuPDF find_tables() API
+- **AND** extract cell content with correct rowspan/colspan
+- **AND** preserve merged cell boundaries
+- **AND** skip placeholder cells covered by merges
+
+#### Scenario: Filter decoration images
+- **WHEN** extracting images from PDF
+- **THEN** the system SHALL filter images smaller than minimum area threshold
+- **AND** exclude covering/redaction images
+- **AND** preserve meaningful content images
+
+#### Scenario: Preserve text styling with image handling
+- **WHEN** direct extraction completes
+- **THEN** the system SHALL convert PyMuPDF results to UnifiedDocument
+- **AND** preserve text styling, fonts, and exact positioning
+- **AND** extract tables with cell boundaries, content, and merge info
+- **AND** include only meaningful images in output
--- a/openspec/changes/refactor-dual-track-architecture/tasks.md
+++ b/openspec/changes/refactor-dual-track-architecture/tasks.md
@@ -0,0 +1,108 @@
+# Tasks: Refactor Dual-Track Architecture
+
+## Phase 1: 修復已知 Bug (已完成)
+
+### 1.1 Direct Track 表格修復 (已完成 ✓)
+- [x] 1.1.1 修改 `_process_native_table()` 方法使用 `table.cells` 處理合併單元格
+- [x] 1.1.2 使用 PyMuPDF `page.find_tables()` API (已在使用中)
+- [x] 1.1.3 解析 `table.cells` 並正確計算 `row_span`/`col_span`
+- [x] 1.1.4 處理被合併的單元格（跳過 `None` 值，建立 covered grid）
+- [x] 1.1.5 驗證 edit3.pdf 返回 83 個正確的 cells ✓
+
+### 1.2 OCR Track 圖片路徑修復 (已完成 ✓)
+- [x] 1.2.1 修改 `ocr_to_unified_converter.py` 第 604-613 行
+- [x] 1.2.2 擴展視覺元素類型判斷：`IMAGE, FIGURE, CHART, DIAGRAM, LOGO, STAMP`
+- [x] 1.2.3 優先使用 `saved_path`，fallback 到 `img_path`
+- [x] 1.2.4 確保 content dict 包含 `saved_path`, `path`, `width`, `height`, `format`
+- [x] 1.2.5 程式碼已修正 (需 OCR Track 完整測試驗證)
+- [x] 1.2.6 程式碼已修正 (需 OCR Track 完整測試驗證)
+
+### 1.3 Cell boxes 座標驗證 (已完成 ✓)
+- [x] 1.3.1 在 `ocr_to_unified_converter.py` 添加 `validate_cell_boxes()` 函數
+- [x] 1.3.2 檢查 cell_boxes 是否超出頁面邊界 (0, 0, page_width, page_height)
+- [x] 1.3.3 超出範圍時使用 clamped coordinates，標記 needs_fallback
+- [x] 1.3.4 添加日誌記錄異常座標
+- [x] 1.3.5 單元測試驗證座標驗證邏輯正確 ✓
+
+### 1.4 過濾極小裝飾圖片 (已完成 ✓)
+- [x] 1.4.1 在 `direct_extraction_engine.py` 圖片提取邏輯添加面積檢查
+- [x] 1.4.2 過濾 `image_area < min_image_area` (默認 200 px²) 的圖片
+- [x] 1.4.3 添加 `min_image_area` 配置項允許調整閾值
+- [x] 1.4.4 驗證 edit3.pdf 偵測到 3 個極小裝飾圖片 ✓
+
+### 1.5 移除覆蓋圖像 (已完成 ✓)
+- [x] 1.5.1 傳遞 `covering_images` 到 `_extract_images()` 方法
+- [x] 1.5.2 使用 IoU 閾值 (0.8) 和 xref 比對判斷覆蓋圖像
+- [x] 1.5.3 從最終輸出中排除覆蓋圖像
+- [x] 1.5.4 添加 `_calculate_iou()` 輔助方法
+- [x] 1.5.5 驗證 edit3.pdf 偵測到 6 個黑框覆蓋圖像 ✓
+
+## Phase 2: 服務層重構
+
+### 2.1 提取 ProcessingOrchestrator
+- [ ] 2.1.1 建立 `backend/app/services/processing_orchestrator.py`
+- [ ] 2.1.2 從 OCRService 提取流程編排邏輯
+- [ ] 2.1.3 定義 `ProcessingPipeline` 介面
+- [ ] 2.1.4 實現 DirectPipeline 和 OCRPipeline
+- [ ] 2.1.5 更新 OCRService 使用 ProcessingOrchestrator
+- [ ] 2.1.6 確保現有功能不受影響
+
+### 2.2 提取 TableRenderer
+- [ ] 2.2.1 建立 `backend/app/services/pdf_table_renderer.py`
+- [ ] 2.2.2 從 PDFGeneratorService 提取 HTMLTableParser
+- [ ] 2.2.3 提取表格渲染邏輯到獨立類
+- [ ] 2.2.4 支援合併單元格渲染
+- [ ] 2.2.5 更新 PDFGeneratorService 使用 TableRenderer
+
+### 2.3 提取 FontManager
+- [ ] 2.3.1 建立 `backend/app/services/pdf_font_manager.py`
+- [ ] 2.3.2 提取字體載入和快取邏輯
+- [ ] 2.3.3 提取 CJK 字體支援邏輯
+- [ ] 2.3.4 實現字體 fallback 機制
+- [ ] 2.3.5 更新 PDFGeneratorService 使用 FontManager
+
+## Phase 3: 記憶體管理簡化
+
+### 3.1 統一記憶體策略引擎
+- [ ] 3.1.1 建立 `backend/app/services/memory_policy_engine.py`
+- [ ] 3.1.2 定義統一的記憶體策略介面
+- [ ] 3.1.3 合併 MemoryManager 和 MemoryGuard 邏輯
+- [ ] 3.1.4 整合 Semaphore 管理
+- [ ] 3.1.5 簡化配置到 3-4 個核心項目
+
+### 3.2 更新服務使用新記憶體引擎
+- [ ] 3.2.1 更新 OCRService 使用 MemoryPolicyEngine
+- [ ] 3.2.2 更新 ServicePool 使用 MemoryPolicyEngine
+- [ ] 3.2.3 移除舊的 MemoryGuard 引用
+- [ ] 3.2.4 驗證 GPU 記憶體監控正常運作
+
+## Phase 4: 前端狀態管理改進
+
+### 4.1 新增 TaskStore
+- [ ] 4.1.1 建立 `frontend/src/store/taskStore.ts`
+- [ ] 4.1.2 定義任務狀態結構（currentTask, tasks, processingStatus）
+- [ ] 4.1.3 實現 CRUD 操作和狀態轉換
+- [ ] 4.1.4 添加 localStorage 持久化
+- [ ] 4.1.5 更新 ProcessingPage 使用 TaskStore
+- [ ] 4.1.6 更新 TaskDetailPage 使用 TaskStore
+
+### 4.2 合併類型定義
+- [ ] 4.2.1 審查 `api.ts` 和 `apiV2.ts` 的差異
+- [ ] 4.2.2 合併類型定義到 `apiV2.ts`
+- [ ] 4.2.3 移除 `api.ts` 中的重複定義
+- [ ] 4.2.4 更新所有 import 路徑
+- [ ] 4.2.5 驗證 TypeScript 編譯無錯誤
+
+## Phase 5: 測試與驗證
+
+### 5.1 回歸測試
+- [ ] 5.1.1 使用 edit.pdf 測試 Direct Track（確保無回歸）
+- [ ] 5.1.2 使用 edit3.pdf 測試 Direct Track 表格合併
+- [ ] 5.1.3 使用 edit.pdf 測試 OCR Track 圖片放回
+- [ ] 5.1.4 使用 edit3.pdf 測試 OCR Track 圖片放回
+- [ ] 5.1.5 驗證所有 cell_boxes 座標正確
+
+### 5.2 效能測試
+- [ ] 5.2.1 測量重構後的處理時間
+- [ ] 5.2.2 驗證記憶體使用無明顯增加
+- [ ] 5.2.3 驗證 GPU 使用率正常