OCR/tasks.md at 0edc56b03f0f5f33b67e0dba66eceea2766b8a7f

egg 0edc56b03f fix: 修復PDF生成中的頁碼錯誤和文字重疊問題

## 問題修復

### 1. 頁碼分配錯誤
- **問題**: layout_data 和 images_metadata 頁碼被 1-based 覆蓋，導致全部為 0
- **修復**: 在 analyze_layout() 添加 current_page 參數，從源頭設置正確的 0-based 頁碼
- **影響**: 表格和圖片現在顯示在正確的頁面上

### 2. 文字與表格/圖片重疊
- **問題**: 使用不存在的 'tables' 和 'image_regions' 字段過濾，導致過濾失效
- **修復**: 改用 images_metadata（包含所有表格/圖片的 bbox）
- **新增**: _bbox_overlaps() 檢測任意重疊（非完全包含）
- **影響**: 文字不再覆蓋表格和圖片區域

### 3. 渲染順序優化
- **調整**: 圖片(底層) → 表格(中間層) → 文字(頂層)
- **影響**: 視覺層次更正確

## 技術細節

- ocr_service.py: 添加 current_page 參數傳遞，移除頁碼覆蓋邏輯
- pdf_generator_service.py:
  - 新增 _bbox_overlaps() 方法
  - 更新 _filter_text_in_regions() 使用重疊檢測
  - 修正數據源為 images_metadata
  - 調整繪製順序

## 已知限制

- 仍有 21.6% 文字因過濾而遺失（座標定位方法的固有問題）
- 未使用 PP-StructureV3 的完整版面資訊（parsing_res_list, layout_bbox）

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

5.8 KiB

Raw Blame History

Implementation Tasks

1. Backend - Fix Image Extraction and Saving (PREREQUISITE) ✅

2. Backend - Environment Setup ✅

3. Backend - PDF Generation Service ✅

4. Backend - PDF Download Endpoint Fix ✅

5. Backend - Integrate PDF Generation into OCR Flow (REQUIRED) ✅

6. Frontend - Install Dependencies ✅

7. Frontend - Create PDF Viewer Component ✅

8. Frontend - Results Page Integration ✅

9. Frontend - Task Detail Page Integration ✅

10. Testing ⚠️ (待實際 OCR 任務測試)

基本驗證 (已完成) ✅

功能測試 (需實際 OCR 任務)

5.8 KiB Raw Blame History Unescape Escape

Implementation Tasks

1. Backend - Fix Image Extraction and Saving (PREREQUISITE) ✅

2. Backend - Environment Setup ✅

3. Backend - PDF Generation Service ✅

4. Backend - PDF Download Endpoint Fix ✅

5. Backend - Integrate PDF Generation into OCR Flow (REQUIRED) ✅

6. Frontend - Install Dependencies ✅

7. Frontend - Create PDF Viewer Component ✅

8. Frontend - Results Page Integration ✅

9. Frontend - Task Detail Page Integration ✅

10. Testing ⚠️ (待實際 OCR 任務測試)

基本驗證 (已完成) ✅

功能測試 (需實際 OCR 任務)

5.8 KiB

Raw Blame History