Files
OCR/openspec/changes/archive/2025-12-11-fix-ocr-track-table-rendering/proposal.md
egg cfe65158a3 feat: enable document orientation detection for scanned PDFs
- Enable PP-StructureV3's use_doc_orientation_classify feature
- Detect rotation angle from doc_preprocessor_res.angle
- Swap page dimensions (width <-> height) for 90°/270° rotations
- Output PDF now correctly displays landscape-scanned content

Also includes:
- Archive completed openspec proposals
- Add simplify-frontend-ocr-config proposal (pending)
- Code cleanup and frontend simplification

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 17:13:46 +08:00

18 lines
1.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Change: Fix OCR Track Table Rendering and Text Sizing
## Why
OCR Track 處理產生的 PDF 有兩個主要問題:
1. **表格內容消失**PP-StructureV3 正確返回了 `table_res_list`(包含 `pred_html``cell_box_list`),但 `pp_structure_enhanced.py` 在通過 bbox overlap 匹配時只提取了 `cell_boxes` 而沒有提取 `pred_html`,導致表格的 HTML 內容為空。
2. **文字大小不一致**OCR 座標系 (1275x1650 pixels) 與 PDF 輸出尺寸 (612x792 pts) 之間的縮放因子 (0.48) 導致字體大小計算不準確,文字過小或大小不一致。
## What Changes
- 修復 `pp_structure_enhanced.py` 中 bbox overlap 匹配時的 HTML 提取邏輯
- 改進 `pdf_generator_service.py` 中 OCR Track 的座標系處理,使用 OCR 座標系尺寸作為 PDF 輸出尺寸
- 調整 `_check_cell_boxes_quality()` 函數的判定邏輯,避免過度過濾有效表格
## Impact
- Affected specs: `ocr-processing`
- Affected code:
- `backend/app/services/pp_structure_enhanced.py` - 表格 HTML 提取邏輯
- `backend/app/services/pdf_generator_service.py` - PDF 生成座標系處理