fix: 修復PDF生成中的頁碼錯誤和文字重疊問題

## 問題修復 ### 1. 頁碼分配錯誤 - **問題**: layout_data 和 images_metadata 頁碼被 1-based 覆蓋，導致全部為 0 - **修復**: 在 analyze_layout() 添加 current_page 參數，從源頭設置正確的 0-based 頁碼 - **影響**: 表格和圖片現在顯示在正確的頁面上 ### 2. 文字與表格/圖片重疊 - **問題**: 使用不存在的 'tables' 和 'image_regions' 字段過濾，導致過濾失效 - **修復**: 改用 images_metadata（包含所有表格/圖片的 bbox） - **新增**: _bbox_overlaps() 檢測任意重疊（非完全包含） - **影響**: 文字不再覆蓋表格和圖片區域 ### 3. 渲染順序優化 - **調整**: 圖片(底層) → 表格(中間層) → 文字(頂層) - **影響**: 視覺層次更正確 ## 技術細節 - ocr_service.py: 添加 current_page 參數傳遞，移除頁碼覆蓋邏輯 - pdf_generator_service.py: - 新增 _bbox_overlaps() 方法 - 更新 _filter_text_in_regions() 使用重疊檢測 - 修正數據源為 images_metadata - 調整繪製順序 ## 已知限制 - 仍有 21.6% 文字因過濾而遺失（座標定位方法的固有問題） - 未使用 PP-StructureV3 的完整版面資訊（parsing_res_list, layout_bbox） 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-18 18:57:01 +08:00
parent 5cf4010c9b
commit 0edc56b03f
6 changed files with 485 additions and 45 deletions
--- a/openspec/changes/fix-result-preview-and-pdf-download/specs/result-export/spec.md
+++ b/openspec/changes/fix-result-preview-and-pdf-download/specs/result-export/spec.md
@@ -0,0 +1,57 @@
+# Result Export - Delta Changes
+
+## ADDED Requirements
+
+### Requirement: Image Extraction and Persistence
+The OCR system SHALL save extracted images to disk during layout analysis for later use in PDF generation.
+
+#### Scenario: Images extracted by PP-StructureV3 are saved to disk
+- **WHEN** OCR processes a document containing images (charts, tables, figures)
+- **THEN** system SHALL extract image objects from `markdown_images` dictionary
+- **AND** system SHALL create `imgs/` subdirectory in result folder
+- **AND** system SHALL save each image object to disk using PIL Image.save()
+- **AND** saved file paths SHALL match paths recorded in JSON `images_metadata`
+- **AND** system SHALL log warnings for failed image saves but continue processing
+
+#### Scenario: Multi-page documents with images on different pages
+- **WHEN** OCR processes multi-page PDF with images on multiple pages
+- **THEN** system SHALL save images from all pages to same `imgs/` folder
+- **AND** image filenames SHALL include bbox coordinates for uniqueness
+- **AND** images SHALL be available for PDF generation after OCR completes
+
+### Requirement: Layout-Preserving PDF Generation
+The system SHALL generate PDF files that preserve the original document layout using OCR JSON data.
+
+#### Scenario: PDF generated from JSON with accurate layout
+- **WHEN** user requests PDF download for a completed task
+- **THEN** system SHALL parse OCR JSON result file
+- **AND** system SHALL extract bounding box coordinates for each text region
+- **AND** system SHALL determine page dimensions from source file or bbox maximum values
+- **AND** system SHALL generate PDF with text positioned at precise coordinates
+- **AND** system SHALL use Chinese-compatible font (e.g., Noto Sans CJK)
+- **AND** system SHALL embed images from `imgs/` folder using paths in `images_metadata`
+- **AND** generated PDF SHALL visually resemble original document layout with images
+
+#### Scenario: PDF download works correctly
+- **WHEN** user clicks PDF download button
+- **THEN** system SHALL return cached PDF if already generated
+- **OR** system SHALL generate new PDF from JSON on first request
+- **AND** system SHALL NOT return 403 Forbidden error
+- **AND** downloaded PDF SHALL contain task OCR results with layout preserved
+
+#### Scenario: Multi-page PDF generation
+- **WHEN** OCR JSON contains results for multiple pages
+- **THEN** generated PDF SHALL contain same number of pages
+- **AND** each page SHALL display text regions for that page only
+- **AND** page dimensions SHALL match original document pages
+
+## MODIFIED Requirements
+
+### Requirement: Export Interface
+The Export page SHALL support downloading OCR results in multiple formats using V2 task APIs.
+
+#### Scenario: PDF caching improves performance
+- **WHEN** user downloads same PDF multiple times
+- **THEN** system SHALL serve cached PDF file on subsequent requests
+- **AND** system SHALL NOT regenerate PDF unless JSON changes
+- **AND** download response time SHALL be faster than initial generation
--- a/openspec/changes/fix-result-preview-and-pdf-download/specs/task-management/spec.md
+++ b/openspec/changes/fix-result-preview-and-pdf-download/specs/task-management/spec.md
@@ -0,0 +1,63 @@
+# Task Management - Delta Changes
+
+## MODIFIED Requirements
+
+### Requirement: Task Result Display
+The system SHALL provide interactive PDF preview of OCR results with layout preservation on Results and Task Detail pages.
+
+#### Scenario: Results page shows layout-preserving PDF preview
+- **WHEN** Results page loads with a completed task
+- **THEN** page SHALL fetch PDF from `/api/v2/tasks/{task_id}/download/pdf`
+- **AND** page SHALL render PDF using react-pdf PDFViewer component
+- **AND** page SHALL NOT show placeholder text "請使用上方下載按鈕..."
+- **AND** PDF SHALL display with original document layout preserved
+- **AND** PDF SHALL support zoom and page navigation controls
+
+#### Scenario: Task detail page shows PDF preview
+- **WHEN** Task Detail page loads for a completed task
+- **THEN** page SHALL fetch layout-preserving PDF
+- **AND** page SHALL render PDF using PDFViewer component
+- **AND** page SHALL NOT show placeholder text
+- **AND** PDF SHALL visually match original document layout
+
+#### Scenario: Preview handles loading state
+- **WHEN** PDF is being generated or fetched
+- **THEN** page SHALL display loading spinner
+- **AND** page SHALL show progress indicator during PDF generation
+- **AND** page SHALL NOT show error or placeholder text
+
+#### Scenario: Preview handles errors gracefully
+- **WHEN** PDF generation fails or file is missing
+- **THEN** page SHALL display helpful error message
+- **AND** error message SHALL suggest trying download again or contact support
+- **AND** page SHALL NOT crash or expose technical errors to user
+- **AND** page MAY fallback to markdown preview if PDF unavailable
+
+## ADDED Requirements
+
+### Requirement: Interactive PDF Viewer Features
+The PDF viewer component SHALL provide essential viewing controls for user convenience.
+
+#### Scenario: PDF viewer provides zoom controls
+- **WHEN** user views PDF preview
+- **THEN** viewer SHALL provide zoom in (+) and zoom out (-) buttons
+- **AND** viewer SHALL provide fit-to-width option
+- **AND** viewer SHALL provide fit-to-page option
+- **AND** zoom level SHALL persist during page navigation
+
+#### Scenario: PDF viewer provides page navigation
+- **WHEN** PDF contains multiple pages
+- **THEN** viewer SHALL display current page number and total pages
+- **AND** viewer SHALL provide previous/next page buttons
+- **AND** viewer SHALL provide page selector dropdown
+- **AND** page navigation SHALL be smooth without flickering
+
+### Requirement: Frontend PDF Library Integration
+The frontend SHALL use react-pdf for PDF rendering capabilities.
+
+#### Scenario: react-pdf configured correctly
+- **WHEN** application initializes
+- **THEN** react-pdf library SHALL be installed and imported
+- **AND** PDF.js worker SHALL be configured properly
+- **AND** worker path SHALL point to correct pdfjs-dist worker file
+- **AND** PDF rendering SHALL work without console errors