Files
OCR/openspec/changes/fix-result-preview-and-pdf-download/specs/task-management/spec.md
egg 0edc56b03f fix: 修復PDF生成中的頁碼錯誤和文字重疊問題
## 問題修復

### 1. 頁碼分配錯誤
- **問題**: layout_data 和 images_metadata 頁碼被 1-based 覆蓋,導致全部為 0
- **修復**: 在 analyze_layout() 添加 current_page 參數,從源頭設置正確的 0-based 頁碼
- **影響**: 表格和圖片現在顯示在正確的頁面上

### 2. 文字與表格/圖片重疊
- **問題**: 使用不存在的 'tables' 和 'image_regions' 字段過濾,導致過濾失效
- **修復**: 改用 images_metadata(包含所有表格/圖片的 bbox)
- **新增**: _bbox_overlaps() 檢測任意重疊(非完全包含)
- **影響**: 文字不再覆蓋表格和圖片區域

### 3. 渲染順序優化
- **調整**: 圖片(底層) → 表格(中間層) → 文字(頂層)
- **影響**: 視覺層次更正確

## 技術細節

- ocr_service.py: 添加 current_page 參數傳遞,移除頁碼覆蓋邏輯
- pdf_generator_service.py:
  - 新增 _bbox_overlaps() 方法
  - 更新 _filter_text_in_regions() 使用重疊檢測
  - 修正數據源為 images_metadata
  - 調整繪製順序

## 已知限制

- 仍有 21.6% 文字因過濾而遺失(座標定位方法的固有問題)
- 未使用 PP-StructureV3 的完整版面資訊(parsing_res_list, layout_bbox)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-18 18:57:01 +08:00

2.8 KiB

Task Management - Delta Changes

MODIFIED Requirements

Requirement: Task Result Display

The system SHALL provide interactive PDF preview of OCR results with layout preservation on Results and Task Detail pages.

Scenario: Results page shows layout-preserving PDF preview

  • WHEN Results page loads with a completed task
  • THEN page SHALL fetch PDF from /api/v2/tasks/{task_id}/download/pdf
  • AND page SHALL render PDF using react-pdf PDFViewer component
  • AND page SHALL NOT show placeholder text "請使用上方下載按鈕..."
  • AND PDF SHALL display with original document layout preserved
  • AND PDF SHALL support zoom and page navigation controls

Scenario: Task detail page shows PDF preview

  • WHEN Task Detail page loads for a completed task
  • THEN page SHALL fetch layout-preserving PDF
  • AND page SHALL render PDF using PDFViewer component
  • AND page SHALL NOT show placeholder text
  • AND PDF SHALL visually match original document layout

Scenario: Preview handles loading state

  • WHEN PDF is being generated or fetched
  • THEN page SHALL display loading spinner
  • AND page SHALL show progress indicator during PDF generation
  • AND page SHALL NOT show error or placeholder text

Scenario: Preview handles errors gracefully

  • WHEN PDF generation fails or file is missing
  • THEN page SHALL display helpful error message
  • AND error message SHALL suggest trying download again or contact support
  • AND page SHALL NOT crash or expose technical errors to user
  • AND page MAY fallback to markdown preview if PDF unavailable

ADDED Requirements

Requirement: Interactive PDF Viewer Features

The PDF viewer component SHALL provide essential viewing controls for user convenience.

Scenario: PDF viewer provides zoom controls

  • WHEN user views PDF preview
  • THEN viewer SHALL provide zoom in (+) and zoom out (-) buttons
  • AND viewer SHALL provide fit-to-width option
  • AND viewer SHALL provide fit-to-page option
  • AND zoom level SHALL persist during page navigation

Scenario: PDF viewer provides page navigation

  • WHEN PDF contains multiple pages
  • THEN viewer SHALL display current page number and total pages
  • AND viewer SHALL provide previous/next page buttons
  • AND viewer SHALL provide page selector dropdown
  • AND page navigation SHALL be smooth without flickering

Requirement: Frontend PDF Library Integration

The frontend SHALL use react-pdf for PDF rendering capabilities.

Scenario: react-pdf configured correctly

  • WHEN application initializes
  • THEN react-pdf library SHALL be installed and imported
  • AND PDF.js worker SHALL be configured properly
  • AND worker path SHALL point to correct pdfjs-dist worker file
  • AND PDF rendering SHALL work without console errors