- Add OCR Track support for reflow PDF generation using raw_ocr_regions.json - Add OCR Track translation extraction from raw_ocr_regions instead of elements - Add raw_ocr_translations output format for OCR Track documents - Add exclusion zone filtering to remove text overlapping with images - Update API validation to accept both translations and raw_ocr_translations - Add page_number field to TranslatedItem for proper tracking 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
52 lines
2.0 KiB
Markdown
52 lines
2.0 KiB
Markdown
# Tasks: Fix OCR Track Reflow PDF
|
|
|
|
## 1. Modify generate_reflow_pdf Method
|
|
|
|
- [x] 1.1 Add processing track detection
|
|
- File: `backend/app/services/pdf_generator_service.py`
|
|
- Location: `generate_reflow_pdf` method (line ~4704)
|
|
- Read `metadata.processing_track` from JSON data
|
|
- Branch logic based on track type
|
|
|
|
- [x] 1.2 Add helper function to load raw OCR regions
|
|
- File: `backend/app/services/pdf_generator_service.py`
|
|
- Using existing: `load_raw_ocr_regions` from `text_region_renderer.py`
|
|
- Pattern: `{task_id}_*_page_{page_num}_raw_ocr_regions.json`
|
|
- Return: List of text regions with bbox and content
|
|
|
|
- [x] 1.3 Implement OCR Track reflow rendering
|
|
- File: `backend/app/services/pdf_generator_service.py`
|
|
- For OCR Track: Load raw OCR regions per page
|
|
- Sort text blocks by Y coordinate (top to bottom reading order)
|
|
- Render text blocks as paragraphs
|
|
- Still render images/charts from elements
|
|
|
|
- [x] 1.4 Keep Direct Track logic unchanged
|
|
- File: `backend/app/services/pdf_generator_service.py`
|
|
- Direct Track continues using `content.cells` for tables
|
|
- Extracted to `_render_reflow_elements` helper method
|
|
- No changes to existing Direct Track flow
|
|
|
|
## 2. Handle Multi-page Documents
|
|
|
|
- [x] 2.1 Support per-page raw OCR files
|
|
- Pattern: `{task_id}_*_page_{page_num}_raw_ocr_regions.json`
|
|
- Iterate through pages and load corresponding raw OCR file
|
|
- Handle missing files gracefully (fall back to elements)
|
|
|
|
## 3. Testing
|
|
|
|
- [x] 3.1 Test OCR Track reflow PDF
|
|
- Test with: `a9259180-fc49-4890-8184-2e6d5f4edad3` (scan document)
|
|
- Verify: All 59 text blocks appear in reflow PDF
|
|
- Verify: Images are embedded correctly
|
|
|
|
- [x] 3.2 Test Direct Track reflow PDF
|
|
- Test with: `1b32428d-0609-4cfd-bc52-56be6956ac2e` (editable PDF)
|
|
- Verify: Tables render with cells
|
|
- Verify: No regression from changes
|
|
|
|
- [x] 3.3 Test translated reflow PDF
|
|
- Test: Complete translation then download reflow PDF
|
|
- Verify: Translated text appears correctly
|