Backup commit before executing remove-unused-code proposal. This includes all pending changes and new features. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1.4 KiB
1.4 KiB
Simple Text Positioning from Raw OCR
Summary
Simplify OCR track PDF generation by rendering raw OCR text at correct positions without complex table structure reconstruction.
Problem
Current OCR track processing has multiple failure points:
- PP-Structure table structure recognition fails for borderless tables
- Multi-column layouts get merged incorrectly into single tables
- Table HTML reconstruction produces wrong cell positions
- Complex column correction algorithms still can't fix fundamental structure errors
Meanwhile, raw OCR (raw_ocr_regions.json) correctly identifies all text with accurate bounding boxes.
Solution
Replace complex table reconstruction with simple text positioning:
- Read raw OCR regions directly
- Position text at bbox coordinates
- Calculate text rotation from bbox quadrilateral shape
- Estimate font size from bbox height
- Skip table HTML parsing entirely for OCR track
Benefits
- Reliability: Raw OCR text positions are accurate
- Simplicity: Eliminates complex table parsing logic
- Performance: Faster processing without structure analysis
- Consistency: Predictable output regardless of table type
Trade-offs
- No table borders in output
- No cell structure (colspan, rowspan)
- Visual layout approximation rather than semantic structure
Scope
- OCR track PDF generation only
- Direct track remains unchanged (uses native PDF text extraction)