chore: backup before code cleanup
Backup commit before executing remove-unused-code proposal. This includes all pending changes and new features. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
42
openspec/changes/simple-text-positioning/proposal.md
Normal file
42
openspec/changes/simple-text-positioning/proposal.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Simple Text Positioning from Raw OCR
|
||||
|
||||
## Summary
|
||||
|
||||
Simplify OCR track PDF generation by rendering raw OCR text at correct positions without complex table structure reconstruction.
|
||||
|
||||
## Problem
|
||||
|
||||
Current OCR track processing has multiple failure points:
|
||||
1. PP-Structure table structure recognition fails for borderless tables
|
||||
2. Multi-column layouts get merged incorrectly into single tables
|
||||
3. Table HTML reconstruction produces wrong cell positions
|
||||
4. Complex column correction algorithms still can't fix fundamental structure errors
|
||||
|
||||
Meanwhile, raw OCR (`raw_ocr_regions.json`) correctly identifies all text with accurate bounding boxes.
|
||||
|
||||
## Solution
|
||||
|
||||
Replace complex table reconstruction with simple text positioning:
|
||||
1. Read raw OCR regions directly
|
||||
2. Position text at bbox coordinates
|
||||
3. Calculate text rotation from bbox quadrilateral shape
|
||||
4. Estimate font size from bbox height
|
||||
5. Skip table HTML parsing entirely for OCR track
|
||||
|
||||
## Benefits
|
||||
|
||||
- **Reliability**: Raw OCR text positions are accurate
|
||||
- **Simplicity**: Eliminates complex table parsing logic
|
||||
- **Performance**: Faster processing without structure analysis
|
||||
- **Consistency**: Predictable output regardless of table type
|
||||
|
||||
## Trade-offs
|
||||
|
||||
- No table borders in output
|
||||
- No cell structure (colspan, rowspan)
|
||||
- Visual layout approximation rather than semantic structure
|
||||
|
||||
## Scope
|
||||
|
||||
- OCR track PDF generation only
|
||||
- Direct track remains unchanged (uses native PDF text extraction)
|
||||
Reference in New Issue
Block a user