fix: improve PDF layout generation for Direct track
Key fixes: - Skip large vector_graphics charts (>50% page coverage) that cover text - Fix font fallback to use NotoSansSC for CJK support instead of Helvetica - Improve translated table rendering with dynamic font sizing - Add merged cell (row_span/col_span) support for reflow tables - Skip text elements inside table bboxes to avoid duplication Archive openspec proposal: fix-pdf-table-rendering 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,18 @@
|
||||
# Change: Fix PDF Table Rendering Issues
|
||||
|
||||
## Why
|
||||
OCR track PDF exports have significant table rendering problems:
|
||||
1. **Reflow PDF** (both translated and untranslated): Tables are misaligned due to missing row_span/col_span support
|
||||
2. **Translated Layout PDF**: Table borders disappear and text overlaps because it doesn't use the accurate `cell_boxes` positioning
|
||||
|
||||
## What Changes
|
||||
- **Translated Layout PDF**: Adopt layered rendering approach (borders + text separately) using `cell_boxes` from metadata
|
||||
- **Reflow PDF Tables**: Fix cell extraction and add basic merged cell support
|
||||
- Ensure embedded images in tables are rendered correctly in all PDF formats
|
||||
|
||||
## Impact
|
||||
- Affected specs: result-export
|
||||
- Affected code:
|
||||
- `backend/app/services/pdf_generator_service.py`
|
||||
- `_draw_translated_table()` - needs complete rewrite
|
||||
- `_create_reflow_table()` - needs merged cell support
|
||||
Reference in New Issue
Block a user