Key fixes: - Skip large vector_graphics charts (>50% page coverage) that cover text - Fix font fallback to use NotoSansSC for CJK support instead of Helvetica - Improve translated table rendering with dynamic font sizing - Add merged cell (row_span/col_span) support for reflow tables - Skip text elements inside table bboxes to avoid duplication Archive openspec proposal: fix-pdf-table-rendering 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
889 B
889 B
Change: Fix PDF Table Rendering Issues
Why
OCR track PDF exports have significant table rendering problems:
- Reflow PDF (both translated and untranslated): Tables are misaligned due to missing row_span/col_span support
- Translated Layout PDF: Table borders disappear and text overlaps because it doesn't use the accurate
cell_boxespositioning
What Changes
- Translated Layout PDF: Adopt layered rendering approach (borders + text separately) using
cell_boxesfrom metadata - Reflow PDF Tables: Fix cell extraction and add basic merged cell support
- Ensure embedded images in tables are rendered correctly in all PDF formats
Impact
- Affected specs: result-export
- Affected code:
backend/app/services/pdf_generator_service.py_draw_translated_table()- needs complete rewrite_create_reflow_table()- needs merged cell support