Key fixes: - Skip large vector_graphics charts (>50% page coverage) that cover text - Fix font fallback to use NotoSansSC for CJK support instead of Helvetica - Improve translated table rendering with dynamic font sizing - Add merged cell (row_span/col_span) support for reflow tables - Skip text elements inside table bboxes to avoid duplication Archive openspec proposal: fix-pdf-table-rendering 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1.8 KiB
1.8 KiB
MODIFIED Requirements
Requirement: Translated Layout PDF Generation
The system SHALL generate layout-preserving PDFs with translated content that maintain accurate table structure.
Scenario: Table with accurate borders
- GIVEN an OCR result with tables containing
cell_boxesmetadata - WHEN generating translated layout PDF
- THEN table cell borders SHALL be drawn at positions matching
cell_boxes - AND translated text SHALL be rendered within each cell's bounding box
Scenario: Text overflow handling
- GIVEN translated text longer than original text
- WHEN text exceeds cell bounding box
- THEN the system SHALL reduce font size (minimum 8pt) to fit content
- OR truncate with ellipsis if minimum font size is insufficient
Scenario: Embedded images in tables
- GIVEN a table with
embedded_imagesin metadata - WHEN generating translated layout PDF
- THEN images SHALL be rendered at their original positions within the table
Requirement: Reflow PDF Table Rendering
The system SHALL generate reflow PDFs with properly structured tables including merged cell support.
Scenario: Basic table rendering
- GIVEN an OCR result with table cells containing
row,col,content - WHEN generating reflow PDF
- THEN cells SHALL be grouped by row and column indices
- AND table SHALL render with visible borders
Scenario: Merged cells support
- GIVEN table cells with
row_spanorcol_spangreater than 1 - WHEN generating reflow PDF
- THEN the system SHALL apply appropriate cell spanning
- AND merged cells SHALL display content without duplication
Scenario: Column width calculation
- GIVEN a table with
cell_boxesmetadata - WHEN generating reflow PDF
- THEN column widths SHOULD be proportional to original cell widths