fix: improve PDF layout generation for Direct track
Key fixes: - Skip large vector_graphics charts (>50% page coverage) that cover text - Fix font fallback to use NotoSansSC for CJK support instead of Helvetica - Improve translated table rendering with dynamic font sizing - Add merged cell (row_span/col_span) support for reflow tables - Skip text elements inside table bboxes to avoid duplication Archive openspec proposal: fix-pdf-table-rendering 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,41 @@
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Translated Layout PDF Generation
|
||||
The system SHALL generate layout-preserving PDFs with translated content that maintain accurate table structure.
|
||||
|
||||
#### Scenario: Table with accurate borders
|
||||
- **GIVEN** an OCR result with tables containing `cell_boxes` metadata
|
||||
- **WHEN** generating translated layout PDF
|
||||
- **THEN** table cell borders SHALL be drawn at positions matching `cell_boxes`
|
||||
- **AND** translated text SHALL be rendered within each cell's bounding box
|
||||
|
||||
#### Scenario: Text overflow handling
|
||||
- **GIVEN** translated text longer than original text
|
||||
- **WHEN** text exceeds cell bounding box
|
||||
- **THEN** the system SHALL reduce font size (minimum 8pt) to fit content
|
||||
- **OR** truncate with ellipsis if minimum font size is insufficient
|
||||
|
||||
#### Scenario: Embedded images in tables
|
||||
- **GIVEN** a table with `embedded_images` in metadata
|
||||
- **WHEN** generating translated layout PDF
|
||||
- **THEN** images SHALL be rendered at their original positions within the table
|
||||
|
||||
### Requirement: Reflow PDF Table Rendering
|
||||
The system SHALL generate reflow PDFs with properly structured tables including merged cell support.
|
||||
|
||||
#### Scenario: Basic table rendering
|
||||
- **GIVEN** an OCR result with table cells containing `row`, `col`, `content`
|
||||
- **WHEN** generating reflow PDF
|
||||
- **THEN** cells SHALL be grouped by row and column indices
|
||||
- **AND** table SHALL render with visible borders
|
||||
|
||||
#### Scenario: Merged cells support
|
||||
- **GIVEN** table cells with `row_span` or `col_span` greater than 1
|
||||
- **WHEN** generating reflow PDF
|
||||
- **THEN** the system SHALL apply appropriate cell spanning
|
||||
- **AND** merged cells SHALL display content without duplication
|
||||
|
||||
#### Scenario: Column width calculation
|
||||
- **GIVEN** a table with `cell_boxes` metadata
|
||||
- **WHEN** generating reflow PDF
|
||||
- **THEN** column widths SHOULD be proportional to original cell widths
|
||||
Reference in New Issue
Block a user