Changes: - Replace PP-Structure 7-slider parameter UI with simple 3-option layout model selector - Add layout model mapping: chinese (PP-DocLayout-S), default (PubLayNet), cdla - Add LayoutModelSelector component and zh-TW translations - Fix "default" model behavior with sentinel value for PubLayNet - Add gap filling service for OCR track coverage improvement - Add PP-Structure debug utilities - Archive completed/incomplete proposals: - add-ocr-track-gap-filling (complete) - fix-ocr-track-table-rendering (incomplete) - simplify-ppstructure-model-selection (22/25 tasks) - Add new layout model tests, archive old PP-Structure param tests - Update OpenSpec ocr-processing spec with layout model requirements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1.9 KiB
PDF Generation - OCR Track Table Rendering Fix
MODIFIED Requirements
Requirement: OCR Track Table Content Conversion
The PDF generator MUST properly convert table content from JSON dict format to renderable structure when processing OCR track results.
Scenario: Table dict with cells array converts to proper HTML
Given an OCR track JSON with table element containing rows, cols, and cells array When the PDF generator processes this element Then the table content MUST be converted to a TableData object And TableData.to_html() MUST produce valid HTML with proper tr/td structure And the generated PDF table MUST have cells positioned in correct grid locations
Scenario: Table with rowspan/colspan renders correctly
Given a table element with cells having rowspan > 1 or colspan > 1 When the PDF generator renders the table Then merged cells MUST span the correct number of rows/columns And content MUST appear in the merged cell position
Requirement: Table Visual Fidelity
The PDF generator MUST render OCR track tables with visual structure matching the original document.
Scenario: Table renders with grid lines
Given an OCR track table element When rendered to PDF Then the table MUST have visible grid lines/borders And cell boundaries MUST be clearly defined
Scenario: Table text alignment preserved
Given an OCR track table with cell content When rendered to PDF Then text MUST be positioned within the correct cell boundaries And text MUST NOT overflow into adjacent cells
Requirement: Backward Compatibility with Hybrid Mode
The table rendering fix MUST NOT break hybrid mode processing.
Scenario: Hybrid mode tables render correctly
Given a document processed with hybrid mode combining Direct and OCR tracks When PDF is generated Then Direct track tables MUST render with existing quality And OCR track tables MUST render with improved quality And no regression in table positioning or content