feat: create PDF layout restoration proposal
Create new OpenSpec change proposal to fix critical PDF generation issues: **Problems Identified**: 1. Images never saved (empty _save_image implementation) 2. Image path mismatch (saved_path vs path lookup) 3. Tables never render (fake image dependency) 4. Text style completely lost (no font/color application) **Solution Design**: - Phase 1: Critical fixes (images, tables) - Phase 2: Basic style preservation - Phase 3: Advanced layout features - Phase 4: Testing and optimization **Key Improvements**: - Implement actual image saving in pp_structure_enhanced - Fix path resolution with fallback logic - Use table's own bbox instead of fake images - Track-specific rendering (rich for Direct, simple for OCR) - Preserve StyleInfo (fonts, sizes, colors) **Implementation Tasks**: - 10 major task groups - 4-week timeline - No breaking changes - Performance target: <10% overhead Proposal validated: openspec validate pdf-layout-restoration ✓ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,88 @@
|
||||
# Result Export Specification
|
||||
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Layout-Preserving PDF Generation
|
||||
The system MUST generate PDF files that preserve the original document layout including images, tables, and text formatting.
|
||||
|
||||
#### Scenario: Generate PDF with images
|
||||
GIVEN a document processed through OCR or Direct track
|
||||
WHEN images are detected and extracted
|
||||
THEN the generated PDF MUST include all images at their original positions
|
||||
AND images MUST maintain their aspect ratios
|
||||
AND images MUST be saved to an imgs/ subdirectory
|
||||
|
||||
#### Scenario: Generate PDF with tables
|
||||
GIVEN a document containing tables
|
||||
WHEN tables are detected and extracted
|
||||
THEN the generated PDF MUST render tables with proper structure
|
||||
AND tables MUST use their own bbox coordinates for positioning
|
||||
AND tables MUST NOT depend on fake image references
|
||||
|
||||
#### Scenario: Generate PDF with styled text
|
||||
GIVEN a document processed through Direct track with StyleInfo
|
||||
WHEN text elements have style information
|
||||
THEN the generated PDF MUST apply font families (with mapping)
|
||||
AND the PDF MUST apply font sizes
|
||||
AND the PDF MUST apply text colors
|
||||
AND the PDF MUST apply bold/italic formatting
|
||||
|
||||
### Requirement: Track-Specific Rendering
|
||||
The system MUST provide different rendering approaches based on the processing track.
|
||||
|
||||
#### Scenario: Direct track rendering
|
||||
GIVEN a document processed through Direct extraction
|
||||
WHEN generating a PDF
|
||||
THEN the system MUST use rich formatting preservation
|
||||
AND maintain precise positioning from the original
|
||||
AND apply all available StyleInfo
|
||||
|
||||
#### Scenario: OCR track rendering
|
||||
GIVEN a document processed through OCR
|
||||
WHEN generating a PDF
|
||||
THEN the system MUST use simplified rendering
|
||||
AND apply best-effort positioning based on bbox
|
||||
AND use estimated font sizes
|
||||
|
||||
### Requirement: Image Path Resolution
|
||||
The system MUST correctly resolve image paths with fallback logic.
|
||||
|
||||
#### Scenario: Resolve saved image paths
|
||||
GIVEN an element with image content
|
||||
WHEN looking for the image path
|
||||
THEN the system MUST check content["saved_path"] first
|
||||
AND fallback to content["path"] if not found
|
||||
AND fallback to content["image_path"] if not found
|
||||
AND finally check metadata["path"]
|
||||
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: PDF Generation Pipeline
|
||||
The PDF generation pipeline MUST be enhanced to support layout preservation.
|
||||
|
||||
#### Scenario: Enhanced PDF generation
|
||||
GIVEN a UnifiedDocument from either track
|
||||
WHEN generating a PDF
|
||||
THEN the system MUST detect the processing track
|
||||
AND route to the appropriate rendering method
|
||||
AND preserve as much layout information as available
|
||||
|
||||
### Requirement: Image Handling in PP-Structure
|
||||
The PP-Structure enhanced module MUST actually save extracted images.
|
||||
|
||||
#### Scenario: Save PP-Structure images
|
||||
GIVEN PP-Structure extracts an image with img_path
|
||||
WHEN processing the image element
|
||||
THEN the _save_image method MUST save the image to disk
|
||||
AND return a relative path for reference
|
||||
AND handle both file paths and numpy arrays
|
||||
|
||||
### Requirement: Table Rendering Logic
|
||||
The table rendering MUST use direct bbox instead of image lookup.
|
||||
|
||||
#### Scenario: Render table with direct bbox
|
||||
GIVEN a table element with bbox coordinates
|
||||
WHEN rendering the table in PDF
|
||||
THEN the system MUST use the element's own bbox
|
||||
AND NOT look for non-existent table image files
|
||||
AND position the table accurately based on coordinates
|
||||
Reference in New Issue
Block a user