Backend changes: - Apply background image + invisible text layer to all Direct Track PDFs - Add CHART to regions_to_avoid for text extraction - Improve visual fidelity for native PDFs and Office documents Frontend changes: - Remove JSON, UnifiedDocument, Markdown download buttons - Simplify to 2-column layout with only Layout PDF and Reflow PDF - Remove translation JSON download and Layout PDF option - Keep only Reflow PDF for translated document downloads - Clean up unused imports (FileJson, Database, FileOutput) Archives two OpenSpec proposals: - unify-direct-track-pdf-rendering - simplify-frontend-export-options 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2.4 KiB
2.4 KiB
Change: Unify Direct Track PDF Rendering with Background Image + Invisible Text Layer
Why
Direct Track PDF generation currently has visual rendering issues:
- Chart text overlap: Text elements extracted from PDF text layer (e.g., "Temperature, °C") overlap with chart images
- Z-order problems: White text on dark backgrounds becomes invisible when rendered incorrectly
- Office document issues: PPT/DOC/XLS converted PDFs lose visual fidelity (vector graphics, gradients)
The root cause is that Direct Track tries to render individual elements (text, images, tables) separately, which leads to z-order conflicts and missing visual content.
What Changes
Backend Changes
-
Unified Background Image Rendering for All Direct Track
- Render source PDF page as full-page background image (2x resolution)
- Draw invisible text layer on top (PDF Text Rendering Mode 3)
- Text remains searchable/extractable but doesn't visually overlap
-
Chart Region Exclusion
- Add
CHARTelement type toregions_to_avoid - Chart-internal text (axis labels, legends) will NOT be in invisible text layer
- These texts are already visible in the background image and don't need translation
- Add
-
Skip Element Rendering When Background Exists
- When background image is rendered, skip individual image/table rendering
- Only draw invisible text layer for searchability and translation extraction
Frontend Considerations
-
No UI Changes Required for Layout PDF
- Layout PDF generation is automatic, no user options needed
- Visual output will match source PDF exactly
-
Translation Flow Clarification
- Layout PDF: Background image + invisible text (for preview)
- Translated PDF: Reflow layout with real visible text (page-by-page)
- Chart text excluded from translation (already in background image)
Impact
- Affected specs: document-processing, result-export, translation
- Affected code:
backend/app/services/pdf_generator_service.py(main changes)backend/app/services/direct_extraction_engine.py(chart detection)
- File size: Output PDF will be larger due to embedded page images (~2MB per page at 2x resolution)
- Processing time: Slight increase for page rendering
Migration
- No database changes required
- No API changes required
- Existing tasks can be re-exported with new PDF generation logic