feat: unify Direct Track PDF rendering and simplify export options

Backend changes: - Apply background image + invisible text layer to all Direct Track PDFs - Add CHART to regions_to_avoid for text extraction - Improve visual fidelity for native PDFs and Office documents Frontend changes: - Remove JSON, UnifiedDocument, Markdown download buttons - Simplify to 2-column layout with only Layout PDF and Reflow PDF - Remove translation JSON download and Layout PDF option - Keep only Reflow PDF for translated document downloads - Clean up unused imports (FileJson, Database, FileOutput) Archives two OpenSpec proposals: - unify-direct-track-pdf-rendering - simplify-frontend-export-options 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 07:50:43 +08:00
parent 53bfa88773
commit 24253ac15e
15 changed files with 891 additions and 195 deletions
--- a/openspec/changes/archive/2025-12-11-unify-direct-track-pdf-rendering/design.md
+++ b/openspec/changes/archive/2025-12-11-unify-direct-track-pdf-rendering/design.md
@@ -0,0 +1,130 @@
+# Design: Unify Direct Track PDF Rendering
+
+## Context
+
+The Tool_OCR system generates "Layout PDF" files that preserve the original document appearance while maintaining extractable text. Currently, Direct Track (editable PDFs and Office documents) uses element-by-element rendering, which causes:
+- Z-order conflicts (text behind images)
+- Missing vector graphics (chart bars, gradients)
+- White text becoming invisible on dark backgrounds
+
+## Goals / Non-Goals
+
+### Goals
+- Visual fidelity: Layout PDF matches source document exactly
+- Text extractability: All text remains searchable/selectable for translation
+- Unified logic: Same rendering approach for all Direct Track documents
+- Chart handling: Chart-internal text excluded from translation layer
+
+### Non-Goals
+- Editable text in Layout PDF (translation creates separate reflow PDF)
+- Reducing file size (trade-off for visual fidelity)
+- OCR Track changes (only affects Direct Track)
+
+## Decisions
+
+### Decision 1: Use Background Image + Invisible Text Layer
+
+**What**: Render each source PDF page as a full-page background image, then overlay invisible text.
+
+**Why**:
+- Preserves ALL visual content (vector graphics, gradients, complex layouts)
+- Invisible text (PDF Rendering Mode 3) allows text selection without visual overlap
+- Simplifies z-order handling (just one image layer + one text layer)
+
+**Implementation**:
+```python
+# Render source page as background
+mat = fitz.Matrix(2.0, 2.0)  # 2x resolution
+pix = source_page.get_pixmap(matrix=mat, alpha=False)
+pdf_canvas.drawImage(bg_img, 0, 0, width=page_width, height=page_height)
+
+# Set invisible text mode
+pdf_canvas._code.append('3 Tr')  # Text render mode: invisible
+
+# Draw text elements (invisible but selectable)
+for elem in text_elements:
+    if not is_inside_chart_region(elem):
+        draw_text_element(elem)
+
+pdf_canvas._code.append('0 Tr')  # Reset to normal
+```
+
+### Decision 2: Add CHART to regions_to_avoid
+
+**What**: Chart-internal text elements are excluded from the invisible text layer.
+
+**Why**:
+- Chart axis labels, legends already visible in background image
+- These texts typically don't need translation
+- Prevents duplicate text extraction for translation
+
+**Implementation**:
+```python
+# In element classification loop
+if element.type == ElementType.CHART:
+    image_elements.append(element)
+    regions_to_avoid.append(element)  # Exclude chart region from text layer
+```
+
+### Decision 3: Apply to ALL Direct Track Documents
+
+**What**: Use background image rendering for both Office documents and native PDFs.
+
+**Why**:
+- Consistent handling eliminates edge cases
+- Chart text overlap affects both document types
+- Office detection (LibreOffice producer) is unreliable for some PDFs
+
+**Detection logic removed**:
+```python
+# OLD: Only for Office documents
+is_office_document = 'LibreOffice' in producer or filename.endswith('.pptx')
+
+# NEW: All Direct Track uses background rendering
+if self.current_processing_track == ProcessingTrack.DIRECT:
+    render_background_image()
+```
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    PDF Generation Flow                       │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  Source PDF ──► PyMuPDF ──► Page Pixmap (2x) ──► Background │
+│                    │                                         │
+│                    ▼                                         │
+│              Extract Text ──► Filter Chart Regions           │
+│                    │                                         │
+│                    ▼                                         │
+│         Invisible Text Layer (Mode 3) ──► Overlay            │
+│                                                              │
+│  Result: Background Image + Invisible Searchable Text        │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Risks / Trade-offs
+
+| Risk | Impact | Mitigation |
+|------|--------|------------|
+| Larger file size (~2MB/page) | Storage, download time | Accept trade-off for visual fidelity |
+| Slightly slower generation | User wait time | Acceptable for quality improvement |
+| Chart text not translatable | Feature limitation | Document as expected behavior |
+| Source PDF required | Can't regenerate without source | Store source PDF reference in task |
+
+## File Size Estimation
+
+| Document | Pages | Current Size | New Size (est.) |
+|----------|-------|--------------|-----------------|
+| PPT (25 pages) | 25 | ~1.5 MB | ~43 MB |
+| PDF (3 pages) | 3 | ~68 KB | ~6 MB |
+
+## Open Questions
+
+1. Should we provide a "lightweight" option that skips background rendering for simple PDFs?
+   - **Decision**: No, keep unified approach for consistency
+
+2. Should chart text be optionally included in translation?
+   - **Decision**: No, chart labels rarely need translation and would require complex masking