Files
OCR/openspec/changes/archive/2025-12-04-improve-translated-text-fitting/specs/result-export/spec.md
2025-12-04 18:00:37 +08:00

5.8 KiB

ADDED Requirements

Requirement: Dual PDF Generation Modes

The system SHALL support two distinct PDF generation modes to serve different use cases for both OCR and Direct tracks.

Scenario: Download layout preservation PDF

  • WHEN user requests PDF via /api/v2/tasks/{task_id}/download/pdf
  • THEN PDF SHALL use layout preservation mode
  • AND text positions SHALL match original document coordinates
  • AND this option SHALL be available for both OCR and Direct tracks
  • AND existing behavior SHALL remain unchanged

Scenario: Download reflow layout PDF without translation

  • WHEN user requests PDF via /api/v2/tasks/{task_id}/download/pdf?format=reflow
  • THEN PDF SHALL use reflow layout mode
  • AND text SHALL flow naturally with consistent font sizes
  • AND body text SHALL use approximately 12pt font size
  • AND headings SHALL use larger font sizes (14-18pt)
  • AND this option SHALL be available for both OCR and Direct tracks

Scenario: OCR track reading order in reflow mode

  • GIVEN document processed via OCR track
  • WHEN generating reflow PDF
  • THEN system SHALL use explicit reading_order array from JSON
  • AND elements SHALL appear in order specified by reading_order indices
  • AND if reading_order is missing, fall back to spatial sort (y, x)

Scenario: Direct track reading order in reflow mode

  • GIVEN document processed via Direct track
  • WHEN generating reflow PDF
  • THEN system SHALL use implicit element order from extraction
  • AND elements SHALL appear in list iteration order
  • AND PyMuPDF's sort=True ordering SHALL be trusted

Requirement: Reflow PDF Semantic Structure

The reflow PDF generation SHALL preserve document semantic structure.

Scenario: Headings in reflow mode

  • WHEN original document contains headings (title, h1, h2, etc.)
  • THEN headings SHALL be rendered with larger font sizes
  • AND headings SHALL be visually distinguished from body text
  • AND heading hierarchy SHALL be preserved

Scenario: Tables in reflow mode

  • WHEN original document contains tables
  • THEN tables SHALL render with visible cell borders
  • AND column widths SHALL auto-adjust to content
  • AND table content SHALL be fully visible
  • AND tables SHALL use appropriate cell padding

Scenario: Images in reflow mode

  • WHEN original document contains images
  • THEN images SHALL be embedded inline in flowing content
  • AND images SHALL be scaled to fit page width if necessary
  • AND images SHALL maintain aspect ratio

Scenario: Lists in reflow mode

  • WHEN original document contains numbered or bulleted lists
  • THEN lists SHALL preserve their formatting
  • AND list items SHALL flow naturally

MODIFIED Requirements

Requirement: Translated PDF Export API

The system SHALL expose an API endpoint for downloading translated documents as PDF files using reflow layout mode only.

Scenario: Download translated PDF via API

  • GIVEN a task with completed translation
  • WHEN POST request to /api/v2/translate/{task_id}/pdf?lang={lang}
  • THEN system returns PDF file with translated content
  • AND PDF SHALL use reflow layout mode (not layout preservation)
  • AND Content-Type is application/pdf
  • AND Content-Disposition suggests filename like {task_id}_translated_{lang}.pdf

Scenario: Translated PDF uses reflow layout

  • WHEN user downloads translated PDF
  • THEN the PDF SHALL use reflow layout mode
  • AND text SHALL flow naturally with consistent font sizes
  • AND body text SHALL use approximately 12pt font size
  • AND headings SHALL use larger font sizes (14-18pt)
  • AND content SHALL be readable without magnification

Scenario: Translated PDF for OCR track

  • GIVEN document processed via OCR track with translation
  • WHEN generating translated PDF
  • THEN reading order SHALL follow reading_order array
  • AND translated text SHALL replace original in correct positions

Scenario: Translated PDF for Direct track

  • GIVEN document processed via Direct track with translation
  • WHEN generating translated PDF
  • THEN reading order SHALL follow implicit element order
  • AND translated text SHALL replace original in correct positions

Scenario: Invalid language parameter

  • GIVEN a task with translation only to English
  • WHEN user requests PDF with lang=ja (Japanese)
  • THEN system returns 404 Not Found
  • AND response includes available languages in error message

Scenario: Task not found

  • GIVEN non-existent task_id
  • WHEN user requests translated PDF
  • THEN system returns 404 Not Found

Requirement: Frontend Download Options

The frontend SHALL provide appropriate download options based on translation status.

Scenario: Download options without translation

  • GIVEN a task without any completed translations
  • WHEN user views TaskDetailPage
  • THEN page SHALL display "Download Layout PDF" button (original coordinates)
  • AND page SHALL display "Download Reflow PDF" button (flowing layout)
  • AND both options SHALL be available in the download section

Scenario: Download options with translation

  • GIVEN a task with completed translation
  • WHEN user views TaskDetailPage
  • THEN page SHALL display "Download Translated PDF" button for each language
  • AND translated PDF button SHALL remain as single option (no Layout/Reflow choice)
  • AND translated PDF SHALL automatically use reflow layout

Scenario: Remove outdated MADLAD-400 references

  • WHEN displaying translation section
  • THEN page SHALL NOT display "MADLAD-400" badge
  • AND description text SHALL reflect cloud translation service (Dify)
  • AND description SHALL NOT mention local model loading time