Files
OCR/openspec/changes/archive/2025-12-02-add-translated-pdf-export/specs/translation/spec.md
egg a07aad96b3 feat: add translated PDF export with layout preservation
Adds the ability to download translated documents as PDF files while
preserving the original document layout. Key changes:

- Add apply_translations() function to merge translation JSON with UnifiedDocument
- Add generate_translated_pdf() method to PDFGeneratorService
- Add POST /api/v2/translate/{task_id}/pdf endpoint
- Add downloadTranslatedPdf() method and PDF button in frontend
- Add comprehensive unit tests (52 tests: merge, PDF generation, API endpoints)
- Archive add-translated-pdf-export proposal

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-02 12:33:31 +08:00

3.0 KiB

ADDED Requirements

Requirement: Translated PDF Generation

The system SHALL support generating PDF files with translated content while preserving the original document layout.

Scenario: Generate translated PDF from Direct track document

  • GIVEN a completed translation for a Direct track processed document
  • WHEN user requests translated PDF via POST /api/v2/translate/{task_id}/pdf?lang={target_lang}
  • THEN the system loads the translation JSON file
  • AND merges translations with UnifiedDocument by element_id
  • AND generates PDF with translated text at original positions
  • AND returns PDF file with Content-Type application/pdf

Scenario: Generate translated PDF from OCR track document

  • GIVEN a completed translation for an OCR track processed document
  • WHEN user requests translated PDF
  • THEN the system generates PDF preserving all OCR layout information
  • AND replaces original text with translated content
  • AND maintains table structure with translated cell content

Scenario: Handle missing translations gracefully

  • GIVEN a translation JSON missing some element_id entries
  • WHEN generating translated PDF
  • THEN the system uses original content for missing translations
  • AND logs warning for each fallback
  • AND completes PDF generation successfully

Scenario: Translated PDF for incomplete translation

  • GIVEN a task with translation status "pending" or "translating"
  • WHEN user requests translated PDF
  • THEN the system returns 400 Bad Request
  • AND includes error message indicating translation not complete

Scenario: Translated PDF for non-existent translation

  • GIVEN a task that has not been translated to requested language
  • WHEN user requests translated PDF with lang=fr
  • THEN the system returns 404 Not Found
  • AND includes error message indicating no translation for language

Requirement: Translation Merge Service

The system SHALL provide a service to merge translation data with UnifiedDocument.

Scenario: Merge text element translations

  • GIVEN a UnifiedDocument with text elements
  • AND a translation JSON with matching element_ids
  • WHEN applying translations
  • THEN the system replaces content field for each matched element
  • AND preserves all other element properties (bounding_box, style_info, etc.)

Scenario: Merge table cell translations

  • GIVEN a UnifiedDocument containing table elements
  • AND a translation JSON with table_cell translations like:
    {
      "table_1_0": {
        "cells": [{"row": 0, "col": 0, "content": "Translated"}]
      }
    }
    
  • WHEN applying translations
  • THEN the system updates cell content at matching row/col positions
  • AND preserves cell structure and styling

Scenario: Non-destructive merge operation

  • GIVEN a UnifiedDocument
  • WHEN applying translations
  • THEN the system creates a modified copy
  • AND original UnifiedDocument remains unchanged