Adds the ability to download translated documents as PDF files while
preserving the original document layout. Key changes:
- Add apply_translations() function to merge translation JSON with UnifiedDocument
- Add generate_translated_pdf() method to PDFGeneratorService
- Add POST /api/v2/translate/{task_id}/pdf endpoint
- Add downloadTranslatedPdf() method and PDF button in frontend
- Add comprehensive unit tests (52 tests: merge, PDF generation, API endpoints)
- Archive add-translated-pdf-export proposal
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
260 lines
9.6 KiB
Markdown
260 lines
9.6 KiB
Markdown
# translation Specification
|
|
|
|
## Purpose
|
|
TBD - created by archiving change add-document-translation. Update Purpose after archive.
|
|
## Requirements
|
|
### Requirement: Document Translation Service
|
|
|
|
The system SHALL provide a document translation service that translates extracted text from OCR-processed documents into target languages using DIFY AI API.
|
|
|
|
#### Scenario: Successful translation of Direct track document
|
|
- **GIVEN** a completed OCR task with Direct track processing
|
|
- **WHEN** user requests translation to English
|
|
- **THEN** the system extracts all translatable elements (text, title, header, footer, paragraph, footnote, table cells)
|
|
- **AND** translates them using DIFY AI API
|
|
- **AND** saves the result to `{task_id}_translated_en.json`
|
|
|
|
#### Scenario: Successful translation of OCR track document
|
|
- **GIVEN** a completed OCR task with OCR track processing
|
|
- **WHEN** user requests translation to Japanese
|
|
- **THEN** the system extracts all translatable elements from UnifiedDocument format
|
|
- **AND** translates them preserving element_id mapping
|
|
- **AND** saves the result to `{task_id}_translated_ja.json`
|
|
|
|
#### Scenario: Successful translation of Hybrid track document
|
|
- **GIVEN** a completed OCR task with Hybrid track processing
|
|
- **WHEN** translation is requested
|
|
- **THEN** the system processes the document using the same unified logic
|
|
- **AND** handles any combination of element types present
|
|
|
|
#### Scenario: Table cell translation
|
|
- **GIVEN** a document containing table elements
|
|
- **WHEN** translation is requested
|
|
- **THEN** the system extracts text from each table cell
|
|
- **AND** translates each cell content individually
|
|
- **AND** preserves row/col position in the translation result
|
|
|
|
---
|
|
|
|
### Requirement: Translation API Endpoints
|
|
|
|
The system SHALL expose REST API endpoints for translation operations.
|
|
|
|
#### Scenario: Start translation request
|
|
- **GIVEN** a completed OCR task with task_id
|
|
- **WHEN** POST request to `/api/v2/translate/{task_id}` with target_lang parameter
|
|
- **THEN** the system starts background translation process
|
|
- **AND** returns translation job status with 202 Accepted
|
|
|
|
#### Scenario: Query translation status
|
|
- **GIVEN** an active translation job
|
|
- **WHEN** GET request to `/api/v2/translate/{task_id}/status`
|
|
- **THEN** the system returns current status (pending, translating, completed, failed)
|
|
- **AND** includes progress information (current_element, total_elements)
|
|
|
|
#### Scenario: Retrieve translation result
|
|
- **GIVEN** a completed translation job
|
|
- **WHEN** GET request to `/api/v2/translate/{task_id}/result?lang={target_lang}`
|
|
- **THEN** the system returns the translation JSON content
|
|
|
|
#### Scenario: Translation for non-existent task
|
|
- **GIVEN** an invalid or non-existent task_id
|
|
- **WHEN** translation is requested
|
|
- **THEN** the system returns 404 Not Found error
|
|
|
|
---
|
|
|
|
### Requirement: DIFY API Integration
|
|
|
|
The system SHALL integrate with DIFY AI service for translation.
|
|
|
|
#### Scenario: API request format
|
|
- **GIVEN** text to be translated
|
|
- **WHEN** calling DIFY API
|
|
- **THEN** the system sends POST request to `/chat-messages` endpoint
|
|
- **AND** includes query with translation prompt
|
|
- **AND** uses blocking response mode
|
|
- **AND** includes user identifier for tracking
|
|
|
|
#### Scenario: API response handling
|
|
- **GIVEN** DIFY API returns translation response
|
|
- **WHEN** parsing the response
|
|
- **THEN** the system extracts translated text from `answer` field
|
|
- **AND** records usage statistics (tokens, latency)
|
|
|
|
#### Scenario: API error handling
|
|
- **GIVEN** DIFY API returns error or times out
|
|
- **WHEN** handling the error
|
|
- **THEN** the system retries up to 3 times with exponential backoff
|
|
- **AND** returns appropriate error message if all retries fail
|
|
|
|
#### Scenario: API rate limiting
|
|
- **GIVEN** high volume of translation requests
|
|
- **WHEN** requests approach rate limits
|
|
- **THEN** the system queues requests appropriately
|
|
- **AND** provides feedback about wait times
|
|
|
|
---
|
|
|
|
### Requirement: Translation Prompt Format
|
|
|
|
The system SHALL use structured prompts for translation requests.
|
|
|
|
#### Scenario: Generate translation prompt
|
|
- **GIVEN** source text to translate
|
|
- **WHEN** preparing DIFY API request
|
|
- **THEN** the system formats prompt as:
|
|
```
|
|
Translate the following text to {language}.
|
|
Return ONLY the translated text, no explanations.
|
|
|
|
{text}
|
|
```
|
|
|
|
#### Scenario: Language name mapping
|
|
- **GIVEN** language code like "zh-TW" or "ja"
|
|
- **WHEN** constructing translation prompt
|
|
- **THEN** the system maps to full language name (Traditional Chinese, Japanese)
|
|
|
|
---
|
|
|
|
### Requirement: Translation Progress Reporting
|
|
|
|
The system SHALL provide real-time progress feedback during translation.
|
|
|
|
#### Scenario: Progress during multi-element translation
|
|
- **GIVEN** a document with 50 translatable elements
|
|
- **WHEN** user queries status
|
|
- **THEN** the system returns progress like `{"status": "translating", "current_element": 25, "total_elements": 50}`
|
|
|
|
#### Scenario: Translation starting status
|
|
- **GIVEN** translation job just started
|
|
- **WHEN** user queries status
|
|
- **THEN** the system returns `{"status": "pending"}`
|
|
|
|
---
|
|
|
|
### Requirement: Translation Result Storage
|
|
|
|
The system SHALL store translation results as independent JSON files.
|
|
|
|
#### Scenario: Save translation result
|
|
- **GIVEN** translation completes successfully
|
|
- **WHEN** saving results
|
|
- **THEN** the system creates `{original_filename}_translated_{lang}.json`
|
|
- **AND** includes schema_version, metadata, and translations dict
|
|
|
|
#### Scenario: Multiple language translations
|
|
- **GIVEN** a document translated to English and Japanese
|
|
- **WHEN** checking result files
|
|
- **THEN** both `xxx_translated_en.json` and `xxx_translated_ja.json` exist
|
|
- **AND** original `xxx_result.json` is unchanged
|
|
|
|
---
|
|
|
|
### Requirement: Language Support
|
|
|
|
The system SHALL support common languages through DIFY AI service.
|
|
|
|
#### Scenario: Common language translation
|
|
- **GIVEN** target language is English, Chinese, Japanese, or Korean
|
|
- **WHEN** translation is requested
|
|
- **THEN** the system includes appropriate language name in prompt
|
|
- **AND** executes translation successfully
|
|
|
|
#### Scenario: Automatic source language detection
|
|
- **GIVEN** source_lang is set to "auto"
|
|
- **WHEN** translation is executed
|
|
- **THEN** the AI model automatically detects source language
|
|
- **AND** translates to target language
|
|
|
|
#### Scenario: Supported languages list
|
|
- **GIVEN** user queries supported languages
|
|
- **WHEN** checking language support
|
|
- **THEN** the system provides list including:
|
|
- English (en)
|
|
- Traditional Chinese (zh-TW)
|
|
- Simplified Chinese (zh-CN)
|
|
- Japanese (ja)
|
|
- Korean (ko)
|
|
- German (de)
|
|
- French (fr)
|
|
- Spanish (es)
|
|
- Portuguese (pt)
|
|
- Italian (it)
|
|
- Russian (ru)
|
|
- Vietnamese (vi)
|
|
- Thai (th)
|
|
|
|
### Requirement: Translated PDF Generation
|
|
|
|
The system SHALL support generating PDF files with translated content while preserving the original document layout.
|
|
|
|
#### Scenario: Generate translated PDF from Direct track document
|
|
- **GIVEN** a completed translation for a Direct track processed document
|
|
- **WHEN** user requests translated PDF via `POST /api/v2/translate/{task_id}/pdf?lang={target_lang}`
|
|
- **THEN** the system loads the translation JSON file
|
|
- **AND** merges translations with UnifiedDocument by element_id
|
|
- **AND** generates PDF with translated text at original positions
|
|
- **AND** returns PDF file with Content-Type `application/pdf`
|
|
|
|
#### Scenario: Generate translated PDF from OCR track document
|
|
- **GIVEN** a completed translation for an OCR track processed document
|
|
- **WHEN** user requests translated PDF
|
|
- **THEN** the system generates PDF preserving all OCR layout information
|
|
- **AND** replaces original text with translated content
|
|
- **AND** maintains table structure with translated cell content
|
|
|
|
#### Scenario: Handle missing translations gracefully
|
|
- **GIVEN** a translation JSON missing some element_id entries
|
|
- **WHEN** generating translated PDF
|
|
- **THEN** the system uses original content for missing translations
|
|
- **AND** logs warning for each fallback
|
|
- **AND** completes PDF generation successfully
|
|
|
|
#### Scenario: Translated PDF for incomplete translation
|
|
- **GIVEN** a task with translation status "pending" or "translating"
|
|
- **WHEN** user requests translated PDF
|
|
- **THEN** the system returns 400 Bad Request
|
|
- **AND** includes error message indicating translation not complete
|
|
|
|
#### Scenario: Translated PDF for non-existent translation
|
|
- **GIVEN** a task that has not been translated to requested language
|
|
- **WHEN** user requests translated PDF with `lang=fr`
|
|
- **THEN** the system returns 404 Not Found
|
|
- **AND** includes error message indicating no translation for language
|
|
|
|
---
|
|
|
|
### Requirement: Translation Merge Service
|
|
|
|
The system SHALL provide a service to merge translation data with UnifiedDocument.
|
|
|
|
#### Scenario: Merge text element translations
|
|
- **GIVEN** a UnifiedDocument with text elements
|
|
- **AND** a translation JSON with matching element_ids
|
|
- **WHEN** applying translations
|
|
- **THEN** the system replaces content field for each matched element
|
|
- **AND** preserves all other element properties (bounding_box, style_info, etc.)
|
|
|
|
#### Scenario: Merge table cell translations
|
|
- **GIVEN** a UnifiedDocument containing table elements
|
|
- **AND** a translation JSON with table_cell translations like:
|
|
```json
|
|
{
|
|
"table_1_0": {
|
|
"cells": [{"row": 0, "col": 0, "content": "Translated"}]
|
|
}
|
|
}
|
|
```
|
|
- **WHEN** applying translations
|
|
- **THEN** the system updates cell content at matching row/col positions
|
|
- **AND** preserves cell structure and styling
|
|
|
|
#### Scenario: Non-destructive merge operation
|
|
- **GIVEN** a UnifiedDocument
|
|
- **WHEN** applying translations
|
|
- **THEN** the system creates a modified copy
|
|
- **AND** original UnifiedDocument remains unchanged
|
|
|