feat: add document translation via DIFY AI API

Implement document translation feature using DIFY AI API with batch processing: Backend: - Add DIFY client with batch translation support (5000 chars, 20 items per batch) - Add translation service with element extraction and result building - Add translation router with start/status/result/list/delete endpoints - Add translation schemas (TranslationRequest, TranslationStatus, etc.) Frontend: - Enable translation UI in TaskDetailPage - Add translation API methods to apiV2.ts - Add translation types Features: - Batch translation with numbered markers [1], [2], [3]... - Support for text, title, header, footer, paragraph, footnote, table cells - Translation result JSON with statistics (tokens, latency, batch_count) - Background task processing with progress tracking 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-02 11:57:02 +08:00
parent 87dc97d951
commit 8d9b69ba93
18 changed files with 2970 additions and 26 deletions
--- a/openspec/changes/archive/2025-12-02-add-document-translation/specs/result-export/spec.md
+++ b/openspec/changes/archive/2025-12-02-add-document-translation/specs/result-export/spec.md
@@ -0,0 +1,55 @@
+## ADDED Requirements
+
+### Requirement: Translation Result JSON Export
+
+The system SHALL support exporting translation results as independent JSON files following a defined schema.
+
+#### Scenario: Export translation result JSON
+- **WHEN** translation completes for a document
+- **THEN** system SHALL save translation to `{filename}_translated_{lang}.json`
+- **AND** file SHALL be stored alongside original `{filename}_result.json`
+- **AND** original result file SHALL remain unchanged
+
+#### Scenario: Translation JSON schema compliance
+- **WHEN** translation result is saved
+- **THEN** JSON SHALL include schema_version field ("1.0.0")
+- **AND** SHALL include source_document reference
+- **AND** SHALL include source_lang and target_lang
+- **AND** SHALL include provider identifier (e.g., "dify")
+- **AND** SHALL include translated_at timestamp
+- **AND** SHALL include translations dict mapping element_id to translated content
+
+#### Scenario: Translation statistics in export
+- **WHEN** translation result is saved
+- **THEN** JSON SHALL include statistics object with:
+  - total_elements: count of all elements in document
+  - translated_elements: count of successfully translated elements
+  - skipped_elements: count of non-translatable elements (images, charts, etc.)
+  - total_characters: character count of translated text
+  - processing_time_seconds: translation duration
+
+#### Scenario: Table cell translation in export
+- **WHEN** document contains tables
+- **THEN** translation JSON SHALL represent table translations as:
+  ```json
+  {
+    "table_1_0": {
+      "cells": [
+        {"row": 0, "col": 0, "content": "Translated cell text"},
+        {"row": 0, "col": 1, "content": "Another cell"}
+      ]
+    }
+  }
+  ```
+- **AND** row/col positions SHALL match original table structure
+
+#### Scenario: Download translation result via API
+- **WHEN** GET request to `/api/v2/translate/{task_id}/result?lang={lang}`
+- **THEN** system SHALL return translation JSON content
+- **AND** Content-Type SHALL be application/json
+- **AND** response SHALL include appropriate cache headers
+
+#### Scenario: List available translations
+- **WHEN** GET request to `/api/v2/tasks/{task_id}/translations`
+- **THEN** system SHALL return list of available translation languages
+- **AND** include translation metadata (translated_at, provider, statistics)
--- a/openspec/changes/archive/2025-12-02-add-document-translation/specs/translation/spec.md
+++ b/openspec/changes/archive/2025-12-02-add-document-translation/specs/translation/spec.md
@@ -0,0 +1,184 @@
+## ADDED Requirements
+
+### Requirement: Document Translation Service
+
+The system SHALL provide a document translation service that translates extracted text from OCR-processed documents into target languages using DIFY AI API.
+
+#### Scenario: Successful translation of Direct track document
+- **GIVEN** a completed OCR task with Direct track processing
+- **WHEN** user requests translation to English
+- **THEN** the system extracts all translatable elements (text, title, header, footer, paragraph, footnote, table cells)
+- **AND** translates them using DIFY AI API
+- **AND** saves the result to `{task_id}_translated_en.json`
+
+#### Scenario: Successful translation of OCR track document
+- **GIVEN** a completed OCR task with OCR track processing
+- **WHEN** user requests translation to Japanese
+- **THEN** the system extracts all translatable elements from UnifiedDocument format
+- **AND** translates them preserving element_id mapping
+- **AND** saves the result to `{task_id}_translated_ja.json`
+
+#### Scenario: Successful translation of Hybrid track document
+- **GIVEN** a completed OCR task with Hybrid track processing
+- **WHEN** translation is requested
+- **THEN** the system processes the document using the same unified logic
+- **AND** handles any combination of element types present
+
+#### Scenario: Table cell translation
+- **GIVEN** a document containing table elements
+- **WHEN** translation is requested
+- **THEN** the system extracts text from each table cell
+- **AND** translates each cell content individually
+- **AND** preserves row/col position in the translation result
+
+---
+
+### Requirement: Translation API Endpoints
+
+The system SHALL expose REST API endpoints for translation operations.
+
+#### Scenario: Start translation request
+- **GIVEN** a completed OCR task with task_id
+- **WHEN** POST request to `/api/v2/translate/{task_id}` with target_lang parameter
+- **THEN** the system starts background translation process
+- **AND** returns translation job status with 202 Accepted
+
+#### Scenario: Query translation status
+- **GIVEN** an active translation job
+- **WHEN** GET request to `/api/v2/translate/{task_id}/status`
+- **THEN** the system returns current status (pending, translating, completed, failed)
+- **AND** includes progress information (current_element, total_elements)
+
+#### Scenario: Retrieve translation result
+- **GIVEN** a completed translation job
+- **WHEN** GET request to `/api/v2/translate/{task_id}/result?lang={target_lang}`
+- **THEN** the system returns the translation JSON content
+
+#### Scenario: Translation for non-existent task
+- **GIVEN** an invalid or non-existent task_id
+- **WHEN** translation is requested
+- **THEN** the system returns 404 Not Found error
+
+---
+
+### Requirement: DIFY API Integration
+
+The system SHALL integrate with DIFY AI service for translation.
+
+#### Scenario: API request format
+- **GIVEN** text to be translated
+- **WHEN** calling DIFY API
+- **THEN** the system sends POST request to `/chat-messages` endpoint
+- **AND** includes query with translation prompt
+- **AND** uses blocking response mode
+- **AND** includes user identifier for tracking
+
+#### Scenario: API response handling
+- **GIVEN** DIFY API returns translation response
+- **WHEN** parsing the response
+- **THEN** the system extracts translated text from `answer` field
+- **AND** records usage statistics (tokens, latency)
+
+#### Scenario: API error handling
+- **GIVEN** DIFY API returns error or times out
+- **WHEN** handling the error
+- **THEN** the system retries up to 3 times with exponential backoff
+- **AND** returns appropriate error message if all retries fail
+
+#### Scenario: API rate limiting
+- **GIVEN** high volume of translation requests
+- **WHEN** requests approach rate limits
+- **THEN** the system queues requests appropriately
+- **AND** provides feedback about wait times
+
+---
+
+### Requirement: Translation Prompt Format
+
+The system SHALL use structured prompts for translation requests.
+
+#### Scenario: Generate translation prompt
+- **GIVEN** source text to translate
+- **WHEN** preparing DIFY API request
+- **THEN** the system formats prompt as:
+  ```
+  Translate the following text to {language}.
+  Return ONLY the translated text, no explanations.
+
+  {text}
+  ```
+
+#### Scenario: Language name mapping
+- **GIVEN** language code like "zh-TW" or "ja"
+- **WHEN** constructing translation prompt
+- **THEN** the system maps to full language name (Traditional Chinese, Japanese)
+
+---
+
+### Requirement: Translation Progress Reporting
+
+The system SHALL provide real-time progress feedback during translation.
+
+#### Scenario: Progress during multi-element translation
+- **GIVEN** a document with 50 translatable elements
+- **WHEN** user queries status
+- **THEN** the system returns progress like `{"status": "translating", "current_element": 25, "total_elements": 50}`
+
+#### Scenario: Translation starting status
+- **GIVEN** translation job just started
+- **WHEN** user queries status
+- **THEN** the system returns `{"status": "pending"}`
+
+---
+
+### Requirement: Translation Result Storage
+
+The system SHALL store translation results as independent JSON files.
+
+#### Scenario: Save translation result
+- **GIVEN** translation completes successfully
+- **WHEN** saving results
+- **THEN** the system creates `{original_filename}_translated_{lang}.json`
+- **AND** includes schema_version, metadata, and translations dict
+
+#### Scenario: Multiple language translations
+- **GIVEN** a document translated to English and Japanese
+- **WHEN** checking result files
+- **THEN** both `xxx_translated_en.json` and `xxx_translated_ja.json` exist
+- **AND** original `xxx_result.json` is unchanged
+
+---
+
+### Requirement: Language Support
+
+The system SHALL support common languages through DIFY AI service.
+
+#### Scenario: Common language translation
+- **GIVEN** target language is English, Chinese, Japanese, or Korean
+- **WHEN** translation is requested
+- **THEN** the system includes appropriate language name in prompt
+- **AND** executes translation successfully
+
+#### Scenario: Automatic source language detection
+- **GIVEN** source_lang is set to "auto"
+- **WHEN** translation is executed
+- **THEN** the AI model automatically detects source language
+- **AND** translates to target language
+
+#### Scenario: Supported languages list
+- **GIVEN** user queries supported languages
+- **WHEN** checking language support
+- **THEN** the system provides list including:
+  - English (en)
+  - Traditional Chinese (zh-TW)
+  - Simplified Chinese (zh-CN)
+  - Japanese (ja)
+  - Korean (ko)
+  - German (de)
+  - French (fr)
+  - Spanish (es)
+  - Portuguese (pt)
+  - Italian (it)
+  - Russian (ru)
+  - Vietnamese (vi)
+  - Thai (th)