# translation Specification Delta ## MODIFIED Requirements ### Requirement: Translation Content Extraction The translation service SHALL extract content based on processing track type. #### Scenario: OCR Track translation extraction - **GIVEN** a document processed with OCR Track - **AND** the result JSON has `metadata.processing_track = "ocr"` - **WHEN** translation service extracts translatable content - **THEN** it SHALL load `raw_ocr_regions.json` for each page - **AND** it SHALL extract all text blocks from raw OCR regions - **AND** it SHALL NOT rely on `content.cells` from table elements #### Scenario: Direct Track translation extraction (unchanged) - **GIVEN** a document processed with Direct Track - **AND** the result JSON has `metadata.processing_track = "direct"` or no track specified - **WHEN** translation service extracts translatable content - **THEN** it SHALL extract from `pages[].elements[]` in result JSON - **AND** it SHALL extract table cell content from `content.cells` ### Requirement: Translation Result Format The translation result JSON SHALL support both element-based and raw OCR translations. #### Scenario: OCR Track translation result format - **GIVEN** an OCR Track document has been translated - **WHEN** translation result is saved - **THEN** the JSON SHALL include `raw_ocr_translations` array - **AND** each item SHALL have `index`, `original`, and `translated` fields - **AND** the `translations` object MAY be empty or contain header text translations #### Scenario: Direct Track translation result format (unchanged) - **GIVEN** a Direct Track document has been translated - **WHEN** translation result is saved - **THEN** the JSON SHALL use `translations` object mapping element_id to translated text - **AND** `raw_ocr_translations` field SHALL NOT be present ### Requirement: Translated PDF Generation The translated PDF generation SHALL use appropriate translation source based on processing track. #### Scenario: OCR Track translated PDF generation - **GIVEN** an OCR Track document with translations - **AND** the translation JSON contains `raw_ocr_translations` - **WHEN** generating translated reflow PDF - **THEN** it SHALL apply translations from `raw_ocr_translations` by index - **AND** it SHALL render all translated text blocks in reading order #### Scenario: Direct Track translated PDF generation (unchanged) - **GIVEN** a Direct Track document with translations - **WHEN** generating translated reflow PDF - **THEN** it SHALL apply translations from `translations` object by element_id - **AND** existing behavior SHALL be unchanged