fix: multi-worker translation status and OCR fallback handling

Translation status (multi-worker support): - Add filesystem lock files (.translating) to track in-progress translations - Check lock files in /status API when job_state not found in current worker - Remove lock files on translation success or failure OCR fallback fix: - Fix empty pages when layout analysis fails but OCR succeeds - Change 'enhanced_results' in ocr_results to ocr_results.get('enhanced_results') - This ensures fallback to text_regions when enhanced_results is empty list 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 16:36:36 +08:00
parent 1c37585be2
commit 3ccbdb8394
4 changed files with 36 additions and 3 deletions
--- a/backend/app/services/ocr_to_unified_converter.py
+++ b/backend/app/services/ocr_to_unified_converter.py
@@ -439,14 +439,15 @@ class OCRToUnifiedConverter:
        ocr_dimensions = ocr_results.get('ocr_dimensions', {})

        # Check if we have enhanced results from PPStructureEnhanced
-        if 'enhanced_results' in ocr_results:
+        # Note: Must check for non-empty list, not just key existence (key may exist with empty list)
+        if ocr_results.get('enhanced_results'):
            pages = self._extract_from_enhanced_results(
                ocr_results['enhanced_results'],
                raw_text_regions=raw_text_regions,
                ocr_dimensions=ocr_dimensions
            )
        # Check for traditional OCR results with text_regions at top level (from process_file_traditional)
-        elif 'text_regions' in ocr_results:
+        elif ocr_results.get('text_regions'):
            pages = self._extract_from_traditional_ocr(ocr_results)
        # Check for traditional layout_data structure
        elif 'layout_data' in ocr_results: