feat: add OCR to UnifiedDocument converter for PP-StructureV3 integration

Implements the converter that transforms PP-StructureV3 OCR results into the UnifiedDocument format, enabling consistent output for both OCR and direct extraction tracks. - Create OCRToUnifiedConverter class with full element type mapping - Handle both enhanced (parsing_res_list) and standard markdown results - Support 4-point and simple bbox formats for coordinates - Establish element relationships (captions, lists, headers) - Integrate converter into OCR service dual-track processing - Update tasks.md marking section 3.3 complete 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 08:05:20 +08:00
parent 062cb1f423
commit a3a6fbe58b
4 changed files with 1172 additions and 29 deletions
--- a/openspec/changes/dual-track-document-processing/tasks.md
+++ b/openspec/changes/dual-track-document-processing/tasks.md
@@ -42,15 +42,15 @@
  - [ ] 3.1.2 Enable batch processing for GPU efficiency
  - [ ] 3.1.3 Configure memory management settings
  - [ ] 3.1.4 Set up model caching
- [ ] 3.2 Enhance OCR service to use parsing_res_list
-  - [ ] 3.2.1 Replace markdown extraction with parsing_res_list
-  - [ ] 3.2.2 Extract all 23 element types
-  - [ ] 3.2.3 Preserve bbox coordinates from PP-StructureV3
-  - [ ] 3.2.4 Maintain reading order information
- [ ] 3.3 Create OCR to UnifiedDocument converter
-  - [ ] 3.3.1 Map PP-StructureV3 elements to UnifiedDocument
-  - [ ] 3.3.2 Handle complex nested structures
-  - [ ] 3.3.3 Preserve all metadata
+- [x] 3.2 Enhance OCR service to use parsing_res_list
+  - [x] 3.2.1 Replace markdown extraction with parsing_res_list
+  - [x] 3.2.2 Extract all 23 element types
+  - [x] 3.2.3 Preserve bbox coordinates from PP-StructureV3
+  - [x] 3.2.4 Maintain reading order information
+- [x] 3.3 Create OCR to UnifiedDocument converter
+  - [x] 3.3.1 Map PP-StructureV3 elements to UnifiedDocument
+  - [x] 3.3.2 Handle complex nested structures
+  - [x] 3.3.3 Preserve all metadata

 ## 4. Unified Processing Pipeline
 - [x] 4.1 Update main OCR service for dual-track processing