- Enable PP-StructureV3's use_doc_orientation_classify feature - Detect rotation angle from doc_preprocessor_res.angle - Swap page dimensions (width <-> height) for 90°/270° rotations - Output PDF now correctly displays landscape-scanned content Also includes: - Archive completed openspec proposals - Add simplify-frontend-ocr-config proposal (pending) - Code cleanup and frontend simplification 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2.2 KiB
2.2 KiB
1. Algorithm Changes (gap_filling_service.py)
1.1 IoA Implementation
- 1.1.1 Add
_calculate_ioa()method alongside existing_calculate_iou() - 1.1.2 Modify
_is_region_covered()to use IoA instead of IoU - 1.1.3 Update deduplication logic to use IoA
1.2 Dynamic Threshold Strategy
- 1.2.1 Add element-type-specific thresholds as class constants
- 1.2.2 Modify
_is_region_covered()to accept element type parameter - 1.2.3 Apply different thresholds based on element type (TEXT: 0.6, TABLE: 0.1, FIGURE: 0.8)
1.3 Boundary Shrinking
- 1.3.1 Add optional
shrink_pixelsparameter to coverage detection - 1.3.2 Implement bbox shrinking logic (inward 1-2 px)
2. OCR Data Source Changes
2.1 Extract overall_ocr_res from PP-StructureV3
- 2.1.1 Modify
pp_structure_enhanced.pyto extractoverall_ocr_resfrom result - 2.1.2 Convert
dt_polys+rec_texts+rec_scoresto TextRegion format - 2.1.3 Store extracted OCR in result dict for gap filling
2.2 Update Processing Orchestrator
- 2.2.1 Add option to use
overall_ocr_resas OCR source - 2.2.2 Skip separate Raw OCR inference when using PP-StructureV3's OCR
- 2.2.3 Maintain backward compatibility with explicit Raw OCR mode
3. Configuration Updates
3.1 Add Settings (config.py)
- 3.1.1 Add
gap_filling_ioa_threshold_text: float = 0.6 - 3.1.2 Add
gap_filling_ioa_threshold_table: float = 0.1 - 3.1.3 Add
gap_filling_ioa_threshold_figure: float = 0.8 - 3.1.4 Add
gap_filling_use_overall_ocr: bool = True - 3.1.5 Add
gap_filling_shrink_pixels: int = 1
4. Testing
4.1 Unit Tests
- 4.1.1 Test IoA calculation with known values
- 4.1.2 Test dynamic threshold selection by element type
- 4.1.3 Test boundary shrinking edge cases
4.2 Integration Tests
- 4.2.1 Test with scan.pdf (current problematic file)
- 4.2.2 Compare results: old IoU vs new IoA approach
- 4.2.3 Verify no duplicate text rendering in output PDF
- 4.2.4 Verify table content is not duplicated outside table bounds
5. Documentation
- 5.1 Update spec documentation with new algorithm
- 5.2 Add inline code comments explaining IoA vs IoU