chore: backup before code cleanup
Backup commit before executing remove-unused-code proposal. This includes all pending changes and new features. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
54
openspec/changes/improve-ocr-track-algorithm/tasks.md
Normal file
54
openspec/changes/improve-ocr-track-algorithm/tasks.md
Normal file
@@ -0,0 +1,54 @@
|
||||
## 1. Algorithm Changes (gap_filling_service.py)
|
||||
|
||||
### 1.1 IoA Implementation
|
||||
- [x] 1.1.1 Add `_calculate_ioa()` method alongside existing `_calculate_iou()`
|
||||
- [x] 1.1.2 Modify `_is_region_covered()` to use IoA instead of IoU
|
||||
- [x] 1.1.3 Update deduplication logic to use IoA
|
||||
|
||||
### 1.2 Dynamic Threshold Strategy
|
||||
- [x] 1.2.1 Add element-type-specific thresholds as class constants
|
||||
- [x] 1.2.2 Modify `_is_region_covered()` to accept element type parameter
|
||||
- [x] 1.2.3 Apply different thresholds based on element type (TEXT: 0.6, TABLE: 0.1, FIGURE: 0.8)
|
||||
|
||||
### 1.3 Boundary Shrinking
|
||||
- [x] 1.3.1 Add optional `shrink_pixels` parameter to coverage detection
|
||||
- [x] 1.3.2 Implement bbox shrinking logic (inward 1-2 px)
|
||||
|
||||
## 2. OCR Data Source Changes
|
||||
|
||||
### 2.1 Extract overall_ocr_res from PP-StructureV3
|
||||
- [x] 2.1.1 Modify `pp_structure_enhanced.py` to extract `overall_ocr_res` from result
|
||||
- [x] 2.1.2 Convert `dt_polys` + `rec_texts` + `rec_scores` to TextRegion format
|
||||
- [x] 2.1.3 Store extracted OCR in result dict for gap filling
|
||||
|
||||
### 2.2 Update Processing Orchestrator
|
||||
- [x] 2.2.1 Add option to use `overall_ocr_res` as OCR source
|
||||
- [x] 2.2.2 Skip separate Raw OCR inference when using PP-StructureV3's OCR
|
||||
- [x] 2.2.3 Maintain backward compatibility with explicit Raw OCR mode
|
||||
|
||||
## 3. Configuration Updates
|
||||
|
||||
### 3.1 Add Settings (config.py)
|
||||
- [x] 3.1.1 Add `gap_filling_ioa_threshold_text: float = 0.6`
|
||||
- [x] 3.1.2 Add `gap_filling_ioa_threshold_table: float = 0.1`
|
||||
- [x] 3.1.3 Add `gap_filling_ioa_threshold_figure: float = 0.8`
|
||||
- [x] 3.1.4 Add `gap_filling_use_overall_ocr: bool = True`
|
||||
- [x] 3.1.5 Add `gap_filling_shrink_pixels: int = 1`
|
||||
|
||||
## 4. Testing
|
||||
|
||||
### 4.1 Unit Tests
|
||||
- [ ] 4.1.1 Test IoA calculation with known values
|
||||
- [ ] 4.1.2 Test dynamic threshold selection by element type
|
||||
- [ ] 4.1.3 Test boundary shrinking edge cases
|
||||
|
||||
### 4.2 Integration Tests
|
||||
- [ ] 4.2.1 Test with scan.pdf (current problematic file)
|
||||
- [ ] 4.2.2 Compare results: old IoU vs new IoA approach
|
||||
- [ ] 4.2.3 Verify no duplicate text rendering in output PDF
|
||||
- [ ] 4.2.4 Verify table content is not duplicated outside table bounds
|
||||
|
||||
## 5. Documentation
|
||||
|
||||
- [x] 5.1 Update spec documentation with new algorithm
|
||||
- [x] 5.2 Add inline code comments explaining IoA vs IoU
|
||||
Reference in New Issue
Block a user