fix: disable chart recognition due to PaddlePaddle 3.0.0 API limitation
PaddleOCR-VL chart recognition model requires `fused_rms_norm_ext` API which is not available in PaddlePaddle 3.0.0 stable release. Changes: - Set use_chart_recognition=False in PP-StructureV3 initialization - Remove unsupported show_log parameter from PaddleOCR 3.x API calls - Document known limitation in openspec proposal - Add limitation documentation to README - Update tasks.md with documentation task for known issues Impact: - Layout analysis still detects/extracts charts as images ✓ - Tables, formulas, and text recognition work normally ✓ - Deep chart understanding (type detection, data extraction) disabled ✗ - Chart to structured data conversion disabled ✗ Workaround: Charts saved as image files for manual review 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -49,3 +49,32 @@ PaddleOCR supports CUDA GPU acceleration which can significantly improve OCR pro
|
||||
- Fully backward compatible - existing CPU-only installations continue to work
|
||||
- No breaking changes to API or configuration
|
||||
- Existing installations can opt-in by re-running setup script on GPU-enabled hardware
|
||||
|
||||
## Known Issues and Limitations
|
||||
|
||||
### Chart Recognition Feature Disabled (PaddlePaddle 3.0.0 API Limitation)
|
||||
|
||||
**Issue**: Chart recognition feature in PP-StructureV3 is currently disabled due to API incompatibility.
|
||||
|
||||
**Root Cause**:
|
||||
- PaddleOCR-VL chart recognition model requires `paddle.incubate.nn.functional.fused_rms_norm_ext` API
|
||||
- PaddlePaddle 3.0.0 stable only provides `fused_rms_norm` (base version)
|
||||
- The extended version `fused_rms_norm_ext` is not yet available in stable release
|
||||
|
||||
**Impact**:
|
||||
- ✅ **Still Works**: Layout analysis can detect and extract chart/figure regions as images
|
||||
- ✅ **Still Works**: Tables, formulas, and text recognition all function normally
|
||||
- ❌ **Disabled**: Deep chart understanding (chart type detection, data extraction, axis/legend parsing)
|
||||
- ❌ **Disabled**: Converting chart content to structured data (JSON, tables)
|
||||
|
||||
**Workaround**:
|
||||
- Set `use_chart_recognition=False` in PP-StructureV3 initialization
|
||||
- Charts are saved as image files but content is not analyzed
|
||||
|
||||
**Future Resolution**:
|
||||
- Wait for PaddlePaddle 3.0.x/3.1.x update that adds `fused_rms_norm_ext` API
|
||||
- Or use PaddlePaddle develop version (unstable, not recommended for production)
|
||||
|
||||
**Code Location**: [backend/app/services/ocr_service.py:216](../../backend/app/services/ocr_service.py#L216)
|
||||
|
||||
**Status**: Documented limitation, pending PaddlePaddle framework update
|
||||
|
||||
Reference in New Issue
Block a user