Chart Recognition Status Investigation: - OpenSpec limitation record is ACCURATE but based on old PaddlePaddle 3.0.0 (Mar 2025) - PaddlePaddle has released multiple updates (3.1.x, 3.2.x, latest: 3.2.2 Nov 2025) - The fused_rms_norm_ext API MAY now be available in newer versions Root Cause: - PaddleOCR-VL chart recognition requires paddle.incubate.nn.functional.fused_rms_norm_ext - PaddlePaddle 3.0.0 only provided fused_rms_norm (base version) - Not a compatibility issue - PaddleOCR 3.x is fully compatible with PaddlePaddle 3.x - Issue is missing API, not version mismatch What Still Works (Even with Chart Recognition Disabled): ✅ Chart detection and extraction as images ✅ Table recognition (with nested formulas/images) ✅ Formula recognition ✅ Text recognition (OCR core) What's Disabled: ❌ Deep chart understanding (type, data extraction, axis/legend parsing) ❌ Converting chart content to structured data Created Files: 1. CHART_RECOGNITION.md - Comprehensive guide explaining: - Current limitation status and history - What works vs what's disabled - How to verify if newer PaddlePaddle versions support the API - How to enable chart recognition if API becomes available - Troubleshooting and performance considerations 2. backend/verify_chart_recognition.py - Verification script to: - Check if fused_rms_norm_ext API is available - Display current PaddlePaddle version - Provide actionable recommendations Next Steps for Users: 1. Run: conda activate tool_ocr && python backend/verify_chart_recognition.py 2. If API is available, enable chart recognition in ocr_service.py:217 3. Update OpenSpec if limitation is resolved in newer versions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
62 lines
2.2 KiB
Python
Executable File
62 lines
2.2 KiB
Python
Executable File
#!/usr/bin/env python3
|
|
"""
|
|
Verify if chart recognition can be enabled in the current PaddlePaddle version
|
|
Run this in the conda environment: conda activate tool_ocr && python verify_chart_recognition.py
|
|
"""
|
|
|
|
import sys
|
|
|
|
def check_paddle_api():
|
|
"""Check if fused_rms_norm_ext API is available"""
|
|
try:
|
|
import paddle
|
|
print(f"✅ PaddlePaddle version: {paddle.__version__}")
|
|
|
|
# Check if the API exists
|
|
import paddle.incubate.nn.functional as F
|
|
|
|
has_base = hasattr(F, 'fused_rms_norm')
|
|
has_ext = hasattr(F, 'fused_rms_norm_ext')
|
|
|
|
print(f"\n📊 API Availability:")
|
|
print(f" - fused_rms_norm: {'✅ Available' if has_base else '❌ Not found'}")
|
|
print(f" - fused_rms_norm_ext: {'✅ Available' if has_ext else '❌ Not found'}")
|
|
|
|
if has_ext:
|
|
print(f"\n🎉 Chart recognition CAN be enabled!")
|
|
print(f"\n📝 Action required:")
|
|
print(f" 1. Edit backend/app/services/ocr_service.py")
|
|
print(f" 2. Change line 217: use_chart_recognition=False → True")
|
|
print(f" 3. Restart the backend service")
|
|
print(f"\n⚠️ Note: This will enable deep chart analysis (may increase processing time)")
|
|
return True
|
|
else:
|
|
print(f"\n❌ Chart recognition CANNOT be enabled yet")
|
|
print(f"\n📝 Current PaddlePaddle version ({paddle.__version__}) does not support fused_rms_norm_ext")
|
|
print(f"\n💡 Options:")
|
|
print(f" 1. Upgrade PaddlePaddle: pip install --upgrade paddlepaddle>=3.2.0")
|
|
print(f" 2. Check for newer versions: pip search paddlepaddle")
|
|
print(f" 3. Wait for official PaddlePaddle update")
|
|
return False
|
|
|
|
except ImportError as e:
|
|
print(f"❌ PaddlePaddle not installed: {e}")
|
|
print(f"\n💡 Install PaddlePaddle:")
|
|
print(f" pip install paddlepaddle>=3.2.0")
|
|
return False
|
|
except Exception as e:
|
|
print(f"❌ Error: {e}")
|
|
return False
|
|
|
|
if __name__ == "__main__":
|
|
print("=" * 70)
|
|
print("Chart Recognition Availability Checker")
|
|
print("=" * 70)
|
|
print()
|
|
|
|
can_enable = check_paddle_api()
|
|
|
|
print()
|
|
print("=" * 70)
|
|
sys.exit(0 if can_enable else 1)
|