Updates all project documentation to reflect that chart recognition is now fully enabled with PaddlePaddle 3.2.1+. Changes: - README.md: Remove Known Limitations section about chart recognition, update tech stack and prerequisites to include PaddlePaddle 3.2.1+, add WSL CUDA configuration notes - openspec/project.md: Add comprehensive chart recognition feature descriptions, update system requirements for GPU/CUDA support - openspec/changes/add-gpu-acceleration-support/tasks.md: Mark task 5.4 as completed with resolution details - openspec/changes/add-gpu-acceleration-support/proposal.md: Update Known Issues section to show chart recognition is now resolved - setup_dev_env.sh: Upgrade PaddlePaddle from 3.0.0 to 3.2.1+, add WSL CUDA library path configuration, add chart recognition API verification All documentation now accurately reflects: ✅ Chart recognition fully enabled ✅ PaddlePaddle 3.2.1+ with fused_rms_norm_ext API ✅ WSL CUDA path auto-configuration ✅ Comprehensive PP-StructureV3 capabilities 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
4.2 KiB
4.2 KiB
Change: Add GPU Acceleration Support for OCR Processing
Why
PaddleOCR supports CUDA GPU acceleration which can significantly improve OCR processing speed for batch operations. Currently, the system always uses CPU processing, which is slower and less efficient for large document batches. By adding GPU detection and automatic CUDA support, the system will:
- Automatically utilize available GPU hardware when present
- Fall back gracefully to CPU processing when GPU is unavailable
- Reduce processing time for large batches by leveraging parallel GPU computation
- Improve overall system throughput and user experience
What Changes
- Add GPU detection logic to environment setup script (
setup_dev_env.sh) - Automatically install CUDA-enabled PaddlePaddle when compatible GPU is detected
- Install CPU-only PaddlePaddle when no compatible GPU is found
- Add GPU availability detection in OCR processing code
- Automatically enable GPU acceleration in PaddleOCR when GPU is available
- Add configuration option to force CPU mode (for testing or troubleshooting)
- Add GPU status reporting in API health check endpoint
- Update documentation with GPU requirements and setup instructions
Impact
-
Affected capabilities:
ocr-processing: Add GPU acceleration support with automatic detectionenvironment-setup: Add GPU detection and CUDA installation logic
-
Affected code:
setup_dev_env.sh: GPU detection and conditional CUDA package installationbackend/app/services/ocr_service.py: GPU availability detection and configurationbackend/app/api/v1/endpoints/health.py: GPU status reportingbackend/app/core/config.py: GPU configuration settings.env.local: GPU-related environment variables
-
Dependencies:
- When GPU available:
paddlepaddle-gpu(with matching CUDA version) - When GPU unavailable:
paddlepaddle(CPU-only, current default) - Detection tools:
nvidia-smi(NVIDIA GPUs),lspci(hardware detection)
- When GPU available:
-
Configuration:
- New env var:
FORCE_CPU_MODE(default: false) - Override GPU detection - New env var:
CUDA_VERSION(auto-detected or manual override) - GPU memory allocation settings for PaddleOCR
- Batch size adjustment based on GPU memory availability
- New env var:
-
Performance Impact:
- Expected 3-10x speedup for OCR processing on GPU-enabled systems
- No performance degradation on CPU-only systems (same as current behavior)
- Automatic memory management to prevent GPU OOM errors
-
Backward Compatibility:
- Fully backward compatible - existing CPU-only installations continue to work
- No breaking changes to API or configuration
- Existing installations can opt-in by re-running setup script on GPU-enabled hardware
Known Issues and Limitations
Chart Recognition Feature Disabled ✅ RESOLVED (2025-11-16)
Previous Issue: Chart recognition feature in PP-StructureV3 was disabled due to API incompatibility with PaddlePaddle 3.0.0.
Resolution:
- Fixed in: PaddlePaddle 3.2.1 (released 2025-10-30)
- Current Status: ✅ Chart recognition FULLY ENABLED
- API Status:
paddle.incubate.nn.functional.fused_rms_norm_extnow available - Documentation: See CHART_RECOGNITION.md for details
Root Cause (Historical):
- PaddleOCR-VL chart recognition model requires
paddle.incubate.nn.functional.fused_rms_norm_extAPI - PaddlePaddle 3.0.0 stable only provided
fused_rms_norm(base version) - The extended version
fused_rms_norm_extwas not available in 3.0.0
Current Capabilities (✅ All Enabled):
- ✅ Layout analysis detects and extracts chart/figure regions as images
- ✅ Tables, formulas, and text recognition function normally
- ✅ Deep chart understanding (chart type detection, data extraction, axis/legend parsing)
- ✅ Converting chart content to structured data (JSON, tables)
Actions Taken:
- Upgraded system to PaddlePaddle 3.2.1+
- Enabled chart recognition in PP-StructureV3 initialization
- Configured WSL CUDA library paths for GPU support
- Updated all documentation to reflect enabled status
Code Location: backend/app/services/ocr_service.py:217
Status: ✅ RESOLVED - Chart recognition fully operational