fix: disable chart recognition due to PaddlePaddle 3.0.0 API limitation

PaddleOCR-VL chart recognition model requires `fused_rms_norm_ext` API
which is not available in PaddlePaddle 3.0.0 stable release.

Changes:
- Set use_chart_recognition=False in PP-StructureV3 initialization
- Remove unsupported show_log parameter from PaddleOCR 3.x API calls
- Document known limitation in openspec proposal
- Add limitation documentation to README
- Update tasks.md with documentation task for known issues

Impact:
- Layout analysis still detects/extracts charts as images ✓
- Tables, formulas, and text recognition work normally ✓
- Deep chart understanding (type detection, data extraction) disabled ✗
- Chart to structured data conversion disabled ✗

Workaround: Charts saved as image files for manual review

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-14 13:16:17 +08:00
parent 80c091b89a
commit b048f2d640
5 changed files with 119 additions and 133 deletions

View File

@@ -49,3 +49,32 @@ PaddleOCR supports CUDA GPU acceleration which can significantly improve OCR pro
- Fully backward compatible - existing CPU-only installations continue to work
- No breaking changes to API or configuration
- Existing installations can opt-in by re-running setup script on GPU-enabled hardware
## Known Issues and Limitations
### Chart Recognition Feature Disabled (PaddlePaddle 3.0.0 API Limitation)
**Issue**: Chart recognition feature in PP-StructureV3 is currently disabled due to API incompatibility.
**Root Cause**:
- PaddleOCR-VL chart recognition model requires `paddle.incubate.nn.functional.fused_rms_norm_ext` API
- PaddlePaddle 3.0.0 stable only provides `fused_rms_norm` (base version)
- The extended version `fused_rms_norm_ext` is not yet available in stable release
**Impact**:
-**Still Works**: Layout analysis can detect and extract chart/figure regions as images
-**Still Works**: Tables, formulas, and text recognition all function normally
-**Disabled**: Deep chart understanding (chart type detection, data extraction, axis/legend parsing)
-**Disabled**: Converting chart content to structured data (JSON, tables)
**Workaround**:
- Set `use_chart_recognition=False` in PP-StructureV3 initialization
- Charts are saved as image files but content is not analyzed
**Future Resolution**:
- Wait for PaddlePaddle 3.0.x/3.1.x update that adds `fused_rms_norm_ext` API
- Or use PaddlePaddle develop version (unstable, not recommended for production)
**Code Location**: [backend/app/services/ocr_service.py:216](../../backend/app/services/ocr_service.py#L216)
**Status**: Documented limitation, pending PaddlePaddle framework update

View File

@@ -1,59 +1,59 @@
# Implementation Tasks
## 1. Environment Setup Enhancement
- [ ] 1.1 Add GPU detection function in `setup_dev_env.sh`
- [x] 1.1 Add GPU detection function in `setup_dev_env.sh`
- Detect NVIDIA GPU using `nvidia-smi` or `lspci`
- Detect CUDA version if GPU is available
- Output GPU detection results to user
- [ ] 1.2 Add conditional CUDA package installation
- [x] 1.2 Add conditional CUDA package installation
- Install `paddlepaddle-gpu` with matching CUDA version when GPU detected
- Install `paddlepaddle` (CPU-only) when no GPU detected
- Handle different CUDA versions (11.2, 11.6, 11.7, 12.0, etc.)
- [ ] 1.3 Add GPU verification step after installation
- Handle different CUDA versions (11.x, 12.x, 13.x)
- [x] 1.3 Add GPU verification step after installation
- Test PaddlePaddle GPU availability
- Report GPU status and CUDA version to user
- Provide fallback instructions if GPU setup fails
## 2. Configuration Updates
- [ ] 2.1 Add GPU configuration to `.env.local`
- [x] 2.1 Add GPU configuration to `.env.local`
- Add `FORCE_CPU_MODE` option (default: false)
- Add `CUDA_VERSION` for manual override
- Add `GPU_DEVICE_ID` for device selection
- Add `GPU_MEMORY_FRACTION` for memory allocation control
- [ ] 2.2 Update backend configuration
- [x] 2.2 Update backend configuration
- Add GPU settings to `backend/app/core/config.py`
- Load GPU-related environment variables
- Add validation for GPU configuration values
## 3. OCR Service GPU Integration
- [ ] 3.1 Add GPU detection in OCR service initialization
- [x] 3.1 Add GPU detection in OCR service initialization
- Create GPU availability check function
- Detect available GPU devices
- Log GPU status (available/unavailable, device name, memory)
- [ ] 3.2 Implement automatic GPU/CPU mode selection
- [x] 3.2 Implement automatic GPU/CPU mode selection
- Enable GPU mode in PaddleOCR when GPU is available
- Fall back to CPU mode when GPU is unavailable or forced
- Set appropriate `use_gpu` parameter for PaddleOCR initialization
- [ ] 3.3 Add GPU memory management
- Use global device setting via `paddle.set_device()` for PaddleOCR 3.x
- [x] 3.3 Add GPU memory management
- Set GPU memory fraction to prevent OOM errors
- Adjust batch size based on GPU memory availability
- Detect GPU memory and compute capability
- Handle GPU memory allocation failures gracefully
- [ ] 3.4 Update `backend/app/services/ocr_service.py`
- Modify PaddleOCR initialization with GPU parameters
- [x] 3.4 Update `backend/app/services/ocr_service.py`
- Modify PaddleOCR initialization for PaddleOCR 3.x API
- Add GPU status logging
- Add error handling for GPU-related issues
## 4. Health Check and Monitoring
- [ ] 4.1 Add GPU status to health check endpoint
- [x] 4.1 Add GPU status to health check endpoint
- Report GPU availability (true/false)
- Report GPU device name and compute capability
- Report CUDA version
- Report current GPU memory usage
- [ ] 4.2 Update `backend/app/api/v1/endpoints/health.py`
- [x] 4.2 Update `backend/app/main.py`
- Add GPU status fields to health check response
- Handle cases where GPU detection fails
## 5. Documentation Updates
- [ ] 5.1 Update README.md
- [x] 5.1 Update README.md
- Add GPU requirements section
- Document GPU detection and setup process
- Add troubleshooting for GPU issues
@@ -65,6 +65,11 @@
- Document NVIDIA driver installation for WSL
- Document CUDA toolkit installation
- Provide GPU verification steps
- [ ] 5.4 Document known limitations
- Chart recognition feature disabled (PaddlePaddle 3.0.0 API limitation)
- Document `fused_rms_norm_ext` API incompatibility
- Explain impact and workarounds for users
- Update README with limitations section
## 6. Testing
- [ ] 6.1 Test GPU detection on GPU-enabled system