- Removed all test files and directories - Deleted outdated documentation (will be rewritten) - Cleaned up temporary files, logs, and uploads - Archived 5 completed OpenSpec proposals - Created new dual-track-document-processing proposal with complete OpenSpec structure - Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF) - UnifiedDocument model for consistent output - Support for structure-preserving translation - Updated .gitignore to prevent future test/temp files This is a major cleanup preparing for the complete refactoring of the document processing pipeline. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
104 lines
4.2 KiB
Markdown
104 lines
4.2 KiB
Markdown
# Implementation Tasks
|
|
|
|
## 1. Environment Setup Enhancement
|
|
- [x] 1.1 Add GPU detection function in `setup_dev_env.sh`
|
|
- Detect NVIDIA GPU using `nvidia-smi` or `lspci`
|
|
- Detect CUDA version if GPU is available
|
|
- Output GPU detection results to user
|
|
- [x] 1.2 Add conditional CUDA package installation
|
|
- Install `paddlepaddle-gpu` with matching CUDA version when GPU detected
|
|
- Install `paddlepaddle` (CPU-only) when no GPU detected
|
|
- Handle different CUDA versions (11.x, 12.x, 13.x)
|
|
- [x] 1.3 Add GPU verification step after installation
|
|
- Test PaddlePaddle GPU availability
|
|
- Report GPU status and CUDA version to user
|
|
- Provide fallback instructions if GPU setup fails
|
|
|
|
## 2. Configuration Updates
|
|
- [x] 2.1 Add GPU configuration to `.env.local`
|
|
- Add `FORCE_CPU_MODE` option (default: false)
|
|
- Add `GPU_DEVICE_ID` for device selection
|
|
- Add `GPU_MEMORY_FRACTION` for memory allocation control
|
|
- [x] 2.2 Update backend configuration
|
|
- Add GPU settings to `backend/app/core/config.py`
|
|
- Load GPU-related environment variables
|
|
- Add validation for GPU configuration values
|
|
|
|
## 3. OCR Service GPU Integration
|
|
- [x] 3.1 Add GPU detection in OCR service initialization
|
|
- Create GPU availability check function
|
|
- Detect available GPU devices
|
|
- Log GPU status (available/unavailable, device name, memory)
|
|
- [x] 3.2 Implement automatic GPU/CPU mode selection
|
|
- Enable GPU mode in PaddleOCR when GPU is available
|
|
- Fall back to CPU mode when GPU is unavailable or forced
|
|
- Use global device setting via `paddle.set_device()` for PaddleOCR 3.x
|
|
- [x] 3.3 Add GPU memory management
|
|
- Set GPU memory fraction to prevent OOM errors
|
|
- Detect GPU memory and compute capability
|
|
- Handle GPU memory allocation failures gracefully
|
|
- [x] 3.4 Update `backend/app/services/ocr_service.py`
|
|
- Modify PaddleOCR initialization for PaddleOCR 3.x API
|
|
- Add GPU status logging
|
|
- Add error handling for GPU-related issues
|
|
|
|
## 4. Health Check and Monitoring
|
|
- [x] 4.1 Add GPU status to health check endpoint
|
|
- Report GPU availability (true/false)
|
|
- Report GPU device name and compute capability
|
|
- Report CUDA version
|
|
- Report current GPU memory usage
|
|
- [x] 4.2 Update `backend/app/main.py`
|
|
- Add GPU status fields to health check response
|
|
- Handle cases where GPU detection fails
|
|
|
|
## 5. Documentation Updates
|
|
- [x] 5.1 Update README.md
|
|
- Add GPU requirements section
|
|
- Document GPU detection and setup process
|
|
- Add troubleshooting for GPU issues
|
|
- [ ] 5.2 Update openspec/project.md
|
|
- Add GPU hardware recommendations
|
|
- Document CUDA version compatibility
|
|
- Add GPU-specific configuration options
|
|
- [ ] 5.3 Create GPU setup guide
|
|
- Document NVIDIA driver installation for WSL
|
|
- Document CUDA toolkit installation
|
|
- Provide GPU verification steps
|
|
- [x] 5.4 Document known limitations
|
|
- ~~Chart recognition feature disabled (PaddlePaddle 3.0.0 API limitation)~~ **RESOLVED**
|
|
- ~~Document `fused_rms_norm_ext` API incompatibility~~ **RESOLVED in PaddlePaddle 3.2.1+**
|
|
- Updated README to reflect chart recognition is now enabled
|
|
- Created CHART_RECOGNITION.md with detailed status and history
|
|
|
|
## 6. Testing
|
|
- [ ] 6.1 Test GPU detection on GPU-enabled system
|
|
- Verify correct CUDA version detection
|
|
- Verify correct PaddlePaddle GPU installation
|
|
- Verify OCR processing uses GPU
|
|
- [ ] 6.2 Test CPU fallback on non-GPU system
|
|
- Verify CPU-only installation
|
|
- Verify OCR processing works without GPU
|
|
- Verify no errors or warnings about missing GPU
|
|
- [ ] 6.3 Test FORCE_CPU_MODE override
|
|
- Verify GPU is ignored when FORCE_CPU_MODE=true
|
|
- Verify CPU processing works on GPU-enabled system
|
|
- [ ] 6.4 Performance benchmarking
|
|
- Measure OCR processing time with GPU
|
|
- Measure OCR processing time with CPU
|
|
- Document performance improvements
|
|
|
|
## 7. Error Handling and Edge Cases
|
|
- [ ] 7.1 Handle GPU out-of-memory errors
|
|
- Catch CUDA OOM exceptions
|
|
- Automatically fall back to CPU mode
|
|
- Log warning message to user
|
|
- [ ] 7.2 Handle CUDA version mismatch
|
|
- Detect PaddlePaddle/CUDA compatibility issues
|
|
- Provide clear error messages
|
|
- Suggest correct CUDA version installation
|
|
- [ ] 7.3 Handle missing NVIDIA drivers
|
|
- Detect when GPU hardware exists but drivers are missing
|
|
- Provide installation instructions
|
|
- Fall back to CPU mode gracefully
|