feat: add GPU acceleration support OpenSpec proposal

新增 GPU 加速支援的 OpenSpec 變更提案主要內容： - 在環境建置腳本中加入 GPU 偵測功能 - 自動安裝對應 CUDA 版本的 PaddlePaddle GPU 套件 - 在 OCR 處理程式中加入 GPU 可用性偵測 - 自動啟用 GPU 加速（可用時）或使用 CPU（不可用時） - 支援強制 CPU 模式選項 - 加入 GPU 狀態報告到健康檢查 API 變更範圍： - 新增 capability: environment-setup (環境設置) - 修改 capability: ocr-processing (加入 GPU 支援) 實作任務包含： 1. 環境設置腳本增強 (GPU 偵測、CUDA 安裝) 2. 配置更新 (GPU 相關環境變數) 3. OCR 服務 GPU 整合 (自動偵測、記憶體管理) 4. 健康檢查與監控 (GPU 狀態報告) 5. 文檔更新 6. 測試與效能評估 7. 錯誤處理與邊界情況預期效果： - GPU 系統: 3-10x OCR 處理速度提升 - CPU 系統: 無影響，向後相容 - 自動硬體偵測與優化配置 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 07:34:06 +08:00
parent d7e64737b7
commit 6452797abe
4 changed files with 315 additions and 0 deletions
--- a/openspec/changes/add-gpu-acceleration-support/proposal.md
+++ b/openspec/changes/add-gpu-acceleration-support/proposal.md
@@ -0,0 +1,51 @@
+# Change: Add GPU Acceleration Support for OCR Processing
+
+## Why
+PaddleOCR supports CUDA GPU acceleration which can significantly improve OCR processing speed for batch operations. Currently, the system always uses CPU processing, which is slower and less efficient for large document batches. By adding GPU detection and automatic CUDA support, the system will:
+- Automatically utilize available GPU hardware when present
+- Fall back gracefully to CPU processing when GPU is unavailable
+- Reduce processing time for large batches by leveraging parallel GPU computation
+- Improve overall system throughput and user experience
+
+## What Changes
+- Add GPU detection logic to environment setup script (`setup_dev_env.sh`)
+- Automatically install CUDA-enabled PaddlePaddle when compatible GPU is detected
+- Install CPU-only PaddlePaddle when no compatible GPU is found
+- Add GPU availability detection in OCR processing code
+- Automatically enable GPU acceleration in PaddleOCR when GPU is available
+- Add configuration option to force CPU mode (for testing or troubleshooting)
+- Add GPU status reporting in API health check endpoint
+- Update documentation with GPU requirements and setup instructions
+
+## Impact
+- **Affected capabilities**:
+  - `ocr-processing`: Add GPU acceleration support with automatic detection
+  - `environment-setup`: Add GPU detection and CUDA installation logic
+
+- **Affected code**:
+  - `setup_dev_env.sh`: GPU detection and conditional CUDA package installation
+  - `backend/app/services/ocr_service.py`: GPU availability detection and configuration
+  - `backend/app/api/v1/endpoints/health.py`: GPU status reporting
+  - `backend/app/core/config.py`: GPU configuration settings
+  - `.env.local`: GPU-related environment variables
+
+- **Dependencies**:
+  - When GPU available: `paddlepaddle-gpu` (with matching CUDA version)
+  - When GPU unavailable: `paddlepaddle` (CPU-only, current default)
+  - Detection tools: `nvidia-smi` (NVIDIA GPUs), `lspci` (hardware detection)
+
+- **Configuration**:
+  - New env var: `FORCE_CPU_MODE` (default: false) - Override GPU detection
+  - New env var: `CUDA_VERSION` (auto-detected or manual override)
+  - GPU memory allocation settings for PaddleOCR
+  - Batch size adjustment based on GPU memory availability
+
+- **Performance Impact**:
+  - Expected 3-10x speedup for OCR processing on GPU-enabled systems
+  - No performance degradation on CPU-only systems (same as current behavior)
+  - Automatic memory management to prevent GPU OOM errors
+
+- **Backward Compatibility**:
+  - Fully backward compatible - existing CPU-only installations continue to work
+  - No breaking changes to API or configuration
+  - Existing installations can opt-in by re-running setup script on GPU-enabled hardware
--- a/openspec/changes/add-gpu-acceleration-support/specs/environment-setup/spec.md
+++ b/openspec/changes/add-gpu-acceleration-support/specs/environment-setup/spec.md
@@ -0,0 +1,77 @@
+# Environment Setup Specification
+
+## ADDED Requirements
+
+### Requirement: GPU Detection and CUDA Installation
+The system SHALL automatically detect compatible GPU hardware during environment setup and install appropriate PaddlePaddle packages (GPU-enabled or CPU-only) based on hardware availability.
+
+#### Scenario: GPU detected with CUDA support
+- **WHEN** setup script runs on system with NVIDIA GPU and CUDA drivers
+- **THEN** the script detects GPU using `nvidia-smi` command
+- **AND** determines CUDA version from driver
+- **AND** installs `paddlepaddle-gpu` with matching CUDA version
+- **AND** verifies GPU availability through Python
+- **AND** displays GPU information (device name, CUDA version, memory)
+
+#### Scenario: No GPU detected
+- **WHEN** setup script runs on system without compatible GPU
+- **THEN** the script detects absence of GPU hardware
+- **AND** installs CPU-only `paddlepaddle` package
+- **AND** displays message that CPU mode will be used
+- **AND** continues setup without errors
+
+#### Scenario: GPU detected but no CUDA drivers
+- **WHEN** setup script detects NVIDIA GPU but CUDA drivers are missing
+- **THEN** the script displays warning about missing drivers
+- **AND** provides installation instructions for CUDA drivers
+- **AND** falls back to CPU-only installation
+- **AND** suggests re-running setup after driver installation
+
+#### Scenario: CUDA version mismatch
+- **WHEN** detected CUDA version is not compatible with available PaddlePaddle packages
+- **THEN** the script displays available CUDA versions
+- **AND** installs closest compatible PaddlePaddle GPU package
+- **AND** warns user about potential compatibility issues
+- **AND** provides instructions to upgrade/downgrade CUDA if needed
+
+#### Scenario: Manual CUDA version override
+- **WHEN** user sets CUDA_VERSION environment variable before running setup
+- **THEN** the script uses specified CUDA version instead of auto-detection
+- **AND** installs corresponding PaddlePaddle GPU package
+- **AND** skips automatic CUDA detection
+- **AND** displays warning if specified version differs from detected version
+
+### Requirement: GPU Verification
+The system SHALL verify GPU functionality after installation and provide clear status reporting.
+
+#### Scenario: Successful GPU setup verification
+- **WHEN** PaddlePaddle GPU installation completes
+- **THEN** the script runs GPU availability test using Python
+- **AND** confirms CUDA devices are accessible
+- **AND** displays GPU count, device names, and memory capacity
+- **AND** marks GPU setup as successful
+
+#### Scenario: GPU verification fails
+- **WHEN** GPU verification test fails after installation
+- **THEN** the script displays detailed error message
+- **AND** provides troubleshooting steps
+- **AND** suggests fallback to CPU mode
+- **AND** does not fail entire setup process
+
+### Requirement: Environment Configuration for GPU
+The system SHALL create appropriate configuration settings for GPU usage in environment files.
+
+#### Scenario: GPU-enabled configuration
+- **WHEN** GPU is successfully detected and verified
+- **THEN** the setup script adds GPU settings to `.env.local`
+- **AND** sets `FORCE_CPU_MODE=false`
+- **AND** sets detected `CUDA_VERSION`
+- **AND** sets recommended `GPU_MEMORY_FRACTION` (e.g., 0.8)
+- **AND** adds GPU-related comments and documentation
+
+#### Scenario: CPU-only configuration
+- **WHEN** no GPU is detected or verification fails
+- **THEN** the setup script creates CPU-only configuration
+- **AND** sets `FORCE_CPU_MODE=true`
+- **AND** omits or comments out GPU-specific settings
+- **AND** adds note about GPU requirements
--- a/openspec/changes/add-gpu-acceleration-support/specs/ocr-processing/spec.md
+++ b/openspec/changes/add-gpu-acceleration-support/specs/ocr-processing/spec.md
@@ -0,0 +1,89 @@
+# OCR Processing Specification
+
+## ADDED Requirements
+
+### Requirement: GPU Acceleration
+The system SHALL automatically detect and utilize GPU hardware for OCR processing when available, with graceful fallback to CPU mode when GPU is unavailable or disabled.
+
+#### Scenario: GPU available and enabled
+- **WHEN** PaddleOCR service initializes on system with compatible GPU
+- **THEN** the system detects GPU availability using CUDA runtime
+- **AND** initializes PaddleOCR with `use_gpu=True` parameter
+- **AND** sets appropriate GPU memory fraction to prevent OOM errors
+- **AND** logs GPU device information (name, memory, CUDA version)
+- **AND** processes OCR tasks using GPU acceleration
+
+#### Scenario: CPU fallback when GPU unavailable
+- **WHEN** PaddleOCR service initializes on system without GPU
+- **THEN** the system detects absence of GPU
+- **AND** initializes PaddleOCR with `use_gpu=False` parameter
+- **AND** logs CPU mode status
+- **AND** processes OCR tasks using CPU without errors
+
+#### Scenario: Force CPU mode override
+- **WHEN** FORCE_CPU_MODE environment variable is set to true
+- **THEN** the system ignores GPU availability
+- **AND** initializes PaddleOCR in CPU mode
+- **AND** logs that CPU mode is forced by configuration
+- **AND** processes OCR tasks using CPU
+
+#### Scenario: GPU out-of-memory error handling
+- **WHEN** GPU runs out of memory during OCR processing
+- **THEN** the system catches CUDA OOM exception
+- **AND** logs error with GPU memory information
+- **AND** attempts to process the task using CPU mode
+- **AND** continues batch processing without failure
+- **AND** records GPU failure in task metadata
+
+#### Scenario: Multiple GPU devices available
+- **WHEN** system has multiple CUDA devices
+- **THEN** the system detects all available GPUs
+- **AND** uses primary GPU (device 0) by default
+- **AND** allows GPU device selection via configuration
+- **AND** logs selected GPU device information
+
+### Requirement: GPU Performance Optimization
+The system SHALL optimize GPU memory usage and batch processing for efficient OCR performance.
+
+#### Scenario: Automatic batch size adjustment
+- **WHEN** GPU mode is enabled
+- **THEN** the system queries available GPU memory
+- **AND** calculates optimal batch size based on memory capacity
+- **AND** adjusts concurrent processing threads accordingly
+- **AND** monitors memory usage during processing
+- **AND** prevents memory allocation beyond safe threshold
+
+#### Scenario: GPU memory management
+- **WHEN** GPU memory fraction is configured
+- **THEN** the system allocates specified fraction of total GPU memory
+- **AND** reserves memory for PaddleOCR model
+- **AND** prevents other processes from causing OOM
+- **AND** releases memory after batch completion
+
+### Requirement: GPU Status Reporting
+The system SHALL provide GPU status information through health check API and logging.
+
+#### Scenario: Health check with GPU available
+- **WHEN** client requests `/health` endpoint on GPU-enabled system
+- **THEN** the system returns health status including:
+  - `gpu_available`: true
+  - `gpu_device_name`: detected GPU name
+  - `cuda_version`: CUDA runtime version
+  - `gpu_memory_total`: total GPU memory in MB
+  - `gpu_memory_used`: currently used GPU memory in MB
+  - `gpu_utilization`: current GPU utilization percentage
+
+#### Scenario: Health check without GPU
+- **WHEN** client requests `/health` endpoint on CPU-only system
+- **THEN** the system returns health status including:
+  - `gpu_available`: false
+  - `processing_mode`: "CPU"
+  - `reason`: explanation for CPU mode (e.g., "No GPU detected", "CPU mode forced")
+
+#### Scenario: Startup GPU status logging
+- **WHEN** OCR service starts
+- **THEN** the system logs GPU detection results
+- **AND** logs selected processing mode (GPU/CPU)
+- **AND** logs GPU device details if available
+- **AND** logs any GPU-related warnings or errors
+- **AND** continues startup successfully regardless of GPU status
--- a/openspec/changes/add-gpu-acceleration-support/tasks.md
+++ b/openspec/changes/add-gpu-acceleration-support/tasks.md
@@ -0,0 +1,98 @@
+# Implementation Tasks
+
+## 1. Environment Setup Enhancement
+- [ ] 1.1 Add GPU detection function in `setup_dev_env.sh`
+  - Detect NVIDIA GPU using `nvidia-smi` or `lspci`
+  - Detect CUDA version if GPU is available
+  - Output GPU detection results to user
+- [ ] 1.2 Add conditional CUDA package installation
+  - Install `paddlepaddle-gpu` with matching CUDA version when GPU detected
+  - Install `paddlepaddle` (CPU-only) when no GPU detected
+  - Handle different CUDA versions (11.2, 11.6, 11.7, 12.0, etc.)
+- [ ] 1.3 Add GPU verification step after installation
+  - Test PaddlePaddle GPU availability
+  - Report GPU status and CUDA version to user
+  - Provide fallback instructions if GPU setup fails
+
+## 2. Configuration Updates
+- [ ] 2.1 Add GPU configuration to `.env.local`
+  - Add `FORCE_CPU_MODE` option (default: false)
+  - Add `CUDA_VERSION` for manual override
+  - Add `GPU_MEMORY_FRACTION` for memory allocation control
+- [ ] 2.2 Update backend configuration
+  - Add GPU settings to `backend/app/core/config.py`
+  - Load GPU-related environment variables
+  - Add validation for GPU configuration values
+
+## 3. OCR Service GPU Integration
+- [ ] 3.1 Add GPU detection in OCR service initialization
+  - Create GPU availability check function
+  - Detect available GPU devices
+  - Log GPU status (available/unavailable, device name, memory)
+- [ ] 3.2 Implement automatic GPU/CPU mode selection
+  - Enable GPU mode in PaddleOCR when GPU is available
+  - Fall back to CPU mode when GPU is unavailable or forced
+  - Set appropriate `use_gpu` parameter for PaddleOCR initialization
+- [ ] 3.3 Add GPU memory management
+  - Set GPU memory fraction to prevent OOM errors
+  - Adjust batch size based on GPU memory availability
+  - Handle GPU memory allocation failures gracefully
+- [ ] 3.4 Update `backend/app/services/ocr_service.py`
+  - Modify PaddleOCR initialization with GPU parameters
+  - Add GPU status logging
+  - Add error handling for GPU-related issues
+
+## 4. Health Check and Monitoring
+- [ ] 4.1 Add GPU status to health check endpoint
+  - Report GPU availability (true/false)
+  - Report GPU device name and compute capability
+  - Report CUDA version
+  - Report current GPU memory usage
+- [ ] 4.2 Update `backend/app/api/v1/endpoints/health.py`
+  - Add GPU status fields to health check response
+  - Handle cases where GPU detection fails
+
+## 5. Documentation Updates
+- [ ] 5.1 Update README.md
+  - Add GPU requirements section
+  - Document GPU detection and setup process
+  - Add troubleshooting for GPU issues
+- [ ] 5.2 Update openspec/project.md
+  - Add GPU hardware recommendations
+  - Document CUDA version compatibility
+  - Add GPU-specific configuration options
+- [ ] 5.3 Create GPU setup guide
+  - Document NVIDIA driver installation for WSL
+  - Document CUDA toolkit installation
+  - Provide GPU verification steps
+
+## 6. Testing
+- [ ] 6.1 Test GPU detection on GPU-enabled system
+  - Verify correct CUDA version detection
+  - Verify correct PaddlePaddle GPU installation
+  - Verify OCR processing uses GPU
+- [ ] 6.2 Test CPU fallback on non-GPU system
+  - Verify CPU-only installation
+  - Verify OCR processing works without GPU
+  - Verify no errors or warnings about missing GPU
+- [ ] 6.3 Test FORCE_CPU_MODE override
+  - Verify GPU is ignored when FORCE_CPU_MODE=true
+  - Verify CPU processing works on GPU-enabled system
+- [ ] 6.4 Performance benchmarking
+  - Measure OCR processing time with GPU
+  - Measure OCR processing time with CPU
+  - Document performance improvements
+
+## 7. Error Handling and Edge Cases
+- [ ] 7.1 Handle GPU out-of-memory errors
+  - Catch CUDA OOM exceptions
+  - Automatically fall back to CPU mode
+  - Log warning message to user
+- [ ] 7.2 Handle CUDA version mismatch
+  - Detect PaddlePaddle/CUDA compatibility issues
+  - Provide clear error messages
+  - Suggest correct CUDA version installation
+- [ ] 7.3 Handle missing NVIDIA drivers
+  - Detect when GPU hardware exists but drivers are missing
+  - Provide installation instructions
+  - Fall back to CPU mode gracefully