Files
OCR/openspec/changes/add-gpu-acceleration-support/specs/ocr-processing/spec.md
egg 6452797abe feat: add GPU acceleration support OpenSpec proposal
新增 GPU 加速支援的 OpenSpec 變更提案

主要內容:
- 在環境建置腳本中加入 GPU 偵測功能
- 自動安裝對應 CUDA 版本的 PaddlePaddle GPU 套件
- 在 OCR 處理程式中加入 GPU 可用性偵測
- 自動啟用 GPU 加速(可用時)或使用 CPU(不可用時)
- 支援強制 CPU 模式選項
- 加入 GPU 狀態報告到健康檢查 API

變更範圍:
- 新增 capability: environment-setup (環境設置)
- 修改 capability: ocr-processing (加入 GPU 支援)

實作任務包含:
1. 環境設置腳本增強 (GPU 偵測、CUDA 安裝)
2. 配置更新 (GPU 相關環境變數)
3. OCR 服務 GPU 整合 (自動偵測、記憶體管理)
4. 健康檢查與監控 (GPU 狀態報告)
5. 文檔更新
6. 測試與效能評估
7. 錯誤處理與邊界情況

預期效果:
- GPU 系統: 3-10x OCR 處理速度提升
- CPU 系統: 無影響,向後相容
- 自動硬體偵測與優化配置

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 07:34:06 +08:00

3.8 KiB

OCR Processing Specification

ADDED Requirements

Requirement: GPU Acceleration

The system SHALL automatically detect and utilize GPU hardware for OCR processing when available, with graceful fallback to CPU mode when GPU is unavailable or disabled.

Scenario: GPU available and enabled

  • WHEN PaddleOCR service initializes on system with compatible GPU
  • THEN the system detects GPU availability using CUDA runtime
  • AND initializes PaddleOCR with use_gpu=True parameter
  • AND sets appropriate GPU memory fraction to prevent OOM errors
  • AND logs GPU device information (name, memory, CUDA version)
  • AND processes OCR tasks using GPU acceleration

Scenario: CPU fallback when GPU unavailable

  • WHEN PaddleOCR service initializes on system without GPU
  • THEN the system detects absence of GPU
  • AND initializes PaddleOCR with use_gpu=False parameter
  • AND logs CPU mode status
  • AND processes OCR tasks using CPU without errors

Scenario: Force CPU mode override

  • WHEN FORCE_CPU_MODE environment variable is set to true
  • THEN the system ignores GPU availability
  • AND initializes PaddleOCR in CPU mode
  • AND logs that CPU mode is forced by configuration
  • AND processes OCR tasks using CPU

Scenario: GPU out-of-memory error handling

  • WHEN GPU runs out of memory during OCR processing
  • THEN the system catches CUDA OOM exception
  • AND logs error with GPU memory information
  • AND attempts to process the task using CPU mode
  • AND continues batch processing without failure
  • AND records GPU failure in task metadata

Scenario: Multiple GPU devices available

  • WHEN system has multiple CUDA devices
  • THEN the system detects all available GPUs
  • AND uses primary GPU (device 0) by default
  • AND allows GPU device selection via configuration
  • AND logs selected GPU device information

Requirement: GPU Performance Optimization

The system SHALL optimize GPU memory usage and batch processing for efficient OCR performance.

Scenario: Automatic batch size adjustment

  • WHEN GPU mode is enabled
  • THEN the system queries available GPU memory
  • AND calculates optimal batch size based on memory capacity
  • AND adjusts concurrent processing threads accordingly
  • AND monitors memory usage during processing
  • AND prevents memory allocation beyond safe threshold

Scenario: GPU memory management

  • WHEN GPU memory fraction is configured
  • THEN the system allocates specified fraction of total GPU memory
  • AND reserves memory for PaddleOCR model
  • AND prevents other processes from causing OOM
  • AND releases memory after batch completion

Requirement: GPU Status Reporting

The system SHALL provide GPU status information through health check API and logging.

Scenario: Health check with GPU available

  • WHEN client requests /health endpoint on GPU-enabled system
  • THEN the system returns health status including:
    • gpu_available: true
    • gpu_device_name: detected GPU name
    • cuda_version: CUDA runtime version
    • gpu_memory_total: total GPU memory in MB
    • gpu_memory_used: currently used GPU memory in MB
    • gpu_utilization: current GPU utilization percentage

Scenario: Health check without GPU

  • WHEN client requests /health endpoint on CPU-only system
  • THEN the system returns health status including:
    • gpu_available: false
    • processing_mode: "CPU"
    • reason: explanation for CPU mode (e.g., "No GPU detected", "CPU mode forced")

Scenario: Startup GPU status logging

  • WHEN OCR service starts
  • THEN the system logs GPU detection results
  • AND logs selected processing mode (GPU/CPU)
  • AND logs GPU device details if available
  • AND logs any GPU-related warnings or errors
  • AND continues startup successfully regardless of GPU status