Files
OCR/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/specs/ocr-processing/spec.md
egg cd3cbea49d chore: project cleanup and prepare for dual-track processing refactor
- Removed all test files and directories
- Deleted outdated documentation (will be rewritten)
- Cleaned up temporary files, logs, and uploads
- Archived 5 completed OpenSpec proposals
- Created new dual-track-document-processing proposal with complete OpenSpec structure
  - Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF)
  - UnifiedDocument model for consistent output
  - Support for structure-preserving translation
- Updated .gitignore to prevent future test/temp files

This is a major cleanup preparing for the complete refactoring of the document processing pipeline.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-18 20:02:31 +08:00

3.8 KiB

OCR Processing Specification

ADDED Requirements

Requirement: GPU Acceleration

The system SHALL automatically detect and utilize GPU hardware for OCR processing when available, with graceful fallback to CPU mode when GPU is unavailable or disabled.

Scenario: GPU available and enabled

  • WHEN PaddleOCR service initializes on system with compatible GPU
  • THEN the system detects GPU availability using CUDA runtime
  • AND initializes PaddleOCR with use_gpu=True parameter
  • AND sets appropriate GPU memory fraction to prevent OOM errors
  • AND logs GPU device information (name, memory, CUDA version)
  • AND processes OCR tasks using GPU acceleration

Scenario: CPU fallback when GPU unavailable

  • WHEN PaddleOCR service initializes on system without GPU
  • THEN the system detects absence of GPU
  • AND initializes PaddleOCR with use_gpu=False parameter
  • AND logs CPU mode status
  • AND processes OCR tasks using CPU without errors

Scenario: Force CPU mode override

  • WHEN FORCE_CPU_MODE environment variable is set to true
  • THEN the system ignores GPU availability
  • AND initializes PaddleOCR in CPU mode
  • AND logs that CPU mode is forced by configuration
  • AND processes OCR tasks using CPU

Scenario: GPU out-of-memory error handling

  • WHEN GPU runs out of memory during OCR processing
  • THEN the system catches CUDA OOM exception
  • AND logs error with GPU memory information
  • AND attempts to process the task using CPU mode
  • AND continues batch processing without failure
  • AND records GPU failure in task metadata

Scenario: Multiple GPU devices available

  • WHEN system has multiple CUDA devices
  • THEN the system detects all available GPUs
  • AND uses primary GPU (device 0) by default
  • AND allows GPU device selection via configuration
  • AND logs selected GPU device information

Requirement: GPU Performance Optimization

The system SHALL optimize GPU memory usage and batch processing for efficient OCR performance.

Scenario: Automatic batch size adjustment

  • WHEN GPU mode is enabled
  • THEN the system queries available GPU memory
  • AND calculates optimal batch size based on memory capacity
  • AND adjusts concurrent processing threads accordingly
  • AND monitors memory usage during processing
  • AND prevents memory allocation beyond safe threshold

Scenario: GPU memory management

  • WHEN GPU memory fraction is configured
  • THEN the system allocates specified fraction of total GPU memory
  • AND reserves memory for PaddleOCR model
  • AND prevents other processes from causing OOM
  • AND releases memory after batch completion

Requirement: GPU Status Reporting

The system SHALL provide GPU status information through health check API and logging.

Scenario: Health check with GPU available

  • WHEN client requests /health endpoint on GPU-enabled system
  • THEN the system returns health status including:
    • gpu_available: true
    • gpu_device_name: detected GPU name
    • cuda_version: CUDA runtime version
    • gpu_memory_total: total GPU memory in MB
    • gpu_memory_used: currently used GPU memory in MB
    • gpu_utilization: current GPU utilization percentage

Scenario: Health check without GPU

  • WHEN client requests /health endpoint on CPU-only system
  • THEN the system returns health status including:
    • gpu_available: false
    • processing_mode: "CPU"
    • reason: explanation for CPU mode (e.g., "No GPU detected", "CPU mode forced")

Scenario: Startup GPU status logging

  • WHEN OCR service starts
  • THEN the system logs GPU detection results
  • AND logs selected processing mode (GPU/CPU)
  • AND logs GPU device details if available
  • AND logs any GPU-related warnings or errors
  • AND continues startup successfully regardless of GPU status