egg/OCR

Files

egg cd3cbea49d chore: project cleanup and prepare for dual-track processing refactor

- Removed all test files and directories
- Deleted outdated documentation (will be rewritten)
- Cleaned up temporary files, logs, and uploads
- Archived 5 completed OpenSpec proposals
- Created new dual-track-document-processing proposal with complete OpenSpec structure
  - Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF)
  - UnifiedDocument model for consistent output
  - Support for structure-preserving translation
- Updated .gitignore to prevent future test/temp files

This is a major cleanup preparing for the complete refactoring of the document processing pipeline.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-18 20:02:31 +08:00

3.8 KiB

Raw Blame History

OCR Processing Specification

ADDED Requirements

Requirement: GPU Acceleration

The system SHALL automatically detect and utilize GPU hardware for OCR processing when available, with graceful fallback to CPU mode when GPU is unavailable or disabled.

Scenario: GPU available and enabled

WHEN PaddleOCR service initializes on system with compatible GPU
THEN the system detects GPU availability using CUDA runtime
AND initializes PaddleOCR with use_gpu=True parameter
AND sets appropriate GPU memory fraction to prevent OOM errors
AND logs GPU device information (name, memory, CUDA version)
AND processes OCR tasks using GPU acceleration

Scenario: CPU fallback when GPU unavailable

WHEN PaddleOCR service initializes on system without GPU
THEN the system detects absence of GPU
AND initializes PaddleOCR with use_gpu=False parameter
AND logs CPU mode status
AND processes OCR tasks using CPU without errors

Scenario: Force CPU mode override

WHEN FORCE_CPU_MODE environment variable is set to true
THEN the system ignores GPU availability
AND initializes PaddleOCR in CPU mode
AND logs that CPU mode is forced by configuration
AND processes OCR tasks using CPU

Scenario: GPU out-of-memory error handling

WHEN GPU runs out of memory during OCR processing
THEN the system catches CUDA OOM exception
AND logs error with GPU memory information
AND attempts to process the task using CPU mode
AND continues batch processing without failure
AND records GPU failure in task metadata

Scenario: Multiple GPU devices available

WHEN system has multiple CUDA devices
THEN the system detects all available GPUs
AND uses primary GPU (device 0) by default
AND allows GPU device selection via configuration
AND logs selected GPU device information

Requirement: GPU Performance Optimization

The system SHALL optimize GPU memory usage and batch processing for efficient OCR performance.

Scenario: Automatic batch size adjustment

WHEN GPU mode is enabled
THEN the system queries available GPU memory
AND calculates optimal batch size based on memory capacity
AND adjusts concurrent processing threads accordingly
AND monitors memory usage during processing
AND prevents memory allocation beyond safe threshold

Scenario: GPU memory management

WHEN GPU memory fraction is configured
THEN the system allocates specified fraction of total GPU memory
AND reserves memory for PaddleOCR model
AND prevents other processes from causing OOM
AND releases memory after batch completion

Requirement: GPU Status Reporting

The system SHALL provide GPU status information through health check API and logging.

Scenario: Health check with GPU available

WHEN client requests /health endpoint on GPU-enabled system
THEN the system returns health status including:
- gpu_available: true
- gpu_device_name: detected GPU name
- cuda_version: CUDA runtime version
- gpu_memory_total: total GPU memory in MB
- gpu_memory_used: currently used GPU memory in MB
- gpu_utilization: current GPU utilization percentage

Scenario: Health check without GPU

WHEN client requests /health endpoint on CPU-only system
THEN the system returns health status including:
- gpu_available: false
- processing_mode: "CPU"
- reason: explanation for CPU mode (e.g., "No GPU detected", "CPU mode forced")

Scenario: Startup GPU status logging

WHEN OCR service starts
THEN the system logs GPU detection results
AND logs selected processing mode (GPU/CPU)
AND logs GPU device details if available
AND logs any GPU-related warnings or errors
AND continues startup successfully regardless of GPU status

3.8 KiB Raw Blame History