feat: add GPU optimization and fix TableData consistency

GPU Optimization (Section 3.1):
- Add comprehensive memory management for RTX 4060 8GB
- Enable all recognition features (chart, formula, table, seal, text)
- Implement model cache with auto-unload for idle models
- Add memory monitoring and warning system

Bug Fix (Section 3.3):
- Fix TableData field inconsistency: 'columns' -> 'cols'
- Remove invalid 'html' and 'extracted_text' parameters
- Add proper TableCell conversion in _convert_table_data

Documentation:
- Add Future Improvements section for batch processing enhancement

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-19 09:17:27 +08:00
parent ecdce961ca
commit 8b9a364452
4 changed files with 205 additions and 24 deletions

View File

@@ -63,10 +63,41 @@ class Settings(BaseSettings):
return [lang.strip() for lang in self.ocr_languages.split(",")]
# ===== GPU Acceleration Configuration =====
# Basic GPU settings
force_cpu_mode: bool = Field(default=False)
gpu_memory_fraction: float = Field(default=0.8)
gpu_memory_fraction: float = Field(default=0.7) # Optimized for RTX 4060 8GB
gpu_device_id: int = Field(default=0)
# Memory management for RTX 4060 8GB
gpu_memory_limit_mb: int = Field(default=6144) # 6GB max for models (leave 2GB buffer)
gpu_memory_reserve_mb: int = Field(default=512) # Reserve for CUDA overhead
enable_memory_optimization: bool = Field(default=True)
# Model loading and caching
enable_lazy_model_loading: bool = Field(default=True) # Load models on demand
enable_model_cache: bool = Field(default=True)
model_cache_limit_mb: int = Field(default=4096) # Max 4GB for cached models
auto_unload_unused_models: bool = Field(default=True) # Unload unused language models
model_idle_timeout_seconds: int = Field(default=300) # Unload after 5 min idle
# Batch processing configuration
enable_batch_processing: bool = Field(default=True)
inference_batch_size: int = Field(default=1) # Conservative for 8GB VRAM
max_concurrent_pages: int = Field(default=2) # Process 2 pages concurrently
# PP-StructureV3 optimization
enable_chart_recognition: bool = Field(default=True) # Chart/diagram recognition
enable_formula_recognition: bool = Field(default=True) # Math formula recognition
enable_table_recognition: bool = Field(default=True) # Table structure recognition
enable_seal_recognition: bool = Field(default=True) # Seal/stamp recognition
enable_text_recognition: bool = Field(default=True) # General text recognition
layout_detection_threshold: float = Field(default=0.5)
# Performance tuning
use_fp16_inference: bool = Field(default=False) # Half-precision (if supported)
enable_cudnn_benchmark: bool = Field(default=True) # Optimize convolution algorithms
num_threads: int = Field(default=4) # CPU threads for preprocessing
# ===== File Upload Configuration =====
max_upload_size: int = Field(default=52428800) # 50MB
allowed_extensions: str = Field(default="png,jpg,jpeg,pdf,bmp,tiff,doc,docx,ppt,pptx")