feat: implement hybrid image extraction and memory management
Backend: - Add hybrid image extraction for Direct track (inline image blocks) - Add render_inline_image_regions() fallback when OCR doesn't find images - Add check_document_for_missing_images() for detecting missing images - Add memory management system (MemoryGuard, ModelManager, ServicePool) - Update pdf_generator_service to handle HYBRID processing track - Add ElementType.LOGO for logo extraction Frontend: - Fix PDF viewer re-rendering issues with memoization - Add TaskNotFound component and useTaskValidation hook - Disable StrictMode due to react-pdf incompatibility - Fix task detail and results page loading states 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -104,6 +104,37 @@ class Settings(BaseSettings):
|
||||
enable_cudnn_benchmark: bool = Field(default=True) # Optimize convolution algorithms
|
||||
num_threads: int = Field(default=4) # CPU threads for preprocessing
|
||||
|
||||
# ===== Enhanced Memory Management Configuration =====
|
||||
# Memory thresholds (as ratio of total GPU memory)
|
||||
memory_warning_threshold: float = Field(default=0.80) # 80% - start warning
|
||||
memory_critical_threshold: float = Field(default=0.95) # 95% - throttle operations
|
||||
memory_emergency_threshold: float = Field(default=0.98) # 98% - emergency cleanup
|
||||
|
||||
# Memory monitoring
|
||||
memory_check_interval_seconds: int = Field(default=30) # Background check interval
|
||||
enable_memory_alerts: bool = Field(default=True) # Enable memory alerts
|
||||
|
||||
# Model lifecycle management
|
||||
enable_model_lifecycle_management: bool = Field(default=True) # Use ModelManager
|
||||
pp_structure_idle_timeout_seconds: int = Field(default=300) # Unload PP-Structure after idle
|
||||
structure_model_memory_mb: int = Field(default=2000) # Estimated memory for PP-StructureV3
|
||||
ocr_model_memory_mb: int = Field(default=500) # Estimated memory per OCR language model
|
||||
|
||||
# Service pool configuration
|
||||
enable_service_pool: bool = Field(default=True) # Use OCRServicePool
|
||||
max_services_per_device: int = Field(default=1) # Max OCRService per GPU
|
||||
max_total_services: int = Field(default=2) # Max total OCRService instances
|
||||
service_acquire_timeout_seconds: float = Field(default=300.0) # Timeout for acquiring service
|
||||
max_queue_size: int = Field(default=50) # Max pending tasks per device
|
||||
|
||||
# Concurrency control
|
||||
max_concurrent_predictions: int = Field(default=2) # Max concurrent PP-StructureV3 predictions
|
||||
enable_cpu_fallback: bool = Field(default=True) # Fall back to CPU when GPU memory low
|
||||
|
||||
# Emergency recovery
|
||||
enable_emergency_cleanup: bool = Field(default=True) # Auto-cleanup on memory pressure
|
||||
enable_worker_restart: bool = Field(default=False) # Restart workers on OOM (requires supervisor)
|
||||
|
||||
# ===== File Upload Configuration =====
|
||||
max_upload_size: int = Field(default=52428800) # 50MB
|
||||
allowed_extensions: str = Field(default="png,jpg,jpeg,pdf,bmp,tiff,doc,docx,ppt,pptx")
|
||||
|
||||
@@ -7,10 +7,103 @@ from fastapi import FastAPI
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
from contextlib import asynccontextmanager
|
||||
import logging
|
||||
import signal
|
||||
import sys
|
||||
import asyncio
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
from app.core.config import settings
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Section 6.1: Signal Handlers
|
||||
# =============================================================================
|
||||
|
||||
# Flag to indicate graceful shutdown is in progress
|
||||
_shutdown_requested = False
|
||||
_shutdown_complete = asyncio.Event()
|
||||
|
||||
# Track active connections for draining
|
||||
_active_connections = 0
|
||||
_connection_lock = asyncio.Lock()
|
||||
|
||||
|
||||
async def increment_connections():
|
||||
"""Track active connection count"""
|
||||
global _active_connections
|
||||
async with _connection_lock:
|
||||
_active_connections += 1
|
||||
|
||||
|
||||
async def decrement_connections():
|
||||
"""Track active connection count"""
|
||||
global _active_connections
|
||||
async with _connection_lock:
|
||||
_active_connections -= 1
|
||||
|
||||
|
||||
def get_active_connections() -> int:
|
||||
"""Get current active connection count"""
|
||||
return _active_connections
|
||||
|
||||
|
||||
def is_shutdown_requested() -> bool:
|
||||
"""Check if graceful shutdown has been requested"""
|
||||
return _shutdown_requested
|
||||
|
||||
|
||||
def _signal_handler(signum: int, frame) -> None:
|
||||
"""
|
||||
Signal handler for SIGTERM and SIGINT.
|
||||
|
||||
Initiates graceful shutdown by setting the shutdown flag.
|
||||
The actual cleanup is handled by the lifespan context manager.
|
||||
"""
|
||||
global _shutdown_requested
|
||||
signal_name = signal.Signals(signum).name
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
if _shutdown_requested:
|
||||
logger.warning(f"Received {signal_name} again, forcing immediate exit...")
|
||||
sys.exit(1)
|
||||
|
||||
logger.info(f"Received {signal_name}, initiating graceful shutdown...")
|
||||
_shutdown_requested = True
|
||||
|
||||
# Try to stop the event loop gracefully
|
||||
try:
|
||||
loop = asyncio.get_event_loop()
|
||||
if loop.is_running():
|
||||
# Schedule shutdown event
|
||||
loop.call_soon_threadsafe(_shutdown_complete.set)
|
||||
except RuntimeError:
|
||||
pass # No event loop running
|
||||
|
||||
|
||||
def setup_signal_handlers():
|
||||
"""
|
||||
Set up signal handlers for graceful shutdown.
|
||||
|
||||
Handles:
|
||||
- SIGTERM: Standard termination signal (from systemd, docker, etc.)
|
||||
- SIGINT: Keyboard interrupt (Ctrl+C)
|
||||
"""
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
try:
|
||||
# SIGTERM - Standard termination signal
|
||||
signal.signal(signal.SIGTERM, _signal_handler)
|
||||
logger.info("SIGTERM handler installed")
|
||||
|
||||
# SIGINT - Keyboard interrupt
|
||||
signal.signal(signal.SIGINT, _signal_handler)
|
||||
logger.info("SIGINT handler installed")
|
||||
|
||||
except (ValueError, OSError) as e:
|
||||
# Signal handling may not be available in all contexts
|
||||
logger.warning(f"Could not install signal handlers: {e}")
|
||||
|
||||
# Ensure log directory exists before configuring logging
|
||||
Path(settings.log_file).parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
@@ -38,16 +131,91 @@ for logger_name in ['app', 'app.services', 'app.services.pdf_generator_service',
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def drain_connections(timeout: float = 30.0):
|
||||
"""
|
||||
Wait for active connections to complete (connection draining).
|
||||
|
||||
Args:
|
||||
timeout: Maximum time to wait for connections to drain
|
||||
"""
|
||||
logger.info(f"Draining connections (timeout={timeout}s)...")
|
||||
start_time = asyncio.get_event_loop().time()
|
||||
|
||||
while get_active_connections() > 0:
|
||||
elapsed = asyncio.get_event_loop().time() - start_time
|
||||
if elapsed >= timeout:
|
||||
logger.warning(
|
||||
f"Connection drain timeout after {timeout}s. "
|
||||
f"{get_active_connections()} connections still active."
|
||||
)
|
||||
break
|
||||
|
||||
logger.info(f"Waiting for {get_active_connections()} active connections...")
|
||||
await asyncio.sleep(1.0)
|
||||
|
||||
if get_active_connections() == 0:
|
||||
logger.info("All connections drained successfully")
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
"""Application lifespan events"""
|
||||
# Startup
|
||||
logger.info("Starting Tool_OCR V2 application...")
|
||||
|
||||
# Set up signal handlers for graceful shutdown
|
||||
setup_signal_handlers()
|
||||
|
||||
# Ensure all directories exist
|
||||
settings.ensure_directories()
|
||||
logger.info("All directories created/verified")
|
||||
|
||||
# Initialize memory management if enabled
|
||||
if settings.enable_model_lifecycle_management:
|
||||
try:
|
||||
from app.services.memory_manager import get_model_manager, MemoryConfig
|
||||
|
||||
memory_config = MemoryConfig(
|
||||
warning_threshold=settings.memory_warning_threshold,
|
||||
critical_threshold=settings.memory_critical_threshold,
|
||||
emergency_threshold=settings.memory_emergency_threshold,
|
||||
model_idle_timeout_seconds=settings.pp_structure_idle_timeout_seconds,
|
||||
memory_check_interval_seconds=settings.memory_check_interval_seconds,
|
||||
enable_auto_cleanup=settings.enable_memory_optimization,
|
||||
enable_emergency_cleanup=settings.enable_emergency_cleanup,
|
||||
max_concurrent_predictions=settings.max_concurrent_predictions,
|
||||
enable_cpu_fallback=settings.enable_cpu_fallback,
|
||||
gpu_memory_limit_mb=settings.gpu_memory_limit_mb,
|
||||
)
|
||||
get_model_manager(memory_config)
|
||||
logger.info("Memory management initialized")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to initialize memory management: {e}")
|
||||
|
||||
# Initialize service pool if enabled
|
||||
if settings.enable_service_pool:
|
||||
try:
|
||||
from app.services.service_pool import get_service_pool, PoolConfig
|
||||
|
||||
pool_config = PoolConfig(
|
||||
max_services_per_device=settings.max_services_per_device,
|
||||
max_total_services=settings.max_total_services,
|
||||
acquire_timeout_seconds=settings.service_acquire_timeout_seconds,
|
||||
max_queue_size=settings.max_queue_size,
|
||||
)
|
||||
get_service_pool(pool_config)
|
||||
logger.info("OCR service pool initialized")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to initialize service pool: {e}")
|
||||
|
||||
# Initialize prediction semaphore for controlling concurrent PP-StructureV3 predictions
|
||||
try:
|
||||
from app.services.memory_manager import get_prediction_semaphore
|
||||
get_prediction_semaphore(max_concurrent=settings.max_concurrent_predictions)
|
||||
logger.info(f"Prediction semaphore initialized (max_concurrent={settings.max_concurrent_predictions})")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to initialize prediction semaphore: {e}")
|
||||
|
||||
logger.info("Application startup complete")
|
||||
|
||||
yield
|
||||
@@ -55,6 +223,45 @@ async def lifespan(app: FastAPI):
|
||||
# Shutdown
|
||||
logger.info("Shutting down Tool_OCR application...")
|
||||
|
||||
# Connection draining - wait for active requests to complete
|
||||
await drain_connections(timeout=30.0)
|
||||
|
||||
# Shutdown recovery manager if initialized
|
||||
try:
|
||||
from app.services.memory_manager import shutdown_recovery_manager
|
||||
shutdown_recovery_manager()
|
||||
logger.info("Recovery manager shutdown complete")
|
||||
except Exception as e:
|
||||
logger.debug(f"Recovery manager shutdown skipped: {e}")
|
||||
|
||||
# Shutdown service pool
|
||||
if settings.enable_service_pool:
|
||||
try:
|
||||
from app.services.service_pool import shutdown_service_pool
|
||||
shutdown_service_pool()
|
||||
logger.info("Service pool shutdown complete")
|
||||
except Exception as e:
|
||||
logger.warning(f"Error shutting down service pool: {e}")
|
||||
|
||||
# Shutdown prediction semaphore
|
||||
try:
|
||||
from app.services.memory_manager import shutdown_prediction_semaphore
|
||||
shutdown_prediction_semaphore()
|
||||
logger.info("Prediction semaphore shutdown complete")
|
||||
except Exception as e:
|
||||
logger.warning(f"Error shutting down prediction semaphore: {e}")
|
||||
|
||||
# Shutdown memory manager
|
||||
if settings.enable_model_lifecycle_management:
|
||||
try:
|
||||
from app.services.memory_manager import shutdown_model_manager
|
||||
shutdown_model_manager()
|
||||
logger.info("Memory manager shutdown complete")
|
||||
except Exception as e:
|
||||
logger.warning(f"Error shutting down memory manager: {e}")
|
||||
|
||||
logger.info("Tool_OCR shutdown complete")
|
||||
|
||||
|
||||
# Create FastAPI application
|
||||
app = FastAPI(
|
||||
@@ -77,9 +284,7 @@ app.add_middleware(
|
||||
# Health check endpoint
|
||||
@app.get("/health")
|
||||
async def health_check():
|
||||
"""Health check endpoint with GPU status"""
|
||||
from app.services.ocr_service import OCRService
|
||||
|
||||
"""Health check endpoint with GPU status and memory management info"""
|
||||
response = {
|
||||
"status": "healthy",
|
||||
"service": "Tool_OCR V2",
|
||||
@@ -88,10 +293,31 @@ async def health_check():
|
||||
|
||||
# Add GPU status information
|
||||
try:
|
||||
# Create temporary OCRService instance to get GPU status
|
||||
# In production, this should be a singleton service
|
||||
ocr_service = OCRService()
|
||||
gpu_status = ocr_service.get_gpu_status()
|
||||
# Use service pool if available to avoid creating new instances
|
||||
gpu_status = None
|
||||
if settings.enable_service_pool:
|
||||
try:
|
||||
from app.services.service_pool import get_service_pool
|
||||
pool = get_service_pool()
|
||||
pool_stats = pool.get_pool_stats()
|
||||
response["service_pool"] = pool_stats
|
||||
|
||||
# Get GPU status from first available service
|
||||
for device, services in pool.services.items():
|
||||
for pooled in services:
|
||||
if hasattr(pooled.service, 'get_gpu_status'):
|
||||
gpu_status = pooled.service.get_gpu_status()
|
||||
break
|
||||
if gpu_status:
|
||||
break
|
||||
except Exception as e:
|
||||
logger.debug(f"Could not get service pool stats: {e}")
|
||||
|
||||
# Fallback: create temporary instance if no pool or no service available
|
||||
if gpu_status is None:
|
||||
from app.services.ocr_service import OCRService
|
||||
ocr_service = OCRService()
|
||||
gpu_status = ocr_service.get_gpu_status()
|
||||
|
||||
response["gpu"] = {
|
||||
"available": gpu_status.get("gpu_available", False),
|
||||
@@ -120,6 +346,15 @@ async def health_check():
|
||||
"error": str(e),
|
||||
}
|
||||
|
||||
# Add memory management status
|
||||
if settings.enable_model_lifecycle_management:
|
||||
try:
|
||||
from app.services.memory_manager import get_model_manager
|
||||
model_manager = get_model_manager()
|
||||
response["memory_management"] = model_manager.get_model_stats()
|
||||
except Exception as e:
|
||||
logger.debug(f"Could not get memory management stats: {e}")
|
||||
|
||||
return response
|
||||
|
||||
|
||||
|
||||
@@ -212,26 +212,44 @@ class TableData:
|
||||
if self.caption:
|
||||
html.append(f"<caption>{self.caption}</caption>")
|
||||
|
||||
# Group cells by row
|
||||
rows_data = {}
|
||||
# Group cells by row and column for quick lookup
|
||||
cell_map = {}
|
||||
for cell in self.cells:
|
||||
if cell.row not in rows_data:
|
||||
rows_data[cell.row] = []
|
||||
rows_data[cell.row].append(cell)
|
||||
cell_map[(cell.row, cell.col)] = cell
|
||||
|
||||
# Generate HTML
|
||||
# Track which cells are covered by row/col spans
|
||||
covered = set()
|
||||
for cell in self.cells:
|
||||
if cell.row_span > 1 or cell.col_span > 1:
|
||||
for r in range(cell.row, cell.row + cell.row_span):
|
||||
for c in range(cell.col, cell.col + cell.col_span):
|
||||
if (r, c) != (cell.row, cell.col):
|
||||
covered.add((r, c))
|
||||
|
||||
# Generate HTML with proper column filling
|
||||
for row_idx in range(self.rows):
|
||||
html.append("<tr>")
|
||||
if row_idx in rows_data:
|
||||
for cell in sorted(rows_data[row_idx], key=lambda c: c.col):
|
||||
for col_idx in range(self.cols):
|
||||
# Skip cells covered by row/col spans
|
||||
if (row_idx, col_idx) in covered:
|
||||
continue
|
||||
|
||||
cell = cell_map.get((row_idx, col_idx))
|
||||
tag = "th" if row_idx == 0 and self.headers else "td"
|
||||
|
||||
if cell:
|
||||
span_attrs = []
|
||||
if cell.row_span > 1:
|
||||
span_attrs.append(f'rowspan="{cell.row_span}"')
|
||||
if cell.col_span > 1:
|
||||
span_attrs.append(f'colspan="{cell.col_span}"')
|
||||
span_str = " ".join(span_attrs)
|
||||
tag = "th" if row_idx == 0 and self.headers else "td"
|
||||
html.append(f'<{tag} {span_str}>{cell.content}</{tag}>')
|
||||
content = cell.content if cell.content else ""
|
||||
html.append(f'<{tag} {span_str}>{content}</{tag}>')
|
||||
else:
|
||||
# Fill in empty cell for missing positions
|
||||
html.append(f'<{tag}></{tag}>')
|
||||
|
||||
html.append("</tr>")
|
||||
|
||||
html.append("</table>")
|
||||
|
||||
@@ -39,6 +39,7 @@ from app.schemas.task import (
|
||||
from app.services.task_service import task_service
|
||||
from app.services.file_access_service import file_access_service
|
||||
from app.services.ocr_service import OCRService
|
||||
from app.services.service_pool import get_service_pool, PoolConfig
|
||||
|
||||
# Import dual-track components
|
||||
try:
|
||||
@@ -47,6 +48,13 @@ try:
|
||||
except ImportError:
|
||||
DUAL_TRACK_AVAILABLE = False
|
||||
|
||||
# Service pool availability
|
||||
SERVICE_POOL_AVAILABLE = True
|
||||
try:
|
||||
from app.services.memory_manager import get_model_manager
|
||||
except ImportError:
|
||||
SERVICE_POOL_AVAILABLE = False
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
router = APIRouter(prefix="/api/v2/tasks", tags=["Tasks"])
|
||||
@@ -63,7 +71,10 @@ def process_task_ocr(
|
||||
pp_structure_params: Optional[dict] = None
|
||||
):
|
||||
"""
|
||||
Background task to process OCR for a task with dual-track support
|
||||
Background task to process OCR for a task with dual-track support.
|
||||
|
||||
Uses OCRServicePool to acquire a shared service instance instead of
|
||||
creating a new one, preventing GPU memory proliferation.
|
||||
|
||||
Args:
|
||||
task_id: Task UUID string
|
||||
@@ -80,6 +91,7 @@ def process_task_ocr(
|
||||
|
||||
db = SessionLocal()
|
||||
start_time = datetime.now()
|
||||
pooled_service = None
|
||||
|
||||
try:
|
||||
logger.info(f"Starting OCR processing for task {task_id}, file: {filename}")
|
||||
@@ -91,16 +103,39 @@ def process_task_ocr(
|
||||
logger.error(f"Task {task_id} not found in database")
|
||||
return
|
||||
|
||||
# Initialize OCR service
|
||||
ocr_service = OCRService()
|
||||
# Acquire OCR service from pool (or create new if pool disabled)
|
||||
ocr_service = None
|
||||
if settings.enable_service_pool and SERVICE_POOL_AVAILABLE:
|
||||
try:
|
||||
service_pool = get_service_pool()
|
||||
pooled_service = service_pool.acquire(
|
||||
device="GPU:0",
|
||||
timeout=settings.service_acquire_timeout_seconds,
|
||||
task_id=task_id
|
||||
)
|
||||
if pooled_service:
|
||||
ocr_service = pooled_service.service
|
||||
logger.info(f"Acquired OCR service from pool for task {task_id}")
|
||||
else:
|
||||
logger.warning(f"Timeout acquiring service from pool, creating new instance")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to acquire service from pool: {e}, creating new instance")
|
||||
|
||||
# Fallback: create new instance if pool acquisition failed
|
||||
if ocr_service is None:
|
||||
logger.info("Creating new OCRService instance (pool disabled or unavailable)")
|
||||
ocr_service = OCRService()
|
||||
|
||||
# Create result directory before OCR processing (needed for saving extracted images)
|
||||
result_dir = Path(settings.result_dir) / task_id
|
||||
result_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Process the file with OCR (use dual-track if available)
|
||||
if use_dual_track and hasattr(ocr_service, 'process'):
|
||||
# Use new dual-track processing
|
||||
# Process the file with OCR
|
||||
# Use dual-track processing if:
|
||||
# 1. use_dual_track is True (auto-detection)
|
||||
# 2. OR force_track is specified (explicit track selection)
|
||||
if (use_dual_track or force_track) and hasattr(ocr_service, 'process'):
|
||||
# Use new dual-track processing (or forced track)
|
||||
ocr_result = ocr_service.process(
|
||||
file_path=Path(file_path),
|
||||
lang=language,
|
||||
@@ -111,7 +146,7 @@ def process_task_ocr(
|
||||
pp_structure_params=pp_structure_params
|
||||
)
|
||||
else:
|
||||
# Fall back to traditional processing
|
||||
# Fall back to traditional processing (no force_track support)
|
||||
ocr_result = ocr_service.process_image(
|
||||
image_path=Path(file_path),
|
||||
lang=language,
|
||||
@@ -131,6 +166,16 @@ def process_task_ocr(
|
||||
source_file_path=Path(file_path)
|
||||
)
|
||||
|
||||
# Release service back to pool (success case)
|
||||
if pooled_service:
|
||||
try:
|
||||
service_pool = get_service_pool()
|
||||
service_pool.release(pooled_service, error=None)
|
||||
logger.info(f"Released OCR service back to pool for task {task_id}")
|
||||
pooled_service = None # Prevent double release in finally
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to release service to pool: {e}")
|
||||
|
||||
# Close old session and create fresh one to avoid MySQL timeout
|
||||
# (long OCR processing may cause connection to become stale)
|
||||
db.close()
|
||||
@@ -158,6 +203,15 @@ def process_task_ocr(
|
||||
except Exception as e:
|
||||
logger.exception(f"OCR processing failed for task {task_id}")
|
||||
|
||||
# Release service back to pool with error
|
||||
if pooled_service:
|
||||
try:
|
||||
service_pool = get_service_pool()
|
||||
service_pool.release(pooled_service, error=e)
|
||||
pooled_service = None
|
||||
except Exception as release_error:
|
||||
logger.warning(f"Failed to release service to pool: {release_error}")
|
||||
|
||||
# Update task status to failed (direct database update)
|
||||
try:
|
||||
task = db.query(Task).filter(Task.id == task_db_id).first()
|
||||
@@ -170,6 +224,13 @@ def process_task_ocr(
|
||||
logger.error(f"Failed to update task status: {update_error}")
|
||||
|
||||
finally:
|
||||
# Ensure service is released in case of any missed release
|
||||
if pooled_service:
|
||||
try:
|
||||
service_pool = get_service_pool()
|
||||
service_pool.release(pooled_service, error=None)
|
||||
except Exception:
|
||||
pass
|
||||
db.close()
|
||||
|
||||
|
||||
@@ -330,7 +391,13 @@ async def get_task(
|
||||
with open(result_path) as f:
|
||||
result_data = json.load(f)
|
||||
metadata = result_data.get("metadata", {})
|
||||
processing_track = metadata.get("processing_track")
|
||||
track_str = metadata.get("processing_track")
|
||||
# Convert string to enum to avoid Pydantic serialization warning
|
||||
if track_str:
|
||||
try:
|
||||
processing_track = ProcessingTrackEnum(track_str)
|
||||
except ValueError:
|
||||
processing_track = None
|
||||
except Exception:
|
||||
pass # Silently ignore errors reading the result file
|
||||
|
||||
|
||||
@@ -247,9 +247,11 @@ class DirectExtractionEngine:
|
||||
element_counter += len(image_elements)
|
||||
|
||||
# Extract vector graphics (charts, diagrams) from drawing commands
|
||||
# Pass table_bboxes to filter out table border drawings before clustering
|
||||
if self.enable_image_extraction:
|
||||
vector_elements = self._extract_vector_graphics(
|
||||
page, page_num, document_id, element_counter, output_dir
|
||||
page, page_num, document_id, element_counter, output_dir,
|
||||
table_bboxes=table_bboxes
|
||||
)
|
||||
elements.extend(vector_elements)
|
||||
element_counter += len(vector_elements)
|
||||
@@ -705,40 +707,52 @@ class DirectExtractionEngine:
|
||||
y1=bbox_data[3]
|
||||
)
|
||||
|
||||
# Extract column widths from table cells
|
||||
# Extract column widths from table cells by analyzing X boundaries
|
||||
column_widths = []
|
||||
if hasattr(table, 'cells') and table.cells:
|
||||
# Group cells by column
|
||||
cols_x = {}
|
||||
# Collect all unique X boundaries (both left and right edges)
|
||||
x_boundaries = set()
|
||||
for cell in table.cells:
|
||||
col_idx = None
|
||||
# Determine column index by x0 position
|
||||
for idx, x0 in enumerate(sorted(set(c[0] for c in table.cells))):
|
||||
if abs(cell[0] - x0) < 1.0: # Within 1pt tolerance
|
||||
col_idx = idx
|
||||
break
|
||||
x_boundaries.add(round(cell[0], 1)) # x0 (left edge)
|
||||
x_boundaries.add(round(cell[2], 1)) # x1 (right edge)
|
||||
|
||||
if col_idx is not None:
|
||||
if col_idx not in cols_x:
|
||||
cols_x[col_idx] = {'x0': cell[0], 'x1': cell[2]}
|
||||
else:
|
||||
cols_x[col_idx]['x1'] = max(cols_x[col_idx]['x1'], cell[2])
|
||||
# Sort boundaries to get column edges
|
||||
sorted_x = sorted(x_boundaries)
|
||||
|
||||
# Calculate width for each column
|
||||
for col_idx in sorted(cols_x.keys()):
|
||||
width = cols_x[col_idx]['x1'] - cols_x[col_idx]['x0']
|
||||
column_widths.append(width)
|
||||
# Calculate column widths from adjacent boundaries
|
||||
if len(sorted_x) >= 2:
|
||||
column_widths = [sorted_x[i+1] - sorted_x[i] for i in range(len(sorted_x)-1)]
|
||||
logger.debug(f"Calculated column widths from {len(sorted_x)} boundaries: {column_widths}")
|
||||
|
||||
# Extract row heights from table cells by analyzing Y boundaries
|
||||
row_heights = []
|
||||
if hasattr(table, 'cells') and table.cells:
|
||||
# Collect all unique Y boundaries (both top and bottom edges)
|
||||
y_boundaries = set()
|
||||
for cell in table.cells:
|
||||
y_boundaries.add(round(cell[1], 1)) # y0 (top edge)
|
||||
y_boundaries.add(round(cell[3], 1)) # y1 (bottom edge)
|
||||
|
||||
# Sort boundaries to get row edges
|
||||
sorted_y = sorted(y_boundaries)
|
||||
|
||||
# Calculate row heights from adjacent boundaries
|
||||
if len(sorted_y) >= 2:
|
||||
row_heights = [sorted_y[i+1] - sorted_y[i] for i in range(len(sorted_y)-1)]
|
||||
logger.debug(f"Calculated row heights from {len(sorted_y)} boundaries: {row_heights}")
|
||||
|
||||
# Create table cells
|
||||
# Note: Include ALL cells (even empty ones) to preserve table structure
|
||||
# This is critical for correct HTML generation and PDF rendering
|
||||
cells = []
|
||||
for row_idx, row in enumerate(data):
|
||||
for col_idx, cell_text in enumerate(row):
|
||||
if cell_text:
|
||||
cells.append(TableCell(
|
||||
row=row_idx,
|
||||
col=col_idx,
|
||||
content=str(cell_text) if cell_text else ""
|
||||
))
|
||||
# Always add cell, even if empty, to maintain table structure
|
||||
cells.append(TableCell(
|
||||
row=row_idx,
|
||||
col=col_idx,
|
||||
content=str(cell_text) if cell_text else ""
|
||||
))
|
||||
|
||||
# Create table data
|
||||
table_data = TableData(
|
||||
@@ -748,8 +762,13 @@ class DirectExtractionEngine:
|
||||
headers=data[0] if data else None # Assume first row is header
|
||||
)
|
||||
|
||||
# Store column widths in metadata
|
||||
metadata = {"column_widths": column_widths} if column_widths else None
|
||||
# Store column widths and row heights in metadata
|
||||
metadata = {}
|
||||
if column_widths:
|
||||
metadata["column_widths"] = column_widths
|
||||
if row_heights:
|
||||
metadata["row_heights"] = row_heights
|
||||
metadata = metadata if metadata else None
|
||||
|
||||
return DocumentElement(
|
||||
element_id=f"table_{page_num}_{counter}",
|
||||
@@ -978,7 +997,9 @@ class DirectExtractionEngine:
|
||||
image_filename = f"{document_id}_p{page_num}_img{img_idx}.png"
|
||||
image_path = output_dir / image_filename
|
||||
pix.save(str(image_path))
|
||||
image_data["saved_path"] = str(image_path)
|
||||
# Store relative filename only (consistent with OCR track)
|
||||
# PDF generator will join with result_dir to get full path
|
||||
image_data["saved_path"] = image_filename
|
||||
logger.debug(f"Saved image to {image_path}")
|
||||
|
||||
element = DocumentElement(
|
||||
@@ -1001,12 +1022,272 @@ class DirectExtractionEngine:
|
||||
|
||||
return elements
|
||||
|
||||
def has_missing_images(self, page: fitz.Page) -> bool:
|
||||
"""
|
||||
Detect if a page likely has images that weren't extracted.
|
||||
|
||||
This checks for inline image blocks (type=1 in text dict) which indicate
|
||||
graphics composed of many small image blocks (like logos) that
|
||||
page.get_images() cannot detect.
|
||||
|
||||
Args:
|
||||
page: PyMuPDF page object
|
||||
|
||||
Returns:
|
||||
True if there are likely missing images that need OCR extraction
|
||||
"""
|
||||
try:
|
||||
# Check if get_images found anything
|
||||
standard_images = page.get_images()
|
||||
if standard_images:
|
||||
return False # Standard images were found, no need for fallback
|
||||
|
||||
# Check for inline image blocks (type=1)
|
||||
text_dict = page.get_text("dict", sort=True)
|
||||
blocks = text_dict.get("blocks", [])
|
||||
|
||||
image_block_count = sum(1 for b in blocks if b.get("type") == 1)
|
||||
|
||||
# If there are many inline image blocks, likely there's a logo or graphic
|
||||
if image_block_count >= 10:
|
||||
logger.info(f"Detected {image_block_count} inline image blocks - may need OCR for image extraction")
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error checking for missing images: {e}")
|
||||
return False
|
||||
|
||||
def check_document_for_missing_images(self, pdf_path: Path) -> List[int]:
|
||||
"""
|
||||
Check a PDF document for pages that likely have missing images.
|
||||
|
||||
This opens the PDF and checks each page for inline image blocks
|
||||
that weren't extracted by get_images().
|
||||
|
||||
Args:
|
||||
pdf_path: Path to the PDF file
|
||||
|
||||
Returns:
|
||||
List of page numbers (1-indexed) that have missing images
|
||||
"""
|
||||
pages_with_missing_images = []
|
||||
|
||||
try:
|
||||
doc = fitz.open(str(pdf_path))
|
||||
for page_num in range(len(doc)):
|
||||
page = doc[page_num]
|
||||
if self.has_missing_images(page):
|
||||
pages_with_missing_images.append(page_num + 1) # 1-indexed
|
||||
doc.close()
|
||||
|
||||
if pages_with_missing_images:
|
||||
logger.info(f"Document has missing images on pages: {pages_with_missing_images}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error checking document for missing images: {e}")
|
||||
|
||||
return pages_with_missing_images
|
||||
|
||||
def render_inline_image_regions(
|
||||
self,
|
||||
pdf_path: Path,
|
||||
unified_doc: 'UnifiedDocument',
|
||||
pages: List[int],
|
||||
output_dir: Optional[Path] = None
|
||||
) -> int:
|
||||
"""
|
||||
Render inline image regions and add them to the unified document.
|
||||
|
||||
This is a fallback when OCR doesn't detect images. It clusters inline
|
||||
image blocks (type=1) and renders them as images.
|
||||
|
||||
Args:
|
||||
pdf_path: Path to the PDF file
|
||||
unified_doc: UnifiedDocument to add images to
|
||||
pages: List of page numbers (1-indexed) to process
|
||||
output_dir: Directory to save rendered images
|
||||
|
||||
Returns:
|
||||
Number of images added
|
||||
"""
|
||||
images_added = 0
|
||||
|
||||
try:
|
||||
doc = fitz.open(str(pdf_path))
|
||||
|
||||
for page_num in pages:
|
||||
if page_num < 1 or page_num > len(doc):
|
||||
continue
|
||||
|
||||
page = doc[page_num - 1] # 0-indexed
|
||||
page_rect = page.rect
|
||||
|
||||
# Get inline image blocks
|
||||
text_dict = page.get_text("dict", sort=True)
|
||||
blocks = text_dict.get("blocks", [])
|
||||
|
||||
image_blocks = []
|
||||
for block in blocks:
|
||||
if block.get("type") == 1: # Image block
|
||||
bbox = block.get("bbox")
|
||||
if bbox:
|
||||
image_blocks.append(fitz.Rect(bbox))
|
||||
|
||||
if len(image_blocks) < 5: # Reduced from 10
|
||||
logger.debug(f"Page {page_num}: Only {len(image_blocks)} inline image blocks, skipping")
|
||||
continue
|
||||
|
||||
logger.info(f"Page {page_num}: Found {len(image_blocks)} inline image blocks")
|
||||
|
||||
# Cluster nearby image blocks
|
||||
regions = self._cluster_nearby_rects(image_blocks, tolerance=5.0)
|
||||
logger.info(f"Page {page_num}: Clustered into {len(regions)} regions")
|
||||
|
||||
# Find the corresponding page in unified_doc
|
||||
target_page = None
|
||||
for p in unified_doc.pages:
|
||||
if p.page_number == page_num:
|
||||
target_page = p
|
||||
break
|
||||
|
||||
if not target_page:
|
||||
continue
|
||||
|
||||
for region_idx, region_rect in enumerate(regions):
|
||||
logger.info(f"Page {page_num} region {region_idx}: {region_rect} (w={region_rect.width:.1f}, h={region_rect.height:.1f})")
|
||||
|
||||
# Skip very small regions
|
||||
if region_rect.width < 30 or region_rect.height < 30:
|
||||
logger.info(f" -> Skipped: too small (min 30x30)")
|
||||
continue
|
||||
|
||||
# Skip regions that are primarily in the table area (below top 40%)
|
||||
# But allow regions that START in the top portion
|
||||
page_30_pct = page_rect.height * 0.3
|
||||
page_40_pct = page_rect.height * 0.4
|
||||
if region_rect.y0 > page_40_pct:
|
||||
logger.info(f" -> Skipped: y0={region_rect.y0:.1f} > 40% of page ({page_40_pct:.1f})")
|
||||
continue
|
||||
|
||||
logger.info(f"Rendering inline image region {region_idx} on page {page_num}: {region_rect}")
|
||||
|
||||
try:
|
||||
# Add small padding
|
||||
clip_rect = region_rect + (-2, -2, 2, 2)
|
||||
clip_rect.intersect(page_rect)
|
||||
|
||||
# Render at 2x resolution
|
||||
mat = fitz.Matrix(2, 2)
|
||||
pix = page.get_pixmap(clip=clip_rect, matrix=mat, alpha=False)
|
||||
|
||||
# Create bounding box
|
||||
bbox = BoundingBox(
|
||||
x0=clip_rect.x0,
|
||||
y0=clip_rect.y0,
|
||||
x1=clip_rect.x1,
|
||||
y1=clip_rect.y1
|
||||
)
|
||||
|
||||
image_data = {
|
||||
"width": pix.width,
|
||||
"height": pix.height,
|
||||
"colorspace": "rgb",
|
||||
"type": "inline_region"
|
||||
}
|
||||
|
||||
# Save image if output directory provided
|
||||
if output_dir:
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
doc_id = unified_doc.document_id or "unknown"
|
||||
image_filename = f"{doc_id}_p{page_num}_logo{region_idx}.png"
|
||||
image_path = output_dir / image_filename
|
||||
pix.save(str(image_path))
|
||||
image_data["saved_path"] = image_filename
|
||||
logger.info(f"Saved inline image region to {image_path}")
|
||||
|
||||
element = DocumentElement(
|
||||
element_id=f"logo_{page_num}_{region_idx}",
|
||||
type=ElementType.LOGO,
|
||||
content=image_data,
|
||||
bbox=bbox,
|
||||
confidence=0.9,
|
||||
metadata={
|
||||
"region_type": "inline_image_blocks",
|
||||
"block_count": len(image_blocks)
|
||||
}
|
||||
)
|
||||
target_page.elements.append(element)
|
||||
images_added += 1
|
||||
|
||||
pix = None # Free memory
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error rendering inline image region {region_idx}: {e}")
|
||||
|
||||
doc.close()
|
||||
|
||||
if images_added > 0:
|
||||
current_images = unified_doc.metadata.total_images or 0
|
||||
unified_doc.metadata.total_images = current_images + images_added
|
||||
logger.info(f"Added {images_added} inline image regions to document")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error rendering inline image regions: {e}")
|
||||
|
||||
return images_added
|
||||
|
||||
def _cluster_nearby_rects(self, rects: List[fitz.Rect], tolerance: float = 5.0) -> List[fitz.Rect]:
|
||||
"""Cluster nearby rectangles into regions."""
|
||||
if not rects:
|
||||
return []
|
||||
|
||||
sorted_rects = sorted(rects, key=lambda r: (r.y0, r.x0))
|
||||
|
||||
merged = []
|
||||
for rect in sorted_rects:
|
||||
merged_with_existing = False
|
||||
for i, region in enumerate(merged):
|
||||
expanded = region + (-tolerance, -tolerance, tolerance, tolerance)
|
||||
if expanded.intersects(rect):
|
||||
merged[i] = region | rect
|
||||
merged_with_existing = True
|
||||
break
|
||||
if not merged_with_existing:
|
||||
merged.append(rect)
|
||||
|
||||
# Second pass: merge any regions that now overlap
|
||||
changed = True
|
||||
while changed:
|
||||
changed = False
|
||||
new_merged = []
|
||||
skip = set()
|
||||
|
||||
for i, r1 in enumerate(merged):
|
||||
if i in skip:
|
||||
continue
|
||||
current = r1
|
||||
for j, r2 in enumerate(merged[i+1:], start=i+1):
|
||||
if j in skip:
|
||||
continue
|
||||
expanded = current + (-tolerance, -tolerance, tolerance, tolerance)
|
||||
if expanded.intersects(r2):
|
||||
current = current | r2
|
||||
skip.add(j)
|
||||
changed = True
|
||||
new_merged.append(current)
|
||||
merged = new_merged
|
||||
|
||||
return merged
|
||||
|
||||
def _extract_vector_graphics(self,
|
||||
page: fitz.Page,
|
||||
page_num: int,
|
||||
document_id: str,
|
||||
counter: int,
|
||||
output_dir: Optional[Path]) -> List[DocumentElement]:
|
||||
output_dir: Optional[Path],
|
||||
table_bboxes: Optional[List[BoundingBox]] = None) -> List[DocumentElement]:
|
||||
"""
|
||||
Extract vector graphics (charts, diagrams) from page.
|
||||
|
||||
@@ -1020,6 +1301,7 @@ class DirectExtractionEngine:
|
||||
document_id: Unique document identifier
|
||||
counter: Starting counter for element IDs
|
||||
output_dir: Directory to save rendered graphics
|
||||
table_bboxes: List of table bounding boxes to exclude table border drawings
|
||||
|
||||
Returns:
|
||||
List of DocumentElement objects representing vector graphics
|
||||
@@ -1034,16 +1316,25 @@ class DirectExtractionEngine:
|
||||
|
||||
logger.debug(f"Page {page_num} contains {len(drawings)} vector drawing commands")
|
||||
|
||||
# Filter out drawings that are likely table borders
|
||||
# Table borders are typically thin rectangular lines within table regions
|
||||
non_table_drawings = self._filter_table_border_drawings(drawings, table_bboxes)
|
||||
logger.debug(f"After filtering table borders: {len(non_table_drawings)} drawings remain")
|
||||
|
||||
if not non_table_drawings:
|
||||
logger.debug("All drawings appear to be table borders, no vector graphics to extract")
|
||||
return elements
|
||||
|
||||
# Cluster drawings into groups (charts, diagrams, etc.)
|
||||
try:
|
||||
# PyMuPDF's cluster_drawings() groups nearby drawings automatically
|
||||
drawing_clusters = page.cluster_drawings()
|
||||
# Use custom clustering that only considers non-table drawings
|
||||
drawing_clusters = self._cluster_non_table_drawings(page, non_table_drawings)
|
||||
logger.debug(f"Clustered into {len(drawing_clusters)} groups")
|
||||
except (AttributeError, TypeError) as e:
|
||||
# cluster_drawings not available or has different signature
|
||||
# Fallback: try to identify charts by analyzing drawing density
|
||||
logger.warning(f"cluster_drawings() failed ({e}), using fallback method")
|
||||
drawing_clusters = self._cluster_drawings_fallback(page, drawings)
|
||||
logger.warning(f"Custom clustering failed ({e}), using fallback method")
|
||||
drawing_clusters = self._cluster_drawings_fallback(page, non_table_drawings)
|
||||
|
||||
for cluster_idx, bbox in enumerate(drawing_clusters):
|
||||
# Ignore small regions (likely noise or separator lines)
|
||||
@@ -1148,6 +1439,124 @@ class DirectExtractionEngine:
|
||||
|
||||
return filtered_clusters
|
||||
|
||||
def _filter_table_border_drawings(self, drawings: list, table_bboxes: Optional[List[BoundingBox]]) -> list:
|
||||
"""
|
||||
Filter out drawings that are likely table borders.
|
||||
|
||||
Table borders are typically:
|
||||
- Thin rectangular lines (height or width < 5pt)
|
||||
- Located within or on the edge of table bounding boxes
|
||||
|
||||
Args:
|
||||
drawings: List of PyMuPDF drawing objects
|
||||
table_bboxes: List of table bounding boxes
|
||||
|
||||
Returns:
|
||||
List of drawings that are NOT table borders (likely logos, charts, etc.)
|
||||
"""
|
||||
if not table_bboxes:
|
||||
return drawings
|
||||
|
||||
non_table_drawings = []
|
||||
table_border_count = 0
|
||||
|
||||
for drawing in drawings:
|
||||
rect = drawing.get('rect')
|
||||
if not rect:
|
||||
continue
|
||||
|
||||
draw_rect = fitz.Rect(rect)
|
||||
|
||||
# Check if this drawing is a thin line (potential table border)
|
||||
is_thin_line = draw_rect.width < 5 or draw_rect.height < 5
|
||||
|
||||
# Check if drawing overlaps significantly with any table
|
||||
overlaps_table = False
|
||||
for table_bbox in table_bboxes:
|
||||
table_rect = fitz.Rect(table_bbox.x0, table_bbox.y0, table_bbox.x1, table_bbox.y1)
|
||||
|
||||
# Expand table rect slightly to include border lines on edges
|
||||
expanded_table = table_rect + (-5, -5, 5, 5)
|
||||
|
||||
if expanded_table.contains(draw_rect) or expanded_table.intersects(draw_rect):
|
||||
# Calculate overlap ratio
|
||||
intersection = draw_rect & expanded_table
|
||||
if not intersection.is_empty:
|
||||
overlap_ratio = intersection.get_area() / draw_rect.get_area() if draw_rect.get_area() > 0 else 0
|
||||
|
||||
# If drawing is mostly inside table region, it's likely a border
|
||||
if overlap_ratio > 0.8:
|
||||
overlaps_table = True
|
||||
break
|
||||
|
||||
# Keep drawing if it's NOT (thin line AND overlapping table)
|
||||
# This keeps: logos (complex shapes), charts outside tables, etc.
|
||||
if is_thin_line and overlaps_table:
|
||||
table_border_count += 1
|
||||
else:
|
||||
non_table_drawings.append(drawing)
|
||||
|
||||
if table_border_count > 0:
|
||||
logger.debug(f"Filtered out {table_border_count} table border drawings")
|
||||
|
||||
return non_table_drawings
|
||||
|
||||
def _cluster_non_table_drawings(self, page: fitz.Page, drawings: list) -> list:
|
||||
"""
|
||||
Cluster non-table drawings into groups.
|
||||
|
||||
This method clusters drawings that have been pre-filtered to exclude table borders.
|
||||
It uses a more conservative clustering approach suitable for logos and charts.
|
||||
|
||||
Args:
|
||||
page: PyMuPDF page object
|
||||
drawings: Pre-filtered list of drawings (excluding table borders)
|
||||
|
||||
Returns:
|
||||
List of fitz.Rect representing clustered drawing regions
|
||||
"""
|
||||
if not drawings:
|
||||
return []
|
||||
|
||||
# Collect all drawing bounding boxes
|
||||
bboxes = []
|
||||
for drawing in drawings:
|
||||
rect = drawing.get('rect')
|
||||
if rect:
|
||||
bboxes.append(fitz.Rect(rect))
|
||||
|
||||
if not bboxes:
|
||||
return []
|
||||
|
||||
# More conservative clustering with smaller tolerance
|
||||
# This prevents grouping distant graphics together
|
||||
clusters = []
|
||||
tolerance = 10 # Smaller tolerance than fallback (was 20)
|
||||
|
||||
for bbox in bboxes:
|
||||
# Try to merge with existing cluster
|
||||
merged = False
|
||||
for i, cluster in enumerate(clusters):
|
||||
# Check if bbox is close to this cluster
|
||||
expanded_cluster = cluster + (-tolerance, -tolerance, tolerance, tolerance)
|
||||
if expanded_cluster.intersects(bbox):
|
||||
# Merge bbox into cluster
|
||||
clusters[i] = cluster | bbox # Union of rectangles
|
||||
merged = True
|
||||
break
|
||||
|
||||
if not merged:
|
||||
# Create new cluster
|
||||
clusters.append(bbox)
|
||||
|
||||
# Filter out very small clusters (noise)
|
||||
# Keep minimum 30x30 for logos (smaller than default 50x50)
|
||||
filtered_clusters = [c for c in clusters if c.width >= 30 and c.height >= 30]
|
||||
|
||||
logger.debug(f"Non-table clustering: {len(bboxes)} drawings -> {len(clusters)} clusters -> {len(filtered_clusters)} filtered")
|
||||
|
||||
return filtered_clusters
|
||||
|
||||
def _deduplicate_table_chart_overlap(self, elements: List[DocumentElement]) -> List[DocumentElement]:
|
||||
"""
|
||||
Intelligently resolve TABLE-CHART overlaps based on table structure completeness.
|
||||
|
||||
2269
backend/app/services/memory_manager.py
Normal file
2269
backend/app/services/memory_manager.py
Normal file
File diff suppressed because it is too large
Load Diff
@@ -25,6 +25,7 @@ except ImportError:
|
||||
|
||||
from app.core.config import settings
|
||||
from app.services.office_converter import OfficeConverter, OfficeConverterError
|
||||
from app.services.memory_manager import get_model_manager, MemoryConfig, MemoryGuard, prediction_context
|
||||
|
||||
# Import dual-track components
|
||||
try:
|
||||
@@ -96,6 +97,26 @@ class OCRService:
|
||||
self._model_last_used = {} # Track last usage time for each model
|
||||
self._memory_warning_logged = False
|
||||
|
||||
# Initialize MemoryGuard for enhanced memory monitoring
|
||||
self._memory_guard = None
|
||||
if settings.enable_model_lifecycle_management:
|
||||
try:
|
||||
memory_config = MemoryConfig(
|
||||
warning_threshold=settings.memory_warning_threshold,
|
||||
critical_threshold=settings.memory_critical_threshold,
|
||||
emergency_threshold=settings.memory_emergency_threshold,
|
||||
model_idle_timeout_seconds=settings.pp_structure_idle_timeout_seconds,
|
||||
gpu_memory_limit_mb=settings.gpu_memory_limit_mb,
|
||||
enable_cpu_fallback=settings.enable_cpu_fallback,
|
||||
)
|
||||
self._memory_guard = MemoryGuard(memory_config)
|
||||
logger.debug("MemoryGuard initialized for OCRService")
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to initialize MemoryGuard: {e}")
|
||||
|
||||
# Track if CPU fallback was activated
|
||||
self._cpu_fallback_active = False
|
||||
|
||||
self._detect_and_configure_gpu()
|
||||
|
||||
# Log GPU optimization settings
|
||||
@@ -217,53 +238,91 @@ class OCRService:
|
||||
def _check_gpu_memory_usage(self):
|
||||
"""
|
||||
Check GPU memory usage and log warnings if approaching limits.
|
||||
Implements memory optimization for RTX 4060 8GB.
|
||||
Uses MemoryGuard for enhanced monitoring with multiple backends.
|
||||
"""
|
||||
if not self.use_gpu or not settings.enable_memory_optimization:
|
||||
return
|
||||
|
||||
try:
|
||||
device_id = self.gpu_info.get('device_id', 0)
|
||||
memory_allocated = paddle.device.cuda.memory_allocated(device_id)
|
||||
memory_allocated_mb = memory_allocated / (1024**2)
|
||||
memory_limit_mb = settings.gpu_memory_limit_mb
|
||||
# Use MemoryGuard if available for better monitoring
|
||||
if self._memory_guard:
|
||||
stats = self._memory_guard.get_memory_stats()
|
||||
|
||||
utilization = (memory_allocated_mb / memory_limit_mb * 100) if memory_limit_mb > 0 else 0
|
||||
# Log based on usage ratio
|
||||
if stats.gpu_used_ratio > 0.90 and not self._memory_warning_logged:
|
||||
logger.warning(
|
||||
f"GPU memory usage critical: {stats.gpu_used_mb:.0f}MB / {stats.gpu_total_mb:.0f}MB "
|
||||
f"({stats.gpu_used_ratio*100:.1f}%)"
|
||||
)
|
||||
logger.warning("Consider enabling auto_unload_unused_models or reducing batch size")
|
||||
self._memory_warning_logged = True
|
||||
|
||||
if utilization > 90 and not self._memory_warning_logged:
|
||||
logger.warning(f"GPU memory usage high: {memory_allocated_mb:.0f}MB / {memory_limit_mb}MB ({utilization:.1f}%)")
|
||||
logger.warning("Consider enabling auto_unload_unused_models or reducing batch size")
|
||||
self._memory_warning_logged = True
|
||||
elif utilization > 75:
|
||||
logger.info(f"GPU memory: {memory_allocated_mb:.0f}MB / {memory_limit_mb}MB ({utilization:.1f}%)")
|
||||
# Trigger emergency cleanup if enabled
|
||||
if settings.enable_emergency_cleanup:
|
||||
self._cleanup_unused_models()
|
||||
self._memory_guard.clear_gpu_cache()
|
||||
|
||||
elif stats.gpu_used_ratio > 0.75:
|
||||
logger.info(
|
||||
f"GPU memory: {stats.gpu_used_mb:.0f}MB / {stats.gpu_total_mb:.0f}MB "
|
||||
f"({stats.gpu_used_ratio*100:.1f}%)"
|
||||
)
|
||||
else:
|
||||
# Fallback to original implementation
|
||||
device_id = self.gpu_info.get('device_id', 0)
|
||||
memory_allocated = paddle.device.cuda.memory_allocated(device_id)
|
||||
memory_allocated_mb = memory_allocated / (1024**2)
|
||||
memory_limit_mb = settings.gpu_memory_limit_mb
|
||||
|
||||
utilization = (memory_allocated_mb / memory_limit_mb * 100) if memory_limit_mb > 0 else 0
|
||||
|
||||
if utilization > 90 and not self._memory_warning_logged:
|
||||
logger.warning(f"GPU memory usage high: {memory_allocated_mb:.0f}MB / {memory_limit_mb}MB ({utilization:.1f}%)")
|
||||
logger.warning("Consider enabling auto_unload_unused_models or reducing batch size")
|
||||
self._memory_warning_logged = True
|
||||
elif utilization > 75:
|
||||
logger.info(f"GPU memory: {memory_allocated_mb:.0f}MB / {memory_limit_mb}MB ({utilization:.1f}%)")
|
||||
|
||||
except Exception as e:
|
||||
logger.debug(f"Memory check failed: {e}")
|
||||
|
||||
def _cleanup_unused_models(self):
|
||||
"""
|
||||
Clean up unused language models to free GPU memory.
|
||||
Clean up unused models (including PP-StructureV3) to free GPU memory.
|
||||
Models idle longer than model_idle_timeout_seconds will be unloaded.
|
||||
|
||||
Note: PP-StructureV3 is NO LONGER exempted from cleanup - it will be
|
||||
unloaded based on pp_structure_idle_timeout_seconds configuration.
|
||||
"""
|
||||
if not settings.auto_unload_unused_models:
|
||||
return
|
||||
|
||||
current_time = datetime.now()
|
||||
timeout = settings.model_idle_timeout_seconds
|
||||
models_to_remove = []
|
||||
|
||||
for lang, last_used in self._model_last_used.items():
|
||||
if lang == 'structure': # Don't unload structure engine
|
||||
continue
|
||||
# Use different timeout for structure engine vs language models
|
||||
if lang == 'structure':
|
||||
timeout = settings.pp_structure_idle_timeout_seconds
|
||||
else:
|
||||
timeout = settings.model_idle_timeout_seconds
|
||||
|
||||
idle_seconds = (current_time - last_used).total_seconds()
|
||||
if idle_seconds > timeout:
|
||||
models_to_remove.append(lang)
|
||||
|
||||
for lang in models_to_remove:
|
||||
if lang in self.ocr_engines:
|
||||
logger.info(f"Unloading idle OCR engine for {lang} (idle {timeout}s)")
|
||||
del self.ocr_engines[lang]
|
||||
del self._model_last_used[lang]
|
||||
for model_key in models_to_remove:
|
||||
if model_key == 'structure':
|
||||
if self.structure_engine is not None:
|
||||
logger.info(f"Unloading idle PP-StructureV3 engine (idle {settings.pp_structure_idle_timeout_seconds}s)")
|
||||
self._unload_structure_engine()
|
||||
if model_key in self._model_last_used:
|
||||
del self._model_last_used[model_key]
|
||||
elif model_key in self.ocr_engines:
|
||||
logger.info(f"Unloading idle OCR engine for {model_key} (idle {settings.model_idle_timeout_seconds}s)")
|
||||
del self.ocr_engines[model_key]
|
||||
if model_key in self._model_last_used:
|
||||
del self._model_last_used[model_key]
|
||||
|
||||
if models_to_remove and self.use_gpu:
|
||||
# Clear CUDA cache
|
||||
@@ -273,6 +332,41 @@ class OCRService:
|
||||
except Exception as e:
|
||||
logger.debug(f"Cache clear failed: {e}")
|
||||
|
||||
def _unload_structure_engine(self):
|
||||
"""
|
||||
Properly unload PP-StructureV3 engine and free GPU memory.
|
||||
"""
|
||||
if self.structure_engine is None:
|
||||
return
|
||||
|
||||
try:
|
||||
# Clear internal engine components
|
||||
if hasattr(self.structure_engine, 'table_engine'):
|
||||
self.structure_engine.table_engine = None
|
||||
if hasattr(self.structure_engine, 'text_detector'):
|
||||
self.structure_engine.text_detector = None
|
||||
if hasattr(self.structure_engine, 'text_recognizer'):
|
||||
self.structure_engine.text_recognizer = None
|
||||
if hasattr(self.structure_engine, 'layout_predictor'):
|
||||
self.structure_engine.layout_predictor = None
|
||||
|
||||
# Delete the engine
|
||||
del self.structure_engine
|
||||
self.structure_engine = None
|
||||
|
||||
# Force garbage collection
|
||||
gc.collect()
|
||||
|
||||
# Clear GPU cache
|
||||
if self.use_gpu:
|
||||
paddle.device.cuda.empty_cache()
|
||||
|
||||
logger.info("PP-StructureV3 engine unloaded successfully")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Error unloading PP-StructureV3: {e}")
|
||||
self.structure_engine = None
|
||||
|
||||
def clear_gpu_cache(self):
|
||||
"""
|
||||
Manually clear GPU memory cache.
|
||||
@@ -519,46 +613,160 @@ class OCRService:
|
||||
logger.warning(f"GPU memory cleanup failed (non-critical): {e}")
|
||||
# Don't fail the processing if cleanup fails
|
||||
|
||||
def check_gpu_memory(self, required_mb: int = 2000) -> bool:
|
||||
def check_gpu_memory(self, required_mb: int = 2000, enable_fallback: bool = True) -> bool:
|
||||
"""
|
||||
Check if sufficient GPU memory is available.
|
||||
Check if sufficient GPU memory is available using MemoryGuard.
|
||||
|
||||
This method now uses MemoryGuard for accurate memory queries across
|
||||
multiple backends (pynvml, torch, paddle) instead of returning True
|
||||
blindly for PaddlePaddle-only environments.
|
||||
|
||||
Args:
|
||||
required_mb: Required memory in MB (default 2000MB for OCR models)
|
||||
enable_fallback: If True and CPU fallback is enabled, switch to CPU mode
|
||||
when memory is insufficient instead of returning False
|
||||
|
||||
Returns:
|
||||
True if sufficient memory is available or GPU is not used
|
||||
True if sufficient memory is available, GPU is not used, or CPU fallback activated
|
||||
"""
|
||||
try:
|
||||
# Check GPU memory using torch if available, otherwise use PaddlePaddle
|
||||
free_memory = None
|
||||
# If not using GPU, always return True
|
||||
if not self.use_gpu:
|
||||
return True
|
||||
|
||||
if TORCH_AVAILABLE and torch.cuda.is_available():
|
||||
free_memory = torch.cuda.mem_get_info()[0] / 1024**2
|
||||
elif paddle.device.is_compiled_with_cuda():
|
||||
# PaddlePaddle doesn't have direct API to get free memory,
|
||||
# so we rely on cleanup and continue
|
||||
logger.debug("Using PaddlePaddle GPU, memory info not directly available")
|
||||
try:
|
||||
# Use MemoryGuard if available for accurate multi-backend memory queries
|
||||
if self._memory_guard:
|
||||
is_available, stats = self._memory_guard.check_memory(
|
||||
required_mb=required_mb,
|
||||
device_id=self.gpu_info.get('device_id', 0)
|
||||
)
|
||||
|
||||
if not is_available:
|
||||
logger.warning(
|
||||
f"GPU memory check failed: {stats.gpu_free_mb:.0f}MB free, "
|
||||
f"{required_mb}MB required ({stats.gpu_used_ratio*100:.1f}% used)"
|
||||
)
|
||||
|
||||
# Try to free memory
|
||||
logger.info("Attempting memory cleanup before retry...")
|
||||
self._cleanup_unused_models()
|
||||
self._memory_guard.clear_gpu_cache()
|
||||
|
||||
# Check again
|
||||
is_available, stats = self._memory_guard.check_memory(required_mb=required_mb)
|
||||
|
||||
if not is_available:
|
||||
# Memory still insufficient after cleanup
|
||||
if enable_fallback and settings.enable_cpu_fallback:
|
||||
logger.warning(
|
||||
f"Insufficient GPU memory ({stats.gpu_free_mb:.0f}MB) after cleanup. "
|
||||
f"Activating CPU fallback mode."
|
||||
)
|
||||
self._activate_cpu_fallback()
|
||||
return True # Continue with CPU
|
||||
else:
|
||||
logger.error(
|
||||
f"Insufficient GPU memory: {stats.gpu_free_mb:.0f}MB available, "
|
||||
f"{required_mb}MB required"
|
||||
)
|
||||
return False
|
||||
|
||||
logger.debug(
|
||||
f"GPU memory check passed: {stats.gpu_free_mb:.0f}MB free "
|
||||
f"({stats.gpu_used_ratio*100:.1f}% used)"
|
||||
)
|
||||
return True
|
||||
|
||||
if free_memory is not None:
|
||||
if free_memory < required_mb:
|
||||
logger.warning(f"Low GPU memory: {free_memory:.0f}MB available, {required_mb}MB required")
|
||||
# Try to free memory
|
||||
self.cleanup_gpu_memory()
|
||||
# Check again
|
||||
if TORCH_AVAILABLE and torch.cuda.is_available():
|
||||
free_memory = torch.cuda.mem_get_info()[0] / 1024**2
|
||||
if free_memory < required_mb:
|
||||
logger.error(f"Insufficient GPU memory after cleanup: {free_memory:.0f}MB")
|
||||
return False
|
||||
logger.debug(f"GPU memory check passed: {free_memory:.0f}MB available")
|
||||
else:
|
||||
# Fallback to original implementation
|
||||
free_memory = None
|
||||
|
||||
if TORCH_AVAILABLE and torch.cuda.is_available():
|
||||
free_memory = torch.cuda.mem_get_info()[0] / 1024**2
|
||||
elif paddle.device.is_compiled_with_cuda():
|
||||
# PaddlePaddle doesn't have direct API to get free memory,
|
||||
# use allocated memory to estimate
|
||||
device_id = self.gpu_info.get('device_id', 0)
|
||||
allocated = paddle.device.cuda.memory_allocated(device_id) / (1024**2)
|
||||
total = settings.gpu_memory_limit_mb
|
||||
free_memory = max(0, total - allocated)
|
||||
logger.debug(f"Estimated free GPU memory: {free_memory:.0f}MB (total: {total}MB, allocated: {allocated:.0f}MB)")
|
||||
|
||||
if free_memory is not None:
|
||||
if free_memory < required_mb:
|
||||
logger.warning(f"Low GPU memory: {free_memory:.0f}MB available, {required_mb}MB required")
|
||||
self.cleanup_gpu_memory()
|
||||
|
||||
# Recheck
|
||||
if TORCH_AVAILABLE and torch.cuda.is_available():
|
||||
free_memory = torch.cuda.mem_get_info()[0] / 1024**2
|
||||
else:
|
||||
allocated = paddle.device.cuda.memory_allocated(device_id) / (1024**2)
|
||||
free_memory = max(0, total - allocated)
|
||||
|
||||
if free_memory < required_mb:
|
||||
if enable_fallback and settings.enable_cpu_fallback:
|
||||
logger.warning(f"Insufficient GPU memory after cleanup. Activating CPU fallback.")
|
||||
self._activate_cpu_fallback()
|
||||
return True
|
||||
else:
|
||||
logger.error(f"Insufficient GPU memory after cleanup: {free_memory:.0f}MB")
|
||||
return False
|
||||
|
||||
logger.debug(f"GPU memory check passed: {free_memory:.0f}MB available")
|
||||
|
||||
return True
|
||||
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.warning(f"GPU memory check failed: {e}")
|
||||
return True # Continue processing even if check fails
|
||||
|
||||
def _activate_cpu_fallback(self):
|
||||
"""
|
||||
Activate CPU fallback mode when GPU memory is insufficient.
|
||||
This disables GPU usage for the current service instance.
|
||||
"""
|
||||
if self._cpu_fallback_active:
|
||||
return # Already in CPU mode
|
||||
|
||||
logger.warning("=== CPU FALLBACK MODE ACTIVATED ===")
|
||||
logger.warning("GPU memory insufficient, switching to CPU processing")
|
||||
logger.warning("Performance will be significantly reduced")
|
||||
|
||||
self._cpu_fallback_active = True
|
||||
self.use_gpu = False
|
||||
|
||||
# Update GPU info to reflect fallback
|
||||
self.gpu_info['cpu_fallback'] = True
|
||||
self.gpu_info['fallback_reason'] = 'GPU memory insufficient'
|
||||
|
||||
# Clear GPU cache to free memory
|
||||
if self._memory_guard:
|
||||
self._memory_guard.clear_gpu_cache()
|
||||
|
||||
def _restore_gpu_mode(self):
|
||||
"""
|
||||
Attempt to restore GPU mode after CPU fallback.
|
||||
Called when memory pressure has been relieved.
|
||||
"""
|
||||
if not self._cpu_fallback_active:
|
||||
return
|
||||
|
||||
if not self.gpu_available:
|
||||
return
|
||||
|
||||
# Check if GPU memory is now available
|
||||
if self._memory_guard:
|
||||
is_available, stats = self._memory_guard.check_memory(
|
||||
required_mb=settings.structure_model_memory_mb
|
||||
)
|
||||
if is_available:
|
||||
logger.info("GPU memory available, restoring GPU mode")
|
||||
self._cpu_fallback_active = False
|
||||
self.use_gpu = True
|
||||
self.gpu_info.pop('cpu_fallback', None)
|
||||
self.gpu_info.pop('fallback_reason', None)
|
||||
|
||||
def convert_pdf_to_images(self, pdf_path: Path, output_dir: Path) -> List[Path]:
|
||||
"""
|
||||
Convert PDF to images (one per page)
|
||||
@@ -626,6 +834,24 @@ class OCRService:
|
||||
threshold = confidence_threshold if confidence_threshold is not None else self.confidence_threshold
|
||||
|
||||
try:
|
||||
# Pre-operation memory check: Try to restore GPU if in fallback and memory available
|
||||
if self._cpu_fallback_active:
|
||||
self._restore_gpu_mode()
|
||||
if not self._cpu_fallback_active:
|
||||
logger.info("GPU mode restored for processing")
|
||||
|
||||
# Initial memory check before starting any heavy processing
|
||||
# Estimate memory requirement based on image type
|
||||
estimated_memory_mb = 2500 # Conservative estimate for full OCR + layout
|
||||
if detect_layout:
|
||||
estimated_memory_mb += 500 # Additional for PP-StructureV3
|
||||
|
||||
if not self.check_gpu_memory(required_mb=estimated_memory_mb, enable_fallback=True):
|
||||
logger.warning(
|
||||
f"Pre-operation memory check failed ({estimated_memory_mb}MB required). "
|
||||
f"Processing will attempt to proceed but may encounter issues."
|
||||
)
|
||||
|
||||
# Check if file is Office document
|
||||
if self.office_converter.is_office_document(image_path):
|
||||
logger.info(f"Detected Office document: {image_path.name}, converting to PDF")
|
||||
@@ -748,9 +974,12 @@ class OCRService:
|
||||
# Get OCR engine (for non-PDF images)
|
||||
ocr_engine = self.get_ocr_engine(lang)
|
||||
|
||||
# Check GPU memory before OCR processing
|
||||
if not self.check_gpu_memory(required_mb=1500):
|
||||
logger.warning("Insufficient GPU memory for OCR, attempting to proceed anyway")
|
||||
# Secondary memory check before OCR processing
|
||||
if not self.check_gpu_memory(required_mb=1500, enable_fallback=True):
|
||||
logger.warning(
|
||||
f"OCR memory check: insufficient GPU memory (1500MB required). "
|
||||
f"Mode: {'CPU fallback' if self._cpu_fallback_active else 'GPU (low memory)'}"
|
||||
)
|
||||
|
||||
# Get the actual image dimensions that OCR will use
|
||||
from PIL import Image
|
||||
@@ -950,6 +1179,18 @@ class OCRService:
|
||||
Tuple of (layout_data, images_metadata)
|
||||
"""
|
||||
try:
|
||||
# Pre-operation memory check for layout analysis
|
||||
if self._cpu_fallback_active:
|
||||
self._restore_gpu_mode()
|
||||
if not self._cpu_fallback_active:
|
||||
logger.info("GPU mode restored for layout analysis")
|
||||
|
||||
if not self.check_gpu_memory(required_mb=2000, enable_fallback=True):
|
||||
logger.warning(
|
||||
f"Layout analysis pre-check: insufficient GPU memory (2000MB required). "
|
||||
f"Mode: {'CPU fallback' if self._cpu_fallback_active else 'GPU'}"
|
||||
)
|
||||
|
||||
structure_engine = self._ensure_structure_engine(pp_structure_params)
|
||||
|
||||
# Try enhanced processing first
|
||||
@@ -998,11 +1239,21 @@ class OCRService:
|
||||
# Standard processing (original implementation)
|
||||
logger.info(f"Running standard layout analysis on {image_path.name}")
|
||||
|
||||
# Check GPU memory before processing
|
||||
if not self.check_gpu_memory(required_mb=2000):
|
||||
logger.warning("Insufficient GPU memory for PP-StructureV3, attempting to proceed anyway")
|
||||
# Memory check before PP-StructureV3 processing
|
||||
if not self.check_gpu_memory(required_mb=2000, enable_fallback=True):
|
||||
logger.warning(
|
||||
f"PP-StructureV3 memory check: insufficient GPU memory (2000MB required). "
|
||||
f"Mode: {'CPU fallback' if self._cpu_fallback_active else 'GPU (low memory)'}"
|
||||
)
|
||||
|
||||
results = structure_engine.predict(str(image_path))
|
||||
# Use prediction semaphore to control concurrent predictions
|
||||
# This prevents OOM errors from multiple simultaneous PP-StructureV3.predict() calls
|
||||
with prediction_context(timeout=settings.service_acquire_timeout_seconds) as acquired:
|
||||
if not acquired:
|
||||
logger.error("Failed to acquire prediction slot (timeout), returning empty layout")
|
||||
return None, []
|
||||
|
||||
results = structure_engine.predict(str(image_path))
|
||||
|
||||
layout_elements = []
|
||||
images_metadata = []
|
||||
@@ -1254,6 +1505,46 @@ class OCRService:
|
||||
if temp_pdf_path:
|
||||
unified_doc.metadata.original_filename = file_path.name
|
||||
|
||||
# HYBRID MODE: Check if Direct track missed images (e.g., inline image blocks)
|
||||
# If so, use OCR to extract images and merge them into the Direct result
|
||||
pages_with_missing_images = self.direct_extraction_engine.check_document_for_missing_images(
|
||||
actual_file_path
|
||||
)
|
||||
if pages_with_missing_images:
|
||||
logger.info(f"Hybrid mode: Direct track missing images on pages {pages_with_missing_images}, using OCR to extract images")
|
||||
try:
|
||||
# Run OCR on the file to extract images
|
||||
ocr_result = self.process_file_traditional(
|
||||
actual_file_path, lang, detect_layout=True,
|
||||
confidence_threshold=confidence_threshold,
|
||||
output_dir=output_dir, pp_structure_params=pp_structure_params
|
||||
)
|
||||
|
||||
# Convert OCR result to extract images
|
||||
ocr_unified = self.ocr_to_unified_converter.convert(
|
||||
ocr_result, actual_file_path, 0.0, lang
|
||||
)
|
||||
|
||||
# Merge OCR-extracted images into Direct track result
|
||||
images_added = self._merge_ocr_images_into_direct(
|
||||
unified_doc, ocr_unified, pages_with_missing_images
|
||||
)
|
||||
if images_added > 0:
|
||||
logger.info(f"Hybrid mode: Added {images_added} images from OCR to Direct track result")
|
||||
unified_doc.metadata.processing_track = ProcessingTrack.HYBRID
|
||||
else:
|
||||
# Fallback: OCR didn't find images either, render inline image blocks directly
|
||||
logger.info("Hybrid mode: OCR didn't find images, falling back to inline image rendering")
|
||||
images_added = self.direct_extraction_engine.render_inline_image_regions(
|
||||
actual_file_path, unified_doc, pages_with_missing_images, output_dir
|
||||
)
|
||||
if images_added > 0:
|
||||
logger.info(f"Hybrid mode: Rendered {images_added} inline image regions")
|
||||
unified_doc.metadata.processing_track = ProcessingTrack.HYBRID
|
||||
except Exception as e:
|
||||
logger.warning(f"Hybrid mode image extraction failed: {e}")
|
||||
# Continue with Direct track result without images
|
||||
|
||||
# Use OCR track (either by recommendation or fallback)
|
||||
if recommendation.track == "ocr":
|
||||
# Use OCR for scanned documents, images, etc.
|
||||
@@ -1269,17 +1560,19 @@ class OCRService:
|
||||
)
|
||||
unified_doc.document_id = document_id
|
||||
|
||||
# Update processing track metadata
|
||||
unified_doc.metadata.processing_track = (
|
||||
ProcessingTrack.DIRECT if recommendation.track == "direct"
|
||||
else ProcessingTrack.OCR
|
||||
)
|
||||
# Update processing track metadata (only if not already set to HYBRID)
|
||||
if unified_doc.metadata.processing_track != ProcessingTrack.HYBRID:
|
||||
unified_doc.metadata.processing_track = (
|
||||
ProcessingTrack.DIRECT if recommendation.track == "direct"
|
||||
else ProcessingTrack.OCR
|
||||
)
|
||||
|
||||
# Calculate total processing time
|
||||
processing_time = (datetime.now() - start_time).total_seconds()
|
||||
unified_doc.metadata.processing_time = processing_time
|
||||
|
||||
logger.info(f"Document processing completed in {processing_time:.2f}s using {recommendation.track} track")
|
||||
actual_track = unified_doc.metadata.processing_track.value
|
||||
logger.info(f"Document processing completed in {processing_time:.2f}s using {actual_track} track")
|
||||
|
||||
return unified_doc
|
||||
|
||||
@@ -1290,6 +1583,75 @@ class OCRService:
|
||||
file_path, lang, detect_layout, confidence_threshold, output_dir, pp_structure_params
|
||||
)
|
||||
|
||||
def _merge_ocr_images_into_direct(
|
||||
self,
|
||||
direct_doc: 'UnifiedDocument',
|
||||
ocr_doc: 'UnifiedDocument',
|
||||
pages_with_missing_images: List[int]
|
||||
) -> int:
|
||||
"""
|
||||
Merge OCR-extracted images into Direct track result.
|
||||
|
||||
This is used in hybrid mode when Direct track couldn't extract certain
|
||||
images (like logos composed of inline image blocks).
|
||||
|
||||
Args:
|
||||
direct_doc: UnifiedDocument from Direct track
|
||||
ocr_doc: UnifiedDocument from OCR track
|
||||
pages_with_missing_images: List of page numbers (1-indexed) that need images
|
||||
|
||||
Returns:
|
||||
Number of images added
|
||||
"""
|
||||
images_added = 0
|
||||
|
||||
try:
|
||||
# Get image element types to look for
|
||||
image_types = {ElementType.FIGURE, ElementType.IMAGE, ElementType.LOGO}
|
||||
|
||||
for page_num in pages_with_missing_images:
|
||||
# Find the target page in direct_doc
|
||||
direct_page = None
|
||||
for page in direct_doc.pages:
|
||||
if page.page_number == page_num:
|
||||
direct_page = page
|
||||
break
|
||||
|
||||
if not direct_page:
|
||||
continue
|
||||
|
||||
# Find the source page in ocr_doc
|
||||
ocr_page = None
|
||||
for page in ocr_doc.pages:
|
||||
if page.page_number == page_num:
|
||||
ocr_page = page
|
||||
break
|
||||
|
||||
if not ocr_page:
|
||||
continue
|
||||
|
||||
# Extract image elements from OCR page
|
||||
for element in ocr_page.elements:
|
||||
if element.type in image_types:
|
||||
# Assign new element ID to avoid conflicts
|
||||
new_element_id = f"hybrid_{element.element_id}"
|
||||
element.element_id = new_element_id
|
||||
|
||||
# Add to direct page
|
||||
direct_page.elements.append(element)
|
||||
images_added += 1
|
||||
logger.debug(f"Added image element {new_element_id} to page {page_num}")
|
||||
|
||||
# Update image count in direct_doc metadata
|
||||
if images_added > 0:
|
||||
current_images = direct_doc.metadata.total_images or 0
|
||||
direct_doc.metadata.total_images = current_images + images_added
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error merging OCR images into Direct track: {e}")
|
||||
|
||||
return images_added
|
||||
|
||||
def process_file_traditional(
|
||||
self,
|
||||
file_path: Path,
|
||||
@@ -1441,13 +1803,16 @@ class OCRService:
|
||||
UnifiedDocument if dual-track is enabled and use_dual_track=True,
|
||||
Dict with legacy format otherwise
|
||||
"""
|
||||
if use_dual_track and self.dual_track_enabled:
|
||||
# Use dual-track processing
|
||||
# Use dual-track processing if:
|
||||
# 1. use_dual_track is True (auto-detection), OR
|
||||
# 2. force_track is specified (explicit track selection)
|
||||
if (use_dual_track or force_track) and self.dual_track_enabled:
|
||||
# Use dual-track processing (or forced track)
|
||||
return self.process_with_dual_track(
|
||||
file_path, lang, detect_layout, confidence_threshold, output_dir, force_track, pp_structure_params
|
||||
)
|
||||
else:
|
||||
# Use traditional OCR processing
|
||||
# Use traditional OCR processing (no force_track support)
|
||||
return self.process_file_traditional(
|
||||
file_path, lang, detect_layout, confidence_threshold, output_dir, pp_structure_params
|
||||
)
|
||||
|
||||
@@ -572,8 +572,10 @@ class PDFGeneratorService:
|
||||
processing_track = unified_doc.metadata.get('processing_track')
|
||||
|
||||
# Route to track-specific rendering method
|
||||
is_direct_track = (processing_track == 'direct' or
|
||||
processing_track == ProcessingTrack.DIRECT)
|
||||
# ProcessingTrack is (str, Enum), so comparing with enum value works for both string and enum
|
||||
# HYBRID track uses Direct track rendering (Direct text/tables + OCR images)
|
||||
is_direct_track = (processing_track == ProcessingTrack.DIRECT or
|
||||
processing_track == ProcessingTrack.HYBRID)
|
||||
|
||||
logger.info(f"Processing track: {processing_track}, using {'Direct' if is_direct_track else 'OCR'} track rendering")
|
||||
|
||||
@@ -675,8 +677,11 @@ class PDFGeneratorService:
|
||||
logger.info("=== Direct Track PDF Generation ===")
|
||||
logger.info(f"Total pages: {len(unified_doc.pages)}")
|
||||
|
||||
# Set current track for helper methods
|
||||
self.current_processing_track = 'direct'
|
||||
# Set current track for helper methods (may be DIRECT or HYBRID)
|
||||
if hasattr(unified_doc, 'metadata') and unified_doc.metadata:
|
||||
self.current_processing_track = unified_doc.metadata.processing_track
|
||||
else:
|
||||
self.current_processing_track = ProcessingTrack.DIRECT
|
||||
|
||||
# Get page dimensions from first page (for canvas initialization)
|
||||
if not unified_doc.pages:
|
||||
@@ -1074,11 +1079,16 @@ class PDFGeneratorService:
|
||||
# *** 優先級 1: 檢查 ocr_dimensions (UnifiedDocument 轉換來的) ***
|
||||
if 'ocr_dimensions' in ocr_data:
|
||||
dims = ocr_data['ocr_dimensions']
|
||||
w = float(dims.get('width', 0))
|
||||
h = float(dims.get('height', 0))
|
||||
if w > 0 and h > 0:
|
||||
logger.info(f"使用 ocr_dimensions 欄位的頁面尺寸: {w:.1f} x {h:.1f}")
|
||||
return (w, h)
|
||||
# Handle both dict format {'width': w, 'height': h} and
|
||||
# list format [{'page': 1, 'width': w, 'height': h}, ...]
|
||||
if isinstance(dims, list) and len(dims) > 0:
|
||||
dims = dims[0] # Use first page dimensions
|
||||
if isinstance(dims, dict):
|
||||
w = float(dims.get('width', 0))
|
||||
h = float(dims.get('height', 0))
|
||||
if w > 0 and h > 0:
|
||||
logger.info(f"使用 ocr_dimensions 欄位的頁面尺寸: {w:.1f} x {h:.1f}")
|
||||
return (w, h)
|
||||
|
||||
# *** 優先級 2: 檢查原始 JSON 的 dimensions ***
|
||||
if 'dimensions' in ocr_data:
|
||||
@@ -1418,8 +1428,8 @@ class PDFGeneratorService:
|
||||
# Set font with track-specific styling
|
||||
# Note: OCR track has no StyleInfo (extracted from images), so no advanced formatting
|
||||
style_info = region.get('style')
|
||||
is_direct_track = (self.current_processing_track == 'direct' or
|
||||
self.current_processing_track == ProcessingTrack.DIRECT)
|
||||
is_direct_track = (self.current_processing_track == ProcessingTrack.DIRECT or
|
||||
self.current_processing_track == ProcessingTrack.HYBRID)
|
||||
|
||||
if style_info and is_direct_track:
|
||||
# Direct track: Apply rich styling from StyleInfo
|
||||
@@ -1661,10 +1671,15 @@ class PDFGeneratorService:
|
||||
return
|
||||
|
||||
# Construct full path to image
|
||||
# saved_path is relative to result_dir (e.g., "imgs/element_id.png")
|
||||
image_path = result_dir / image_path_str
|
||||
|
||||
# Fallback for legacy data
|
||||
if not image_path.exists():
|
||||
logger.warning(f"Image not found: {image_path}")
|
||||
image_path = result_dir / Path(image_path_str).name
|
||||
|
||||
if not image_path.exists():
|
||||
logger.warning(f"Image not found: {image_path_str} (in {result_dir})")
|
||||
return
|
||||
|
||||
# Get bbox for positioning
|
||||
@@ -2289,12 +2304,30 @@ class PDFGeneratorService:
|
||||
col_widths = element.metadata['column_widths']
|
||||
logger.debug(f"Using extracted column widths: {col_widths}")
|
||||
|
||||
# Create table without rowHeights (will use canvas scaling instead)
|
||||
t = Table(table_content, colWidths=col_widths)
|
||||
# Use original row heights from extraction if available
|
||||
# Row heights must match the number of data rows exactly
|
||||
row_heights_list = None
|
||||
if element.metadata and 'row_heights' in element.metadata:
|
||||
extracted_row_heights = element.metadata['row_heights']
|
||||
num_data_rows = len(table_content)
|
||||
num_height_rows = len(extracted_row_heights)
|
||||
|
||||
if num_height_rows == num_data_rows:
|
||||
row_heights_list = extracted_row_heights
|
||||
logger.debug(f"Using extracted row heights ({num_height_rows} rows): {row_heights_list}")
|
||||
else:
|
||||
# Row counts don't match - this can happen with merged cells or empty rows
|
||||
logger.warning(f"Row height mismatch: {num_height_rows} heights for {num_data_rows} data rows, falling back to auto-sizing")
|
||||
|
||||
# Create table with both column widths and row heights for accurate sizing
|
||||
t = Table(table_content, colWidths=col_widths, rowHeights=row_heights_list)
|
||||
|
||||
# Apply style with minimal padding to reduce table extension
|
||||
# Use Chinese font to support special characters (℃, μm, ≦, ×, Ω, etc.)
|
||||
font_for_table = self.font_name if self.font_registered else 'Helvetica'
|
||||
style = TableStyle([
|
||||
('GRID', (0, 0), (-1, -1), 0.5, colors.grey),
|
||||
('FONTNAME', (0, 0), (-1, -1), font_for_table),
|
||||
('FONTSIZE', (0, 0), (-1, -1), 8),
|
||||
('ALIGN', (0, 0), (-1, -1), 'LEFT'),
|
||||
('VALIGN', (0, 0), (-1, -1), 'TOP'),
|
||||
@@ -2307,8 +2340,8 @@ class PDFGeneratorService:
|
||||
])
|
||||
t.setStyle(style)
|
||||
|
||||
# CRITICAL: Use canvas scaling to fit table within bbox
|
||||
# This is more reliable than rowHeights which doesn't always work
|
||||
# Use canvas scaling as fallback to fit table within bbox
|
||||
# With proper row heights, scaling should be minimal (close to 1.0)
|
||||
|
||||
# Step 1: Wrap to get actual rendered size
|
||||
actual_width, actual_height = t.wrapOn(pdf_canvas, table_width * 10, table_height * 10)
|
||||
@@ -2358,11 +2391,16 @@ class PDFGeneratorService:
|
||||
logger.warning(f"No image path for element {element.element_id}")
|
||||
return
|
||||
|
||||
# Construct full path
|
||||
# Construct full path to image
|
||||
# saved_path is relative to result_dir (e.g., "document_id_p1_img0.png")
|
||||
image_path = result_dir / image_path_str
|
||||
|
||||
# Fallback for legacy data
|
||||
if not image_path.exists():
|
||||
logger.warning(f"Image not found: {image_path}")
|
||||
image_path = result_dir / Path(image_path_str).name
|
||||
|
||||
if not image_path.exists():
|
||||
logger.warning(f"Image not found: {image_path_str} (in {result_dir})")
|
||||
return
|
||||
|
||||
# Get bbox
|
||||
@@ -2388,7 +2426,7 @@ class PDFGeneratorService:
|
||||
preserveAspectRatio=True
|
||||
)
|
||||
|
||||
logger.debug(f"Drew image: {image_path_str}")
|
||||
logger.debug(f"Drew image: {image_path} (from: {original_path_str})")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to draw image element {element.element_id}: {e}")
|
||||
|
||||
@@ -21,6 +21,8 @@ except ImportError:
|
||||
import paddle
|
||||
from paddleocr import PPStructureV3
|
||||
from app.models.unified_document import ElementType
|
||||
from app.core.config import settings
|
||||
from app.services.memory_manager import prediction_context
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -96,8 +98,22 @@ class PPStructureEnhanced:
|
||||
try:
|
||||
logger.info(f"Enhanced PP-StructureV3 analysis on {image_path.name}")
|
||||
|
||||
# Perform structure analysis
|
||||
results = self.structure_engine.predict(str(image_path))
|
||||
# Perform structure analysis with semaphore control
|
||||
# This prevents OOM errors from multiple simultaneous predictions
|
||||
with prediction_context(timeout=settings.service_acquire_timeout_seconds) as acquired:
|
||||
if not acquired:
|
||||
logger.error("Failed to acquire prediction slot (timeout), returning empty result")
|
||||
return {
|
||||
'has_parsing_res_list': False,
|
||||
'elements': [],
|
||||
'total_elements': 0,
|
||||
'images': [],
|
||||
'tables': [],
|
||||
'element_types': {},
|
||||
'error': 'Prediction slot timeout'
|
||||
}
|
||||
|
||||
results = self.structure_engine.predict(str(image_path))
|
||||
|
||||
all_elements = []
|
||||
all_images = []
|
||||
|
||||
468
backend/app/services/service_pool.py
Normal file
468
backend/app/services/service_pool.py
Normal file
@@ -0,0 +1,468 @@
|
||||
"""
|
||||
Tool_OCR - OCR Service Pool
|
||||
Manages a pool of OCRService instances to prevent duplicate model loading
|
||||
and control concurrent GPU operations.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import threading
|
||||
import time
|
||||
from contextlib import contextmanager
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import Any, Dict, List, Optional, TYPE_CHECKING
|
||||
|
||||
from app.services.memory_manager import get_model_manager, MemoryConfig
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from app.services.ocr_service import OCRService
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ServiceState(Enum):
|
||||
"""State of a pooled service"""
|
||||
AVAILABLE = "available"
|
||||
IN_USE = "in_use"
|
||||
UNHEALTHY = "unhealthy"
|
||||
INITIALIZING = "initializing"
|
||||
|
||||
|
||||
@dataclass
|
||||
class PooledService:
|
||||
"""Wrapper for a pooled OCRService instance"""
|
||||
service: Any # OCRService
|
||||
device: str
|
||||
state: ServiceState = ServiceState.AVAILABLE
|
||||
created_at: float = field(default_factory=time.time)
|
||||
last_used: float = field(default_factory=time.time)
|
||||
use_count: int = 0
|
||||
error_count: int = 0
|
||||
current_task_id: Optional[str] = None
|
||||
|
||||
|
||||
class PoolConfig:
|
||||
"""Configuration for the service pool"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
max_services_per_device: int = 1,
|
||||
max_total_services: int = 2,
|
||||
acquire_timeout_seconds: float = 300.0,
|
||||
max_queue_size: int = 50,
|
||||
health_check_interval_seconds: int = 60,
|
||||
max_consecutive_errors: int = 3,
|
||||
service_idle_timeout_seconds: int = 600,
|
||||
enable_auto_scaling: bool = False,
|
||||
):
|
||||
self.max_services_per_device = max_services_per_device
|
||||
self.max_total_services = max_total_services
|
||||
self.acquire_timeout_seconds = acquire_timeout_seconds
|
||||
self.max_queue_size = max_queue_size
|
||||
self.health_check_interval_seconds = health_check_interval_seconds
|
||||
self.max_consecutive_errors = max_consecutive_errors
|
||||
self.service_idle_timeout_seconds = service_idle_timeout_seconds
|
||||
self.enable_auto_scaling = enable_auto_scaling
|
||||
|
||||
|
||||
class OCRServicePool:
|
||||
"""
|
||||
Pool of OCRService instances with concurrency control.
|
||||
|
||||
Features:
|
||||
- Per-device instance management (one service per GPU)
|
||||
- Queue-based task distribution
|
||||
- Semaphore-based concurrency limits
|
||||
- Health monitoring
|
||||
- Automatic service recovery
|
||||
"""
|
||||
|
||||
_instance = None
|
||||
_lock = threading.Lock()
|
||||
|
||||
def __new__(cls, *args, **kwargs):
|
||||
"""Singleton pattern"""
|
||||
with cls._lock:
|
||||
if cls._instance is None:
|
||||
cls._instance = super().__new__(cls)
|
||||
cls._instance._initialized = False
|
||||
return cls._instance
|
||||
|
||||
def __init__(self, config: Optional[PoolConfig] = None):
|
||||
if self._initialized:
|
||||
return
|
||||
|
||||
self.config = config or PoolConfig()
|
||||
self.services: Dict[str, List[PooledService]] = {}
|
||||
self.semaphores: Dict[str, threading.Semaphore] = {}
|
||||
self.queues: Dict[str, List] = {}
|
||||
self._pool_lock = threading.RLock()
|
||||
self._condition = threading.Condition(self._pool_lock)
|
||||
|
||||
# Metrics
|
||||
self._metrics = {
|
||||
"total_acquisitions": 0,
|
||||
"total_releases": 0,
|
||||
"total_timeouts": 0,
|
||||
"total_errors": 0,
|
||||
"queue_waits": 0,
|
||||
}
|
||||
|
||||
# Initialize default device pool
|
||||
self._initialize_device("GPU:0")
|
||||
|
||||
self._initialized = True
|
||||
logger.info("OCRServicePool initialized")
|
||||
|
||||
def _initialize_device(self, device: str):
|
||||
"""Initialize pool resources for a device"""
|
||||
with self._pool_lock:
|
||||
if device not in self.services:
|
||||
self.services[device] = []
|
||||
self.semaphores[device] = threading.Semaphore(
|
||||
self.config.max_services_per_device
|
||||
)
|
||||
self.queues[device] = []
|
||||
logger.info(f"Initialized pool for device {device}")
|
||||
|
||||
def _create_service(self, device: str) -> PooledService:
|
||||
"""
|
||||
Create a new OCRService instance for the pool.
|
||||
|
||||
Args:
|
||||
device: Device identifier (e.g., "GPU:0", "CPU")
|
||||
|
||||
Returns:
|
||||
PooledService wrapper
|
||||
"""
|
||||
# Import here to avoid circular imports
|
||||
from app.services.ocr_service import OCRService
|
||||
|
||||
logger.info(f"Creating new OCRService for device {device}")
|
||||
start_time = time.time()
|
||||
|
||||
# Create service instance
|
||||
service = OCRService()
|
||||
|
||||
creation_time = time.time() - start_time
|
||||
logger.info(f"OCRService created in {creation_time:.2f}s for device {device}")
|
||||
|
||||
return PooledService(
|
||||
service=service,
|
||||
device=device,
|
||||
state=ServiceState.AVAILABLE
|
||||
)
|
||||
|
||||
def acquire(
|
||||
self,
|
||||
device: str = "GPU:0",
|
||||
timeout: Optional[float] = None,
|
||||
task_id: Optional[str] = None
|
||||
) -> Optional[PooledService]:
|
||||
"""
|
||||
Acquire an OCRService from the pool.
|
||||
|
||||
Args:
|
||||
device: Preferred device (e.g., "GPU:0")
|
||||
timeout: Maximum time to wait for a service
|
||||
task_id: Optional task ID for tracking
|
||||
|
||||
Returns:
|
||||
PooledService if available, None if timeout
|
||||
"""
|
||||
timeout = timeout or self.config.acquire_timeout_seconds
|
||||
self._initialize_device(device)
|
||||
|
||||
start_time = time.time()
|
||||
deadline = start_time + timeout
|
||||
|
||||
with self._condition:
|
||||
while True:
|
||||
# Try to get an available service
|
||||
service = self._try_acquire_service(device, task_id)
|
||||
if service:
|
||||
self._metrics["total_acquisitions"] += 1
|
||||
return service
|
||||
|
||||
# Check if we can create a new service
|
||||
if self._can_create_service(device):
|
||||
try:
|
||||
pooled = self._create_service(device)
|
||||
pooled.state = ServiceState.IN_USE
|
||||
pooled.current_task_id = task_id
|
||||
pooled.use_count += 1
|
||||
self.services[device].append(pooled)
|
||||
self._metrics["total_acquisitions"] += 1
|
||||
logger.info(f"Created and acquired new service for {device}")
|
||||
return pooled
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to create service for {device}: {e}")
|
||||
self._metrics["total_errors"] += 1
|
||||
|
||||
# Wait for a service to become available
|
||||
remaining = deadline - time.time()
|
||||
if remaining <= 0:
|
||||
self._metrics["total_timeouts"] += 1
|
||||
logger.warning(f"Timeout waiting for service on {device}")
|
||||
return None
|
||||
|
||||
self._metrics["queue_waits"] += 1
|
||||
logger.debug(f"Waiting for service on {device} (timeout: {remaining:.1f}s)")
|
||||
self._condition.wait(timeout=min(remaining, 1.0))
|
||||
|
||||
def _try_acquire_service(self, device: str, task_id: Optional[str]) -> Optional[PooledService]:
|
||||
"""Try to acquire an available service without waiting"""
|
||||
for pooled in self.services.get(device, []):
|
||||
if pooled.state == ServiceState.AVAILABLE:
|
||||
pooled.state = ServiceState.IN_USE
|
||||
pooled.last_used = time.time()
|
||||
pooled.use_count += 1
|
||||
pooled.current_task_id = task_id
|
||||
logger.debug(f"Acquired existing service for {device} (use #{pooled.use_count})")
|
||||
return pooled
|
||||
return None
|
||||
|
||||
def _can_create_service(self, device: str) -> bool:
|
||||
"""Check if a new service can be created"""
|
||||
device_count = len(self.services.get(device, []))
|
||||
total_count = sum(len(services) for services in self.services.values())
|
||||
|
||||
return (
|
||||
device_count < self.config.max_services_per_device and
|
||||
total_count < self.config.max_total_services
|
||||
)
|
||||
|
||||
def release(self, pooled: PooledService, error: Optional[Exception] = None):
|
||||
"""
|
||||
Release a service back to the pool.
|
||||
|
||||
Args:
|
||||
pooled: The pooled service to release
|
||||
error: Optional error that occurred during use
|
||||
"""
|
||||
with self._condition:
|
||||
if error:
|
||||
pooled.error_count += 1
|
||||
self._metrics["total_errors"] += 1
|
||||
logger.warning(f"Service released with error: {error}")
|
||||
|
||||
# Mark unhealthy if too many errors
|
||||
if pooled.error_count >= self.config.max_consecutive_errors:
|
||||
pooled.state = ServiceState.UNHEALTHY
|
||||
logger.error(f"Service marked unhealthy after {pooled.error_count} errors")
|
||||
else:
|
||||
pooled.state = ServiceState.AVAILABLE
|
||||
else:
|
||||
pooled.error_count = 0 # Reset error count on success
|
||||
pooled.state = ServiceState.AVAILABLE
|
||||
|
||||
pooled.last_used = time.time()
|
||||
pooled.current_task_id = None
|
||||
self._metrics["total_releases"] += 1
|
||||
|
||||
# Clean up GPU memory after release
|
||||
try:
|
||||
model_manager = get_model_manager()
|
||||
model_manager.memory_guard.clear_gpu_cache()
|
||||
except Exception as e:
|
||||
logger.debug(f"Cache clear after release failed: {e}")
|
||||
|
||||
# Notify waiting threads
|
||||
self._condition.notify_all()
|
||||
|
||||
logger.debug(f"Service released for device {pooled.device}")
|
||||
|
||||
@contextmanager
|
||||
def acquire_context(
|
||||
self,
|
||||
device: str = "GPU:0",
|
||||
timeout: Optional[float] = None,
|
||||
task_id: Optional[str] = None
|
||||
):
|
||||
"""
|
||||
Context manager for acquiring and releasing a service.
|
||||
|
||||
Usage:
|
||||
with pool.acquire_context("GPU:0") as pooled:
|
||||
result = pooled.service.process(...)
|
||||
"""
|
||||
pooled = None
|
||||
error = None
|
||||
try:
|
||||
pooled = self.acquire(device, timeout, task_id)
|
||||
if pooled is None:
|
||||
raise TimeoutError(f"Failed to acquire service for {device}")
|
||||
yield pooled
|
||||
except Exception as e:
|
||||
error = e
|
||||
raise
|
||||
finally:
|
||||
if pooled:
|
||||
self.release(pooled, error)
|
||||
|
||||
def get_service(self, device: str = "GPU:0") -> Optional["OCRService"]:
|
||||
"""
|
||||
Get a service directly (for backward compatibility).
|
||||
|
||||
This acquires a service and returns the underlying OCRService.
|
||||
The caller is responsible for calling release_service() when done.
|
||||
|
||||
Args:
|
||||
device: Device identifier
|
||||
|
||||
Returns:
|
||||
OCRService instance or None
|
||||
"""
|
||||
pooled = self.acquire(device)
|
||||
if pooled:
|
||||
return pooled.service
|
||||
return None
|
||||
|
||||
def get_pool_stats(self) -> Dict:
|
||||
"""Get current pool statistics"""
|
||||
with self._pool_lock:
|
||||
stats = {
|
||||
"devices": {},
|
||||
"metrics": self._metrics.copy(),
|
||||
"total_services": 0,
|
||||
"available_services": 0,
|
||||
"in_use_services": 0,
|
||||
}
|
||||
|
||||
for device, services in self.services.items():
|
||||
available = sum(1 for s in services if s.state == ServiceState.AVAILABLE)
|
||||
in_use = sum(1 for s in services if s.state == ServiceState.IN_USE)
|
||||
unhealthy = sum(1 for s in services if s.state == ServiceState.UNHEALTHY)
|
||||
|
||||
stats["devices"][device] = {
|
||||
"total": len(services),
|
||||
"available": available,
|
||||
"in_use": in_use,
|
||||
"unhealthy": unhealthy,
|
||||
"max_allowed": self.config.max_services_per_device,
|
||||
}
|
||||
|
||||
stats["total_services"] += len(services)
|
||||
stats["available_services"] += available
|
||||
stats["in_use_services"] += in_use
|
||||
|
||||
return stats
|
||||
|
||||
def health_check(self) -> Dict:
|
||||
"""
|
||||
Perform health check on all pooled services.
|
||||
|
||||
Returns:
|
||||
Health check results
|
||||
"""
|
||||
results = {
|
||||
"healthy": True,
|
||||
"services": [],
|
||||
"timestamp": time.time()
|
||||
}
|
||||
|
||||
with self._pool_lock:
|
||||
for device, services in self.services.items():
|
||||
for idx, pooled in enumerate(services):
|
||||
service_health = {
|
||||
"device": device,
|
||||
"index": idx,
|
||||
"state": pooled.state.value,
|
||||
"error_count": pooled.error_count,
|
||||
"use_count": pooled.use_count,
|
||||
"idle_seconds": time.time() - pooled.last_used,
|
||||
}
|
||||
|
||||
# Check if service is responsive
|
||||
if pooled.state == ServiceState.AVAILABLE:
|
||||
try:
|
||||
# Simple check - verify service has required attributes
|
||||
has_process = hasattr(pooled.service, 'process')
|
||||
has_gpu_status = hasattr(pooled.service, 'get_gpu_status')
|
||||
service_health["responsive"] = has_process and has_gpu_status
|
||||
except Exception as e:
|
||||
service_health["responsive"] = False
|
||||
service_health["error"] = str(e)
|
||||
results["healthy"] = False
|
||||
else:
|
||||
service_health["responsive"] = pooled.state != ServiceState.UNHEALTHY
|
||||
|
||||
if pooled.state == ServiceState.UNHEALTHY:
|
||||
results["healthy"] = False
|
||||
|
||||
results["services"].append(service_health)
|
||||
|
||||
return results
|
||||
|
||||
def recover_unhealthy(self):
|
||||
"""
|
||||
Attempt to recover unhealthy services.
|
||||
"""
|
||||
with self._pool_lock:
|
||||
for device, services in self.services.items():
|
||||
for idx, pooled in enumerate(services):
|
||||
if pooled.state == ServiceState.UNHEALTHY:
|
||||
logger.info(f"Attempting to recover unhealthy service {device}:{idx}")
|
||||
try:
|
||||
# Remove old service
|
||||
services.remove(pooled)
|
||||
|
||||
# Create new service
|
||||
new_pooled = self._create_service(device)
|
||||
services.append(new_pooled)
|
||||
logger.info(f"Successfully recovered service {device}:{idx}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to recover service {device}:{idx}: {e}")
|
||||
|
||||
def shutdown(self):
|
||||
"""
|
||||
Shutdown the pool and cleanup all services.
|
||||
"""
|
||||
logger.info("OCRServicePool shutdown started")
|
||||
|
||||
with self._pool_lock:
|
||||
for device, services in self.services.items():
|
||||
for pooled in services:
|
||||
try:
|
||||
# Clean up service resources
|
||||
if hasattr(pooled.service, 'cleanup_gpu_memory'):
|
||||
pooled.service.cleanup_gpu_memory()
|
||||
except Exception as e:
|
||||
logger.warning(f"Error cleaning up service: {e}")
|
||||
|
||||
# Clear all pools
|
||||
self.services.clear()
|
||||
self.semaphores.clear()
|
||||
self.queues.clear()
|
||||
|
||||
logger.info("OCRServicePool shutdown completed")
|
||||
|
||||
|
||||
# Global singleton instance
|
||||
_service_pool: Optional[OCRServicePool] = None
|
||||
|
||||
|
||||
def get_service_pool(config: Optional[PoolConfig] = None) -> OCRServicePool:
|
||||
"""
|
||||
Get the global OCRServicePool instance.
|
||||
|
||||
Args:
|
||||
config: Optional configuration (only used on first call)
|
||||
|
||||
Returns:
|
||||
OCRServicePool singleton instance
|
||||
"""
|
||||
global _service_pool
|
||||
if _service_pool is None:
|
||||
_service_pool = OCRServicePool(config)
|
||||
return _service_pool
|
||||
|
||||
|
||||
def shutdown_service_pool():
|
||||
"""Shutdown the global service pool"""
|
||||
global _service_pool
|
||||
if _service_pool is not None:
|
||||
_service_pool.shutdown()
|
||||
_service_pool = None
|
||||
1986
backend/tests/services/test_memory_manager.py
Normal file
1986
backend/tests/services/test_memory_manager.py
Normal file
File diff suppressed because it is too large
Load Diff
380
backend/tests/services/test_ocr_memory_integration.py
Normal file
380
backend/tests/services/test_ocr_memory_integration.py
Normal file
@@ -0,0 +1,380 @@
|
||||
"""
|
||||
Tests for OCR Service Memory Integration
|
||||
|
||||
Tests the integration of MemoryGuard with OCRService patterns,
|
||||
including pre-operation memory checks and CPU fallback logic.
|
||||
"""
|
||||
|
||||
import pytest
|
||||
from unittest.mock import Mock, patch, MagicMock
|
||||
import sys
|
||||
|
||||
# Mock paddle before importing memory_manager
|
||||
paddle_mock = MagicMock()
|
||||
paddle_mock.is_compiled_with_cuda.return_value = False
|
||||
paddle_mock.device.cuda.device_count.return_value = 0
|
||||
paddle_mock.device.cuda.memory_allocated.return_value = 0
|
||||
paddle_mock.device.cuda.memory_reserved.return_value = 0
|
||||
paddle_mock.device.cuda.empty_cache = MagicMock()
|
||||
sys.modules['paddle'] = paddle_mock
|
||||
|
||||
from app.services.memory_manager import (
|
||||
MemoryGuard,
|
||||
MemoryConfig,
|
||||
MemoryStats,
|
||||
)
|
||||
|
||||
|
||||
class TestMemoryGuardIntegration:
|
||||
"""Tests for MemoryGuard integration patterns used in OCRService"""
|
||||
|
||||
def setup_method(self):
|
||||
"""Setup for each test"""
|
||||
self.config = MemoryConfig(
|
||||
warning_threshold=0.80,
|
||||
critical_threshold=0.95,
|
||||
emergency_threshold=0.98,
|
||||
enable_cpu_fallback=True,
|
||||
)
|
||||
|
||||
def teardown_method(self):
|
||||
"""Cleanup after each test"""
|
||||
pass
|
||||
|
||||
def test_memory_check_below_threshold_allows_processing(self):
|
||||
"""Test that memory check returns True when below thresholds"""
|
||||
guard = MemoryGuard(self.config)
|
||||
|
||||
# Mock stats below warning threshold
|
||||
with patch.object(guard, 'get_memory_stats') as mock_stats:
|
||||
mock_stats.return_value = MemoryStats(
|
||||
gpu_used_ratio=0.50,
|
||||
gpu_free_mb=4000,
|
||||
gpu_total_mb=8000,
|
||||
)
|
||||
|
||||
is_available, stats = guard.check_memory(required_mb=2000)
|
||||
|
||||
assert is_available is True
|
||||
assert stats.gpu_free_mb >= 2000
|
||||
|
||||
guard.shutdown()
|
||||
|
||||
def test_memory_check_above_critical_blocks_processing(self):
|
||||
"""Test that memory check returns False when above critical threshold"""
|
||||
guard = MemoryGuard(self.config)
|
||||
|
||||
# Mock stats above critical threshold
|
||||
with patch.object(guard, 'get_memory_stats') as mock_stats:
|
||||
mock_stats.return_value = MemoryStats(
|
||||
gpu_used_ratio=0.96,
|
||||
gpu_free_mb=320,
|
||||
gpu_total_mb=8000,
|
||||
)
|
||||
|
||||
is_available, stats = guard.check_memory(required_mb=1000)
|
||||
|
||||
assert is_available is False
|
||||
|
||||
guard.shutdown()
|
||||
|
||||
def test_memory_check_insufficient_free_memory(self):
|
||||
"""Test that memory check returns False when free memory < required"""
|
||||
guard = MemoryGuard(self.config)
|
||||
|
||||
# Mock stats with insufficient free memory but below critical ratio
|
||||
with patch.object(guard, 'get_memory_stats') as mock_stats:
|
||||
mock_stats.return_value = MemoryStats(
|
||||
gpu_used_ratio=0.70,
|
||||
gpu_free_mb=500,
|
||||
gpu_total_mb=8000,
|
||||
)
|
||||
|
||||
is_available, stats = guard.check_memory(required_mb=1000)
|
||||
|
||||
# Should return False (not enough free memory)
|
||||
assert is_available is False
|
||||
|
||||
guard.shutdown()
|
||||
|
||||
|
||||
class TestCPUFallbackPattern:
|
||||
"""Tests for CPU fallback pattern as used in OCRService"""
|
||||
|
||||
def test_cpu_fallback_activation_pattern(self):
|
||||
"""Test the CPU fallback activation pattern"""
|
||||
# Simulate the pattern used in OCRService._activate_cpu_fallback
|
||||
|
||||
class MockOCRService:
|
||||
def __init__(self):
|
||||
self._cpu_fallback_active = False
|
||||
self.use_gpu = True
|
||||
self.gpu_available = True
|
||||
self.gpu_info = {'device_id': 0}
|
||||
self._memory_guard = Mock()
|
||||
|
||||
def _activate_cpu_fallback(self):
|
||||
if self._cpu_fallback_active:
|
||||
return
|
||||
|
||||
self._cpu_fallback_active = True
|
||||
self.use_gpu = False
|
||||
self.gpu_info['cpu_fallback'] = True
|
||||
self.gpu_info['fallback_reason'] = 'GPU memory insufficient'
|
||||
|
||||
if self._memory_guard:
|
||||
self._memory_guard.clear_gpu_cache()
|
||||
|
||||
service = MockOCRService()
|
||||
|
||||
# Verify initial state
|
||||
assert service._cpu_fallback_active is False
|
||||
assert service.use_gpu is True
|
||||
|
||||
# Activate fallback
|
||||
service._activate_cpu_fallback()
|
||||
|
||||
# Verify fallback state
|
||||
assert service._cpu_fallback_active is True
|
||||
assert service.use_gpu is False
|
||||
assert service.gpu_info.get('cpu_fallback') is True
|
||||
service._memory_guard.clear_gpu_cache.assert_called_once()
|
||||
|
||||
def test_cpu_fallback_idempotent(self):
|
||||
"""Test that CPU fallback activation is idempotent"""
|
||||
class MockOCRService:
|
||||
def __init__(self):
|
||||
self._cpu_fallback_active = False
|
||||
self.use_gpu = True
|
||||
self._memory_guard = Mock()
|
||||
self.gpu_info = {}
|
||||
|
||||
def _activate_cpu_fallback(self):
|
||||
if self._cpu_fallback_active:
|
||||
return
|
||||
self._cpu_fallback_active = True
|
||||
self.use_gpu = False
|
||||
if self._memory_guard:
|
||||
self._memory_guard.clear_gpu_cache()
|
||||
|
||||
service = MockOCRService()
|
||||
|
||||
# Activate twice
|
||||
service._activate_cpu_fallback()
|
||||
service._activate_cpu_fallback()
|
||||
|
||||
# clear_gpu_cache should only be called once
|
||||
assert service._memory_guard.clear_gpu_cache.call_count == 1
|
||||
|
||||
def test_gpu_mode_restoration_pattern(self):
|
||||
"""Test the GPU mode restoration pattern"""
|
||||
# Simulate the pattern used in OCRService._restore_gpu_mode
|
||||
|
||||
class MockOCRService:
|
||||
def __init__(self):
|
||||
self._cpu_fallback_active = True
|
||||
self.use_gpu = False
|
||||
self.gpu_available = True
|
||||
self.gpu_info = {
|
||||
'device_id': 0,
|
||||
'cpu_fallback': True,
|
||||
'fallback_reason': 'test'
|
||||
}
|
||||
self._memory_guard = Mock()
|
||||
|
||||
def _restore_gpu_mode(self):
|
||||
if not self._cpu_fallback_active:
|
||||
return
|
||||
|
||||
if not self.gpu_available:
|
||||
return
|
||||
|
||||
# Check if GPU memory is now available
|
||||
if self._memory_guard:
|
||||
is_available, stats = self._memory_guard.check_memory(required_mb=2000)
|
||||
if is_available:
|
||||
self._cpu_fallback_active = False
|
||||
self.use_gpu = True
|
||||
self.gpu_info.pop('cpu_fallback', None)
|
||||
self.gpu_info.pop('fallback_reason', None)
|
||||
|
||||
service = MockOCRService()
|
||||
|
||||
# Mock memory guard to indicate sufficient memory
|
||||
mock_stats = Mock()
|
||||
mock_stats.gpu_free_mb = 5000
|
||||
service._memory_guard.check_memory.return_value = (True, mock_stats)
|
||||
|
||||
# Restore GPU mode
|
||||
service._restore_gpu_mode()
|
||||
|
||||
# Verify GPU mode restored
|
||||
assert service._cpu_fallback_active is False
|
||||
assert service.use_gpu is True
|
||||
assert 'cpu_fallback' not in service.gpu_info
|
||||
|
||||
def test_gpu_mode_not_restored_when_memory_still_low(self):
|
||||
"""Test that GPU mode is not restored when memory is still low"""
|
||||
class MockOCRService:
|
||||
def __init__(self):
|
||||
self._cpu_fallback_active = True
|
||||
self.use_gpu = False
|
||||
self.gpu_available = True
|
||||
self.gpu_info = {'cpu_fallback': True}
|
||||
self._memory_guard = Mock()
|
||||
|
||||
def _restore_gpu_mode(self):
|
||||
if not self._cpu_fallback_active:
|
||||
return
|
||||
if not self.gpu_available:
|
||||
return
|
||||
if self._memory_guard:
|
||||
is_available, stats = self._memory_guard.check_memory(required_mb=2000)
|
||||
if is_available:
|
||||
self._cpu_fallback_active = False
|
||||
self.use_gpu = True
|
||||
|
||||
service = MockOCRService()
|
||||
|
||||
# Mock memory guard to indicate insufficient memory
|
||||
mock_stats = Mock()
|
||||
mock_stats.gpu_free_mb = 500
|
||||
service._memory_guard.check_memory.return_value = (False, mock_stats)
|
||||
|
||||
# Try to restore GPU mode
|
||||
service._restore_gpu_mode()
|
||||
|
||||
# Verify still in fallback mode
|
||||
assert service._cpu_fallback_active is True
|
||||
assert service.use_gpu is False
|
||||
|
||||
|
||||
class TestPreOperationMemoryCheckPattern:
|
||||
"""Tests for pre-operation memory check pattern as used in OCRService"""
|
||||
|
||||
def test_pre_operation_check_with_fallback(self):
|
||||
"""Test the pre-operation memory check pattern with fallback"""
|
||||
guard = MemoryGuard(MemoryConfig(
|
||||
warning_threshold=0.80,
|
||||
critical_threshold=0.95,
|
||||
enable_cpu_fallback=True,
|
||||
))
|
||||
|
||||
# Simulate the pattern:
|
||||
# 1. Check if in CPU fallback mode
|
||||
# 2. Try to restore GPU mode if memory available
|
||||
# 3. Perform memory check for operation
|
||||
|
||||
class MockService:
|
||||
def __init__(self):
|
||||
self._cpu_fallback_active = False
|
||||
self.use_gpu = True
|
||||
self.gpu_available = True
|
||||
self._memory_guard = guard
|
||||
|
||||
def _restore_gpu_mode(self):
|
||||
pass # Simplified
|
||||
|
||||
def pre_operation_check(self, required_mb: int) -> bool:
|
||||
# Try restore first
|
||||
if self._cpu_fallback_active:
|
||||
self._restore_gpu_mode()
|
||||
|
||||
# Perform memory check
|
||||
if not self.use_gpu:
|
||||
return True # CPU mode, no GPU check needed
|
||||
|
||||
is_available, stats = self._memory_guard.check_memory(required_mb=required_mb)
|
||||
return is_available
|
||||
|
||||
service = MockService()
|
||||
|
||||
# Mock sufficient memory
|
||||
with patch.object(guard, 'get_memory_stats') as mock_stats:
|
||||
mock_stats.return_value = MemoryStats(
|
||||
gpu_used_ratio=0.50,
|
||||
gpu_free_mb=4000,
|
||||
gpu_total_mb=8000,
|
||||
)
|
||||
|
||||
result = service.pre_operation_check(required_mb=2000)
|
||||
assert result is True
|
||||
|
||||
guard.shutdown()
|
||||
|
||||
def test_pre_operation_check_returns_true_in_cpu_mode(self):
|
||||
"""Test that pre-operation check returns True when in CPU mode"""
|
||||
class MockService:
|
||||
def __init__(self):
|
||||
self._cpu_fallback_active = True
|
||||
self.use_gpu = False
|
||||
self._memory_guard = Mock()
|
||||
|
||||
def pre_operation_check(self, required_mb: int) -> bool:
|
||||
if not self.use_gpu:
|
||||
return True # CPU mode, no GPU check needed
|
||||
return False
|
||||
|
||||
service = MockService()
|
||||
result = service.pre_operation_check(required_mb=5000)
|
||||
|
||||
# Should return True because we're in CPU mode
|
||||
assert result is True
|
||||
# Memory guard should not be called
|
||||
service._memory_guard.check_memory.assert_not_called()
|
||||
|
||||
|
||||
class TestMemoryCheckWithCleanup:
|
||||
"""Tests for memory check with cleanup pattern"""
|
||||
|
||||
def test_memory_check_triggers_cleanup_on_failure(self):
|
||||
"""Test that memory check triggers cleanup when insufficient"""
|
||||
guard = MemoryGuard(MemoryConfig(
|
||||
warning_threshold=0.80,
|
||||
critical_threshold=0.95,
|
||||
))
|
||||
|
||||
# Track cleanup calls
|
||||
cleanup_called = False
|
||||
|
||||
def mock_cleanup():
|
||||
nonlocal cleanup_called
|
||||
cleanup_called = True
|
||||
|
||||
class MockService:
|
||||
def __init__(self):
|
||||
self._memory_guard = guard
|
||||
self.cleanup_func = mock_cleanup
|
||||
|
||||
def check_gpu_memory(self, required_mb: int) -> bool:
|
||||
# First check
|
||||
with patch.object(self._memory_guard, 'get_memory_stats') as mock_stats:
|
||||
# First call - low memory
|
||||
mock_stats.return_value = MemoryStats(
|
||||
gpu_used_ratio=0.96,
|
||||
gpu_free_mb=300,
|
||||
gpu_total_mb=8000,
|
||||
)
|
||||
|
||||
is_available, stats = self._memory_guard.check_memory(required_mb=required_mb)
|
||||
|
||||
if not is_available:
|
||||
# Trigger cleanup
|
||||
self.cleanup_func()
|
||||
self._memory_guard.clear_gpu_cache()
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
service = MockService()
|
||||
result = service.check_gpu_memory(required_mb=1000)
|
||||
|
||||
# Cleanup should have been triggered
|
||||
assert cleanup_called is True
|
||||
assert result is False
|
||||
|
||||
guard.shutdown()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main([__file__, "-v"])
|
||||
387
backend/tests/services/test_service_pool.py
Normal file
387
backend/tests/services/test_service_pool.py
Normal file
@@ -0,0 +1,387 @@
|
||||
"""
|
||||
Tests for OCR Service Pool
|
||||
|
||||
Tests OCRServicePool functionality including acquire, release, and concurrency.
|
||||
"""
|
||||
|
||||
import pytest
|
||||
import threading
|
||||
import time
|
||||
from unittest.mock import Mock, patch, MagicMock
|
||||
import sys
|
||||
|
||||
# Mock paddle before importing service_pool to avoid import errors
|
||||
# when paddle is not installed in the test environment
|
||||
paddle_mock = MagicMock()
|
||||
paddle_mock.is_compiled_with_cuda.return_value = False
|
||||
paddle_mock.device.cuda.device_count.return_value = 0
|
||||
paddle_mock.device.cuda.memory_allocated.return_value = 0
|
||||
paddle_mock.device.cuda.memory_reserved.return_value = 0
|
||||
paddle_mock.device.cuda.empty_cache = MagicMock()
|
||||
sys.modules['paddle'] = paddle_mock
|
||||
|
||||
from app.services.service_pool import (
|
||||
OCRServicePool,
|
||||
PooledService,
|
||||
PoolConfig,
|
||||
ServiceState,
|
||||
get_service_pool,
|
||||
shutdown_service_pool,
|
||||
)
|
||||
|
||||
|
||||
class TestPoolConfig:
|
||||
"""Tests for PoolConfig class"""
|
||||
|
||||
def test_default_values(self):
|
||||
"""Test default configuration values"""
|
||||
config = PoolConfig()
|
||||
assert config.max_services_per_device == 1
|
||||
assert config.max_total_services == 2
|
||||
assert config.acquire_timeout_seconds == 300.0
|
||||
assert config.max_queue_size == 50
|
||||
assert config.max_consecutive_errors == 3
|
||||
|
||||
def test_custom_values(self):
|
||||
"""Test custom configuration values"""
|
||||
config = PoolConfig(
|
||||
max_services_per_device=2,
|
||||
max_total_services=4,
|
||||
acquire_timeout_seconds=60.0,
|
||||
)
|
||||
assert config.max_services_per_device == 2
|
||||
assert config.max_total_services == 4
|
||||
assert config.acquire_timeout_seconds == 60.0
|
||||
|
||||
|
||||
class TestPooledService:
|
||||
"""Tests for PooledService class"""
|
||||
|
||||
def test_creation(self):
|
||||
"""Test PooledService creation"""
|
||||
mock_service = Mock()
|
||||
pooled = PooledService(
|
||||
service=mock_service,
|
||||
device="GPU:0",
|
||||
)
|
||||
assert pooled.service is mock_service
|
||||
assert pooled.device == "GPU:0"
|
||||
assert pooled.state == ServiceState.AVAILABLE
|
||||
assert pooled.use_count == 0
|
||||
assert pooled.error_count == 0
|
||||
|
||||
|
||||
class TestOCRServicePool:
|
||||
"""Tests for OCRServicePool class"""
|
||||
|
||||
def setup_method(self):
|
||||
"""Reset singleton before each test"""
|
||||
shutdown_service_pool()
|
||||
OCRServicePool._instance = None
|
||||
OCRServicePool._lock = threading.Lock()
|
||||
|
||||
def teardown_method(self):
|
||||
"""Cleanup after each test"""
|
||||
shutdown_service_pool()
|
||||
OCRServicePool._instance = None
|
||||
|
||||
def test_singleton_pattern(self):
|
||||
"""Test that OCRServicePool is a singleton"""
|
||||
pool1 = OCRServicePool()
|
||||
pool2 = OCRServicePool()
|
||||
assert pool1 is pool2
|
||||
pool1.shutdown()
|
||||
|
||||
def test_initialize_device(self):
|
||||
"""Test device initialization"""
|
||||
config = PoolConfig()
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Default device should be initialized
|
||||
assert "GPU:0" in pool.services
|
||||
assert "GPU:0" in pool.semaphores
|
||||
|
||||
# Test adding new device
|
||||
pool._initialize_device("GPU:1")
|
||||
assert "GPU:1" in pool.services
|
||||
assert "GPU:1" in pool.semaphores
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_acquire_creates_service(self):
|
||||
"""Test that acquire creates a new service if none available"""
|
||||
config = PoolConfig(max_services_per_device=1)
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
mock_service.process = Mock()
|
||||
mock_service.get_gpu_status = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
pooled = pool.acquire(device="GPU:0", timeout=5.0)
|
||||
assert pooled is not None
|
||||
assert pooled.state == ServiceState.IN_USE
|
||||
assert pooled.use_count == 1
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_acquire_reuses_available_service(self):
|
||||
"""Test that acquire reuses available services"""
|
||||
config = PoolConfig(max_services_per_device=1)
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
# First acquire
|
||||
pooled1 = pool.acquire(device="GPU:0")
|
||||
service_id = id(pooled1.service)
|
||||
pool.release(pooled1)
|
||||
|
||||
# Second acquire should get the same service
|
||||
pooled2 = pool.acquire(device="GPU:0")
|
||||
assert id(pooled2.service) == service_id
|
||||
assert pooled2.use_count == 2
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_release_makes_service_available(self):
|
||||
"""Test that release makes service available again"""
|
||||
config = PoolConfig()
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
pooled = pool.acquire(device="GPU:0")
|
||||
assert pooled.state == ServiceState.IN_USE
|
||||
|
||||
pool.release(pooled)
|
||||
assert pooled.state == ServiceState.AVAILABLE
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_release_with_error(self):
|
||||
"""Test that release with error increments error count"""
|
||||
config = PoolConfig(max_consecutive_errors=3)
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
pooled = pool.acquire(device="GPU:0")
|
||||
pool.release(pooled, error=Exception("Test error"))
|
||||
|
||||
assert pooled.error_count == 1
|
||||
assert pooled.state == ServiceState.AVAILABLE
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_release_marks_unhealthy_after_errors(self):
|
||||
"""Test that service is marked unhealthy after too many errors"""
|
||||
config = PoolConfig(max_consecutive_errors=2)
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
pooled = pool.acquire(device="GPU:0")
|
||||
pool.release(pooled, error=Exception("Error 1"))
|
||||
|
||||
pooled = pool.acquire(device="GPU:0")
|
||||
pool.release(pooled, error=Exception("Error 2"))
|
||||
|
||||
assert pooled.state == ServiceState.UNHEALTHY
|
||||
assert pooled.error_count == 2
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_acquire_context_manager(self):
|
||||
"""Test context manager for acquire/release"""
|
||||
config = PoolConfig()
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
with pool.acquire_context(device="GPU:0") as pooled:
|
||||
assert pooled is not None
|
||||
assert pooled.state == ServiceState.IN_USE
|
||||
|
||||
# After context, service should be available
|
||||
assert pooled.state == ServiceState.AVAILABLE
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_acquire_context_manager_with_error(self):
|
||||
"""Test context manager releases on error"""
|
||||
config = PoolConfig()
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
with pool.acquire_context(device="GPU:0") as pooled:
|
||||
raise ValueError("Test error")
|
||||
|
||||
# Service should still be available after error
|
||||
assert pooled.error_count == 1
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_acquire_timeout(self):
|
||||
"""Test that acquire times out when no service available"""
|
||||
config = PoolConfig(
|
||||
max_services_per_device=1,
|
||||
max_total_services=1,
|
||||
)
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
# Acquire the only service
|
||||
pooled1 = pool.acquire(device="GPU:0")
|
||||
assert pooled1 is not None
|
||||
|
||||
# Try to acquire another - should timeout
|
||||
pooled2 = pool.acquire(device="GPU:0", timeout=0.5)
|
||||
assert pooled2 is None
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_get_pool_stats(self):
|
||||
"""Test pool statistics"""
|
||||
config = PoolConfig()
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
# Acquire a service
|
||||
pooled = pool.acquire(device="GPU:0")
|
||||
|
||||
stats = pool.get_pool_stats()
|
||||
assert stats["total_services"] == 1
|
||||
assert stats["in_use_services"] == 1
|
||||
assert stats["available_services"] == 0
|
||||
assert stats["metrics"]["total_acquisitions"] == 1
|
||||
|
||||
pool.release(pooled)
|
||||
|
||||
stats = pool.get_pool_stats()
|
||||
assert stats["available_services"] == 1
|
||||
assert stats["metrics"]["total_releases"] == 1
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_health_check(self):
|
||||
"""Test health check functionality"""
|
||||
config = PoolConfig()
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with a mock service
|
||||
mock_service = Mock()
|
||||
mock_service.process = Mock()
|
||||
mock_service.get_gpu_status = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
# Acquire and release to update use_count
|
||||
pooled = pool.acquire(device="GPU:0")
|
||||
pool.release(pooled)
|
||||
|
||||
health = pool.health_check()
|
||||
assert health["healthy"] is True
|
||||
assert len(health["services"]) == 1
|
||||
assert health["services"][0]["responsive"] is True
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
def test_concurrent_acquire(self):
|
||||
"""Test concurrent service acquisition"""
|
||||
config = PoolConfig(
|
||||
max_services_per_device=2,
|
||||
max_total_services=2,
|
||||
)
|
||||
pool = OCRServicePool(config)
|
||||
|
||||
# Pre-populate with 2 mock services
|
||||
for i in range(2):
|
||||
mock_service = Mock()
|
||||
pooled_service = PooledService(service=mock_service, device="GPU:0")
|
||||
pool.services["GPU:0"].append(pooled_service)
|
||||
|
||||
results = []
|
||||
|
||||
def worker(worker_id):
|
||||
pooled = pool.acquire(device="GPU:0", timeout=5.0, task_id=f"task_{worker_id}")
|
||||
if pooled:
|
||||
results.append((worker_id, pooled))
|
||||
time.sleep(0.1) # Simulate work
|
||||
pool.release(pooled)
|
||||
|
||||
threads = [threading.Thread(target=worker, args=(i,)) for i in range(4)]
|
||||
for t in threads:
|
||||
t.start()
|
||||
for t in threads:
|
||||
t.join()
|
||||
|
||||
# All workers should have acquired a service
|
||||
assert len(results) == 4
|
||||
|
||||
pool.shutdown()
|
||||
|
||||
|
||||
class TestGetServicePool:
|
||||
"""Tests for get_service_pool helper function"""
|
||||
|
||||
def setup_method(self):
|
||||
"""Reset singleton before each test"""
|
||||
shutdown_service_pool()
|
||||
OCRServicePool._instance = None
|
||||
|
||||
def teardown_method(self):
|
||||
"""Cleanup after each test"""
|
||||
shutdown_service_pool()
|
||||
OCRServicePool._instance = None
|
||||
|
||||
def test_get_service_pool_creates_singleton(self):
|
||||
"""Test that get_service_pool creates a singleton"""
|
||||
pool1 = get_service_pool()
|
||||
pool2 = get_service_pool()
|
||||
assert pool1 is pool2
|
||||
shutdown_service_pool()
|
||||
|
||||
def test_shutdown_service_pool(self):
|
||||
"""Test shutdown_service_pool cleans up"""
|
||||
pool = get_service_pool()
|
||||
shutdown_service_pool()
|
||||
|
||||
# Should be able to create new pool
|
||||
new_pool = get_service_pool()
|
||||
assert new_pool._initialized is True
|
||||
shutdown_service_pool()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main([__file__, "-v"])
|
||||
691
frontend/package-lock.json
generated
691
frontend/package-lock.json
generated
@@ -8,6 +8,7 @@
|
||||
"name": "frontend",
|
||||
"version": "0.0.0",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-select": "^2.2.6",
|
||||
"@tanstack/react-query": "^5.90.7",
|
||||
"axios": "^1.13.2",
|
||||
"class-variance-authority": "^0.7.0",
|
||||
@@ -87,7 +88,6 @@
|
||||
"integrity": "sha512-e7jT4DxYvIDLk1ZHmU/m/mB19rex9sv0c2ftBtjSBv+kVM/902eh0fINUzD7UwLLNR+jU585GxUJ8/EBfAM5fw==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"@babel/code-frame": "^7.27.1",
|
||||
"@babel/generator": "^7.28.5",
|
||||
@@ -947,6 +947,44 @@
|
||||
"node": "^18.18.0 || ^20.9.0 || >=21.1.0"
|
||||
}
|
||||
},
|
||||
"node_modules/@floating-ui/core": {
|
||||
"version": "1.7.3",
|
||||
"resolved": "https://registry.npmjs.org/@floating-ui/core/-/core-1.7.3.tgz",
|
||||
"integrity": "sha512-sGnvb5dmrJaKEZ+LDIpguvdX3bDlEllmv4/ClQ9awcmCZrlx5jQyyMWFM5kBI+EyNOCDDiKk8il0zeuX3Zlg/w==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@floating-ui/utils": "^0.2.10"
|
||||
}
|
||||
},
|
||||
"node_modules/@floating-ui/dom": {
|
||||
"version": "1.7.4",
|
||||
"resolved": "https://registry.npmjs.org/@floating-ui/dom/-/dom-1.7.4.tgz",
|
||||
"integrity": "sha512-OOchDgh4F2CchOX94cRVqhvy7b3AFb+/rQXyswmzmGakRfkMgoWVjfnLWkRirfLEfuD4ysVW16eXzwt3jHIzKA==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@floating-ui/core": "^1.7.3",
|
||||
"@floating-ui/utils": "^0.2.10"
|
||||
}
|
||||
},
|
||||
"node_modules/@floating-ui/react-dom": {
|
||||
"version": "2.1.6",
|
||||
"resolved": "https://registry.npmjs.org/@floating-ui/react-dom/-/react-dom-2.1.6.tgz",
|
||||
"integrity": "sha512-4JX6rEatQEvlmgU80wZyq9RT96HZJa88q8hp0pBd+LrczeDI4o6uA2M+uvxngVHo4Ihr8uibXxH6+70zhAFrVw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@floating-ui/dom": "^1.7.4"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"react": ">=16.8.0",
|
||||
"react-dom": ">=16.8.0"
|
||||
}
|
||||
},
|
||||
"node_modules/@floating-ui/utils": {
|
||||
"version": "0.2.10",
|
||||
"resolved": "https://registry.npmjs.org/@floating-ui/utils/-/utils-0.2.10.tgz",
|
||||
"integrity": "sha512-aGTxbpbg8/b5JfU1HXSrbH3wXZuLPJcNEcZQFMxLs3oSzgtVu6nFPkbbGGUvBcUjKV2YyB9Wxxabo+HEH9tcRQ==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/@humanfs/core": {
|
||||
"version": "0.19.1",
|
||||
"resolved": "https://registry.npmjs.org/@humanfs/core/-/core-0.19.1.tgz",
|
||||
@@ -1272,6 +1310,502 @@
|
||||
"node": ">= 8"
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/number": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/number/-/number-1.1.1.tgz",
|
||||
"integrity": "sha512-MkKCwxlXTgz6CFoJx3pCwn07GKp36+aZyu/u2Ln2VrA5DcdyCZkASEDBTd8x5whTQQL5CiYf4prXKLcgQdv29g==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/@radix-ui/primitive": {
|
||||
"version": "1.1.3",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/primitive/-/primitive-1.1.3.tgz",
|
||||
"integrity": "sha512-JTF99U/6XIjCBo0wqkU5sK10glYe27MRRsfwoiq5zzOEZLHU3A3KCMa5X/azekYRCJ0HlwI0crAXS/5dEHTzDg==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/@radix-ui/react-arrow": {
|
||||
"version": "1.1.7",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-arrow/-/react-arrow-1.1.7.tgz",
|
||||
"integrity": "sha512-F+M1tLhO+mlQaOWspE8Wstg+z6PwxwRd8oQ8IXceWz92kfAmalTRf0EjrouQeo7QssEPfCn05B4Ihs1K9WQ/7w==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-primitive": "2.1.3"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-collection": {
|
||||
"version": "1.1.7",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-collection/-/react-collection-1.1.7.tgz",
|
||||
"integrity": "sha512-Fh9rGN0MoI4ZFUNyfFVNU4y9LUz93u9/0K+yLgA2bwRojxM8JU1DyvvMBabnZPBgMWREAJvU2jjVzq+LrFUglw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-compose-refs": "1.1.2",
|
||||
"@radix-ui/react-context": "1.1.2",
|
||||
"@radix-ui/react-primitive": "2.1.3",
|
||||
"@radix-ui/react-slot": "1.2.3"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-compose-refs": {
|
||||
"version": "1.1.2",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-compose-refs/-/react-compose-refs-1.1.2.tgz",
|
||||
"integrity": "sha512-z4eqJvfiNnFMHIIvXP3CY57y2WJs5g2v3X0zm9mEJkrkNv4rDxu+sg9Jh8EkXyeqBkB7SOcboo9dMVqhyrACIg==",
|
||||
"license": "MIT",
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-context": {
|
||||
"version": "1.1.2",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-context/-/react-context-1.1.2.tgz",
|
||||
"integrity": "sha512-jCi/QKUM2r1Ju5a3J64TH2A5SpKAgh0LpknyqdQ4m6DCV0xJ2HG1xARRwNGPQfi1SLdLWZ1OJz6F4OMBBNiGJA==",
|
||||
"license": "MIT",
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-direction": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-direction/-/react-direction-1.1.1.tgz",
|
||||
"integrity": "sha512-1UEWRX6jnOA2y4H5WczZ44gOOjTEmlqv1uNW4GAJEO5+bauCBhv8snY65Iw5/VOS/ghKN9gr2KjnLKxrsvoMVw==",
|
||||
"license": "MIT",
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-dismissable-layer": {
|
||||
"version": "1.1.11",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-dismissable-layer/-/react-dismissable-layer-1.1.11.tgz",
|
||||
"integrity": "sha512-Nqcp+t5cTB8BinFkZgXiMJniQH0PsUt2k51FUhbdfeKvc4ACcG2uQniY/8+h1Yv6Kza4Q7lD7PQV0z0oicE0Mg==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/primitive": "1.1.3",
|
||||
"@radix-ui/react-compose-refs": "1.1.2",
|
||||
"@radix-ui/react-primitive": "2.1.3",
|
||||
"@radix-ui/react-use-callback-ref": "1.1.1",
|
||||
"@radix-ui/react-use-escape-keydown": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-focus-guards": {
|
||||
"version": "1.1.3",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-focus-guards/-/react-focus-guards-1.1.3.tgz",
|
||||
"integrity": "sha512-0rFg/Rj2Q62NCm62jZw0QX7a3sz6QCQU0LpZdNrJX8byRGaGVTqbrW9jAoIAHyMQqsNpeZ81YgSizOt5WXq0Pw==",
|
||||
"license": "MIT",
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-focus-scope": {
|
||||
"version": "1.1.7",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-focus-scope/-/react-focus-scope-1.1.7.tgz",
|
||||
"integrity": "sha512-t2ODlkXBQyn7jkl6TNaw/MtVEVvIGelJDCG41Okq/KwUsJBwQ4XVZsHAVUkK4mBv3ewiAS3PGuUWuY2BoK4ZUw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-compose-refs": "1.1.2",
|
||||
"@radix-ui/react-primitive": "2.1.3",
|
||||
"@radix-ui/react-use-callback-ref": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-id": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-id/-/react-id-1.1.1.tgz",
|
||||
"integrity": "sha512-kGkGegYIdQsOb4XjsfM97rXsiHaBwco+hFI66oO4s9LU+PLAC5oJ7khdOVFxkhsmlbpUqDAvXw11CluXP+jkHg==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-use-layout-effect": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-popper": {
|
||||
"version": "1.2.8",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-popper/-/react-popper-1.2.8.tgz",
|
||||
"integrity": "sha512-0NJQ4LFFUuWkE7Oxf0htBKS6zLkkjBH+hM1uk7Ng705ReR8m/uelduy1DBo0PyBXPKVnBA6YBlU94MBGXrSBCw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@floating-ui/react-dom": "^2.0.0",
|
||||
"@radix-ui/react-arrow": "1.1.7",
|
||||
"@radix-ui/react-compose-refs": "1.1.2",
|
||||
"@radix-ui/react-context": "1.1.2",
|
||||
"@radix-ui/react-primitive": "2.1.3",
|
||||
"@radix-ui/react-use-callback-ref": "1.1.1",
|
||||
"@radix-ui/react-use-layout-effect": "1.1.1",
|
||||
"@radix-ui/react-use-rect": "1.1.1",
|
||||
"@radix-ui/react-use-size": "1.1.1",
|
||||
"@radix-ui/rect": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-portal": {
|
||||
"version": "1.1.9",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-portal/-/react-portal-1.1.9.tgz",
|
||||
"integrity": "sha512-bpIxvq03if6UNwXZ+HTK71JLh4APvnXntDc6XOX8UVq4XQOVl7lwok0AvIl+b8zgCw3fSaVTZMpAPPagXbKmHQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-primitive": "2.1.3",
|
||||
"@radix-ui/react-use-layout-effect": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-primitive": {
|
||||
"version": "2.1.3",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-primitive/-/react-primitive-2.1.3.tgz",
|
||||
"integrity": "sha512-m9gTwRkhy2lvCPe6QJp4d3G1TYEUHn/FzJUtq9MjH46an1wJU+GdoGC5VLof8RX8Ft/DlpshApkhswDLZzHIcQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-slot": "1.2.3"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-select": {
|
||||
"version": "2.2.6",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-select/-/react-select-2.2.6.tgz",
|
||||
"integrity": "sha512-I30RydO+bnn2PQztvo25tswPH+wFBjehVGtmagkU78yMdwTwVf12wnAOF+AeP8S2N8xD+5UPbGhkUfPyvT+mwQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/number": "1.1.1",
|
||||
"@radix-ui/primitive": "1.1.3",
|
||||
"@radix-ui/react-collection": "1.1.7",
|
||||
"@radix-ui/react-compose-refs": "1.1.2",
|
||||
"@radix-ui/react-context": "1.1.2",
|
||||
"@radix-ui/react-direction": "1.1.1",
|
||||
"@radix-ui/react-dismissable-layer": "1.1.11",
|
||||
"@radix-ui/react-focus-guards": "1.1.3",
|
||||
"@radix-ui/react-focus-scope": "1.1.7",
|
||||
"@radix-ui/react-id": "1.1.1",
|
||||
"@radix-ui/react-popper": "1.2.8",
|
||||
"@radix-ui/react-portal": "1.1.9",
|
||||
"@radix-ui/react-primitive": "2.1.3",
|
||||
"@radix-ui/react-slot": "1.2.3",
|
||||
"@radix-ui/react-use-callback-ref": "1.1.1",
|
||||
"@radix-ui/react-use-controllable-state": "1.2.2",
|
||||
"@radix-ui/react-use-layout-effect": "1.1.1",
|
||||
"@radix-ui/react-use-previous": "1.1.1",
|
||||
"@radix-ui/react-visually-hidden": "1.2.3",
|
||||
"aria-hidden": "^1.2.4",
|
||||
"react-remove-scroll": "^2.6.3"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-slot": {
|
||||
"version": "1.2.3",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-slot/-/react-slot-1.2.3.tgz",
|
||||
"integrity": "sha512-aeNmHnBxbi2St0au6VBVC7JXFlhLlOnvIIlePNniyUNAClzmtAUEY8/pBiK3iHjufOlwA+c20/8jngo7xcrg8A==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-compose-refs": "1.1.2"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-use-callback-ref": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-use-callback-ref/-/react-use-callback-ref-1.1.1.tgz",
|
||||
"integrity": "sha512-FkBMwD+qbGQeMu1cOHnuGB6x4yzPjho8ap5WtbEJ26umhgqVXbhekKUQO+hZEL1vU92a3wHwdp0HAcqAUF5iDg==",
|
||||
"license": "MIT",
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-use-controllable-state": {
|
||||
"version": "1.2.2",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-use-controllable-state/-/react-use-controllable-state-1.2.2.tgz",
|
||||
"integrity": "sha512-BjasUjixPFdS+NKkypcyyN5Pmg83Olst0+c6vGov0diwTEo6mgdqVR6hxcEgFuh4QrAs7Rc+9KuGJ9TVCj0Zzg==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-use-effect-event": "0.0.2",
|
||||
"@radix-ui/react-use-layout-effect": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-use-effect-event": {
|
||||
"version": "0.0.2",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-use-effect-event/-/react-use-effect-event-0.0.2.tgz",
|
||||
"integrity": "sha512-Qp8WbZOBe+blgpuUT+lw2xheLP8q0oatc9UpmiemEICxGvFLYmHm9QowVZGHtJlGbS6A6yJ3iViad/2cVjnOiA==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-use-layout-effect": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-use-escape-keydown": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-use-escape-keydown/-/react-use-escape-keydown-1.1.1.tgz",
|
||||
"integrity": "sha512-Il0+boE7w/XebUHyBjroE+DbByORGR9KKmITzbR7MyQ4akpORYP/ZmbhAr0DG7RmmBqoOnZdy2QlvajJ2QA59g==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-use-callback-ref": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-use-layout-effect": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-use-layout-effect/-/react-use-layout-effect-1.1.1.tgz",
|
||||
"integrity": "sha512-RbJRS4UWQFkzHTTwVymMTUv8EqYhOp8dOOviLj2ugtTiXRaRQS7GLGxZTLL1jWhMeoSCf5zmcZkqTl9IiYfXcQ==",
|
||||
"license": "MIT",
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-use-previous": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-use-previous/-/react-use-previous-1.1.1.tgz",
|
||||
"integrity": "sha512-2dHfToCj/pzca2Ck724OZ5L0EVrr3eHRNsG/b3xQJLA2hZpVCS99bLAX+hm1IHXDEnzU6by5z/5MIY794/a8NQ==",
|
||||
"license": "MIT",
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-use-rect": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-use-rect/-/react-use-rect-1.1.1.tgz",
|
||||
"integrity": "sha512-QTYuDesS0VtuHNNvMh+CjlKJ4LJickCMUAqjlE3+j8w+RlRpwyX3apEQKGFzbZGdo7XNG1tXa+bQqIE7HIXT2w==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/rect": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-use-size": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-use-size/-/react-use-size-1.1.1.tgz",
|
||||
"integrity": "sha512-ewrXRDTAqAXlkl6t/fkXWNAhFX9I+CkKlw6zjEwk86RSPKwZr3xpBRso655aqYafwtnbpHLj6toFzmd6xdVptQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-use-layout-effect": "1.1.1"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/react-visually-hidden": {
|
||||
"version": "1.2.3",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/react-visually-hidden/-/react-visually-hidden-1.2.3.tgz",
|
||||
"integrity": "sha512-pzJq12tEaaIhqjbzpCuv/OypJY/BPavOofm+dbab+MHLajy277+1lLm6JFcGgF5eskJ6mquGirhXY2GD/8u8Ug==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@radix-ui/react-primitive": "2.1.3"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"@types/react-dom": "*",
|
||||
"react": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc",
|
||||
"react-dom": "^16.8 || ^17.0 || ^18.0 || ^19.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
},
|
||||
"@types/react-dom": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@radix-ui/rect": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/@radix-ui/rect/-/rect-1.1.1.tgz",
|
||||
"integrity": "sha512-HPwpGIzkl28mWyZqG52jiqDJ12waP11Pa1lGoiyUkIEuMLBP0oeK/C89esbXrxsky5we7dfd8U58nm0SgAWpVw==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/@rolldown/pluginutils": {
|
||||
"version": "1.0.0-beta.47",
|
||||
"resolved": "https://registry.npmjs.org/@rolldown/pluginutils/-/pluginutils-1.0.0-beta.47.tgz",
|
||||
@@ -1990,7 +2524,6 @@
|
||||
"integrity": "sha512-GNWcUTRBgIRJD5zj+Tq0fKOJ5XZajIiBroOF0yvj2bSU1WvNdYS/dn9UxwsujGW4JX06dnHyjV2y9rRaybH0iQ==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"undici-types": "~7.16.0"
|
||||
}
|
||||
@@ -2000,7 +2533,6 @@
|
||||
"resolved": "https://registry.npmjs.org/@types/react/-/react-19.2.4.tgz",
|
||||
"integrity": "sha512-tBFxBp9Nfyy5rsmefN+WXc1JeW/j2BpBHFdLZbEVfs9wn3E3NRFxwV0pJg8M1qQAexFpvz73hJXFofV0ZAu92A==",
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"csstype": "^3.0.2"
|
||||
}
|
||||
@@ -2009,7 +2541,7 @@
|
||||
"version": "19.2.3",
|
||||
"resolved": "https://registry.npmjs.org/@types/react-dom/-/react-dom-19.2.3.tgz",
|
||||
"integrity": "sha512-jp2L/eY6fn+KgVVQAOqYItbF0VY/YApe5Mz2F0aykSO8gx31bYCZyvSeYxCHKvzHG5eZjc+zyaS5BrBWya2+kQ==",
|
||||
"dev": true,
|
||||
"devOptional": true,
|
||||
"license": "MIT",
|
||||
"peerDependencies": {
|
||||
"@types/react": "^19.2.0"
|
||||
@@ -2067,7 +2599,6 @@
|
||||
"integrity": "sha512-tK3GPFWbirvNgsNKto+UmB/cRtn6TZfyw0D6IKrW55n6Vbs7KJoZtI//kpTKzE/DUmmnAFD8/Ca46s7Obs92/w==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"@typescript-eslint/scope-manager": "8.46.4",
|
||||
"@typescript-eslint/types": "8.46.4",
|
||||
@@ -2326,7 +2857,6 @@
|
||||
"integrity": "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"bin": {
|
||||
"acorn": "bin/acorn"
|
||||
},
|
||||
@@ -2384,6 +2914,18 @@
|
||||
"dev": true,
|
||||
"license": "Python-2.0"
|
||||
},
|
||||
"node_modules/aria-hidden": {
|
||||
"version": "1.2.6",
|
||||
"resolved": "https://registry.npmjs.org/aria-hidden/-/aria-hidden-1.2.6.tgz",
|
||||
"integrity": "sha512-ik3ZgC9dY/lYVVM++OISsaYDeg1tb0VtP5uL3ouh1koGOaUMDPpbFIei4JkFimWUFPn90sbMNMXQAIVOlnYKJA==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"tslib": "^2.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=10"
|
||||
}
|
||||
},
|
||||
"node_modules/asynckit": {
|
||||
"version": "0.4.0",
|
||||
"resolved": "https://registry.npmjs.org/asynckit/-/asynckit-0.4.0.tgz",
|
||||
@@ -2519,7 +3061,6 @@
|
||||
}
|
||||
],
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"baseline-browser-mapping": "^2.8.25",
|
||||
"caniuse-lite": "^1.0.30001754",
|
||||
@@ -2817,6 +3358,12 @@
|
||||
"node": ">=8"
|
||||
}
|
||||
},
|
||||
"node_modules/detect-node-es": {
|
||||
"version": "1.1.0",
|
||||
"resolved": "https://registry.npmjs.org/detect-node-es/-/detect-node-es-1.1.0.tgz",
|
||||
"integrity": "sha512-ypdmJU/TbBby2Dxibuv7ZLW3Bs1QEmM7nHjEANfohJLvE0XVujisn1qPJcZxg+qDucsr+bP6fLD1rPS3AhJ7EQ==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/devlop": {
|
||||
"version": "1.1.0",
|
||||
"resolved": "https://registry.npmjs.org/devlop/-/devlop-1.1.0.tgz",
|
||||
@@ -2981,7 +3528,6 @@
|
||||
"integrity": "sha512-BhHmn2yNOFA9H9JmmIVKJmd288g9hrVRDkdoIgRCRuSySRUHH7r/DI6aAXW9T1WwUuY3DFgrcaqB+deURBLR5g==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"@eslint-community/eslint-utils": "^4.8.0",
|
||||
"@eslint-community/regexpp": "^4.12.1",
|
||||
@@ -3414,6 +3960,15 @@
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/get-nonce": {
|
||||
"version": "1.0.1",
|
||||
"resolved": "https://registry.npmjs.org/get-nonce/-/get-nonce-1.0.1.tgz",
|
||||
"integrity": "sha512-FJhYRoDaiatfEkUK8HKlicmu/3SGFD51q3itKDGoSTysQJBnfOcxU5GxnhE1E6soB76MbT0MBtnKJuXyAx+96Q==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=6"
|
||||
}
|
||||
},
|
||||
"node_modules/get-proto": {
|
||||
"version": "1.0.1",
|
||||
"resolved": "https://registry.npmjs.org/get-proto/-/get-proto-1.0.1.tgz",
|
||||
@@ -3606,7 +4161,6 @@
|
||||
}
|
||||
],
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"@babel/runtime": "^7.27.6"
|
||||
},
|
||||
@@ -5096,7 +5650,6 @@
|
||||
}
|
||||
],
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"nanoid": "^3.3.11",
|
||||
"picocolors": "^1.1.1",
|
||||
@@ -5186,7 +5739,6 @@
|
||||
"resolved": "https://registry.npmjs.org/react/-/react-19.2.0.tgz",
|
||||
"integrity": "sha512-tmbWg6W31tQLeB5cdIBOicJDJRR2KzXsV7uSK9iNfLWQ5bIZfxuPEHp7M8wiHyHnn0DD1i7w3Zmin0FtkrwoCQ==",
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=0.10.0"
|
||||
}
|
||||
@@ -5196,7 +5748,6 @@
|
||||
"resolved": "https://registry.npmjs.org/react-dom/-/react-dom-19.2.0.tgz",
|
||||
"integrity": "sha512-UlbRu4cAiGaIewkPyiRGJk0imDN2T3JjieT6spoL2UeSf5od4n5LB/mQ4ejmxhCFT1tYe8IvaFulzynWovsEFQ==",
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"scheduler": "^0.27.0"
|
||||
},
|
||||
@@ -5332,6 +5883,53 @@
|
||||
"node": ">=0.10.0"
|
||||
}
|
||||
},
|
||||
"node_modules/react-remove-scroll": {
|
||||
"version": "2.7.1",
|
||||
"resolved": "https://registry.npmjs.org/react-remove-scroll/-/react-remove-scroll-2.7.1.tgz",
|
||||
"integrity": "sha512-HpMh8+oahmIdOuS5aFKKY6Pyog+FNaZV/XyJOq7b4YFwsFHe5yYfdbIalI4k3vU2nSDql7YskmUseHsRrJqIPA==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"react-remove-scroll-bar": "^2.3.7",
|
||||
"react-style-singleton": "^2.2.3",
|
||||
"tslib": "^2.1.0",
|
||||
"use-callback-ref": "^1.3.3",
|
||||
"use-sidecar": "^1.1.3"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=10"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/react-remove-scroll-bar": {
|
||||
"version": "2.3.8",
|
||||
"resolved": "https://registry.npmjs.org/react-remove-scroll-bar/-/react-remove-scroll-bar-2.3.8.tgz",
|
||||
"integrity": "sha512-9r+yi9+mgU33AKcj6IbT9oRCO78WriSj6t/cF8DWBZJ9aOGPOTEDvdUDz1FwKim7QXWwmHqtdHnRJfhAxEG46Q==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"react-style-singleton": "^2.2.2",
|
||||
"tslib": "^2.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=10"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/react-router": {
|
||||
"version": "7.9.6",
|
||||
"resolved": "https://registry.npmjs.org/react-router/-/react-router-7.9.6.tgz",
|
||||
@@ -5370,6 +5968,28 @@
|
||||
"react-dom": ">=18"
|
||||
}
|
||||
},
|
||||
"node_modules/react-style-singleton": {
|
||||
"version": "2.2.3",
|
||||
"resolved": "https://registry.npmjs.org/react-style-singleton/-/react-style-singleton-2.2.3.tgz",
|
||||
"integrity": "sha512-b6jSvxvVnyptAiLjbkWLE/lOnR4lfTtDAl+eUC7RZy+QQWc6wRzIV2CE6xBuMmDxc2qIihtDCZD5NPOFl7fRBQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"get-nonce": "^1.0.0",
|
||||
"tslib": "^2.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=10"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/remark-parse": {
|
||||
"version": "11.0.0",
|
||||
"resolved": "https://registry.npmjs.org/remark-parse/-/remark-parse-11.0.0.tgz",
|
||||
@@ -5691,7 +6311,6 @@
|
||||
"integrity": "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=12"
|
||||
},
|
||||
@@ -5770,7 +6389,6 @@
|
||||
"integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==",
|
||||
"devOptional": true,
|
||||
"license": "Apache-2.0",
|
||||
"peer": true,
|
||||
"bin": {
|
||||
"tsc": "bin/tsc",
|
||||
"tsserver": "bin/tsserver"
|
||||
@@ -5938,6 +6556,49 @@
|
||||
"punycode": "^2.1.0"
|
||||
}
|
||||
},
|
||||
"node_modules/use-callback-ref": {
|
||||
"version": "1.3.3",
|
||||
"resolved": "https://registry.npmjs.org/use-callback-ref/-/use-callback-ref-1.3.3.tgz",
|
||||
"integrity": "sha512-jQL3lRnocaFtu3V00JToYz/4QkNWswxijDaCVNZRiRTO3HQDLsdu1ZtmIUvV4yPp+rvWm5j0y0TG/S61cuijTg==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"tslib": "^2.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=10"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/use-sidecar": {
|
||||
"version": "1.1.3",
|
||||
"resolved": "https://registry.npmjs.org/use-sidecar/-/use-sidecar-1.1.3.tgz",
|
||||
"integrity": "sha512-Fedw0aZvkhynoPYlA5WXrMCAMm+nSWdZt6lzJQ7Ok8S6Q+VsHmHpRWndVRJ8Be0ZbkfPc5LRYH+5XrzXcEeLRQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"detect-node-es": "^1.1.0",
|
||||
"tslib": "^2.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=10"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@types/react": "*",
|
||||
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"@types/react": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/use-sync-external-store": {
|
||||
"version": "1.6.0",
|
||||
"resolved": "https://registry.npmjs.org/use-sync-external-store/-/use-sync-external-store-1.6.0.tgz",
|
||||
@@ -5981,7 +6642,6 @@
|
||||
"integrity": "sha512-BxAKBWmIbrDgrokdGZH1IgkIk/5mMHDreLDmCJ0qpyJaAteP8NvMhkwr/ZCQNqNH97bw/dANTE9PDzqwJghfMQ==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"esbuild": "^0.25.0",
|
||||
"fdir": "^6.5.0",
|
||||
@@ -6075,7 +6735,6 @@
|
||||
"integrity": "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=12"
|
||||
},
|
||||
|
||||
@@ -10,6 +10,7 @@
|
||||
"preview": "vite preview"
|
||||
},
|
||||
"dependencies": {
|
||||
"@radix-ui/react-select": "^2.2.6",
|
||||
"@tanstack/react-query": "^5.90.7",
|
||||
"axios": "^1.13.2",
|
||||
"class-variance-authority": "^0.7.0",
|
||||
|
||||
@@ -1,11 +1,17 @@
|
||||
import { useState, useMemo } from 'react'
|
||||
import { Document, Page } from 'react-pdf'
|
||||
import { useState, useCallback, useMemo, useRef, useEffect } from 'react'
|
||||
import { Document, Page, pdfjs } from 'react-pdf'
|
||||
import type { PDFDocumentProxy } from 'pdfjs-dist'
|
||||
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card'
|
||||
import { Button } from '@/components/ui/button'
|
||||
import { ChevronLeft, ChevronRight, ZoomIn, ZoomOut } from 'lucide-react'
|
||||
import { ChevronLeft, ChevronRight, ZoomIn, ZoomOut, Loader2 } from 'lucide-react'
|
||||
import 'react-pdf/dist/Page/AnnotationLayer.css'
|
||||
import 'react-pdf/dist/Page/TextLayer.css'
|
||||
|
||||
// Configure standard font data URL for proper font rendering
|
||||
const pdfOptions = {
|
||||
standardFontDataUrl: `https://unpkg.com/pdfjs-dist@${pdfjs.version}/standard_fonts/`,
|
||||
}
|
||||
|
||||
interface PDFViewerProps {
|
||||
title?: string
|
||||
pdfUrl: string
|
||||
@@ -17,41 +23,56 @@ export default function PDFViewer({ title, pdfUrl, className, httpHeaders }: PDF
|
||||
const [numPages, setNumPages] = useState<number>(0)
|
||||
const [pageNumber, setPageNumber] = useState<number>(1)
|
||||
const [scale, setScale] = useState<number>(1.0)
|
||||
const [loading, setLoading] = useState<boolean>(true)
|
||||
const [documentLoaded, setDocumentLoaded] = useState<boolean>(false)
|
||||
const [error, setError] = useState<string | null>(null)
|
||||
|
||||
// Memoize the file prop to prevent unnecessary reloads
|
||||
// Store PDF document reference
|
||||
const pdfDocRef = useRef<PDFDocumentProxy | null>(null)
|
||||
|
||||
// Memoize file config to prevent unnecessary reloads
|
||||
const fileConfig = useMemo(() => {
|
||||
return httpHeaders ? { url: pdfUrl, httpHeaders } : pdfUrl
|
||||
}, [pdfUrl, httpHeaders])
|
||||
|
||||
const onDocumentLoadSuccess = ({ numPages }: { numPages: number }) => {
|
||||
setNumPages(numPages)
|
||||
setLoading(false)
|
||||
// Reset state when URL changes
|
||||
useEffect(() => {
|
||||
setDocumentLoaded(false)
|
||||
setError(null)
|
||||
}
|
||||
setNumPages(0)
|
||||
setPageNumber(1)
|
||||
pdfDocRef.current = null
|
||||
}, [pdfUrl])
|
||||
|
||||
const onDocumentLoadError = (error: Error) => {
|
||||
console.error('Error loading PDF:', error)
|
||||
setError('Failed to load PDF. Please try again later.')
|
||||
setLoading(false)
|
||||
}
|
||||
const onDocumentLoadSuccess = useCallback((pdf: { numPages: number }) => {
|
||||
pdfDocRef.current = pdf as unknown as PDFDocumentProxy
|
||||
setNumPages(pdf.numPages)
|
||||
setPageNumber(1)
|
||||
setDocumentLoaded(true)
|
||||
setError(null)
|
||||
}, [])
|
||||
|
||||
const goToPreviousPage = () => {
|
||||
const onDocumentLoadError = useCallback((err: Error) => {
|
||||
console.error('Error loading PDF:', err)
|
||||
setError('無法載入 PDF 檔案。請稍後再試。')
|
||||
setDocumentLoaded(false)
|
||||
pdfDocRef.current = null
|
||||
}, [])
|
||||
|
||||
const goToPreviousPage = useCallback(() => {
|
||||
setPageNumber((prev) => Math.max(prev - 1, 1))
|
||||
}
|
||||
}, [])
|
||||
|
||||
const goToNextPage = () => {
|
||||
const goToNextPage = useCallback(() => {
|
||||
setPageNumber((prev) => Math.min(prev + 1, numPages))
|
||||
}
|
||||
}, [numPages])
|
||||
|
||||
const zoomIn = () => {
|
||||
const zoomIn = useCallback(() => {
|
||||
setScale((prev) => Math.min(prev + 0.2, 3.0))
|
||||
}
|
||||
}, [])
|
||||
|
||||
const zoomOut = () => {
|
||||
const zoomOut = useCallback(() => {
|
||||
setScale((prev) => Math.max(prev - 0.2, 0.5))
|
||||
}
|
||||
}, [])
|
||||
|
||||
return (
|
||||
<Card className={className}>
|
||||
@@ -69,18 +90,18 @@ export default function PDFViewer({ title, pdfUrl, className, httpHeaders }: PDF
|
||||
variant="outline"
|
||||
size="sm"
|
||||
onClick={goToPreviousPage}
|
||||
disabled={pageNumber <= 1 || loading}
|
||||
disabled={pageNumber <= 1 || !documentLoaded}
|
||||
>
|
||||
<ChevronLeft className="h-4 w-4" />
|
||||
</Button>
|
||||
<span className="text-sm whitespace-nowrap">
|
||||
Page {pageNumber} of {numPages || '...'}
|
||||
第 {pageNumber} 頁 / 共 {numPages || '...'} 頁
|
||||
</span>
|
||||
<Button
|
||||
variant="outline"
|
||||
size="sm"
|
||||
onClick={goToNextPage}
|
||||
disabled={pageNumber >= numPages || loading}
|
||||
disabled={pageNumber >= numPages || !documentLoaded}
|
||||
>
|
||||
<ChevronRight className="h-4 w-4" />
|
||||
</Button>
|
||||
@@ -92,7 +113,7 @@ export default function PDFViewer({ title, pdfUrl, className, httpHeaders }: PDF
|
||||
variant="outline"
|
||||
size="sm"
|
||||
onClick={zoomOut}
|
||||
disabled={scale <= 0.5 || loading}
|
||||
disabled={scale <= 0.5 || !documentLoaded}
|
||||
>
|
||||
<ZoomOut className="h-4 w-4" />
|
||||
</Button>
|
||||
@@ -103,7 +124,7 @@ export default function PDFViewer({ title, pdfUrl, className, httpHeaders }: PDF
|
||||
variant="outline"
|
||||
size="sm"
|
||||
onClick={zoomIn}
|
||||
disabled={scale >= 3.0 || loading}
|
||||
disabled={scale >= 3.0 || !documentLoaded}
|
||||
>
|
||||
<ZoomIn className="h-4 w-4" />
|
||||
</Button>
|
||||
@@ -113,39 +134,48 @@ export default function PDFViewer({ title, pdfUrl, className, httpHeaders }: PDF
|
||||
{/* PDF Document */}
|
||||
<div className="border rounded-md bg-muted/10 overflow-auto max-h-[800px]">
|
||||
<div className="flex justify-center p-4">
|
||||
{loading && (
|
||||
<div className="flex items-center justify-center min-h-[400px]">
|
||||
<div className="animate-spin rounded-full h-12 w-12 border-b-2 border-primary"></div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{error && (
|
||||
{error ? (
|
||||
<div className="flex items-center justify-center min-h-[400px]">
|
||||
<div className="text-center">
|
||||
<p className="text-destructive font-semibold mb-2">Error</p>
|
||||
<p className="text-destructive font-semibold mb-2">錯誤</p>
|
||||
<p className="text-sm text-muted-foreground">{error}</p>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{!error && (
|
||||
) : (
|
||||
<Document
|
||||
key={pdfUrl}
|
||||
file={fileConfig}
|
||||
options={pdfOptions}
|
||||
onLoadSuccess={onDocumentLoadSuccess}
|
||||
onLoadError={onDocumentLoadError}
|
||||
loading={
|
||||
<div className="flex items-center justify-center min-h-[400px]">
|
||||
<div className="animate-spin rounded-full h-12 w-12 border-b-2 border-primary"></div>
|
||||
<div className="text-center">
|
||||
<Loader2 className="w-12 h-12 animate-spin text-primary mx-auto mb-4" />
|
||||
<p className="text-muted-foreground">載入 PDF 中...</p>
|
||||
</div>
|
||||
</div>
|
||||
}
|
||||
>
|
||||
<Page
|
||||
pageNumber={pageNumber}
|
||||
scale={scale}
|
||||
renderTextLayer={true}
|
||||
renderAnnotationLayer={true}
|
||||
className="shadow-lg"
|
||||
/>
|
||||
{documentLoaded && (
|
||||
<Page
|
||||
pageNumber={pageNumber}
|
||||
scale={scale}
|
||||
renderTextLayer={true}
|
||||
renderAnnotationLayer={true}
|
||||
className="shadow-lg"
|
||||
loading={
|
||||
<div className="flex items-center justify-center min-h-[400px]">
|
||||
<Loader2 className="w-8 h-8 animate-spin text-primary" />
|
||||
</div>
|
||||
}
|
||||
error={
|
||||
<div className="text-center p-4 text-destructive">
|
||||
無法載入第 {pageNumber} 頁
|
||||
</div>
|
||||
}
|
||||
/>
|
||||
)}
|
||||
</Document>
|
||||
)}
|
||||
</div>
|
||||
|
||||
46
frontend/src/components/TaskNotFound.tsx
Normal file
46
frontend/src/components/TaskNotFound.tsx
Normal file
@@ -0,0 +1,46 @@
|
||||
import { useNavigate } from 'react-router-dom'
|
||||
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card'
|
||||
import { Button } from '@/components/ui/button'
|
||||
import { Trash2 } from 'lucide-react'
|
||||
|
||||
interface TaskNotFoundProps {
|
||||
taskId: string | null
|
||||
onClearAndUpload: () => void
|
||||
}
|
||||
|
||||
export default function TaskNotFound({ taskId, onClearAndUpload }: TaskNotFoundProps) {
|
||||
const navigate = useNavigate()
|
||||
|
||||
const handleClick = () => {
|
||||
onClearAndUpload()
|
||||
navigate('/upload')
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="flex items-center justify-center min-h-[60vh]">
|
||||
<Card className="max-w-md text-center">
|
||||
<CardHeader>
|
||||
<div className="flex justify-center mb-4">
|
||||
<div className="w-16 h-16 bg-destructive/10 rounded-full flex items-center justify-center">
|
||||
<Trash2 className="w-8 h-8 text-destructive" />
|
||||
</div>
|
||||
</div>
|
||||
<CardTitle className="text-xl">任務已刪除</CardTitle>
|
||||
</CardHeader>
|
||||
<CardContent className="space-y-4">
|
||||
<p className="text-muted-foreground">
|
||||
此任務已被刪除或不存在。請上傳新檔案以建立新任務。
|
||||
</p>
|
||||
{taskId && (
|
||||
<p className="text-xs text-muted-foreground font-mono">
|
||||
任務 ID: {taskId}
|
||||
</p>
|
||||
)}
|
||||
<Button onClick={handleClick} size="lg">
|
||||
前往上傳頁面
|
||||
</Button>
|
||||
</CardContent>
|
||||
</Card>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
@@ -1,12 +1,14 @@
|
||||
import * as React from 'react'
|
||||
import * as SelectPrimitive from '@radix-ui/react-select'
|
||||
import { cn } from '@/lib/utils'
|
||||
import { ChevronDown } from 'lucide-react'
|
||||
import { Check, ChevronDown, ChevronUp } from 'lucide-react'
|
||||
|
||||
export interface SelectProps extends React.SelectHTMLAttributes<HTMLSelectElement> {
|
||||
// Simple native select for backwards compatibility
|
||||
export interface NativeSelectProps extends React.SelectHTMLAttributes<HTMLSelectElement> {
|
||||
options: Array<{ value: string; label: string }>
|
||||
}
|
||||
|
||||
const Select = React.forwardRef<HTMLSelectElement, SelectProps>(
|
||||
const NativeSelect = React.forwardRef<HTMLSelectElement, NativeSelectProps>(
|
||||
({ className, options, ...props }, ref) => {
|
||||
return (
|
||||
<div className="relative">
|
||||
@@ -33,6 +35,168 @@ const Select = React.forwardRef<HTMLSelectElement, SelectProps>(
|
||||
)
|
||||
}
|
||||
)
|
||||
Select.displayName = 'Select'
|
||||
NativeSelect.displayName = 'NativeSelect'
|
||||
|
||||
export { Select }
|
||||
const Select = SelectPrimitive.Root
|
||||
|
||||
const SelectGroup = SelectPrimitive.Group
|
||||
|
||||
const SelectValue = SelectPrimitive.Value
|
||||
|
||||
const SelectTrigger = React.forwardRef<
|
||||
React.ComponentRef<typeof SelectPrimitive.Trigger>,
|
||||
React.ComponentPropsWithoutRef<typeof SelectPrimitive.Trigger>
|
||||
>(({ className, children, ...props }, ref) => (
|
||||
<SelectPrimitive.Trigger
|
||||
ref={ref}
|
||||
className={cn(
|
||||
'flex h-10 w-full items-center justify-between rounded-lg border border-border bg-background px-3 py-2 text-sm',
|
||||
'ring-offset-background placeholder:text-muted-foreground',
|
||||
'focus:outline-none focus:ring-2 focus:ring-primary/20 focus:border-primary',
|
||||
'disabled:cursor-not-allowed disabled:opacity-50',
|
||||
'[&>span]:line-clamp-1',
|
||||
className
|
||||
)}
|
||||
{...props}
|
||||
>
|
||||
{children}
|
||||
<SelectPrimitive.Icon asChild>
|
||||
<ChevronDown className="h-4 w-4 opacity-50" />
|
||||
</SelectPrimitive.Icon>
|
||||
</SelectPrimitive.Trigger>
|
||||
))
|
||||
SelectTrigger.displayName = SelectPrimitive.Trigger.displayName
|
||||
|
||||
const SelectScrollUpButton = React.forwardRef<
|
||||
React.ComponentRef<typeof SelectPrimitive.ScrollUpButton>,
|
||||
React.ComponentPropsWithoutRef<typeof SelectPrimitive.ScrollUpButton>
|
||||
>(({ className, ...props }, ref) => (
|
||||
<SelectPrimitive.ScrollUpButton
|
||||
ref={ref}
|
||||
className={cn(
|
||||
'flex cursor-default items-center justify-center py-1',
|
||||
className
|
||||
)}
|
||||
{...props}
|
||||
>
|
||||
<ChevronUp className="h-4 w-4" />
|
||||
</SelectPrimitive.ScrollUpButton>
|
||||
))
|
||||
SelectScrollUpButton.displayName = SelectPrimitive.ScrollUpButton.displayName
|
||||
|
||||
const SelectScrollDownButton = React.forwardRef<
|
||||
React.ComponentRef<typeof SelectPrimitive.ScrollDownButton>,
|
||||
React.ComponentPropsWithoutRef<typeof SelectPrimitive.ScrollDownButton>
|
||||
>(({ className, ...props }, ref) => (
|
||||
<SelectPrimitive.ScrollDownButton
|
||||
ref={ref}
|
||||
className={cn(
|
||||
'flex cursor-default items-center justify-center py-1',
|
||||
className
|
||||
)}
|
||||
{...props}
|
||||
>
|
||||
<ChevronDown className="h-4 w-4" />
|
||||
</SelectPrimitive.ScrollDownButton>
|
||||
))
|
||||
SelectScrollDownButton.displayName = SelectPrimitive.ScrollDownButton.displayName
|
||||
|
||||
const SelectContent = React.forwardRef<
|
||||
React.ComponentRef<typeof SelectPrimitive.Content>,
|
||||
React.ComponentPropsWithoutRef<typeof SelectPrimitive.Content>
|
||||
>(({ className, children, position = 'popper', ...props }, ref) => (
|
||||
<SelectPrimitive.Portal>
|
||||
<SelectPrimitive.Content
|
||||
ref={ref}
|
||||
className={cn(
|
||||
'relative z-50 max-h-96 min-w-[8rem] overflow-hidden rounded-lg border border-border bg-popover text-popover-foreground shadow-md',
|
||||
'data-[state=open]:animate-in data-[state=closed]:animate-out',
|
||||
'data-[state=closed]:fade-out-0 data-[state=open]:fade-in-0',
|
||||
'data-[state=closed]:zoom-out-95 data-[state=open]:zoom-in-95',
|
||||
'data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2',
|
||||
'data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2',
|
||||
position === 'popper' &&
|
||||
'data-[side=bottom]:translate-y-1 data-[side=left]:-translate-x-1 data-[side=right]:translate-x-1 data-[side=top]:-translate-y-1',
|
||||
className
|
||||
)}
|
||||
position={position}
|
||||
{...props}
|
||||
>
|
||||
<SelectScrollUpButton />
|
||||
<SelectPrimitive.Viewport
|
||||
className={cn(
|
||||
'p-1',
|
||||
position === 'popper' &&
|
||||
'h-[var(--radix-select-trigger-height)] w-full min-w-[var(--radix-select-trigger-width)]'
|
||||
)}
|
||||
>
|
||||
{children}
|
||||
</SelectPrimitive.Viewport>
|
||||
<SelectScrollDownButton />
|
||||
</SelectPrimitive.Content>
|
||||
</SelectPrimitive.Portal>
|
||||
))
|
||||
SelectContent.displayName = SelectPrimitive.Content.displayName
|
||||
|
||||
const SelectLabel = React.forwardRef<
|
||||
React.ComponentRef<typeof SelectPrimitive.Label>,
|
||||
React.ComponentPropsWithoutRef<typeof SelectPrimitive.Label>
|
||||
>(({ className, ...props }, ref) => (
|
||||
<SelectPrimitive.Label
|
||||
ref={ref}
|
||||
className={cn('py-1.5 pl-8 pr-2 text-sm font-semibold', className)}
|
||||
{...props}
|
||||
/>
|
||||
))
|
||||
SelectLabel.displayName = SelectPrimitive.Label.displayName
|
||||
|
||||
const SelectItem = React.forwardRef<
|
||||
React.ComponentRef<typeof SelectPrimitive.Item>,
|
||||
React.ComponentPropsWithoutRef<typeof SelectPrimitive.Item>
|
||||
>(({ className, children, ...props }, ref) => (
|
||||
<SelectPrimitive.Item
|
||||
ref={ref}
|
||||
className={cn(
|
||||
'relative flex w-full cursor-default select-none items-center rounded-md py-1.5 pl-8 pr-2 text-sm outline-none',
|
||||
'focus:bg-accent focus:text-accent-foreground',
|
||||
'data-[disabled]:pointer-events-none data-[disabled]:opacity-50',
|
||||
className
|
||||
)}
|
||||
{...props}
|
||||
>
|
||||
<span className="absolute left-2 flex h-3.5 w-3.5 items-center justify-center">
|
||||
<SelectPrimitive.ItemIndicator>
|
||||
<Check className="h-4 w-4" />
|
||||
</SelectPrimitive.ItemIndicator>
|
||||
</span>
|
||||
|
||||
<SelectPrimitive.ItemText>{children}</SelectPrimitive.ItemText>
|
||||
</SelectPrimitive.Item>
|
||||
))
|
||||
SelectItem.displayName = SelectPrimitive.Item.displayName
|
||||
|
||||
const SelectSeparator = React.forwardRef<
|
||||
React.ComponentRef<typeof SelectPrimitive.Separator>,
|
||||
React.ComponentPropsWithoutRef<typeof SelectPrimitive.Separator>
|
||||
>(({ className, ...props }, ref) => (
|
||||
<SelectPrimitive.Separator
|
||||
ref={ref}
|
||||
className={cn('-mx-1 my-1 h-px bg-muted', className)}
|
||||
{...props}
|
||||
/>
|
||||
))
|
||||
SelectSeparator.displayName = SelectPrimitive.Separator.displayName
|
||||
|
||||
export {
|
||||
Select,
|
||||
SelectGroup,
|
||||
SelectValue,
|
||||
SelectTrigger,
|
||||
SelectContent,
|
||||
SelectLabel,
|
||||
SelectItem,
|
||||
SelectSeparator,
|
||||
SelectScrollUpButton,
|
||||
SelectScrollDownButton,
|
||||
NativeSelect,
|
||||
}
|
||||
|
||||
64
frontend/src/hooks/useTaskValidation.ts
Normal file
64
frontend/src/hooks/useTaskValidation.ts
Normal file
@@ -0,0 +1,64 @@
|
||||
import { useEffect, useState } from 'react'
|
||||
import { useQuery } from '@tanstack/react-query'
|
||||
import { useUploadStore } from '@/store/uploadStore'
|
||||
import { apiClientV2 } from '@/services/apiV2'
|
||||
import type { TaskDetail } from '@/types/apiV2'
|
||||
|
||||
interface UseTaskValidationResult {
|
||||
taskId: string | null
|
||||
taskDetail: TaskDetail | undefined
|
||||
isLoading: boolean
|
||||
isNotFound: boolean
|
||||
clearAndReset: () => void
|
||||
}
|
||||
|
||||
/**
|
||||
* Hook for validating task existence and handling deleted tasks gracefully.
|
||||
* Shows loading state first, then either returns task data or marks as not found.
|
||||
*/
|
||||
export function useTaskValidation(options?: {
|
||||
refetchInterval?: number | false | ((query: any) => number | false)
|
||||
}): UseTaskValidationResult {
|
||||
const { batchId, clearUpload } = useUploadStore()
|
||||
const taskId = batchId ? String(batchId) : null
|
||||
|
||||
const [isNotFound, setIsNotFound] = useState(false)
|
||||
|
||||
const { data: taskDetail, isLoading, error, isFetching } = useQuery({
|
||||
queryKey: ['taskDetail', taskId],
|
||||
queryFn: () => apiClientV2.getTask(taskId!),
|
||||
enabled: !!taskId && !isNotFound,
|
||||
retry: (failureCount, error: any) => {
|
||||
// Don't retry on 404
|
||||
if (error?.response?.status === 404) {
|
||||
return false
|
||||
}
|
||||
return failureCount < 2
|
||||
},
|
||||
refetchInterval: options?.refetchInterval ?? false,
|
||||
// Disable stale time to ensure we check fresh data
|
||||
staleTime: 0,
|
||||
})
|
||||
|
||||
// Handle 404 error - mark as not found immediately
|
||||
useEffect(() => {
|
||||
if (error && (error as any)?.response?.status === 404) {
|
||||
setIsNotFound(true)
|
||||
}
|
||||
}, [error])
|
||||
|
||||
// Clear state and store
|
||||
const clearAndReset = () => {
|
||||
clearUpload()
|
||||
setIsNotFound(false)
|
||||
}
|
||||
|
||||
return {
|
||||
taskId,
|
||||
taskDetail,
|
||||
// Show loading if we have a taskId and are still fetching (but not if already marked as not found)
|
||||
isLoading: !!taskId && !isNotFound && (isLoading || isFetching) && !taskDetail,
|
||||
isNotFound,
|
||||
clearAndReset,
|
||||
}
|
||||
}
|
||||
@@ -1,4 +1,3 @@
|
||||
import { StrictMode } from 'react'
|
||||
import { createRoot } from 'react-dom/client'
|
||||
import { BrowserRouter } from 'react-router-dom'
|
||||
import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
|
||||
@@ -10,8 +9,8 @@ import App from './App.tsx'
|
||||
|
||||
// Configure PDF.js worker for react-pdf
|
||||
import { pdfjs } from 'react-pdf'
|
||||
// Use the worker from react-pdf's bundled pdfjs-dist
|
||||
pdfjs.GlobalWorkerOptions.workerSrc = `//unpkg.com/pdfjs-dist@${pdfjs.version}/build/pdf.worker.min.mjs`
|
||||
// Use CDN for the worker (most reliable for Vite)
|
||||
pdfjs.GlobalWorkerOptions.workerSrc = `https://unpkg.com/pdfjs-dist@${pdfjs.version}/build/pdf.worker.min.mjs`
|
||||
|
||||
// Create React Query client
|
||||
const queryClient = new QueryClient({
|
||||
@@ -24,16 +23,16 @@ const queryClient = new QueryClient({
|
||||
},
|
||||
})
|
||||
|
||||
// Note: StrictMode disabled due to react-pdf incompatibility
|
||||
// StrictMode's double-invocation causes PDF worker race conditions
|
||||
createRoot(document.getElementById('root')!).render(
|
||||
<StrictMode>
|
||||
<QueryClientProvider client={queryClient}>
|
||||
<I18nextProvider i18n={i18n}>
|
||||
<ToastProvider>
|
||||
<BrowserRouter>
|
||||
<App />
|
||||
</BrowserRouter>
|
||||
</ToastProvider>
|
||||
</I18nextProvider>
|
||||
</QueryClientProvider>
|
||||
</StrictMode>,
|
||||
<QueryClientProvider client={queryClient}>
|
||||
<I18nextProvider i18n={i18n}>
|
||||
<ToastProvider>
|
||||
<BrowserRouter>
|
||||
<App />
|
||||
</BrowserRouter>
|
||||
</ToastProvider>
|
||||
</I18nextProvider>
|
||||
</QueryClientProvider>,
|
||||
)
|
||||
|
||||
@@ -1,26 +1,35 @@
|
||||
import { useEffect, useState } from 'react'
|
||||
import { useState, useEffect } from 'react'
|
||||
import { useNavigate } from 'react-router-dom'
|
||||
import { useTranslation } from 'react-i18next'
|
||||
import { useQuery, useMutation } from '@tanstack/react-query'
|
||||
import { useMutation } from '@tanstack/react-query'
|
||||
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card'
|
||||
import { Progress } from '@/components/ui/progress'
|
||||
import { Button } from '@/components/ui/button'
|
||||
import { Badge } from '@/components/ui/badge'
|
||||
import { useToast } from '@/components/ui/toast'
|
||||
import { useUploadStore } from '@/store/uploadStore'
|
||||
import { apiClientV2 } from '@/services/apiV2'
|
||||
import { Play, CheckCircle, FileText, AlertCircle, Clock, Activity, Loader2 } from 'lucide-react'
|
||||
import PPStructureParams from '@/components/PPStructureParams'
|
||||
import TaskNotFound from '@/components/TaskNotFound'
|
||||
import { useTaskValidation } from '@/hooks/useTaskValidation'
|
||||
import type { PPStructureV3Params, ProcessingOptions } from '@/types/apiV2'
|
||||
|
||||
export default function ProcessingPage() {
|
||||
const { t } = useTranslation()
|
||||
const navigate = useNavigate()
|
||||
const { toast } = useToast()
|
||||
const { batchId } = useUploadStore()
|
||||
|
||||
// In V2, batchId is actually a task_id (string)
|
||||
const taskId = batchId ? String(batchId) : null
|
||||
// Use shared hook for task validation
|
||||
const { taskId, taskDetail, isLoading: isValidating, isNotFound, clearAndReset } = useTaskValidation({
|
||||
refetchInterval: (query) => {
|
||||
const data = query.state.data
|
||||
if (!data) return 2000
|
||||
if (data.status === 'completed' || data.status === 'failed') {
|
||||
return false
|
||||
}
|
||||
return 2000
|
||||
},
|
||||
})
|
||||
|
||||
// PP-StructureV3 parameters state
|
||||
const [ppStructureParams, setPpStructureParams] = useState<PPStructureV3Params>({})
|
||||
@@ -56,22 +65,6 @@ export default function ProcessingPage() {
|
||||
},
|
||||
})
|
||||
|
||||
// Poll task status
|
||||
const { data: taskDetail } = useQuery({
|
||||
queryKey: ['taskDetail', taskId],
|
||||
queryFn: () => apiClientV2.getTask(taskId!),
|
||||
enabled: !!taskId,
|
||||
refetchInterval: (query) => {
|
||||
const data = query.state.data
|
||||
if (!data) return 2000
|
||||
// Stop polling if completed or failed
|
||||
if (data.status === 'completed' || data.status === 'failed') {
|
||||
return false
|
||||
}
|
||||
return 2000 // Poll every 2 seconds
|
||||
},
|
||||
})
|
||||
|
||||
// Auto-redirect when completed
|
||||
useEffect(() => {
|
||||
if (taskDetail?.status === 'completed') {
|
||||
@@ -115,6 +108,23 @@ export default function ProcessingPage() {
|
||||
}
|
||||
}
|
||||
|
||||
// Show loading while validating task
|
||||
if (isValidating) {
|
||||
return (
|
||||
<div className="flex items-center justify-center min-h-[60vh]">
|
||||
<div className="text-center">
|
||||
<Loader2 className="w-12 h-12 animate-spin text-primary mx-auto mb-4" />
|
||||
<p className="text-muted-foreground">載入任務資訊...</p>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
// Show message when task was deleted
|
||||
if (isNotFound) {
|
||||
return <TaskNotFound taskId={taskId} onClearAndUpload={clearAndReset} />
|
||||
}
|
||||
|
||||
// Show helpful message when no task is selected
|
||||
if (!taskId) {
|
||||
return (
|
||||
|
||||
@@ -1,29 +1,23 @@
|
||||
import { useMemo } from 'react'
|
||||
import { useNavigate } from 'react-router-dom'
|
||||
import { useTranslation } from 'react-i18next'
|
||||
import { useQuery } from '@tanstack/react-query'
|
||||
import { Button } from '@/components/ui/button'
|
||||
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card'
|
||||
import PDFViewer from '@/components/PDFViewer'
|
||||
import { useToast } from '@/components/ui/toast'
|
||||
import { useUploadStore } from '@/store/uploadStore'
|
||||
import { apiClientV2 } from '@/services/apiV2'
|
||||
import { FileText, Download, AlertCircle, TrendingUp, Clock, Layers, FileJson, Loader2 } from 'lucide-react'
|
||||
import { Badge } from '@/components/ui/badge'
|
||||
import TaskNotFound from '@/components/TaskNotFound'
|
||||
import { useTaskValidation } from '@/hooks/useTaskValidation'
|
||||
|
||||
export default function ResultsPage() {
|
||||
const { t } = useTranslation()
|
||||
const navigate = useNavigate()
|
||||
const { toast } = useToast()
|
||||
const { batchId } = useUploadStore()
|
||||
|
||||
// In V2, batchId is actually a task_id (string)
|
||||
const taskId = batchId ? String(batchId) : null
|
||||
|
||||
// Get task details
|
||||
const { data: taskDetail, isLoading } = useQuery({
|
||||
queryKey: ['taskDetail', taskId],
|
||||
queryFn: () => apiClientV2.getTask(taskId!),
|
||||
enabled: !!taskId,
|
||||
// Use shared hook for task validation
|
||||
const { taskId, taskDetail, isLoading, isNotFound, clearAndReset } = useTaskValidation({
|
||||
refetchInterval: (query) => {
|
||||
const data = query.state.data
|
||||
if (!data) return 2000
|
||||
@@ -34,6 +28,19 @@ export default function ResultsPage() {
|
||||
},
|
||||
})
|
||||
|
||||
// Construct PDF URL for preview - memoize to prevent unnecessary reloads
|
||||
// Must be called unconditionally before any early returns (React hooks rule)
|
||||
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || 'http://localhost:8000'
|
||||
const pdfUrl = useMemo(() => {
|
||||
return taskId ? `${API_BASE_URL}/api/v2/tasks/${taskId}/download/pdf` : ''
|
||||
}, [taskId, API_BASE_URL])
|
||||
|
||||
// Get auth token for PDF preview - memoize to prevent new object reference each render
|
||||
const pdfHttpHeaders = useMemo(() => {
|
||||
const authToken = localStorage.getItem('auth_token_v2')
|
||||
return authToken ? { Authorization: `Bearer ${authToken}` } : undefined
|
||||
}, [])
|
||||
|
||||
const handleDownloadPDF = async () => {
|
||||
if (!taskId) return
|
||||
try {
|
||||
@@ -101,6 +108,23 @@ export default function ResultsPage() {
|
||||
}
|
||||
}
|
||||
|
||||
// Show loading while validating task
|
||||
if (isLoading) {
|
||||
return (
|
||||
<div className="flex items-center justify-center min-h-[60vh]">
|
||||
<div className="text-center">
|
||||
<Loader2 className="w-12 h-12 animate-spin text-primary mx-auto mb-4" />
|
||||
<p className="text-muted-foreground">載入任務結果...</p>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
// Show message when task was deleted
|
||||
if (isNotFound) {
|
||||
return <TaskNotFound taskId={taskId} onClearAndUpload={clearAndReset} />
|
||||
}
|
||||
|
||||
// Show helpful message when no task is selected
|
||||
if (!taskId) {
|
||||
return (
|
||||
@@ -127,17 +151,7 @@ export default function ResultsPage() {
|
||||
)
|
||||
}
|
||||
|
||||
if (isLoading) {
|
||||
return (
|
||||
<div className="flex items-center justify-center min-h-[60vh]">
|
||||
<div className="text-center">
|
||||
<Loader2 className="w-12 h-12 animate-spin text-primary mx-auto mb-4" />
|
||||
<p className="text-muted-foreground">載入任務結果...</p>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
// Fallback for no task detail (shouldn't happen with proper validation)
|
||||
if (!taskDetail) {
|
||||
return (
|
||||
<div className="flex items-center justify-center min-h-[60vh]">
|
||||
@@ -157,14 +171,6 @@ export default function ResultsPage() {
|
||||
|
||||
const isCompleted = taskDetail.status === 'completed'
|
||||
|
||||
// Construct PDF URL for preview
|
||||
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || 'http://localhost:8000'
|
||||
const pdfUrl = taskId ? `${API_BASE_URL}/api/v2/tasks/${taskId}/download/pdf` : ''
|
||||
|
||||
// Get auth token for PDF preview
|
||||
const authToken = localStorage.getItem('auth_token_v2')
|
||||
const pdfHttpHeaders = authToken ? { Authorization: `Bearer ${authToken}` } : undefined
|
||||
|
||||
return (
|
||||
<div className="space-y-6">
|
||||
{/* Page Header */}
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import { useMemo } from 'react'
|
||||
import { useParams, useNavigate } from 'react-router-dom'
|
||||
import { useTranslation } from 'react-i18next'
|
||||
import { useQuery } from '@tanstack/react-query'
|
||||
@@ -65,6 +66,19 @@ export default function TaskDetailPage() {
|
||||
retry: false,
|
||||
})
|
||||
|
||||
// Construct PDF URL for preview - memoize to prevent unnecessary reloads
|
||||
// Must be called unconditionally before any early returns (React hooks rule)
|
||||
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || 'http://localhost:8000'
|
||||
const pdfUrl = useMemo(() => {
|
||||
return taskId ? `${API_BASE_URL}/api/v2/tasks/${taskId}/download/pdf` : ''
|
||||
}, [taskId, API_BASE_URL])
|
||||
|
||||
// Get auth token for PDF preview - memoize to prevent new object reference each render
|
||||
const pdfHttpHeaders = useMemo(() => {
|
||||
const authToken = localStorage.getItem('auth_token_v2')
|
||||
return authToken ? { Authorization: `Bearer ${authToken}` } : undefined
|
||||
}, [])
|
||||
|
||||
const getTrackBadge = (track?: ProcessingTrack) => {
|
||||
if (!track) return null
|
||||
switch (track) {
|
||||
@@ -218,14 +232,6 @@ export default function TaskDetailPage() {
|
||||
const isProcessing = taskDetail.status === 'processing'
|
||||
const isFailed = taskDetail.status === 'failed'
|
||||
|
||||
// Construct PDF URL for preview
|
||||
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || 'http://localhost:8000'
|
||||
const pdfUrl = taskId ? `${API_BASE_URL}/api/v2/tasks/${taskId}/download/pdf` : ''
|
||||
|
||||
// Get auth token for PDF preview
|
||||
const authToken = localStorage.getItem('auth_token_v2')
|
||||
const pdfHttpHeaders = authToken ? { Authorization: `Bearer ${authToken}` } : undefined
|
||||
|
||||
return (
|
||||
<div className="space-y-6">
|
||||
{/* Page Header */}
|
||||
|
||||
@@ -28,7 +28,7 @@ import {
|
||||
TableHeader,
|
||||
TableRow,
|
||||
} from '@/components/ui/table'
|
||||
import { Select } from '@/components/ui/select'
|
||||
import { NativeSelect } from '@/components/ui/select'
|
||||
import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card'
|
||||
|
||||
export default function TaskHistoryPage() {
|
||||
@@ -112,6 +112,43 @@ export default function TaskHistoryPage() {
|
||||
}
|
||||
}
|
||||
|
||||
// Delete all tasks
|
||||
const handleDeleteAll = async () => {
|
||||
if (tasks.length === 0) {
|
||||
alert('沒有可刪除的任務')
|
||||
return
|
||||
}
|
||||
|
||||
if (!confirm(`確定要刪除所有 ${total} 個任務嗎?此操作無法復原!`)) return
|
||||
|
||||
try {
|
||||
setLoading(true)
|
||||
// Delete tasks one by one
|
||||
for (const task of tasks) {
|
||||
await apiClientV2.deleteTask(task.task_id)
|
||||
}
|
||||
// If there are more pages, keep fetching and deleting
|
||||
let hasMoreTasks = hasMore
|
||||
while (hasMoreTasks) {
|
||||
const response = await apiClientV2.listTasks({ page: 1, page_size: 100 })
|
||||
if (response.tasks.length === 0) break
|
||||
for (const task of response.tasks) {
|
||||
await apiClientV2.deleteTask(task.task_id)
|
||||
}
|
||||
hasMoreTasks = response.has_more
|
||||
}
|
||||
fetchTasks()
|
||||
fetchStats()
|
||||
alert('所有任務已刪除')
|
||||
} catch (err: any) {
|
||||
alert(err.response?.data?.detail || '刪除任務失敗')
|
||||
fetchTasks()
|
||||
fetchStats()
|
||||
} finally {
|
||||
setLoading(false)
|
||||
}
|
||||
}
|
||||
|
||||
// View task details
|
||||
const handleViewDetails = (taskId: string) => {
|
||||
navigate(`/tasks/${taskId}`)
|
||||
@@ -220,10 +257,16 @@ export default function TaskHistoryPage() {
|
||||
<h1 className="text-3xl font-bold text-gray-900">任務歷史</h1>
|
||||
<p className="text-gray-600 mt-1">查看和管理您的 OCR 任務</p>
|
||||
</div>
|
||||
<Button onClick={() => fetchTasks()} variant="outline">
|
||||
<RefreshCw className="w-4 h-4 mr-2" />
|
||||
刷新
|
||||
</Button>
|
||||
<div className="flex gap-2">
|
||||
<Button onClick={() => fetchTasks()} variant="outline">
|
||||
<RefreshCw className="w-4 h-4 mr-2" />
|
||||
刷新
|
||||
</Button>
|
||||
<Button onClick={handleDeleteAll} variant="destructive" disabled={loading || tasks.length === 0}>
|
||||
<Trash2 className="w-4 h-4 mr-2" />
|
||||
刪除全部
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Statistics */}
|
||||
@@ -288,7 +331,7 @@ export default function TaskHistoryPage() {
|
||||
<div className="grid grid-cols-1 md:grid-cols-4 gap-4">
|
||||
<div>
|
||||
<label className="block text-sm font-medium text-gray-700 mb-2">狀態</label>
|
||||
<Select
|
||||
<NativeSelect
|
||||
value={statusFilter}
|
||||
onChange={(e) => {
|
||||
setStatusFilter(e.target.value as any)
|
||||
|
||||
@@ -415,4 +415,173 @@ async def test_concurrent_load():
|
||||
### Phase 4: Hardening (Week 4)
|
||||
- Stress testing
|
||||
- Performance tuning
|
||||
- Documentation and monitoring
|
||||
- Documentation and monitoring
|
||||
|
||||
## Configuration Settings Reference
|
||||
|
||||
All memory management settings are defined in `backend/app/core/config.py` under the `Settings` class.
|
||||
|
||||
### Memory Thresholds
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
|---------|------|---------|-------------|
|
||||
| `memory_warning_threshold` | float | 0.80 | GPU memory usage ratio (0-1) to trigger warning alerts |
|
||||
| `memory_critical_threshold` | float | 0.95 | GPU memory ratio to start throttling operations |
|
||||
| `memory_emergency_threshold` | float | 0.98 | GPU memory ratio to trigger emergency cleanup |
|
||||
|
||||
### Memory Monitoring
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
|---------|------|---------|-------------|
|
||||
| `memory_check_interval_seconds` | int | 30 | Background check interval for memory monitoring |
|
||||
| `enable_memory_alerts` | bool | True | Enable/disable memory threshold alerts |
|
||||
| `gpu_memory_limit_mb` | int | 6144 | Maximum GPU memory to use (MB) |
|
||||
| `gpu_memory_reserve_mb` | int | 512 | Memory reserved for CUDA overhead |
|
||||
|
||||
### Model Lifecycle Management
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
|---------|------|---------|-------------|
|
||||
| `enable_model_lifecycle_management` | bool | True | Use ModelManager for model lifecycle |
|
||||
| `model_idle_timeout_seconds` | int | 300 | Unload models after idle time |
|
||||
| `pp_structure_idle_timeout_seconds` | int | 300 | Unload PP-StructureV3 after idle |
|
||||
| `structure_model_memory_mb` | int | 2000 | Estimated memory for PP-StructureV3 |
|
||||
| `ocr_model_memory_mb` | int | 500 | Estimated memory per OCR language model |
|
||||
| `enable_lazy_model_loading` | bool | True | Load models on demand |
|
||||
| `auto_unload_unused_models` | bool | True | Auto-unload unused language models |
|
||||
|
||||
### Service Pool Configuration
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
|---------|------|---------|-------------|
|
||||
| `enable_service_pool` | bool | True | Use OCRServicePool |
|
||||
| `max_services_per_device` | int | 1 | Max OCRService instances per GPU |
|
||||
| `max_total_services` | int | 2 | Max total OCRService instances |
|
||||
| `service_acquire_timeout_seconds` | float | 300.0 | Timeout for acquiring service from pool |
|
||||
| `max_queue_size` | int | 50 | Max pending tasks per device queue |
|
||||
|
||||
### Concurrency Control
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
|---------|------|---------|-------------|
|
||||
| `max_concurrent_predictions` | int | 2 | Max concurrent PP-StructureV3 predictions |
|
||||
| `max_concurrent_pages` | int | 2 | Max pages processed concurrently |
|
||||
| `inference_batch_size` | int | 1 | Batch size for inference |
|
||||
| `enable_batch_processing` | bool | True | Enable batch processing for large docs |
|
||||
|
||||
### Recovery Settings
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
|---------|------|---------|-------------|
|
||||
| `enable_cpu_fallback` | bool | True | Fall back to CPU when GPU memory low |
|
||||
| `enable_emergency_cleanup` | bool | True | Auto-cleanup on memory pressure |
|
||||
| `enable_worker_restart` | bool | False | Restart workers on OOM (requires supervisor) |
|
||||
|
||||
### Feature Flags
|
||||
|
||||
| Setting | Type | Default | Description |
|
||||
|---------|------|---------|-------------|
|
||||
| `enable_chart_recognition` | bool | True | Enable chart/diagram recognition |
|
||||
| `enable_formula_recognition` | bool | True | Enable math formula recognition |
|
||||
| `enable_table_recognition` | bool | True | Enable table structure recognition |
|
||||
| `enable_seal_recognition` | bool | True | Enable seal/stamp recognition |
|
||||
| `enable_text_recognition` | bool | True | Enable general text recognition |
|
||||
| `enable_memory_optimization` | bool | True | Enable memory optimizations |
|
||||
|
||||
### Environment Variable Override
|
||||
|
||||
All settings can be overridden via environment variables. The format is uppercase with underscores:
|
||||
|
||||
```bash
|
||||
# Example .env file
|
||||
MEMORY_WARNING_THRESHOLD=0.75
|
||||
MEMORY_CRITICAL_THRESHOLD=0.90
|
||||
MAX_CONCURRENT_PREDICTIONS=1
|
||||
GPU_MEMORY_LIMIT_MB=4096
|
||||
ENABLE_CPU_FALLBACK=true
|
||||
```
|
||||
|
||||
### Recommended Configurations
|
||||
|
||||
#### RTX 4060 8GB (Default)
|
||||
```bash
|
||||
GPU_MEMORY_LIMIT_MB=6144
|
||||
MAX_CONCURRENT_PREDICTIONS=2
|
||||
MAX_CONCURRENT_PAGES=2
|
||||
INFERENCE_BATCH_SIZE=1
|
||||
```
|
||||
|
||||
#### RTX 3090 24GB
|
||||
```bash
|
||||
GPU_MEMORY_LIMIT_MB=20480
|
||||
MAX_CONCURRENT_PREDICTIONS=4
|
||||
MAX_CONCURRENT_PAGES=4
|
||||
INFERENCE_BATCH_SIZE=2
|
||||
```
|
||||
|
||||
#### CPU-Only Mode
|
||||
```bash
|
||||
FORCE_CPU_MODE=true
|
||||
MAX_CONCURRENT_PREDICTIONS=1
|
||||
ENABLE_CPU_FALLBACK=false
|
||||
```
|
||||
|
||||
## Prometheus Metrics
|
||||
|
||||
The system exports Prometheus-format metrics via the `PrometheusMetrics` class. Available metrics:
|
||||
|
||||
### GPU Metrics
|
||||
- `tool_ocr_memory_gpu_total_bytes` - Total GPU memory
|
||||
- `tool_ocr_memory_gpu_used_bytes` - Used GPU memory
|
||||
- `tool_ocr_memory_gpu_free_bytes` - Free GPU memory
|
||||
- `tool_ocr_memory_gpu_utilization_ratio` - GPU utilization (0-1)
|
||||
|
||||
### Model Metrics
|
||||
- `tool_ocr_memory_models_loaded_total` - Number of loaded models
|
||||
- `tool_ocr_memory_models_memory_bytes` - Total memory used by models
|
||||
- `tool_ocr_memory_model_ref_count{model_id}` - Reference count per model
|
||||
|
||||
### Prediction Metrics
|
||||
- `tool_ocr_memory_predictions_active` - Currently active predictions
|
||||
- `tool_ocr_memory_predictions_queue_depth` - Predictions waiting in queue
|
||||
- `tool_ocr_memory_predictions_total` - Total predictions processed (counter)
|
||||
- `tool_ocr_memory_predictions_timeouts_total` - Total prediction timeouts (counter)
|
||||
|
||||
### Pool Metrics
|
||||
- `tool_ocr_memory_pool_services_total` - Total services in pool
|
||||
- `tool_ocr_memory_pool_services_available` - Available services
|
||||
- `tool_ocr_memory_pool_services_in_use` - Services in use
|
||||
- `tool_ocr_memory_pool_acquisitions_total` - Total acquisitions (counter)
|
||||
|
||||
### Recovery Metrics
|
||||
- `tool_ocr_memory_recovery_count_total` - Total recovery attempts
|
||||
- `tool_ocr_memory_recovery_in_cooldown` - In cooldown (0/1)
|
||||
- `tool_ocr_memory_recovery_cooldown_remaining_seconds` - Remaining cooldown
|
||||
|
||||
## Memory Dump API
|
||||
|
||||
The `MemoryDumper` class provides debugging capabilities:
|
||||
|
||||
```python
|
||||
from app.services.memory_manager import get_memory_dumper
|
||||
|
||||
dumper = get_memory_dumper()
|
||||
|
||||
# Create a memory dump
|
||||
dump = dumper.create_dump(include_python_objects=True)
|
||||
|
||||
# Get dump as dictionary for JSON serialization
|
||||
dump_dict = dumper.to_dict(dump)
|
||||
|
||||
# Compare two dumps to detect memory growth
|
||||
comparison = dumper.compare_dumps(dump1, dump2)
|
||||
```
|
||||
|
||||
Memory dumps include:
|
||||
- GPU/CPU memory usage
|
||||
- Loaded models and reference counts
|
||||
- Active predictions and queue state
|
||||
- Service pool statistics
|
||||
- Recovery manager state
|
||||
- Python GC statistics
|
||||
- Large Python objects (optional)
|
||||
@@ -3,123 +3,123 @@
|
||||
## Section 1: Model Lifecycle Management (Priority: Critical)
|
||||
|
||||
### 1.1 Create ModelManager class
|
||||
- [ ] Design ModelManager interface with load/unload/get methods
|
||||
- [ ] Implement reference counting for model instances
|
||||
- [ ] Add idle timeout tracking with configurable thresholds
|
||||
- [ ] Create teardown() method for explicit cleanup
|
||||
- [ ] Add logging for model lifecycle events
|
||||
- [x] Design ModelManager interface with load/unload/get methods
|
||||
- [x] Implement reference counting for model instances
|
||||
- [x] Add idle timeout tracking with configurable thresholds
|
||||
- [x] Create teardown() method for explicit cleanup
|
||||
- [x] Add logging for model lifecycle events
|
||||
|
||||
### 1.2 Integrate PP-StructureV3 with ModelManager
|
||||
- [ ] Remove permanent exemption from unloading (lines 255-267)
|
||||
- [ ] Wrap PP-StructureV3 in managed model wrapper
|
||||
- [ ] Implement lazy loading on first access
|
||||
- [ ] Add unload capability with cache clearing
|
||||
- [ ] Test model reload after unload
|
||||
- [x] Remove permanent exemption from unloading (lines 255-267)
|
||||
- [x] Wrap PP-StructureV3 in managed model wrapper
|
||||
- [x] Implement lazy loading on first access
|
||||
- [x] Add unload capability with cache clearing
|
||||
- [x] Test model reload after unload
|
||||
|
||||
## Section 2: Service Singleton Pattern (Priority: Critical)
|
||||
|
||||
### 2.1 Create OCRServicePool
|
||||
- [ ] Design pool interface with acquire/release methods
|
||||
- [ ] Implement per-device instance management
|
||||
- [ ] Add queue-based task distribution
|
||||
- [ ] Implement concurrency limits via semaphores
|
||||
- [ ] Add health check for pooled instances
|
||||
- [x] Design pool interface with acquire/release methods
|
||||
- [x] Implement per-device instance management
|
||||
- [x] Add queue-based task distribution
|
||||
- [x] Implement concurrency limits via semaphores
|
||||
- [x] Add health check for pooled instances
|
||||
|
||||
### 2.2 Refactor task router
|
||||
- [ ] Replace OCRService() instantiation with pool.acquire()
|
||||
- [ ] Add proper release in finally blocks
|
||||
- [ ] Handle pool exhaustion gracefully
|
||||
- [ ] Add metrics for pool utilization
|
||||
- [ ] Update error handling for pooled services
|
||||
- [x] Replace OCRService() instantiation with pool.acquire()
|
||||
- [x] Add proper release in finally blocks
|
||||
- [x] Handle pool exhaustion gracefully
|
||||
- [x] Add metrics for pool utilization
|
||||
- [x] Update error handling for pooled services
|
||||
|
||||
## Section 3: Enhanced Memory Monitoring (Priority: High)
|
||||
|
||||
### 3.1 Create MemoryGuard class
|
||||
- [ ] Implement paddle.device.cuda memory queries
|
||||
- [ ] Add pynvml integration as fallback
|
||||
- [ ] Add torch memory query support
|
||||
- [ ] Create configurable threshold system
|
||||
- [ ] Implement memory prediction for operations
|
||||
- [x] Implement paddle.device.cuda memory queries
|
||||
- [x] Add pynvml integration as fallback
|
||||
- [x] Add torch memory query support
|
||||
- [x] Create configurable threshold system
|
||||
- [x] Implement memory prediction for operations
|
||||
|
||||
### 3.2 Integrate memory checks
|
||||
- [ ] Replace existing check_gpu_memory implementation
|
||||
- [ ] Add pre-operation memory checks
|
||||
- [ ] Implement CPU fallback when memory low
|
||||
- [ ] Add memory usage logging
|
||||
- [ ] Create memory pressure alerts
|
||||
- [x] Replace existing check_gpu_memory implementation
|
||||
- [x] Add pre-operation memory checks
|
||||
- [x] Implement CPU fallback when memory low
|
||||
- [x] Add memory usage logging
|
||||
- [x] Create memory pressure alerts
|
||||
|
||||
## Section 4: Concurrency Control (Priority: High)
|
||||
|
||||
### 4.1 Implement prediction semaphores
|
||||
- [ ] Add semaphore for PP-StructureV3.predict
|
||||
- [ ] Configure max concurrent predictions
|
||||
- [ ] Add queue for waiting predictions
|
||||
- [ ] Implement timeout handling
|
||||
- [ ] Add metrics for queue depth
|
||||
- [x] Add semaphore for PP-StructureV3.predict
|
||||
- [x] Configure max concurrent predictions
|
||||
- [x] Add queue for waiting predictions
|
||||
- [x] Implement timeout handling
|
||||
- [x] Add metrics for queue depth
|
||||
|
||||
### 4.2 Add selective processing
|
||||
- [ ] Create config for disabling chart/formula/table
|
||||
- [ ] Implement batch processing for large documents
|
||||
- [ ] Add progressive loading for multi-page docs
|
||||
- [ ] Create priority queue for operations
|
||||
- [ ] Test memory savings with selective processing
|
||||
- [x] Create config for disabling chart/formula/table
|
||||
- [x] Implement batch processing for large documents
|
||||
- [x] Add progressive loading for multi-page docs
|
||||
- [x] Create priority queue for operations
|
||||
- [x] Test memory savings with selective processing
|
||||
|
||||
## Section 5: Active Memory Management (Priority: Medium)
|
||||
|
||||
### 5.1 Create memory monitor thread
|
||||
- [ ] Implement background monitoring loop
|
||||
- [ ] Add periodic memory metrics collection
|
||||
- [ ] Create threshold-based triggers
|
||||
- [ ] Implement automatic cache clearing
|
||||
- [ ] Add LRU-based model unloading
|
||||
- [x] Implement background monitoring loop
|
||||
- [x] Add periodic memory metrics collection
|
||||
- [x] Create threshold-based triggers
|
||||
- [x] Implement automatic cache clearing
|
||||
- [x] Add LRU-based model unloading
|
||||
|
||||
### 5.2 Add recovery mechanisms
|
||||
- [ ] Implement emergency memory release
|
||||
- [ ] Add worker process restart capability
|
||||
- [ ] Create memory dump for debugging
|
||||
- [ ] Add cooldown period after recovery
|
||||
- [ ] Test recovery under various scenarios
|
||||
- [x] Implement emergency memory release
|
||||
- [x] Add worker process restart capability (RecoveryManager)
|
||||
- [x] Create memory dump for debugging
|
||||
- [x] Add cooldown period after recovery
|
||||
- [x] Test recovery under various scenarios
|
||||
|
||||
## Section 6: Cleanup Hooks (Priority: Medium)
|
||||
|
||||
### 6.1 Implement shutdown handlers
|
||||
- [ ] Add FastAPI shutdown event handler
|
||||
- [ ] Create signal handlers (SIGTERM, SIGINT)
|
||||
- [ ] Implement graceful model unloading
|
||||
- [ ] Add connection draining
|
||||
- [ ] Test shutdown sequence
|
||||
- [x] Add FastAPI shutdown event handler
|
||||
- [x] Create signal handlers (SIGTERM, SIGINT)
|
||||
- [x] Implement graceful model unloading
|
||||
- [x] Add connection draining
|
||||
- [x] Test shutdown sequence
|
||||
|
||||
### 6.2 Add task cleanup
|
||||
- [ ] Wrap background tasks with cleanup
|
||||
- [ ] Add success/failure callbacks
|
||||
- [ ] Implement resource release on completion
|
||||
- [ ] Add cleanup verification logging
|
||||
- [ ] Test cleanup in error scenarios
|
||||
- [x] Wrap background tasks with cleanup
|
||||
- [x] Add success/failure callbacks
|
||||
- [x] Implement resource release on completion
|
||||
- [x] Add cleanup verification logging
|
||||
- [x] Test cleanup in error scenarios
|
||||
|
||||
## Section 7: Configuration & Settings (Priority: Low)
|
||||
|
||||
### 7.1 Add memory settings to config
|
||||
- [ ] Define memory threshold parameters
|
||||
- [ ] Add model timeout settings
|
||||
- [ ] Configure pool sizes
|
||||
- [ ] Add feature flags for new behavior
|
||||
- [ ] Document all settings
|
||||
- [x] Define memory threshold parameters
|
||||
- [x] Add model timeout settings
|
||||
- [x] Configure pool sizes
|
||||
- [x] Add feature flags for new behavior
|
||||
- [x] Document all settings
|
||||
|
||||
### 7.2 Create monitoring dashboard
|
||||
- [ ] Add memory metrics endpoint
|
||||
- [ ] Create pool status endpoint
|
||||
- [ ] Add model lifecycle stats
|
||||
- [ ] Implement health check endpoint
|
||||
- [ ] Add Prometheus metrics export
|
||||
- [x] Add memory metrics endpoint
|
||||
- [x] Create pool status endpoint
|
||||
- [x] Add model lifecycle stats
|
||||
- [x] Implement health check endpoint
|
||||
- [x] Add Prometheus metrics export
|
||||
|
||||
## Section 8: Testing & Documentation (Priority: High)
|
||||
|
||||
### 8.1 Create comprehensive tests
|
||||
- [ ] Unit tests for ModelManager
|
||||
- [ ] Integration tests for OCRServicePool
|
||||
- [ ] Memory leak detection tests
|
||||
- [ ] Stress tests with concurrent requests
|
||||
- [ ] Performance benchmarks
|
||||
- [x] Unit tests for ModelManager
|
||||
- [x] Integration tests for OCRServicePool
|
||||
- [x] Memory leak detection tests
|
||||
- [x] Stress tests with concurrent requests
|
||||
- [x] Performance benchmarks
|
||||
|
||||
### 8.2 Documentation
|
||||
- [ ] Document memory management architecture
|
||||
@@ -131,5 +131,46 @@
|
||||
---
|
||||
|
||||
**Total Tasks**: 58
|
||||
**Estimated Effort**: 3-4 weeks
|
||||
**Critical Path**: Sections 1-2 must be completed first as they form the foundation
|
||||
**Completed**: 53
|
||||
**Remaining**: 5 (Section 8.2 Documentation only)
|
||||
**Progress**: ~91%
|
||||
|
||||
**Critical Path Status**: Sections 1-8.1 are completed (foundation, memory monitoring, prediction semaphores, batch processing, recovery, signal handlers, configuration, Prometheus metrics, and comprehensive tests in place)
|
||||
|
||||
## Implementation Summary
|
||||
|
||||
### Files Created
|
||||
- `backend/app/services/memory_manager.py` - ModelManager, MemoryGuard, MemoryConfig, PredictionSemaphore, BatchProcessor, ProgressiveLoader, PriorityOperationQueue, RecoveryManager
|
||||
- `backend/app/services/service_pool.py` - OCRServicePool, PoolConfig
|
||||
- `backend/tests/services/test_memory_manager.py` - Unit tests for memory management (57 tests)
|
||||
- `backend/tests/services/test_service_pool.py` - Unit tests for service pool (18 tests)
|
||||
- `backend/tests/services/test_ocr_memory_integration.py` - Integration tests for memory check patterns (10 tests)
|
||||
|
||||
### Files Modified
|
||||
- `backend/app/core/config.py` - Added memory management configuration settings
|
||||
- `backend/app/services/ocr_service.py` - Removed PP-StructureV3 exemption, added unload capability, integrated MemoryGuard for pre-operation checks and CPU fallback, added PredictionSemaphore for concurrent prediction control
|
||||
- `backend/app/services/pp_structure_enhanced.py` - Added PredictionSemaphore control for predict calls
|
||||
- `backend/app/routers/tasks.py` - Refactored to use service pool
|
||||
- `backend/app/main.py` - Added startup/shutdown handlers, signal handlers (SIGTERM/SIGINT), connection draining, recovery manager shutdown
|
||||
|
||||
### New Classes Added (Section 4.2-8)
|
||||
- `BatchProcessor` - Memory-aware batch processing for large documents with priority support
|
||||
- `ProgressiveLoader` - Progressive page loading with lookahead and automatic cleanup
|
||||
- `PriorityOperationQueue` - Priority queue with timeout and cancellation support
|
||||
- `RecoveryManager` - Memory recovery with cooldown period and attempt limits
|
||||
- `MemoryDumper` - Memory dump creation for debugging with history and comparison
|
||||
- `PrometheusMetrics` - Prometheus-format metrics export for monitoring
|
||||
- Signal handlers for graceful shutdown (SIGTERM, SIGINT)
|
||||
- Connection draining for clean shutdown
|
||||
|
||||
### New Test Classes Added (Section 8.1)
|
||||
- `TestModelReloadAfterUnload` - Tests for model reload after unload
|
||||
- `TestSelectiveProcessingMemorySavings` - Tests for memory savings with selective processing
|
||||
- `TestRecoveryScenarios` - Tests for recovery under various scenarios
|
||||
- `TestShutdownSequence` - Tests for shutdown sequence
|
||||
- `TestCleanupInErrorScenarios` - Tests for cleanup in error scenarios
|
||||
- `TestMemoryLeakDetection` - Tests for memory leak detection
|
||||
- `TestStressConcurrentRequests` - Stress tests with concurrent requests
|
||||
- `TestPerformanceBenchmarks` - Performance benchmark tests
|
||||
- `TestMemoryDumper` - Tests for MemoryDumper class
|
||||
- `TestPrometheusMetrics` - Tests for PrometheusMetrics class
|
||||
|
||||
Reference in New Issue
Block a user