Files
OCR/backend/app/core/config.py
egg ad2b832fb6 feat: complete external auth V2 migration with advanced features
This commit implements comprehensive external Azure AD authentication
with complete task management, file download, and admin monitoring systems.

## Core Features Implemented (80% Complete)

### 1. Token Auto-Refresh Mechanism 
- Backend: POST /api/v2/auth/refresh endpoint
- Frontend: Auto-refresh 5 minutes before expiration
- Auto-retry on 401 errors with seamless token refresh

### 2. File Download System 
- Three format support: JSON / Markdown / PDF
- Endpoints: GET /api/v2/tasks/{id}/download/{format}
- File access control with ownership validation
- Frontend download buttons in TaskHistoryPage

### 3. Complete Task Management 
Backend Endpoints:
- POST /api/v2/tasks/{id}/start - Start task
- POST /api/v2/tasks/{id}/cancel - Cancel task
- POST /api/v2/tasks/{id}/retry - Retry failed task
- GET /api/v2/tasks - List with filters (status, filename, date range)
- GET /api/v2/tasks/stats - User statistics

Frontend Features:
- Status-based action buttons (Start/Cancel/Retry)
- Advanced search and filtering (status, filename, date range)
- Pagination and sorting
- Task statistics dashboard (5 stat cards)

### 4. Admin Monitoring System  (Backend)
Admin APIs:
- GET /api/v2/admin/stats - System statistics
- GET /api/v2/admin/users - User list with stats
- GET /api/v2/admin/users/top - User leaderboard
- GET /api/v2/admin/audit-logs - Audit log query system
- GET /api/v2/admin/audit-logs/user/{id}/summary

Admin Features:
- Email-based admin check (ymirliu@panjit.com.tw)
- Comprehensive system metrics (users, tasks, sessions, activity)
- Audit logging service for security tracking

### 5. User Isolation & Security 
- Row-level security on all task queries
- File access control with ownership validation
- Strict user_id filtering on all operations
- Session validation and expiry checking
- Admin privilege verification

## New Files Created

Backend:
- backend/app/models/user_v2.py - User model for external auth
- backend/app/models/task.py - Task model with user isolation
- backend/app/models/session.py - Session management
- backend/app/models/audit_log.py - Audit log model
- backend/app/services/external_auth_service.py - External API client
- backend/app/services/task_service.py - Task CRUD with isolation
- backend/app/services/file_access_service.py - File access control
- backend/app/services/admin_service.py - Admin operations
- backend/app/services/audit_service.py - Audit logging
- backend/app/routers/auth_v2.py - V2 auth endpoints
- backend/app/routers/tasks.py - Task management endpoints
- backend/app/routers/admin.py - Admin endpoints
- backend/alembic/versions/5e75a59fb763_*.py - DB migration

Frontend:
- frontend/src/services/apiV2.ts - Complete V2 API client
- frontend/src/types/apiV2.ts - V2 type definitions
- frontend/src/pages/TaskHistoryPage.tsx - Task history UI

Modified Files:
- backend/app/core/deps.py - Added get_current_admin_user_v2
- backend/app/main.py - Registered admin router
- frontend/src/pages/LoginPage.tsx - V2 login integration
- frontend/src/components/Layout.tsx - User display and logout
- frontend/src/App.tsx - Added /tasks route

## Documentation
- openspec/changes/.../PROGRESS_UPDATE.md - Detailed progress report

## Pending Items (20%)
1. Database migration execution for audit_logs table
2. Frontend admin dashboard page
3. Frontend audit log viewer

## Testing Status
- Manual testing:  Authentication flow verified
- Unit tests:  Pending
- Integration tests:  Pending

## Security Enhancements
-  User isolation (row-level security)
-  File access control
-  Token expiry validation
-  Admin privilege verification
-  Audit logging infrastructure
-  Token encryption (noted, low priority)
-  Rate limiting (noted, low priority)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 17:19:43 +08:00

149 lines
5.6 KiB
Python

"""
Tool_OCR - Configuration Management
Loads environment variables and provides centralized configuration
"""
from typing import List
from pydantic_settings import BaseSettings
from pydantic import Field
from pathlib import Path
class Settings(BaseSettings):
"""Application settings loaded from environment variables"""
# ===== Database Configuration =====
mysql_host: str = Field(default="mysql.theaken.com")
mysql_port: int = Field(default=33306)
mysql_user: str = Field(default="A060")
mysql_password: str = Field(default="")
mysql_database: str = Field(default="db_A060")
@property
def database_url(self) -> str:
"""Construct SQLAlchemy database URL"""
return (
f"mysql+pymysql://{self.mysql_user}:{self.mysql_password}"
f"@{self.mysql_host}:{self.mysql_port}/{self.mysql_database}"
)
# ===== Application Configuration =====
backend_port: int = Field(default=12010)
frontend_port: int = Field(default=12011)
secret_key: str = Field(default="your-secret-key-change-this")
algorithm: str = Field(default="HS256")
access_token_expire_minutes: int = Field(default=1440) # 24 hours
# ===== External Authentication Configuration =====
external_auth_api_url: str = Field(default="https://pj-auth-api.vercel.app")
external_auth_endpoint: str = Field(default="/api/auth/login")
external_auth_timeout: int = Field(default=30)
token_refresh_buffer: int = Field(default=300) # Refresh tokens 5 minutes before expiry
@property
def external_auth_full_url(self) -> str:
"""Construct full external authentication URL"""
return f"{self.external_auth_api_url.rstrip('/')}{self.external_auth_endpoint}"
# ===== Task Management Configuration =====
database_table_prefix: str = Field(default="tool_ocr_")
enable_task_history: bool = Field(default=True)
task_retention_days: int = Field(default=30)
max_tasks_per_user: int = Field(default=1000)
# ===== OCR Configuration =====
paddleocr_model_dir: str = Field(default="./models/paddleocr")
ocr_languages: str = Field(default="ch,en,japan,korean")
ocr_confidence_threshold: float = Field(default=0.5)
max_ocr_workers: int = Field(default=4)
@property
def ocr_languages_list(self) -> List[str]:
"""Get OCR languages as list"""
return [lang.strip() for lang in self.ocr_languages.split(",")]
# ===== GPU Acceleration Configuration =====
force_cpu_mode: bool = Field(default=False)
gpu_memory_fraction: float = Field(default=0.8)
gpu_device_id: int = Field(default=0)
# ===== File Upload Configuration =====
max_upload_size: int = Field(default=52428800) # 50MB
allowed_extensions: str = Field(default="png,jpg,jpeg,pdf,bmp,tiff,doc,docx,ppt,pptx")
upload_dir: str = Field(default="./uploads")
temp_dir: str = Field(default="./uploads/temp")
processed_dir: str = Field(default="./uploads/processed")
images_dir: str = Field(default="./uploads/images")
@property
def allowed_extensions_list(self) -> List[str]:
"""Get allowed extensions as list"""
return [ext.strip() for ext in self.allowed_extensions.split(",")]
# ===== Export Configuration =====
storage_dir: str = Field(default="./storage")
markdown_dir: str = Field(default="./storage/markdown")
json_dir: str = Field(default="./storage/json")
exports_dir: str = Field(default="./storage/exports")
# ===== PDF Generation Configuration =====
pandoc_path: str = Field(default="/opt/homebrew/bin/pandoc")
font_dir: str = Field(default="/System/Library/Fonts")
pdf_page_size: str = Field(default="A4")
pdf_margin_top: int = Field(default=20)
pdf_margin_bottom: int = Field(default=20)
pdf_margin_left: int = Field(default=20)
pdf_margin_right: int = Field(default=20)
# ===== Translation Configuration (Reserved) =====
enable_translation: bool = Field(default=False)
translation_engine: str = Field(default="offline")
argostranslate_models_dir: str = Field(default="./models/argostranslate")
# ===== Background Tasks Configuration =====
task_queue_type: str = Field(default="memory")
redis_url: str = Field(default="redis://localhost:6379/0")
# ===== CORS Configuration =====
cors_origins: str = Field(default="http://localhost:12011,http://127.0.0.1:12011")
@property
def cors_origins_list(self) -> List[str]:
"""Get CORS origins as list"""
return [origin.strip() for origin in self.cors_origins.split(",")]
# ===== Logging Configuration =====
log_level: str = Field(default="INFO")
log_file: str = Field(default="./logs/app.log")
class Config:
# Look for .env in project root (one level up from backend/)
env_file = str(Path(__file__).resolve().parent.parent.parent.parent / ".env")
env_file_encoding = "utf-8"
case_sensitive = False
def ensure_directories(self):
"""Create all necessary directories if they don't exist"""
dirs = [
self.upload_dir,
self.temp_dir,
self.processed_dir,
self.images_dir,
self.storage_dir,
self.markdown_dir,
self.json_dir,
self.exports_dir,
self.paddleocr_model_dir,
Path(self.log_file).parent,
]
if self.enable_translation and self.translation_engine == "offline":
dirs.append(self.argostranslate_models_dir)
for dir_path in dirs:
Path(dir_path).mkdir(parents=True, exist_ok=True)
# Global settings instance
settings = Settings()