feat: complete external auth V2 migration with advanced features

This commit implements comprehensive external Azure AD authentication
with complete task management, file download, and admin monitoring systems.

## Core Features Implemented (80% Complete)

### 1. Token Auto-Refresh Mechanism 
- Backend: POST /api/v2/auth/refresh endpoint
- Frontend: Auto-refresh 5 minutes before expiration
- Auto-retry on 401 errors with seamless token refresh

### 2. File Download System 
- Three format support: JSON / Markdown / PDF
- Endpoints: GET /api/v2/tasks/{id}/download/{format}
- File access control with ownership validation
- Frontend download buttons in TaskHistoryPage

### 3. Complete Task Management 
Backend Endpoints:
- POST /api/v2/tasks/{id}/start - Start task
- POST /api/v2/tasks/{id}/cancel - Cancel task
- POST /api/v2/tasks/{id}/retry - Retry failed task
- GET /api/v2/tasks - List with filters (status, filename, date range)
- GET /api/v2/tasks/stats - User statistics

Frontend Features:
- Status-based action buttons (Start/Cancel/Retry)
- Advanced search and filtering (status, filename, date range)
- Pagination and sorting
- Task statistics dashboard (5 stat cards)

### 4. Admin Monitoring System  (Backend)
Admin APIs:
- GET /api/v2/admin/stats - System statistics
- GET /api/v2/admin/users - User list with stats
- GET /api/v2/admin/users/top - User leaderboard
- GET /api/v2/admin/audit-logs - Audit log query system
- GET /api/v2/admin/audit-logs/user/{id}/summary

Admin Features:
- Email-based admin check (ymirliu@panjit.com.tw)
- Comprehensive system metrics (users, tasks, sessions, activity)
- Audit logging service for security tracking

### 5. User Isolation & Security 
- Row-level security on all task queries
- File access control with ownership validation
- Strict user_id filtering on all operations
- Session validation and expiry checking
- Admin privilege verification

## New Files Created

Backend:
- backend/app/models/user_v2.py - User model for external auth
- backend/app/models/task.py - Task model with user isolation
- backend/app/models/session.py - Session management
- backend/app/models/audit_log.py - Audit log model
- backend/app/services/external_auth_service.py - External API client
- backend/app/services/task_service.py - Task CRUD with isolation
- backend/app/services/file_access_service.py - File access control
- backend/app/services/admin_service.py - Admin operations
- backend/app/services/audit_service.py - Audit logging
- backend/app/routers/auth_v2.py - V2 auth endpoints
- backend/app/routers/tasks.py - Task management endpoints
- backend/app/routers/admin.py - Admin endpoints
- backend/alembic/versions/5e75a59fb763_*.py - DB migration

Frontend:
- frontend/src/services/apiV2.ts - Complete V2 API client
- frontend/src/types/apiV2.ts - V2 type definitions
- frontend/src/pages/TaskHistoryPage.tsx - Task history UI

Modified Files:
- backend/app/core/deps.py - Added get_current_admin_user_v2
- backend/app/main.py - Registered admin router
- frontend/src/pages/LoginPage.tsx - V2 login integration
- frontend/src/components/Layout.tsx - User display and logout
- frontend/src/App.tsx - Added /tasks route

## Documentation
- openspec/changes/.../PROGRESS_UPDATE.md - Detailed progress report

## Pending Items (20%)
1. Database migration execution for audit_logs table
2. Frontend admin dashboard page
3. Frontend audit log viewer

## Testing Status
- Manual testing:  Authentication flow verified
- Unit tests:  Pending
- Integration tests:  Pending

## Security Enhancements
-  User isolation (row-level security)
-  File access control
-  Token expiry validation
-  Admin privilege verification
-  Audit logging infrastructure
-  Token encryption (noted, low priority)
-  Rate limiting (noted, low priority)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-14 17:19:43 +08:00
parent 470fa96428
commit ad2b832fb6
32 changed files with 6450 additions and 26 deletions

View File

@@ -20,18 +20,31 @@ class LoginRequest(BaseModel):
}
class UserInfo(BaseModel):
"""User information schema"""
id: int
email: str
display_name: Optional[str] = None
class Token(BaseModel):
"""JWT token response schema"""
access_token: str = Field(..., description="JWT access token")
token_type: str = Field(default="bearer", description="Token type")
expires_in: int = Field(..., description="Token expiration time in seconds")
user: Optional[UserInfo] = Field(None, description="User information (V2 only)")
class Config:
json_schema_extra = {
"example": {
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer",
"expires_in": 3600
"expires_in": 3600,
"user": {
"id": 1,
"email": "user@example.com",
"display_name": "User Name"
}
}
}
@@ -40,3 +53,18 @@ class TokenData(BaseModel):
"""Token payload data"""
user_id: Optional[int] = None
username: Optional[str] = None
email: Optional[str] = None
session_id: Optional[int] = None
class UserResponse(BaseModel):
"""User response schema"""
id: int
email: str
display_name: Optional[str] = None
created_at: Optional[str] = None
last_login: Optional[str] = None
is_active: bool = True
class Config:
from_attributes = True

103
backend/app/schemas/task.py Normal file
View File

@@ -0,0 +1,103 @@
"""
Tool_OCR - Task Management Schemas
"""
from typing import Optional, List
from datetime import datetime
from pydantic import BaseModel, Field
from enum import Enum
class TaskStatusEnum(str, Enum):
"""Task status enumeration"""
PENDING = "pending"
PROCESSING = "processing"
COMPLETED = "completed"
FAILED = "failed"
class TaskCreate(BaseModel):
"""Task creation request"""
filename: Optional[str] = Field(None, description="Original filename")
file_type: Optional[str] = Field(None, description="File MIME type")
class TaskUpdate(BaseModel):
"""Task update request"""
status: Optional[TaskStatusEnum] = None
error_message: Optional[str] = None
processing_time_ms: Optional[int] = None
result_json_path: Optional[str] = None
result_markdown_path: Optional[str] = None
result_pdf_path: Optional[str] = None
class TaskFileResponse(BaseModel):
"""Task file response schema"""
id: int
original_name: Optional[str] = None
stored_path: Optional[str] = None
file_size: Optional[int] = None
mime_type: Optional[str] = None
file_hash: Optional[str] = None
created_at: datetime
class Config:
from_attributes = True
class TaskResponse(BaseModel):
"""Task response schema"""
id: int
user_id: int
task_id: str
filename: Optional[str] = None
file_type: Optional[str] = None
status: TaskStatusEnum
result_json_path: Optional[str] = None
result_markdown_path: Optional[str] = None
result_pdf_path: Optional[str] = None
error_message: Optional[str] = None
processing_time_ms: Optional[int] = None
created_at: datetime
updated_at: datetime
completed_at: Optional[datetime] = None
file_deleted: bool = False
class Config:
from_attributes = True
class TaskDetailResponse(TaskResponse):
"""Detailed task response with files"""
files: List[TaskFileResponse] = []
class TaskListResponse(BaseModel):
"""Paginated task list response"""
tasks: List[TaskResponse]
total: int
page: int
page_size: int
has_more: bool
class TaskStatsResponse(BaseModel):
"""User task statistics"""
total: int
pending: int
processing: int
completed: int
failed: int
class TaskHistoryQuery(BaseModel):
"""Task history query parameters"""
status: Optional[TaskStatusEnum] = None
filename: Optional[str] = None
date_from: Optional[datetime] = None
date_to: Optional[datetime] = None
page: int = Field(default=1, ge=1)
page_size: int = Field(default=50, ge=1, le=100)
order_by: str = Field(default="created_at")
order_desc: bool = Field(default=True)