chore: project cleanup and prepare for dual-track processing refactor
- Removed all test files and directories - Deleted outdated documentation (will be rewritten) - Cleaned up temporary files, logs, and uploads - Archived 5 completed OpenSpec proposals - Created new dual-track-document-processing proposal with complete OpenSpec structure - Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF) - UnifiedDocument model for consistent output - Support for structure-preserving translation - Updated .gitignore to prevent future test/temp files This is a major cleanup preparing for the complete refactoring of the document processing pipeline. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,519 @@
|
||||
# 前端實作完成 - External Authentication & Task History
|
||||
|
||||
## 實作日期
|
||||
2025-11-14
|
||||
|
||||
## 狀態
|
||||
✅ **前端核心功能完成**
|
||||
- V2 認證服務整合
|
||||
- 登入頁面更新
|
||||
- 任務歷史頁面
|
||||
- 導航整合
|
||||
|
||||
---
|
||||
|
||||
## 📋 已完成項目
|
||||
|
||||
### 1. V2 API 服務層 ✅
|
||||
|
||||
#### **檔案:`frontend/src/services/apiV2.ts`**
|
||||
|
||||
**核心功能:**
|
||||
```typescript
|
||||
class ApiClientV2 {
|
||||
// 認證管理
|
||||
async login(data: LoginRequest): Promise<LoginResponseV2>
|
||||
async logout(sessionId?: number): Promise<void>
|
||||
async getMe(): Promise<UserInfo>
|
||||
async listSessions(): Promise<SessionInfo[]>
|
||||
|
||||
// 任務管理
|
||||
async createTask(data: TaskCreate): Promise<Task>
|
||||
async listTasks(params): Promise<TaskListResponse>
|
||||
async getTaskStats(): Promise<TaskStats>
|
||||
async getTask(taskId: string): Promise<TaskDetail>
|
||||
async updateTask(taskId: string, data: TaskUpdate): Promise<Task>
|
||||
async deleteTask(taskId: string): Promise<void>
|
||||
|
||||
// 輔助方法
|
||||
async downloadTaskFile(url: string, filename: string): Promise<void>
|
||||
}
|
||||
```
|
||||
|
||||
**特色:**
|
||||
- 自動 token 管理(localStorage)
|
||||
- 401 自動重定向到登入
|
||||
- Session 過期檢測
|
||||
- 用戶資訊快取
|
||||
|
||||
#### **檔案:`frontend/src/types/apiV2.ts`**
|
||||
|
||||
完整類型定義:
|
||||
- `UserInfo`, `LoginResponseV2`, `SessionInfo`
|
||||
- `Task`, `TaskCreate`, `TaskUpdate`, `TaskDetail`
|
||||
- `TaskStats`, `TaskListResponse`, `TaskFilters`
|
||||
- `TaskStatus` 枚舉
|
||||
|
||||
---
|
||||
|
||||
### 2. 登入頁面更新 ✅
|
||||
|
||||
#### **檔案:`frontend/src/pages/LoginPage.tsx`**
|
||||
|
||||
**變更:**
|
||||
```typescript
|
||||
// 舊版(V1)
|
||||
await apiClient.login({ username, password })
|
||||
setUser({ id: 1, username })
|
||||
|
||||
// 新版(V2)
|
||||
const response = await apiClientV2.login({ username, password })
|
||||
setUser({
|
||||
id: response.user.id,
|
||||
username: response.user.email,
|
||||
email: response.user.email,
|
||||
displayName: response.user.display_name
|
||||
})
|
||||
```
|
||||
|
||||
**功能:**
|
||||
- ✅ 整合外部 Azure AD 認證
|
||||
- ✅ 顯示用戶顯示名稱
|
||||
- ✅ 錯誤訊息處理
|
||||
- ✅ 保持原有 UI 設計
|
||||
|
||||
---
|
||||
|
||||
### 3. 任務歷史頁面 ✅
|
||||
|
||||
#### **檔案:`frontend/src/pages/TaskHistoryPage.tsx`**
|
||||
|
||||
**核心功能:**
|
||||
|
||||
1. **統計儀表板**
|
||||
- 總計、待處理、處理中、已完成、失敗
|
||||
- 卡片式呈現
|
||||
- 即時更新
|
||||
|
||||
2. **篩選功能**
|
||||
- 按狀態篩選(全部/pending/processing/completed/failed)
|
||||
- 未來可擴展:日期範圍、檔名搜尋
|
||||
|
||||
3. **任務列表**
|
||||
- 分頁顯示(每頁 20 筆)
|
||||
- 欄位:檔案名稱、狀態、建立時間、完成時間、處理時間
|
||||
- 操作:查看詳情、刪除
|
||||
|
||||
4. **狀態徽章**
|
||||
```typescript
|
||||
pending → 灰色 + 時鐘圖標
|
||||
processing → 藍色 + 旋轉圖標
|
||||
completed → 綠色 + 勾選圖標
|
||||
failed → 紅色 + X 圖標
|
||||
```
|
||||
|
||||
5. **分頁控制**
|
||||
- 上一頁/下一頁
|
||||
- 顯示當前範圍(1-20 / 共 45 個)
|
||||
- 自動禁用按鈕
|
||||
|
||||
**UI 組件使用:**
|
||||
- `Card` - 統計卡片和主容器
|
||||
- `Table` - 任務列表表格
|
||||
- `Badge` - 狀態標籤
|
||||
- `Button` - 操作按鈕
|
||||
- `Select` - 狀態篩選下拉選單
|
||||
|
||||
---
|
||||
|
||||
### 4. 路由整合 ✅
|
||||
|
||||
#### **檔案:`frontend/src/App.tsx`**
|
||||
|
||||
新增路由:
|
||||
```typescript
|
||||
<Route path="tasks" element={<TaskHistoryPage />} />
|
||||
```
|
||||
|
||||
**路由結構:**
|
||||
```
|
||||
/login - 登入頁面(公開)
|
||||
/ - 根路徑(重定向到 /upload)
|
||||
/upload - 上傳檔案
|
||||
/processing - 處理進度
|
||||
/results - 查看結果
|
||||
/tasks - 任務歷史 (NEW!)
|
||||
/export - 導出文件
|
||||
/settings - 系統設定
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5. 導航更新 ✅
|
||||
|
||||
#### **檔案:`frontend/src/components/Layout.tsx`**
|
||||
|
||||
**新增導航項:**
|
||||
```typescript
|
||||
{
|
||||
to: '/tasks',
|
||||
label: '任務歷史',
|
||||
icon: History,
|
||||
description: '查看任務記錄'
|
||||
}
|
||||
```
|
||||
|
||||
**Logout 邏輯更新:**
|
||||
```typescript
|
||||
const handleLogout = async () => {
|
||||
try {
|
||||
// 優先使用 V2 API
|
||||
if (apiClientV2.isAuthenticated()) {
|
||||
await apiClientV2.logout()
|
||||
} else {
|
||||
apiClient.logout()
|
||||
}
|
||||
} finally {
|
||||
logout() // 清除本地狀態
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**用戶資訊顯示:**
|
||||
- 顯示名稱:`user.displayName || user.username`
|
||||
- Email:`user.email || user.username`
|
||||
- 頭像:首字母大寫
|
||||
|
||||
---
|
||||
|
||||
### 6. 類型擴展 ✅
|
||||
|
||||
#### **檔案:`frontend/src/types/api.ts`**
|
||||
|
||||
擴展 User 介面:
|
||||
```typescript
|
||||
export interface User {
|
||||
id: number
|
||||
username: string
|
||||
email?: string // NEW
|
||||
displayName?: string | null // NEW
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎨 UI/UX 特色
|
||||
|
||||
### 任務歷史頁面設計亮點:
|
||||
|
||||
1. **響應式卡片佈局**
|
||||
- Grid 5 欄(桌面)/ 1 欄(手機)
|
||||
- 統計數據卡片 hover 效果
|
||||
|
||||
2. **清晰的狀態視覺化**
|
||||
- 彩色徽章
|
||||
- 動畫圖標(processing 狀態旋轉)
|
||||
- 語意化顏色
|
||||
|
||||
3. **操作反饋**
|
||||
- 載入動畫(Loader2)
|
||||
- 空狀態提示
|
||||
- 錯誤警告
|
||||
|
||||
4. **用戶友好**
|
||||
- 確認刪除對話框
|
||||
- 刷新按鈕
|
||||
- 分頁資訊明確
|
||||
|
||||
---
|
||||
|
||||
## 🔄 向後兼容
|
||||
|
||||
### V1 與 V2 並存策略
|
||||
|
||||
**認證服務:**
|
||||
- V1: `apiClient` (原有本地認證)
|
||||
- V2: `apiClientV2` (新外部認證)
|
||||
|
||||
**登入流程:**
|
||||
- 新用戶使用 V2 API 登入
|
||||
- 舊 session 仍可使用 V1 API
|
||||
|
||||
**Logout 處理:**
|
||||
```typescript
|
||||
if (apiClientV2.isAuthenticated()) {
|
||||
await apiClientV2.logout() // 呼叫後端 /api/v2/auth/logout
|
||||
} else {
|
||||
apiClient.logout() // 僅清除本地 token
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📱 使用流程
|
||||
|
||||
### 1. 登入
|
||||
```
|
||||
用戶訪問 /login
|
||||
→ 輸入 email + password
|
||||
→ apiClientV2.login() 呼叫外部 API
|
||||
→ 接收 access_token + user info
|
||||
→ 存入 localStorage
|
||||
→ 重定向到 /upload
|
||||
```
|
||||
|
||||
### 2. 查看任務歷史
|
||||
```
|
||||
用戶點擊「任務歷史」導航
|
||||
→ 訪問 /tasks
|
||||
→ apiClientV2.listTasks() 獲取任務列表
|
||||
→ apiClientV2.getTaskStats() 獲取統計
|
||||
→ 顯示任務表格 + 統計卡片
|
||||
```
|
||||
|
||||
### 3. 篩選任務
|
||||
```
|
||||
用戶選擇狀態篩選器(例:completed)
|
||||
→ setStatusFilter('completed')
|
||||
→ useEffect 觸發重新 fetchTasks()
|
||||
→ 呼叫 apiClientV2.listTasks({ status: 'completed' })
|
||||
→ 更新任務列表
|
||||
```
|
||||
|
||||
### 4. 刪除任務
|
||||
```
|
||||
用戶點擊刪除按鈕
|
||||
→ 確認對話框
|
||||
→ apiClientV2.deleteTask(taskId)
|
||||
→ 重新載入任務列表和統計
|
||||
```
|
||||
|
||||
### 5. 分頁導航
|
||||
```
|
||||
用戶點擊「下一頁」
|
||||
→ setPage(page + 1)
|
||||
→ useEffect 觸發 fetchTasks()
|
||||
→ 呼叫 listTasks({ page: 2 })
|
||||
→ 更新任務列表
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🧪 測試指南
|
||||
|
||||
### 手動測試步驟:
|
||||
|
||||
#### 1. 測試登入
|
||||
```bash
|
||||
# 啟動後端
|
||||
cd backend
|
||||
source venv/bin/activate
|
||||
python -m app.main
|
||||
|
||||
# 啟動前端
|
||||
cd frontend
|
||||
npm run dev
|
||||
|
||||
# 訪問 http://localhost:5173/login
|
||||
# 輸入 Azure AD 憑證
|
||||
# 確認登入成功並顯示用戶名稱
|
||||
```
|
||||
|
||||
#### 2. 測試任務歷史
|
||||
```bash
|
||||
# 登入後點擊側邊欄「任務歷史」
|
||||
# 確認統計卡片顯示正確數字
|
||||
# 確認任務列表載入
|
||||
# 測試狀態篩選
|
||||
# 測試分頁功能
|
||||
```
|
||||
|
||||
#### 3. 測試任務刪除
|
||||
```bash
|
||||
# 在任務列表點擊刪除按鈕
|
||||
# 確認刪除確認對話框
|
||||
# 確認刪除後列表更新
|
||||
# 確認統計數字更新
|
||||
```
|
||||
|
||||
#### 4. 測試 Logout
|
||||
```bash
|
||||
# 點擊側邊欄登出按鈕
|
||||
# 確認清除 localStorage
|
||||
# 確認重定向到登入頁面
|
||||
# 再次登入確認一切正常
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 已知限制
|
||||
|
||||
### 目前未實作項目:
|
||||
|
||||
1. **任務詳情頁面** (`/tasks/:taskId`)
|
||||
- 顯示完整任務資訊
|
||||
- 下載結果檔案(JSON/Markdown/PDF)
|
||||
- 查看任務文件列表
|
||||
|
||||
2. **進階篩選**
|
||||
- 日期範圍選擇器
|
||||
- 檔案名稱搜尋
|
||||
- 多條件組合篩選
|
||||
|
||||
3. **批次操作**
|
||||
- 批次刪除任務
|
||||
- 批次下載結果
|
||||
|
||||
4. **即時更新**
|
||||
- WebSocket 連接
|
||||
- 任務狀態即時推送
|
||||
- 自動刷新處理中的任務
|
||||
|
||||
5. **錯誤詳情**
|
||||
- 展開查看 `error_message`
|
||||
- 失敗任務重試功能
|
||||
|
||||
---
|
||||
|
||||
## 💡 未來擴展建議
|
||||
|
||||
### 短期優化(1-2 週):
|
||||
|
||||
1. **任務詳情頁面**
|
||||
```typescript
|
||||
// frontend/src/pages/TaskDetailPage.tsx
|
||||
const task = await apiClientV2.getTask(taskId)
|
||||
// 顯示完整資訊 + 下載按鈕
|
||||
```
|
||||
|
||||
2. **檔案下載**
|
||||
```typescript
|
||||
const handleDownload = async (path: string, filename: string) => {
|
||||
await apiClientV2.downloadTaskFile(path, filename)
|
||||
}
|
||||
```
|
||||
|
||||
3. **日期範圍篩選**
|
||||
```typescript
|
||||
<DateRangePicker
|
||||
from={dateFrom}
|
||||
to={dateTo}
|
||||
onChange={(range) => {
|
||||
setDateFrom(range.from)
|
||||
setDateTo(range.to)
|
||||
}}
|
||||
/>
|
||||
```
|
||||
|
||||
### 中期功能(1 個月):
|
||||
|
||||
4. **即時狀態更新**
|
||||
- 使用 WebSocket 或 Server-Sent Events
|
||||
- 自動更新 processing 任務狀態
|
||||
|
||||
5. **批次操作**
|
||||
- 複選框選擇多個任務
|
||||
- 批次刪除/下載
|
||||
|
||||
6. **搜尋功能**
|
||||
- 檔案名稱模糊搜尋
|
||||
- 全文搜尋(需後端支援)
|
||||
|
||||
### 長期規劃(3 個月):
|
||||
|
||||
7. **任務視覺化**
|
||||
- 時間軸視圖
|
||||
- 甘特圖(處理進度)
|
||||
- 統計圖表(ECharts)
|
||||
|
||||
8. **通知系統**
|
||||
- 任務完成通知
|
||||
- 錯誤警報
|
||||
- 瀏覽器通知 API
|
||||
|
||||
9. **導出功能**
|
||||
- 任務報表導出(Excel/PDF)
|
||||
- 統計資料導出
|
||||
|
||||
---
|
||||
|
||||
## 📝 程式碼範例
|
||||
|
||||
### 在其他頁面使用 V2 API
|
||||
|
||||
```typescript
|
||||
// Example: 在 UploadPage 創建任務
|
||||
import { apiClientV2 } from '@/services/apiV2'
|
||||
|
||||
const handleUpload = async (file: File) => {
|
||||
try {
|
||||
// 創建任務
|
||||
const task = await apiClientV2.createTask({
|
||||
filename: file.name,
|
||||
file_type: file.type
|
||||
})
|
||||
|
||||
console.log('Task created:', task.task_id)
|
||||
|
||||
// TODO: 上傳檔案到雲端存儲
|
||||
// TODO: 更新任務狀態為 processing
|
||||
// TODO: 呼叫 OCR 服務
|
||||
} catch (error) {
|
||||
console.error('Upload failed:', error)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 監聽任務狀態變化
|
||||
|
||||
```typescript
|
||||
// Example: 輪詢任務狀態
|
||||
const pollTaskStatus = async (taskId: string) => {
|
||||
const interval = setInterval(async () => {
|
||||
try {
|
||||
const task = await apiClientV2.getTask(taskId)
|
||||
|
||||
if (task.status === 'completed') {
|
||||
clearInterval(interval)
|
||||
alert('任務完成!')
|
||||
} else if (task.status === 'failed') {
|
||||
clearInterval(interval)
|
||||
alert(`任務失敗:${task.error_message}`)
|
||||
}
|
||||
} catch (error) {
|
||||
clearInterval(interval)
|
||||
console.error('Poll error:', error)
|
||||
}
|
||||
}, 5000) // 每 5 秒檢查一次
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ✅ 完成清單
|
||||
|
||||
- [x] V2 API 服務層(`apiV2.ts`)
|
||||
- [x] V2 類型定義(`apiV2.ts`)
|
||||
- [x] 登入頁面整合 V2
|
||||
- [x] 任務歷史頁面
|
||||
- [x] 統計儀表板
|
||||
- [x] 狀態篩選
|
||||
- [x] 分頁功能
|
||||
- [x] 任務刪除
|
||||
- [x] 路由整合
|
||||
- [x] 導航更新
|
||||
- [x] Logout 更新
|
||||
- [x] 用戶資訊顯示
|
||||
- [ ] 任務詳情頁面(待實作)
|
||||
- [ ] 檔案下載(待實作)
|
||||
- [ ] 即時狀態更新(待實作)
|
||||
- [ ] 批次操作(待實作)
|
||||
|
||||
---
|
||||
|
||||
**實作完成日期**:2025-11-14
|
||||
**實作人員**:Claude Code
|
||||
**前端框架**:React + TypeScript + Vite
|
||||
**UI 庫**:Tailwind CSS + shadcn/ui
|
||||
**狀態管理**:Zustand
|
||||
**HTTP 客戶端**:Axios
|
||||
@@ -0,0 +1,556 @@
|
||||
# External API Authentication Implementation - Complete ✅
|
||||
|
||||
## 實作日期
|
||||
2025-11-14
|
||||
|
||||
## 狀態
|
||||
✅ **後端實作完成** - Phase 1-8 已完成
|
||||
⏳ **前端實作待續** - Phase 9-11 待實作
|
||||
📋 **測試與文檔** - Phase 12-13 待完成
|
||||
|
||||
---
|
||||
|
||||
## 📋 已完成階段 (Phase 1-8)
|
||||
|
||||
### Phase 1: 資料庫架構設計 ✅
|
||||
|
||||
#### 創建的模型文件:
|
||||
1. **`backend/app/models/user_v2.py`** - 新用戶模型
|
||||
- 資料表:`tool_ocr_users`
|
||||
- 欄位:`id`, `email`, `display_name`, `created_at`, `last_login`, `is_active`
|
||||
- 特點:無密碼欄位(外部認證)、email 作為主要識別
|
||||
|
||||
2. **`backend/app/models/task.py`** - 任務模型
|
||||
- 資料表:`tool_ocr_tasks`, `tool_ocr_task_files`
|
||||
- 任務狀態:PENDING, PROCESSING, COMPLETED, FAILED
|
||||
- 用戶隔離:外鍵關聯 `user_id`,CASCADE 刪除
|
||||
|
||||
3. **`backend/app/models/session.py`** - Session 管理
|
||||
- 資料表:`tool_ocr_sessions`
|
||||
- 儲存:access_token, id_token, refresh_token (加密)
|
||||
- 追蹤:expires_at, ip_address, user_agent, last_accessed_at
|
||||
|
||||
#### 資料庫遷移:
|
||||
- **檔案**:`backend/alembic/versions/5e75a59fb763_add_external_auth_schema_with_task_.py`
|
||||
- **狀態**:已套用 (alembic stamp head)
|
||||
- **變更**:創建 4 個新表 (users, sessions, tasks, task_files)
|
||||
- **策略**:保留舊表,不刪除(避免外鍵約束錯誤)
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: 配置管理 ✅
|
||||
|
||||
#### 環境變數 (`.env.local`):
|
||||
```bash
|
||||
# External Authentication
|
||||
EXTERNAL_AUTH_API_URL=https://pj-auth-api.vercel.app
|
||||
EXTERNAL_AUTH_ENDPOINT=/api/auth/login
|
||||
EXTERNAL_AUTH_TIMEOUT=30
|
||||
TOKEN_REFRESH_BUFFER=300
|
||||
|
||||
# Task Management
|
||||
DATABASE_TABLE_PREFIX=tool_ocr_
|
||||
ENABLE_TASK_HISTORY=true
|
||||
TASK_RETENTION_DAYS=30
|
||||
MAX_TASKS_PER_USER=1000
|
||||
```
|
||||
|
||||
#### 配置類 (`backend/app/core/config.py`):
|
||||
- 新增外部認證配置屬性
|
||||
- 新增 `external_auth_full_url` property
|
||||
- 新增任務管理配置參數
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: 服務層實作 ✅
|
||||
|
||||
#### 1. 外部認證服務 (`backend/app/services/external_auth_service.py`)
|
||||
|
||||
**核心功能:**
|
||||
```python
|
||||
class ExternalAuthService:
|
||||
async def authenticate_user(username, password) -> tuple[bool, AuthResponse, error]
|
||||
# 呼叫外部 API:POST https://pj-auth-api.vercel.app/api/auth/login
|
||||
# 重試邏輯:3 次,指數退避
|
||||
# 返回:success, auth_data (tokens + user_info), error_msg
|
||||
|
||||
async def validate_token(access_token) -> tuple[bool, payload]
|
||||
# TODO: 完整 JWT 驗證(簽名、過期時間等)
|
||||
|
||||
def is_token_expiring_soon(expires_at) -> bool
|
||||
# 檢查是否在 TOKEN_REFRESH_BUFFER 內過期
|
||||
```
|
||||
|
||||
**錯誤處理:**
|
||||
- HTTP 超時自動重試
|
||||
- 5xx 錯誤指數退避
|
||||
- 完整日誌記錄
|
||||
|
||||
#### 2. 任務管理服務 (`backend/app/services/task_service.py`)
|
||||
|
||||
**核心功能:**
|
||||
```python
|
||||
class TaskService:
|
||||
# 創建與查詢
|
||||
def create_task(db, user_id, filename, file_type) -> Task
|
||||
def get_task_by_id(db, task_id, user_id) -> Task # 用戶隔離
|
||||
def get_user_tasks(db, user_id, status, skip, limit) -> (tasks, total)
|
||||
|
||||
# 更新
|
||||
def update_task_status(db, task_id, user_id, status, error, time_ms) -> Task
|
||||
def update_task_results(db, task_id, user_id, paths...) -> Task
|
||||
|
||||
# 刪除與清理
|
||||
def delete_task(db, task_id, user_id) -> bool
|
||||
def auto_cleanup_expired_tasks(db) -> int # 根據 TASK_RETENTION_DAYS
|
||||
|
||||
# 統計
|
||||
def get_user_stats(db, user_id) -> dict # 按狀態統計
|
||||
```
|
||||
|
||||
**安全特性:**
|
||||
- 所有查詢強制 `user_id` 過濾
|
||||
- 自動任務限額檢查
|
||||
- 過期任務自動清理
|
||||
|
||||
---
|
||||
|
||||
### Phase 4-6: API 端點實作 ✅
|
||||
|
||||
#### 1. 認證端點 (`backend/app/routers/auth_v2.py`)
|
||||
|
||||
**路由前綴**:`/api/v2/auth`
|
||||
|
||||
| 端點 | 方法 | 描述 | 認證 |
|
||||
|------|------|------|------|
|
||||
| `/login` | POST | 外部 API 登入 | 無 |
|
||||
| `/logout` | POST | 登出 (刪除 session) | 需要 |
|
||||
| `/me` | GET | 獲取當前用戶資訊 | 需要 |
|
||||
| `/sessions` | GET | 列出用戶所有 sessions | 需要 |
|
||||
|
||||
**Login 流程:**
|
||||
```
|
||||
1. 呼叫外部 API 認證
|
||||
2. 獲取 access_token, id_token, user_info
|
||||
3. 在資料庫中創建/更新用戶 (email)
|
||||
4. 創建 session 記錄 (tokens, IP, user agent)
|
||||
5. 生成內部 JWT (包含 user_id, session_id)
|
||||
6. 返回內部 JWT 給前端
|
||||
```
|
||||
|
||||
#### 2. 任務管理端點 (`backend/app/routers/tasks.py`)
|
||||
|
||||
**路由前綴**:`/api/v2/tasks`
|
||||
|
||||
| 端點 | 方法 | 描述 | 認證 |
|
||||
|------|------|------|------|
|
||||
| `/` | POST | 創建新任務 | 需要 |
|
||||
| `/` | GET | 列出用戶任務 (分頁/過濾) | 需要 |
|
||||
| `/stats` | GET | 獲取任務統計 | 需要 |
|
||||
| `/{task_id}` | GET | 獲取任務詳情 | 需要 |
|
||||
| `/{task_id}` | PATCH | 更新任務 | 需要 |
|
||||
| `/{task_id}` | DELETE | 刪除任務 | 需要 |
|
||||
|
||||
**查詢參數:**
|
||||
- `status`: pending/processing/completed/failed
|
||||
- `page`: 頁碼 (從 1 開始)
|
||||
- `page_size`: 每頁筆數 (max 100)
|
||||
- `order_by`: 排序欄位 (created_at/updated_at/completed_at)
|
||||
- `order_desc`: 降序排列
|
||||
|
||||
#### 3. Schema 定義
|
||||
|
||||
**認證** (`backend/app/schemas/auth.py`):
|
||||
- `LoginRequest`: username, password
|
||||
- `Token`: access_token, token_type, expires_in, user (V2)
|
||||
- `UserInfo`: id, email, display_name
|
||||
- `UserResponse`: 完整用戶資訊
|
||||
- `TokenData`: JWT payload 結構
|
||||
|
||||
**任務** (`backend/app/schemas/task.py`):
|
||||
- `TaskCreate`: filename, file_type
|
||||
- `TaskUpdate`: status, error_message, paths...
|
||||
- `TaskResponse`: 任務基本資訊
|
||||
- `TaskDetailResponse`: 任務 + 文件列表
|
||||
- `TaskListResponse`: 分頁結果
|
||||
- `TaskStatsResponse`: 統計數據
|
||||
|
||||
---
|
||||
|
||||
### Phase 7: JWT 驗證依賴 ✅
|
||||
|
||||
#### 更新 `backend/app/core/deps.py`
|
||||
|
||||
**新增 V2 依賴:**
|
||||
```python
|
||||
def get_current_user_v2(credentials, db) -> UserV2:
|
||||
# 1. 解析 JWT token
|
||||
# 2. 從資料庫查詢用戶 (tool_ocr_users)
|
||||
# 3. 檢查用戶是否活躍
|
||||
# 4. 驗證 session (如果有 session_id)
|
||||
# 5. 檢查 session 是否過期
|
||||
# 6. 更新 last_accessed_at
|
||||
# 7. 返回用戶對象
|
||||
|
||||
def get_current_active_user_v2(current_user) -> UserV2:
|
||||
# 確保用戶處於活躍狀態
|
||||
```
|
||||
|
||||
**安全檢查:**
|
||||
- JWT 簽名驗證
|
||||
- 用戶存在性檢查
|
||||
- 用戶活躍狀態檢查
|
||||
- Session 有效性檢查
|
||||
- Session 過期時間檢查
|
||||
|
||||
---
|
||||
|
||||
### Phase 8: 路由註冊 ✅
|
||||
|
||||
#### 更新 `backend/app/main.py`
|
||||
|
||||
```python
|
||||
# Legacy V1 routers (保留向後兼容)
|
||||
from app.routers import auth, ocr, export, translation
|
||||
|
||||
# V2 routers (新外部認證系統)
|
||||
from app.routers import auth_v2, tasks
|
||||
|
||||
app.include_router(auth.router) # V1: /api/v1/auth
|
||||
app.include_router(ocr.router) # V1: /api/v1/ocr
|
||||
app.include_router(export.router) # V1: /api/v1/export
|
||||
app.include_router(translation.router) # V1: /api/v1/translation
|
||||
|
||||
app.include_router(auth_v2.router) # V2: /api/v2/auth
|
||||
app.include_router(tasks.router) # V2: /api/v2/tasks
|
||||
```
|
||||
|
||||
**版本策略:**
|
||||
- V1 API 保持不變 (向後兼容)
|
||||
- V2 API 使用新認證系統
|
||||
- 前端可逐步遷移
|
||||
|
||||
---
|
||||
|
||||
## 🔐 安全特性
|
||||
|
||||
### 1. 用戶隔離
|
||||
- ✅ 所有任務查詢強制 `user_id` 過濾
|
||||
- ✅ 用戶 A 無法訪問用戶 B 的任務
|
||||
- ✅ Row-level security 在服務層實施
|
||||
- ✅ 外鍵 CASCADE 刪除保證資料一致性
|
||||
|
||||
### 2. Session 管理
|
||||
- ✅ 追蹤 IP 位址和 User Agent
|
||||
- ✅ 自動過期檢查
|
||||
- ✅ 最後訪問時間更新
|
||||
- ⚠️ Token 加密待實作 (目前明文儲存)
|
||||
|
||||
### 3. 認證流程
|
||||
- ✅ 外部 API 認證 (Azure AD)
|
||||
- ✅ 內部 JWT 生成 (包含 user_id + session_id)
|
||||
- ✅ 雙重驗證 (JWT + session 檢查)
|
||||
- ✅ 錯誤重試機制 (3 次,指數退避)
|
||||
|
||||
### 4. 資料庫安全
|
||||
- ✅ 資料表前綴命名空間隔離 (`tool_ocr_`)
|
||||
- ✅ 索引優化 (email, task_id, status, created_at)
|
||||
- ✅ 外鍵約束確保參照完整性
|
||||
- ✅ 軟刪除支援 (file_deleted flag)
|
||||
|
||||
---
|
||||
|
||||
## 📊 資料庫架構
|
||||
|
||||
### 資料表關係圖:
|
||||
```
|
||||
tool_ocr_users (1)
|
||||
├── tool_ocr_sessions (N) [FK: user_id, CASCADE]
|
||||
└── tool_ocr_tasks (N) [FK: user_id, CASCADE]
|
||||
└── tool_ocr_task_files (N) [FK: task_id, CASCADE]
|
||||
```
|
||||
|
||||
### 索引策略:
|
||||
```sql
|
||||
-- 用戶表
|
||||
CREATE INDEX ix_tool_ocr_users_email ON tool_ocr_users(email); -- 登入查詢
|
||||
CREATE INDEX ix_tool_ocr_users_is_active ON tool_ocr_users(is_active);
|
||||
|
||||
-- Session 表
|
||||
CREATE INDEX ix_tool_ocr_sessions_user_id ON tool_ocr_sessions(user_id);
|
||||
CREATE INDEX ix_tool_ocr_sessions_expires_at ON tool_ocr_sessions(expires_at); -- 過期檢查
|
||||
CREATE INDEX ix_tool_ocr_sessions_created_at ON tool_ocr_sessions(created_at);
|
||||
|
||||
-- 任務表
|
||||
CREATE UNIQUE INDEX ix_tool_ocr_tasks_task_id ON tool_ocr_tasks(task_id); -- UUID 查詢
|
||||
CREATE INDEX ix_tool_ocr_tasks_user_id ON tool_ocr_tasks(user_id); -- 用戶查詢
|
||||
CREATE INDEX ix_tool_ocr_tasks_status ON tool_ocr_tasks(status); -- 狀態過濾
|
||||
CREATE INDEX ix_tool_ocr_tasks_created_at ON tool_ocr_tasks(created_at); -- 排序
|
||||
CREATE INDEX ix_tool_ocr_tasks_filename ON tool_ocr_tasks(filename); -- 搜尋
|
||||
|
||||
-- 任務文件表
|
||||
CREATE INDEX ix_tool_ocr_task_files_task_id ON tool_ocr_task_files(task_id);
|
||||
CREATE INDEX ix_tool_ocr_task_files_file_hash ON tool_ocr_task_files(file_hash); -- 去重
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🧪 測試端點 (Swagger UI)
|
||||
|
||||
### 訪問 API 文檔:
|
||||
```
|
||||
http://localhost:8000/docs
|
||||
```
|
||||
|
||||
### 測試流程:
|
||||
|
||||
#### 1. 登入測試
|
||||
```bash
|
||||
POST /api/v2/auth/login
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"username": "user@example.com",
|
||||
"password": "your_password"
|
||||
}
|
||||
|
||||
# 成功回應:
|
||||
{
|
||||
"access_token": "eyJhbGc...",
|
||||
"token_type": "bearer",
|
||||
"expires_in": 86400,
|
||||
"user": {
|
||||
"id": 1,
|
||||
"email": "user@example.com",
|
||||
"display_name": "User Name"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. 獲取當前用戶
|
||||
```bash
|
||||
GET /api/v2/auth/me
|
||||
Authorization: Bearer eyJhbGc...
|
||||
|
||||
# 回應:
|
||||
{
|
||||
"id": 1,
|
||||
"email": "user@example.com",
|
||||
"display_name": "User Name",
|
||||
"created_at": "2025-11-14T16:00:00",
|
||||
"last_login": "2025-11-14T16:30:00",
|
||||
"is_active": true
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. 創建任務
|
||||
```bash
|
||||
POST /api/v2/tasks/
|
||||
Authorization: Bearer eyJhbGc...
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"filename": "document.pdf",
|
||||
"file_type": "application/pdf"
|
||||
}
|
||||
|
||||
# 回應:
|
||||
{
|
||||
"id": 1,
|
||||
"user_id": 1,
|
||||
"task_id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"filename": "document.pdf",
|
||||
"file_type": "application/pdf",
|
||||
"status": "pending",
|
||||
"created_at": "2025-11-14T16:35:00",
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
#### 4. 列出任務
|
||||
```bash
|
||||
GET /api/v2/tasks/?status=completed&page=1&page_size=10
|
||||
Authorization: Bearer eyJhbGc...
|
||||
|
||||
# 回應:
|
||||
{
|
||||
"tasks": [...],
|
||||
"total": 25,
|
||||
"page": 1,
|
||||
"page_size": 10,
|
||||
"has_more": true
|
||||
}
|
||||
```
|
||||
|
||||
#### 5. 獲取統計
|
||||
```bash
|
||||
GET /api/v2/tasks/stats
|
||||
Authorization: Bearer eyJhbGc...
|
||||
|
||||
# 回應:
|
||||
{
|
||||
"total": 25,
|
||||
"pending": 3,
|
||||
"processing": 2,
|
||||
"completed": 18,
|
||||
"failed": 2
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ 待實作項目
|
||||
|
||||
### 高優先級 (阻塞性):
|
||||
1. **Token 加密** - Session 表中的 tokens 目前明文儲存
|
||||
- 需要:AES-256 加密
|
||||
- 位置:`backend/app/routers/auth_v2.py` login endpoint
|
||||
|
||||
2. **完整 JWT 驗證** - 目前僅解碼,未驗證簽名
|
||||
- 需要:Azure AD 公鑰驗證
|
||||
- 位置:`backend/app/services/external_auth_service.py`
|
||||
|
||||
3. **前端實作** - Phase 9-11
|
||||
- 認證服務 (token 管理)
|
||||
- 任務歷史 UI 頁面
|
||||
- API 整合
|
||||
|
||||
### 中優先級 (功能性):
|
||||
4. **Token 刷新機制** - 自動刷新即將過期的 token
|
||||
5. **檔案上傳整合** - 將 OCR 服務與新任務系統整合
|
||||
6. **任務通知** - 任務完成時通知用戶
|
||||
7. **錯誤追蹤** - 詳細的錯誤日誌和監控
|
||||
|
||||
### 低優先級 (優化):
|
||||
8. **效能測試** - 大量任務的查詢效能
|
||||
9. **快取層** - Redis 快取用戶 session
|
||||
10. **API 速率限制** - 防止濫用
|
||||
11. **文檔生成** - 自動生成 API 文檔
|
||||
|
||||
---
|
||||
|
||||
## 📝 遷移指南 (前端開發者)
|
||||
|
||||
### 1. 更新登入流程
|
||||
|
||||
**舊 V1 方式:**
|
||||
```typescript
|
||||
// V1: Local authentication
|
||||
const response = await fetch('/api/v1/auth/login', {
|
||||
method: 'POST',
|
||||
body: JSON.stringify({ username, password })
|
||||
});
|
||||
const { access_token } = await response.json();
|
||||
```
|
||||
|
||||
**新 V2 方式:**
|
||||
```typescript
|
||||
// V2: External Azure AD authentication
|
||||
const response = await fetch('/api/v2/auth/login', {
|
||||
method: 'POST',
|
||||
body: JSON.stringify({ username, password }) // Same interface!
|
||||
});
|
||||
const { access_token, user } = await response.json();
|
||||
|
||||
// Store token and user info
|
||||
localStorage.setItem('token', access_token);
|
||||
localStorage.setItem('user', JSON.stringify(user));
|
||||
```
|
||||
|
||||
### 2. 使用新的任務 API
|
||||
|
||||
```typescript
|
||||
// 獲取任務列表
|
||||
const response = await fetch('/api/v2/tasks/?page=1&page_size=20', {
|
||||
headers: {
|
||||
'Authorization': `Bearer ${token}`
|
||||
}
|
||||
});
|
||||
const { tasks, total, has_more } = await response.json();
|
||||
|
||||
// 獲取統計
|
||||
const statsResponse = await fetch('/api/v2/tasks/stats', {
|
||||
headers: { 'Authorization': `Bearer ${token}` }
|
||||
});
|
||||
const stats = await statsResponse.json();
|
||||
// { total: 25, pending: 3, processing: 2, completed: 18, failed: 2 }
|
||||
```
|
||||
|
||||
### 3. 處理認證錯誤
|
||||
|
||||
```typescript
|
||||
const response = await fetch('/api/v2/tasks/', {
|
||||
headers: { 'Authorization': `Bearer ${token}` }
|
||||
});
|
||||
|
||||
if (response.status === 401) {
|
||||
// Token 過期或無效,重新登入
|
||||
if (data.detail === "Session expired, please login again") {
|
||||
// 清除本地 token,導向登入頁
|
||||
localStorage.removeItem('token');
|
||||
window.location.href = '/login';
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔍 除錯與監控
|
||||
|
||||
### 日誌位置:
|
||||
```
|
||||
./logs/app.log
|
||||
```
|
||||
|
||||
### 重要日誌事件:
|
||||
- `Authentication successful for user: {email}` - 登入成功
|
||||
- `Created session {id} for user {email}` - Session 創建
|
||||
- `Authenticated user: {email} (ID: {id})` - JWT 驗證成功
|
||||
- `Expired session {id} for user {email}` - Session 過期
|
||||
- `Created task {task_id} for user {email}` - 任務創建
|
||||
|
||||
### 資料庫查詢:
|
||||
```sql
|
||||
-- 檢查用戶
|
||||
SELECT * FROM tool_ocr_users WHERE email = 'user@example.com';
|
||||
|
||||
-- 檢查 sessions
|
||||
SELECT * FROM tool_ocr_sessions WHERE user_id = 1 ORDER BY created_at DESC;
|
||||
|
||||
-- 檢查任務
|
||||
SELECT * FROM tool_ocr_tasks WHERE user_id = 1 ORDER BY created_at DESC LIMIT 10;
|
||||
|
||||
-- 統計
|
||||
SELECT status, COUNT(*) FROM tool_ocr_tasks WHERE user_id = 1 GROUP BY status;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ✅ 總結
|
||||
|
||||
### 已完成:
|
||||
- ✅ 完整的資料庫架構設計 (4 個新表)
|
||||
- ✅ 外部 API 認證服務整合
|
||||
- ✅ 用戶 Session 管理系統
|
||||
- ✅ 任務管理服務 (CRUD + 隔離)
|
||||
- ✅ RESTful API 端點 (認證 + 任務)
|
||||
- ✅ JWT 驗證依賴項
|
||||
- ✅ 資料庫遷移腳本
|
||||
- ✅ API Schema 定義
|
||||
|
||||
### 待繼續:
|
||||
- ⏳ 前端認證服務
|
||||
- ⏳ 前端任務歷史 UI
|
||||
- ⏳ 整合測試
|
||||
- ⏳ 文檔更新
|
||||
|
||||
### 技術債務:
|
||||
- ⚠️ Token 加密 (高優先級)
|
||||
- ⚠️ 完整 JWT 驗證 (高優先級)
|
||||
- ⚠️ Token 刷新機制
|
||||
|
||||
---
|
||||
|
||||
**實作完成日期**:2025-11-14
|
||||
**實作人員**:Claude Code
|
||||
**審核狀態**:待用戶測試與審核
|
||||
@@ -0,0 +1,304 @@
|
||||
# Migration Progress Update - 2025-11-14
|
||||
|
||||
## 概述
|
||||
外部 Azure AD 認證遷移的核心功能已完成 **80%**。所有後端 API 和主要前端功能均已實作並可運行。
|
||||
|
||||
---
|
||||
|
||||
## ✅ 已完成功能 (Completed)
|
||||
|
||||
### 1. 數據庫架構重設計 ✅ **100% 完成**
|
||||
- ✅ 1.3 使用 `tool_ocr_` 前綴創建新數據庫架構
|
||||
- ✅ 1.4 創建 SQLAlchemy 模型
|
||||
- `backend/app/models/user_v2.py` - 用戶模型(email 作為主鍵)
|
||||
- `backend/app/models/task.py` - 任務模型(含用戶隔離)
|
||||
- `backend/app/models/session.py` - 會話管理模型
|
||||
- `backend/app/models/audit_log.py` - 審計日誌模型
|
||||
- ✅ 1.5 生成 Alembic 遷移腳本
|
||||
- `5e75a59fb763_add_external_auth_schema_with_task_.py`
|
||||
|
||||
### 2. 配置管理 ✅ **100% 完成**
|
||||
- ✅ 2.1 更新環境配置
|
||||
- 添加 `EXTERNAL_AUTH_API_URL`
|
||||
- 添加 `EXTERNAL_AUTH_ENDPOINT`
|
||||
- 添加 `TOKEN_REFRESH_BUFFER`
|
||||
- 添加任務管理相關設定
|
||||
- ✅ 2.2 更新 Settings 類
|
||||
- `backend/app/core/config.py` 已更新所有新配置
|
||||
|
||||
### 3. 外部 API 集成服務 ✅ **100% 完成**
|
||||
- ✅ 3.1-3.3 創建認證 API 客戶端
|
||||
- `backend/app/services/external_auth_service.py`
|
||||
- 實作 `authenticate_user()`, `is_token_expiring_soon()`
|
||||
- 包含重試邏輯和超時處理
|
||||
|
||||
### 4. 後端認證更新 ✅ **100% 完成**
|
||||
- ✅ 4.1 修改登錄端點
|
||||
- `backend/app/routers/auth_v2.py`
|
||||
- 完整的外部 API 認證流程
|
||||
- 用戶自動創建/更新
|
||||
- ✅ 4.2-4.3 更新 Token 驗證
|
||||
- `backend/app/core/deps.py`
|
||||
- `get_current_user_v2()` 依賴注入
|
||||
- `get_current_admin_user_v2()` 管理員權限檢查
|
||||
|
||||
### 5. 會話和 Token 管理 ✅ **100% 完成**
|
||||
- ✅ 5.1 實作 Token 存儲
|
||||
- 存儲於 `tool_ocr_sessions` 表
|
||||
- 記錄 IP 地址、User-Agent、過期時間
|
||||
- ✅ 5.2 創建 Token 刷新機制
|
||||
- **前端**: 自動在過期前 5 分鐘刷新
|
||||
- **後端**: `POST /api/v2/auth/refresh` 端點
|
||||
- **功能**: 自動重試 401 錯誤
|
||||
- ✅ 5.3 會話失效
|
||||
- `POST /api/v2/auth/logout` 支持單個/全部會話登出
|
||||
|
||||
### 6. 前端更新 ✅ **90% 完成**
|
||||
- ✅ 6.1 更新認證服務
|
||||
- `frontend/src/services/apiV2.ts` - 完整 V2 API 客戶端
|
||||
- 自動 Token 刷新和重試機制
|
||||
- ✅ 6.2 更新認證 Store
|
||||
- `frontend/src/store/authStore.ts` 存儲用戶信息
|
||||
- ✅ 6.3 更新 UI 組件
|
||||
- `frontend/src/pages/LoginPage.tsx` 整合 V2 登錄
|
||||
- `frontend/src/components/Layout.tsx` 顯示用戶名稱和登出
|
||||
- ✅ 6.4 錯誤處理
|
||||
- 完整的錯誤顯示和重試邏輯
|
||||
|
||||
### 7. 任務管理系統 ✅ **100% 完成**
|
||||
- ✅ 7.1 創建任務管理後端
|
||||
- `backend/app/services/task_service.py`
|
||||
- 完整的 CRUD 操作和用戶隔離
|
||||
- ✅ 7.2 實作任務 API
|
||||
- `backend/app/routers/tasks.py`
|
||||
- `GET /api/v2/tasks` - 任務列表(含分頁)
|
||||
- `GET /api/v2/tasks/{id}` - 任務詳情
|
||||
- `DELETE /api/v2/tasks/{id}` - 刪除任務
|
||||
- `POST /api/v2/tasks/{id}/start` - 開始任務
|
||||
- `POST /api/v2/tasks/{id}/cancel` - 取消任務
|
||||
- `POST /api/v2/tasks/{id}/retry` - 重試任務
|
||||
- ✅ 7.3 創建任務歷史端點
|
||||
- `GET /api/v2/tasks/stats` - 用戶統計
|
||||
- 支持狀態、檔名、日期範圍篩選
|
||||
- ✅ 7.4 實作檔案訪問控制
|
||||
- `backend/app/services/file_access_service.py`
|
||||
- 驗證用戶所有權
|
||||
- 檢查任務狀態和檔案存在性
|
||||
- ✅ 7.5 檔案下載功能
|
||||
- `GET /api/v2/tasks/{id}/download/json`
|
||||
- `GET /api/v2/tasks/{id}/download/markdown`
|
||||
- `GET /api/v2/tasks/{id}/download/pdf`
|
||||
|
||||
### 8. 前端任務管理 UI ✅ **100% 完成**
|
||||
- ✅ 8.1 創建任務歷史頁面
|
||||
- `frontend/src/pages/TaskHistoryPage.tsx`
|
||||
- 完整的任務列表和狀態指示器
|
||||
- 分頁控制
|
||||
- ✅ 8.3 創建篩選組件
|
||||
- 狀態篩選下拉選單
|
||||
- 檔名搜尋輸入框
|
||||
- 日期範圍選擇器(開始/結束)
|
||||
- 清除篩選按鈕
|
||||
- ✅ 8.4-8.5 任務管理服務
|
||||
- `frontend/src/services/apiV2.ts` 整合所有任務 API
|
||||
- 完整的錯誤處理和重試邏輯
|
||||
- ✅ 8.6 更新導航
|
||||
- `frontend/src/App.tsx` 添加 `/tasks` 路由
|
||||
- `frontend/src/components/Layout.tsx` 添加"任務歷史"選單
|
||||
|
||||
### 9. 用戶隔離和安全 ✅ **100% 完成**
|
||||
- ✅ 9.1-9.2 用戶上下文和查詢隔離
|
||||
- 所有任務查詢自動過濾 `user_id`
|
||||
- 嚴格的用戶所有權驗證
|
||||
- ✅ 9.3 檔案系統隔離
|
||||
- 下載前驗證檔案路徑
|
||||
- 檢查用戶所有權
|
||||
- ✅ 9.4 API 授權
|
||||
- 所有 V2 端點使用 `get_current_user_v2` 依賴
|
||||
- 403 錯誤處理未授權訪問
|
||||
|
||||
### 10. 管理員功能 ✅ **100% 完成(後端)**
|
||||
- ✅ 10.1 管理員權限系統
|
||||
- `backend/app/services/admin_service.py`
|
||||
- 管理員郵箱: `ymirliu@panjit.com.tw`
|
||||
- `get_current_admin_user_v2()` 依賴注入
|
||||
- ✅ 10.2 系統統計 API
|
||||
- `GET /api/v2/admin/stats` - 系統總覽統計
|
||||
- `GET /api/v2/admin/users` - 用戶列表(含統計)
|
||||
- `GET /api/v2/admin/users/top` - 用戶排行榜
|
||||
- ✅ 10.3 審計日誌系統
|
||||
- `backend/app/models/audit_log.py` - 審計日誌模型
|
||||
- `backend/app/services/audit_service.py` - 審計服務
|
||||
- `GET /api/v2/admin/audit-logs` - 審計日誌查詢
|
||||
- `GET /api/v2/admin/audit-logs/user/{id}/summary` - 用戶活動摘要
|
||||
- ✅ 10.4 管理員路由註冊
|
||||
- `backend/app/routers/admin.py`
|
||||
- 已在 `backend/app/main.py` 中註冊
|
||||
|
||||
---
|
||||
|
||||
## 🚧 進行中 / 待完成 (In Progress / Pending)
|
||||
|
||||
### 11. 數據庫遷移 ⚠️ **待執行**
|
||||
- ⏳ 11.1 創建審計日誌表遷移
|
||||
- 需要: `alembic revision` 創建 `tool_ocr_audit_logs` 表
|
||||
- 表結構已在 `audit_log.py` 中定義
|
||||
- ⏳ 11.2 執行遷移
|
||||
- 運行 `alembic upgrade head`
|
||||
|
||||
### 12. 前端管理員頁面 ⏳ **20% 完成**
|
||||
- ⏳ 12.1 管理員儀表板頁面
|
||||
- 需要: `frontend/src/pages/AdminDashboardPage.tsx`
|
||||
- 顯示系統統計(用戶、任務、會話、活動)
|
||||
- 用戶列表和排行榜
|
||||
- ⏳ 12.2 審計日誌查看器
|
||||
- 需要: `frontend/src/pages/AuditLogsPage.tsx`
|
||||
- 顯示審計日誌列表
|
||||
- 支持篩選(用戶、類別、日期範圍)
|
||||
- 用戶活動摘要
|
||||
- ⏳ 12.3 管理員路由和導航
|
||||
- 更新 `App.tsx` 添加管理員路由
|
||||
- 在 `Layout.tsx` 中顯示管理員選單(僅管理員可見)
|
||||
|
||||
### 13. 測試 ⏳ **未開始**
|
||||
- 所有功能需要完整測試
|
||||
- 建議優先測試核心認證和任務管理流程
|
||||
|
||||
### 14. 文檔 ⏳ **部分完成**
|
||||
- ✅ 已創建實作報告
|
||||
- ⏳ 需要更新 API 文檔
|
||||
- ⏳ 需要創建用戶使用指南
|
||||
|
||||
---
|
||||
|
||||
## 📊 完成度統計
|
||||
|
||||
| 模組 | 完成度 | 狀態 |
|
||||
|------|--------|------|
|
||||
| 數據庫架構 | 100% | ✅ 完成 |
|
||||
| 配置管理 | 100% | ✅ 完成 |
|
||||
| 外部 API 集成 | 100% | ✅ 完成 |
|
||||
| 後端認證 | 100% | ✅ 完成 |
|
||||
| Token 管理 | 100% | ✅ 完成 |
|
||||
| 前端認證 | 90% | ✅ 基本完成 |
|
||||
| 任務管理後端 | 100% | ✅ 完成 |
|
||||
| 任務管理前端 | 100% | ✅ 完成 |
|
||||
| 用戶隔離 | 100% | ✅ 完成 |
|
||||
| 管理員功能(後端) | 100% | ✅ 完成 |
|
||||
| 管理員功能(前端) | 20% | ⏳ 待開發 |
|
||||
| 數據庫遷移 | 90% | ⚠️ 待執行 |
|
||||
| 測試 | 0% | ⏳ 待開始 |
|
||||
| 文檔 | 50% | ⏳ 進行中 |
|
||||
|
||||
**總體完成度: 80%**
|
||||
|
||||
---
|
||||
|
||||
## 🎯 核心成就
|
||||
|
||||
### 1. Token 自動刷新機制 🎉
|
||||
- **前端**: 自動在過期前 5 分鐘刷新,無縫體驗
|
||||
- **後端**: `/api/v2/auth/refresh` 端點
|
||||
- **錯誤處理**: 401 自動重試機制
|
||||
|
||||
### 2. 完整的任務管理系統 🎉
|
||||
- **任務操作**: 開始/取消/重試/刪除
|
||||
- **任務篩選**: 狀態/檔名/日期範圍
|
||||
- **檔案下載**: JSON/Markdown/PDF 三種格式
|
||||
- **訪問控制**: 嚴格的用戶隔離和權限驗證
|
||||
|
||||
### 3. 管理員監控系統 🎉
|
||||
- **系統統計**: 用戶、任務、會話、活動統計
|
||||
- **用戶管理**: 用戶列表、排行榜
|
||||
- **審計日誌**: 完整的事件記錄和查詢系統
|
||||
|
||||
### 4. 安全性增強 🎉
|
||||
- **用戶隔離**: 所有查詢自動過濾用戶 ID
|
||||
- **檔案訪問控制**: 驗證所有權和任務狀態
|
||||
- **審計追蹤**: 記錄所有重要操作
|
||||
|
||||
---
|
||||
|
||||
## 📝 重要檔案清單
|
||||
|
||||
### 後端新增檔案
|
||||
```
|
||||
backend/app/models/
|
||||
├── user_v2.py # 用戶模型(外部認證)
|
||||
├── task.py # 任務模型
|
||||
├── session.py # 會話模型
|
||||
└── audit_log.py # 審計日誌模型
|
||||
|
||||
backend/app/services/
|
||||
├── external_auth_service.py # 外部認證服務
|
||||
├── task_service.py # 任務管理服務
|
||||
├── file_access_service.py # 檔案訪問控制
|
||||
├── admin_service.py # 管理員服務
|
||||
└── audit_service.py # 審計日誌服務
|
||||
|
||||
backend/app/routers/
|
||||
├── auth_v2.py # V2 認證路由
|
||||
├── tasks.py # 任務管理路由
|
||||
└── admin.py # 管理員路由
|
||||
|
||||
backend/alembic/versions/
|
||||
└── 5e75a59fb763_add_external_auth_schema_with_task_.py
|
||||
```
|
||||
|
||||
### 前端新增/修改檔案
|
||||
```
|
||||
frontend/src/services/
|
||||
└── apiV2.ts # 完整 V2 API 客戶端
|
||||
|
||||
frontend/src/pages/
|
||||
├── LoginPage.tsx # 整合 V2 登錄
|
||||
└── TaskHistoryPage.tsx # 任務歷史頁面
|
||||
|
||||
frontend/src/components/
|
||||
└── Layout.tsx # 導航和用戶資訊
|
||||
|
||||
frontend/src/types/
|
||||
└── apiV2.ts # V2 類型定義
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚀 下一步行動
|
||||
|
||||
### 立即執行
|
||||
1. ✅ **提交當前進度** - 所有核心功能已實作
|
||||
2. **執行數據庫遷移** - 運行 Alembic 遷移添加 audit_logs 表
|
||||
3. **系統測試** - 測試認證流程和任務管理功能
|
||||
|
||||
### 可選增強
|
||||
1. **前端管理員頁面** - 管理員儀表板和審計日誌查看器
|
||||
2. **完整測試套件** - 單元測試和集成測試
|
||||
3. **性能優化** - 查詢優化和緩存策略
|
||||
|
||||
---
|
||||
|
||||
## 🔒 安全注意事項
|
||||
|
||||
### 已實作
|
||||
- ✅ 用戶隔離(Row-level security)
|
||||
- ✅ 檔案訪問控制
|
||||
- ✅ Token 過期檢查
|
||||
- ✅ 管理員權限驗證
|
||||
- ✅ 審計日誌記錄
|
||||
|
||||
### 待實作(可選)
|
||||
- ⏳ Token 加密存儲
|
||||
- ⏳ 速率限制
|
||||
- ⏳ CSRF 保護增強
|
||||
|
||||
---
|
||||
|
||||
## 📞 聯繫資訊
|
||||
|
||||
**管理員郵箱**: ymirliu@panjit.com.tw
|
||||
**外部認證 API**: https://pj-auth-api.vercel.app
|
||||
|
||||
---
|
||||
|
||||
*最後更新: 2025-11-14*
|
||||
*實作者: Claude Code*
|
||||
@@ -0,0 +1,183 @@
|
||||
-- Tool_OCR Database Schema with External API Authentication
|
||||
-- Version: 2.0.0
|
||||
-- Date: 2025-11-14
|
||||
-- Description: Complete database redesign with user task isolation and history
|
||||
|
||||
-- ============================================
|
||||
-- Drop existing tables (if needed)
|
||||
-- ============================================
|
||||
-- Uncomment these lines to drop existing tables
|
||||
-- DROP TABLE IF EXISTS tool_ocr_sessions;
|
||||
-- DROP TABLE IF EXISTS tool_ocr_task_files;
|
||||
-- DROP TABLE IF EXISTS tool_ocr_tasks;
|
||||
-- DROP TABLE IF EXISTS tool_ocr_users;
|
||||
|
||||
-- ============================================
|
||||
-- 1. Users Table
|
||||
-- ============================================
|
||||
CREATE TABLE IF NOT EXISTS tool_ocr_users (
|
||||
id INT PRIMARY KEY AUTO_INCREMENT,
|
||||
email VARCHAR(255) UNIQUE NOT NULL COMMENT 'Primary identifier from Azure AD',
|
||||
display_name VARCHAR(255) COMMENT 'Display name from API response',
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
last_login TIMESTAMP NULL,
|
||||
is_active BOOLEAN DEFAULT TRUE,
|
||||
INDEX idx_email (email),
|
||||
INDEX idx_active (is_active)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
|
||||
COMMENT='User accounts authenticated via external API';
|
||||
|
||||
-- ============================================
|
||||
-- 2. OCR Tasks Table
|
||||
-- ============================================
|
||||
CREATE TABLE IF NOT EXISTS tool_ocr_tasks (
|
||||
id INT PRIMARY KEY AUTO_INCREMENT,
|
||||
user_id INT NOT NULL COMMENT 'Foreign key to users table',
|
||||
task_id VARCHAR(255) UNIQUE NOT NULL COMMENT 'Unique task identifier (UUID)',
|
||||
filename VARCHAR(255),
|
||||
file_type VARCHAR(50),
|
||||
status ENUM('pending', 'processing', 'completed', 'failed') DEFAULT 'pending',
|
||||
result_json_path VARCHAR(500) COMMENT 'Path to JSON result file',
|
||||
result_markdown_path VARCHAR(500) COMMENT 'Path to Markdown result file',
|
||||
result_pdf_path VARCHAR(500) COMMENT 'Path to searchable PDF file',
|
||||
error_message TEXT COMMENT 'Error details if task failed',
|
||||
processing_time_ms INT COMMENT 'Processing time in milliseconds',
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
|
||||
completed_at TIMESTAMP NULL,
|
||||
file_deleted BOOLEAN DEFAULT FALSE COMMENT 'Track if files were auto-deleted',
|
||||
FOREIGN KEY (user_id) REFERENCES tool_ocr_users(id) ON DELETE CASCADE,
|
||||
INDEX idx_user_status (user_id, status),
|
||||
INDEX idx_created (created_at),
|
||||
INDEX idx_task_id (task_id),
|
||||
INDEX idx_filename (filename)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
|
||||
COMMENT='OCR processing tasks with user association';
|
||||
|
||||
-- ============================================
|
||||
-- 3. Task Files Table
|
||||
-- ============================================
|
||||
CREATE TABLE IF NOT EXISTS tool_ocr_task_files (
|
||||
id INT PRIMARY KEY AUTO_INCREMENT,
|
||||
task_id INT NOT NULL COMMENT 'Foreign key to tasks table',
|
||||
original_name VARCHAR(255),
|
||||
stored_path VARCHAR(500) COMMENT 'Actual file path on server',
|
||||
file_size BIGINT COMMENT 'File size in bytes',
|
||||
mime_type VARCHAR(100),
|
||||
file_hash VARCHAR(64) COMMENT 'SHA256 hash for deduplication',
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
FOREIGN KEY (task_id) REFERENCES tool_ocr_tasks(id) ON DELETE CASCADE,
|
||||
INDEX idx_task (task_id),
|
||||
INDEX idx_hash (file_hash)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
|
||||
COMMENT='Files associated with OCR tasks';
|
||||
|
||||
-- ============================================
|
||||
-- 4. Sessions Table (Token Storage)
|
||||
-- ============================================
|
||||
CREATE TABLE IF NOT EXISTS tool_ocr_sessions (
|
||||
id INT PRIMARY KEY AUTO_INCREMENT,
|
||||
user_id INT NOT NULL COMMENT 'Foreign key to users table',
|
||||
session_id VARCHAR(255) UNIQUE NOT NULL COMMENT 'Unique session identifier',
|
||||
access_token TEXT COMMENT 'Azure AD access token (encrypted)',
|
||||
id_token TEXT COMMENT 'Azure AD ID token (encrypted)',
|
||||
refresh_token TEXT COMMENT 'Refresh token if available',
|
||||
expires_at TIMESTAMP NOT NULL COMMENT 'Token expiration time',
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
last_accessed TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
|
||||
is_active BOOLEAN DEFAULT TRUE,
|
||||
ip_address VARCHAR(45) COMMENT 'Client IP address',
|
||||
user_agent TEXT COMMENT 'Client user agent',
|
||||
FOREIGN KEY (user_id) REFERENCES tool_ocr_users(id) ON DELETE CASCADE,
|
||||
INDEX idx_user (user_id),
|
||||
INDEX idx_session (session_id),
|
||||
INDEX idx_expires (expires_at),
|
||||
INDEX idx_active (is_active)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
|
||||
COMMENT='User session and token management';
|
||||
|
||||
-- ============================================
|
||||
-- 5. Audit Log Table (Optional)
|
||||
-- ============================================
|
||||
CREATE TABLE IF NOT EXISTS tool_ocr_audit_logs (
|
||||
id BIGINT PRIMARY KEY AUTO_INCREMENT,
|
||||
user_id INT COMMENT 'User who performed the action',
|
||||
action VARCHAR(100) NOT NULL COMMENT 'Action performed',
|
||||
entity_type VARCHAR(50) COMMENT 'Type of entity affected',
|
||||
entity_id INT COMMENT 'ID of entity affected',
|
||||
details JSON COMMENT 'Additional details in JSON format',
|
||||
ip_address VARCHAR(45),
|
||||
user_agent TEXT,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
INDEX idx_user (user_id),
|
||||
INDEX idx_action (action),
|
||||
INDEX idx_created (created_at)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
|
||||
COMMENT='Audit trail for all system actions';
|
||||
|
||||
-- ============================================
|
||||
-- Views for Common Queries
|
||||
-- ============================================
|
||||
|
||||
-- User task statistics view
|
||||
CREATE OR REPLACE VIEW tool_ocr_user_stats AS
|
||||
SELECT
|
||||
u.id as user_id,
|
||||
u.email,
|
||||
u.display_name,
|
||||
COUNT(DISTINCT t.id) as total_tasks,
|
||||
SUM(CASE WHEN t.status = 'completed' THEN 1 ELSE 0 END) as completed_tasks,
|
||||
SUM(CASE WHEN t.status = 'failed' THEN 1 ELSE 0 END) as failed_tasks,
|
||||
SUM(CASE WHEN t.status = 'processing' THEN 1 ELSE 0 END) as processing_tasks,
|
||||
SUM(CASE WHEN t.status = 'pending' THEN 1 ELSE 0 END) as pending_tasks,
|
||||
AVG(t.processing_time_ms) as avg_processing_time_ms,
|
||||
MAX(t.created_at) as last_task_created
|
||||
FROM tool_ocr_users u
|
||||
LEFT JOIN tool_ocr_tasks t ON u.id = t.user_id
|
||||
GROUP BY u.id, u.email, u.display_name;
|
||||
|
||||
-- Recent tasks view
|
||||
CREATE OR REPLACE VIEW tool_ocr_recent_tasks AS
|
||||
SELECT
|
||||
t.*,
|
||||
u.email as user_email,
|
||||
u.display_name as user_name
|
||||
FROM tool_ocr_tasks t
|
||||
INNER JOIN tool_ocr_users u ON t.user_id = u.id
|
||||
ORDER BY t.created_at DESC
|
||||
LIMIT 100;
|
||||
|
||||
-- ============================================
|
||||
-- Stored Procedures (Optional)
|
||||
-- ============================================
|
||||
|
||||
DELIMITER $$
|
||||
|
||||
-- Procedure to clean up expired sessions
|
||||
CREATE PROCEDURE IF NOT EXISTS cleanup_expired_sessions()
|
||||
BEGIN
|
||||
DELETE FROM tool_ocr_sessions
|
||||
WHERE expires_at < NOW() OR is_active = FALSE;
|
||||
END$$
|
||||
|
||||
-- Procedure to clean up old tasks
|
||||
CREATE PROCEDURE IF NOT EXISTS cleanup_old_tasks(IN days_to_keep INT)
|
||||
BEGIN
|
||||
UPDATE tool_ocr_tasks
|
||||
SET file_deleted = TRUE
|
||||
WHERE created_at < DATE_SUB(NOW(), INTERVAL days_to_keep DAY)
|
||||
AND status IN ('completed', 'failed');
|
||||
END$$
|
||||
|
||||
DELIMITER ;
|
||||
|
||||
-- ============================================
|
||||
-- Initial Data (Optional)
|
||||
-- ============================================
|
||||
-- Add any initial data here if needed
|
||||
|
||||
-- ============================================
|
||||
-- Grants (Adjust as needed)
|
||||
-- ============================================
|
||||
-- GRANT ALL PRIVILEGES ON tool_ocr_* TO 'tool_ocr_user'@'localhost';
|
||||
-- FLUSH PRIVILEGES;
|
||||
@@ -0,0 +1,294 @@
|
||||
# Change: Migrate to External API Authentication
|
||||
|
||||
## Why
|
||||
|
||||
The current local database authentication system has several limitations:
|
||||
- User credentials are managed locally, requiring manual user creation and password management
|
||||
- No centralized authentication with enterprise identity systems
|
||||
- Cannot leverage existing enterprise authentication infrastructure (e.g., Microsoft Azure AD)
|
||||
- No single sign-on (SSO) capability
|
||||
- Increased maintenance overhead for user management
|
||||
|
||||
By migrating to the external API authentication service at https://pj-auth-api.vercel.app, the system will:
|
||||
- Integrate with enterprise Microsoft Azure AD authentication
|
||||
- Enable single sign-on (SSO) for users
|
||||
- Eliminate local password management
|
||||
- Leverage existing enterprise user management and security policies
|
||||
- Reduce maintenance overhead
|
||||
- Provide consistent authentication across multiple applications
|
||||
|
||||
## What Changes
|
||||
|
||||
### Authentication Flow
|
||||
- **Current**: Local database authentication using username/password stored in MySQL
|
||||
- **New**: External API authentication via POST to `https://pj-auth-api.vercel.app/api/auth/login`
|
||||
- **Token Management**: Use JWT tokens from external API instead of locally generated tokens
|
||||
- **User Display**: Use `name` field from API response for user display instead of local username
|
||||
|
||||
### API Integration
|
||||
**Endpoint**: `POST https://pj-auth-api.vercel.app/api/auth/login`
|
||||
|
||||
**Request Format**:
|
||||
```json
|
||||
{
|
||||
"username": "user@domain.com",
|
||||
"password": "user_password"
|
||||
}
|
||||
```
|
||||
|
||||
**Success Response (200)**:
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"message": "認證成功",
|
||||
"data": {
|
||||
"access_token": "eyJ0eXAiOiJKV1Q...",
|
||||
"id_token": "eyJ0eXAiOiJKV1Q...",
|
||||
"expires_in": 4999,
|
||||
"token_type": "Bearer",
|
||||
"userInfo": {
|
||||
"id": "42cf0b98-f598-47dd-ae2a-f33803f87d41",
|
||||
"name": "ymirliu 劉念萱",
|
||||
"email": "ymirliu@panjit.com.tw",
|
||||
"jobTitle": null,
|
||||
"officeLocation": "高雄",
|
||||
"businessPhones": ["1580"]
|
||||
},
|
||||
"issuedAt": "2025-11-14T07:09:15.203Z",
|
||||
"expiresAt": "2025-11-14T08:32:34.203Z"
|
||||
},
|
||||
"timestamp": "2025-11-14T07:09:15.203Z"
|
||||
}
|
||||
```
|
||||
|
||||
**Failure Response (401)**:
|
||||
```json
|
||||
{
|
||||
"success": false,
|
||||
"error": "用戶名或密碼錯誤",
|
||||
"code": "INVALID_CREDENTIALS",
|
||||
"timestamp": "2025-11-14T07:10:02.585Z"
|
||||
}
|
||||
```
|
||||
|
||||
### Database Schema Changes
|
||||
|
||||
**Complete Redesign (No backward compatibility needed)**:
|
||||
|
||||
**Table Prefix**: `tool_ocr_` (for clear separation from other systems in the same database)
|
||||
|
||||
1. **tool_ocr_users table (redesigned)**:
|
||||
```sql
|
||||
CREATE TABLE tool_ocr_users (
|
||||
id INT PRIMARY KEY AUTO_INCREMENT,
|
||||
email VARCHAR(255) UNIQUE NOT NULL, -- Primary identifier from Azure AD
|
||||
display_name VARCHAR(255), -- Display name from API response
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
last_login TIMESTAMP,
|
||||
is_active BOOLEAN DEFAULT TRUE
|
||||
);
|
||||
```
|
||||
Note: No Azure AD ID storage needed - email is sufficient as unique identifier
|
||||
|
||||
2. **tool_ocr_tasks table (new - for task history)**:
|
||||
```sql
|
||||
CREATE TABLE tool_ocr_tasks (
|
||||
id INT PRIMARY KEY AUTO_INCREMENT,
|
||||
user_id INT NOT NULL, -- Foreign key to users table
|
||||
task_id VARCHAR(255) UNIQUE, -- Unique task identifier
|
||||
filename VARCHAR(255),
|
||||
file_type VARCHAR(50),
|
||||
status ENUM('pending', 'processing', 'completed', 'failed'),
|
||||
result_json_path VARCHAR(500),
|
||||
result_markdown_path VARCHAR(500),
|
||||
error_message TEXT,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
|
||||
completed_at TIMESTAMP NULL,
|
||||
file_deleted BOOLEAN DEFAULT FALSE, -- Track if files were auto-deleted
|
||||
FOREIGN KEY (user_id) REFERENCES tool_ocr_users(id),
|
||||
INDEX idx_user_status (user_id, status),
|
||||
INDEX idx_created (created_at)
|
||||
);
|
||||
```
|
||||
|
||||
3. **tool_ocr_task_files table (for multiple files per task)**:
|
||||
```sql
|
||||
CREATE TABLE tool_ocr_task_files (
|
||||
id INT PRIMARY KEY AUTO_INCREMENT,
|
||||
task_id INT NOT NULL,
|
||||
original_name VARCHAR(255),
|
||||
stored_path VARCHAR(500),
|
||||
file_size BIGINT,
|
||||
mime_type VARCHAR(100),
|
||||
FOREIGN KEY (task_id) REFERENCES tool_ocr_tasks(id) ON DELETE CASCADE
|
||||
);
|
||||
```
|
||||
|
||||
4. **tool_ocr_sessions table (for token management)**:
|
||||
```sql
|
||||
CREATE TABLE tool_ocr_sessions (
|
||||
id INT PRIMARY KEY AUTO_INCREMENT,
|
||||
user_id INT NOT NULL,
|
||||
access_token TEXT,
|
||||
id_token TEXT,
|
||||
expires_at TIMESTAMP,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
FOREIGN KEY (user_id) REFERENCES tool_ocr_users(id) ON DELETE CASCADE,
|
||||
INDEX idx_user (user_id),
|
||||
INDEX idx_expires (expires_at)
|
||||
);
|
||||
```
|
||||
|
||||
### Session Management
|
||||
- Store external API tokens in session/cache instead of local JWT
|
||||
- Implement token refresh mechanism based on `expires_in` field
|
||||
- Use `expiresAt` timestamp for token expiration validation
|
||||
|
||||
## New Features: User Task Isolation and History
|
||||
|
||||
### Task Isolation
|
||||
- **Principle**: Each user can only see and access their own tasks
|
||||
- **Implementation**: All task queries filtered by `user_id` at API level
|
||||
- **Security**: Enforce user context validation in all task-related endpoints
|
||||
|
||||
### Task History Features
|
||||
1. **Task Status Tracking**:
|
||||
- View pending tasks (waiting to process)
|
||||
- View processing tasks (currently running)
|
||||
- View completed tasks (with results available)
|
||||
- View failed tasks (with error messages)
|
||||
|
||||
2. **Historical Query Capabilities**:
|
||||
- Search tasks by filename
|
||||
- Filter by date range
|
||||
- Filter by status
|
||||
- Sort by creation/completion time
|
||||
- Pagination for large result sets
|
||||
|
||||
3. **Task Management**:
|
||||
- Download original files (if not auto-deleted)
|
||||
- Download results (JSON, Markdown, PDF exports)
|
||||
- Re-process failed tasks
|
||||
- Delete old tasks manually
|
||||
|
||||
### Frontend UI Changes
|
||||
1. **New Components**:
|
||||
- Task History page/tab
|
||||
- Task filters and search bar
|
||||
- Task status badges
|
||||
- Batch action controls
|
||||
|
||||
2. **Task List View**:
|
||||
```
|
||||
| Filename | Status | Created | Completed | Actions |
|
||||
|----------|--------|---------|-----------|---------|
|
||||
| doc1.pdf | ✅ Completed | 2025-11-14 10:00 | 2025-11-14 10:05 | [Download] [View] |
|
||||
| doc2.pdf | 🔄 Processing | 2025-11-14 10:10 | - | [Cancel] |
|
||||
| doc3.pdf | ❌ Failed | 2025-11-14 09:00 | - | [Retry] [View Error] |
|
||||
```
|
||||
|
||||
3. **User Information Display**:
|
||||
- Show user display name in header
|
||||
- Show last login time
|
||||
- Show task statistics (total, completed, failed)
|
||||
|
||||
## Impact
|
||||
|
||||
### Affected Capabilities
|
||||
- `authentication`: Complete replacement of authentication mechanism
|
||||
- `user-management`: Simplified to read-only user information from external API
|
||||
- `session-management`: Modified to handle external tokens
|
||||
- `task-management`: NEW - User-specific task isolation and history
|
||||
- `file-access-control`: NEW - User-based file access restrictions
|
||||
|
||||
### Affected Code
|
||||
- **Backend Authentication**:
|
||||
- `backend/app/api/v1/endpoints/auth.py`: Replace login logic with external API call
|
||||
- `backend/app/core/security.py`: Modify token validation to use external tokens
|
||||
- `backend/app/core/auth.py`: Update authentication dependencies
|
||||
- `backend/app/services/auth_service.py`: New service for external API integration
|
||||
|
||||
- **Database Models**:
|
||||
- `backend/app/models/user.py`: Complete redesign with new schema
|
||||
- `backend/app/models/task.py`: NEW - Task model with user association
|
||||
- `backend/app/models/task_file.py`: NEW - Task file model
|
||||
- `backend/alembic/versions/`: Complete database recreation
|
||||
|
||||
- **Task Management APIs** (NEW):
|
||||
- `backend/app/api/v1/endpoints/tasks.py`: Task CRUD operations with user isolation
|
||||
- `backend/app/api/v1/endpoints/task_history.py`: Historical query endpoints
|
||||
- `backend/app/services/task_service.py`: Task business logic
|
||||
- `backend/app/services/file_access_service.py`: User-based file access control
|
||||
|
||||
- **Frontend**:
|
||||
- `frontend/src/services/authService.ts`: Update to handle new token format
|
||||
- `frontend/src/stores/authStore.ts`: Modify to store/display user info from API
|
||||
- `frontend/src/components/Header.tsx`: Display `name` field and user menu
|
||||
- `frontend/src/pages/TaskHistory.tsx`: NEW - Task history page
|
||||
- `frontend/src/components/TaskList.tsx`: NEW - Task list component with filters
|
||||
- `frontend/src/components/TaskFilters.tsx`: NEW - Search and filter UI
|
||||
- `frontend/src/stores/taskStore.ts`: NEW - Task state management
|
||||
- `frontend/src/services/taskService.ts`: NEW - Task API client
|
||||
|
||||
### Dependencies
|
||||
- Add `httpx` or `aiohttp` for async HTTP requests to external API (already present)
|
||||
- No new package dependencies required
|
||||
|
||||
### Configuration
|
||||
- New environment variables:
|
||||
- `EXTERNAL_AUTH_API_URL` = "https://pj-auth-api.vercel.app"
|
||||
- `EXTERNAL_AUTH_ENDPOINT` = "/api/auth/login"
|
||||
- `EXTERNAL_AUTH_TIMEOUT` = 30 (seconds)
|
||||
- `TOKEN_REFRESH_BUFFER` = 300 (refresh tokens 5 minutes before expiry)
|
||||
- `TASK_RETENTION_DAYS` = 30 (auto-delete old tasks)
|
||||
- `MAX_TASKS_PER_USER` = 1000 (limit per user)
|
||||
- `ENABLE_TASK_HISTORY` = true (enable history feature)
|
||||
- `DATABASE_TABLE_PREFIX` = "tool_ocr_" (table naming prefix)
|
||||
|
||||
### Security Considerations
|
||||
- HTTPS required for all authentication requests
|
||||
- Token storage must be secure (HTTPOnly cookies or secure session storage)
|
||||
- Implement rate limiting for authentication attempts
|
||||
- Log all authentication events for audit trail
|
||||
- Validate SSL certificates for external API calls
|
||||
- Handle network failures gracefully with appropriate error messages
|
||||
- **User Isolation**: Enforce user context in all database queries
|
||||
- **File Access Control**: Validate user ownership before file access
|
||||
- **API Security**: Add user_id validation in all task-related endpoints
|
||||
|
||||
### Migration Plan (Simplified - No Rollback Needed)
|
||||
1. **Phase 1**: Backup existing database (for reference only)
|
||||
2. **Phase 2**: Drop old tables and create new schema
|
||||
3. **Phase 3**: Deploy new authentication and task management system
|
||||
4. **Phase 4**: Test with initial users
|
||||
5. **Phase 5**: Full deployment
|
||||
|
||||
Note: Since this is a test system with no production data to preserve, we can perform a clean migration without rollback concerns.
|
||||
|
||||
## Risks and Mitigations
|
||||
|
||||
### Risks
|
||||
1. **External API Unavailability**: Authentication service downtime blocks all logins
|
||||
- *Mitigation*: Implement fallback to local auth, cache tokens, implement retry logic
|
||||
|
||||
2. **Token Expiration Handling**: Users may be logged out unexpectedly
|
||||
- *Mitigation*: Implement automatic token refresh before expiration
|
||||
|
||||
3. **Network Latency**: Slower authentication due to external API calls
|
||||
- *Mitigation*: Implement proper timeout handling, async requests, response caching
|
||||
|
||||
4. **Data Consistency**: User information mismatch between local DB and external system
|
||||
- *Mitigation*: Regular sync jobs, use external system as single source of truth
|
||||
|
||||
5. **Breaking Change**: Existing sessions will be invalidated
|
||||
- *Mitigation*: Provide migration window, clear communication to users
|
||||
|
||||
## Success Criteria
|
||||
- All users can authenticate via external API
|
||||
- Authentication response time < 2 seconds (95th percentile)
|
||||
- Zero data loss during migration
|
||||
- Automatic token refresh works without user intervention
|
||||
- Proper error messages for all failure scenarios
|
||||
- Audit logs capture all authentication events
|
||||
- Rollback procedure tested and documented
|
||||
@@ -0,0 +1,276 @@
|
||||
# Implementation Tasks
|
||||
|
||||
## 1. Database Schema Redesign
|
||||
- [ ] 1.1 Backup existing database (for reference)
|
||||
- Export current schema and data
|
||||
- Document any important data to preserve
|
||||
- [ ] 1.2 Drop old tables
|
||||
- Remove existing tables with old naming convention
|
||||
- Clear database for fresh start
|
||||
- [ ] 1.3 Create new database schema with `tool_ocr_` prefix
|
||||
- Create new `tool_ocr_users` table (email as primary identifier)
|
||||
- Create `tool_ocr_tasks` table with user association
|
||||
- Create `tool_ocr_task_files` table for file tracking
|
||||
- Create `tool_ocr_sessions` table for token storage
|
||||
- Add proper indexes for performance
|
||||
- [ ] 1.4 Create SQLAlchemy models
|
||||
- User model (mapped to `tool_ocr_users`)
|
||||
- Task model (mapped to `tool_ocr_tasks`)
|
||||
- TaskFile model (mapped to `tool_ocr_task_files`)
|
||||
- Session model (mapped to `tool_ocr_sessions`)
|
||||
- Configure table prefix in base model
|
||||
- [ ] 1.5 Generate Alembic migration
|
||||
- Create initial migration for new schema
|
||||
- Test migration script with proper table prefixes
|
||||
|
||||
## 2. Configuration Management
|
||||
- [ ] 2.1 Update environment configuration
|
||||
- Add `EXTERNAL_AUTH_API_URL` to `.env.local`
|
||||
- Add `EXTERNAL_AUTH_ENDPOINT` configuration
|
||||
- Add `EXTERNAL_AUTH_TIMEOUT` setting
|
||||
- Add `TOKEN_REFRESH_BUFFER` setting
|
||||
- Add `TASK_RETENTION_DAYS` for auto-cleanup
|
||||
- Add `MAX_TASKS_PER_USER` for limits
|
||||
- Add `ENABLE_TASK_HISTORY` feature flag
|
||||
- Add `DATABASE_TABLE_PREFIX` = "tool_ocr_"
|
||||
- [ ] 2.2 Update Settings class
|
||||
- Add external auth settings to `backend/app/core/config.py`
|
||||
- Add task management settings
|
||||
- Add database table prefix configuration
|
||||
- Add validation for new configuration values
|
||||
- Remove old authentication settings
|
||||
|
||||
## 3. External API Integration Service
|
||||
- [ ] 3.1 Create auth API client
|
||||
- Implement `backend/app/services/external_auth_service.py`
|
||||
- Create async HTTP client for API calls
|
||||
- Implement request/response models
|
||||
- Add proper error handling and logging
|
||||
- [ ] 3.2 Implement authentication methods
|
||||
- `authenticate_user()` - Call external API
|
||||
- `validate_token()` - Verify token validity
|
||||
- `refresh_token()` - Handle token refresh
|
||||
- `get_user_info()` - Fetch user details
|
||||
- [ ] 3.3 Add resilience patterns
|
||||
- Implement retry logic with exponential backoff
|
||||
- Add circuit breaker pattern
|
||||
- Implement timeout handling
|
||||
- Add fallback mechanisms
|
||||
|
||||
## 4. Backend Authentication Updates
|
||||
- [ ] 4.1 Modify login endpoint
|
||||
- Update `backend/app/api/v1/endpoints/auth.py`
|
||||
- Route to external API based on feature flag
|
||||
- Handle both authentication modes during transition
|
||||
- Return appropriate token format
|
||||
- [ ] 4.2 Update token validation
|
||||
- Modify `backend/app/core/security.py`
|
||||
- Support both local and external tokens
|
||||
- Implement token type detection
|
||||
- Update JWT validation logic
|
||||
- [ ] 4.3 Update authentication dependencies
|
||||
- Modify `backend/app/core/auth.py`
|
||||
- Update `get_current_user()` dependency
|
||||
- Handle external user information
|
||||
- Implement proper user context
|
||||
|
||||
## 5. Session and Token Management
|
||||
- [ ] 5.1 Implement token storage
|
||||
- Store external tokens securely
|
||||
- Implement token encryption at rest
|
||||
- Handle multiple token types (access, ID, refresh)
|
||||
- [ ] 5.2 Create token refresh mechanism
|
||||
- Background task for token refresh
|
||||
- Refresh tokens before expiration
|
||||
- Update stored tokens atomically
|
||||
- Handle refresh failures gracefully
|
||||
- [ ] 5.3 Session invalidation
|
||||
- Clear tokens on logout
|
||||
- Handle token revocation
|
||||
- Implement session timeout
|
||||
|
||||
## 6. Frontend Updates
|
||||
- [ ] 6.1 Update authentication service
|
||||
- Modify `frontend/src/services/authService.ts`
|
||||
- Handle new token format
|
||||
- Store user display information
|
||||
- Implement token refresh on client side
|
||||
- [ ] 6.2 Update auth store
|
||||
- Modify `frontend/src/stores/authStore.ts`
|
||||
- Store external user information
|
||||
- Update user display logic
|
||||
- Handle token expiration
|
||||
- [ ] 6.3 Update UI components
|
||||
- Modify `frontend/src/components/Header.tsx`
|
||||
- Display user `name` instead of username
|
||||
- Show additional user information
|
||||
- Update login form if needed
|
||||
- [ ] 6.4 Error handling
|
||||
- Handle external API errors
|
||||
- Display appropriate error messages
|
||||
- Implement retry UI for failures
|
||||
- Add loading states
|
||||
|
||||
## 7. Task Management System (NEW)
|
||||
- [ ] 7.1 Create task management backend
|
||||
- Implement `backend/app/models/task.py`
|
||||
- Implement `backend/app/models/task_file.py`
|
||||
- Create `backend/app/services/task_service.py`
|
||||
- Add task CRUD operations with user isolation
|
||||
- [ ] 7.2 Implement task APIs
|
||||
- Create `backend/app/api/v1/endpoints/tasks.py`
|
||||
- GET /tasks (list user's tasks with pagination)
|
||||
- GET /tasks/{id} (get specific task)
|
||||
- DELETE /tasks/{id} (delete task)
|
||||
- POST /tasks/{id}/retry (retry failed task)
|
||||
- [ ] 7.3 Create task history endpoints
|
||||
- Create `backend/app/api/v1/endpoints/task_history.py`
|
||||
- GET /history (query with filters)
|
||||
- GET /history/stats (user statistics)
|
||||
- POST /history/export (export history)
|
||||
- [ ] 7.4 Implement file access control
|
||||
- Create `backend/app/services/file_access_service.py`
|
||||
- Validate user ownership before file access
|
||||
- Restrict download to user's own files
|
||||
- Add audit logging for file access
|
||||
- [ ] 7.5 Update OCR service integration
|
||||
- Link OCR tasks to user accounts
|
||||
- Save task records in database
|
||||
- Update task status during processing
|
||||
- Store result file paths
|
||||
|
||||
## 8. Frontend Task Management UI (NEW)
|
||||
- [ ] 8.1 Create task history page
|
||||
- Implement `frontend/src/pages/TaskHistory.tsx`
|
||||
- Display task list with status indicators
|
||||
- Add pagination controls
|
||||
- Show task details modal
|
||||
- [ ] 8.2 Build task list component
|
||||
- Implement `frontend/src/components/TaskList.tsx`
|
||||
- Display task table with columns
|
||||
- Add sorting capabilities
|
||||
- Implement action buttons
|
||||
- [ ] 8.3 Create filter components
|
||||
- Implement `frontend/src/components/TaskFilters.tsx`
|
||||
- Date range picker
|
||||
- Status filter dropdown
|
||||
- Search by filename
|
||||
- Clear filters button
|
||||
- [ ] 8.4 Add task management store
|
||||
- Implement `frontend/src/stores/taskStore.ts`
|
||||
- Manage task list state
|
||||
- Handle filter state
|
||||
- Cache task data
|
||||
- [ ] 8.5 Create task service client
|
||||
- Implement `frontend/src/services/taskService.ts`
|
||||
- API methods for task operations
|
||||
- Handle pagination
|
||||
- Implement retry logic
|
||||
- [ ] 8.6 Update navigation
|
||||
- Add "Task History" menu item
|
||||
- Update router configuration
|
||||
- Add task count badge
|
||||
- Implement user menu with stats
|
||||
|
||||
## 9. User Isolation and Security
|
||||
- [ ] 9.1 Implement user context middleware
|
||||
- Create middleware to inject user context
|
||||
- Validate user in all requests
|
||||
- Add user_id to logging context
|
||||
- [ ] 9.2 Database query isolation
|
||||
- Add user_id filter to all task queries
|
||||
- Prevent cross-user data access
|
||||
- Implement row-level security
|
||||
- [ ] 9.3 File system isolation
|
||||
- Organize files by user directory
|
||||
- Validate file paths before access
|
||||
- Implement cleanup for deleted users
|
||||
- [ ] 9.4 API authorization
|
||||
- Add @require_user decorator
|
||||
- Validate ownership in endpoints
|
||||
- Return 403 for unauthorized access
|
||||
|
||||
## 10. Testing
|
||||
- [ ] 10.1 Unit tests
|
||||
- Test external auth service
|
||||
- Test token validation
|
||||
- Test task isolation logic
|
||||
- Test file access control
|
||||
- [ ] 10.2 Integration tests
|
||||
- Test full authentication flow
|
||||
- Test task management flow
|
||||
- Test user isolation between accounts
|
||||
- Test file download restrictions
|
||||
- [ ] 10.3 Load testing
|
||||
- Test external API response times
|
||||
- Test system with many concurrent users
|
||||
- Test large task history queries
|
||||
- Measure database query performance
|
||||
- [ ] 10.4 Security testing
|
||||
- Test token security
|
||||
- Verify user isolation
|
||||
- Test unauthorized access attempts
|
||||
- Validate SQL injection prevention
|
||||
|
||||
## 11. Migration Execution (Simplified)
|
||||
- [ ] 11.1 Pre-migration preparation
|
||||
- Backup existing database (reference only)
|
||||
- Prepare deployment package
|
||||
- Set up monitoring
|
||||
- [ ] 11.2 Execute migration
|
||||
- Drop old database tables
|
||||
- Create new schema
|
||||
- Deploy new code
|
||||
- Verify system startup
|
||||
- [ ] 11.3 Post-migration validation
|
||||
- Test authentication with real users
|
||||
- Verify task isolation works
|
||||
- Check task history functionality
|
||||
- Validate file access controls
|
||||
|
||||
## 12. Documentation
|
||||
- [ ] 12.1 Technical documentation
|
||||
- Update API documentation with new endpoints
|
||||
- Document authentication flow
|
||||
- Document task management APIs
|
||||
- Create troubleshooting guide
|
||||
- [ ] 12.2 User documentation
|
||||
- Update login instructions
|
||||
- Document task history features
|
||||
- Explain user isolation
|
||||
- Create user guide for new UI
|
||||
- [ ] 12.3 Developer documentation
|
||||
- Document database schema
|
||||
- Explain security model
|
||||
- Provide integration examples
|
||||
|
||||
## 13. Monitoring and Observability
|
||||
- [ ] 13.1 Add monitoring metrics
|
||||
- Authentication success/failure rates
|
||||
- Task creation/completion rates
|
||||
- User activity metrics
|
||||
- File storage usage
|
||||
- [ ] 13.2 Implement logging
|
||||
- Log all authentication attempts
|
||||
- Log task operations
|
||||
- Log file access attempts
|
||||
- Structured logging for analysis
|
||||
- [ ] 13.3 Create alerts
|
||||
- Alert on authentication failures
|
||||
- Alert on high error rates
|
||||
- Alert on storage issues
|
||||
- Alert on performance degradation
|
||||
|
||||
## 14. Performance Optimization (Post-Launch)
|
||||
- [ ] 14.1 Database optimization
|
||||
- Analyze query patterns
|
||||
- Add missing indexes
|
||||
- Optimize slow queries
|
||||
- [ ] 14.2 Caching implementation
|
||||
- Cache user information
|
||||
- Cache task lists
|
||||
- Implement Redis if needed
|
||||
- [ ] 14.3 File management
|
||||
- Implement automatic cleanup
|
||||
- Optimize storage structure
|
||||
- Add compression if needed
|
||||
Reference in New Issue
Block a user