Commit Graph

9 Commits

Author SHA1 Message Date
egg
5cf4010c9b fix: 修復多頁PDF頁碼分配錯誤和logging配置問題
Critical Bug #1: 多頁PDF頁碼分配錯誤
問題:
- 在處理多頁PDF時,雖然text_regions有正確的頁碼標記
- 但layout_data.elements(表格)和images_metadata(圖片)都保持page=0
- 導致所有頁面的表格和圖片都被錯誤地繪製在第1頁
- 造成嚴重的版面錯誤、元素重疊和位置錯誤

根本原因:
- ocr_service.py (第359-372行) 在累積多頁結果時
- text_regions有添加頁碼:region['page'] = page_num
- 但images_metadata和layout_data.elements沒有更新頁碼
- 它們保持單頁處理時的默認值page=0

修復方案:
- backend/app/services/ocr_service.py (第359-372行)
  - 為layout_data.elements中的每個元素添加正確的頁碼
  - 為images_metadata中的每個圖片添加正確的頁碼
  - 確保多頁PDF的每個元素都有正確的page標記

Critical Bug #2: Logging配置被uvicorn覆蓋
問題:
- uvicorn啟動時會設置自己的logging配置
- 這會覆蓋應用程式的logging.basicConfig()
- 導致應用層的INFO/WARNING/ERROR log完全消失
- 只能看到uvicorn的HTTP請求log和第三方庫的DEBUG log
- 無法診斷PDF生成過程中的問題

修復方案:
- backend/app/main.py (第17-36行)
  - 添加force=True參數強制重新配置logging (Python 3.8+)
  - 顯式設置root logger的level
  - 配置app-specific loggers (app.services.pdf_generator_service等)
  - 啟用log propagation確保訊息能傳遞到root logger

其他修復:
- backend/app/services/pdf_generator_service.py
  - 將重要的debug logging改為info level (第371, 379, 490, 613行)
    原因:預設log level是INFO,debug log不會顯示
  - 修復max_cols UnboundLocalError (第507-509行)
    將logger.info()移到max_cols定義之後
  - 移除危險的.get('page', 0)默認值 (第762行)
    改為.get('page'),沒有page的元素會被正確跳過

影響:
 多頁PDF的表格和圖片現在會正確分配到對應頁面
 詳細的PDF生成log現在可以正確顯示(座標轉換、縮放比例等)
 能夠診斷文字擠壓、間距和位置錯誤的問題

測試建議:
1. 重新啟動後端清除Python cache
2. 上傳多頁PDF進行OCR處理
3. 檢查生成的JSON中每個元素是否有正確的page標記
4. 檢查終端log是否顯示詳細的PDF生成過程
5. 驗證生成的PDF中每頁的元素位置是否正確

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-18 12:13:25 +08:00
egg
7e12f162b4 feat: enable chart recognition with PaddlePaddle 3.2.1
- Fixed WSL CUDA library path in ~/.bashrc
- Upgraded PaddlePaddle from 3.0.0 to 3.2.1
- Verified fused_rms_norm_ext API is now available
- Enabled chart recognition in ocr_service.py
- Updated CHART_RECOGNITION.md to reflect enabled status

Chart recognition now supports:
 Chart type identification
 Data extraction from charts
 Axis and legend parsing
 Converting charts to structured data

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 18:57:38 +08:00
egg
fd98018ddd refactor: complete V1 to V2 migration and remove legacy architecture
Remove all V1 architecture components and promote V2 to primary:
- Delete all paddle_ocr_* table models (export, ocr, translation, user)
- Delete legacy routers (auth, export, ocr, translation)
- Delete legacy schemas and services
- Promote user_v2.py to user.py as primary user model
- Update all imports and dependencies to use V2 models only
- Update main.py version to 2.0.0

Database changes:
- Fix SQLAlchemy reserved word: rename audit_log.metadata to extra_data
- Add migration to drop all paddle_ocr_* tables
- Update alembic env to only import V2 models

Frontend fixes:
- Fix Select component exports in TaskHistoryPage.tsx
- Update to use simplified Select API with options prop
- Fix AxiosInstance TypeScript import syntax

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 21:27:39 +08:00
egg
ad2b832fb6 feat: complete external auth V2 migration with advanced features
This commit implements comprehensive external Azure AD authentication
with complete task management, file download, and admin monitoring systems.

## Core Features Implemented (80% Complete)

### 1. Token Auto-Refresh Mechanism 
- Backend: POST /api/v2/auth/refresh endpoint
- Frontend: Auto-refresh 5 minutes before expiration
- Auto-retry on 401 errors with seamless token refresh

### 2. File Download System 
- Three format support: JSON / Markdown / PDF
- Endpoints: GET /api/v2/tasks/{id}/download/{format}
- File access control with ownership validation
- Frontend download buttons in TaskHistoryPage

### 3. Complete Task Management 
Backend Endpoints:
- POST /api/v2/tasks/{id}/start - Start task
- POST /api/v2/tasks/{id}/cancel - Cancel task
- POST /api/v2/tasks/{id}/retry - Retry failed task
- GET /api/v2/tasks - List with filters (status, filename, date range)
- GET /api/v2/tasks/stats - User statistics

Frontend Features:
- Status-based action buttons (Start/Cancel/Retry)
- Advanced search and filtering (status, filename, date range)
- Pagination and sorting
- Task statistics dashboard (5 stat cards)

### 4. Admin Monitoring System  (Backend)
Admin APIs:
- GET /api/v2/admin/stats - System statistics
- GET /api/v2/admin/users - User list with stats
- GET /api/v2/admin/users/top - User leaderboard
- GET /api/v2/admin/audit-logs - Audit log query system
- GET /api/v2/admin/audit-logs/user/{id}/summary

Admin Features:
- Email-based admin check (ymirliu@panjit.com.tw)
- Comprehensive system metrics (users, tasks, sessions, activity)
- Audit logging service for security tracking

### 5. User Isolation & Security 
- Row-level security on all task queries
- File access control with ownership validation
- Strict user_id filtering on all operations
- Session validation and expiry checking
- Admin privilege verification

## New Files Created

Backend:
- backend/app/models/user_v2.py - User model for external auth
- backend/app/models/task.py - Task model with user isolation
- backend/app/models/session.py - Session management
- backend/app/models/audit_log.py - Audit log model
- backend/app/services/external_auth_service.py - External API client
- backend/app/services/task_service.py - Task CRUD with isolation
- backend/app/services/file_access_service.py - File access control
- backend/app/services/admin_service.py - Admin operations
- backend/app/services/audit_service.py - Audit logging
- backend/app/routers/auth_v2.py - V2 auth endpoints
- backend/app/routers/tasks.py - Task management endpoints
- backend/app/routers/admin.py - Admin endpoints
- backend/alembic/versions/5e75a59fb763_*.py - DB migration

Frontend:
- frontend/src/services/apiV2.ts - Complete V2 API client
- frontend/src/types/apiV2.ts - V2 type definitions
- frontend/src/pages/TaskHistoryPage.tsx - Task history UI

Modified Files:
- backend/app/core/deps.py - Added get_current_admin_user_v2
- backend/app/main.py - Registered admin router
- frontend/src/pages/LoginPage.tsx - V2 login integration
- frontend/src/components/Layout.tsx - User display and logout
- frontend/src/App.tsx - Added /tasks route

## Documentation
- openspec/changes/.../PROGRESS_UPDATE.md - Detailed progress report

## Pending Items (20%)
1. Database migration execution for audit_logs table
2. Frontend admin dashboard page
3. Frontend audit log viewer

## Testing Status
- Manual testing:  Authentication flow verified
- Unit tests:  Pending
- Integration tests:  Pending

## Security Enhancements
-  User isolation (row-level security)
-  File access control
-  Token expiry validation
-  Admin privilege verification
-  Audit logging infrastructure
-  Token encryption (noted, low priority)
-  Rate limiting (noted, low priority)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 17:19:43 +08:00
egg
7536f43513 feat: implement GPU acceleration support for OCR processing
實作 GPU 加速支援,自動偵測並啟用 CUDA GPU 加速 OCR 處理

主要變更:

1. 環境設置增強 (setup_dev_env.sh)
   - 新增 GPU 和 CUDA 版本偵測功能
   - 自動安裝對應的 PaddlePaddle GPU/CPU 版本
   - CUDA 11.2+ 安裝 GPU 版本,否則安裝 CPU 版本
   - 安裝後驗證 GPU 可用性並顯示設備資訊

2. 配置更新
   - .env.local: 加入 GPU 配置選項
     * FORCE_CPU_MODE: 強制 CPU 模式選項
     * GPU_MEMORY_FRACTION: GPU 記憶體使用比例
     * GPU_DEVICE_ID: GPU 裝置 ID
   - backend/app/core/config.py: 加入 GPU 配置欄位

3. OCR 服務 GPU 整合 (backend/app/services/ocr_service.py)
   - 新增 _detect_and_configure_gpu() 方法自動偵測 GPU
   - 新增 get_gpu_status() 方法回報 GPU 狀態和記憶體使用
   - 修改 get_ocr_engine() 支援 GPU 參數和錯誤降級
   - 修改 get_structure_engine() 支援 GPU 參數和錯誤降級
   - 自動 GPU/CPU 切換,GPU 失敗時自動降級到 CPU

4. 健康檢查與監控 (backend/app/main.py)
   - /health endpoint 加入 GPU 狀態資訊
   - 回報 GPU 可用性、裝置名稱、記憶體使用等資訊

5. 文檔更新 (README.md)
   - Features: 加入 GPU 加速功能說明
   - Prerequisites: 加入 GPU 硬體要求(可選)
   - Quick Start: 更新自動化設置說明包含 GPU 偵測
   - Configuration: 加入 GPU 配置選項和說明
   - Notes: 加入 GPU 支援注意事項

技術特性:
- 自動偵測 NVIDIA GPU 和 CUDA 版本
- 支援 CUDA 11.2-12.x
- GPU 初始化失敗時優雅降級到 CPU
- GPU 記憶體分配控制防止 OOM
- 即時 GPU 狀態監控和報告
- 完全向後相容 CPU-only 環境

預期效能:
- GPU 系統: 3-10x OCR 處理速度提升
- CPU 系統: 無影響,維持現有效能

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 07:42:13 +08:00
egg
d7e64737b7 feat: migrate to WSL Ubuntu native development environment
從 Docker/macOS+Conda 部署遷移到 WSL2 Ubuntu 原生開發環境

主要變更:
- 移除所有 Docker 相關配置檔案 (Dockerfile, docker-compose.yml, .dockerignore 等)
- 移除 macOS/Conda 設置腳本 (SETUP.md, setup_conda.sh)
- 新增 WSL Ubuntu 自動化環境設置腳本 (setup_dev_env.sh)
- 新增後端/前端快速啟動腳本 (start_backend.sh, start_frontend.sh)
- 統一開發端口配置 (backend: 8000, frontend: 5173)
- 改進資料庫連接穩定性(連接池、超時設置、重試機制)
- 更新專案文檔以反映當前 WSL 開發環境

Technical improvements:
- Database connection pooling with health checks and auto-reconnection
- Retry logic for long-running OCR tasks to prevent DB timeouts
- Extended JWT token expiration to 24 hours
- Support for Office documents (pptx, docx) via LibreOffice headless
- Comprehensive system dependency installation in single script

Environment:
- OS: WSL2 Ubuntu 24.04
- Python: 3.12 (venv)
- Node.js: 24.x LTS (nvm)
- Backend Port: 8000
- Frontend Port: 5173

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 21:00:42 +08:00
beabigegg
fed112656f update FRONTEND documentation 2025-11-12 23:55:21 +08:00
beabigegg
21bc2f92f1 feat: modernize frontend architecture with professional UI/UX design
Complete redesign of frontend interface with focus on usability, visual hierarchy, and professional appearance:

**Design System:**
- Implemented clean blue color theme (#3B82F6) with professional palette
- Created consistent spacing, shadows, and typography system
- Added reusable utility classes (page-header, section, status-badge-*)
- Removed excessive gradients and decorative effects

**Layout Architecture:**
- Redesigned main layout with 256px sidebar navigation
- Sidebar includes logo, navigation with descriptions, and user profile
- Main content area with search bar and scrollable content
- Replaced horizontal navigation with vertical sidebar pattern

**Page Redesigns:**
1. LoginPage: Split-screen design with branding (left) and clean form (right)
   - Feature highlights with icons and statistics
   - Mobile responsive design
   - Professional gradient background with subtle pattern

2. UploadPage: Added 3-step visual progress indicator
   - Better file organization with summary and status badges
   - Clear action bar with confirmation message
   - Improved file list presentation

3. ProcessingPage: Enhanced progress visualization
   - Large progress bar with percentage display
   - 4-column stats grid (Completed, Processing, Failed, Total)
   - Clean file status list with processing times

4. ResultsPage: Improved 5-column layout (2 for list, 3 for preview)
   - Added stats cards for accuracy, processing time, and text blocks
   - Better preview panel with detailed metrics
   - Export and translate action buttons

5. ExportPage: Better organization with 2-column layout
   - Visual format selection with icons (TXT, JSON, Excel, Markdown, PDF)
   - Improved form controls and option organization
   - Sticky preview sidebar showing current configuration

**Component Updates:**
- Updated Button component with proper variants
- Enhanced Card component with hover effects
- Maintained FileUpload component functionality
- Added lucide-react for modern iconography

**Technical Improvements:**
- Fixed Tailwind CSS v4 compatibility issues with @apply
- Removed decorative animations in favor of functional ones
- Improved accessibility with proper labels and ARIA attributes
- Better color contrast and readability

This redesign transforms the interface from a basic layout to a professional, enterprise-ready application with clear visual hierarchy and excellent usability.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 23:54:44 +08:00
beabigegg
da700721fa first 2025-11-12 22:53:17 +08:00