Added `tool_ocr_` prefix to all database tables for clear separation
from other systems in the same database.
Changes:
- All tables now use `tool_ocr_` prefix
- Added tool_ocr_sessions table for token management
- Created complete SQL schema file with:
- Full table definitions with comments
- Indexes for performance
- Views for common queries
- Stored procedures for maintenance
- Audit log table (optional)
New files:
- database_schema.sql: Ready-to-use SQL script for deployment
Configuration:
- Added DATABASE_TABLE_PREFIX environment variable
- Updated all references to use prefixed table names
Benefits:
- Clear namespace separation in shared databases
- Easier identification of Tool_OCR tables
- Prevent conflicts with other applications
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major updates based on feedback:
1. Remove Azure AD ID storage - use email as primary identifier
2. Complete database redesign - no backward compatibility needed
3. Add comprehensive user task isolation and history features
Database changes:
- Simplified users table (email-based)
- New ocr_tasks table with user association
- New task_files table for file tracking
- Proper indexes for performance
New features:
- User task isolation (A cannot see B's tasks)
- Task history with status tracking (pending/processing/completed/failed)
- Historical query capabilities with filters
- Download support for completed tasks
- Task management UI with search and filters
Security enhancements:
- User context validation in all endpoints
- File access control based on ownership
- Row-level security in database queries
- API-level authorization checks
Implementation approach:
- Clean migration without rollback concerns
- Drop old tables and start fresh
- Simplified deployment process
- Comprehensive task management system
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Create OpenSpec proposal for migrating from local database authentication
to external API authentication using Microsoft Azure AD.
Changes proposed:
- Replace local username/password auth with external API
- Integrate with https://pj-auth-api.vercel.app/api/auth/login
- Use Azure AD tokens instead of local JWT
- Display user 'name' from API response in UI
- Maintain backward compatibility with feature flag
Benefits:
- Single Sign-On (SSO) capability
- Leverage enterprise identity management
- Reduce local user management overhead
- Consistent authentication across applications
Database changes:
- Add external_user_id for Azure AD user mapping
- Add display_name for UI display
- Keep existing schema for rollback capability
Implementation includes:
- Detailed migration plan with phased rollout
- Comprehensive task list for implementation
- Test script for API validation
- Risk assessment and mitigation strategies
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
PaddleOCR-VL chart recognition model requires `fused_rms_norm_ext` API
which is not available in PaddlePaddle 3.0.0 stable release.
Changes:
- Set use_chart_recognition=False in PP-StructureV3 initialization
- Remove unsupported show_log parameter from PaddleOCR 3.x API calls
- Document known limitation in openspec proposal
- Add limitation documentation to README
- Update tasks.md with documentation task for known issues
Impact:
- Layout analysis still detects/extracts charts as images ✓
- Tables, formulas, and text recognition work normally ✓
- Deep chart understanding (type detection, data extraction) disabled ✗
- Chart to structured data conversion disabled ✗
Workaround: Charts saved as image files for manual review
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
PaddlePaddle 3.0.0b2 has "Illegal instruction" error on current CPU.
Downgrade to stable 2.6.2 which works but uses different API.
Changes:
- Auto-detect PaddlePaddle version at runtime
- Use 'device' parameter for 3.x (device="gpu:0" or "cpu")
- Use 'use_gpu' + 'gpu_mem' parameters for 2.x
- Apply to both get_ocr_engine() and get_structure_engine()
- Log PaddlePaddle version in initialization messages
Current setup:
- paddlepaddle-gpu==2.6.2 (stable, CUDA compiled)
- paddleocr==3.3.1
- paddlex==3.3.9
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changes to setup_dev_env.sh:
- Add support for CUDA 13.x (install CUDA 12.x compatible version)
- Use official PaddlePaddle source for GPU versions
- Install paddlepaddle-gpu==3.0.0b2 from official index
- CUDA 13.x: use cu123 package (backward compatible)
- CUDA 12.x: use cu123 package
- CUDA 11.7+: use cu118 package
- CUDA 11.2-11.6: use cu117 package
Changes to requirements.txt:
- Comment out paddlepaddle dependency
- Let setup script handle GPU/CPU version installation
This fixes the issue where pip installed CPU-only paddlepaddle 3.2.1
instead of GPU version, causing GPU acceleration to be unavailable.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
PaddleOCR 3.x changed the API:
- Removed: use_gpu=True/False and gpu_mem=<value>
- Added: device="gpu:0" or device="cpu"
Changes:
- Updated get_ocr_engine() to use device parameter
- Updated get_structure_engine() to use device parameter
- GPU mode: device="gpu:{gpu_device_id}"
- CPU mode: device="cpu"
This fixes the "ValueError: Unknown argument: gpu_mem" runtime error.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Complete redesign of frontend interface with focus on usability, visual hierarchy, and professional appearance:
**Design System:**
- Implemented clean blue color theme (#3B82F6) with professional palette
- Created consistent spacing, shadows, and typography system
- Added reusable utility classes (page-header, section, status-badge-*)
- Removed excessive gradients and decorative effects
**Layout Architecture:**
- Redesigned main layout with 256px sidebar navigation
- Sidebar includes logo, navigation with descriptions, and user profile
- Main content area with search bar and scrollable content
- Replaced horizontal navigation with vertical sidebar pattern
**Page Redesigns:**
1. LoginPage: Split-screen design with branding (left) and clean form (right)
- Feature highlights with icons and statistics
- Mobile responsive design
- Professional gradient background with subtle pattern
2. UploadPage: Added 3-step visual progress indicator
- Better file organization with summary and status badges
- Clear action bar with confirmation message
- Improved file list presentation
3. ProcessingPage: Enhanced progress visualization
- Large progress bar with percentage display
- 4-column stats grid (Completed, Processing, Failed, Total)
- Clean file status list with processing times
4. ResultsPage: Improved 5-column layout (2 for list, 3 for preview)
- Added stats cards for accuracy, processing time, and text blocks
- Better preview panel with detailed metrics
- Export and translate action buttons
5. ExportPage: Better organization with 2-column layout
- Visual format selection with icons (TXT, JSON, Excel, Markdown, PDF)
- Improved form controls and option organization
- Sticky preview sidebar showing current configuration
**Component Updates:**
- Updated Button component with proper variants
- Enhanced Card component with hover effects
- Maintained FileUpload component functionality
- Added lucide-react for modern iconography
**Technical Improvements:**
- Fixed Tailwind CSS v4 compatibility issues with @apply
- Removed decorative animations in favor of functional ones
- Improved accessibility with proper labels and ARIA attributes
- Better color contrast and readability
This redesign transforms the interface from a basic layout to a professional, enterprise-ready application with clear visual hierarchy and excellent usability.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>