Implements missing layout analysis capabilities:
- Add footer detection based on page position (bottom 10%)
- Build hierarchical section structure from font sizes
- Create nested list structure from indentation levels
All elements now have proper metadata for:
- section_level, parent_section, child_sections (headers)
- list_level, parent_item, children (list items)
- is_page_header, is_page_footer flags
Updates tasks.md to reflect accurate completion status.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implements the converter that transforms PP-StructureV3 OCR results into
the UnifiedDocument format, enabling consistent output for both OCR and
direct extraction tracks.
- Create OCRToUnifiedConverter class with full element type mapping
- Handle both enhanced (parsing_res_list) and standard markdown results
- Support 4-point and simple bbox formats for coordinates
- Establish element relationships (captions, lists, headers)
- Integrate converter into OCR service dual-track processing
- Update tasks.md marking section 3.3 complete
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Progress update:
- Unified Processing Pipeline: 4/4 tasks completed (section 4.1)
- Total progress: 34/147 tasks (23.1%)
Completed:
✅ Integrated DocumentTypeDetector into OCR service
✅ Automatic routing to OCR or Direct extraction tracks
✅ UnifiedDocument output from both tracks
✅ Full backward compatibility maintained
Progress update:
- Core Infrastructure: 13/14 tasks completed
- Direct Extraction Track: 18/18 tasks completed
- Total progress: 30/147 tasks (20.4%)
Completed major components:
✅ UnifiedDocument model with all structures
✅ DocumentTypeDetector service
✅ DirectExtractionEngine with PyMuPDF
✅ Dependencies added to requirements.txt
Next priorities:
- Update OCR service for dual-track integration
- Enhance PP-StructureV3 usage
- Update PDF generator for UnifiedDocument
- Removed all test files and directories
- Deleted outdated documentation (will be rewritten)
- Cleaned up temporary files, logs, and uploads
- Archived 5 completed OpenSpec proposals
- Created new dual-track-document-processing proposal with complete OpenSpec structure
- Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF)
- UnifiedDocument model for consistent output
- Support for structure-preserving translation
- Updated .gitignore to prevent future test/temp files
This is a major cleanup preparing for the complete refactoring of the document processing pipeline.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Backend fixes:
- Fix markdown generation using correct 'markdown_content' key in tasks.py
- Update admin service to return flat data structure matching frontend types
- Add task_count and failed_tasks fields to user statistics
- Fix top users endpoint to return complete user data
Frontend fixes:
- Migrate ResultsPage from V1 batch API to V2 task API with polling
- Create TaskDetailPage component with markdown preview and download buttons
- Refactor ExportPage to support multi-task selection using V2 download endpoints
- Fix login infinite refresh loop with concurrency control flags
- Create missing Checkbox UI component
New features:
- Add /tasks/:taskId route for task detail view
- Implement multi-task batch export functionality
- Add real-time task status polling (2s interval)
OpenSpec:
- Archive completed proposal 2025-11-17-fix-v2-api-ui-issues
- Create result-export and task-management specifications
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Updates all project documentation to reflect that chart recognition
is now fully enabled with PaddlePaddle 3.2.1+.
Changes:
- README.md: Remove Known Limitations section about chart recognition,
update tech stack and prerequisites to include PaddlePaddle 3.2.1+,
add WSL CUDA configuration notes
- openspec/project.md: Add comprehensive chart recognition feature
descriptions, update system requirements for GPU/CUDA support
- openspec/changes/add-gpu-acceleration-support/tasks.md: Mark task
5.4 as completed with resolution details
- openspec/changes/add-gpu-acceleration-support/proposal.md: Update
Known Issues section to show chart recognition is now resolved
- setup_dev_env.sh: Upgrade PaddlePaddle from 3.0.0 to 3.2.1+, add
WSL CUDA library path configuration, add chart recognition API
verification
All documentation now accurately reflects:
✅ Chart recognition fully enabled
✅ PaddlePaddle 3.2.1+ with fused_rms_norm_ext API
✅ WSL CUDA path auto-configuration
✅ Comprehensive PP-StructureV3 capabilities
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added `tool_ocr_` prefix to all database tables for clear separation
from other systems in the same database.
Changes:
- All tables now use `tool_ocr_` prefix
- Added tool_ocr_sessions table for token management
- Created complete SQL schema file with:
- Full table definitions with comments
- Indexes for performance
- Views for common queries
- Stored procedures for maintenance
- Audit log table (optional)
New files:
- database_schema.sql: Ready-to-use SQL script for deployment
Configuration:
- Added DATABASE_TABLE_PREFIX environment variable
- Updated all references to use prefixed table names
Benefits:
- Clear namespace separation in shared databases
- Easier identification of Tool_OCR tables
- Prevent conflicts with other applications
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major updates based on feedback:
1. Remove Azure AD ID storage - use email as primary identifier
2. Complete database redesign - no backward compatibility needed
3. Add comprehensive user task isolation and history features
Database changes:
- Simplified users table (email-based)
- New ocr_tasks table with user association
- New task_files table for file tracking
- Proper indexes for performance
New features:
- User task isolation (A cannot see B's tasks)
- Task history with status tracking (pending/processing/completed/failed)
- Historical query capabilities with filters
- Download support for completed tasks
- Task management UI with search and filters
Security enhancements:
- User context validation in all endpoints
- File access control based on ownership
- Row-level security in database queries
- API-level authorization checks
Implementation approach:
- Clean migration without rollback concerns
- Drop old tables and start fresh
- Simplified deployment process
- Comprehensive task management system
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Create OpenSpec proposal for migrating from local database authentication
to external API authentication using Microsoft Azure AD.
Changes proposed:
- Replace local username/password auth with external API
- Integrate with https://pj-auth-api.vercel.app/api/auth/login
- Use Azure AD tokens instead of local JWT
- Display user 'name' from API response in UI
- Maintain backward compatibility with feature flag
Benefits:
- Single Sign-On (SSO) capability
- Leverage enterprise identity management
- Reduce local user management overhead
- Consistent authentication across applications
Database changes:
- Add external_user_id for Azure AD user mapping
- Add display_name for UI display
- Keep existing schema for rollback capability
Implementation includes:
- Detailed migration plan with phased rollout
- Comprehensive task list for implementation
- Test script for API validation
- Risk assessment and mitigation strategies
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
PaddleOCR-VL chart recognition model requires `fused_rms_norm_ext` API
which is not available in PaddlePaddle 3.0.0 stable release.
Changes:
- Set use_chart_recognition=False in PP-StructureV3 initialization
- Remove unsupported show_log parameter from PaddleOCR 3.x API calls
- Document known limitation in openspec proposal
- Add limitation documentation to README
- Update tasks.md with documentation task for known issues
Impact:
- Layout analysis still detects/extracts charts as images ✓
- Tables, formulas, and text recognition work normally ✓
- Deep chart understanding (type detection, data extraction) disabled ✗
- Chart to structured data conversion disabled ✗
Workaround: Charts saved as image files for manual review
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>