- Fix MySQL connection timeout by creating fresh DB session after OCR
- Fix /analyze endpoint attribute errors (detect vs analyze, metadata)
- Add processing_track field extraction to TaskDetailResponse
- Update E2E tests to use POST for /analyze endpoint
- Increase Office document timeout to 300s
- Add Section 2.4 tasks for Office document direct extraction
- Document Office → PDF → Direct track strategy in design.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Removed all test files and directories
- Deleted outdated documentation (will be rewritten)
- Cleaned up temporary files, logs, and uploads
- Archived 5 completed OpenSpec proposals
- Created new dual-track-document-processing proposal with complete OpenSpec structure
- Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF)
- UnifiedDocument model for consistent output
- Support for structure-preserving translation
- Updated .gitignore to prevent future test/temp files
This is a major cleanup preparing for the complete refactoring of the document processing pipeline.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>