OCR/tasks.md at 8ba61f51b3200752b796fe46d46fab8792ddd149

egg cd3cbea49d chore: project cleanup and prepare for dual-track processing refactor

- Removed all test files and directories
- Deleted outdated documentation (will be rewritten)
- Cleaned up temporary files, logs, and uploads
- Archived 5 completed OpenSpec proposals
- Created new dual-track-document-processing proposal with complete OpenSpec structure
  - Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF)
  - UnifiedDocument model for consistent output
  - Support for structure-preserving translation
- Updated .gitignore to prevent future test/temp files

This is a major cleanup preparing for the complete refactoring of the document processing pipeline.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

3.2 KiB

Raw Blame History

Implementation Tasks

Phase 1: Dependencies & Configuration

Phase 2: Document Conversion Implementation

Phase 3: OCR Service Integration

Phase 4: API Updates

Phase 5: Testing

Phase 6: Documentation

3.2 KiB Raw Blame History

Implementation Tasks

Phase 1: Dependencies & Configuration

Phase 2: Document Conversion Implementation

Phase 3: OCR Service Integration

Phase 4: API Updates

Phase 5: Testing

Phase 6: Documentation

3.2 KiB

Raw Blame History