egg/OCR - OCR

egg/OCR

Author	SHA1	Message	Date
egg	c2288ba935	feat: add frontend support for dual-track processing - Add ProcessingTrack, ProcessingMetadata types to apiV2.ts - Add analyzeDocument, getProcessingMetadata, downloadUnified API methods - Update startTask to support ProcessingOptions - Update TaskDetailPage with: - Processing track badge and description display - Enhanced stats grid (pages, text regions, tables, images, confidence) - UnifiedDocument download option - Translation UI preparation (disabled, awaiting backend) - Mark Section 7 Frontend Updates as completed in tasks.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 12:34:01 +08:00
egg	0fcb2492c9	test: add unit tests for DocumentTypeDetector - Create test directory structure for backend - Add pytest fixtures for test files (PDF, images, Office docs) - Add 20 unit tests covering: - PDF type detection (editable, scanned, mixed) - Image file detection (PNG, JPG) - Office document detection (DOCX) - Text file detection - Edge cases (file not found, unknown types) - Batch processing and statistics - Mark tasks 1.1.4 and 1.3.5 as completed in tasks.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 12:16:49 +08:00
egg	1d0b63854a	feat: add dual-track API endpoints for document processing - Add ProcessingTrackEnum, ProcessingOptions, ProcessingMetadata schemas - Add DocumentAnalysisResponse for document type detection - Update /start endpoint with dual-track query parameters - Add /analyze endpoint for document type detection with confidence scores - Add /metadata endpoint for processing track information - Add /download/unified endpoint for UnifiedDocument format export - Update tasks.md to mark Section 6 API updates as completed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 09:38:12 +08:00
egg	8b9a364452	feat: add GPU optimization and fix TableData consistency GPU Optimization (Section 3.1): - Add comprehensive memory management for RTX 4060 8GB - Enable all recognition features (chart, formula, table, seal, text) - Implement model cache with auto-unload for idle models - Add memory monitoring and warning system Bug Fix (Section 3.3): - Fix TableData field inconsistency: 'columns' -> 'cols' - Remove invalid 'html' and 'extracted_text' parameters - Add proper TableCell conversion in _convert_table_data Documentation: - Add Future Improvements section for batch processing enhancement 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 09:17:27 +08:00
egg	ecdce961ca	feat: update PDF generator to support UnifiedDocument directly - Add generate_from_unified_document() method for direct UnifiedDocument processing - Create convert_unified_document_to_ocr_data() for format conversion - Extract _generate_pdf_from_data() as reusable core logic - Support both OCR and DIRECT processing tracks in PDF generation - Handle coordinate transformations (BoundingBox to polygon format) - Update OCR service to use appropriate PDF generation method Completes Section 4 (Unified Processing Pipeline) of dual-track proposal. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 08:48:25 +08:00
egg	ab89a40e8d	feat: add unified JSON export with standardized schema - Create JSON Schema definition for UnifiedDocument format - Implement UnifiedDocumentExporter service with multiple export formats - Include comprehensive processing metadata and statistics - Update OCR service to use new exporter for dual-track outputs - Support JSON, Markdown, Text, and legacy format exports 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 08:36:24 +08:00
egg	5bcf3dfd42	fix: complete layout analysis features for DirectExtractionEngine Implements missing layout analysis capabilities: - Add footer detection based on page position (bottom 10%) - Build hierarchical section structure from font sizes - Create nested list structure from indentation levels All elements now have proper metadata for: - section_level, parent_section, child_sections (headers) - list_level, parent_item, children (list items) - is_page_header, is_page_footer flags Updates tasks.md to reflect accurate completion status. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 08:15:11 +08:00
egg	a3a6fbe58b	feat: add OCR to UnifiedDocument converter for PP-StructureV3 integration Implements the converter that transforms PP-StructureV3 OCR results into the UnifiedDocument format, enabling consistent output for both OCR and direct extraction tracks. - Create OCRToUnifiedConverter class with full element type mapping - Handle both enhanced (parsing_res_list) and standard markdown results - Support 4-point and simple bbox formats for coordinates - Establish element relationships (captions, lists, headers) - Integrate converter into OCR service dual-track processing - Update tasks.md marking section 3.3 complete 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 08:05:20 +08:00
egg	062cb1f423	chore: update tasks - OCR service dual-track integration complete Progress update: - Unified Processing Pipeline: 4/4 tasks completed (section 4.1) - Total progress: 34/147 tasks (23.1%) Completed: ✅ Integrated DocumentTypeDetector into OCR service ✅ Automatic routing to OCR or Direct extraction tracks ✅ UnifiedDocument output from both tracks ✅ Full backward compatibility maintained	2025-11-19 07:29:47 +08:00
egg	0608017a02	chore: update tasks.md with completed infrastructure work Progress update: - Core Infrastructure: 13/14 tasks completed - Direct Extraction Track: 18/18 tasks completed - Total progress: 30/147 tasks (20.4%) Completed major components: ✅ UnifiedDocument model with all structures ✅ DocumentTypeDetector service ✅ DirectExtractionEngine with PyMuPDF ✅ Dependencies added to requirements.txt Next priorities: - Update OCR service for dual-track integration - Enhance PP-StructureV3 usage - Update PDF generator for UnifiedDocument	2025-11-18 20:37:30 +08:00
egg	cd3cbea49d	chore: project cleanup and prepare for dual-track processing refactor - Removed all test files and directories - Deleted outdated documentation (will be rewritten) - Cleaned up temporary files, logs, and uploads - Archived 5 completed OpenSpec proposals - Created new dual-track-document-processing proposal with complete OpenSpec structure - Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF) - UnifiedDocument model for consistent output - Support for structure-preserving translation - Updated .gitignore to prevent future test/temp files This is a major cleanup preparing for the complete refactoring of the document processing pipeline. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-18 20:02:31 +08:00

11 Commits