OCR/tasks.md at fa9b542b06c6ffb6fb8902ef6560c965a507f00d

egg 1afdb822c3 feat: implement hybrid image extraction and memory management

Backend:
- Add hybrid image extraction for Direct track (inline image blocks)
- Add render_inline_image_regions() fallback when OCR doesn't find images
- Add check_document_for_missing_images() for detecting missing images
- Add memory management system (MemoryGuard, ModelManager, ServicePool)
- Update pdf_generator_service to handle HYBRID processing track
- Add ElementType.LOGO for logo extraction

Frontend:
- Fix PDF viewer re-rendering issues with memoization
- Add TaskNotFound component and useTaskValidation hook
- Disable StrictMode due to react-pdf incompatibility
- Fix task detail and results page loading states

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

7.2 KiB

Raw Blame History

Tasks for Enhanced Memory Management

Section 1: Model Lifecycle Management (Priority: Critical)

1.1 Create ModelManager class

1.2 Integrate PP-StructureV3 with ModelManager

Section 2: Service Singleton Pattern (Priority: Critical)

2.1 Create OCRServicePool

2.2 Refactor task router

Section 3: Enhanced Memory Monitoring (Priority: High)

3.1 Create MemoryGuard class

3.2 Integrate memory checks

Section 4: Concurrency Control (Priority: High)

4.1 Implement prediction semaphores

4.2 Add selective processing

Section 5: Active Memory Management (Priority: Medium)

5.1 Create memory monitor thread

5.2 Add recovery mechanisms

Section 6: Cleanup Hooks (Priority: Medium)

6.1 Implement shutdown handlers

6.2 Add task cleanup

Section 7: Configuration & Settings (Priority: Low)

7.1 Add memory settings to config

7.2 Create monitoring dashboard

Section 8: Testing & Documentation (Priority: High)

8.1 Create comprehensive tests

8.2 Documentation

Implementation Summary

Files Created

Files Modified

New Classes Added (Section 4.2-8)

New Test Classes Added (Section 8.1)

7.2 KiB Raw Blame History

Tasks for Enhanced Memory Management

Section 1: Model Lifecycle Management (Priority: Critical)

1.1 Create ModelManager class

1.2 Integrate PP-StructureV3 with ModelManager

Section 2: Service Singleton Pattern (Priority: Critical)

2.1 Create OCRServicePool

2.2 Refactor task router

Section 3: Enhanced Memory Monitoring (Priority: High)

3.1 Create MemoryGuard class

3.2 Integrate memory checks

Section 4: Concurrency Control (Priority: High)

4.1 Implement prediction semaphores

4.2 Add selective processing

Section 5: Active Memory Management (Priority: Medium)

5.1 Create memory monitor thread

5.2 Add recovery mechanisms

Section 6: Cleanup Hooks (Priority: Medium)

6.1 Implement shutdown handlers

6.2 Add task cleanup

Section 7: Configuration & Settings (Priority: Low)

7.1 Add memory settings to config

7.2 Create monitoring dashboard

Section 8: Testing & Documentation (Priority: High)

8.1 Create comprehensive tests

8.2 Documentation

Implementation Summary

Files Created

Files Modified

New Classes Added (Section 4.2-8)

New Test Classes Added (Section 8.1)

7.2 KiB

Raw Blame History