8 Commits

Author SHA1 Message Date
egg
a07aad96b3 feat: add translated PDF export with layout preservation
Adds the ability to download translated documents as PDF files while
preserving the original document layout. Key changes:

- Add apply_translations() function to merge translation JSON with UnifiedDocument
- Add generate_translated_pdf() method to PDFGeneratorService
- Add POST /api/v2/translate/{task_id}/pdf endpoint
- Add downloadTranslatedPdf() method and PDF button in frontend
- Add comprehensive unit tests (52 tests: merge, PDF generation, API endpoints)
- Archive add-translated-pdf-export proposal

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-02 12:33:31 +08:00
egg
6235280c45 feat: upgrade PP-StructureV3 models to latest versions
- Layout: PP-DocLayout-S → PP-DocLayout_plus-L (83.2% mAP)
- Table: Single model → Dual SLANeXt (wired/wireless)
- Formula: PP-FormulaNet_plus-L for enhanced recognition
- Add preprocessing flags support (orientation, unwarping)
- Update frontend i18n descriptions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 14:22:06 +08:00
egg
59206a6ab8 feat: simplify layout model selection and archive proposals
Changes:
- Replace PP-Structure 7-slider parameter UI with simple 3-option layout model selector
- Add layout model mapping: chinese (PP-DocLayout-S), default (PubLayNet), cdla
- Add LayoutModelSelector component and zh-TW translations
- Fix "default" model behavior with sentinel value for PubLayNet
- Add gap filling service for OCR track coverage improvement
- Add PP-Structure debug utilities
- Archive completed/incomplete proposals:
  - add-ocr-track-gap-filling (complete)
  - fix-ocr-track-table-rendering (incomplete)
  - simplify-ppstructure-model-selection (22/25 tasks)
- Add new layout model tests, archive old PP-Structure param tests
- Update OpenSpec ocr-processing spec with layout model requirements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 13:27:00 +08:00
egg
1afdb822c3 feat: implement hybrid image extraction and memory management
Backend:
- Add hybrid image extraction for Direct track (inline image blocks)
- Add render_inline_image_regions() fallback when OCR doesn't find images
- Add check_document_for_missing_images() for detecting missing images
- Add memory management system (MemoryGuard, ModelManager, ServicePool)
- Update pdf_generator_service to handle HYBRID processing track
- Add ElementType.LOGO for logo extraction

Frontend:
- Fix PDF viewer re-rendering issues with memoization
- Add TaskNotFound component and useTaskValidation hook
- Disable StrictMode due to react-pdf incompatibility
- Fix task detail and results page loading states

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 10:56:22 +08:00
egg
2312b4cd66 feat: add frontend-adjustable PP-StructureV3 parameters with comprehensive testing
Implement user-configurable PP-StructureV3 parameters to allow fine-tuning OCR behavior
from the frontend. This addresses issues with over-merging, missing small text, and
document-specific optimization needs.

Backend:
- Add PPStructureV3Params schema with 7 adjustable parameters
- Update OCR service to accept custom parameters with smart caching
- Modify /tasks/{task_id}/start endpoint to receive params in request body
- Parameter priority: custom > settings default
- Conditional caching (no cache for custom params to avoid pollution)

Frontend:
- Create PPStructureParams component with collapsible UI
- Add 3 presets: default, high-quality, fast
- Implement localStorage persistence for user parameters
- Add import/export JSON functionality
- Integrate into ProcessingPage with conditional rendering

Testing:
- Unit tests: 7/10 passing (core functionality verified)
- API integration tests for schema validation
- E2E tests with authentication support
- Performance benchmarks for memory and initialization
- Test runner script with venv activation

Environment:
- Remove duplicate backend/venv (use root venv only)
- Update test runner to use correct virtual environment

OpenSpec:
- Archive fix-pdf-coordinate-system proposal
- Archive frontend-adjustable-ppstructure-params proposal
- Create ocr-processing spec
- Update result-export spec

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 14:39:19 +08:00
egg
ef335cf3af feat: implement Office document direct extraction (Section 2.4)
- Update DocumentTypeDetector._analyze_office to convert Office to PDF first
- Analyze converted PDF for text extractability before routing
- Route text-based Office documents to direct track (10x faster)
- Update OCR service to convert Office files for DirectExtractionEngine
- Add unit tests for Office → PDF → Direct extraction flow
- Handle conversion failures with fallback to OCR track

This optimization reduces Office document processing from >300s to ~2-5s
for text-based documents by avoiding unnecessary OCR processing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-20 12:20:50 +08:00
egg
c50a5e9d2b test: add unit and integration tests for dual-track processing
Add comprehensive test suite for DirectExtractionEngine and dual-track
integration. All 65 tests pass covering text extraction, structure
preservation, routing logic, and backward compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 12:50:44 +08:00
egg
0fcb2492c9 test: add unit tests for DocumentTypeDetector
- Create test directory structure for backend
- Add pytest fixtures for test files (PDF, images, Office docs)
- Add 20 unit tests covering:
  - PDF type detection (editable, scanned, mixed)
  - Image file detection (PNG, JPG)
  - Office document detection (DOCX)
  - Text file detection
  - Edge cases (file not found, unknown types)
  - Batch processing and statistics
- Mark tasks 1.1.4 and 1.3.5 as completed in tasks.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 12:16:49 +08:00