Files
OCR/openspec/changes/archive/2025-12-10-add-ocr-processing-presets/tasks.md
egg 940a406dce chore: backup before code cleanup
Backup commit before executing remove-unused-code proposal.
This includes all pending changes and new features.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 11:55:39 +08:00

2.7 KiB

Tasks: Add OCR Processing Presets

Phase 1: Backend API and Presets

  • Define preset configurations as Pydantic models

    • Create OCRPreset enum with preset names
    • Create OCRConfig model with all configurable parameters
    • Define preset mappings (preset name -> config values)
  • Update task creation API

    • Add ocr_preset optional parameter
    • Add ocr_config optional parameter for custom settings
    • Validate preset/config combinations
    • Apply configuration to OCR service
  • Implement preset configuration loader

    • Load preset from enum name
    • Merge custom config with preset defaults
    • Validate parameter ranges
  • Remove/disable patch behaviors (already done)

    • Disable cell_validation_enabled (default=False)
    • Disable gap_filling_enabled (default=False)
    • Disable table_content_rebuilder_enabled (default=False)

Phase 2: Frontend Preset Selector

  • Create preset selection component

    • Card selector with document type icons
    • Preset description and use case tooltips
    • Visual preview of expected behavior (info box)
  • Integrate with processing flow

    • Add preset selection to ProcessingPage
    • Pass selected preset to API
    • Default to 'datasheet' preset
  • Add preset management

    • List available presets in grid layout
    • Show recommended preset (datasheet)
    • Allow preset change before processing

Phase 3: Advanced Parameter Panel

  • Create parameter configuration component

    • Collapsible "Advanced Settings" section
    • Group parameters by category (Table, Layout, Preprocessing)
    • Input controls for each parameter type
  • Implement parameter validation

    • Client-side input validation
    • Disabled state when preset != custom
    • Reset hint when not in custom mode
  • Add parameter tooltips

    • Chinese labels for all parameters
    • Help text for custom mode
    • Info box with usage notes

Phase 4: Documentation and Testing

  • Create user documentation

    • Preset selection guide
    • Parameter reference
    • Troubleshooting common issues
  • Add API documentation

    • OpenAPI spec auto-generated by FastAPI
    • Pydantic models provide schema documentation
    • Field descriptions in OCRConfig
  • Test with various document types

    • Verify datasheet processing with conservative mode (see test-notes.md; execution pending on target runtime)
    • Verify table-heavy documents with full mode (see test-notes.md; execution pending on target runtime)
    • Verify text documents with disabled mode (see test-notes.md; execution pending on target runtime)