Backup commit before executing remove-unused-code proposal. This includes all pending changes and new features. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.4 KiB
3.4 KiB
OCR Processing - Delta Spec
ADDED Requirements
Requirement: REQ-OCR-PRESETS - Document Type Presets
The system MUST provide predefined OCR processing configurations for common document types.
Available presets:
text_heavy: Optimized for text-heavy documents (reports, articles)datasheet: Optimized for technical datasheetstable_heavy: Optimized for documents with many tablesform: Optimized for forms and applicationsmixed: Balanced configuration for mixed contentcustom: User-defined configuration
Scenario: User selects datasheet preset
- Given a user uploading a technical datasheet
- When they select the "datasheet" preset
- Then the system applies conservative table parsing mode
- And disables wireless table detection
- And sets layout threshold to 0.65
Scenario: User selects text_heavy preset
- Given a user uploading a text-heavy report
- When they select the "text_heavy" preset
- Then the system disables table recognition
- And focuses on text extraction
Requirement: REQ-OCR-PARAMS - Advanced Parameter Configuration
The system MUST allow advanced users to configure individual PP-Structure parameters.
Configurable parameters include:
- Table parsing mode (full/conservative/classification_only/disabled)
- Table layout threshold (0.0-1.0)
- Wired/wireless table detection toggles
- Layout detection model selection
- Preprocessing options (orientation, unwarping, textline)
- Recognition module toggles (chart, formula, seal)
Scenario: User adjusts table layout threshold
- Given a user experiencing table over-detection
- When they increase table_layout_threshold to 0.7
- Then fewer regions are classified as tables
- And text regions are preserved correctly
Scenario: User disables wireless table detection
- Given a user processing a datasheet with cell explosion
- When they disable enable_wireless_table
- Then only bordered tables are detected
- And structured text is not split into cells
Requirement: REQ-OCR-API - OCR Configuration API
The task creation API MUST accept OCR configuration parameters.
API accepts:
ocr_preset: Preset name to applyocr_config: Custom configuration object (overrides preset)
Scenario: Create task with preset
- Given an API request with ocr_preset="datasheet"
- When the task is created
- Then the datasheet preset configuration is applied
- And the task processes with conservative table parsing
Scenario: Create task with custom config
- Given an API request with ocr_config containing custom values
- When the task is created
- Then the custom configuration overrides defaults
- And the task uses the specified parameters
MODIFIED Requirements
Requirement: REQ-OCR-DEFAULTS - Default Processing Configuration
The system default configuration MUST be conservative to prevent over-detection.
Default values:
table_parsing_mode: "conservative"table_layout_threshold: 0.65enable_wireless_table: falseuse_doc_unwarping: false
Patch behaviors MUST be disabled by default:
cell_validation_enabled: falsegap_filling_enabled: falsetable_content_rebuilder_enabled: false
Scenario: New task uses conservative defaults
- Given a task created without specifying OCR configuration
- When the task is processed
- Then conservative table parsing is used
- And wireless table detection is disabled
- And no post-processing patches are applied