Backup commit before executing remove-unused-code proposal. This includes all pending changes and new features. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
97 lines
3.4 KiB
Markdown
97 lines
3.4 KiB
Markdown
# OCR Processing - Delta Spec
|
|
|
|
## ADDED Requirements
|
|
|
|
### Requirement: REQ-OCR-PRESETS - Document Type Presets
|
|
|
|
The system MUST provide predefined OCR processing configurations for common document types.
|
|
|
|
Available presets:
|
|
- `text_heavy`: Optimized for text-heavy documents (reports, articles)
|
|
- `datasheet`: Optimized for technical datasheets
|
|
- `table_heavy`: Optimized for documents with many tables
|
|
- `form`: Optimized for forms and applications
|
|
- `mixed`: Balanced configuration for mixed content
|
|
- `custom`: User-defined configuration
|
|
|
|
#### Scenario: User selects datasheet preset
|
|
- Given a user uploading a technical datasheet
|
|
- When they select the "datasheet" preset
|
|
- Then the system applies conservative table parsing mode
|
|
- And disables wireless table detection
|
|
- And sets layout threshold to 0.65
|
|
|
|
#### Scenario: User selects text_heavy preset
|
|
- Given a user uploading a text-heavy report
|
|
- When they select the "text_heavy" preset
|
|
- Then the system disables table recognition
|
|
- And focuses on text extraction
|
|
|
|
### Requirement: REQ-OCR-PARAMS - Advanced Parameter Configuration
|
|
|
|
The system MUST allow advanced users to configure individual PP-Structure parameters.
|
|
|
|
Configurable parameters include:
|
|
- Table parsing mode (full/conservative/classification_only/disabled)
|
|
- Table layout threshold (0.0-1.0)
|
|
- Wired/wireless table detection toggles
|
|
- Layout detection model selection
|
|
- Preprocessing options (orientation, unwarping, textline)
|
|
- Recognition module toggles (chart, formula, seal)
|
|
|
|
#### Scenario: User adjusts table layout threshold
|
|
- Given a user experiencing table over-detection
|
|
- When they increase table_layout_threshold to 0.7
|
|
- Then fewer regions are classified as tables
|
|
- And text regions are preserved correctly
|
|
|
|
#### Scenario: User disables wireless table detection
|
|
- Given a user processing a datasheet with cell explosion
|
|
- When they disable enable_wireless_table
|
|
- Then only bordered tables are detected
|
|
- And structured text is not split into cells
|
|
|
|
### Requirement: REQ-OCR-API - OCR Configuration API
|
|
|
|
The task creation API MUST accept OCR configuration parameters.
|
|
|
|
API accepts:
|
|
- `ocr_preset`: Preset name to apply
|
|
- `ocr_config`: Custom configuration object (overrides preset)
|
|
|
|
#### Scenario: Create task with preset
|
|
- Given an API request with ocr_preset="datasheet"
|
|
- When the task is created
|
|
- Then the datasheet preset configuration is applied
|
|
- And the task processes with conservative table parsing
|
|
|
|
#### Scenario: Create task with custom config
|
|
- Given an API request with ocr_config containing custom values
|
|
- When the task is created
|
|
- Then the custom configuration overrides defaults
|
|
- And the task uses the specified parameters
|
|
|
|
## MODIFIED Requirements
|
|
|
|
### Requirement: REQ-OCR-DEFAULTS - Default Processing Configuration
|
|
|
|
The system default configuration MUST be conservative to prevent over-detection.
|
|
|
|
Default values:
|
|
- `table_parsing_mode`: "conservative"
|
|
- `table_layout_threshold`: 0.65
|
|
- `enable_wireless_table`: false
|
|
- `use_doc_unwarping`: false
|
|
|
|
Patch behaviors MUST be disabled by default:
|
|
- `cell_validation_enabled`: false
|
|
- `gap_filling_enabled`: false
|
|
- `table_content_rebuilder_enabled`: false
|
|
|
|
#### Scenario: New task uses conservative defaults
|
|
- Given a task created without specifying OCR configuration
|
|
- When the task is processed
|
|
- Then conservative table parsing is used
|
|
- And wireless table detection is disabled
|
|
- And no post-processing patches are applied
|