chore: backup before code cleanup

Backup commit before executing remove-unused-code proposal.
This includes all pending changes and new features.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
egg
2025-12-11 11:55:39 +08:00
parent eff9b0bcd5
commit 940a406dce
58 changed files with 8226 additions and 175 deletions

View File

@@ -0,0 +1,96 @@
# OCR Processing - Delta Spec
## ADDED Requirements
### Requirement: REQ-OCR-PRESETS - Document Type Presets
The system MUST provide predefined OCR processing configurations for common document types.
Available presets:
- `text_heavy`: Optimized for text-heavy documents (reports, articles)
- `datasheet`: Optimized for technical datasheets
- `table_heavy`: Optimized for documents with many tables
- `form`: Optimized for forms and applications
- `mixed`: Balanced configuration for mixed content
- `custom`: User-defined configuration
#### Scenario: User selects datasheet preset
- Given a user uploading a technical datasheet
- When they select the "datasheet" preset
- Then the system applies conservative table parsing mode
- And disables wireless table detection
- And sets layout threshold to 0.65
#### Scenario: User selects text_heavy preset
- Given a user uploading a text-heavy report
- When they select the "text_heavy" preset
- Then the system disables table recognition
- And focuses on text extraction
### Requirement: REQ-OCR-PARAMS - Advanced Parameter Configuration
The system MUST allow advanced users to configure individual PP-Structure parameters.
Configurable parameters include:
- Table parsing mode (full/conservative/classification_only/disabled)
- Table layout threshold (0.0-1.0)
- Wired/wireless table detection toggles
- Layout detection model selection
- Preprocessing options (orientation, unwarping, textline)
- Recognition module toggles (chart, formula, seal)
#### Scenario: User adjusts table layout threshold
- Given a user experiencing table over-detection
- When they increase table_layout_threshold to 0.7
- Then fewer regions are classified as tables
- And text regions are preserved correctly
#### Scenario: User disables wireless table detection
- Given a user processing a datasheet with cell explosion
- When they disable enable_wireless_table
- Then only bordered tables are detected
- And structured text is not split into cells
### Requirement: REQ-OCR-API - OCR Configuration API
The task creation API MUST accept OCR configuration parameters.
API accepts:
- `ocr_preset`: Preset name to apply
- `ocr_config`: Custom configuration object (overrides preset)
#### Scenario: Create task with preset
- Given an API request with ocr_preset="datasheet"
- When the task is created
- Then the datasheet preset configuration is applied
- And the task processes with conservative table parsing
#### Scenario: Create task with custom config
- Given an API request with ocr_config containing custom values
- When the task is created
- Then the custom configuration overrides defaults
- And the task uses the specified parameters
## MODIFIED Requirements
### Requirement: REQ-OCR-DEFAULTS - Default Processing Configuration
The system default configuration MUST be conservative to prevent over-detection.
Default values:
- `table_parsing_mode`: "conservative"
- `table_layout_threshold`: 0.65
- `enable_wireless_table`: false
- `use_doc_unwarping`: false
Patch behaviors MUST be disabled by default:
- `cell_validation_enabled`: false
- `gap_filling_enabled`: false
- `table_content_rebuilder_enabled`: false
#### Scenario: New task uses conservative defaults
- Given a task created without specifying OCR configuration
- When the task is processed
- Then conservative table parsing is used
- And wireless table detection is disabled
- And no post-processing patches are applied