feat: implement layout preprocessing backend

Backend implementation for add-layout-preprocessing proposal:
- Add LayoutPreprocessingService with CLAHE, sharpen, binarize
- Add auto-detection: analyze_image_quality() for contrast/edge metrics
- Integrate preprocessing into OCR pipeline (analyze_layout)
- Add Preview API: POST /api/v2/tasks/{id}/preview/preprocessing
- Add config options: layout_preprocessing_mode, thresholds
- Add schemas: PreprocessingConfig, PreprocessingPreviewResponse

Preprocessing only affects layout detection input.
Original images preserved for element extraction.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-27 15:17:20 +08:00
parent 06a5973f2e
commit ea0dd7456c
7 changed files with 800 additions and 22 deletions

View File

@@ -93,7 +93,7 @@
- `frontend/src/i18n/locales/zh-TW.json` - Traditional Chinese
- `frontend/src/i18n/locales/en.json` - English (if exists)
## 6. Testing
## 6. Testing (with env)
- [ ] 6.1 Unit tests for preprocessing_service
- Test contrast enhancement methods
@@ -106,7 +106,7 @@
- Test preview endpoint returns correct images
- Test auto-detection returns sensible config
- [ ] 6.3 Integration tests
- [ ] 6.3 Integration tests (accountL ymirliu@panjit.com.tw ; password: 4RFV5tgb6yhn)
- Test OCR track with preprocessing modes (auto/manual/disabled)
- Verify image element quality is preserved
- Test with known problematic documents (faint table borders)