proposal: add-layout-preprocessing for improved table detection
Problem: PP-Structure misses tables with faint lines/borders Solution: Preprocess images (contrast, sharpen) for layout detection - Preprocessed image only used for layout detection - Original image preserved for element extraction (quality) Includes: proposal.md, design.md, tasks.md, spec delta 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
54
openspec/changes/add-layout-preprocessing/tasks.md
Normal file
54
openspec/changes/add-layout-preprocessing/tasks.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# Tasks: Add Image Preprocessing for Layout Detection
|
||||
|
||||
## 1. Configuration
|
||||
|
||||
- [ ] 1.1 Add preprocessing configuration to `backend/app/core/config.py`
|
||||
- `layout_preprocessing_enabled: bool = True` - Enable/disable preprocessing
|
||||
- `layout_preprocessing_contrast: str = "clahe"` - Options: none, histogram, clahe
|
||||
- `layout_preprocessing_sharpen: bool = True` - Enable sharpening for faint lines
|
||||
- `layout_preprocessing_binarize: bool = False` - Optional binarization (aggressive)
|
||||
|
||||
## 2. Preprocessing Service
|
||||
|
||||
- [ ] 2.1 Create `backend/app/services/preprocessing_service.py`
|
||||
- Image loading utility (supports PIL, OpenCV)
|
||||
- Contrast enhancement methods (histogram equalization, CLAHE)
|
||||
- Sharpening filter for line enhancement
|
||||
- Optional adaptive binarization
|
||||
- Return preprocessed image as numpy array or PIL Image
|
||||
|
||||
- [ ] 2.2 Implement `enhance_for_layout_detection()` function
|
||||
- Input: Original image path or PIL Image
|
||||
- Output: Preprocessed image (same format as input)
|
||||
- Steps: contrast → sharpen → (optional) binarize
|
||||
|
||||
## 3. Integration with OCR Service
|
||||
|
||||
- [ ] 3.1 Update `backend/app/services/ocr_service.py`
|
||||
- Import preprocessing service
|
||||
- Before `_run_ppstructure()`, preprocess image if enabled
|
||||
- Pass preprocessed image to PP-Structure for layout detection
|
||||
- Keep original image reference for image extraction
|
||||
|
||||
- [ ] 3.2 Ensure image element extraction uses original
|
||||
- Verify `saved_path` and `img_path` in elements reference original
|
||||
- Bbox coordinates from preprocessed detection applied to original crop
|
||||
|
||||
## 4. Testing
|
||||
|
||||
- [ ] 4.1 Unit tests for preprocessing_service
|
||||
- Test contrast enhancement methods
|
||||
- Test sharpening filter
|
||||
- Test binarization
|
||||
- Test with various image formats (PNG, JPEG)
|
||||
|
||||
- [ ] 4.2 Integration tests
|
||||
- Test OCR track with preprocessing enabled/disabled
|
||||
- Verify image element quality is preserved
|
||||
- Test with known problematic documents (faint table borders)
|
||||
|
||||
## 5. Documentation
|
||||
|
||||
- [ ] 5.1 Update API documentation
|
||||
- Document new configuration options
|
||||
- Explain preprocessing behavior
|
||||
Reference in New Issue
Block a user