# Tasks: Add Image Preprocessing for Layout Detection ## 1. Configuration - [ ] 1.1 Add preprocessing configuration to `backend/app/core/config.py` - `layout_preprocessing_mode: str = "auto"` - Options: auto, manual, disabled - `layout_preprocessing_contrast: str = "clahe"` - Options: none, histogram, clahe - `layout_preprocessing_sharpen: bool = True` - Enable sharpening for faint lines - `layout_preprocessing_binarize: bool = False` - Optional binarization (aggressive) - [ ] 1.2 Add preprocessing schema to `backend/app/schemas/task.py` - `PreprocessingMode` enum: auto, manual, disabled - `PreprocessingConfig` schema for API request/response ## 2. Preprocessing Service - [ ] 2.1 Create `backend/app/services/preprocessing_service.py` - Image loading utility (supports PIL, OpenCV) - Contrast enhancement methods (histogram equalization, CLAHE) - Sharpening filter for line enhancement - Optional adaptive binarization - Return preprocessed image as numpy array or PIL Image - [ ] 2.2 Implement `enhance_for_layout_detection()` function - Input: Original image path or PIL Image + config - Output: Preprocessed image (same format as input) - Steps: contrast → sharpen → (optional) binarize - [ ] 2.3 Implement `analyze_image_quality()` function (Auto mode) - Calculate contrast level (standard deviation of grayscale) - Detect edge clarity (Sobel/Canny edge strength) - Return recommended `PreprocessingConfig` based on analysis - Thresholds: - Low contrast < 40: Apply CLAHE - Faint edges < 0.1: Apply sharpen - Very low contrast < 20: Consider binarize ## 3. Integration with OCR Service - [ ] 3.1 Update `backend/app/services/ocr_service.py` - Import preprocessing service - Check preprocessing mode (auto/manual/disabled) - If auto: call `analyze_image_quality()` first - Before `_run_ppstructure()`, preprocess image based on config - Pass preprocessed image to PP-Structure for layout detection - Keep original image reference for image extraction - [ ] 3.2 Ensure image element extraction uses original - Verify `saved_path` and `img_path` in elements reference original - Bbox coordinates from preprocessed detection applied to original crop - [ ] 3.3 Update task start API to accept preprocessing options - Add `preprocessing_mode` parameter to start request - Add `preprocessing_config` for manual mode overrides ## 4. Preview API - [ ] 4.1 Create `backend/app/api/v2/endpoints/preview.py` - `POST /api/v2/tasks/{task_id}/preview/preprocessing` - Input: page number, preprocessing config (optional) - Output: - Original image (base64 or URL) - Preprocessed image (base64 or URL) - Auto-detected config (if mode=auto) - Image quality metrics (contrast, edge_strength) - [ ] 4.2 Add preview router to API - Register in `backend/app/api/v2/api.py` - Add appropriate authentication/authorization ## 5. Frontend UI - [ ] 5.1 Create `frontend/src/components/PreprocessingSettings.tsx` - Radio buttons: Auto / Manual / Disabled - Manual mode shows: - Contrast dropdown: None / Histogram / CLAHE - Sharpen checkbox - Binarize checkbox - Preview button to trigger comparison view - [ ] 5.2 Create `frontend/src/components/PreprocessingPreview.tsx` - Side-by-side image comparison (original vs preprocessed) - Display detected quality metrics - Show which auto settings would be applied - Slider or toggle to switch between views - [ ] 5.3 Integrate with task start flow - Add PreprocessingSettings to OCR track options - Pass selected config to task start API - Store user preference in localStorage - [ ] 5.4 Add i18n translations - `frontend/src/i18n/locales/zh-TW.json` - Traditional Chinese - `frontend/src/i18n/locales/en.json` - English (if exists) ## 6. Testing - [ ] 6.1 Unit tests for preprocessing_service - Test contrast enhancement methods - Test sharpening filter - Test binarization - Test `analyze_image_quality()` with various images - Test with various image formats (PNG, JPEG) - [ ] 6.2 Unit tests for preview API - Test preview endpoint returns correct images - Test auto-detection returns sensible config - [ ] 6.3 Integration tests - Test OCR track with preprocessing modes (auto/manual/disabled) - Verify image element quality is preserved - Test with known problematic documents (faint table borders) - Verify auto mode improves detection for low-quality images ## 7. Documentation - [ ] 7.1 Update API documentation - Document new configuration options - Document preview endpoint - Explain preprocessing behavior and modes - [ ] 7.2 Add user guide section - When to use auto vs manual - How to interpret quality metrics - Troubleshooting tips