# Tasks: Unify Image Scaling Strategy ## 1. Configuration - [x] 1.1 Add min_dimension setting to `backend/app/core/config.py` - `layout_image_scaling_min_dimension: int = 1200` - Description: "Min dimension (pixels) before upscaling. Images smaller than this will be scaled up." ## 2. Bidirectional Scaling Logic - [x] 2.1 Update `scale_for_layout_detection()` in `layout_preprocessing_service.py` - Add upscaling condition: `max_dim < min_dimension` - Use `cv2.INTER_CUBIC` for upscaling (better quality than INTER_LINEAR) - Update docstring to reflect bidirectional behavior - [x] 2.2 Update scaling decision logic ```python # Current: only downscale should_scale = max_dim > max_dimension # New: bidirectional should_downscale = max_dim > max_dimension should_upscale = max_dim < min_dimension should_scale = should_downscale or should_upscale ``` - [x] 2.3 Update logging to indicate scale direction - "Scaled DOWN for layout detection: 2480x3508 -> 1131x1600" - "Scaled UP for layout detection: 800x600 -> 1600x1200" ## 3. PDF DPI Handling (Optional Optimization) - [x] 3.1 Evaluate current PDF conversion impact - Decision: Keep 300 DPI, let bidirectional scaling handle it - Reason: Raw OCR benefits from high resolution, scaling handles PP-Structure needs - [x] 3.2 Option A: Keep 300 DPI, let scaling handle it ✓ - Simplest approach, no change needed - Raw OCR benefits from high resolution - [ ] ~~3.3 Option B: Add configurable PDF DPI~~ (Not needed) ## 4. Testing - [x] 4.1 Test upscaling with small images - Small image (800x600): Scaled UP → 1600x1200, scale_factor=0.500 - Very small (400x300): Scaled UP → 1600x1200, scale_factor=0.250 - [x] 4.2 Test no scaling for optimal range - Optimal image (1500x1000): was_scaled=False, scale_factor=1.000 - [x] 4.3 Test downscaling (existing behavior) - Large image (2480x3508): Scaled DOWN → 1131x1600, scale_factor=2.192 - [ ] 4.4 Test PDF workflow (manual test recommended) - PDF page should be detected correctly - Scaling should apply after PDF conversion ## 5. Documentation - [x] 5.1 Update config.py Field descriptions - Explained bidirectional scaling in enabled field description - Updated max/min/target descriptions - [x] 5.2 Add logging for scaling decisions - Logs direction (UP/DOWN), original size, target size, scale_factor --- ## Implementation Summary **Files Modified:** - `backend/app/core/config.py` - Added `layout_image_scaling_min_dimension` setting - `backend/app/services/layout_preprocessing_service.py` - Updated bidirectional scaling logic **Test Results (2025-11-27):** | Test Case | Original | Result | scale_factor | |-----------|----------|--------|--------------| | Small (800×600) | max=800 < 1200 | UP → 1600×1200 | 0.500 | | Optimal (1500×1000) | 1200 ≤ 1500 ≤ 2000 | No scaling | 1.000 | | Large (2480×3508) | max=3508 > 2000 | DOWN → 1131×1600 | 2.192 | | Very small (400×300) | max=400 < 1200 | UP → 1600×1200 | 0.250 | --- ## Implementation Notes ### Scaling Decision Matrix | Image Size | Action | Scale Factor | Interpolation | |------------|--------|--------------|---------------| | < 1200px | Scale UP | target/max_dim | INTER_CUBIC | | 1200-2000px | No scaling | 1.0 | N/A | | > 2000px | Scale DOWN | target/max_dim | INTER_AREA | ### Example Scenarios 1. **Small scan (800×600)** - max_dim = 800 < 1200 → Scale UP - target = 1600, scale = 1600/800 = 2.0 - Result: 1600×1200 - scale_factor (for bbox restore) = 0.5 2. **Optimal image (1400×1000)** - max_dim = 1400, 1200 <= 1400 <= 2000 → No scaling - Result: unchanged - scale_factor = 1.0 3. **High-res scan (2480×3508)** - max_dim = 3508 > 2000 → Scale DOWN - target = 1600, scale = 1600/3508 = 0.456 - Result: 1131×1600 - scale_factor (for bbox restore) = 2.19