egg/OCR

Files

egg 801ee9c4b6 feat: create extract-table-cell-boxes proposal and archive old proposal

- Archive unify-image-scaling proposal to archive/2025-11-28
- Create new extract-table-cell-boxes proposal for supplementing PPStructureV3
  with direct SLANeXt model calls to extract table cell bounding boxes
- Add debug logging to pp_structure_enhanced.py for table cell boxes investigation
- Discovered that PPStructureV3 high-level API filters out cell bbox data,
  but paddlex.create_model() can directly invoke underlying models

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-28 12:15:06 +08:00

3.8 KiB

Raw Blame History

Tasks: Unify Image Scaling Strategy

1. Configuration

1.1 Add min_dimension setting to backend/app/core/config.py
- layout_image_scaling_min_dimension: int = 1200
- Description: "Min dimension (pixels) before upscaling. Images smaller than this will be scaled up."

2. Bidirectional Scaling Logic

2.1 Update scale_for_layout_detection() in layout_preprocessing_service.py
- Add upscaling condition: max_dim < min_dimension
- Use cv2.INTER_CUBIC for upscaling (better quality than INTER_LINEAR)
- Update docstring to reflect bidirectional behavior

2.2 Update scaling decision logic

# Current: only downscale
should_scale = max_dim > max_dimension

# New: bidirectional
should_downscale = max_dim > max_dimension
should_upscale = max_dim < min_dimension
should_scale = should_downscale or should_upscale

2.3 Update logging to indicate scale direction
- "Scaled DOWN for layout detection: 2480x3508 -> 1131x1600"
- "Scaled UP for layout detection: 800x600 -> 1600x1200"

3. PDF DPI Handling (Optional Optimization)

3.1 Evaluate current PDF conversion impact
- Decision: Keep 300 DPI, let bidirectional scaling handle it
- Reason: Raw OCR benefits from high resolution, scaling handles PP-Structure needs
3.2 Option A: Keep 300 DPI, let scaling handle it ✓
- Simplest approach, no change needed
- Raw OCR benefits from high resolution
~~3.3 Option B: Add configurable PDF DPI~~ (Not needed)

4. Testing

4.1 Test upscaling with small images
- Small image (800x600): Scaled UP → 1600x1200, scale_factor=0.500
- Very small (400x300): Scaled UP → 1600x1200, scale_factor=0.250
4.2 Test no scaling for optimal range
- Optimal image (1500x1000): was_scaled=False, scale_factor=1.000
4.3 Test downscaling (existing behavior)
- Large image (2480x3508): Scaled DOWN → 1131x1600, scale_factor=2.192
4.4 Test PDF workflow (manual test recommended)
- PDF page should be detected correctly
- Scaling should apply after PDF conversion

5. Documentation

5.1 Update config.py Field descriptions
- Explained bidirectional scaling in enabled field description
- Updated max/min/target descriptions
5.2 Add logging for scaling decisions
- Logs direction (UP/DOWN), original size, target size, scale_factor

Implementation Summary

Files Modified:

backend/app/core/config.py - Added layout_image_scaling_min_dimension setting
backend/app/services/layout_preprocessing_service.py - Updated bidirectional scaling logic

Test Results (2025-11-27):

Test Case	Original	Result	scale_factor
Small (800×600)	max=800 < 1200	UP → 1600×1200	0.500
Optimal (1500×1000)	1200 ≤ 1500 ≤ 2000	No scaling	1.000
Large (2480×3508)	max=3508 > 2000	DOWN → 1131×1600	2.192
Very small (400×300)	max=400 < 1200	UP → 1600×1200	0.250

Implementation Notes

Scaling Decision Matrix

Image Size	Action	Scale Factor	Interpolation
< 1200px	Scale UP	target/max_dim	INTER_CUBIC
1200-2000px	No scaling	1.0	N/A
> 2000px	Scale DOWN	target/max_dim	INTER_AREA

Example Scenarios

Small scan (800×600)
- max_dim = 800 < 1200 → Scale UP
- target = 1600, scale = 1600/800 = 2.0
- Result: 1600×1200
- scale_factor (for bbox restore) = 0.5
Optimal image (1400×1000)
- max_dim = 1400, 1200 <= 1400 <= 2000 → No scaling
- Result: unchanged
- scale_factor = 1.0
High-res scan (2480×3508)
- max_dim = 3508 > 2000 → Scale DOWN
- target = 1600, scale = 1600/3508 = 0.456
- Result: 1131×1600
- scale_factor (for bbox restore) = 2.19

3.8 KiB Raw Blame History Unescape Escape

Tasks: Unify Image Scaling Strategy

1. Configuration

2. Bidirectional Scaling Logic

3. PDF DPI Handling (Optional Optimization)

4. Testing

5. Documentation

Implementation Summary

Implementation Notes

Scaling Decision Matrix

Example Scenarios

3.8 KiB

Raw Blame History