feat: enhance layout preprocessing and unify image scaling proposal

Backend changes:
- Add image scaling configuration for PP-Structure processing
- Enhance layout preprocessing service with scaling support
- Update OCR service with improved memory management
- Add PP-Structure enhanced processing improvements

Frontend changes:
- Update preprocessing settings UI
- Fix processing page layout and state management
- Update API types for new parameters

Proposals:
- Archive add-layout-preprocessing proposal (completed)
- Add unify-image-scaling proposal for consistent coordinate handling

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-28 09:23:19 +08:00
parent 86bbea6fbf
commit dda9621e17
17 changed files with 826 additions and 104 deletions

View File

@@ -0,0 +1,128 @@
## ADDED Requirements
### Requirement: Layout Detection Image Preprocessing
The system SHALL provide optional image preprocessing to enhance layout detection accuracy for documents with faint lines, low contrast, or poor scan quality.
#### Scenario: Preprocessing improves table detection
- **GIVEN** a document with faint table borders that PP-Structure fails to detect
- **WHEN** layout preprocessing is enabled
- **THEN** the system SHALL preprocess the image before layout detection
- **AND** contrast enhancement SHALL make faint lines more visible
- **AND** PP-Structure SHALL receive the preprocessed image for layout detection
#### Scenario: Image element extraction uses original quality
- **GIVEN** an image element detected by PP-Structure from preprocessed input
- **WHEN** the system extracts the image element
- **THEN** the system SHALL crop from the ORIGINAL image, not the preprocessed version
- **AND** the extracted image SHALL maintain original quality and colors
#### Scenario: CLAHE contrast enhancement
- **WHEN** `layout_preprocessing_contrast` is set to "clahe"
- **THEN** the system SHALL apply Contrast Limited Adaptive Histogram Equalization
- **AND** the enhancement SHALL not over-saturate already bright regions
#### Scenario: Sharpening enhances faint lines
- **WHEN** `layout_preprocessing_sharpen` is enabled
- **THEN** the system SHALL apply unsharp masking to enhance edges
- **AND** faint table borders SHALL become more detectable
#### Scenario: Optional binarization for extreme cases
- **WHEN** `layout_preprocessing_binarize` is enabled
- **THEN** the system SHALL apply adaptive thresholding
- **AND** this SHALL be used only for documents with very poor contrast
### Requirement: Preprocessing Hybrid Control Mode
The system SHALL support three preprocessing modes: automatic, manual, and disabled, with automatic as the default.
#### Scenario: Auto mode analyzes image quality
- **GIVEN** preprocessing mode is set to "auto"
- **WHEN** processing begins for a page
- **THEN** the system SHALL analyze image quality metrics (contrast, edge strength)
- **AND** automatically determine optimal preprocessing parameters
- **AND** apply recommended settings without user intervention
#### Scenario: Auto mode detects low contrast
- **GIVEN** preprocessing mode is "auto"
- **WHEN** image contrast (standard deviation) is below 40
- **THEN** the system SHALL automatically enable CLAHE contrast enhancement
#### Scenario: Auto mode detects faint edges
- **GIVEN** preprocessing mode is "auto"
- **WHEN** image edge strength (Sobel gradient mean) is below 15
- **THEN** the system SHALL automatically enable sharpening
#### Scenario: Manual mode uses user-specified settings
- **GIVEN** preprocessing mode is set to "manual"
- **WHEN** processing begins
- **THEN** the system SHALL use the user-provided preprocessing configuration
- **AND** ignore automatic quality analysis
#### Scenario: Disabled mode skips preprocessing
- **GIVEN** preprocessing mode is set to "disabled"
- **WHEN** processing begins
- **THEN** the system SHALL skip all preprocessing
- **AND** PP-Structure SHALL receive the original image directly
### Requirement: Preprocessing Preview API
The system SHALL provide a preview endpoint that allows users to compare original and preprocessed images before processing.
#### Scenario: Preview returns comparison images
- **GIVEN** a task with uploaded document
- **WHEN** user requests preprocessing preview for a specific page
- **THEN** the system SHALL return URLs or data for both original and preprocessed images
- **AND** user can visually compare the difference
#### Scenario: Preview shows auto-detected settings
- **GIVEN** preview is requested with mode "auto"
- **WHEN** the system analyzes the page
- **THEN** the response SHALL include the auto-detected preprocessing configuration
- **AND** include quality metrics (contrast, edge_strength)
#### Scenario: Preview accepts manual configuration
- **GIVEN** preview is requested with mode "manual"
- **WHEN** user provides specific preprocessing settings
- **THEN** the system SHALL apply those settings to generate preview
- **AND** return the preprocessed result for user verification
### Requirement: Preprocessing Track Isolation
The layout preprocessing feature SHALL only affect layout detection input without impacting other processing components.
#### Scenario: Raw OCR is unaffected
- **GIVEN** layout preprocessing is enabled
- **WHEN** Raw OCR processing runs
- **THEN** Raw OCR SHALL use the original image
- **AND** text detection quality SHALL not be affected by preprocessing
#### Scenario: Preprocessed image is temporary
- **GIVEN** an image is preprocessed for layout detection
- **WHEN** layout detection completes
- **THEN** the preprocessed image SHALL NOT be persisted to storage
- **AND** only the original image and element crops SHALL be saved
### Requirement: Preprocessing Frontend UI
The frontend SHALL provide a user interface for configuring and previewing preprocessing settings.
#### Scenario: Mode selection is available
- **GIVEN** the user is configuring OCR track processing
- **WHEN** the preprocessing settings panel is displayed
- **THEN** the user SHALL be able to select mode: Auto (default), Manual, or Disabled
- **AND** Auto mode SHALL be pre-selected
#### Scenario: Manual mode shows configuration options
- **GIVEN** the user selects Manual mode
- **WHEN** the settings panel updates
- **THEN** the user SHALL see options for:
- Contrast enhancement (None / Histogram / CLAHE)
- Sharpen toggle
- Binarize toggle
#### Scenario: Preview button triggers comparison view
- **GIVEN** preprocessing settings are configured
- **WHEN** the user clicks Preview button
- **THEN** the system SHALL display side-by-side comparison of original and preprocessed images
- **AND** show detected quality metrics