feat: enable document orientation detection for scanned PDFs
- Enable PP-StructureV3's use_doc_orientation_classify feature - Detect rotation angle from doc_preprocessor_res.angle - Swap page dimensions (width <-> height) for 90°/270° rotations - Output PDF now correctly displays landscape-scanned content Also includes: - Archive completed openspec proposals - Add simplify-frontend-ocr-config proposal (pending) - Code cleanup and frontend simplification 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,42 @@
|
||||
## REMOVED Requirements
|
||||
|
||||
### Requirement: Legacy PDF Generator Service
|
||||
|
||||
**Reason**: `pdf_generator.py` (507 lines) was the original PDF generation implementation using Pandoc/WeasyPrint. It has been completely superseded by `pdf_generator_service.py` which uses ReportLab for low-level PDF generation with full layout preservation, table rendering, and image support.
|
||||
|
||||
**Migration**: No migration needed. The new `pdf_generator_service.py` provides all functionality with improved features.
|
||||
|
||||
#### Scenario: Legacy PDF generator file removal
|
||||
- **WHEN** the legacy `pdf_generator.py` file is removed
|
||||
- **THEN** the system continues to function normally using `pdf_generator_service.py`
|
||||
- **AND** PDF generation works correctly with layout preservation
|
||||
- **AND** no import errors occur in any service or router
|
||||
|
||||
### Requirement: Deprecated IoU Configuration Parameters
|
||||
|
||||
**Reason**: `gap_filling_iou_threshold` and `gap_filling_dedup_iou_threshold` are deprecated configuration parameters that should be replaced by IoA (Intersection over Area) thresholds for better accuracy.
|
||||
|
||||
**Migration**: Use `gap_filling_dedup_ioa_threshold` instead.
|
||||
|
||||
#### Scenario: Deprecated config removal
|
||||
- **WHEN** the deprecated IoU configuration parameters are removed from config.py
|
||||
- **THEN** gap filling service uses IoA-based thresholds
|
||||
- **AND** the system starts without configuration errors
|
||||
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Unified Bbox Utility Module
|
||||
|
||||
The system SHALL provide a centralized bbox utility module (`backend/app/utils/bbox_utils.py`) for consistent bounding box normalization across all services.
|
||||
|
||||
#### Scenario: Bbox normalization from polygon format
|
||||
- **WHEN** a bbox in polygon format `[[x1,y1], [x2,y2], [x3,y3], [x4,y4]]` is provided
|
||||
- **THEN** the utility returns normalized tuple `(x0, y0, x1, y1)` representing min/max coordinates
|
||||
|
||||
#### Scenario: Bbox normalization from flat array
|
||||
- **WHEN** a bbox in flat array format `[x0, y0, x1, y1]` is provided
|
||||
- **THEN** the utility returns normalized tuple `(x0, y0, x1, y1)`
|
||||
|
||||
#### Scenario: Bbox normalization from 8-point polygon
|
||||
- **WHEN** a bbox in 8-point format `[x1, y1, x2, y2, x3, y3, x4, y4]` is provided
|
||||
- **THEN** the utility calculates and returns normalized tuple `(min_x, min_y, max_x, max_y)`
|
||||
Reference in New Issue
Block a user