docs: archive enable-doc-orientation-detection proposal
Feature implementation completed and tested successfully. - PP-StructureV3 orientation detection enabled - Page dimensions correctly swapped for 90°/270° rotations - Output PDF now displays landscape content correctly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,80 @@
|
||||
# ocr-processing Specification Delta
|
||||
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Document Orientation Detection
|
||||
|
||||
The system SHALL detect and correct document orientation for scanned PDFs where the content orientation differs from PDF page metadata.
|
||||
|
||||
#### Scenario: Portrait PDF with landscape content is corrected
|
||||
- **GIVEN** a PDF with portrait page dimensions (width < height)
|
||||
- **AND** the scanned content is rotated 90° (landscape scan in portrait page)
|
||||
- **WHEN** PP-StructureV3 processes the image with `use_doc_orientation_classify=True`
|
||||
- **THEN** the system SHALL detect rotation angle as "90" or "270"
|
||||
- **AND** the output PDF page dimensions SHALL be swapped (width ↔ height)
|
||||
- **AND** all text elements SHALL be correctly positioned in the rotated coordinate space
|
||||
|
||||
#### Scenario: Landscape PDF with portrait content is corrected
|
||||
- **GIVEN** a PDF with landscape page dimensions (width > height)
|
||||
- **AND** the scanned content is rotated 90° (portrait scan in landscape page)
|
||||
- **WHEN** PP-StructureV3 processes the image
|
||||
- **THEN** the system SHALL detect rotation angle as "90" or "270"
|
||||
- **AND** the output PDF page dimensions SHALL be swapped
|
||||
- **AND** all text elements SHALL be correctly positioned
|
||||
|
||||
#### Scenario: Upside-down content is corrected
|
||||
- **GIVEN** a scanned document that is upside down (180° rotation)
|
||||
- **WHEN** PP-StructureV3 processes the image
|
||||
- **THEN** the system SHALL detect rotation angle as "180"
|
||||
- **AND** page dimensions SHALL NOT be swapped (orientation is same, just flipped)
|
||||
- **AND** text elements SHALL be correctly positioned after internal rotation
|
||||
|
||||
#### Scenario: Correctly oriented documents remain unchanged
|
||||
- **GIVEN** a PDF where page metadata matches actual content orientation
|
||||
- **WHEN** PP-StructureV3 processes the image
|
||||
- **THEN** the system SHALL detect rotation angle as "0"
|
||||
- **AND** page dimensions SHALL remain unchanged
|
||||
- **AND** processing SHALL proceed normally without dimension adjustment
|
||||
|
||||
#### Scenario: Rotation angle is captured from PP-StructureV3 results
|
||||
- **GIVEN** PP-StructureV3 is configured with `use_doc_orientation_classify=True`
|
||||
- **WHEN** processing completes
|
||||
- **THEN** the system SHALL extract rotation angle from `doc_preprocessor_res.label_names`
|
||||
- **AND** include `detected_rotation` in the OCR result metadata
|
||||
- **AND** log the detected rotation for debugging
|
||||
|
||||
#### Scenario: Dimension adjustment happens before PDF generation
|
||||
- **GIVEN** OCR processing detects rotation angle of "90" or "270"
|
||||
- **WHEN** creating the UnifiedDocument for PDF generation
|
||||
- **THEN** the Page dimensions SHALL use adjusted (swapped) width and height
|
||||
- **AND** OCR coordinates SHALL be used directly (already in rotated space)
|
||||
- **AND** no additional coordinate transformation is needed
|
||||
|
||||
### Requirement: Orientation Detection Configuration
|
||||
|
||||
The system SHALL provide configuration for enabling/disabling document orientation detection.
|
||||
|
||||
#### Scenario: Orientation detection is enabled by default
|
||||
- **GIVEN** default configuration settings
|
||||
- **WHEN** OCR track processing runs
|
||||
- **THEN** `use_doc_orientation_classify` SHALL be `True`
|
||||
- **AND** PP-StructureV3 SHALL perform document orientation classification
|
||||
|
||||
#### Scenario: Orientation detection can be disabled
|
||||
- **GIVEN** `use_doc_orientation_classify` is set to `False` in configuration
|
||||
- **WHEN** OCR track processing runs
|
||||
- **THEN** the system SHALL NOT perform orientation detection
|
||||
- **AND** page dimensions SHALL be based on original image dimensions
|
||||
- **AND** this maintains backward compatibility for controlled environments
|
||||
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Layout Model Selection (Modified)
|
||||
|
||||
The system SHALL apply document orientation detection before layout detection regardless of the selected layout model.
|
||||
|
||||
#### Scenario: Orientation detection works with all layout models
|
||||
- **GIVEN** a user selects any layout model (chinese, default, cdla)
|
||||
- **WHEN** OCR processing runs with `use_doc_orientation_classify=True`
|
||||
- **THEN** orientation detection SHALL be applied regardless of layout model choice
|
||||
- **AND** orientation detection happens in Stage 1 (preprocessing) before layout detection (Stage 3)
|
||||
Reference in New Issue
Block a user