Archived as 2025-11-27-upgrade-ppstructure-models Spec updated: ocr-processing (added PP-StructureV3 Configuration) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2.7 KiB
2.7 KiB
ADDED Requirements
Requirement: PP-StructureV3 Configuration
The system SHALL configure PP-StructureV3 with the following settings:
Preprocessing (Stage 1):
- Document orientation classification MUST be enabled (
use_doc_orientation_classify=True) - Document unwarping MUST be enabled (
use_doc_unwarping=True) - Textline orientation detection MUST be enabled (
use_textline_orientation=True)
Layout Detection (Stage 3):
- The
chineselayout model option SHALL use PP-DocLayout_plus-L (83.2% mAP) - The
defaultlayout model option SHALL use PubLayNet for English documents - The
cdlalayout model option SHALL use picodet_lcnet_x1_0_fgd_layout_cdla
Element Recognition (Stage 4):
- Table structure recognition SHALL use SLANeXt_wired and SLANeXt_wireless models (69.65% combined accuracy)
- Formula recognition SHALL use PP-FormulaNet_plus-L (92.22% English, 90.64% Chinese BLEU)
- Chart parsing SHALL use PP-Chart2Table
- Seal recognition SHALL use PP-OCRv4_seal
Scenario: Processing rotated scanned document
- WHEN a PDF document with rotated pages is processed using OCR track
- THEN the system SHALL automatically detect and correct the orientation before OCR processing
Scenario: Processing complex Chinese document with tables
- WHEN a Chinese document containing tables, images, and formulas is processed
- AND the user selects "chinese" layout model
- THEN the system SHALL use PP-DocLayout_plus-L for layout detection (83.2% mAP)
- AND the system SHALL correctly identify table regions
Scenario: Table structure recognition with wired tables
- WHEN a document contains wired (bordered) tables
- THEN the system SHALL use SLANeXt_wired model for structure recognition
- AND output correct HTML table structure with proper row/column spanning
Scenario: Table structure recognition with wireless tables
- WHEN a document contains wireless (borderless) tables
- THEN the system SHALL use SLANeXt_wireless model for structure recognition
Scenario: Chinese formula recognition
- WHEN a document contains mathematical formulas with Chinese characters
- THEN the system SHALL use PP-FormulaNet_plus-L for recognition
- AND output LaTeX code with correct Chinese character representation
ADDED Requirements
Requirement: Model Cache Cleanup
The system SHALL provide documentation for cleaning up unused model caches to optimize storage space.
Scenario: User wants to free disk space after model upgrade
- WHEN the user has upgraded from older models (PP-DocLayout-S, SLANet) to newer models
- THEN the documentation SHALL explain how to delete unused cached models from
~/.paddlex/official_models/ - AND list which model directories can be safely removed