Files
OCR/openspec/changes/upgrade-ppstructure-models/tasks.md
egg 6235280c45 feat: upgrade PP-StructureV3 models to latest versions
- Layout: PP-DocLayout-S → PP-DocLayout_plus-L (83.2% mAP)
- Table: Single model → Dual SLANeXt (wired/wireless)
- Formula: PP-FormulaNet_plus-L for enhanced recognition
- Add preprocessing flags support (orientation, unwarping)
- Update frontend i18n descriptions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 14:22:06 +08:00

2.9 KiB

Tasks: Upgrade PP-StructureV3 Models

1. Backend Configuration Changes

  • 1.1 Update backend/app/core/config.py - Enable preprocessing flags

    • Set use_doc_orientation_classify default to True
    • Set use_doc_unwarping default to True
    • Set use_textline_orientation default to True
    • Add table_structure_model_name configuration
    • Add formula_recognition_model_name configuration
  • 1.2 Update backend/app/services/ocr_service.py - Model mapping changes

    • Update LAYOUT_MODEL_MAPPING:
      • Change "chinese" from "PP-DocLayout-S" to "PP-DocLayout_plus-L"
      • Keep "default" as PubLayNet
      • Keep "cdla" as is
    • Update _ensure_structure_engine():
      • Pass preprocessing flags to PPStructureV3
      • Configure SLANeXt models for table recognition
      • Configure PP-FormulaNet_plus-L for formula recognition
  • 1.3 Update PPStructureV3 initialization kwargs

    • Add table_structure_model_name="SLANeXt_wired" (or configure dual model)
    • Add formula_recognition_model_name="PP-FormulaNet_plus-L"
    • Verify preprocessing flags are passed correctly

2. Schema Updates

  • 2.1 Update backend/app/schemas/task.py - LayoutModelEnum
    • Rename or update CHINESE description to reflect PP-DocLayout_plus-L
    • Update docstrings to reflect new model capabilities

3. Frontend Updates

  • 3.1 Update frontend/src/components/LayoutModelSelector.tsx

    • Update Chinese option description to mention PP-DocLayout_plus-L
    • Update accuracy information displayed to users
  • 3.2 Update frontend/src/i18n/locales/zh-TW.json

    • Update layoutModel.chinese.description to reflect new model
    • Update any accuracy percentages in descriptions

4. Testing

  • 4.1 Create unit tests for new model configuration

    • Test preprocessing flags are correctly passed
    • Test model mapping resolves correctly
    • Test engine initialization with new models
  • 4.2 Integration testing with real documents

    • Test rotated document handling (preprocessing)
    • Test complex Chinese document layout detection
    • Test table structure recognition accuracy
    • Test formula recognition with Chinese formulas
  • 4.3 Update existing tests

    • Update backend/tests/services/test_layout_model.py for new mapping
    • Update backend/tests/api/test_layout_model_api.py if needed

5. Documentation

  • 5.1 Create model cleanup documentation

    • Document ~/.paddlex/official_models/ cache location
    • List models that can be safely deleted after upgrade
    • Provide cleanup script/commands
    • See: MODEL_CLEANUP.md
  • 5.2 Update API documentation

    • Document preprocessing feature behavior
    • Update layout model descriptions

6. Verification & Deployment

  • 6.1 Verify new models download correctly on first use
  • 6.2 Measure memory/GPU usage with new models
  • 6.3 Compare processing speed before/after upgrade
  • 6.4 Verify existing functionality not broken