Files
OCR/openspec/changes/upgrade-ppstructure-models/tasks.md
egg 6235280c45 feat: upgrade PP-StructureV3 models to latest versions
- Layout: PP-DocLayout-S → PP-DocLayout_plus-L (83.2% mAP)
- Table: Single model → Dual SLANeXt (wired/wireless)
- Formula: PP-FormulaNet_plus-L for enhanced recognition
- Add preprocessing flags support (orientation, unwarping)
- Update frontend i18n descriptions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 14:22:06 +08:00

78 lines
2.9 KiB
Markdown

# Tasks: Upgrade PP-StructureV3 Models
## 1. Backend Configuration Changes
- [x] 1.1 Update `backend/app/core/config.py` - Enable preprocessing flags
- Set `use_doc_orientation_classify` default to True
- Set `use_doc_unwarping` default to True
- Set `use_textline_orientation` default to True
- Add `table_structure_model_name` configuration
- Add `formula_recognition_model_name` configuration
- [x] 1.2 Update `backend/app/services/ocr_service.py` - Model mapping changes
- Update `LAYOUT_MODEL_MAPPING`:
- Change `"chinese"` from `"PP-DocLayout-S"` to `"PP-DocLayout_plus-L"`
- Keep `"default"` as PubLayNet
- Keep `"cdla"` as is
- Update `_ensure_structure_engine()`:
- Pass preprocessing flags to PPStructureV3
- Configure SLANeXt models for table recognition
- Configure PP-FormulaNet_plus-L for formula recognition
- [x] 1.3 Update PPStructureV3 initialization kwargs
- Add `table_structure_model_name="SLANeXt_wired"` (or configure dual model)
- Add `formula_recognition_model_name="PP-FormulaNet_plus-L"`
- Verify preprocessing flags are passed correctly
## 2. Schema Updates
- [x] 2.1 Update `backend/app/schemas/task.py` - LayoutModelEnum
- Rename or update `CHINESE` description to reflect PP-DocLayout_plus-L
- Update docstrings to reflect new model capabilities
## 3. Frontend Updates
- [x] 3.1 Update `frontend/src/components/LayoutModelSelector.tsx`
- Update Chinese option description to mention PP-DocLayout_plus-L
- Update accuracy information displayed to users
- [x] 3.2 Update `frontend/src/i18n/locales/zh-TW.json`
- Update `layoutModel.chinese.description` to reflect new model
- Update any accuracy percentages in descriptions
## 4. Testing
- [x] 4.1 Create unit tests for new model configuration
- Test preprocessing flags are correctly passed
- Test model mapping resolves correctly
- Test engine initialization with new models
- [ ] 4.2 Integration testing with real documents
- Test rotated document handling (preprocessing)
- Test complex Chinese document layout detection
- Test table structure recognition accuracy
- Test formula recognition with Chinese formulas
- [x] 4.3 Update existing tests
- Update `backend/tests/services/test_layout_model.py` for new mapping
- Update `backend/tests/api/test_layout_model_api.py` if needed
## 5. Documentation
- [x] 5.1 Create model cleanup documentation
- Document `~/.paddlex/official_models/` cache location
- List models that can be safely deleted after upgrade
- Provide cleanup script/commands
- See: [MODEL_CLEANUP.md](./MODEL_CLEANUP.md)
- [x] 5.2 Update API documentation
- Document preprocessing feature behavior
- Update layout model descriptions
## 6. Verification & Deployment
- [ ] 6.1 Verify new models download correctly on first use
- [ ] 6.2 Measure memory/GPU usage with new models
- [ ] 6.3 Compare processing speed before/after upgrade
- [ ] 6.4 Verify existing functionality not broken