Files
OCR/openspec/changes/archive/2025-11-27-simplify-ppstructure-model-selection/tasks.md
egg 59206a6ab8 feat: simplify layout model selection and archive proposals
Changes:
- Replace PP-Structure 7-slider parameter UI with simple 3-option layout model selector
- Add layout model mapping: chinese (PP-DocLayout-S), default (PubLayNet), cdla
- Add LayoutModelSelector component and zh-TW translations
- Fix "default" model behavior with sentinel value for PubLayNet
- Add gap filling service for OCR track coverage improvement
- Add PP-Structure debug utilities
- Archive completed/incomplete proposals:
  - add-ocr-track-gap-filling (complete)
  - fix-ocr-track-table-rendering (incomplete)
  - simplify-ppstructure-model-selection (22/25 tasks)
- Add new layout model tests, archive old PP-Structure param tests
- Update OpenSpec ocr-processing spec with layout model requirements

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 13:27:00 +08:00

2.5 KiB

Implementation Tasks

1. Backend API Changes

  • 1.1 Update app/schemas/task.py to add layout_model enum type
  • 1.2 Update app/routers/tasks.py to replace pp_structure_params with layout_model parameter
  • 1.3 Update app/services/ocr_service.py to map layout_model to layout_detection_model_name
  • 1.4 Remove custom PP-Structure engine creation logic (use model selection instead)
  • 1.5 Add backward compatibility: default to "chinese" if no model specified

2. Backend Configuration

  • 2.1 Keep layout_detection_model_name in config.py as fallback default
  • 2.2 Keep fine-tuning parameters in config.py (not exposed to API)
  • 2.3 Document available layout models in config comments

3. Frontend Changes

  • 3.1 Remove PPStructureParams.tsx component
  • 3.2 Update src/types/apiV2.ts:
    • Remove PPStructureV3Params interface
    • Add LayoutModel type: "default" | "chinese" | "cdla"
    • Update ProcessingOptions to use layout_model instead of pp_structure_params
  • 3.3 Create LayoutModelSelector.tsx component with:
    • Radio buttons or dropdown for model selection
    • Clear descriptions for each model option
    • Default selection: "chinese"
  • 3.4 Update task start form to use new LayoutModelSelector
  • 3.5 Update API calls to send layout_model instead of pp_structure_params

4. Internationalization

  • 4.1 Add i18n strings for layout model options:
    • layoutModel.default: "Standard Model (English documents)"
    • layoutModel.chinese: "Chinese Document Model (Recommended)"
    • layoutModel.cdla: "CDLA Model (Chinese layout analysis)"
  • 4.2 Add i18n strings for model descriptions

5. Testing

  • 5.1 Create new tests for layout_model parameter (test_layout_model_api.py, test_layout_model.py)
  • 5.2 Archive tests for pp_structure_params validation (moved to tests/archived/)
  • 5.3 Add tests for layout model selection (19 tests passing)
  • 5.4 Test backward compatibility (no model specified → use chinese default)

6. Documentation

  • 6.1 Update API documentation for task start endpoint
  • 6.2 Remove PP-Structure parameter documentation
  • 6.3 Add layout model selection documentation

7. Cleanup

  • 7.1 Remove localStorage keys for PP-Structure params (pp_structure_params_presets, pp_structure_params_last_used)
  • 7.2 Remove any unused imports/types related to PP-Structure params
  • 7.3 Archive old PP-Structure params test files