# Implementation Tasks ## 1. Backend Schema (✅ COMPLETED) - [x] 1.1 Define `PPStructureV3Params` schema in `backend/app/schemas/task.py` - [x] Add 7 parameter fields with validation - [x] Set appropriate constraints (ge, le, gt, pattern) - [x] Add descriptive documentation - [x] 1.2 Update `ProcessingOptions` schema - [x] Add optional `pp_structure_params` field - [x] Ensure backward compatibility ## 2. Backend OCR Service Implementation - [x] 2.1 Modify `backend/app/services/ocr_service.py` - [x] Update `_ensure_structure_engine()` method signature - [x] Add `custom_params: Optional[Dict[str, Any]] = None` parameter - [x] Implement parameter priority logic (custom > settings) - [x] Conditional caching (skip cache for custom params) - [x] Update `process_image()` method - [x] Add `pp_structure_params` parameter - [x] Pass params to `_ensure_structure_engine()` - [x] Update `process_with_dual_track()` method - [x] Add `pp_structure_params` parameter - [x] Forward params to OCR track processing - [x] Update main `process()` method - [x] Add `pp_structure_params` parameter - [x] Ensure params flow through all code paths - [x] 2.2 Add parameter logging - [x] Log when custom params are used - [x] Log parameter values for debugging - [x] Add performance metrics for custom vs default ## 3. Backend API Endpoint Updates - [x] 3.1 Modify `backend/app/routers/tasks.py` - [x] Update `start_task` endpoint - [x] Accept `ProcessingOptions` as request body (not query params) - [x] Extract `pp_structure_params` from options - [x] Convert to dict using `model_dump(exclude_none=True)` - [x] Pass to OCR service - [x] Update `analyze_document` endpoint (if needed) - [x] Support PP-StructureV3 params for analysis - [x] 3.2 Update API documentation - [x] Add OpenAPI schema for new parameters - [x] Include parameter descriptions and ranges ## 4. Frontend TypeScript Types - [x] 4.1 Update `frontend/src/types/apiV2.ts` - [x] Define `PPStructureV3Params` interface ```typescript export interface PPStructureV3Params { layout_detection_threshold?: number layout_nms_threshold?: number layout_merge_bboxes_mode?: 'union' | 'large' | 'small' layout_unclip_ratio?: number text_det_thresh?: number text_det_box_thresh?: number text_det_unclip_ratio?: number } ``` - [x] Update `ProcessingOptions` interface - [x] Add `pp_structure_params?: PPStructureV3Params` ## 5. Frontend API Client Updates - [x] 5.1 Modify `frontend/src/services/apiV2.ts` - [x] Update `startTask()` method - [x] Change from query params to request body - [x] Send full `ProcessingOptions` object ```typescript async startTask(taskId: string, options?: ProcessingOptions): Promise { const response = await this.client.post( `/tasks/${taskId}/start`, options // Send as body, not query params ) return response.data } ``` ## 6. Frontend UI Implementation - [x] 6.1 Create parameter adjustment component - [x] Create `frontend/src/components/PPStructureParams.tsx` - [x] Slider components for numeric parameters - [x] Select dropdown for merge mode - [x] Help tooltips for each parameter - [x] Reset to defaults button - [x] 6.2 Add preset configurations - [x] Default mode (use backend defaults) - [x] High Quality mode (lower thresholds) - [x] Fast mode (higher thresholds) - [x] Custom mode (show all sliders) - [x] 6.3 Integrate into task processing flow - [x] Add to `ProcessingPage.tsx` - [x] Show only when task is pending - [x] Store params in component state - [x] Pass params to `startTask()` API call ## 7. Frontend UI/UX Polish - [x] 7.1 Add visual feedback - [x] Loading state while processing with custom params - [x] Success/error notifications with save confirmation - [x] Parameter value display (current vs default with highlight) - [x] 7.2 Add parameter persistence - [x] Save last used params to localStorage (auto-save on change) - [x] Create preset configurations (default, high-quality, fast) - [x] Import/export parameter configurations (JSON format) - [x] 7.3 Add help documentation - [x] Inline help text for each parameter with tooltips - [x] Descriptive labels explaining parameter effects - [x] Info panel explaining OCR track requirement ## 8. Testing - [x] 8.1 Backend unit tests - [x] Test schema validation (min/max, types, patterns) - [x] Test parameter passing through service layers - [x] Test caching behavior with custom params (no caching) - [x] Test parameter priority (custom > settings) - [x] Test fallback to defaults on error - [x] Test parameter flow through processing pipeline - [x] Test logging of custom parameters - [x] 8.2 API integration tests - [x] Test endpoint with various parameter combinations - [x] Test backward compatibility (no params) - [x] Test validation errors for invalid params (422 responses) - [x] Test partial parameter sets - [x] Test OpenAPI schema documentation - [x] Test parameter serialization/deserialization - [ ] 8.3 Frontend component tests - [ ] Test slider value changes - [ ] Test preset selection - [ ] Test API call generation - [ ] 8.4 End-to-end tests - [ ] Upload document → adjust params → process → verify results - [ ] Test with different document types - [ ] Compare results: default vs custom params - [ ] 8.5 Performance tests - [ ] Ensure no memory leaks with custom params - [ ] Verify engine cleanup after processing - [ ] Benchmark processing time impact ## 9. Documentation - [ ] 9.1 Update API documentation - [ ] Document new request body format - [ ] Add parameter reference guide - [ ] Include example requests - [ ] 9.2 Create user guide - [ ] When to adjust each parameter - [ ] Common scenarios and recommended settings - [ ] Troubleshooting guide - [ ] 9.3 Update README - [ ] Add feature description - [ ] Include screenshots of UI - [ ] Add configuration examples ## 10. Deployment & Rollout - [ ] 10.1 Database migration (if needed) - [ ] Store user parameter preferences - [ ] Log parameter usage statistics - [ ] 10.2 Feature flag (optional) - [ ] Add feature toggle for gradual rollout - [ ] Default to enabled - [ ] 10.3 Monitoring - [ ] Add metrics for parameter usage - [ ] Track processing success rates by param config - [ ] Monitor performance impact ## Critical Path for Testing **Minimum required for frontend testing:** 1. ✅ Backend Schema (Section 1) - DONE 2. Backend OCR Service (Section 2) - REQUIRED 3. Backend API Endpoint (Section 3) - REQUIRED 4. Frontend Types (Section 4) - REQUIRED 5. Frontend API Client (Section 5) - REQUIRED 6. Basic UI Component (Section 6.1-6.3) - REQUIRED **Nice to have but not blocking:** - UI Polish (Section 7) - Full test suite (Section 8) - Documentation (Section 9) - Deployment features (Section 10)