feat: simplify layout model selection and archive proposals

Changes: - Replace PP-Structure 7-slider parameter UI with simple 3-option layout model selector - Add layout model mapping: chinese (PP-DocLayout-S), default (PubLayNet), cdla - Add LayoutModelSelector component and zh-TW translations - Fix "default" model behavior with sentinel value for PubLayNet - Add gap filling service for OCR track coverage improvement - Add PP-Structure debug utilities - Archive completed/incomplete proposals: - add-ocr-track-gap-filling (complete) - fix-ocr-track-table-rendering (incomplete) - simplify-ppstructure-model-selection (22/25 tasks) - Add new layout model tests, archive old PP-Structure param tests - Update OpenSpec ocr-processing spec with layout model requirements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 13:27:00 +08:00
parent c65df754cf
commit 59206a6ab8
35 changed files with 3621 additions and 658 deletions
--- a/openspec/changes/fix-ocr-track-table-rendering/tasks.md
+++ b/openspec/changes/fix-ocr-track-table-rendering/tasks.md
@@ -1,55 +0,0 @@
-# Implementation Tasks
-
-## Phase 1: Core Fix - Table Content Conversion
-
-### 1.1 Add TableData.from_dict() class method
- [ ] In `unified_document.py`, add `from_dict()` method to `TableData` class
- [ ] Handle conversion of cells list (list of dicts) to `TableCell` objects
- [ ] Preserve rows, cols, headers, caption fields
-
-### 1.2 Fix _json_to_document_element for TABLE elements
- [ ] In `pdf_generator_service.py`, modify `_json_to_document_element`
- [ ] When `elem_type == ElementType.TABLE` and content is dict with 'cells', convert to `TableData`
- [ ] Use `TableData.from_dict()` for clean conversion
-
-### 1.3 Verify TableData.to_html() generates correct HTML
- [ ] Test that `to_html()` produces parseable HTML with proper row/cell structure
- [ ] Verify colspan/rowspan attributes are correctly generated
- [ ] Ensure empty cells are properly handled
-
-## Phase 2: OCR Track Rendering Consistency
-
-### 2.1 Review convert_unified_document_to_ocr_data
- [ ] Verify TableData objects are properly converted to HTML
- [ ] Add fallback handling for dict content with 'cells' key
- [ ] Log warning if content cannot be converted to HTML
-
-### 2.2 Review draw_table_region
- [ ] Verify HTMLTableParser correctly parses generated HTML
- [ ] Check that ReportLab Table is positioned at correct bbox
- [ ] Verify font and style application
-
-## Phase 3: Testing and Verification
-
-### 3.1 Test OCR Track
- [ ] Test scan.pdf - verify tables have correct structure
- [ ] Test img1.png, img2.png, img3.png
- [ ] Compare generated PDF with original documents
-
-### 3.2 Test Direct Track (Regression)
- [ ] Test PDF files with Direct track
- [ ] Verify table rendering unchanged
-
-### 3.3 Test Hybrid Mode
- [ ] Test files that trigger hybrid processing
- [ ] Verify mixed Direct + OCR elements render correctly
-
-## Phase 4: Code Quality
-
-### 4.1 Add logging
- [ ] Add debug logging for table content type detection
- [ ] Log conversion steps for troubleshooting
-
-### 4.2 Error handling
- [ ] Handle malformed cell data gracefully
- [ ] Log warnings for unexpected content formats