feat: add table detection options and scan artifact removal

- Add TableDetectionSelector component for wired/wireless/region detection
- Add CV-based table line detector module (disabled due to poor performance)
- Add scan artifact removal preprocessing step (removes faint horizontal lines)
- Add PreprocessingConfig schema with remove_scan_artifacts option
- Update frontend PreprocessingSettings with scan artifact toggle
- Integrate table detection config into ProcessingPage
- Archive extract-table-cell-boxes proposal

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-30 13:21:50 +08:00
parent f5a2c8a750
commit 95ae1f1bdb
17 changed files with 1906 additions and 344 deletions

View File

@@ -64,6 +64,16 @@
"recommended": "推薦",
"note": "版面模型會影響文件結構(表格、文字區塊、圖片)的偵測效果。請根據您的文件類型選擇適合的模型。"
},
"tableDetection": {
"title": "表格偵測模式",
"wired": "有框線表格",
"wiredDesc": "偵測有明顯格線邊框的表格,適用於正式表格文件",
"wireless": "無框線表格",
"wirelessDesc": "偵測無邊框的表格,透過對齊方式推斷表格結構",
"region": "區域偵測",
"regionDesc": "輔助偵測表格區域,改善複雜表格的儲存格識別",
"note": "可同時啟用多種偵測模式,系統會自動整合偵測結果。如果表格儲存格框線不正確,請嘗試調整偵測模式。"
},
"preprocessing": {
"title": "影像前處理",
"mode": {
@@ -92,6 +102,8 @@
"strong": "強",
"maximum": "最強"
},
"removeScanArtifacts": "移除掃描瑕疵",
"removeScanArtifactsDesc": "移除掃描時光源產生的水平線痕,避免被誤判為表格框線",
"advanced": "進階選項",
"binarize": "二值化處理",
"binarizeWarning": "不建議使用",