feat: add batch processing for multiple file uploads

- Add BatchState management in taskStore with progress tracking
- Implement batch processing service with concurrency control
  - Direct Track: max 5 parallel tasks
  - OCR Track: sequential processing (GPU VRAM limit)
- Refactor ProcessingPage to support batch mode with BatchProcessingPanel
- Update UploadPage to initialize batch state for multi-file uploads
- Add i18n translations for batch processing (zh-TW, en-US)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
egg
2025-12-12 17:05:16 +08:00
parent d5bc311757
commit d20751d56b
11 changed files with 1469 additions and 5 deletions

View File

@@ -0,0 +1,43 @@
# Change: 新增批次處理功能
## Why
目前系統支援批次上傳多個檔案,但處理時需要使用者逐一點選每個任務進行處理。這對於大量文件的處理場景非常不便。需要新增批次處理功能,讓使用者可以一次設定並啟動所有上傳的任務。
## What Changes
### 1. 批次狀態管理
- 擴展 taskStore 支援批次任務追蹤
- 新增批次進度狀態(總數、已完成、處理中、失敗)
- 儲存批次統一設定
### 2. 批次處理邏輯
- 上傳完成後分析所有檔案決定處理軌道
- 根據軌道類型分流處理:
- Direct Track最多 5 個並行CPU 運算)
- OCR Track單一佇列GPU VRAM 限制)
- 兩類任務可同時進行
### 3. 批次設定 UI
- 修改 ProcessingPage 支援多任務模式
- 統一設定介面:
- 處理策略(自動判斷/強制 OCR/強制 Direct
- Layout ModelOCR 專用)
- 預處理模式OCR 專用)
- 批次進度顯示(整體進度 + 各任務狀態)
### 4. 處理策略
- **自動判斷**(推薦):系統分析每個檔案後自動選擇最佳 track
- **全部 OCR**:強制所有檔案使用 OCR track
- **全部 Direct**:強制所有 PDF 使用 Direct track
## Impact
- Affected specs: frontend-ui (修改)
- Affected code:
- `frontend/src/store/taskStore.ts` - 擴展批次狀態
- `frontend/src/pages/ProcessingPage.tsx` - 支援多任務處理
- `frontend/src/pages/UploadPage.tsx` - 傳遞多任務 ID
- `frontend/src/services/apiV2.ts` - 新增批次處理輔助函數
- `frontend/src/i18n/locales/*.json` - 新增翻譯
- 後端無需改動(利用現有 API

View File

@@ -0,0 +1,100 @@
# Frontend UI Specification - Batch Processing
## ADDED Requirements
### Requirement: Batch Processing Support
The system SHALL support batch processing of multiple uploaded files with a single configuration.
After uploading multiple files, the user SHALL be able to:
- Configure processing settings once for all files
- Start processing all files with one action
- Monitor progress of all files in a unified view
#### Scenario: Multiple files uploaded
- **WHEN** user uploads multiple files
- **AND** navigates to processing page
- **THEN** the system displays batch processing mode
- **AND** shows all pending tasks in a list
#### Scenario: Batch configuration
- **WHEN** user is in batch processing mode
- **THEN** user can select a processing strategy (auto/OCR/Direct)
- **AND** user can configure layout model for OCR tasks
- **AND** user can configure preprocessing for OCR tasks
- **AND** settings apply to all applicable tasks
---
### Requirement: Batch Processing Strategy
The system SHALL support three batch processing strategies:
1. **Auto Detection** (default): System analyzes each file and selects optimal track
2. **Force OCR**: All files processed with OCR track
3. **Force Direct**: All PDF files processed with Direct track
#### Scenario: Auto detection strategy
- **WHEN** user selects auto detection strategy
- **THEN** the system analyzes each file before processing
- **AND** assigns OCR or Direct track based on file characteristics
#### Scenario: Force OCR strategy
- **WHEN** user selects force OCR strategy
- **THEN** all files are processed using OCR track
- **AND** layout model and preprocessing settings are applied
#### Scenario: Force Direct strategy
- **WHEN** user selects force Direct strategy
- **AND** file is a PDF
- **THEN** the file is processed using Direct track
---
### Requirement: Parallel Processing Limits
The system SHALL enforce different parallelism limits based on processing track:
- Direct Track: Maximum 5 concurrent tasks (CPU-based)
- OCR Track: Maximum 1 concurrent task (GPU VRAM constraint)
Direct and OCR tasks MAY run simultaneously as they use different resources.
#### Scenario: Direct track parallelism
- **WHEN** batch contains multiple Direct track tasks
- **THEN** up to 5 tasks process concurrently
- **AND** remaining tasks wait in queue
#### Scenario: OCR track serialization
- **WHEN** batch contains multiple OCR track tasks
- **THEN** only 1 task processes at a time
- **AND** remaining tasks wait in queue
#### Scenario: Mixed track processing
- **WHEN** batch contains both Direct and OCR tasks
- **THEN** Direct tasks run in parallel pool (max 5)
- **AND** OCR tasks run in serial queue (max 1)
- **AND** both pools operate simultaneously
---
### Requirement: Batch Progress Display
The system SHALL display unified progress for batch processing.
Progress display SHALL include:
- Overall progress (completed / total)
- Count by status (processing, completed, failed)
- Individual task status list
- Estimated time remaining (optional)
#### Scenario: Batch progress monitoring
- **WHEN** batch processing is in progress
- **THEN** user sees overall completion percentage
- **AND** user sees count of tasks in each status
- **AND** user sees status of each individual task
#### Scenario: Batch completion
- **WHEN** all tasks in batch are completed or failed
- **THEN** user sees final summary
- **AND** user can navigate to results page

View File

@@ -0,0 +1,42 @@
# Tasks: 新增批次處理功能
## 1. 批次狀態管理
- [x] 1.1 擴展 taskStore 新增批次狀態介面BatchState
- [x] 1.2 實作批次任務追蹤taskIds、taskStates
- [x] 1.3 實作批次進度計算total、completed、processing、failed
- [x] 1.4 實作批次設定儲存processingOptions
## 2. 批次處理邏輯
- [x] 2.1 新增批次分析函數(分析所有任務決定 track
- [x] 2.2 實作 Direct Track 並行處理(最多 5 並行)
- [x] 2.3 實作 OCR Track 佇列處理(單一佇列)
- [x] 2.4 實作混合模式處理Direct 和 OCR 同時進行)
- [x] 2.5 實作任務狀態輪詢與更新
## 3. 上傳頁面調整
- [x] 3.1 修改 UploadPage 上傳完成後儲存所有 taskIds
- [x] 3.2 導航至 ProcessingPage 時傳遞批次模式標記
## 4. 處理頁面重構
- [x] 4.1 修改 ProcessingPage 支援批次模式
- [x] 4.2 新增批次設定區塊(策略選擇、統一設定)
- [x] 4.3 新增批次進度顯示元件
- [x] 4.4 新增任務列表顯示(各任務狀態)
- [x] 4.5 實作批次開始處理按鈕
## 5. i18n 翻譯
- [x] 5.1 新增批次處理相關中文翻譯
- [x] 5.2 新增批次處理相關英文翻譯
## 6. 測試與驗證
- [x] 6.1 測試單檔案處理(向下相容)
- [x] 6.2 測試多檔案 Direct Track 並行
- [x] 6.3 測試多檔案 OCR Track 佇列
- [x] 6.4 測試混合模式處理
- [x] 6.5 驗證 TypeScript 編譯通過