Files
TTS/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/tasks.md
beabigegg 33ea22f259 feat: 新增智慧簡報旁白生成系統 (Smart Slide Voiceover System)
- 新增 Excel 輸入模組:解析 .xlsx 格式講稿檔案
- 新增 TTS 引擎模組:整合 edge-tts 調用 Azure Neural Voice
- 新增 PyQt6 圖形介面:檔案選擇、語音選擇、進度監控
- 新增執行緒模型:QThread + Asyncio 確保 UI 響應性
- 支援 10 種 Neural Voice (中文/越南/英文)
- 支援中英混雜、越英混雜發音

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 15:42:11 +08:00

2.1 KiB

1. Environment Setup

  • 1.1 Create environment.yml with Python 3.10, PyQt6, edge-tts, pandas, openpyxl
  • 1.2 Test conda environment creation and activation

2. Excel Input Module

  • 2.1 Implement Excel file loading with openpyxl/pandas
  • 2.2 Implement column parsing (Filename, Text, Lang)
  • 2.3 Add data validation for required fields
  • 2.4 Implement default language fallback (zh)

3. TTS Engine Module

  • 3.1 Define voice registry with all available voices and bilingual support annotations
  • 3.2 Implement default voice mapping dictionary (zh/vi/en -> Voice ID)
  • 3.3 Implement edge-tts wrapper for voice synthesis with configurable voice
  • 3.4 Add rate limit delay (0.5s between requests)
  • 3.5 Implement network error handling with retry
  • 3.6 Add output directory creation logic

4. Threading Model

  • 4.1 Create TTS Worker class extending QThread
  • 4.2 Implement asyncio event loop within worker thread
  • 4.3 Define pyqtSignal for progress, log, completion events
  • 4.4 Implement stop flag for graceful cancellation

5. GUI Implementation

  • 5.1 Create main window with PyQt6
  • 5.2 Add file browser widget with .xlsx filter
  • 5.3 Add voice selection dropdown (grouped by language, with bilingual annotations)
  • 5.4 Add Start/Stop buttons with state management
  • 5.5 Add progress bar widget
  • 5.6 Add log console (QTextEdit) with auto-scroll
  • 5.7 Connect signals to UI update slots
  • 5.8 Implement completion notification dialog

6. Integration

  • 6.1 Wire Excel parser to TTS worker
  • 6.2 Pass selected voice from GUI to TTS worker
  • 6.3 Connect worker signals to GUI updates
  • 6.4 Implement start/stop button handlers

7. Testing & Documentation

  • 7.1 Create template.xlsx with sample data
  • 7.2 Test batch processing with 10+ items
  • 7.3 Test voice selection (verify different voices produce correct output)
  • 7.4 Test UI responsiveness during processing
  • 7.5 Test stop functionality mid-batch
  • 7.6 Test error handling (invalid file, network error)
  • 7.7 Write README.md with usage instructions (including voice selection guide)