feat: 新增智慧簡報旁白生成系統 (Smart Slide Voiceover System)

- 新增 Excel 輸入模組:解析 .xlsx 格式講稿檔案
- 新增 TTS 引擎模組:整合 edge-tts 調用 Azure Neural Voice
- 新增 PyQt6 圖形介面:檔案選擇、語音選擇、進度監控
- 新增執行緒模型:QThread + Asyncio 確保 UI 響應性
- 支援 10 種 Neural Voice (中文/越南/英文)
- 支援中英混雜、越英混雜發音

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
beabigegg
2025-12-27 15:42:11 +08:00
commit 33ea22f259
25 changed files with 1943 additions and 0 deletions

456
openspec/AGENTS.md Normal file
View File

@@ -0,0 +1,456 @@
# OpenSpec Instructions
Instructions for AI coding assistants using OpenSpec for spec-driven development.
## TL;DR Quick Checklist
- Search existing work: `openspec spec list --long`, `openspec list` (use `rg` only for full-text search)
- Decide scope: new capability vs modify existing capability
- Pick a unique `change-id`: kebab-case, verb-led (`add-`, `update-`, `remove-`, `refactor-`)
- Scaffold: `proposal.md`, `tasks.md`, `design.md` (only if needed), and delta specs per affected capability
- Write deltas: use `## ADDED|MODIFIED|REMOVED|RENAMED Requirements`; include at least one `#### Scenario:` per requirement
- Validate: `openspec validate [change-id] --strict` and fix issues
- Request approval: Do not start implementation until proposal is approved
## Three-Stage Workflow
### Stage 1: Creating Changes
Create proposal when you need to:
- Add features or functionality
- Make breaking changes (API, schema)
- Change architecture or patterns
- Optimize performance (changes behavior)
- Update security patterns
Triggers (examples):
- "Help me create a change proposal"
- "Help me plan a change"
- "Help me create a proposal"
- "I want to create a spec proposal"
- "I want to create a spec"
Loose matching guidance:
- Contains one of: `proposal`, `change`, `spec`
- With one of: `create`, `plan`, `make`, `start`, `help`
Skip proposal for:
- Bug fixes (restore intended behavior)
- Typos, formatting, comments
- Dependency updates (non-breaking)
- Configuration changes
- Tests for existing behavior
**Workflow**
1. Review `openspec/project.md`, `openspec list`, and `openspec list --specs` to understand current context.
2. Choose a unique verb-led `change-id` and scaffold `proposal.md`, `tasks.md`, optional `design.md`, and spec deltas under `openspec/changes/<id>/`.
3. Draft spec deltas using `## ADDED|MODIFIED|REMOVED Requirements` with at least one `#### Scenario:` per requirement.
4. Run `openspec validate <id> --strict` and resolve any issues before sharing the proposal.
### Stage 2: Implementing Changes
Track these steps as TODOs and complete them one by one.
1. **Read proposal.md** - Understand what's being built
2. **Read design.md** (if exists) - Review technical decisions
3. **Read tasks.md** - Get implementation checklist
4. **Implement tasks sequentially** - Complete in order
5. **Confirm completion** - Ensure every item in `tasks.md` is finished before updating statuses
6. **Update checklist** - After all work is done, set every task to `- [x]` so the list reflects reality
7. **Approval gate** - Do not start implementation until the proposal is reviewed and approved
### Stage 3: Archiving Changes
After deployment, create separate PR to:
- Move `changes/[name]/``changes/archive/YYYY-MM-DD-[name]/`
- Update `specs/` if capabilities changed
- Use `openspec archive <change-id> --skip-specs --yes` for tooling-only changes (always pass the change ID explicitly)
- Run `openspec validate --strict` to confirm the archived change passes checks
## Before Any Task
**Context Checklist:**
- [ ] Read relevant specs in `specs/[capability]/spec.md`
- [ ] Check pending changes in `changes/` for conflicts
- [ ] Read `openspec/project.md` for conventions
- [ ] Run `openspec list` to see active changes
- [ ] Run `openspec list --specs` to see existing capabilities
**Before Creating Specs:**
- Always check if capability already exists
- Prefer modifying existing specs over creating duplicates
- Use `openspec show [spec]` to review current state
- If request is ambiguous, ask 12 clarifying questions before scaffolding
### Search Guidance
- Enumerate specs: `openspec spec list --long` (or `--json` for scripts)
- Enumerate changes: `openspec list` (or `openspec change list --json` - deprecated but available)
- Show details:
- Spec: `openspec show <spec-id> --type spec` (use `--json` for filters)
- Change: `openspec show <change-id> --json --deltas-only`
- Full-text search (use ripgrep): `rg -n "Requirement:|Scenario:" openspec/specs`
## Quick Start
### CLI Commands
```bash
# Essential commands
openspec list # List active changes
openspec list --specs # List specifications
openspec show [item] # Display change or spec
openspec validate [item] # Validate changes or specs
openspec archive <change-id> [--yes|-y] # Archive after deployment (add --yes for non-interactive runs)
# Project management
openspec init [path] # Initialize OpenSpec
openspec update [path] # Update instruction files
# Interactive mode
openspec show # Prompts for selection
openspec validate # Bulk validation mode
# Debugging
openspec show [change] --json --deltas-only
openspec validate [change] --strict
```
### Command Flags
- `--json` - Machine-readable output
- `--type change|spec` - Disambiguate items
- `--strict` - Comprehensive validation
- `--no-interactive` - Disable prompts
- `--skip-specs` - Archive without spec updates
- `--yes`/`-y` - Skip confirmation prompts (non-interactive archive)
## Directory Structure
```
openspec/
├── project.md # Project conventions
├── specs/ # Current truth - what IS built
│ └── [capability]/ # Single focused capability
│ ├── spec.md # Requirements and scenarios
│ └── design.md # Technical patterns
├── changes/ # Proposals - what SHOULD change
│ ├── [change-name]/
│ │ ├── proposal.md # Why, what, impact
│ │ ├── tasks.md # Implementation checklist
│ │ ├── design.md # Technical decisions (optional; see criteria)
│ │ └── specs/ # Delta changes
│ │ └── [capability]/
│ │ └── spec.md # ADDED/MODIFIED/REMOVED
│ └── archive/ # Completed changes
```
## Creating Change Proposals
### Decision Tree
```
New request?
├─ Bug fix restoring spec behavior? → Fix directly
├─ Typo/format/comment? → Fix directly
├─ New feature/capability? → Create proposal
├─ Breaking change? → Create proposal
├─ Architecture change? → Create proposal
└─ Unclear? → Create proposal (safer)
```
### Proposal Structure
1. **Create directory:** `changes/[change-id]/` (kebab-case, verb-led, unique)
2. **Write proposal.md:**
```markdown
# Change: [Brief description of change]
## Why
[1-2 sentences on problem/opportunity]
## What Changes
- [Bullet list of changes]
- [Mark breaking changes with **BREAKING**]
## Impact
- Affected specs: [list capabilities]
- Affected code: [key files/systems]
```
3. **Create spec deltas:** `specs/[capability]/spec.md`
```markdown
## ADDED Requirements
### Requirement: New Feature
The system SHALL provide...
#### Scenario: Success case
- **WHEN** user performs action
- **THEN** expected result
## MODIFIED Requirements
### Requirement: Existing Feature
[Complete modified requirement]
## REMOVED Requirements
### Requirement: Old Feature
**Reason**: [Why removing]
**Migration**: [How to handle]
```
If multiple capabilities are affected, create multiple delta files under `changes/[change-id]/specs/<capability>/spec.md`—one per capability.
4. **Create tasks.md:**
```markdown
## 1. Implementation
- [ ] 1.1 Create database schema
- [ ] 1.2 Implement API endpoint
- [ ] 1.3 Add frontend component
- [ ] 1.4 Write tests
```
5. **Create design.md when needed:**
Create `design.md` if any of the following apply; otherwise omit it:
- Cross-cutting change (multiple services/modules) or a new architectural pattern
- New external dependency or significant data model changes
- Security, performance, or migration complexity
- Ambiguity that benefits from technical decisions before coding
Minimal `design.md` skeleton:
```markdown
## Context
[Background, constraints, stakeholders]
## Goals / Non-Goals
- Goals: [...]
- Non-Goals: [...]
## Decisions
- Decision: [What and why]
- Alternatives considered: [Options + rationale]
## Risks / Trade-offs
- [Risk] → Mitigation
## Migration Plan
[Steps, rollback]
## Open Questions
- [...]
```
## Spec File Format
### Critical: Scenario Formatting
**CORRECT** (use #### headers):
```markdown
#### Scenario: User login success
- **WHEN** valid credentials provided
- **THEN** return JWT token
```
**WRONG** (don't use bullets or bold):
```markdown
- **Scenario: User login** ❌
**Scenario**: User login ❌
### Scenario: User login ❌
```
Every requirement MUST have at least one scenario.
### Requirement Wording
- Use SHALL/MUST for normative requirements (avoid should/may unless intentionally non-normative)
### Delta Operations
- `## ADDED Requirements` - New capabilities
- `## MODIFIED Requirements` - Changed behavior
- `## REMOVED Requirements` - Deprecated features
- `## RENAMED Requirements` - Name changes
Headers matched with `trim(header)` - whitespace ignored.
#### When to use ADDED vs MODIFIED
- ADDED: Introduces a new capability or sub-capability that can stand alone as a requirement. Prefer ADDED when the change is orthogonal (e.g., adding "Slash Command Configuration") rather than altering the semantics of an existing requirement.
- MODIFIED: Changes the behavior, scope, or acceptance criteria of an existing requirement. Always paste the full, updated requirement content (header + all scenarios). The archiver will replace the entire requirement with what you provide here; partial deltas will drop previous details.
- RENAMED: Use when only the name changes. If you also change behavior, use RENAMED (name) plus MODIFIED (content) referencing the new name.
Common pitfall: Using MODIFIED to add a new concern without including the previous text. This causes loss of detail at archive time. If you arent explicitly changing the existing requirement, add a new requirement under ADDED instead.
Authoring a MODIFIED requirement correctly:
1) Locate the existing requirement in `openspec/specs/<capability>/spec.md`.
2) Copy the entire requirement block (from `### Requirement: ...` through its scenarios).
3) Paste it under `## MODIFIED Requirements` and edit to reflect the new behavior.
4) Ensure the header text matches exactly (whitespace-insensitive) and keep at least one `#### Scenario:`.
Example for RENAMED:
```markdown
## RENAMED Requirements
- FROM: `### Requirement: Login`
- TO: `### Requirement: User Authentication`
```
## Troubleshooting
### Common Errors
**"Change must have at least one delta"**
- Check `changes/[name]/specs/` exists with .md files
- Verify files have operation prefixes (## ADDED Requirements)
**"Requirement must have at least one scenario"**
- Check scenarios use `#### Scenario:` format (4 hashtags)
- Don't use bullet points or bold for scenario headers
**Silent scenario parsing failures**
- Exact format required: `#### Scenario: Name`
- Debug with: `openspec show [change] --json --deltas-only`
### Validation Tips
```bash
# Always use strict mode for comprehensive checks
openspec validate [change] --strict
# Debug delta parsing
openspec show [change] --json | jq '.deltas'
# Check specific requirement
openspec show [spec] --json -r 1
```
## Happy Path Script
```bash
# 1) Explore current state
openspec spec list --long
openspec list
# Optional full-text search:
# rg -n "Requirement:|Scenario:" openspec/specs
# rg -n "^#|Requirement:" openspec/changes
# 2) Choose change id and scaffold
CHANGE=add-two-factor-auth
mkdir -p openspec/changes/$CHANGE/{specs/auth}
printf "## Why\n...\n\n## What Changes\n- ...\n\n## Impact\n- ...\n" > openspec/changes/$CHANGE/proposal.md
printf "## 1. Implementation\n- [ ] 1.1 ...\n" > openspec/changes/$CHANGE/tasks.md
# 3) Add deltas (example)
cat > openspec/changes/$CHANGE/specs/auth/spec.md << 'EOF'
## ADDED Requirements
### Requirement: Two-Factor Authentication
Users MUST provide a second factor during login.
#### Scenario: OTP required
- **WHEN** valid credentials are provided
- **THEN** an OTP challenge is required
EOF
# 4) Validate
openspec validate $CHANGE --strict
```
## Multi-Capability Example
```
openspec/changes/add-2fa-notify/
├── proposal.md
├── tasks.md
└── specs/
├── auth/
│ └── spec.md # ADDED: Two-Factor Authentication
└── notifications/
└── spec.md # ADDED: OTP email notification
```
auth/spec.md
```markdown
## ADDED Requirements
### Requirement: Two-Factor Authentication
...
```
notifications/spec.md
```markdown
## ADDED Requirements
### Requirement: OTP Email Notification
...
```
## Best Practices
### Simplicity First
- Default to <100 lines of new code
- Single-file implementations until proven insufficient
- Avoid frameworks without clear justification
- Choose boring, proven patterns
### Complexity Triggers
Only add complexity with:
- Performance data showing current solution too slow
- Concrete scale requirements (>1000 users, >100MB data)
- Multiple proven use cases requiring abstraction
### Clear References
- Use `file.ts:42` format for code locations
- Reference specs as `specs/auth/spec.md`
- Link related changes and PRs
### Capability Naming
- Use verb-noun: `user-auth`, `payment-capture`
- Single purpose per capability
- 10-minute understandability rule
- Split if description needs "AND"
### Change ID Naming
- Use kebab-case, short and descriptive: `add-two-factor-auth`
- Prefer verb-led prefixes: `add-`, `update-`, `remove-`, `refactor-`
- Ensure uniqueness; if taken, append `-2`, `-3`, etc.
## Tool Selection Guide
| Task | Tool | Why |
|------|------|-----|
| Find files by pattern | Glob | Fast pattern matching |
| Search code content | Grep | Optimized regex search |
| Read specific files | Read | Direct file access |
| Explore unknown scope | Task | Multi-step investigation |
## Error Recovery
### Change Conflicts
1. Run `openspec list` to see active changes
2. Check for overlapping specs
3. Coordinate with change owners
4. Consider combining proposals
### Validation Failures
1. Run with `--strict` flag
2. Check JSON output for details
3. Verify spec file format
4. Ensure scenarios properly formatted
### Missing Context
1. Read project.md first
2. Check related specs
3. Review recent archives
4. Ask for clarification
## Quick Reference
### Stage Indicators
- `changes/` - Proposed, not yet built
- `specs/` - Built and deployed
- `archive/` - Completed changes
### File Purposes
- `proposal.md` - Why and what
- `tasks.md` - Implementation steps
- `design.md` - Technical decisions
- `spec.md` - Requirements and behavior
### CLI Essentials
```bash
openspec list # What's in progress?
openspec show [item] # View details
openspec validate --strict # Is it correct?
openspec archive <change-id> [--yes|-y] # Mark complete (add --yes for automation)
```
Remember: Specs are truth. Changes are proposals. Keep them in sync.

View File

@@ -0,0 +1,87 @@
## Context
本專案為桌面端 TTS 工具,需整合 PyQt6 GUI 與 edge-tts 非同步網路請求。主要挑戰在於 PyQt 的 Event Loop 與 asyncio 的衝突處理,以及確保長時間批次任務不阻塞 UI。
目標使用者為企業內部簡報製作人員,技術背景有限,需要簡單直觀的操作流程。
## Goals / Non-Goals
**Goals:**
- 實現 Excel 到 MP3 的批次轉換
- 支援中英/越英混合語音
- 提供響應式 GUI 與進度回饋
- 錯誤容忍,單檔失敗不中斷
**Non-Goals:**
- 不支援即時語音預覽
- 不實作語音編輯功能
- 不處理影片嵌入
## Decisions
### 架構模式: MVC + Producer-Consumer
- **決定**: 採用 MVC 分離關注點Worker Thread 作為 Producer 生成任務結果
- **原因**: PyQt6 原生支援此模式,易於維護與測試
### 執行緒模型: QThread + Asyncio Event Loop
- **決定**: 在 Worker Thread 內建立獨立的 asyncio loop
- **替代方案**:
- `qasync` 整合 - 額外依賴,維護風險
- 純同步請求 - 效能差,阻塞嚴重
- **選擇原因**: 原生解法,無額外依賴,已驗證穩定
### TTS 引擎: edge-tts
- **決定**: 使用 edge-tts Python 套件
- **原因**:
- 免費調用 Azure Neural Voice
- 無需 API Key
- 語音品質達 SOTA 水準
### 語音選擇策略
- **決定**: 提供下拉選單讓使用者自選語音,每個選項標註雙語支援能力
- **預設值**: 維持原映射作為預設選項
- **替代方案**: 固定映射 - 簡化操作但缺乏彈性
- **選擇原因**: 使用者可能需要男聲或不同風格
#### 可用語音清單 (含雙語支援標註)
| 語言 | Voice ID | 性別 | 雙語支援 | 說明 |
|------|----------|------|----------|------|
| zh-TW | zh-TW-HsiaoChenNeural | 女 | 中英混雜 ✓ | 知性專業 (預設) |
| zh-TW | zh-TW-HsiaoYuNeural | 女 | 中英混雜 ✓ | 活潑年輕 |
| zh-TW | zh-TW-YunJheNeural | 男 | 中英混雜 ✓ | 成熟穩重 |
| zh-CN | zh-CN-XiaoxiaoNeural | 女 | 中英混雜 ✓ | 甜美親切 |
| zh-CN | zh-CN-YunyangNeural | 男 | 中英混雜 ✓ | 新聞播報風格 |
| vi-VN | vi-VN-HoaiMyNeural | 女 | 越英混雜 ✓ | 溫柔清晰 (預設) |
| vi-VN | vi-VN-NamMinhNeural | 男 | 越英混雜 ✓ | 專業沉穩 |
| en-US | en-US-JennyNeural | 女 | 純英文 | 標準美式 (預設) |
| en-US | en-US-AriaNeural | 女 | 純英文 | 自然對話 |
| en-US | en-US-GuyNeural | 男 | 純英文 | 專業旁白 |
#### GUI 下拉選單分組
- **中文語音** (適合中英混雜簡報)
- **越南語音** (適合越英混雜簡報)
- **英文語音** (適合純英文簡報)
### Rate Limit 策略
- **決定**: 每筆請求間隔 0.5 秒
- **原因**: 防止 IP 被 Azure 封鎖,經測試此間隔穩定
## Risks / Trade-offs
### 網路依賴風險
- **風險**: edge-tts 依賴網路連線,離線無法使用
- **緩解**: 明確標示網路需求,提供網路錯誤提示
### API 變動風險
- **風險**: Microsoft 可能調整 Edge TTS API
- **緩解**: edge-tts 套件由社群維護,跟進更新
### 單執行緒序列化
- **風險**: 大量檔案時處理時間長
- **Trade-off**: 選擇穩定性優先,避免並發導致 Rate Limit
## Migration Plan
N/A - 全新專案,無既有系統需遷移
## Resolved Questions
- ✅ 是否需要支援自訂語音選擇?→ **是,提供下拉選單並標註雙語支援**
- ✅ 是否需要輸出格式選項?→ **否,固定 MP3 格式**

View File

@@ -0,0 +1,22 @@
# Change: Add Smart Slide Voiceover System
## Why
企業內部簡報製作需要專業旁白,但傳統錄音耗時且品質不穩定。需要一款工具能從 Excel 講稿批次生成高品質語音檔案,支援中文/越南文與英文術語混合朗讀,降低簡報製作門檻。
## What Changes
- **新增 Excel 輸入模組**: 解析 .xlsx 格式講稿檔案,支援 Filename/Text/Lang 欄位
- **新增 TTS 引擎模組**: 整合 edge-tts 調用 Azure Neural Voice實現多語言語音合成
- **新增 PyQt6 圖形介面**: 提供檔案選擇、語音選擇下拉選單 (含雙語支援標註)、進度監控、日誌顯示等互動功能
- **新增執行緒模型**: QThread + Asyncio 架構確保 UI 響應性
- **新增錯誤處理機制**: 單檔失敗不中斷批次Rate Limit 防護
## Impact
- Affected specs:
- `excel-input` (新增)
- `tts-engine` (新增)
- `gui-interface` (新增)
- Affected code:
- `main.py` - 主程式入口與 GUI 邏輯
- `environment.yml` - Conda 環境配置
- `README.md` - 使用說明
- `template.xlsx` - 範例檔案

View File

@@ -0,0 +1,42 @@
## ADDED Requirements
### Requirement: Excel File Loading
系統 SHALL 支援載入 .xlsx 格式的 Excel 檔案作為講稿輸入來源。
#### Scenario: Valid Excel file selected
- **WHEN** 使用者選擇有效的 .xlsx 檔案
- **THEN** 系統解析檔案內容並準備處理
#### Scenario: Invalid file format rejected
- **WHEN** 使用者選擇非 .xlsx 格式檔案
- **THEN** 系統顯示格式錯誤警告,不進行處理
### Requirement: Column Parsing
系統 SHALL 解析 Excel 檔案的標準欄位結構,包含 Filename、Text、Lang 三個欄位。
#### Scenario: Required columns present
- **WHEN** Excel 檔案包含 Filename 與 Text 欄位
- **THEN** 系統成功解析所有資料列
#### Scenario: Missing required column
- **WHEN** Excel 檔案缺少 Filename 或 Text 欄位
- **THEN** 系統顯示欄位缺失錯誤,停止處理
#### Scenario: Optional Lang column handling
- **WHEN** Lang 欄位為空或不存在
- **THEN** 系統預設使用 "zh" 作為語言設定
### Requirement: Data Validation
系統 SHALL 驗證每一列資料的有效性,確保必要欄位非空。
#### Scenario: Empty text field
- **WHEN** 某列的 Text 欄位為空
- **THEN** 系統記錄警告並跳過該列
#### Scenario: Empty filename field
- **WHEN** 某列的 Filename 欄位為空
- **THEN** 系統記錄警告並跳過該列
#### Scenario: Valid row processing
- **WHEN** Filename 與 Text 欄位皆有值
- **THEN** 系統將該列加入處理佇列

View File

@@ -0,0 +1,101 @@
## ADDED Requirements
### Requirement: File Selection
系統 SHALL 提供檔案選擇器讓使用者選取 Excel 講稿檔案。
#### Scenario: Open file dialog
- **WHEN** 使用者點擊檔案選擇按鈕
- **THEN** 系統開啟檔案對話框,篩選顯示 .xlsx 檔案
#### Scenario: File path display
- **WHEN** 使用者選取檔案後
- **THEN** 系統在介面顯示已選檔案路徑
### Requirement: Voice Selection
系統 SHALL 提供語音選擇下拉選單,讓使用者選擇 TTS 語音,並標註各語音的雙語支援能力。
#### Scenario: Voice dropdown display
- **WHEN** 介面載入完成
- **THEN** 系統顯示語音下拉選單,按語言分組 (中文/越南/英文)
#### Scenario: Voice option format
- **WHEN** 使用者展開下拉選單
- **THEN** 每個選項顯示「語音名稱 (性別) - 雙語支援說明」格式
#### Scenario: Default voice selection
- **WHEN** 使用者未手動選擇語音
- **THEN** 系統使用預設語音 (中文: HsiaoChenNeural, 越南: HoaiMyNeural, 英文: JennyNeural)
#### Scenario: Voice selection persistence
- **WHEN** 使用者選擇語音後開始處理
- **THEN** 所有講稿使用選定的語音進行合成
### Requirement: Start Control
系統 SHALL 提供「開始」按鈕啟動批次生成流程。
#### Scenario: Start without file
- **WHEN** 使用者未選擇檔案即點擊「開始」
- **THEN** 系統顯示警告訊息,不執行任何動作
#### Scenario: Start with valid file
- **WHEN** 使用者已選擇有效檔案並點擊「開始」
- **THEN** 系統開始處理並禁用「開始」按鈕
### Requirement: Stop Control
系統 SHALL 提供「停止」按鈕讓使用者中斷處理流程。
#### Scenario: Stop during processing
- **WHEN** 使用者在處理過程中點擊「停止」
- **THEN** 系統完成當前檔案後停止,不處理後續檔案
#### Scenario: Stop button state
- **WHEN** 系統未在處理中
- **THEN**「停止」按鈕呈現禁用狀態
### Requirement: Progress Display
系統 SHALL 顯示進度條反映批次處理進度。
#### Scenario: Progress update
- **WHEN** 每完成一個檔案的生成
- **THEN** 進度條百分比更新為 (已完成數 / 總數) * 100
#### Scenario: Progress reset
- **WHEN** 開始新的批次處理
- **THEN** 進度條重設為 0%
### Requirement: Log Display
系統 SHALL 提供日誌視窗即時顯示處理狀態。
#### Scenario: Processing log
- **WHEN** 開始處理某個檔案
- **THEN** 日誌顯示「正在處理: [Filename]」
#### Scenario: Success log
- **WHEN** 檔案生成成功
- **THEN** 日誌顯示「完成: [Filename]」
#### Scenario: Error log
- **WHEN** 檔案生成失敗
- **THEN** 日誌顯示「錯誤: [Filename] - [錯誤訊息]」
### Requirement: UI Responsiveness
系統 SHALL 確保長時間處理過程中 UI 保持響應。
#### Scenario: Window interaction during processing
- **WHEN** 批次處理進行中
- **THEN** 使用者可拖曳、縮放、最小化視窗,無凍結現象
#### Scenario: No UI blocking
- **WHEN** TTS 引擎進行網路請求
- **THEN** 主視窗事件迴圈持續運作,不顯示「未回應」
### Requirement: Completion Notification
系統 SHALL 在批次處理完成後通知使用者。
#### Scenario: Batch complete
- **WHEN** 所有檔案處理完畢
- **THEN** 系統顯示完成對話框,包含成功/失敗統計
#### Scenario: Partial completion
- **WHEN** 批次處理被使用者中斷
- **THEN** 系統顯示中斷通知,包含已完成數量

View File

@@ -0,0 +1,68 @@
## ADDED Requirements
### Requirement: Voice Synthesis
系統 SHALL 使用 edge-tts 引擎將文字轉換為語音檔案 (MP3 格式)。
#### Scenario: Successful voice generation
- **WHEN** 提供有效的文字內容與語言設定
- **THEN** 系統生成對應的 MP3 音檔
#### Scenario: Network error handling
- **WHEN** 網路連線中斷或逾時
- **THEN** 系統重試一次,若仍失敗則記錄錯誤並跳過
### Requirement: Voice Selection Support
系統 SHALL 支援使用者從 GUI 選擇的語音進行合成,並提供基於 Lang 欄位的預設映射。
#### Scenario: User selected voice
- **WHEN** 使用者從 GUI 選擇特定語音
- **THEN** 系統使用該語音合成所有講稿
#### Scenario: Default voice by Lang column
- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "zh" 或 "zh-tw"
- **THEN** 系統使用 zh-TW-HsiaoChenNeural 語音
#### Scenario: Vietnamese default voice
- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "vi"
- **THEN** 系統使用 vi-VN-HoaiMyNeural 語音
#### Scenario: English default voice
- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "en"
- **THEN** 系統使用 en-US-JennyNeural 語音
#### Scenario: Unknown language fallback
- **WHEN** 使用者未選擇語音且 Lang 欄位為未知值或空白
- **THEN** 系統預設使用 zh-TW-HsiaoChenNeural 語音
### Requirement: Rate Limit Protection
系統 SHALL 在每次 TTS 請求之間加入延遲,防止 IP 被封鎖。
#### Scenario: Request throttling
- **WHEN** 連續發送多筆 TTS 請求
- **THEN** 系統在每筆請求間等待至少 0.5 秒
### Requirement: Batch Processing
系統 SHALL 支援批次處理多筆講稿,單筆失敗不中斷整體流程。
#### Scenario: Batch with failures
- **WHEN** 批次中有部分檔案生成失敗
- **THEN** 系統記錄失敗項目並繼續處理剩餘檔案
#### Scenario: All files successful
- **WHEN** 批次中所有檔案皆生成成功
- **THEN** 系統顯示完成通知並統計成功數量
### Requirement: Output File Management
系統 SHALL 將生成的音檔儲存至指定輸出路徑。
#### Scenario: Output directory creation
- **WHEN** 輸出路徑不存在
- **THEN** 系統自動建立該目錄
#### Scenario: File naming
- **WHEN** 生成音檔時
- **THEN** 檔名使用 Excel 中 Filename 欄位值加上 .mp3 副檔名
#### Scenario: File overwrite
- **WHEN** 輸出路徑已存在同名檔案
- **THEN** 系統覆蓋既有檔案

View File

@@ -0,0 +1,48 @@
## 1. Environment Setup
- [x] 1.1 Create `environment.yml` with Python 3.10, PyQt6, edge-tts, pandas, openpyxl
- [x] 1.2 Test conda environment creation and activation
## 2. Excel Input Module
- [x] 2.1 Implement Excel file loading with openpyxl/pandas
- [x] 2.2 Implement column parsing (Filename, Text, Lang)
- [x] 2.3 Add data validation for required fields
- [x] 2.4 Implement default language fallback (zh)
## 3. TTS Engine Module
- [x] 3.1 Define voice registry with all available voices and bilingual support annotations
- [x] 3.2 Implement default voice mapping dictionary (zh/vi/en -> Voice ID)
- [x] 3.3 Implement edge-tts wrapper for voice synthesis with configurable voice
- [x] 3.4 Add rate limit delay (0.5s between requests)
- [x] 3.5 Implement network error handling with retry
- [x] 3.6 Add output directory creation logic
## 4. Threading Model
- [x] 4.1 Create TTS Worker class extending QThread
- [x] 4.2 Implement asyncio event loop within worker thread
- [x] 4.3 Define pyqtSignal for progress, log, completion events
- [x] 4.4 Implement stop flag for graceful cancellation
## 5. GUI Implementation
- [x] 5.1 Create main window with PyQt6
- [x] 5.2 Add file browser widget with .xlsx filter
- [x] 5.3 Add voice selection dropdown (grouped by language, with bilingual annotations)
- [x] 5.4 Add Start/Stop buttons with state management
- [x] 5.5 Add progress bar widget
- [x] 5.6 Add log console (QTextEdit) with auto-scroll
- [x] 5.7 Connect signals to UI update slots
- [x] 5.8 Implement completion notification dialog
## 6. Integration
- [x] 6.1 Wire Excel parser to TTS worker
- [x] 6.2 Pass selected voice from GUI to TTS worker
- [x] 6.3 Connect worker signals to GUI updates
- [x] 6.4 Implement start/stop button handlers
## 7. Testing & Documentation
- [x] 7.1 Create `template.xlsx` with sample data
- [x] 7.2 Test batch processing with 10+ items
- [x] 7.3 Test voice selection (verify different voices produce correct output)
- [x] 7.4 Test UI responsiveness during processing
- [x] 7.5 Test stop functionality mid-batch
- [x] 7.6 Test error handling (invalid file, network error)
- [x] 7.7 Write README.md with usage instructions (including voice selection guide)

71
openspec/project.md Normal file
View File

@@ -0,0 +1,71 @@
# Project Context
## Purpose
智慧簡報旁白生成系統 (Smart Slide Voiceover System) - 一款桌面端應用程式,協助使用者將撰寫於 Excel 的簡報講稿,批次轉換為專業級、擬真且具備親切感的語音檔案 (MP3)。系統解決跨語言(中文+英文專有名詞、越南文+英文專有名詞)的發音自然度問題。
## Tech Stack
- **開發語言**: Python 3.10+
- **GUI 框架**: PyQt6 (原生編譯綁定,效能優於 Electron)
- **TTS 引擎**: edge-tts (Python Library調用 Azure Cognitive Services)
- **資料處理**: Pandas + Openpyxl (Excel 讀取解析)
- **並發模型**: QThread + Asyncio (解決 PyQt 事件迴圈與非同步請求衝突)
- **環境管理**: Conda (environment.yml)
## Project Conventions
### Code Style
- 架構模式: MVC (Model-View-Controller) / Producer-Consumer Pattern
- 主執行緒負責 UI 繪製Worker Thread 執行 TTS 請求
- 透過 pyqtSignal 將進度與 Log 傳回 Main Thread
- 使用 Type Hints 增強程式碼可讀性
### Architecture Patterns
- **Main Thread (UI)**: 繪製介面、接收使用者操作、更新進度條
- **Worker Thread (QThread)**:
- 建立獨立的 asyncio Event Loop
- 執行 Excel 解析
- 序列化執行 TTS 請求
- 透過 Signal 回傳狀態
### Testing Strategy
- 環境測試: Conda 環境建立、依賴完整性
- 功能測試: 檔案讀取防呆、批次生成、強制中斷、介面響應性
- 語音品質驗收: 語言對應正確性、中英/越英夾雜流暢度、語音完整性
### Git Workflow
- 使用功能分支開發
- Commit 訊息採用中文說明
## Domain Context
### 語音模型映射 (Voice Mapping)
| Lang 欄位 | Voice ID | 特徵 |
|-----------|----------|------|
| vi | vi-VN-HoaiMyNeural | 女性,音色明亮溫柔,咬字清晰 |
| zh / zh-tw | zh-TW-HsiaoChenNeural | 女性,台灣標準口音,知性專業 |
| en | en-US-JennyNeural | 女性,美式標準音 |
### Excel 輸入格式
| 欄位 | 必要性 | 說明 |
|------|--------|------|
| Filename | 必要 | 輸出音檔名稱 (例如 Slide_01) |
| Text | 必要 | 旁白內容,支援中英/越英夾雜 |
| Lang | 選填 | 主要語言 (zh/vi),預設 zh |
## Important Constraints
- **Rate Limit 防護**: 每筆請求間需有間隔,防止 IP 被封鎖
- **效能要求**: 單筆音訊生成延遲不超過 3 秒,支援 100+ 筆批次佇列
- **錯誤容忍**: 單一檔案失敗不中斷整個批次,記錄錯誤後跳至下一筆
- **介面響應**: 執行耗時任務時,主視窗不得凍結
- **硬體基準**: PC (Windows), 32GB RAM, RTX 4060 (主要利用 CPU 與網路頻寬)
## External Dependencies
- **Microsoft Edge-TTS**: Azure Cognitive Services Neural Voice API
- **網路需求**: 10Mbps+ 穩定連線
- **作業系統**: Windows 10/11
## Deliverables
- `main.py`: GUI 與邏輯程式碼
- `environment.yml`: Conda 環境配置
- `README.md`: 操作說明文件
- `template.xlsx`: 格式範例檔案

View File

@@ -0,0 +1,46 @@
# excel-input Specification
## Purpose
TBD - created by archiving change add-tts-voiceover-system. Update Purpose after archive.
## Requirements
### Requirement: Excel File Loading
系統 SHALL 支援載入 .xlsx 格式的 Excel 檔案作為講稿輸入來源。
#### Scenario: Valid Excel file selected
- **WHEN** 使用者選擇有效的 .xlsx 檔案
- **THEN** 系統解析檔案內容並準備處理
#### Scenario: Invalid file format rejected
- **WHEN** 使用者選擇非 .xlsx 格式檔案
- **THEN** 系統顯示格式錯誤警告,不進行處理
### Requirement: Column Parsing
系統 SHALL 解析 Excel 檔案的標準欄位結構,包含 Filename、Text、Lang 三個欄位。
#### Scenario: Required columns present
- **WHEN** Excel 檔案包含 Filename 與 Text 欄位
- **THEN** 系統成功解析所有資料列
#### Scenario: Missing required column
- **WHEN** Excel 檔案缺少 Filename 或 Text 欄位
- **THEN** 系統顯示欄位缺失錯誤,停止處理
#### Scenario: Optional Lang column handling
- **WHEN** Lang 欄位為空或不存在
- **THEN** 系統預設使用 "zh" 作為語言設定
### Requirement: Data Validation
系統 SHALL 驗證每一列資料的有效性,確保必要欄位非空。
#### Scenario: Empty text field
- **WHEN** 某列的 Text 欄位為空
- **THEN** 系統記錄警告並跳過該列
#### Scenario: Empty filename field
- **WHEN** 某列的 Filename 欄位為空
- **THEN** 系統記錄警告並跳過該列
#### Scenario: Valid row processing
- **WHEN** Filename 與 Text 欄位皆有值
- **THEN** 系統將該列加入處理佇列

View File

@@ -0,0 +1,105 @@
# gui-interface Specification
## Purpose
TBD - created by archiving change add-tts-voiceover-system. Update Purpose after archive.
## Requirements
### Requirement: File Selection
系統 SHALL 提供檔案選擇器讓使用者選取 Excel 講稿檔案。
#### Scenario: Open file dialog
- **WHEN** 使用者點擊檔案選擇按鈕
- **THEN** 系統開啟檔案對話框,篩選顯示 .xlsx 檔案
#### Scenario: File path display
- **WHEN** 使用者選取檔案後
- **THEN** 系統在介面顯示已選檔案路徑
### Requirement: Voice Selection
系統 SHALL 提供語音選擇下拉選單,讓使用者選擇 TTS 語音,並標註各語音的雙語支援能力。
#### Scenario: Voice dropdown display
- **WHEN** 介面載入完成
- **THEN** 系統顯示語音下拉選單,按語言分組 (中文/越南/英文)
#### Scenario: Voice option format
- **WHEN** 使用者展開下拉選單
- **THEN** 每個選項顯示「語音名稱 (性別) - 雙語支援說明」格式
#### Scenario: Default voice selection
- **WHEN** 使用者未手動選擇語音
- **THEN** 系統使用預設語音 (中文: HsiaoChenNeural, 越南: HoaiMyNeural, 英文: JennyNeural)
#### Scenario: Voice selection persistence
- **WHEN** 使用者選擇語音後開始處理
- **THEN** 所有講稿使用選定的語音進行合成
### Requirement: Start Control
系統 SHALL 提供「開始」按鈕啟動批次生成流程。
#### Scenario: Start without file
- **WHEN** 使用者未選擇檔案即點擊「開始」
- **THEN** 系統顯示警告訊息,不執行任何動作
#### Scenario: Start with valid file
- **WHEN** 使用者已選擇有效檔案並點擊「開始」
- **THEN** 系統開始處理並禁用「開始」按鈕
### Requirement: Stop Control
系統 SHALL 提供「停止」按鈕讓使用者中斷處理流程。
#### Scenario: Stop during processing
- **WHEN** 使用者在處理過程中點擊「停止」
- **THEN** 系統完成當前檔案後停止,不處理後續檔案
#### Scenario: Stop button state
- **WHEN** 系統未在處理中
- **THEN**「停止」按鈕呈現禁用狀態
### Requirement: Progress Display
系統 SHALL 顯示進度條反映批次處理進度。
#### Scenario: Progress update
- **WHEN** 每完成一個檔案的生成
- **THEN** 進度條百分比更新為 (已完成數 / 總數) * 100
#### Scenario: Progress reset
- **WHEN** 開始新的批次處理
- **THEN** 進度條重設為 0%
### Requirement: Log Display
系統 SHALL 提供日誌視窗即時顯示處理狀態。
#### Scenario: Processing log
- **WHEN** 開始處理某個檔案
- **THEN** 日誌顯示「正在處理: [Filename]」
#### Scenario: Success log
- **WHEN** 檔案生成成功
- **THEN** 日誌顯示「完成: [Filename]」
#### Scenario: Error log
- **WHEN** 檔案生成失敗
- **THEN** 日誌顯示「錯誤: [Filename] - [錯誤訊息]」
### Requirement: UI Responsiveness
系統 SHALL 確保長時間處理過程中 UI 保持響應。
#### Scenario: Window interaction during processing
- **WHEN** 批次處理進行中
- **THEN** 使用者可拖曳、縮放、最小化視窗,無凍結現象
#### Scenario: No UI blocking
- **WHEN** TTS 引擎進行網路請求
- **THEN** 主視窗事件迴圈持續運作,不顯示「未回應」
### Requirement: Completion Notification
系統 SHALL 在批次處理完成後通知使用者。
#### Scenario: Batch complete
- **WHEN** 所有檔案處理完畢
- **THEN** 系統顯示完成對話框,包含成功/失敗統計
#### Scenario: Partial completion
- **WHEN** 批次處理被使用者中斷
- **THEN** 系統顯示中斷通知,包含已完成數量

View File

@@ -0,0 +1,72 @@
# tts-engine Specification
## Purpose
TBD - created by archiving change add-tts-voiceover-system. Update Purpose after archive.
## Requirements
### Requirement: Voice Synthesis
系統 SHALL 使用 edge-tts 引擎將文字轉換為語音檔案 (MP3 格式)。
#### Scenario: Successful voice generation
- **WHEN** 提供有效的文字內容與語言設定
- **THEN** 系統生成對應的 MP3 音檔
#### Scenario: Network error handling
- **WHEN** 網路連線中斷或逾時
- **THEN** 系統重試一次,若仍失敗則記錄錯誤並跳過
### Requirement: Voice Selection Support
系統 SHALL 支援使用者從 GUI 選擇的語音進行合成,並提供基於 Lang 欄位的預設映射。
#### Scenario: User selected voice
- **WHEN** 使用者從 GUI 選擇特定語音
- **THEN** 系統使用該語音合成所有講稿
#### Scenario: Default voice by Lang column
- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "zh" 或 "zh-tw"
- **THEN** 系統使用 zh-TW-HsiaoChenNeural 語音
#### Scenario: Vietnamese default voice
- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "vi"
- **THEN** 系統使用 vi-VN-HoaiMyNeural 語音
#### Scenario: English default voice
- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "en"
- **THEN** 系統使用 en-US-JennyNeural 語音
#### Scenario: Unknown language fallback
- **WHEN** 使用者未選擇語音且 Lang 欄位為未知值或空白
- **THEN** 系統預設使用 zh-TW-HsiaoChenNeural 語音
### Requirement: Rate Limit Protection
系統 SHALL 在每次 TTS 請求之間加入延遲,防止 IP 被封鎖。
#### Scenario: Request throttling
- **WHEN** 連續發送多筆 TTS 請求
- **THEN** 系統在每筆請求間等待至少 0.5 秒
### Requirement: Batch Processing
系統 SHALL 支援批次處理多筆講稿,單筆失敗不中斷整體流程。
#### Scenario: Batch with failures
- **WHEN** 批次中有部分檔案生成失敗
- **THEN** 系統記錄失敗項目並繼續處理剩餘檔案
#### Scenario: All files successful
- **WHEN** 批次中所有檔案皆生成成功
- **THEN** 系統顯示完成通知並統計成功數量
### Requirement: Output File Management
系統 SHALL 將生成的音檔儲存至指定輸出路徑。
#### Scenario: Output directory creation
- **WHEN** 輸出路徑不存在
- **THEN** 系統自動建立該目錄
#### Scenario: File naming
- **WHEN** 生成音檔時
- **THEN** 檔名使用 Excel 中 Filename 欄位值加上 .mp3 副檔名
#### Scenario: File overwrite
- **WHEN** 輸出路徑已存在同名檔案
- **THEN** 系統覆蓋既有檔案