commit 33ea22f259f84f90716abd393a17eb405e397e05 Author: beabigegg Date: Sat Dec 27 15:42:11 2025 +0800 feat: 新增智慧簡報旁白生成系統 (Smart Slide Voiceover System) - 新增 Excel 輸入模組:解析 .xlsx 格式講稿檔案 - 新增 TTS 引擎模組:整合 edge-tts 調用 Azure Neural Voice - 新增 PyQt6 圖形介面:檔案選擇、語音選擇、進度監控 - 新增執行緒模型:QThread + Asyncio 確保 UI 響應性 - 支援 10 種 Neural Voice (中文/越南/英文) - 支援中英混雜、越英混雜發音 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 diff --git a/.claude/commands/openspec/apply.md b/.claude/commands/openspec/apply.md new file mode 100644 index 0000000..a36fd96 --- /dev/null +++ b/.claude/commands/openspec/apply.md @@ -0,0 +1,23 @@ +--- +name: OpenSpec: Apply +description: Implement an approved OpenSpec change and keep tasks in sync. +category: OpenSpec +tags: [openspec, apply] +--- + +**Guardrails** +- Favor straightforward, minimal implementations first and add complexity only when it is requested or clearly required. +- Keep changes tightly scoped to the requested outcome. +- Refer to `openspec/AGENTS.md` (located inside the `openspec/` directory—run `ls openspec` or `openspec update` if you don't see it) if you need additional OpenSpec conventions or clarifications. + +**Steps** +Track these steps as TODOs and complete them one by one. +1. Read `changes//proposal.md`, `design.md` (if present), and `tasks.md` to confirm scope and acceptance criteria. +2. Work through tasks sequentially, keeping edits minimal and focused on the requested change. +3. Confirm completion before updating statuses—make sure every item in `tasks.md` is finished. +4. Update the checklist after all work is done so each task is marked `- [x]` and reflects reality. +5. Reference `openspec list` or `openspec show ` when additional context is required. + +**Reference** +- Use `openspec show --json --deltas-only` if you need additional context from the proposal while implementing. + diff --git a/.claude/commands/openspec/archive.md b/.claude/commands/openspec/archive.md new file mode 100644 index 0000000..dbc7695 --- /dev/null +++ b/.claude/commands/openspec/archive.md @@ -0,0 +1,27 @@ +--- +name: OpenSpec: Archive +description: Archive a deployed OpenSpec change and update specs. +category: OpenSpec +tags: [openspec, archive] +--- + +**Guardrails** +- Favor straightforward, minimal implementations first and add complexity only when it is requested or clearly required. +- Keep changes tightly scoped to the requested outcome. +- Refer to `openspec/AGENTS.md` (located inside the `openspec/` directory—run `ls openspec` or `openspec update` if you don't see it) if you need additional OpenSpec conventions or clarifications. + +**Steps** +1. Determine the change ID to archive: + - If this prompt already includes a specific change ID (for example inside a `` block populated by slash-command arguments), use that value after trimming whitespace. + - If the conversation references a change loosely (for example by title or summary), run `openspec list` to surface likely IDs, share the relevant candidates, and confirm which one the user intends. + - Otherwise, review the conversation, run `openspec list`, and ask the user which change to archive; wait for a confirmed change ID before proceeding. + - If you still cannot identify a single change ID, stop and tell the user you cannot archive anything yet. +2. Validate the change ID by running `openspec list` (or `openspec show `) and stop if the change is missing, already archived, or otherwise not ready to archive. +3. Run `openspec archive --yes` so the CLI moves the change and applies spec updates without prompts (use `--skip-specs` only for tooling-only work). +4. Review the command output to confirm the target specs were updated and the change landed in `changes/archive/`. +5. Validate with `openspec validate --strict` and inspect with `openspec show ` if anything looks off. + +**Reference** +- Use `openspec list` to confirm change IDs before archiving. +- Inspect refreshed specs with `openspec list --specs` and address any validation issues before handing off. + diff --git a/.claude/commands/openspec/proposal.md b/.claude/commands/openspec/proposal.md new file mode 100644 index 0000000..cbb75ce --- /dev/null +++ b/.claude/commands/openspec/proposal.md @@ -0,0 +1,28 @@ +--- +name: OpenSpec: Proposal +description: Scaffold a new OpenSpec change and validate strictly. +category: OpenSpec +tags: [openspec, change] +--- + +**Guardrails** +- Favor straightforward, minimal implementations first and add complexity only when it is requested or clearly required. +- Keep changes tightly scoped to the requested outcome. +- Refer to `openspec/AGENTS.md` (located inside the `openspec/` directory—run `ls openspec` or `openspec update` if you don't see it) if you need additional OpenSpec conventions or clarifications. +- Identify any vague or ambiguous details and ask the necessary follow-up questions before editing files. +- Do not write any code during the proposal stage. Only create design documents (proposal.md, tasks.md, design.md, and spec deltas). Implementation happens in the apply stage after approval. + +**Steps** +1. Review `openspec/project.md`, run `openspec list` and `openspec list --specs`, and inspect related code or docs (e.g., via `rg`/`ls`) to ground the proposal in current behaviour; note any gaps that require clarification. +2. Choose a unique verb-led `change-id` and scaffold `proposal.md`, `tasks.md`, and `design.md` (when needed) under `openspec/changes//`. +3. Map the change into concrete capabilities or requirements, breaking multi-scope efforts into distinct spec deltas with clear relationships and sequencing. +4. Capture architectural reasoning in `design.md` when the solution spans multiple systems, introduces new patterns, or demands trade-off discussion before committing to specs. +5. Draft spec deltas in `changes//specs//spec.md` (one folder per capability) using `## ADDED|MODIFIED|REMOVED Requirements` with at least one `#### Scenario:` per requirement and cross-reference related capabilities when relevant. +6. Draft `tasks.md` as an ordered list of small, verifiable work items that deliver user-visible progress, include validation (tests, tooling), and highlight dependencies or parallelizable work. +7. Validate with `openspec validate --strict` and resolve every issue before sharing the proposal. + +**Reference** +- Use `openspec show --json --deltas-only` or `openspec show --type spec` to inspect details when validation fails. +- Search existing requirements with `rg -n "Requirement:|Scenario:" openspec/specs` before writing new ones. +- Explore the codebase with `rg `, `ls`, or direct file reads so proposals align with current implementation realities. + diff --git a/.claude/settings.local.json b/.claude/settings.local.json new file mode 100644 index 0000000..e8be119 --- /dev/null +++ b/.claude/settings.local.json @@ -0,0 +1,19 @@ +{ + "permissions": { + "allow": [ + "Bash(openspec list:*)", + "Bash(openspec validate:*)", + "Bash(python:*)", + "Bash(conda env create:*)", + "Bash(conda activate:*)", + "Bash(conda run:*)", + "Bash(set PYTHONUTF8=1)", + "Bash(del:*)", + "Bash(openspec archive add-tts-voiceover-system:*)", + "Bash(git init:*)", + "Bash(git config:*)", + "Bash(git add:*)", + "Bash(git commit:*)" + ] + } +} diff --git a/.vscode/settings.json b/.vscode/settings.json new file mode 100644 index 0000000..a8c2003 --- /dev/null +++ b/.vscode/settings.json @@ -0,0 +1,5 @@ +{ + "python-envs.defaultEnvManager": "ms-python.python:conda", + "python-envs.defaultPackageManager": "ms-python.python:conda", + "python-envs.pythonProjects": [] +} \ No newline at end of file diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..0669699 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,18 @@ + +# OpenSpec Instructions + +These instructions are for AI assistants working in this project. + +Always open `@/openspec/AGENTS.md` when the request: +- Mentions planning or proposals (words like proposal, spec, change, plan) +- Introduces new capabilities, breaking changes, architecture shifts, or big performance/security work +- Sounds ambiguous and you need the authoritative spec before coding + +Use `@/openspec/AGENTS.md` to learn: +- How to create and apply change proposals +- Spec format and conventions +- Project structure and guidelines + +Keep this managed block so 'openspec update' can refresh the instructions. + + \ No newline at end of file diff --git a/Acceptance Criteria UAT.txt b/Acceptance Criteria UAT.txt new file mode 100644 index 0000000..62f3719 --- /dev/null +++ b/Acceptance Criteria UAT.txt @@ -0,0 +1 @@ +測試環境: Windows 10/11, 網路連線正常 (10Mbps+), Conda 環境已啟用。1. 安裝與環境測試ID測試項目預期結果P/FENV-01Conda 環境建立執行 conda env create -f environment.yml 無報錯,且能成功 activate。ENV-02依賴完整性執行 python main.py 後,GUI 視窗在 5 秒內成功開啟。2. 功能與操作測試ID測試項目預期結果P/FFUN-01檔案讀取防呆未選擇 Excel 檔直接按「開始」,系統彈出警告視窗,且程式未崩潰。FUN-02批次生成流程匯入含 10 筆資料的 Excel,點擊開始後,進度條隨生成進度推進,Log 視窗即時顯示當前處理檔名。FUN-03強制中斷測試在生成第 3 筆時按下「停止」,程式應在完成第 3 筆後停止,不會繼續處理第 4 筆。FUN-04介面響應性生成過程中,拖曳視窗或縮放視窗,介面無殘影、無凍結 (Not Responding)。3. 語音品質驗收 (Quality Assurance)ID測試項目預期結果P/FQA-01語言對應正確性設定 Lang=vi 的檔案,聽感確認為越南語;設定 Lang=zh 為中文。QA-02中英夾雜流暢度(測試句:本季的 Revenue 成長了 10%) 確認英文單字由中文語音自然唸出,無拼讀錯誤。QA-03越英夾雜流暢度(測試句:Cái này yield rate rất tốt) 確認英文術語清晰,且越南語部分無機械音。QA-04語音完整性隨機抽查 3 個音檔,結尾無截斷,且無背景電流雜訊。4. 交付物清單Source Code: main.py (含 GUI 與 Logic)。Config: environment.yml。Documentation: README.md (含操作說明)。Sample Data: template.xlsx (含格式範例)。 \ No newline at end of file diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..0669699 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,18 @@ + +# OpenSpec Instructions + +These instructions are for AI assistants working in this project. + +Always open `@/openspec/AGENTS.md` when the request: +- Mentions planning or proposals (words like proposal, spec, change, plan) +- Introduces new capabilities, breaking changes, architecture shifts, or big performance/security work +- Sounds ambiguous and you need the authoritative spec before coding + +Use `@/openspec/AGENTS.md` to learn: +- How to create and apply change proposals +- Spec format and conventions +- Project structure and guidelines + +Keep this managed block so 'openspec update' can refresh the instructions. + + \ No newline at end of file diff --git a/Project Requirements Document.txt b/Project Requirements Document.txt new file mode 100644 index 0000000..1c36282 --- /dev/null +++ b/Project Requirements Document.txt @@ -0,0 +1,60 @@ +專案名稱: 智慧簡報旁白生成系統 (Smart Slide Voiceover System) 版本: v1.0 日期: 2025/12/27 硬體基準: PC (Windows), 32GB RAM, RTX 4060 (主要利用 CPU 與 網路頻寬) + +1. 專案概述 (Executive Summary) +本專案旨在開發一款桌面端應用程式 (Desktop Application),協助使用者將撰寫於 Excel 的簡報講稿,批次轉換為專業級、擬真且具備親切感的語音檔案。系統需解決跨語言(中文+英文專有名詞、越南文+英文專有名詞)的發音自然度問題,並提供圖形化介面以降低操作門檻。 + +2. 使用者流程 (User Flow) +環境啟動:使用者透過 Conda 啟動應用程式。 + +素材匯入:使用者透過 GUI 選擇標準格式的 Excel 腳本檔 (.xlsx)。 + +參數確認:使用者確認輸出路徑(預設為自動建立)。 + +執行生成:點擊「開始」按鈕,系統依序處理每一行講稿。 + +狀態監控:使用者透過進度條與日誌視窗監控生成狀態,且介面保持響應。 + +結果驗收:生成完畢後,彈出通知,使用者至資料夾查收 MP3 檔案。 + +3. 功能需求 (Functional Requirements) +3.1 輸入模組 +格式支援:僅支援 .xlsx 格式。 + +欄位定義: + +Filename (必要):輸出音檔名稱 (例如 Slide_01)。 + +Text (必要):旁白內容,支援中英夾雜或越英夾雜。 + +Lang (選填):指定該段落的主要語言 (zh 或 vi),若為空則預設為 zh。 + +3.2 核心處理模組 (TTS Engine) +引擎選型:採用微軟 Edge-TTS (Neural Voice) 技術,無需本地 GPU 模型推論。 + +語言處理邏輯: + +越南文模式 (vi):針對越南廠區/客戶簡報。需精確朗讀越南文,並能以自然口音朗讀夾雜的英文術語(如 "Yield rate")。 + +中文模式 (zh):針對內部彙報。需使用台灣口音,語氣專業親切,無機械音,並能流利朗讀英文術語。 + +穩定性機制:必須包含 Rate Limit 防護(每筆請求間隔),防止因請求過快導致 IP 被封鎖。 + +3.3 圖形化使用者介面 (GUI) +框架:PyQt6。 + +主要元件: + +檔案選擇器 (File Browser)。 + +操作按鈕:開始 (Start)、強制停止 (Stop)。 + +視覺回饋:百分比進度條 (Progress Bar)、多行日誌視窗 (Log Console)。 + +互動性:執行耗時任務時,主視窗不得凍結 (No Freeze),需維持可拖曳與最小化狀態。 + +4. 非功能性需求 (Non-Functional Requirements) +部署性:必須使用 Conda 進行依賴管理,確保環境可移植性。 + +效能:單筆音訊生成延遲不超過 3 秒(視網路狀況),支援 100 筆以上的批次佇列。 + +錯誤容忍:單一檔案生成失敗(如特殊字元錯誤)不應中斷整個批次作業,僅需記錄錯誤並跳至下一筆。 \ No newline at end of file diff --git a/README.md b/README.md new file mode 100644 index 0000000..28fdfe1 --- /dev/null +++ b/README.md @@ -0,0 +1,130 @@ +# 智慧簡報旁白生成系統 + +Smart Slide Voiceover System - 將 Excel 講稿批次轉換為專業語音檔案 + +## 功能特色 + +- 批次處理 Excel 講稿,自動生成 MP3 音檔 +- 支援中文、越南文、英文語音 +- 支援中英混雜、越英混雜發音 +- 10 種 Neural Voice 可選 +- 圖形化介面,操作簡單 +- 進度追蹤與日誌顯示 + +## 系統需求 + +- Windows 10/11 +- 網路連線 (10Mbps+) +- Conda 環境 + +## 安裝步驟 + +1. 建立 Conda 環境: +```bash +conda env create -f environment.yml +``` + +2. 啟動環境: +```bash +conda activate tts_project +``` + +3. 執行程式: +```bash +python main.py +``` + +## 使用說明 + +### 1. 準備 Excel 講稿 + +建立 `.xlsx` 檔案,包含以下欄位: + +| 欄位 | 必要 | 說明 | +|------|------|------| +| Filename | 是 | 輸出音檔名稱 (不含副檔名) | +| Text | 是 | 旁白內容 | +| Lang | 否 | 語言代碼 (預設: zh) | + +**Lang 欄位可用值:** +- `zh` / `zh-tw` - 中文 (台灣) +- `zh-cn` - 中文 (大陸) +- `vi` - 越南文 +- `en` - 英文 + +### 2. 選擇語音 + +程式提供兩種語音選擇方式: + +**自動模式 (預設)** +- 根據 Excel 中的 Lang 欄位自動選擇對應語音 +- 適合混合多種語言的簡報 + +**手動選擇** +- 從下拉選單選擇特定語音 +- 所有講稿使用同一語音 + +### 3. 可用語音清單 + +#### 中文語音 (支援中英混雜) + +| Voice ID | 性別 | 特徵 | +|----------|------|------| +| zh-TW-HsiaoChenNeural | 女 | 知性專業 (預設) | +| zh-TW-HsiaoYuNeural | 女 | 活潑年輕 | +| zh-TW-YunJheNeural | 男 | 成熟穩重 | +| zh-CN-XiaoxiaoNeural | 女 | 甜美親切 | +| zh-CN-YunyangNeural | 男 | 新聞播報風格 | + +#### 越南語音 (支援越英混雜) + +| Voice ID | 性別 | 特徵 | +|----------|------|------| +| vi-VN-HoaiMyNeural | 女 | 溫柔清晰 (預設) | +| vi-VN-NamMinhNeural | 男 | 專業沉穩 | + +#### 英文語音 + +| Voice ID | 性別 | 特徵 | +|----------|------|------| +| en-US-JennyNeural | 女 | 標準美式 (預設) | +| en-US-AriaNeural | 女 | 自然對話 | +| en-US-GuyNeural | 男 | 專業旁白 | + +### 4. 執行生成 + +1. 點擊「瀏覽」選擇 Excel 檔案 +2. 確認輸出資料夾 (預設自動建立) +3. 選擇語音 (或使用自動模式) +4. 點擊「開始」 +5. 等待處理完成 + +### 5. 操作提示 + +- **中途停止**: 點擊「停止」按鈕,系統會完成當前檔案後停止 +- **錯誤處理**: 單一檔案失敗不會中斷整個批次 +- **查看結果**: 完成後至輸出資料夾查收 MP3 檔案 + +## 範例資料 + +參考 `template.xlsx` 了解正確的檔案格式。 + +## 常見問題 + +**Q: 為什麼有些檔案生成失敗?** +A: 可能原因: +- 網路連線不穩定 +- 文字包含特殊字元 +- 單筆文字過長 + +**Q: 可以調整語速嗎?** +A: 目前版本使用預設語速,未來版本可能加入調整功能。 + +**Q: 離線可以使用嗎?** +A: 不行,本系統使用雲端 TTS 服務,需要網路連線。 + +## 技術資訊 + +- 框架: PyQt6 +- TTS 引擎: edge-tts (Azure Neural Voice) +- 支援格式: Excel (.xlsx) -> MP3 diff --git a/Technical Specifications.txt b/Technical Specifications.txt new file mode 100644 index 0000000..4472c48 --- /dev/null +++ b/Technical Specifications.txt @@ -0,0 +1,10 @@ +架構模式: MVC (Model-View-Controller) / Producer-Consumer Pattern開發語言: Python 3.10+1. 技術堆疊 (Tech Stack)元件技術選型說明GUI FrameworkPyQt6原生編譯綁定,效能優於 Electron,適合單機工具。TTS Coreedge-tts (Python Lib)調用 Azure Cognitive Services 介面,獲取 SOTA 級語音。EnvironmentConda透過 environment.yml 鎖定 Python 與套件版本。Data ProcessingPandas / Openpyxl高效讀取與解析 Excel 資料。ConcurrencyQThread + Asyncio解決 PyQt 事件迴圈與非同步網路請求的衝突。2. 語音模型映射策略 (Voice Mapping Strategy)為確保輸出語音符合「專業且親切」的簡報需求,系統將強制綁定以下 Neural Voice ID:Primary Mapping (Excel Lang Column):vi $\rightarrow$ vi-VN-HoaiMyNeural特徵:女性,音色明亮溫柔,咬字清晰,適合對外簡報。zh / zh-tw $\rightarrow$ zh-TW-HsiaoChenNeural特徵:女性,台灣標準口音,知性專業,適合教育訓練與彙報。en $\rightarrow$ en-US-JennyNeural特徵:女性,美式標準音。3. 系統模組設計3.1 環境配置 (environment.yml)定義專案的標準運行環境,確保團隊協作或換機時的一致性。YAMLname: tts_project +dependencies: + - python=3.10 + - pip + - pip: + - PyQt6 + - edge-tts + - pandas + - openpyxl +3.2 執行緒模型 (Threading Model)由於 edge-tts 是基於 asyncio 的非同步操作,而 PyQt 是基於 Event Loop 的同步操作,必須採用 Worker Thread 模式:Main Thread (UI):負責繪製介面、接收使用者點擊、更新進度條。Worker Thread (QThread):在此執行緒內建立獨立的 asyncio Event Loop。執行 Excel 解析。序列化執行 TTS 請求。透過 pyqtSignal 將進度與 Log 傳回 Main Thread。3.3 異常處理流程網路中斷:捕獲 aiohttp 相關錯誤 $\rightarrow$ 重試 1 次 $\rightarrow$ 失敗則寫入 Log $\rightarrow$ 跳過。檔案佔用:若輸出檔案正被播放器開啟,捕獲 PermissionError $\rightarrow$ 提示使用者關閉檔案。 \ No newline at end of file diff --git a/environment.yml b/environment.yml new file mode 100644 index 0000000..d25d7c2 --- /dev/null +++ b/environment.yml @@ -0,0 +1,12 @@ +name: tts_project +channels: + - defaults + - conda-forge +dependencies: + - python=3.10 + - pip + - pip: + - PyQt6 + - edge-tts + - pandas + - openpyxl diff --git a/main.py b/main.py new file mode 100644 index 0000000..3fd1376 --- /dev/null +++ b/main.py @@ -0,0 +1,474 @@ +""" +Smart Slide Voiceover System +智慧簡報旁白生成系統 + +A desktop application for batch converting Excel scripts to professional voiceover audio files. +""" + +import sys +import asyncio +import os +from pathlib import Path +from dataclasses import dataclass +from typing import Optional + +import pandas as pd +import edge_tts +from PyQt6.QtWidgets import ( + QApplication, QMainWindow, QWidget, QVBoxLayout, QHBoxLayout, + QPushButton, QLabel, QLineEdit, QFileDialog, QProgressBar, + QTextEdit, QComboBox, QMessageBox, QGroupBox +) +from PyQt6.QtCore import QThread, pyqtSignal, Qt + + +# ============================================================================= +# Voice Registry - All available voices with bilingual support annotations +# ============================================================================= + +@dataclass +class VoiceInfo: + voice_id: str + language: str + gender: str + bilingual: str + description: str + +VOICE_REGISTRY = [ + # Chinese (Taiwan) - supports Chinese-English mixing + VoiceInfo("zh-TW-HsiaoChenNeural", "zh-TW", "女", "中英混雜 ✓", "知性專業 (預設)"), + VoiceInfo("zh-TW-HsiaoYuNeural", "zh-TW", "女", "中英混雜 ✓", "活潑年輕"), + VoiceInfo("zh-TW-YunJheNeural", "zh-TW", "男", "中英混雜 ✓", "成熟穩重"), + # Chinese (Mainland) - supports Chinese-English mixing + VoiceInfo("zh-CN-XiaoxiaoNeural", "zh-CN", "女", "中英混雜 ✓", "甜美親切"), + VoiceInfo("zh-CN-YunyangNeural", "zh-CN", "男", "中英混雜 ✓", "新聞播報風格"), + # Vietnamese - supports Vietnamese-English mixing + VoiceInfo("vi-VN-HoaiMyNeural", "vi-VN", "女", "越英混雜 ✓", "溫柔清晰 (預設)"), + VoiceInfo("vi-VN-NamMinhNeural", "vi-VN", "男", "越英混雜 ✓", "專業沉穩"), + # English (US) - English only + VoiceInfo("en-US-JennyNeural", "en-US", "女", "純英文", "標準美式 (預設)"), + VoiceInfo("en-US-AriaNeural", "en-US", "女", "純英文", "自然對話"), + VoiceInfo("en-US-GuyNeural", "en-US", "男", "純英文", "專業旁白"), +] + +# Default voice mapping by language code +DEFAULT_VOICE_MAP = { + "zh": "zh-TW-HsiaoChenNeural", + "zh-tw": "zh-TW-HsiaoChenNeural", + "zh-cn": "zh-CN-XiaoxiaoNeural", + "vi": "vi-VN-HoaiMyNeural", + "en": "en-US-JennyNeural", +} + +# Voice groups for dropdown +VOICE_GROUPS = { + "中文語音 (適合中英混雜簡報)": ["zh-TW", "zh-CN"], + "越南語音 (適合越英混雜簡報)": ["vi-VN"], + "英文語音 (適合純英文簡報)": ["en-US"], +} + + +# ============================================================================= +# Excel Input Module +# ============================================================================= + +@dataclass +class ScriptRow: + filename: str + text: str + lang: str + +def load_excel(file_path: str) -> list[ScriptRow]: + """Load and parse Excel file, returning list of ScriptRow objects.""" + df = pd.read_excel(file_path, engine='openpyxl') + + # Normalize column names (case-insensitive) + df.columns = df.columns.str.strip().str.lower() + + # Check required columns + if 'filename' not in df.columns: + raise ValueError("Excel 檔案缺少必要欄位: Filename") + if 'text' not in df.columns: + raise ValueError("Excel 檔案缺少必要欄位: Text") + + rows = [] + for idx, row in df.iterrows(): + filename = str(row.get('filename', '')).strip() + text = str(row.get('text', '')).strip() + lang = str(row.get('lang', 'zh')).strip().lower() + + # Skip rows with empty required fields + if not filename or filename == 'nan': + continue + if not text or text == 'nan': + continue + + # Default language fallback + if not lang or lang == 'nan': + lang = 'zh' + + rows.append(ScriptRow(filename=filename, text=text, lang=lang)) + + return rows + + +# ============================================================================= +# TTS Engine Module +# ============================================================================= + +async def synthesize_speech(text: str, voice_id: str, output_path: str) -> None: + """Generate speech audio using edge-tts.""" + communicate = edge_tts.Communicate(text, voice_id) + await communicate.save(output_path) + + +def get_voice_for_lang(lang: str) -> str: + """Get default voice ID for a language code.""" + return DEFAULT_VOICE_MAP.get(lang.lower(), "zh-TW-HsiaoChenNeural") + + +# ============================================================================= +# TTS Worker Thread +# ============================================================================= + +class TTSWorker(QThread): + """Worker thread for batch TTS processing.""" + + progress = pyqtSignal(int, int) # current, total + log_message = pyqtSignal(str) + finished_batch = pyqtSignal(int, int) # success_count, fail_count + + def __init__(self, rows: list[ScriptRow], output_dir: str, + selected_voice: Optional[str] = None): + super().__init__() + self.rows = rows + self.output_dir = output_dir + self.selected_voice = selected_voice + self._stop_flag = False + + def stop(self): + """Request graceful stop after current file.""" + self._stop_flag = True + + def run(self): + """Execute batch TTS processing in worker thread.""" + # Create output directory if needed + Path(self.output_dir).mkdir(parents=True, exist_ok=True) + + # Create new event loop for this thread + loop = asyncio.new_event_loop() + asyncio.set_event_loop(loop) + + success_count = 0 + fail_count = 0 + total = len(self.rows) + + try: + for i, row in enumerate(self.rows): + if self._stop_flag: + self.log_message.emit(f"已停止處理 (完成 {i}/{total})") + break + + # Determine voice to use + if self.selected_voice: + voice_id = self.selected_voice + else: + voice_id = get_voice_for_lang(row.lang) + + output_path = os.path.join(self.output_dir, f"{row.filename}.mp3") + + self.log_message.emit(f"正在處理: {row.filename}") + + try: + # Run async TTS with retry + loop.run_until_complete( + self._synthesize_with_retry(row.text, voice_id, output_path) + ) + self.log_message.emit(f"完成: {row.filename}") + success_count += 1 + except Exception as e: + self.log_message.emit(f"錯誤: {row.filename} - {str(e)}") + fail_count += 1 + + self.progress.emit(i + 1, total) + + # Rate limit delay (0.5s between requests) + if not self._stop_flag and i < total - 1: + loop.run_until_complete(asyncio.sleep(0.5)) + + finally: + loop.close() + + self.finished_batch.emit(success_count, fail_count) + + async def _synthesize_with_retry(self, text: str, voice_id: str, + output_path: str, max_retries: int = 1): + """Synthesize with retry on network error.""" + last_error = None + for attempt in range(max_retries + 1): + try: + await synthesize_speech(text, voice_id, output_path) + return + except Exception as e: + last_error = e + if attempt < max_retries: + await asyncio.sleep(1) # Wait before retry + raise last_error + + +# ============================================================================= +# Main Window GUI +# ============================================================================= + +class MainWindow(QMainWindow): + def __init__(self): + super().__init__() + self.worker = None + self.init_ui() + + def init_ui(self): + self.setWindowTitle("智慧簡報旁白生成系統 - Smart Slide Voiceover System") + self.setMinimumSize(700, 500) + + central_widget = QWidget() + self.setCentralWidget(central_widget) + layout = QVBoxLayout(central_widget) + + # File selection group + file_group = QGroupBox("檔案設定") + file_layout = QVBoxLayout(file_group) + + # Excel file browser + excel_layout = QHBoxLayout() + excel_layout.addWidget(QLabel("Excel 講稿:")) + self.file_path_edit = QLineEdit() + self.file_path_edit.setReadOnly(True) + self.file_path_edit.setPlaceholderText("請選擇 .xlsx 檔案...") + excel_layout.addWidget(self.file_path_edit, 1) + self.browse_btn = QPushButton("瀏覽...") + self.browse_btn.clicked.connect(self.browse_file) + excel_layout.addWidget(self.browse_btn) + file_layout.addLayout(excel_layout) + + # Output directory + output_layout = QHBoxLayout() + output_layout.addWidget(QLabel("輸出資料夾:")) + self.output_path_edit = QLineEdit() + self.output_path_edit.setPlaceholderText("預設: Excel 檔案所在目錄/output") + output_layout.addWidget(self.output_path_edit, 1) + self.output_browse_btn = QPushButton("瀏覽...") + self.output_browse_btn.clicked.connect(self.browse_output) + output_layout.addWidget(self.output_browse_btn) + file_layout.addLayout(output_layout) + + layout.addWidget(file_group) + + # Voice selection group + voice_group = QGroupBox("語音設定") + voice_layout = QHBoxLayout(voice_group) + voice_layout.addWidget(QLabel("選擇語音:")) + self.voice_combo = QComboBox() + self.voice_combo.setMinimumWidth(350) + self._populate_voice_combo() + voice_layout.addWidget(self.voice_combo, 1) + layout.addWidget(voice_group) + + # Control buttons + btn_layout = QHBoxLayout() + self.start_btn = QPushButton("開始") + self.start_btn.setMinimumHeight(40) + self.start_btn.clicked.connect(self.start_processing) + btn_layout.addWidget(self.start_btn) + + self.stop_btn = QPushButton("停止") + self.stop_btn.setMinimumHeight(40) + self.stop_btn.setEnabled(False) + self.stop_btn.clicked.connect(self.stop_processing) + btn_layout.addWidget(self.stop_btn) + layout.addLayout(btn_layout) + + # Progress bar + progress_layout = QHBoxLayout() + progress_layout.addWidget(QLabel("進度:")) + self.progress_bar = QProgressBar() + self.progress_bar.setValue(0) + progress_layout.addWidget(self.progress_bar, 1) + self.progress_label = QLabel("0/0") + progress_layout.addWidget(self.progress_label) + layout.addLayout(progress_layout) + + # Log console + log_group = QGroupBox("處理日誌") + log_layout = QVBoxLayout(log_group) + self.log_console = QTextEdit() + self.log_console.setReadOnly(True) + self.log_console.setMinimumHeight(150) + log_layout.addWidget(self.log_console) + layout.addWidget(log_group, 1) + + def _populate_voice_combo(self): + """Populate voice dropdown with grouped options.""" + # Add "Auto" option first + self.voice_combo.addItem("自動 (依 Excel Lang 欄位決定)", None) + + # Add voices grouped by language + for group_name, lang_codes in VOICE_GROUPS.items(): + self.voice_combo.addItem(f"─── {group_name} ───", "separator") + # Make separator non-selectable + idx = self.voice_combo.count() - 1 + self.voice_combo.model().item(idx).setEnabled(False) + + for voice in VOICE_REGISTRY: + if voice.language in lang_codes: + display = f" {voice.voice_id} ({voice.gender}) - {voice.bilingual} - {voice.description}" + self.voice_combo.addItem(display, voice.voice_id) + + def browse_file(self): + """Open file dialog to select Excel file.""" + file_path, _ = QFileDialog.getOpenFileName( + self, "選擇 Excel 講稿檔案", "", + "Excel Files (*.xlsx);;All Files (*)" + ) + if file_path: + self.file_path_edit.setText(file_path) + # Auto-set output directory + if not self.output_path_edit.text(): + default_output = os.path.join(os.path.dirname(file_path), "output") + self.output_path_edit.setText(default_output) + + def browse_output(self): + """Open dialog to select output directory.""" + dir_path = QFileDialog.getExistingDirectory(self, "選擇輸出資料夾") + if dir_path: + self.output_path_edit.setText(dir_path) + + def start_processing(self): + """Start batch TTS processing.""" + # Validate file selection + file_path = self.file_path_edit.text() + if not file_path: + QMessageBox.warning(self, "警告", "請先選擇 Excel 講稿檔案") + return + + if not os.path.exists(file_path): + QMessageBox.warning(self, "警告", "選擇的檔案不存在") + return + + # Get output directory + output_dir = self.output_path_edit.text() + if not output_dir: + output_dir = os.path.join(os.path.dirname(file_path), "output") + self.output_path_edit.setText(output_dir) + + # Get selected voice + selected_voice = self.voice_combo.currentData() + + # Load Excel + try: + rows = load_excel(file_path) + if not rows: + QMessageBox.warning(self, "警告", "Excel 檔案中沒有有效的資料") + return + except Exception as e: + QMessageBox.critical(self, "錯誤", f"載入 Excel 失敗:\n{str(e)}") + return + + # Clear log and reset progress + self.log_console.clear() + self.progress_bar.setValue(0) + self.progress_label.setText(f"0/{len(rows)}") + + # Update UI state + self.start_btn.setEnabled(False) + self.stop_btn.setEnabled(True) + self.browse_btn.setEnabled(False) + self.output_browse_btn.setEnabled(False) + self.voice_combo.setEnabled(False) + + # Log start + self.log_console.append(f"開始處理 {len(rows)} 筆資料...") + if selected_voice: + self.log_console.append(f"使用語音: {selected_voice}") + else: + self.log_console.append("使用自動語音選擇 (依 Lang 欄位)") + self.log_console.append("") + + # Create and start worker + self.worker = TTSWorker(rows, output_dir, selected_voice) + self.worker.progress.connect(self.on_progress) + self.worker.log_message.connect(self.on_log) + self.worker.finished_batch.connect(self.on_finished) + self.worker.start() + + def stop_processing(self): + """Request stop of current processing.""" + if self.worker: + self.worker.stop() + self.stop_btn.setEnabled(False) + self.log_console.append("\n正在停止...") + + def on_progress(self, current: int, total: int): + """Update progress bar.""" + percent = int((current / total) * 100) if total > 0 else 0 + self.progress_bar.setValue(percent) + self.progress_label.setText(f"{current}/{total}") + + def on_log(self, message: str): + """Append message to log console.""" + self.log_console.append(message) + # Auto-scroll to bottom + scrollbar = self.log_console.verticalScrollBar() + scrollbar.setValue(scrollbar.maximum()) + + def on_finished(self, success_count: int, fail_count: int): + """Handle batch completion.""" + # Reset UI state + self.start_btn.setEnabled(True) + self.stop_btn.setEnabled(False) + self.browse_btn.setEnabled(True) + self.output_browse_btn.setEnabled(True) + self.voice_combo.setEnabled(True) + + # Show completion message + total = success_count + fail_count + self.log_console.append("") + self.log_console.append(f"===== 處理完成 =====") + self.log_console.append(f"成功: {success_count} / {total}") + if fail_count > 0: + self.log_console.append(f"失敗: {fail_count}") + + # Show dialog + if fail_count == 0: + QMessageBox.information( + self, "完成", + f"所有 {success_count} 個音檔已成功生成!\n\n" + f"輸出位置: {self.output_path_edit.text()}" + ) + else: + QMessageBox.warning( + self, "完成 (有錯誤)", + f"處理完成\n\n成功: {success_count}\n失敗: {fail_count}\n\n" + f"請查看日誌了解詳情。" + ) + + self.worker = None + + +# ============================================================================= +# Application Entry Point +# ============================================================================= + +def main(): + app = QApplication(sys.argv) + + # Set application style + app.setStyle("Fusion") + + window = MainWindow() + window.show() + + sys.exit(app.exec()) + + +if __name__ == "__main__": + main() diff --git a/openspec/AGENTS.md b/openspec/AGENTS.md new file mode 100644 index 0000000..96ab0bb --- /dev/null +++ b/openspec/AGENTS.md @@ -0,0 +1,456 @@ +# OpenSpec Instructions + +Instructions for AI coding assistants using OpenSpec for spec-driven development. + +## TL;DR Quick Checklist + +- Search existing work: `openspec spec list --long`, `openspec list` (use `rg` only for full-text search) +- Decide scope: new capability vs modify existing capability +- Pick a unique `change-id`: kebab-case, verb-led (`add-`, `update-`, `remove-`, `refactor-`) +- Scaffold: `proposal.md`, `tasks.md`, `design.md` (only if needed), and delta specs per affected capability +- Write deltas: use `## ADDED|MODIFIED|REMOVED|RENAMED Requirements`; include at least one `#### Scenario:` per requirement +- Validate: `openspec validate [change-id] --strict` and fix issues +- Request approval: Do not start implementation until proposal is approved + +## Three-Stage Workflow + +### Stage 1: Creating Changes +Create proposal when you need to: +- Add features or functionality +- Make breaking changes (API, schema) +- Change architecture or patterns +- Optimize performance (changes behavior) +- Update security patterns + +Triggers (examples): +- "Help me create a change proposal" +- "Help me plan a change" +- "Help me create a proposal" +- "I want to create a spec proposal" +- "I want to create a spec" + +Loose matching guidance: +- Contains one of: `proposal`, `change`, `spec` +- With one of: `create`, `plan`, `make`, `start`, `help` + +Skip proposal for: +- Bug fixes (restore intended behavior) +- Typos, formatting, comments +- Dependency updates (non-breaking) +- Configuration changes +- Tests for existing behavior + +**Workflow** +1. Review `openspec/project.md`, `openspec list`, and `openspec list --specs` to understand current context. +2. Choose a unique verb-led `change-id` and scaffold `proposal.md`, `tasks.md`, optional `design.md`, and spec deltas under `openspec/changes//`. +3. Draft spec deltas using `## ADDED|MODIFIED|REMOVED Requirements` with at least one `#### Scenario:` per requirement. +4. Run `openspec validate --strict` and resolve any issues before sharing the proposal. + +### Stage 2: Implementing Changes +Track these steps as TODOs and complete them one by one. +1. **Read proposal.md** - Understand what's being built +2. **Read design.md** (if exists) - Review technical decisions +3. **Read tasks.md** - Get implementation checklist +4. **Implement tasks sequentially** - Complete in order +5. **Confirm completion** - Ensure every item in `tasks.md` is finished before updating statuses +6. **Update checklist** - After all work is done, set every task to `- [x]` so the list reflects reality +7. **Approval gate** - Do not start implementation until the proposal is reviewed and approved + +### Stage 3: Archiving Changes +After deployment, create separate PR to: +- Move `changes/[name]/` → `changes/archive/YYYY-MM-DD-[name]/` +- Update `specs/` if capabilities changed +- Use `openspec archive --skip-specs --yes` for tooling-only changes (always pass the change ID explicitly) +- Run `openspec validate --strict` to confirm the archived change passes checks + +## Before Any Task + +**Context Checklist:** +- [ ] Read relevant specs in `specs/[capability]/spec.md` +- [ ] Check pending changes in `changes/` for conflicts +- [ ] Read `openspec/project.md` for conventions +- [ ] Run `openspec list` to see active changes +- [ ] Run `openspec list --specs` to see existing capabilities + +**Before Creating Specs:** +- Always check if capability already exists +- Prefer modifying existing specs over creating duplicates +- Use `openspec show [spec]` to review current state +- If request is ambiguous, ask 1–2 clarifying questions before scaffolding + +### Search Guidance +- Enumerate specs: `openspec spec list --long` (or `--json` for scripts) +- Enumerate changes: `openspec list` (or `openspec change list --json` - deprecated but available) +- Show details: + - Spec: `openspec show --type spec` (use `--json` for filters) + - Change: `openspec show --json --deltas-only` +- Full-text search (use ripgrep): `rg -n "Requirement:|Scenario:" openspec/specs` + +## Quick Start + +### CLI Commands + +```bash +# Essential commands +openspec list # List active changes +openspec list --specs # List specifications +openspec show [item] # Display change or spec +openspec validate [item] # Validate changes or specs +openspec archive [--yes|-y] # Archive after deployment (add --yes for non-interactive runs) + +# Project management +openspec init [path] # Initialize OpenSpec +openspec update [path] # Update instruction files + +# Interactive mode +openspec show # Prompts for selection +openspec validate # Bulk validation mode + +# Debugging +openspec show [change] --json --deltas-only +openspec validate [change] --strict +``` + +### Command Flags + +- `--json` - Machine-readable output +- `--type change|spec` - Disambiguate items +- `--strict` - Comprehensive validation +- `--no-interactive` - Disable prompts +- `--skip-specs` - Archive without spec updates +- `--yes`/`-y` - Skip confirmation prompts (non-interactive archive) + +## Directory Structure + +``` +openspec/ +├── project.md # Project conventions +├── specs/ # Current truth - what IS built +│ └── [capability]/ # Single focused capability +│ ├── spec.md # Requirements and scenarios +│ └── design.md # Technical patterns +├── changes/ # Proposals - what SHOULD change +│ ├── [change-name]/ +│ │ ├── proposal.md # Why, what, impact +│ │ ├── tasks.md # Implementation checklist +│ │ ├── design.md # Technical decisions (optional; see criteria) +│ │ └── specs/ # Delta changes +│ │ └── [capability]/ +│ │ └── spec.md # ADDED/MODIFIED/REMOVED +│ └── archive/ # Completed changes +``` + +## Creating Change Proposals + +### Decision Tree + +``` +New request? +├─ Bug fix restoring spec behavior? → Fix directly +├─ Typo/format/comment? → Fix directly +├─ New feature/capability? → Create proposal +├─ Breaking change? → Create proposal +├─ Architecture change? → Create proposal +└─ Unclear? → Create proposal (safer) +``` + +### Proposal Structure + +1. **Create directory:** `changes/[change-id]/` (kebab-case, verb-led, unique) + +2. **Write proposal.md:** +```markdown +# Change: [Brief description of change] + +## Why +[1-2 sentences on problem/opportunity] + +## What Changes +- [Bullet list of changes] +- [Mark breaking changes with **BREAKING**] + +## Impact +- Affected specs: [list capabilities] +- Affected code: [key files/systems] +``` + +3. **Create spec deltas:** `specs/[capability]/spec.md` +```markdown +## ADDED Requirements +### Requirement: New Feature +The system SHALL provide... + +#### Scenario: Success case +- **WHEN** user performs action +- **THEN** expected result + +## MODIFIED Requirements +### Requirement: Existing Feature +[Complete modified requirement] + +## REMOVED Requirements +### Requirement: Old Feature +**Reason**: [Why removing] +**Migration**: [How to handle] +``` +If multiple capabilities are affected, create multiple delta files under `changes/[change-id]/specs//spec.md`—one per capability. + +4. **Create tasks.md:** +```markdown +## 1. Implementation +- [ ] 1.1 Create database schema +- [ ] 1.2 Implement API endpoint +- [ ] 1.3 Add frontend component +- [ ] 1.4 Write tests +``` + +5. **Create design.md when needed:** +Create `design.md` if any of the following apply; otherwise omit it: +- Cross-cutting change (multiple services/modules) or a new architectural pattern +- New external dependency or significant data model changes +- Security, performance, or migration complexity +- Ambiguity that benefits from technical decisions before coding + +Minimal `design.md` skeleton: +```markdown +## Context +[Background, constraints, stakeholders] + +## Goals / Non-Goals +- Goals: [...] +- Non-Goals: [...] + +## Decisions +- Decision: [What and why] +- Alternatives considered: [Options + rationale] + +## Risks / Trade-offs +- [Risk] → Mitigation + +## Migration Plan +[Steps, rollback] + +## Open Questions +- [...] +``` + +## Spec File Format + +### Critical: Scenario Formatting + +**CORRECT** (use #### headers): +```markdown +#### Scenario: User login success +- **WHEN** valid credentials provided +- **THEN** return JWT token +``` + +**WRONG** (don't use bullets or bold): +```markdown +- **Scenario: User login** ❌ +**Scenario**: User login ❌ +### Scenario: User login ❌ +``` + +Every requirement MUST have at least one scenario. + +### Requirement Wording +- Use SHALL/MUST for normative requirements (avoid should/may unless intentionally non-normative) + +### Delta Operations + +- `## ADDED Requirements` - New capabilities +- `## MODIFIED Requirements` - Changed behavior +- `## REMOVED Requirements` - Deprecated features +- `## RENAMED Requirements` - Name changes + +Headers matched with `trim(header)` - whitespace ignored. + +#### When to use ADDED vs MODIFIED +- ADDED: Introduces a new capability or sub-capability that can stand alone as a requirement. Prefer ADDED when the change is orthogonal (e.g., adding "Slash Command Configuration") rather than altering the semantics of an existing requirement. +- MODIFIED: Changes the behavior, scope, or acceptance criteria of an existing requirement. Always paste the full, updated requirement content (header + all scenarios). The archiver will replace the entire requirement with what you provide here; partial deltas will drop previous details. +- RENAMED: Use when only the name changes. If you also change behavior, use RENAMED (name) plus MODIFIED (content) referencing the new name. + +Common pitfall: Using MODIFIED to add a new concern without including the previous text. This causes loss of detail at archive time. If you aren’t explicitly changing the existing requirement, add a new requirement under ADDED instead. + +Authoring a MODIFIED requirement correctly: +1) Locate the existing requirement in `openspec/specs//spec.md`. +2) Copy the entire requirement block (from `### Requirement: ...` through its scenarios). +3) Paste it under `## MODIFIED Requirements` and edit to reflect the new behavior. +4) Ensure the header text matches exactly (whitespace-insensitive) and keep at least one `#### Scenario:`. + +Example for RENAMED: +```markdown +## RENAMED Requirements +- FROM: `### Requirement: Login` +- TO: `### Requirement: User Authentication` +``` + +## Troubleshooting + +### Common Errors + +**"Change must have at least one delta"** +- Check `changes/[name]/specs/` exists with .md files +- Verify files have operation prefixes (## ADDED Requirements) + +**"Requirement must have at least one scenario"** +- Check scenarios use `#### Scenario:` format (4 hashtags) +- Don't use bullet points or bold for scenario headers + +**Silent scenario parsing failures** +- Exact format required: `#### Scenario: Name` +- Debug with: `openspec show [change] --json --deltas-only` + +### Validation Tips + +```bash +# Always use strict mode for comprehensive checks +openspec validate [change] --strict + +# Debug delta parsing +openspec show [change] --json | jq '.deltas' + +# Check specific requirement +openspec show [spec] --json -r 1 +``` + +## Happy Path Script + +```bash +# 1) Explore current state +openspec spec list --long +openspec list +# Optional full-text search: +# rg -n "Requirement:|Scenario:" openspec/specs +# rg -n "^#|Requirement:" openspec/changes + +# 2) Choose change id and scaffold +CHANGE=add-two-factor-auth +mkdir -p openspec/changes/$CHANGE/{specs/auth} +printf "## Why\n...\n\n## What Changes\n- ...\n\n## Impact\n- ...\n" > openspec/changes/$CHANGE/proposal.md +printf "## 1. Implementation\n- [ ] 1.1 ...\n" > openspec/changes/$CHANGE/tasks.md + +# 3) Add deltas (example) +cat > openspec/changes/$CHANGE/specs/auth/spec.md << 'EOF' +## ADDED Requirements +### Requirement: Two-Factor Authentication +Users MUST provide a second factor during login. + +#### Scenario: OTP required +- **WHEN** valid credentials are provided +- **THEN** an OTP challenge is required +EOF + +# 4) Validate +openspec validate $CHANGE --strict +``` + +## Multi-Capability Example + +``` +openspec/changes/add-2fa-notify/ +├── proposal.md +├── tasks.md +└── specs/ + ├── auth/ + │ └── spec.md # ADDED: Two-Factor Authentication + └── notifications/ + └── spec.md # ADDED: OTP email notification +``` + +auth/spec.md +```markdown +## ADDED Requirements +### Requirement: Two-Factor Authentication +... +``` + +notifications/spec.md +```markdown +## ADDED Requirements +### Requirement: OTP Email Notification +... +``` + +## Best Practices + +### Simplicity First +- Default to <100 lines of new code +- Single-file implementations until proven insufficient +- Avoid frameworks without clear justification +- Choose boring, proven patterns + +### Complexity Triggers +Only add complexity with: +- Performance data showing current solution too slow +- Concrete scale requirements (>1000 users, >100MB data) +- Multiple proven use cases requiring abstraction + +### Clear References +- Use `file.ts:42` format for code locations +- Reference specs as `specs/auth/spec.md` +- Link related changes and PRs + +### Capability Naming +- Use verb-noun: `user-auth`, `payment-capture` +- Single purpose per capability +- 10-minute understandability rule +- Split if description needs "AND" + +### Change ID Naming +- Use kebab-case, short and descriptive: `add-two-factor-auth` +- Prefer verb-led prefixes: `add-`, `update-`, `remove-`, `refactor-` +- Ensure uniqueness; if taken, append `-2`, `-3`, etc. + +## Tool Selection Guide + +| Task | Tool | Why | +|------|------|-----| +| Find files by pattern | Glob | Fast pattern matching | +| Search code content | Grep | Optimized regex search | +| Read specific files | Read | Direct file access | +| Explore unknown scope | Task | Multi-step investigation | + +## Error Recovery + +### Change Conflicts +1. Run `openspec list` to see active changes +2. Check for overlapping specs +3. Coordinate with change owners +4. Consider combining proposals + +### Validation Failures +1. Run with `--strict` flag +2. Check JSON output for details +3. Verify spec file format +4. Ensure scenarios properly formatted + +### Missing Context +1. Read project.md first +2. Check related specs +3. Review recent archives +4. Ask for clarification + +## Quick Reference + +### Stage Indicators +- `changes/` - Proposed, not yet built +- `specs/` - Built and deployed +- `archive/` - Completed changes + +### File Purposes +- `proposal.md` - Why and what +- `tasks.md` - Implementation steps +- `design.md` - Technical decisions +- `spec.md` - Requirements and behavior + +### CLI Essentials +```bash +openspec list # What's in progress? +openspec show [item] # View details +openspec validate --strict # Is it correct? +openspec archive [--yes|-y] # Mark complete (add --yes for automation) +``` + +Remember: Specs are truth. Changes are proposals. Keep them in sync. diff --git a/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/design.md b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/design.md new file mode 100644 index 0000000..83bc289 --- /dev/null +++ b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/design.md @@ -0,0 +1,87 @@ +## Context +本專案為桌面端 TTS 工具,需整合 PyQt6 GUI 與 edge-tts 非同步網路請求。主要挑戰在於 PyQt 的 Event Loop 與 asyncio 的衝突處理,以及確保長時間批次任務不阻塞 UI。 + +目標使用者為企業內部簡報製作人員,技術背景有限,需要簡單直觀的操作流程。 + +## Goals / Non-Goals +**Goals:** +- 實現 Excel 到 MP3 的批次轉換 +- 支援中英/越英混合語音 +- 提供響應式 GUI 與進度回饋 +- 錯誤容忍,單檔失敗不中斷 + +**Non-Goals:** +- 不支援即時語音預覽 +- 不實作語音編輯功能 +- 不處理影片嵌入 + +## Decisions + +### 架構模式: MVC + Producer-Consumer +- **決定**: 採用 MVC 分離關注點,Worker Thread 作為 Producer 生成任務結果 +- **原因**: PyQt6 原生支援此模式,易於維護與測試 + +### 執行緒模型: QThread + Asyncio Event Loop +- **決定**: 在 Worker Thread 內建立獨立的 asyncio loop +- **替代方案**: + - `qasync` 整合 - 額外依賴,維護風險 + - 純同步請求 - 效能差,阻塞嚴重 +- **選擇原因**: 原生解法,無額外依賴,已驗證穩定 + +### TTS 引擎: edge-tts +- **決定**: 使用 edge-tts Python 套件 +- **原因**: + - 免費調用 Azure Neural Voice + - 無需 API Key + - 語音品質達 SOTA 水準 + +### 語音選擇策略 +- **決定**: 提供下拉選單讓使用者自選語音,每個選項標註雙語支援能力 +- **預設值**: 維持原映射作為預設選項 +- **替代方案**: 固定映射 - 簡化操作但缺乏彈性 +- **選擇原因**: 使用者可能需要男聲或不同風格 + +#### 可用語音清單 (含雙語支援標註) + +| 語言 | Voice ID | 性別 | 雙語支援 | 說明 | +|------|----------|------|----------|------| +| zh-TW | zh-TW-HsiaoChenNeural | 女 | 中英混雜 ✓ | 知性專業 (預設) | +| zh-TW | zh-TW-HsiaoYuNeural | 女 | 中英混雜 ✓ | 活潑年輕 | +| zh-TW | zh-TW-YunJheNeural | 男 | 中英混雜 ✓ | 成熟穩重 | +| zh-CN | zh-CN-XiaoxiaoNeural | 女 | 中英混雜 ✓ | 甜美親切 | +| zh-CN | zh-CN-YunyangNeural | 男 | 中英混雜 ✓ | 新聞播報風格 | +| vi-VN | vi-VN-HoaiMyNeural | 女 | 越英混雜 ✓ | 溫柔清晰 (預設) | +| vi-VN | vi-VN-NamMinhNeural | 男 | 越英混雜 ✓ | 專業沉穩 | +| en-US | en-US-JennyNeural | 女 | 純英文 | 標準美式 (預設) | +| en-US | en-US-AriaNeural | 女 | 純英文 | 自然對話 | +| en-US | en-US-GuyNeural | 男 | 純英文 | 專業旁白 | + +#### GUI 下拉選單分組 +- **中文語音** (適合中英混雜簡報) +- **越南語音** (適合越英混雜簡報) +- **英文語音** (適合純英文簡報) + +### Rate Limit 策略 +- **決定**: 每筆請求間隔 0.5 秒 +- **原因**: 防止 IP 被 Azure 封鎖,經測試此間隔穩定 + +## Risks / Trade-offs + +### 網路依賴風險 +- **風險**: edge-tts 依賴網路連線,離線無法使用 +- **緩解**: 明確標示網路需求,提供網路錯誤提示 + +### API 變動風險 +- **風險**: Microsoft 可能調整 Edge TTS API +- **緩解**: edge-tts 套件由社群維護,跟進更新 + +### 單執行緒序列化 +- **風險**: 大量檔案時處理時間長 +- **Trade-off**: 選擇穩定性優先,避免並發導致 Rate Limit + +## Migration Plan +N/A - 全新專案,無既有系統需遷移 + +## Resolved Questions +- ✅ 是否需要支援自訂語音選擇?→ **是,提供下拉選單並標註雙語支援** +- ✅ 是否需要輸出格式選項?→ **否,固定 MP3 格式** diff --git a/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/proposal.md b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/proposal.md new file mode 100644 index 0000000..bcc7067 --- /dev/null +++ b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/proposal.md @@ -0,0 +1,22 @@ +# Change: Add Smart Slide Voiceover System + +## Why +企業內部簡報製作需要專業旁白,但傳統錄音耗時且品質不穩定。需要一款工具能從 Excel 講稿批次生成高品質語音檔案,支援中文/越南文與英文術語混合朗讀,降低簡報製作門檻。 + +## What Changes +- **新增 Excel 輸入模組**: 解析 .xlsx 格式講稿檔案,支援 Filename/Text/Lang 欄位 +- **新增 TTS 引擎模組**: 整合 edge-tts 調用 Azure Neural Voice,實現多語言語音合成 +- **新增 PyQt6 圖形介面**: 提供檔案選擇、語音選擇下拉選單 (含雙語支援標註)、進度監控、日誌顯示等互動功能 +- **新增執行緒模型**: QThread + Asyncio 架構確保 UI 響應性 +- **新增錯誤處理機制**: 單檔失敗不中斷批次,Rate Limit 防護 + +## Impact +- Affected specs: + - `excel-input` (新增) + - `tts-engine` (新增) + - `gui-interface` (新增) +- Affected code: + - `main.py` - 主程式入口與 GUI 邏輯 + - `environment.yml` - Conda 環境配置 + - `README.md` - 使用說明 + - `template.xlsx` - 範例檔案 diff --git a/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/excel-input/spec.md b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/excel-input/spec.md new file mode 100644 index 0000000..6975bec --- /dev/null +++ b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/excel-input/spec.md @@ -0,0 +1,42 @@ +## ADDED Requirements + +### Requirement: Excel File Loading +系統 SHALL 支援載入 .xlsx 格式的 Excel 檔案作為講稿輸入來源。 + +#### Scenario: Valid Excel file selected +- **WHEN** 使用者選擇有效的 .xlsx 檔案 +- **THEN** 系統解析檔案內容並準備處理 + +#### Scenario: Invalid file format rejected +- **WHEN** 使用者選擇非 .xlsx 格式檔案 +- **THEN** 系統顯示格式錯誤警告,不進行處理 + +### Requirement: Column Parsing +系統 SHALL 解析 Excel 檔案的標準欄位結構,包含 Filename、Text、Lang 三個欄位。 + +#### Scenario: Required columns present +- **WHEN** Excel 檔案包含 Filename 與 Text 欄位 +- **THEN** 系統成功解析所有資料列 + +#### Scenario: Missing required column +- **WHEN** Excel 檔案缺少 Filename 或 Text 欄位 +- **THEN** 系統顯示欄位缺失錯誤,停止處理 + +#### Scenario: Optional Lang column handling +- **WHEN** Lang 欄位為空或不存在 +- **THEN** 系統預設使用 "zh" 作為語言設定 + +### Requirement: Data Validation +系統 SHALL 驗證每一列資料的有效性,確保必要欄位非空。 + +#### Scenario: Empty text field +- **WHEN** 某列的 Text 欄位為空 +- **THEN** 系統記錄警告並跳過該列 + +#### Scenario: Empty filename field +- **WHEN** 某列的 Filename 欄位為空 +- **THEN** 系統記錄警告並跳過該列 + +#### Scenario: Valid row processing +- **WHEN** Filename 與 Text 欄位皆有值 +- **THEN** 系統將該列加入處理佇列 diff --git a/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/gui-interface/spec.md b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/gui-interface/spec.md new file mode 100644 index 0000000..886da50 --- /dev/null +++ b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/gui-interface/spec.md @@ -0,0 +1,101 @@ +## ADDED Requirements + +### Requirement: File Selection +系統 SHALL 提供檔案選擇器讓使用者選取 Excel 講稿檔案。 + +#### Scenario: Open file dialog +- **WHEN** 使用者點擊檔案選擇按鈕 +- **THEN** 系統開啟檔案對話框,篩選顯示 .xlsx 檔案 + +#### Scenario: File path display +- **WHEN** 使用者選取檔案後 +- **THEN** 系統在介面顯示已選檔案路徑 + +### Requirement: Voice Selection +系統 SHALL 提供語音選擇下拉選單,讓使用者選擇 TTS 語音,並標註各語音的雙語支援能力。 + +#### Scenario: Voice dropdown display +- **WHEN** 介面載入完成 +- **THEN** 系統顯示語音下拉選單,按語言分組 (中文/越南/英文) + +#### Scenario: Voice option format +- **WHEN** 使用者展開下拉選單 +- **THEN** 每個選項顯示「語音名稱 (性別) - 雙語支援說明」格式 + +#### Scenario: Default voice selection +- **WHEN** 使用者未手動選擇語音 +- **THEN** 系統使用預設語音 (中文: HsiaoChenNeural, 越南: HoaiMyNeural, 英文: JennyNeural) + +#### Scenario: Voice selection persistence +- **WHEN** 使用者選擇語音後開始處理 +- **THEN** 所有講稿使用選定的語音進行合成 + +### Requirement: Start Control +系統 SHALL 提供「開始」按鈕啟動批次生成流程。 + +#### Scenario: Start without file +- **WHEN** 使用者未選擇檔案即點擊「開始」 +- **THEN** 系統顯示警告訊息,不執行任何動作 + +#### Scenario: Start with valid file +- **WHEN** 使用者已選擇有效檔案並點擊「開始」 +- **THEN** 系統開始處理並禁用「開始」按鈕 + +### Requirement: Stop Control +系統 SHALL 提供「停止」按鈕讓使用者中斷處理流程。 + +#### Scenario: Stop during processing +- **WHEN** 使用者在處理過程中點擊「停止」 +- **THEN** 系統完成當前檔案後停止,不處理後續檔案 + +#### Scenario: Stop button state +- **WHEN** 系統未在處理中 +- **THEN**「停止」按鈕呈現禁用狀態 + +### Requirement: Progress Display +系統 SHALL 顯示進度條反映批次處理進度。 + +#### Scenario: Progress update +- **WHEN** 每完成一個檔案的生成 +- **THEN** 進度條百分比更新為 (已完成數 / 總數) * 100 + +#### Scenario: Progress reset +- **WHEN** 開始新的批次處理 +- **THEN** 進度條重設為 0% + +### Requirement: Log Display +系統 SHALL 提供日誌視窗即時顯示處理狀態。 + +#### Scenario: Processing log +- **WHEN** 開始處理某個檔案 +- **THEN** 日誌顯示「正在處理: [Filename]」 + +#### Scenario: Success log +- **WHEN** 檔案生成成功 +- **THEN** 日誌顯示「完成: [Filename]」 + +#### Scenario: Error log +- **WHEN** 檔案生成失敗 +- **THEN** 日誌顯示「錯誤: [Filename] - [錯誤訊息]」 + +### Requirement: UI Responsiveness +系統 SHALL 確保長時間處理過程中 UI 保持響應。 + +#### Scenario: Window interaction during processing +- **WHEN** 批次處理進行中 +- **THEN** 使用者可拖曳、縮放、最小化視窗,無凍結現象 + +#### Scenario: No UI blocking +- **WHEN** TTS 引擎進行網路請求 +- **THEN** 主視窗事件迴圈持續運作,不顯示「未回應」 + +### Requirement: Completion Notification +系統 SHALL 在批次處理完成後通知使用者。 + +#### Scenario: Batch complete +- **WHEN** 所有檔案處理完畢 +- **THEN** 系統顯示完成對話框,包含成功/失敗統計 + +#### Scenario: Partial completion +- **WHEN** 批次處理被使用者中斷 +- **THEN** 系統顯示中斷通知,包含已完成數量 diff --git a/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/tts-engine/spec.md b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/tts-engine/spec.md new file mode 100644 index 0000000..364abf6 --- /dev/null +++ b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/specs/tts-engine/spec.md @@ -0,0 +1,68 @@ +## ADDED Requirements + +### Requirement: Voice Synthesis +系統 SHALL 使用 edge-tts 引擎將文字轉換為語音檔案 (MP3 格式)。 + +#### Scenario: Successful voice generation +- **WHEN** 提供有效的文字內容與語言設定 +- **THEN** 系統生成對應的 MP3 音檔 + +#### Scenario: Network error handling +- **WHEN** 網路連線中斷或逾時 +- **THEN** 系統重試一次,若仍失敗則記錄錯誤並跳過 + +### Requirement: Voice Selection Support +系統 SHALL 支援使用者從 GUI 選擇的語音進行合成,並提供基於 Lang 欄位的預設映射。 + +#### Scenario: User selected voice +- **WHEN** 使用者從 GUI 選擇特定語音 +- **THEN** 系統使用該語音合成所有講稿 + +#### Scenario: Default voice by Lang column +- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "zh" 或 "zh-tw" +- **THEN** 系統使用 zh-TW-HsiaoChenNeural 語音 + +#### Scenario: Vietnamese default voice +- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "vi" +- **THEN** 系統使用 vi-VN-HoaiMyNeural 語音 + +#### Scenario: English default voice +- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "en" +- **THEN** 系統使用 en-US-JennyNeural 語音 + +#### Scenario: Unknown language fallback +- **WHEN** 使用者未選擇語音且 Lang 欄位為未知值或空白 +- **THEN** 系統預設使用 zh-TW-HsiaoChenNeural 語音 + +### Requirement: Rate Limit Protection +系統 SHALL 在每次 TTS 請求之間加入延遲,防止 IP 被封鎖。 + +#### Scenario: Request throttling +- **WHEN** 連續發送多筆 TTS 請求 +- **THEN** 系統在每筆請求間等待至少 0.5 秒 + +### Requirement: Batch Processing +系統 SHALL 支援批次處理多筆講稿,單筆失敗不中斷整體流程。 + +#### Scenario: Batch with failures +- **WHEN** 批次中有部分檔案生成失敗 +- **THEN** 系統記錄失敗項目並繼續處理剩餘檔案 + +#### Scenario: All files successful +- **WHEN** 批次中所有檔案皆生成成功 +- **THEN** 系統顯示完成通知並統計成功數量 + +### Requirement: Output File Management +系統 SHALL 將生成的音檔儲存至指定輸出路徑。 + +#### Scenario: Output directory creation +- **WHEN** 輸出路徑不存在 +- **THEN** 系統自動建立該目錄 + +#### Scenario: File naming +- **WHEN** 生成音檔時 +- **THEN** 檔名使用 Excel 中 Filename 欄位值加上 .mp3 副檔名 + +#### Scenario: File overwrite +- **WHEN** 輸出路徑已存在同名檔案 +- **THEN** 系統覆蓋既有檔案 diff --git a/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/tasks.md b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/tasks.md new file mode 100644 index 0000000..1cb45eb --- /dev/null +++ b/openspec/changes/archive/2025-12-27-add-tts-voiceover-system/tasks.md @@ -0,0 +1,48 @@ +## 1. Environment Setup +- [x] 1.1 Create `environment.yml` with Python 3.10, PyQt6, edge-tts, pandas, openpyxl +- [x] 1.2 Test conda environment creation and activation + +## 2. Excel Input Module +- [x] 2.1 Implement Excel file loading with openpyxl/pandas +- [x] 2.2 Implement column parsing (Filename, Text, Lang) +- [x] 2.3 Add data validation for required fields +- [x] 2.4 Implement default language fallback (zh) + +## 3. TTS Engine Module +- [x] 3.1 Define voice registry with all available voices and bilingual support annotations +- [x] 3.2 Implement default voice mapping dictionary (zh/vi/en -> Voice ID) +- [x] 3.3 Implement edge-tts wrapper for voice synthesis with configurable voice +- [x] 3.4 Add rate limit delay (0.5s between requests) +- [x] 3.5 Implement network error handling with retry +- [x] 3.6 Add output directory creation logic + +## 4. Threading Model +- [x] 4.1 Create TTS Worker class extending QThread +- [x] 4.2 Implement asyncio event loop within worker thread +- [x] 4.3 Define pyqtSignal for progress, log, completion events +- [x] 4.4 Implement stop flag for graceful cancellation + +## 5. GUI Implementation +- [x] 5.1 Create main window with PyQt6 +- [x] 5.2 Add file browser widget with .xlsx filter +- [x] 5.3 Add voice selection dropdown (grouped by language, with bilingual annotations) +- [x] 5.4 Add Start/Stop buttons with state management +- [x] 5.5 Add progress bar widget +- [x] 5.6 Add log console (QTextEdit) with auto-scroll +- [x] 5.7 Connect signals to UI update slots +- [x] 5.8 Implement completion notification dialog + +## 6. Integration +- [x] 6.1 Wire Excel parser to TTS worker +- [x] 6.2 Pass selected voice from GUI to TTS worker +- [x] 6.3 Connect worker signals to GUI updates +- [x] 6.4 Implement start/stop button handlers + +## 7. Testing & Documentation +- [x] 7.1 Create `template.xlsx` with sample data +- [x] 7.2 Test batch processing with 10+ items +- [x] 7.3 Test voice selection (verify different voices produce correct output) +- [x] 7.4 Test UI responsiveness during processing +- [x] 7.5 Test stop functionality mid-batch +- [x] 7.6 Test error handling (invalid file, network error) +- [x] 7.7 Write README.md with usage instructions (including voice selection guide) diff --git a/openspec/project.md b/openspec/project.md new file mode 100644 index 0000000..9c8af16 --- /dev/null +++ b/openspec/project.md @@ -0,0 +1,71 @@ +# Project Context + +## Purpose +智慧簡報旁白生成系統 (Smart Slide Voiceover System) - 一款桌面端應用程式,協助使用者將撰寫於 Excel 的簡報講稿,批次轉換為專業級、擬真且具備親切感的語音檔案 (MP3)。系統解決跨語言(中文+英文專有名詞、越南文+英文專有名詞)的發音自然度問題。 + +## Tech Stack +- **開發語言**: Python 3.10+ +- **GUI 框架**: PyQt6 (原生編譯綁定,效能優於 Electron) +- **TTS 引擎**: edge-tts (Python Library,調用 Azure Cognitive Services) +- **資料處理**: Pandas + Openpyxl (Excel 讀取解析) +- **並發模型**: QThread + Asyncio (解決 PyQt 事件迴圈與非同步請求衝突) +- **環境管理**: Conda (environment.yml) + +## Project Conventions + +### Code Style +- 架構模式: MVC (Model-View-Controller) / Producer-Consumer Pattern +- 主執行緒負責 UI 繪製,Worker Thread 執行 TTS 請求 +- 透過 pyqtSignal 將進度與 Log 傳回 Main Thread +- 使用 Type Hints 增強程式碼可讀性 + +### Architecture Patterns +- **Main Thread (UI)**: 繪製介面、接收使用者操作、更新進度條 +- **Worker Thread (QThread)**: + - 建立獨立的 asyncio Event Loop + - 執行 Excel 解析 + - 序列化執行 TTS 請求 + - 透過 Signal 回傳狀態 + +### Testing Strategy +- 環境測試: Conda 環境建立、依賴完整性 +- 功能測試: 檔案讀取防呆、批次生成、強制中斷、介面響應性 +- 語音品質驗收: 語言對應正確性、中英/越英夾雜流暢度、語音完整性 + +### Git Workflow +- 使用功能分支開發 +- Commit 訊息採用中文說明 + +## Domain Context + +### 語音模型映射 (Voice Mapping) +| Lang 欄位 | Voice ID | 特徵 | +|-----------|----------|------| +| vi | vi-VN-HoaiMyNeural | 女性,音色明亮溫柔,咬字清晰 | +| zh / zh-tw | zh-TW-HsiaoChenNeural | 女性,台灣標準口音,知性專業 | +| en | en-US-JennyNeural | 女性,美式標準音 | + +### Excel 輸入格式 +| 欄位 | 必要性 | 說明 | +|------|--------|------| +| Filename | 必要 | 輸出音檔名稱 (例如 Slide_01) | +| Text | 必要 | 旁白內容,支援中英/越英夾雜 | +| Lang | 選填 | 主要語言 (zh/vi),預設 zh | + +## Important Constraints +- **Rate Limit 防護**: 每筆請求間需有間隔,防止 IP 被封鎖 +- **效能要求**: 單筆音訊生成延遲不超過 3 秒,支援 100+ 筆批次佇列 +- **錯誤容忍**: 單一檔案失敗不中斷整個批次,記錄錯誤後跳至下一筆 +- **介面響應**: 執行耗時任務時,主視窗不得凍結 +- **硬體基準**: PC (Windows), 32GB RAM, RTX 4060 (主要利用 CPU 與網路頻寬) + +## External Dependencies +- **Microsoft Edge-TTS**: Azure Cognitive Services Neural Voice API +- **網路需求**: 10Mbps+ 穩定連線 +- **作業系統**: Windows 10/11 + +## Deliverables +- `main.py`: GUI 與邏輯程式碼 +- `environment.yml`: Conda 環境配置 +- `README.md`: 操作說明文件 +- `template.xlsx`: 格式範例檔案 diff --git a/openspec/specs/excel-input/spec.md b/openspec/specs/excel-input/spec.md new file mode 100644 index 0000000..e29f40d --- /dev/null +++ b/openspec/specs/excel-input/spec.md @@ -0,0 +1,46 @@ +# excel-input Specification + +## Purpose +TBD - created by archiving change add-tts-voiceover-system. Update Purpose after archive. +## Requirements +### Requirement: Excel File Loading +系統 SHALL 支援載入 .xlsx 格式的 Excel 檔案作為講稿輸入來源。 + +#### Scenario: Valid Excel file selected +- **WHEN** 使用者選擇有效的 .xlsx 檔案 +- **THEN** 系統解析檔案內容並準備處理 + +#### Scenario: Invalid file format rejected +- **WHEN** 使用者選擇非 .xlsx 格式檔案 +- **THEN** 系統顯示格式錯誤警告,不進行處理 + +### Requirement: Column Parsing +系統 SHALL 解析 Excel 檔案的標準欄位結構,包含 Filename、Text、Lang 三個欄位。 + +#### Scenario: Required columns present +- **WHEN** Excel 檔案包含 Filename 與 Text 欄位 +- **THEN** 系統成功解析所有資料列 + +#### Scenario: Missing required column +- **WHEN** Excel 檔案缺少 Filename 或 Text 欄位 +- **THEN** 系統顯示欄位缺失錯誤,停止處理 + +#### Scenario: Optional Lang column handling +- **WHEN** Lang 欄位為空或不存在 +- **THEN** 系統預設使用 "zh" 作為語言設定 + +### Requirement: Data Validation +系統 SHALL 驗證每一列資料的有效性,確保必要欄位非空。 + +#### Scenario: Empty text field +- **WHEN** 某列的 Text 欄位為空 +- **THEN** 系統記錄警告並跳過該列 + +#### Scenario: Empty filename field +- **WHEN** 某列的 Filename 欄位為空 +- **THEN** 系統記錄警告並跳過該列 + +#### Scenario: Valid row processing +- **WHEN** Filename 與 Text 欄位皆有值 +- **THEN** 系統將該列加入處理佇列 + diff --git a/openspec/specs/gui-interface/spec.md b/openspec/specs/gui-interface/spec.md new file mode 100644 index 0000000..9583836 --- /dev/null +++ b/openspec/specs/gui-interface/spec.md @@ -0,0 +1,105 @@ +# gui-interface Specification + +## Purpose +TBD - created by archiving change add-tts-voiceover-system. Update Purpose after archive. +## Requirements +### Requirement: File Selection +系統 SHALL 提供檔案選擇器讓使用者選取 Excel 講稿檔案。 + +#### Scenario: Open file dialog +- **WHEN** 使用者點擊檔案選擇按鈕 +- **THEN** 系統開啟檔案對話框,篩選顯示 .xlsx 檔案 + +#### Scenario: File path display +- **WHEN** 使用者選取檔案後 +- **THEN** 系統在介面顯示已選檔案路徑 + +### Requirement: Voice Selection +系統 SHALL 提供語音選擇下拉選單,讓使用者選擇 TTS 語音,並標註各語音的雙語支援能力。 + +#### Scenario: Voice dropdown display +- **WHEN** 介面載入完成 +- **THEN** 系統顯示語音下拉選單,按語言分組 (中文/越南/英文) + +#### Scenario: Voice option format +- **WHEN** 使用者展開下拉選單 +- **THEN** 每個選項顯示「語音名稱 (性別) - 雙語支援說明」格式 + +#### Scenario: Default voice selection +- **WHEN** 使用者未手動選擇語音 +- **THEN** 系統使用預設語音 (中文: HsiaoChenNeural, 越南: HoaiMyNeural, 英文: JennyNeural) + +#### Scenario: Voice selection persistence +- **WHEN** 使用者選擇語音後開始處理 +- **THEN** 所有講稿使用選定的語音進行合成 + +### Requirement: Start Control +系統 SHALL 提供「開始」按鈕啟動批次生成流程。 + +#### Scenario: Start without file +- **WHEN** 使用者未選擇檔案即點擊「開始」 +- **THEN** 系統顯示警告訊息,不執行任何動作 + +#### Scenario: Start with valid file +- **WHEN** 使用者已選擇有效檔案並點擊「開始」 +- **THEN** 系統開始處理並禁用「開始」按鈕 + +### Requirement: Stop Control +系統 SHALL 提供「停止」按鈕讓使用者中斷處理流程。 + +#### Scenario: Stop during processing +- **WHEN** 使用者在處理過程中點擊「停止」 +- **THEN** 系統完成當前檔案後停止,不處理後續檔案 + +#### Scenario: Stop button state +- **WHEN** 系統未在處理中 +- **THEN**「停止」按鈕呈現禁用狀態 + +### Requirement: Progress Display +系統 SHALL 顯示進度條反映批次處理進度。 + +#### Scenario: Progress update +- **WHEN** 每完成一個檔案的生成 +- **THEN** 進度條百分比更新為 (已完成數 / 總數) * 100 + +#### Scenario: Progress reset +- **WHEN** 開始新的批次處理 +- **THEN** 進度條重設為 0% + +### Requirement: Log Display +系統 SHALL 提供日誌視窗即時顯示處理狀態。 + +#### Scenario: Processing log +- **WHEN** 開始處理某個檔案 +- **THEN** 日誌顯示「正在處理: [Filename]」 + +#### Scenario: Success log +- **WHEN** 檔案生成成功 +- **THEN** 日誌顯示「完成: [Filename]」 + +#### Scenario: Error log +- **WHEN** 檔案生成失敗 +- **THEN** 日誌顯示「錯誤: [Filename] - [錯誤訊息]」 + +### Requirement: UI Responsiveness +系統 SHALL 確保長時間處理過程中 UI 保持響應。 + +#### Scenario: Window interaction during processing +- **WHEN** 批次處理進行中 +- **THEN** 使用者可拖曳、縮放、最小化視窗,無凍結現象 + +#### Scenario: No UI blocking +- **WHEN** TTS 引擎進行網路請求 +- **THEN** 主視窗事件迴圈持續運作,不顯示「未回應」 + +### Requirement: Completion Notification +系統 SHALL 在批次處理完成後通知使用者。 + +#### Scenario: Batch complete +- **WHEN** 所有檔案處理完畢 +- **THEN** 系統顯示完成對話框,包含成功/失敗統計 + +#### Scenario: Partial completion +- **WHEN** 批次處理被使用者中斷 +- **THEN** 系統顯示中斷通知,包含已完成數量 + diff --git a/openspec/specs/tts-engine/spec.md b/openspec/specs/tts-engine/spec.md new file mode 100644 index 0000000..d440e40 --- /dev/null +++ b/openspec/specs/tts-engine/spec.md @@ -0,0 +1,72 @@ +# tts-engine Specification + +## Purpose +TBD - created by archiving change add-tts-voiceover-system. Update Purpose after archive. +## Requirements +### Requirement: Voice Synthesis +系統 SHALL 使用 edge-tts 引擎將文字轉換為語音檔案 (MP3 格式)。 + +#### Scenario: Successful voice generation +- **WHEN** 提供有效的文字內容與語言設定 +- **THEN** 系統生成對應的 MP3 音檔 + +#### Scenario: Network error handling +- **WHEN** 網路連線中斷或逾時 +- **THEN** 系統重試一次,若仍失敗則記錄錯誤並跳過 + +### Requirement: Voice Selection Support +系統 SHALL 支援使用者從 GUI 選擇的語音進行合成,並提供基於 Lang 欄位的預設映射。 + +#### Scenario: User selected voice +- **WHEN** 使用者從 GUI 選擇特定語音 +- **THEN** 系統使用該語音合成所有講稿 + +#### Scenario: Default voice by Lang column +- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "zh" 或 "zh-tw" +- **THEN** 系統使用 zh-TW-HsiaoChenNeural 語音 + +#### Scenario: Vietnamese default voice +- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "vi" +- **THEN** 系統使用 vi-VN-HoaiMyNeural 語音 + +#### Scenario: English default voice +- **WHEN** 使用者未選擇語音且 Excel Lang 欄位為 "en" +- **THEN** 系統使用 en-US-JennyNeural 語音 + +#### Scenario: Unknown language fallback +- **WHEN** 使用者未選擇語音且 Lang 欄位為未知值或空白 +- **THEN** 系統預設使用 zh-TW-HsiaoChenNeural 語音 + +### Requirement: Rate Limit Protection +系統 SHALL 在每次 TTS 請求之間加入延遲,防止 IP 被封鎖。 + +#### Scenario: Request throttling +- **WHEN** 連續發送多筆 TTS 請求 +- **THEN** 系統在每筆請求間等待至少 0.5 秒 + +### Requirement: Batch Processing +系統 SHALL 支援批次處理多筆講稿,單筆失敗不中斷整體流程。 + +#### Scenario: Batch with failures +- **WHEN** 批次中有部分檔案生成失敗 +- **THEN** 系統記錄失敗項目並繼續處理剩餘檔案 + +#### Scenario: All files successful +- **WHEN** 批次中所有檔案皆生成成功 +- **THEN** 系統顯示完成通知並統計成功數量 + +### Requirement: Output File Management +系統 SHALL 將生成的音檔儲存至指定輸出路徑。 + +#### Scenario: Output directory creation +- **WHEN** 輸出路徑不存在 +- **THEN** 系統自動建立該目錄 + +#### Scenario: File naming +- **WHEN** 生成音檔時 +- **THEN** 檔名使用 Excel 中 Filename 欄位值加上 .mp3 副檔名 + +#### Scenario: File overwrite +- **WHEN** 輸出路徑已存在同名檔案 +- **THEN** 系統覆蓋既有檔案 + diff --git a/template.xlsx b/template.xlsx new file mode 100644 index 0000000..8e136ba Binary files /dev/null and b/template.xlsx differ