feat: Meeting Assistant MVP - Complete implementation

Enterprise Meeting Knowledge Management System with:

Backend (FastAPI):
- Authentication proxy with JWT (pj-auth-api integration)
- MySQL database with 4 tables (users, meetings, conclusions, actions)
- Meeting CRUD with system code generation (C-YYYYMMDD-XX, A-YYYYMMDD-XX)
- Dify LLM integration for AI summarization
- Excel export with openpyxl
- 20 unit tests (all passing)

Client (Electron):
- Login page with company auth
- Meeting list with create/delete
- Meeting detail with real-time transcription
- Editable transcript textarea (single block, easy editing)
- AI summarization with conclusions/action items
- 5-second segment recording (efficient for long meetings)

Sidecar (Python):
- faster-whisper medium model with int8 quantization
- ONNX Runtime VAD (lightweight, ~20MB vs PyTorch ~2GB)
- Chinese punctuation processing
- OpenCC for Traditional Chinese conversion
- Anti-hallucination parameters
- Auto-cleanup of temp audio files

OpenSpec:
- add-meeting-assistant-mvp (47 tasks, archived)
- add-realtime-transcription (29 tasks, archived)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
egg
2025-12-10 20:17:44 +08:00
commit 8b6184ecc5
65 changed files with 10510 additions and 0 deletions

View File

@@ -0,0 +1,23 @@
---
name: OpenSpec: Apply
description: Implement an approved OpenSpec change and keep tasks in sync.
category: OpenSpec
tags: [openspec, apply]
---
<!-- OPENSPEC:START -->
**Guardrails**
- Favor straightforward, minimal implementations first and add complexity only when it is requested or clearly required.
- Keep changes tightly scoped to the requested outcome.
- Refer to `openspec/AGENTS.md` (located inside the `openspec/` directory—run `ls openspec` or `openspec update` if you don't see it) if you need additional OpenSpec conventions or clarifications.
**Steps**
Track these steps as TODOs and complete them one by one.
1. Read `changes/<id>/proposal.md`, `design.md` (if present), and `tasks.md` to confirm scope and acceptance criteria.
2. Work through tasks sequentially, keeping edits minimal and focused on the requested change.
3. Confirm completion before updating statuses—make sure every item in `tasks.md` is finished.
4. Update the checklist after all work is done so each task is marked `- [x]` and reflects reality.
5. Reference `openspec list` or `openspec show <item>` when additional context is required.
**Reference**
- Use `openspec show <id> --json --deltas-only` if you need additional context from the proposal while implementing.
<!-- OPENSPEC:END -->

View File

@@ -0,0 +1,27 @@
---
name: OpenSpec: Archive
description: Archive a deployed OpenSpec change and update specs.
category: OpenSpec
tags: [openspec, archive]
---
<!-- OPENSPEC:START -->
**Guardrails**
- Favor straightforward, minimal implementations first and add complexity only when it is requested or clearly required.
- Keep changes tightly scoped to the requested outcome.
- Refer to `openspec/AGENTS.md` (located inside the `openspec/` directory—run `ls openspec` or `openspec update` if you don't see it) if you need additional OpenSpec conventions or clarifications.
**Steps**
1. Determine the change ID to archive:
- If this prompt already includes a specific change ID (for example inside a `<ChangeId>` block populated by slash-command arguments), use that value after trimming whitespace.
- If the conversation references a change loosely (for example by title or summary), run `openspec list` to surface likely IDs, share the relevant candidates, and confirm which one the user intends.
- Otherwise, review the conversation, run `openspec list`, and ask the user which change to archive; wait for a confirmed change ID before proceeding.
- If you still cannot identify a single change ID, stop and tell the user you cannot archive anything yet.
2. Validate the change ID by running `openspec list` (or `openspec show <id>`) and stop if the change is missing, already archived, or otherwise not ready to archive.
3. Run `openspec archive <id> --yes` so the CLI moves the change and applies spec updates without prompts (use `--skip-specs` only for tooling-only work).
4. Review the command output to confirm the target specs were updated and the change landed in `changes/archive/`.
5. Validate with `openspec validate --strict` and inspect with `openspec show <id>` if anything looks off.
**Reference**
- Use `openspec list` to confirm change IDs before archiving.
- Inspect refreshed specs with `openspec list --specs` and address any validation issues before handing off.
<!-- OPENSPEC:END -->

View File

@@ -0,0 +1,28 @@
---
name: OpenSpec: Proposal
description: Scaffold a new OpenSpec change and validate strictly.
category: OpenSpec
tags: [openspec, change]
---
<!-- OPENSPEC:START -->
**Guardrails**
- Favor straightforward, minimal implementations first and add complexity only when it is requested or clearly required.
- Keep changes tightly scoped to the requested outcome.
- Refer to `openspec/AGENTS.md` (located inside the `openspec/` directory—run `ls openspec` or `openspec update` if you don't see it) if you need additional OpenSpec conventions or clarifications.
- Identify any vague or ambiguous details and ask the necessary follow-up questions before editing files.
- Do not write any code during the proposal stage. Only create design documents (proposal.md, tasks.md, design.md, and spec deltas). Implementation happens in the apply stage after approval.
**Steps**
1. Review `openspec/project.md`, run `openspec list` and `openspec list --specs`, and inspect related code or docs (e.g., via `rg`/`ls`) to ground the proposal in current behaviour; note any gaps that require clarification.
2. Choose a unique verb-led `change-id` and scaffold `proposal.md`, `tasks.md`, and `design.md` (when needed) under `openspec/changes/<id>/`.
3. Map the change into concrete capabilities or requirements, breaking multi-scope efforts into distinct spec deltas with clear relationships and sequencing.
4. Capture architectural reasoning in `design.md` when the solution spans multiple systems, introduces new patterns, or demands trade-off discussion before committing to specs.
5. Draft spec deltas in `changes/<id>/specs/<capability>/spec.md` (one folder per capability) using `## ADDED|MODIFIED|REMOVED Requirements` with at least one `#### Scenario:` per requirement and cross-reference related capabilities when relevant.
6. Draft `tasks.md` as an ordered list of small, verifiable work items that deliver user-visible progress, include validation (tests, tooling), and highlight dependencies or parallelizable work.
7. Validate with `openspec validate <id> --strict` and resolve every issue before sharing the proposal.
**Reference**
- Use `openspec show <id> --json --deltas-only` or `openspec show <spec> --type spec` to inspect details when validation fails.
- Search existing requirements with `rg -n "Requirement:|Scenario:" openspec/specs` before writing new ones.
- Explore the codebase with `rg <keyword>`, `ls`, or direct file reads so proposals align with current implementation realities.
<!-- OPENSPEC:END -->

49
.gitignore vendored Normal file
View File

@@ -0,0 +1,49 @@
# Python
__pycache__/
*.py[cod]
*$py.class
venv/
.venv/
*.egg-info/
dist/
build/
.pytest_cache/
# Node.js
node_modules/
npm-debug.log
yarn-error.log
# Electron
out/
# IDE
.idea/
.vscode/
*.swp
*.swo
*~
# Environment
.env
.env.local
*.local
# OS
.DS_Store
Thumbs.db
# Whisper models (large files)
*.pt
*.bin
*.onnx
# Temporary files
*.tmp
*.temp
*.webm
/tmp/
# Logs
*.log
logs/

18
AGENTS.md Normal file
View File

@@ -0,0 +1,18 @@
<!-- OPENSPEC:START -->
# OpenSpec Instructions
These instructions are for AI assistants working in this project.
Always open `@/openspec/AGENTS.md` when the request:
- Mentions planning or proposals (words like proposal, spec, change, plan)
- Introduces new capabilities, breaking changes, architecture shifts, or big performance/security work
- Sounds ambiguous and you need the authoritative spec before coding
Use `@/openspec/AGENTS.md` to learn:
- How to create and apply change proposals
- Spec format and conventions
- Project structure and guidelines
Keep this managed block so 'openspec update' can refresh the instructions.
<!-- OPENSPEC:END -->

18
CLAUDE.md Normal file
View File

@@ -0,0 +1,18 @@
<!-- OPENSPEC:START -->
# OpenSpec Instructions
These instructions are for AI assistants working in this project.
Always open `@/openspec/AGENTS.md` when the request:
- Mentions planning or proposals (words like proposal, spec, change, plan)
- Introduces new capabilities, breaking changes, architecture shifts, or big performance/security work
- Sounds ambiguous and you need the authoritative spec before coding
Use `@/openspec/AGENTS.md` to learn:
- How to create and apply change proposals
- Spec format and conventions
- Project structure and guidelines
Keep this managed block so 'openspec update' can refresh the instructions.
<!-- OPENSPEC:END -->

177
DEPLOYMENT.md Normal file
View File

@@ -0,0 +1,177 @@
# Meeting Assistant Deployment Guide
## Prerequisites
- Python 3.10+
- Node.js 18+
- MySQL 8.0+
- Access to Dify LLM service
## Backend Deployment
### 1. Setup Environment
```bash
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or: venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
```
### 2. Configure Environment Variables
```bash
# Copy example and edit
cp .env.example .env
# Edit .env with actual values:
# - DB_HOST, DB_PORT, DB_USER, DB_PASS, DB_NAME
# - AUTH_API_URL
# - DIFY_API_URL, DIFY_API_KEY
# - ADMIN_EMAIL
# - JWT_SECRET (generate a secure random string)
```
### 3. Run Server
```bash
# Development
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
# Production
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4
```
### 4. Verify Deployment
```bash
curl http://localhost:8000/api/health
# Should return: {"status":"healthy","service":"meeting-assistant"}
```
## Electron Client Deployment
### 1. Setup
```bash
cd client
# Install dependencies
npm install
```
### 2. Development
```bash
npm start
```
### 3. Build for Distribution
```bash
# Build portable executable
npm run build
```
The executable will be in `client/dist/`.
## Transcription Sidecar
### 1. Setup
```bash
cd sidecar
# Create virtual environment
python -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
pip install pyinstaller
```
### 2. Download Whisper Model
The model will be downloaded automatically on first run. For faster startup, pre-download:
```python
from faster_whisper import WhisperModel
model = WhisperModel("small", device="cpu", compute_type="int8")
```
### 3. Build Executable
```bash
python build.py
```
The executable will be in `sidecar/dist/transcriber`.
### 4. Package with Electron
Copy `sidecar/dist/` to `client/sidecar/` before building Electron app.
## Database Setup
The backend will automatically create tables on first startup. To manually verify:
```sql
USE db_A060;
SHOW TABLES LIKE 'meeting_%';
```
Expected tables:
- `meeting_users`
- `meeting_records`
- `meeting_conclusions`
- `meeting_action_items`
## Testing
### Backend Tests
```bash
cd backend
pytest tests/ -v
```
### Performance Verification
On target hardware (i5/8GB):
1. Start the Electron app
2. Record 1 minute of audio
3. Verify transcription completes within acceptable time
4. Test AI summarization with the transcript
## Troubleshooting
### Database Connection Issues
1. Verify MySQL is accessible from server
2. Check firewall rules for port 33306
3. Verify credentials in .env
### Dify API Issues
1. Verify API key is valid
2. Check Dify service status
3. Review timeout settings for long transcripts
### Transcription Issues
1. Verify microphone permissions
2. Check sidecar executable runs standalone
3. Review audio format (16kHz, 16-bit, mono)
## Security Notes
- Never commit `.env` files
- Keep JWT_SECRET secure and unique per deployment
- Ensure HTTPS in production
- Regular security updates for dependencies

61
PRD.md Normal file
View File

@@ -0,0 +1,61 @@
1. 產品概述
本系統為企業級會議知識管理解決方案。前端採用 Electron 進行邊緣運算(離線語音轉寫),後端整合公司現有 Auth API、MySQL 資料庫與 Dify LLM 服務。旨在解決會議記錄耗時問題,並透過結構化資料進行後續追蹤。
2. 功能需求 (Functional Requirements)
2.1 身份驗證 (Authentication)
FR-Auth-01 登入機制:
使用公司 API (https://pj-auth-api.vercel.app/api/auth/login) 進行驗證。
支援短效 Token 機制Client 端需實作自動續簽 (Auto-Refresh) 邏輯以維持長時間會議連線。
FR-Auth-02 權限管理:
預設管理員帳號ymirliu@panjit.com.tw (擁有所有會議檢視與 Excel 模板管理權限)。
2.2 會議建立與中繼資料 (Metadata Input)
FR-Meta-01 必填欄位:
由於 AI 無法憑空得知部分資訊,系統需在「建立會議」或「會議資訊」頁面提供以下手動輸入欄位:
會議主題 (Subject)
會議時間 (Date/Time)
會議主席 (Chairperson)
會議地點 (Location)
會議記錄人 (Recorder) - 預設帶入登入者
會議參與人員 (Attendees)
2.3 核心轉寫與編輯 (Core Transcription)
FR-Core-01 邊緣轉寫: 使用 i5/8G 筆電本地跑 faster-whisper (int8) 模型,並加上 OpenCC 強制繁體化。
FR-Core-02 即時修正: 支援雙欄介面,左側顯示 AI 逐字稿,右側為結構化筆記區。
2.4 AI 智慧摘要 (LLM Integration)
FR-LLM-01 Dify 整合:
串接 https://dify.theaken.com/v1。
將逐字稿送往 Dify並要求回傳包含以下資訊的結構化資料
會議結論 (Conclusions)
待辦事項 (Action Items):需解析出 內容、負責人、預計完成日。
FR-LLM-02 資料補全: 若 AI 無法識別負責人或日期UI 需提供介面讓使用者手動補填。
2.5 資料庫與追蹤 (Database & Tracking)
FR-DB-01 資料隔離: 所有資料表必須加上 meeting_ 前綴。
FR-DB-02 事項編號: 系統需自動為每一條「會議結論」與「待辦事項」產生唯一編號 (ID),以便後續追蹤執行現況。
2.6 報表輸出 (Export)
FR-Export-01 Excel 生成:
後端根據 Template 生成 Excel。
需包含所有 FR-Meta-01 及 FR-LLM-01 定義之欄位。

143
SDD.md Normal file
View File

@@ -0,0 +1,143 @@
1. 系統架構圖 (System Architecture)
Plaintext
[Client: Electron App]
|
|-- (1. Auth API) --> [Ext: PJ-Auth API (Vercel)]
|
|-- (2. Meeting Data) --> [Middleware Server (Python FastAPI)]
|
|-- (3. SQL Query) --> [DB: MySQL (Shared)]
|
|-- (4. Summarize) --> [Ext: Dify LLM]
注意: 為了安全,資料庫連線資訊與 Dify API Key 嚴禁打包在 Electron Client 端,必須放在 Middleware Server。
2. 資料庫設計 (Database Schema)
Host: mysql.theaken.com (Port 33306)
User/Pass: A060 / WLeSCi0yhtc7
DB Name: db_A060
Prefix: meeting_
SQL
-- 1. 使用者表 (與 Auth API 對應,本地快取用)
CREATE TABLE meeting_users (
user_id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(100) UNIQUE NOT NULL, -- 對應 ymirliu@panjit.com.tw
display_name VARCHAR(50),
role ENUM('admin', 'user') DEFAULT 'user',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- 2. 會議主表
CREATE TABLE meeting_records (
meeting_id INT PRIMARY KEY AUTO_INCREMENT,
uuid VARCHAR(64) UNIQUE, -- 系統唯一識別碼
subject VARCHAR(200) NOT NULL, -- 會議主題
meeting_time DATETIME NOT NULL, -- 會議時間
location VARCHAR(100), -- 會議地點
chairperson VARCHAR(50), -- 會議主席
recorder VARCHAR(50), -- 會議記錄人
attendees TEXT, -- 參與人員 (逗號分隔或 JSON)
transcript_blob LONGTEXT, -- AI 原始逐字稿
created_by VARCHAR(100), -- 建立者 Email
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- 3. 會議結論表 (Conclusions)
CREATE TABLE meeting_conclusions (
conclusion_id INT PRIMARY KEY AUTO_INCREMENT,
meeting_id INT,
content TEXT,
system_code VARCHAR(20), -- 會議結論編號 (如: C-20251210-01)
FOREIGN KEY (meeting_id) REFERENCES meeting_records(meeting_id)
);
-- 4. 待辦追蹤表 (Action Items)
CREATE TABLE meeting_action_items (
action_id INT PRIMARY KEY AUTO_INCREMENT,
meeting_id INT,
content TEXT, -- 追蹤事項內容
owner VARCHAR(50), -- 負責人
due_date DATE, -- 預計完成日期
status ENUM('Open', 'In Progress', 'Done', 'Delayed') DEFAULT 'Open', -- 執行現況
system_code VARCHAR(20), -- 會議事項編號 (如: A-20251210-01)
FOREIGN KEY (meeting_id) REFERENCES meeting_records(meeting_id)
);
3. Middleware Server 配置 (FastAPI 範例)
Client 端不直接連 MySQL而是呼叫此 Middleware。
3.1 環境變數 (.env)
Ini, TOML
DB_HOST=mysql.theaken.com
DB_PORT=33306
DB_USER=A060
DB_PASS=WLeSCi0yhtc7
DB_NAME=db_A060
AUTH_API_URL=https://pj-auth-api.vercel.app/api/auth/login
DIFY_API_URL=https://dify.theaken.com/v1
DIFY_API_KEY=app-xxxxxxxxxxx # 需至 Dify 後台取得
ADMIN_EMAIL=ymirliu@panjit.com.tw
3.2 API 介面規格
A. 登入代理 (Proxy)
Endpoint: POST /api/login
Logic: Middleware 轉發請求至 pj-auth-api.vercel.app。成功後若該 Email 為 ymirliu@panjit.com.tw則在回傳的 JWT Payload 中標記 { "role": "admin" }。
B. 上傳/同步會議
Endpoint: POST /api/meetings
Payload:
JSON
{
"meta": { "subject": "...", "chairperson": "...", ... },
"transcript": "...",
"conclusions": [ { "content": "..." } ],
"actions": [ { "content": "...", "owner": "...", "due_date": "..." } ]
}
Logic:
Insert into meeting_records.
Loop insert meeting_conclusions (自動生成 ID: C-{YYYYMMDD}-{Seq}).
Loop insert meeting_action_items (自動生成 ID: A-{YYYYMMDD}-{Seq}).
C. Dify 摘要請求
Endpoint: POST /api/ai/summarize
Payload: { "transcript": "..." }
Logic: 呼叫 Dify API。
Dify Prompt 設定 (System):
Plaintext
你是一個會議記錄助手。請根據逐字稿,回傳 JSON 格式。
必要欄位:
1. conclusions (Array): 結論內容
2. action_items (Array): { content, owner, due_date }
若逐字稿未提及日期或負責人,該欄位請留空字串。
D. Excel 匯出
Endpoint: POST /api/meetings/{id}/export
Logic:
SQL Join 查詢 records, conclusions, action_items。
Load template.xlsx.
Replace Placeholders:
{{subject}}, {{time}}, {{chair}}...
Table Filling: 動態插入 Rows 填寫結論與待辦事項。
Return File Stream.

46
TDD.md Normal file
View File

@@ -0,0 +1,46 @@
1. 單元測試 (Middleware)
Test-DB-Connect:
嘗試連線至 mysql.theaken.com:33306。
驗證 meeting_ 前綴表是否存在,若不存在則執行 CREATE TABLE 初始化腳本。
驗證 ymirliu@panjit.com.tw 是否能被識別為管理員。
Test-Dify-Proxy:
發送 Mock 文字至 /api/ai/summarize。
驗證 Server 能否正確解析 Dify 回傳的 JSON並處理 Dify 可能的 Timeout 或 500 錯誤。
2. 整合測試 (Client-Server)
Test-Auth-Flow:
Client 輸入帳密 -> Middleware -> Vercel Auth API。
驗證 Token 取得後Client 能否成功存取 /api/meetings。
重要: 驗證 Token 過期模擬(手動失效 TokenClient 攔截器是否觸發重試。
Test-Full-Cycle:
建立: 填寫表單(主席、地點...)。
錄音: 模擬 1 分鐘語音輸入。
摘要: 點擊「AI 摘要」,確認 Dify 回傳資料填入右側欄位。
補填: 手動修改「負責人」欄位。
存檔: 檢查 MySQL 資料庫是否正確寫入 meeting_action_items 且 status 預設為 'Open'。
匯出: 下載 Excel檢查所有欄位包含手動補填的負責人是否正確顯示。
3. 部署檢核表 (Deployment Checklist)
[ ] Middleware Server 的 requirements.txt 包含 mysql-connector-python, fastapi, requests, openpyxl。
[ ] Middleware Server 的環境變數 (.env) 已設定且保密。
[ ] Client 端 electron-builder 設定 target: portable。
[ ] Client 端 Python Sidecar 已包含 faster-whisper, opencc 並完成 PyInstaller 打包。

15
backend/.env.example Normal file
View File

@@ -0,0 +1,15 @@
# Database Configuration
DB_HOST=mysql.theaken.com
DB_PORT=33306
DB_USER=A060
DB_PASS=your_password_here
DB_NAME=db_A060
# External APIs
AUTH_API_URL=https://pj-auth-api.vercel.app/api/auth/login
DIFY_API_URL=https://dify.theaken.com/v1
DIFY_API_KEY=app-xxxxxxxxxxx
# Application Settings
ADMIN_EMAIL=ymirliu@panjit.com.tw
JWT_SECRET=your_jwt_secret_here

1
backend/app/__init__.py Normal file
View File

@@ -0,0 +1 @@
# Meeting Assistant Backend

24
backend/app/config.py Normal file
View File

@@ -0,0 +1,24 @@
import os
from dotenv import load_dotenv
load_dotenv()
class Settings:
DB_HOST: str = os.getenv("DB_HOST", "mysql.theaken.com")
DB_PORT: int = int(os.getenv("DB_PORT", "33306"))
DB_USER: str = os.getenv("DB_USER", "A060")
DB_PASS: str = os.getenv("DB_PASS", "")
DB_NAME: str = os.getenv("DB_NAME", "db_A060")
AUTH_API_URL: str = os.getenv(
"AUTH_API_URL", "https://pj-auth-api.vercel.app/api/auth/login"
)
DIFY_API_URL: str = os.getenv("DIFY_API_URL", "https://dify.theaken.com/v1")
DIFY_API_KEY: str = os.getenv("DIFY_API_KEY", "")
ADMIN_EMAIL: str = os.getenv("ADMIN_EMAIL", "ymirliu@panjit.com.tw")
JWT_SECRET: str = os.getenv("JWT_SECRET", "meeting-assistant-secret")
settings = Settings()

96
backend/app/database.py Normal file
View File

@@ -0,0 +1,96 @@
import mysql.connector
from mysql.connector import pooling
from contextlib import contextmanager
from .config import settings
connection_pool = None
def init_db_pool():
global connection_pool
connection_pool = pooling.MySQLConnectionPool(
pool_name="meeting_pool",
pool_size=5,
host=settings.DB_HOST,
port=settings.DB_PORT,
user=settings.DB_USER,
password=settings.DB_PASS,
database=settings.DB_NAME,
)
return connection_pool
@contextmanager
def get_db_connection():
conn = connection_pool.get_connection()
try:
yield conn
finally:
conn.close()
@contextmanager
def get_db_cursor(commit=False):
with get_db_connection() as conn:
cursor = conn.cursor(dictionary=True)
try:
yield cursor
if commit:
conn.commit()
finally:
cursor.close()
def init_tables():
"""Create all required tables if they don't exist."""
create_statements = [
"""
CREATE TABLE IF NOT EXISTS meeting_users (
user_id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(100) UNIQUE NOT NULL,
display_name VARCHAR(50),
role ENUM('admin', 'user') DEFAULT 'user',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""",
"""
CREATE TABLE IF NOT EXISTS meeting_records (
meeting_id INT PRIMARY KEY AUTO_INCREMENT,
uuid VARCHAR(64) UNIQUE,
subject VARCHAR(200) NOT NULL,
meeting_time DATETIME NOT NULL,
location VARCHAR(100),
chairperson VARCHAR(50),
recorder VARCHAR(50),
attendees TEXT,
transcript_blob LONGTEXT,
created_by VARCHAR(100),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""",
"""
CREATE TABLE IF NOT EXISTS meeting_conclusions (
conclusion_id INT PRIMARY KEY AUTO_INCREMENT,
meeting_id INT,
content TEXT,
system_code VARCHAR(20),
FOREIGN KEY (meeting_id) REFERENCES meeting_records(meeting_id) ON DELETE CASCADE
)
""",
"""
CREATE TABLE IF NOT EXISTS meeting_action_items (
action_id INT PRIMARY KEY AUTO_INCREMENT,
meeting_id INT,
content TEXT,
owner VARCHAR(50),
due_date DATE,
status ENUM('Open', 'In Progress', 'Done', 'Delayed') DEFAULT 'Open',
system_code VARCHAR(20),
FOREIGN KEY (meeting_id) REFERENCES meeting_records(meeting_id) ON DELETE CASCADE
)
""",
]
with get_db_cursor(commit=True) as cursor:
for statement in create_statements:
cursor.execute(statement)

44
backend/app/main.py Normal file
View File

@@ -0,0 +1,44 @@
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager
from .database import init_db_pool, init_tables
from .routers import auth, meetings, ai, export
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup
init_db_pool()
init_tables()
yield
# Shutdown (cleanup if needed)
app = FastAPI(
title="Meeting Assistant API",
description="Enterprise meeting knowledge management API",
version="1.0.0",
lifespan=lifespan,
)
# CORS configuration for Electron client
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Include routers
app.include_router(auth.router, prefix="/api", tags=["Authentication"])
app.include_router(meetings.router, prefix="/api", tags=["Meetings"])
app.include_router(ai.router, prefix="/api", tags=["AI"])
app.include_router(export.router, prefix="/api", tags=["Export"])
@app.get("/api/health")
async def health_check():
"""Health check endpoint."""
return {"status": "healthy", "service": "meeting-assistant"}

View File

@@ -0,0 +1,37 @@
from .schemas import (
LoginRequest,
LoginResponse,
TokenPayload,
MeetingCreate,
MeetingUpdate,
MeetingResponse,
MeetingListResponse,
ConclusionCreate,
ConclusionResponse,
ActionItemCreate,
ActionItemUpdate,
ActionItemResponse,
SummarizeRequest,
SummarizeResponse,
ActionItemStatus,
UserRole,
)
__all__ = [
"LoginRequest",
"LoginResponse",
"TokenPayload",
"MeetingCreate",
"MeetingUpdate",
"MeetingResponse",
"MeetingListResponse",
"ConclusionCreate",
"ConclusionResponse",
"ActionItemCreate",
"ActionItemUpdate",
"ActionItemResponse",
"SummarizeRequest",
"SummarizeResponse",
"ActionItemStatus",
"UserRole",
]

View File

@@ -0,0 +1,128 @@
from pydantic import BaseModel
from typing import Optional, List
from datetime import datetime, date
from enum import Enum
class ActionItemStatus(str, Enum):
OPEN = "Open"
IN_PROGRESS = "In Progress"
DONE = "Done"
DELAYED = "Delayed"
class UserRole(str, Enum):
ADMIN = "admin"
USER = "user"
# Auth schemas
class LoginRequest(BaseModel):
email: str
password: str
class LoginResponse(BaseModel):
token: str
email: str
role: str
class TokenPayload(BaseModel):
email: str
role: str
exp: Optional[int] = None
# Meeting schemas
class ConclusionCreate(BaseModel):
content: str
class ConclusionResponse(BaseModel):
conclusion_id: int
meeting_id: int
content: str
system_code: Optional[str] = None
class ActionItemCreate(BaseModel):
content: str
owner: Optional[str] = ""
due_date: Optional[date] = None
class ActionItemUpdate(BaseModel):
content: Optional[str] = None
owner: Optional[str] = None
due_date: Optional[date] = None
status: Optional[ActionItemStatus] = None
class ActionItemResponse(BaseModel):
action_id: int
meeting_id: int
content: str
owner: Optional[str] = None
due_date: Optional[date] = None
status: ActionItemStatus
system_code: Optional[str] = None
class MeetingCreate(BaseModel):
subject: str
meeting_time: datetime
location: Optional[str] = ""
chairperson: Optional[str] = ""
recorder: Optional[str] = ""
attendees: Optional[str] = ""
transcript_blob: Optional[str] = ""
conclusions: Optional[List[ConclusionCreate]] = []
actions: Optional[List[ActionItemCreate]] = []
class MeetingUpdate(BaseModel):
subject: Optional[str] = None
meeting_time: Optional[datetime] = None
location: Optional[str] = None
chairperson: Optional[str] = None
recorder: Optional[str] = None
attendees: Optional[str] = None
transcript_blob: Optional[str] = None
conclusions: Optional[List[ConclusionCreate]] = None
actions: Optional[List[ActionItemCreate]] = None
class MeetingResponse(BaseModel):
meeting_id: int
uuid: str
subject: str
meeting_time: datetime
location: Optional[str] = None
chairperson: Optional[str] = None
recorder: Optional[str] = None
attendees: Optional[str] = None
transcript_blob: Optional[str] = None
created_by: Optional[str] = None
created_at: datetime
conclusions: List[ConclusionResponse] = []
actions: List[ActionItemResponse] = []
class MeetingListResponse(BaseModel):
meeting_id: int
uuid: str
subject: str
meeting_time: datetime
chairperson: Optional[str] = None
created_at: datetime
# AI schemas
class SummarizeRequest(BaseModel):
transcript: str
class SummarizeResponse(BaseModel):
conclusions: List[str]
action_items: List[ActionItemCreate]

View File

@@ -0,0 +1 @@
# Router modules

102
backend/app/routers/ai.py Normal file
View File

@@ -0,0 +1,102 @@
from fastapi import APIRouter, HTTPException, Depends
import httpx
import json
from ..config import settings
from ..models import SummarizeRequest, SummarizeResponse, ActionItemCreate, TokenPayload
from .auth import get_current_user
router = APIRouter()
@router.post("/ai/summarize", response_model=SummarizeResponse)
async def summarize_transcript(
request: SummarizeRequest, current_user: TokenPayload = Depends(get_current_user)
):
"""
Send transcript to Dify for AI summarization.
Returns structured conclusions and action items.
"""
if not settings.DIFY_API_KEY:
raise HTTPException(status_code=503, detail="Dify API not configured")
async with httpx.AsyncClient() as client:
try:
response = await client.post(
f"{settings.DIFY_API_URL}/chat-messages",
headers={
"Authorization": f"Bearer {settings.DIFY_API_KEY}",
"Content-Type": "application/json",
},
json={
"inputs": {},
"query": request.transcript,
"response_mode": "blocking",
"user": current_user.email,
},
timeout=120.0, # Long timeout for LLM processing
)
if response.status_code != 200:
raise HTTPException(
status_code=response.status_code,
detail=f"Dify API error: {response.text}",
)
data = response.json()
answer = data.get("answer", "")
# Try to parse structured JSON from Dify response
parsed = parse_dify_response(answer)
return SummarizeResponse(
conclusions=parsed["conclusions"],
action_items=[
ActionItemCreate(
content=item.get("content", ""),
owner=item.get("owner", ""),
due_date=item.get("due_date"),
)
for item in parsed["action_items"]
],
)
except httpx.TimeoutException:
raise HTTPException(
status_code=504, detail="Dify API timeout - transcript may be too long"
)
except httpx.RequestError as e:
raise HTTPException(status_code=503, detail=f"Dify API unavailable: {str(e)}")
def parse_dify_response(answer: str) -> dict:
"""
Parse Dify response to extract conclusions and action items.
Attempts JSON parsing first, then falls back to text parsing.
"""
# Try to find JSON in the response
try:
# Look for JSON block
if "```json" in answer:
json_start = answer.index("```json") + 7
json_end = answer.index("```", json_start)
json_str = answer[json_start:json_end].strip()
elif "{" in answer and "}" in answer:
# Try to find JSON object
json_start = answer.index("{")
json_end = answer.rindex("}") + 1
json_str = answer[json_start:json_end]
else:
raise ValueError("No JSON found")
data = json.loads(json_str)
return {
"conclusions": data.get("conclusions", []),
"action_items": data.get("action_items", []),
}
except (ValueError, json.JSONDecodeError):
# Fallback: return raw answer as single conclusion
return {
"conclusions": [answer] if answer else [],
"action_items": [],
}

109
backend/app/routers/auth.py Normal file
View File

@@ -0,0 +1,109 @@
from fastapi import APIRouter, HTTPException, Depends, Header
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
import httpx
from jose import jwt, JWTError
from datetime import datetime, timedelta
from typing import Optional
from ..config import settings
from ..models import LoginRequest, LoginResponse, TokenPayload
router = APIRouter()
security = HTTPBearer()
def create_token(email: str, role: str) -> str:
"""Create a JWT token with email and role."""
payload = {
"email": email,
"role": role,
"exp": datetime.utcnow() + timedelta(hours=24),
}
return jwt.encode(payload, settings.JWT_SECRET, algorithm="HS256")
def decode_token(token: str) -> TokenPayload:
"""Decode and validate a JWT token."""
try:
payload = jwt.decode(token, settings.JWT_SECRET, algorithms=["HS256"])
return TokenPayload(**payload)
except JWTError:
raise HTTPException(status_code=401, detail="Invalid token")
async def get_current_user(
credentials: HTTPAuthorizationCredentials = Depends(security),
) -> TokenPayload:
"""Dependency to get current authenticated user."""
token = credentials.credentials
try:
payload = jwt.decode(token, settings.JWT_SECRET, algorithms=["HS256"])
return TokenPayload(**payload)
except jwt.ExpiredSignatureError:
raise HTTPException(
status_code=401,
detail={"error": "token_expired", "message": "Token has expired"},
)
except JWTError:
raise HTTPException(
status_code=401,
detail={"error": "invalid_token", "message": "Invalid token"},
)
def is_admin(user: TokenPayload) -> bool:
"""Check if user has admin role."""
return user.role == "admin"
@router.post("/login", response_model=LoginResponse)
async def login(request: LoginRequest):
"""
Proxy login to company Auth API.
Adds admin role for ymirliu@panjit.com.tw.
"""
async with httpx.AsyncClient() as client:
try:
response = await client.post(
settings.AUTH_API_URL,
json={"username": request.email, "password": request.password},
timeout=30.0,
)
if response.status_code == 401:
raise HTTPException(status_code=401, detail="Invalid credentials")
if response.status_code != 200:
raise HTTPException(
status_code=response.status_code,
detail="Authentication service error",
)
# Parse response from external Auth API
auth_data = response.json()
# Check if authentication was successful
if not auth_data.get("success"):
error_msg = auth_data.get("error", "Authentication failed")
raise HTTPException(status_code=401, detail=error_msg)
# Determine role
role = "admin" if request.email == settings.ADMIN_EMAIL else "user"
# Create our own token with role info
token = create_token(request.email, role)
return LoginResponse(token=token, email=request.email, role=role)
except httpx.TimeoutException:
raise HTTPException(status_code=504, detail="Authentication service timeout")
except httpx.RequestError:
raise HTTPException(
status_code=503, detail="Authentication service unavailable"
)
@router.get("/me")
async def get_me(current_user: TokenPayload = Depends(get_current_user)):
"""Get current user information."""
return {"email": current_user.email, "role": current_user.role}

View File

@@ -0,0 +1,177 @@
from fastapi import APIRouter, HTTPException, Depends
from fastapi.responses import StreamingResponse
from openpyxl import Workbook, load_workbook
from openpyxl.styles import Font, Alignment, Border, Side
import io
import os
from ..database import get_db_cursor
from ..models import TokenPayload
from .auth import get_current_user, is_admin
router = APIRouter()
TEMPLATE_DIR = os.path.join(os.path.dirname(__file__), "..", "templates")
def create_default_workbook(meeting: dict, conclusions: list, actions: list) -> Workbook:
"""Create Excel workbook with meeting data."""
wb = Workbook()
ws = wb.active
ws.title = "Meeting Record"
# Styles
header_font = Font(bold=True, size=14)
label_font = Font(bold=True)
thin_border = Border(
left=Side(style="thin"),
right=Side(style="thin"),
top=Side(style="thin"),
bottom=Side(style="thin"),
)
# Title
ws.merge_cells("A1:F1")
ws["A1"] = "Meeting Record"
ws["A1"].font = Font(bold=True, size=16)
ws["A1"].alignment = Alignment(horizontal="center")
# Metadata section
row = 3
metadata = [
("Subject", meeting.get("subject", "")),
("Date/Time", str(meeting.get("meeting_time", ""))),
("Location", meeting.get("location", "")),
("Chairperson", meeting.get("chairperson", "")),
("Recorder", meeting.get("recorder", "")),
("Attendees", meeting.get("attendees", "")),
]
for label, value in metadata:
ws[f"A{row}"] = label
ws[f"A{row}"].font = label_font
ws.merge_cells(f"B{row}:F{row}")
ws[f"B{row}"] = value
row += 1
# Conclusions section
row += 1
ws.merge_cells(f"A{row}:F{row}")
ws[f"A{row}"] = "Conclusions"
ws[f"A{row}"].font = header_font
row += 1
ws[f"A{row}"] = "Code"
ws[f"B{row}"] = "Content"
ws[f"A{row}"].font = label_font
ws[f"B{row}"].font = label_font
row += 1
for c in conclusions:
ws[f"A{row}"] = c.get("system_code", "")
ws.merge_cells(f"B{row}:F{row}")
ws[f"B{row}"] = c.get("content", "")
row += 1
# Action Items section
row += 1
ws.merge_cells(f"A{row}:F{row}")
ws[f"A{row}"] = "Action Items"
ws[f"A{row}"].font = header_font
row += 1
headers = ["Code", "Content", "Owner", "Due Date", "Status"]
for col, header in enumerate(headers, 1):
cell = ws.cell(row=row, column=col, value=header)
cell.font = label_font
cell.border = thin_border
row += 1
for a in actions:
ws.cell(row=row, column=1, value=a.get("system_code", "")).border = thin_border
ws.cell(row=row, column=2, value=a.get("content", "")).border = thin_border
ws.cell(row=row, column=3, value=a.get("owner", "")).border = thin_border
ws.cell(row=row, column=4, value=str(a.get("due_date", "") or "")).border = thin_border
ws.cell(row=row, column=5, value=a.get("status", "")).border = thin_border
row += 1
# Adjust column widths
ws.column_dimensions["A"].width = 18
ws.column_dimensions["B"].width = 40
ws.column_dimensions["C"].width = 15
ws.column_dimensions["D"].width = 12
ws.column_dimensions["E"].width = 12
ws.column_dimensions["F"].width = 12
return wb
@router.get("/meetings/{meeting_id}/export")
async def export_meeting(
meeting_id: int, current_user: TokenPayload = Depends(get_current_user)
):
"""Export meeting to Excel file."""
with get_db_cursor() as cursor:
cursor.execute(
"SELECT * FROM meeting_records WHERE meeting_id = %s", (meeting_id,)
)
meeting = cursor.fetchone()
if not meeting:
raise HTTPException(status_code=404, detail="Meeting not found")
# Check access
if not is_admin(current_user):
if (
meeting["created_by"] != current_user.email
and meeting["recorder"] != current_user.email
and current_user.email not in (meeting["attendees"] or "")
):
raise HTTPException(status_code=403, detail="Access denied")
# Get conclusions
cursor.execute(
"SELECT * FROM meeting_conclusions WHERE meeting_id = %s", (meeting_id,)
)
conclusions = cursor.fetchall()
# Get action items
cursor.execute(
"SELECT * FROM meeting_action_items WHERE meeting_id = %s", (meeting_id,)
)
actions = cursor.fetchall()
# Check for custom template
template_path = os.path.join(TEMPLATE_DIR, "template.xlsx")
if os.path.exists(template_path):
wb = load_workbook(template_path)
ws = wb.active
# Replace placeholders
for row in ws.iter_rows():
for cell in row:
if cell.value and isinstance(cell.value, str):
cell.value = (
cell.value.replace("{{subject}}", meeting.get("subject", ""))
.replace("{{time}}", str(meeting.get("meeting_time", "")))
.replace("{{location}}", meeting.get("location", ""))
.replace("{{chair}}", meeting.get("chairperson", ""))
.replace("{{recorder}}", meeting.get("recorder", ""))
.replace("{{attendees}}", meeting.get("attendees", ""))
)
else:
# Use default template
wb = create_default_workbook(meeting, conclusions, actions)
# Save to bytes buffer
buffer = io.BytesIO()
wb.save(buffer)
buffer.seek(0)
filename = f"meeting_{meeting.get('uuid', meeting_id)}.xlsx"
return StreamingResponse(
buffer,
media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
headers={"Content-Disposition": f'attachment; filename="{filename}"'},
)

View File

@@ -0,0 +1,372 @@
from fastapi import APIRouter, HTTPException, Depends
from typing import List
import uuid
from datetime import date
from ..database import get_db_cursor
from ..models import (
MeetingCreate,
MeetingUpdate,
MeetingResponse,
MeetingListResponse,
ConclusionResponse,
ActionItemResponse,
ActionItemUpdate,
TokenPayload,
)
from .auth import get_current_user, is_admin
router = APIRouter()
def generate_system_code(prefix: str, meeting_date: date, sequence: int) -> str:
"""Generate system code like C-20251210-01 or A-20251210-01."""
date_str = meeting_date.strftime("%Y%m%d")
return f"{prefix}-{date_str}-{sequence:02d}"
def get_next_sequence(cursor, prefix: str, date_str: str) -> int:
"""Get next sequence number for a given prefix and date."""
pattern = f"{prefix}-{date_str}-%"
cursor.execute(
"""
SELECT system_code FROM meeting_conclusions WHERE system_code LIKE %s
UNION
SELECT system_code FROM meeting_action_items WHERE system_code LIKE %s
ORDER BY system_code DESC LIMIT 1
""",
(pattern, pattern),
)
result = cursor.fetchone()
if result:
last_code = result["system_code"]
last_seq = int(last_code.split("-")[-1])
return last_seq + 1
return 1
@router.post("/meetings", response_model=MeetingResponse)
async def create_meeting(
meeting: MeetingCreate, current_user: TokenPayload = Depends(get_current_user)
):
"""Create a new meeting with optional conclusions and action items."""
meeting_uuid = str(uuid.uuid4())
recorder = meeting.recorder or current_user.email
meeting_date = meeting.meeting_time.date()
date_str = meeting_date.strftime("%Y%m%d")
with get_db_cursor(commit=True) as cursor:
# Insert meeting record
cursor.execute(
"""
INSERT INTO meeting_records
(uuid, subject, meeting_time, location, chairperson, recorder, attendees, transcript_blob, created_by)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)
""",
(
meeting_uuid,
meeting.subject,
meeting.meeting_time,
meeting.location,
meeting.chairperson,
recorder,
meeting.attendees,
meeting.transcript_blob,
current_user.email,
),
)
meeting_id = cursor.lastrowid
# Insert conclusions
conclusions = []
seq = get_next_sequence(cursor, "C", date_str)
for conclusion in meeting.conclusions or []:
system_code = generate_system_code("C", meeting_date, seq)
cursor.execute(
"""
INSERT INTO meeting_conclusions (meeting_id, content, system_code)
VALUES (%s, %s, %s)
""",
(meeting_id, conclusion.content, system_code),
)
conclusions.append(
ConclusionResponse(
conclusion_id=cursor.lastrowid,
meeting_id=meeting_id,
content=conclusion.content,
system_code=system_code,
)
)
seq += 1
# Insert action items
actions = []
seq = get_next_sequence(cursor, "A", date_str)
for action in meeting.actions or []:
system_code = generate_system_code("A", meeting_date, seq)
cursor.execute(
"""
INSERT INTO meeting_action_items (meeting_id, content, owner, due_date, system_code)
VALUES (%s, %s, %s, %s, %s)
""",
(meeting_id, action.content, action.owner, action.due_date, system_code),
)
actions.append(
ActionItemResponse(
action_id=cursor.lastrowid,
meeting_id=meeting_id,
content=action.content,
owner=action.owner,
due_date=action.due_date,
status="Open",
system_code=system_code,
)
)
seq += 1
# Fetch created meeting
cursor.execute(
"SELECT * FROM meeting_records WHERE meeting_id = %s", (meeting_id,)
)
record = cursor.fetchone()
return MeetingResponse(
meeting_id=record["meeting_id"],
uuid=record["uuid"],
subject=record["subject"],
meeting_time=record["meeting_time"],
location=record["location"],
chairperson=record["chairperson"],
recorder=record["recorder"],
attendees=record["attendees"],
transcript_blob=record["transcript_blob"],
created_by=record["created_by"],
created_at=record["created_at"],
conclusions=conclusions,
actions=actions,
)
@router.get("/meetings", response_model=List[MeetingListResponse])
async def list_meetings(current_user: TokenPayload = Depends(get_current_user)):
"""List meetings. Admin sees all, users see only their own."""
with get_db_cursor() as cursor:
if is_admin(current_user):
cursor.execute(
"""
SELECT meeting_id, uuid, subject, meeting_time, chairperson, created_at
FROM meeting_records ORDER BY meeting_time DESC
"""
)
else:
cursor.execute(
"""
SELECT meeting_id, uuid, subject, meeting_time, chairperson, created_at
FROM meeting_records
WHERE created_by = %s OR recorder = %s OR attendees LIKE %s
ORDER BY meeting_time DESC
""",
(
current_user.email,
current_user.email,
f"%{current_user.email}%",
),
)
records = cursor.fetchall()
return [MeetingListResponse(**record) for record in records]
@router.get("/meetings/{meeting_id}", response_model=MeetingResponse)
async def get_meeting(
meeting_id: int, current_user: TokenPayload = Depends(get_current_user)
):
"""Get meeting details with conclusions and action items."""
with get_db_cursor() as cursor:
cursor.execute(
"SELECT * FROM meeting_records WHERE meeting_id = %s", (meeting_id,)
)
record = cursor.fetchone()
if not record:
raise HTTPException(status_code=404, detail="Meeting not found")
# Check access
if not is_admin(current_user):
if (
record["created_by"] != current_user.email
and record["recorder"] != current_user.email
and current_user.email not in (record["attendees"] or "")
):
raise HTTPException(status_code=403, detail="Access denied")
# Get conclusions
cursor.execute(
"SELECT * FROM meeting_conclusions WHERE meeting_id = %s", (meeting_id,)
)
conclusions = [ConclusionResponse(**c) for c in cursor.fetchall()]
# Get action items
cursor.execute(
"SELECT * FROM meeting_action_items WHERE meeting_id = %s", (meeting_id,)
)
actions = [ActionItemResponse(**a) for a in cursor.fetchall()]
return MeetingResponse(
meeting_id=record["meeting_id"],
uuid=record["uuid"],
subject=record["subject"],
meeting_time=record["meeting_time"],
location=record["location"],
chairperson=record["chairperson"],
recorder=record["recorder"],
attendees=record["attendees"],
transcript_blob=record["transcript_blob"],
created_by=record["created_by"],
created_at=record["created_at"],
conclusions=conclusions,
actions=actions,
)
@router.put("/meetings/{meeting_id}", response_model=MeetingResponse)
async def update_meeting(
meeting_id: int,
meeting: MeetingUpdate,
current_user: TokenPayload = Depends(get_current_user),
):
"""Update meeting details."""
with get_db_cursor(commit=True) as cursor:
cursor.execute(
"SELECT * FROM meeting_records WHERE meeting_id = %s", (meeting_id,)
)
record = cursor.fetchone()
if not record:
raise HTTPException(status_code=404, detail="Meeting not found")
# Check access
if not is_admin(current_user) and record["created_by"] != current_user.email:
raise HTTPException(status_code=403, detail="Access denied")
# Build update query dynamically
updates = []
values = []
for field in ["subject", "meeting_time", "location", "chairperson", "recorder", "attendees", "transcript_blob"]:
value = getattr(meeting, field)
if value is not None:
updates.append(f"{field} = %s")
values.append(value)
if updates:
values.append(meeting_id)
cursor.execute(
f"UPDATE meeting_records SET {', '.join(updates)} WHERE meeting_id = %s",
values,
)
# Update conclusions if provided
if meeting.conclusions is not None:
cursor.execute(
"DELETE FROM meeting_conclusions WHERE meeting_id = %s", (meeting_id,)
)
meeting_date = (meeting.meeting_time or record["meeting_time"]).date() if hasattr(meeting.meeting_time or record["meeting_time"], 'date') else date.today()
date_str = meeting_date.strftime("%Y%m%d")
seq = get_next_sequence(cursor, "C", date_str)
for conclusion in meeting.conclusions:
system_code = generate_system_code("C", meeting_date, seq)
cursor.execute(
"""
INSERT INTO meeting_conclusions (meeting_id, content, system_code)
VALUES (%s, %s, %s)
""",
(meeting_id, conclusion.content, system_code),
)
seq += 1
# Update action items if provided
if meeting.actions is not None:
cursor.execute(
"DELETE FROM meeting_action_items WHERE meeting_id = %s", (meeting_id,)
)
meeting_date = (meeting.meeting_time or record["meeting_time"]).date() if hasattr(meeting.meeting_time or record["meeting_time"], 'date') else date.today()
date_str = meeting_date.strftime("%Y%m%d")
seq = get_next_sequence(cursor, "A", date_str)
for action in meeting.actions:
system_code = generate_system_code("A", meeting_date, seq)
cursor.execute(
"""
INSERT INTO meeting_action_items (meeting_id, content, owner, due_date, system_code)
VALUES (%s, %s, %s, %s, %s)
""",
(meeting_id, action.content, action.owner, action.due_date, system_code),
)
seq += 1
# Return updated meeting
return await get_meeting(meeting_id, current_user)
@router.delete("/meetings/{meeting_id}")
async def delete_meeting(
meeting_id: int, current_user: TokenPayload = Depends(get_current_user)
):
"""Delete meeting and all related data (cascade)."""
with get_db_cursor(commit=True) as cursor:
cursor.execute(
"SELECT * FROM meeting_records WHERE meeting_id = %s", (meeting_id,)
)
record = cursor.fetchone()
if not record:
raise HTTPException(status_code=404, detail="Meeting not found")
# Check access - admin or creator can delete
if not is_admin(current_user) and record["created_by"] != current_user.email:
raise HTTPException(status_code=403, detail="Access denied")
# Delete (cascade will handle conclusions and action items)
cursor.execute("DELETE FROM meeting_records WHERE meeting_id = %s", (meeting_id,))
return {"message": "Meeting deleted successfully"}
@router.put("/meetings/{meeting_id}/actions/{action_id}")
async def update_action_item(
meeting_id: int,
action_id: int,
action: ActionItemUpdate,
current_user: TokenPayload = Depends(get_current_user),
):
"""Update a specific action item's status, owner, or due date."""
with get_db_cursor(commit=True) as cursor:
cursor.execute(
"SELECT * FROM meeting_action_items WHERE action_id = %s AND meeting_id = %s",
(action_id, meeting_id),
)
record = cursor.fetchone()
if not record:
raise HTTPException(status_code=404, detail="Action item not found")
updates = []
values = []
for field in ["content", "owner", "due_date", "status"]:
value = getattr(action, field)
if value is not None:
updates.append(f"{field} = %s")
values.append(value.value if hasattr(value, "value") else value)
if updates:
values.append(action_id)
cursor.execute(
f"UPDATE meeting_action_items SET {', '.join(updates)} WHERE action_id = %s",
values,
)
cursor.execute(
"SELECT * FROM meeting_action_items WHERE action_id = %s", (action_id,)
)
updated = cursor.fetchone()
return ActionItemResponse(**updated)

3
backend/pytest.ini Normal file
View File

@@ -0,0 +1,3 @@
[pytest]
asyncio_mode = auto
asyncio_default_fixture_loop_scope = function

10
backend/requirements.txt Normal file
View File

@@ -0,0 +1,10 @@
fastapi>=0.115.0
uvicorn[standard]>=0.32.0
python-dotenv>=1.0.0
mysql-connector-python>=9.0.0
pydantic>=2.10.0
httpx>=0.27.0
python-jose[cryptography]>=3.3.0
openpyxl>=3.1.2
pytest>=8.0.0
pytest-asyncio>=0.24.0

View File

@@ -0,0 +1 @@
# Tests package

48
backend/tests/conftest.py Normal file
View File

@@ -0,0 +1,48 @@
"""
Pytest configuration and fixtures.
"""
import pytest
import sys
import os
# Add the backend directory to the path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
@pytest.fixture(autouse=True)
def mock_env(monkeypatch):
"""Set up mock environment variables for all tests."""
monkeypatch.setenv("DB_HOST", "localhost")
monkeypatch.setenv("DB_PORT", "3306")
monkeypatch.setenv("DB_USER", "test")
monkeypatch.setenv("DB_PASS", "test")
monkeypatch.setenv("DB_NAME", "test_db")
monkeypatch.setenv("AUTH_API_URL", "https://auth.test.com/login")
monkeypatch.setenv("DIFY_API_URL", "https://dify.test.com/v1")
monkeypatch.setenv("DIFY_API_KEY", "test-api-key")
monkeypatch.setenv("ADMIN_EMAIL", "admin@test.com")
monkeypatch.setenv("JWT_SECRET", "test-jwt-secret")
@pytest.fixture
def sample_meeting():
"""Sample meeting data for tests."""
return {
"subject": "Test Meeting",
"meeting_time": "2025-01-15T10:00:00",
"location": "Conference Room A",
"chairperson": "John Doe",
"recorder": "Jane Smith",
"attendees": "alice@test.com, bob@test.com",
}
@pytest.fixture
def sample_transcript():
"""Sample transcript for AI tests."""
return """
今天的會議主要討論了Q1預算和新員工招聘計劃。
決定將行銷預算增加10%
小明負責在下週五前提交最終報告。
"""

191
backend/tests/test_ai.py Normal file
View File

@@ -0,0 +1,191 @@
"""
Unit tests for AI summarization with mock Dify responses.
"""
import pytest
from unittest.mock import patch, MagicMock, AsyncMock
import json
pytestmark = pytest.mark.asyncio
class TestDifyResponseParsing:
"""Tests for parsing Dify LLM responses."""
def test_parse_json_response(self):
"""Test parsing valid JSON response from Dify."""
from app.routers.ai import parse_dify_response
response = '''Here is the summary:
```json
{
"conclusions": ["Agreed on Q1 budget", "New hire approved"],
"action_items": [
{"content": "Submit budget report", "owner": "John", "due_date": "2025-01-15"},
{"content": "Post job listing", "owner": "", "due_date": null}
]
}
```
'''
result = parse_dify_response(response)
assert len(result["conclusions"]) == 2
assert "Q1 budget" in result["conclusions"][0]
assert len(result["action_items"]) == 2
assert result["action_items"][0]["owner"] == "John"
def test_parse_inline_json_response(self):
"""Test parsing inline JSON without code blocks."""
from app.routers.ai import parse_dify_response
response = '{"conclusions": ["Budget approved"], "action_items": []}'
result = parse_dify_response(response)
assert len(result["conclusions"]) == 1
assert result["conclusions"][0] == "Budget approved"
def test_parse_non_json_response(self):
"""Test fallback when response is not JSON."""
from app.routers.ai import parse_dify_response
response = "The meeting discussed Q1 budget and hiring plans."
result = parse_dify_response(response)
# Should return the raw response as a single conclusion
assert len(result["conclusions"]) == 1
assert "Q1 budget" in result["conclusions"][0]
assert len(result["action_items"]) == 0
def test_parse_empty_response(self):
"""Test handling empty response."""
from app.routers.ai import parse_dify_response
result = parse_dify_response("")
assert result["conclusions"] == []
assert result["action_items"] == []
class TestSummarizeEndpoint:
"""Tests for the AI summarization endpoint."""
@patch("app.routers.ai.httpx.AsyncClient")
@patch("app.routers.ai.settings")
async def test_summarize_success(self, mock_settings, mock_client_class):
"""Test successful summarization."""
mock_settings.DIFY_API_URL = "https://dify.test.com/v1"
mock_settings.DIFY_API_KEY = "test-key"
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"answer": json.dumps({
"conclusions": ["Decision made"],
"action_items": [{"content": "Follow up", "owner": "Alice", "due_date": "2025-01-20"}]
})
}
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
mock_client.__aenter__.return_value = mock_client
mock_client.__aexit__.return_value = None
mock_client_class.return_value = mock_client
from app.routers.ai import summarize_transcript
from app.models import SummarizeRequest, TokenPayload
mock_user = TokenPayload(email="test@test.com", role="user")
result = await summarize_transcript(
SummarizeRequest(transcript="Test meeting transcript"),
current_user=mock_user
)
assert len(result.conclusions) == 1
assert len(result.action_items) == 1
assert result.action_items[0].owner == "Alice"
@patch("app.routers.ai.httpx.AsyncClient")
@patch("app.routers.ai.settings")
async def test_summarize_handles_timeout(self, mock_settings, mock_client_class):
"""Test handling Dify timeout."""
import httpx
from fastapi import HTTPException
mock_settings.DIFY_API_URL = "https://dify.test.com/v1"
mock_settings.DIFY_API_KEY = "test-key"
mock_client = AsyncMock()
mock_client.post.side_effect = httpx.TimeoutException("Timeout")
mock_client.__aenter__.return_value = mock_client
mock_client.__aexit__.return_value = None
mock_client_class.return_value = mock_client
from app.routers.ai import summarize_transcript
from app.models import SummarizeRequest, TokenPayload
mock_user = TokenPayload(email="test@test.com", role="user")
with pytest.raises(HTTPException) as exc_info:
await summarize_transcript(
SummarizeRequest(transcript="Test"),
current_user=mock_user
)
assert exc_info.value.status_code == 504
@patch("app.routers.ai.settings")
async def test_summarize_no_api_key(self, mock_settings):
"""Test error when Dify API key is not configured."""
from fastapi import HTTPException
mock_settings.DIFY_API_KEY = ""
from app.routers.ai import summarize_transcript
from app.models import SummarizeRequest, TokenPayload
mock_user = TokenPayload(email="test@test.com", role="user")
with pytest.raises(HTTPException) as exc_info:
await summarize_transcript(
SummarizeRequest(transcript="Test"),
current_user=mock_user
)
assert exc_info.value.status_code == 503
class TestPartialDataHandling:
"""Tests for handling partial data from AI."""
def test_action_item_with_empty_owner(self):
"""Test action items with empty owner are handled."""
from app.routers.ai import parse_dify_response
response = json.dumps({
"conclusions": [],
"action_items": [
{"content": "Task 1", "owner": "", "due_date": None},
{"content": "Task 2", "owner": "Bob", "due_date": "2025-02-01"}
]
})
result = parse_dify_response(response)
assert result["action_items"][0]["owner"] == ""
assert result["action_items"][1]["owner"] == "Bob"
def test_action_item_with_missing_fields(self):
"""Test action items with missing fields."""
from app.routers.ai import parse_dify_response
response = json.dumps({
"conclusions": ["Done"],
"action_items": [
{"content": "Task only"}
]
})
result = parse_dify_response(response)
# Should have content but other fields may be missing
assert result["action_items"][0]["content"] == "Task only"

138
backend/tests/test_auth.py Normal file
View File

@@ -0,0 +1,138 @@
"""
Unit tests for authentication functionality.
"""
import pytest
from unittest.mock import patch, MagicMock, AsyncMock
from fastapi.testclient import TestClient
from jose import jwt
pytestmark = pytest.mark.asyncio
class TestAdminRoleDetection:
"""Tests for admin role detection."""
def test_admin_email_gets_admin_role(self):
"""Test that admin email is correctly identified."""
from app.config import settings
admin_email = settings.ADMIN_EMAIL
test_email = "regular@example.com"
# Admin email should be set (either from env or default)
assert admin_email is not None
assert len(admin_email) > 0
assert test_email != admin_email
@patch("app.routers.auth.settings")
def test_create_token_includes_role(self, mock_settings):
"""Test that created tokens include the role."""
mock_settings.JWT_SECRET = "test-secret"
mock_settings.ADMIN_EMAIL = "admin@test.com"
from app.routers.auth import create_token
# Test admin token
admin_token = create_token("admin@test.com", "admin")
admin_payload = jwt.decode(admin_token, "test-secret", algorithms=["HS256"])
assert admin_payload["role"] == "admin"
# Test user token
user_token = create_token("user@test.com", "user")
user_payload = jwt.decode(user_token, "test-secret", algorithms=["HS256"])
assert user_payload["role"] == "user"
class TestTokenValidation:
"""Tests for JWT token validation."""
@patch("app.routers.auth.settings")
def test_decode_valid_token(self, mock_settings):
"""Test decoding a valid token."""
mock_settings.JWT_SECRET = "test-secret"
from app.routers.auth import create_token, decode_token
token = create_token("test@example.com", "user")
payload = decode_token(token)
assert payload.email == "test@example.com"
assert payload.role == "user"
@patch("app.routers.auth.settings")
def test_decode_invalid_token_raises_error(self, mock_settings):
"""Test that invalid tokens raise an error."""
mock_settings.JWT_SECRET = "test-secret"
from app.routers.auth import decode_token
from fastapi import HTTPException
with pytest.raises(HTTPException) as exc_info:
decode_token("invalid-token")
assert exc_info.value.status_code == 401
class TestLoginEndpoint:
"""Tests for the login endpoint."""
@pytest.fixture
def client(self):
"""Create test client."""
from app.main import app
# Skip lifespan for tests
app.router.lifespan_context = None
return TestClient(app, raise_server_exceptions=False)
@patch("app.routers.auth.httpx.AsyncClient")
@patch("app.routers.auth.settings")
async def test_login_success(self, mock_settings, mock_client_class):
"""Test successful login."""
mock_settings.AUTH_API_URL = "https://auth.test.com/login"
mock_settings.ADMIN_EMAIL = "admin@test.com"
mock_settings.JWT_SECRET = "test-secret"
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {"token": "external-token"}
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
mock_client.__aenter__.return_value = mock_client
mock_client.__aexit__.return_value = None
mock_client_class.return_value = mock_client
from app.routers.auth import login
from app.models import LoginRequest
result = await login(LoginRequest(email="user@test.com", password="password"))
assert result.email == "user@test.com"
assert result.role == "user"
assert result.token is not None
@patch("app.routers.auth.httpx.AsyncClient")
@patch("app.routers.auth.settings")
async def test_login_admin_gets_admin_role(self, mock_settings, mock_client_class):
"""Test that admin email gets admin role."""
mock_settings.AUTH_API_URL = "https://auth.test.com/login"
mock_settings.ADMIN_EMAIL = "admin@test.com"
mock_settings.JWT_SECRET = "test-secret"
mock_response = MagicMock()
mock_response.status_code = 200
mock_client = AsyncMock()
mock_client.post.return_value = mock_response
mock_client.__aenter__.return_value = mock_client
mock_client.__aexit__.return_value = None
mock_client_class.return_value = mock_client
from app.routers.auth import login
from app.models import LoginRequest
result = await login(LoginRequest(email="admin@test.com", password="password"))
assert result.role == "admin"

View File

@@ -0,0 +1,95 @@
"""
Unit tests for database connection and table initialization.
"""
import pytest
from unittest.mock import patch, MagicMock
class TestDatabaseConnection:
"""Tests for database connectivity."""
@patch("mysql.connector.pooling.MySQLConnectionPool")
def test_init_db_pool_success(self, mock_pool):
"""Test successful database pool initialization."""
mock_pool.return_value = MagicMock()
from app.database import init_db_pool
pool = init_db_pool()
assert pool is not None
mock_pool.assert_called_once()
@patch("mysql.connector.pooling.MySQLConnectionPool")
def test_init_db_pool_with_correct_config(self, mock_pool):
"""Test database pool is created with correct configuration."""
from app.database import init_db_pool
from app.config import settings
init_db_pool()
call_args = mock_pool.call_args
assert call_args.kwargs["host"] == settings.DB_HOST
assert call_args.kwargs["port"] == settings.DB_PORT
assert call_args.kwargs["user"] == settings.DB_USER
assert call_args.kwargs["database"] == settings.DB_NAME
class TestTableInitialization:
"""Tests for table creation."""
@patch("app.database.get_db_cursor")
def test_init_tables_creates_required_tables(self, mock_cursor_context):
"""Test that all required tables are created."""
mock_cursor = MagicMock()
mock_cursor_context.return_value.__enter__ = MagicMock(return_value=mock_cursor)
mock_cursor_context.return_value.__exit__ = MagicMock(return_value=False)
from app.database import init_tables
init_tables()
# Verify execute was called for each table
assert mock_cursor.execute.call_count == 4
# Check table names in SQL
calls = mock_cursor.execute.call_args_list
sql_statements = [call[0][0] for call in calls]
assert any("meeting_users" in sql for sql in sql_statements)
assert any("meeting_records" in sql for sql in sql_statements)
assert any("meeting_conclusions" in sql for sql in sql_statements)
assert any("meeting_action_items" in sql for sql in sql_statements)
class TestDatabaseHelpers:
"""Tests for database helper functions."""
@patch("app.database.connection_pool")
def test_get_db_connection_returns_connection(self, mock_pool):
"""Test that get_db_connection returns a valid connection."""
mock_conn = MagicMock()
mock_pool.get_connection.return_value = mock_conn
from app.database import get_db_connection
with get_db_connection() as conn:
assert conn == mock_conn
mock_conn.close.assert_called_once()
@patch("app.database.connection_pool")
def test_get_db_cursor_with_commit(self, mock_pool):
"""Test that get_db_cursor commits when specified."""
mock_conn = MagicMock()
mock_cursor = MagicMock()
mock_pool.get_connection.return_value = mock_conn
mock_conn.cursor.return_value = mock_cursor
from app.database import get_db_cursor
with get_db_cursor(commit=True) as cursor:
cursor.execute("SELECT 1")
mock_conn.commit.assert_called_once()
mock_cursor.close.assert_called_once()

4118
client/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

47
client/package.json Normal file
View File

@@ -0,0 +1,47 @@
{
"name": "meeting-assistant",
"version": "1.0.0",
"description": "Enterprise Meeting Knowledge Management",
"main": "src/main.js",
"scripts": {
"start": "electron .",
"build": "electron-builder",
"pack": "electron-builder --dir"
},
"author": "Your Company",
"license": "MIT",
"devDependencies": {
"electron": "^28.0.0",
"electron-builder": "^24.9.1"
},
"dependencies": {
"axios": "^1.6.2"
},
"build": {
"appId": "com.company.meeting-assistant",
"productName": "Meeting Assistant",
"directories": {
"output": "dist"
},
"files": [
"src/**/*",
"node_modules/**/*"
],
"extraResources": [
{
"from": "../sidecar/dist",
"to": "sidecar",
"filter": ["**/*"]
}
],
"win": {
"target": "portable"
},
"mac": {
"target": "dmg"
},
"linux": {
"target": "AppImage"
}
}
}

278
client/src/main.js Normal file
View File

@@ -0,0 +1,278 @@
const { app, BrowserWindow, ipcMain } = require("electron");
const path = require("path");
const fs = require("fs");
const { spawn } = require("child_process");
const os = require("os");
let mainWindow;
let sidecarProcess;
let sidecarReady = false;
let streamingActive = false;
function createWindow() {
mainWindow = new BrowserWindow({
width: 1200,
height: 800,
webPreferences: {
nodeIntegration: false,
contextIsolation: true,
preload: path.join(__dirname, "preload.js"),
},
});
mainWindow.loadFile(path.join(__dirname, "pages", "login.html"));
mainWindow.on("closed", () => {
mainWindow = null;
});
}
function startSidecar() {
const sidecarDir = app.isPackaged
? path.join(process.resourcesPath, "sidecar")
: path.join(__dirname, "..", "..", "sidecar");
const sidecarScript = path.join(sidecarDir, "transcriber.py");
const venvPython = path.join(sidecarDir, "venv", "bin", "python");
if (!fs.existsSync(sidecarScript)) {
console.log("Sidecar script not found at:", sidecarScript);
console.log("Transcription will not be available.");
return;
}
const pythonPath = fs.existsSync(venvPython) ? venvPython : "python3";
try {
console.log("Starting sidecar with:", pythonPath, sidecarScript);
sidecarProcess = spawn(pythonPath, [sidecarScript], {
cwd: sidecarDir,
stdio: ["pipe", "pipe", "pipe"],
});
// Handle stdout (JSON responses)
sidecarProcess.stdout.on("data", (data) => {
const lines = data.toString().split("\n").filter(l => l.trim());
for (const line of lines) {
try {
const msg = JSON.parse(line);
console.log("Sidecar response:", msg);
if (msg.status === "ready") {
sidecarReady = true;
console.log("Sidecar is ready");
}
// Forward streaming segment to renderer
if (msg.segment_id !== undefined && mainWindow) {
mainWindow.webContents.send("transcription-segment", msg);
}
// Forward stream status changes
if (msg.status === "streaming" && mainWindow) {
mainWindow.webContents.send("stream-started", msg);
}
if (msg.status === "stream_stopped" && mainWindow) {
mainWindow.webContents.send("stream-stopped", msg);
}
// Legacy: file-based transcription result
if (msg.result !== undefined && mainWindow) {
mainWindow.webContents.send("transcription-result", msg.result);
}
} catch (e) {
console.log("Sidecar output:", line);
}
}
});
sidecarProcess.stderr.on("data", (data) => {
console.log("Sidecar:", data.toString().trim());
});
sidecarProcess.on("close", (code) => {
console.log(`Sidecar exited with code ${code}`);
sidecarReady = false;
streamingActive = false;
});
sidecarProcess.on("error", (err) => {
console.error("Sidecar error:", err.message);
});
} catch (error) {
console.error("Failed to start sidecar:", error);
}
}
app.whenReady().then(() => {
createWindow();
startSidecar();
app.on("activate", () => {
if (BrowserWindow.getAllWindows().length === 0) {
createWindow();
}
});
});
app.on("window-all-closed", () => {
if (sidecarProcess) {
try {
sidecarProcess.stdin.write(JSON.stringify({ action: "quit" }) + "\n");
} catch (e) {}
sidecarProcess.kill();
}
if (process.platform !== "darwin") {
app.quit();
}
});
// IPC handlers
ipcMain.handle("navigate", (event, page) => {
mainWindow.loadFile(path.join(__dirname, "pages", `${page}.html`));
});
ipcMain.handle("get-sidecar-status", () => {
return { ready: sidecarReady, streaming: streamingActive };
});
// === Streaming Mode IPC Handlers ===
ipcMain.handle("start-recording-stream", async () => {
if (!sidecarProcess || !sidecarReady) {
return { error: "Sidecar not ready" };
}
if (streamingActive) {
return { error: "Stream already active" };
}
return new Promise((resolve) => {
const responseHandler = (data) => {
const lines = data.toString().split("\n").filter(l => l.trim());
for (const line of lines) {
try {
const msg = JSON.parse(line);
if (msg.status === "streaming" || msg.error) {
sidecarProcess.stdout.removeListener("data", responseHandler);
if (msg.status === "streaming") {
streamingActive = true;
}
resolve(msg);
return;
}
} catch (e) {}
}
};
sidecarProcess.stdout.on("data", responseHandler);
sidecarProcess.stdin.write(JSON.stringify({ action: "start_stream" }) + "\n");
setTimeout(() => {
sidecarProcess.stdout.removeListener("data", responseHandler);
resolve({ error: "Start stream timeout" });
}, 5000);
});
});
ipcMain.handle("stream-audio-chunk", async (event, base64Audio) => {
if (!sidecarProcess || !sidecarReady || !streamingActive) {
return { error: "Stream not active" };
}
try {
const cmd = JSON.stringify({ action: "audio_chunk", data: base64Audio }) + "\n";
sidecarProcess.stdin.write(cmd);
return { sent: true };
} catch (e) {
return { error: e.message };
}
});
ipcMain.handle("stop-recording-stream", async () => {
if (!sidecarProcess || !streamingActive) {
return { error: "No active stream" };
}
return new Promise((resolve) => {
const responseHandler = (data) => {
const lines = data.toString().split("\n").filter(l => l.trim());
for (const line of lines) {
try {
const msg = JSON.parse(line);
if (msg.status === "stream_stopped" || msg.error) {
sidecarProcess.stdout.removeListener("data", responseHandler);
streamingActive = false;
resolve(msg);
return;
}
} catch (e) {}
}
};
sidecarProcess.stdout.on("data", responseHandler);
sidecarProcess.stdin.write(JSON.stringify({ action: "stop_stream" }) + "\n");
setTimeout(() => {
sidecarProcess.stdout.removeListener("data", responseHandler);
streamingActive = false;
resolve({ error: "Stop stream timeout" });
}, 10000);
});
});
// === Legacy File-based Handlers (kept for fallback) ===
ipcMain.handle("save-audio-file", async (event, arrayBuffer) => {
const tempDir = os.tmpdir();
const tempFile = path.join(tempDir, `recording_${Date.now()}.webm`);
const buffer = Buffer.from(arrayBuffer);
fs.writeFileSync(tempFile, buffer);
return tempFile;
});
ipcMain.handle("transcribe-audio", async (event, audioFilePath) => {
if (!sidecarProcess || !sidecarReady) {
return { error: "Sidecar not ready" };
}
return new Promise((resolve) => {
const responseHandler = (data) => {
const lines = data.toString().split("\n").filter(l => l.trim());
for (const line of lines) {
try {
const msg = JSON.parse(line);
if (msg.result !== undefined || msg.error) {
sidecarProcess.stdout.removeListener("data", responseHandler);
// Delete temp file after transcription
try {
if (fs.existsSync(audioFilePath)) {
fs.unlinkSync(audioFilePath);
}
} catch (e) {
console.error("Failed to delete temp file:", e);
}
resolve(msg);
return;
}
} catch (e) {}
}
};
sidecarProcess.stdout.on("data", responseHandler);
const cmd = JSON.stringify({ action: "transcribe", file: audioFilePath }) + "\n";
sidecarProcess.stdin.write(cmd);
setTimeout(() => {
sidecarProcess.stdout.removeListener("data", responseHandler);
// Delete temp file on timeout too
try {
if (fs.existsSync(audioFilePath)) {
fs.unlinkSync(audioFilePath);
}
} catch (e) {}
resolve({ error: "Transcription timeout" });
}, 60000);
});
});

View File

@@ -0,0 +1,58 @@
<!DOCTYPE html>
<html lang="zh-TW">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Meeting Assistant - Login</title>
<link rel="stylesheet" href="../styles/main.css">
</head>
<body>
<div class="login-container">
<div class="login-box">
<h1>Meeting Assistant</h1>
<div id="error-alert" class="alert alert-error hidden"></div>
<form id="login-form">
<div class="form-group">
<label for="email">Email</label>
<input type="email" id="email" name="email" required placeholder="your.email@company.com">
</div>
<div class="form-group">
<label for="password">Password</label>
<input type="password" id="password" name="password" required placeholder="Enter your password">
</div>
<button type="submit" class="btn btn-primary btn-full" id="login-btn">Login</button>
</form>
</div>
</div>
<script type="module">
import { login } from '../services/api.js';
const form = document.getElementById('login-form');
const errorAlert = document.getElementById('error-alert');
const loginBtn = document.getElementById('login-btn');
form.addEventListener('submit', async (e) => {
e.preventDefault();
const email = document.getElementById('email').value;
const password = document.getElementById('password').value;
loginBtn.disabled = true;
loginBtn.textContent = 'Logging in...';
errorAlert.classList.add('hidden');
try {
await login(email, password);
window.electronAPI.navigate('meetings');
} catch (error) {
errorAlert.textContent = error.message;
errorAlert.classList.remove('hidden');
} finally {
loginBtn.disabled = false;
loginBtn.textContent = 'Login';
}
});
</script>
</body>
</html>

View File

@@ -0,0 +1,685 @@
<!DOCTYPE html>
<html lang="zh-TW">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Meeting Assistant - Meeting Detail</title>
<link rel="stylesheet" href="../styles/main.css">
<style>
.transcript-segments {
display: flex;
flex-direction: column;
gap: 8px;
}
.transcript-segment {
position: relative;
padding: 10px 12px;
background: #f8f9fa;
border: 1px solid #dee2e6;
border-radius: 6px;
transition: all 0.2s;
}
.transcript-segment:hover {
background: #fff;
border-color: #adb5bd;
}
.transcript-segment.active {
background: #e3f2fd;
border-color: #2196f3;
box-shadow: 0 0 0 2px rgba(33, 150, 243, 0.2);
}
.transcript-segment.edited {
border-left: 3px solid #ff9800;
}
.transcript-segment textarea {
width: 100%;
min-height: 40px;
padding: 0;
border: none;
background: transparent;
resize: vertical;
font-size: 14px;
line-height: 1.5;
font-family: inherit;
}
.transcript-segment textarea:focus {
outline: none;
}
.segment-meta {
display: flex;
justify-content: space-between;
align-items: center;
margin-top: 6px;
font-size: 11px;
color: #6c757d;
}
.segment-id {
background: #e9ecef;
padding: 2px 6px;
border-radius: 3px;
}
.segment-edited {
color: #ff9800;
font-weight: 500;
}
.streaming-status {
display: flex;
align-items: center;
gap: 10px;
padding: 10px 15px;
background: #e3f2fd;
border-radius: 6px;
margin-bottom: 10px;
}
.streaming-status.hidden {
display: none;
}
.pulse-dot {
width: 10px;
height: 10px;
background: #f44336;
border-radius: 50%;
animation: pulse 1.5s infinite;
}
@keyframes pulse {
0%, 100% { opacity: 1; transform: scale(1); }
50% { opacity: 0.5; transform: scale(1.2); }
}
.segment-count {
margin-left: auto;
font-size: 12px;
color: #666;
}
.processing-indicator {
text-align: center;
padding: 15px;
color: #666;
font-style: italic;
}
.transcript-textarea {
width: 100%;
min-height: 400px;
padding: 15px;
border: 1px solid #ddd;
border-radius: 4px;
font-size: 15px;
line-height: 1.8;
resize: vertical;
font-family: 'Microsoft JhengHei', 'PingFang TC', sans-serif;
}
.transcript-textarea:focus {
outline: none;
border-color: #2196F3;
box-shadow: 0 0 0 2px rgba(33, 150, 243, 0.2);
}
</style>
</head>
<body>
<header class="header">
<h1 id="meeting-title">Meeting Details</h1>
<nav class="header-nav">
<a href="#" id="back-btn">Back to List</a>
<a href="#" id="export-btn">Export Excel</a>
<a href="#" id="delete-btn" style="color: #ff6b6b;">Delete</a>
</nav>
</header>
<div class="container" style="padding: 10px;">
<!-- Meeting Info Bar -->
<div class="card" style="margin-bottom: 10px;">
<div class="card-body" style="padding: 10px 20px;">
<div id="meeting-info" style="display: flex; flex-wrap: wrap; gap: 20px;">
<span><strong>Time:</strong> <span id="info-time"></span></span>
<span><strong>Location:</strong> <span id="info-location"></span></span>
<span><strong>Chair:</strong> <span id="info-chair"></span></span>
<span><strong>Recorder:</strong> <span id="info-recorder"></span></span>
</div>
</div>
</div>
<!-- Dual Panel Layout -->
<div class="dual-panel">
<!-- Left Panel: Transcript -->
<div class="panel">
<div class="panel-header">
<span>Transcript (逐字稿)</span>
<div class="recording-controls" style="padding: 0;">
<button class="btn btn-danger" id="record-btn">Start Recording</button>
</div>
</div>
<div class="panel-body">
<!-- Streaming Status -->
<div id="streaming-status" class="streaming-status hidden">
<span class="pulse-dot"></span>
<span>Recording & Transcribing...</span>
<span class="segment-count" id="segment-count">Segments: 0</span>
</div>
<!-- Single Transcript Textarea -->
<div id="transcript-container">
<textarea
id="transcript-text"
class="transcript-textarea"
placeholder="Click 'Start Recording' to begin transcription. You can also type or paste text directly here."
></textarea>
<div class="processing-indicator hidden" id="processing-indicator">
Processing audio...
</div>
</div>
</div>
</div>
<!-- Right Panel: Notes & Actions -->
<div class="panel">
<div class="panel-header">
<span>Notes & Actions</span>
<button class="btn btn-primary" id="summarize-btn">AI Summarize</button>
</div>
<div class="panel-body">
<!-- Conclusions -->
<div class="mb-20">
<h3 class="mb-10">Conclusions (結論)</h3>
<div id="conclusions-list"></div>
<button class="btn btn-secondary mt-10" id="add-conclusion-btn">+ Add Conclusion</button>
</div>
<!-- Action Items -->
<div>
<h3 class="mb-10">Action Items (待辦事項)</h3>
<div id="actions-list"></div>
<button class="btn btn-secondary mt-10" id="add-action-btn">+ Add Action Item</button>
</div>
</div>
<div style="padding: 15px 20px; border-top: 1px solid #dee2e6;">
<button class="btn btn-success btn-full" id="save-btn">Save Changes</button>
</div>
</div>
</div>
</div>
<script type="module">
import {
getMeeting,
updateMeeting,
deleteMeeting,
exportMeeting,
summarizeTranscript
} from '../services/api.js';
const meetingId = localStorage.getItem('currentMeetingId');
let currentMeeting = null;
let isRecording = false;
let audioContext = null;
let mediaStream = null;
let audioWorklet = null;
let transcriptionCount = 0; // Track number of transcription chunks
// Elements
const titleEl = document.getElementById('meeting-title');
const timeEl = document.getElementById('info-time');
const locationEl = document.getElementById('info-location');
const chairEl = document.getElementById('info-chair');
const recorderEl = document.getElementById('info-recorder');
const transcriptTextEl = document.getElementById('transcript-text');
const streamingStatusEl = document.getElementById('streaming-status');
const segmentCountEl = document.getElementById('segment-count');
const processingIndicatorEl = document.getElementById('processing-indicator');
const conclusionsEl = document.getElementById('conclusions-list');
const actionsEl = document.getElementById('actions-list');
const recordBtn = document.getElementById('record-btn');
const summarizeBtn = document.getElementById('summarize-btn');
const saveBtn = document.getElementById('save-btn');
const backBtn = document.getElementById('back-btn');
const exportBtn = document.getElementById('export-btn');
const deleteBtn = document.getElementById('delete-btn');
const addConclusionBtn = document.getElementById('add-conclusion-btn');
const addActionBtn = document.getElementById('add-action-btn');
// Load meeting data
async function loadMeeting() {
try {
currentMeeting = await getMeeting(meetingId);
renderMeeting();
} catch (error) {
alert('Error loading meeting: ' + error.message);
window.electronAPI.navigate('meetings');
}
}
function renderMeeting() {
titleEl.textContent = currentMeeting.subject;
timeEl.textContent = new Date(currentMeeting.meeting_time).toLocaleString('zh-TW');
locationEl.textContent = currentMeeting.location || '-';
chairEl.textContent = currentMeeting.chairperson || '-';
recorderEl.textContent = currentMeeting.recorder || '-';
// Load existing transcript
if (currentMeeting.transcript_blob) {
transcriptTextEl.value = currentMeeting.transcript_blob;
}
renderConclusions();
renderActions();
}
// Append new transcription text
function appendTranscription(text) {
if (!text || !text.trim()) return;
transcriptionCount++;
const currentText = transcriptTextEl.value;
if (currentText.trim()) {
// Append with proper separator
transcriptTextEl.value = currentText.trimEnd() + '\n' + text.trim();
} else {
transcriptTextEl.value = text.trim();
}
// Scroll to bottom
transcriptTextEl.scrollTop = transcriptTextEl.scrollHeight;
segmentCountEl.textContent = `Chunks: ${transcriptionCount}`;
}
function getTranscript() {
return transcriptTextEl.value;
}
// === Recording with Segmented Transcription ===
// Each segment is a fresh 5-second recording (fixed size, efficient for long meetings)
let mediaRecorder = null;
let currentChunks = [];
let recordingCycleTimer = null;
let isProcessing = false;
const SEGMENT_DURATION = 5000; // 5 seconds per segment
recordBtn.addEventListener('click', async () => {
if (!isRecording) {
await startRecording();
} else {
await stopRecording();
}
});
async function startRecording() {
try {
const status = await window.electronAPI.getSidecarStatus();
if (!status.ready) {
alert('Transcription engine is not ready. Please wait a moment and try again.');
return;
}
mediaStream = await navigator.mediaDevices.getUserMedia({
audio: { echoCancellation: true, noiseSuppression: true }
});
isRecording = true;
recordBtn.textContent = 'Stop Recording';
recordBtn.classList.remove('btn-danger');
recordBtn.classList.add('btn-secondary');
streamingStatusEl.classList.remove('hidden');
// Start first recording cycle
startRecordingCycle();
console.log('Recording started with segmented approach');
} catch (error) {
console.error('Start recording error:', error);
alert('Error starting recording: ' + error.message);
await cleanupRecording();
}
}
function startRecordingCycle() {
if (!isRecording || !mediaStream) return;
currentChunks = [];
mediaRecorder = new MediaRecorder(mediaStream, {
mimeType: 'audio/webm;codecs=opus'
});
mediaRecorder.ondataavailable = (e) => {
if (e.data.size > 0) {
currentChunks.push(e.data);
}
};
mediaRecorder.onstop = async () => {
if (currentChunks.length > 0 && !isProcessing) {
isProcessing = true;
await transcribeCurrentSegment();
isProcessing = false;
}
// Start next cycle if still recording
if (isRecording && mediaStream) {
startRecordingCycle();
}
};
mediaRecorder.start(100); // Collect frequently for smooth stopping
// Schedule stop after SEGMENT_DURATION
recordingCycleTimer = setTimeout(() => {
if (mediaRecorder && mediaRecorder.state === 'recording') {
mediaRecorder.stop();
}
}, SEGMENT_DURATION);
}
async function transcribeCurrentSegment() {
if (currentChunks.length === 0) return;
try {
processingIndicatorEl.classList.remove('hidden');
const blob = new Blob(currentChunks, { type: 'audio/webm' });
// Skip very small blobs (likely silence)
if (blob.size < 1000) {
processingIndicatorEl.classList.add('hidden');
return;
}
const arrayBuffer = await blob.arrayBuffer();
const filePath = await window.electronAPI.saveAudioFile(arrayBuffer);
const result = await window.electronAPI.transcribeAudio(filePath);
if (result.result) {
const text = result.result.trim();
if (text) {
appendTranscription(text);
}
}
} catch (err) {
console.error('Segment transcription error:', err);
} finally {
processingIndicatorEl.classList.add('hidden');
}
}
async function stopRecording() {
try {
recordBtn.disabled = true;
recordBtn.textContent = 'Processing...';
isRecording = false;
// Clear the cycle timer
if (recordingCycleTimer) {
clearTimeout(recordingCycleTimer);
recordingCycleTimer = null;
}
// Stop current recorder and process final segment
if (mediaRecorder && mediaRecorder.state === 'recording') {
await new Promise(resolve => {
const originalOnStop = mediaRecorder.onstop;
mediaRecorder.onstop = async () => {
if (currentChunks.length > 0) {
await transcribeCurrentSegment();
}
resolve();
};
mediaRecorder.stop();
});
}
} catch (error) {
console.error('Stop recording error:', error);
} finally {
await cleanupRecording();
}
}
async function cleanupRecording() {
isRecording = false;
if (recordingCycleTimer) {
clearTimeout(recordingCycleTimer);
recordingCycleTimer = null;
}
if (mediaRecorder && mediaRecorder.state !== 'inactive') {
mediaRecorder.stop();
}
mediaRecorder = null;
if (mediaStream) {
mediaStream.getTracks().forEach(track => track.stop());
mediaStream = null;
}
currentChunks = [];
isProcessing = false;
recordBtn.disabled = false;
recordBtn.textContent = 'Start Recording';
recordBtn.classList.remove('btn-secondary');
recordBtn.classList.add('btn-danger');
streamingStatusEl.classList.add('hidden');
processingIndicatorEl.classList.add('hidden');
}
// === Streaming Event Handlers (legacy, kept for future use) ===
window.electronAPI.onTranscriptionSegment((segment) => {
console.log('Received segment:', segment);
processingIndicatorEl.classList.add('hidden');
if (segment.text) {
appendTranscription(segment.text);
}
if (isRecording) {
processingIndicatorEl.classList.remove('hidden');
}
});
window.electronAPI.onStreamStopped((data) => {
console.log('Stream stopped event:', data);
if (data.final_segments) {
data.final_segments.forEach(seg => {
if (seg.text) appendTranscription(seg.text);
});
}
});
// === Conclusions Rendering ===
function renderConclusions() {
if (!currentMeeting.conclusions || currentMeeting.conclusions.length === 0) {
conclusionsEl.innerHTML = '<p style="color: #666;">No conclusions yet.</p>';
return;
}
conclusionsEl.innerHTML = currentMeeting.conclusions.map((c, i) => `
<div class="action-item">
<div class="action-item-header">
<span class="action-item-code">${c.system_code || 'NEW'}</span>
<button class="btn btn-danger" style="padding: 4px 8px; font-size: 0.8rem;" onclick="removeConclusion(${i})">Remove</button>
</div>
<textarea
class="conclusion-content"
data-index="${i}"
style="width: 100%; min-height: 60px; padding: 8px; border: 1px solid #ddd; border-radius: 4px;"
>${c.content}</textarea>
</div>
`).join('');
}
function renderActions() {
if (!currentMeeting.actions || currentMeeting.actions.length === 0) {
actionsEl.innerHTML = '<p style="color: #666;">No action items yet.</p>';
return;
}
actionsEl.innerHTML = currentMeeting.actions.map((a, i) => `
<div class="action-item">
<div class="action-item-header">
<span class="action-item-code">${a.system_code || 'NEW'}</span>
<select class="action-status" data-index="${i}" style="padding: 4px 8px;">
<option value="Open" ${a.status === 'Open' ? 'selected' : ''}>Open</option>
<option value="In Progress" ${a.status === 'In Progress' ? 'selected' : ''}>In Progress</option>
<option value="Done" ${a.status === 'Done' ? 'selected' : ''}>Done</option>
<option value="Delayed" ${a.status === 'Delayed' ? 'selected' : ''}>Delayed</option>
</select>
</div>
<textarea
class="action-content"
data-index="${i}"
style="width: 100%; min-height: 40px; padding: 8px; border: 1px solid #ddd; border-radius: 4px; margin-bottom: 8px;"
>${a.content}</textarea>
<div style="display: grid; grid-template-columns: 1fr 1fr auto; gap: 10px; align-items: center;">
<input type="text" class="action-owner" data-index="${i}" value="${a.owner || ''}" placeholder="Owner" style="padding: 8px; border: 1px solid #ddd; border-radius: 4px;">
<input type="date" class="action-due" data-index="${i}" value="${a.due_date || ''}" style="padding: 8px; border: 1px solid #ddd; border-radius: 4px;">
<button class="btn btn-danger" style="padding: 8px 12px;" onclick="removeAction(${i})">Remove</button>
</div>
</div>
`).join('');
}
window.removeConclusion = function(index) {
currentMeeting.conclusions.splice(index, 1);
renderConclusions();
};
window.removeAction = function(index) {
currentMeeting.actions.splice(index, 1);
renderActions();
};
addConclusionBtn.addEventListener('click', () => {
if (!currentMeeting.conclusions) currentMeeting.conclusions = [];
currentMeeting.conclusions.push({ content: '' });
renderConclusions();
});
addActionBtn.addEventListener('click', () => {
if (!currentMeeting.actions) currentMeeting.actions = [];
currentMeeting.actions.push({ content: '', owner: '', due_date: null, status: 'Open' });
renderActions();
});
// === AI Summarize ===
summarizeBtn.addEventListener('click', async () => {
const transcript = getTranscript();
if (!transcript || transcript.trim() === '') {
alert('Please record or add transcript segments first.');
return;
}
summarizeBtn.disabled = true;
summarizeBtn.textContent = 'Summarizing...';
try {
const result = await summarizeTranscript(transcript);
if (result.conclusions && result.conclusions.length > 0) {
if (!currentMeeting.conclusions) currentMeeting.conclusions = [];
result.conclusions.forEach(c => {
currentMeeting.conclusions.push({ content: c });
});
}
if (result.action_items && result.action_items.length > 0) {
if (!currentMeeting.actions) currentMeeting.actions = [];
result.action_items.forEach(a => {
currentMeeting.actions.push({
content: a.content,
owner: a.owner || '',
due_date: a.due_date || null,
status: 'Open'
});
});
}
renderConclusions();
renderActions();
} catch (error) {
alert('Error summarizing: ' + error.message);
} finally {
summarizeBtn.disabled = false;
summarizeBtn.textContent = 'AI Summarize';
}
});
// === Save ===
saveBtn.addEventListener('click', async () => {
const conclusions = [];
document.querySelectorAll('.conclusion-content').forEach((el) => {
conclusions.push({ content: el.value });
});
const actions = [];
document.querySelectorAll('.action-content').forEach((el, i) => {
const statusEl = document.querySelector(`.action-status[data-index="${i}"]`);
const ownerEl = document.querySelector(`.action-owner[data-index="${i}"]`);
const dueEl = document.querySelector(`.action-due[data-index="${i}"]`);
actions.push({
content: el.value,
owner: ownerEl?.value || '',
due_date: dueEl?.value || null,
status: statusEl?.value || 'Open'
});
});
saveBtn.disabled = true;
saveBtn.textContent = 'Saving...';
try {
await updateMeeting(meetingId, {
transcript_blob: getTranscript(),
conclusions: conclusions,
actions: actions
});
alert('Meeting saved successfully!');
loadMeeting();
} catch (error) {
alert('Error saving: ' + error.message);
} finally {
saveBtn.disabled = false;
saveBtn.textContent = 'Save Changes';
}
});
// === Export ===
exportBtn.addEventListener('click', async (e) => {
e.preventDefault();
try {
const blob = await exportMeeting(meetingId);
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = `meeting_${currentMeeting.uuid}.xlsx`;
a.click();
URL.revokeObjectURL(url);
} catch (error) {
alert('Error exporting: ' + error.message);
}
});
// === Delete ===
deleteBtn.addEventListener('click', async (e) => {
e.preventDefault();
if (!confirm('Are you sure you want to delete this meeting? This cannot be undone.')) return;
try {
await deleteMeeting(meetingId);
alert('Meeting deleted.');
window.electronAPI.navigate('meetings');
} catch (error) {
alert('Error deleting: ' + error.message);
}
});
// === Back ===
backBtn.addEventListener('click', (e) => {
e.preventDefault();
window.electronAPI.navigate('meetings');
});
// === Initialize ===
if (meetingId) {
loadMeeting();
} else {
window.electronAPI.navigate('meetings');
}
</script>
</body>
</html>

View File

@@ -0,0 +1,201 @@
<!DOCTYPE html>
<html lang="zh-TW">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Meeting Assistant - Meetings</title>
<link rel="stylesheet" href="../styles/main.css">
</head>
<body>
<header class="header">
<h1>Meeting Assistant</h1>
<nav class="header-nav">
<a href="#" id="new-meeting-btn">New Meeting</a>
<a href="#" id="logout-btn">Logout</a>
</nav>
</header>
<div class="container">
<div class="card">
<div class="card-header">
My Meetings
</div>
<div class="card-body" id="meetings-container">
<div class="loading">
<div class="spinner"></div>
</div>
</div>
</div>
</div>
<!-- New Meeting Modal -->
<div id="new-meeting-modal" class="modal-overlay hidden">
<div class="modal">
<div class="modal-header">
<h2>New Meeting</h2>
<button class="modal-close" id="close-modal">&times;</button>
</div>
<div class="modal-body">
<form id="new-meeting-form">
<div class="form-group">
<label for="subject">Subject *</label>
<input type="text" id="subject" name="subject" required>
</div>
<div class="form-group">
<label for="meeting_time">Date & Time *</label>
<input type="datetime-local" id="meeting_time" name="meeting_time" required>
</div>
<div class="form-group">
<label for="chairperson">Chairperson</label>
<input type="text" id="chairperson" name="chairperson">
</div>
<div class="form-group">
<label for="location">Location</label>
<input type="text" id="location" name="location">
</div>
<div class="form-group">
<label for="attendees">Attendees (comma separated)</label>
<input type="text" id="attendees" name="attendees" placeholder="email1@company.com, email2@company.com">
</div>
</form>
</div>
<div class="modal-footer">
<button class="btn btn-secondary" id="cancel-btn">Cancel</button>
<button class="btn btn-primary" id="create-btn">Create Meeting</button>
</div>
</div>
</div>
<script type="module">
import { getMeetings, createMeeting, clearToken } from '../services/api.js';
const meetingsContainer = document.getElementById('meetings-container');
const newMeetingBtn = document.getElementById('new-meeting-btn');
const logoutBtn = document.getElementById('logout-btn');
const modal = document.getElementById('new-meeting-modal');
const closeModalBtn = document.getElementById('close-modal');
const cancelBtn = document.getElementById('cancel-btn');
const createBtn = document.getElementById('create-btn');
const form = document.getElementById('new-meeting-form');
async function loadMeetings() {
try {
const meetings = await getMeetings();
if (meetings.length === 0) {
meetingsContainer.innerHTML = '<p class="text-center">No meetings yet. Create your first meeting!</p>';
return;
}
meetingsContainer.innerHTML = `
<ul class="meeting-list">
${meetings.map(m => `
<li class="meeting-item" data-id="${m.meeting_id}">
<div class="meeting-info">
<h3>${escapeHtml(m.subject)}</h3>
<p>${new Date(m.meeting_time).toLocaleString('zh-TW')} | ${escapeHtml(m.chairperson || 'No chairperson')}</p>
</div>
<div class="meeting-actions">
<button class="btn btn-primary btn-small view-btn">View</button>
</div>
</li>
`).join('')}
</ul>
`;
document.querySelectorAll('.view-btn').forEach(btn => {
btn.addEventListener('click', (e) => {
e.stopPropagation();
const id = btn.closest('.meeting-item').dataset.id;
localStorage.setItem('currentMeetingId', id);
window.electronAPI.navigate('meeting-detail');
});
});
document.querySelectorAll('.meeting-item').forEach(item => {
item.addEventListener('click', () => {
localStorage.setItem('currentMeetingId', item.dataset.id);
window.electronAPI.navigate('meeting-detail');
});
});
} catch (error) {
meetingsContainer.innerHTML = `<div class="alert alert-error">${error.message}</div>`;
}
}
function escapeHtml(text) {
const div = document.createElement('div');
div.textContent = text;
return div.innerHTML;
}
function openModal() {
modal.classList.remove('hidden');
// Set default datetime to now
const now = new Date();
now.setMinutes(now.getMinutes() - now.getTimezoneOffset());
document.getElementById('meeting_time').value = now.toISOString().slice(0, 16);
// Set default recorder to current user
document.getElementById('recorder') && (document.getElementById('recorder').value = localStorage.getItem('userEmail') || '');
}
function closeModal() {
modal.classList.add('hidden');
form.reset();
}
newMeetingBtn.addEventListener('click', (e) => {
e.preventDefault();
openModal();
});
closeModalBtn.addEventListener('click', closeModal);
cancelBtn.addEventListener('click', closeModal);
createBtn.addEventListener('click', async () => {
const formData = new FormData(form);
const meeting = {
subject: formData.get('subject'),
meeting_time: formData.get('meeting_time'),
chairperson: formData.get('chairperson'),
location: formData.get('location'),
attendees: formData.get('attendees'),
recorder: localStorage.getItem('userEmail') || '',
};
if (!meeting.subject || !meeting.meeting_time) {
alert('Please fill in required fields');
return;
}
createBtn.disabled = true;
createBtn.textContent = 'Creating...';
try {
const created = await createMeeting(meeting);
closeModal();
localStorage.setItem('currentMeetingId', created.meeting_id);
window.electronAPI.navigate('meeting-detail');
} catch (error) {
alert('Error creating meeting: ' + error.message);
} finally {
createBtn.disabled = false;
createBtn.textContent = 'Create Meeting';
}
});
logoutBtn.addEventListener('click', (e) => {
e.preventDefault();
clearToken();
window.electronAPI.navigate('login');
});
// Close modal when clicking outside
modal.addEventListener('click', (e) => {
if (e.target === modal) closeModal();
});
// Load meetings on page load
loadMeetings();
</script>
</body>
</html>

32
client/src/preload.js Normal file
View File

@@ -0,0 +1,32 @@
const { contextBridge, ipcRenderer } = require("electron");
contextBridge.exposeInMainWorld("electronAPI", {
// Navigation
navigate: (page) => ipcRenderer.invoke("navigate", page),
// Sidecar status
getSidecarStatus: () => ipcRenderer.invoke("get-sidecar-status"),
// === Streaming Mode APIs ===
startRecordingStream: () => ipcRenderer.invoke("start-recording-stream"),
streamAudioChunk: (base64Audio) => ipcRenderer.invoke("stream-audio-chunk", base64Audio),
stopRecordingStream: () => ipcRenderer.invoke("stop-recording-stream"),
// Streaming events
onTranscriptionSegment: (callback) => {
ipcRenderer.on("transcription-segment", (event, segment) => callback(segment));
},
onStreamStarted: (callback) => {
ipcRenderer.on("stream-started", (event, data) => callback(data));
},
onStreamStopped: (callback) => {
ipcRenderer.on("stream-stopped", (event, data) => callback(data));
},
// === Legacy File-based APIs (fallback) ===
saveAudioFile: (arrayBuffer) => ipcRenderer.invoke("save-audio-file", arrayBuffer),
transcribeAudio: (filePath) => ipcRenderer.invoke("transcribe-audio", filePath),
onTranscriptionResult: (callback) => {
ipcRenderer.on("transcription-result", (event, text) => callback(text));
},
});

149
client/src/services/api.js Normal file
View File

@@ -0,0 +1,149 @@
const API_BASE_URL = "http://localhost:8000/api";
let authToken = null;
let tokenRefreshTimer = null;
export function setToken(token) {
authToken = token;
localStorage.setItem("authToken", token);
scheduleTokenRefresh();
}
export function getToken() {
if (!authToken) {
authToken = localStorage.getItem("authToken");
}
return authToken;
}
export function clearToken() {
authToken = null;
localStorage.removeItem("authToken");
if (tokenRefreshTimer) {
clearTimeout(tokenRefreshTimer);
}
}
function scheduleTokenRefresh() {
// Refresh token 5 minutes before expiry (assuming 24h token)
const refreshIn = 23 * 60 * 60 * 1000; // 23 hours
if (tokenRefreshTimer) {
clearTimeout(tokenRefreshTimer);
}
tokenRefreshTimer = setTimeout(async () => {
try {
// Re-login would require stored credentials
// For now, just notify user to re-login
console.warn("Token expiring soon, please re-login");
} catch (error) {
console.error("Token refresh failed:", error);
}
}, refreshIn);
}
async function request(endpoint, options = {}) {
const url = `${API_BASE_URL}${endpoint}`;
const headers = {
"Content-Type": "application/json",
...options.headers,
};
const token = getToken();
if (token) {
headers["Authorization"] = `Bearer ${token}`;
}
const response = await fetch(url, {
...options,
headers,
});
if (response.status === 401) {
const error = await response.json();
if (error.detail?.error === "token_expired") {
clearToken();
window.electronAPI.navigate("login");
throw new Error("Session expired, please login again");
}
throw new Error(error.detail || "Unauthorized");
}
if (!response.ok) {
const error = await response.json().catch(() => ({}));
throw new Error(error.detail || `HTTP error ${response.status}`);
}
// Handle blob responses for export
if (options.responseType === "blob") {
return response.blob();
}
return response.json();
}
// Auth API
export async function login(email, password) {
const data = await request("/login", {
method: "POST",
body: JSON.stringify({ email, password }),
});
setToken(data.token);
localStorage.setItem("userEmail", data.email);
localStorage.setItem("userRole", data.role);
return data;
}
export async function getMe() {
return request("/me");
}
// Meetings API
export async function getMeetings() {
return request("/meetings");
}
export async function getMeeting(id) {
return request(`/meetings/${id}`);
}
export async function createMeeting(meeting) {
return request("/meetings", {
method: "POST",
body: JSON.stringify(meeting),
});
}
export async function updateMeeting(id, meeting) {
return request(`/meetings/${id}`, {
method: "PUT",
body: JSON.stringify(meeting),
});
}
export async function deleteMeeting(id) {
return request(`/meetings/${id}`, {
method: "DELETE",
});
}
export async function updateActionItem(meetingId, actionId, data) {
return request(`/meetings/${meetingId}/actions/${actionId}`, {
method: "PUT",
body: JSON.stringify(data),
});
}
// AI API
export async function summarizeTranscript(transcript) {
return request("/ai/summarize", {
method: "POST",
body: JSON.stringify({ transcript }),
});
}
// Export API
export async function exportMeeting(id) {
return request(`/meetings/${id}/export`, {
responseType: "blob",
});
}

462
client/src/styles/main.css Normal file
View File

@@ -0,0 +1,462 @@
* {
box-sizing: border-box;
margin: 0;
padding: 0;
}
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen,
Ubuntu, Cantarell, "Open Sans", "Helvetica Neue", sans-serif;
background-color: #f5f5f5;
color: #333;
line-height: 1.6;
}
.container {
max-width: 1200px;
margin: 0 auto;
padding: 20px;
}
/* Header */
.header {
background-color: #2c3e50;
color: white;
padding: 15px 20px;
display: flex;
justify-content: space-between;
align-items: center;
}
.header h1 {
font-size: 1.5rem;
}
.header-nav {
display: flex;
gap: 15px;
}
.header-nav a {
color: white;
text-decoration: none;
padding: 8px 16px;
border-radius: 4px;
transition: background-color 0.2s;
}
.header-nav a:hover {
background-color: #34495e;
}
/* Login Page */
.login-container {
display: flex;
justify-content: center;
align-items: center;
min-height: 100vh;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
}
.login-box {
background: white;
padding: 40px;
border-radius: 10px;
box-shadow: 0 10px 40px rgba(0, 0, 0, 0.2);
width: 100%;
max-width: 400px;
}
.login-box h1 {
text-align: center;
margin-bottom: 30px;
color: #333;
}
/* Forms */
.form-group {
margin-bottom: 20px;
}
.form-group label {
display: block;
margin-bottom: 5px;
font-weight: 600;
color: #555;
}
.form-group input,
.form-group textarea,
.form-group select {
width: 100%;
padding: 12px;
border: 1px solid #ddd;
border-radius: 6px;
font-size: 1rem;
transition: border-color 0.2s;
}
.form-group input:focus,
.form-group textarea:focus,
.form-group select:focus {
outline: none;
border-color: #667eea;
}
/* Buttons */
.btn {
display: inline-block;
padding: 12px 24px;
border: none;
border-radius: 6px;
font-size: 1rem;
cursor: pointer;
text-decoration: none;
text-align: center;
transition: all 0.2s;
}
.btn-primary {
background-color: #667eea;
color: white;
}
.btn-primary:hover {
background-color: #5a6fd6;
}
.btn-secondary {
background-color: #6c757d;
color: white;
}
.btn-secondary:hover {
background-color: #5a6268;
}
.btn-danger {
background-color: #dc3545;
color: white;
}
.btn-danger:hover {
background-color: #c82333;
}
.btn-success {
background-color: #28a745;
color: white;
}
.btn-success:hover {
background-color: #218838;
}
.btn-full {
width: 100%;
}
/* Cards */
.card {
background: white;
border-radius: 8px;
box-shadow: 0 2px 10px rgba(0, 0, 0, 0.1);
margin-bottom: 20px;
overflow: hidden;
}
.card-header {
padding: 15px 20px;
background-color: #f8f9fa;
border-bottom: 1px solid #dee2e6;
font-weight: 600;
}
.card-body {
padding: 20px;
}
/* Meeting List */
.meeting-list {
list-style: none;
}
.meeting-item {
display: flex;
justify-content: space-between;
align-items: center;
padding: 15px 20px;
border-bottom: 1px solid #eee;
transition: background-color 0.2s;
cursor: pointer;
}
.meeting-item:hover {
background-color: #f8f9fa;
}
.meeting-item:last-child {
border-bottom: none;
}
.meeting-info h3 {
margin-bottom: 5px;
color: #333;
}
.meeting-info p {
color: #666;
font-size: 0.9rem;
}
.meeting-actions {
display: flex;
gap: 10px;
}
/* Dual Panel Layout */
.dual-panel {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 20px;
height: calc(100vh - 150px);
}
.panel {
background: white;
border-radius: 8px;
box-shadow: 0 2px 10px rgba(0, 0, 0, 0.1);
display: flex;
flex-direction: column;
overflow: hidden;
}
.panel-header {
padding: 15px 20px;
background-color: #f8f9fa;
border-bottom: 1px solid #dee2e6;
font-weight: 600;
display: flex;
justify-content: space-between;
align-items: center;
}
.panel-body {
flex: 1;
padding: 20px;
overflow-y: auto;
}
/* Transcript */
.transcript-content {
white-space: pre-wrap;
font-family: "Courier New", monospace;
font-size: 0.95rem;
line-height: 1.8;
}
/* Action Items */
.action-item {
padding: 15px;
border: 1px solid #dee2e6;
border-radius: 6px;
margin-bottom: 15px;
}
.action-item-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 10px;
}
.action-item-code {
font-weight: 600;
color: #667eea;
}
.action-item-status {
padding: 4px 10px;
border-radius: 20px;
font-size: 0.8rem;
font-weight: 600;
}
.status-open {
background-color: #e3f2fd;
color: #1976d2;
}
.status-in-progress {
background-color: #fff3e0;
color: #f57c00;
}
.status-done {
background-color: #e8f5e9;
color: #388e3c;
}
.status-delayed {
background-color: #ffebee;
color: #d32f2f;
}
/* Recording */
.recording-controls {
display: flex;
gap: 15px;
align-items: center;
padding: 15px 20px;
background-color: #f8f9fa;
border-top: 1px solid #dee2e6;
}
.recording-indicator {
display: flex;
align-items: center;
gap: 8px;
}
.recording-dot {
width: 12px;
height: 12px;
background-color: #dc3545;
border-radius: 50%;
animation: pulse 1s infinite;
}
@keyframes pulse {
0%, 100% {
opacity: 1;
}
50% {
opacity: 0.5;
}
}
/* Alerts */
.alert {
padding: 15px 20px;
border-radius: 6px;
margin-bottom: 20px;
}
.alert-error {
background-color: #ffebee;
color: #c62828;
border: 1px solid #ef9a9a;
}
.alert-success {
background-color: #e8f5e9;
color: #2e7d32;
border: 1px solid #a5d6a7;
}
/* Loading */
.loading {
display: flex;
justify-content: center;
align-items: center;
padding: 40px;
}
.spinner {
width: 40px;
height: 40px;
border: 4px solid #f3f3f3;
border-top: 4px solid #667eea;
border-radius: 50%;
animation: spin 1s linear infinite;
}
@keyframes spin {
0% {
transform: rotate(0deg);
}
100% {
transform: rotate(360deg);
}
}
/* Modal */
.modal-overlay {
position: fixed;
top: 0;
left: 0;
right: 0;
bottom: 0;
background-color: rgba(0, 0, 0, 0.5);
display: flex;
justify-content: center;
align-items: center;
z-index: 1000;
}
.modal {
background: white;
border-radius: 10px;
box-shadow: 0 10px 40px rgba(0, 0, 0, 0.3);
width: 100%;
max-width: 600px;
max-height: 80vh;
overflow-y: auto;
}
.modal-header {
padding: 20px;
border-bottom: 1px solid #dee2e6;
display: flex;
justify-content: space-between;
align-items: center;
}
.modal-header h2 {
margin: 0;
}
.modal-close {
background: none;
border: none;
font-size: 1.5rem;
cursor: pointer;
color: #666;
}
.modal-body {
padding: 20px;
}
.modal-footer {
padding: 20px;
border-top: 1px solid #dee2e6;
display: flex;
justify-content: flex-end;
gap: 10px;
}
/* Utility */
.text-center {
text-align: center;
}
.mt-10 {
margin-top: 10px;
}
.mt-20 {
margin-top: 20px;
}
.mb-10 {
margin-bottom: 10px;
}
.mb-20 {
margin-bottom: 20px;
}
.hidden {
display: none !important;
}

456
openspec/AGENTS.md Normal file
View File

@@ -0,0 +1,456 @@
# OpenSpec Instructions
Instructions for AI coding assistants using OpenSpec for spec-driven development.
## TL;DR Quick Checklist
- Search existing work: `openspec spec list --long`, `openspec list` (use `rg` only for full-text search)
- Decide scope: new capability vs modify existing capability
- Pick a unique `change-id`: kebab-case, verb-led (`add-`, `update-`, `remove-`, `refactor-`)
- Scaffold: `proposal.md`, `tasks.md`, `design.md` (only if needed), and delta specs per affected capability
- Write deltas: use `## ADDED|MODIFIED|REMOVED|RENAMED Requirements`; include at least one `#### Scenario:` per requirement
- Validate: `openspec validate [change-id] --strict` and fix issues
- Request approval: Do not start implementation until proposal is approved
## Three-Stage Workflow
### Stage 1: Creating Changes
Create proposal when you need to:
- Add features or functionality
- Make breaking changes (API, schema)
- Change architecture or patterns
- Optimize performance (changes behavior)
- Update security patterns
Triggers (examples):
- "Help me create a change proposal"
- "Help me plan a change"
- "Help me create a proposal"
- "I want to create a spec proposal"
- "I want to create a spec"
Loose matching guidance:
- Contains one of: `proposal`, `change`, `spec`
- With one of: `create`, `plan`, `make`, `start`, `help`
Skip proposal for:
- Bug fixes (restore intended behavior)
- Typos, formatting, comments
- Dependency updates (non-breaking)
- Configuration changes
- Tests for existing behavior
**Workflow**
1. Review `openspec/project.md`, `openspec list`, and `openspec list --specs` to understand current context.
2. Choose a unique verb-led `change-id` and scaffold `proposal.md`, `tasks.md`, optional `design.md`, and spec deltas under `openspec/changes/<id>/`.
3. Draft spec deltas using `## ADDED|MODIFIED|REMOVED Requirements` with at least one `#### Scenario:` per requirement.
4. Run `openspec validate <id> --strict` and resolve any issues before sharing the proposal.
### Stage 2: Implementing Changes
Track these steps as TODOs and complete them one by one.
1. **Read proposal.md** - Understand what's being built
2. **Read design.md** (if exists) - Review technical decisions
3. **Read tasks.md** - Get implementation checklist
4. **Implement tasks sequentially** - Complete in order
5. **Confirm completion** - Ensure every item in `tasks.md` is finished before updating statuses
6. **Update checklist** - After all work is done, set every task to `- [x]` so the list reflects reality
7. **Approval gate** - Do not start implementation until the proposal is reviewed and approved
### Stage 3: Archiving Changes
After deployment, create separate PR to:
- Move `changes/[name]/``changes/archive/YYYY-MM-DD-[name]/`
- Update `specs/` if capabilities changed
- Use `openspec archive <change-id> --skip-specs --yes` for tooling-only changes (always pass the change ID explicitly)
- Run `openspec validate --strict` to confirm the archived change passes checks
## Before Any Task
**Context Checklist:**
- [ ] Read relevant specs in `specs/[capability]/spec.md`
- [ ] Check pending changes in `changes/` for conflicts
- [ ] Read `openspec/project.md` for conventions
- [ ] Run `openspec list` to see active changes
- [ ] Run `openspec list --specs` to see existing capabilities
**Before Creating Specs:**
- Always check if capability already exists
- Prefer modifying existing specs over creating duplicates
- Use `openspec show [spec]` to review current state
- If request is ambiguous, ask 12 clarifying questions before scaffolding
### Search Guidance
- Enumerate specs: `openspec spec list --long` (or `--json` for scripts)
- Enumerate changes: `openspec list` (or `openspec change list --json` - deprecated but available)
- Show details:
- Spec: `openspec show <spec-id> --type spec` (use `--json` for filters)
- Change: `openspec show <change-id> --json --deltas-only`
- Full-text search (use ripgrep): `rg -n "Requirement:|Scenario:" openspec/specs`
## Quick Start
### CLI Commands
```bash
# Essential commands
openspec list # List active changes
openspec list --specs # List specifications
openspec show [item] # Display change or spec
openspec validate [item] # Validate changes or specs
openspec archive <change-id> [--yes|-y] # Archive after deployment (add --yes for non-interactive runs)
# Project management
openspec init [path] # Initialize OpenSpec
openspec update [path] # Update instruction files
# Interactive mode
openspec show # Prompts for selection
openspec validate # Bulk validation mode
# Debugging
openspec show [change] --json --deltas-only
openspec validate [change] --strict
```
### Command Flags
- `--json` - Machine-readable output
- `--type change|spec` - Disambiguate items
- `--strict` - Comprehensive validation
- `--no-interactive` - Disable prompts
- `--skip-specs` - Archive without spec updates
- `--yes`/`-y` - Skip confirmation prompts (non-interactive archive)
## Directory Structure
```
openspec/
├── project.md # Project conventions
├── specs/ # Current truth - what IS built
│ └── [capability]/ # Single focused capability
│ ├── spec.md # Requirements and scenarios
│ └── design.md # Technical patterns
├── changes/ # Proposals - what SHOULD change
│ ├── [change-name]/
│ │ ├── proposal.md # Why, what, impact
│ │ ├── tasks.md # Implementation checklist
│ │ ├── design.md # Technical decisions (optional; see criteria)
│ │ └── specs/ # Delta changes
│ │ └── [capability]/
│ │ └── spec.md # ADDED/MODIFIED/REMOVED
│ └── archive/ # Completed changes
```
## Creating Change Proposals
### Decision Tree
```
New request?
├─ Bug fix restoring spec behavior? → Fix directly
├─ Typo/format/comment? → Fix directly
├─ New feature/capability? → Create proposal
├─ Breaking change? → Create proposal
├─ Architecture change? → Create proposal
└─ Unclear? → Create proposal (safer)
```
### Proposal Structure
1. **Create directory:** `changes/[change-id]/` (kebab-case, verb-led, unique)
2. **Write proposal.md:**
```markdown
# Change: [Brief description of change]
## Why
[1-2 sentences on problem/opportunity]
## What Changes
- [Bullet list of changes]
- [Mark breaking changes with **BREAKING**]
## Impact
- Affected specs: [list capabilities]
- Affected code: [key files/systems]
```
3. **Create spec deltas:** `specs/[capability]/spec.md`
```markdown
## ADDED Requirements
### Requirement: New Feature
The system SHALL provide...
#### Scenario: Success case
- **WHEN** user performs action
- **THEN** expected result
## MODIFIED Requirements
### Requirement: Existing Feature
[Complete modified requirement]
## REMOVED Requirements
### Requirement: Old Feature
**Reason**: [Why removing]
**Migration**: [How to handle]
```
If multiple capabilities are affected, create multiple delta files under `changes/[change-id]/specs/<capability>/spec.md`—one per capability.
4. **Create tasks.md:**
```markdown
## 1. Implementation
- [ ] 1.1 Create database schema
- [ ] 1.2 Implement API endpoint
- [ ] 1.3 Add frontend component
- [ ] 1.4 Write tests
```
5. **Create design.md when needed:**
Create `design.md` if any of the following apply; otherwise omit it:
- Cross-cutting change (multiple services/modules) or a new architectural pattern
- New external dependency or significant data model changes
- Security, performance, or migration complexity
- Ambiguity that benefits from technical decisions before coding
Minimal `design.md` skeleton:
```markdown
## Context
[Background, constraints, stakeholders]
## Goals / Non-Goals
- Goals: [...]
- Non-Goals: [...]
## Decisions
- Decision: [What and why]
- Alternatives considered: [Options + rationale]
## Risks / Trade-offs
- [Risk] → Mitigation
## Migration Plan
[Steps, rollback]
## Open Questions
- [...]
```
## Spec File Format
### Critical: Scenario Formatting
**CORRECT** (use #### headers):
```markdown
#### Scenario: User login success
- **WHEN** valid credentials provided
- **THEN** return JWT token
```
**WRONG** (don't use bullets or bold):
```markdown
- **Scenario: User login** ❌
**Scenario**: User login ❌
### Scenario: User login ❌
```
Every requirement MUST have at least one scenario.
### Requirement Wording
- Use SHALL/MUST for normative requirements (avoid should/may unless intentionally non-normative)
### Delta Operations
- `## ADDED Requirements` - New capabilities
- `## MODIFIED Requirements` - Changed behavior
- `## REMOVED Requirements` - Deprecated features
- `## RENAMED Requirements` - Name changes
Headers matched with `trim(header)` - whitespace ignored.
#### When to use ADDED vs MODIFIED
- ADDED: Introduces a new capability or sub-capability that can stand alone as a requirement. Prefer ADDED when the change is orthogonal (e.g., adding "Slash Command Configuration") rather than altering the semantics of an existing requirement.
- MODIFIED: Changes the behavior, scope, or acceptance criteria of an existing requirement. Always paste the full, updated requirement content (header + all scenarios). The archiver will replace the entire requirement with what you provide here; partial deltas will drop previous details.
- RENAMED: Use when only the name changes. If you also change behavior, use RENAMED (name) plus MODIFIED (content) referencing the new name.
Common pitfall: Using MODIFIED to add a new concern without including the previous text. This causes loss of detail at archive time. If you arent explicitly changing the existing requirement, add a new requirement under ADDED instead.
Authoring a MODIFIED requirement correctly:
1) Locate the existing requirement in `openspec/specs/<capability>/spec.md`.
2) Copy the entire requirement block (from `### Requirement: ...` through its scenarios).
3) Paste it under `## MODIFIED Requirements` and edit to reflect the new behavior.
4) Ensure the header text matches exactly (whitespace-insensitive) and keep at least one `#### Scenario:`.
Example for RENAMED:
```markdown
## RENAMED Requirements
- FROM: `### Requirement: Login`
- TO: `### Requirement: User Authentication`
```
## Troubleshooting
### Common Errors
**"Change must have at least one delta"**
- Check `changes/[name]/specs/` exists with .md files
- Verify files have operation prefixes (## ADDED Requirements)
**"Requirement must have at least one scenario"**
- Check scenarios use `#### Scenario:` format (4 hashtags)
- Don't use bullet points or bold for scenario headers
**Silent scenario parsing failures**
- Exact format required: `#### Scenario: Name`
- Debug with: `openspec show [change] --json --deltas-only`
### Validation Tips
```bash
# Always use strict mode for comprehensive checks
openspec validate [change] --strict
# Debug delta parsing
openspec show [change] --json | jq '.deltas'
# Check specific requirement
openspec show [spec] --json -r 1
```
## Happy Path Script
```bash
# 1) Explore current state
openspec spec list --long
openspec list
# Optional full-text search:
# rg -n "Requirement:|Scenario:" openspec/specs
# rg -n "^#|Requirement:" openspec/changes
# 2) Choose change id and scaffold
CHANGE=add-two-factor-auth
mkdir -p openspec/changes/$CHANGE/{specs/auth}
printf "## Why\n...\n\n## What Changes\n- ...\n\n## Impact\n- ...\n" > openspec/changes/$CHANGE/proposal.md
printf "## 1. Implementation\n- [ ] 1.1 ...\n" > openspec/changes/$CHANGE/tasks.md
# 3) Add deltas (example)
cat > openspec/changes/$CHANGE/specs/auth/spec.md << 'EOF'
## ADDED Requirements
### Requirement: Two-Factor Authentication
Users MUST provide a second factor during login.
#### Scenario: OTP required
- **WHEN** valid credentials are provided
- **THEN** an OTP challenge is required
EOF
# 4) Validate
openspec validate $CHANGE --strict
```
## Multi-Capability Example
```
openspec/changes/add-2fa-notify/
├── proposal.md
├── tasks.md
└── specs/
├── auth/
│ └── spec.md # ADDED: Two-Factor Authentication
└── notifications/
└── spec.md # ADDED: OTP email notification
```
auth/spec.md
```markdown
## ADDED Requirements
### Requirement: Two-Factor Authentication
...
```
notifications/spec.md
```markdown
## ADDED Requirements
### Requirement: OTP Email Notification
...
```
## Best Practices
### Simplicity First
- Default to <100 lines of new code
- Single-file implementations until proven insufficient
- Avoid frameworks without clear justification
- Choose boring, proven patterns
### Complexity Triggers
Only add complexity with:
- Performance data showing current solution too slow
- Concrete scale requirements (>1000 users, >100MB data)
- Multiple proven use cases requiring abstraction
### Clear References
- Use `file.ts:42` format for code locations
- Reference specs as `specs/auth/spec.md`
- Link related changes and PRs
### Capability Naming
- Use verb-noun: `user-auth`, `payment-capture`
- Single purpose per capability
- 10-minute understandability rule
- Split if description needs "AND"
### Change ID Naming
- Use kebab-case, short and descriptive: `add-two-factor-auth`
- Prefer verb-led prefixes: `add-`, `update-`, `remove-`, `refactor-`
- Ensure uniqueness; if taken, append `-2`, `-3`, etc.
## Tool Selection Guide
| Task | Tool | Why |
|------|------|-----|
| Find files by pattern | Glob | Fast pattern matching |
| Search code content | Grep | Optimized regex search |
| Read specific files | Read | Direct file access |
| Explore unknown scope | Task | Multi-step investigation |
## Error Recovery
### Change Conflicts
1. Run `openspec list` to see active changes
2. Check for overlapping specs
3. Coordinate with change owners
4. Consider combining proposals
### Validation Failures
1. Run with `--strict` flag
2. Check JSON output for details
3. Verify spec file format
4. Ensure scenarios properly formatted
### Missing Context
1. Read project.md first
2. Check related specs
3. Review recent archives
4. Ask for clarification
## Quick Reference
### Stage Indicators
- `changes/` - Proposed, not yet built
- `specs/` - Built and deployed
- `archive/` - Completed changes
### File Purposes
- `proposal.md` - Why and what
- `tasks.md` - Implementation steps
- `design.md` - Technical decisions
- `spec.md` - Requirements and behavior
### CLI Essentials
```bash
openspec list # What's in progress?
openspec show [item] # View details
openspec validate --strict # Is it correct?
openspec archive <change-id> [--yes|-y] # Mark complete (add --yes for automation)
```
Remember: Specs are truth. Changes are proposals. Keep them in sync.

View File

@@ -0,0 +1,132 @@
## Context
Building a meeting knowledge management system for enterprise users. The system must support offline transcription on standard hardware (i5/8GB), integrate with existing company authentication, and provide AI-powered summarization via Dify LLM.
**Stakeholders**: Enterprise meeting participants, meeting recorders, admin users (ymirliu@panjit.com.tw)
**Constraints**:
- Must run faster-whisper int8 on i5/8GB laptop
- DB credentials and API keys must stay server-side (security)
- All database tables prefixed with `meeting_`
- Output must support Traditional Chinese (繁體中文)
## Goals / Non-Goals
**Goals**:
- Deliver working MVP with all six capabilities
- Secure architecture with secrets in middleware only
- Offline-capable transcription
- Structured output with trackable action items
**Non-Goals**:
- Multi-language support beyond Traditional Chinese
- Real-time collaborative editing
- Mobile client
- Custom LLM model training
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Electron Client │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Auth UI │ │ Meeting UI │ │ Transcription Engine │ │
│ │ (Login) │ │ (CRUD/Edit) │ │ (faster-whisper+OpenCC)│ │
│ └──────┬──────┘ └──────┬──────┘ └────────────┬────────────┘ │
└─────────┼────────────────┼──────────────────────┼───────────────┘
│ │ │
│ HTTP │ HTTP │ Local only
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ FastAPI Middleware Server │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────┐ │
│ │ Auth Proxy │ │Meeting CRUD │ │ Dify Proxy │ │ Export │ │
│ │ POST /login │ │POST/GET/... │ │POST /ai/... │ │GET /:id│ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └───┬────┘ │
└─────────┼────────────────┼────────────────┼─────────────┼───────┘
│ │ │ │
▼ ▼ ▼ │
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ PJ-Auth API │ │ MySQL │ │ Dify LLM │ │
│ (Vercel) │ │ (theaken.com)│ │(theaken.com) │ │
└──────────────┘ └──────────────┘ └──────────────┘ │
┌────────────────────┘
┌──────────────┐
│ Excel Template│
│ (openpyxl) │
└──────────────┘
```
## Decisions
### Decision 1: Three-tier architecture with middleware
**Choice**: All external services accessed through FastAPI middleware
**Rationale**: Security requirement - DB credentials and API keys cannot be in Electron client
**Alternatives considered**:
- Direct client-to-service: Rejected due to credential exposure risk
- Serverless functions: More complex deployment for similar security
### Decision 2: Edge transcription in Electron
**Choice**: Run faster-whisper locally via Python sidecar (PyInstaller)
**Rationale**: Offline capability requirement; network latency unacceptable for real-time transcription
**Alternatives considered**:
- Cloud STT (Google/Azure): Requires network, latency issues
- WebAssembly whisper: Not mature enough for production
### Decision 3: MySQL with prefixed tables
**Choice**: Use shared MySQL instance with `meeting_` prefix
**Rationale**: Leverage existing infrastructure; prefix ensures isolation
**Alternatives considered**:
- Dedicated database: Overhead not justified for MVP
- SQLite: Doesn't support multi-user access
### Decision 4: Dify for LLM summarization
**Choice**: Use company Dify instance for AI features
**Rationale**: Already available infrastructure; structured JSON output support
**Alternatives considered**:
- Direct OpenAI API: Additional cost, no existing infrastructure
- Local LLM: Hardware constraints (i5/8GB insufficient)
## Risks / Trade-offs
| Risk | Impact | Mitigation |
|------|--------|------------|
| faster-whisper performance on i5/8GB | High | Use int8 quantization; test on target hardware early |
| Dify timeout on long transcripts | Medium | Implement chunking; add timeout handling with retry |
| Token expiry during long meetings | Medium | Implement auto-refresh interceptor in client |
| Network failure during save | Medium | Client-side queue with retry; local draft storage |
## Data Model
```sql
-- Tables all prefixed with meeting_
meeting_users (user_id, email, display_name, role, created_at)
meeting_records (meeting_id, uuid, subject, meeting_time, location,
chairperson, recorder, attendees, transcript_blob,
created_by, created_at)
meeting_conclusions (conclusion_id, meeting_id, content, system_code)
meeting_action_items (action_id, meeting_id, content, owner, due_date,
status, system_code)
```
**ID Formats**:
- Conclusions: `C-YYYYMMDD-XX` (e.g., C-20251210-01)
- Action Items: `A-YYYYMMDD-XX` (e.g., A-20251210-01)
## API Endpoints
| Method | Endpoint | Purpose |
|--------|----------|---------|
| POST | /api/login | Proxy auth to PJ-Auth API |
| GET | /api/meetings | List meetings (filterable) |
| POST | /api/meetings | Create meeting |
| GET | /api/meetings/:id | Get meeting details |
| PUT | /api/meetings/:id | Update meeting |
| DELETE | /api/meetings/:id | Delete meeting |
| POST | /api/ai/summarize | Send transcript to Dify |
| GET | /api/meetings/:id/export | Generate Excel report |
## Open Questions
- None currently - PRD and SDD provide sufficient detail for MVP implementation

View File

@@ -0,0 +1,25 @@
# Change: Add Meeting Assistant MVP
## Why
Enterprise users spend significant time manually documenting meetings and tracking action items. This MVP delivers an end-to-end meeting knowledge management solution with offline transcription, AI-powered summarization, and structured tracking of conclusions and action items.
## What Changes
- **NEW** FastAPI middleware server with MySQL integration
- **NEW** Authentication proxy to company Auth API with admin role detection
- **NEW** Meeting CRUD operations with metadata management
- **NEW** Edge-based speech-to-text using faster-whisper (int8)
- **NEW** Dify LLM integration for intelligent summarization
- **NEW** Excel report generation from templates
## Impact
- Affected specs: middleware, authentication, meeting-management, transcription, ai-summarization, excel-export
- Affected code: New Python FastAPI backend, new Electron frontend
- External dependencies: PJ-Auth API, MySQL database, Dify LLM service
## Success Criteria
- Users can login via company SSO
- Meetings can be created with required metadata (subject, time, chairperson, location, recorder, attendees)
- Speech-to-text works offline on i5/8GB hardware
- AI generates structured conclusions and action items from transcripts
- Action items have trackable status (Open/In Progress/Done/Delayed)
- Excel reports can be exported with all meeting data

View File

@@ -0,0 +1,45 @@
## ADDED Requirements
### Requirement: Dify Integration
The middleware server SHALL integrate with Dify LLM at https://dify.theaken.com/v1 for transcript summarization.
#### Scenario: Successful summarization
- **WHEN** user submits POST /api/ai/summarize with transcript text
- **THEN** the server SHALL call Dify API and return structured JSON with conclusions and action_items
#### Scenario: Dify timeout handling
- **WHEN** Dify API does not respond within timeout period
- **THEN** the server SHALL return HTTP 504 with timeout error and client can retry
#### Scenario: Dify error handling
- **WHEN** Dify API returns error (500, rate limit, etc.)
- **THEN** the server SHALL return appropriate HTTP error with details
### Requirement: Structured Output Format
The AI summarization SHALL return structured data with conclusions and action items.
#### Scenario: Complete structured response
- **WHEN** transcript contains clear decisions and assignments
- **THEN** response SHALL include conclusions array and action_items array with content, owner, due_date fields
#### Scenario: Partial data extraction
- **WHEN** transcript lacks explicit owner or due_date for action items
- **THEN** those fields SHALL be empty strings allowing manual completion
### Requirement: Dify Prompt Configuration
The Dify workflow SHALL be configured with appropriate system prompt for meeting summarization.
#### Scenario: System prompt behavior
- **WHEN** transcript is sent to Dify
- **THEN** Dify SHALL use configured prompt to extract conclusions and action_items in JSON format
### Requirement: Manual Data Completion
The Electron client SHALL allow users to manually complete missing AI-extracted data.
#### Scenario: Fill missing owner
- **WHEN** AI returns action item without owner
- **THEN** user SHALL be able to select or type owner name in the UI
#### Scenario: Fill missing due date
- **WHEN** AI returns action item without due_date
- **THEN** user SHALL be able to select date using date picker

View File

@@ -0,0 +1,42 @@
## ADDED Requirements
### Requirement: Login Proxy
The middleware server SHALL proxy login requests to the company Auth API at https://pj-auth-api.vercel.app/api/auth/login.
#### Scenario: Successful login
- **WHEN** user submits valid credentials to POST /api/login
- **THEN** the server SHALL forward to Auth API and return the JWT token
#### Scenario: Admin role detection
- **WHEN** user logs in with email ymirliu@panjit.com.tw
- **THEN** the response JWT payload SHALL include role: "admin"
#### Scenario: Invalid credentials
- **WHEN** user submits invalid credentials
- **THEN** the server SHALL return HTTP 401 with error message from Auth API
### Requirement: Token Validation
The middleware server SHALL validate JWT tokens on protected endpoints.
#### Scenario: Valid token access
- **WHEN** request includes valid JWT in Authorization header
- **THEN** the request SHALL proceed to the endpoint handler
#### Scenario: Expired token
- **WHEN** request includes expired JWT
- **THEN** the server SHALL return HTTP 401 with "token_expired" error code
#### Scenario: Missing token
- **WHEN** request to protected endpoint lacks Authorization header
- **THEN** the server SHALL return HTTP 401 with "token_required" error code
### Requirement: Token Auto-Refresh
The Electron client SHALL implement automatic token refresh before expiration.
#### Scenario: Proactive refresh
- **WHEN** token approaches expiration (within 5 minutes) during active session
- **THEN** the client SHALL request new token transparently without user interruption
#### Scenario: Refresh during long meeting
- **WHEN** user is in a meeting session lasting longer than token validity
- **THEN** the client SHALL maintain authentication through automatic refresh

View File

@@ -0,0 +1,45 @@
## ADDED Requirements
### Requirement: Excel Report Generation
The middleware server SHALL generate Excel reports from meeting data using templates.
#### Scenario: Successful export
- **WHEN** user requests GET /api/meetings/:id/export
- **THEN** server SHALL generate Excel file and return as downloadable stream
#### Scenario: Export non-existent meeting
- **WHEN** user requests export for non-existent meeting ID
- **THEN** server SHALL return HTTP 404
### Requirement: Template-based Generation
The Excel export SHALL use openpyxl with template files.
#### Scenario: Placeholder replacement
- **WHEN** Excel is generated
- **THEN** placeholders ({{subject}}, {{time}}, {{chair}}, etc.) SHALL be replaced with actual meeting data
#### Scenario: Dynamic row insertion
- **WHEN** meeting has multiple conclusions or action items
- **THEN** rows SHALL be dynamically inserted to accommodate all items
### Requirement: Complete Data Inclusion
The exported Excel SHALL include all meeting metadata and AI-generated content.
#### Scenario: Full metadata export
- **WHEN** Excel is generated
- **THEN** it SHALL include subject, meeting_time, location, chairperson, recorder, and attendees
#### Scenario: Conclusions export
- **WHEN** Excel is generated
- **THEN** all conclusions SHALL be listed with their system codes
#### Scenario: Action items export
- **WHEN** Excel is generated
- **THEN** all action items SHALL be listed with content, owner, due_date, status, and system code
### Requirement: Template Management
Admin users SHALL be able to manage Excel templates.
#### Scenario: Admin template access
- **WHEN** admin user accesses template management
- **THEN** they SHALL be able to upload, view, and update Excel templates

View File

@@ -0,0 +1,71 @@
## ADDED Requirements
### Requirement: Create Meeting
The system SHALL allow users to create meetings with required metadata.
#### Scenario: Create meeting with all fields
- **WHEN** user submits POST /api/meetings with subject, meeting_time, chairperson, location, recorder, attendees
- **THEN** a new meeting record SHALL be created with auto-generated UUID and the meeting data SHALL be returned
#### Scenario: Create meeting with missing required fields
- **WHEN** user submits POST /api/meetings without subject or meeting_time
- **THEN** the server SHALL return HTTP 400 with validation error details
#### Scenario: Recorder defaults to current user
- **WHEN** user creates meeting without specifying recorder
- **THEN** the recorder field SHALL default to the logged-in user's email
### Requirement: List Meetings
The system SHALL allow users to retrieve a list of meetings.
#### Scenario: List all meetings for admin
- **WHEN** admin user requests GET /api/meetings
- **THEN** all meetings SHALL be returned
#### Scenario: List meetings for regular user
- **WHEN** regular user requests GET /api/meetings
- **THEN** only meetings where user is creator, recorder, or attendee SHALL be returned
### Requirement: Get Meeting Details
The system SHALL allow users to retrieve full meeting details including conclusions and action items.
#### Scenario: Get meeting with related data
- **WHEN** user requests GET /api/meetings/:id
- **THEN** meeting record with all conclusions and action_items SHALL be returned
#### Scenario: Get non-existent meeting
- **WHEN** user requests GET /api/meetings/:id for non-existent ID
- **THEN** the server SHALL return HTTP 404
### Requirement: Update Meeting
The system SHALL allow users to update meeting data, conclusions, and action items.
#### Scenario: Update meeting metadata
- **WHEN** user submits PUT /api/meetings/:id with updated fields
- **THEN** the meeting record SHALL be updated and new data returned
#### Scenario: Update action item status
- **WHEN** user updates action item status to "Done"
- **THEN** the action_items record SHALL reflect the new status
### Requirement: Delete Meeting
The system SHALL allow authorized users to delete meetings.
#### Scenario: Admin deletes any meeting
- **WHEN** admin user requests DELETE /api/meetings/:id
- **THEN** the meeting and all related conclusions and action_items SHALL be deleted
#### Scenario: User deletes own meeting
- **WHEN** user requests DELETE /api/meetings/:id for meeting they created
- **THEN** the meeting and all related data SHALL be deleted
### Requirement: System Code Generation
The system SHALL auto-generate unique system codes for conclusions and action items.
#### Scenario: Generate conclusion code
- **WHEN** a conclusion is created for a meeting on 2025-12-10
- **THEN** the system_code SHALL follow format C-20251210-XX where XX is sequence number
#### Scenario: Generate action item code
- **WHEN** an action item is created for a meeting on 2025-12-10
- **THEN** the system_code SHALL follow format A-20251210-XX where XX is sequence number

View File

@@ -0,0 +1,41 @@
## ADDED Requirements
### Requirement: FastAPI Server Configuration
The middleware server SHALL be implemented using Python FastAPI framework with environment-based configuration.
#### Scenario: Server startup with valid configuration
- **WHEN** the server starts with valid .env file containing DB_HOST, DB_PORT, DB_USER, DB_PASS, DB_NAME, DIFY_API_URL, DIFY_API_KEY
- **THEN** the server SHALL start successfully and accept connections
#### Scenario: Server startup with missing configuration
- **WHEN** the server starts with missing required environment variables
- **THEN** the server SHALL fail to start with descriptive error message
### Requirement: Database Connection Pool
The middleware server SHALL maintain a connection pool to the MySQL database at mysql.theaken.com:33306.
#### Scenario: Database connection success
- **WHEN** the server connects to MySQL with valid credentials
- **THEN** a connection pool SHALL be established and queries SHALL execute successfully
#### Scenario: Database connection failure
- **WHEN** the database is unreachable
- **THEN** the server SHALL return HTTP 503 with error details for affected endpoints
### Requirement: Table Initialization
The middleware server SHALL ensure all required tables exist on startup with the `meeting_` prefix.
#### Scenario: Tables created on first run
- **WHEN** the server starts and tables do not exist
- **THEN** the server SHALL create meeting_users, meeting_records, meeting_conclusions, and meeting_action_items tables
#### Scenario: Tables already exist
- **WHEN** the server starts and tables already exist
- **THEN** the server SHALL skip table creation and continue normally
### Requirement: CORS Configuration
The middleware server SHALL allow cross-origin requests from the Electron client.
#### Scenario: CORS preflight request
- **WHEN** Electron client sends OPTIONS request
- **THEN** the server SHALL respond with appropriate CORS headers allowing the request

View File

@@ -0,0 +1,41 @@
## ADDED Requirements
### Requirement: Edge Speech-to-Text
The Electron client SHALL perform speech-to-text conversion locally using faster-whisper int8 model.
#### Scenario: Successful transcription
- **WHEN** user records audio during a meeting
- **THEN** the audio SHALL be transcribed locally without network dependency
#### Scenario: Transcription on target hardware
- **WHEN** running on i5 processor with 8GB RAM
- **THEN** transcription SHALL complete within acceptable latency for real-time display
### Requirement: Traditional Chinese Output
The transcription engine SHALL output Traditional Chinese (繁體中文) text.
#### Scenario: Simplified to Traditional conversion
- **WHEN** whisper outputs Simplified Chinese characters
- **THEN** OpenCC SHALL convert output to Traditional Chinese
#### Scenario: Native Traditional Chinese
- **WHEN** whisper outputs Traditional Chinese directly
- **THEN** the text SHALL pass through unchanged
### Requirement: Real-time Display
The Electron client SHALL display transcription results in real-time.
#### Scenario: Streaming transcription
- **WHEN** user is recording
- **THEN** transcribed text SHALL appear in the left panel within seconds of speech
### Requirement: Python Sidecar
The transcription engine SHALL be packaged as a Python sidecar using PyInstaller.
#### Scenario: Sidecar startup
- **WHEN** Electron app launches
- **THEN** the Python sidecar containing faster-whisper and OpenCC SHALL be available
#### Scenario: Sidecar communication
- **WHEN** Electron sends audio data to sidecar
- **THEN** transcribed text SHALL be returned via IPC

View File

@@ -0,0 +1,67 @@
## 1. Middleware Server Foundation
- [x] 1.1 Initialize Python project with FastAPI, uvicorn, python-dotenv
- [x] 1.2 Create .env.example with all required environment variables
- [x] 1.3 Implement database connection pool with mysql-connector-python
- [x] 1.4 Create table initialization script (meeting_users, meeting_records, meeting_conclusions, meeting_action_items)
- [x] 1.5 Configure CORS middleware for Electron client
- [x] 1.6 Add health check endpoint GET /api/health
## 2. Authentication
- [x] 2.1 Implement POST /api/login proxy to PJ-Auth API
- [x] 2.2 Add admin role detection for ymirliu@panjit.com.tw
- [x] 2.3 Create JWT validation middleware for protected routes
- [x] 2.4 Handle token expiration with appropriate error codes
## 3. Meeting CRUD
- [x] 3.1 Implement POST /api/meetings (create meeting)
- [x] 3.2 Implement GET /api/meetings (list meetings with user filtering)
- [x] 3.3 Implement GET /api/meetings/:id (get meeting with conclusions and action items)
- [x] 3.4 Implement PUT /api/meetings/:id (update meeting)
- [x] 3.5 Implement DELETE /api/meetings/:id (delete meeting cascade)
- [x] 3.6 Implement system code generation (C-YYYYMMDD-XX, A-YYYYMMDD-XX)
## 4. AI Summarization
- [x] 4.1 Implement POST /api/ai/summarize endpoint
- [x] 4.2 Configure Dify API client with timeout and retry
- [x] 4.3 Parse Dify response into conclusions and action_items structure
- [x] 4.4 Handle partial data (empty owner/due_date)
## 5. Excel Export
- [x] 5.1 Create Excel template with placeholders
- [x] 5.2 Implement GET /api/meetings/:id/export endpoint
- [x] 5.3 Implement placeholder replacement ({{subject}}, {{time}}, etc.)
- [x] 5.4 Implement dynamic row insertion for conclusions and action items
## 6. Electron Client - Core
- [x] 6.1 Initialize Electron project with electron-builder
- [x] 6.2 Create main window and basic navigation
- [x] 6.3 Implement login page with auth API integration
- [x] 6.4 Implement token storage and auto-refresh interceptor
## 7. Electron Client - Meeting UI
- [x] 7.1 Create meeting list page
- [x] 7.2 Create meeting creation form (metadata fields)
- [x] 7.3 Create dual-panel meeting view (transcript left, notes right)
- [x] 7.4 Implement conclusion/action item editing with manual completion UI
- [x] 7.5 Add export button with download handling
## 8. Transcription Engine
- [x] 8.1 Create Python sidecar project with faster-whisper and OpenCC
- [x] 8.2 Implement audio input capture
- [x] 8.3 Implement transcription with int8 model
- [x] 8.4 Implement OpenCC Traditional Chinese conversion
- [x] 8.5 Set up IPC communication between Electron and sidecar
- [x] 8.6 Package sidecar with PyInstaller
## 9. Testing
- [x] 9.1 Unit tests: DB connection, table creation
- [x] 9.2 Unit tests: Dify proxy with mock responses
- [x] 9.3 Unit tests: Admin role detection
- [x] 9.4 Integration test: Auth flow with token refresh
- [x] 9.5 Integration test: Full meeting cycle (create → transcribe → summarize → save → export)
## 10. Deployment Preparation
- [x] 10.1 Create requirements.txt with all dependencies
- [x] 10.2 Create deployment documentation
- [x] 10.3 Configure electron-builder for portable target
- [x] 10.4 Verify faster-whisper performance on i5/8GB hardware

View File

@@ -0,0 +1,117 @@
## Context
The Meeting Assistant currently uses batch transcription: audio is recorded, saved to file, then sent to Whisper for processing. This creates a poor UX where users must wait until recording stops to see any text. Users also cannot correct transcription errors.
**Stakeholders**: End users recording meetings, admin reviewing transcripts
**Constraints**: i5/8GB hardware target, offline capability required
## Goals / Non-Goals
### Goals
- Real-time text display during recording (< 3 second latency)
- Segment-based editing without disrupting ongoing transcription
- Punctuation in output (Chinese: 。,?!;:)
- Maintain offline capability (all processing local)
### Non-Goals
- Speaker diarization (who said what) - future enhancement
- Multi-language mixing - Chinese only for MVP
- Cloud-based transcription fallback
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Renderer Process (meeting-detail.html) │
│ ┌──────────────┐ ┌─────────────────────────────────┐ │
│ │ MediaRecorder│───▶│ Editable Transcript Component │ │
│ │ (audio chunks) │ [Segment 1] [Segment 2] [...] │ │
│ └──────┬───────┘ └─────────────────────────────────┘ │
│ │ IPC: stream-audio-chunk │
└─────────┼──────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Main Process (main.js) │
│ ┌──────────────────┐ ┌─────────────────────────────┐ │
│ │ Audio Buffer │────▶│ Sidecar (stdin pipe) │ │
│ │ (accumulate PCM) │ │ │ │
│ └──────────────────┘ └──────────┬──────────────────┘ │
│ │ IPC: transcription-segment
│ ▼ │
│ Forward to renderer │
└─────────────────────────────────────────────────────────────┘
▼ stdin (WAV chunks)
┌─────────────────────────────────────────────────────────────┐
│ Sidecar Process (transcriber.py) │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ VAD Buffer │──▶│ Whisper │──▶│ Punctuator │ │
│ │ (silero-vad) │ │ (transcribe) │ │ (rule-based) │ │
│ └──────────────┘ └──────────────┘ └────────────────┘ │
│ │ │ │
│ │ Detect speech end │ │
│ ▼ ▼ │
│ stdout: {"segment_id": 1, "text": "今天開會討論。", ...} │
└─────────────────────────────────────────────────────────────┘
```
## Decisions
### Decision 1: VAD-triggered Segmentation
**What**: Use Silero VAD to detect speech boundaries, transcribe complete utterances
**Why**:
- More accurate than fixed-interval chunking
- Natural sentence boundaries
- Reduces partial/incomplete transcriptions
**Alternatives**:
- Fixed 5-second chunks (simpler but cuts mid-sentence)
- Word-level streaming (too fragmented, higher latency)
### Decision 2: Segment-based Editing
**What**: Each VAD segment becomes an editable text block with unique ID
**Why**:
- Users can edit specific segments without affecting others
- New segments append without disrupting editing
- Simple merge on save (concatenate all segments)
**Alternatives**:
- Single textarea (editing conflicts with appending text)
- Contenteditable div (complex cursor management)
### Decision 3: Audio Format Pipeline
**What**: WebM (MediaRecorder) WAV conversion in main.js raw PCM to sidecar
**Why**:
- MediaRecorder only outputs WebM/Opus in browsers
- Whisper works best with WAV/PCM
- Conversion in main.js keeps sidecar simple
**Alternatives**:
- ffmpeg in sidecar (adds large dependency)
- Raw PCM from AudioWorklet (complex, browser compatibility issues)
### Decision 4: Punctuation via Whisper + Rules
**What**: Enable Whisper word_timestamps, apply rule-based punctuation after
**Why**:
- Whisper alone outputs minimal punctuation for Chinese
- Rule-based post-processing adds 。,? based on pauses and patterns
- No additional model needed
**Alternatives**:
- Separate punctuation model (adds latency and complexity)
- No punctuation (user requirement)
## Risks / Trade-offs
| Risk | Mitigation |
|------|------------|
| Latency > 3s on slow hardware | Use "tiny" model option, skip VAD if needed |
| WebM→WAV conversion quality loss | Use lossless conversion, test on various inputs |
| Memory usage with long meetings | Limit audio buffer to 30s, process and discard |
| Segment boundary splits words | Use VAD with 500ms silence threshold |
## Implementation Phases
1. **Phase 1**: Sidecar streaming mode with VAD
2. **Phase 2**: IPC audio streaming pipeline
3. **Phase 3**: Frontend editable segment component
4. **Phase 4**: Punctuation post-processing
## Open Questions
- Should segments be auto-merged after N seconds of no editing?
- Maximum segment count before auto-archiving old segments?

View File

@@ -0,0 +1,24 @@
# Change: Add Real-time Streaming Transcription
## Why
Current transcription workflow requires users to stop recording before seeing results. Users cannot edit transcription errors, and output lacks punctuation. For meeting scenarios, real-time feedback with editable text is essential for immediate correction and context awareness.
## What Changes
- **Sidecar**: Implement streaming VAD-based transcription with sentence segmentation
- **IPC**: Add continuous audio streaming from renderer to main process to sidecar
- **Frontend**: Make transcript editable with real-time segment updates
- **Punctuation**: Enable Whisper's word timestamps and add sentence boundary detection
## Impact
- Affected specs: `transcription` (new), `frontend-transcript` (new)
- Affected code:
- `sidecar/transcriber.py` - Add streaming mode with VAD
- `client/src/main.js` - Add audio streaming IPC handlers
- `client/src/preload.js` - Expose streaming APIs
- `client/src/pages/meeting-detail.html` - Editable transcript component
## Success Criteria
1. User sees text appearing within 2-3 seconds of speaking
2. Each segment is individually editable
3. Output includes punctuation (。,?!)
4. Recording can continue while user edits previous segments

View File

@@ -0,0 +1,58 @@
## ADDED Requirements
### Requirement: Editable Transcript Segments
The frontend SHALL display transcribed text as individually editable segments that can be modified without disrupting ongoing transcription.
#### Scenario: Display new segment
- **WHEN** a new transcription segment is received from sidecar
- **THEN** a new editable text block SHALL appear in the transcript area
- **AND** the block SHALL be visually distinct (e.g., border, background)
- **AND** the block SHALL be immediately editable
#### Scenario: Edit existing segment
- **WHEN** user modifies text in a segment
- **THEN** only that segment's local data SHALL be updated
- **AND** new incoming segments SHALL continue to append below
- **AND** the edited segment SHALL show an "edited" indicator
#### Scenario: Save merged transcript
- **WHEN** user clicks Save button
- **THEN** all segments (edited and unedited) SHALL be concatenated in order
- **AND** the merged text SHALL be saved as transcript_blob
### Requirement: Real-time Streaming UI
The frontend SHALL provide clear visual feedback during streaming transcription.
#### Scenario: Recording active indicator
- **WHEN** streaming recording is active
- **THEN** a pulsing recording indicator SHALL be visible
- **AND** the current/active segment SHALL have distinct styling (e.g., highlighted border)
- **AND** the Start Recording button SHALL change to Stop Recording
#### Scenario: Processing indicator
- **WHEN** audio is being processed but no text has appeared yet
- **THEN** a "Processing..." indicator SHALL appear in the active segment area
- **AND** the indicator SHALL disappear when text arrives
#### Scenario: Streaming status display
- **WHEN** streaming session is active
- **THEN** the UI SHALL display segment count (e.g., "Segment 5/5")
- **AND** total recording duration
### Requirement: Audio Streaming IPC
The Electron main process SHALL provide IPC handlers for continuous audio streaming between renderer and sidecar.
#### Scenario: Start streaming
- **WHEN** renderer calls `startRecordingStream()`
- **THEN** main process SHALL send start_stream command to sidecar
- **AND** return session confirmation to renderer
#### Scenario: Stream audio data
- **WHEN** renderer sends audio chunk via `streamAudioChunk(arrayBuffer)`
- **THEN** main process SHALL convert WebM to PCM if needed
- **AND** forward to sidecar stdin as base64-encoded audio_chunk command
#### Scenario: Receive transcription
- **WHEN** sidecar emits a segment result on stdout
- **THEN** main process SHALL parse the JSON
- **AND** forward to renderer via `transcription-segment` IPC event

View File

@@ -0,0 +1,46 @@
## ADDED Requirements
### Requirement: Streaming Transcription Mode
The sidecar SHALL support a streaming mode where audio chunks are continuously received and transcribed in real-time with VAD-triggered segmentation.
#### Scenario: Start streaming session
- **WHEN** sidecar receives `{"action": "start_stream"}` command
- **THEN** it SHALL initialize audio buffer and VAD processor
- **AND** respond with `{"status": "streaming", "session_id": "<uuid>"}`
#### Scenario: Process audio chunk
- **WHEN** sidecar receives `{"action": "audio_chunk", "data": "<base64_pcm>"}` during active stream
- **THEN** it SHALL append audio to buffer and run VAD detection
- **AND** if speech boundary detected, transcribe accumulated audio
- **AND** emit `{"segment_id": <int>, "text": "<transcription>", "is_final": true}`
#### Scenario: Stop streaming session
- **WHEN** sidecar receives `{"action": "stop_stream"}` command
- **THEN** it SHALL transcribe any remaining buffered audio
- **AND** respond with `{"status": "stream_stopped", "total_segments": <int>}`
### Requirement: VAD-based Speech Segmentation
The sidecar SHALL use Voice Activity Detection to identify natural speech boundaries for segmentation.
#### Scenario: Detect speech end
- **WHEN** VAD detects silence exceeding 500ms after speech
- **THEN** the accumulated speech audio SHALL be sent for transcription
- **AND** a new segment SHALL begin for subsequent speech
#### Scenario: Handle continuous speech
- **WHEN** speech continues for more than 15 seconds without pause
- **THEN** the sidecar SHALL force a segment boundary
- **AND** transcribe the 15-second chunk to prevent excessive latency
### Requirement: Punctuation in Transcription Output
The sidecar SHALL output transcribed text with appropriate Chinese punctuation marks.
#### Scenario: Add sentence-ending punctuation
- **WHEN** transcription completes for a segment
- **THEN** the output SHALL include period (。) at natural sentence boundaries
- **AND** question marks () for interrogative sentences
- **AND** commas () for clause breaks within sentences
#### Scenario: Detect question patterns
- **WHEN** transcribed text ends with question particles (嗎、呢、什麼、怎麼、為什麼)
- **THEN** the punctuation processor SHALL append question mark ()

View File

@@ -0,0 +1,53 @@
## 1. Sidecar Streaming Infrastructure
- [x] 1.1 Add silero-vad dependency to requirements.txt
- [x] 1.2 Implement VADProcessor class with speech boundary detection
- [x] 1.3 Add streaming mode to Transcriber (action: "start_stream", "audio_chunk", "stop_stream")
- [x] 1.4 Implement audio buffer with VAD-triggered transcription
- [x] 1.5 Add segment_id tracking for each utterance
- [x] 1.6 Test VAD with sample Chinese speech audio
## 2. Punctuation Processing
- [x] 2.1 Enable word_timestamps in Whisper transcribe()
- [x] 2.2 Implement ChinesePunctuator class with rule-based punctuation
- [x] 2.3 Add pause-based sentence boundary detection (>500ms → period)
- [x] 2.4 Add question detection (嗎、呢、什麼 patterns → )
- [x] 2.5 Test punctuation output quality with sample transcripts
## 3. IPC Audio Streaming
- [x] 3.1 Add "start-recording-stream" IPC handler in main.js
- [x] 3.2 Add "stream-audio-chunk" IPC handler to forward audio to sidecar
- [x] 3.3 Add "stop-recording-stream" IPC handler
- [x] 3.4 Implement WebM to PCM conversion using web-audio-api or ffmpeg.wasm
- [x] 3.5 Forward sidecar segment events to renderer via "transcription-segment" IPC
- [x] 3.6 Update preload.js with streaming API exposure
## 4. Frontend Editable Transcript
- [x] 4.1 Create TranscriptSegment component (editable text block with segment_id)
- [x] 4.2 Implement segment container with append-only behavior during recording
- [x] 4.3 Add edit handler that updates local segment data
- [x] 4.4 Style active segment (currently receiving text) differently
- [x] 4.5 Update Save button to merge all segments into transcript_blob
- [x] 4.6 Add visual indicator for streaming status
## 5. Integration & Testing
- [x] 5.1 End-to-end test: start recording → speak → see text appear
- [x] 5.2 Test editing segment while new segments arrive
- [x] 5.3 Test save with mixed edited/unedited segments
- [x] 5.4 Performance test on i5/8GB target hardware
- [x] 5.5 Test with 30+ minute continuous recording
- [x] 5.6 Update meeting-detail.html recording flow documentation
## Dependencies
- Task 3 depends on Task 1 (sidecar must support streaming first)
- Task 4 depends on Task 3 (frontend needs IPC to receive segments)
- Task 2 can run in parallel with Task 3
## Parallelizable Work
- Tasks 1 and 4 can start simultaneously (sidecar and frontend scaffolding)
- Task 2 can run in parallel with Task 3
## Implementation Notes
- VAD uses Silero VAD with fallback to 5-second time-based segmentation if torch unavailable
- Audio captured at 16kHz mono, converted to int16 PCM, sent as base64
- ChinesePunctuator uses regex patterns for question detection
- Segments are editable immediately, edited segments marked with orange border

56
openspec/project.md Normal file
View File

@@ -0,0 +1,56 @@
# Project Context
## Purpose
Enterprise meeting knowledge management solution that automates meeting transcription and generates structured summaries. Solves the time-consuming problem of manual meeting notes by using edge AI for speech-to-text and LLM for intelligent summarization with action item tracking.
## Tech Stack
- **Frontend**: Electron (edge computing for offline transcription)
- **Backend**: Python FastAPI (middleware server)
- **Database**: MySQL (shared instance at mysql.theaken.com:33306)
- **AI/ML**:
- faster-whisper (int8) for local speech-to-text
- OpenCC for Traditional Chinese conversion
- Dify LLM for summarization
- **Key Libraries**: mysql-connector-python, fastapi, requests, openpyxl, PyInstaller
## Project Conventions
### Code Style
- Database tables must use `meeting_` prefix
- System IDs follow format: `C-YYYYMMDD-XX` (conclusions), `A-YYYYMMDD-XX` (action items)
- API endpoints use `/api/` prefix
- Environment variables for sensitive config (DB credentials, API keys)
### Architecture Patterns
- **Three-tier architecture**: Electron Client → FastAPI Middleware → MySQL/Dify
- **Security**: DB connections and API keys must NOT be in Electron client; all secrets stay in middleware
- **Edge Computing**: Speech-to-text runs locally in Electron for offline capability
- **Proxy Pattern**: Middleware proxies auth requests to external Auth API
### Testing Strategy
- **Unit Tests**: DB connectivity, Dify proxy, admin role detection
- **Integration Tests**: Auth flow with token refresh, full meeting cycle (create → record → summarize → save → export)
- **Deployment Checklist**: Environment validation, table creation, package verification
### Git Workflow
- Feature branches for new capabilities
- OpenSpec change proposals for significant features
## Domain Context
- **會議記錄 (Meeting Records)**: Core entity with metadata (subject, time, chairperson, location, recorder, attendees)
- **逐字稿 (Transcript)**: Raw AI-generated speech-to-text output
- **會議結論 (Conclusions)**: Summarized key decisions from meetings
- **待辦事項 (Action Items)**: Tracked tasks with owner, due date, and status (Open/In Progress/Done/Delayed)
- **Admin User**: ymirliu@panjit.com.tw has full access to all meetings and Excel template management
## Important Constraints
- Target hardware: i5/8GB laptop must run faster-whisper int8 locally
- Security: No DB credentials or API keys in client-side code
- Language: Must support Traditional Chinese (繁體中文) output
- Data Isolation: All tables prefixed with `meeting_`
- Token Management: Client must implement auto-refresh for long meetings
## External Dependencies
- **Auth API**: https://pj-auth-api.vercel.app/api/auth/login (company SSO)
- **Dify LLM**: https://dify.theaken.com/v1 (AI summarization)
- **MySQL**: mysql.theaken.com:33306, database `db_A060`

View File

@@ -0,0 +1,49 @@
# ai-summarization Specification
## Purpose
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
## Requirements
### Requirement: Dify Integration
The middleware server SHALL integrate with Dify LLM at https://dify.theaken.com/v1 for transcript summarization.
#### Scenario: Successful summarization
- **WHEN** user submits POST /api/ai/summarize with transcript text
- **THEN** the server SHALL call Dify API and return structured JSON with conclusions and action_items
#### Scenario: Dify timeout handling
- **WHEN** Dify API does not respond within timeout period
- **THEN** the server SHALL return HTTP 504 with timeout error and client can retry
#### Scenario: Dify error handling
- **WHEN** Dify API returns error (500, rate limit, etc.)
- **THEN** the server SHALL return appropriate HTTP error with details
### Requirement: Structured Output Format
The AI summarization SHALL return structured data with conclusions and action items.
#### Scenario: Complete structured response
- **WHEN** transcript contains clear decisions and assignments
- **THEN** response SHALL include conclusions array and action_items array with content, owner, due_date fields
#### Scenario: Partial data extraction
- **WHEN** transcript lacks explicit owner or due_date for action items
- **THEN** those fields SHALL be empty strings allowing manual completion
### Requirement: Dify Prompt Configuration
The Dify workflow SHALL be configured with appropriate system prompt for meeting summarization.
#### Scenario: System prompt behavior
- **WHEN** transcript is sent to Dify
- **THEN** Dify SHALL use configured prompt to extract conclusions and action_items in JSON format
### Requirement: Manual Data Completion
The Electron client SHALL allow users to manually complete missing AI-extracted data.
#### Scenario: Fill missing owner
- **WHEN** AI returns action item without owner
- **THEN** user SHALL be able to select or type owner name in the UI
#### Scenario: Fill missing due date
- **WHEN** AI returns action item without due_date
- **THEN** user SHALL be able to select date using date picker

View File

@@ -0,0 +1,46 @@
# authentication Specification
## Purpose
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
## Requirements
### Requirement: Login Proxy
The middleware server SHALL proxy login requests to the company Auth API at https://pj-auth-api.vercel.app/api/auth/login.
#### Scenario: Successful login
- **WHEN** user submits valid credentials to POST /api/login
- **THEN** the server SHALL forward to Auth API and return the JWT token
#### Scenario: Admin role detection
- **WHEN** user logs in with email ymirliu@panjit.com.tw
- **THEN** the response JWT payload SHALL include role: "admin"
#### Scenario: Invalid credentials
- **WHEN** user submits invalid credentials
- **THEN** the server SHALL return HTTP 401 with error message from Auth API
### Requirement: Token Validation
The middleware server SHALL validate JWT tokens on protected endpoints.
#### Scenario: Valid token access
- **WHEN** request includes valid JWT in Authorization header
- **THEN** the request SHALL proceed to the endpoint handler
#### Scenario: Expired token
- **WHEN** request includes expired JWT
- **THEN** the server SHALL return HTTP 401 with "token_expired" error code
#### Scenario: Missing token
- **WHEN** request to protected endpoint lacks Authorization header
- **THEN** the server SHALL return HTTP 401 with "token_required" error code
### Requirement: Token Auto-Refresh
The Electron client SHALL implement automatic token refresh before expiration.
#### Scenario: Proactive refresh
- **WHEN** token approaches expiration (within 5 minutes) during active session
- **THEN** the client SHALL request new token transparently without user interruption
#### Scenario: Refresh during long meeting
- **WHEN** user is in a meeting session lasting longer than token validity
- **THEN** the client SHALL maintain authentication through automatic refresh

View File

@@ -0,0 +1,49 @@
# excel-export Specification
## Purpose
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
## Requirements
### Requirement: Excel Report Generation
The middleware server SHALL generate Excel reports from meeting data using templates.
#### Scenario: Successful export
- **WHEN** user requests GET /api/meetings/:id/export
- **THEN** server SHALL generate Excel file and return as downloadable stream
#### Scenario: Export non-existent meeting
- **WHEN** user requests export for non-existent meeting ID
- **THEN** server SHALL return HTTP 404
### Requirement: Template-based Generation
The Excel export SHALL use openpyxl with template files.
#### Scenario: Placeholder replacement
- **WHEN** Excel is generated
- **THEN** placeholders ({{subject}}, {{time}}, {{chair}}, etc.) SHALL be replaced with actual meeting data
#### Scenario: Dynamic row insertion
- **WHEN** meeting has multiple conclusions or action items
- **THEN** rows SHALL be dynamically inserted to accommodate all items
### Requirement: Complete Data Inclusion
The exported Excel SHALL include all meeting metadata and AI-generated content.
#### Scenario: Full metadata export
- **WHEN** Excel is generated
- **THEN** it SHALL include subject, meeting_time, location, chairperson, recorder, and attendees
#### Scenario: Conclusions export
- **WHEN** Excel is generated
- **THEN** all conclusions SHALL be listed with their system codes
#### Scenario: Action items export
- **WHEN** Excel is generated
- **THEN** all action items SHALL be listed with content, owner, due_date, status, and system code
### Requirement: Template Management
Admin users SHALL be able to manage Excel templates.
#### Scenario: Admin template access
- **WHEN** admin user accesses template management
- **THEN** they SHALL be able to upload, view, and update Excel templates

View File

@@ -0,0 +1,62 @@
# frontend-transcript Specification
## Purpose
TBD - created by archiving change add-realtime-transcription. Update Purpose after archive.
## Requirements
### Requirement: Editable Transcript Segments
The frontend SHALL display transcribed text as individually editable segments that can be modified without disrupting ongoing transcription.
#### Scenario: Display new segment
- **WHEN** a new transcription segment is received from sidecar
- **THEN** a new editable text block SHALL appear in the transcript area
- **AND** the block SHALL be visually distinct (e.g., border, background)
- **AND** the block SHALL be immediately editable
#### Scenario: Edit existing segment
- **WHEN** user modifies text in a segment
- **THEN** only that segment's local data SHALL be updated
- **AND** new incoming segments SHALL continue to append below
- **AND** the edited segment SHALL show an "edited" indicator
#### Scenario: Save merged transcript
- **WHEN** user clicks Save button
- **THEN** all segments (edited and unedited) SHALL be concatenated in order
- **AND** the merged text SHALL be saved as transcript_blob
### Requirement: Real-time Streaming UI
The frontend SHALL provide clear visual feedback during streaming transcription.
#### Scenario: Recording active indicator
- **WHEN** streaming recording is active
- **THEN** a pulsing recording indicator SHALL be visible
- **AND** the current/active segment SHALL have distinct styling (e.g., highlighted border)
- **AND** the Start Recording button SHALL change to Stop Recording
#### Scenario: Processing indicator
- **WHEN** audio is being processed but no text has appeared yet
- **THEN** a "Processing..." indicator SHALL appear in the active segment area
- **AND** the indicator SHALL disappear when text arrives
#### Scenario: Streaming status display
- **WHEN** streaming session is active
- **THEN** the UI SHALL display segment count (e.g., "Segment 5/5")
- **AND** total recording duration
### Requirement: Audio Streaming IPC
The Electron main process SHALL provide IPC handlers for continuous audio streaming between renderer and sidecar.
#### Scenario: Start streaming
- **WHEN** renderer calls `startRecordingStream()`
- **THEN** main process SHALL send start_stream command to sidecar
- **AND** return session confirmation to renderer
#### Scenario: Stream audio data
- **WHEN** renderer sends audio chunk via `streamAudioChunk(arrayBuffer)`
- **THEN** main process SHALL convert WebM to PCM if needed
- **AND** forward to sidecar stdin as base64-encoded audio_chunk command
#### Scenario: Receive transcription
- **WHEN** sidecar emits a segment result on stdout
- **THEN** main process SHALL parse the JSON
- **AND** forward to renderer via `transcription-segment` IPC event

View File

@@ -0,0 +1,75 @@
# meeting-management Specification
## Purpose
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
## Requirements
### Requirement: Create Meeting
The system SHALL allow users to create meetings with required metadata.
#### Scenario: Create meeting with all fields
- **WHEN** user submits POST /api/meetings with subject, meeting_time, chairperson, location, recorder, attendees
- **THEN** a new meeting record SHALL be created with auto-generated UUID and the meeting data SHALL be returned
#### Scenario: Create meeting with missing required fields
- **WHEN** user submits POST /api/meetings without subject or meeting_time
- **THEN** the server SHALL return HTTP 400 with validation error details
#### Scenario: Recorder defaults to current user
- **WHEN** user creates meeting without specifying recorder
- **THEN** the recorder field SHALL default to the logged-in user's email
### Requirement: List Meetings
The system SHALL allow users to retrieve a list of meetings.
#### Scenario: List all meetings for admin
- **WHEN** admin user requests GET /api/meetings
- **THEN** all meetings SHALL be returned
#### Scenario: List meetings for regular user
- **WHEN** regular user requests GET /api/meetings
- **THEN** only meetings where user is creator, recorder, or attendee SHALL be returned
### Requirement: Get Meeting Details
The system SHALL allow users to retrieve full meeting details including conclusions and action items.
#### Scenario: Get meeting with related data
- **WHEN** user requests GET /api/meetings/:id
- **THEN** meeting record with all conclusions and action_items SHALL be returned
#### Scenario: Get non-existent meeting
- **WHEN** user requests GET /api/meetings/:id for non-existent ID
- **THEN** the server SHALL return HTTP 404
### Requirement: Update Meeting
The system SHALL allow users to update meeting data, conclusions, and action items.
#### Scenario: Update meeting metadata
- **WHEN** user submits PUT /api/meetings/:id with updated fields
- **THEN** the meeting record SHALL be updated and new data returned
#### Scenario: Update action item status
- **WHEN** user updates action item status to "Done"
- **THEN** the action_items record SHALL reflect the new status
### Requirement: Delete Meeting
The system SHALL allow authorized users to delete meetings.
#### Scenario: Admin deletes any meeting
- **WHEN** admin user requests DELETE /api/meetings/:id
- **THEN** the meeting and all related conclusions and action_items SHALL be deleted
#### Scenario: User deletes own meeting
- **WHEN** user requests DELETE /api/meetings/:id for meeting they created
- **THEN** the meeting and all related data SHALL be deleted
### Requirement: System Code Generation
The system SHALL auto-generate unique system codes for conclusions and action items.
#### Scenario: Generate conclusion code
- **WHEN** a conclusion is created for a meeting on 2025-12-10
- **THEN** the system_code SHALL follow format C-20251210-XX where XX is sequence number
#### Scenario: Generate action item code
- **WHEN** an action item is created for a meeting on 2025-12-10
- **THEN** the system_code SHALL follow format A-20251210-XX where XX is sequence number

View File

@@ -0,0 +1,45 @@
# middleware Specification
## Purpose
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
## Requirements
### Requirement: FastAPI Server Configuration
The middleware server SHALL be implemented using Python FastAPI framework with environment-based configuration.
#### Scenario: Server startup with valid configuration
- **WHEN** the server starts with valid .env file containing DB_HOST, DB_PORT, DB_USER, DB_PASS, DB_NAME, DIFY_API_URL, DIFY_API_KEY
- **THEN** the server SHALL start successfully and accept connections
#### Scenario: Server startup with missing configuration
- **WHEN** the server starts with missing required environment variables
- **THEN** the server SHALL fail to start with descriptive error message
### Requirement: Database Connection Pool
The middleware server SHALL maintain a connection pool to the MySQL database at mysql.theaken.com:33306.
#### Scenario: Database connection success
- **WHEN** the server connects to MySQL with valid credentials
- **THEN** a connection pool SHALL be established and queries SHALL execute successfully
#### Scenario: Database connection failure
- **WHEN** the database is unreachable
- **THEN** the server SHALL return HTTP 503 with error details for affected endpoints
### Requirement: Table Initialization
The middleware server SHALL ensure all required tables exist on startup with the `meeting_` prefix.
#### Scenario: Tables created on first run
- **WHEN** the server starts and tables do not exist
- **THEN** the server SHALL create meeting_users, meeting_records, meeting_conclusions, and meeting_action_items tables
#### Scenario: Tables already exist
- **WHEN** the server starts and tables already exist
- **THEN** the server SHALL skip table creation and continue normally
### Requirement: CORS Configuration
The middleware server SHALL allow cross-origin requests from the Electron client.
#### Scenario: CORS preflight request
- **WHEN** Electron client sends OPTIONS request
- **THEN** the server SHALL respond with appropriate CORS headers allowing the request

View File

@@ -0,0 +1,90 @@
# transcription Specification
## Purpose
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
## Requirements
### Requirement: Edge Speech-to-Text
The Electron client SHALL perform speech-to-text conversion locally using faster-whisper int8 model.
#### Scenario: Successful transcription
- **WHEN** user records audio during a meeting
- **THEN** the audio SHALL be transcribed locally without network dependency
#### Scenario: Transcription on target hardware
- **WHEN** running on i5 processor with 8GB RAM
- **THEN** transcription SHALL complete within acceptable latency for real-time display
### Requirement: Traditional Chinese Output
The transcription engine SHALL output Traditional Chinese (繁體中文) text.
#### Scenario: Simplified to Traditional conversion
- **WHEN** whisper outputs Simplified Chinese characters
- **THEN** OpenCC SHALL convert output to Traditional Chinese
#### Scenario: Native Traditional Chinese
- **WHEN** whisper outputs Traditional Chinese directly
- **THEN** the text SHALL pass through unchanged
### Requirement: Real-time Display
The Electron client SHALL display transcription results in real-time.
#### Scenario: Streaming transcription
- **WHEN** user is recording
- **THEN** transcribed text SHALL appear in the left panel within seconds of speech
### Requirement: Python Sidecar
The transcription engine SHALL be packaged as a Python sidecar using PyInstaller.
#### Scenario: Sidecar startup
- **WHEN** Electron app launches
- **THEN** the Python sidecar containing faster-whisper and OpenCC SHALL be available
#### Scenario: Sidecar communication
- **WHEN** Electron sends audio data to sidecar
- **THEN** transcribed text SHALL be returned via IPC
### Requirement: Streaming Transcription Mode
The sidecar SHALL support a streaming mode where audio chunks are continuously received and transcribed in real-time with VAD-triggered segmentation.
#### Scenario: Start streaming session
- **WHEN** sidecar receives `{"action": "start_stream"}` command
- **THEN** it SHALL initialize audio buffer and VAD processor
- **AND** respond with `{"status": "streaming", "session_id": "<uuid>"}`
#### Scenario: Process audio chunk
- **WHEN** sidecar receives `{"action": "audio_chunk", "data": "<base64_pcm>"}` during active stream
- **THEN** it SHALL append audio to buffer and run VAD detection
- **AND** if speech boundary detected, transcribe accumulated audio
- **AND** emit `{"segment_id": <int>, "text": "<transcription>", "is_final": true}`
#### Scenario: Stop streaming session
- **WHEN** sidecar receives `{"action": "stop_stream"}` command
- **THEN** it SHALL transcribe any remaining buffered audio
- **AND** respond with `{"status": "stream_stopped", "total_segments": <int>}`
### Requirement: VAD-based Speech Segmentation
The sidecar SHALL use Voice Activity Detection to identify natural speech boundaries for segmentation.
#### Scenario: Detect speech end
- **WHEN** VAD detects silence exceeding 500ms after speech
- **THEN** the accumulated speech audio SHALL be sent for transcription
- **AND** a new segment SHALL begin for subsequent speech
#### Scenario: Handle continuous speech
- **WHEN** speech continues for more than 15 seconds without pause
- **THEN** the sidecar SHALL force a segment boundary
- **AND** transcribe the 15-second chunk to prevent excessive latency
### Requirement: Punctuation in Transcription Output
The sidecar SHALL output transcribed text with appropriate Chinese punctuation marks.
#### Scenario: Add sentence-ending punctuation
- **WHEN** transcription completes for a segment
- **THEN** the output SHALL include period (。) at natural sentence boundaries
- **AND** question marks () for interrogative sentences
- **AND** commas () for clause breaks within sentences
#### Scenario: Detect question patterns
- **WHEN** transcribed text ends with question particles (嗎、呢、什麼、怎麼、為什麼)
- **THEN** the punctuation processor SHALL append question mark ()

45
sidecar/build.py Normal file
View File

@@ -0,0 +1,45 @@
#!/usr/bin/env python3
"""
Build script for creating standalone transcriber executable using PyInstaller.
"""
import subprocess
import sys
import os
def build():
"""Build the transcriber executable."""
# PyInstaller command
cmd = [
sys.executable, "-m", "PyInstaller",
"--onefile",
"--name", "transcriber",
"--distpath", "dist",
"--workpath", "build",
"--specpath", "build",
"--hidden-import", "faster_whisper",
"--hidden-import", "opencc",
"--hidden-import", "numpy",
"--hidden-import", "ctranslate2",
"--hidden-import", "huggingface_hub",
"--hidden-import", "tokenizers",
"--collect-data", "faster_whisper",
"--collect-data", "opencc",
"transcriber.py"
]
print("Building transcriber executable...")
print(f"Command: {' '.join(cmd)}")
result = subprocess.run(cmd, cwd=os.path.dirname(os.path.abspath(__file__)))
if result.returncode == 0:
print("\nBuild successful! Executable created at: dist/transcriber")
else:
print("\nBuild failed!")
sys.exit(1)
if __name__ == "__main__":
build()

View File

@@ -0,0 +1,3 @@
# Development/Build dependencies
-r requirements.txt
pyinstaller>=6.0.0

5
sidecar/requirements.txt Normal file
View File

@@ -0,0 +1,5 @@
# Runtime dependencies
faster-whisper>=1.0.0
opencc-python-reimplemented>=0.1.7
numpy>=1.26.0
onnxruntime>=1.16.0

510
sidecar/transcriber.py Normal file
View File

@@ -0,0 +1,510 @@
#!/usr/bin/env python3
"""
Meeting Assistant Transcription Sidecar
Provides speech-to-text transcription using faster-whisper
with automatic Traditional Chinese conversion via OpenCC.
Modes:
1. File mode: transcriber.py <audio_file>
2. Server mode: transcriber.py (default, listens on stdin for JSON commands)
3. Streaming mode: Continuous audio processing with VAD segmentation
Uses ONNX Runtime for VAD (lightweight, ~20MB vs PyTorch ~2GB)
"""
import sys
import os
import json
import tempfile
import base64
import uuid
import re
import urllib.request
from pathlib import Path
from typing import Optional, List
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
try:
from faster_whisper import WhisperModel
import opencc
import numpy as np
except ImportError as e:
print(json.dumps({"error": f"Missing dependency: {e}"}), file=sys.stderr)
sys.exit(1)
# Try to import ONNX Runtime for VAD
try:
import onnxruntime as ort
ONNX_AVAILABLE = True
except ImportError:
ONNX_AVAILABLE = False
class ChinesePunctuator:
"""Rule-based Chinese punctuation processor."""
QUESTION_PATTERNS = [
r'嗎$', r'呢$', r'什麼$', r'怎麼$', r'為什麼$', r'哪裡$', r'哪個$',
r'誰$', r'幾$', r'多少$', r'是否$', r'能否$', r'可否$', r'有沒有$',
r'是不是$', r'會不會$', r'能不能$', r'可不可以$', r'好不好$', r'對不對$'
]
def __init__(self):
self.question_regex = re.compile('|'.join(self.QUESTION_PATTERNS))
def add_punctuation(self, text: str, word_timestamps: Optional[List] = None) -> str:
"""Add punctuation to transcribed text."""
if not text:
return text
text = text.strip()
# Already has ending punctuation
if text and text[-1] in '。?!,;:':
return text
# Check for question patterns
if self.question_regex.search(text):
return text + ''
# Default to period for statements
return text + ''
def process_segments(self, segments: List[dict]) -> str:
"""Process multiple segments with timestamps to add punctuation."""
result_parts = []
for i, seg in enumerate(segments):
text = seg.get('text', '').strip()
if not text:
continue
# Check for long pause before next segment (comma opportunity)
if i < len(segments) - 1:
next_seg = segments[i + 1]
gap = next_seg.get('start', 0) - seg.get('end', 0)
if gap > 0.5 and not text[-1] in '。?!,;:':
# Long pause, add comma if not end of sentence
if not self.question_regex.search(text):
text = text + ''
result_parts.append(text)
# Join and add final punctuation
result = ''.join(result_parts)
return self.add_punctuation(result)
class SileroVAD:
"""Silero VAD using ONNX Runtime (lightweight alternative to PyTorch)."""
MODEL_URL = "https://github.com/snakers4/silero-vad/raw/master/src/silero_vad/data/silero_vad.onnx"
def __init__(self, model_path: Optional[str] = None, threshold: float = 0.5):
self.threshold = threshold
self.session = None
self._h = np.zeros((2, 1, 64), dtype=np.float32)
self._c = np.zeros((2, 1, 64), dtype=np.float32)
self.sample_rate = 16000
if not ONNX_AVAILABLE:
print(json.dumps({"warning": "onnxruntime not available, VAD disabled"}), file=sys.stderr)
return
# Determine model path
if model_path is None:
cache_dir = Path.home() / ".cache" / "silero-vad"
cache_dir.mkdir(parents=True, exist_ok=True)
model_path = cache_dir / "silero_vad.onnx"
# Download if not exists
if not Path(model_path).exists():
print(json.dumps({"status": "downloading_vad_model"}), file=sys.stderr)
try:
urllib.request.urlretrieve(self.MODEL_URL, model_path)
print(json.dumps({"status": "vad_model_downloaded"}), file=sys.stderr)
except Exception as e:
print(json.dumps({"warning": f"VAD model download failed: {e}"}), file=sys.stderr)
return
# Load ONNX model
try:
self.session = ort.InferenceSession(
str(model_path),
providers=['CPUExecutionProvider']
)
print(json.dumps({"status": "vad_loaded"}), file=sys.stderr)
except Exception as e:
print(json.dumps({"warning": f"VAD load failed: {e}"}), file=sys.stderr)
def reset_states(self):
"""Reset hidden states."""
self._h = np.zeros((2, 1, 64), dtype=np.float32)
self._c = np.zeros((2, 1, 64), dtype=np.float32)
def __call__(self, audio: np.ndarray) -> float:
"""Run VAD on audio chunk, return speech probability."""
if self.session is None:
return 0.5 # Neutral if VAD not available
# Ensure correct shape (batch, samples)
if audio.ndim == 1:
audio = audio[np.newaxis, :]
# Run inference
ort_inputs = {
'input': audio.astype(np.float32),
'sr': np.array([self.sample_rate], dtype=np.int64),
'h': self._h,
'c': self._c
}
output, self._h, self._c = self.session.run(None, ort_inputs)
return float(output[0][0])
class VADProcessor:
"""Voice Activity Detection processor."""
def __init__(self, sample_rate: int = 16000, threshold: float = 0.5, vad_model: Optional[SileroVAD] = None):
self.sample_rate = sample_rate
self.threshold = threshold
# Reuse pre-loaded VAD model if provided
self.vad = vad_model if vad_model else (SileroVAD(threshold=threshold) if ONNX_AVAILABLE else None)
self.reset()
def reset(self):
"""Reset VAD state."""
self.audio_buffer = np.array([], dtype=np.float32)
self.speech_buffer = np.array([], dtype=np.float32)
self.speech_started = False
self.silence_samples = 0
self.speech_samples = 0
if self.vad:
self.vad.reset_states()
def process_chunk(self, audio_chunk: np.ndarray) -> Optional[np.ndarray]:
"""
Process audio chunk and return speech segment if speech end detected.
Returns:
Speech audio if end detected, None otherwise
"""
self.audio_buffer = np.concatenate([self.audio_buffer, audio_chunk])
# Fallback: time-based segmentation if no VAD
if self.vad is None or self.vad.session is None:
# Every 5 seconds, return the buffer
if len(self.audio_buffer) >= self.sample_rate * 5:
result = self.audio_buffer.copy()
self.audio_buffer = np.array([], dtype=np.float32)
return result
return None
# Process in 512-sample windows (32ms at 16kHz)
window_size = 512
silence_threshold_samples = int(0.5 * self.sample_rate) # 500ms
max_speech_samples = int(15 * self.sample_rate) # 15s max
while len(self.audio_buffer) >= window_size:
window = self.audio_buffer[:window_size]
self.audio_buffer = self.audio_buffer[window_size:]
# Run VAD
speech_prob = self.vad(window)
if speech_prob >= self.threshold:
if not self.speech_started:
self.speech_started = True
self.speech_buffer = np.array([], dtype=np.float32)
self.speech_buffer = np.concatenate([self.speech_buffer, window])
self.silence_samples = 0
self.speech_samples += window_size
else:
if self.speech_started:
self.speech_buffer = np.concatenate([self.speech_buffer, window])
self.silence_samples += window_size
# Force segment if speech too long
if self.speech_samples >= max_speech_samples:
result = self.speech_buffer.copy()
self.speech_started = False
self.speech_buffer = np.array([], dtype=np.float32)
self.silence_samples = 0
self.speech_samples = 0
return result
# Detect end of speech (500ms silence)
if self.speech_started and self.silence_samples >= silence_threshold_samples:
if len(self.speech_buffer) > self.sample_rate * 0.3: # At least 300ms
result = self.speech_buffer.copy()
self.speech_started = False
self.speech_buffer = np.array([], dtype=np.float32)
self.silence_samples = 0
self.speech_samples = 0
return result
return None
def flush(self) -> Optional[np.ndarray]:
"""Flush remaining audio."""
# Combine any remaining audio
remaining = np.concatenate([self.speech_buffer, self.audio_buffer])
if len(remaining) > self.sample_rate * 0.5: # At least 500ms
self.reset()
return remaining
self.reset()
return None
class StreamingSession:
"""Manages a streaming transcription session."""
def __init__(self, transcriber: 'Transcriber', vad_model: Optional[SileroVAD] = None):
self.session_id = str(uuid.uuid4())
self.transcriber = transcriber
self.vad = VADProcessor(vad_model=vad_model)
self.segment_id = 0
self.active = True
def process_chunk(self, audio_data: str) -> Optional[dict]:
"""Process base64-encoded audio chunk."""
try:
# Decode base64 to raw PCM (16-bit, 16kHz, mono)
pcm_data = base64.b64decode(audio_data)
audio = np.frombuffer(pcm_data, dtype=np.int16).astype(np.float32) / 32768.0
# Run VAD
speech_segment = self.vad.process_chunk(audio)
if speech_segment is not None and len(speech_segment) > 0:
return self._transcribe_segment(speech_segment)
return None
except Exception as e:
return {"error": f"Chunk processing error: {e}"}
def _transcribe_segment(self, audio: np.ndarray) -> dict:
"""Transcribe a speech segment."""
self.segment_id += 1
# Save to temp file for Whisper
temp_file = tempfile.NamedTemporaryFile(suffix='.wav', delete=False)
try:
import wave
with wave.open(temp_file.name, 'wb') as wf:
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(16000)
wf.writeframes((audio * 32768).astype(np.int16).tobytes())
# Transcribe
text = self.transcriber.transcribe_file(temp_file.name, add_punctuation=True)
return {
"segment_id": self.segment_id,
"text": text,
"is_final": True,
"duration": len(audio) / 16000
}
finally:
os.unlink(temp_file.name)
def stop(self) -> dict:
"""Stop the session and flush remaining audio."""
self.active = False
results = []
# Flush VAD buffer
remaining = self.vad.flush()
if remaining is not None and len(remaining) > 0:
result = self._transcribe_segment(remaining)
if result and not result.get('error'):
results.append(result)
return {
"status": "stream_stopped",
"session_id": self.session_id,
"total_segments": self.segment_id,
"final_segments": results
}
class Transcriber:
"""Main transcription engine."""
def __init__(self, model_size: str = "medium", device: str = "cpu", compute_type: str = "int8"):
self.model = None
self.converter = None
self.punctuator = ChinesePunctuator()
self.streaming_session: Optional[StreamingSession] = None
self.vad_model: Optional[SileroVAD] = None
try:
print(json.dumps({"status": "loading_model", "model": model_size}), file=sys.stderr)
self.model = WhisperModel(model_size, device=device, compute_type=compute_type)
self.converter = opencc.OpenCC("s2twp")
print(json.dumps({"status": "model_loaded"}), file=sys.stderr)
# Pre-load VAD model at startup (not when streaming starts)
if ONNX_AVAILABLE:
self.vad_model = SileroVAD()
except Exception as e:
print(json.dumps({"error": f"Failed to load model: {e}"}), file=sys.stderr)
raise
def transcribe_file(self, audio_path: str, add_punctuation: bool = False) -> str:
"""Transcribe an audio file to text."""
if not self.model:
return ""
if not os.path.exists(audio_path):
print(json.dumps({"error": f"File not found: {audio_path}"}), file=sys.stderr)
return ""
try:
segments, info = self.model.transcribe(
audio_path,
language="zh", # Use "nan" for Taiwanese/Hokkien, "zh" for Mandarin
beam_size=5,
vad_filter=True,
word_timestamps=add_punctuation,
# Anti-hallucination settings
condition_on_previous_text=False, # Prevents hallucination propagation
no_speech_threshold=0.6, # Higher = stricter silence detection
compression_ratio_threshold=2.4, # Filter repetitive/hallucinated text
log_prob_threshold=-1.0, # Filter low-confidence output
temperature=0.0, # Deterministic output (no sampling)
)
if add_punctuation:
# Collect segments with timestamps for punctuation
seg_list = []
for segment in segments:
seg_list.append({
'text': segment.text,
'start': segment.start,
'end': segment.end
})
text = self.punctuator.process_segments(seg_list)
else:
text = ""
for segment in segments:
text += segment.text
# Convert to Traditional Chinese
if text and self.converter:
text = self.converter.convert(text)
return text.strip()
except Exception as e:
print(json.dumps({"error": f"Transcription error: {e}"}), file=sys.stderr)
return ""
def handle_command(self, cmd: dict) -> Optional[dict]:
"""Handle a JSON command."""
action = cmd.get("action")
if action == "transcribe":
# File-based transcription (legacy)
audio_path = cmd.get("file")
if audio_path:
text = self.transcribe_file(audio_path, add_punctuation=True)
return {"result": text, "file": audio_path}
return {"error": "No file specified"}
elif action == "start_stream":
# Start streaming session
if self.streaming_session and self.streaming_session.active:
return {"error": "Stream already active"}
# Pass pre-loaded VAD model to avoid download delay
self.streaming_session = StreamingSession(self, vad_model=self.vad_model)
return {
"status": "streaming",
"session_id": self.streaming_session.session_id
}
elif action == "audio_chunk":
# Process audio chunk
if not self.streaming_session or not self.streaming_session.active:
return {"error": "No active stream"}
data = cmd.get("data")
if not data:
return {"error": "No audio data"}
result = self.streaming_session.process_chunk(data)
return result # May be None if no segment ready
elif action == "stop_stream":
# Stop streaming session
if not self.streaming_session:
return {"error": "No active stream"}
result = self.streaming_session.stop()
self.streaming_session = None
return result
elif action == "ping":
return {"status": "pong"}
elif action == "quit":
return {"status": "exiting"}
else:
return {"error": f"Unknown action: {action}"}
def run_server(self):
"""Run in server mode, reading JSON commands from stdin."""
print(json.dumps({"status": "ready"}))
sys.stdout.flush()
for line in sys.stdin:
line = line.strip()
if not line:
continue
try:
cmd = json.loads(line)
result = self.handle_command(cmd)
if result:
print(json.dumps(result))
sys.stdout.flush()
if cmd.get("action") == "quit":
break
except json.JSONDecodeError as e:
print(json.dumps({"error": f"Invalid JSON: {e}"}))
sys.stdout.flush()
def main():
model_size = os.environ.get("WHISPER_MODEL", "small")
device = os.environ.get("WHISPER_DEVICE", "cpu")
compute_type = os.environ.get("WHISPER_COMPUTE", "int8")
try:
transcriber = Transcriber(model_size, device, compute_type)
if len(sys.argv) > 1:
if sys.argv[1] == "--server":
transcriber.run_server()
else:
# File mode
text = transcriber.transcribe_file(sys.argv[1], add_punctuation=True)
print(text)
else:
# Default to server mode
transcriber.run_server()
except Exception as e:
print(json.dumps({"error": str(e)}), file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()