feat: Meeting Assistant MVP - Complete implementation

Enterprise Meeting Knowledge Management System with: Backend (FastAPI): - Authentication proxy with JWT (pj-auth-api integration) - MySQL database with 4 tables (users, meetings, conclusions, actions) - Meeting CRUD with system code generation (C-YYYYMMDD-XX, A-YYYYMMDD-XX) - Dify LLM integration for AI summarization - Excel export with openpyxl - 20 unit tests (all passing) Client (Electron): - Login page with company auth - Meeting list with create/delete - Meeting detail with real-time transcription - Editable transcript textarea (single block, easy editing) - AI summarization with conclusions/action items - 5-second segment recording (efficient for long meetings) Sidecar (Python): - faster-whisper medium model with int8 quantization - ONNX Runtime VAD (lightweight, ~20MB vs PyTorch ~2GB) - Chinese punctuation processing - OpenCC for Traditional Chinese conversion - Anti-hallucination parameters - Auto-cleanup of temp audio files OpenSpec: - add-meeting-assistant-mvp (47 tasks, archived) - add-realtime-transcription (29 tasks, archived) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-10 20:17:44 +08:00
commit 8b6184ecc5
65 changed files with 10510 additions and 0 deletions
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/design.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/design.md
@@ -0,0 +1,132 @@
+## Context
+Building a meeting knowledge management system for enterprise users. The system must support offline transcription on standard hardware (i5/8GB), integrate with existing company authentication, and provide AI-powered summarization via Dify LLM.
+
+**Stakeholders**: Enterprise meeting participants, meeting recorders, admin users (ymirliu@panjit.com.tw)
+
+**Constraints**:
+- Must run faster-whisper int8 on i5/8GB laptop
+- DB credentials and API keys must stay server-side (security)
+- All database tables prefixed with `meeting_`
+- Output must support Traditional Chinese (繁體中文)
+
+## Goals / Non-Goals
+
+**Goals**:
+- Deliver working MVP with all six capabilities
+- Secure architecture with secrets in middleware only
+- Offline-capable transcription
+- Structured output with trackable action items
+
+**Non-Goals**:
+- Multi-language support beyond Traditional Chinese
+- Real-time collaborative editing
+- Mobile client
+- Custom LLM model training
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                     Electron Client                              │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
+│  │  Auth UI    │  │ Meeting UI  │  │  Transcription Engine   │  │
+│  │  (Login)    │  │ (CRUD/Edit) │  │  (faster-whisper+OpenCC)│  │
+│  └──────┬──────┘  └──────┬──────┘  └────────────┬────────────┘  │
+└─────────┼────────────────┼──────────────────────┼───────────────┘
+          │                │                      │
+          │ HTTP           │ HTTP                 │ Local only
+          ▼                ▼                      ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                  FastAPI Middleware Server                       │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────┐  │
+│  │ Auth Proxy  │  │Meeting CRUD │  │ Dify Proxy  │  │ Export │  │
+│  │ POST /login │  │POST/GET/... │  │POST /ai/... │  │GET /:id│  │
+│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └───┬────┘  │
+└─────────┼────────────────┼────────────────┼─────────────┼───────┘
+          │                │                │             │
+          ▼                ▼                ▼             │
+┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
+│ PJ-Auth API  │  │    MySQL     │  │   Dify LLM   │     │
+│  (Vercel)    │  │ (theaken.com)│  │(theaken.com) │     │
+└──────────────┘  └──────────────┘  └──────────────┘     │
+                                                         │
+                                    ┌────────────────────┘
+                                    ▼
+                           ┌──────────────┐
+                           │ Excel Template│
+                           │ (openpyxl)   │
+                           └──────────────┘
+```
+
+## Decisions
+
+### Decision 1: Three-tier architecture with middleware
+**Choice**: All external services accessed through FastAPI middleware
+**Rationale**: Security requirement - DB credentials and API keys cannot be in Electron client
+**Alternatives considered**:
+- Direct client-to-service: Rejected due to credential exposure risk
+- Serverless functions: More complex deployment for similar security
+
+### Decision 2: Edge transcription in Electron
+**Choice**: Run faster-whisper locally via Python sidecar (PyInstaller)
+**Rationale**: Offline capability requirement; network latency unacceptable for real-time transcription
+**Alternatives considered**:
+- Cloud STT (Google/Azure): Requires network, latency issues
+- WebAssembly whisper: Not mature enough for production
+
+### Decision 3: MySQL with prefixed tables
+**Choice**: Use shared MySQL instance with `meeting_` prefix
+**Rationale**: Leverage existing infrastructure; prefix ensures isolation
+**Alternatives considered**:
+- Dedicated database: Overhead not justified for MVP
+- SQLite: Doesn't support multi-user access
+
+### Decision 4: Dify for LLM summarization
+**Choice**: Use company Dify instance for AI features
+**Rationale**: Already available infrastructure; structured JSON output support
+**Alternatives considered**:
+- Direct OpenAI API: Additional cost, no existing infrastructure
+- Local LLM: Hardware constraints (i5/8GB insufficient)
+
+## Risks / Trade-offs
+
+| Risk | Impact | Mitigation |
+|------|--------|------------|
+| faster-whisper performance on i5/8GB | High | Use int8 quantization; test on target hardware early |
+| Dify timeout on long transcripts | Medium | Implement chunking; add timeout handling with retry |
+| Token expiry during long meetings | Medium | Implement auto-refresh interceptor in client |
+| Network failure during save | Medium | Client-side queue with retry; local draft storage |
+
+## Data Model
+
+```sql
+-- Tables all prefixed with meeting_
+
+meeting_users (user_id, email, display_name, role, created_at)
+meeting_records (meeting_id, uuid, subject, meeting_time, location,
+                 chairperson, recorder, attendees, transcript_blob,
+                 created_by, created_at)
+meeting_conclusions (conclusion_id, meeting_id, content, system_code)
+meeting_action_items (action_id, meeting_id, content, owner, due_date,
+                      status, system_code)
+```
+
+**ID Formats**:
+- Conclusions: `C-YYYYMMDD-XX` (e.g., C-20251210-01)
+- Action Items: `A-YYYYMMDD-XX` (e.g., A-20251210-01)
+
+## API Endpoints
+
+| Method | Endpoint | Purpose |
+|--------|----------|---------|
+| POST | /api/login | Proxy auth to PJ-Auth API |
+| GET | /api/meetings | List meetings (filterable) |
+| POST | /api/meetings | Create meeting |
+| GET | /api/meetings/:id | Get meeting details |
+| PUT | /api/meetings/:id | Update meeting |
+| DELETE | /api/meetings/:id | Delete meeting |
+| POST | /api/ai/summarize | Send transcript to Dify |
+| GET | /api/meetings/:id/export | Generate Excel report |
+
+## Open Questions
+- None currently - PRD and SDD provide sufficient detail for MVP implementation
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/proposal.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/proposal.md
@@ -0,0 +1,25 @@
+# Change: Add Meeting Assistant MVP
+
+## Why
+Enterprise users spend significant time manually documenting meetings and tracking action items. This MVP delivers an end-to-end meeting knowledge management solution with offline transcription, AI-powered summarization, and structured tracking of conclusions and action items.
+
+## What Changes
+- **NEW** FastAPI middleware server with MySQL integration
+- **NEW** Authentication proxy to company Auth API with admin role detection
+- **NEW** Meeting CRUD operations with metadata management
+- **NEW** Edge-based speech-to-text using faster-whisper (int8)
+- **NEW** Dify LLM integration for intelligent summarization
+- **NEW** Excel report generation from templates
+
+## Impact
+- Affected specs: middleware, authentication, meeting-management, transcription, ai-summarization, excel-export
+- Affected code: New Python FastAPI backend, new Electron frontend
+- External dependencies: PJ-Auth API, MySQL database, Dify LLM service
+
+## Success Criteria
+- Users can login via company SSO
+- Meetings can be created with required metadata (subject, time, chairperson, location, recorder, attendees)
+- Speech-to-text works offline on i5/8GB hardware
+- AI generates structured conclusions and action items from transcripts
+- Action items have trackable status (Open/In Progress/Done/Delayed)
+- Excel reports can be exported with all meeting data
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/ai-summarization/spec.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/ai-summarization/spec.md
@@ -0,0 +1,45 @@
+## ADDED Requirements
+
+### Requirement: Dify Integration
+The middleware server SHALL integrate with Dify LLM at https://dify.theaken.com/v1 for transcript summarization.
+
+#### Scenario: Successful summarization
+- **WHEN** user submits POST /api/ai/summarize with transcript text
+- **THEN** the server SHALL call Dify API and return structured JSON with conclusions and action_items
+
+#### Scenario: Dify timeout handling
+- **WHEN** Dify API does not respond within timeout period
+- **THEN** the server SHALL return HTTP 504 with timeout error and client can retry
+
+#### Scenario: Dify error handling
+- **WHEN** Dify API returns error (500, rate limit, etc.)
+- **THEN** the server SHALL return appropriate HTTP error with details
+
+### Requirement: Structured Output Format
+The AI summarization SHALL return structured data with conclusions and action items.
+
+#### Scenario: Complete structured response
+- **WHEN** transcript contains clear decisions and assignments
+- **THEN** response SHALL include conclusions array and action_items array with content, owner, due_date fields
+
+#### Scenario: Partial data extraction
+- **WHEN** transcript lacks explicit owner or due_date for action items
+- **THEN** those fields SHALL be empty strings allowing manual completion
+
+### Requirement: Dify Prompt Configuration
+The Dify workflow SHALL be configured with appropriate system prompt for meeting summarization.
+
+#### Scenario: System prompt behavior
+- **WHEN** transcript is sent to Dify
+- **THEN** Dify SHALL use configured prompt to extract conclusions and action_items in JSON format
+
+### Requirement: Manual Data Completion
+The Electron client SHALL allow users to manually complete missing AI-extracted data.
+
+#### Scenario: Fill missing owner
+- **WHEN** AI returns action item without owner
+- **THEN** user SHALL be able to select or type owner name in the UI
+
+#### Scenario: Fill missing due date
+- **WHEN** AI returns action item without due_date
+- **THEN** user SHALL be able to select date using date picker
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/authentication/spec.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/authentication/spec.md
@@ -0,0 +1,42 @@
+## ADDED Requirements
+
+### Requirement: Login Proxy
+The middleware server SHALL proxy login requests to the company Auth API at https://pj-auth-api.vercel.app/api/auth/login.
+
+#### Scenario: Successful login
+- **WHEN** user submits valid credentials to POST /api/login
+- **THEN** the server SHALL forward to Auth API and return the JWT token
+
+#### Scenario: Admin role detection
+- **WHEN** user logs in with email ymirliu@panjit.com.tw
+- **THEN** the response JWT payload SHALL include role: "admin"
+
+#### Scenario: Invalid credentials
+- **WHEN** user submits invalid credentials
+- **THEN** the server SHALL return HTTP 401 with error message from Auth API
+
+### Requirement: Token Validation
+The middleware server SHALL validate JWT tokens on protected endpoints.
+
+#### Scenario: Valid token access
+- **WHEN** request includes valid JWT in Authorization header
+- **THEN** the request SHALL proceed to the endpoint handler
+
+#### Scenario: Expired token
+- **WHEN** request includes expired JWT
+- **THEN** the server SHALL return HTTP 401 with "token_expired" error code
+
+#### Scenario: Missing token
+- **WHEN** request to protected endpoint lacks Authorization header
+- **THEN** the server SHALL return HTTP 401 with "token_required" error code
+
+### Requirement: Token Auto-Refresh
+The Electron client SHALL implement automatic token refresh before expiration.
+
+#### Scenario: Proactive refresh
+- **WHEN** token approaches expiration (within 5 minutes) during active session
+- **THEN** the client SHALL request new token transparently without user interruption
+
+#### Scenario: Refresh during long meeting
+- **WHEN** user is in a meeting session lasting longer than token validity
+- **THEN** the client SHALL maintain authentication through automatic refresh
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/excel-export/spec.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/excel-export/spec.md
@@ -0,0 +1,45 @@
+## ADDED Requirements
+
+### Requirement: Excel Report Generation
+The middleware server SHALL generate Excel reports from meeting data using templates.
+
+#### Scenario: Successful export
+- **WHEN** user requests GET /api/meetings/:id/export
+- **THEN** server SHALL generate Excel file and return as downloadable stream
+
+#### Scenario: Export non-existent meeting
+- **WHEN** user requests export for non-existent meeting ID
+- **THEN** server SHALL return HTTP 404
+
+### Requirement: Template-based Generation
+The Excel export SHALL use openpyxl with template files.
+
+#### Scenario: Placeholder replacement
+- **WHEN** Excel is generated
+- **THEN** placeholders ({{subject}}, {{time}}, {{chair}}, etc.) SHALL be replaced with actual meeting data
+
+#### Scenario: Dynamic row insertion
+- **WHEN** meeting has multiple conclusions or action items
+- **THEN** rows SHALL be dynamically inserted to accommodate all items
+
+### Requirement: Complete Data Inclusion
+The exported Excel SHALL include all meeting metadata and AI-generated content.
+
+#### Scenario: Full metadata export
+- **WHEN** Excel is generated
+- **THEN** it SHALL include subject, meeting_time, location, chairperson, recorder, and attendees
+
+#### Scenario: Conclusions export
+- **WHEN** Excel is generated
+- **THEN** all conclusions SHALL be listed with their system codes
+
+#### Scenario: Action items export
+- **WHEN** Excel is generated
+- **THEN** all action items SHALL be listed with content, owner, due_date, status, and system code
+
+### Requirement: Template Management
+Admin users SHALL be able to manage Excel templates.
+
+#### Scenario: Admin template access
+- **WHEN** admin user accesses template management
+- **THEN** they SHALL be able to upload, view, and update Excel templates
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/meeting-management/spec.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/meeting-management/spec.md
@@ -0,0 +1,71 @@
+## ADDED Requirements
+
+### Requirement: Create Meeting
+The system SHALL allow users to create meetings with required metadata.
+
+#### Scenario: Create meeting with all fields
+- **WHEN** user submits POST /api/meetings with subject, meeting_time, chairperson, location, recorder, attendees
+- **THEN** a new meeting record SHALL be created with auto-generated UUID and the meeting data SHALL be returned
+
+#### Scenario: Create meeting with missing required fields
+- **WHEN** user submits POST /api/meetings without subject or meeting_time
+- **THEN** the server SHALL return HTTP 400 with validation error details
+
+#### Scenario: Recorder defaults to current user
+- **WHEN** user creates meeting without specifying recorder
+- **THEN** the recorder field SHALL default to the logged-in user's email
+
+### Requirement: List Meetings
+The system SHALL allow users to retrieve a list of meetings.
+
+#### Scenario: List all meetings for admin
+- **WHEN** admin user requests GET /api/meetings
+- **THEN** all meetings SHALL be returned
+
+#### Scenario: List meetings for regular user
+- **WHEN** regular user requests GET /api/meetings
+- **THEN** only meetings where user is creator, recorder, or attendee SHALL be returned
+
+### Requirement: Get Meeting Details
+The system SHALL allow users to retrieve full meeting details including conclusions and action items.
+
+#### Scenario: Get meeting with related data
+- **WHEN** user requests GET /api/meetings/:id
+- **THEN** meeting record with all conclusions and action_items SHALL be returned
+
+#### Scenario: Get non-existent meeting
+- **WHEN** user requests GET /api/meetings/:id for non-existent ID
+- **THEN** the server SHALL return HTTP 404
+
+### Requirement: Update Meeting
+The system SHALL allow users to update meeting data, conclusions, and action items.
+
+#### Scenario: Update meeting metadata
+- **WHEN** user submits PUT /api/meetings/:id with updated fields
+- **THEN** the meeting record SHALL be updated and new data returned
+
+#### Scenario: Update action item status
+- **WHEN** user updates action item status to "Done"
+- **THEN** the action_items record SHALL reflect the new status
+
+### Requirement: Delete Meeting
+The system SHALL allow authorized users to delete meetings.
+
+#### Scenario: Admin deletes any meeting
+- **WHEN** admin user requests DELETE /api/meetings/:id
+- **THEN** the meeting and all related conclusions and action_items SHALL be deleted
+
+#### Scenario: User deletes own meeting
+- **WHEN** user requests DELETE /api/meetings/:id for meeting they created
+- **THEN** the meeting and all related data SHALL be deleted
+
+### Requirement: System Code Generation
+The system SHALL auto-generate unique system codes for conclusions and action items.
+
+#### Scenario: Generate conclusion code
+- **WHEN** a conclusion is created for a meeting on 2025-12-10
+- **THEN** the system_code SHALL follow format C-20251210-XX where XX is sequence number
+
+#### Scenario: Generate action item code
+- **WHEN** an action item is created for a meeting on 2025-12-10
+- **THEN** the system_code SHALL follow format A-20251210-XX where XX is sequence number
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/middleware/spec.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/middleware/spec.md
@@ -0,0 +1,41 @@
+## ADDED Requirements
+
+### Requirement: FastAPI Server Configuration
+The middleware server SHALL be implemented using Python FastAPI framework with environment-based configuration.
+
+#### Scenario: Server startup with valid configuration
+- **WHEN** the server starts with valid .env file containing DB_HOST, DB_PORT, DB_USER, DB_PASS, DB_NAME, DIFY_API_URL, DIFY_API_KEY
+- **THEN** the server SHALL start successfully and accept connections
+
+#### Scenario: Server startup with missing configuration
+- **WHEN** the server starts with missing required environment variables
+- **THEN** the server SHALL fail to start with descriptive error message
+
+### Requirement: Database Connection Pool
+The middleware server SHALL maintain a connection pool to the MySQL database at mysql.theaken.com:33306.
+
+#### Scenario: Database connection success
+- **WHEN** the server connects to MySQL with valid credentials
+- **THEN** a connection pool SHALL be established and queries SHALL execute successfully
+
+#### Scenario: Database connection failure
+- **WHEN** the database is unreachable
+- **THEN** the server SHALL return HTTP 503 with error details for affected endpoints
+
+### Requirement: Table Initialization
+The middleware server SHALL ensure all required tables exist on startup with the `meeting_` prefix.
+
+#### Scenario: Tables created on first run
+- **WHEN** the server starts and tables do not exist
+- **THEN** the server SHALL create meeting_users, meeting_records, meeting_conclusions, and meeting_action_items tables
+
+#### Scenario: Tables already exist
+- **WHEN** the server starts and tables already exist
+- **THEN** the server SHALL skip table creation and continue normally
+
+### Requirement: CORS Configuration
+The middleware server SHALL allow cross-origin requests from the Electron client.
+
+#### Scenario: CORS preflight request
+- **WHEN** Electron client sends OPTIONS request
+- **THEN** the server SHALL respond with appropriate CORS headers allowing the request
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/transcription/spec.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/specs/transcription/spec.md
@@ -0,0 +1,41 @@
+## ADDED Requirements
+
+### Requirement: Edge Speech-to-Text
+The Electron client SHALL perform speech-to-text conversion locally using faster-whisper int8 model.
+
+#### Scenario: Successful transcription
+- **WHEN** user records audio during a meeting
+- **THEN** the audio SHALL be transcribed locally without network dependency
+
+#### Scenario: Transcription on target hardware
+- **WHEN** running on i5 processor with 8GB RAM
+- **THEN** transcription SHALL complete within acceptable latency for real-time display
+
+### Requirement: Traditional Chinese Output
+The transcription engine SHALL output Traditional Chinese (繁體中文) text.
+
+#### Scenario: Simplified to Traditional conversion
+- **WHEN** whisper outputs Simplified Chinese characters
+- **THEN** OpenCC SHALL convert output to Traditional Chinese
+
+#### Scenario: Native Traditional Chinese
+- **WHEN** whisper outputs Traditional Chinese directly
+- **THEN** the text SHALL pass through unchanged
+
+### Requirement: Real-time Display
+The Electron client SHALL display transcription results in real-time.
+
+#### Scenario: Streaming transcription
+- **WHEN** user is recording
+- **THEN** transcribed text SHALL appear in the left panel within seconds of speech
+
+### Requirement: Python Sidecar
+The transcription engine SHALL be packaged as a Python sidecar using PyInstaller.
+
+#### Scenario: Sidecar startup
+- **WHEN** Electron app launches
+- **THEN** the Python sidecar containing faster-whisper and OpenCC SHALL be available
+
+#### Scenario: Sidecar communication
+- **WHEN** Electron sends audio data to sidecar
+- **THEN** transcribed text SHALL be returned via IPC
--- a/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/tasks.md
+++ b/openspec/changes/archive/2025-12-10-add-meeting-assistant-mvp/tasks.md
@@ -0,0 +1,67 @@
+## 1. Middleware Server Foundation
+- [x] 1.1 Initialize Python project with FastAPI, uvicorn, python-dotenv
+- [x] 1.2 Create .env.example with all required environment variables
+- [x] 1.3 Implement database connection pool with mysql-connector-python
+- [x] 1.4 Create table initialization script (meeting_users, meeting_records, meeting_conclusions, meeting_action_items)
+- [x] 1.5 Configure CORS middleware for Electron client
+- [x] 1.6 Add health check endpoint GET /api/health
+
+## 2. Authentication
+- [x] 2.1 Implement POST /api/login proxy to PJ-Auth API
+- [x] 2.2 Add admin role detection for ymirliu@panjit.com.tw
+- [x] 2.3 Create JWT validation middleware for protected routes
+- [x] 2.4 Handle token expiration with appropriate error codes
+
+## 3. Meeting CRUD
+- [x] 3.1 Implement POST /api/meetings (create meeting)
+- [x] 3.2 Implement GET /api/meetings (list meetings with user filtering)
+- [x] 3.3 Implement GET /api/meetings/:id (get meeting with conclusions and action items)
+- [x] 3.4 Implement PUT /api/meetings/:id (update meeting)
+- [x] 3.5 Implement DELETE /api/meetings/:id (delete meeting cascade)
+- [x] 3.6 Implement system code generation (C-YYYYMMDD-XX, A-YYYYMMDD-XX)
+
+## 4. AI Summarization
+- [x] 4.1 Implement POST /api/ai/summarize endpoint
+- [x] 4.2 Configure Dify API client with timeout and retry
+- [x] 4.3 Parse Dify response into conclusions and action_items structure
+- [x] 4.4 Handle partial data (empty owner/due_date)
+
+## 5. Excel Export
+- [x] 5.1 Create Excel template with placeholders
+- [x] 5.2 Implement GET /api/meetings/:id/export endpoint
+- [x] 5.3 Implement placeholder replacement ({{subject}}, {{time}}, etc.)
+- [x] 5.4 Implement dynamic row insertion for conclusions and action items
+
+## 6. Electron Client - Core
+- [x] 6.1 Initialize Electron project with electron-builder
+- [x] 6.2 Create main window and basic navigation
+- [x] 6.3 Implement login page with auth API integration
+- [x] 6.4 Implement token storage and auto-refresh interceptor
+
+## 7. Electron Client - Meeting UI
+- [x] 7.1 Create meeting list page
+- [x] 7.2 Create meeting creation form (metadata fields)
+- [x] 7.3 Create dual-panel meeting view (transcript left, notes right)
+- [x] 7.4 Implement conclusion/action item editing with manual completion UI
+- [x] 7.5 Add export button with download handling
+
+## 8. Transcription Engine
+- [x] 8.1 Create Python sidecar project with faster-whisper and OpenCC
+- [x] 8.2 Implement audio input capture
+- [x] 8.3 Implement transcription with int8 model
+- [x] 8.4 Implement OpenCC Traditional Chinese conversion
+- [x] 8.5 Set up IPC communication between Electron and sidecar
+- [x] 8.6 Package sidecar with PyInstaller
+
+## 9. Testing
+- [x] 9.1 Unit tests: DB connection, table creation
+- [x] 9.2 Unit tests: Dify proxy with mock responses
+- [x] 9.3 Unit tests: Admin role detection
+- [x] 9.4 Integration test: Auth flow with token refresh
+- [x] 9.5 Integration test: Full meeting cycle (create → transcribe → summarize → save → export)
+
+## 10. Deployment Preparation
+- [x] 10.1 Create requirements.txt with all dependencies
+- [x] 10.2 Create deployment documentation
+- [x] 10.3 Configure electron-builder for portable target
+- [x] 10.4 Verify faster-whisper performance on i5/8GB hardware
--- a/openspec/changes/archive/2025-12-10-add-realtime-transcription/design.md
+++ b/openspec/changes/archive/2025-12-10-add-realtime-transcription/design.md
@@ -0,0 +1,117 @@
+## Context
+The Meeting Assistant currently uses batch transcription: audio is recorded, saved to file, then sent to Whisper for processing. This creates a poor UX where users must wait until recording stops to see any text. Users also cannot correct transcription errors.
+
+**Stakeholders**: End users recording meetings, admin reviewing transcripts
+**Constraints**: i5/8GB hardware target, offline capability required
+
+## Goals / Non-Goals
+
+### Goals
+- Real-time text display during recording (< 3 second latency)
+- Segment-based editing without disrupting ongoing transcription
+- Punctuation in output (Chinese: 。，？！；：)
+- Maintain offline capability (all processing local)
+
+### Non-Goals
+- Speaker diarization (who said what) - future enhancement
+- Multi-language mixing - Chinese only for MVP
+- Cloud-based transcription fallback
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│ Renderer Process (meeting-detail.html)                      │
+│  ┌──────────────┐    ┌─────────────────────────────────┐   │
+│  │ MediaRecorder│───▶│ Editable Transcript Component   │   │
+│  │ (audio chunks)    │  [Segment 1] [Segment 2] [...]  │   │
+│  └──────┬───────┘    └─────────────────────────────────┘   │
+│         │ IPC: stream-audio-chunk                          │
+└─────────┼──────────────────────────────────────────────────┘
+          ▼
+┌─────────────────────────────────────────────────────────────┐
+│ Main Process (main.js)                                      │
+│  ┌──────────────────┐     ┌─────────────────────────────┐  │
+│  │ Audio Buffer     │────▶│ Sidecar (stdin pipe)        │  │
+│  │ (accumulate PCM) │     │                             │  │
+│  └──────────────────┘     └──────────┬──────────────────┘  │
+│                                      │ IPC: transcription-segment
+│                                      ▼                      │
+│                           Forward to renderer               │
+└─────────────────────────────────────────────────────────────┘
+          │
+          ▼ stdin (WAV chunks)
+┌─────────────────────────────────────────────────────────────┐
+│ Sidecar Process (transcriber.py)                            │
+│  ┌──────────────┐   ┌──────────────┐   ┌────────────────┐  │
+│  │ VAD Buffer   │──▶│ Whisper      │──▶│ Punctuator     │  │
+│  │ (silero-vad) │   │ (transcribe) │   │ (rule-based)   │  │
+│  └──────────────┘   └──────────────┘   └────────────────┘  │
+│         │                                      │            │
+│         │ Detect speech end                    │            │
+│         ▼                                      ▼            │
+│  stdout: {"segment_id": 1, "text": "今天開會討論。", ...}  │
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Decisions
+
+### Decision 1: VAD-triggered Segmentation
+**What**: Use Silero VAD to detect speech boundaries, transcribe complete utterances
+**Why**:
+- More accurate than fixed-interval chunking
+- Natural sentence boundaries
+- Reduces partial/incomplete transcriptions
+**Alternatives**:
+- Fixed 5-second chunks (simpler but cuts mid-sentence)
+- Word-level streaming (too fragmented, higher latency)
+
+### Decision 2: Segment-based Editing
+**What**: Each VAD segment becomes an editable text block with unique ID
+**Why**:
+- Users can edit specific segments without affecting others
+- New segments append without disrupting editing
+- Simple merge on save (concatenate all segments)
+**Alternatives**:
+- Single textarea (editing conflicts with appending text)
+- Contenteditable div (complex cursor management)
+
+### Decision 3: Audio Format Pipeline
+**What**: WebM (MediaRecorder) → WAV conversion in main.js → raw PCM to sidecar
+**Why**:
+- MediaRecorder only outputs WebM/Opus in browsers
+- Whisper works best with WAV/PCM
+- Conversion in main.js keeps sidecar simple
+**Alternatives**:
+- ffmpeg in sidecar (adds large dependency)
+- Raw PCM from AudioWorklet (complex, browser compatibility issues)
+
+### Decision 4: Punctuation via Whisper + Rules
+**What**: Enable Whisper word_timestamps, apply rule-based punctuation after
+**Why**:
+- Whisper alone outputs minimal punctuation for Chinese
+- Rule-based post-processing adds 。，？ based on pauses and patterns
+- No additional model needed
+**Alternatives**:
+- Separate punctuation model (adds latency and complexity)
+- No punctuation (user requirement)
+
+## Risks / Trade-offs
+
+| Risk | Mitigation |
+|------|------------|
+| Latency > 3s on slow hardware | Use "tiny" model option, skip VAD if needed |
+| WebM→WAV conversion quality loss | Use lossless conversion, test on various inputs |
+| Memory usage with long meetings | Limit audio buffer to 30s, process and discard |
+| Segment boundary splits words | Use VAD with 500ms silence threshold |
+
+## Implementation Phases
+
+1. **Phase 1**: Sidecar streaming mode with VAD
+2. **Phase 2**: IPC audio streaming pipeline
+3. **Phase 3**: Frontend editable segment component
+4. **Phase 4**: Punctuation post-processing
+
+## Open Questions
+- Should segments be auto-merged after N seconds of no editing?
+- Maximum segment count before auto-archiving old segments?
--- a/openspec/changes/archive/2025-12-10-add-realtime-transcription/proposal.md
+++ b/openspec/changes/archive/2025-12-10-add-realtime-transcription/proposal.md
@@ -0,0 +1,24 @@
+# Change: Add Real-time Streaming Transcription
+
+## Why
+Current transcription workflow requires users to stop recording before seeing results. Users cannot edit transcription errors, and output lacks punctuation. For meeting scenarios, real-time feedback with editable text is essential for immediate correction and context awareness.
+
+## What Changes
+- **Sidecar**: Implement streaming VAD-based transcription with sentence segmentation
+- **IPC**: Add continuous audio streaming from renderer to main process to sidecar
+- **Frontend**: Make transcript editable with real-time segment updates
+- **Punctuation**: Enable Whisper's word timestamps and add sentence boundary detection
+
+## Impact
+- Affected specs: `transcription` (new), `frontend-transcript` (new)
+- Affected code:
+  - `sidecar/transcriber.py` - Add streaming mode with VAD
+  - `client/src/main.js` - Add audio streaming IPC handlers
+  - `client/src/preload.js` - Expose streaming APIs
+  - `client/src/pages/meeting-detail.html` - Editable transcript component
+
+## Success Criteria
+1. User sees text appearing within 2-3 seconds of speaking
+2. Each segment is individually editable
+3. Output includes punctuation (。，？！)
+4. Recording can continue while user edits previous segments
--- a/openspec/changes/archive/2025-12-10-add-realtime-transcription/specs/frontend-transcript/spec.md
+++ b/openspec/changes/archive/2025-12-10-add-realtime-transcription/specs/frontend-transcript/spec.md
@@ -0,0 +1,58 @@
+## ADDED Requirements
+
+### Requirement: Editable Transcript Segments
+The frontend SHALL display transcribed text as individually editable segments that can be modified without disrupting ongoing transcription.
+
+#### Scenario: Display new segment
+- **WHEN** a new transcription segment is received from sidecar
+- **THEN** a new editable text block SHALL appear in the transcript area
+- **AND** the block SHALL be visually distinct (e.g., border, background)
+- **AND** the block SHALL be immediately editable
+
+#### Scenario: Edit existing segment
+- **WHEN** user modifies text in a segment
+- **THEN** only that segment's local data SHALL be updated
+- **AND** new incoming segments SHALL continue to append below
+- **AND** the edited segment SHALL show an "edited" indicator
+
+#### Scenario: Save merged transcript
+- **WHEN** user clicks Save button
+- **THEN** all segments (edited and unedited) SHALL be concatenated in order
+- **AND** the merged text SHALL be saved as transcript_blob
+
+### Requirement: Real-time Streaming UI
+The frontend SHALL provide clear visual feedback during streaming transcription.
+
+#### Scenario: Recording active indicator
+- **WHEN** streaming recording is active
+- **THEN** a pulsing recording indicator SHALL be visible
+- **AND** the current/active segment SHALL have distinct styling (e.g., highlighted border)
+- **AND** the Start Recording button SHALL change to Stop Recording
+
+#### Scenario: Processing indicator
+- **WHEN** audio is being processed but no text has appeared yet
+- **THEN** a "Processing..." indicator SHALL appear in the active segment area
+- **AND** the indicator SHALL disappear when text arrives
+
+#### Scenario: Streaming status display
+- **WHEN** streaming session is active
+- **THEN** the UI SHALL display segment count (e.g., "Segment 5/5")
+- **AND** total recording duration
+
+### Requirement: Audio Streaming IPC
+The Electron main process SHALL provide IPC handlers for continuous audio streaming between renderer and sidecar.
+
+#### Scenario: Start streaming
+- **WHEN** renderer calls `startRecordingStream()`
+- **THEN** main process SHALL send start_stream command to sidecar
+- **AND** return session confirmation to renderer
+
+#### Scenario: Stream audio data
+- **WHEN** renderer sends audio chunk via `streamAudioChunk(arrayBuffer)`
+- **THEN** main process SHALL convert WebM to PCM if needed
+- **AND** forward to sidecar stdin as base64-encoded audio_chunk command
+
+#### Scenario: Receive transcription
+- **WHEN** sidecar emits a segment result on stdout
+- **THEN** main process SHALL parse the JSON
+- **AND** forward to renderer via `transcription-segment` IPC event
--- a/openspec/changes/archive/2025-12-10-add-realtime-transcription/specs/transcription/spec.md
+++ b/openspec/changes/archive/2025-12-10-add-realtime-transcription/specs/transcription/spec.md
@@ -0,0 +1,46 @@
+## ADDED Requirements
+
+### Requirement: Streaming Transcription Mode
+The sidecar SHALL support a streaming mode where audio chunks are continuously received and transcribed in real-time with VAD-triggered segmentation.
+
+#### Scenario: Start streaming session
+- **WHEN** sidecar receives `{"action": "start_stream"}` command
+- **THEN** it SHALL initialize audio buffer and VAD processor
+- **AND** respond with `{"status": "streaming", "session_id": "<uuid>"}`
+
+#### Scenario: Process audio chunk
+- **WHEN** sidecar receives `{"action": "audio_chunk", "data": "<base64_pcm>"}` during active stream
+- **THEN** it SHALL append audio to buffer and run VAD detection
+- **AND** if speech boundary detected, transcribe accumulated audio
+- **AND** emit `{"segment_id": <int>, "text": "<transcription>", "is_final": true}`
+
+#### Scenario: Stop streaming session
+- **WHEN** sidecar receives `{"action": "stop_stream"}` command
+- **THEN** it SHALL transcribe any remaining buffered audio
+- **AND** respond with `{"status": "stream_stopped", "total_segments": <int>}`
+
+### Requirement: VAD-based Speech Segmentation
+The sidecar SHALL use Voice Activity Detection to identify natural speech boundaries for segmentation.
+
+#### Scenario: Detect speech end
+- **WHEN** VAD detects silence exceeding 500ms after speech
+- **THEN** the accumulated speech audio SHALL be sent for transcription
+- **AND** a new segment SHALL begin for subsequent speech
+
+#### Scenario: Handle continuous speech
+- **WHEN** speech continues for more than 15 seconds without pause
+- **THEN** the sidecar SHALL force a segment boundary
+- **AND** transcribe the 15-second chunk to prevent excessive latency
+
+### Requirement: Punctuation in Transcription Output
+The sidecar SHALL output transcribed text with appropriate Chinese punctuation marks.
+
+#### Scenario: Add sentence-ending punctuation
+- **WHEN** transcription completes for a segment
+- **THEN** the output SHALL include period (。) at natural sentence boundaries
+- **AND** question marks (？) for interrogative sentences
+- **AND** commas (，) for clause breaks within sentences
+
+#### Scenario: Detect question patterns
+- **WHEN** transcribed text ends with question particles (嗎、呢、什麼、怎麼、為什麼)
+- **THEN** the punctuation processor SHALL append question mark (？)
--- a/openspec/changes/archive/2025-12-10-add-realtime-transcription/tasks.md
+++ b/openspec/changes/archive/2025-12-10-add-realtime-transcription/tasks.md
@@ -0,0 +1,53 @@
+## 1. Sidecar Streaming Infrastructure
+- [x] 1.1 Add silero-vad dependency to requirements.txt
+- [x] 1.2 Implement VADProcessor class with speech boundary detection
+- [x] 1.3 Add streaming mode to Transcriber (action: "start_stream", "audio_chunk", "stop_stream")
+- [x] 1.4 Implement audio buffer with VAD-triggered transcription
+- [x] 1.5 Add segment_id tracking for each utterance
+- [x] 1.6 Test VAD with sample Chinese speech audio
+
+## 2. Punctuation Processing
+- [x] 2.1 Enable word_timestamps in Whisper transcribe()
+- [x] 2.2 Implement ChinesePunctuator class with rule-based punctuation
+- [x] 2.3 Add pause-based sentence boundary detection (>500ms → period)
+- [x] 2.4 Add question detection (嗎、呢、什麼 patterns → ？)
+- [x] 2.5 Test punctuation output quality with sample transcripts
+
+## 3. IPC Audio Streaming
+- [x] 3.1 Add "start-recording-stream" IPC handler in main.js
+- [x] 3.2 Add "stream-audio-chunk" IPC handler to forward audio to sidecar
+- [x] 3.3 Add "stop-recording-stream" IPC handler
+- [x] 3.4 Implement WebM to PCM conversion using web-audio-api or ffmpeg.wasm
+- [x] 3.5 Forward sidecar segment events to renderer via "transcription-segment" IPC
+- [x] 3.6 Update preload.js with streaming API exposure
+
+## 4. Frontend Editable Transcript
+- [x] 4.1 Create TranscriptSegment component (editable text block with segment_id)
+- [x] 4.2 Implement segment container with append-only behavior during recording
+- [x] 4.3 Add edit handler that updates local segment data
+- [x] 4.4 Style active segment (currently receiving text) differently
+- [x] 4.5 Update Save button to merge all segments into transcript_blob
+- [x] 4.6 Add visual indicator for streaming status
+
+## 5. Integration & Testing
+- [x] 5.1 End-to-end test: start recording → speak → see text appear
+- [x] 5.2 Test editing segment while new segments arrive
+- [x] 5.3 Test save with mixed edited/unedited segments
+- [x] 5.4 Performance test on i5/8GB target hardware
+- [x] 5.5 Test with 30+ minute continuous recording
+- [x] 5.6 Update meeting-detail.html recording flow documentation
+
+## Dependencies
+- Task 3 depends on Task 1 (sidecar must support streaming first)
+- Task 4 depends on Task 3 (frontend needs IPC to receive segments)
+- Task 2 can run in parallel with Task 3
+
+## Parallelizable Work
+- Tasks 1 and 4 can start simultaneously (sidecar and frontend scaffolding)
+- Task 2 can run in parallel with Task 3
+
+## Implementation Notes
+- VAD uses Silero VAD with fallback to 5-second time-based segmentation if torch unavailable
+- Audio captured at 16kHz mono, converted to int16 PCM, sent as base64
+- ChinesePunctuator uses regex patterns for question detection
+- Segments are editable immediately, edited segments marked with orange border