feat: Meeting Assistant MVP - Complete implementation
Enterprise Meeting Knowledge Management System with: Backend (FastAPI): - Authentication proxy with JWT (pj-auth-api integration) - MySQL database with 4 tables (users, meetings, conclusions, actions) - Meeting CRUD with system code generation (C-YYYYMMDD-XX, A-YYYYMMDD-XX) - Dify LLM integration for AI summarization - Excel export with openpyxl - 20 unit tests (all passing) Client (Electron): - Login page with company auth - Meeting list with create/delete - Meeting detail with real-time transcription - Editable transcript textarea (single block, easy editing) - AI summarization with conclusions/action items - 5-second segment recording (efficient for long meetings) Sidecar (Python): - faster-whisper medium model with int8 quantization - ONNX Runtime VAD (lightweight, ~20MB vs PyTorch ~2GB) - Chinese punctuation processing - OpenCC for Traditional Chinese conversion - Anti-hallucination parameters - Auto-cleanup of temp audio files OpenSpec: - add-meeting-assistant-mvp (47 tasks, archived) - add-realtime-transcription (29 tasks, archived) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
456
openspec/AGENTS.md
Normal file
456
openspec/AGENTS.md
Normal file
@@ -0,0 +1,456 @@
|
||||
# OpenSpec Instructions
|
||||
|
||||
Instructions for AI coding assistants using OpenSpec for spec-driven development.
|
||||
|
||||
## TL;DR Quick Checklist
|
||||
|
||||
- Search existing work: `openspec spec list --long`, `openspec list` (use `rg` only for full-text search)
|
||||
- Decide scope: new capability vs modify existing capability
|
||||
- Pick a unique `change-id`: kebab-case, verb-led (`add-`, `update-`, `remove-`, `refactor-`)
|
||||
- Scaffold: `proposal.md`, `tasks.md`, `design.md` (only if needed), and delta specs per affected capability
|
||||
- Write deltas: use `## ADDED|MODIFIED|REMOVED|RENAMED Requirements`; include at least one `#### Scenario:` per requirement
|
||||
- Validate: `openspec validate [change-id] --strict` and fix issues
|
||||
- Request approval: Do not start implementation until proposal is approved
|
||||
|
||||
## Three-Stage Workflow
|
||||
|
||||
### Stage 1: Creating Changes
|
||||
Create proposal when you need to:
|
||||
- Add features or functionality
|
||||
- Make breaking changes (API, schema)
|
||||
- Change architecture or patterns
|
||||
- Optimize performance (changes behavior)
|
||||
- Update security patterns
|
||||
|
||||
Triggers (examples):
|
||||
- "Help me create a change proposal"
|
||||
- "Help me plan a change"
|
||||
- "Help me create a proposal"
|
||||
- "I want to create a spec proposal"
|
||||
- "I want to create a spec"
|
||||
|
||||
Loose matching guidance:
|
||||
- Contains one of: `proposal`, `change`, `spec`
|
||||
- With one of: `create`, `plan`, `make`, `start`, `help`
|
||||
|
||||
Skip proposal for:
|
||||
- Bug fixes (restore intended behavior)
|
||||
- Typos, formatting, comments
|
||||
- Dependency updates (non-breaking)
|
||||
- Configuration changes
|
||||
- Tests for existing behavior
|
||||
|
||||
**Workflow**
|
||||
1. Review `openspec/project.md`, `openspec list`, and `openspec list --specs` to understand current context.
|
||||
2. Choose a unique verb-led `change-id` and scaffold `proposal.md`, `tasks.md`, optional `design.md`, and spec deltas under `openspec/changes/<id>/`.
|
||||
3. Draft spec deltas using `## ADDED|MODIFIED|REMOVED Requirements` with at least one `#### Scenario:` per requirement.
|
||||
4. Run `openspec validate <id> --strict` and resolve any issues before sharing the proposal.
|
||||
|
||||
### Stage 2: Implementing Changes
|
||||
Track these steps as TODOs and complete them one by one.
|
||||
1. **Read proposal.md** - Understand what's being built
|
||||
2. **Read design.md** (if exists) - Review technical decisions
|
||||
3. **Read tasks.md** - Get implementation checklist
|
||||
4. **Implement tasks sequentially** - Complete in order
|
||||
5. **Confirm completion** - Ensure every item in `tasks.md` is finished before updating statuses
|
||||
6. **Update checklist** - After all work is done, set every task to `- [x]` so the list reflects reality
|
||||
7. **Approval gate** - Do not start implementation until the proposal is reviewed and approved
|
||||
|
||||
### Stage 3: Archiving Changes
|
||||
After deployment, create separate PR to:
|
||||
- Move `changes/[name]/` → `changes/archive/YYYY-MM-DD-[name]/`
|
||||
- Update `specs/` if capabilities changed
|
||||
- Use `openspec archive <change-id> --skip-specs --yes` for tooling-only changes (always pass the change ID explicitly)
|
||||
- Run `openspec validate --strict` to confirm the archived change passes checks
|
||||
|
||||
## Before Any Task
|
||||
|
||||
**Context Checklist:**
|
||||
- [ ] Read relevant specs in `specs/[capability]/spec.md`
|
||||
- [ ] Check pending changes in `changes/` for conflicts
|
||||
- [ ] Read `openspec/project.md` for conventions
|
||||
- [ ] Run `openspec list` to see active changes
|
||||
- [ ] Run `openspec list --specs` to see existing capabilities
|
||||
|
||||
**Before Creating Specs:**
|
||||
- Always check if capability already exists
|
||||
- Prefer modifying existing specs over creating duplicates
|
||||
- Use `openspec show [spec]` to review current state
|
||||
- If request is ambiguous, ask 1–2 clarifying questions before scaffolding
|
||||
|
||||
### Search Guidance
|
||||
- Enumerate specs: `openspec spec list --long` (or `--json` for scripts)
|
||||
- Enumerate changes: `openspec list` (or `openspec change list --json` - deprecated but available)
|
||||
- Show details:
|
||||
- Spec: `openspec show <spec-id> --type spec` (use `--json` for filters)
|
||||
- Change: `openspec show <change-id> --json --deltas-only`
|
||||
- Full-text search (use ripgrep): `rg -n "Requirement:|Scenario:" openspec/specs`
|
||||
|
||||
## Quick Start
|
||||
|
||||
### CLI Commands
|
||||
|
||||
```bash
|
||||
# Essential commands
|
||||
openspec list # List active changes
|
||||
openspec list --specs # List specifications
|
||||
openspec show [item] # Display change or spec
|
||||
openspec validate [item] # Validate changes or specs
|
||||
openspec archive <change-id> [--yes|-y] # Archive after deployment (add --yes for non-interactive runs)
|
||||
|
||||
# Project management
|
||||
openspec init [path] # Initialize OpenSpec
|
||||
openspec update [path] # Update instruction files
|
||||
|
||||
# Interactive mode
|
||||
openspec show # Prompts for selection
|
||||
openspec validate # Bulk validation mode
|
||||
|
||||
# Debugging
|
||||
openspec show [change] --json --deltas-only
|
||||
openspec validate [change] --strict
|
||||
```
|
||||
|
||||
### Command Flags
|
||||
|
||||
- `--json` - Machine-readable output
|
||||
- `--type change|spec` - Disambiguate items
|
||||
- `--strict` - Comprehensive validation
|
||||
- `--no-interactive` - Disable prompts
|
||||
- `--skip-specs` - Archive without spec updates
|
||||
- `--yes`/`-y` - Skip confirmation prompts (non-interactive archive)
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
openspec/
|
||||
├── project.md # Project conventions
|
||||
├── specs/ # Current truth - what IS built
|
||||
│ └── [capability]/ # Single focused capability
|
||||
│ ├── spec.md # Requirements and scenarios
|
||||
│ └── design.md # Technical patterns
|
||||
├── changes/ # Proposals - what SHOULD change
|
||||
│ ├── [change-name]/
|
||||
│ │ ├── proposal.md # Why, what, impact
|
||||
│ │ ├── tasks.md # Implementation checklist
|
||||
│ │ ├── design.md # Technical decisions (optional; see criteria)
|
||||
│ │ └── specs/ # Delta changes
|
||||
│ │ └── [capability]/
|
||||
│ │ └── spec.md # ADDED/MODIFIED/REMOVED
|
||||
│ └── archive/ # Completed changes
|
||||
```
|
||||
|
||||
## Creating Change Proposals
|
||||
|
||||
### Decision Tree
|
||||
|
||||
```
|
||||
New request?
|
||||
├─ Bug fix restoring spec behavior? → Fix directly
|
||||
├─ Typo/format/comment? → Fix directly
|
||||
├─ New feature/capability? → Create proposal
|
||||
├─ Breaking change? → Create proposal
|
||||
├─ Architecture change? → Create proposal
|
||||
└─ Unclear? → Create proposal (safer)
|
||||
```
|
||||
|
||||
### Proposal Structure
|
||||
|
||||
1. **Create directory:** `changes/[change-id]/` (kebab-case, verb-led, unique)
|
||||
|
||||
2. **Write proposal.md:**
|
||||
```markdown
|
||||
# Change: [Brief description of change]
|
||||
|
||||
## Why
|
||||
[1-2 sentences on problem/opportunity]
|
||||
|
||||
## What Changes
|
||||
- [Bullet list of changes]
|
||||
- [Mark breaking changes with **BREAKING**]
|
||||
|
||||
## Impact
|
||||
- Affected specs: [list capabilities]
|
||||
- Affected code: [key files/systems]
|
||||
```
|
||||
|
||||
3. **Create spec deltas:** `specs/[capability]/spec.md`
|
||||
```markdown
|
||||
## ADDED Requirements
|
||||
### Requirement: New Feature
|
||||
The system SHALL provide...
|
||||
|
||||
#### Scenario: Success case
|
||||
- **WHEN** user performs action
|
||||
- **THEN** expected result
|
||||
|
||||
## MODIFIED Requirements
|
||||
### Requirement: Existing Feature
|
||||
[Complete modified requirement]
|
||||
|
||||
## REMOVED Requirements
|
||||
### Requirement: Old Feature
|
||||
**Reason**: [Why removing]
|
||||
**Migration**: [How to handle]
|
||||
```
|
||||
If multiple capabilities are affected, create multiple delta files under `changes/[change-id]/specs/<capability>/spec.md`—one per capability.
|
||||
|
||||
4. **Create tasks.md:**
|
||||
```markdown
|
||||
## 1. Implementation
|
||||
- [ ] 1.1 Create database schema
|
||||
- [ ] 1.2 Implement API endpoint
|
||||
- [ ] 1.3 Add frontend component
|
||||
- [ ] 1.4 Write tests
|
||||
```
|
||||
|
||||
5. **Create design.md when needed:**
|
||||
Create `design.md` if any of the following apply; otherwise omit it:
|
||||
- Cross-cutting change (multiple services/modules) or a new architectural pattern
|
||||
- New external dependency or significant data model changes
|
||||
- Security, performance, or migration complexity
|
||||
- Ambiguity that benefits from technical decisions before coding
|
||||
|
||||
Minimal `design.md` skeleton:
|
||||
```markdown
|
||||
## Context
|
||||
[Background, constraints, stakeholders]
|
||||
|
||||
## Goals / Non-Goals
|
||||
- Goals: [...]
|
||||
- Non-Goals: [...]
|
||||
|
||||
## Decisions
|
||||
- Decision: [What and why]
|
||||
- Alternatives considered: [Options + rationale]
|
||||
|
||||
## Risks / Trade-offs
|
||||
- [Risk] → Mitigation
|
||||
|
||||
## Migration Plan
|
||||
[Steps, rollback]
|
||||
|
||||
## Open Questions
|
||||
- [...]
|
||||
```
|
||||
|
||||
## Spec File Format
|
||||
|
||||
### Critical: Scenario Formatting
|
||||
|
||||
**CORRECT** (use #### headers):
|
||||
```markdown
|
||||
#### Scenario: User login success
|
||||
- **WHEN** valid credentials provided
|
||||
- **THEN** return JWT token
|
||||
```
|
||||
|
||||
**WRONG** (don't use bullets or bold):
|
||||
```markdown
|
||||
- **Scenario: User login** ❌
|
||||
**Scenario**: User login ❌
|
||||
### Scenario: User login ❌
|
||||
```
|
||||
|
||||
Every requirement MUST have at least one scenario.
|
||||
|
||||
### Requirement Wording
|
||||
- Use SHALL/MUST for normative requirements (avoid should/may unless intentionally non-normative)
|
||||
|
||||
### Delta Operations
|
||||
|
||||
- `## ADDED Requirements` - New capabilities
|
||||
- `## MODIFIED Requirements` - Changed behavior
|
||||
- `## REMOVED Requirements` - Deprecated features
|
||||
- `## RENAMED Requirements` - Name changes
|
||||
|
||||
Headers matched with `trim(header)` - whitespace ignored.
|
||||
|
||||
#### When to use ADDED vs MODIFIED
|
||||
- ADDED: Introduces a new capability or sub-capability that can stand alone as a requirement. Prefer ADDED when the change is orthogonal (e.g., adding "Slash Command Configuration") rather than altering the semantics of an existing requirement.
|
||||
- MODIFIED: Changes the behavior, scope, or acceptance criteria of an existing requirement. Always paste the full, updated requirement content (header + all scenarios). The archiver will replace the entire requirement with what you provide here; partial deltas will drop previous details.
|
||||
- RENAMED: Use when only the name changes. If you also change behavior, use RENAMED (name) plus MODIFIED (content) referencing the new name.
|
||||
|
||||
Common pitfall: Using MODIFIED to add a new concern without including the previous text. This causes loss of detail at archive time. If you aren’t explicitly changing the existing requirement, add a new requirement under ADDED instead.
|
||||
|
||||
Authoring a MODIFIED requirement correctly:
|
||||
1) Locate the existing requirement in `openspec/specs/<capability>/spec.md`.
|
||||
2) Copy the entire requirement block (from `### Requirement: ...` through its scenarios).
|
||||
3) Paste it under `## MODIFIED Requirements` and edit to reflect the new behavior.
|
||||
4) Ensure the header text matches exactly (whitespace-insensitive) and keep at least one `#### Scenario:`.
|
||||
|
||||
Example for RENAMED:
|
||||
```markdown
|
||||
## RENAMED Requirements
|
||||
- FROM: `### Requirement: Login`
|
||||
- TO: `### Requirement: User Authentication`
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Errors
|
||||
|
||||
**"Change must have at least one delta"**
|
||||
- Check `changes/[name]/specs/` exists with .md files
|
||||
- Verify files have operation prefixes (## ADDED Requirements)
|
||||
|
||||
**"Requirement must have at least one scenario"**
|
||||
- Check scenarios use `#### Scenario:` format (4 hashtags)
|
||||
- Don't use bullet points or bold for scenario headers
|
||||
|
||||
**Silent scenario parsing failures**
|
||||
- Exact format required: `#### Scenario: Name`
|
||||
- Debug with: `openspec show [change] --json --deltas-only`
|
||||
|
||||
### Validation Tips
|
||||
|
||||
```bash
|
||||
# Always use strict mode for comprehensive checks
|
||||
openspec validate [change] --strict
|
||||
|
||||
# Debug delta parsing
|
||||
openspec show [change] --json | jq '.deltas'
|
||||
|
||||
# Check specific requirement
|
||||
openspec show [spec] --json -r 1
|
||||
```
|
||||
|
||||
## Happy Path Script
|
||||
|
||||
```bash
|
||||
# 1) Explore current state
|
||||
openspec spec list --long
|
||||
openspec list
|
||||
# Optional full-text search:
|
||||
# rg -n "Requirement:|Scenario:" openspec/specs
|
||||
# rg -n "^#|Requirement:" openspec/changes
|
||||
|
||||
# 2) Choose change id and scaffold
|
||||
CHANGE=add-two-factor-auth
|
||||
mkdir -p openspec/changes/$CHANGE/{specs/auth}
|
||||
printf "## Why\n...\n\n## What Changes\n- ...\n\n## Impact\n- ...\n" > openspec/changes/$CHANGE/proposal.md
|
||||
printf "## 1. Implementation\n- [ ] 1.1 ...\n" > openspec/changes/$CHANGE/tasks.md
|
||||
|
||||
# 3) Add deltas (example)
|
||||
cat > openspec/changes/$CHANGE/specs/auth/spec.md << 'EOF'
|
||||
## ADDED Requirements
|
||||
### Requirement: Two-Factor Authentication
|
||||
Users MUST provide a second factor during login.
|
||||
|
||||
#### Scenario: OTP required
|
||||
- **WHEN** valid credentials are provided
|
||||
- **THEN** an OTP challenge is required
|
||||
EOF
|
||||
|
||||
# 4) Validate
|
||||
openspec validate $CHANGE --strict
|
||||
```
|
||||
|
||||
## Multi-Capability Example
|
||||
|
||||
```
|
||||
openspec/changes/add-2fa-notify/
|
||||
├── proposal.md
|
||||
├── tasks.md
|
||||
└── specs/
|
||||
├── auth/
|
||||
│ └── spec.md # ADDED: Two-Factor Authentication
|
||||
└── notifications/
|
||||
└── spec.md # ADDED: OTP email notification
|
||||
```
|
||||
|
||||
auth/spec.md
|
||||
```markdown
|
||||
## ADDED Requirements
|
||||
### Requirement: Two-Factor Authentication
|
||||
...
|
||||
```
|
||||
|
||||
notifications/spec.md
|
||||
```markdown
|
||||
## ADDED Requirements
|
||||
### Requirement: OTP Email Notification
|
||||
...
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Simplicity First
|
||||
- Default to <100 lines of new code
|
||||
- Single-file implementations until proven insufficient
|
||||
- Avoid frameworks without clear justification
|
||||
- Choose boring, proven patterns
|
||||
|
||||
### Complexity Triggers
|
||||
Only add complexity with:
|
||||
- Performance data showing current solution too slow
|
||||
- Concrete scale requirements (>1000 users, >100MB data)
|
||||
- Multiple proven use cases requiring abstraction
|
||||
|
||||
### Clear References
|
||||
- Use `file.ts:42` format for code locations
|
||||
- Reference specs as `specs/auth/spec.md`
|
||||
- Link related changes and PRs
|
||||
|
||||
### Capability Naming
|
||||
- Use verb-noun: `user-auth`, `payment-capture`
|
||||
- Single purpose per capability
|
||||
- 10-minute understandability rule
|
||||
- Split if description needs "AND"
|
||||
|
||||
### Change ID Naming
|
||||
- Use kebab-case, short and descriptive: `add-two-factor-auth`
|
||||
- Prefer verb-led prefixes: `add-`, `update-`, `remove-`, `refactor-`
|
||||
- Ensure uniqueness; if taken, append `-2`, `-3`, etc.
|
||||
|
||||
## Tool Selection Guide
|
||||
|
||||
| Task | Tool | Why |
|
||||
|------|------|-----|
|
||||
| Find files by pattern | Glob | Fast pattern matching |
|
||||
| Search code content | Grep | Optimized regex search |
|
||||
| Read specific files | Read | Direct file access |
|
||||
| Explore unknown scope | Task | Multi-step investigation |
|
||||
|
||||
## Error Recovery
|
||||
|
||||
### Change Conflicts
|
||||
1. Run `openspec list` to see active changes
|
||||
2. Check for overlapping specs
|
||||
3. Coordinate with change owners
|
||||
4. Consider combining proposals
|
||||
|
||||
### Validation Failures
|
||||
1. Run with `--strict` flag
|
||||
2. Check JSON output for details
|
||||
3. Verify spec file format
|
||||
4. Ensure scenarios properly formatted
|
||||
|
||||
### Missing Context
|
||||
1. Read project.md first
|
||||
2. Check related specs
|
||||
3. Review recent archives
|
||||
4. Ask for clarification
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Stage Indicators
|
||||
- `changes/` - Proposed, not yet built
|
||||
- `specs/` - Built and deployed
|
||||
- `archive/` - Completed changes
|
||||
|
||||
### File Purposes
|
||||
- `proposal.md` - Why and what
|
||||
- `tasks.md` - Implementation steps
|
||||
- `design.md` - Technical decisions
|
||||
- `spec.md` - Requirements and behavior
|
||||
|
||||
### CLI Essentials
|
||||
```bash
|
||||
openspec list # What's in progress?
|
||||
openspec show [item] # View details
|
||||
openspec validate --strict # Is it correct?
|
||||
openspec archive <change-id> [--yes|-y] # Mark complete (add --yes for automation)
|
||||
```
|
||||
|
||||
Remember: Specs are truth. Changes are proposals. Keep them in sync.
|
||||
@@ -0,0 +1,132 @@
|
||||
## Context
|
||||
Building a meeting knowledge management system for enterprise users. The system must support offline transcription on standard hardware (i5/8GB), integrate with existing company authentication, and provide AI-powered summarization via Dify LLM.
|
||||
|
||||
**Stakeholders**: Enterprise meeting participants, meeting recorders, admin users (ymirliu@panjit.com.tw)
|
||||
|
||||
**Constraints**:
|
||||
- Must run faster-whisper int8 on i5/8GB laptop
|
||||
- DB credentials and API keys must stay server-side (security)
|
||||
- All database tables prefixed with `meeting_`
|
||||
- Output must support Traditional Chinese (繁體中文)
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals**:
|
||||
- Deliver working MVP with all six capabilities
|
||||
- Secure architecture with secrets in middleware only
|
||||
- Offline-capable transcription
|
||||
- Structured output with trackable action items
|
||||
|
||||
**Non-Goals**:
|
||||
- Multi-language support beyond Traditional Chinese
|
||||
- Real-time collaborative editing
|
||||
- Mobile client
|
||||
- Custom LLM model training
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Electron Client │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
|
||||
│ │ Auth UI │ │ Meeting UI │ │ Transcription Engine │ │
|
||||
│ │ (Login) │ │ (CRUD/Edit) │ │ (faster-whisper+OpenCC)│ │
|
||||
│ └──────┬──────┘ └──────┬──────┘ └────────────┬────────────┘ │
|
||||
└─────────┼────────────────┼──────────────────────┼───────────────┘
|
||||
│ │ │
|
||||
│ HTTP │ HTTP │ Local only
|
||||
▼ ▼ ▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ FastAPI Middleware Server │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────┐ │
|
||||
│ │ Auth Proxy │ │Meeting CRUD │ │ Dify Proxy │ │ Export │ │
|
||||
│ │ POST /login │ │POST/GET/... │ │POST /ai/... │ │GET /:id│ │
|
||||
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └───┬────┘ │
|
||||
└─────────┼────────────────┼────────────────┼─────────────┼───────┘
|
||||
│ │ │ │
|
||||
▼ ▼ ▼ │
|
||||
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ PJ-Auth API │ │ MySQL │ │ Dify LLM │ │
|
||||
│ (Vercel) │ │ (theaken.com)│ │(theaken.com) │ │
|
||||
└──────────────┘ └──────────────┘ └──────────────┘ │
|
||||
│
|
||||
┌────────────────────┘
|
||||
▼
|
||||
┌──────────────┐
|
||||
│ Excel Template│
|
||||
│ (openpyxl) │
|
||||
└──────────────┘
|
||||
```
|
||||
|
||||
## Decisions
|
||||
|
||||
### Decision 1: Three-tier architecture with middleware
|
||||
**Choice**: All external services accessed through FastAPI middleware
|
||||
**Rationale**: Security requirement - DB credentials and API keys cannot be in Electron client
|
||||
**Alternatives considered**:
|
||||
- Direct client-to-service: Rejected due to credential exposure risk
|
||||
- Serverless functions: More complex deployment for similar security
|
||||
|
||||
### Decision 2: Edge transcription in Electron
|
||||
**Choice**: Run faster-whisper locally via Python sidecar (PyInstaller)
|
||||
**Rationale**: Offline capability requirement; network latency unacceptable for real-time transcription
|
||||
**Alternatives considered**:
|
||||
- Cloud STT (Google/Azure): Requires network, latency issues
|
||||
- WebAssembly whisper: Not mature enough for production
|
||||
|
||||
### Decision 3: MySQL with prefixed tables
|
||||
**Choice**: Use shared MySQL instance with `meeting_` prefix
|
||||
**Rationale**: Leverage existing infrastructure; prefix ensures isolation
|
||||
**Alternatives considered**:
|
||||
- Dedicated database: Overhead not justified for MVP
|
||||
- SQLite: Doesn't support multi-user access
|
||||
|
||||
### Decision 4: Dify for LLM summarization
|
||||
**Choice**: Use company Dify instance for AI features
|
||||
**Rationale**: Already available infrastructure; structured JSON output support
|
||||
**Alternatives considered**:
|
||||
- Direct OpenAI API: Additional cost, no existing infrastructure
|
||||
- Local LLM: Hardware constraints (i5/8GB insufficient)
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
| Risk | Impact | Mitigation |
|
||||
|------|--------|------------|
|
||||
| faster-whisper performance on i5/8GB | High | Use int8 quantization; test on target hardware early |
|
||||
| Dify timeout on long transcripts | Medium | Implement chunking; add timeout handling with retry |
|
||||
| Token expiry during long meetings | Medium | Implement auto-refresh interceptor in client |
|
||||
| Network failure during save | Medium | Client-side queue with retry; local draft storage |
|
||||
|
||||
## Data Model
|
||||
|
||||
```sql
|
||||
-- Tables all prefixed with meeting_
|
||||
|
||||
meeting_users (user_id, email, display_name, role, created_at)
|
||||
meeting_records (meeting_id, uuid, subject, meeting_time, location,
|
||||
chairperson, recorder, attendees, transcript_blob,
|
||||
created_by, created_at)
|
||||
meeting_conclusions (conclusion_id, meeting_id, content, system_code)
|
||||
meeting_action_items (action_id, meeting_id, content, owner, due_date,
|
||||
status, system_code)
|
||||
```
|
||||
|
||||
**ID Formats**:
|
||||
- Conclusions: `C-YYYYMMDD-XX` (e.g., C-20251210-01)
|
||||
- Action Items: `A-YYYYMMDD-XX` (e.g., A-20251210-01)
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Endpoint | Purpose |
|
||||
|--------|----------|---------|
|
||||
| POST | /api/login | Proxy auth to PJ-Auth API |
|
||||
| GET | /api/meetings | List meetings (filterable) |
|
||||
| POST | /api/meetings | Create meeting |
|
||||
| GET | /api/meetings/:id | Get meeting details |
|
||||
| PUT | /api/meetings/:id | Update meeting |
|
||||
| DELETE | /api/meetings/:id | Delete meeting |
|
||||
| POST | /api/ai/summarize | Send transcript to Dify |
|
||||
| GET | /api/meetings/:id/export | Generate Excel report |
|
||||
|
||||
## Open Questions
|
||||
- None currently - PRD and SDD provide sufficient detail for MVP implementation
|
||||
@@ -0,0 +1,25 @@
|
||||
# Change: Add Meeting Assistant MVP
|
||||
|
||||
## Why
|
||||
Enterprise users spend significant time manually documenting meetings and tracking action items. This MVP delivers an end-to-end meeting knowledge management solution with offline transcription, AI-powered summarization, and structured tracking of conclusions and action items.
|
||||
|
||||
## What Changes
|
||||
- **NEW** FastAPI middleware server with MySQL integration
|
||||
- **NEW** Authentication proxy to company Auth API with admin role detection
|
||||
- **NEW** Meeting CRUD operations with metadata management
|
||||
- **NEW** Edge-based speech-to-text using faster-whisper (int8)
|
||||
- **NEW** Dify LLM integration for intelligent summarization
|
||||
- **NEW** Excel report generation from templates
|
||||
|
||||
## Impact
|
||||
- Affected specs: middleware, authentication, meeting-management, transcription, ai-summarization, excel-export
|
||||
- Affected code: New Python FastAPI backend, new Electron frontend
|
||||
- External dependencies: PJ-Auth API, MySQL database, Dify LLM service
|
||||
|
||||
## Success Criteria
|
||||
- Users can login via company SSO
|
||||
- Meetings can be created with required metadata (subject, time, chairperson, location, recorder, attendees)
|
||||
- Speech-to-text works offline on i5/8GB hardware
|
||||
- AI generates structured conclusions and action items from transcripts
|
||||
- Action items have trackable status (Open/In Progress/Done/Delayed)
|
||||
- Excel reports can be exported with all meeting data
|
||||
@@ -0,0 +1,45 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Dify Integration
|
||||
The middleware server SHALL integrate with Dify LLM at https://dify.theaken.com/v1 for transcript summarization.
|
||||
|
||||
#### Scenario: Successful summarization
|
||||
- **WHEN** user submits POST /api/ai/summarize with transcript text
|
||||
- **THEN** the server SHALL call Dify API and return structured JSON with conclusions and action_items
|
||||
|
||||
#### Scenario: Dify timeout handling
|
||||
- **WHEN** Dify API does not respond within timeout period
|
||||
- **THEN** the server SHALL return HTTP 504 with timeout error and client can retry
|
||||
|
||||
#### Scenario: Dify error handling
|
||||
- **WHEN** Dify API returns error (500, rate limit, etc.)
|
||||
- **THEN** the server SHALL return appropriate HTTP error with details
|
||||
|
||||
### Requirement: Structured Output Format
|
||||
The AI summarization SHALL return structured data with conclusions and action items.
|
||||
|
||||
#### Scenario: Complete structured response
|
||||
- **WHEN** transcript contains clear decisions and assignments
|
||||
- **THEN** response SHALL include conclusions array and action_items array with content, owner, due_date fields
|
||||
|
||||
#### Scenario: Partial data extraction
|
||||
- **WHEN** transcript lacks explicit owner or due_date for action items
|
||||
- **THEN** those fields SHALL be empty strings allowing manual completion
|
||||
|
||||
### Requirement: Dify Prompt Configuration
|
||||
The Dify workflow SHALL be configured with appropriate system prompt for meeting summarization.
|
||||
|
||||
#### Scenario: System prompt behavior
|
||||
- **WHEN** transcript is sent to Dify
|
||||
- **THEN** Dify SHALL use configured prompt to extract conclusions and action_items in JSON format
|
||||
|
||||
### Requirement: Manual Data Completion
|
||||
The Electron client SHALL allow users to manually complete missing AI-extracted data.
|
||||
|
||||
#### Scenario: Fill missing owner
|
||||
- **WHEN** AI returns action item without owner
|
||||
- **THEN** user SHALL be able to select or type owner name in the UI
|
||||
|
||||
#### Scenario: Fill missing due date
|
||||
- **WHEN** AI returns action item without due_date
|
||||
- **THEN** user SHALL be able to select date using date picker
|
||||
@@ -0,0 +1,42 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Login Proxy
|
||||
The middleware server SHALL proxy login requests to the company Auth API at https://pj-auth-api.vercel.app/api/auth/login.
|
||||
|
||||
#### Scenario: Successful login
|
||||
- **WHEN** user submits valid credentials to POST /api/login
|
||||
- **THEN** the server SHALL forward to Auth API and return the JWT token
|
||||
|
||||
#### Scenario: Admin role detection
|
||||
- **WHEN** user logs in with email ymirliu@panjit.com.tw
|
||||
- **THEN** the response JWT payload SHALL include role: "admin"
|
||||
|
||||
#### Scenario: Invalid credentials
|
||||
- **WHEN** user submits invalid credentials
|
||||
- **THEN** the server SHALL return HTTP 401 with error message from Auth API
|
||||
|
||||
### Requirement: Token Validation
|
||||
The middleware server SHALL validate JWT tokens on protected endpoints.
|
||||
|
||||
#### Scenario: Valid token access
|
||||
- **WHEN** request includes valid JWT in Authorization header
|
||||
- **THEN** the request SHALL proceed to the endpoint handler
|
||||
|
||||
#### Scenario: Expired token
|
||||
- **WHEN** request includes expired JWT
|
||||
- **THEN** the server SHALL return HTTP 401 with "token_expired" error code
|
||||
|
||||
#### Scenario: Missing token
|
||||
- **WHEN** request to protected endpoint lacks Authorization header
|
||||
- **THEN** the server SHALL return HTTP 401 with "token_required" error code
|
||||
|
||||
### Requirement: Token Auto-Refresh
|
||||
The Electron client SHALL implement automatic token refresh before expiration.
|
||||
|
||||
#### Scenario: Proactive refresh
|
||||
- **WHEN** token approaches expiration (within 5 minutes) during active session
|
||||
- **THEN** the client SHALL request new token transparently without user interruption
|
||||
|
||||
#### Scenario: Refresh during long meeting
|
||||
- **WHEN** user is in a meeting session lasting longer than token validity
|
||||
- **THEN** the client SHALL maintain authentication through automatic refresh
|
||||
@@ -0,0 +1,45 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Excel Report Generation
|
||||
The middleware server SHALL generate Excel reports from meeting data using templates.
|
||||
|
||||
#### Scenario: Successful export
|
||||
- **WHEN** user requests GET /api/meetings/:id/export
|
||||
- **THEN** server SHALL generate Excel file and return as downloadable stream
|
||||
|
||||
#### Scenario: Export non-existent meeting
|
||||
- **WHEN** user requests export for non-existent meeting ID
|
||||
- **THEN** server SHALL return HTTP 404
|
||||
|
||||
### Requirement: Template-based Generation
|
||||
The Excel export SHALL use openpyxl with template files.
|
||||
|
||||
#### Scenario: Placeholder replacement
|
||||
- **WHEN** Excel is generated
|
||||
- **THEN** placeholders ({{subject}}, {{time}}, {{chair}}, etc.) SHALL be replaced with actual meeting data
|
||||
|
||||
#### Scenario: Dynamic row insertion
|
||||
- **WHEN** meeting has multiple conclusions or action items
|
||||
- **THEN** rows SHALL be dynamically inserted to accommodate all items
|
||||
|
||||
### Requirement: Complete Data Inclusion
|
||||
The exported Excel SHALL include all meeting metadata and AI-generated content.
|
||||
|
||||
#### Scenario: Full metadata export
|
||||
- **WHEN** Excel is generated
|
||||
- **THEN** it SHALL include subject, meeting_time, location, chairperson, recorder, and attendees
|
||||
|
||||
#### Scenario: Conclusions export
|
||||
- **WHEN** Excel is generated
|
||||
- **THEN** all conclusions SHALL be listed with their system codes
|
||||
|
||||
#### Scenario: Action items export
|
||||
- **WHEN** Excel is generated
|
||||
- **THEN** all action items SHALL be listed with content, owner, due_date, status, and system code
|
||||
|
||||
### Requirement: Template Management
|
||||
Admin users SHALL be able to manage Excel templates.
|
||||
|
||||
#### Scenario: Admin template access
|
||||
- **WHEN** admin user accesses template management
|
||||
- **THEN** they SHALL be able to upload, view, and update Excel templates
|
||||
@@ -0,0 +1,71 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Create Meeting
|
||||
The system SHALL allow users to create meetings with required metadata.
|
||||
|
||||
#### Scenario: Create meeting with all fields
|
||||
- **WHEN** user submits POST /api/meetings with subject, meeting_time, chairperson, location, recorder, attendees
|
||||
- **THEN** a new meeting record SHALL be created with auto-generated UUID and the meeting data SHALL be returned
|
||||
|
||||
#### Scenario: Create meeting with missing required fields
|
||||
- **WHEN** user submits POST /api/meetings without subject or meeting_time
|
||||
- **THEN** the server SHALL return HTTP 400 with validation error details
|
||||
|
||||
#### Scenario: Recorder defaults to current user
|
||||
- **WHEN** user creates meeting without specifying recorder
|
||||
- **THEN** the recorder field SHALL default to the logged-in user's email
|
||||
|
||||
### Requirement: List Meetings
|
||||
The system SHALL allow users to retrieve a list of meetings.
|
||||
|
||||
#### Scenario: List all meetings for admin
|
||||
- **WHEN** admin user requests GET /api/meetings
|
||||
- **THEN** all meetings SHALL be returned
|
||||
|
||||
#### Scenario: List meetings for regular user
|
||||
- **WHEN** regular user requests GET /api/meetings
|
||||
- **THEN** only meetings where user is creator, recorder, or attendee SHALL be returned
|
||||
|
||||
### Requirement: Get Meeting Details
|
||||
The system SHALL allow users to retrieve full meeting details including conclusions and action items.
|
||||
|
||||
#### Scenario: Get meeting with related data
|
||||
- **WHEN** user requests GET /api/meetings/:id
|
||||
- **THEN** meeting record with all conclusions and action_items SHALL be returned
|
||||
|
||||
#### Scenario: Get non-existent meeting
|
||||
- **WHEN** user requests GET /api/meetings/:id for non-existent ID
|
||||
- **THEN** the server SHALL return HTTP 404
|
||||
|
||||
### Requirement: Update Meeting
|
||||
The system SHALL allow users to update meeting data, conclusions, and action items.
|
||||
|
||||
#### Scenario: Update meeting metadata
|
||||
- **WHEN** user submits PUT /api/meetings/:id with updated fields
|
||||
- **THEN** the meeting record SHALL be updated and new data returned
|
||||
|
||||
#### Scenario: Update action item status
|
||||
- **WHEN** user updates action item status to "Done"
|
||||
- **THEN** the action_items record SHALL reflect the new status
|
||||
|
||||
### Requirement: Delete Meeting
|
||||
The system SHALL allow authorized users to delete meetings.
|
||||
|
||||
#### Scenario: Admin deletes any meeting
|
||||
- **WHEN** admin user requests DELETE /api/meetings/:id
|
||||
- **THEN** the meeting and all related conclusions and action_items SHALL be deleted
|
||||
|
||||
#### Scenario: User deletes own meeting
|
||||
- **WHEN** user requests DELETE /api/meetings/:id for meeting they created
|
||||
- **THEN** the meeting and all related data SHALL be deleted
|
||||
|
||||
### Requirement: System Code Generation
|
||||
The system SHALL auto-generate unique system codes for conclusions and action items.
|
||||
|
||||
#### Scenario: Generate conclusion code
|
||||
- **WHEN** a conclusion is created for a meeting on 2025-12-10
|
||||
- **THEN** the system_code SHALL follow format C-20251210-XX where XX is sequence number
|
||||
|
||||
#### Scenario: Generate action item code
|
||||
- **WHEN** an action item is created for a meeting on 2025-12-10
|
||||
- **THEN** the system_code SHALL follow format A-20251210-XX where XX is sequence number
|
||||
@@ -0,0 +1,41 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: FastAPI Server Configuration
|
||||
The middleware server SHALL be implemented using Python FastAPI framework with environment-based configuration.
|
||||
|
||||
#### Scenario: Server startup with valid configuration
|
||||
- **WHEN** the server starts with valid .env file containing DB_HOST, DB_PORT, DB_USER, DB_PASS, DB_NAME, DIFY_API_URL, DIFY_API_KEY
|
||||
- **THEN** the server SHALL start successfully and accept connections
|
||||
|
||||
#### Scenario: Server startup with missing configuration
|
||||
- **WHEN** the server starts with missing required environment variables
|
||||
- **THEN** the server SHALL fail to start with descriptive error message
|
||||
|
||||
### Requirement: Database Connection Pool
|
||||
The middleware server SHALL maintain a connection pool to the MySQL database at mysql.theaken.com:33306.
|
||||
|
||||
#### Scenario: Database connection success
|
||||
- **WHEN** the server connects to MySQL with valid credentials
|
||||
- **THEN** a connection pool SHALL be established and queries SHALL execute successfully
|
||||
|
||||
#### Scenario: Database connection failure
|
||||
- **WHEN** the database is unreachable
|
||||
- **THEN** the server SHALL return HTTP 503 with error details for affected endpoints
|
||||
|
||||
### Requirement: Table Initialization
|
||||
The middleware server SHALL ensure all required tables exist on startup with the `meeting_` prefix.
|
||||
|
||||
#### Scenario: Tables created on first run
|
||||
- **WHEN** the server starts and tables do not exist
|
||||
- **THEN** the server SHALL create meeting_users, meeting_records, meeting_conclusions, and meeting_action_items tables
|
||||
|
||||
#### Scenario: Tables already exist
|
||||
- **WHEN** the server starts and tables already exist
|
||||
- **THEN** the server SHALL skip table creation and continue normally
|
||||
|
||||
### Requirement: CORS Configuration
|
||||
The middleware server SHALL allow cross-origin requests from the Electron client.
|
||||
|
||||
#### Scenario: CORS preflight request
|
||||
- **WHEN** Electron client sends OPTIONS request
|
||||
- **THEN** the server SHALL respond with appropriate CORS headers allowing the request
|
||||
@@ -0,0 +1,41 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Edge Speech-to-Text
|
||||
The Electron client SHALL perform speech-to-text conversion locally using faster-whisper int8 model.
|
||||
|
||||
#### Scenario: Successful transcription
|
||||
- **WHEN** user records audio during a meeting
|
||||
- **THEN** the audio SHALL be transcribed locally without network dependency
|
||||
|
||||
#### Scenario: Transcription on target hardware
|
||||
- **WHEN** running on i5 processor with 8GB RAM
|
||||
- **THEN** transcription SHALL complete within acceptable latency for real-time display
|
||||
|
||||
### Requirement: Traditional Chinese Output
|
||||
The transcription engine SHALL output Traditional Chinese (繁體中文) text.
|
||||
|
||||
#### Scenario: Simplified to Traditional conversion
|
||||
- **WHEN** whisper outputs Simplified Chinese characters
|
||||
- **THEN** OpenCC SHALL convert output to Traditional Chinese
|
||||
|
||||
#### Scenario: Native Traditional Chinese
|
||||
- **WHEN** whisper outputs Traditional Chinese directly
|
||||
- **THEN** the text SHALL pass through unchanged
|
||||
|
||||
### Requirement: Real-time Display
|
||||
The Electron client SHALL display transcription results in real-time.
|
||||
|
||||
#### Scenario: Streaming transcription
|
||||
- **WHEN** user is recording
|
||||
- **THEN** transcribed text SHALL appear in the left panel within seconds of speech
|
||||
|
||||
### Requirement: Python Sidecar
|
||||
The transcription engine SHALL be packaged as a Python sidecar using PyInstaller.
|
||||
|
||||
#### Scenario: Sidecar startup
|
||||
- **WHEN** Electron app launches
|
||||
- **THEN** the Python sidecar containing faster-whisper and OpenCC SHALL be available
|
||||
|
||||
#### Scenario: Sidecar communication
|
||||
- **WHEN** Electron sends audio data to sidecar
|
||||
- **THEN** transcribed text SHALL be returned via IPC
|
||||
@@ -0,0 +1,67 @@
|
||||
## 1. Middleware Server Foundation
|
||||
- [x] 1.1 Initialize Python project with FastAPI, uvicorn, python-dotenv
|
||||
- [x] 1.2 Create .env.example with all required environment variables
|
||||
- [x] 1.3 Implement database connection pool with mysql-connector-python
|
||||
- [x] 1.4 Create table initialization script (meeting_users, meeting_records, meeting_conclusions, meeting_action_items)
|
||||
- [x] 1.5 Configure CORS middleware for Electron client
|
||||
- [x] 1.6 Add health check endpoint GET /api/health
|
||||
|
||||
## 2. Authentication
|
||||
- [x] 2.1 Implement POST /api/login proxy to PJ-Auth API
|
||||
- [x] 2.2 Add admin role detection for ymirliu@panjit.com.tw
|
||||
- [x] 2.3 Create JWT validation middleware for protected routes
|
||||
- [x] 2.4 Handle token expiration with appropriate error codes
|
||||
|
||||
## 3. Meeting CRUD
|
||||
- [x] 3.1 Implement POST /api/meetings (create meeting)
|
||||
- [x] 3.2 Implement GET /api/meetings (list meetings with user filtering)
|
||||
- [x] 3.3 Implement GET /api/meetings/:id (get meeting with conclusions and action items)
|
||||
- [x] 3.4 Implement PUT /api/meetings/:id (update meeting)
|
||||
- [x] 3.5 Implement DELETE /api/meetings/:id (delete meeting cascade)
|
||||
- [x] 3.6 Implement system code generation (C-YYYYMMDD-XX, A-YYYYMMDD-XX)
|
||||
|
||||
## 4. AI Summarization
|
||||
- [x] 4.1 Implement POST /api/ai/summarize endpoint
|
||||
- [x] 4.2 Configure Dify API client with timeout and retry
|
||||
- [x] 4.3 Parse Dify response into conclusions and action_items structure
|
||||
- [x] 4.4 Handle partial data (empty owner/due_date)
|
||||
|
||||
## 5. Excel Export
|
||||
- [x] 5.1 Create Excel template with placeholders
|
||||
- [x] 5.2 Implement GET /api/meetings/:id/export endpoint
|
||||
- [x] 5.3 Implement placeholder replacement ({{subject}}, {{time}}, etc.)
|
||||
- [x] 5.4 Implement dynamic row insertion for conclusions and action items
|
||||
|
||||
## 6. Electron Client - Core
|
||||
- [x] 6.1 Initialize Electron project with electron-builder
|
||||
- [x] 6.2 Create main window and basic navigation
|
||||
- [x] 6.3 Implement login page with auth API integration
|
||||
- [x] 6.4 Implement token storage and auto-refresh interceptor
|
||||
|
||||
## 7. Electron Client - Meeting UI
|
||||
- [x] 7.1 Create meeting list page
|
||||
- [x] 7.2 Create meeting creation form (metadata fields)
|
||||
- [x] 7.3 Create dual-panel meeting view (transcript left, notes right)
|
||||
- [x] 7.4 Implement conclusion/action item editing with manual completion UI
|
||||
- [x] 7.5 Add export button with download handling
|
||||
|
||||
## 8. Transcription Engine
|
||||
- [x] 8.1 Create Python sidecar project with faster-whisper and OpenCC
|
||||
- [x] 8.2 Implement audio input capture
|
||||
- [x] 8.3 Implement transcription with int8 model
|
||||
- [x] 8.4 Implement OpenCC Traditional Chinese conversion
|
||||
- [x] 8.5 Set up IPC communication between Electron and sidecar
|
||||
- [x] 8.6 Package sidecar with PyInstaller
|
||||
|
||||
## 9. Testing
|
||||
- [x] 9.1 Unit tests: DB connection, table creation
|
||||
- [x] 9.2 Unit tests: Dify proxy with mock responses
|
||||
- [x] 9.3 Unit tests: Admin role detection
|
||||
- [x] 9.4 Integration test: Auth flow with token refresh
|
||||
- [x] 9.5 Integration test: Full meeting cycle (create → transcribe → summarize → save → export)
|
||||
|
||||
## 10. Deployment Preparation
|
||||
- [x] 10.1 Create requirements.txt with all dependencies
|
||||
- [x] 10.2 Create deployment documentation
|
||||
- [x] 10.3 Configure electron-builder for portable target
|
||||
- [x] 10.4 Verify faster-whisper performance on i5/8GB hardware
|
||||
@@ -0,0 +1,117 @@
|
||||
## Context
|
||||
The Meeting Assistant currently uses batch transcription: audio is recorded, saved to file, then sent to Whisper for processing. This creates a poor UX where users must wait until recording stops to see any text. Users also cannot correct transcription errors.
|
||||
|
||||
**Stakeholders**: End users recording meetings, admin reviewing transcripts
|
||||
**Constraints**: i5/8GB hardware target, offline capability required
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
### Goals
|
||||
- Real-time text display during recording (< 3 second latency)
|
||||
- Segment-based editing without disrupting ongoing transcription
|
||||
- Punctuation in output (Chinese: 。,?!;:)
|
||||
- Maintain offline capability (all processing local)
|
||||
|
||||
### Non-Goals
|
||||
- Speaker diarization (who said what) - future enhancement
|
||||
- Multi-language mixing - Chinese only for MVP
|
||||
- Cloud-based transcription fallback
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Renderer Process (meeting-detail.html) │
|
||||
│ ┌──────────────┐ ┌─────────────────────────────────┐ │
|
||||
│ │ MediaRecorder│───▶│ Editable Transcript Component │ │
|
||||
│ │ (audio chunks) │ [Segment 1] [Segment 2] [...] │ │
|
||||
│ └──────┬───────┘ └─────────────────────────────────┘ │
|
||||
│ │ IPC: stream-audio-chunk │
|
||||
└─────────┼──────────────────────────────────────────────────┘
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Main Process (main.js) │
|
||||
│ ┌──────────────────┐ ┌─────────────────────────────┐ │
|
||||
│ │ Audio Buffer │────▶│ Sidecar (stdin pipe) │ │
|
||||
│ │ (accumulate PCM) │ │ │ │
|
||||
│ └──────────────────┘ └──────────┬──────────────────┘ │
|
||||
│ │ IPC: transcription-segment
|
||||
│ ▼ │
|
||||
│ Forward to renderer │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼ stdin (WAV chunks)
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Sidecar Process (transcriber.py) │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │
|
||||
│ │ VAD Buffer │──▶│ Whisper │──▶│ Punctuator │ │
|
||||
│ │ (silero-vad) │ │ (transcribe) │ │ (rule-based) │ │
|
||||
│ └──────────────┘ └──────────────┘ └────────────────┘ │
|
||||
│ │ │ │
|
||||
│ │ Detect speech end │ │
|
||||
│ ▼ ▼ │
|
||||
│ stdout: {"segment_id": 1, "text": "今天開會討論。", ...} │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Decisions
|
||||
|
||||
### Decision 1: VAD-triggered Segmentation
|
||||
**What**: Use Silero VAD to detect speech boundaries, transcribe complete utterances
|
||||
**Why**:
|
||||
- More accurate than fixed-interval chunking
|
||||
- Natural sentence boundaries
|
||||
- Reduces partial/incomplete transcriptions
|
||||
**Alternatives**:
|
||||
- Fixed 5-second chunks (simpler but cuts mid-sentence)
|
||||
- Word-level streaming (too fragmented, higher latency)
|
||||
|
||||
### Decision 2: Segment-based Editing
|
||||
**What**: Each VAD segment becomes an editable text block with unique ID
|
||||
**Why**:
|
||||
- Users can edit specific segments without affecting others
|
||||
- New segments append without disrupting editing
|
||||
- Simple merge on save (concatenate all segments)
|
||||
**Alternatives**:
|
||||
- Single textarea (editing conflicts with appending text)
|
||||
- Contenteditable div (complex cursor management)
|
||||
|
||||
### Decision 3: Audio Format Pipeline
|
||||
**What**: WebM (MediaRecorder) → WAV conversion in main.js → raw PCM to sidecar
|
||||
**Why**:
|
||||
- MediaRecorder only outputs WebM/Opus in browsers
|
||||
- Whisper works best with WAV/PCM
|
||||
- Conversion in main.js keeps sidecar simple
|
||||
**Alternatives**:
|
||||
- ffmpeg in sidecar (adds large dependency)
|
||||
- Raw PCM from AudioWorklet (complex, browser compatibility issues)
|
||||
|
||||
### Decision 4: Punctuation via Whisper + Rules
|
||||
**What**: Enable Whisper word_timestamps, apply rule-based punctuation after
|
||||
**Why**:
|
||||
- Whisper alone outputs minimal punctuation for Chinese
|
||||
- Rule-based post-processing adds 。,? based on pauses and patterns
|
||||
- No additional model needed
|
||||
**Alternatives**:
|
||||
- Separate punctuation model (adds latency and complexity)
|
||||
- No punctuation (user requirement)
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Latency > 3s on slow hardware | Use "tiny" model option, skip VAD if needed |
|
||||
| WebM→WAV conversion quality loss | Use lossless conversion, test on various inputs |
|
||||
| Memory usage with long meetings | Limit audio buffer to 30s, process and discard |
|
||||
| Segment boundary splits words | Use VAD with 500ms silence threshold |
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
1. **Phase 1**: Sidecar streaming mode with VAD
|
||||
2. **Phase 2**: IPC audio streaming pipeline
|
||||
3. **Phase 3**: Frontend editable segment component
|
||||
4. **Phase 4**: Punctuation post-processing
|
||||
|
||||
## Open Questions
|
||||
- Should segments be auto-merged after N seconds of no editing?
|
||||
- Maximum segment count before auto-archiving old segments?
|
||||
@@ -0,0 +1,24 @@
|
||||
# Change: Add Real-time Streaming Transcription
|
||||
|
||||
## Why
|
||||
Current transcription workflow requires users to stop recording before seeing results. Users cannot edit transcription errors, and output lacks punctuation. For meeting scenarios, real-time feedback with editable text is essential for immediate correction and context awareness.
|
||||
|
||||
## What Changes
|
||||
- **Sidecar**: Implement streaming VAD-based transcription with sentence segmentation
|
||||
- **IPC**: Add continuous audio streaming from renderer to main process to sidecar
|
||||
- **Frontend**: Make transcript editable with real-time segment updates
|
||||
- **Punctuation**: Enable Whisper's word timestamps and add sentence boundary detection
|
||||
|
||||
## Impact
|
||||
- Affected specs: `transcription` (new), `frontend-transcript` (new)
|
||||
- Affected code:
|
||||
- `sidecar/transcriber.py` - Add streaming mode with VAD
|
||||
- `client/src/main.js` - Add audio streaming IPC handlers
|
||||
- `client/src/preload.js` - Expose streaming APIs
|
||||
- `client/src/pages/meeting-detail.html` - Editable transcript component
|
||||
|
||||
## Success Criteria
|
||||
1. User sees text appearing within 2-3 seconds of speaking
|
||||
2. Each segment is individually editable
|
||||
3. Output includes punctuation (。,?!)
|
||||
4. Recording can continue while user edits previous segments
|
||||
@@ -0,0 +1,58 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Editable Transcript Segments
|
||||
The frontend SHALL display transcribed text as individually editable segments that can be modified without disrupting ongoing transcription.
|
||||
|
||||
#### Scenario: Display new segment
|
||||
- **WHEN** a new transcription segment is received from sidecar
|
||||
- **THEN** a new editable text block SHALL appear in the transcript area
|
||||
- **AND** the block SHALL be visually distinct (e.g., border, background)
|
||||
- **AND** the block SHALL be immediately editable
|
||||
|
||||
#### Scenario: Edit existing segment
|
||||
- **WHEN** user modifies text in a segment
|
||||
- **THEN** only that segment's local data SHALL be updated
|
||||
- **AND** new incoming segments SHALL continue to append below
|
||||
- **AND** the edited segment SHALL show an "edited" indicator
|
||||
|
||||
#### Scenario: Save merged transcript
|
||||
- **WHEN** user clicks Save button
|
||||
- **THEN** all segments (edited and unedited) SHALL be concatenated in order
|
||||
- **AND** the merged text SHALL be saved as transcript_blob
|
||||
|
||||
### Requirement: Real-time Streaming UI
|
||||
The frontend SHALL provide clear visual feedback during streaming transcription.
|
||||
|
||||
#### Scenario: Recording active indicator
|
||||
- **WHEN** streaming recording is active
|
||||
- **THEN** a pulsing recording indicator SHALL be visible
|
||||
- **AND** the current/active segment SHALL have distinct styling (e.g., highlighted border)
|
||||
- **AND** the Start Recording button SHALL change to Stop Recording
|
||||
|
||||
#### Scenario: Processing indicator
|
||||
- **WHEN** audio is being processed but no text has appeared yet
|
||||
- **THEN** a "Processing..." indicator SHALL appear in the active segment area
|
||||
- **AND** the indicator SHALL disappear when text arrives
|
||||
|
||||
#### Scenario: Streaming status display
|
||||
- **WHEN** streaming session is active
|
||||
- **THEN** the UI SHALL display segment count (e.g., "Segment 5/5")
|
||||
- **AND** total recording duration
|
||||
|
||||
### Requirement: Audio Streaming IPC
|
||||
The Electron main process SHALL provide IPC handlers for continuous audio streaming between renderer and sidecar.
|
||||
|
||||
#### Scenario: Start streaming
|
||||
- **WHEN** renderer calls `startRecordingStream()`
|
||||
- **THEN** main process SHALL send start_stream command to sidecar
|
||||
- **AND** return session confirmation to renderer
|
||||
|
||||
#### Scenario: Stream audio data
|
||||
- **WHEN** renderer sends audio chunk via `streamAudioChunk(arrayBuffer)`
|
||||
- **THEN** main process SHALL convert WebM to PCM if needed
|
||||
- **AND** forward to sidecar stdin as base64-encoded audio_chunk command
|
||||
|
||||
#### Scenario: Receive transcription
|
||||
- **WHEN** sidecar emits a segment result on stdout
|
||||
- **THEN** main process SHALL parse the JSON
|
||||
- **AND** forward to renderer via `transcription-segment` IPC event
|
||||
@@ -0,0 +1,46 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Streaming Transcription Mode
|
||||
The sidecar SHALL support a streaming mode where audio chunks are continuously received and transcribed in real-time with VAD-triggered segmentation.
|
||||
|
||||
#### Scenario: Start streaming session
|
||||
- **WHEN** sidecar receives `{"action": "start_stream"}` command
|
||||
- **THEN** it SHALL initialize audio buffer and VAD processor
|
||||
- **AND** respond with `{"status": "streaming", "session_id": "<uuid>"}`
|
||||
|
||||
#### Scenario: Process audio chunk
|
||||
- **WHEN** sidecar receives `{"action": "audio_chunk", "data": "<base64_pcm>"}` during active stream
|
||||
- **THEN** it SHALL append audio to buffer and run VAD detection
|
||||
- **AND** if speech boundary detected, transcribe accumulated audio
|
||||
- **AND** emit `{"segment_id": <int>, "text": "<transcription>", "is_final": true}`
|
||||
|
||||
#### Scenario: Stop streaming session
|
||||
- **WHEN** sidecar receives `{"action": "stop_stream"}` command
|
||||
- **THEN** it SHALL transcribe any remaining buffered audio
|
||||
- **AND** respond with `{"status": "stream_stopped", "total_segments": <int>}`
|
||||
|
||||
### Requirement: VAD-based Speech Segmentation
|
||||
The sidecar SHALL use Voice Activity Detection to identify natural speech boundaries for segmentation.
|
||||
|
||||
#### Scenario: Detect speech end
|
||||
- **WHEN** VAD detects silence exceeding 500ms after speech
|
||||
- **THEN** the accumulated speech audio SHALL be sent for transcription
|
||||
- **AND** a new segment SHALL begin for subsequent speech
|
||||
|
||||
#### Scenario: Handle continuous speech
|
||||
- **WHEN** speech continues for more than 15 seconds without pause
|
||||
- **THEN** the sidecar SHALL force a segment boundary
|
||||
- **AND** transcribe the 15-second chunk to prevent excessive latency
|
||||
|
||||
### Requirement: Punctuation in Transcription Output
|
||||
The sidecar SHALL output transcribed text with appropriate Chinese punctuation marks.
|
||||
|
||||
#### Scenario: Add sentence-ending punctuation
|
||||
- **WHEN** transcription completes for a segment
|
||||
- **THEN** the output SHALL include period (。) at natural sentence boundaries
|
||||
- **AND** question marks (?) for interrogative sentences
|
||||
- **AND** commas (,) for clause breaks within sentences
|
||||
|
||||
#### Scenario: Detect question patterns
|
||||
- **WHEN** transcribed text ends with question particles (嗎、呢、什麼、怎麼、為什麼)
|
||||
- **THEN** the punctuation processor SHALL append question mark (?)
|
||||
@@ -0,0 +1,53 @@
|
||||
## 1. Sidecar Streaming Infrastructure
|
||||
- [x] 1.1 Add silero-vad dependency to requirements.txt
|
||||
- [x] 1.2 Implement VADProcessor class with speech boundary detection
|
||||
- [x] 1.3 Add streaming mode to Transcriber (action: "start_stream", "audio_chunk", "stop_stream")
|
||||
- [x] 1.4 Implement audio buffer with VAD-triggered transcription
|
||||
- [x] 1.5 Add segment_id tracking for each utterance
|
||||
- [x] 1.6 Test VAD with sample Chinese speech audio
|
||||
|
||||
## 2. Punctuation Processing
|
||||
- [x] 2.1 Enable word_timestamps in Whisper transcribe()
|
||||
- [x] 2.2 Implement ChinesePunctuator class with rule-based punctuation
|
||||
- [x] 2.3 Add pause-based sentence boundary detection (>500ms → period)
|
||||
- [x] 2.4 Add question detection (嗎、呢、什麼 patterns → ?)
|
||||
- [x] 2.5 Test punctuation output quality with sample transcripts
|
||||
|
||||
## 3. IPC Audio Streaming
|
||||
- [x] 3.1 Add "start-recording-stream" IPC handler in main.js
|
||||
- [x] 3.2 Add "stream-audio-chunk" IPC handler to forward audio to sidecar
|
||||
- [x] 3.3 Add "stop-recording-stream" IPC handler
|
||||
- [x] 3.4 Implement WebM to PCM conversion using web-audio-api or ffmpeg.wasm
|
||||
- [x] 3.5 Forward sidecar segment events to renderer via "transcription-segment" IPC
|
||||
- [x] 3.6 Update preload.js with streaming API exposure
|
||||
|
||||
## 4. Frontend Editable Transcript
|
||||
- [x] 4.1 Create TranscriptSegment component (editable text block with segment_id)
|
||||
- [x] 4.2 Implement segment container with append-only behavior during recording
|
||||
- [x] 4.3 Add edit handler that updates local segment data
|
||||
- [x] 4.4 Style active segment (currently receiving text) differently
|
||||
- [x] 4.5 Update Save button to merge all segments into transcript_blob
|
||||
- [x] 4.6 Add visual indicator for streaming status
|
||||
|
||||
## 5. Integration & Testing
|
||||
- [x] 5.1 End-to-end test: start recording → speak → see text appear
|
||||
- [x] 5.2 Test editing segment while new segments arrive
|
||||
- [x] 5.3 Test save with mixed edited/unedited segments
|
||||
- [x] 5.4 Performance test on i5/8GB target hardware
|
||||
- [x] 5.5 Test with 30+ minute continuous recording
|
||||
- [x] 5.6 Update meeting-detail.html recording flow documentation
|
||||
|
||||
## Dependencies
|
||||
- Task 3 depends on Task 1 (sidecar must support streaming first)
|
||||
- Task 4 depends on Task 3 (frontend needs IPC to receive segments)
|
||||
- Task 2 can run in parallel with Task 3
|
||||
|
||||
## Parallelizable Work
|
||||
- Tasks 1 and 4 can start simultaneously (sidecar and frontend scaffolding)
|
||||
- Task 2 can run in parallel with Task 3
|
||||
|
||||
## Implementation Notes
|
||||
- VAD uses Silero VAD with fallback to 5-second time-based segmentation if torch unavailable
|
||||
- Audio captured at 16kHz mono, converted to int16 PCM, sent as base64
|
||||
- ChinesePunctuator uses regex patterns for question detection
|
||||
- Segments are editable immediately, edited segments marked with orange border
|
||||
56
openspec/project.md
Normal file
56
openspec/project.md
Normal file
@@ -0,0 +1,56 @@
|
||||
# Project Context
|
||||
|
||||
## Purpose
|
||||
Enterprise meeting knowledge management solution that automates meeting transcription and generates structured summaries. Solves the time-consuming problem of manual meeting notes by using edge AI for speech-to-text and LLM for intelligent summarization with action item tracking.
|
||||
|
||||
## Tech Stack
|
||||
- **Frontend**: Electron (edge computing for offline transcription)
|
||||
- **Backend**: Python FastAPI (middleware server)
|
||||
- **Database**: MySQL (shared instance at mysql.theaken.com:33306)
|
||||
- **AI/ML**:
|
||||
- faster-whisper (int8) for local speech-to-text
|
||||
- OpenCC for Traditional Chinese conversion
|
||||
- Dify LLM for summarization
|
||||
- **Key Libraries**: mysql-connector-python, fastapi, requests, openpyxl, PyInstaller
|
||||
|
||||
## Project Conventions
|
||||
|
||||
### Code Style
|
||||
- Database tables must use `meeting_` prefix
|
||||
- System IDs follow format: `C-YYYYMMDD-XX` (conclusions), `A-YYYYMMDD-XX` (action items)
|
||||
- API endpoints use `/api/` prefix
|
||||
- Environment variables for sensitive config (DB credentials, API keys)
|
||||
|
||||
### Architecture Patterns
|
||||
- **Three-tier architecture**: Electron Client → FastAPI Middleware → MySQL/Dify
|
||||
- **Security**: DB connections and API keys must NOT be in Electron client; all secrets stay in middleware
|
||||
- **Edge Computing**: Speech-to-text runs locally in Electron for offline capability
|
||||
- **Proxy Pattern**: Middleware proxies auth requests to external Auth API
|
||||
|
||||
### Testing Strategy
|
||||
- **Unit Tests**: DB connectivity, Dify proxy, admin role detection
|
||||
- **Integration Tests**: Auth flow with token refresh, full meeting cycle (create → record → summarize → save → export)
|
||||
- **Deployment Checklist**: Environment validation, table creation, package verification
|
||||
|
||||
### Git Workflow
|
||||
- Feature branches for new capabilities
|
||||
- OpenSpec change proposals for significant features
|
||||
|
||||
## Domain Context
|
||||
- **會議記錄 (Meeting Records)**: Core entity with metadata (subject, time, chairperson, location, recorder, attendees)
|
||||
- **逐字稿 (Transcript)**: Raw AI-generated speech-to-text output
|
||||
- **會議結論 (Conclusions)**: Summarized key decisions from meetings
|
||||
- **待辦事項 (Action Items)**: Tracked tasks with owner, due date, and status (Open/In Progress/Done/Delayed)
|
||||
- **Admin User**: ymirliu@panjit.com.tw has full access to all meetings and Excel template management
|
||||
|
||||
## Important Constraints
|
||||
- Target hardware: i5/8GB laptop must run faster-whisper int8 locally
|
||||
- Security: No DB credentials or API keys in client-side code
|
||||
- Language: Must support Traditional Chinese (繁體中文) output
|
||||
- Data Isolation: All tables prefixed with `meeting_`
|
||||
- Token Management: Client must implement auto-refresh for long meetings
|
||||
|
||||
## External Dependencies
|
||||
- **Auth API**: https://pj-auth-api.vercel.app/api/auth/login (company SSO)
|
||||
- **Dify LLM**: https://dify.theaken.com/v1 (AI summarization)
|
||||
- **MySQL**: mysql.theaken.com:33306, database `db_A060`
|
||||
49
openspec/specs/ai-summarization/spec.md
Normal file
49
openspec/specs/ai-summarization/spec.md
Normal file
@@ -0,0 +1,49 @@
|
||||
# ai-summarization Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Dify Integration
|
||||
The middleware server SHALL integrate with Dify LLM at https://dify.theaken.com/v1 for transcript summarization.
|
||||
|
||||
#### Scenario: Successful summarization
|
||||
- **WHEN** user submits POST /api/ai/summarize with transcript text
|
||||
- **THEN** the server SHALL call Dify API and return structured JSON with conclusions and action_items
|
||||
|
||||
#### Scenario: Dify timeout handling
|
||||
- **WHEN** Dify API does not respond within timeout period
|
||||
- **THEN** the server SHALL return HTTP 504 with timeout error and client can retry
|
||||
|
||||
#### Scenario: Dify error handling
|
||||
- **WHEN** Dify API returns error (500, rate limit, etc.)
|
||||
- **THEN** the server SHALL return appropriate HTTP error with details
|
||||
|
||||
### Requirement: Structured Output Format
|
||||
The AI summarization SHALL return structured data with conclusions and action items.
|
||||
|
||||
#### Scenario: Complete structured response
|
||||
- **WHEN** transcript contains clear decisions and assignments
|
||||
- **THEN** response SHALL include conclusions array and action_items array with content, owner, due_date fields
|
||||
|
||||
#### Scenario: Partial data extraction
|
||||
- **WHEN** transcript lacks explicit owner or due_date for action items
|
||||
- **THEN** those fields SHALL be empty strings allowing manual completion
|
||||
|
||||
### Requirement: Dify Prompt Configuration
|
||||
The Dify workflow SHALL be configured with appropriate system prompt for meeting summarization.
|
||||
|
||||
#### Scenario: System prompt behavior
|
||||
- **WHEN** transcript is sent to Dify
|
||||
- **THEN** Dify SHALL use configured prompt to extract conclusions and action_items in JSON format
|
||||
|
||||
### Requirement: Manual Data Completion
|
||||
The Electron client SHALL allow users to manually complete missing AI-extracted data.
|
||||
|
||||
#### Scenario: Fill missing owner
|
||||
- **WHEN** AI returns action item without owner
|
||||
- **THEN** user SHALL be able to select or type owner name in the UI
|
||||
|
||||
#### Scenario: Fill missing due date
|
||||
- **WHEN** AI returns action item without due_date
|
||||
- **THEN** user SHALL be able to select date using date picker
|
||||
|
||||
46
openspec/specs/authentication/spec.md
Normal file
46
openspec/specs/authentication/spec.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# authentication Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Login Proxy
|
||||
The middleware server SHALL proxy login requests to the company Auth API at https://pj-auth-api.vercel.app/api/auth/login.
|
||||
|
||||
#### Scenario: Successful login
|
||||
- **WHEN** user submits valid credentials to POST /api/login
|
||||
- **THEN** the server SHALL forward to Auth API and return the JWT token
|
||||
|
||||
#### Scenario: Admin role detection
|
||||
- **WHEN** user logs in with email ymirliu@panjit.com.tw
|
||||
- **THEN** the response JWT payload SHALL include role: "admin"
|
||||
|
||||
#### Scenario: Invalid credentials
|
||||
- **WHEN** user submits invalid credentials
|
||||
- **THEN** the server SHALL return HTTP 401 with error message from Auth API
|
||||
|
||||
### Requirement: Token Validation
|
||||
The middleware server SHALL validate JWT tokens on protected endpoints.
|
||||
|
||||
#### Scenario: Valid token access
|
||||
- **WHEN** request includes valid JWT in Authorization header
|
||||
- **THEN** the request SHALL proceed to the endpoint handler
|
||||
|
||||
#### Scenario: Expired token
|
||||
- **WHEN** request includes expired JWT
|
||||
- **THEN** the server SHALL return HTTP 401 with "token_expired" error code
|
||||
|
||||
#### Scenario: Missing token
|
||||
- **WHEN** request to protected endpoint lacks Authorization header
|
||||
- **THEN** the server SHALL return HTTP 401 with "token_required" error code
|
||||
|
||||
### Requirement: Token Auto-Refresh
|
||||
The Electron client SHALL implement automatic token refresh before expiration.
|
||||
|
||||
#### Scenario: Proactive refresh
|
||||
- **WHEN** token approaches expiration (within 5 minutes) during active session
|
||||
- **THEN** the client SHALL request new token transparently without user interruption
|
||||
|
||||
#### Scenario: Refresh during long meeting
|
||||
- **WHEN** user is in a meeting session lasting longer than token validity
|
||||
- **THEN** the client SHALL maintain authentication through automatic refresh
|
||||
|
||||
49
openspec/specs/excel-export/spec.md
Normal file
49
openspec/specs/excel-export/spec.md
Normal file
@@ -0,0 +1,49 @@
|
||||
# excel-export Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Excel Report Generation
|
||||
The middleware server SHALL generate Excel reports from meeting data using templates.
|
||||
|
||||
#### Scenario: Successful export
|
||||
- **WHEN** user requests GET /api/meetings/:id/export
|
||||
- **THEN** server SHALL generate Excel file and return as downloadable stream
|
||||
|
||||
#### Scenario: Export non-existent meeting
|
||||
- **WHEN** user requests export for non-existent meeting ID
|
||||
- **THEN** server SHALL return HTTP 404
|
||||
|
||||
### Requirement: Template-based Generation
|
||||
The Excel export SHALL use openpyxl with template files.
|
||||
|
||||
#### Scenario: Placeholder replacement
|
||||
- **WHEN** Excel is generated
|
||||
- **THEN** placeholders ({{subject}}, {{time}}, {{chair}}, etc.) SHALL be replaced with actual meeting data
|
||||
|
||||
#### Scenario: Dynamic row insertion
|
||||
- **WHEN** meeting has multiple conclusions or action items
|
||||
- **THEN** rows SHALL be dynamically inserted to accommodate all items
|
||||
|
||||
### Requirement: Complete Data Inclusion
|
||||
The exported Excel SHALL include all meeting metadata and AI-generated content.
|
||||
|
||||
#### Scenario: Full metadata export
|
||||
- **WHEN** Excel is generated
|
||||
- **THEN** it SHALL include subject, meeting_time, location, chairperson, recorder, and attendees
|
||||
|
||||
#### Scenario: Conclusions export
|
||||
- **WHEN** Excel is generated
|
||||
- **THEN** all conclusions SHALL be listed with their system codes
|
||||
|
||||
#### Scenario: Action items export
|
||||
- **WHEN** Excel is generated
|
||||
- **THEN** all action items SHALL be listed with content, owner, due_date, status, and system code
|
||||
|
||||
### Requirement: Template Management
|
||||
Admin users SHALL be able to manage Excel templates.
|
||||
|
||||
#### Scenario: Admin template access
|
||||
- **WHEN** admin user accesses template management
|
||||
- **THEN** they SHALL be able to upload, view, and update Excel templates
|
||||
|
||||
62
openspec/specs/frontend-transcript/spec.md
Normal file
62
openspec/specs/frontend-transcript/spec.md
Normal file
@@ -0,0 +1,62 @@
|
||||
# frontend-transcript Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change add-realtime-transcription. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Editable Transcript Segments
|
||||
The frontend SHALL display transcribed text as individually editable segments that can be modified without disrupting ongoing transcription.
|
||||
|
||||
#### Scenario: Display new segment
|
||||
- **WHEN** a new transcription segment is received from sidecar
|
||||
- **THEN** a new editable text block SHALL appear in the transcript area
|
||||
- **AND** the block SHALL be visually distinct (e.g., border, background)
|
||||
- **AND** the block SHALL be immediately editable
|
||||
|
||||
#### Scenario: Edit existing segment
|
||||
- **WHEN** user modifies text in a segment
|
||||
- **THEN** only that segment's local data SHALL be updated
|
||||
- **AND** new incoming segments SHALL continue to append below
|
||||
- **AND** the edited segment SHALL show an "edited" indicator
|
||||
|
||||
#### Scenario: Save merged transcript
|
||||
- **WHEN** user clicks Save button
|
||||
- **THEN** all segments (edited and unedited) SHALL be concatenated in order
|
||||
- **AND** the merged text SHALL be saved as transcript_blob
|
||||
|
||||
### Requirement: Real-time Streaming UI
|
||||
The frontend SHALL provide clear visual feedback during streaming transcription.
|
||||
|
||||
#### Scenario: Recording active indicator
|
||||
- **WHEN** streaming recording is active
|
||||
- **THEN** a pulsing recording indicator SHALL be visible
|
||||
- **AND** the current/active segment SHALL have distinct styling (e.g., highlighted border)
|
||||
- **AND** the Start Recording button SHALL change to Stop Recording
|
||||
|
||||
#### Scenario: Processing indicator
|
||||
- **WHEN** audio is being processed but no text has appeared yet
|
||||
- **THEN** a "Processing..." indicator SHALL appear in the active segment area
|
||||
- **AND** the indicator SHALL disappear when text arrives
|
||||
|
||||
#### Scenario: Streaming status display
|
||||
- **WHEN** streaming session is active
|
||||
- **THEN** the UI SHALL display segment count (e.g., "Segment 5/5")
|
||||
- **AND** total recording duration
|
||||
|
||||
### Requirement: Audio Streaming IPC
|
||||
The Electron main process SHALL provide IPC handlers for continuous audio streaming between renderer and sidecar.
|
||||
|
||||
#### Scenario: Start streaming
|
||||
- **WHEN** renderer calls `startRecordingStream()`
|
||||
- **THEN** main process SHALL send start_stream command to sidecar
|
||||
- **AND** return session confirmation to renderer
|
||||
|
||||
#### Scenario: Stream audio data
|
||||
- **WHEN** renderer sends audio chunk via `streamAudioChunk(arrayBuffer)`
|
||||
- **THEN** main process SHALL convert WebM to PCM if needed
|
||||
- **AND** forward to sidecar stdin as base64-encoded audio_chunk command
|
||||
|
||||
#### Scenario: Receive transcription
|
||||
- **WHEN** sidecar emits a segment result on stdout
|
||||
- **THEN** main process SHALL parse the JSON
|
||||
- **AND** forward to renderer via `transcription-segment` IPC event
|
||||
|
||||
75
openspec/specs/meeting-management/spec.md
Normal file
75
openspec/specs/meeting-management/spec.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# meeting-management Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Create Meeting
|
||||
The system SHALL allow users to create meetings with required metadata.
|
||||
|
||||
#### Scenario: Create meeting with all fields
|
||||
- **WHEN** user submits POST /api/meetings with subject, meeting_time, chairperson, location, recorder, attendees
|
||||
- **THEN** a new meeting record SHALL be created with auto-generated UUID and the meeting data SHALL be returned
|
||||
|
||||
#### Scenario: Create meeting with missing required fields
|
||||
- **WHEN** user submits POST /api/meetings without subject or meeting_time
|
||||
- **THEN** the server SHALL return HTTP 400 with validation error details
|
||||
|
||||
#### Scenario: Recorder defaults to current user
|
||||
- **WHEN** user creates meeting without specifying recorder
|
||||
- **THEN** the recorder field SHALL default to the logged-in user's email
|
||||
|
||||
### Requirement: List Meetings
|
||||
The system SHALL allow users to retrieve a list of meetings.
|
||||
|
||||
#### Scenario: List all meetings for admin
|
||||
- **WHEN** admin user requests GET /api/meetings
|
||||
- **THEN** all meetings SHALL be returned
|
||||
|
||||
#### Scenario: List meetings for regular user
|
||||
- **WHEN** regular user requests GET /api/meetings
|
||||
- **THEN** only meetings where user is creator, recorder, or attendee SHALL be returned
|
||||
|
||||
### Requirement: Get Meeting Details
|
||||
The system SHALL allow users to retrieve full meeting details including conclusions and action items.
|
||||
|
||||
#### Scenario: Get meeting with related data
|
||||
- **WHEN** user requests GET /api/meetings/:id
|
||||
- **THEN** meeting record with all conclusions and action_items SHALL be returned
|
||||
|
||||
#### Scenario: Get non-existent meeting
|
||||
- **WHEN** user requests GET /api/meetings/:id for non-existent ID
|
||||
- **THEN** the server SHALL return HTTP 404
|
||||
|
||||
### Requirement: Update Meeting
|
||||
The system SHALL allow users to update meeting data, conclusions, and action items.
|
||||
|
||||
#### Scenario: Update meeting metadata
|
||||
- **WHEN** user submits PUT /api/meetings/:id with updated fields
|
||||
- **THEN** the meeting record SHALL be updated and new data returned
|
||||
|
||||
#### Scenario: Update action item status
|
||||
- **WHEN** user updates action item status to "Done"
|
||||
- **THEN** the action_items record SHALL reflect the new status
|
||||
|
||||
### Requirement: Delete Meeting
|
||||
The system SHALL allow authorized users to delete meetings.
|
||||
|
||||
#### Scenario: Admin deletes any meeting
|
||||
- **WHEN** admin user requests DELETE /api/meetings/:id
|
||||
- **THEN** the meeting and all related conclusions and action_items SHALL be deleted
|
||||
|
||||
#### Scenario: User deletes own meeting
|
||||
- **WHEN** user requests DELETE /api/meetings/:id for meeting they created
|
||||
- **THEN** the meeting and all related data SHALL be deleted
|
||||
|
||||
### Requirement: System Code Generation
|
||||
The system SHALL auto-generate unique system codes for conclusions and action items.
|
||||
|
||||
#### Scenario: Generate conclusion code
|
||||
- **WHEN** a conclusion is created for a meeting on 2025-12-10
|
||||
- **THEN** the system_code SHALL follow format C-20251210-XX where XX is sequence number
|
||||
|
||||
#### Scenario: Generate action item code
|
||||
- **WHEN** an action item is created for a meeting on 2025-12-10
|
||||
- **THEN** the system_code SHALL follow format A-20251210-XX where XX is sequence number
|
||||
|
||||
45
openspec/specs/middleware/spec.md
Normal file
45
openspec/specs/middleware/spec.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# middleware Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: FastAPI Server Configuration
|
||||
The middleware server SHALL be implemented using Python FastAPI framework with environment-based configuration.
|
||||
|
||||
#### Scenario: Server startup with valid configuration
|
||||
- **WHEN** the server starts with valid .env file containing DB_HOST, DB_PORT, DB_USER, DB_PASS, DB_NAME, DIFY_API_URL, DIFY_API_KEY
|
||||
- **THEN** the server SHALL start successfully and accept connections
|
||||
|
||||
#### Scenario: Server startup with missing configuration
|
||||
- **WHEN** the server starts with missing required environment variables
|
||||
- **THEN** the server SHALL fail to start with descriptive error message
|
||||
|
||||
### Requirement: Database Connection Pool
|
||||
The middleware server SHALL maintain a connection pool to the MySQL database at mysql.theaken.com:33306.
|
||||
|
||||
#### Scenario: Database connection success
|
||||
- **WHEN** the server connects to MySQL with valid credentials
|
||||
- **THEN** a connection pool SHALL be established and queries SHALL execute successfully
|
||||
|
||||
#### Scenario: Database connection failure
|
||||
- **WHEN** the database is unreachable
|
||||
- **THEN** the server SHALL return HTTP 503 with error details for affected endpoints
|
||||
|
||||
### Requirement: Table Initialization
|
||||
The middleware server SHALL ensure all required tables exist on startup with the `meeting_` prefix.
|
||||
|
||||
#### Scenario: Tables created on first run
|
||||
- **WHEN** the server starts and tables do not exist
|
||||
- **THEN** the server SHALL create meeting_users, meeting_records, meeting_conclusions, and meeting_action_items tables
|
||||
|
||||
#### Scenario: Tables already exist
|
||||
- **WHEN** the server starts and tables already exist
|
||||
- **THEN** the server SHALL skip table creation and continue normally
|
||||
|
||||
### Requirement: CORS Configuration
|
||||
The middleware server SHALL allow cross-origin requests from the Electron client.
|
||||
|
||||
#### Scenario: CORS preflight request
|
||||
- **WHEN** Electron client sends OPTIONS request
|
||||
- **THEN** the server SHALL respond with appropriate CORS headers allowing the request
|
||||
|
||||
90
openspec/specs/transcription/spec.md
Normal file
90
openspec/specs/transcription/spec.md
Normal file
@@ -0,0 +1,90 @@
|
||||
# transcription Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change add-meeting-assistant-mvp. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Edge Speech-to-Text
|
||||
The Electron client SHALL perform speech-to-text conversion locally using faster-whisper int8 model.
|
||||
|
||||
#### Scenario: Successful transcription
|
||||
- **WHEN** user records audio during a meeting
|
||||
- **THEN** the audio SHALL be transcribed locally without network dependency
|
||||
|
||||
#### Scenario: Transcription on target hardware
|
||||
- **WHEN** running on i5 processor with 8GB RAM
|
||||
- **THEN** transcription SHALL complete within acceptable latency for real-time display
|
||||
|
||||
### Requirement: Traditional Chinese Output
|
||||
The transcription engine SHALL output Traditional Chinese (繁體中文) text.
|
||||
|
||||
#### Scenario: Simplified to Traditional conversion
|
||||
- **WHEN** whisper outputs Simplified Chinese characters
|
||||
- **THEN** OpenCC SHALL convert output to Traditional Chinese
|
||||
|
||||
#### Scenario: Native Traditional Chinese
|
||||
- **WHEN** whisper outputs Traditional Chinese directly
|
||||
- **THEN** the text SHALL pass through unchanged
|
||||
|
||||
### Requirement: Real-time Display
|
||||
The Electron client SHALL display transcription results in real-time.
|
||||
|
||||
#### Scenario: Streaming transcription
|
||||
- **WHEN** user is recording
|
||||
- **THEN** transcribed text SHALL appear in the left panel within seconds of speech
|
||||
|
||||
### Requirement: Python Sidecar
|
||||
The transcription engine SHALL be packaged as a Python sidecar using PyInstaller.
|
||||
|
||||
#### Scenario: Sidecar startup
|
||||
- **WHEN** Electron app launches
|
||||
- **THEN** the Python sidecar containing faster-whisper and OpenCC SHALL be available
|
||||
|
||||
#### Scenario: Sidecar communication
|
||||
- **WHEN** Electron sends audio data to sidecar
|
||||
- **THEN** transcribed text SHALL be returned via IPC
|
||||
|
||||
### Requirement: Streaming Transcription Mode
|
||||
The sidecar SHALL support a streaming mode where audio chunks are continuously received and transcribed in real-time with VAD-triggered segmentation.
|
||||
|
||||
#### Scenario: Start streaming session
|
||||
- **WHEN** sidecar receives `{"action": "start_stream"}` command
|
||||
- **THEN** it SHALL initialize audio buffer and VAD processor
|
||||
- **AND** respond with `{"status": "streaming", "session_id": "<uuid>"}`
|
||||
|
||||
#### Scenario: Process audio chunk
|
||||
- **WHEN** sidecar receives `{"action": "audio_chunk", "data": "<base64_pcm>"}` during active stream
|
||||
- **THEN** it SHALL append audio to buffer and run VAD detection
|
||||
- **AND** if speech boundary detected, transcribe accumulated audio
|
||||
- **AND** emit `{"segment_id": <int>, "text": "<transcription>", "is_final": true}`
|
||||
|
||||
#### Scenario: Stop streaming session
|
||||
- **WHEN** sidecar receives `{"action": "stop_stream"}` command
|
||||
- **THEN** it SHALL transcribe any remaining buffered audio
|
||||
- **AND** respond with `{"status": "stream_stopped", "total_segments": <int>}`
|
||||
|
||||
### Requirement: VAD-based Speech Segmentation
|
||||
The sidecar SHALL use Voice Activity Detection to identify natural speech boundaries for segmentation.
|
||||
|
||||
#### Scenario: Detect speech end
|
||||
- **WHEN** VAD detects silence exceeding 500ms after speech
|
||||
- **THEN** the accumulated speech audio SHALL be sent for transcription
|
||||
- **AND** a new segment SHALL begin for subsequent speech
|
||||
|
||||
#### Scenario: Handle continuous speech
|
||||
- **WHEN** speech continues for more than 15 seconds without pause
|
||||
- **THEN** the sidecar SHALL force a segment boundary
|
||||
- **AND** transcribe the 15-second chunk to prevent excessive latency
|
||||
|
||||
### Requirement: Punctuation in Transcription Output
|
||||
The sidecar SHALL output transcribed text with appropriate Chinese punctuation marks.
|
||||
|
||||
#### Scenario: Add sentence-ending punctuation
|
||||
- **WHEN** transcription completes for a segment
|
||||
- **THEN** the output SHALL include period (。) at natural sentence boundaries
|
||||
- **AND** question marks (?) for interrogative sentences
|
||||
- **AND** commas (,) for clause breaks within sentences
|
||||
|
||||
#### Scenario: Detect question patterns
|
||||
- **WHEN** transcribed text ends with question particles (嗎、呢、什麼、怎麼、為什麼)
|
||||
- **THEN** the punctuation processor SHALL append question mark (?)
|
||||
|
||||
Reference in New Issue
Block a user