feat: Meeting Assistant MVP - Complete implementation

Enterprise Meeting Knowledge Management System with:

Backend (FastAPI):
- Authentication proxy with JWT (pj-auth-api integration)
- MySQL database with 4 tables (users, meetings, conclusions, actions)
- Meeting CRUD with system code generation (C-YYYYMMDD-XX, A-YYYYMMDD-XX)
- Dify LLM integration for AI summarization
- Excel export with openpyxl
- 20 unit tests (all passing)

Client (Electron):
- Login page with company auth
- Meeting list with create/delete
- Meeting detail with real-time transcription
- Editable transcript textarea (single block, easy editing)
- AI summarization with conclusions/action items
- 5-second segment recording (efficient for long meetings)

Sidecar (Python):
- faster-whisper medium model with int8 quantization
- ONNX Runtime VAD (lightweight, ~20MB vs PyTorch ~2GB)
- Chinese punctuation processing
- OpenCC for Traditional Chinese conversion
- Anti-hallucination parameters
- Auto-cleanup of temp audio files

OpenSpec:
- add-meeting-assistant-mvp (47 tasks, archived)
- add-realtime-transcription (29 tasks, archived)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
egg
2025-12-10 20:17:44 +08:00
commit 8b6184ecc5
65 changed files with 10510 additions and 0 deletions

View File

@@ -0,0 +1,132 @@
## Context
Building a meeting knowledge management system for enterprise users. The system must support offline transcription on standard hardware (i5/8GB), integrate with existing company authentication, and provide AI-powered summarization via Dify LLM.
**Stakeholders**: Enterprise meeting participants, meeting recorders, admin users (ymirliu@panjit.com.tw)
**Constraints**:
- Must run faster-whisper int8 on i5/8GB laptop
- DB credentials and API keys must stay server-side (security)
- All database tables prefixed with `meeting_`
- Output must support Traditional Chinese (繁體中文)
## Goals / Non-Goals
**Goals**:
- Deliver working MVP with all six capabilities
- Secure architecture with secrets in middleware only
- Offline-capable transcription
- Structured output with trackable action items
**Non-Goals**:
- Multi-language support beyond Traditional Chinese
- Real-time collaborative editing
- Mobile client
- Custom LLM model training
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Electron Client │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Auth UI │ │ Meeting UI │ │ Transcription Engine │ │
│ │ (Login) │ │ (CRUD/Edit) │ │ (faster-whisper+OpenCC)│ │
│ └──────┬──────┘ └──────┬──────┘ └────────────┬────────────┘ │
└─────────┼────────────────┼──────────────────────┼───────────────┘
│ │ │
│ HTTP │ HTTP │ Local only
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ FastAPI Middleware Server │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────┐ │
│ │ Auth Proxy │ │Meeting CRUD │ │ Dify Proxy │ │ Export │ │
│ │ POST /login │ │POST/GET/... │ │POST /ai/... │ │GET /:id│ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └───┬────┘ │
└─────────┼────────────────┼────────────────┼─────────────┼───────┘
│ │ │ │
▼ ▼ ▼ │
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ PJ-Auth API │ │ MySQL │ │ Dify LLM │ │
│ (Vercel) │ │ (theaken.com)│ │(theaken.com) │ │
└──────────────┘ └──────────────┘ └──────────────┘ │
┌────────────────────┘
┌──────────────┐
│ Excel Template│
│ (openpyxl) │
└──────────────┘
```
## Decisions
### Decision 1: Three-tier architecture with middleware
**Choice**: All external services accessed through FastAPI middleware
**Rationale**: Security requirement - DB credentials and API keys cannot be in Electron client
**Alternatives considered**:
- Direct client-to-service: Rejected due to credential exposure risk
- Serverless functions: More complex deployment for similar security
### Decision 2: Edge transcription in Electron
**Choice**: Run faster-whisper locally via Python sidecar (PyInstaller)
**Rationale**: Offline capability requirement; network latency unacceptable for real-time transcription
**Alternatives considered**:
- Cloud STT (Google/Azure): Requires network, latency issues
- WebAssembly whisper: Not mature enough for production
### Decision 3: MySQL with prefixed tables
**Choice**: Use shared MySQL instance with `meeting_` prefix
**Rationale**: Leverage existing infrastructure; prefix ensures isolation
**Alternatives considered**:
- Dedicated database: Overhead not justified for MVP
- SQLite: Doesn't support multi-user access
### Decision 4: Dify for LLM summarization
**Choice**: Use company Dify instance for AI features
**Rationale**: Already available infrastructure; structured JSON output support
**Alternatives considered**:
- Direct OpenAI API: Additional cost, no existing infrastructure
- Local LLM: Hardware constraints (i5/8GB insufficient)
## Risks / Trade-offs
| Risk | Impact | Mitigation |
|------|--------|------------|
| faster-whisper performance on i5/8GB | High | Use int8 quantization; test on target hardware early |
| Dify timeout on long transcripts | Medium | Implement chunking; add timeout handling with retry |
| Token expiry during long meetings | Medium | Implement auto-refresh interceptor in client |
| Network failure during save | Medium | Client-side queue with retry; local draft storage |
## Data Model
```sql
-- Tables all prefixed with meeting_
meeting_users (user_id, email, display_name, role, created_at)
meeting_records (meeting_id, uuid, subject, meeting_time, location,
chairperson, recorder, attendees, transcript_blob,
created_by, created_at)
meeting_conclusions (conclusion_id, meeting_id, content, system_code)
meeting_action_items (action_id, meeting_id, content, owner, due_date,
status, system_code)
```
**ID Formats**:
- Conclusions: `C-YYYYMMDD-XX` (e.g., C-20251210-01)
- Action Items: `A-YYYYMMDD-XX` (e.g., A-20251210-01)
## API Endpoints
| Method | Endpoint | Purpose |
|--------|----------|---------|
| POST | /api/login | Proxy auth to PJ-Auth API |
| GET | /api/meetings | List meetings (filterable) |
| POST | /api/meetings | Create meeting |
| GET | /api/meetings/:id | Get meeting details |
| PUT | /api/meetings/:id | Update meeting |
| DELETE | /api/meetings/:id | Delete meeting |
| POST | /api/ai/summarize | Send transcript to Dify |
| GET | /api/meetings/:id/export | Generate Excel report |
## Open Questions
- None currently - PRD and SDD provide sufficient detail for MVP implementation

View File

@@ -0,0 +1,25 @@
# Change: Add Meeting Assistant MVP
## Why
Enterprise users spend significant time manually documenting meetings and tracking action items. This MVP delivers an end-to-end meeting knowledge management solution with offline transcription, AI-powered summarization, and structured tracking of conclusions and action items.
## What Changes
- **NEW** FastAPI middleware server with MySQL integration
- **NEW** Authentication proxy to company Auth API with admin role detection
- **NEW** Meeting CRUD operations with metadata management
- **NEW** Edge-based speech-to-text using faster-whisper (int8)
- **NEW** Dify LLM integration for intelligent summarization
- **NEW** Excel report generation from templates
## Impact
- Affected specs: middleware, authentication, meeting-management, transcription, ai-summarization, excel-export
- Affected code: New Python FastAPI backend, new Electron frontend
- External dependencies: PJ-Auth API, MySQL database, Dify LLM service
## Success Criteria
- Users can login via company SSO
- Meetings can be created with required metadata (subject, time, chairperson, location, recorder, attendees)
- Speech-to-text works offline on i5/8GB hardware
- AI generates structured conclusions and action items from transcripts
- Action items have trackable status (Open/In Progress/Done/Delayed)
- Excel reports can be exported with all meeting data

View File

@@ -0,0 +1,45 @@
## ADDED Requirements
### Requirement: Dify Integration
The middleware server SHALL integrate with Dify LLM at https://dify.theaken.com/v1 for transcript summarization.
#### Scenario: Successful summarization
- **WHEN** user submits POST /api/ai/summarize with transcript text
- **THEN** the server SHALL call Dify API and return structured JSON with conclusions and action_items
#### Scenario: Dify timeout handling
- **WHEN** Dify API does not respond within timeout period
- **THEN** the server SHALL return HTTP 504 with timeout error and client can retry
#### Scenario: Dify error handling
- **WHEN** Dify API returns error (500, rate limit, etc.)
- **THEN** the server SHALL return appropriate HTTP error with details
### Requirement: Structured Output Format
The AI summarization SHALL return structured data with conclusions and action items.
#### Scenario: Complete structured response
- **WHEN** transcript contains clear decisions and assignments
- **THEN** response SHALL include conclusions array and action_items array with content, owner, due_date fields
#### Scenario: Partial data extraction
- **WHEN** transcript lacks explicit owner or due_date for action items
- **THEN** those fields SHALL be empty strings allowing manual completion
### Requirement: Dify Prompt Configuration
The Dify workflow SHALL be configured with appropriate system prompt for meeting summarization.
#### Scenario: System prompt behavior
- **WHEN** transcript is sent to Dify
- **THEN** Dify SHALL use configured prompt to extract conclusions and action_items in JSON format
### Requirement: Manual Data Completion
The Electron client SHALL allow users to manually complete missing AI-extracted data.
#### Scenario: Fill missing owner
- **WHEN** AI returns action item without owner
- **THEN** user SHALL be able to select or type owner name in the UI
#### Scenario: Fill missing due date
- **WHEN** AI returns action item without due_date
- **THEN** user SHALL be able to select date using date picker

View File

@@ -0,0 +1,42 @@
## ADDED Requirements
### Requirement: Login Proxy
The middleware server SHALL proxy login requests to the company Auth API at https://pj-auth-api.vercel.app/api/auth/login.
#### Scenario: Successful login
- **WHEN** user submits valid credentials to POST /api/login
- **THEN** the server SHALL forward to Auth API and return the JWT token
#### Scenario: Admin role detection
- **WHEN** user logs in with email ymirliu@panjit.com.tw
- **THEN** the response JWT payload SHALL include role: "admin"
#### Scenario: Invalid credentials
- **WHEN** user submits invalid credentials
- **THEN** the server SHALL return HTTP 401 with error message from Auth API
### Requirement: Token Validation
The middleware server SHALL validate JWT tokens on protected endpoints.
#### Scenario: Valid token access
- **WHEN** request includes valid JWT in Authorization header
- **THEN** the request SHALL proceed to the endpoint handler
#### Scenario: Expired token
- **WHEN** request includes expired JWT
- **THEN** the server SHALL return HTTP 401 with "token_expired" error code
#### Scenario: Missing token
- **WHEN** request to protected endpoint lacks Authorization header
- **THEN** the server SHALL return HTTP 401 with "token_required" error code
### Requirement: Token Auto-Refresh
The Electron client SHALL implement automatic token refresh before expiration.
#### Scenario: Proactive refresh
- **WHEN** token approaches expiration (within 5 minutes) during active session
- **THEN** the client SHALL request new token transparently without user interruption
#### Scenario: Refresh during long meeting
- **WHEN** user is in a meeting session lasting longer than token validity
- **THEN** the client SHALL maintain authentication through automatic refresh

View File

@@ -0,0 +1,45 @@
## ADDED Requirements
### Requirement: Excel Report Generation
The middleware server SHALL generate Excel reports from meeting data using templates.
#### Scenario: Successful export
- **WHEN** user requests GET /api/meetings/:id/export
- **THEN** server SHALL generate Excel file and return as downloadable stream
#### Scenario: Export non-existent meeting
- **WHEN** user requests export for non-existent meeting ID
- **THEN** server SHALL return HTTP 404
### Requirement: Template-based Generation
The Excel export SHALL use openpyxl with template files.
#### Scenario: Placeholder replacement
- **WHEN** Excel is generated
- **THEN** placeholders ({{subject}}, {{time}}, {{chair}}, etc.) SHALL be replaced with actual meeting data
#### Scenario: Dynamic row insertion
- **WHEN** meeting has multiple conclusions or action items
- **THEN** rows SHALL be dynamically inserted to accommodate all items
### Requirement: Complete Data Inclusion
The exported Excel SHALL include all meeting metadata and AI-generated content.
#### Scenario: Full metadata export
- **WHEN** Excel is generated
- **THEN** it SHALL include subject, meeting_time, location, chairperson, recorder, and attendees
#### Scenario: Conclusions export
- **WHEN** Excel is generated
- **THEN** all conclusions SHALL be listed with their system codes
#### Scenario: Action items export
- **WHEN** Excel is generated
- **THEN** all action items SHALL be listed with content, owner, due_date, status, and system code
### Requirement: Template Management
Admin users SHALL be able to manage Excel templates.
#### Scenario: Admin template access
- **WHEN** admin user accesses template management
- **THEN** they SHALL be able to upload, view, and update Excel templates

View File

@@ -0,0 +1,71 @@
## ADDED Requirements
### Requirement: Create Meeting
The system SHALL allow users to create meetings with required metadata.
#### Scenario: Create meeting with all fields
- **WHEN** user submits POST /api/meetings with subject, meeting_time, chairperson, location, recorder, attendees
- **THEN** a new meeting record SHALL be created with auto-generated UUID and the meeting data SHALL be returned
#### Scenario: Create meeting with missing required fields
- **WHEN** user submits POST /api/meetings without subject or meeting_time
- **THEN** the server SHALL return HTTP 400 with validation error details
#### Scenario: Recorder defaults to current user
- **WHEN** user creates meeting without specifying recorder
- **THEN** the recorder field SHALL default to the logged-in user's email
### Requirement: List Meetings
The system SHALL allow users to retrieve a list of meetings.
#### Scenario: List all meetings for admin
- **WHEN** admin user requests GET /api/meetings
- **THEN** all meetings SHALL be returned
#### Scenario: List meetings for regular user
- **WHEN** regular user requests GET /api/meetings
- **THEN** only meetings where user is creator, recorder, or attendee SHALL be returned
### Requirement: Get Meeting Details
The system SHALL allow users to retrieve full meeting details including conclusions and action items.
#### Scenario: Get meeting with related data
- **WHEN** user requests GET /api/meetings/:id
- **THEN** meeting record with all conclusions and action_items SHALL be returned
#### Scenario: Get non-existent meeting
- **WHEN** user requests GET /api/meetings/:id for non-existent ID
- **THEN** the server SHALL return HTTP 404
### Requirement: Update Meeting
The system SHALL allow users to update meeting data, conclusions, and action items.
#### Scenario: Update meeting metadata
- **WHEN** user submits PUT /api/meetings/:id with updated fields
- **THEN** the meeting record SHALL be updated and new data returned
#### Scenario: Update action item status
- **WHEN** user updates action item status to "Done"
- **THEN** the action_items record SHALL reflect the new status
### Requirement: Delete Meeting
The system SHALL allow authorized users to delete meetings.
#### Scenario: Admin deletes any meeting
- **WHEN** admin user requests DELETE /api/meetings/:id
- **THEN** the meeting and all related conclusions and action_items SHALL be deleted
#### Scenario: User deletes own meeting
- **WHEN** user requests DELETE /api/meetings/:id for meeting they created
- **THEN** the meeting and all related data SHALL be deleted
### Requirement: System Code Generation
The system SHALL auto-generate unique system codes for conclusions and action items.
#### Scenario: Generate conclusion code
- **WHEN** a conclusion is created for a meeting on 2025-12-10
- **THEN** the system_code SHALL follow format C-20251210-XX where XX is sequence number
#### Scenario: Generate action item code
- **WHEN** an action item is created for a meeting on 2025-12-10
- **THEN** the system_code SHALL follow format A-20251210-XX where XX is sequence number

View File

@@ -0,0 +1,41 @@
## ADDED Requirements
### Requirement: FastAPI Server Configuration
The middleware server SHALL be implemented using Python FastAPI framework with environment-based configuration.
#### Scenario: Server startup with valid configuration
- **WHEN** the server starts with valid .env file containing DB_HOST, DB_PORT, DB_USER, DB_PASS, DB_NAME, DIFY_API_URL, DIFY_API_KEY
- **THEN** the server SHALL start successfully and accept connections
#### Scenario: Server startup with missing configuration
- **WHEN** the server starts with missing required environment variables
- **THEN** the server SHALL fail to start with descriptive error message
### Requirement: Database Connection Pool
The middleware server SHALL maintain a connection pool to the MySQL database at mysql.theaken.com:33306.
#### Scenario: Database connection success
- **WHEN** the server connects to MySQL with valid credentials
- **THEN** a connection pool SHALL be established and queries SHALL execute successfully
#### Scenario: Database connection failure
- **WHEN** the database is unreachable
- **THEN** the server SHALL return HTTP 503 with error details for affected endpoints
### Requirement: Table Initialization
The middleware server SHALL ensure all required tables exist on startup with the `meeting_` prefix.
#### Scenario: Tables created on first run
- **WHEN** the server starts and tables do not exist
- **THEN** the server SHALL create meeting_users, meeting_records, meeting_conclusions, and meeting_action_items tables
#### Scenario: Tables already exist
- **WHEN** the server starts and tables already exist
- **THEN** the server SHALL skip table creation and continue normally
### Requirement: CORS Configuration
The middleware server SHALL allow cross-origin requests from the Electron client.
#### Scenario: CORS preflight request
- **WHEN** Electron client sends OPTIONS request
- **THEN** the server SHALL respond with appropriate CORS headers allowing the request

View File

@@ -0,0 +1,41 @@
## ADDED Requirements
### Requirement: Edge Speech-to-Text
The Electron client SHALL perform speech-to-text conversion locally using faster-whisper int8 model.
#### Scenario: Successful transcription
- **WHEN** user records audio during a meeting
- **THEN** the audio SHALL be transcribed locally without network dependency
#### Scenario: Transcription on target hardware
- **WHEN** running on i5 processor with 8GB RAM
- **THEN** transcription SHALL complete within acceptable latency for real-time display
### Requirement: Traditional Chinese Output
The transcription engine SHALL output Traditional Chinese (繁體中文) text.
#### Scenario: Simplified to Traditional conversion
- **WHEN** whisper outputs Simplified Chinese characters
- **THEN** OpenCC SHALL convert output to Traditional Chinese
#### Scenario: Native Traditional Chinese
- **WHEN** whisper outputs Traditional Chinese directly
- **THEN** the text SHALL pass through unchanged
### Requirement: Real-time Display
The Electron client SHALL display transcription results in real-time.
#### Scenario: Streaming transcription
- **WHEN** user is recording
- **THEN** transcribed text SHALL appear in the left panel within seconds of speech
### Requirement: Python Sidecar
The transcription engine SHALL be packaged as a Python sidecar using PyInstaller.
#### Scenario: Sidecar startup
- **WHEN** Electron app launches
- **THEN** the Python sidecar containing faster-whisper and OpenCC SHALL be available
#### Scenario: Sidecar communication
- **WHEN** Electron sends audio data to sidecar
- **THEN** transcribed text SHALL be returned via IPC

View File

@@ -0,0 +1,67 @@
## 1. Middleware Server Foundation
- [x] 1.1 Initialize Python project with FastAPI, uvicorn, python-dotenv
- [x] 1.2 Create .env.example with all required environment variables
- [x] 1.3 Implement database connection pool with mysql-connector-python
- [x] 1.4 Create table initialization script (meeting_users, meeting_records, meeting_conclusions, meeting_action_items)
- [x] 1.5 Configure CORS middleware for Electron client
- [x] 1.6 Add health check endpoint GET /api/health
## 2. Authentication
- [x] 2.1 Implement POST /api/login proxy to PJ-Auth API
- [x] 2.2 Add admin role detection for ymirliu@panjit.com.tw
- [x] 2.3 Create JWT validation middleware for protected routes
- [x] 2.4 Handle token expiration with appropriate error codes
## 3. Meeting CRUD
- [x] 3.1 Implement POST /api/meetings (create meeting)
- [x] 3.2 Implement GET /api/meetings (list meetings with user filtering)
- [x] 3.3 Implement GET /api/meetings/:id (get meeting with conclusions and action items)
- [x] 3.4 Implement PUT /api/meetings/:id (update meeting)
- [x] 3.5 Implement DELETE /api/meetings/:id (delete meeting cascade)
- [x] 3.6 Implement system code generation (C-YYYYMMDD-XX, A-YYYYMMDD-XX)
## 4. AI Summarization
- [x] 4.1 Implement POST /api/ai/summarize endpoint
- [x] 4.2 Configure Dify API client with timeout and retry
- [x] 4.3 Parse Dify response into conclusions and action_items structure
- [x] 4.4 Handle partial data (empty owner/due_date)
## 5. Excel Export
- [x] 5.1 Create Excel template with placeholders
- [x] 5.2 Implement GET /api/meetings/:id/export endpoint
- [x] 5.3 Implement placeholder replacement ({{subject}}, {{time}}, etc.)
- [x] 5.4 Implement dynamic row insertion for conclusions and action items
## 6. Electron Client - Core
- [x] 6.1 Initialize Electron project with electron-builder
- [x] 6.2 Create main window and basic navigation
- [x] 6.3 Implement login page with auth API integration
- [x] 6.4 Implement token storage and auto-refresh interceptor
## 7. Electron Client - Meeting UI
- [x] 7.1 Create meeting list page
- [x] 7.2 Create meeting creation form (metadata fields)
- [x] 7.3 Create dual-panel meeting view (transcript left, notes right)
- [x] 7.4 Implement conclusion/action item editing with manual completion UI
- [x] 7.5 Add export button with download handling
## 8. Transcription Engine
- [x] 8.1 Create Python sidecar project with faster-whisper and OpenCC
- [x] 8.2 Implement audio input capture
- [x] 8.3 Implement transcription with int8 model
- [x] 8.4 Implement OpenCC Traditional Chinese conversion
- [x] 8.5 Set up IPC communication between Electron and sidecar
- [x] 8.6 Package sidecar with PyInstaller
## 9. Testing
- [x] 9.1 Unit tests: DB connection, table creation
- [x] 9.2 Unit tests: Dify proxy with mock responses
- [x] 9.3 Unit tests: Admin role detection
- [x] 9.4 Integration test: Auth flow with token refresh
- [x] 9.5 Integration test: Full meeting cycle (create → transcribe → summarize → save → export)
## 10. Deployment Preparation
- [x] 10.1 Create requirements.txt with all dependencies
- [x] 10.2 Create deployment documentation
- [x] 10.3 Configure electron-builder for portable target
- [x] 10.4 Verify faster-whisper performance on i5/8GB hardware