# Change: Add Dify Audio Transcription for Uploaded Files ## Why Users need to transcribe pre-recorded audio files (e.g., meeting recordings from external sources). Currently, transcription only works with real-time recording via the local sidecar. Adding Dify-based transcription for uploaded files provides flexibility while keeping real-time transcription local for low latency. ## What Changes - Add audio file upload UI in Electron client (meeting detail page) - Add `segment_audio` command to sidecar for VAD-based audio chunking - Add backend API endpoint to receive audio files, chunk via sidecar, and forward to Dify STT service - Each chunk (~5 minutes max) sent to Dify separately, results concatenated - Transcription result replaces the transcript field (same as real-time transcription) - Support common audio formats: MP3, WAV, M4A, WebM, OGG ## Impact - Affected specs: `transcription` - Affected code: - `sidecar/transcriber.py` - Add `segment_audio` action for VAD chunking - `client/src/pages/meeting-detail.html` - Add upload button and progress UI - `backend/app/routers/ai.py` - Add `/api/ai/transcribe-audio` endpoint - `backend/app/config.py` - Add Dify STT API key configuration ## Technical Notes - Dify STT API Key: `app-xQeSipaQecs0cuKeLvYDaRsu` - Real-time transcription continues to use local sidecar (no change) - File upload transcription uses Dify cloud service with VAD chunking - VAD chunking ensures each chunk < 25MB (Dify API limit) - Max file size: 500MB (chunked processing handles large files) - Both methods output to the same transcript_blob field