Complete implementation of the production line incident response system (生產線異常即時反應系統) including: Backend (FastAPI): - User authentication with AD integration and session management - Chat room management (create, list, update, members, roles) - Real-time messaging via WebSocket (typing indicators, reactions) - File storage with MinIO (upload, download, image preview) Frontend (React + Vite): - Authentication flow with token management - Room list with filtering, search, and pagination - Real-time chat interface with WebSocket - File upload with drag-and-drop and image preview - Member management and room settings - Breadcrumb navigation - 53 unit tests (Vitest) Specifications: - authentication: AD auth, sessions, JWT tokens - chat-room: rooms, members, templates - realtime-messaging: WebSocket, messages, reactions - file-storage: MinIO integration, file management - frontend-core: React SPA structure 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
10 KiB
Add Realtime Messaging
Summary
Implement WebSocket-based real-time messaging system for incident rooms, enabling instant communication between team members during production incidents. Messages will be persisted to database for audit trail and report generation.
Requirements
Requirement: WebSocket Connection Management
The system SHALL provide WebSocket endpoints for establishing persistent bidirectional connections between clients and server, with automatic reconnection handling and connection state management.
Scenario: Establish WebSocket connection
- WHEN an authenticated user connects to
ws://localhost:8000/ws/{room_id} - THEN the system SHALL validate user's room membership
- AND establish a WebSocket connection
- AND add the connection to the room's active connections pool
- AND broadcast a "user joined" event to other room members
Scenario: Handle connection authentication
- WHEN a WebSocket connection request is made without valid authentication token
- THEN the system SHALL reject the connection with status 401
- AND close the WebSocket immediately
Scenario: Automatic reconnection
- WHEN a WebSocket connection is dropped unexpectedly
- THEN the client SHALL attempt to reconnect automatically
- AND resume from the last received message sequence number
- AND request any missed messages during disconnection
Requirement: Real-time Message Broadcasting
The system SHALL broadcast messages to all active room members in real-time, ensuring message ordering and delivery acknowledgment.
Scenario: Send text message
- WHEN a room member sends a message via WebSocket:
{ "type": "message", "content": "Equipment temperature rising to 85°C", "message_type": "text" } - THEN the system SHALL:
- Validate user has write permission (OWNER or EDITOR role)
- Assign a unique message_id and timestamp
- Store the message in database
- Broadcast to all active WebSocket connections in the room
- Return acknowledgment to sender with message_id
Scenario: Send system notification
- WHEN a system event occurs (user joined, room status changed, etc.)
- THEN the system SHALL broadcast a system message:
{ "type": "system", "event": "user_joined", "user_id": "john.doe@panjit.com.tw", "timestamp": "2025-11-17T10:00:00Z" } - AND all connected clients SHALL display the notification
Scenario: Handle message ordering
- WHEN multiple messages are sent simultaneously
- THEN the system SHALL ensure FIFO ordering using message sequence numbers
- AND clients SHALL display messages in the correct order
Requirement: Message Persistence and History
The system SHALL persist all messages to database for audit trail, report generation, and history retrieval.
Scenario: Store message in database
- WHEN a message is sent through WebSocket
- THEN the system SHALL create a database record with:
- message_id (UUID)
- room_id (FK to incident_rooms)
- sender_id (user email)
- content (text or JSON for structured messages)
- message_type (text, image_ref, file_ref, system)
- created_at timestamp
- edited_at (nullable for message edits)
- deleted_at (nullable for soft delete)
Scenario: Retrieve message history
- WHEN a user joins a room or reconnects
- THEN the system SHALL load recent messages via
GET /api/rooms/{room_id}/messages?limit=50&before={timestamp} - AND return messages in reverse chronological order
- AND include pagination metadata for loading more history
Scenario: Search messages
- WHEN a user searches for messages containing specific keywords
- THEN the system SHALL query the database with full-text search
- AND return matching messages with highlighted search terms
- AND maintain user's access control (only rooms they're members of)
Requirement: Message Types and Formatting
The system SHALL support various message types including text, image references, file references, and structured data for production incidents.
Scenario: Text message with mentions
- WHEN a user sends a message with @mentions
{ "content": "@maintenance_team Please check Line 3 immediately", "mentions": ["maintenance_team@panjit.com.tw"] } - THEN the system SHALL parse and store mentions
- AND potentially trigger notifications to mentioned users
Scenario: Image reference message
- WHEN a user uploads an image and sends reference
{ "type": "message", "message_type": "image_ref", "content": "Defect found on product", "file_id": "550e8400-e29b-41d4-a716-446655440000", "file_url": "http://localhost:9000/bucket/room-123/image.jpg" } - THEN the system SHALL store the file reference
- AND clients SHALL display image preview inline
Scenario: Structured incident data
- WHEN reporting specific incident metrics
{ "type": "message", "message_type": "incident_data", "content": { "temperature": 85, "pressure": 120, "production_rate": 450, "timestamp": "2025-11-17T10:15:00Z" } } - THEN the system SHALL store structured data as JSON
- AND enable querying/filtering by specific fields later
Requirement: Connection State Management
The system SHALL track online presence and typing indicators for better collaboration experience.
Scenario: Track online users
- WHEN users connect/disconnect from a room
- THEN the system SHALL maintain a list of online users
- AND broadcast presence updates to all room members
- AND display online status indicators in UI
Scenario: Typing indicators
- WHEN a user starts typing a message
- THEN the client SHALL send a "typing" event via WebSocket
- AND the system SHALL broadcast to other room members
- AND automatically clear typing status after 3 seconds of inactivity
Scenario: Connection health monitoring
- WHEN a WebSocket connection is established
- THEN the system SHALL send ping frames every 30 seconds
- AND expect pong responses within 10 seconds
- AND terminate connection if no response received
Requirement: Message Operations
The system SHALL support message editing and deletion with proper audit trail and permissions.
Scenario: Edit own message
- WHEN a user edits their own message within 15 minutes
{ "type": "edit_message", "message_id": "msg-123", "content": "Updated: Equipment temperature stabilized at 75°C" } - THEN the system SHALL update the message content
- AND set edited_at timestamp
- AND broadcast the edit to all connected clients
- AND preserve original message in audit log
Scenario: Delete message
- WHEN a user deletes their own message or admin deletes any message
- THEN the system SHALL perform soft delete (set deleted_at)
- AND broadcast deletion event to all clients
- AND clients SHALL show "message deleted" placeholder
- AND preserve message in database for audit
Scenario: React to message
- WHEN a user adds a reaction emoji to a message
{ "type": "add_reaction", "message_id": "msg-123", "emoji": "👍" } - THEN the system SHALL store the reaction
- AND broadcast to all connected clients
- AND aggregate reaction counts for display
Scenarios
Happy Path: Production Incident Communication
- Equipment failure detected on Line 3
- Operator creates incident room via REST API
- Opens WebSocket connection to room
- Sends initial message: "Conveyor belt stopped, investigating"
- Maintenance team members join via WebSocket
- Real-time messages exchanged with status updates
- Images uploaded and referenced in messages
- Temperature/pressure data shared as structured messages
- Issue resolved, final message sent
- Room status changed to "resolved"
- All messages available for report generation
Edge Case: Network Interruption Recovery
- User actively chatting in incident room
- Network connection drops for 2 minutes
- Client automatically attempts reconnection
- Connection re-established with same room_id
- Client requests messages since last sequence number
- Server sends missed messages
- UI updates seamlessly without user intervention
Error Case: Unauthorized Message Attempt
- User with VIEWER role connects to room WebSocket
- Attempts to send a message
- Server validates permission based on role
- Rejects message with error: "Insufficient permissions"
- WebSocket connection remains active (not terminated)
- User can still receive messages from others
Performance Case: High-traffic Incident
- Major production issue affects multiple lines
- 20+ team members join incident room
- Rapid message exchange (100+ messages/minute)
- System maintains sub-100ms message broadcast latency
- Database writes queued to prevent blocking
- All messages preserved in correct order
- No message loss despite high load
Technical Considerations
WebSocket Implementation
- Use FastAPI's native WebSocket support
- Implement connection pooling per room
- Use Redis pub/sub for multi-server scaling (future)
- Graceful shutdown handling to notify clients
Database Schema
CREATE TABLE messages (
message_id VARCHAR(36) PRIMARY KEY,
room_id VARCHAR(36) NOT NULL REFERENCES incident_rooms(room_id),
sender_id VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
message_type VARCHAR(20) DEFAULT 'text',
metadata JSONB,
created_at TIMESTAMP DEFAULT NOW(),
edited_at TIMESTAMP,
deleted_at TIMESTAMP,
sequence_number BIGSERIAL,
INDEX idx_room_messages (room_id, created_at DESC),
INDEX idx_message_search (content gin_trgm_ops) -- PostgreSQL full-text
);
CREATE TABLE message_reactions (
reaction_id SERIAL PRIMARY KEY,
message_id VARCHAR(36) REFERENCES messages(message_id),
user_id VARCHAR(255) NOT NULL,
emoji VARCHAR(10) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(message_id, user_id, emoji)
);
Security Considerations
- Validate JWT tokens on WebSocket connection
- Rate limiting per user (max 10 messages/second)
- Message size limit (10KB for text, 100KB for structured data)
- XSS prevention for message content
- SQL injection prevention using parameterized queries
Performance Requirements
- Message broadcast latency < 100ms for same server
- Support 100+ concurrent connections per room
- Message history query < 200ms for 1000 messages
- Automatic connection cleanup for dropped clients