feat: Initial commit - Task Reporter incident response system

Complete implementation of the production line incident response system (生產線異常即時反應系統) including:

Backend (FastAPI):
- User authentication with AD integration and session management
- Chat room management (create, list, update, members, roles)
- Real-time messaging via WebSocket (typing indicators, reactions)
- File storage with MinIO (upload, download, image preview)

Frontend (React + Vite):
- Authentication flow with token management
- Room list with filtering, search, and pagination
- Real-time chat interface with WebSocket
- File upload with drag-and-drop and image preview
- Member management and room settings
- Breadcrumb navigation
- 53 unit tests (Vitest)

Specifications:
- authentication: AD auth, sessions, JWT tokens
- chat-room: rooms, members, templates
- realtime-messaging: WebSocket, messages, reactions
- file-storage: MinIO integration, file management
- frontend-core: React SPA structure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-12-01 17:42:52 +08:00
commit c8966477b9
135 changed files with 23269 additions and 0 deletions

View File

@@ -0,0 +1,283 @@
# Add Realtime Messaging
## Summary
Implement WebSocket-based real-time messaging system for incident rooms, enabling instant communication between team members during production incidents. Messages will be persisted to database for audit trail and report generation.
## Requirements
### Requirement: WebSocket Connection Management
The system SHALL provide WebSocket endpoints for establishing persistent bidirectional connections between clients and server, with automatic reconnection handling and connection state management.
#### Scenario: Establish WebSocket connection
- **WHEN** an authenticated user connects to `ws://localhost:8000/ws/{room_id}`
- **THEN** the system SHALL validate user's room membership
- **AND** establish a WebSocket connection
- **AND** add the connection to the room's active connections pool
- **AND** broadcast a "user joined" event to other room members
#### Scenario: Handle connection authentication
- **WHEN** a WebSocket connection request is made without valid authentication token
- **THEN** the system SHALL reject the connection with status 401
- **AND** close the WebSocket immediately
#### Scenario: Automatic reconnection
- **WHEN** a WebSocket connection is dropped unexpectedly
- **THEN** the client SHALL attempt to reconnect automatically
- **AND** resume from the last received message sequence number
- **AND** request any missed messages during disconnection
### Requirement: Real-time Message Broadcasting
The system SHALL broadcast messages to all active room members in real-time, ensuring message ordering and delivery acknowledgment.
#### Scenario: Send text message
- **WHEN** a room member sends a message via WebSocket:
```json
{
"type": "message",
"content": "Equipment temperature rising to 85°C",
"message_type": "text"
}
```
- **THEN** the system SHALL:
- Validate user has write permission (OWNER or EDITOR role)
- Assign a unique message_id and timestamp
- Store the message in database
- Broadcast to all active WebSocket connections in the room
- Return acknowledgment to sender with message_id
#### Scenario: Send system notification
- **WHEN** a system event occurs (user joined, room status changed, etc.)
- **THEN** the system SHALL broadcast a system message:
```json
{
"type": "system",
"event": "user_joined",
"user_id": "john.doe@panjit.com.tw",
"timestamp": "2025-11-17T10:00:00Z"
}
```
- **AND** all connected clients SHALL display the notification
#### Scenario: Handle message ordering
- **WHEN** multiple messages are sent simultaneously
- **THEN** the system SHALL ensure FIFO ordering using message sequence numbers
- **AND** clients SHALL display messages in the correct order
### Requirement: Message Persistence and History
The system SHALL persist all messages to database for audit trail, report generation, and history retrieval.
#### Scenario: Store message in database
- **WHEN** a message is sent through WebSocket
- **THEN** the system SHALL create a database record with:
- message_id (UUID)
- room_id (FK to incident_rooms)
- sender_id (user email)
- content (text or JSON for structured messages)
- message_type (text, image_ref, file_ref, system)
- created_at timestamp
- edited_at (nullable for message edits)
- deleted_at (nullable for soft delete)
#### Scenario: Retrieve message history
- **WHEN** a user joins a room or reconnects
- **THEN** the system SHALL load recent messages via `GET /api/rooms/{room_id}/messages?limit=50&before={timestamp}`
- **AND** return messages in reverse chronological order
- **AND** include pagination metadata for loading more history
#### Scenario: Search messages
- **WHEN** a user searches for messages containing specific keywords
- **THEN** the system SHALL query the database with full-text search
- **AND** return matching messages with highlighted search terms
- **AND** maintain user's access control (only rooms they're members of)
### Requirement: Message Types and Formatting
The system SHALL support various message types including text, image references, file references, and structured data for production incidents.
#### Scenario: Text message with mentions
- **WHEN** a user sends a message with @mentions
```json
{
"content": "@maintenance_team Please check Line 3 immediately",
"mentions": ["maintenance_team@panjit.com.tw"]
}
```
- **THEN** the system SHALL parse and store mentions
- **AND** potentially trigger notifications to mentioned users
#### Scenario: Image reference message
- **WHEN** a user uploads an image and sends reference
```json
{
"type": "message",
"message_type": "image_ref",
"content": "Defect found on product",
"file_id": "550e8400-e29b-41d4-a716-446655440000",
"file_url": "http://localhost:9000/bucket/room-123/image.jpg"
}
```
- **THEN** the system SHALL store the file reference
- **AND** clients SHALL display image preview inline
#### Scenario: Structured incident data
- **WHEN** reporting specific incident metrics
```json
{
"type": "message",
"message_type": "incident_data",
"content": {
"temperature": 85,
"pressure": 120,
"production_rate": 450,
"timestamp": "2025-11-17T10:15:00Z"
}
}
```
- **THEN** the system SHALL store structured data as JSON
- **AND** enable querying/filtering by specific fields later
### Requirement: Connection State Management
The system SHALL track online presence and typing indicators for better collaboration experience.
#### Scenario: Track online users
- **WHEN** users connect/disconnect from a room
- **THEN** the system SHALL maintain a list of online users
- **AND** broadcast presence updates to all room members
- **AND** display online status indicators in UI
#### Scenario: Typing indicators
- **WHEN** a user starts typing a message
- **THEN** the client SHALL send a "typing" event via WebSocket
- **AND** the system SHALL broadcast to other room members
- **AND** automatically clear typing status after 3 seconds of inactivity
#### Scenario: Connection health monitoring
- **WHEN** a WebSocket connection is established
- **THEN** the system SHALL send ping frames every 30 seconds
- **AND** expect pong responses within 10 seconds
- **AND** terminate connection if no response received
### Requirement: Message Operations
The system SHALL support message editing and deletion with proper audit trail and permissions.
#### Scenario: Edit own message
- **WHEN** a user edits their own message within 15 minutes
```json
{
"type": "edit_message",
"message_id": "msg-123",
"content": "Updated: Equipment temperature stabilized at 75°C"
}
```
- **THEN** the system SHALL update the message content
- **AND** set edited_at timestamp
- **AND** broadcast the edit to all connected clients
- **AND** preserve original message in audit log
#### Scenario: Delete message
- **WHEN** a user deletes their own message or admin deletes any message
- **THEN** the system SHALL perform soft delete (set deleted_at)
- **AND** broadcast deletion event to all clients
- **AND** clients SHALL show "message deleted" placeholder
- **AND** preserve message in database for audit
#### Scenario: React to message
- **WHEN** a user adds a reaction emoji to a message
```json
{
"type": "add_reaction",
"message_id": "msg-123",
"emoji": "👍"
}
```
- **THEN** the system SHALL store the reaction
- **AND** broadcast to all connected clients
- **AND** aggregate reaction counts for display
## Scenarios
### Happy Path: Production Incident Communication
1. Equipment failure detected on Line 3
2. Operator creates incident room via REST API
3. Opens WebSocket connection to room
4. Sends initial message: "Conveyor belt stopped, investigating"
5. Maintenance team members join via WebSocket
6. Real-time messages exchanged with status updates
7. Images uploaded and referenced in messages
8. Temperature/pressure data shared as structured messages
9. Issue resolved, final message sent
10. Room status changed to "resolved"
11. All messages available for report generation
### Edge Case: Network Interruption Recovery
1. User actively chatting in incident room
2. Network connection drops for 2 minutes
3. Client automatically attempts reconnection
4. Connection re-established with same room_id
5. Client requests messages since last sequence number
6. Server sends missed messages
7. UI updates seamlessly without user intervention
### Error Case: Unauthorized Message Attempt
1. User with VIEWER role connects to room WebSocket
2. Attempts to send a message
3. Server validates permission based on role
4. Rejects message with error: "Insufficient permissions"
5. WebSocket connection remains active (not terminated)
6. User can still receive messages from others
### Performance Case: High-traffic Incident
1. Major production issue affects multiple lines
2. 20+ team members join incident room
3. Rapid message exchange (100+ messages/minute)
4. System maintains sub-100ms message broadcast latency
5. Database writes queued to prevent blocking
6. All messages preserved in correct order
7. No message loss despite high load
## Technical Considerations
### WebSocket Implementation
- Use FastAPI's native WebSocket support
- Implement connection pooling per room
- Use Redis pub/sub for multi-server scaling (future)
- Graceful shutdown handling to notify clients
### Database Schema
```sql
CREATE TABLE messages (
message_id VARCHAR(36) PRIMARY KEY,
room_id VARCHAR(36) NOT NULL REFERENCES incident_rooms(room_id),
sender_id VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
message_type VARCHAR(20) DEFAULT 'text',
metadata JSONB,
created_at TIMESTAMP DEFAULT NOW(),
edited_at TIMESTAMP,
deleted_at TIMESTAMP,
sequence_number BIGSERIAL,
INDEX idx_room_messages (room_id, created_at DESC),
INDEX idx_message_search (content gin_trgm_ops) -- PostgreSQL full-text
);
CREATE TABLE message_reactions (
reaction_id SERIAL PRIMARY KEY,
message_id VARCHAR(36) REFERENCES messages(message_id),
user_id VARCHAR(255) NOT NULL,
emoji VARCHAR(10) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(message_id, user_id, emoji)
);
```
### Security Considerations
- Validate JWT tokens on WebSocket connection
- Rate limiting per user (max 10 messages/second)
- Message size limit (10KB for text, 100KB for structured data)
- XSS prevention for message content
- SQL injection prevention using parameterized queries
### Performance Requirements
- Message broadcast latency < 100ms for same server
- Support 100+ concurrent connections per room
- Message history query < 200ms for 1000 messages
- Automatic connection cleanup for dropped clients

View File

@@ -0,0 +1,199 @@
# realtime-messaging Specification
## Purpose
Enable real-time bidirectional communication within incident rooms using WebSocket protocol, allowing production teams to collaborate instantly during incidents with message persistence for audit trail and report generation.
## Target Specs
- chat-room: extend
- NEW: realtime-messaging
## Requirements
### Requirement: WebSocket Connection Management
The system SHALL provide WebSocket endpoints for establishing persistent bidirectional connections between clients and server, with automatic reconnection handling and connection state management.
#### Scenario: Establish WebSocket connection
- **WHEN** an authenticated user connects to `ws://localhost:8000/ws/{room_id}`
- **THEN** the system SHALL validate user's room membership
- **AND** establish a WebSocket connection
- **AND** add the connection to the room's active connections pool
- **AND** broadcast a "user joined" event to other room members
#### Scenario: Handle connection authentication
- **WHEN** a WebSocket connection request is made without valid authentication token
- **THEN** the system SHALL reject the connection with status 401
- **AND** close the WebSocket immediately
#### Scenario: Automatic reconnection
- **WHEN** a WebSocket connection is dropped unexpectedly
- **THEN** the client SHALL attempt to reconnect automatically with exponential backoff
- **AND** resume from the last received message sequence number
- **AND** request any missed messages during disconnection
### Requirement: Real-time Message Broadcasting
The system SHALL broadcast messages to all active room members in real-time, ensuring message ordering and delivery acknowledgment.
#### Scenario: Send text message
- **WHEN** a room member sends a message via WebSocket:
```json
{
"type": "message",
"content": "Equipment temperature rising to 85°C",
"message_type": "text"
}
```
- **THEN** the system SHALL:
- Validate user has write permission (OWNER or EDITOR role)
- Assign a unique message_id and timestamp
- Store the message in database
- Broadcast to all active WebSocket connections in the room
- Return acknowledgment to sender with message_id
#### Scenario: Send system notification
- **WHEN** a system event occurs (user joined, room status changed, etc.)
- **THEN** the system SHALL broadcast a system message:
```json
{
"type": "system",
"event": "user_joined",
"user_id": "john.doe@panjit.com.tw",
"timestamp": "2025-11-17T10:00:00Z"
}
```
- **AND** all connected clients SHALL display the notification
#### Scenario: Handle message ordering
- **WHEN** multiple messages are sent simultaneously
- **THEN** the system SHALL ensure FIFO ordering using message sequence numbers
- **AND** clients SHALL display messages in the correct order
### Requirement: Message Persistence and History
The system SHALL persist all messages to database for audit trail, report generation, and history retrieval.
#### Scenario: Store message in database
- **WHEN** a message is sent through WebSocket
- **THEN** the system SHALL create a database record with:
- message_id (UUID)
- room_id (FK to incident_rooms)
- sender_id (user email)
- content (text or JSON for structured messages)
- message_type (text, image_ref, file_ref, system)
- created_at timestamp
- edited_at (nullable for message edits)
- deleted_at (nullable for soft delete)
- sequence_number (for ordering)
#### Scenario: Retrieve message history
- **WHEN** a user joins a room or reconnects
- **THEN** the system SHALL load recent messages via `GET /api/rooms/{room_id}/messages?limit=50&before={timestamp}`
- **AND** return messages in reverse chronological order
- **AND** include pagination metadata for loading more history
#### Scenario: Search messages
- **WHEN** a user searches for messages containing specific keywords
- **THEN** the system SHALL query the database with full-text search
- **AND** return matching messages with highlighted search terms
- **AND** maintain user's access control (only rooms they're members of)
### Requirement: Message Types and Formatting
The system SHALL support various message types including text, image references, file references, and structured data for production incidents.
#### Scenario: Text message with mentions
- **WHEN** a user sends a message with @mentions
```json
{
"content": "@maintenance_team Please check Line 3 immediately",
"mentions": ["maintenance_team@panjit.com.tw"]
}
```
- **THEN** the system SHALL parse and store mentions
- **AND** potentially trigger notifications to mentioned users
#### Scenario: Image reference message
- **WHEN** a user uploads an image and sends reference
```json
{
"type": "message",
"message_type": "image_ref",
"content": "Defect found on product",
"file_id": "550e8400-e29b-41d4-a716-446655440000",
"file_url": "http://localhost:9000/bucket/room-123/image.jpg"
}
```
- **THEN** the system SHALL store the file reference
- **AND** clients SHALL display image preview inline
#### Scenario: Structured incident data
- **WHEN** reporting specific incident metrics
```json
{
"type": "message",
"message_type": "incident_data",
"content": {
"temperature": 85,
"pressure": 120,
"production_rate": 450,
"timestamp": "2025-11-17T10:15:00Z"
}
}
```
- **THEN** the system SHALL store structured data as JSON
- **AND** enable querying/filtering by specific fields later
### Requirement: Connection State Management
The system SHALL track online presence and typing indicators for better collaboration experience.
#### Scenario: Track online users
- **WHEN** users connect/disconnect from a room
- **THEN** the system SHALL maintain a list of online users
- **AND** broadcast presence updates to all room members
- **AND** display online status indicators in UI
#### Scenario: Typing indicators
- **WHEN** a user starts typing a message
- **THEN** the client SHALL send a "typing" event via WebSocket
- **AND** the system SHALL broadcast to other room members
- **AND** automatically clear typing status after 3 seconds of inactivity
#### Scenario: Connection health monitoring
- **WHEN** a WebSocket connection is established
- **THEN** the system SHALL send ping frames every 30 seconds
- **AND** expect pong responses within 10 seconds
- **AND** terminate connection if no response received
### Requirement: Message Operations
The system SHALL support message editing and deletion with proper audit trail and permissions.
#### Scenario: Edit own message
- **WHEN** a user edits their own message within 15 minutes
```json
{
"type": "edit_message",
"message_id": "msg-123",
"content": "Updated: Equipment temperature stabilized at 75°C"
}
```
- **THEN** the system SHALL update the message content
- **AND** set edited_at timestamp
- **AND** broadcast the edit to all connected clients
- **AND** preserve original message in audit log
#### Scenario: Delete message
- **WHEN** a user deletes their own message or admin deletes any message
- **THEN** the system SHALL perform soft delete (set deleted_at)
- **AND** broadcast deletion event to all clients
- **AND** clients SHALL show "message deleted" placeholder
- **AND** preserve message in database for audit
#### Scenario: React to message
- **WHEN** a user adds a reaction emoji to a message
```json
{
"type": "add_reaction",
"message_id": "msg-123",
"emoji": "👍"
}
```
- **THEN** the system SHALL store the reaction
- **AND** broadcast to all connected clients
- **AND** aggregate reaction counts for display

View File

@@ -0,0 +1,195 @@
# realtime-messaging Specification
## Purpose
Enable real-time bidirectional communication within incident rooms using WebSocket protocol, allowing production teams to collaborate instantly during incidents with message persistence for audit trail and report generation.
## ADDED Requirements
### Requirement: WebSocket Connection Management
The system SHALL provide WebSocket endpoints for establishing persistent bidirectional connections between clients and server, with automatic reconnection handling and connection state management.
#### Scenario: Establish WebSocket connection
- **WHEN** an authenticated user connects to `ws://localhost:8000/ws/{room_id}`
- **THEN** the system SHALL validate user's room membership
- **AND** establish a WebSocket connection
- **AND** add the connection to the room's active connections pool
- **AND** broadcast a "user joined" event to other room members
#### Scenario: Handle connection authentication
- **WHEN** a WebSocket connection request is made without valid authentication token
- **THEN** the system SHALL reject the connection with status 401
- **AND** close the WebSocket immediately
#### Scenario: Automatic reconnection
- **WHEN** a WebSocket connection is dropped unexpectedly
- **THEN** the client SHALL attempt to reconnect automatically with exponential backoff
- **AND** resume from the last received message sequence number
- **AND** request any missed messages during disconnection
### Requirement: Real-time Message Broadcasting
The system SHALL broadcast messages to all active room members in real-time, ensuring message ordering and delivery acknowledgment.
#### Scenario: Send text message
- **WHEN** a room member sends a message via WebSocket:
```json
{
"type": "message",
"content": "Equipment temperature rising to 85°C",
"message_type": "text"
}
```
- **THEN** the system SHALL:
- Validate user has write permission (OWNER or EDITOR role)
- Assign a unique message_id and timestamp
- Store the message in database
- Broadcast to all active WebSocket connections in the room
- Return acknowledgment to sender with message_id
#### Scenario: Send system notification
- **WHEN** a system event occurs (user joined, room status changed, etc.)
- **THEN** the system SHALL broadcast a system message:
```json
{
"type": "system",
"event": "user_joined",
"user_id": "john.doe@panjit.com.tw",
"timestamp": "2025-11-17T10:00:00Z"
}
```
- **AND** all connected clients SHALL display the notification
#### Scenario: Handle message ordering
- **WHEN** multiple messages are sent simultaneously
- **THEN** the system SHALL ensure FIFO ordering using message sequence numbers
- **AND** clients SHALL display messages in the correct order
### Requirement: Message Persistence and History
The system SHALL persist all messages to database for audit trail, report generation, and history retrieval.
#### Scenario: Store message in database
- **WHEN** a message is sent through WebSocket
- **THEN** the system SHALL create a database record with:
- message_id (UUID)
- room_id (FK to incident_rooms)
- sender_id (user email)
- content (text or JSON for structured messages)
- message_type (text, image_ref, file_ref, system)
- created_at timestamp
- edited_at (nullable for message edits)
- deleted_at (nullable for soft delete)
- sequence_number (for ordering)
#### Scenario: Retrieve message history
- **WHEN** a user joins a room or reconnects
- **THEN** the system SHALL load recent messages via `GET /api/rooms/{room_id}/messages?limit=50&before={timestamp}`
- **AND** return messages in reverse chronological order
- **AND** include pagination metadata for loading more history
#### Scenario: Search messages
- **WHEN** a user searches for messages containing specific keywords
- **THEN** the system SHALL query the database with full-text search
- **AND** return matching messages with highlighted search terms
- **AND** maintain user's access control (only rooms they're members of)
### Requirement: Message Types and Formatting
The system SHALL support various message types including text, image references, file references, and structured data for production incidents.
#### Scenario: Text message with mentions
- **WHEN** a user sends a message with @mentions
```json
{
"content": "@maintenance_team Please check Line 3 immediately",
"mentions": ["maintenance_team@panjit.com.tw"]
}
```
- **THEN** the system SHALL parse and store mentions
- **AND** potentially trigger notifications to mentioned users
#### Scenario: Image reference message
- **WHEN** a user uploads an image and sends reference
```json
{
"type": "message",
"message_type": "image_ref",
"content": "Defect found on product",
"file_id": "550e8400-e29b-41d4-a716-446655440000",
"file_url": "http://localhost:9000/bucket/room-123/image.jpg"
}
```
- **THEN** the system SHALL store the file reference
- **AND** clients SHALL display image preview inline
#### Scenario: Structured incident data
- **WHEN** reporting specific incident metrics
```json
{
"type": "message",
"message_type": "incident_data",
"content": {
"temperature": 85,
"pressure": 120,
"production_rate": 450,
"timestamp": "2025-11-17T10:15:00Z"
}
}
```
- **THEN** the system SHALL store structured data as JSON
- **AND** enable querying/filtering by specific fields later
### Requirement: Connection State Management
The system SHALL track online presence and typing indicators for better collaboration experience.
#### Scenario: Track online users
- **WHEN** users connect/disconnect from a room
- **THEN** the system SHALL maintain a list of online users
- **AND** broadcast presence updates to all room members
- **AND** display online status indicators in UI
#### Scenario: Typing indicators
- **WHEN** a user starts typing a message
- **THEN** the client SHALL send a "typing" event via WebSocket
- **AND** the system SHALL broadcast to other room members
- **AND** automatically clear typing status after 3 seconds of inactivity
#### Scenario: Connection health monitoring
- **WHEN** a WebSocket connection is established
- **THEN** the system SHALL send ping frames every 30 seconds
- **AND** expect pong responses within 10 seconds
- **AND** terminate connection if no response received
### Requirement: Message Operations
The system SHALL support message editing and deletion with proper audit trail and permissions.
#### Scenario: Edit own message
- **WHEN** a user edits their own message within 15 minutes
```json
{
"type": "edit_message",
"message_id": "msg-123",
"content": "Updated: Equipment temperature stabilized at 75°C"
}
```
- **THEN** the system SHALL update the message content
- **AND** set edited_at timestamp
- **AND** broadcast the edit to all connected clients
- **AND** preserve original message in audit log
#### Scenario: Delete message
- **WHEN** a user deletes their own message or admin deletes any message
- **THEN** the system SHALL perform soft delete (set deleted_at)
- **AND** broadcast deletion event to all clients
- **AND** clients SHALL show "message deleted" placeholder
- **AND** preserve message in database for audit
#### Scenario: React to message
- **WHEN** a user adds a reaction emoji to a message
```json
{
"type": "add_reaction",
"message_id": "msg-123",
"emoji": "👍"
}
```
- **THEN** the system SHALL store the reaction
- **AND** broadcast to all connected clients
- **AND** aggregate reaction counts for display

View File

@@ -0,0 +1,312 @@
# Implementation Tasks
## 1. Database Schema
- [ ] 1.1 Create `messages` table with columns:
- [ ] message_id (UUID, PK)
- [ ] room_id (FK to incident_rooms)
- [ ] sender_id (VARCHAR, user email)
- [ ] content (TEXT)
- [ ] message_type (ENUM: text, image_ref, file_ref, system, incident_data)
- [ ] metadata (JSON, for structured data and mentions)
- [ ] created_at (TIMESTAMP)
- [ ] edited_at (TIMESTAMP, nullable)
- [ ] deleted_at (TIMESTAMP, nullable for soft delete)
- [ ] sequence_number (BIGSERIAL for ordering)
- [ ] 1.2 Create `message_reactions` table:
- [ ] reaction_id (SERIAL, PK)
- [ ] message_id (FK to messages)
- [ ] user_id (VARCHAR)
- [ ] emoji (VARCHAR(10))
- [ ] created_at (TIMESTAMP)
- [ ] UNIQUE constraint on (message_id, user_id, emoji)
- [ ] 1.3 Create `message_edit_history` table:
- [ ] edit_id (SERIAL, PK)
- [ ] message_id (FK to messages)
- [ ] original_content (TEXT)
- [ ] edited_by (VARCHAR)
- [ ] edited_at (TIMESTAMP)
- [ ] 1.4 Create indexes:
- [ ] Index on messages(room_id, created_at DESC)
- [ ] Index on messages(room_id, sequence_number)
- [ ] Full-text index on messages(content) for search
- [ ] Index on message_reactions(message_id)
- [ ] 1.5 Create database migration using Alembic
## 2. WebSocket Infrastructure
### 2.1 Connection Management
- [ ] 2.1.1 Create `app/modules/realtime/` module structure:
- [ ] `__init__.py`
- [ ] `websocket_manager.py` (connection pool management)
- [ ] `models.py` (message models)
- [ ] `schemas.py` (WebSocket message schemas)
- [ ] `handlers.py` (message type handlers)
- [ ] `router.py` (WebSocket endpoint)
- [ ] 2.1.2 Implement `WebSocketManager` class:
- [ ] `connect(websocket, room_id, user_id)` - Add connection to pool
- [ ] `disconnect(websocket, room_id, user_id)` - Remove and cleanup
- [ ] `broadcast_to_room(room_id, message)` - Send to all in room
- [ ] `send_personal(user_id, message)` - Send to specific user
- [ ] `get_room_connections(room_id)` - List active connections
- [ ] 2.1.3 Implement connection authentication:
- [ ] Extract JWT token from WebSocket headers or query params
- [ ] Validate token using existing auth module
- [ ] Check room membership before allowing connection
- [ ] Reject unauthorized connections
### 2.2 WebSocket Endpoint
- [ ] 2.2.1 Create WebSocket route `/ws/{room_id}`:
- [ ] Accept WebSocket connection
- [ ] Authenticate user
- [ ] Add to connection pool
- [ ] Listen for incoming messages
- [ ] Handle disconnection gracefully
- [ ] 2.2.2 Implement reconnection handling:
- [ ] Track last sequence number per connection
- [ ] Support resume from sequence number
- [ ] Queue messages during brief disconnections
- [ ] 2.2.3 Implement heartbeat/ping-pong:
- [ ] Send ping every 30 seconds
- [ ] Wait for pong response
- [ ] Terminate stale connections
## 3. Message Handling
### 3.1 Message Models and Schemas
- [ ] 3.1.1 Create SQLAlchemy models:
- [ ] `Message` model
- [ ] `MessageReaction` model
- [ ] `MessageEditHistory` model
- [ ] 3.1.2 Create Pydantic schemas:
- [ ] `WebSocketMessage` (incoming)
- [ ] `MessageBroadcast` (outgoing)
- [ ] `MessageCreate` (for database)
- [ ] `MessageResponse` (for REST API)
- [ ] `ReactionRequest/Response`
- [ ] 3.1.3 Define message type enums:
- [ ] MessageType (text, image_ref, file_ref, system, incident_data)
- [ ] SystemEventType (user_joined, user_left, status_changed, etc.)
### 3.2 Message Processing
- [ ] 3.2.1 Create message handlers for each type:
- [ ] `handle_text_message()` - Process plain text
- [ ] `handle_image_reference()` - Validate and store image refs
- [ ] `handle_file_reference()` - Validate and store file refs
- [ ] `handle_incident_data()` - Process structured data
- [ ] `handle_system_message()` - Broadcast system events
- [ ] 3.2.2 Implement mention parsing:
- [ ] Extract @mentions from content
- [ ] Validate mentioned users exist
- [ ] Store mentions in metadata
- [ ] 3.2.3 Implement message validation:
- [ ] Content length limits (10KB text, 100KB structured)
- [ ] Rate limiting (10 messages/second per user)
- [ ] Spam detection
- [ ] XSS prevention (sanitize HTML)
### 3.3 Message Operations
- [ ] 3.3.1 Implement message editing:
- [ ] Validate edit permission (own message, within 15 mins)
- [ ] Store original in edit history
- [ ] Update message content
- [ ] Broadcast edit event
- [ ] 3.3.2 Implement message deletion:
- [ ] Soft delete (set deleted_at)
- [ ] Validate delete permission
- [ ] Broadcast deletion event
- [ ] 3.3.3 Implement reactions:
- [ ] Add/remove reactions
- [ ] Aggregate reaction counts
- [ ] Broadcast reaction updates
## 4. Message Service Layer
### 4.1 Message Service
- [ ] 4.1.1 Create `services/message_service.py`:
- [ ] `create_message(db, room_id, sender_id, content, message_type)`
- [ ] `get_messages(db, room_id, limit, before_timestamp)`
- [ ] `edit_message(db, message_id, user_id, new_content)`
- [ ] `delete_message(db, message_id, user_id)`
- [ ] `search_messages(db, room_id, query, user_id)`
- [ ] 4.1.2 Implement message persistence:
- [ ] Auto-assign sequence numbers
- [ ] Handle concurrent writes
- [ ] Maintain message ordering
- [ ] 4.1.3 Implement message retrieval:
- [ ] Paginated history loading
- [ ] Filter by message type
- [ ] Include reaction aggregates
### 4.2 Reaction Service
- [ ] 4.2.1 Create `services/reaction_service.py`:
- [ ] `add_reaction(db, message_id, user_id, emoji)`
- [ ] `remove_reaction(db, message_id, user_id, emoji)`
- [ ] `get_message_reactions(db, message_id)`
- [ ] `get_reaction_counts(db, message_id)`
## 5. REST API Endpoints
### 5.1 Message Endpoints
- [ ] 5.1.1 Implement message REST endpoints:
- [ ] `GET /api/rooms/{room_id}/messages` - Get message history
- [ ] `GET /api/rooms/{room_id}/messages/{message_id}` - Get single message
- [ ] `POST /api/rooms/{room_id}/messages` - Send message (alternative to WS)
- [ ] `PATCH /api/messages/{message_id}` - Edit message
- [ ] `DELETE /api/messages/{message_id}` - Delete message
- [ ] 5.1.2 Implement search endpoint:
- [ ] `GET /api/rooms/{room_id}/messages/search?q={query}`
- [ ] 5.1.3 Implement reaction endpoints:
- [ ] `POST /api/messages/{message_id}/reactions` - Add reaction
- [ ] `DELETE /api/messages/{message_id}/reactions/{emoji}` - Remove reaction
### 5.2 Presence Endpoints
- [ ] 5.2.1 Create presence tracking:
- [ ] `GET /api/rooms/{room_id}/online` - Get online users
- [ ] `GET /api/rooms/{room_id}/typing` - Get typing users
## 6. Broadcasting System
### 6.1 Event Broadcasting
- [ ] 6.1.1 Create broadcast service:
- [ ] Room-level broadcasts (all members)
- [ ] User-level broadcasts (specific user)
- [ ] System-wide broadcasts (all connections)
- [ ] 6.1.2 Implement event types:
- [ ] Message events (new, edit, delete)
- [ ] User events (joined, left, typing)
- [ ] Room events (status changed, member added)
- [ ] 6.1.3 Create event queue:
- [ ] Queue events during processing
- [ ] Batch broadcasts for efficiency
- [ ] Handle failed deliveries
### 6.2 Typing Indicators
- [ ] 6.2.1 Implement typing state management:
- [ ] Track typing users per room
- [ ] Auto-expire after 3 seconds
- [ ] Broadcast typing events
- [ ] 6.2.2 Debounce typing events:
- [ ] Limit typing event frequency
- [ ] Aggregate multiple typing users
## 7. Client Helpers
### 7.1 Connection Recovery
- [ ] 7.1.1 Implement reconnection logic:
- [ ] Exponential backoff (1s, 2s, 4s, 8s, max 30s)
- [ ] Store last sequence number
- [ ] Request missed messages on reconnect
- [ ] 7.1.2 Implement offline queue:
- [ ] Queue messages while disconnected
- [ ] Send queued messages on reconnect
- [ ] Handle conflicts
### 7.2 Client SDK
- [ ] 7.2.1 Create Python client example:
- [ ] WebSocket connection wrapper
- [ ] Auto-reconnection
- [ ] Message sending/receiving
- [ ] 7.2.2 Create JavaScript client example:
- [ ] WebSocket wrapper class
- [ ] React hooks for messages
- [ ] TypeScript types
## 8. Testing
### 8.1 Unit Tests
- [ ] 8.1.1 Test message service:
- [ ] Message creation with validation
- [ ] Message ordering
- [ ] Edit/delete permissions
- [ ] Reaction operations
- [ ] 8.1.2 Test WebSocket manager:
- [ ] Connection pooling
- [ ] Broadcasting logic
- [ ] Disconnection handling
### 8.2 Integration Tests
- [ ] 8.2.1 Test WebSocket flow:
- [ ] Connect → Send → Receive → Disconnect
- [ ] Multi-user message exchange
- [ ] Reconnection with message recovery
- [ ] 8.2.2 Test REST endpoints:
- [ ] Message history retrieval
- [ ] Search functionality
- [ ] Permission enforcement
### 8.3 Performance Tests
- [ ] 8.3.1 Load testing:
- [ ] 100+ concurrent WebSocket connections
- [ ] 1000+ messages/minute throughput
- [ ] Message broadcast latency < 100ms
- [ ] 8.3.2 Stress testing:
- [ ] Connection/disconnection cycling
- [ ] Large message payloads
- [ ] Database query performance
## 9. Security
### 9.1 Authentication & Authorization
- [ ] 9.1.1 Implement WebSocket authentication:
- [ ] JWT token validation
- [ ] Room membership verification
- [ ] Permission checks for operations
- [ ] 9.1.2 Implement rate limiting:
- [ ] Per-user message rate limits
- [ ] Connection attempt limits
- [ ] Global room limits
### 9.2 Input Validation
- [ ] 9.2.1 Sanitize message content:
- [ ] XSS prevention
- [ ] SQL injection prevention
- [ ] Script injection prevention
- [ ] 9.2.2 Validate message size:
- [ ] Enforce content length limits
- [ ] Validate JSON structure
- [ ] Reject malformed messages
## 10. Monitoring & Logging
### 10.1 Metrics
- [ ] 10.1.1 Track WebSocket metrics:
- [ ] Active connections count
- [ ] Messages per second
- [ ] Broadcast latency
- [ ] Error rates
- [ ] 10.1.2 Track message metrics:
- [ ] Messages per room
- [ ] Message types distribution
- [ ] Edit/delete rates
### 10.2 Logging
- [ ] 10.2.1 Log WebSocket events:
- [ ] Connection/disconnection
- [ ] Authentication failures
- [ ] Message errors
- [ ] 10.2.2 Log message operations:
- [ ] Message sent/edited/deleted
- [ ] Search queries
- [ ] Permission denials
## 11. Documentation
### 11.1 API Documentation
- [ ] 11.1.1 Document WebSocket protocol:
- [ ] Connection process
- [ ] Message formats
- [ ] Event types
- [ ] 11.1.2 Document REST endpoints:
- [ ] Request/response formats
- [ ] Error codes
- [ ] Rate limits
### 11.2 Integration Guide
- [ ] 11.2.1 Client integration examples:
- [ ] Python client code
- [ ] JavaScript/React examples
- [ ] Error handling patterns
- [ ] 11.2.2 Server deployment guide:
- [ ] WebSocket configuration
- [ ] Nginx proxy setup
- [ ] Scaling considerations