# file-storage Specification ## Purpose TBD - created by archiving change add-file-upload-minio. Update Purpose after archive. ## Requirements ### Requirement: File Upload with Validation The system SHALL accept multipart file uploads to incident rooms, validate file type and size, persist files to MinIO object storage with metadata tracking in PostgreSQL, AND create an associated message in the chat for context. #### Scenario: Upload image to incident room - **WHEN** a user with OWNER or EDITOR role uploads an image file via `POST /api/rooms/{room_id}/files` ```http POST /api/rooms/room-123/files Content-Type: multipart/form-data Authorization: Bearer {jwt_token} file: [binary data of defect.jpg, 2.5MB] description: "Defect found on product batch A-45" ``` - **THEN** the system SHALL: - Validate JWT token and extract user_id - Verify user is member of room-123 with OWNER or EDITOR role - Validate file MIME type is image/jpeg, image/png, or image/gif - Validate file size <= 10MB - Generate unique file_id (UUID) - Upload file to MinIO bucket `task-reporter-files` at path `room-123/images/{file_id}.jpg` - Create a message with `message_type=image_ref` containing file metadata - Create database record in `room_files` table with `message_id` reference - Return file metadata with presigned download URL (1-hour expiry) and message_id #### Scenario: Upload document to incident room - **WHEN** a user uploads a non-image file (PDF, log, etc.) - **THEN** the system SHALL: - Create a message with `message_type=file_ref` - Store message_id in the file record - Return file metadata with message_id #### Scenario: Reject oversized file upload - **WHEN** a user attempts to upload a 15MB PDF file - **THEN** the system SHALL: - Detect file size exceeds 20MB limit for documents - Return 413 Payload Too Large error - Include error message: "File size exceeds limit: 15MB > 20MB" - NOT upload file to MinIO - NOT create database record - NOT create chat message #### Scenario: Reject unauthorized file type - **WHEN** a user attempts to upload an executable file (e.g., .exe, .sh, .bat) - **THEN** the system SHALL: - Detect MIME type not in whitelist - Return 400 Bad Request error - Include error message: "File type not allowed: application/x-msdownload" - NOT upload file to MinIO - NOT create database record #### Scenario: Upload log file to incident room - **WHEN** an engineer uploads a machine log file ```http POST /api/rooms/room-456/files Content-Type: multipart/form-data file: [machine_error.log, 1.2MB] description: "Equipment error log from 2025-11-17" ``` - **THEN** the system SHALL: - Validate MIME type is text/plain - Upload to MinIO at `room-456/logs/{file_id}.log` - Store metadata with file_type='log' - Create a message with message_type='file_ref' - Return success response with file_id and message_id ### Requirement: File Download with Access Control The system SHALL generate time-limited presigned download URLs for files, enforcing room membership-based access control. #### Scenario: Download file with valid membership - **WHEN** a user who is member of room-123 requests `GET /api/rooms/room-123/files/{file_id}` - **THEN** the system SHALL: - Validate user is member of room-123 - Retrieve file metadata from database - Generate MinIO presigned URL with 1-hour expiry - Return JSON response: ```json { "file_id": "550e8400-e29b-41d4-a716-446655440000", "filename": "defect.jpg", "file_type": "image", "file_size": 2621440, "download_url": "http://localhost:9000/task-reporter-files/room-123/images/...?X-Amz-Expires=3600", "uploaded_at": "2025-11-17T10:30:00Z", "uploader_id": "operator@panjit.com.tw" } ``` #### Scenario: Reject download for non-member - **WHEN** a user who is NOT member of room-123 requests file from that room - **THEN** the system SHALL: - Validate room membership - Return 403 Forbidden error - Include error message: "You are not a member of this room" - NOT generate presigned URL - Log unauthorized access attempt #### Scenario: Download deleted file - **WHEN** a user requests a file where `deleted_at` is NOT NULL - **THEN** the system SHALL: - Return 404 Not Found error - Include error message: "File has been deleted" - NOT generate presigned URL ### Requirement: File Metadata Management The system SHALL maintain comprehensive metadata for all uploaded files, support file listing with pagination, and implement soft delete for audit trail preservation. #### Scenario: List files in incident room - **WHEN** a user requests `GET /api/rooms/room-123/files?limit=20&offset=0` - **THEN** the system SHALL: - Validate user is room member - Query `room_files` table filtered by room_id - Exclude soft-deleted files (deleted_at IS NULL) - Order by uploaded_at DESC - Return paginated response: ```json { "files": [ { "file_id": "...", "filename": "defect.jpg", "file_type": "image", "file_size": 2621440, "uploaded_at": "2025-11-17T10:30:00Z", "uploader_id": "operator@panjit.com.tw", "description": "Defect found on product" } ], "total": 45, "limit": 20, "offset": 0, "has_more": true } ``` #### Scenario: Filter files by type - **WHEN** a user requests `GET /api/rooms/room-123/files?file_type=image` - **THEN** the system SHALL: - Filter results where file_type='image' - Return only image files - Include pagination metadata #### Scenario: Soft delete file - **WHEN** a file uploader or room OWNER requests `DELETE /api/rooms/room-123/files/{file_id}` - **THEN** the system SHALL: - Validate user is uploader OR room OWNER - Set `deleted_at = NOW()` in database - NOT delete file from MinIO (preserve for audit) - Return 204 No Content - Broadcast file deletion event via WebSocket #### Scenario: Prevent deletion by non-owner - **WHEN** a user who is neither uploader nor room OWNER attempts DELETE - **THEN** the system SHALL: - Return 403 Forbidden error - Include error message: "Only file uploader or room owner can delete files" - NOT modify database record ### Requirement: MinIO Integration and Connection Management The system SHALL maintain persistent connection pool to MinIO server, handle connection failures gracefully, and support bucket initialization. #### Scenario: Initialize MinIO connection on startup - **WHEN** the FastAPI application starts - **THEN** the system SHALL: - Read MinIO configuration from environment variables (MINIO_ENDPOINT, MINIO_ACCESS_KEY, MINIO_SECRET_KEY) - Initialize Minio client instance - Check if bucket `task-reporter-files` exists - Create bucket if not exists with appropriate permissions - Log successful connection: "MinIO connected: localhost:9000" #### Scenario: Handle MinIO connection failure - **WHEN** MinIO service is unreachable during file upload - **THEN** the system SHALL: - Catch connection exception - Return 503 Service Unavailable error - Include error message: "File storage service temporarily unavailable" - Log error with stack trace - NOT create database record #### Scenario: Retry failed MinIO upload - **WHEN** MinIO upload fails with transient error (e.g., timeout) - **THEN** the system SHALL: - Retry upload up to 3 times with exponential backoff - On success after retry, proceed normally - On failure after 3 retries, return 500 Internal Server Error - Log retry attempts for debugging ### Requirement: Realtime File Upload Notifications The system SHALL broadcast file upload events to all room members via WebSocket, enabling instant file availability notifications. #### Scenario: Broadcast file upload to room members - **WHEN** a file upload completes successfully - **THEN** the system SHALL: - Retrieve all active WebSocket connections for the room - Broadcast message to all connected members: ```json { "type": "file_uploaded", "room_id": "room-123", "file": { "file_id": "550e8400-e29b-41d4-a716-446655440000", "filename": "defect.jpg", "file_type": "image", "file_size": 2621440, "uploader_id": "operator@panjit.com.tw", "uploaded_at": "2025-11-17T10:30:00Z", "thumbnail_url": "http://localhost:9000/task-reporter-files/room-123/images/..." }, "timestamp": "2025-11-17T10:30:01Z" } ``` - Connected clients SHALL update UI to display new file #### Scenario: Send file upload acknowledgment to uploader - **WHEN** file upload completes - **THEN** the system SHALL: - Send personal WebSocket message to uploader: ```json { "type": "file_upload_ack", "file_id": "550e8400-e29b-41d4-a716-446655440000", "status": "success", "download_url": "http://localhost:9000/...", "timestamp": "2025-11-17T10:30:01Z" } ``` #### Scenario: Broadcast file deletion event - **WHEN** a file is soft-deleted - **THEN** the system SHALL: - Broadcast to all room members: ```json { "type": "file_deleted", "room_id": "room-123", "file_id": "550e8400-e29b-41d4-a716-446655440000", "deleted_by": "supervisor@panjit.com.tw", "timestamp": "2025-11-17T11:00:00Z" } ``` - Connected clients SHALL remove file from UI ### Requirement: File Type Detection and Security The system SHALL validate file types using MIME type detection (not just file extension), prevent malicious file uploads, and enforce strict content-type validation. #### Scenario: Detect real MIME type regardless of extension - **WHEN** a user uploads a file named "image.jpg" but actual content is PDF - **THEN** the system SHALL: - Use `python-magic` to detect actual MIME type (application/pdf) - Reject upload with error: "File content does not match extension" - Log potential security violation - Return 400 Bad Request #### Scenario: Allow only whitelisted file types - **WHEN** validating uploaded file - **THEN** the system SHALL: - Check MIME type against whitelist: - Images: image/jpeg, image/png, image/gif - Documents: application/pdf - Logs: text/plain, text/csv - Reject any MIME type not in whitelist - Return 400 Bad Request with specific error #### Scenario: Prevent script file uploads - **WHEN** a user attempts to upload .js, .sh, .bat, .exe, or other executable - **THEN** the system SHALL: - Detect script/executable MIME type (application/x-sh, application/javascript, etc.) - Return 400 Bad Request error: "Executable files are not allowed" - NOT upload to MinIO - Log security event ### Requirement: File-Message Association The system SHALL maintain a foreign key relationship between uploaded files and their associated chat messages, enabling contextual display of files in conversations. #### Scenario: Query file with associated message - **WHEN** a client requests file metadata via `GET /api/rooms/{room_id}/files/{file_id}` - **THEN** the response SHALL include: ```json { "file_id": "550e8400-e29b-41d4-a716-446655440000", "message_id": "msg-789", "filename": "defect.jpg", "file_type": "image", "download_url": "...", "uploaded_at": "2025-12-08T10:30:00+08:00" } ``` #### Scenario: Delete file cascades to message - **WHEN** a file is soft-deleted via `DELETE /api/rooms/{room_id}/files/{file_id}` - **THEN** the system SHALL also soft-delete the associated message - **AND** broadcast both `file_deleted` and `message_deleted` events ### Requirement: Image Thumbnail URLs The system SHALL generate presigned URLs suitable for thumbnail display, allowing frontends to efficiently render image previews. #### Scenario: File metadata includes thumbnail URL - **WHEN** file metadata is returned for an image file - **THEN** the response SHALL include a `thumbnail_url` field - **AND** the URL SHALL be a presigned MinIO URL valid for 1 hour - **AND** the frontend SHALL use CSS to constrain thumbnail display size #### Scenario: Non-image files have no thumbnail - **WHEN** file metadata is returned for a non-image file (PDF, log, etc.) - **THEN** the response SHALL NOT include a `thumbnail_url` field - **AND** the frontend SHALL display a file-type icon instead