# ai-report-generation Specification ## Purpose TBD - created by archiving change add-ai-report-generation. Update Purpose after archive. ## Requirements ### Requirement: User Display Name Resolution The system SHALL maintain a permanent `users` table to store user display names from AD authentication, enabling reports to show names instead of email addresses. #### Scenario: Create user record on first login - **GIVEN** user "ymirliu@panjit.com.tw" logs in for the first time - **AND** the AD API returns userInfo with name "ymirliu 劉念蓉" - **WHEN** authentication succeeds - **THEN** the system SHALL create a new record in `users` table with: - user_id: "ymirliu@panjit.com.tw" - display_name: "ymirliu 劉念蓉" - office_location: "高雄" (from AD API) - job_title: null (from AD API) - last_login_at: current timestamp - created_at: current timestamp #### Scenario: Update user record on subsequent login - **GIVEN** user "ymirliu@panjit.com.tw" already exists in `users` table - **AND** the user's display_name in AD has changed to "劉念蓉 Ymir" - **WHEN** the user logs in again - **THEN** the system SHALL update the existing record with: - display_name: "劉念蓉 Ymir" - last_login_at: current timestamp - **AND** preserve the original created_at timestamp #### Scenario: Resolve display name for report - **GIVEN** a message was sent by "ymirliu@panjit.com.tw" - **AND** the users table contains display_name "ymirliu 劉念蓉" for this user - **WHEN** report data is collected - **THEN** the system SHALL JOIN with users table - **AND** return display_name "ymirliu 劉念蓉" instead of email address #### Scenario: Handle unknown user gracefully - **GIVEN** a message was sent by "olduser@panjit.com.tw" - **AND** this user does not exist in the users table (never logged in to new system) - **WHEN** report data is collected - **THEN** the system SHALL use the email address as fallback display name - **AND** format it as "olduser@panjit.com.tw" in the report --- ### Requirement: Report Data Collection The system SHALL collect all relevant room data for AI processing, including messages, members, files, and room metadata. #### Scenario: Collect complete room data for report generation - **GIVEN** an incident room with ID `room-123` exists - **AND** the room has 50 messages from 5 members - **AND** the room has 3 uploaded files (2 images, 1 PDF) - **WHEN** the report data service collects room data - **THEN** the system SHALL return a structured data object containing: - Room metadata (title, incident_type, severity, status, location, description, timestamps) - All 50 messages sorted by created_at ascending - All 5 members with their roles (owner, editor, viewer) - All 3 files with metadata (filename, type, uploader, upload time) - **AND** messages SHALL include sender display name (not just user_id) - **AND** file references in messages SHALL be annotated as "[附件: filename.ext]" #### Scenario: Handle room with no messages - **GIVEN** an incident room was just created with no messages - **WHEN** report generation is requested - **THEN** the system SHALL return an error indicating insufficient data for report generation - **AND** the error message SHALL be "事件聊天室尚無訊息記錄,無法生成報告" #### Scenario: Summarize large rooms exceeding message limit - **GIVEN** an incident room has 500 messages spanning 5 days - **AND** the REPORT_MAX_MESSAGES limit is 200 - **WHEN** report data is collected - **THEN** the system SHALL keep the most recent 150 messages in full - **AND** summarize older messages by day (e.g., "2025-12-01: 45 則訊息討論設備檢修") - **AND** the total formatted content SHALL stay within token limits --- ### Requirement: DIFY AI Integration The system SHALL integrate with DIFY Chat API to generate structured report content from collected room data. #### Scenario: Successful report generation via DIFY - **GIVEN** room data has been collected successfully - **WHEN** the DIFY service is called with the formatted prompt - **THEN** the system SHALL send a POST request to `{DIFY_BASE_URL}/chat-messages` - **AND** include Authorization header with Bearer token - **AND** set response_mode to "blocking" - **AND** set user to the room_id for tracking - **AND** parse the JSON from the `answer` field in the response - **AND** validate the JSON structure matches expected schema #### Scenario: DIFY returns invalid JSON - **GIVEN** DIFY returns a response where `answer` is not valid JSON - **WHEN** the system attempts to parse the response - **THEN** the system SHALL attempt to extract JSON using regex patterns - **AND** if extraction fails, retry the request once with a simplified prompt - **AND** if retry fails, return error with status "failed" and store raw response for debugging #### Scenario: DIFY API timeout - **GIVEN** the DIFY API does not respond within DIFY_TIMEOUT_SECONDS (120s) - **WHEN** the timeout is reached - **THEN** the system SHALL cancel the request - **AND** return error with message "AI 服務回應超時,請稍後再試" - **AND** log the timeout event with room_id and request duration #### Scenario: DIFY API authentication failure - **GIVEN** the DIFY_API_KEY is invalid or expired - **WHEN** the DIFY API returns 401 Unauthorized - **THEN** the system SHALL return error with message "AI 服務認證失敗,請聯繫系統管理員" - **AND** log the authentication failure (without exposing the key) --- ### Requirement: Document Assembly The system SHALL assemble professional .docx documents from AI-generated content with embedded images from MinIO. #### Scenario: Generate complete report document - **GIVEN** DIFY has returned valid JSON report content - **AND** the room has 2 image attachments in MinIO - **WHEN** the docx assembly service creates the document - **THEN** the system SHALL create a .docx file with: - Report title: "生產線異常處理報告 - {room.title}" - Generation metadata: 生成時間, 事件編號, 生成者 - Section 1: 事件摘要 (from AI summary.content) - Section 2: 事件時間軸 (formatted table from AI timeline.events) - Section 3: 參與人員 (formatted list from AI participants.members) - Section 4: 處理過程 (from AI resolution_process.content) - Section 5: 目前狀態 (from AI current_status) - Section 6: 最終處置結果 (from AI final_resolution, if has_resolution=true) - Section 7: 附件 (embedded images + file list) - **AND** images SHALL be embedded at appropriate size (max width 15cm) - **AND** document SHALL use professional formatting (標楷體 or similar) #### Scenario: Handle missing images during assembly - **GIVEN** a file reference exists in the database - **BUT** the actual file is missing from MinIO - **WHEN** the docx service attempts to embed the image - **THEN** the system SHALL skip the missing image - **AND** add a placeholder text: "[圖片無法載入: {filename}]" - **AND** continue with document assembly - **AND** log a warning with file_id and room_id #### Scenario: Generate report for room without images - **GIVEN** the room has no image attachments - **WHEN** the docx assembly service creates the document - **THEN** the system SHALL create a complete document without the embedded images section - **AND** the attachments section SHALL show "本事件無附件檔案" if no files exist --- ### Requirement: Report Generation API The system SHALL provide REST API endpoints for triggering report generation and downloading generated reports. #### Scenario: Trigger report generation - **GIVEN** user "supervisor@company.com" is a member of room "room-123" - **AND** the room status is "resolved" or "archived" - **WHEN** the user sends `POST /api/rooms/room-123/reports/generate` - **THEN** the system SHALL create a new report record with status "generating" - **AND** return immediately with report_id and status - **AND** process the report generation asynchronously - **AND** update status to "completed" when done #### Scenario: Generate report for active room - **GIVEN** user requests report for a room with status "active" - **WHEN** the request is processed - **THEN** the system SHALL allow generation with a warning - **AND** include note in report: "注意:本報告生成時事件尚未結案" #### Scenario: Download generated report - **GIVEN** a report with ID "report-456" has status "completed" - **AND** the report belongs to room "room-123" - **WHEN** user sends `GET /api/rooms/room-123/reports/report-456/download` - **THEN** the system SHALL return the .docx file - **AND** set Content-Type to "application/vnd.openxmlformats-officedocument.wordprocessingml.document" - **AND** set Content-Disposition to "attachment; filename={report_title}_{date}.docx" #### Scenario: List room reports - **GIVEN** room "room-123" has 3 previously generated reports - **WHEN** user sends `GET /api/rooms/room-123/reports` - **THEN** the system SHALL return a list of reports with: - report_id - generated_at - generated_by - status - report_title - **AND** results SHALL be sorted by generated_at descending #### Scenario: Unauthorized report access - **GIVEN** user "outsider@company.com" is NOT a member of room "room-123" - **WHEN** the user attempts to generate or download a report - **THEN** the system SHALL return 403 Forbidden - **AND** the error message SHALL be "您沒有此事件的存取權限" --- ### Requirement: Report Generation Status and Notifications The system SHALL track report generation status and notify users of completion via WebSocket. #### Scenario: Track report generation progress - **GIVEN** a report generation has been triggered - **WHEN** the generation process runs - **THEN** the system SHALL update report status through stages: - "pending" → initial state - "collecting_data" → gathering room data - "generating_content" → calling DIFY API - "assembling_document" → creating .docx - "completed" → finished successfully - "failed" → error occurred #### Scenario: Notify via WebSocket on completion - **GIVEN** user is connected to room WebSocket - **AND** report generation completes successfully - **WHEN** the status changes to "completed" - **THEN** the system SHALL broadcast to room members: ```json { "type": "report_generated", "report_id": "report-456", "report_title": "生產線異常處理報告", "generated_by": "supervisor@company.com", "generated_at": "2025-12-04T16:30:00+08:00" } ``` #### Scenario: Notify on generation failure - **GIVEN** report generation fails - **WHEN** the status changes to "failed" - **THEN** the system SHALL broadcast to the user who triggered generation: ```json { "type": "report_generation_failed", "report_id": "report-456", "error": "AI 服務回應超時,請稍後再試" } ``` - **AND** the error message SHALL be user-friendly (no technical details)