Files
Task_Reporter/openspec/specs/ai-report-generation/spec.md
egg 3927441103 feat: Add AI report generation with DIFY integration
- Add Users table for display name resolution from AD authentication
- Integrate DIFY AI service for report content generation
- Create docx assembly service with image embedding from MinIO
- Add REST API endpoints for report generation and download
- Add WebSocket notifications for generation progress
- Add frontend UI with progress modal and download functionality
- Add integration tests for report generation flow

Report sections (Traditional Chinese):
- 事件摘要 (Summary)
- 時間軸 (Timeline)
- 參與人員 (Participants)
- 處理過程 (Resolution Process)
- 目前狀態 (Current Status)
- 最終處置結果 (Final Resolution)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 18:32:40 +08:00

11 KiB

ai-report-generation Specification

Purpose

TBD - created by archiving change add-ai-report-generation. Update Purpose after archive.

Requirements

Requirement: User Display Name Resolution

The system SHALL maintain a permanent users table to store user display names from AD authentication, enabling reports to show names instead of email addresses.

Scenario: Create user record on first login

  • GIVEN user "ymirliu@panjit.com.tw" logs in for the first time
  • AND the AD API returns userInfo with name "ymirliu 劉念蓉"
  • WHEN authentication succeeds
  • THEN the system SHALL create a new record in users table with:
    • user_id: "ymirliu@panjit.com.tw"
    • display_name: "ymirliu 劉念蓉"
    • office_location: "高雄" (from AD API)
    • job_title: null (from AD API)
    • last_login_at: current timestamp
    • created_at: current timestamp

Scenario: Update user record on subsequent login

  • GIVEN user "ymirliu@panjit.com.tw" already exists in users table
  • AND the user's display_name in AD has changed to "劉念蓉 Ymir"
  • WHEN the user logs in again
  • THEN the system SHALL update the existing record with:
    • display_name: "劉念蓉 Ymir"
    • last_login_at: current timestamp
  • AND preserve the original created_at timestamp

Scenario: Resolve display name for report

  • GIVEN a message was sent by "ymirliu@panjit.com.tw"
  • AND the users table contains display_name "ymirliu 劉念蓉" for this user
  • WHEN report data is collected
  • THEN the system SHALL JOIN with users table
  • AND return display_name "ymirliu 劉念蓉" instead of email address

Scenario: Handle unknown user gracefully

  • GIVEN a message was sent by "olduser@panjit.com.tw"
  • AND this user does not exist in the users table (never logged in to new system)
  • WHEN report data is collected
  • THEN the system SHALL use the email address as fallback display name
  • AND format it as "olduser@panjit.com.tw" in the report

Requirement: Report Data Collection

The system SHALL collect all relevant room data for AI processing, including messages, members, files, and room metadata.

Scenario: Collect complete room data for report generation

  • GIVEN an incident room with ID room-123 exists
  • AND the room has 50 messages from 5 members
  • AND the room has 3 uploaded files (2 images, 1 PDF)
  • WHEN the report data service collects room data
  • THEN the system SHALL return a structured data object containing:
    • Room metadata (title, incident_type, severity, status, location, description, timestamps)
    • All 50 messages sorted by created_at ascending
    • All 5 members with their roles (owner, editor, viewer)
    • All 3 files with metadata (filename, type, uploader, upload time)
  • AND messages SHALL include sender display name (not just user_id)
  • AND file references in messages SHALL be annotated as "[附件: filename.ext]"

Scenario: Handle room with no messages

  • GIVEN an incident room was just created with no messages
  • WHEN report generation is requested
  • THEN the system SHALL return an error indicating insufficient data for report generation
  • AND the error message SHALL be "事件聊天室尚無訊息記錄,無法生成報告"

Scenario: Summarize large rooms exceeding message limit

  • GIVEN an incident room has 500 messages spanning 5 days
  • AND the REPORT_MAX_MESSAGES limit is 200
  • WHEN report data is collected
  • THEN the system SHALL keep the most recent 150 messages in full
  • AND summarize older messages by day (e.g., "2025-12-01: 45 則訊息討論設備檢修")
  • AND the total formatted content SHALL stay within token limits

Requirement: DIFY AI Integration

The system SHALL integrate with DIFY Chat API to generate structured report content from collected room data.

Scenario: Successful report generation via DIFY

  • GIVEN room data has been collected successfully
  • WHEN the DIFY service is called with the formatted prompt
  • THEN the system SHALL send a POST request to {DIFY_BASE_URL}/chat-messages
  • AND include Authorization header with Bearer token
  • AND set response_mode to "blocking"
  • AND set user to the room_id for tracking
  • AND parse the JSON from the answer field in the response
  • AND validate the JSON structure matches expected schema

Scenario: DIFY returns invalid JSON

  • GIVEN DIFY returns a response where answer is not valid JSON
  • WHEN the system attempts to parse the response
  • THEN the system SHALL attempt to extract JSON using regex patterns
  • AND if extraction fails, retry the request once with a simplified prompt
  • AND if retry fails, return error with status "failed" and store raw response for debugging

Scenario: DIFY API timeout

  • GIVEN the DIFY API does not respond within DIFY_TIMEOUT_SECONDS (120s)
  • WHEN the timeout is reached
  • THEN the system SHALL cancel the request
  • AND return error with message "AI 服務回應超時,請稍後再試"
  • AND log the timeout event with room_id and request duration

Scenario: DIFY API authentication failure

  • GIVEN the DIFY_API_KEY is invalid or expired
  • WHEN the DIFY API returns 401 Unauthorized
  • THEN the system SHALL return error with message "AI 服務認證失敗,請聯繫系統管理員"
  • AND log the authentication failure (without exposing the key)

Requirement: Document Assembly

The system SHALL assemble professional .docx documents from AI-generated content with embedded images from MinIO.

Scenario: Generate complete report document

  • GIVEN DIFY has returned valid JSON report content
  • AND the room has 2 image attachments in MinIO
  • WHEN the docx assembly service creates the document
  • THEN the system SHALL create a .docx file with:
    • Report title: "生產線異常處理報告 - {room.title}"
    • Generation metadata: 生成時間, 事件編號, 生成者
    • Section 1: 事件摘要 (from AI summary.content)
    • Section 2: 事件時間軸 (formatted table from AI timeline.events)
    • Section 3: 參與人員 (formatted list from AI participants.members)
    • Section 4: 處理過程 (from AI resolution_process.content)
    • Section 5: 目前狀態 (from AI current_status)
    • Section 6: 最終處置結果 (from AI final_resolution, if has_resolution=true)
    • Section 7: 附件 (embedded images + file list)
  • AND images SHALL be embedded at appropriate size (max width 15cm)
  • AND document SHALL use professional formatting (標楷體 or similar)

Scenario: Handle missing images during assembly

  • GIVEN a file reference exists in the database
  • BUT the actual file is missing from MinIO
  • WHEN the docx service attempts to embed the image
  • THEN the system SHALL skip the missing image
  • AND add a placeholder text: "[圖片無法載入: {filename}]"
  • AND continue with document assembly
  • AND log a warning with file_id and room_id

Scenario: Generate report for room without images

  • GIVEN the room has no image attachments
  • WHEN the docx assembly service creates the document
  • THEN the system SHALL create a complete document without the embedded images section
  • AND the attachments section SHALL show "本事件無附件檔案" if no files exist

Requirement: Report Generation API

The system SHALL provide REST API endpoints for triggering report generation and downloading generated reports.

Scenario: Trigger report generation

  • GIVEN user "supervisor@company.com" is a member of room "room-123"
  • AND the room status is "resolved" or "archived"
  • WHEN the user sends POST /api/rooms/room-123/reports/generate
  • THEN the system SHALL create a new report record with status "generating"
  • AND return immediately with report_id and status
  • AND process the report generation asynchronously
  • AND update status to "completed" when done

Scenario: Generate report for active room

  • GIVEN user requests report for a room with status "active"
  • WHEN the request is processed
  • THEN the system SHALL allow generation with a warning
  • AND include note in report: "注意:本報告生成時事件尚未結案"

Scenario: Download generated report

  • GIVEN a report with ID "report-456" has status "completed"
  • AND the report belongs to room "room-123"
  • WHEN user sends GET /api/rooms/room-123/reports/report-456/download
  • THEN the system SHALL return the .docx file
  • AND set Content-Type to "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
  • AND set Content-Disposition to "attachment; filename={report_title}_{date}.docx"

Scenario: List room reports

  • GIVEN room "room-123" has 3 previously generated reports
  • WHEN user sends GET /api/rooms/room-123/reports
  • THEN the system SHALL return a list of reports with:
    • report_id
    • generated_at
    • generated_by
    • status
    • report_title
  • AND results SHALL be sorted by generated_at descending

Scenario: Unauthorized report access

  • GIVEN user "outsider@company.com" is NOT a member of room "room-123"
  • WHEN the user attempts to generate or download a report
  • THEN the system SHALL return 403 Forbidden
  • AND the error message SHALL be "您沒有此事件的存取權限"

Requirement: Report Generation Status and Notifications

The system SHALL track report generation status and notify users of completion via WebSocket.

Scenario: Track report generation progress

  • GIVEN a report generation has been triggered
  • WHEN the generation process runs
  • THEN the system SHALL update report status through stages:
    • "pending" → initial state
    • "collecting_data" → gathering room data
    • "generating_content" → calling DIFY API
    • "assembling_document" → creating .docx
    • "completed" → finished successfully
    • "failed" → error occurred

Scenario: Notify via WebSocket on completion

  • GIVEN user is connected to room WebSocket
  • AND report generation completes successfully
  • WHEN the status changes to "completed"
  • THEN the system SHALL broadcast to room members:
    {
      "type": "report_generated",
      "report_id": "report-456",
      "report_title": "生產線異常處理報告",
      "generated_by": "supervisor@company.com",
      "generated_at": "2025-12-04T16:30:00+08:00"
    }
    

Scenario: Notify on generation failure

  • GIVEN report generation fails
  • WHEN the status changes to "failed"
  • THEN the system SHALL broadcast to the user who triggered generation:
    {
      "type": "report_generation_failed",
      "report_id": "report-456",
      "error": "AI 服務回應超時,請稍後再試"
    }
    
  • AND the error message SHALL be user-friendly (no technical details)