Files
Task_Reporter/openspec/changes/archive/2025-12-04-add-ai-report-generation/proposal.md
egg 3927441103 feat: Add AI report generation with DIFY integration
- Add Users table for display name resolution from AD authentication
- Integrate DIFY AI service for report content generation
- Create docx assembly service with image embedding from MinIO
- Add REST API endpoints for report generation and download
- Add WebSocket notifications for generation progress
- Add frontend UI with progress modal and download functionality
- Add integration tests for report generation flow

Report sections (Traditional Chinese):
- 事件摘要 (Summary)
- 時間軸 (Timeline)
- 參與人員 (Participants)
- 處理過程 (Resolution Process)
- 目前狀態 (Current Status)
- 最終處置結果 (Final Resolution)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 18:32:40 +08:00

5.6 KiB

Change: Add AI Report Generation with DIFY

Why

The Task Reporter system currently supports real-time incident communication, file uploads, and chat room management. However, after an incident is resolved, operators must manually compile reports from scattered chat messages, which is time-consuming and error-prone. According to the project specification, automated report generation is a core business value:

"The system uses on-premise AI to automatically generate professional .docx reports with timelines and embedded evidence."

Without this capability:

  • Report compilation takes hours instead of minutes
  • Important details may be missed in manual transcription
  • No standardized report format across incidents
  • Difficult to maintain audit trails for compliance

This change integrates DIFY AI service to automatically distill chat room conversations into structured incident reports, with embedded images and file attachments.

What Changes

This proposal adds a new ai-report-generation capability that:

  1. Integrates DIFY Chat API for AI-powered content generation
  2. Collects room data (messages, members, files, metadata) for AI processing
  3. Generates structured JSON reports using carefully crafted prompts with examples
  4. Assembles .docx documents using python-docx with embedded images from MinIO
  5. Provides REST API endpoints for report generation and download
  6. Adds WebSocket notifications for report generation progress

Integration Details

Component Value
DIFY Base URL https://dify.theaken.com/v1
DIFY Endpoint /chat-messages (Chat Flow)
Response Mode blocking (wait for complete response)
AI Output JSON format with predefined structure
Document Format .docx (Microsoft Word)

Report Sections (Generated by AI)

  • 事件摘要 (Event Summary)
  • 時間軸 (Timeline)
  • 參與人員 (Participants)
  • 處理過程 (Resolution Process)
  • 目前狀態 (Current Status)
  • 最終處置結果 (Final Resolution - if available)

File/Image Handling Strategy

  • AI does NOT receive files/images (avoids DIFY's complex file upload flow)
  • Files are mentioned in text: "[附件: filename.jpg]" annotations in messages
  • Images embedded by Python: Downloaded from MinIO and inserted into .docx
  • File attachments section: Listed with metadata at end of report

User Display Name Resolution

The system needs to display user names (e.g., "劉念蓉") in reports instead of email addresses. Since user_sessions data is temporary, we add a permanent users table:

CREATE TABLE users (
    user_id VARCHAR(255) PRIMARY KEY,  -- email address
    display_name VARCHAR(255) NOT NULL,
    office_location VARCHAR(100),
    job_title VARCHAR(100),
    last_login_at TIMESTAMP,
    created_at TIMESTAMP DEFAULT NOW()
);
  • Populated on login: Auth module creates/updates user record from AD API response
  • Used in reports: JOIN with messages/room_members to get display names
  • Permanent storage: User info persists even after session expires

Dependencies

  • Requires: authentication (user identity + user table), chat-room (room data), realtime-messaging (message history), file-storage (embedded images)
  • External: DIFY API service at https://dify.theaken.com/v1
  • Python packages: python-docx, httpx (async HTTP client)

Spec Deltas

  • ADDED ai-report-generation spec with 5 requirements covering data collection, AI integration, document assembly, API endpoints, and error handling

Risks

  • DIFY service must be accessible (network dependency)
  • AI may produce inconsistent JSON if prompts are not well-structured (mitigation: strict JSON schema + examples in prompt)
  • Large rooms with many messages may exceed token limits (mitigation: summarize older messages)

Impact

  • Affected specs: ai-report-generation (new capability)
  • Affected code:
    • Backend: New app/modules/report_generation/ module with:
      • Routes: POST /api/rooms/{room_id}/reports/generate, GET /api/rooms/{room_id}/reports/{report_id}
      • Services: DifyService, ReportDataService, DocxAssemblyService
      • Models: GeneratedReport (SQLAlchemy)
      • Schemas: Request/Response models
    • Config: New DIFY settings in app/core/config.py
    • Storage: Reports stored temporarily or in MinIO
  • Database: New generated_reports table for report metadata and status tracking

Scenarios

Happy Path: Generate Incident Report

  1. Supervisor opens resolved incident room
  2. Clicks "Generate Report" button
  3. Frontend calls POST /api/rooms/{room_id}/reports/generate
  4. Backend collects all messages, members, files, and room metadata
  5. Backend sends structured prompt to DIFY Chat API
  6. DIFY returns JSON report structure
  7. Backend parses JSON and validates structure
  8. Backend downloads images from MinIO
  9. Backend assembles .docx with python-docx
  10. Backend stores report and returns report_id
  11. Frontend downloads report via GET /api/rooms/{room_id}/reports/{report_id}/download

Edge Case: AI Returns Invalid JSON

  1. DIFY returns malformed JSON or missing fields
  2. Backend detects validation error
  3. Backend retries with simplified prompt (max 2 retries)
  4. If still failing, returns error with raw AI response for debugging
  5. Frontend displays error message to user

Edge Case: Large Room with Many Messages

  1. Room has 500+ messages spanning multiple days
  2. Backend detects message count exceeds threshold (e.g., 200)
  3. Backend summarizes older messages by day
  4. Sends condensed data to DIFY within token limits
  5. Report generation completes successfully