Files

egg 3927441103 feat: Add AI report generation with DIFY integration

- Add Users table for display name resolution from AD authentication
- Integrate DIFY AI service for report content generation
- Create docx assembly service with image embedding from MinIO
- Add REST API endpoints for report generation and download
- Add WebSocket notifications for generation progress
- Add frontend UI with progress modal and download functionality
- Add integration tests for report generation flow

Report sections (Traditional Chinese):
- 事件摘要 (Summary)
- 時間軸 (Timeline)
- 參與人員 (Participants)
- 處理過程 (Resolution Process)
- 目前狀態 (Current Status)
- 最終處置結果 (Final Resolution)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-12-04 18:32:40 +08:00

5.6 KiB

Raw Blame History

Change: Add AI Report Generation with DIFY

Why

The Task Reporter system currently supports real-time incident communication, file uploads, and chat room management. However, after an incident is resolved, operators must manually compile reports from scattered chat messages, which is time-consuming and error-prone. According to the project specification, automated report generation is a core business value:

"The system uses on-premise AI to automatically generate professional .docx reports with timelines and embedded evidence."

Without this capability:

Report compilation takes hours instead of minutes
Important details may be missed in manual transcription
No standardized report format across incidents
Difficult to maintain audit trails for compliance

This change integrates DIFY AI service to automatically distill chat room conversations into structured incident reports, with embedded images and file attachments.

What Changes

This proposal adds a new ai-report-generation capability that:

Integrates DIFY Chat API for AI-powered content generation
Collects room data (messages, members, files, metadata) for AI processing
Generates structured JSON reports using carefully crafted prompts with examples
Assembles .docx documents using python-docx with embedded images from MinIO
Provides REST API endpoints for report generation and download
Adds WebSocket notifications for report generation progress

Integration Details

Component	Value
DIFY Base URL	`https://dify.theaken.com/v1`
DIFY Endpoint	`/chat-messages` (Chat Flow)
Response Mode	`blocking` (wait for complete response)
AI Output	JSON format with predefined structure
Document Format	`.docx` (Microsoft Word)

Report Sections (Generated by AI)

事件摘要 (Event Summary)
時間軸 (Timeline)
參與人員 (Participants)
處理過程 (Resolution Process)
目前狀態 (Current Status)
最終處置結果 (Final Resolution - if available)

File/Image Handling Strategy

AI does NOT receive files/images (avoids DIFY's complex file upload flow)
Files are mentioned in text: "[附件: filename.jpg]" annotations in messages
Images embedded by Python: Downloaded from MinIO and inserted into .docx
File attachments section: Listed with metadata at end of report

User Display Name Resolution

The system needs to display user names (e.g., "劉念蓉") in reports instead of email addresses. Since user_sessions data is temporary, we add a permanent users table:

CREATE TABLE users (
    user_id VARCHAR(255) PRIMARY KEY,  -- email address
    display_name VARCHAR(255) NOT NULL,
    office_location VARCHAR(100),
    job_title VARCHAR(100),
    last_login_at TIMESTAMP,
    created_at TIMESTAMP DEFAULT NOW()
);

Populated on login: Auth module creates/updates user record from AD API response
Used in reports: JOIN with messages/room_members to get display names
Permanent storage: User info persists even after session expires

Dependencies

Requires: authentication (user identity + user table), chat-room (room data), realtime-messaging (message history), file-storage (embedded images)
External: DIFY API service at https://dify.theaken.com/v1
Python packages: python-docx, httpx (async HTTP client)

Spec Deltas

ADDED ai-report-generation spec with 5 requirements covering data collection, AI integration, document assembly, API endpoints, and error handling

Risks

DIFY service must be accessible (network dependency)
AI may produce inconsistent JSON if prompts are not well-structured (mitigation: strict JSON schema + examples in prompt)
Large rooms with many messages may exceed token limits (mitigation: summarize older messages)

Impact

Affected specs: ai-report-generation (new capability)
Affected code:
- Backend: New app/modules/report_generation/ module with:
  - Routes: POST /api/rooms/{room_id}/reports/generate, GET /api/rooms/{room_id}/reports/{report_id}
  - Services: DifyService, ReportDataService, DocxAssemblyService
  - Models: GeneratedReport (SQLAlchemy)
  - Schemas: Request/Response models
- Config: New DIFY settings in app/core/config.py
- Storage: Reports stored temporarily or in MinIO
Database: New generated_reports table for report metadata and status tracking

Scenarios

Happy Path: Generate Incident Report

Supervisor opens resolved incident room
Clicks "Generate Report" button
Frontend calls POST /api/rooms/{room_id}/reports/generate
Backend collects all messages, members, files, and room metadata
Backend sends structured prompt to DIFY Chat API
DIFY returns JSON report structure
Backend parses JSON and validates structure
Backend downloads images from MinIO
Backend assembles .docx with python-docx
Backend stores report and returns report_id
Frontend downloads report via GET /api/rooms/{room_id}/reports/{report_id}/download

Edge Case: AI Returns Invalid JSON

DIFY returns malformed JSON or missing fields
Backend detects validation error
Backend retries with simplified prompt (max 2 retries)
If still failing, returns error with raw AI response for debugging
Frontend displays error message to user

Edge Case: Large Room with Many Messages

Room has 500+ messages spanning multiple days
Backend detects message count exceeds threshold (e.g., 200)
Backend summarizes older messages by day
Sends condensed data to DIFY within token limits
Report generation completes successfully

5.6 KiB Raw Blame History