Files
Task_Reporter/openspec/changes/archive/2025-12-04-add-ai-report-generation/design.md
egg 3927441103 feat: Add AI report generation with DIFY integration
- Add Users table for display name resolution from AD authentication
- Integrate DIFY AI service for report content generation
- Create docx assembly service with image embedding from MinIO
- Add REST API endpoints for report generation and download
- Add WebSocket notifications for generation progress
- Add frontend UI with progress modal and download functionality
- Add integration tests for report generation flow

Report sections (Traditional Chinese):
- 事件摘要 (Summary)
- 時間軸 (Timeline)
- 參與人員 (Participants)
- 處理過程 (Resolution Process)
- 目前狀態 (Current Status)
- 最終處置結果 (Final Resolution)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 18:32:40 +08:00

13 KiB

Design: AI Report Generation Architecture

Overview

This document describes the architectural design for integrating DIFY AI service to generate incident reports from chat room data.

System Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                           Frontend (React)                                   │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────────────┐  │
│  │ Generate Button │  │ Progress Modal  │  │ Download Button             │  │
│  └────────┬────────┘  └────────▲────────┘  └──────────────┬──────────────┘  │
└───────────┼────────────────────┼──────────────────────────┼─────────────────┘
            │ POST /generate     │ WebSocket: progress      │ GET /download
            ▼                    │                          ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                         FastAPI Backend                                      │
│                                                                              │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │                    Report Generation Router                             │ │
│  │  POST /api/rooms/{room_id}/reports/generate                            │ │
│  │  GET  /api/rooms/{room_id}/reports                                     │ │
│  │  GET  /api/rooms/{room_id}/reports/{report_id}                         │ │
│  │  GET  /api/rooms/{room_id}/reports/{report_id}/download                │ │
│  └────────────────────────────────────────────────────────────────────────┘ │
│                              │                                               │
│                              ▼                                               │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │                    Report Generation Service                            │ │
│  │                                                                         │ │
│  │  1. ReportDataService.collect_room_data()                              │ │
│  │     ├── Get room metadata (title, type, severity, status)              │ │
│  │     ├── Get all messages (sorted by time)                              │ │
│  │     ├── Get member list (with roles)                                   │ │
│  │     └── Get file list (with metadata, not content)                     │ │
│  │                                                                         │ │
│  │  2. DifyService.generate_report_content()                              │ │
│  │     ├── Build prompt with system instructions + room data              │ │
│  │     ├── Call DIFY Chat API (blocking mode)                             │ │
│  │     ├── Parse JSON response                                            │ │
│  │     └── Validate against expected schema                               │ │
│  │                                                                         │ │
│  │  3. DocxAssemblyService.create_document()                              │ │
│  │     ├── Create docx with python-docx                                   │ │
│  │     ├── Add title, metadata header                                     │ │
│  │     ├── Add AI-generated sections (summary, timeline, etc.)            │ │
│  │     ├── Download images from MinIO                                     │ │
│  │     ├── Embed images in document                                       │ │
│  │     └── Add file attachment list                                       │ │
│  │                                                                         │ │
│  │  4. Store report metadata in database                                  │ │
│  │  5. Upload .docx to MinIO or store locally                             │ │
│  └────────────────────────────────────────────────────────────────────────┘ │
│                              │                                               │
└──────────────────────────────┼───────────────────────────────────────────────┘
                               │
            ┌──────────────────┼──────────────────┐
            ▼                  ▼                  ▼
    ┌──────────────┐   ┌──────────────┐   ┌──────────────┐
    │   DIFY API   │   │    MinIO     │   │  PostgreSQL  │
    │ Chat Messages│   │ File Storage │   │   Database   │
    └──────────────┘   └──────────────┘   └──────────────┘

Data Flow

1. Data Collection Phase

RoomReportData:
  room_id: str
  title: str
  incident_type: str
  severity: str
  status: str
  location: str
  description: str
  created_at: datetime
  resolved_at: datetime | None

  messages: List[MessageData]
    - sender_name: str
    - content: str
    - message_type: str
    - created_at: datetime
    - has_file_attachment: bool
    - file_name: str | None

  members: List[MemberData]
    - user_id: str
    - display_name: str
    - role: str

  files: List[FileData]
    - file_id: str
    - filename: str
    - file_type: str
    - mime_type: str
    - uploaded_at: datetime
    - uploader_name: str

2. DIFY Prompt Construction

System Prompt (在 DIFY 應用設定):
  - Role definition (專業報告撰寫助手)
  - Output format requirements (JSON only)
  - Report section definitions
  - JSON schema with examples

User Query (每次請求):
  ## 事件資訊
  - 標題: {room.title}
  - 類型: {room.incident_type}
  - 嚴重程度: {room.severity}
  - 狀態: {room.status}
  - 地點: {room.location}
  - 建立時間: {room.created_at}

  ## 參與人員
  {formatted member list}

  ## 對話記錄
  {formatted message timeline}

  ## 附件清單
  {formatted file list - names only}

  請根據以上資料生成報告 JSON。

3. DIFY API Request/Response

# Request
POST https://dify.theaken.com/v1/chat-messages
Headers:
  Authorization: Bearer {DIFY_API_KEY}
  Content-Type: application/json

Body:
{
  "inputs": {},
  "query": "{constructed_prompt}",
  "response_mode": "blocking",
  "conversation_id": "",  # New conversation each time
  "user": "{room_id}"     # Use room_id for tracking
}

# Response
{
  "event": "message",
  "message_id": "...",
  "answer": "{...JSON report content...}",
  "metadata": {
    "usage": {...}
  }
}

4. AI Output JSON Schema

{
  "summary": {
    "content": "string (50-100字事件摘要)"
  },
  "timeline": {
    "events": [
      {
        "time": "string (HH:MM or YYYY-MM-DD HH:MM)",
        "description": "string"
      }
    ]
  },
  "participants": {
    "members": [
      {
        "name": "string",
        "role": "string (事件發起人/維修負責人/etc.)"
      }
    ]
  },
  "resolution_process": {
    "content": "string (詳細處理過程)"
  },
  "current_status": {
    "status": "active|resolved|archived",
    "description": "string"
  },
  "final_resolution": {
    "has_resolution": "boolean",
    "content": "string (若 has_resolution=false 可為空)"
  }
}

Module Structure

app/modules/report_generation/
├── __init__.py
├── models.py              # GeneratedReport SQLAlchemy model
├── schemas.py             # Pydantic schemas for API
├── router.py              # FastAPI endpoints
├── dependencies.py        # Auth and permission checks
├── prompts.py             # System prompt and prompt templates
└── services/
    ├── __init__.py
    ├── dify_service.py    # DIFY API client
    ├── report_data_service.py   # Collect room data
    └── docx_service.py    # python-docx assembly

Database Schema

Users Table (New - for display name resolution)

CREATE TABLE users (
    user_id VARCHAR(255) PRIMARY KEY,  -- email address (e.g., ymirliu@panjit.com.tw)
    display_name VARCHAR(255) NOT NULL, -- from AD API userInfo.name (e.g., "ymirliu 劉念蓉")
    office_location VARCHAR(100),       -- from AD API userInfo.officeLocation
    job_title VARCHAR(100),             -- from AD API userInfo.jobTitle
    last_login_at TIMESTAMP,            -- updated on each login
    created_at TIMESTAMP DEFAULT NOW(),

    INDEX ix_users_display_name (display_name)
);

Population Strategy:

  • On successful login, auth module calls upsert_user() with AD API response data
  • Uses INSERT ... ON CONFLICT DO UPDATE for atomic upsert
  • last_login_at updated on every login

Usage in Reports:

SELECT m.content, m.created_at, u.display_name
FROM messages m
LEFT JOIN users u ON m.sender_id = u.user_id
WHERE m.room_id = ?
ORDER BY m.created_at;

Generated Reports Table

CREATE TABLE generated_reports (
    report_id VARCHAR(36) PRIMARY KEY,
    room_id VARCHAR(36) NOT NULL REFERENCES incident_rooms(room_id),

    -- Generation metadata
    generated_by VARCHAR(255) NOT NULL,  -- User who triggered generation
    generated_at TIMESTAMP DEFAULT NOW(),

    -- Status tracking
    status VARCHAR(20) NOT NULL DEFAULT 'pending',  -- pending, generating, completed, failed
    error_message TEXT,

    -- AI metadata
    dify_message_id VARCHAR(100),
    dify_conversation_id VARCHAR(100),
    prompt_tokens INTEGER,
    completion_tokens INTEGER,

    -- Report storage
    report_title VARCHAR(255),
    report_json JSONB,        -- Parsed AI output
    docx_storage_path VARCHAR(500),  -- MinIO path or local path

    -- Indexes
    INDEX ix_generated_reports_room (room_id, generated_at DESC),
    INDEX ix_generated_reports_status (status)
);

Configuration

# app/core/config.py additions
class Settings(BaseSettings):
    # ... existing settings ...

    # DIFY AI Service
    DIFY_BASE_URL: str = "https://dify.theaken.com/v1"
    DIFY_API_KEY: str  # Required, from .env
    DIFY_TIMEOUT_SECONDS: int = 120  # AI generation can take time

    # Report Generation
    REPORT_MAX_MESSAGES: int = 200  # Summarize if exceeded
    REPORT_STORAGE_PATH: str = "reports"  # MinIO path prefix

Error Handling Strategy

Error Type Handling
DIFY API timeout Retry once, then fail with timeout error
DIFY returns non-JSON Attempt to extract JSON from response, retry if fails
JSON schema validation fails Log raw response, return error with details
MinIO image download fails Skip image, add note in report
python-docx assembly fails Return partial report or error

Security Considerations

  • DIFY API key stored in environment variable, never logged
  • Room membership verified before report generation
  • Generated reports inherit room access permissions
  • Report download URLs are direct (no presigned URLs needed as they're behind auth)

Performance Considerations

  • Report generation is async-friendly but runs in blocking mode for simplicity
  • Large rooms: messages older than 7 days are summarized by day
  • Images are downloaded in parallel using asyncio.gather
  • Reports cached in database to avoid regeneration