feat: Add AI report generation with DIFY integration

- Add Users table for display name resolution from AD authentication
- Integrate DIFY AI service for report content generation
- Create docx assembly service with image embedding from MinIO
- Add REST API endpoints for report generation and download
- Add WebSocket notifications for generation progress
- Add frontend UI with progress modal and download functionality
- Add integration tests for report generation flow

Report sections (Traditional Chinese):
- 事件摘要 (Summary)
- 時間軸 (Timeline)
- 參與人員 (Participants)
- 處理過程 (Resolution Process)
- 目前狀態 (Current Status)
- 最終處置結果 (Final Resolution)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-12-04 18:32:40 +08:00
parent 77091eefb5
commit 3927441103
32 changed files with 4374 additions and 8 deletions

View File

@@ -0,0 +1,320 @@
# Design: AI Report Generation Architecture
## Overview
This document describes the architectural design for integrating DIFY AI service to generate incident reports from chat room data.
## System Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Frontend (React) │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │
│ │ Generate Button │ │ Progress Modal │ │ Download Button │ │
│ └────────┬────────┘ └────────▲────────┘ └──────────────┬──────────────┘ │
└───────────┼────────────────────┼──────────────────────────┼─────────────────┘
│ POST /generate │ WebSocket: progress │ GET /download
▼ │ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ FastAPI Backend │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ Report Generation Router │ │
│ │ POST /api/rooms/{room_id}/reports/generate │ │
│ │ GET /api/rooms/{room_id}/reports │ │
│ │ GET /api/rooms/{room_id}/reports/{report_id} │ │
│ │ GET /api/rooms/{room_id}/reports/{report_id}/download │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ Report Generation Service │ │
│ │ │ │
│ │ 1. ReportDataService.collect_room_data() │ │
│ │ ├── Get room metadata (title, type, severity, status) │ │
│ │ ├── Get all messages (sorted by time) │ │
│ │ ├── Get member list (with roles) │ │
│ │ └── Get file list (with metadata, not content) │ │
│ │ │ │
│ │ 2. DifyService.generate_report_content() │ │
│ │ ├── Build prompt with system instructions + room data │ │
│ │ ├── Call DIFY Chat API (blocking mode) │ │
│ │ ├── Parse JSON response │ │
│ │ └── Validate against expected schema │ │
│ │ │ │
│ │ 3. DocxAssemblyService.create_document() │ │
│ │ ├── Create docx with python-docx │ │
│ │ ├── Add title, metadata header │ │
│ │ ├── Add AI-generated sections (summary, timeline, etc.) │ │
│ │ ├── Download images from MinIO │ │
│ │ ├── Embed images in document │ │
│ │ └── Add file attachment list │ │
│ │ │ │
│ │ 4. Store report metadata in database │ │
│ │ 5. Upload .docx to MinIO or store locally │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │ │
└──────────────────────────────┼───────────────────────────────────────────────┘
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ DIFY API │ │ MinIO │ │ PostgreSQL │
│ Chat Messages│ │ File Storage │ │ Database │
└──────────────┘ └──────────────┘ └──────────────┘
```
## Data Flow
### 1. Data Collection Phase
```python
RoomReportData:
room_id: str
title: str
incident_type: str
severity: str
status: str
location: str
description: str
created_at: datetime
resolved_at: datetime | None
messages: List[MessageData]
- sender_name: str
- content: str
- message_type: str
- created_at: datetime
- has_file_attachment: bool
- file_name: str | None
members: List[MemberData]
- user_id: str
- display_name: str
- role: str
files: List[FileData]
- file_id: str
- filename: str
- file_type: str
- mime_type: str
- uploaded_at: datetime
- uploader_name: str
```
### 2. DIFY Prompt Construction
```
System Prompt (在 DIFY 應用設定):
- Role definition (專業報告撰寫助手)
- Output format requirements (JSON only)
- Report section definitions
- JSON schema with examples
User Query (每次請求):
## 事件資訊
- 標題: {room.title}
- 類型: {room.incident_type}
- 嚴重程度: {room.severity}
- 狀態: {room.status}
- 地點: {room.location}
- 建立時間: {room.created_at}
## 參與人員
{formatted member list}
## 對話記錄
{formatted message timeline}
## 附件清單
{formatted file list - names only}
請根據以上資料生成報告 JSON。
```
### 3. DIFY API Request/Response
```python
# Request
POST https://dify.theaken.com/v1/chat-messages
Headers:
Authorization: Bearer {DIFY_API_KEY}
Content-Type: application/json
Body:
{
"inputs": {},
"query": "{constructed_prompt}",
"response_mode": "blocking",
"conversation_id": "", # New conversation each time
"user": "{room_id}" # Use room_id for tracking
}
# Response
{
"event": "message",
"message_id": "...",
"answer": "{...JSON report content...}",
"metadata": {
"usage": {...}
}
}
```
### 4. AI Output JSON Schema
```json
{
"summary": {
"content": "string (50-100字事件摘要)"
},
"timeline": {
"events": [
{
"time": "string (HH:MM or YYYY-MM-DD HH:MM)",
"description": "string"
}
]
},
"participants": {
"members": [
{
"name": "string",
"role": "string (事件發起人/維修負責人/etc.)"
}
]
},
"resolution_process": {
"content": "string (詳細處理過程)"
},
"current_status": {
"status": "active|resolved|archived",
"description": "string"
},
"final_resolution": {
"has_resolution": "boolean",
"content": "string (若 has_resolution=false 可為空)"
}
}
```
## Module Structure
```
app/modules/report_generation/
├── __init__.py
├── models.py # GeneratedReport SQLAlchemy model
├── schemas.py # Pydantic schemas for API
├── router.py # FastAPI endpoints
├── dependencies.py # Auth and permission checks
├── prompts.py # System prompt and prompt templates
└── services/
├── __init__.py
├── dify_service.py # DIFY API client
├── report_data_service.py # Collect room data
└── docx_service.py # python-docx assembly
```
## Database Schema
### Users Table (New - for display name resolution)
```sql
CREATE TABLE users (
user_id VARCHAR(255) PRIMARY KEY, -- email address (e.g., ymirliu@panjit.com.tw)
display_name VARCHAR(255) NOT NULL, -- from AD API userInfo.name (e.g., "ymirliu 劉念蓉")
office_location VARCHAR(100), -- from AD API userInfo.officeLocation
job_title VARCHAR(100), -- from AD API userInfo.jobTitle
last_login_at TIMESTAMP, -- updated on each login
created_at TIMESTAMP DEFAULT NOW(),
INDEX ix_users_display_name (display_name)
);
```
**Population Strategy:**
- On successful login, auth module calls `upsert_user()` with AD API response data
- Uses `INSERT ... ON CONFLICT DO UPDATE` for atomic upsert
- `last_login_at` updated on every login
**Usage in Reports:**
```sql
SELECT m.content, m.created_at, u.display_name
FROM messages m
LEFT JOIN users u ON m.sender_id = u.user_id
WHERE m.room_id = ?
ORDER BY m.created_at;
```
### Generated Reports Table
```sql
CREATE TABLE generated_reports (
report_id VARCHAR(36) PRIMARY KEY,
room_id VARCHAR(36) NOT NULL REFERENCES incident_rooms(room_id),
-- Generation metadata
generated_by VARCHAR(255) NOT NULL, -- User who triggered generation
generated_at TIMESTAMP DEFAULT NOW(),
-- Status tracking
status VARCHAR(20) NOT NULL DEFAULT 'pending', -- pending, generating, completed, failed
error_message TEXT,
-- AI metadata
dify_message_id VARCHAR(100),
dify_conversation_id VARCHAR(100),
prompt_tokens INTEGER,
completion_tokens INTEGER,
-- Report storage
report_title VARCHAR(255),
report_json JSONB, -- Parsed AI output
docx_storage_path VARCHAR(500), -- MinIO path or local path
-- Indexes
INDEX ix_generated_reports_room (room_id, generated_at DESC),
INDEX ix_generated_reports_status (status)
);
```
## Configuration
```python
# app/core/config.py additions
class Settings(BaseSettings):
# ... existing settings ...
# DIFY AI Service
DIFY_BASE_URL: str = "https://dify.theaken.com/v1"
DIFY_API_KEY: str # Required, from .env
DIFY_TIMEOUT_SECONDS: int = 120 # AI generation can take time
# Report Generation
REPORT_MAX_MESSAGES: int = 200 # Summarize if exceeded
REPORT_STORAGE_PATH: str = "reports" # MinIO path prefix
```
## Error Handling Strategy
| Error Type | Handling |
|------------|----------|
| DIFY API timeout | Retry once, then fail with timeout error |
| DIFY returns non-JSON | Attempt to extract JSON from response, retry if fails |
| JSON schema validation fails | Log raw response, return error with details |
| MinIO image download fails | Skip image, add note in report |
| python-docx assembly fails | Return partial report or error |
## Security Considerations
- DIFY API key stored in environment variable, never logged
- Room membership verified before report generation
- Generated reports inherit room access permissions
- Report download URLs are direct (no presigned URLs needed as they're behind auth)
## Performance Considerations
- Report generation is async-friendly but runs in blocking mode for simplicity
- Large rooms: messages older than 7 days are summarized by day
- Images are downloaded in parallel using asyncio.gather
- Reports cached in database to avoid regeneration