feat: Add AI report generation with DIFY integration

- Add Users table for display name resolution from AD authentication
- Integrate DIFY AI service for report content generation
- Create docx assembly service with image embedding from MinIO
- Add REST API endpoints for report generation and download
- Add WebSocket notifications for generation progress
- Add frontend UI with progress modal and download functionality
- Add integration tests for report generation flow

Report sections (Traditional Chinese):
- 事件摘要 (Summary)
- 時間軸 (Timeline)
- 參與人員 (Participants)
- 處理過程 (Resolution Process)
- 目前狀態 (Current Status)
- 最終處置結果 (Final Resolution)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-12-04 18:32:40 +08:00
parent 77091eefb5
commit 3927441103
32 changed files with 4374 additions and 8 deletions

View File

@@ -0,0 +1,320 @@
# Design: AI Report Generation Architecture
## Overview
This document describes the architectural design for integrating DIFY AI service to generate incident reports from chat room data.
## System Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Frontend (React) │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────────┐ │
│ │ Generate Button │ │ Progress Modal │ │ Download Button │ │
│ └────────┬────────┘ └────────▲────────┘ └──────────────┬──────────────┘ │
└───────────┼────────────────────┼──────────────────────────┼─────────────────┘
│ POST /generate │ WebSocket: progress │ GET /download
▼ │ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ FastAPI Backend │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ Report Generation Router │ │
│ │ POST /api/rooms/{room_id}/reports/generate │ │
│ │ GET /api/rooms/{room_id}/reports │ │
│ │ GET /api/rooms/{room_id}/reports/{report_id} │ │
│ │ GET /api/rooms/{room_id}/reports/{report_id}/download │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ Report Generation Service │ │
│ │ │ │
│ │ 1. ReportDataService.collect_room_data() │ │
│ │ ├── Get room metadata (title, type, severity, status) │ │
│ │ ├── Get all messages (sorted by time) │ │
│ │ ├── Get member list (with roles) │ │
│ │ └── Get file list (with metadata, not content) │ │
│ │ │ │
│ │ 2. DifyService.generate_report_content() │ │
│ │ ├── Build prompt with system instructions + room data │ │
│ │ ├── Call DIFY Chat API (blocking mode) │ │
│ │ ├── Parse JSON response │ │
│ │ └── Validate against expected schema │ │
│ │ │ │
│ │ 3. DocxAssemblyService.create_document() │ │
│ │ ├── Create docx with python-docx │ │
│ │ ├── Add title, metadata header │ │
│ │ ├── Add AI-generated sections (summary, timeline, etc.) │ │
│ │ ├── Download images from MinIO │ │
│ │ ├── Embed images in document │ │
│ │ └── Add file attachment list │ │
│ │ │ │
│ │ 4. Store report metadata in database │ │
│ │ 5. Upload .docx to MinIO or store locally │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │ │
└──────────────────────────────┼───────────────────────────────────────────────┘
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ DIFY API │ │ MinIO │ │ PostgreSQL │
│ Chat Messages│ │ File Storage │ │ Database │
└──────────────┘ └──────────────┘ └──────────────┘
```
## Data Flow
### 1. Data Collection Phase
```python
RoomReportData:
room_id: str
title: str
incident_type: str
severity: str
status: str
location: str
description: str
created_at: datetime
resolved_at: datetime | None
messages: List[MessageData]
- sender_name: str
- content: str
- message_type: str
- created_at: datetime
- has_file_attachment: bool
- file_name: str | None
members: List[MemberData]
- user_id: str
- display_name: str
- role: str
files: List[FileData]
- file_id: str
- filename: str
- file_type: str
- mime_type: str
- uploaded_at: datetime
- uploader_name: str
```
### 2. DIFY Prompt Construction
```
System Prompt (在 DIFY 應用設定):
- Role definition (專業報告撰寫助手)
- Output format requirements (JSON only)
- Report section definitions
- JSON schema with examples
User Query (每次請求):
## 事件資訊
- 標題: {room.title}
- 類型: {room.incident_type}
- 嚴重程度: {room.severity}
- 狀態: {room.status}
- 地點: {room.location}
- 建立時間: {room.created_at}
## 參與人員
{formatted member list}
## 對話記錄
{formatted message timeline}
## 附件清單
{formatted file list - names only}
請根據以上資料生成報告 JSON。
```
### 3. DIFY API Request/Response
```python
# Request
POST https://dify.theaken.com/v1/chat-messages
Headers:
Authorization: Bearer {DIFY_API_KEY}
Content-Type: application/json
Body:
{
"inputs": {},
"query": "{constructed_prompt}",
"response_mode": "blocking",
"conversation_id": "", # New conversation each time
"user": "{room_id}" # Use room_id for tracking
}
# Response
{
"event": "message",
"message_id": "...",
"answer": "{...JSON report content...}",
"metadata": {
"usage": {...}
}
}
```
### 4. AI Output JSON Schema
```json
{
"summary": {
"content": "string (50-100字事件摘要)"
},
"timeline": {
"events": [
{
"time": "string (HH:MM or YYYY-MM-DD HH:MM)",
"description": "string"
}
]
},
"participants": {
"members": [
{
"name": "string",
"role": "string (事件發起人/維修負責人/etc.)"
}
]
},
"resolution_process": {
"content": "string (詳細處理過程)"
},
"current_status": {
"status": "active|resolved|archived",
"description": "string"
},
"final_resolution": {
"has_resolution": "boolean",
"content": "string (若 has_resolution=false 可為空)"
}
}
```
## Module Structure
```
app/modules/report_generation/
├── __init__.py
├── models.py # GeneratedReport SQLAlchemy model
├── schemas.py # Pydantic schemas for API
├── router.py # FastAPI endpoints
├── dependencies.py # Auth and permission checks
├── prompts.py # System prompt and prompt templates
└── services/
├── __init__.py
├── dify_service.py # DIFY API client
├── report_data_service.py # Collect room data
└── docx_service.py # python-docx assembly
```
## Database Schema
### Users Table (New - for display name resolution)
```sql
CREATE TABLE users (
user_id VARCHAR(255) PRIMARY KEY, -- email address (e.g., ymirliu@panjit.com.tw)
display_name VARCHAR(255) NOT NULL, -- from AD API userInfo.name (e.g., "ymirliu 劉念蓉")
office_location VARCHAR(100), -- from AD API userInfo.officeLocation
job_title VARCHAR(100), -- from AD API userInfo.jobTitle
last_login_at TIMESTAMP, -- updated on each login
created_at TIMESTAMP DEFAULT NOW(),
INDEX ix_users_display_name (display_name)
);
```
**Population Strategy:**
- On successful login, auth module calls `upsert_user()` with AD API response data
- Uses `INSERT ... ON CONFLICT DO UPDATE` for atomic upsert
- `last_login_at` updated on every login
**Usage in Reports:**
```sql
SELECT m.content, m.created_at, u.display_name
FROM messages m
LEFT JOIN users u ON m.sender_id = u.user_id
WHERE m.room_id = ?
ORDER BY m.created_at;
```
### Generated Reports Table
```sql
CREATE TABLE generated_reports (
report_id VARCHAR(36) PRIMARY KEY,
room_id VARCHAR(36) NOT NULL REFERENCES incident_rooms(room_id),
-- Generation metadata
generated_by VARCHAR(255) NOT NULL, -- User who triggered generation
generated_at TIMESTAMP DEFAULT NOW(),
-- Status tracking
status VARCHAR(20) NOT NULL DEFAULT 'pending', -- pending, generating, completed, failed
error_message TEXT,
-- AI metadata
dify_message_id VARCHAR(100),
dify_conversation_id VARCHAR(100),
prompt_tokens INTEGER,
completion_tokens INTEGER,
-- Report storage
report_title VARCHAR(255),
report_json JSONB, -- Parsed AI output
docx_storage_path VARCHAR(500), -- MinIO path or local path
-- Indexes
INDEX ix_generated_reports_room (room_id, generated_at DESC),
INDEX ix_generated_reports_status (status)
);
```
## Configuration
```python
# app/core/config.py additions
class Settings(BaseSettings):
# ... existing settings ...
# DIFY AI Service
DIFY_BASE_URL: str = "https://dify.theaken.com/v1"
DIFY_API_KEY: str # Required, from .env
DIFY_TIMEOUT_SECONDS: int = 120 # AI generation can take time
# Report Generation
REPORT_MAX_MESSAGES: int = 200 # Summarize if exceeded
REPORT_STORAGE_PATH: str = "reports" # MinIO path prefix
```
## Error Handling Strategy
| Error Type | Handling |
|------------|----------|
| DIFY API timeout | Retry once, then fail with timeout error |
| DIFY returns non-JSON | Attempt to extract JSON from response, retry if fails |
| JSON schema validation fails | Log raw response, return error with details |
| MinIO image download fails | Skip image, add note in report |
| python-docx assembly fails | Return partial report or error |
## Security Considerations
- DIFY API key stored in environment variable, never logged
- Room membership verified before report generation
- Generated reports inherit room access permissions
- Report download URLs are direct (no presigned URLs needed as they're behind auth)
## Performance Considerations
- Report generation is async-friendly but runs in blocking mode for simplicity
- Large rooms: messages older than 7 days are summarized by day
- Images are downloaded in parallel using asyncio.gather
- Reports cached in database to avoid regeneration

View File

@@ -0,0 +1,127 @@
# Change: Add AI Report Generation with DIFY
## Why
The Task Reporter system currently supports real-time incident communication, file uploads, and chat room management. However, after an incident is resolved, operators must manually compile reports from scattered chat messages, which is time-consuming and error-prone. According to the project specification, automated report generation is a core business value:
> "The system uses on-premise AI to automatically generate professional .docx reports with timelines and embedded evidence."
Without this capability:
- Report compilation takes hours instead of minutes
- Important details may be missed in manual transcription
- No standardized report format across incidents
- Difficult to maintain audit trails for compliance
This change integrates DIFY AI service to automatically distill chat room conversations into structured incident reports, with embedded images and file attachments.
## What Changes
This proposal adds a new **ai-report-generation** capability that:
1. **Integrates DIFY Chat API** for AI-powered content generation
2. **Collects room data** (messages, members, files, metadata) for AI processing
3. **Generates structured JSON reports** using carefully crafted prompts with examples
4. **Assembles .docx documents** using python-docx with embedded images from MinIO
5. **Provides REST API endpoints** for report generation and download
6. **Adds WebSocket notifications** for report generation progress
### Integration Details
| Component | Value |
|-----------|-------|
| DIFY Base URL | `https://dify.theaken.com/v1` |
| DIFY Endpoint | `/chat-messages` (Chat Flow) |
| Response Mode | `blocking` (wait for complete response) |
| AI Output | JSON format with predefined structure |
| Document Format | `.docx` (Microsoft Word) |
### Report Sections (Generated by AI)
- 事件摘要 (Event Summary)
- 時間軸 (Timeline)
- 參與人員 (Participants)
- 處理過程 (Resolution Process)
- 目前狀態 (Current Status)
- 最終處置結果 (Final Resolution - if available)
### File/Image Handling Strategy
- **AI does NOT receive files/images** (avoids DIFY's complex file upload flow)
- **Files are mentioned in text**: "[附件: filename.jpg]" annotations in messages
- **Images embedded by Python**: Downloaded from MinIO and inserted into .docx
- **File attachments section**: Listed with metadata at end of report
### User Display Name Resolution
The system needs to display user names (e.g., "劉念蓉") in reports instead of email addresses. Since `user_sessions` data is temporary, we add a permanent `users` table:
```sql
CREATE TABLE users (
user_id VARCHAR(255) PRIMARY KEY, -- email address
display_name VARCHAR(255) NOT NULL,
office_location VARCHAR(100),
job_title VARCHAR(100),
last_login_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW()
);
```
- **Populated on login**: Auth module creates/updates user record from AD API response
- **Used in reports**: JOIN with messages/room_members to get display names
- **Permanent storage**: User info persists even after session expires
### Dependencies
- **Requires**: `authentication` (user identity + user table), `chat-room` (room data), `realtime-messaging` (message history), `file-storage` (embedded images)
- **External**: DIFY API service at `https://dify.theaken.com/v1`
- **Python packages**: `python-docx`, `httpx` (async HTTP client)
### Spec Deltas
- **ADDED** `ai-report-generation` spec with 5 requirements covering data collection, AI integration, document assembly, API endpoints, and error handling
### Risks
- DIFY service must be accessible (network dependency)
- AI may produce inconsistent JSON if prompts are not well-structured (mitigation: strict JSON schema + examples in prompt)
- Large rooms with many messages may exceed token limits (mitigation: summarize older messages)
## Impact
- **Affected specs**: `ai-report-generation` (new capability)
- **Affected code**:
- Backend: New `app/modules/report_generation/` module with:
- Routes: `POST /api/rooms/{room_id}/reports/generate`, `GET /api/rooms/{room_id}/reports/{report_id}`
- Services: `DifyService`, `ReportDataService`, `DocxAssemblyService`
- Models: `GeneratedReport` (SQLAlchemy)
- Schemas: Request/Response models
- Config: New DIFY settings in `app/core/config.py`
- Storage: Reports stored temporarily or in MinIO
- **Database**: New `generated_reports` table for report metadata and status tracking
## Scenarios
### Happy Path: Generate Incident Report
1. Supervisor opens resolved incident room
2. Clicks "Generate Report" button
3. Frontend calls `POST /api/rooms/{room_id}/reports/generate`
4. Backend collects all messages, members, files, and room metadata
5. Backend sends structured prompt to DIFY Chat API
6. DIFY returns JSON report structure
7. Backend parses JSON and validates structure
8. Backend downloads images from MinIO
9. Backend assembles .docx with python-docx
10. Backend stores report and returns report_id
11. Frontend downloads report via `GET /api/rooms/{room_id}/reports/{report_id}/download`
### Edge Case: AI Returns Invalid JSON
1. DIFY returns malformed JSON or missing fields
2. Backend detects validation error
3. Backend retries with simplified prompt (max 2 retries)
4. If still failing, returns error with raw AI response for debugging
5. Frontend displays error message to user
### Edge Case: Large Room with Many Messages
1. Room has 500+ messages spanning multiple days
2. Backend detects message count exceeds threshold (e.g., 200)
3. Backend summarizes older messages by day
4. Sends condensed data to DIFY within token limits
5. Report generation completes successfully

View File

@@ -0,0 +1,264 @@
# Capability: AI Report Generation
Automated incident report generation using DIFY AI service to distill chat room conversations into structured .docx documents with embedded images.
## ADDED Requirements
### Requirement: User Display Name Resolution
The system SHALL maintain a permanent `users` table to store user display names from AD authentication, enabling reports to show names instead of email addresses.
#### Scenario: Create user record on first login
- **GIVEN** user "ymirliu@panjit.com.tw" logs in for the first time
- **AND** the AD API returns userInfo with name "ymirliu 劉念蓉"
- **WHEN** authentication succeeds
- **THEN** the system SHALL create a new record in `users` table with:
- user_id: "ymirliu@panjit.com.tw"
- display_name: "ymirliu 劉念蓉"
- office_location: "高雄" (from AD API)
- job_title: null (from AD API)
- last_login_at: current timestamp
- created_at: current timestamp
#### Scenario: Update user record on subsequent login
- **GIVEN** user "ymirliu@panjit.com.tw" already exists in `users` table
- **AND** the user's display_name in AD has changed to "劉念蓉 Ymir"
- **WHEN** the user logs in again
- **THEN** the system SHALL update the existing record with:
- display_name: "劉念蓉 Ymir"
- last_login_at: current timestamp
- **AND** preserve the original created_at timestamp
#### Scenario: Resolve display name for report
- **GIVEN** a message was sent by "ymirliu@panjit.com.tw"
- **AND** the users table contains display_name "ymirliu 劉念蓉" for this user
- **WHEN** report data is collected
- **THEN** the system SHALL JOIN with users table
- **AND** return display_name "ymirliu 劉念蓉" instead of email address
#### Scenario: Handle unknown user gracefully
- **GIVEN** a message was sent by "olduser@panjit.com.tw"
- **AND** this user does not exist in the users table (never logged in to new system)
- **WHEN** report data is collected
- **THEN** the system SHALL use the email address as fallback display name
- **AND** format it as "olduser@panjit.com.tw" in the report
---
### Requirement: Report Data Collection
The system SHALL collect all relevant room data for AI processing, including messages, members, files, and room metadata.
#### Scenario: Collect complete room data for report generation
- **GIVEN** an incident room with ID `room-123` exists
- **AND** the room has 50 messages from 5 members
- **AND** the room has 3 uploaded files (2 images, 1 PDF)
- **WHEN** the report data service collects room data
- **THEN** the system SHALL return a structured data object containing:
- Room metadata (title, incident_type, severity, status, location, description, timestamps)
- All 50 messages sorted by created_at ascending
- All 5 members with their roles (owner, editor, viewer)
- All 3 files with metadata (filename, type, uploader, upload time)
- **AND** messages SHALL include sender display name (not just user_id)
- **AND** file references in messages SHALL be annotated as "[附件: filename.ext]"
#### Scenario: Handle room with no messages
- **GIVEN** an incident room was just created with no messages
- **WHEN** report generation is requested
- **THEN** the system SHALL return an error indicating insufficient data for report generation
- **AND** the error message SHALL be "事件聊天室尚無訊息記錄,無法生成報告"
#### Scenario: Summarize large rooms exceeding message limit
- **GIVEN** an incident room has 500 messages spanning 5 days
- **AND** the REPORT_MAX_MESSAGES limit is 200
- **WHEN** report data is collected
- **THEN** the system SHALL keep the most recent 150 messages in full
- **AND** summarize older messages by day (e.g., "2025-12-01: 45 則訊息討論設備檢修")
- **AND** the total formatted content SHALL stay within token limits
---
### Requirement: DIFY AI Integration
The system SHALL integrate with DIFY Chat API to generate structured report content from collected room data.
#### Scenario: Successful report generation via DIFY
- **GIVEN** room data has been collected successfully
- **WHEN** the DIFY service is called with the formatted prompt
- **THEN** the system SHALL send a POST request to `{DIFY_BASE_URL}/chat-messages`
- **AND** include Authorization header with Bearer token
- **AND** set response_mode to "blocking"
- **AND** set user to the room_id for tracking
- **AND** parse the JSON from the `answer` field in the response
- **AND** validate the JSON structure matches expected schema
#### Scenario: DIFY returns invalid JSON
- **GIVEN** DIFY returns a response where `answer` is not valid JSON
- **WHEN** the system attempts to parse the response
- **THEN** the system SHALL attempt to extract JSON using regex patterns
- **AND** if extraction fails, retry the request once with a simplified prompt
- **AND** if retry fails, return error with status "failed" and store raw response for debugging
#### Scenario: DIFY API timeout
- **GIVEN** the DIFY API does not respond within DIFY_TIMEOUT_SECONDS (120s)
- **WHEN** the timeout is reached
- **THEN** the system SHALL cancel the request
- **AND** return error with message "AI 服務回應超時,請稍後再試"
- **AND** log the timeout event with room_id and request duration
#### Scenario: DIFY API authentication failure
- **GIVEN** the DIFY_API_KEY is invalid or expired
- **WHEN** the DIFY API returns 401 Unauthorized
- **THEN** the system SHALL return error with message "AI 服務認證失敗,請聯繫系統管理員"
- **AND** log the authentication failure (without exposing the key)
---
### Requirement: Document Assembly
The system SHALL assemble professional .docx documents from AI-generated content with embedded images from MinIO.
#### Scenario: Generate complete report document
- **GIVEN** DIFY has returned valid JSON report content
- **AND** the room has 2 image attachments in MinIO
- **WHEN** the docx assembly service creates the document
- **THEN** the system SHALL create a .docx file with:
- Report title: "生產線異常處理報告 - {room.title}"
- Generation metadata: 生成時間, 事件編號, 生成者
- Section 1: 事件摘要 (from AI summary.content)
- Section 2: 事件時間軸 (formatted table from AI timeline.events)
- Section 3: 參與人員 (formatted list from AI participants.members)
- Section 4: 處理過程 (from AI resolution_process.content)
- Section 5: 目前狀態 (from AI current_status)
- Section 6: 最終處置結果 (from AI final_resolution, if has_resolution=true)
- Section 7: 附件 (embedded images + file list)
- **AND** images SHALL be embedded at appropriate size (max width 15cm)
- **AND** document SHALL use professional formatting (標楷體 or similar)
#### Scenario: Handle missing images during assembly
- **GIVEN** a file reference exists in the database
- **BUT** the actual file is missing from MinIO
- **WHEN** the docx service attempts to embed the image
- **THEN** the system SHALL skip the missing image
- **AND** add a placeholder text: "[圖片無法載入: {filename}]"
- **AND** continue with document assembly
- **AND** log a warning with file_id and room_id
#### Scenario: Generate report for room without images
- **GIVEN** the room has no image attachments
- **WHEN** the docx assembly service creates the document
- **THEN** the system SHALL create a complete document without the embedded images section
- **AND** the attachments section SHALL show "本事件無附件檔案" if no files exist
---
### Requirement: Report Generation API
The system SHALL provide REST API endpoints for triggering report generation and downloading generated reports.
#### Scenario: Trigger report generation
- **GIVEN** user "supervisor@company.com" is a member of room "room-123"
- **AND** the room status is "resolved" or "archived"
- **WHEN** the user sends `POST /api/rooms/room-123/reports/generate`
- **THEN** the system SHALL create a new report record with status "generating"
- **AND** return immediately with report_id and status
- **AND** process the report generation asynchronously
- **AND** update status to "completed" when done
#### Scenario: Generate report for active room
- **GIVEN** user requests report for a room with status "active"
- **WHEN** the request is processed
- **THEN** the system SHALL allow generation with a warning
- **AND** include note in report: "注意:本報告生成時事件尚未結案"
#### Scenario: Download generated report
- **GIVEN** a report with ID "report-456" has status "completed"
- **AND** the report belongs to room "room-123"
- **WHEN** user sends `GET /api/rooms/room-123/reports/report-456/download`
- **THEN** the system SHALL return the .docx file
- **AND** set Content-Type to "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
- **AND** set Content-Disposition to "attachment; filename={report_title}_{date}.docx"
#### Scenario: List room reports
- **GIVEN** room "room-123" has 3 previously generated reports
- **WHEN** user sends `GET /api/rooms/room-123/reports`
- **THEN** the system SHALL return a list of reports with:
- report_id
- generated_at
- generated_by
- status
- report_title
- **AND** results SHALL be sorted by generated_at descending
#### Scenario: Unauthorized report access
- **GIVEN** user "outsider@company.com" is NOT a member of room "room-123"
- **WHEN** the user attempts to generate or download a report
- **THEN** the system SHALL return 403 Forbidden
- **AND** the error message SHALL be "您沒有此事件的存取權限"
---
### Requirement: Report Generation Status and Notifications
The system SHALL track report generation status and notify users of completion via WebSocket.
#### Scenario: Track report generation progress
- **GIVEN** a report generation has been triggered
- **WHEN** the generation process runs
- **THEN** the system SHALL update report status through stages:
- "pending" → initial state
- "collecting_data" → gathering room data
- "generating_content" → calling DIFY API
- "assembling_document" → creating .docx
- "completed" → finished successfully
- "failed" → error occurred
#### Scenario: Notify via WebSocket on completion
- **GIVEN** user is connected to room WebSocket
- **AND** report generation completes successfully
- **WHEN** the status changes to "completed"
- **THEN** the system SHALL broadcast to room members:
```json
{
"type": "report_generated",
"report_id": "report-456",
"report_title": "生產線異常處理報告",
"generated_by": "supervisor@company.com",
"generated_at": "2025-12-04T16:30:00+08:00"
}
```
#### Scenario: Notify on generation failure
- **GIVEN** report generation fails
- **WHEN** the status changes to "failed"
- **THEN** the system SHALL broadcast to the user who triggered generation:
```json
{
"type": "report_generation_failed",
"report_id": "report-456",
"error": "AI 服務回應超時,請稍後再試"
}
```
- **AND** the error message SHALL be user-friendly (no technical details)

View File

@@ -0,0 +1,304 @@
# Implementation Tasks
## 0. Users Table for Display Name Resolution
- [x] 0.1 Create `app/modules/auth/models.py` - Add `User` model:
- `user_id` (PK, VARCHAR 255) - email address
- `display_name` (VARCHAR 255, NOT NULL)
- `office_location` (VARCHAR 100, nullable)
- `job_title` (VARCHAR 100, nullable)
- `last_login_at` (TIMESTAMP)
- `created_at` (TIMESTAMP, default NOW)
- [x] 0.2 Create `app/modules/auth/services/user_service.py`:
- `upsert_user(user_id, display_name, office_location, job_title)` function
- Uses SQLAlchemy merge or INSERT ON CONFLICT for atomic upsert
- Updates `last_login_at` on every call
- [x] 0.3 Modify `app/modules/auth/router.py` login endpoint:
- After successful AD authentication, call `upsert_user()` with:
- `user_id`: userInfo.email
- `display_name`: userInfo.name
- `office_location`: userInfo.officeLocation
- `job_title`: userInfo.jobTitle
- [x] 0.4 Run database migration to create `users` table
- [x] 0.5 Write unit tests for user upsert:
- Test new user creation
- Test existing user update
- Test last_login_at update
## 1. Configuration and Dependencies
- [x] 1.1 Add DIFY settings to `app/core/config.py`:
- `DIFY_BASE_URL`: str = "https://dify.theaken.com/v1"
- `DIFY_API_KEY`: str (required)
- `DIFY_TIMEOUT_SECONDS`: int = 120
- `REPORT_MAX_MESSAGES`: int = 200
- `REPORT_STORAGE_PATH`: str = "reports"
- [x] 1.2 Update `.env.example` with DIFY configuration variables
- [x] 1.3 Add dependencies to `requirements.txt`:
- `python-docx>=1.1.0`
- `httpx>=0.27.0` (async HTTP client for DIFY API)
- [x] 1.4 Install dependencies: `pip install -r requirements.txt`
## 2. Database Schema and Models
- [x] 2.1 Create `app/modules/report_generation/models.py`:
- `GeneratedReport` SQLAlchemy model with fields:
- report_id (PK, UUID)
- room_id (FK to incident_rooms)
- generated_by, generated_at
- status (pending/collecting_data/generating_content/assembling_document/completed/failed)
- error_message
- dify_message_id, dify_conversation_id
- prompt_tokens, completion_tokens
- report_title, report_json (JSONB)
- docx_storage_path
- [x] 2.2 Create `app/modules/report_generation/schemas.py`:
- `ReportGenerateRequest` (optional parameters)
- `ReportGenerateResponse` (report_id, status)
- `ReportStatusResponse` (full report metadata)
- `ReportListResponse` (list of reports)
- `AIReportContent` (validated JSON schema from DIFY)
- [x] 2.3 Run database migration to create `generated_reports` table
## 3. DIFY Service Integration
- [x] 3.1 Create `app/modules/report_generation/prompts.py`:
- System prompt template (Traditional Chinese)
- JSON output schema with examples
- User query template for room data formatting
- [x] 3.2 Create `app/modules/report_generation/services/dify_service.py`:
- `DifyService` class with httpx async client
- `generate_report_content(prompt: str) -> dict` method
- Request construction with headers and body
- Response parsing and JSON extraction
- Error handling (timeout, auth failure, invalid JSON)
- Retry logic for recoverable errors
- [x] 3.3 Write unit tests for DIFY service:
- Mock successful API response
- Mock timeout scenario
- Mock invalid JSON response
- Mock authentication failure
## 4. Report Data Collection Service
- [x] 4.1 Create `app/modules/report_generation/services/report_data_service.py`:
- `ReportDataService` class
- `collect_room_data(room_id: str) -> RoomReportData` method
- Query room metadata from `incident_rooms`
- Query messages with sender display names
- Query members with roles
- Query files with metadata
- Handle message limit (summarize if exceeds REPORT_MAX_MESSAGES)
- [x] 4.2 Create data models for collected data:
- `RoomReportData` dataclass
- `MessageData` dataclass
- `MemberData` dataclass
- `FileData` dataclass
- [x] 4.3 Create prompt builder function:
- `build_report_prompt(room_data: RoomReportData) -> str`
- Format room metadata section
- Format members section
- Format messages timeline
- Format files section
- [x] 4.4 Write unit tests for data collection:
- Test with normal room data
- Test with empty room (should raise error)
- Test message summarization for large rooms
## 5. Document Assembly Service
- [x] 5.1 Create `app/modules/report_generation/services/docx_service.py`:
- `DocxAssemblyService` class
- `create_document(report_content: dict, room_data: RoomReportData) -> BytesIO` method
- Document title and metadata header
- Section formatting (headings, paragraphs, tables)
- Timeline table generation
- Member list formatting
- [x] 5.2 Implement image embedding:
- Download images from MinIO using existing `minio_service`
- Resize images to max width (15cm)
- Insert images into document
- Handle missing images gracefully
- [x] 5.3 Implement document styling:
- Set default font (標楷體 or 微軟正黑體)
- Set heading styles
- Set paragraph spacing
- Set table styles
- [x] 5.4 Write unit tests for docx assembly:
- Test basic document creation
- Test with embedded images (mock MinIO)
- Test without images
- Test missing image handling
## 6. REST API Router
- [x] 6.1 Create `app/modules/report_generation/router.py`:
- `POST /api/rooms/{room_id}/reports/generate` - Trigger generation
- `GET /api/rooms/{room_id}/reports` - List reports
- `GET /api/rooms/{room_id}/reports/{report_id}` - Get report status
- `GET /api/rooms/{room_id}/reports/{report_id}/download` - Download .docx
- [x] 6.2 Create `app/modules/report_generation/dependencies.py`:
- `verify_room_access` - Check user is room member
- `verify_report_access` - Check report belongs to accessible room
- [x] 6.3 Implement generate endpoint:
- Verify room membership
- Create report record with status "pending"
- Return report_id immediately
- Trigger async report generation (can use background task or sync for MVP)
- [x] 6.4 Implement download endpoint:
- Verify report exists and is completed
- Load .docx from storage
- Return as file response with proper headers
- [x] 6.5 Register router in `app/main.py`
## 7. Report Generation Orchestration
- [x] 7.1 Create main orchestration function in `services/__init__.py`:
- `generate_report(room_id: str, user_id: str, db: Session) -> str`
- Update status at each stage
- Call data collection service
- Call DIFY service
- Call docx assembly service
- Store document (MinIO or local)
- Update final status
- [x] 7.2 Implement error handling:
- Catch and log all exceptions
- Update report status to "failed" with user-friendly error message
- Store technical error in database for debugging
- [x] 7.3 Implement document storage:
- Upload .docx to MinIO under `reports/{room_id}/{report_id}.docx`
- Store path in database
## 8. WebSocket Notifications
- [x] 8.1 Add report notification schemas to `app/modules/realtime/schemas.py`:
- `ReportGeneratedBroadcast`
- `ReportGenerationFailedBroadcast`
- [x] 8.2 Integrate WebSocket broadcast in report generation:
- Broadcast `report_generated` on success
- Broadcast `report_generation_failed` on failure
## 9. Frontend Integration
- [x] 9.1 Create `frontend/src/services/reports.ts`:
- `generateReport(roomId: string): Promise<{report_id: string}>`
- `listReports(roomId: string): Promise<Report[]>`
- `getReportStatus(roomId: string, reportId: string): Promise<Report>`
- `downloadReport(roomId: string, reportId: string): Promise<Blob>`
- [x] 9.2 Add TypeScript types for reports in `frontend/src/types/index.ts`
- [x] 9.3 Create report generation hooks in `frontend/src/hooks/useReports.ts`:
- `useGenerateReport` mutation
- `useReportList` query
- `useReportStatus` query
- [x] 9.4 Add "Generate Report" button to RoomDetail page:
- Show only for resolved/archived rooms (or with warning for active)
- Disable during generation
- Show progress indicator
- [x] 9.5 Add report list and download UI:
- Show list of generated reports
- Download button for each completed report
- Status indicator for pending/failed reports
- [x] 9.6 Handle WebSocket report notifications:
- Update UI when report_generated received
- Show toast notification
- Refresh report list
## 10. Integration Testing
- [x] 10.1 Create `tests/test_report_generation.py`:
- Test full report generation flow (with mocked DIFY)
- Test API endpoints (generate, list, download)
- Test permission checks
- Test error scenarios
- [x] 10.2 Create integration test with real DIFY (optional, manual):
- Test with sample room data
- Verify JSON output format
- Check document quality
## 11. Documentation
- [x] 11.1 Update API documentation with new endpoints
- [x] 11.2 Update .env.example with all DIFY configuration
---
## Task Dependencies
```
0.1 ─▶ 0.2 ─▶ 0.3 ─▶ 0.4 ─▶ 0.5
1.1 ─┬─▶ 2.1 ─▶ 2.3 ─┴─▶ 4.1 (needs users table for JOIN)
1.2 ─┘
1.3 ─▶ 1.4 ─┬─▶ 3.1 ─▶ 3.2 ─▶ 3.3
├─▶ 4.1 ─▶ 4.2 ─▶ 4.3 ─▶ 4.4
└─▶ 5.1 ─▶ 5.2 ─▶ 5.3 ─▶ 5.4
2.2 ─┬─▶ 6.1 ─▶ 6.2 ─▶ 6.3 ─▶ 6.4 ─▶ 6.5
3.2 ─┼─▶ 7.1 ─▶ 7.2 ─▶ 7.3
4.1 ─┤
5.1 ─┘
7.1 ─▶ 8.1 ─▶ 8.2
6.5 ─▶ 9.1 ─▶ 9.2 ─▶ 9.3 ─▶ 9.4 ─▶ 9.5 ─▶ 9.6
All ─▶ 10.1 ─▶ 10.2 ─▶ 11.1 ─▶ 11.2
```
## Parallelizable Work
The following can be done in parallel:
- Section 0 (Users Table) should be done first as a prerequisite
- Section 3 (DIFY Service) and Section 4 (Data Collection) and Section 5 (Docx Assembly)
- Section 6 (API Router) can start once Section 2 (Schemas) is done
- Section 9 (Frontend) can start once Section 6 (API) is done
## Summary
| Section | Tasks | Description |
|---------|-------|-------------|
| 0. Users Table | 5 | Display name resolution |
| 1. Config | 4 | Configuration and dependencies |
| 2. Database | 3 | Models and schemas |
| 3. DIFY | 3 | AI service integration |
| 4. Data Collection | 4 | Room data gathering |
| 5. Docx Assembly | 4 | Document generation |
| 6. REST API | 5 | API endpoints |
| 7. Orchestration | 3 | Main generation flow |
| 8. WebSocket | 2 | Notifications |
| 9. Frontend | 6 | UI integration |
| 10. Testing | 2 | Integration tests |
| 11. Documentation | 2 | Docs update |
| **Total** | **43** | |