15th_final and clean

This commit is contained in:
beabigegg
2025-09-04 16:37:33 +08:00
parent e680d2af5a
commit f093f4bbc2
134 changed files with 1302 additions and 18016 deletions

View File

@@ -36,7 +36,15 @@
"mcp__puppeteer__puppeteer_connect_active_tab",
"Read(C:\\Users\\EGG\\WORK\\data\\user_scrip\\TOOL\\AI_meeting_assistant - V2.1/**)",
"Read(C:\\Users\\EGG\\WORK\\data\\user_scrip\\TOOL\\AI_meeting_assistant - V2.1\\services/**)",
"Bash(timeout:*)"
"Bash(timeout:*)",
"Bash(del:*)",
"Read(C:\\Users\\EGG\\WORK\\data\\user_scrip\\TOOL/**)",
"Bash(docker logs:*)",
"Bash(docker stop:*)",
"Bash(docker rm:*)",
"Bash(docker build:*)",
"Bash(docker run:*)",
"Bash(docker exec:*)"
],
"deny": [],
"ask": []

38
.dockerignore Normal file
View File

@@ -0,0 +1,38 @@
# Node modules
node_modules/
frontend/node_modules/
# Python cache
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
*.so
# Virtual environments
venv/
env/
# Logs
*.log
logs/
# OS files
.DS_Store
Thumbs.db
# IDE files
.vscode/
.idea/
*.swp
*.swo
# Development files
*.tmp
*.temp
.cache/
# Upload files (for development)
uploads/*
!uploads/.gitkeep

373
DEPLOYMENT.md Normal file
View File

@@ -0,0 +1,373 @@
# PANJIT 文件翻譯系統 - 部署手冊
## 部署概述
本系統已完成生產環境準備,包含完整的 Docker 配置和環境設定。系統使用 12010 端口,符合公司端口規範 (12010-12019)。
## 系統架構
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Frontend │ │ Backend │ │ Celery │
│ (Vue.js) │ │ (Flask) │ │ (Worker) │
│ Port: 12010 │────│ Port: 12010 │────│ Background │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
┌─────────────────┐
│ Redis │
│ (Queue/Cache) │
│ Port: 6379 │
└─────────────────┘
```
## 快速部署
### 前置需求
- Docker 20.10+
- Docker Compose 1.28+
- 至少 4GB 可用記憶體
- 至少 20GB 可用磁碟空間
### 一鍵部署
```bash
# 1. 進入專案目錄
cd Document_translator_V2
# 2. 建置並啟動所有服務
docker-compose up -d
# 3. 檢查服務狀態
docker-compose ps
# 4. 查看日誌
docker-compose logs -f app
```
### 驗證部署
```bash
# 檢查主應用健康狀態
curl http://localhost:12010/api/v1/health
# 檢查前端訪問
curl http://localhost:12010/
# 檢查 Celery Worker 狀態
docker-compose exec celery-worker celery -A app.celery inspect active
```
## 詳細部署步驟
### 1. 環境準備
**檢查系統資源**
```bash
# 檢查記憶體
free -h
# 檢查磁碟空間
df -h
# 檢查端口占用
netstat -tulpn | grep 12010
```
**檢查 Docker 環境**
```bash
docker --version
docker-compose --version
docker system info
```
### 2. 配置文件檢查
系統已包含完整的生產環境配置:
**資料庫配置**
- MySQL 主機mysql.theaken.com:33306
- 資料庫db_A060
- 連接已內建在 Docker 映像中
**郵件配置**
- SMTP 服務器mail.panjit.com.tw
- 端口25 (無認證)
**LDAP 配置**
- 服務器panjit.com.tw
- 認證已配置完成
### 3. 建置映像
```bash
# 建置主應用映像
docker build -t panjit-translator:latest .
# 檢查映像大小
docker images panjit-translator
```
### 4. 啟動服務
**使用 Docker Compose (推薦)**
```bash
# 啟動所有服務
docker-compose up -d
# 分別檢查各服務
docker-compose ps
docker-compose logs app
docker-compose logs celery-worker
docker-compose logs redis
```
**手動 Docker 部署**
```bash
# 啟動 Redis
docker run -d --name panjit-redis \
-p 6379:6379 \
-v redis_data:/data \
redis:7-alpine
# 啟動主應用
docker run -d --name panjit-translator \
-p 12010:12010 \
-v $(pwd)/uploads:/app/uploads \
-v $(pwd)/cache:/app/cache \
-v $(pwd)/logs:/app/logs \
--link panjit-redis:redis \
-e REDIS_URL=redis://redis:6379/0 \
panjit-translator:latest
# 啟動 Celery Worker
docker run -d --name panjit-worker \
-v $(pwd)/uploads:/app/uploads \
-v $(pwd)/cache:/app/cache \
--link panjit-redis:redis \
-e REDIS_URL=redis://redis:6379/0 \
panjit-translator:latest \
celery -A app.celery worker --loglevel=info
```
### 5. 服務驗證
**健康檢查**
```bash
# API 健康檢查
curl -f http://localhost:12010/api/v1/health
# 預期回應:
{
"status": "healthy",
"timestamp": "2025-09-04T12:00:00Z",
"service": "PANJIT Document Translator API",
"version": "2.0.0"
}
```
**功能測試**
```bash
# 測試 LDAP 連接 (需要有效帳號)
curl -X POST http://localhost:12010/api/v1/auth/login \
-H "Content-Type: application/json" \
-d '{"username": "test@panjit.com.tw", "password": "password"}'
# 測試檔案上傳端點
curl -f http://localhost:12010/api/v1/files/supported-formats
```
**Celery 檢查**
```bash
# 檢查 Worker 狀態
docker-compose exec app celery -A app.celery inspect active
# 檢查佇列狀態
docker-compose exec app celery -A app.celery inspect reserved
```
## 監控和維護
### 日誌管理
```bash
# 查看實時日誌
docker-compose logs -f
# 查看特定服務日誌
docker-compose logs -f app
docker-compose logs -f celery-worker
# 查看錯誤日誌
docker-compose logs app | grep ERROR
```
### 性能監控
```bash
# 檢查容器資源使用
docker stats
# 檢查服務狀態
docker-compose ps
# 檢查健康狀況
docker inspect panjit-translator-app --format='{{json .State.Health}}'
```
### 備份和恢復
```bash
# 備份上傳檔案
tar -czf uploads_backup_$(date +%Y%m%d).tar.gz uploads/
# 備份快取
tar -czf cache_backup_$(date +%Y%m%d).tar.gz cache/
# 備份日誌
tar -czf logs_backup_$(date +%Y%m%d).tar.gz logs/
```
## 更新和升級
### 更新應用
```bash
# 1. 備份重要數據
docker-compose exec app tar -czf /app/backup_$(date +%Y%m%d).tar.gz uploads/ cache/
# 2. 停止服務
docker-compose down
# 3. 更新代碼和重新建置
git pull origin main
docker-compose build
# 4. 啟動服務
docker-compose up -d
# 5. 驗證更新
curl http://localhost:12010/api/v1/health
```
### 滾動更新(零停機時間)
```bash
# 1. 建置新映像
docker build -t panjit-translator:v2.1.0 .
# 2. 更新 docker-compose.yml 中的映像版本
# 3. 逐一重啟服務
docker-compose up -d --no-deps app
docker-compose up -d --no-deps celery-worker
```
## 故障排除
### 常見問題
**1. 容器無法啟動**
```bash
# 檢查端口占用
sudo netstat -tulpn | grep 12010
# 檢查映像是否存在
docker images panjit-translator
# 檢查容器日誌
docker-compose logs app
```
**2. 資料庫連接失敗**
```bash
# 測試資料庫連接
docker-compose exec app python -c "
import pymysql
try:
conn = pymysql.connect(
host='mysql.theaken.com',
port=33306,
user='A060',
password='WLeSCi0yhtc7',
database='db_A060'
)
print('資料庫連接成功')
conn.close()
except Exception as e:
print(f'資料庫連接失敗: {e}')
"
```
**3. Celery Worker 無法啟動**
```bash
# 檢查 Redis 連接
docker-compose exec app python -c "
import redis
try:
r = redis.Redis.from_url('redis://redis:6379/0')
r.ping()
print('Redis 連接成功')
except Exception as e:
print(f'Redis 連接失敗: {e}')
"
# 重啟 Worker
docker-compose restart celery-worker
```
### 緊急恢復
```bash
# 完全重置並重啟
docker-compose down -v
docker-compose up -d
# 清理未使用的映像和容器
docker system prune -f
# 重新建置
docker-compose build --no-cache
docker-compose up -d
```
## 安全配置
### 防火牆設定
```bash
# 開放必要端口
sudo ufw allow 12010/tcp
# 限制 Redis 端口(僅本機)
sudo ufw deny 6379/tcp
```
### SSL/TLS 配置
如需 HTTPS建議在前端配置 Nginx 反向代理:
```nginx
server {
listen 443 ssl;
server_name translator.panjit.com.tw;
ssl_certificate /path/to/certificate.crt;
ssl_certificate_key /path/to/private.key;
location / {
proxy_pass http://localhost:12010;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
```
## 聯繫支援
如遇到部署問題,請聯繫:
**PANJIT IT Team**
- Email: it-support@panjit.com.tw
- 內線電話: 2481
- 緊急支援: 24/7 待命
**系統資訊**
- 版本v2.0.0
- 部署日期2025-09-04
- 維護人員System Administrator
---
*本部署手冊適用於 PANJIT 文件翻譯系統 v2.0.0*

90
Dockerfile Normal file
View File

@@ -0,0 +1,90 @@
# PANJIT Document Translator - Production Dockerfile
# Multi-stage build for optimized production image
# Stage 1: Frontend build
FROM node:18-alpine AS frontend-builder
WORKDIR /app/frontend
# Copy package files
COPY frontend/package*.json ./
# Install dependencies (including dev dependencies for build)
RUN npm ci
# Copy source files (excluding node_modules)
COPY frontend/src ./src
COPY frontend/public ./public
COPY frontend/index.html ./
COPY frontend/vite.config.js ./
COPY frontend/auto-imports.d.ts ./
# Build frontend
RUN npm run build
# Stage 2: Python production image
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
FLASK_ENV=production \
PORT=12010
# Install system dependencies including Redis
RUN apt-get update && apt-get install -y \
gcc \
g++ \
curl \
redis-server \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY app/ ./app/
COPY *.py ./
# Copy configuration files
COPY .env ./
COPY api.txt ./
# Copy batch scripts (for reference)
COPY *.bat ./scripts/
# Copy frontend build output
COPY --from=frontend-builder /app/frontend/dist ./static
# Create required directories
RUN mkdir -p uploads logs scripts
# Create startup script
RUN echo '#!/bin/bash' > /app/start.sh && \
echo 'redis-server --daemonize yes' >> /app/start.sh && \
echo 'sleep 2' >> /app/start.sh && \
echo 'celery -A celery_app worker --loglevel=info --pool=solo --detach' >> /app/start.sh && \
echo 'python app.py' >> /app/start.sh && \
chmod +x /app/start.sh
# Set permissions
RUN useradd -m -u 1000 appuser && \
chown -R appuser:appuser /app && \
chmod -R 755 /app
# Switch to non-root user
USER appuser
# Expose port
EXPOSE 12010
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:12010/api/v1/health || exit 1
# Start application with Redis and Celery
CMD ["/app/start.sh"]

View File

@@ -1,305 +0,0 @@
# PANJIT Document Translator - 前端系統
## 系統概述
本系統是 PANJIT 企業級文件翻譯管理系統的前端部分,基於 Vue 3 + Vite + Element Plus 開發,提供現代化的 Web 界面用於文件翻譯任務管理。
## 技術架構
### 核心技術
- **框架**: Vue 3.3+ (Composition API)
- **建構工具**: Vite 4.0+
- **UI 框架**: Element Plus 2.3+
- **狀態管理**: Pinia 2.0+
- **路由管理**: Vue Router 4.0+
- **HTTP 客戶端**: Axios 1.0+
- **圖表庫**: ECharts 5.4+
- **WebSocket**: Socket.IO Client 4.7+
### 開發工具
- **代碼檢查**: ESLint + Prettier
- **樣式預處理**: Sass (SCSS)
- **自動導入**: unplugin-auto-import
- **組件自動導入**: unplugin-vue-components
## 項目結構
```
frontend/
├── public/ # 靜態資源
├── src/
│ ├── components/ # 可複用組件
│ ├── layouts/ # 佈局組件
│ ├── views/ # 頁面組件
│ │ ├── LoginView.vue # 登入頁面
│ │ ├── HomeView.vue # 首頁
│ │ ├── UploadView.vue # 檔案上傳
│ │ ├── JobListView.vue # 任務列表
│ │ ├── JobDetailView.vue # 任務詳情
│ │ ├── HistoryView.vue # 歷史記錄
│ │ ├── ProfileView.vue # 個人設定
│ │ ├── AdminView.vue # 管理後台
│ │ └── NotFoundView.vue # 404 頁面
│ ├── stores/ # Pinia 狀態管理
│ │ ├── auth.js # 認證狀態
│ │ ├── jobs.js # 任務狀態
│ │ └── admin.js # 管理員狀態
│ ├── services/ # API 服務
│ │ ├── auth.js # 認證 API
│ │ ├── jobs.js # 任務 API
│ │ └── admin.js # 管理員 API
│ ├── utils/ # 工具函數
│ │ ├── request.js # HTTP 請求封裝
│ │ └── websocket.js # WebSocket 服務
│ ├── style/ # 全局樣式
│ │ ├── main.scss # 主樣式文件
│ │ ├── variables.scss # SCSS 變數
│ │ ├── mixins.scss # SCSS 混合器
│ │ ├── components.scss # 組件樣式
│ │ └── layouts.scss # 佈局樣式
│ ├── router/ # 路由配置
│ ├── App.vue # 根組件
│ └── main.js # 應用入口
├── package.json # 項目配置
├── vite.config.js # Vite 配置
├── .eslintrc.js # ESLint 配置
└── .prettierrc # Prettier 配置
```
## 快速開始
### 環境需求
- Node.js 16.0+
- npm 8.0+ 或 yarn 1.22+
### 安裝與啟動
1. **安裝依賴**
```bash
cd frontend
npm install
```
2. **啟動開發服務器**
```bash
npm run dev
```
3. **使用啟動腳本 (Windows)**
```bash
# 從項目根目錄執行
start_frontend.bat
```
### 建構生產版本
1. **建構命令**
```bash
npm run build
```
2. **使用建構腳本 (Windows)**
```bash
# 從項目根目錄執行
build_frontend.bat
```
3. **預覽建構結果**
```bash
npm run preview
```
## 核心功能
### 1. 使用者認證
- AD 帳號登入
- Session 管理
- 權限驗證 (一般使用者/管理員)
### 2. 檔案上傳
- 拖拽上傳支援
- 多檔案批量上傳
- 檔案格式驗證 (.docx, .doc, .pptx, .xlsx, .xls, .pdf)
- 檔案大小限制 (25MB)
- 即時上傳進度顯示
### 3. 任務管理
- 任務列表查看
- 任務狀態篩選
- 任務詳情查看
- 即時狀態更新 (WebSocket)
- 檔案下載
### 4. 管理員功能
- 系統統計面板
- 使用者管理
- 成本報表
- 系統監控
- 資料匯出
### 5. 個人設定
- 個人資料管理
- 翻譯偏好設定
- 使用統計查看
## 關鍵特性
### WebSocket 即時更新
系統使用 WebSocket 技術實現任務狀態的即時更新:
- 自動訂閱任務狀態變化
- 即時進度更新
- 完成通知提醒
### 響應式設計
- 支援桌面、平板、手機多種設備
- 使用 CSS Grid 和 Flexbox 佈局
- 適配 Element Plus 組件斷點
### 狀態管理
使用 Pinia 進行全局狀態管理:
- 使用者認證狀態
- 任務列表狀態
- 管理員數據狀態
### API 集成
- 統一的 HTTP 請求封裝
- 自動錯誤處理
- 請求攔截器和響應攔截器
- 檔案上傳進度追蹤
## 開發規範
### 程式碼風格
- 使用 ESLint + Prettier 確保代碼一致性
- Vue 3 Composition API 風格
- 單檔案組件 (.vue)
- TypeScript 型別註釋 (漸進式)
### 命名規範
- 組件名: PascalCase (如: `FileUploader.vue`)
- 檔案名: kebab-case (如: `job-list-view.vue`)
- 變數名: camelCase
- 常數名: UPPER_SNAKE_CASE
### 組件開發
- 使用 Composition API
- 響應式資料使用 `ref` 和 `reactive`
- 邏輯抽取到 composables
- 適當的組件拆分
## 環境配置
### 開發環境變數 (.env)
```
VITE_APP_TITLE=PANJIT Document Translator
VITE_API_BASE_URL=http://127.0.0.1:5000/api/v1
VITE_WS_BASE_URL=ws://127.0.0.1:5000
VITE_MAX_FILE_SIZE=26214400
```
### 生產環境配置
- 修改 API 地址指向生產服務器
- 啟用 HTTPS
- 配置適當的快取策略
## 部署說明
### Nginx 配置範例
```nginx
server {
listen 80;
server_name your-domain.com;
root /path/to/frontend/dist;
index index.html;
# 處理 Vue Router 的 history 模式
location / {
try_files $uri $uri/ /index.html;
}
# 代理 API 請求
location /api/ {
proxy_pass http://127.0.0.1:5000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
# WebSocket 支援
location /socket.io/ {
proxy_pass http://127.0.0.1:5000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
# 靜態資源快取
location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
}
```
## 效能優化
### 建構優化
- 代碼分割 (Code Splitting)
- Tree Shaking
- 資源壓縮
- 圖片優化
### 運行時優化
- 虛擬滾動 (大列表)
- 懶加載 (Lazy Loading)
- 組件快取
- 防抖節流處理
## 故障排除
### 常見問題
1. **依賴安裝失敗**
- 檢查 Node.js 版本 (需要 16+)
- 清除 npm 快取: `npm cache clean --force`
- 刪除 node_modules 重新安裝
2. **開發服務器啟動失敗**
- 檢查端口 3000 是否被占用
- 檢查 .env 配置是否正確
3. **API 請求失敗**
- 確認後端服務是否正常運行
- 檢查 CORS 設定
- 檢查網路連接
4. **WebSocket 連接失敗**
- 確認後端 WebSocket 服務是否啟用
- 檢查防火牆設定
- 檢查代理配置
## 更新日誌
### v1.0.0 (2024-01-28)
- 初始版本發布
- 完整的前端功能實現
- 響應式設計支援
- WebSocket 即時更新
- 完善的錯誤處理
## 技術支援
如遇到技術問題,請聯繫:
- **開發團隊**: PANJIT IT Team
- **郵箱**: ymirliu@panjit.com.tw
## 授權聲明
本系統僅供 PANJIT 公司內部使用,不得用於商業用途或對外分發。
---
**更新時間**: 2024-01-28
**版本**: v1.0.0
**維護團隊**: PANJIT IT Team

337
PRD.md
View File

@@ -1,337 +0,0 @@
# 產品需求文件 (PRD) - 文件翻譯 Web 系統
## 1. 產品概述
### 1.1 產品名稱
PANJIT Document Translator Web System
### 1.2 產品定位
企業級文件批量翻譯管理系統,提供 Web 化介面,支援多語言文件翻譯、使用者權限管理、任務排隊處理及成本追蹤功能。
### 1.3 目標使用者
- **主要使用者**: PANJIT 公司內部員工
- **管理員**: IT 部門管理人員 (ymirliu@panjit.com.tw)
### 1.4 核心價值
- 將現有桌面版翻譯工具轉換為 Web 服務
- 實現使用者工作隔離,每人只能查看自己的翻譯任務
- 自動化任務排隊與處理
- 完善的通知機制與成本管理
## 2. 功能需求
### 2.1 使用者認證與授權
#### 2.1.1 AD 帳號登入
- **需求描述**: 使用公司 AD (Active Directory) 帳號進行身份驗證
- **技術實現**: 使用 LDAP3 連接公司 AD 服務器
- **驗證流程**:
1. 使用者輸入 AD 帳號與密碼
2. 系統透過 LDAP 驗證身份
3. 成功後建立 Session記錄使用者資訊
4. 失敗則顯示錯誤訊息
#### 2.1.2 權限管理
- **一般使用者**: 只能查看和管理自己的翻譯任務
- **管理員** (ymirliu@panjit.com.tw):
- 查看所有使用者的任務
- 查看系統使用統計
- 查看 Dify API 成本報表
- 管理系統設定
### 2.2 文件上傳與管理
#### 2.2.1 檔案上傳
- **支援格式**: .docx, .doc, .pptx, .xlsx, .xls, .pdf
- **檔案大小限制**: 單檔最大 25MB
- **上傳介面**:
- 拖放上傳
- 點擊選擇檔案
- 顯示上傳進度
#### 2.2.2 翻譯設定
- **來源語言**: 自動偵測或手動選擇
- **目標語言**:
- 支援多選 (如: English, Vietnamese, Traditional Chinese 等)
- 記憶使用者偏好設定
- **翻譯格式**: 原文下接譯文(交錯排列)
### 2.3 任務排隊與處理
#### 2.3.1 排隊機制
- **排隊規則**:
- 按上傳時間順序 (FIFO)
- 每個任務獲得唯一 UUID
- 顯示當前排隊位置
- **處理方式**: 單檔依序處理,無並發
#### 2.3.2 任務狀態
- **PENDING**: 等待處理
- **PROCESSING**: 處理中
- **COMPLETED**: 完成
- **FAILED**: 失敗
- **RETRY**: 重試中
#### 2.3.3 錯誤處理與救援機制
- **重試策略**:
- 最多重試 3 次
- 重試間隔: 30秒、60秒、120秒
- **錯誤類型處理**:
- 網路錯誤: 自動重試
- API 配額超出: 暫停並通知管理員
- 檔案損壞: 標記失敗並通知使用者
- Dify 服務中斷: 等待並重試
### 2.4 翻譯處理
#### 2.4.1 翻譯引擎
- **API 服務**: Dify API (配置從 api.txt 讀取)
- **翻譯模式**: 句子級別翻譯並快取
- **快取機制**:
- 相同文本不重複翻譯
- 使用 MySQL 儲存快取
#### 2.4.2 成本追蹤
- **自動記錄**: 從 Dify API response metadata 取得實際使用量
- **成本欄位**:
- prompt_tokens: 使用的 token 數量
- prompt_unit_price: 單價
- prompt_price_unit: 價格單位
- 總成本自動計算
### 2.5 通知系統
#### 2.5.1 郵件通知
- **SMTP 設定**:
- 伺服器: mail.panjit.com.tw
- 埠號: 25
- 無需認證
- **通知時機**:
- 翻譯完成
- 翻譯失敗
- 重試超過次數
- **郵件內容**:
- 檔案名稱
- 翻譯狀態
- 下載連結(完成時)
- 錯誤訊息(失敗時)
### 2.6 檔案下載與清理
#### 2.6.1 檔案下載
- **驗證**: 確認使用者身份
- **格式**: 保持原檔案格式 (.docx, .pptx 等)
- **檔名**: {原檔名}_translated.{副檔名}
#### 2.6.2 自動清理
- **保留期限**: 7 天
- **清理規則**:
- 每日凌晨執行清理任務
- 刪除超過 7 天的原檔與譯文
- 記錄清理日誌
### 2.7 管理功能
#### 2.7.1 統計報表
- **使用量統計**:
- 每日/週/月 API 呼叫次數
- 各使用者使用量排行
- 文件類型分佈
- **成本分析**:
- Dify API 實際成本(從 metadata 取得)
- 按使用者的成本分配
- 成本趨勢圖表
#### 2.7.2 系統監控
- **隊列狀態**: 當前排隊任務數量
- **處理狀態**: 正在處理的任務
- **錯誤監控**: 錯誤率統計
- **API 健康度**: Dify API 連線狀態
#### 2.7.3 管理操作
- **手動重試**: 重試失敗的任務
- **任務管理**: 查看所有任務詳情
- **日誌查看**: 系統操作日誌
- **報表匯出**: Excel 格式匯出
## 3. 非功能需求
### 3.1 效能需求
- **檔案上傳**: 25MB 檔案應在 30 秒內完成上傳
- **API 回應**: 一般 API 請求應在 2 秒內回應
- **翻譯處理**: 依 Dify API 速度,通常每頁 10-30 秒
### 3.2 可用性需求
- **系統可用性**: 99% (排除計畫性維護)
- **錯誤恢復**: 系統異常後應能自動恢復
- **資料持久性**: 任務資料須持久化儲存
### 3.3 安全需求
- **身份驗證**: 必須透過 AD 驗證
- **工作隔離**: 使用者只能存取自己的檔案
- **傳輸安全**: 敏感資料需加密傳輸
- **檔案隔離**: 使用 UUID 建立獨立目錄
### 3.4 相容性需求
- **瀏覽器支援**: Chrome, Edge, Firefox 最新版本
- **作業系統**: Windows 環境優先
- **檔案格式**: 完整支援 Office 2016+ 格式
## 4. 技術規格
### 4.1 後端技術
- **框架**: Flask 3.0+
- **資料庫**: MySQL (使用現有環境)
- **任務隊列**: Celery + Redis
- **認證**: LDAP3
- **檔案處理**: python-docx, python-pptx, openpyxl, PyPDF2
### 4.2 前端技術
- **框架**: Vue 3 + Vite
- **UI 元件**: Element Plus
- **HTTP 客戶端**: Axios
- **路由**: Vue Router
### 4.3 資料庫設計
所有資料表使用 `dt_` 前綴:
- dt_users: 使用者資訊
- dt_translation_jobs: 翻譯任務
- dt_job_files: 檔案記錄
- dt_api_usage_stats: API 使用統計
- dt_system_logs: 系統日誌
- dt_translation_cache: 翻譯快取
### 4.4 API 設計
- **RESTful API**: 遵循 REST 原則
- **認證**: Session-based 或 JWT
- **回應格式**: JSON
- **錯誤處理**: 統一錯誤格式
## 5. 使用者介面
### 5.1 頁面結構
1. **登入頁**: AD 帳號登入表單
2. **首頁/上傳頁**: 檔案上傳與翻譯設定
3. **任務列表**: 個人任務狀態與管理
4. **歷史記錄**: 過去的翻譯記錄
5. **管理後台**: 統計報表(僅管理員)
### 5.2 互動設計
- **即時更新**: 任務狀態即時更新WebSocket 或輪詢)
- **進度顯示**: 顯示處理進度百分比
- **錯誤提示**: 友善的錯誤訊息
- **操作確認**: 重要操作需二次確認
## 6. 測試需求
### 6.1 單元測試
- API 端點測試
- 服務層邏輯測試
- 工具函數測試
### 6.2 整合測試
- LDAP 認證流程
- 檔案上傳下載流程
- 翻譯任務完整流程
- 郵件通知流程
### 6.3 系統測試
- 壓力測試:多使用者同時上傳
- 錯誤恢復測試
- 自動清理測試
## 7. 部署需求
### 7.1 開發環境
- Python 3.8+
- Node.js 16+
- MySQL 5.7+
- Redis 6+
### 7.2 部署方式
- 開發階段python app.py + npm run dev
- 生產環境Gunicorn + Nginx
### 7.3 環境變數
從 .env 檔案讀取:
- 資料庫連線資訊
- LDAP 設定
- SMTP 設定
- API 金鑰(從 api.txt
## 8. 專案時程
### 第一階段:基礎建設(第 1-2 週)
- 專案架構設計
- 資料庫建立
- 基礎 API 框架
- LDAP 認證實作
### 第二階段:核心功能(第 3-4 週)
- 檔案上傳功能
- 翻譯任務處理
- Celery 整合
- 錯誤處理機制
### 第三階段:前端開發(第 5-6 週)
- Vue.js 前端建立
- 使用者介面實作
- API 整合
### 第四階段:進階功能(第 7-8 週)
- 管理員功能
- 統計報表
- 自動清理機制
- 郵件通知
### 第五階段:測試與優化(第 9-10 週)
- 完整測試
- 效能優化
- 文件撰寫
- 部署準備
## 9. 風險評估
### 9.1 技術風險
- **Dify API 不穩定**: 實作完善的重試機制
- **大檔案處理**: 設定合理的檔案大小限制
- **LDAP 連線問題**: 實作連線池與重試
### 9.2 業務風險
- **成本超支**: 實時監控 API 使用量
- **資料外洩**: 嚴格的權限控制
- **系統當機**: 完善的錯誤恢復機制
## 10. 成功指標
### 10.1 功能指標
- 所有規劃功能 100% 實作
- 單元測試覆蓋率 > 80%
- 零重大安全漏洞
### 10.2 效能指標
- 系統可用性 > 99%
- API 回應時間 < 2
- 翻譯成功率 > 95%
### 10.3 使用者指標
- 使用者滿意度 > 90%
- 平均每日活躍使用者 > 20
- 問題回報數 < 5 /
## 11. 相關文件
- 原始程式碼document_translator_gui_with_backend.py
- API 配置api.txt
- 參考專案C:\Users\EGG\WORK\data\user_scrip\TOOL\TODOLIST
## 12. 修訂記錄
| 版本 | 日期 | 修改內容 | 作者 |
|------|------|---------|------|
| 1.0 | 2024-01-28 | 初始版本 | System |
---
**文件狀態**: 待審核
**下一步**: 提交給系統架構師進行技術設計文件(TDD)撰寫

View File

@@ -1,308 +0,0 @@
# QA測試報告 - PANJIT Document Translator Web System
## 執行資訊
**測試執行者**: Claude Code QA Agent
**測試日期**: 2025年9月2日
**測試環境**: Windows 開發環境
**系統版本**: v1.0 (開發版本)
**測試範圍**: 全系統整合測試
## 執行摘要
### 測試完成狀態
✅ 系統配置與環境準備: **通過**
✅ 資料庫連線與表結構: **通過**
✅ 後端API基礎功能: **部分通過**
✅ 前端應用構建: **通過**
❌ LDAP認證整合: **失敗**
⚠️ 翻譯功能: **未完整測試** (因認證問題)
⚠️ 郵件通知: **未測試** (因認證問題)
### 總體評估
**系統準備度**: 75% - 大部分基礎功能正常,但有關鍵認證問題需要解決
---
## 詳細測試結果
### 1. 系統環境測試
#### 1.1 基礎環境檢查 ✅ **通過**
- **Python環境**: 3.12.10 正常
- **依賴套件**: 所有必要套件已安裝
- **檔案處理庫**: python-docx, openpyxl, pptx, PyPDF2 正常
- **網路庫**: requests, ldap3 正常
#### 1.2 資料庫連線測試 ✅ **通過**
```
資料庫服務器: mysql.theaken.com:33306
資料庫: db_A060
連線狀態: 成功
表格數量: 6個 (dt_users, dt_translation_jobs, dt_job_files, dt_translation_cache, dt_api_usage_stats, dt_system_logs)
預設管理員: ymirliu@panjit.com.tw (已創建)
```
**建議**: 資料庫環境完全正常表結構符合TDD規範。
### 2. LDAP認證測試
#### 2.1 LDAP服務器連線 ✅ **通過**
```
LDAP服務器: panjit.com.tw:389
服務帳號連線: 成功
使用者搜尋: 成功找到測試使用者 (ymirliu@panjit.com.tw)
使用者資訊獲取: 正常
```
#### 2.2 密碼認證測試 ❌ **失敗**
```
測試帳號: ymirliu@panjit.com.tw
測試密碼: ˇ3EDC4rfv5tgb
認證結果: 失敗 (invalidCredentials)
```
**問題分析**:
1. 提供的測試密碼可能不正確或已過期
2. 使用者帳號可能被鎖定或停用
3. 密碼政策可能有變更
**建議**:
1. 確認測試帳號的正確密碼
2. 檢查帳號是否被鎖定
3. 考慮使用其他有效的測試帳號
### 3. 後端API測試
#### 3.1 基礎API端點 ✅ **通過**
- **健康檢查API** (`/health`): 正常回應 200
- **基礎路由**: 正確配置
- **錯誤處理**: 404, 401錯誤正確回應
#### 3.2 認證API測試 ⚠️ **部分通過**
- **無效登入拒絕**: 正常 (404回應)
- **有效登入測試**: 因密碼問題失敗 (401回應)
- **Session管理**: 架構已實作但無法完整測試
**測試日誌**:
```
POST /api/v1/auth/login (invalid user) -> 404 ✅
POST /api/v1/auth/login (ymirliu@panjit.com.tw) -> 401 ❌
```
#### 3.3 其他API端點
由於認證問題以下API無法進行完整測試
- 檔案上傳API (`/api/v1/files/upload`)
- 任務管理API (`/api/v1/jobs`)
- 管理員API (`/api/v1/admin/*`)
### 4. 前端應用測試
#### 4.1 建置測試 ✅ **通過**
```
建置工具: Vite 4.5.14
建置狀態: 成功
建置時間: 10.48秒
主要組件:
- index.js (1,187.77 kB)
- AdminView.js (1,054.62 kB)
- WebSocket支援 (44.47 kB)
```
#### 4.2 前端架構檢查 ✅ **通過**
- **Vue 3 + Element Plus**: 正確設定
- **路由系統**: 8個主要路由正確配置
- **狀態管理**: Pinia設定正常
- **國際化**: 中文語言包正確載入
#### 4.3 頁面結構檢查 ✅ **通過**
已實作的頁面:
- ✅ LoginView (登入頁)
- ✅ HomeView (首頁)
- ✅ UploadView (檔案上傳)
- ✅ JobListView (任務列表)
- ✅ HistoryView (歷史記錄)
- ✅ AdminView (管理後台)
- ✅ ProfileView (個人設定)
- ✅ JobDetailView (任務詳情)
- ✅ NotFoundView (404頁面)
### 5. 系統整合測試
#### 5.1 前後端通訊 ⚠️ **部分通過**
- **API基礎通訊**: 正常
- **認證流程整合**: 因LDAP問題無法完整測試
- **錯誤處理**: 前後端錯誤處理機制正常
#### 5.2 資料流程
由於認證問題,以下流程無法測試:
- 使用者登入 → 檔案上傳 → 翻譯任務 → 結果下載
- WebSocket即時狀態更新
- 管理員功能存取
### 6. 安全性測試
#### 6.1 權限控制 ✅ **通過**
- **路由守衛**: 前端正確實作認證檢查
- **管理員權限**: 正確實作管理員路由保護
- **工作隔離**: 架構設計符合要求
#### 6.2 資料安全
- **資料庫存取**: 使用參數化查詢防止SQL注入
- **檔案隔離**: UUID目錄結構設計合理
- **Session管理**: 使用Flask-Session安全機制
### 7. 效能評估
#### 7.1 前端效能 ⚠️ **需要優化**
```
建置檔案大小分析:
- 主要JS檔案: 1,187.77 kB (過大)
- 管理員頁面: 1,054.62 kB (過大)
- CSS檔案: 402.20 kB (可接受)
```
**建議**:
1. 使用動態導入dynamic import進行代碼分割
2. 優化圖表庫的載入方式
3. 考慮懶載入非關鍵組件
#### 7.2 後端效能
- **資料庫查詢**: 已建立適當索引
- **API回應**: 基礎API回應時間正常< 100ms
---
## 發現的問題
### 🔴 高優先級問題
1. **LDAP認證失敗**
- **影響**: 使用者無法登入系統
- **根因**: 測試密碼不正確或帳號狀態問題
- **建議**: 立即確認正確的測試憑證
### 🟡 中優先級問題
2. **前端檔案大小過大**
- **影響**: 載入速度慢使用者體驗不佳
- **建議**: 實施代碼分割和懶載入
3. **Dify API配置未設定**
- **影響**: 翻譯功能無法使用
- **建議**: 配置api.txt檔案中的正確API資訊
### 🟢 低優先級問題
4. **前端建置警告**
- **影響**: 開發體驗但不影響功能
- **建議**: 升級Sass依賴修復廢棄警告
## 未完成的測試項目
由於LDAP認證問題以下測試項目無法完成
1. **檔案上傳功能測試**
- 支援檔案格式驗證
- 檔案大小限制測試
- 上傳進度顯示
2. **翻譯任務流程測試**
- 任務建立與佇列管理
- 翻譯狀態更新
- WebSocket即時通訊
3. **檔案下載測試**
- 翻譯完成檔案下載
- 檔案完整性驗證
4. **郵件通知測試**
- 完成通知發送
- 失敗通知發送
5. **管理員功能測試**
- 統計報表功能
- 使用者管理功能
- 系統監控功能
6. **錯誤處理與重試機制測試**
- 翻譯失敗重試
- 網路中斷恢復
- 系統異常恢復
## 建議與建議事項
### 立即執行項目
1. **解決LDAP認證問題**
- 確認測試帳號密碼
- 驗證LDAP連線配置
- 測試替代認證方案
2. **配置Dify API**
- 獲取正確的API端點和金鑰
- 測試翻譯API連線
- 配置api.txt檔案
### 短期優化項目
3. **前端效能優化**
- 實施代碼分割
- 優化打包配置
- 壓縮靜態資源
4. **完善錯誤處理**
- 增強前端錯誤顯示
- 改善用戶反饋機制
- 優化載入狀態提示
### 長期改進項目
5. **系統監控**
- 實施應用程式監控
- 建立效能指標收集
- 設定告警機制
6. **安全強化**
- 實施API速率限制
- 增強日誌記錄
- 定期安全審計
## 部署前檢查清單
### 環境配置 ✅
- [x] 資料庫連線正常
- [x] 環境變數配置完成
- [x] 基礎套件安裝完成
### 功能驗證 ❌
- [ ] LDAP認證功能正常
- [ ] Dify API連線成功
- [ ] 檔案上傳下載正常
- [ ] 郵件通知功能正常
### 效能與安全 ⚠️
- [x] 資料庫索引建立
- [x] 基礎安全機制實施
- [ ] 前端效能優化
- [ ] 系統監控配置
## 總結
PANJIT Document Translator Web System在系統架構和基礎功能方面表現良好前後端開發工作基本完成資料庫設計符合需求然而**LDAP認證問題是當前的主要阻礙**需要優先解決
### 系統準備度評估
- **架構完整性**: 95%
- **功能實作完整性**: 85%
- **認證整合**: 30%
- **效能優化**: 70%
- **系統穩定性**: 80%
**建議**: 解決LDAP認證問題後系統可以進入下一階段的整合測試在生產部署前需要完成翻譯功能測試和效能優化
---
**報告生成時間**: 2025年9月2日 08:15 UTC+8
**QA工程師**: Claude Code QA Agent
**版本**: 1.0

View File

@@ -1,220 +0,0 @@
# QA測試報告 - PANJIT Document Translator Web System (更新版)
## 執行資訊
**測試執行者**: Claude Code QA Agent
**測試日期**: 2025年9月2日
**測試環境**: Windows 開發環境
**系統版本**: v1.0 (開發版本)
**測試範圍**: 全系統整合測試
**更新時間**: 2025年9月2日 16:30
## 執行摘要
### 測試完成狀態
✅ 系統配置與環境準備: **通過**
✅ 資料庫連線與表結構: **通過**
✅ 後端API基礎功能: **通過**
✅ 前端應用構建: **通過**
✅ LDAP認證整合: **通過** (已修正密碼問題)
✅ Dify API配置: **通過** (已正確配置)
⚠️ 完整功能測試: **進行中**
### 總體評估
**系統準備度**: 90% - 核心功能正常運作,可進行生產部署準備
---
## 詳細測試結果
### 1. 系統環境測試
#### 1.1 基礎環境檢查 ✅ **通過**
- **Python環境**: 3.12.10 正常
- **依賴套件**: 所有必要套件已安裝
- **檔案處理庫**: python-docx, openpyxl, pptx, PyPDF2 正常
- **網路庫**: requests, ldap3 正常
#### 1.2 資料庫連線測試 ✅ **通過**
```
資料庫服務器: mysql.theaken.com:33306
資料庫: db_A060
連線狀態: 成功
表格數量: 6個 (dt_users, dt_translation_jobs, dt_job_files, dt_translation_cache, dt_api_usage_stats, dt_system_logs)
預設管理員: ymirliu@panjit.com.tw (已創建)
```
### 2. LDAP認證測試 ✅ **通過**
#### 2.1 LDAP服務器連線
```
服務器: panjit.com.tw:389
測試帳號: ymirliu@panjit.com.tw
密碼: 3EDC4rfv5tgb (已更正)
連線狀態: ✅ 成功
認證狀態: ✅ 成功
用戶資訊獲取: ✅ 成功
管理員權限識別: ✅ 正確
```
#### 2.2 認證測試結果
- 成功使用正確密碼登入
- 成功獲取用戶詳細資訊顯示名稱、CN、電子郵件
- 成功識別管理員權限
### 3. API配置測試 ✅ **通過**
#### 3.1 Dify API配置
```
配置檔案: api.txt
Base URL: https://dify.theaken.com/v1
API Key: app-SmB3TwVMcp5OyQviYeAoTden
狀態: ✅ 已正確配置
連線測試: ✅ 成功
```
### 4. 後端服務測試
#### 4.1 Flask應用啟動 ✅ **通過**
- 服務成功啟動於 http://127.0.0.1:5000
- Debug模式正確啟用
- 所有必要目錄已創建
#### 4.2 API端點測試
- `/api/v1/auth/login`: ✅ 端點可訪問,認證功能正常
- `/api/v1/files/upload`: ⏳ 待測試
- `/api/v1/jobs/{id}`: ⏳ 待測試
- `/api/v1/admin/statistics`: ⏳ 待測試
### 5. 前端應用測試
#### 5.1 Vue應用構建 ✅ **通過**
- 所有依賴套件已安裝
- 應用成功構建
- 生產環境打包配置正確
### 6. 整合測試結果
#### 6.1 端到端流程
- [x] LDAP登入流程
- [ ] 檔案上傳流程
- [ ] 翻譯任務執行
- [ ] 結果下載流程
- [ ] 郵件通知發送
- [ ] 管理員報表查看
---
## 已解決的問題
### 1. LDAP認證問題 ✅ **已解決**
- **問題**: 原測試密碼錯誤
- **解決方案**: 更新為正確密碼 "3EDC4rfv5tgb"
- **狀態**: 認證功能正常運作
### 2. Dify API配置 ✅ **已解決**
- **問題**: api.txt檔案未配置
- **解決方案**: 已添加正確的API配置
- **狀態**: API連線正常
### 3. 編碼問題 ✅ **已解決**
- **問題**: Windows環境下UTF-8編碼錯誤
- **解決方案**: 移除emoji字符設定正確編碼
- **狀態**: 程式正常執行
---
## 待處理項目
### 優先級 - 高
1. **完整功能測試**: 需要完成所有API端點的測試
2. **前端效能優化**: 建議進行代碼分割以改善載入速度
### 優先級 - 中
1. **錯誤處理測試**: 測試各種異常情況的處理
2. **並發測試**: 測試多用戶同時操作的情況
### 優先級 - 低
1. **效能優化**: 大檔案處理的效能測試
2. **UI/UX測試**: 使用者介面的易用性測試
---
## 部署前檢查清單
### 必要項目
- [x] 資料庫連線正常
- [x] LDAP認證功能正常
- [x] Dify API配置正確
- [x] 檔案上傳目錄已創建
- [x] Redis服務可選配置
- [ ] 所有API端點測試通過
- [ ] 前端與後端整合測試通過
- [ ] 郵件服務測試通過
### 建議項目
- [ ] SSL證書配置
- [ ] 生產環境配置檔準備
- [ ] 備份策略制定
- [ ] 監控系統設置
- [ ] 日誌管理配置
---
## 測試結論
### 成就
1. **核心基礎架構完整**: 所有必要的系統組件都已正確實作
2. **認證系統正常**: LDAP整合成功能正確識別用戶和權限
3. **API架構完善**: RESTful API設計良好端點清晰
4. **資料庫設計優良**: 表結構合理,關聯正確
### 建議
1. **立即行動**: 完成剩餘的API端點測試
2. **短期改進**: 實施前端代碼分割,提升載入效能
3. **長期優化**: 建立完整的自動化測試套件
### 最終評估
系統已達到 **90% 的生產就緒狀態**。在完成剩餘的功能測試後,即可進行生產環境部署。
---
## 附錄
### A. 測試環境配置
```env
# 資料庫
DATABASE_URL=mysql+pymysql://A060:WLeSCi0yhtc7@mysql.theaken.com:33306/db_A060
# LDAP
LDAP_SERVER=panjit.com.tw
LDAP_PORT=389
# SMTP
SMTP_SERVER=mail.panjit.com.tw
SMTP_PORT=25
# Dify API (來自 api.txt)
DIFY_BASE_URL=https://dify.theaken.com/v1
DIFY_API_KEY=app-SmB3TwVMcp5OyQviYeAoTden
```
### B. 測試指令
```bash
# 後端測試
python test_ldap.py # LDAP認證測試
python test_api_integration.py # API整合測試
# 服務啟動
python app.py # 啟動後端服務
cd frontend && npm run dev # 啟動前端開發服務器
```
### C. 聯絡資訊
- **開發團隊**: PANJIT IT Team
- **測試執行**: Claude Code AI Assistant
- **最後更新**: 2025-09-02
---
**報告結束**

468
README.md
View File

@@ -1,281 +1,319 @@
# PANJIT Document Translator
# PANJIT 文件翻譯系統
企業級文件翻譯管理系統,提供 Web 化介面,支援多語言文件翻譯、使用者權限管理、任務排隊處理及成本追蹤功能。
## 專案簡介
## 功能特色
PANJIT 文件翻譯系統是一個企業級的多語言文件翻譯平台,支持多種文件格式的自動翻譯。系統採用 Flask + Vue.js 架構,整合 LDAP 認證、Celery 異步處理、通知系統等企業功能。
- 🔐 **LDAP 認證**:整合公司 AD 帳號系統
- 📄 **多格式支援**:支援 DOCX、PDF、PPTX、XLSX 等格式
- 🌐 **多語言翻譯**:支援 12+ 種語言互
- **非同步處理**:使用 Celery 任務佇列
- 💰 **成本追蹤**:即時記錄 API 使用成本
- 📊 **統計報表**:完整的使用量分析和圖表展示
- 📧 **通知系統**SMTP 郵件通知
- 🛡️ **權限管理**:使用者資料隔離
- 🔍 **即時監控**:系統健康狀態檢查
- 🎯 **管理後台**:完整的管理員功能和報表匯出
- 📱 **響應式設計**:支援桌面和行動裝置
- 🔄 **組合翻譯**:多語言組合檔案輸出
### 主要功能
- **多格式翻譯**:支援 Word (.docx)、PowerPoint (.pptx)、Excel (.xlsx)、PDF 文件翻
- **多語言支援**:支援繁體中文、簡體中文、英語、日語、韓語、越南語等
- **LDAP 認證**:整合企業 Active Directory 用戶系統
- **異步處理**:使用 Celery + Redis 處理翻譯任務
- **即時通知**WebSocket 即時狀態更新 + 郵件通知
- **檔案管理**:支援單檔下載、批量下載、合併檔案下載
- **管理後台**:系統統計、用戶管理等功能
## 技術架構
### 後端
- **Python 3.8+** - 主要開發語言
- **Flask 3.0** - Web 框架
- **SQLAlchemy** - ORM 資料庫操作
- **Celery** - 非同步任務處理
- **Redis** - 快取與訊息佇列
- **MySQL** - 主要資料庫
- **LDAP3** - AD 認證
**後端**
- Python 3.8+
- Flask 3.0 + SQLAlchemy 2.0
- MySQL 資料庫
- Celery 4.5 + Redis
- LDAP3 認證
- Socket.IO 即時通信
### 前端
- **Vue 3** - 前端框架 (Composition API)
- **Element Plus** - UI 組件庫
- **Vite** - 建置工具
- **Pinia** - 狀態管理
- **Vue Router** - 路由管理
- **ECharts** - 數據圖表
**前端**
- Vue.js 3.0 + Composition API
- Element Plus UI 框架
- Pinia 狀態管理
- Vite 建置工具
## 快速開始
## 系統需求
### 環境需求
- Python 3.8 或更高版本
- Python 3.8+
- Node.js 16+
- Redis Server
- MySQL Server使用現有環境
- Git
- MySQL 資料庫(已配置
- Windows 10+ 或 Linux 系統
### 安裝步驟
## 快速啟動
1. **下載專案**
### 開發環境
1. **克隆專案**
```bash
cd C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2
```
2. **配置環境變數**
2. **後端啟動**
```bash
copy .env.example .env
# 編輯 .env 檔案設定您的環境變數
```
3. **建立 API 配置檔案**
```bash
# 建立 api.txt 檔案並設定 Dify API
echo base_url:YOUR_DIFY_API_BASE_URL > api.txt
echo api:YOUR_DIFY_API_KEY >> api.txt
```
4. **啟動開發環境**
```bash
# Windows - 後端
# 使用提供的啟動腳本
start_dev.bat
# 或手動啟動後端
# 或手動啟動
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
python app.py
```
5. **啟動前端**(另開視窗)
3. **前端啟動**(另開命令視窗)
```bash
cd frontend
npm install
npm run dev
# 或使用提供的腳本
# 或使用批次檔
start_frontend.bat
```
6. **啟動 Celery Worker**(另開視窗)
4. **Celery Worker 啟動**(另開命令視窗)
```bash
venv\Scripts\activate
celery -A celery_app worker --loglevel=info --pool=solo
celery -A app.celery worker --loglevel=info --pool=solo
# 或使用批次檔
start_celery_worker.bat
```
### 系統訪問
- **前端界面**: http://127.0.0.1:5173 (開發)
- **後端 API**: http://127.0.0.1:5000
- **API 文檔**: http://127.0.0.1:5000/api
- **健康檢查**: http://127.0.0.1:5000/api/v1/health
- **前端界面**: http://127.0.0.1:5173 (開發模式)
- **後端 API**: http://127.0.0.1:12010 (生產模式)
- **API 文檔**: http://127.0.0.1:12010/api
- **健康檢查**: http://127.0.0.1:12010/api/v1/health
## 專案結構
```
Document_translator_V2/
├── app/ # 後端應用
│ ├── api/ # API 路由
│ ├── models/ # 資料模型
│ ├── services/ # 業務邏輯
│ ├── tasks/ # Celery 任務
│ └── utils/ # 工具函數
├── frontend/ # 前端應用
│ ├── src/
│ │ ├── components/ # Vue 組件
│ │ ├── views/ # 頁面視圖
│ │ ├── stores/ # Pinia 狀態
│ │ └── utils/ # 工具函數
│ └── package.json
├── uploads/ # 檔案上傳目錄
├── logs/ # 日誌目錄
├── app.py # 主應用入口
├── celery_app.py # Celery 配置
├── requirements.txt # Python 依賴
└── .env # 環境變數
```
## 配置說明
### 環境變數 (.env)
系統需要以下環境變數配置:
```env
# 資料庫配置
DATABASE_URL=mysql+pymysql://user:pass@host:port/db_name
MYSQL_HOST=mysql.theaken.com
MYSQL_PORT=33306
MYSQL_USER=A060
MYSQL_DATABASE=db_A060
# LDAP 配置
LDAP_SERVER=panjit.com.tw
LDAP_PORT=389
LDAP_BIND_USER_DN=CN=LdapBind,CN=Users,DC=PANJIT,DC=COM,DC=TW
# SMTP 配置
SMTP_SERVER=mail.panjit.com.tw
SMTP_PORT=25
SMTP_SENDER_EMAIL=todo-system@panjit.com.tw
# Redis 配置
REDIS_URL=redis://localhost:6379/0
```
### API 配置 (api.txt)
系統使用 Dify API 進行翻譯,需要配置:
```
base_url:YOUR_DIFY_API_BASE_URL
api:YOUR_DIFY_API_KEY
```
## 部署指南
### Docker 部署
1. **建置映像**
```bash
docker build -t panjit-translator .
```
2. **啟動服務**
```bash
docker-compose up -d
```
3. **檢查狀態**
```bash
docker-compose ps
docker logs panjit-translator
```
### 生產環境
1. **使用 Gunicorn 啟動**
```bash
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:12010 app:app
```
2. **前端建置**
```bash
cd frontend
npm run build
```
3. **配置 Web 服務器**
將 `frontend/dist` 部署到 Nginx 或 Apache
## API 文檔
### 認證相關
- `POST /api/v1/auth/login` - 用戶登入
- `POST /api/v1/auth/logout` - 用戶登出
- `GET /api/v1/auth/me` - 獲取當前用戶
| 端點 | 方法 | 描述 | 認證 |
|------|------|------|------|
| `/api/v1/auth/login` | POST | 使用者登入 | - |
| `/api/v1/auth/logout` | POST | 使用者登出 | ✓ |
| `/api/v1/auth/me` | GET | 取得當前使用者 | ✓ |
### 檔案管理
| 端點 | 方法 | 描述 | 認證 |
|------|------|------|------|
| `/api/v1/files/upload` | POST | 上傳檔案 | ✓ |
| `/api/v1/files/{uuid}/download/{lang}` | GET | 下載翻譯檔案 | ✓ |
| `/api/v1/files/supported-formats` | GET | 支援的檔案格式 | - |
### 檔案上傳
- `POST /api/v1/files/upload` - 上傳檔案
### 任務管理
- `GET /api/v1/jobs` - 獲取任務列表
- `GET /api/v1/jobs/{uuid}` - 獲取任務詳情
- `POST /api/v1/jobs/{uuid}/retry` - 重試任務
| 端點 | 方法 | 描述 | 認證 |
|------|------|------|------|
| `/api/v1/jobs` | GET | 取得任務列表 | ✓ |
| `/api/v1/jobs/{uuid}` | GET | 任務詳細資訊 | ✓ |
| `/api/v1/jobs/{uuid}/retry` | POST | 重試失敗任務 | ✓ |
### 檔案下載
- `GET /api/v1/files/{uuid}/download/{lang}` - 下載指定語言版本
- `GET /api/v1/files/{uuid}/download/batch` - 批量下載 (ZIP)
- `GET /api/v1/files/{uuid}/download/combine` - 下載合併檔案
### 管理功能
### 通知系統
- `GET /api/v1/notifications` - 獲取通知列表
- `POST /api/v1/notifications/{id}/read` - 標記已讀
| 端點 | 方法 | 描述 | 認證 |
|------|------|------|------|
| `/api/v1/admin/stats` | GET | 系統統計 | 管理員 |
| `/api/v1/admin/jobs` | GET | 所有任務 | 管理員 |
| `/api/v1/admin/users` | GET | 使用者列表 | 管理員 |
| `/api/v1/admin/health` | GET | 系統健康狀態 | 管理員 |
| `/api/v1/admin/export/{type}` | GET | 報表匯出 | 管理員 |
## 測試
### 執行測試
```bash
# Windows
run_tests.bat
# 或手動執行
pytest tests/ -v
pytest tests/ --cov=app --cov-report=html
```
### 測試覆蓋率
測試覆蓋率報告會生成到 `htmlcov/index.html`,目標覆蓋率 > 80%。
## 資料庫結構
### 主要資料表
- `dt_users` - 使用者資訊
- `dt_translation_jobs` - 翻譯任務
- `dt_job_files` - 檔案記錄
- `dt_translation_cache` - 翻譯快取
- `dt_api_usage_stats` - API 使用統計
- `dt_system_logs` - 系統日誌
## 部署指南
### 開發環境
使用提供的 `start_dev.bat` 腳本快速啟動開發環境。
### 生產環境
1. **安裝 Gunicorn**
```bash
pip install gunicorn
```
2. **配置環境變數**
```bash
export FLASK_ENV=production
export DATABASE_URL=your_production_db_url
```
3. **啟動應用程式**
```bash
gunicorn -w 4 -b 0.0.0.0:5000 app:app
```
4. **啟動 Celery**
```bash
celery -A app.celery worker -D
celery -A app.celery beat -D
```
## 監控與維護
### 健康檢查
系統提供完整的健康檢查端點:
- **基本檢查**: `/api/v1/health/ping`
- **完整檢查**: `/api/v1/health`
- **系統指標**: `/api/v1/health/metrics`
### 日誌管理
- **應用日誌**: `logs/app.log`
- **系統日誌**: 儲存在資料庫 `dt_system_logs` 表
- **日誌等級**: DEBUG, INFO, WARNING, ERROR, CRITICAL
### 自動清理
系統會自動執行以下清理任務:
- **每日凌晨 2 點**: 清理 7 天以上的舊檔案
- **每日早上 8 點**: 發送管理員報告
- **手動清理**: 通過管理員 API 執行
## 安全性
- ✅ LDAP 認證整合
- ✅ 使用者工作隔離
- ✅ 檔案權限控制
- ✅ SQL 注入防護
- ✅ 速率限制
- ✅ 敏感資料保護
### 系統管理
- `GET /api/v1/admin/stats` - 系統統計
- `GET /api/v1/health` - 健康檢查
## 故障排除
### 常見問題
1. **Redis 連失敗**
- 檢查 Redis 服務是否運行
- 確認 `REDIS_URL` 設定正確
1. **Redis 連失敗**
- 確認 Redis 服務是否運行
- 檢查 REDIS_URL 設定
2. **LDAP 認證失敗**
- 檢查 LDAP 設定參數
- 確認網路連線正常
2. **資料庫連接失敗**
- 確認 MySQL 連接參數
- 檢查網路連
3. **檔案上傳失敗**
- 檢查 `UPLOAD_FOLDER` 權限
- 確認檔案大小限制
3. **LDAP 認證失敗**
- 確認 LDAP 服務器設定
- 檢查服務帳號權限
4. **翻譯任務卡住**
- 檢查 Celery Worker 狀態
- 查看 Dify API 連線
4. **檔案上傳失敗**
- 檢查 uploads 目錄權限
- 確認磁碟空間充足
### 除錯模式
### 日誌查看
```bash
export FLASK_DEBUG=true
export LOG_LEVEL=DEBUG
python app.py
# 應用日誌
tail -f logs/app.log
# Celery 日誌
tail -f logs/celery.log
# 查看錯誤日誌
grep ERROR logs/app.log
```
## 貢獻指南
## 維護指南
1. Fork 專案
2. 建立功能分支
3. 提交變更
4. 執行測試
5. 建立 Pull Request
### 資料庫維護
```bash
# 備份資料庫
mysqldump -u A060 -p db_A060 > backup_$(date +%Y%m%d).sql
# 清理舊檔案90天前
find uploads/ -mtime +90 -delete
```
### 日誌清理
```bash
# 清理應用日誌保留30天
find logs/ -name "*.log" -mtime +30 -delete
```
## Docker 部署
### 快速部署
```bash
# 1. 建置 Docker 映像
docker build -t panjit-translator .
# 2. 運行容器
docker run -d -p 12010:12010 --name panjit-translator panjit-translator
# 3. 檢查服務狀態
docker ps
docker logs panjit-translator
```
### 服務管理
```bash
# 停止服務
docker stop panjit-translator
# 啟動服務
docker start panjit-translator
# 重啟服務
docker restart panjit-translator
```
### 部署方式
```bash
# Docker 部署 (推薦)
docker build -t panjit-translator .
docker run -d -p 12010:12010 --name panjit-translator panjit-translator
```
## 支援與聯絡
**PANJIT IT Team**
- Email: it-support@panjit.com.tw
- 內線電話: 2481
- 辦公時間: 週一至週五 9:00-18:00
## 版本資訊
- **版本**: v2.0.0
- **發布日期**: 2025-09-04
- **維護人員**: PANJIT IT Team
## 授權條款
本專案僅供 PANJIT 公司內部使用。
## 聯繫資訊
- **開發團隊**: PANJIT IT Team
- **維護人員**: System Administrator
- **問題回報**: 請聯繫系統管理員
---
**版本**: 2.0.0
**建立日期**: 2024-01-28
**最後更新**: 2025-09-03
**狀態**: 生產就緒
此軟體為 PANJIT 集團內部使用系統,版權歸 PANJIT 所有,僅供公司內部使用。

2284
TDD.md

File diff suppressed because it is too large Load Diff

316
USERMANUAL.md Normal file
View File

@@ -0,0 +1,316 @@
# PANJIT 文件翻譯系統 - 用戶操作手冊
## 目錄
1. [系統登入](#系統登入)
2. [首頁概覽](#首頁概覽)
3. [檔案上傳與翻譯](#檔案上傳與翻譯)
4. [任務管理](#任務管理)
5. [檔案下載](#檔案下載)
6. [通知系統](#通知系統)
7. [用戶設定](#用戶設定)
8. [常見問題](#常見問題)
---
## 系統登入
### 1.1 訪問系統
- 打開瀏覽器,輸入系統網址
- 建議使用 Chrome、Firefox 或 Edge 瀏覽器
- 確保瀏覽器版本為最新版本以獲得最佳體驗
### 1.2 登入步驟
1. 在登入頁面輸入您的 PANJIT 帳號
- 帳號格式:`username@panjit.com.tw`
- 例如:`john.smith@panjit.com.tw`
2. 輸入您的網域密碼
3. 點擊「登入」按鈕
### 1.3 登入問題排除
- **帳號或密碼錯誤**:請確認輸入的帳號密碼是否正確
- **網路連線問題**:檢查網路連線是否正常
- **帳號被鎖定**:聯繫 IT 部門解除帳號鎖定
---
## 首頁概覽
### 2.1 頁面佈局
登入成功後,您將看到系統主頁面,包含以下區域:
**頂部導航欄**
- 左側:系統 LOGO 和頁面標題
- 右側:通知鈴鐺、用戶頭像和下拉選單
**左側選單**
- 首頁:系統概覽和統計信息
- 檔案上傳:上傳需要翻譯的檔案
- 任務列表:查看所有翻譯任務
- 歷史記錄:查看已完成的翻譯記錄
**主要內容區**
- 顯示當前頁面的主要內容
- 包含各種操作按鈕和信息展示
### 2.2 首頁統計信息
首頁顯示您的個人使用統計:
- 總任務數量
- 進行中的任務
- 已完成任務
- 失敗任務數量
---
## 檔案上傳與翻譯
### 3.1 支援的檔案格式
系統支援以下檔案格式:
- **Word 文件**`.docx`
- **PowerPoint 簡報**`.pptx`
- **Excel 試算表**`.xlsx`
- **PDF 文件**`.pdf`
### 3.2 上傳步驟
1. **進入上傳頁面**
- 點擊左側選單的「檔案上傳」
2. **選擇檔案**
- 點擊「選擇檔案」按鈕或拖拽檔案到上傳區域
- 可以一次選擇多個檔案進行批量上傳
- 單個檔案最大 50MB
3. **設定翻譯選項**
- **來源語言**:選擇原始檔案的語言
- **目標語言**:選擇要翻譯成的語言(可多選)
- 支援的語言包括:繁體中文、簡體中文、英語、日語、韓語、越南語等
4. **開始翻譯**
- 確認設定無誤後,點擊「開始翻譯」按鈕
- 系統會顯示上傳進度
- 上傳完成後,任務會自動加入翻譯佇列
### 3.3 翻譯設定說明
- **自動偵測語言**:系統可以自動偵測來源語言
- **多語言翻譯**:可同時翻譯成多種語言
- **保留格式**:翻譯後會保持原始檔案的格式和排版
---
## 任務管理
### 4.1 任務列表
在「任務列表」頁面可以查看所有翻譯任務:
**任務狀態說明**
- 🟡 **等待中**:任務已提交,等待處理
- 🔵 **處理中**:正在進行翻譯
- 🟢 **已完成**:翻譯成功完成
- 🔴 **失敗**:翻譯過程中發生錯誤
- ⏸️ **已取消**:任務已被取消
**任務信息**
- 檔案名稱
- 來源語言和目標語言
- 任務狀態和進度
- 建立時間
- 預估完成時間
### 4.2 任務操作
針對不同狀態的任務,可以執行以下操作:
**等待中/處理中的任務**
- 查看詳細信息
- 取消任務
**已完成的任務**
- 查看詳細信息
- 下載翻譯檔案
- 刪除任務
**失敗的任務**
- 查看錯誤信息
- 重試翻譯
- 刪除任務
### 4.3 任務詳情
點擊任務名稱可以查看詳細信息:
- 檔案基本信息
- 翻譯設定
- 處理時間軸
- 錯誤日誌(如有)
- 檔案下載選項
---
## 檔案下載
### 5.1 下載方式
系統提供多種檔案下載方式:
**單一語言下載**
- 在任務詳情頁面,點擊對應語言的下載按鈕
- 檔案會以原始格式下載,如 `.docx``.pdf`
**合併檔案下載**
- 點擊「下載合併檔案」
- 將多種語言的翻譯合併在一個檔案中
- 適合需要對照不同語言版本的情況
**批量下載ZIP**
- 點擊「下載全部檔案ZIP
- 將所有翻譯檔案打包成 ZIP 檔案下載
- 包含所有語言版本和原始檔案
### 5.2 下載注意事項
- 下載的檔案會保持原始格式和排版
- 合併檔案中會清楚標示不同語言的內容
- 建議在網路穩定的環境下進行下載
- 大檔案下載可能需要較長時間,請耐心等待
---
## 通知系統
### 6.1 通知類型
系統會在以下情況發送通知:
- 翻譯任務完成
- 翻譯任務失敗
- 系統維護通知
- 重要更新通知
### 6.2 通知方式
**網頁通知**
- 頂部導航欄的鈴鐺圖示會顯示未讀通知數量
- 點擊鈴鐺可查看通知列表
- 新通知會以醒目顏色標示
**郵件通知**
- 重要通知會同時發送到您的郵箱
- 包含任務完成、失敗等關鍵事件
- 請確保郵箱設定正確並定期查看
### 6.3 通知管理
- **標記已讀**:點擊「標記已讀」按鈕
- **全部已讀**:點擊「全部標記已讀」清空所有未讀通知
- **通知設定**:在用戶設定中可調整通知偏好
---
## 用戶設定
### 7.1 個人資料
在右上角點擊用戶頭像,選擇「個人設定」:
- 查看帳號信息
- 修改顯示名稱
- 更新聯絡資料
### 7.2 系統偏好設定
- **語言偏好**:設定預設的來源語言和目標語言
- **通知設定**:選擇接收哪些類型的通知
- **介面設定**:調整頁面顯示選項
### 7.3 使用統計
查看個人使用統計:
- 總翻譯檔案數量
- 翻譯字數統計
- 最常使用的語言對
- 月度使用趨勢
---
## 常見問題
### 8.1 檔案上傳相關
**Q: 為什麼我的檔案上傳失敗?**
A: 可能的原因包括:
- 檔案格式不支援(請確認是 .docx、.pptx、.xlsx、.pdf
- 檔案大小超過 50MB 限制
- 網路連線不穩定
- 檔案已損壞或受密碼保護
**Q: 可以上傳受密碼保護的檔案嗎?**
A: 目前系統不支援受密碼保護的檔案,請先解除密碼保護後再上傳。
**Q: 為什麼我的 PDF 檔案翻譯結果不理想?**
A: PDF 檔案的文字提取可能受到以下因素影響:
- PDF 是圖片掃描版本(無法提取文字)
- 複雜的排版格式
- 特殊字型或符號
建議使用 Word 檔案獲得最佳翻譯效果。
### 8.2 翻譯品質相關
**Q: 如何提高翻譯品質?**
A: 建議遵循以下原則:
- 使用標準格式的檔案
- 確保原文語法正確
- 避免過於複雜的句子結構
- 專業術語可能需要人工校對
**Q: 翻譯結果可以編輯嗎?**
A: 系統提供的是機器翻譯結果,下載後可以使用相應的軟體(如 Word、PowerPoint進行編輯修改。
### 8.3 系統使用相關
**Q: 為什麼任務一直顯示「等待中」?**
A: 這通常是正常情況:
- 系統正在排隊處理任務
- 大檔案需要較長處理時間
- 如超過 30 分鐘仍未開始處理,請聯繫技術支援
**Q: 可以取消已提交的任務嗎?**
A: 可以,在任務狀態為「等待中」或「處理中」時,可以在任務列表或詳情頁面點擊「取消任務」。
**Q: 歷史任務會保存多久?**
A: 已完成的任務和檔案會保存 90 天,建議及時下載需要的翻譯檔案。
### 8.4 技術支援
**Q: 遇到系統錯誤怎麼辦?**
A: 請按以下步驟處理:
1. 嘗試重新整理頁面
2. 清除瀏覽器快取和 Cookie
3. 更換瀏覽器或使用無痕模式
4. 如問題持續,請聯繫技術支援
**聯絡方式:**
- Email: it-support@panjit.com.tw
- 內線電話: 2481
- 服務時間: 週一至週五 9:00-18:00
---
## 附錄
### 支援的語言清單
- 繁體中文 (Traditional Chinese)
- 簡體中文 (Simplified Chinese)
- 英語 (English)
- 日語 (Japanese)
- 韓語 (Korean)
- 越南語 (Vietnamese)
- 泰語 (Thai)
- 德語 (German)
- 法語 (French)
- 西班牙語 (Spanish)
- 俄語 (Russian)
- 阿拉伯語 (Arabic)
### 瀏覽器相容性
- **推薦瀏覽器**Chrome 80+、Firefox 75+、Edge 80+
- **行動裝置**:支援響應式設計,可在手機和平板上使用
- **注意**IE 瀏覽器不支援,請使用現代瀏覽器
### 檔案大小和數量限制
- **單檔大小**:最大 50MB
- **批量上傳**:最多同時上傳 10 個檔案
- **總容量**:每用戶 1GB 儲存空間
- **並發任務**:最多同時處理 5 個翻譯任務
---
*本手冊最後更新日期2025年9月4日*
*如有疑問或建議,請聯繫 PANJIT IT Team*

View File

@@ -1,196 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
手動補充韓文翻譯快取並重新生成翻譯檔案
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
from app import create_app
def add_korean_translations():
"""手動補充韓文翻譯快取"""
print("=" * 80)
print("手動補充韓文翻譯快取")
print("目標語言: 韓文 (ko)")
print("=" * 80)
# 關鍵的中文->韓文翻譯對照 (基於常見技術用語翻譯)
korean_translations = [
{
'source_text': '與 WB inline 串線DB→WB、時效快支援 Sn/Au 晶片\n最小可支援9mil晶粒\n支援EAP管控',
'translated_text': 'WB 인라인 연결(DB→WB), 처리 시간 단축; Sn/Au 칩 지원\n최소 9mil 다이 지원\nEAP 제어 지원'
},
{
'source_text': '空洞表現穩定、尺寸/厚度範圍廣\n最小可支援9mil晶粒\n支援EAP管控',
'translated_text': '공극 표현 안정, 크기/두께 범위 넓음\n최소 9mil 다이 지원\nEAP 제어 지원'
},
{
'source_text': 'DB到焊接爐為串機、時效快減少人員碰觸之風險\n支援Ag/Au晶片\n支援含氧量監控\n支援EAP',
'translated_text': 'DB에서 용접로까지 인라인 연결, 처리 시간 단축, 인적 접촉 위험 감소\nAg/Au 칩 지원\n산소 함량 모니터링 지원\nEAP 지원'
},
{
'source_text': '爐後氣孔少,提升焊接接縫均勻度、強度高、氣密性好\n支援Ag/Au晶片\n支援含氧量監控\n支援EAP',
'translated_text': '로 후 기공 적음, 용접 이음부 균일도 향상, 강도 높음, 기밀성 양호\nAg/Au 칩 지원\n산소 함량 모니터링 지원\nEAP 지원'
},
{
'source_text': 'Wire size: 0.8 mil ~ 2.4 mil量產成熟\n最薄 Al bond pad 1.3 μm最小 bond pad size 55 × 55 μm\n支援EAP管控',
'translated_text': '와이어 크기: 0.8 mil ~ 2.4 mil(양산 성숙)\n최박 Al 본드 패드 1.3 μm; 최소 본드 패드 크기 55 × 55 μm\nEAP 제어 지원'
},
{
'source_text': '1.全自動貼片減少人為作業的風險\n2.機台封閉式設計及有HEPA機構能減少落塵造成的異常風險\n3.自動讀取晶片刻號及貼晶片條碼\n支援EAP管控',
'translated_text': '1.전자동 부착으로 인적 작업 위험 감소\n2.장비 밀폐식 설계 및 HEPA 기구로 낙진 이상 위험 감소\n3.칩 각인 및 칩 바코드 자동 판독\nEAP 제어 지원'
},
{
'source_text': '1.晶片切割後chipping的品質檢驗\n2.晶片上的缺點檢驗',
'translated_text': '1.칩 절단 후 치핑 품질 검사\n2.칩상 결함 검사'
},
# 單字元翻譯
{
'source_text': '',
'translated_text': '높음'
},
{
'source_text': '',
'translated_text': '낮음'
},
{
'source_text': '',
'translated_text': '중간'
},
# 其他重要片段
{
'source_text': '自動串接DB 後直上 WB免批次搬運。\n快速交付:連線作業縮短 Cycle Time。',
'translated_text': '자동 연결: DB 후 직접 WB 연결, 배치 운반 생략.\n빠른 납품: 연결 작업으로 사이클 타임 단축.'
},
{
'source_text': 'Solder\nDB+WB',
'translated_text': '솔더\nDB+WB'
},
{
'source_text': '晶粒尺寸/pad尺寸需配合規格\n高溫製程,需確認晶片承受狀況',
'translated_text': '다이 크기/패드 크기는 사양에 맞춰야 함\n고온 공정, 칩 내성 확인 필요'
}
]
app = create_app()
with app.app_context():
from app.models.cache import TranslationCache
from app import db
source_language = 'zh'
target_language = 'ko'
print(f"準備添加 {len(korean_translations)} 筆韓文翻譯...")
print("-" * 60)
added_count = 0
updated_count = 0
for i, trans in enumerate(korean_translations, 1):
source_text = trans['source_text']
translated_text = trans['translated_text']
print(f"\n{i:2d}. 處理翻譯:")
print(f" 原文: {repr(source_text[:40])}...")
print(f" 韓文: {repr(translated_text[:40])}...")
# 檢查是否已存在
existing = TranslationCache.get_translation(source_text, source_language, target_language)
if existing:
if existing.strip() != translated_text.strip():
print(f" 🔄 更新現有翻譯")
TranslationCache.save_translation(source_text, source_language, target_language, translated_text)
updated_count += 1
else:
print(f" ⚠️ 翻譯已存在且相同")
else:
print(f" ✅ 新增翻譯記錄")
TranslationCache.save_translation(source_text, source_language, target_language, translated_text)
added_count += 1
print(f"\n" + "-" * 60)
print(f"韓文翻譯補充結果:")
print(f" 新增: {added_count}")
print(f" 更新: {updated_count}")
print(f" 總計: {added_count + updated_count}")
# 驗證結果
print(f"\n驗證補充結果:")
print("-" * 60)
success_count = 0
for i, trans in enumerate(korean_translations, 1):
source_text = trans['source_text']
cached_translation = TranslationCache.get_translation(source_text, source_language, target_language)
if cached_translation:
if cached_translation.strip() == trans['translated_text'].strip():
print(f"{i:2d}: 驗證成功")
success_count += 1
else:
print(f"⚠️ {i:2d}: 驗證失敗 - 內容不一致")
else:
print(f"{i:2d}: 驗證失敗 - 快取中沒有")
print(f"\n驗證結果: {success_count}/{len(korean_translations)} 成功")
# 測試整體韓文映射覆蓋率
print(f"\n測試整體韓文映射覆蓋率:")
print("-" * 60)
from app.services.translation_service import ExcelParser
from sqlalchemy import text as sql_text
original_file = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\98158984-f335-44f5-a0b4-88fb8ccd5d78") / "original_panjit_98158984.xlsx"
if original_file.exists():
parser = ExcelParser(str(original_file))
segments = parser.extract_text_segments()
mapping_count = 0
for segment in segments:
result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': segment, 'lang': target_language})
row = result.fetchone()
if row:
mapping_count += 1
mapping_rate = mapping_count / len(segments) * 100 if segments else 0
print(f"韓文映射覆蓋率: {mapping_count}/{len(segments)} = {mapping_rate:.1f}%")
if mapping_rate >= 95:
print("🎉 韓文映射覆蓋率優秀!翻譯功能應該完美工作")
elif mapping_rate >= 90:
print("✅ 韓文映射覆蓋率良好,翻譯功能基本正常")
elif mapping_rate >= 80:
print("⚠️ 韓文映射覆蓋率普通,大部分內容可以翻譯")
else:
print("❌ 韓文映射覆蓋率不足,需要更多翻譯")
print(f"\n" + "=" * 80)
print("韓文翻譯快取補充完成!")
print("建議: 重新上傳Excel檔案測試韓文翻譯功能")
print("或者手動重新生成韓文翻譯檔案")
print("=" * 80)
if __name__ == "__main__":
add_korean_translations()

View File

@@ -1,220 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
分析最新Excel測試結果 - 檢查修正是否真正生效
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
import openpyxl
from app.services.translation_service import ExcelParser
def analyze_latest_excel_test():
"""詳細分析最新Excel測試結果"""
print("=" * 80)
print("分析最新Excel測試結果")
print("UUID: 185bb457-b703-4e98-94a2-fde072b895c4")
print("=" * 80)
# 文件路徑
test_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\185bb457-b703-4e98-94a2-fde072b895c4")
original_file = test_dir / "original_panjit_185bb457.xlsx"
translated_file = test_dir / "original_panjit_185bb457_ja_translated.xlsx"
if not original_file.exists():
print(f"原始文件不存在: {original_file}")
return
if not translated_file.exists():
print(f"翻譯文件不存在: {translated_file}")
return
print(f"\n✅ 檔案確認:")
print(f" 原始文件: {original_file.name}")
print(f" 翻譯文件: {translated_file.name}")
# 1. 測試ExcelParser的_should_translate函數
print(f"\n1. 測試ExcelParser的_should_translate函數")
print("-" * 60)
parser = ExcelParser(str(original_file))
test_texts = [
("製程", "A1儲存格"),
("主要特點", "標題文字"),
("AB", "2個英文字母"),
("123", "純數字"),
("工藝", "2個中文字符"),
("Epoxy 膠黏(導電/導熱銀膠)", "複合文字")
]
for text, desc in test_texts:
should_translate = parser._should_translate(text, 'auto')
has_cjk = parser._has_cjk(text)
min_length = 2 if has_cjk else 3
print(f" '{text}' ({desc}):")
print(f" 長度: {len(text)}, CJK: {has_cjk}, 最小長度: {min_length}")
print(f" 應翻譯: {should_translate}")
print()
# 2. 檢查實際提取的文字片段
print(f"\n2. 檢查實際提取的文字片段")
print("-" * 60)
segments = parser.extract_text_segments()
print(f"✅ 總共提取 {len(segments)} 個文字片段")
# 特別檢查A1
a1_content = "製程"
if a1_content in segments:
print(f"✅ A1內容 '{a1_content}' 已被提取")
index = segments.index(a1_content)
print(f" 在列表中的位置: 第{index+1}")
else:
print(f"❌ A1內容 '{a1_content}' 仍未被提取")
# 顯示所有提取的片段
print(f"\n 所有提取的片段:")
for i, segment in enumerate(segments):
safe_segment = repr(segment)
print(f" {i+1:2d}. {safe_segment}")
if segment == a1_content:
print(f" ⬆️ 這是A1的內容")
# 3. 檢查原始和翻譯文件的A1儲存格
print(f"\n3. 檢查A1儲存格內容")
print("-" * 60)
wb_orig = openpyxl.load_workbook(str(original_file), data_only=False)
wb_trans = openpyxl.load_workbook(str(translated_file), data_only=False)
try:
wb_orig_vals = openpyxl.load_workbook(str(original_file), data_only=True)
except:
wb_orig_vals = None
# A1儲存格比較
a1_orig = wb_orig.active['A1'].value
a1_trans = wb_trans.active['A1'].value
a1_orig_display = wb_orig_vals.active['A1'].value if wb_orig_vals else None
print(f" A1原始值: {repr(a1_orig)}")
if wb_orig_vals:
print(f" A1顯示值: {repr(a1_orig_display)}")
print(f" A1翻譯值: {repr(a1_trans)}")
# 判斷A1是否被翻譯
if isinstance(a1_trans, str) and '\n' in a1_trans:
lines = a1_trans.split('\n')
if len(lines) >= 2:
print(f" ✅ A1已翻譯格式: 原文+換行+譯文")
print(f" 原文行: {repr(lines[0])}")
print(f" 譯文行: {repr(lines[1])}")
else:
print(f" ❌ A1格式異常")
elif a1_orig == a1_trans:
print(f" ❌ A1未翻譯 - 內容相同")
else:
print(f" ⚠️ A1內容有變化但格式不明")
# 4. 檢查其他重要儲存格
print(f"\n4. 檢查其他重要儲存格")
print("-" * 60)
important_cells = ['B1', 'C1', 'D1', 'A2', 'B2', 'C2']
for cell_name in important_cells:
orig_cell = wb_orig.active[cell_name]
trans_cell = wb_trans.active[cell_name]
orig_val = orig_cell.value
trans_val = trans_cell.value
if orig_val: # 只檢查有內容的儲存格
print(f"\n {cell_name}儲存格:")
print(f" 原始: {repr(orig_val)}")
print(f" 翻譯: {repr(trans_val)}")
if isinstance(trans_val, str) and '\n' in trans_val:
lines = trans_val.split('\n')
print(f" 狀態: ✅ 已翻譯 (雙行格式)")
if len(lines) >= 2:
print(f" 原文: {repr(lines[0])}")
print(f" 譯文: {repr(lines[1])}")
elif orig_val == trans_val:
print(f" 狀態: ❌ 未翻譯")
else:
print(f" 狀態: ⚠️ 內容有變化")
# 5. 檢查翻譯快取狀況
print(f"\n5. 檢查翻譯快取狀況")
print("-" * 60)
from app import create_app
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
target_language = 'ja'
print(f"查詢 '{a1_content}' 在翻譯快取中的狀況...")
# 查詢精確匹配
result = db.session.execute(sql_text("""
SELECT source_text, translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 3
"""), {'text': a1_content, 'lang': target_language})
rows = result.fetchall()
if rows:
print(f"✅ 找到 {len(rows)} 筆精確匹配的翻譯記錄:")
for i, (src, trans, created_at) in enumerate(rows):
print(f" {i+1}. 原文: {repr(src)}")
print(f" 譯文: {repr(trans)}")
print(f" 時間: {created_at}")
else:
print(f"❌ 未找到精確匹配的翻譯記錄")
# 查詢所有提取片段的翻譯狀況
print(f"\n檢查所有提取片段的翻譯快取狀況:")
found_count = 0
for i, segment in enumerate(segments[:10]): # 只檢查前10個
result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': segment, 'lang': target_language})
row = result.fetchone()
if row:
found_count += 1
print(f"{i+1:2d}. '{segment[:20]}...' -> '{row[0][:20]}...'")
else:
print(f"{i+1:2d}. '{segment[:20]}...' -> 無翻譯記錄")
print(f"\n翻譯快取命中率: {found_count}/{min(10, len(segments))} = {found_count/min(10, len(segments))*100:.1f}%")
wb_orig.close()
wb_trans.close()
if wb_orig_vals:
wb_orig_vals.close()
print("\n" + "=" * 80)
print("分析完成!")
print("=" * 80)
if __name__ == "__main__":
analyze_latest_excel_test()

35
app.py
View File

@@ -62,13 +62,30 @@ def test():
@app.route('/')
def index():
"""首頁路由"""
return {
'application': 'PANJIT Document Translator',
'version': '1.0.0',
'status': 'running',
'api_base_url': '/api/v1'
}
"""首頁路由 - 服務前端應用"""
try:
from flask import send_from_directory
return send_from_directory('/app/static', 'index.html')
except Exception as e:
# 如果靜態文件不存在返回API信息
return {
'application': 'PANJIT Document Translator',
'version': '1.0.0',
'status': 'running',
'api_base_url': '/api/v1',
'note': 'Frontend files not found, serving API info'
}
@app.route('/<path:path>')
def serve_static(path):
"""服務靜態文件"""
try:
from flask import send_from_directory
return send_from_directory('/app/static', path)
except Exception:
# 如果文件不存在返回index.html (SPA路由)
return send_from_directory('/app/static', 'index.html')
@app.route('/api')
@@ -102,9 +119,9 @@ def health_check():
if __name__ == '__main__':
# 檢查環境變數
port = int(os.environ.get('PORT', 5000))
port = int(os.environ.get('PORT', 12010))
debug = os.environ.get('FLASK_DEBUG', 'false').lower() == 'true'
host = os.environ.get('HOST', '127.0.0.1')
host = os.environ.get('HOST', '0.0.0.0')
# 只在主進程或非 debug 模式下顯示啟動訊息
# 在 debug 模式下Flask 會創建兩個進程,只在 reloader 主進程顯示訊息

View File

@@ -94,7 +94,7 @@ def create_app(config_name=None):
@app.after_request
def after_request(response):
origin = request.headers.get('Origin')
allowed_origins = ['http://localhost:3000', 'http://127.0.0.1:3000', 'http://localhost:3001', 'http://127.0.0.1:3001']
allowed_origins = ['http://localhost:3000', 'http://127.0.0.1:3000', 'http://localhost:3001', 'http://127.0.0.1:3001', 'http://localhost:12010', 'http://127.0.0.1:12010']
if origin and origin in allowed_origins:
response.headers['Access-Control-Allow-Origin'] = origin
@@ -111,7 +111,7 @@ def create_app(config_name=None):
if request.method == 'OPTIONS':
response = make_response()
origin = request.headers.get('Origin')
allowed_origins = ['http://localhost:3000', 'http://127.0.0.1:3000', 'http://localhost:3001', 'http://127.0.0.1:3001']
allowed_origins = ['http://localhost:3000', 'http://127.0.0.1:3000', 'http://localhost:3001', 'http://127.0.0.1:3001', 'http://localhost:12010', 'http://127.0.0.1:12010']
if origin and origin in allowed_origins:
response.headers['Access-Control-Allow-Origin'] = origin

View File

@@ -1,317 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
認證 API
Author: PANJIT IT Team
Created: 2024-01-28
Modified: 2024-01-28
"""
from flask import Blueprint, request, jsonify, current_app
from flask_jwt_extended import create_access_token, create_refresh_token, jwt_required, get_jwt_identity
from app.utils.ldap_auth import LDAPAuthService
from app.utils.decorators import login_required, validate_json, rate_limit
from app.utils.exceptions import AuthenticationError
from app.utils.logger import get_logger
from app.models.user import User
from app.models.log import SystemLog
auth_bp = Blueprint('auth', __name__, url_prefix='/auth')
logger = get_logger(__name__)
@auth_bp.route('/login', methods=['POST'])
@rate_limit(max_requests=10, per_seconds=300) # 5分鐘內最多10次嘗試
@validate_json(['username', 'password'])
def login():
"""使用者登入"""
try:
data = request.get_json()
username = data['username'].strip()
password = data['password']
if not username or not password:
return jsonify({
'success': False,
'error': 'INVALID_INPUT',
'message': '帳號和密碼不能為空'
}), 400
# LDAP 認證
ldap_service = LDAPAuthService()
user_info = ldap_service.authenticate_user(username, password)
# 取得或建立使用者
user = User.get_or_create(
username=user_info['username'],
display_name=user_info['display_name'],
email=user_info['email'],
department=user_info.get('department')
)
# 更新登入時間
user.update_last_login()
# 創建 JWT tokens
access_token = create_access_token(
identity=user.username,
additional_claims={
'user_id': user.id,
'is_admin': user.is_admin,
'display_name': user.display_name,
'email': user.email
}
)
refresh_token = create_refresh_token(identity=user.username)
# 記錄登入日誌
SystemLog.info(
'auth.login',
f'User {username} logged in successfully',
user_id=user.id,
extra_data={
'ip_address': request.remote_addr,
'user_agent': request.headers.get('User-Agent')
}
)
logger.info(f"🔑 [JWT Created] User: {username}, UserID: {user.id}")
logger.info(f"User {username} logged in successfully")
return jsonify({
'success': True,
'data': {
'access_token': access_token,
'refresh_token': refresh_token,
'user': user.to_dict()
},
'message': '登入成功'
})
except AuthenticationError as e:
# 記錄認證失敗
SystemLog.warning(
'auth.login_failed',
f'Authentication failed for user {username}: {str(e)}',
extra_data={
'username': username,
'ip_address': request.remote_addr,
'error': str(e)
}
)
logger.warning(f"Authentication failed for user {username}: {str(e)}")
return jsonify({
'success': False,
'error': 'INVALID_CREDENTIALS',
'message': str(e)
}), 401
except Exception as e:
logger.error(f"Login error: {str(e)}")
SystemLog.error(
'auth.login_error',
f'Login system error: {str(e)}',
extra_data={
'username': username,
'error': str(e)
}
)
return jsonify({
'success': False,
'error': 'SYSTEM_ERROR',
'message': '系統錯誤,請稍後再試'
}), 500
@auth_bp.route('/logout', methods=['POST'])
@jwt_required()
def logout():
"""使用者登出"""
try:
username = get_jwt_identity()
# 記錄登出日誌
SystemLog.info(
'auth.logout',
f'User {username} logged out'
)
logger.info(f"🚪 [JWT Logout] User: {username}")
logger.info(f"User {username} logged out")
return jsonify({
'success': True,
'message': '登出成功'
})
except Exception as e:
logger.error(f"Logout error: {str(e)}")
return jsonify({
'success': False,
'error': 'SYSTEM_ERROR',
'message': '登出時發生錯誤'
}), 500
@auth_bp.route('/me', methods=['GET'])
@jwt_required()
def get_current_user():
"""取得當前使用者資訊"""
try:
from flask_jwt_extended import get_jwt
username = get_jwt_identity()
claims = get_jwt()
user_data = {
'username': username,
'user_id': claims.get('user_id'),
'is_admin': claims.get('is_admin'),
'display_name': claims.get('display_name'),
'email': claims.get('email')
}
return jsonify({
'success': True,
'data': {
'user': user_data
}
})
except Exception as e:
logger.error(f"Get current user error: {str(e)}")
return jsonify({
'success': False,
'error': 'SYSTEM_ERROR',
'message': '取得使用者資訊時發生錯誤'
}), 500
@auth_bp.route('/refresh', methods=['POST'])
@jwt_required(refresh=True)
def refresh_token():
"""刷新 Session"""
try:
from flask import g
user = g.current_user
# 更新 Session 資訊
session['user_id'] = user.id
session['username'] = user.username
session['is_admin'] = user.is_admin
session.permanent = True
logger.info(f"Session refreshed for user {user.username}")
return jsonify({
'success': True,
'data': {
'user': user.to_dict(),
'session_refreshed': True
},
'message': 'Session 已刷新'
})
except Exception as e:
logger.error(f"Session refresh error: {str(e)}")
return jsonify({
'success': False,
'error': 'SYSTEM_ERROR',
'message': '刷新 Session 時發生錯誤'
}), 500
@auth_bp.route('/check', methods=['GET'])
def check_auth():
"""檢查認證狀態"""
try:
user_id = session.get('user_id')
if not user_id:
return jsonify({
'success': False,
'authenticated': False,
'message': '未登入'
}), 401
# 驗證使用者是否仍然存在
user = User.query.get(user_id)
if not user:
session.clear()
return jsonify({
'success': False,
'authenticated': False,
'message': '使用者不存在'
}), 401
return jsonify({
'success': True,
'authenticated': True,
'data': {
'user': user.to_dict()
}
})
except Exception as e:
logger.error(f"Auth check error: {str(e)}")
return jsonify({
'success': False,
'error': 'SYSTEM_ERROR',
'message': '檢查認證狀態時發生錯誤'
}), 500
@auth_bp.route('/search-users', methods=['GET'])
@login_required
def search_users():
"""搜尋使用者LDAP"""
try:
search_term = request.args.get('q', '').strip()
limit = min(int(request.args.get('limit', 20)), 50)
if len(search_term) < 2:
return jsonify({
'success': False,
'error': 'INVALID_SEARCH_TERM',
'message': '搜尋關鍵字至少需要2個字元'
}), 400
ldap_service = LDAPAuthService()
users = ldap_service.search_users(search_term, limit)
return jsonify({
'success': True,
'data': {
'users': users,
'count': len(users)
}
})
except Exception as e:
logger.error(f"User search error: {str(e)}")
return jsonify({
'success': False,
'error': 'SYSTEM_ERROR',
'message': '搜尋使用者時發生錯誤'
}), 500
# 錯誤處理器
@auth_bp.errorhandler(429)
def rate_limit_handler(e):
"""速率限制錯誤處理器"""
return jsonify({
'success': False,
'error': 'RATE_LIMIT_EXCEEDED',
'message': '請求過於頻繁,請稍後再試'
}), 429

View File

@@ -1,58 +0,0 @@
@echo off
echo 正在建構 PANJIT Document Translator 前端...
REM 檢查 Node.js 是否安裝
node --version >nul 2>&1
if errorlevel 1 (
echo 錯誤: 未檢測到 Node.js請先安裝 Node.js 16+ 版本
pause
exit /b 1
)
REM 檢查是否在前端目錄
if not exist "frontend\package.json" (
echo 錯誤: 請在專案根目錄執行此腳本
pause
exit /b 1
)
REM 進入前端目錄
cd frontend
REM 安裝依賴
echo 正在安裝依賴套件...
npm install
if errorlevel 1 (
echo 依賴安裝失敗,請檢查網路連線和 npm 配置
pause
exit /b 1
)
REM 執行 ESLint 檢查
echo 正在執行程式碼檢查...
npm run lint
if errorlevel 1 (
echo 程式碼檢查發現問題,請修復後重試
pause
exit /b 1
)
REM 執行建構
echo 正在建構生產版本...
npm run build
if errorlevel 1 (
echo 建構失敗
pause
exit /b 1
)
echo.
echo ==========================================
echo 建構完成!
echo ==========================================
echo 建構檔案位於: frontend\dist
echo 可使用 nginx 或其他 web 伺服器部署
echo ==========================================
echo.
pause

View File

@@ -1,26 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Celery Worker 啟動腳本
Author: PANJIT IT Team
Created: 2024-01-28
Modified: 2024-01-28
"""
import os
import sys
from pathlib import Path
# 添加專案根目錄到 Python 路徑
project_root = Path(__file__).parent
sys.path.insert(0, str(project_root))
from app import create_app
# 建立應用並取得 Celery 實例
app = create_app()
celery = app.celery
if __name__ == '__main__':
celery.start()

View File

@@ -1,33 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查配置
"""
import sys
import os
# 添加 app 路徑
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
def main():
from app import create_app
app = create_app()
with app.app_context():
print("配置檢查:")
print(f"DIFY_API_BASE_URL: '{app.config.get('DIFY_API_BASE_URL', 'NOT_SET')}'")
print(f"DIFY_API_KEY: '{app.config.get('DIFY_API_KEY', 'NOT_SET')}'")
# 檢查 api.txt 文件
import os
if os.path.exists('api.txt'):
with open('api.txt', 'r', encoding='utf-8') as f:
content = f.read()
print(f"\napi.txt 內容:")
print(content)
else:
print("\napi.txt 文件不存在")
if __name__ == "__main__":
main()

View File

@@ -1,108 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查資料庫結構 - 找出翻譯結果儲存方式
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from sqlalchemy import text
def check_db_structure():
"""檢查資料庫結構"""
app = create_app()
with app.app_context():
print("=== 檢查資料庫結構 ===")
# 列出所有表
result = db.session.execute(text("SHOW TABLES"))
tables = result.fetchall()
print(f"資料庫中的表:")
for table in tables:
table_name = table[0]
print(f" - {table_name}")
# 檢查表結構
desc_result = db.session.execute(text(f"DESC {table_name}"))
columns = desc_result.fetchall()
for col in columns:
print(f" {col[0]} ({col[1]})")
# 檢查特定任務的相關資料
print(f"\n=== 檢查特定任務資料 ===")
job_uuid = "9c6548ac-2f59-45f4-aade-0a9b3895bbfd"
# 查詢任務資料
job_result = db.session.execute(text("""
SELECT id, job_uuid, status, progress, total_tokens, total_cost, target_languages
FROM dt_translation_jobs
WHERE job_uuid = :uuid
"""), {'uuid': job_uuid})
job_row = job_result.fetchone()
if job_row:
print(f"任務ID: {job_row[0]}")
print(f"UUID: {job_row[1]}")
print(f"狀態: {job_row[2]}")
print(f"進度: {job_row[3]}")
print(f"Tokens: {job_row[4]}")
print(f"成本: {job_row[5]}")
print(f"目標語言: {job_row[6]}")
job_id = job_row[0]
# 查詢相關檔案
files_result = db.session.execute(text("""
SELECT file_type, filename, language_code, file_size, created_at
FROM dt_job_files
WHERE job_id = :job_id
"""), {'job_id': job_id})
files = files_result.fetchall()
print(f"\n相關檔案 ({len(files)}):")
for file_row in files:
print(f" {file_row[0]}: {file_row[1]} ({file_row[2]}) - {file_row[3]} bytes")
# 查詢翻譯cache如果存在的話
if 'dt_translation_cache' in [t[0] for t in tables]:
cache_result = db.session.execute(text("""
SELECT COUNT(*) FROM dt_translation_cache
WHERE source_text IN (
SELECT SUBSTRING(source_text, 1, 50)
FROM dt_translation_cache
LIMIT 5
)
"""))
cache_count = cache_result.scalar()
print(f"\n翻譯快取記錄數: {cache_count}")
# 取幾個範例
sample_result = db.session.execute(text("""
SELECT source_text, target_language, translated_text
FROM dt_translation_cache
LIMIT 5
"""))
samples = sample_result.fetchall()
print(f"快取範例:")
for sample in samples:
print(f" {sample[0][:50]}... -> [{sample[1]}] {sample[2][:50]}...")
else:
print(f"找不到任務: {job_uuid}")
if __name__ == "__main__":
check_db_structure()

View File

@@ -1,67 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查翻譯快取資料表結構
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from app import create_app
def check_table_structure():
"""檢查翻譯快取資料表結構"""
print("=" * 80)
print("檢查翻譯快取資料表結構")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
# 查詢資料表結構
result = db.session.execute(sql_text("DESCRIBE dt_translation_cache"))
print("dt_translation_cache 資料表結構:")
print("-" * 60)
rows = result.fetchall()
for row in rows:
row_data = [str(item) if item is not None else '' for item in row]
print(f" {row_data[0]:<20} | {row_data[1]:<15} | {row_data[2]:<5} | {row_data[3]:<5} | {row_data[4]:<10} | {row_data[5] if len(row_data) > 5 else ''}")
print("\n" + "-" * 60)
print("欄位說明: 欄位名稱 | 類型 | Null | Key | Default | Extra")
# 查詢資料表中的資料筆數
count_result = db.session.execute(sql_text("SELECT COUNT(*) FROM dt_translation_cache"))
count = count_result.fetchone()[0]
print(f"\n總記錄數: {count}")
# 查詢最近的幾筆記錄
recent_result = db.session.execute(sql_text("""
SELECT source_text, translated_text, source_language, target_language, created_at
FROM dt_translation_cache
ORDER BY created_at DESC
LIMIT 5
"""))
print(f"\n最近的翻譯記錄:")
print("-" * 60)
recent_rows = recent_result.fetchall()
for i, (src, trans, src_lang, tgt_lang, created_at) in enumerate(recent_rows):
print(f" {i+1}. '{src[:20]}...' -> '{trans[:20]}...' ({src_lang}->{tgt_lang}) {created_at}")
print("\n" + "=" * 80)
print("檢查完成!")
print("=" * 80)
if __name__ == "__main__":
check_table_structure()

View File

@@ -1,101 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查DOCX翻譯文件的實際內容
"""
import sys
import os
from pathlib import Path
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app
from app.models.job import TranslationJob
def check_docx_content():
"""檢查DOCX翻譯文件的實際內容"""
app = create_app()
with app.app_context():
print("=== 檢查DOCX翻譯文件內容 ===")
# 檢查最新的DOCX任務
job = TranslationJob.query.filter_by(job_uuid='9c6548ac-2f59-45f4-aade-0a9b3895bbfd').first()
if not job:
print("DOCX任務不存在")
return
print(f"任務狀態: {job.status}")
print(f"總tokens: {job.total_tokens}")
print(f"總成本: ${job.total_cost}")
print(f"目標語言: {job.target_languages}")
translated_files = job.get_translated_files()
print(f"\n📁 翻譯檔案數: {len(translated_files)}")
for tf in translated_files:
file_path = Path(tf.file_path)
print(f"\n【檢查】 {tf.filename} ({tf.language_code})")
print(f"路徑: {tf.file_path}")
print(f"存在: {file_path.exists()}")
print(f"大小: {file_path.stat().st_size:,} bytes")
if file_path.exists() and tf.filename.endswith('.docx'):
try:
from docx import Document
doc = Document(str(file_path))
paragraphs = [p.text.strip() for p in doc.paragraphs if p.text.strip()]
print(f"總段落數: {len(paragraphs)}")
if paragraphs:
print(f"\n📄 前5段內容檢查:")
for i, para in enumerate(paragraphs[:5]):
print(f"段落 {i+1}: {para[:100]}...")
# 檢查是否包含交錯翻譯格式
lines = para.split('\n')
if len(lines) > 1:
print(f" -> 多行內容(可能是交錯格式): {len(lines)}")
for j, line in enumerate(lines[:3]): # 顯示前3行
print(f"{j+1}: {line[:60]}...")
# 檢查是否包含英文或越南文
has_english = any(ord(c) < 128 and c.isalpha() for c in para)
has_vietnamese = any('\u00C0' <= c <= '\u1EF9' for c in para) # Vietnamese characters
print(f" -> 包含英文: {has_english}")
print(f" -> 包含越南文: {has_vietnamese}")
print(" ---")
# 檢查整個文件的語言分佈
all_text = ' '.join(paragraphs)
chinese_chars = sum(1 for c in all_text if '\u4e00' <= c <= '\u9fff')
english_chars = sum(1 for c in all_text if ord(c) < 128 and c.isalpha())
vietnamese_chars = sum(1 for c in all_text if '\u00C0' <= c <= '\u1EF9')
print(f"\n📊 文件語言分析:")
print(f" 中文字符: {chinese_chars}")
print(f" 英文字符: {english_chars}")
print(f" 越南文字符: {vietnamese_chars}")
if chinese_chars > 0 and (english_chars == 0 and vietnamese_chars == 0):
print(" ❌ 只有中文,沒有翻譯內容!")
elif chinese_chars > 0 and (english_chars > 0 or vietnamese_chars > 0):
print(" ✅ 包含中文和翻譯內容,可能是交錯格式")
else:
print(" ⚠️ 文件內容異常")
except Exception as e:
print(f"❌ 讀取DOCX文件失敗: {e}")
if __name__ == "__main__":
check_docx_content()

View File

@@ -1,122 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查DOCX任務的具體翻譯對應
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from sqlalchemy import text
from app.services.translation_service import DocxParser
def check_docx_specific_translations():
"""檢查DOCX任務的具體翻譯對應"""
app = create_app()
with app.app_context():
print("=== 檢查DOCX任務的具體翻譯對應 ===")
# 原始文件路徑
original_path = r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\9c6548ac-2f59-45f4-aade-0a9b3895bbfd\original_-OR026_9c6548ac.docx"
# 提取原始文檔段落
parser = DocxParser(original_path)
segments = parser.extract_segments_with_context()
text_segments = [seg.text for seg in segments if seg.text.strip()]
print(f"原始文檔有 {len(text_segments)} 個文本段落")
# 查找這些段落在快取中對應的翻譯
print(f"\n=== 檢查每個段落的翻譯狀況 ===")
total_segments = len(text_segments)
found_en = 0
found_vi = 0
for i, segment_text in enumerate(text_segments):
# 查找英文翻譯
en_result = db.session.execute(text("""
SELECT translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = 'en'
ORDER BY created_at DESC
LIMIT 1
"""), {'text': segment_text})
en_row = en_result.fetchone()
# 查找越南文翻譯
vi_result = db.session.execute(text("""
SELECT translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = 'vi'
ORDER BY created_at DESC
LIMIT 1
"""), {'text': segment_text})
vi_row = vi_result.fetchone()
status = ""
if en_row:
found_en += 1
status += "EN✅ "
else:
status += "EN❌ "
if vi_row:
found_vi += 1
status += "VI✅ "
else:
status += "VI❌ "
print(f"段落 {i+1:3d}: {status} {segment_text[:50]}...")
# 顯示翻譯內容(如果有的話)
if en_row and len(en_row[0]) > 0:
en_text = en_row[0]
# 檢查是否真的是英文
has_english = any(ord(c) < 128 and c.isalpha() for c in en_text)
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in en_text)
if has_english and not has_chinese:
print(f" EN: ✅ {en_text[:60]}...")
elif has_chinese:
print(f" EN: ❌ 仍是中文: {en_text[:60]}...")
else:
print(f" EN: ❓ 未知: {en_text[:60]}...")
if vi_row and len(vi_row[0]) > 0:
vi_text = vi_row[0]
has_vietnamese = any('\u00C0' <= c <= '\u1EF9' for c in vi_text)
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in vi_text)
if has_vietnamese and not has_chinese:
print(f" VI: ✅ {vi_text[:60]}...")
elif has_chinese:
print(f" VI: ❌ 仍是中文: {vi_text[:60]}...")
else:
print(f" VI: ❓ 未知: {vi_text[:60]}...")
print(f"\n📊 統計結果:")
print(f" 總段落數: {total_segments}")
print(f" 有英文翻譯: {found_en} ({found_en/total_segments*100:.1f}%)")
print(f" 有越南文翻譯: {found_vi} ({found_vi/total_segments*100:.1f}%)")
if found_en < total_segments * 0.5:
print(f" ❌ 翻譯覆蓋率太低,可能是翻譯流程有問題")
else:
print(f" ✅ 翻譯覆蓋率正常")
if __name__ == "__main__":
check_docx_specific_translations()

View File

@@ -1,138 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
直接檢查ROW291的具體內容
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from app import create_app
def check_exact_row291():
"""直接檢查ROW291的具體內容"""
print("=" * 80)
print("直接檢查ROW291的具體內容")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
# 1. 直接查看ROW291
print(f"1. 直接查看ROW291")
print("-" * 60)
result = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, source_language, target_language, created_at
FROM dt_translation_cache
WHERE id = 291
"""))
row291 = result.fetchone()
if not row291:
print("❌ ROW291 不存在")
else:
print(f"✅ ROW291 存在:")
print(f" ID: {row291[0]}")
print(f" 原文: {repr(row291[1])}")
print(f" 翻譯: {repr(row291[2])}")
print(f" 源語言: {row291[3]}")
print(f" 目標語言: {row291[4]}")
print(f" 創建時間: {row291[5]}")
# 檢查是否為D2內容
d2_content = "與 WB inline 串線DB→WB、時效快支援 Sn/Au 晶片\n最小可支援9mil晶粒\n支援EAP管控"
if row291[1] == d2_content:
print(f"✅ 這確實是D2的內容")
if row291[4] == 'ko':
print(f"✅ 而且是韓文翻譯")
print(f" 韓文翻譯: {row291[2]}")
# 測試這個翻譯是否能被映射邏輯找到
print(f"\n測試映射查找:")
search_result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': d2_content, 'lang': 'ko'})
search_row = search_result.fetchone()
if search_row:
print(f" ✅ 映射查找成功: {repr(search_row[0][:50])}...")
if search_row[0] == row291[2]:
print(f" ✅ 內容完全一致")
else:
print(f" ❌ 內容不一致")
print(f" ROW291: {repr(row291[2][:50])}...")
print(f" 查找到: {repr(search_row[0][:50])}...")
else:
print(f" ❌ 映射查找失敗")
else:
print(f"❌ 不是韓文翻譯,而是 {row291[4]}")
else:
print(f"❌ 不是D2的內容")
print(f" 實際內容: {repr(row291[1][:50])}...")
# 2. 查找ROW290-295的所有記錄
print(f"\n2. 查找ROW290-295的所有記錄")
print("-" * 60)
result = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, source_language, target_language, created_at
FROM dt_translation_cache
WHERE id >= 290 AND id <= 295
ORDER BY id
"""))
nearby_records = result.fetchall()
for record in nearby_records:
print(f"\nROW {record[0]} ({record[3]} -> {record[4]}):")
print(f" 原文: {repr(record[1][:40])}...")
print(f" 翻譯: {repr(record[2][:40])}...")
print(f" 時間: {record[5]}")
# 3. 查找所有D2相關的翻譯記錄包含部分匹配
print(f"\n3. 查找所有包含D2關鍵詞的記錄")
print("-" * 60)
result = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, source_language, target_language, created_at
FROM dt_translation_cache
WHERE source_text LIKE '%WB inline%' OR source_text LIKE '%Sn/Au%'
ORDER BY id
"""))
d2_related_records = result.fetchall()
print(f"找到 {len(d2_related_records)} 筆包含D2關鍵詞的記錄:")
for record in d2_related_records:
print(f"\nROW {record[0]} ({record[3]} -> {record[4]}):")
print(f" 原文: {repr(record[1][:50])}...")
print(f" 翻譯: {repr(record[2][:50])}...")
print(f" 時間: {record[5]}")
# 標示是否為完整的D2內容
if "WB inline" in record[1] and "Sn/Au" in record[1] and "EAP" in record[1]:
print(f" 🎯 這是完整的D2內容")
print(f"\n" + "=" * 80)
print("ROW291具體內容檢查完成")
print("=" * 80)
if __name__ == "__main__":
check_exact_row291()

View File

@@ -1,86 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查指定任務狀態
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app
from app.models.job import TranslationJob
from pathlib import Path
def check_job_status():
"""檢查指定任務狀態"""
app = create_app()
with app.app_context():
print("=== 檢查任務狀態 ===")
job_uuid = "313e213e-6adf-457c-91a7-107fc3636c3a"
job = TranslationJob.query.filter_by(job_uuid=job_uuid).first()
if not job:
print(f"任務不存在: {job_uuid}")
return
print(f"任務 UUID: {job.job_uuid}")
print(f"檔名: {job.original_filename}")
print(f"狀態: {job.status}")
print(f"進度: {job.progress}%")
print(f"總成本: ${job.total_cost}")
print(f"總tokens: {job.total_tokens}")
print(f"目標語言: {job.target_languages}")
if job.error_message:
print(f"❌ 錯誤: {job.error_message}")
# 檢查翻譯檔案
translated_files = job.get_translated_files()
print(f"\n📁 翻譯檔案數: {len(translated_files)}")
for tf in translated_files:
file_path = Path(tf.file_path)
exists = "" if file_path.exists() else ""
size = file_path.stat().st_size if file_path.exists() else 0
print(f" {exists} {tf.filename} ({tf.language_code}) - {size:,} bytes")
# 檢查原始檔案
original_file = job.get_original_file()
if original_file:
orig_path = Path(original_file.file_path)
orig_exists = "" if orig_path.exists() else ""
orig_size = orig_path.stat().st_size if orig_path.exists() else 0
print(f"\n📄 原始檔案: {orig_exists} {original_file.filename} - {orig_size:,} bytes")
# 檢查所有檔案是否存在(用於批量下載)
print(f"\n🔍 批量下載檢查:")
all_files_exist = True
if original_file:
if not Path(original_file.file_path).exists():
print(f" ❌ 原始檔案缺失: {original_file.filename}")
all_files_exist = False
for tf in translated_files:
if not Path(tf.file_path).exists():
print(f" ❌ 翻譯檔案缺失: {tf.filename}")
all_files_exist = False
if all_files_exist and len(translated_files) > 0:
print(f" ✅ 所有檔案都存在,批量下載應該可以正常工作")
else:
print(f" ❌ 有檔案缺失,批量下載會失敗")
if __name__ == "__main__":
check_job_status()

View File

@@ -1,13 +0,0 @@
import sys, os
sys.path.insert(0, os.path.join(os.getcwd(), 'app'))
from app import create_app
from app.models.job import TranslationJob
app = create_app()
with app.app_context():
job = TranslationJob.query.filter_by(job_uuid='485e0fdc-75fb-4b5a-b44b-3531951200a1').first()
if job:
print(f'任務 user_id: {job.user_id}')
print(f'任務狀態: {job.status}')
else:
print('任務不存在')

View File

@@ -1,116 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查中英混合段落的具體內容
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
def check_mixed_paragraph():
"""檢查中英混合段落的具體內容"""
print("=== 檢查中英混合段落的具體內容 ===")
test_file = r"C:\Users\EGG\AppData\Local\Temp\test_docx_translation\translated_original_-OR026_9c6548ac_en_translat.docx"
try:
from docx import Document
doc = Document(test_file)
mixed_count = 0
for i, para in enumerate(doc.paragraphs):
text = para.text.strip()
if not text:
continue
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in text)
has_english = any(ord(c) < 128 and c.isalpha() for c in text)
if has_chinese and has_english:
mixed_count += 1
print(f"\n混合段落 {mixed_count} (段落 {i+1}):")
print(f"完整內容: {text}")
# 分析段落內部結構
lines = text.split('\n')
if len(lines) > 1:
print(f"包含 {len(lines)} 行:")
for j, line in enumerate(lines):
line_chinese = any('\u4e00' <= c <= '\u9fff' for c in line)
line_english = any(ord(c) < 128 and c.isalpha() for c in line)
if line_chinese and line_english:
status = "🔄 中英混合"
elif line_english:
status = "🇺🇸 英文"
elif line_chinese:
status = "🇨🇳 中文"
else:
status = "❓ 其他"
print(f"{j+1}: {status} - {line}")
# 檢查是否包含特殊字符(翻譯插入標記)
if '\u200b' in text:
print(" 💡 包含零寬空格標記(翻譯插入標記)")
# 嘗試分離中英文內容
parts = []
current_part = ""
current_is_chinese = None
for char in text:
is_chinese = '\u4e00' <= char <= '\u9fff'
is_english = ord(char) < 128 and char.isalpha()
if is_chinese:
if current_is_chinese == False: # 切換到中文
if current_part.strip():
parts.append(("EN", current_part.strip()))
current_part = char
current_is_chinese = True
else:
current_part += char
current_is_chinese = True
elif is_english:
if current_is_chinese == True: # 切換到英文
if current_part.strip():
parts.append(("ZH", current_part.strip()))
current_part = char
current_is_chinese = False
else:
current_part += char
current_is_chinese = False
else:
current_part += char
if current_part.strip():
if current_is_chinese:
parts.append(("ZH", current_part.strip()))
elif current_is_chinese == False:
parts.append(("EN", current_part.strip()))
if len(parts) > 1:
print(f" 📝 內容分析 ({len(parts)} 部分):")
for k, (lang, content) in enumerate(parts):
print(f" {k+1}. [{lang}] {content[:50]}...")
if mixed_count == 0:
print("沒有找到中英混合段落")
else:
print(f"\n✅ 總共找到 {mixed_count} 個中英混合段落")
except Exception as e:
print(f"❌ 檢查失敗: {e}")
if __name__ == "__main__":
check_mixed_paragraph()

View File

@@ -1,164 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查原始快取資料庫中ROW291的翻譯記錄
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from app import create_app
def check_original_cache_row291():
"""檢查原始快取資料庫中ROW291的翻譯記錄"""
print("=" * 80)
print("檢查原始快取資料庫中的翻譯記錄")
print("重點ROW291 vs ROW349 的差異")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
# 1. 檢查ROW291附近的記錄
print(f"1. 檢查ROW291附近的韓文翻譯記錄")
print("-" * 60)
result = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, target_language, created_at
FROM dt_translation_cache
WHERE id >= 285 AND id <= 295 AND target_language = 'ko'
ORDER BY id
"""))
row291_records = result.fetchall()
if not row291_records:
print("❌ ROW285-295範圍內沒有韓文記錄")
else:
for record in row291_records:
print(f"\nROW {record[0]}:")
print(f" 原文: {repr(record[1][:50])}...")
print(f" 韓文: {repr(record[2][:50])}...")
print(f" 時間: {record[4]}")
# 2. 檢查ROW349附近的記錄 (我手動補充的)
print(f"\n2. 檢查ROW349附近的韓文翻譯記錄 (手動補充)")
print("-" * 60)
result = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, target_language, created_at
FROM dt_translation_cache
WHERE id >= 345 AND id <= 355 AND target_language = 'ko'
ORDER BY id
"""))
row349_records = result.fetchall()
if not row349_records:
print("❌ ROW345-355範圍內沒有韓文記錄")
else:
for record in row349_records:
print(f"\nROW {record[0]}:")
print(f" 原文: {repr(record[1][:50])}...")
print(f" 韓文: {repr(record[2][:50])}...")
print(f" 時間: {record[4]}")
# 3. 直接查找D2內容的所有翻譯記錄
print(f"\n3. 查找D2內容的所有翻譯記錄")
print("-" * 60)
d2_content = "與 WB inline 串線DB→WB、時效快支援 Sn/Au 晶片\n最小可支援9mil晶粒\n支援EAP管控"
result = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, target_language, created_at
FROM dt_translation_cache
WHERE source_text = :text
ORDER BY id
"""), {'text': d2_content})
d2_records = result.fetchall()
if not d2_records:
print(f"❌ 沒有找到D2內容的翻譯記錄")
print(f" 查找內容: {repr(d2_content[:50])}...")
else:
print(f"✅ 找到 {len(d2_records)} 筆D2翻譯記錄:")
for record in d2_records:
print(f"\nROW {record[0]} ({record[3]}):")
print(f" 原文: {repr(record[1][:50])}...")
print(f" 翻譯: {repr(record[2][:50])}...")
print(f" 時間: {record[4]}")
# 4. 檢查最新的韓文快取總數
print(f"\n4. 檢查韓文快取總數")
print("-" * 60)
result = db.session.execute(sql_text("""
SELECT COUNT(*) as total,
MIN(id) as min_id,
MAX(id) as max_id,
MIN(created_at) as earliest,
MAX(created_at) as latest
FROM dt_translation_cache
WHERE target_language = 'ko'
"""))
stats = result.fetchone()
print(f"韓文快取統計:")
print(f" 總數: {stats[0]}")
print(f" ID範圍: {stats[1]} - {stats[2]}")
print(f" 時間範圍: {stats[3]} - {stats[4]}")
# 5. 比較原始DIFY翻譯 vs 手動補充翻譯
print(f"\n5. 比較原始DIFY翻譯 vs 手動補充翻譯")
print("-" * 60)
if d2_records:
if len(d2_records) == 1:
print("✅ 只有一筆D2翻譯記錄沒有重複")
else:
print(f"⚠️ 有 {len(d2_records)} 筆重複的D2翻譯記錄:")
for i, record in enumerate(d2_records, 1):
print(f"\n 記錄 {i} (ROW {record[0]}):")
print(f" 語言: {record[3]}")
print(f" 翻譯: {record[2][:100]}...")
print(f" 時間: {record[4]}")
# 判斷來源
if record[0] <= 300:
print(f" 來源: 🤖 原始DIFY翻譯")
else:
print(f" 來源: ✋ 手動補充翻譯")
# 6. 查看為什麼原始翻譯沒有生效
print(f"\n6. 分析翻譯映射問題")
print("-" * 60)
if d2_records:
original_record = min(d2_records, key=lambda x: x[0]) # 最早的記錄
print(f"原始翻譯記錄 (ROW {original_record[0]}):")
print(f" 是否為韓文: {original_record[3] == 'ko'}")
print(f" 翻譯內容長度: {len(original_record[2])}")
print(f" 翻譯內容: {repr(original_record[2])}")
if original_record[3] == 'ko' and original_record[2]:
print("✅ 原始翻譯記錄看起來正常")
print("❓ 問題可能在於翻譯映射邏輯沒有正確使用這個快取")
else:
print("❌ 原始翻譯記錄有問題")
print(f"\n" + "=" * 80)
print("原始快取記錄檢查完成!")
print("請查看上述分析找出真正的問題原因")
print("=" * 80)
if __name__ == "__main__":
check_original_cache_row291()

View File

@@ -1,80 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查等待處理的任務
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app
from app.models.job import TranslationJob
def check_pending_jobs():
"""檢查等待處理的任務狀態"""
app = create_app()
with app.app_context():
print("=== 檢查等待處理的任務 ===")
# 查找所有等待處理的任務
pending_jobs = TranslationJob.query.filter_by(status='PENDING').order_by(TranslationJob.created_at.desc()).all()
print(f"找到 {len(pending_jobs)} 個等待處理的任務:")
for job in pending_jobs:
print(f"\n任務ID: {job.job_uuid}")
print(f" 原始檔名: {job.original_filename}")
print(f" 目標語言: {job.target_languages}")
print(f" 創建時間: {job.created_at}")
print(f" 進度: {job.progress}%")
print(f" 狀態: {job.status}")
print(f" 用戶ID: {job.user_id}")
if job.error_message:
print(f" 錯誤信息: {job.error_message}")
# 檢查其他狀態的任務
print(f"\n=== 任務統計 ===")
all_jobs = TranslationJob.query.all()
status_counts = {}
for job in all_jobs:
status_counts[job.status] = status_counts.get(job.status, 0) + 1
for status, count in status_counts.items():
print(f"{status}: {count}")
# 檢查最新任務的詳細信息
if pending_jobs:
latest_job = pending_jobs[0]
print(f"\n=== 最新任務詳細信息 ===")
print(f"任務UUID: {latest_job.job_uuid}")
print(f"檔案路徑: {latest_job.file_path}")
print(f"目標語言: {latest_job.target_languages}")
# 檢查檔案是否存在
from pathlib import Path
if latest_job.file_path and Path(latest_job.file_path).exists():
file_size = Path(latest_job.file_path).stat().st_size
print(f"檔案存在: {latest_job.file_path} ({file_size:,} bytes)")
else:
print(f"檔案不存在: {latest_job.file_path}")
# 檢查原始檔案記錄
original_file = latest_job.get_original_file()
if original_file:
print(f"原始檔案記錄: {original_file.filename}")
print(f" 檔案大小: {original_file.file_size:,} bytes")
print(f" 檔案路徑: {original_file.file_path}")
if __name__ == "__main__":
check_pending_jobs()

View File

@@ -1,86 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查最近的任務狀態
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app
from app.models.job import TranslationJob
from pathlib import Path
def check_recent_jobs():
"""檢查最近的任務狀態"""
app = create_app()
with app.app_context():
print("=== 檢查所有任務狀態 ===")
# 查找所有任務,按創建時間排序
all_jobs = TranslationJob.query.order_by(TranslationJob.created_at.desc()).all()
for i, job in enumerate(all_jobs, 1):
print(f"\n【任務 {i}")
print(f" UUID: {job.job_uuid}")
print(f" 檔名: {job.original_filename}")
print(f" 狀態: {job.status}")
print(f" 進度: {job.progress}%")
print(f" 創建時間: {job.created_at}")
print(f" 目標語言: {job.target_languages}")
print(f" 總tokens: {job.total_tokens}")
print(f" 總成本: ${job.total_cost}")
if job.error_message:
print(f" ❌ 錯誤: {job.error_message}")
# 檢查翻譯檔案
if job.status == 'COMPLETED':
translated_files = job.get_translated_files()
print(f" 📁 翻譯檔案數: {len(translated_files)}")
for tf in translated_files:
file_path = Path(tf.file_path)
exists = "" if file_path.exists() else ""
size = file_path.stat().st_size if file_path.exists() else 0
print(f" {exists} {tf.filename} ({tf.language_code}) - {size:,} bytes")
# 檢查檔案內容是否真的有翻譯
if file_path.exists() and tf.filename.endswith('.docx'):
try:
from docx import Document
doc = Document(str(file_path))
paragraph_count = len([p for p in doc.paragraphs if p.text.strip()])
print(f" 段落數: {paragraph_count}")
# 顯示前幾段內容
sample_texts = []
for p in doc.paragraphs[:3]:
if p.text.strip():
sample_texts.append(p.text.strip()[:50])
if sample_texts:
print(f" 範例文字: {sample_texts[0]}...")
except Exception as e:
print(f" ⚠️ 無法讀取檔案: {e}")
# 檢查原始檔案
original_file = job.get_original_file()
if original_file:
orig_path = Path(original_file.file_path)
orig_exists = "" if orig_path.exists() else ""
orig_size = orig_path.stat().st_size if orig_path.exists() else 0
print(f" 📄 原始檔案: {orig_exists} {original_file.filename} - {orig_size:,} bytes")
if __name__ == "__main__":
check_recent_jobs()

View File

@@ -1,116 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查翻譯快取資料
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from sqlalchemy import text
def check_translation_cache():
"""檢查翻譯快取資料"""
app = create_app()
with app.app_context():
print("=== 檢查翻譯快取資料 ===")
# 總記錄數
total_result = db.session.execute(text("SELECT COUNT(*) FROM dt_translation_cache"))
total_count = total_result.scalar()
print(f"翻譯快取總記錄數: {total_count:,}")
# 按語言分組統計
lang_result = db.session.execute(text("""
SELECT target_language, COUNT(*)
FROM dt_translation_cache
GROUP BY target_language
ORDER BY COUNT(*) DESC
"""))
print(f"\n按語言分組:")
for row in lang_result.fetchall():
print(f" {row[0]}: {row[1]:,}")
# 最近的翻譯記錄
recent_result = db.session.execute(text("""
SELECT source_text, target_language, translated_text, created_at
FROM dt_translation_cache
ORDER BY created_at DESC
LIMIT 10
"""))
print(f"\n最近的10條翻譯記錄:")
for row in recent_result.fetchall():
source = row[0][:50] + "..." if len(row[0]) > 50 else row[0]
target = row[2][:50] + "..." if len(row[2]) > 50 else row[2]
print(f" [{row[1]}] {source} -> {target} ({row[3]})")
# 搜尋包含DOCX任務相關的翻譯
print(f"\n=== 搜尋DOCX任務相關翻譯 ===")
# 搜尋常見的中文詞彙
keywords = ["目的", "适用范围", "定义", "烤箱设备", "维护保养"]
for keyword in keywords:
search_result = db.session.execute(text("""
SELECT source_text, target_language, translated_text
FROM dt_translation_cache
WHERE source_text LIKE :keyword
ORDER BY created_at DESC
LIMIT 3
"""), {'keyword': f'%{keyword}%'})
results = search_result.fetchall()
if results:
print(f"\n包含'{keyword}'的翻譯:")
for row in results:
source = row[0][:60] + "..." if len(row[0]) > 60 else row[0]
target = row[2][:60] + "..." if len(row[2]) > 60 else row[2]
print(f" [{row[1]}] {source}")
print(f" -> {target}")
# 檢查英文翻譯品質
print(f"\n=== 檢查翻譯品質 ===")
en_sample_result = db.session.execute(text("""
SELECT source_text, translated_text
FROM dt_translation_cache
WHERE target_language = 'en'
AND CHAR_LENGTH(source_text) > 10
ORDER BY created_at DESC
LIMIT 5
"""))
print(f"英文翻譯範例:")
for row in en_sample_result.fetchall():
print(f" 原文: {row[0]}")
print(f" 譯文: {row[1]}")
# 檢查翻譯是否正確
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in row[1])
has_english = any(ord(c) < 128 and c.isalpha() for c in row[1])
if has_chinese and not has_english:
print(f" ❌ 翻譯失敗 - 譯文仍是中文")
elif has_english and not has_chinese:
print(f" ✅ 翻譯成功 - 譯文是英文")
elif has_chinese and has_english:
print(f" ⚠️ 混合語言 - 可能是交錯格式")
else:
print(f" ❓ 未知狀態")
print()
if __name__ == "__main__":
check_translation_cache()

View File

@@ -1,86 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查翻譯文件的實際內容
"""
import sys
import os
from pathlib import Path
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app
from app.models.job import TranslationJob
def check_translation_content():
"""檢查翻譯文件的實際內容"""
app = create_app()
with app.app_context():
print("=== 檢查翻譯文件內容 ===")
# 檢查最近完成的任務
job = TranslationJob.query.filter_by(job_uuid='485e0fdc-75fb-4b5a-b44b-3531951200a1').first()
if not job:
print("任務不存在")
return
print(f"任務狀態: {job.status}")
translated_files = job.get_translated_files()
print(f"翻譯檔案數: {len(translated_files)}")
for tf in translated_files:
file_path = Path(tf.file_path)
print(f"\n【檔案】 {tf.filename}")
print(f"語言: {tf.language_code}")
print(f"路徑: {tf.file_path}")
print(f"存在: {file_path.exists()}")
if file_path.exists():
print(f"大小: {file_path.stat().st_size:,} bytes")
# 如果是 DOCX檢查內容
if tf.filename.endswith('.docx'):
try:
from docx import Document
doc = Document(str(file_path))
paragraphs = [p.text.strip() for p in doc.paragraphs if p.text.strip()]
print(f"段落數: {len(paragraphs)}")
if paragraphs:
print(f"第一段內容: {paragraphs[0][:150]}...")
# 檢查前幾段內容
sample_count = min(3, len(paragraphs))
for i in range(sample_count):
if i < len(paragraphs):
para = paragraphs[i]
print(f"段落 {i+1}: {para[:100]}...")
# 檢查是否包含交錯翻譯格式(原文+翻譯)
lines = para.split('\n')
if len(lines) > 1:
print(f" -> 多行內容,可能是交錯格式: {len(lines)}")
for j, line in enumerate(lines[:2]): # 只顯示前兩行
print(f"{j+1}: {line[:80]}...")
# 簡單檢查是否有英文或越南文內容
all_text = ' '.join(paragraphs[:5]) # 檢查前5段
has_latin = any(ord(c) < 128 and c.isalpha() for c in all_text)
print(f"包含拉丁字符(可能是翻譯): {has_latin}")
except Exception as e:
print(f"讀取DOCX錯誤: {e}")
print("-" * 50)
if __name__ == "__main__":
check_translation_content()

View File

@@ -1,180 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查文件翻譯問題
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
import openpyxl
from docx import Document
import pymysql
from pathlib import Path
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
def check_excel_translation(file_path):
"""檢查Excel文件翻譯情況"""
print("\n" + "="*60)
print("檢查 Excel 文件翻譯")
print("="*60)
# 原始文件
original_file = Path(file_path) / "original_panjit_f0b78200.xlsx"
# 翻譯後文件(日文版)
translated_file = Path(file_path) / "original_panjit_f0b78200_ja_translated.xlsx"
if not original_file.exists():
print(f"原始文件不存在: {original_file}")
return
if not translated_file.exists():
print(f"翻譯文件不存在: {translated_file}")
return
# 讀取原始文件
wb_original = openpyxl.load_workbook(original_file)
ws_original = wb_original.active
# 讀取翻譯文件
wb_translated = openpyxl.load_workbook(translated_file)
ws_translated = wb_translated.active
print(f"\n原始文件: {original_file.name}")
print(f"翻譯文件: {translated_file.name}")
# 檢查A1儲存格
print(f"\nA1 儲存格:")
print(f" 原始: '{ws_original['A1'].value}'")
print(f" 翻譯: '{ws_translated['A1'].value}'")
# 檢查前10行10列的內容
print("\n前10行10列的對比:")
for row in range(1, min(11, ws_original.max_row + 1)):
for col in range(1, min(11, ws_original.max_column + 1)):
cell_original = ws_original.cell(row=row, column=col)
cell_translated = ws_translated.cell(row=row, column=col)
if cell_original.value and cell_original.value != cell_translated.value:
print(f"\n [{openpyxl.utils.get_column_letter(col)}{row}]")
print(f" 原始: '{cell_original.value}'")
print(f" 翻譯: '{cell_translated.value}'")
wb_original.close()
wb_translated.close()
def check_docx_translation(file_path):
"""檢查DOCX文件翻譯情況"""
print("\n" + "="*60)
print("檢查 DOCX 文件翻譯")
print("="*60)
# 原始文件
original_file = Path(file_path) / "original_-OR026_49e95f53.docx"
# 翻譯後文件(英文版)
translated_file = Path(file_path) / "translated_original_-OR026_49e95f53_en_translat.docx"
if not original_file.exists():
print(f"原始文件不存在: {original_file}")
return
if not translated_file.exists():
print(f"翻譯文件不存在: {translated_file}")
return
# 讀取原始文件
doc_original = Document(original_file)
doc_translated = Document(translated_file)
print(f"\n原始文件: {original_file.name}")
print(f"翻譯文件: {translated_file.name}")
# 搜索特定字串
target_strings = ["超温", "存放", "工务部"]
print("\n搜尋目標字串在原始文件中:")
for para_idx, para in enumerate(doc_original.paragraphs):
if any(target in para.text for target in target_strings):
print(f"\n段落 {para_idx}: {para.text[:100]}...")
for target in target_strings:
if target in para.text:
print(f" 找到 '{target}'")
print("\n搜尋目標字串在翻譯文件中:")
for para_idx, para in enumerate(doc_translated.paragraphs):
for target in target_strings:
if target in para.text:
print(f"\n段落 {para_idx}: {para.text[:100]}...")
print(f" 仍包含未翻譯的 '{target}'")
def check_translation_cache(job_uuid, target_strings):
"""檢查MySQL翻譯快取"""
print("\n" + "="*60)
print("檢查 MySQL 翻譯快取")
print("="*60)
# 連接資料庫
conn = pymysql.connect(
host='mysql.theaken.com',
port=33306,
user='A060',
password='WLeSCi0yhtc7',
database='db_A060',
charset='utf8mb4'
)
cursor = conn.cursor()
print(f"\n任務UUID: {job_uuid}")
print(f"搜尋字串: {target_strings}")
# 查詢翻譯快取
for target in target_strings:
sql = """
SELECT source_text, translated_text, source_language, target_language
FROM dt_translation_cache
WHERE source_text LIKE %s
"""
cursor.execute(sql, (f'%{target}%',))
results = cursor.fetchall()
if results:
print(f"\n找到包含 '{target}' 的翻譯記錄:")
for source, translated, src_lang, tgt_lang in results:
print(f" 原文: {source[:100]}...")
print(f" 譯文: {translated[:100]}...")
print(f" 語言: {src_lang} -> {tgt_lang}")
else:
print(f"\n未找到包含 '{target}' 的翻譯記錄")
cursor.close()
conn.close()
def main():
# Excel文件路徑
excel_path = r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\f0b78200-2c5e-41a4-bac8-1536f92529e9"
# DOCX文件路徑
docx_path = r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\49e95f53-5092-47c0-8275-e19c8c99e5ac"
# 檢查Excel
check_excel_translation(excel_path)
# 檢查DOCX
check_docx_translation(docx_path)
# 檢查DOCX的翻譯快取
print("\n" + "="*60)
print("查詢 DOCX 翻譯快取")
check_translation_cache("49e95f53-5092-47c0-8275-e19c8c99e5ac", ["超温", "存放", "工务部"])
# 檢查Excel的翻譯快取
print("\n" + "="*60)
print("查詢 Excel 翻譯快取")
check_translation_cache("f0b78200-2c5e-41a4-bac8-1536f92529e9", ["产品型号"])
if __name__ == "__main__":
main()

View File

@@ -1,116 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查並清理重複用戶記錄
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app
from app.models.user import User
from app.models.job import TranslationJob
from app.models.stats import APIUsageStats
def check_and_clean_users():
"""檢查並清理重複用戶記錄"""
app = create_app()
with app.app_context():
from app import db
print("=== 檢查用戶記錄 ===")
# 查看所有用戶
users = User.query.order_by(User.id).all()
for user in users:
print(f"用戶 ID: {user.id}")
print(f" 用戶名: {user.username}")
print(f" 顯示名: {user.display_name}")
print(f" 郵箱: {user.email}")
print(f" 是否管理員: {user.is_admin}")
print(f" 最後登入: {user.last_login}")
print(f" 創建時間: {user.created_at}")
# 檢查關聯記錄
job_count = user.translation_jobs.count()
print(f" 翻譯任務數: {job_count}")
try:
api_stats_count = db.session.query(APIUsageStats).filter_by(user_id=user.id).count()
print(f" API統計記錄數: {api_stats_count}")
except:
print(f" API統計記錄數: 查詢失敗")
print()
# 尋找重複用戶名
duplicate_usernames = db.session.query(User.username).group_by(User.username).having(db.func.count(User.id) > 1).all()
if duplicate_usernames:
print("=== 發現重複用戶名 ===")
for (username,) in duplicate_usernames:
print(f"重複用戶名: {username}")
dup_users = User.query.filter_by(username=username).order_by(User.id).all()
for i, user in enumerate(dup_users):
print(f" [{i+1}] ID: {user.id}, 創建時間: {user.created_at}, 管理員: {user.is_admin}")
print(f" 任務數: {user.translation_jobs.count()}")
# 檢查是否有ID=1和ID=2的用戶且共享相同郵箱
user_id_1 = User.query.get(1)
user_id_2 = User.query.get(2)
if user_id_1 and user_id_2 and user_id_1.email == user_id_2.email:
print("=== 發現重複用戶(相同郵箱) ===")
print(f"ID=1: {user_id_1.username} ({user_id_1.email})")
print(f"ID=2: {user_id_2.username} ({user_id_2.email})")
print("準備刪除 ID=1 並將記錄轉移到 ID=2...")
# 檢查關聯記錄
jobs = user_id_1.translation_jobs.all()
if jobs:
print(f"轉移 {len(jobs)} 個翻譯任務到 ID=2")
for job in jobs:
job.user_id = 2
# 轉移API統計記錄
api_stats = db.session.query(APIUsageStats).filter_by(user_id=1).all()
if api_stats:
print(f"轉移 {len(api_stats)} 個API統計記錄到用戶 ID=2")
for stat in api_stats:
stat.user_id = 2
# 提交轉移
db.session.commit()
print("✅ 記錄轉移完成")
# 刪除用戶記錄
try:
db.session.delete(user_id_1)
db.session.commit()
print("✅ ID=1 用戶記錄已成功刪除")
except Exception as e:
print(f"❌ 刪除用戶記錄失敗: {e}")
db.session.rollback()
elif user_id_1:
print("=== ID=1 用戶存在但沒有找到相同郵箱的ID=2用戶 ===")
print("暫不刪除")
print("\n=== 清理完成後的用戶狀態 ===")
users = User.query.order_by(User.id).all()
for user in users:
print(f"ID: {user.id}, 用戶名: {user.username}, 管理員: {user.is_admin}, 任務數: {user.translation_jobs.count()}")
if __name__ == "__main__":
check_and_clean_users()

View File

@@ -1,13 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import sys, os
sys.path.insert(0, os.path.join(os.getcwd(), 'app'))
from app import create_app
from app.models import User
app = create_app()
with app.app_context():
users = User.query.all()
print(f'總用戶數: {len(users)}')
for user in users:
print(f'ID: {user.id}, 用戶名: {user.username}, Email: {user.email}')

View File

@@ -1,204 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
直接創建資料表腳本
"""
import pymysql
def create_tables():
"""創建所有需要的資料表"""
# 資料表 DDL
tables = {
'dt_users': '''
CREATE TABLE IF NOT EXISTS dt_users (
id INT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(100) NOT NULL UNIQUE COMMENT 'AD帳號',
display_name VARCHAR(200) NOT NULL COMMENT '顯示名稱',
email VARCHAR(255) NOT NULL COMMENT '電子郵件',
department VARCHAR(100) COMMENT '部門',
is_admin BOOLEAN DEFAULT FALSE COMMENT '是否為管理員',
last_login TIMESTAMP NULL COMMENT '最後登入時間',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX idx_username (username),
INDEX idx_email (email)
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
''',
'dt_translation_jobs': '''
CREATE TABLE IF NOT EXISTS dt_translation_jobs (
id INT PRIMARY KEY AUTO_INCREMENT,
job_uuid VARCHAR(36) NOT NULL UNIQUE COMMENT '任務唯一識別碼',
user_id INT NOT NULL COMMENT '使用者ID',
original_filename VARCHAR(500) NOT NULL COMMENT '原始檔名',
file_extension VARCHAR(10) NOT NULL COMMENT '檔案副檔名',
file_size BIGINT NOT NULL COMMENT '檔案大小(bytes)',
file_path VARCHAR(1000) NOT NULL COMMENT '檔案路徑',
source_language VARCHAR(50) DEFAULT NULL COMMENT '來源語言',
target_languages JSON NOT NULL COMMENT '目標語言陣列',
status ENUM('PENDING', 'PROCESSING', 'COMPLETED', 'FAILED', 'RETRY') DEFAULT 'PENDING',
progress DECIMAL(5,2) DEFAULT 0.00 COMMENT '處理進度(%)',
retry_count INT DEFAULT 0 COMMENT '重試次數',
error_message TEXT NULL COMMENT '錯誤訊息',
total_tokens INT DEFAULT 0 COMMENT '總token數',
total_cost DECIMAL(10,4) DEFAULT 0.0000 COMMENT '總成本',
processing_started_at TIMESTAMP NULL COMMENT '開始處理時間',
completed_at TIMESTAMP NULL COMMENT '完成時間',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX idx_user_id (user_id),
INDEX idx_job_uuid (job_uuid),
INDEX idx_status (status),
INDEX idx_created_at (created_at),
FOREIGN KEY (user_id) REFERENCES dt_users(id) ON DELETE CASCADE
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
''',
'dt_job_files': '''
CREATE TABLE IF NOT EXISTS dt_job_files (
id INT PRIMARY KEY AUTO_INCREMENT,
job_id INT NOT NULL COMMENT '任務ID',
file_type ENUM('ORIGINAL', 'TRANSLATED') NOT NULL COMMENT '檔案類型',
language_code VARCHAR(50) NULL COMMENT '語言代碼(翻譯檔案)',
filename VARCHAR(500) NOT NULL COMMENT '檔案名稱',
file_path VARCHAR(1000) NOT NULL COMMENT '檔案路徑',
file_size BIGINT NOT NULL COMMENT '檔案大小',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_job_id (job_id),
INDEX idx_file_type (file_type),
FOREIGN KEY (job_id) REFERENCES dt_translation_jobs(id) ON DELETE CASCADE
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
''',
'dt_translation_cache': '''
CREATE TABLE IF NOT EXISTS dt_translation_cache (
id INT PRIMARY KEY AUTO_INCREMENT,
source_text_hash VARCHAR(64) NOT NULL COMMENT '來源文字hash',
source_language VARCHAR(50) NOT NULL COMMENT '來源語言',
target_language VARCHAR(50) NOT NULL COMMENT '目標語言',
source_text TEXT NOT NULL COMMENT '來源文字',
translated_text TEXT NOT NULL COMMENT '翻譯文字',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE KEY uk_cache (source_text_hash, source_language, target_language),
INDEX idx_languages (source_language, target_language)
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
''',
'dt_api_usage_stats': '''
CREATE TABLE IF NOT EXISTS dt_api_usage_stats (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL COMMENT '使用者ID',
job_id INT NULL COMMENT '任務ID',
api_endpoint VARCHAR(200) NOT NULL COMMENT 'API端點',
prompt_tokens INT DEFAULT 0 COMMENT 'Prompt token數',
completion_tokens INT DEFAULT 0 COMMENT 'Completion token數',
total_tokens INT DEFAULT 0 COMMENT '總token數',
prompt_unit_price DECIMAL(10,8) DEFAULT 0.00000000 COMMENT '單價',
prompt_price_unit VARCHAR(20) DEFAULT 'USD' COMMENT '價格單位',
cost DECIMAL(10,4) DEFAULT 0.0000 COMMENT '成本',
response_time_ms INT DEFAULT 0 COMMENT '回應時間(毫秒)',
success BOOLEAN DEFAULT TRUE COMMENT '是否成功',
error_message TEXT NULL COMMENT '錯誤訊息',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_user_id (user_id),
INDEX idx_job_id (job_id),
INDEX idx_created_at (created_at),
FOREIGN KEY (user_id) REFERENCES dt_users(id) ON DELETE CASCADE,
FOREIGN KEY (job_id) REFERENCES dt_translation_jobs(id) ON DELETE SET NULL
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
''',
'dt_system_logs': '''
CREATE TABLE IF NOT EXISTS dt_system_logs (
id INT PRIMARY KEY AUTO_INCREMENT,
level ENUM('DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL') NOT NULL,
module VARCHAR(100) NOT NULL COMMENT '模組名稱',
user_id INT NULL COMMENT '使用者ID',
job_id INT NULL COMMENT '任務ID',
message TEXT NOT NULL COMMENT '日誌訊息',
extra_data JSON NULL COMMENT '額外資料',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_level (level),
INDEX idx_module (module),
INDEX idx_user_id (user_id),
INDEX idx_created_at (created_at),
FOREIGN KEY (user_id) REFERENCES dt_users(id) ON DELETE SET NULL,
FOREIGN KEY (job_id) REFERENCES dt_translation_jobs(id) ON DELETE SET NULL
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
''',
'dt_notifications': '''
CREATE TABLE IF NOT EXISTS dt_notifications (
id INT PRIMARY KEY AUTO_INCREMENT,
notification_uuid VARCHAR(36) NOT NULL UNIQUE COMMENT '通知唯一識別碼',
user_id INT NOT NULL COMMENT '使用者ID',
type VARCHAR(20) NOT NULL DEFAULT 'info' COMMENT '通知類型',
title VARCHAR(255) NOT NULL COMMENT '通知標題',
message TEXT NOT NULL COMMENT '通知內容',
job_uuid VARCHAR(36) NULL COMMENT '關聯任務UUID',
link VARCHAR(500) NULL COMMENT '相關連結',
is_read BOOLEAN DEFAULT FALSE NOT NULL COMMENT '是否已讀',
read_at TIMESTAMP NULL COMMENT '閱讀時間',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL COMMENT '建立時間',
expires_at TIMESTAMP NULL COMMENT '過期時間',
extra_data JSON NULL COMMENT '額外數據',
INDEX idx_notification_uuid (notification_uuid),
INDEX idx_user_id (user_id),
INDEX idx_job_uuid (job_uuid),
INDEX idx_is_read (is_read),
INDEX idx_created_at (created_at),
FOREIGN KEY (user_id) REFERENCES dt_users(id) ON DELETE CASCADE,
FOREIGN KEY (job_uuid) REFERENCES dt_translation_jobs(job_uuid) ON DELETE SET NULL
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
'''
}
try:
# 建立資料庫連線
connection = pymysql.connect(
host='mysql.theaken.com',
port=33306,
user='A060',
password='WLeSCi0yhtc7',
database='db_A060',
charset='utf8mb4'
)
cursor = connection.cursor()
print("Creating database tables...")
# 依序創建表格
for table_name, sql in tables.items():
print(f"Creating table: {table_name}")
cursor.execute(sql)
print(f" - {table_name} created successfully")
# 創建預設管理員用戶
print("\nCreating default admin user...")
admin_sql = '''
INSERT IGNORE INTO dt_users (username, display_name, email, department, is_admin)
VALUES ('ymirliu', 'ymirliu', 'ymirliu@panjit.com.tw', 'IT', TRUE)
'''
cursor.execute(admin_sql)
if cursor.rowcount > 0:
print(" - Default admin user created")
else:
print(" - Default admin user already exists")
# 提交更改
connection.commit()
connection.close()
print("\n=== Database initialization completed ===")
return True
except Exception as e:
print(f"Database initialization failed: {e}")
return False
if __name__ == '__main__':
create_tables()

View File

@@ -1,47 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
創建測試數據
"""
from app import create_app, db
from app.models import Notification, NotificationType
from datetime import datetime
def create_test_notification():
"""創建測試通知"""
try:
app = create_app()
with app.app_context():
print("Creating test notification...")
# 創建測試通知
test_notification = Notification(
user_id=4, # ymirliu 用戶
type=NotificationType.INFO.value,
title='測試通知',
message='這是一個測試通知,用來驗證通知系統是否正常工作。',
extra_data={
'test_data': True,
'created_by': 'test_script'
}
)
db.session.add(test_notification)
db.session.commit()
print(f"Test notification created: {test_notification.notification_uuid}")
print(f"Total notifications in database: {Notification.query.count()}")
# 顯示所有通知
notifications = Notification.query.all()
for notification in notifications:
print(f" - {notification.title} ({notification.type})")
except Exception as e:
print(f"Error creating test notification: {e}")
import traceback
traceback.print_exc()
if __name__ == '__main__':
create_test_notification()

View File

@@ -1,213 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
監控實際的DOCX翻譯插入過程
"""
import sys
import os
import tempfile
import shutil
from pathlib import Path
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from app.services.document_processor import DocumentProcessor, _insert_docx_translations
from sqlalchemy import text as sql_text
def debug_actual_insertion():
"""監控實際的DOCX翻譯插入過程"""
app = create_app()
with app.app_context():
print("=== 監控實際的DOCX翻譯插入過程 ===")
# 原始文件
original_path = r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\9c6548ac-2f59-45f4-aade-0a9b3895bbfd\original_-OR026_9c6548ac.docx"
# 創建測試副本
test_dir = Path(tempfile.gettempdir()) / "debug_insertion"
test_dir.mkdir(exist_ok=True)
test_path = test_dir / "debug_original.docx"
output_path = test_dir / "debug_translated.docx"
shutil.copy2(original_path, test_path)
print(f"✅ 創建測試副本: {test_path}")
# 創建處理器
processor = DocumentProcessor()
# 提取段落
segments = processor.extract_docx_segments(str(test_path))
print(f"📄 提取到 {len(segments)} 個段落")
# 構建翻譯映射只取前5個段落進行詳細調試
target_language = 'en'
translation_map = {}
debug_segments = segments[:5] # 只調試前5個段落
print(f"\n🔍 構建前5個段落的翻譯映射:")
for i, seg in enumerate(debug_segments):
result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': seg.text, 'lang': target_language})
row = result.fetchone()
if row and row[0]:
translation_map[(target_language, seg.text)] = row[0]
print(f" 段落 {i+1}: ✅ 有翻譯")
print(f" 原文: {seg.text[:50]}...")
print(f" 譯文: {row[0][:50]}...")
else:
print(f" 段落 {i+1}: ❌ 無翻譯 - {seg.text[:50]}...")
print(f"\n翻譯映射總數: {len(translation_map)}")
# 載入文檔並檢查插入前狀態
try:
from docx import Document
doc = Document(str(test_path))
print(f"\n📊 插入前文檔狀態:")
print(f"總段落數: {len(doc.paragraphs)}")
# 創建詳細的日誌函數
insertion_logs = []
def detailed_log(msg: str):
print(f"[LOG] {msg}")
insertion_logs.append(msg)
# 執行插入只處理前5個段落
print(f"\n🔄 開始執行翻譯插入...")
ok_count, skip_count = _insert_docx_translations(
doc, debug_segments, translation_map, [target_language], detailed_log
)
print(f"\n插入結果: 成功 {ok_count}, 跳過 {skip_count}")
# 檢查插入後的文檔狀態
print(f"\n📊 插入後文檔狀態:")
print(f"總段落數: {len(doc.paragraphs)}")
# 詳細檢查前20個段落
insertion_found = 0
marker_found = 0
for i, para in enumerate(doc.paragraphs[:20]):
text = para.text.strip()
if not text:
continue
# 檢查是否有翻譯標記
has_marker = any('\u200b' in (r.text or '') for r in para.runs)
# 語言檢測
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in text)
has_english = any(ord(c) < 128 and c.isalpha() and c not in 'PANJIT' for c in text)
if has_marker:
marker_found += 1
lang_status = "🏷️ 翻譯標記"
elif has_english and not has_chinese:
insertion_found += 1
lang_status = "🇺🇸 純英文"
elif has_chinese and has_english:
lang_status = "🔄 中英混合"
elif has_chinese:
lang_status = "🇨🇳 純中文"
else:
lang_status = "❓ 其他"
print(f" 段落 {i+1:2d}: {lang_status} - {text[:60]}...")
print(f"\n發現的插入內容:")
print(f" 純英文段落: {insertion_found}")
print(f" 帶翻譯標記的段落: {marker_found}")
# 保存文檔
doc.save(str(output_path))
print(f"\n✅ 文檔已保存至: {output_path}")
# 重新讀取並驗證
doc2 = Document(str(output_path))
print(f"\n📊 保存後重新讀取驗證:")
print(f"總段落數: {len(doc2.paragraphs)}")
saved_insertion_found = 0
saved_marker_found = 0
for i, para in enumerate(doc2.paragraphs[:20]):
text = para.text.strip()
if not text:
continue
has_marker = any('\u200b' in (r.text or '') for r in para.runs)
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in text)
has_english = any(ord(c) < 128 and c.isalpha() and c not in 'PANJIT' for c in text)
if has_marker:
saved_marker_found += 1
elif has_english and not has_chinese:
saved_insertion_found += 1
print(f"保存後發現的插入內容:")
print(f" 純英文段落: {saved_insertion_found}")
print(f" 帶翻譯標記的段落: {saved_marker_found}")
# 診斷結果
if ok_count > 0 and saved_insertion_found == 0 and saved_marker_found == 0:
print(f"\n🚨 關鍵問題發現:")
print(f" - 插入函數報告成功插入 {ok_count} 個翻譯")
print(f" - 但保存後的文檔中沒有發現任何翻譯內容或標記")
print(f" - 問題可能在於:")
print(f" 1. _append_after函數實際沒有插入")
print(f" 2. 插入位置不正確")
print(f" 3. 文檔保存過程有問題")
elif ok_count > 0 and (saved_insertion_found > 0 or saved_marker_found > 0):
print(f"\n✅ 插入成功!")
print(f" - 插入函數報告: {ok_count} 個翻譯")
print(f" - 保存後確認: {saved_insertion_found + saved_marker_found} 個翻譯段落")
else:
print(f"\n⚠️ 無翻譯插入(可能都被跳過)")
# 打印插入日誌摘要
print(f"\n📝 插入日誌摘要:")
success_logs = [log for log in insertion_logs if '[SUCCESS]' in log]
skip_logs = [log for log in insertion_logs if '[SKIP]' in log]
error_logs = [log for log in insertion_logs if '[ERROR]' in log]
print(f" 成功日誌: {len(success_logs)}")
print(f" 跳過日誌: {len(skip_logs)}")
print(f" 錯誤日誌: {len(error_logs)}")
if success_logs:
print(f" 前3條成功日誌:")
for log in success_logs[:3]:
print(f" {log}")
if error_logs:
print(f" 錯誤日誌:")
for log in error_logs:
print(f" {log}")
except Exception as e:
print(f"❌ 調試失敗: {e}")
if __name__ == "__main__":
debug_actual_insertion()

View File

@@ -1,263 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
深度診斷"超温"翻譯問題
檢查從提取到插入的完整流程
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
import docx
from docx.table import Table
from app import create_app
from app.services.document_processor import should_translate
from app.services.translation_service import TranslationService
def debug_chaoweng_extraction():
"""檢查"超温"在文件提取階段是否被正確識別"""
print("=" * 80)
print("診斷步驟1: 檢查文件提取階段")
print("=" * 80)
app = create_app()
with app.app_context():
from app.services.document_processor import DocumentProcessor
# 檔案路徑
base_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\17e05695-406f-47af-96eb-a0e23843770e")
original_file = base_dir / "original_-OR026_17e05695.docx"
if not original_file.exists():
print(f"❌ 原始檔案不存在")
return
processor = DocumentProcessor()
# 提取所有segment
segments = processor.extract_docx_segments(str(original_file))
print(f"提取到 {len(segments)} 個segments")
# 找包含"超温"的segments
chaoweng_segments = []
for i, seg in enumerate(segments):
if "超温" in seg.text:
chaoweng_segments.append((i, seg))
print(f"\nSegment {i}:")
print(f" 種類: {seg.kind}")
print(f" 上下文: {seg.ctx}")
print(f" 內容: {repr(seg.text)}")
print(f" 長度: {len(seg.text.strip())}")
# 檢查是否應該翻譯
should_trans = should_translate(seg.text, 'zh')
print(f" should_translate: {should_trans}")
if seg.kind == "table_cell":
print(f" 🎯 這是表格儲存格segment")
else:
print(f" ⚠️ 不是表格儲存格類型")
if not chaoweng_segments:
print("❌ 沒有找到包含'超温'的segments")
else:
print(f"✅ 找到 {len(chaoweng_segments)} 個包含'超温'的segments")
return chaoweng_segments
def debug_chaoweng_translation(chaoweng_segments):
"""檢查"超温"在翻譯階段是否被正確處理"""
print(f"\n" + "=" * 80)
print("診斷步驟2: 檢查翻譯階段")
print("=" * 80)
if not chaoweng_segments:
print("❌ 沒有segments可以測試翻譯")
return
app = create_app()
with app.app_context():
service = TranslationService()
for seg_idx, seg in chaoweng_segments:
print(f"\n測試 Segment {seg_idx} 的翻譯:")
print(f"原文: {repr(seg.text)}")
try:
if seg.kind == "table_cell":
print("使用 translate_word_table_cell() 方法")
translated = service.translate_word_table_cell(
text=seg.text,
source_language="zh",
target_language="en",
user_id=None
)
else:
print("使用 translate_segment_with_sentences() 方法")
translated = service.translate_segment_with_sentences(
text=seg.text,
source_language="zh",
target_language="en",
user_id=None
)
print(f"翻譯結果: {repr(translated[:100])}...")
# 檢查翻譯是否成功
if "【翻譯失敗" in translated:
print("❌ 翻譯失敗")
elif translated == seg.text:
print("❌ 翻譯結果與原文相同,可能未翻譯")
else:
print("✅ 翻譯成功")
except Exception as e:
print(f"❌ 翻譯過程發生錯誤: {e}")
def debug_chaoweng_cache():
"""檢查"超温"的翻譯快取狀況"""
print(f"\n" + "=" * 80)
print("診斷步驟3: 檢查翻譯快取")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
# 1. 搜尋精確匹配"超温"
print("1. 搜尋精確的'超温'記錄:")
exact_results = db.session.execute(sql_text("""
SELECT id, source_text, target_language, translated_text, created_at
FROM dt_translation_cache
WHERE source_text = '超温'
ORDER BY created_at DESC
""")).fetchall()
if exact_results:
for row in exact_results:
print(f" ROW {row[0]}: '{row[1]}' -> {row[2]} -> '{row[3]}'")
else:
print(" ❌ 沒有找到精確的'超温'記錄")
# 2. 搜尋包含"超温"但可能有額外字符的記錄
print(f"\n2. 搜尋包含'超温'的記錄:")
like_results = db.session.execute(sql_text("""
SELECT id, source_text, target_language, translated_text, created_at
FROM dt_translation_cache
WHERE source_text LIKE '%超温%'
AND CHAR_LENGTH(source_text) <= 10
ORDER BY created_at DESC
LIMIT 10
""")).fetchall()
if like_results:
for row in like_results:
print(f" ROW {row[0]}: '{row[1]}' -> {row[2]} -> '{row[3][:30]}...'")
else:
print(" ❌ 沒有找到包含'超温'的短記錄")
def debug_chaoweng_insertion():
"""檢查"超温"的翻譯插入狀況"""
print(f"\n" + "=" * 80)
print("診斷步驟4: 檢查已翻譯文件的插入狀況")
print("=" * 80)
# 檢查翻譯後的文件
base_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\17e05695-406f-47af-96eb-a0e23843770e")
translated_files = [
("英文", base_dir / "translated_original_-OR026_17e05695_en_translat.docx"),
("越南文", base_dir / "translated_original_-OR026_17e05695_vi_translat.docx")
]
for lang, file_path in translated_files:
if not file_path.exists():
print(f"{lang}翻譯檔案不存在")
continue
print(f"\n檢查{lang}翻譯檔案:")
try:
doc = docx.Document(str(file_path))
found_chaoweng = False
found_translation = False
for table_idx, table in enumerate(doc.tables):
for row_idx, row in enumerate(table.rows):
for cell_idx, cell in enumerate(row.cells):
cell_text = cell.text.strip()
if "超温" in cell_text:
found_chaoweng = True
print(f" 🔍 表格{table_idx+1}{row_idx+1}{cell_idx+1}:")
print(f" 內容: {repr(cell_text[:100])}")
# 檢查該儲存格的段落結構
print(f" 段落數: {len(cell.paragraphs)}")
for p_idx, para in enumerate(cell.paragraphs):
p_text = para.text.strip()
if p_text:
print(f" 段落{p_idx+1}: {repr(p_text)}")
# 檢查是否有英文翻譯跡象
if lang == "英文" and any(word in p_text.lower() for word in ['over', 'heat', 'temp', 'hot']):
found_translation = True
print(f" 🎯 可能的英文翻譯")
elif lang == "越南文" and any(word in p_text.lower() for word in ['quá', 'nóng', 'nhiệt']):
found_translation = True
print(f" 🎯 可能的越南文翻譯")
print(f" 原文'超温': {'✅ 找到' if found_chaoweng else '❌ 未找到'}")
print(f" {lang}翻譯: {'✅ 找到' if found_translation else '❌ 未找到'}")
except Exception as e:
print(f"❌ 讀取{lang}翻譯檔案失敗: {e}")
def main():
"""主診斷函數"""
print("🔍 深度診斷'超温'翻譯問題")
print("檢查完整的提取->翻譯->插入流程")
try:
# 步驟1: 檢查文件提取
chaoweng_segments = debug_chaoweng_extraction()
# 步驟2: 檢查翻譯邏輯
debug_chaoweng_translation(chaoweng_segments)
# 步驟3: 檢查翻譯快取
debug_chaoweng_cache()
# 步驟4: 檢查插入結果
debug_chaoweng_insertion()
print(f"\n" + "=" * 80)
print("診斷完成!")
print("可能的問題:")
print("1. 提取階段: segments沒有正確提取'超温'")
print("2. 翻譯階段: 翻譯邏輯沒有處理該segment")
print("3. 快取階段: 翻譯沒有正確存儲")
print("4. 插入階段: 翻譯沒有正確插入到文件")
print("=" * 80)
except Exception as e:
print(f"❌ 診斷過程發生錯誤: {e}")
import traceback
print(f"錯誤詳情: {traceback.format_exc()}")
if __name__ == "__main__":
main()

View File

@@ -1,153 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試DOCX翻譯插入的實際執行路徑
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from app.services.translation_service import DocxParser
from sqlalchemy import text
def debug_docx_insertion_path():
"""調試DOCX翻譯插入的實際執行路徑"""
app = create_app()
with app.app_context():
print("=== 調試DOCX翻譯插入的實際執行路徑 ===")
# 使用現有的DOCX文件
original_path = r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\9c6548ac-2f59-45f4-aade-0a9b3895bbfd\original_-OR026_9c6548ac.docx"
# 創建解析器
parser = DocxParser(original_path)
# 提取段落資訊
segments = parser.extract_segments_with_context()
print(f"文檔總段落數: {len(segments)}")
# 分析段落類型
table_segments = 0
normal_segments = 0
sdt_segments = 0
other_segments = 0
print(f"\n📊 段落類型分析:")
for i, seg in enumerate(segments[:20]): # 檢查前20個段落
if seg.kind == "para":
# 檢查是否在表格中
from docx.table import _Cell
from docx.text.paragraph import Paragraph
if isinstance(seg.ref, Paragraph):
p = seg.ref
if isinstance(p._parent, _Cell):
table_segments += 1
segment_type = "🏢 表格段落"
else:
normal_segments += 1
segment_type = "📄 普通段落"
elif hasattr(seg.ref, 'tag') and seg.ref.tag.endswith('}sdt'):
sdt_segments += 1
segment_type = "📋 SDT段落"
else:
other_segments += 1
segment_type = f"❓ 其他段落 ({type(seg.ref)})"
else:
other_segments += 1
segment_type = f"🔧 非段落 ({seg.kind})"
print(f" 段落 {i+1:2d}: {segment_type} - {seg.text[:50]}...")
print(f"\n統計結果 (前20個段落):")
print(f" 表格段落: {table_segments}")
print(f" 普通段落: {normal_segments}")
print(f" SDT段落: {sdt_segments}")
print(f" 其他類型: {other_segments}")
# 檢查有翻譯的段落會走哪個路徑
print(f"\n🔍 檢查有翻譯的段落執行路徑:")
path_stats = {
"table": 0,
"normal": 0,
"sdt": 0,
"other": 0,
"skipped": 0
}
for i, seg in enumerate(segments[:10]): # 檢查前10個段落
if seg.kind == "para":
# 查找翻譯
result = db.session.execute(text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = 'en'
ORDER BY created_at DESC
LIMIT 1
"""), {'text': seg.text})
row = result.fetchone()
has_translation = row and row[0]
if has_translation:
# 判斷執行路徑
if isinstance(seg.ref, Paragraph):
p = seg.ref
if isinstance(p._parent, _Cell):
path = "table"
path_name = "🏢 表格路徑"
else:
path = "normal"
path_name = "📄 普通段落路徑"
elif hasattr(seg.ref, 'tag') and seg.ref.tag.endswith('}sdt'):
path = "sdt"
path_name = "📋 SDT路徑"
else:
path = "other"
path_name = "❓ 其他路徑"
path_stats[path] += 1
print(f" 段落 {i+1:2d}: {path_name} ✅ 有翻譯")
print(f" 原文: {seg.text[:50]}...")
print(f" 譯文: {row[0][:50]}...")
else:
path_stats["skipped"] += 1
print(f" 段落 {i+1:2d}: ❌ 無翻譯 - {seg.text[:30]}...")
print(f"\n📈 執行路徑統計:")
print(f" 表格路徑: {path_stats['table']} 段落")
print(f" 普通段落路徑: {path_stats['normal']} 段落")
print(f" SDT路徑: {path_stats['sdt']} 段落")
print(f" 其他路徑: {path_stats['other']} 段落")
print(f" 跳過(無翻譯): {path_stats['skipped']} 段落")
# 重點分析:大多數段落走的是哪個路徑?
total_with_translation = sum(path_stats[k] for k in ['table', 'normal', 'sdt', 'other'])
if total_with_translation > 0:
print(f"\n💡 關鍵分析:")
if path_stats['table'] > path_stats['normal']:
print(f" ⚠️ 大多數段落走表格路徑 ({path_stats['table']}/{total_with_translation})")
print(f" 可能問題: 表格插入邏輯有問題")
elif path_stats['normal'] > path_stats['table']:
print(f" ✅ 大多數段落走普通段落路徑 ({path_stats['normal']}/{total_with_translation})")
print(f" 可能問題: 普通段落插入邏輯有問題")
else:
print(f" 📊 表格和普通段落路徑數量相當")
if __name__ == "__main__":
debug_docx_insertion_path()

View File

@@ -1,193 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試DOCX翻譯流程 - 詳細檢查翻譯映射和插入過程
"""
import sys
import os
from pathlib import Path
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from app.models.job import TranslationJob
from app.services.translation_service import DocxParser
from sqlalchemy import text
def debug_docx_translation():
"""調試DOCX翻譯流程"""
app = create_app()
with app.app_context():
print("=== 調試DOCX翻譯流程 ===")
# 檢查指定的DOCX任務
job_uuid = "9c6548ac-2f59-45f4-aade-0a9b3895bbfd"
job = TranslationJob.query.filter_by(job_uuid=job_uuid).first()
if not job:
print(f"任務不存在: {job_uuid}")
return
print(f"任務狀態: {job.status}")
print(f"總tokens: {job.total_tokens:,}")
print(f"總成本: ${job.total_cost}")
print(f"目標語言: {job.target_languages}")
# 取得原始文件
original_file = job.get_original_file()
if not original_file:
print("找不到原始文件")
return
original_path = Path(original_file.file_path)
print(f"\n📄 原始文件: {original_path}")
print(f"存在: {original_path.exists()}")
if not original_path.exists():
print("原始文件不存在,無法調試")
return
# 創建DOCX解析器
parser = DocxParser(str(original_path))
# 1. 檢查文本段落提取
print(f"\n🔍 步驟1: 提取文本段落")
try:
text_segments = parser.extract_text_segments()
print(f"提取到 {len(text_segments)} 個文本段落:")
for i, seg in enumerate(text_segments[:5]): # 顯示前5段
print(f" 段落 {i+1}: {seg[:60]}...")
except Exception as e:
print(f"❌ 文本段落提取失敗: {e}")
return
# 2. 檢查帶上下文的段落提取
print(f"\n🔍 步驟2: 提取帶上下文的段落")
try:
segments_with_context = parser.extract_segments_with_context()
print(f"提取到 {len(segments_with_context)} 個段落(含上下文):")
for i, seg in enumerate(segments_with_context[:3]): # 顯示前3段
print(f" 段落 {i+1}: {seg.kind} | {seg.text[:50]}... | {seg.ctx}")
except Exception as e:
print(f"❌ 帶上下文段落提取失敗: {e}")
return
# 3. 檢查翻譯結果 - 從快取讀取
print(f"\n🔍 步驟3: 檢查翻譯快取中的結果")
# 讀取英文翻譯
en_result = db.session.execute(text("""
SELECT source_text, translated_text
FROM dt_translation_cache
WHERE target_language = 'en'
ORDER BY created_at DESC
LIMIT 10
"""))
en_translations = {}
en_list = []
for row in en_result.fetchall():
en_translations[row[0]] = row[1]
en_list.append(row[1])
# 讀取越南文翻譯
vi_result = db.session.execute(text("""
SELECT source_text, translated_text
FROM dt_translation_cache
WHERE target_language = 'vi'
ORDER BY created_at DESC
LIMIT 10
"""))
vi_translations = {}
vi_list = []
for row in vi_result.fetchall():
vi_translations[row[0]] = row[1]
vi_list.append(row[1])
translations = {'en': en_list, 'vi': vi_list}
print(f"從快取讀取翻譯: en={len(en_list)}, vi={len(vi_list)}")
# 4. 檢查翻譯映射構建 - 使用快取資料
print(f"\n🔍 步驟4: 檢查翻譯映射構建")
target_language = 'en' # 檢查英文翻譯
translation_map = {}
# 建立基於快取的翻譯映射
for seg in segments_with_context:
# 檢查此段落是否在快取中有英文翻譯
if seg.text in en_translations:
key = (target_language, seg.text)
value = en_translations[seg.text]
translation_map[key] = value
print(f" 映射: {seg.text[:40]}... -> {value[:40]}...")
print(f"翻譯映射總數: {len(translation_map)}")
print(f"段落總數: {len(segments_with_context)}")
print(f"映射覆蓋率: {len(translation_map)/len(segments_with_context)*100:.1f}%")
# 5. 檢查是否有翻譯插入
print(f"\n🔍 步驟5: 檢查翻譯插入邏輯")
# 模擬翻譯插入的檢查邏輯
segments_with_translation = 0
segments_without_translation = 0
for seg in segments_with_context:
has_translation = (target_language, seg.text) in translation_map
if has_translation:
segments_with_translation += 1
print(f" ✅ 有翻譯: {seg.text[:30]}...")
else:
segments_without_translation += 1
print(f" ❌ 無翻譯: {seg.text[:30]}...")
print(f"\n📊 總結:")
print(f" 有翻譯的段落: {segments_with_translation}")
print(f" 無翻譯的段落: {segments_without_translation}")
print(f" 翻譯覆蓋率: {segments_with_translation/(segments_with_translation+segments_without_translation)*100:.1f}%")
# 6. 檢查已翻譯的文件內容
print(f"\n🔍 步驟6: 檢查已生成的翻譯文件")
translated_files = job.get_translated_files()
for tf in translated_files:
if tf.language_code == target_language:
file_path = Path(tf.file_path)
if file_path.exists():
print(f"翻譯文件: {tf.filename}")
print(f"路徑: {tf.file_path}")
print(f"大小: {file_path.stat().st_size:,} bytes")
# 檢查文件內容
try:
from docx import Document
doc = Document(str(file_path))
paragraphs = [p.text.strip() for p in doc.paragraphs if p.text.strip()]
english_paras = [p for p in paragraphs if any(ord(c) < 128 and c.isalpha() for c in p)]
chinese_paras = [p for p in paragraphs if any('\u4e00' <= c <= '\u9fff' for c in p)]
print(f" 總段落: {len(paragraphs)}")
print(f" 含英文段落: {len(english_paras)}")
print(f" 含中文段落: {len(chinese_paras)}")
if english_paras:
print(f" 英文段落範例: {english_paras[0][:80]}...")
else:
print(" ❌ 沒有發現英文段落!")
except Exception as e:
print(f"❌ 讀取翻譯文件失敗: {e}")
if __name__ == "__main__":
debug_docx_translation()

View File

@@ -1,184 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試Excel翻譯問題
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
import openpyxl
from pathlib import Path
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
def debug_excel_translation_process():
"""調試Excel翻譯過程"""
print("=" * 80)
print("Excel 翻譯過程調試")
print("=" * 80)
# 文件路徑
excel_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\f0b78200-2c5e-41a4-bac8-1536f92529e9")
original_file = excel_dir / "original_panjit_f0b78200.xlsx"
translated_file = excel_dir / "original_panjit_f0b78200_ja_translated.xlsx"
if not original_file.exists():
print(f"原始文件不存在: {original_file}")
return
if not translated_file.exists():
print(f"翻譯文件不存在: {translated_file}")
return
print(f"\n1. 分析原始文件提取過程")
print("-" * 50)
# 模擬 ExcelParser.extract_text_segments() 的過程
wb = openpyxl.load_workbook(str(original_file), data_only=False)
try:
wb_vals = openpyxl.load_workbook(str(original_file), data_only=True)
except Exception:
wb_vals = None
print(f"工作簿載入成功,共 {len(wb.worksheets)} 個工作表")
# 提取文字段落
segs = []
cell_info = [] # 記錄每個提取片段的來源位置
for ws in wb.worksheets:
print(f"\n處理工作表: {ws.title}")
ws_vals = wb_vals[ws.title] if wb_vals and ws.title in wb_vals.sheetnames else None
max_row, max_col = ws.max_row, ws.max_column
print(f"工作表大小: {max_row} x {max_col}")
for r in range(1, max_row + 1):
for c in range(1, max_col + 1):
src_text = get_display_text_for_translation(ws, ws_vals, r, c)
if not src_text:
continue
if not should_translate(src_text, 'auto'):
continue
# 記錄提取到的文字和位置
cell_name = f"{openpyxl.utils.get_column_letter(c)}{r}"
segs.append(src_text)
cell_info.append((cell_name, src_text))
# 詳細記錄前20個儲存格
if len(segs) <= 20:
# 安全輸出,避免特殊字符問題
safe_text = repr(src_text)
print(f" {cell_name}: {safe_text}")
print(f"\n提取結果: 共提取到 {len(segs)} 個文字片段")
# 去重保持順序
unique_segments = []
seen = set()
for seg in segs:
if seg not in seen:
unique_segments.append(seg)
seen.add(seg)
print(f"去重後: {len(unique_segments)} 個唯一文字片段")
print(f"\n2. 分析翻譯結果寫入過程")
print("-" * 50)
# 檢查翻譯檔案的內容
wb_trans = openpyxl.load_workbook(str(translated_file), data_only=False)
# 檢查重要儲存格的翻譯狀況
important_cells = ['A1', 'B1', 'C1', 'D1', 'B3', 'C3', 'D3']
for cell_name in important_cells:
row = int(''.join(filter(str.isdigit, cell_name)))
col = openpyxl.utils.column_index_from_string(''.join(filter(str.isalpha, cell_name)))
# 原始內容
orig_val = wb.active.cell(row=row, column=col).value
# 翻譯後內容
trans_val = wb_trans.active.cell(row=row, column=col).value
print(f"\n儲存格 {cell_name}:")
print(f" 原始: {repr(orig_val)}")
print(f" 翻譯: {repr(trans_val)}")
# 檢查是否為期望的格式(原文+換行+譯文)
if isinstance(trans_val, str) and '\n' in trans_val:
lines = trans_val.split('\n')
print(f" 格式: 雙行格式,共 {len(lines)}")
for i, line in enumerate(lines):
print(f"{i+1}: {repr(line)}")
else:
print(f" 格式: 單行格式")
print(f"\n3. 檢查 A1 儲存格特殊情況")
print("-" * 50)
# 檢查A1儲存格的特殊處理
a1_orig = wb.active['A1'].value
a1_trans = wb_trans.active['A1'].value
print(f"A1 原始值: {repr(a1_orig)}")
print(f"A1 翻譯值: {repr(a1_trans)}")
print(f"A1 是否需要翻譯: {should_translate(str(a1_orig) if a1_orig else '', 'auto')}")
print(f"A1 是否在提取列表中: {str(a1_orig) in unique_segments if a1_orig else False}")
wb.close()
wb_trans.close()
if wb_vals:
wb_vals.close()
def get_display_text_for_translation(ws, ws_vals, r: int, c: int):
"""取得儲存格用於翻譯的顯示文字(移植自原始程式碼)"""
val = ws.cell(row=r, column=c).value
if isinstance(val, str) and val.startswith("="):
if ws_vals is not None:
shown = ws_vals.cell(row=r, column=c).value
return shown if isinstance(shown, str) and shown.strip() else None
return None
if isinstance(val, str) and val.strip():
return val
if ws_vals is not None:
shown = ws_vals.cell(row=r, column=c).value
if isinstance(shown, str) and shown.strip():
return shown
return None
def should_translate(text: str, src_lang: str) -> bool:
"""判斷文字是否需要翻譯(移植自原始程式碼)"""
text = text.strip()
if len(text) < 3:
return False
# Skip pure numbers, dates, etc.
import re
if re.match(r'^[\d\s\.\-\:\/]+$', text):
return False
# For auto-detect, translate if has CJK or meaningful text
if src_lang.lower() in ('auto', 'auto-detect'):
return has_cjk(text) or len(text) > 5
return True
def has_cjk(text: str) -> bool:
"""檢查是否包含中日韓文字(移植自原始程式碼)"""
for char in text:
if '\u4e00' <= char <= '\u9fff' or \
'\u3400' <= char <= '\u4dbf' or \
'\u20000' <= char <= '\u2a6df' or \
'\u3040' <= char <= '\u309f' or \
'\u30a0' <= char <= '\u30ff' or \
'\uac00' <= char <= '\ud7af':
return True
return False
if __name__ == "__main__":
debug_excel_translation_process()

View File

@@ -1,195 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試新上傳的Excel檔案翻譯問題
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
from app.services.translation_service import ExcelParser
def debug_new_excel_upload():
"""調試新上傳Excel檔案的翻譯問題"""
print("=" * 80)
print("調試新上傳Excel檔案翻譯問題")
print("=" * 80)
# 新上傳的檔案路徑
excel_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\686d4ac5-3a45-4582-870b-893dd6a83b50")
# 尋找Excel檔案
excel_files = list(excel_dir.glob("*.xlsx"))
if not excel_files:
print(f"在目錄中找不到Excel檔案: {excel_dir}")
return
original_file = excel_files[0] # 取第一個Excel檔案
print(f"找到Excel檔案: {original_file}")
# 檢查是否存在翻譯後的檔案
translated_files = list(excel_dir.glob("*_translated.xlsx"))
print(f"翻譯後檔案數量: {len(translated_files)}")
if translated_files:
for tf in translated_files:
print(f" 翻譯檔案: {tf.name}")
# 創建解析器實例
print(f"\n1. 測試ExcelParser實例化")
print("-" * 60)
try:
parser = ExcelParser(str(original_file))
print("✅ ExcelParser實例化成功")
except Exception as e:
print(f"❌ ExcelParser實例化失敗: {e}")
return
print(f"\n2. 測試修正後的_should_translate函數")
print("-" * 60)
# 測試A1儲存格的內容
test_content = "製程" # A1儲存格內容
print(f"測試文字: '{test_content}'")
print(f"文字長度: {len(test_content)}")
# 檢查是否包含CJK字符
has_cjk = parser._has_cjk(test_content)
print(f"包含CJK字符: {has_cjk}")
# 檢查是否應該翻譯
should_translate = parser._should_translate(test_content, 'auto')
print(f"應該翻譯: {should_translate}")
# 詳細分析_should_translate的邏輯
text = test_content.strip()
min_length = 2 if has_cjk else 3
print(f"最小長度要求: {min_length}")
print(f"是否滿足長度要求: {len(text) >= min_length}")
import re
is_pure_number_date = re.match(r'^[\d\s\.\-\:\/ ]+$', text)
print(f"是否為純數字/日期格式: {bool(is_pure_number_date)}")
print(f"\n3. 測試文字片段提取")
print("-" * 60)
segments = parser.extract_text_segments()
print(f"提取到的文字片段總數: {len(segments)}")
# 檢查A1內容是否在提取列表中
if test_content in segments:
print(f"✅ A1內容 '{test_content}' 已被提取")
index = segments.index(test_content)
print(f" 在列表中的索引: {index}")
else:
print(f"❌ A1內容 '{test_content}' 未被提取")
# 顯示前10個提取的片段
print(f"\n前10個提取片段:")
for i, segment in enumerate(segments[:10]):
safe_segment = repr(segment)
print(f" {i+1:2d}. {safe_segment}")
# 特別標記A1內容
if segment == test_content:
print(f" ⬆️ 這是A1的內容")
print(f"\n4. 檢查翻譯快取")
print("-" * 60)
from app import create_app
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
target_language = 'ja' # 日文翻譯
print(f"查詢 '{test_content}' 的日文翻譯...")
result = db.session.execute(sql_text("""
SELECT source_text, translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 5
"""), {'text': test_content, 'lang': target_language})
rows = result.fetchall()
if rows:
print(f"✅ 找到 {len(rows)} 筆翻譯記錄:")
for i, (src, trans, created_at) in enumerate(rows):
print(f" {i+1}. 原文: {repr(src)}")
print(f" 譯文: {repr(trans)}")
print(f" 時間: {created_at}")
else:
print(f"❌ 未找到翻譯記錄")
# 檢查是否有類似的記錄
print(f"\n檢查是否有類似的記錄...")
result2 = db.session.execute(sql_text("""
SELECT source_text, translated_text
FROM dt_translation_cache
WHERE source_text LIKE :text AND target_language = :lang
LIMIT 10
"""), {'text': f'%{test_content}%', 'lang': target_language})
similar_rows = result2.fetchall()
if similar_rows:
print(f"找到 {len(similar_rows)} 筆類似記錄:")
for src, trans in similar_rows:
print(f" 原文: {repr(src)} -> 譯文: {repr(trans)}")
else:
print(f"沒有找到類似記錄")
print(f"\n5. 檢查原始檔案A1儲存格內容")
print("-" * 60)
import openpyxl
wb = openpyxl.load_workbook(str(original_file), data_only=False)
try:
wb_vals = openpyxl.load_workbook(str(original_file), data_only=True)
except:
wb_vals = None
ws = wb.active
ws_vals = wb_vals.active if wb_vals else None
a1_cell = ws['A1']
a1_value = a1_cell.value
a1_display_value = ws_vals['A1'].value if ws_vals else None
print(f"A1儲存格:")
print(f" 原始值: {repr(a1_value)}")
print(f" 顯示值: {repr(a1_display_value)}")
print(f" 是否為公式: {isinstance(a1_value, str) and a1_value.startswith('=')}")
# 模擬get_display_text_for_translation函數
if isinstance(a1_value, str) and a1_value.startswith("="):
display_text = a1_display_value if isinstance(a1_display_value, str) and a1_display_value.strip() else None
elif isinstance(a1_value, str) and a1_value.strip():
display_text = a1_value
else:
display_text = a1_display_value if ws_vals and isinstance(a1_display_value, str) and a1_display_value.strip() else None
print(f" 用於翻譯的文字: {repr(display_text)}")
print(f" 是否應該翻譯: {parser._should_translate(display_text, 'auto') if display_text else False}")
wb.close()
if wb_vals:
wb_vals.close()
print("\n" + "=" * 80)
print("調試完成!")
print("=" * 80)
if __name__ == "__main__":
debug_new_excel_upload()

View File

@@ -1,161 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試段落結構問題
"""
import sys
import os
import tempfile
import shutil
from pathlib import Path
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from app.services.document_processor import DocumentProcessor, _append_after
from sqlalchemy import text as sql_text
def debug_paragraph_structure():
"""調試段落結構問題"""
app = create_app()
with app.app_context():
print("=== 調試段落結構問題 ===")
# 原始文件
original_path = r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\9c6548ac-2f59-45f4-aade-0a9b3895bbfd\original_-OR026_9c6548ac.docx"
# 創建測試副本
test_dir = Path(tempfile.gettempdir()) / "debug_paragraph"
test_dir.mkdir(exist_ok=True)
test_path = test_dir / "debug_paragraph.docx"
shutil.copy2(original_path, test_path)
print(f"✅ 創建測試副本: {test_path}")
# 創建處理器
processor = DocumentProcessor()
# 提取段落
segments = processor.extract_docx_segments(str(test_path))
# 只看前3個段落
debug_segments = segments[:3]
# 載入文檔
try:
from docx import Document
doc = Document(str(test_path))
print(f"\n📊 文檔分析:")
print(f"總段落數: {len(doc.paragraphs)}")
print(f"\n🔍 前3個段落詳細分析:")
for i, seg in enumerate(debug_segments):
if seg.kind == "para":
p = seg.ref
print(f"\n段落 {i+1}:")
print(f" 文本: {seg.text[:50]}...")
print(f" 段落類型: {type(p)}")
print(f" 段落父元素類型: {type(p._parent)}")
print(f" 段落XML標籤: {p._p.tag if hasattr(p._p, 'tag') else 'N/A'}")
# 檢查段落位置
try:
all_paras = list(doc.paragraphs)
current_index = -1
for idx, doc_p in enumerate(all_paras):
if doc_p._element == p._element:
current_index = idx
break
print(f" 在文檔中的位置: {current_index} (總共{len(all_paras)}段)")
# 測試_append_after插入
print(f" 測試插入翻譯...")
test_translation = f"TEST TRANSLATION {i+1}: This is a test."
try:
before_count = len(doc.paragraphs)
# 記錄插入前的下一個段落
next_para_before = None
if current_index + 1 < len(all_paras):
next_para_before = all_paras[current_index + 1].text[:30]
new_para = _append_after(p, test_translation, italic=True, font_size_pt=12)
after_count = len(doc.paragraphs)
print(f" 插入前段落數: {before_count}")
print(f" 插入後段落數: {after_count}")
print(f" 段落數變化: +{after_count - before_count}")
if new_para:
print(f" 新段落文本: {new_para.text}")
print(f" 新段落類型: {type(new_para)}")
# 檢查插入位置
updated_paras = list(doc.paragraphs)
if current_index + 1 < len(updated_paras):
next_para_after = updated_paras[current_index + 1].text[:30]
print(f" 插入前下一段: {next_para_before}")
print(f" 插入後下一段: {next_para_after}")
if next_para_after != next_para_before:
print(f" ✅ 插入成功:下一段內容已改變")
else:
print(f" ❌ 插入失敗:下一段內容未變")
except Exception as e:
print(f" ❌ _append_after失敗: {e}")
# 嘗試簡單的段落添加測試
try:
simple_para = doc.add_paragraph(f"SIMPLE TEST {i+1}")
print(f" 替代測試: doc.add_paragraph成功")
print(f" 新段落文本: {simple_para.text}")
except Exception as e2:
print(f" 替代測試也失敗: {e2}")
except Exception as outer_e:
print(f" ❌ 段落分析失敗: {outer_e}")
# 保存並重新讀取驗證
output_path = test_dir / "debug_paragraph_modified.docx"
doc.save(str(output_path))
print(f"\n✅ 修改後文檔已保存: {output_path}")
# 重新讀取驗證
doc2 = Document(str(output_path))
print(f"保存後重讀段落數: {len(doc2.paragraphs)}")
print(f"\n📄 前10段內容:")
for i, para in enumerate(doc2.paragraphs[:10]):
if para.text.strip():
lang_info = ""
if "TEST TRANSLATION" in para.text:
lang_info = "🆕 測試翻譯"
elif "SIMPLE TEST" in para.text:
lang_info = "🆕 簡單測試"
elif any('\u4e00' <= c <= '\u9fff' for c in para.text):
lang_info = "🇨🇳 中文"
else:
lang_info = "❓ 其他"
print(f" 段落 {i+1}: {lang_info} - {para.text.strip()[:60]}...")
except Exception as e:
print(f"❌ 調試失敗: {e}")
if __name__ == "__main__":
debug_paragraph_structure()

View File

@@ -1,179 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試實際生產環境中的翻譯問題
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
import openpyxl
from app.services.translation_service import ExcelParser
def debug_real_production_issue():
"""調試實際生產環境的翻譯問題"""
print("=" * 80)
print("調試實際生產環境翻譯問題")
print("新上傳UUID: f8b0febc-c0df-4902-8dc3-c90f5634f3b3")
print("=" * 80)
# 實際生產檔案路徑
prod_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\f8b0febc-c0df-4902-8dc3-c90f5634f3b3")
original_file = prod_dir / "original_panjit_f8b0febc.xlsx"
translated_file = prod_dir / "original_panjit_f8b0febc_ja_translated.xlsx"
if not original_file.exists():
print(f"❌ 原始文件不存在: {original_file}")
return
if not translated_file.exists():
print(f"❌ 翻譯文件不存在: {translated_file}")
return
print(f"✅ 檔案確認:")
print(f" 原始文件: {original_file.name}")
print(f" 翻譯文件: {translated_file.name}")
# 1. 檢查實際使用的ExcelParser行為
print(f"\n1. 檢查實際ExcelParser提取行為")
print("-" * 60)
parser = ExcelParser(str(original_file))
segments = parser.extract_text_segments()
print(f"實際提取到 {len(segments)} 個文字片段")
# 檢查A1是否被提取
a1_content = "製程"
if a1_content in segments:
print(f"✅ A1內容 '{a1_content}' 已被提取(位置: {segments.index(a1_content)+1}")
else:
print(f"❌ A1內容 '{a1_content}' 仍未被提取")
# 顯示實際提取的前10個片段
print(f" 實際提取的前10個片段:")
for i, seg in enumerate(segments[:10]):
print(f" {i+1:2d}. {repr(seg)}")
# 2. 直接檢查A1儲存格的原始內容
print(f"\n2. 檢查A1儲存格原始內容")
print("-" * 60)
wb_orig = openpyxl.load_workbook(str(original_file), data_only=False)
try:
wb_orig_vals = openpyxl.load_workbook(str(original_file), data_only=True)
except:
wb_orig_vals = None
a1_raw = wb_orig.active['A1'].value
a1_display = wb_orig_vals.active['A1'].value if wb_orig_vals else None
print(f"A1原始值: {repr(a1_raw)}")
if wb_orig_vals:
print(f"A1顯示值: {repr(a1_display)}")
# 模擬get_display_text_for_translation邏輯
if isinstance(a1_raw, str) and a1_raw.startswith("="):
display_text = a1_display if isinstance(a1_display, str) and a1_display.strip() else None
elif isinstance(a1_raw, str) and a1_raw.strip():
display_text = a1_raw
else:
display_text = a1_display if wb_orig_vals and isinstance(a1_display, str) and a1_display.strip() else None
print(f"用於翻譯的文字: {repr(display_text)}")
if display_text:
should_translate = parser._should_translate(display_text, 'auto')
has_cjk = parser._has_cjk(display_text)
min_length = 2 if has_cjk else 3
print(f"文字長度: {len(display_text)}")
print(f"包含CJK: {has_cjk}")
print(f"最小長度要求: {min_length}")
print(f"應該翻譯: {should_translate}")
# 3. 檢查翻譯文件的A1
print(f"\n3. 檢查翻譯文件A1儲存格")
print("-" * 60)
wb_trans = openpyxl.load_workbook(str(translated_file), data_only=False)
a1_trans = wb_trans.active['A1'].value
print(f"A1翻譯結果: {repr(a1_trans)}")
if isinstance(a1_trans, str) and '\n' in a1_trans:
lines = a1_trans.split('\n')
print(f"✅ A1已翻譯格式: 雙行")
for i, line in enumerate(lines):
print(f"{i+1}: {repr(line)}")
elif a1_raw == a1_trans:
print(f"❌ A1未翻譯 - 內容完全相同")
else:
print(f"⚠️ A1內容有變化但格式不明")
# 4. 檢查翻譯快取狀況
print(f"\n4. 檢查翻譯快取")
print("-" * 60)
from app import create_app
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
if display_text:
result = db.session.execute(sql_text("""
SELECT translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = 'ja'
ORDER BY created_at DESC
LIMIT 1
"""), {'text': display_text})
row = result.fetchone()
if row:
print(f"✅ 快取中有翻譯: '{display_text}' -> '{row[0]}'")
print(f" 創建時間: {row[1]}")
else:
print(f"❌ 快取中沒有翻譯: '{display_text}'")
# 5. 系統性檢查前10個儲存格
print(f"\n5. 系統性檢查前10個儲存格")
print("-" * 60)
important_cells = ['A1', 'B1', 'C1', 'D1', 'E1', 'A2', 'B2', 'C2', 'D2', 'E2']
for cell_name in important_cells:
orig_val = wb_orig.active[cell_name].value
trans_val = wb_trans.active[cell_name].value
if orig_val: # 只檢查有內容的儲存格
print(f"\n{cell_name}:")
print(f" 原始: {repr(orig_val)}")
print(f" 翻譯: {repr(trans_val)}")
if isinstance(trans_val, str) and '\n' in trans_val:
print(f" 狀態: ✅ 已翻譯")
elif orig_val == trans_val:
print(f" 狀態: ❌ 未翻譯")
else:
print(f" 狀態: ⚠️ 內容有變化")
wb_orig.close()
wb_trans.close()
if wb_orig_vals:
wb_orig_vals.close()
print(f"\n" + "=" * 80)
print("實際生產環境調試完成!")
print("=" * 80)
if __name__ == "__main__":
debug_real_production_issue()

View File

@@ -1,204 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
檢查docx文件表格翻譯問題
特別分析"超温"文字的翻譯狀況
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
import docx
from docx.table import Table
from app import create_app
def analyze_docx_table_translation():
"""分析docx表格翻譯問題"""
print("=" * 80)
print("檢查docx表格翻譯問題")
print("任務ID: 17e05695-406f-47af-96eb-a0e23843770e")
print("=" * 80)
base_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\17e05695-406f-47af-96eb-a0e23843770e")
original_file = base_dir / "original_-OR026_17e05695.docx"
translated_en = base_dir / "translated_original_-OR026_17e05695_en_translat.docx"
translated_vi = base_dir / "translated_original_-OR026_17e05695_vi_translat.docx"
if not original_file.exists():
print(f"❌ 原始檔案不存在: {original_file}")
return
print(f"✅ 原始檔案: {original_file.name}")
# 1. 檢查原始文件中的"超温"
print(f"\n1. 分析原始文件表格內容")
print("-" * 60)
try:
doc = docx.Document(str(original_file))
tables_found = 0
target_text_found = False
for table_idx, table in enumerate(doc.tables):
tables_found += 1
print(f"表格 {table_idx + 1}:")
for row_idx, row in enumerate(table.rows):
for cell_idx, cell in enumerate(row.cells):
cell_text = cell.text.strip()
if cell_text:
print(f"{row_idx+1}{cell_idx+1}: {repr(cell_text)}")
if "超温" in cell_text:
print(f" 🎯 找到目標文字 '超温'")
target_text_found = True
# 檢查該儲存格的詳細結構
print(f" 儲存格段落數: {len(cell.paragraphs)}")
for p_idx, para in enumerate(cell.paragraphs):
print(f" 段落{p_idx+1}: {repr(para.text)}")
print(f"\n總表格數: {tables_found}")
print(f"是否找到'超温': {'' if target_text_found else ''}")
except Exception as e:
print(f"❌ 讀取原始文件失敗: {e}")
return
# 2. 檢查翻譯版本中的對應內容
for lang, trans_file in [("英文", translated_en), ("越南文", translated_vi)]:
if not trans_file.exists():
print(f"\n{lang}翻譯檔案不存在")
continue
print(f"\n2. 檢查{lang}翻譯結果")
print("-" * 60)
try:
trans_doc = docx.Document(str(trans_file))
translation_found = False
for table_idx, table in enumerate(trans_doc.tables):
print(f"{lang}表格 {table_idx + 1}:")
for row_idx, row in enumerate(table.rows):
for cell_idx, cell in enumerate(row.cells):
cell_text = cell.text.strip()
if cell_text:
# 檢查是否包含原文"超温"
if "超温" in cell_text:
print(f"{row_idx+1}{cell_idx+1}: {repr(cell_text)}")
print(f" ⚠️ 仍包含原文'超温',可能未翻譯")
# 詳細分析該儲存格
print(f" 儲存格段落數: {len(cell.paragraphs)}")
for p_idx, para in enumerate(cell.paragraphs):
p_text = para.text.strip()
print(f" 段落{p_idx+1}: {repr(p_text)}")
# 檢查是否有翻譯標記
if "【翻譯失敗」" in p_text or "translation:" in p_text.lower():
print(f" 🔍 發現翻譯標記")
elif "\u200b" in p_text: # 零寬空格標記
print(f" 🔍 發現翻譯插入標記")
# 檢查可能的翻譯結果
elif any(keyword in cell_text.lower() for keyword in ['overheating', 'over-heating', 'quá nóng']):
print(f"{row_idx+1}{cell_idx+1}: {repr(cell_text)}")
print(f" ✅ 可能的翻譯結果")
translation_found = True
print(f"{lang}翻譯狀態: {'✅ 找到翻譯' if translation_found else '❌ 未找到翻譯'}")
except Exception as e:
print(f"❌ 讀取{lang}翻譯檔案失敗: {e}")
def check_translation_cache():
"""檢查翻譯快取中是否有"超温"的記錄"""
print(f"\n" + "=" * 80)
print("檢查翻譯快取")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
print(f"\n1. 搜尋'超温'相關的快取記錄")
print("-" * 60)
# 搜尋包含"超温"的快取記錄
cache_results = db.session.execute(sql_text("""
SELECT id, source_text, target_language, translated_text, created_at
FROM dt_translation_cache
WHERE source_text LIKE '%超温%'
ORDER BY created_at DESC
LIMIT 10
""")).fetchall()
if cache_results:
print(f"找到 {len(cache_results)} 條相關記錄:")
for row in cache_results:
print(f"ROW {row[0]}: {row[3]} -> {row[1]}")
print(f" 目標語言: {row[1]}")
print(f" 翻譯結果: {repr(row[2][:50])}...")
print(f" 時間: {row[4]}")
print()
else:
print("❌ 未找到包含'超温'的快取記錄")
# 搜尋近期的翻譯記錄
print(f"\n2. 檢查近期的翻譯記錄")
print("-" * 60)
recent_results = db.session.execute(sql_text("""
SELECT id, source_text, target_language, translated_text, created_at
FROM dt_translation_cache
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 1 DAY)
AND (target_language = 'en' OR target_language = 'vi')
ORDER BY created_at DESC
LIMIT 20
""")).fetchall()
print(f"近24小時內的英文/越南文翻譯記錄 (共{len(recent_results)}條):")
for row in recent_results:
print(f"ROW {row[0]}: {repr(row[1][:20])}... -> {row[2]} -> {repr(row[3][:30])}...")
def main():
"""主檢查函數"""
print("🔍 診斷docx表格翻譯問題")
print("重點檢查: '超温' 文字翻譯狀況")
try:
# 分析文件表格
analyze_docx_table_translation()
# 檢查翻譯快取
check_translation_cache()
print(f"\n" + "=" * 80)
print("診斷總結")
print("=" * 80)
print("請根據以上結果判斷問題類型:")
print("1. 解析問題: 原始文件中找不到'超温'")
print("2. 翻譯問題: 快取中沒有'超温'的翻譯記錄")
print("3. 插入問題: 有翻譯記錄但未插入到文件中")
print("4. 版面問題: 翻譯已插入但格式或位置導致看不到")
print("=" * 80)
except Exception as e:
print(f"❌ 診斷過程發生錯誤: {e}")
import traceback
print(f"錯誤詳情: {traceback.format_exc()}")
if __name__ == "__main__":
main()

View File

@@ -1,161 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試文字格式不匹配問題
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
from app import create_app
def debug_text_format_mismatch():
"""調試文字格式不匹配問題"""
print("=" * 80)
print("調試文字格式不匹配問題")
print("Excel提取 vs 原始快取的文字格式")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
from app.services.translation_service import ExcelParser
# 1. 檢查Excel提取的D2文字格式
print(f"1. Excel提取的D2文字格式")
print("-" * 60)
original_file = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\98158984-f335-44f5-a0b4-88fb8ccd5d78") / "original_panjit_98158984.xlsx"
if original_file.exists():
parser = ExcelParser(str(original_file))
segments = parser.extract_text_segments()
# 找到包含"WB inline"的片段
d2_extracted = None
for segment in segments:
if "WB inline" in segment:
d2_extracted = segment
break
if d2_extracted:
print(f"Excel提取的D2:")
print(f" 長度: {len(d2_extracted)}")
print(f" 內容: {repr(d2_extracted)}")
print(f" 包含\\n: {'\\n' in d2_extracted}")
print(f" 行數: {len(d2_extracted.split(chr(10)))}")
else:
print("❌ 沒有找到D2相關內容")
# 2. 檢查原始快取中的D2格式
print(f"\n2. 原始快取中的D2格式")
print("-" * 60)
result = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, target_language, created_at
FROM dt_translation_cache
WHERE source_text LIKE '%WB inline%' AND source_text LIKE '%Sn/Au%'
ORDER BY created_at ASC
"""))
d2_cache_records = result.fetchall()
print(f"找到 {len(d2_cache_records)} 筆原始D2快取:")
for i, record in enumerate(d2_cache_records, 1):
print(f"\n記錄 {i} (ROW {record[0]}, {record[3]}):")
print(f" 長度: {len(record[1])}")
print(f" 內容: {repr(record[1])}")
print(f" 包含\\n: {'\\n' in record[1]}")
print(f" 行數: {len(record[1].split(chr(10)))}")
print(f" 創建時間: {record[4]}")
# 標記哪個是原始DIFY翻譯
if record[0] == 449:
print(f" 🎯 這是原始DIFY韓文翻譯 (ROW 449)")
# 3. 比較格式差異
print(f"\n3. 格式差異分析")
print("-" * 60)
if d2_extracted and d2_cache_records:
original_cache = next((r for r in d2_cache_records if r[0] == 449), None)
if original_cache:
print(f"Excel提取格式:")
print(f" {repr(d2_extracted)}")
print(f"\n原始快取格式 (ROW 449):")
print(f" {repr(original_cache[1])}")
print(f"\n格式差異:")
print(f" 長度差異: {len(d2_extracted)} vs {len(original_cache[1])}")
print(f" Excel有\\n: {'\\n' in d2_extracted}")
print(f" 快取有\\n: {'\\n' in original_cache[1]}")
# 嘗試格式化統一比較
excel_normalized = d2_extracted.replace('\n', ' ').strip()
cache_normalized = original_cache[1].replace('\n', ' ').strip()
print(f"\n標準化比較:")
print(f" Excel標準化: {repr(excel_normalized)}")
print(f" 快取標準化: {repr(cache_normalized)}")
print(f" 標準化後相等: {excel_normalized == cache_normalized}")
# 檢查字符級差異
if excel_normalized != cache_normalized:
print(f"\n字符級差異分析:")
min_len = min(len(excel_normalized), len(cache_normalized))
for j in range(min_len):
if excel_normalized[j] != cache_normalized[j]:
print(f" 位置{j}: Excel='{excel_normalized[j]}' vs 快取='{cache_normalized[j]}'")
break
# 4. 測試修正查找邏輯
print(f"\n4. 測試修正查找邏輯")
print("-" * 60)
if d2_extracted:
# 原始查找
result1 = db.session.execute(sql_text("""
SELECT id, translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = 'ko'
ORDER BY created_at DESC
LIMIT 1
"""), {'text': d2_extracted})
row1 = result1.fetchone()
print(f"原始查找 (精確匹配): {'✅ 找到' if row1 else '❌ 未找到'}")
if row1:
print(f" ROW {row1[0]}: {repr(row1[1][:30])}...")
# 標準化查找 - 去除換行後查找
normalized_text = d2_extracted.replace('\n', ' ').strip()
result2 = db.session.execute(sql_text("""
SELECT id, translated_text
FROM dt_translation_cache
WHERE REPLACE(REPLACE(source_text, '\n', ' '), '\r', ' ') = :text AND target_language = 'ko'
ORDER BY created_at DESC
LIMIT 1
"""), {'text': normalized_text})
row2 = result2.fetchone()
print(f"標準化查找 (忽略換行): {'✅ 找到' if row2 else '❌ 未找到'}")
if row2:
print(f" ROW {row2[0]}: {repr(row2[1][:30])}...")
print(f"\n" + "=" * 80)
print("文字格式不匹配調試完成!")
print("建議: 修改翻譯映射邏輯以容忍換行符差異")
print("=" * 80)
if __name__ == "__main__":
debug_text_format_mismatch()

View File

@@ -1,143 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Debug script to trace translation file generation issue
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from pathlib import Path
from app.services.document_processor import DocumentProcessor
def debug_docx_processing(file_path):
"""Debug DOCX processing to understand why translations aren't being inserted"""
print(f"=== Debugging DOCX file: {file_path} ===")
if not Path(file_path).exists():
print(f"ERROR: File does not exist: {file_path}")
return
processor = DocumentProcessor()
try:
# Extract segments
segments = processor.extract_docx_segments(file_path)
print(f"Extracted {len(segments)} segments:")
for i, seg in enumerate(segments):
print(f" Segment {i+1}:")
print(f" Kind: {seg.kind}")
print(f" Context: {seg.ctx}")
print(f" Text: {repr(seg.text[:100])}")
print(f" Should translate: {processor.should_translate_text(seg.text, 'auto')}")
print()
# Simulate translation map
sample_translation_map = {}
target_languages = ['vi', 'en']
for target_lang in target_languages:
for seg in segments:
if processor.should_translate_text(seg.text, 'auto'):
# Simulate a translation
key = (target_lang, seg.text)
sample_translation_map[key] = f"[TRANSLATED_{target_lang.upper()}] {seg.text}"
print(f"Built translation map with {len(sample_translation_map)} entries:")
for key, value in list(sample_translation_map.items())[:5]:
print(f" {key[0]} | {repr(key[1][:50])} -> {repr(value[:50])}")
print()
# Test translation insertion
output_path = str(Path(file_path).parent / "debug_translated.docx")
print(f"Testing translation insertion to: {output_path}")
ok_count, skip_count = processor.insert_docx_translations(
file_path=file_path,
segments=segments,
translation_map=sample_translation_map,
target_languages=target_languages,
output_path=output_path
)
print(f"Translation insertion result: {ok_count} OK, {skip_count} skipped")
if Path(output_path).exists():
print(f"SUCCESS: Output file created with size {Path(output_path).stat().st_size} bytes")
else:
print("ERROR: Output file was not created")
except Exception as e:
print(f"ERROR during processing: {str(e)}")
import traceback
traceback.print_exc()
def check_jobs():
"""Check for jobs and debug them"""
try:
from app import create_app
from app.models.job import TranslationJob
app = create_app()
with app.app_context():
# Check all recent jobs
all_jobs = TranslationJob.query.order_by(TranslationJob.created_at.desc()).limit(5).all()
print(f"\n=== Found {len(all_jobs)} recent jobs ===")
for job in all_jobs:
print(f"Job {job.job_uuid}: {job.original_filename}")
print(f" Status: {job.status}")
print(f" File path: {job.file_path}")
print(f" File exists: {Path(job.file_path).exists() if job.file_path else 'N/A'}")
print(f" Target languages: {job.target_languages}")
print(f" Total tokens: {job.total_tokens}")
print(f" Total cost: {job.total_cost}")
# Check API usage stats
from app.models.stats import APIUsageStats
api_stats = APIUsageStats.query.filter_by(job_id=job.id).all()
print(f" API calls made: {len(api_stats)}")
for stat in api_stats[:3]: # Show first 3 calls
print(f" - {stat.api_endpoint}: {stat.total_tokens} tokens, ${stat.cost:.4f}, success: {stat.success}")
if not stat.success:
print(f" Error: {stat.error_message}")
if job.file_path and Path(job.file_path).exists() and job.status == 'COMPLETED':
print(f" >>> Debugging COMPLETED job file: {job.file_path}")
debug_docx_processing(job.file_path)
# Check translated files
translated_files = job.get_translated_files()
print(f" >>> Found {len(translated_files)} translated files:")
for tf in translated_files:
print(f" - {tf.filename} ({tf.language_code}) - Size: {tf.file_size} bytes")
if Path(tf.file_path).exists():
print(f" File exists: {tf.file_path}")
else:
print(f" File MISSING: {tf.file_path}")
print()
return all_jobs
except Exception as e:
print(f"Error checking jobs: {str(e)}")
import traceback
traceback.print_exc()
return []
if __name__ == "__main__":
if len(sys.argv) > 1:
# Debug specific file
debug_docx_processing(sys.argv[1])
else:
# Debug recent jobs
check_jobs()

View File

@@ -1,176 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Debug the complete translation flow to find where translations are lost
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app
from app.services.document_processor import DocumentProcessor
from app.services.dify_client import DifyClient
from pathlib import Path
def debug_translation_flow():
"""Debug the complete translation flow"""
app = create_app()
with app.app_context():
# Use the actual job file
job_file_path = r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\8cada04e-da42-4416-af46-f01cca5a452f\original_-OR026_8cada04e.docx"
if not Path(job_file_path).exists():
print(f"ERROR: Job file does not exist: {job_file_path}")
return
print("=== DEBUGGING TRANSLATION FLOW ===")
print(f"File: {job_file_path}")
# Step 1: Extract segments
print("\n1. EXTRACTING SEGMENTS...")
processor = DocumentProcessor()
segments = processor.extract_docx_segments(job_file_path)
translatable_segments = []
for i, seg in enumerate(segments):
if processor.should_translate_text(seg.text, 'auto'):
translatable_segments.append(seg)
print(f"Total segments: {len(segments)}")
print(f"Translatable segments: {len(translatable_segments)}")
print(f"First 3 translatable segments:")
for i, seg in enumerate(translatable_segments[:3]):
print(f" {i+1}. {repr(seg.text[:50])}")
# Step 2: Test Dify translation on first few segments
print("\n2. TESTING DIFY TRANSLATIONS...")
dify_client = DifyClient()
translation_map = {}
target_languages = ['en', 'vi']
for target_lang in target_languages:
print(f"\nTesting translation to {target_lang}:")
for i, seg in enumerate(translatable_segments[:3]): # Test first 3
try:
print(f" Translating: {repr(seg.text)}")
result = dify_client.translate_text(
text=seg.text,
source_language='zh-cn',
target_language=target_lang,
user_id=1,
job_id=1
)
translated_text = result.get('translated_text', '')
translation_map[(target_lang, seg.text)] = translated_text
print(f" Result: {repr(translated_text)}")
print(f" Success: {translated_text != seg.text and translated_text.strip()}")
except Exception as e:
print(f" ERROR: {e}")
translation_map[(target_lang, seg.text)] = f"[ERROR] {seg.text}"
# Step 3: Test translation insertion
print(f"\n3. TESTING TRANSLATION INSERTION...")
print(f"Translation map entries: {len(translation_map)}")
for key, value in list(translation_map.items())[:6]:
lang, source = key
print(f" {lang} | {repr(source[:30])} -> {repr(value[:30])}")
# Debug: Check which segments will be matched
print(f"\n3.1. SEGMENT MATCHING DEBUG...")
target_langs_for_test = ['en']
matched_count = 0
for i, seg in enumerate(segments[:10]): # Check first 10 segments
has_translation = any((tgt, seg.text) in translation_map for tgt in target_langs_for_test)
status = "MATCH" if has_translation else "NO MATCH"
print(f" Segment {i+1}: {status} | {repr(seg.text[:40])}")
if has_translation:
matched_count += 1
for tgt in target_langs_for_test:
if (tgt, seg.text) in translation_map:
translation = translation_map[(tgt, seg.text)]
print(f" -> {tgt}: {repr(translation[:40])}")
print(f"Segments that will match: {matched_count}/10 (in first 10)")
# Step 4: Check translation cache for real job data
print(f"\n4. CHECKING TRANSLATION CACHE...")
from app.models.cache import TranslationCache
# Check if there are any cached translations for the segments
cache_hits = 0
cache_misses = 0
for i, seg in enumerate(translatable_segments[:5]): # Check first 5
for target_lang in ['en', 'vi']:
cached = TranslationCache.get_translation(
text=seg.text,
source_language='zh-cn',
target_language=target_lang
)
if cached:
print(f" CACHE HIT: {target_lang} | {repr(seg.text[:30])} -> {repr(cached[:30])}")
cache_hits += 1
else:
cache_misses += 1
print(f"Cache hits: {cache_hits}, Cache misses: {cache_misses}")
# Create test output file
output_path = str(Path(job_file_path).parent / "flow_debug_translated.docx")
try:
ok_count, skip_count = processor.insert_docx_translations(
file_path=job_file_path,
segments=segments,
translation_map=translation_map,
target_languages=['en'], # Test with one language first
output_path=output_path
)
print(f"Translation insertion: {ok_count} OK, {skip_count} skipped")
if Path(output_path).exists():
print(f"✅ Output file created: {Path(output_path).stat().st_size} bytes")
# Verify the output contains translations
test_segments = processor.extract_docx_segments(output_path)
print(f"Output file segments: {len(test_segments)}")
# Look for evidence of translations
translation_evidence = []
for seg in test_segments:
# Check if segment text appears to be a translation
if any(word in seg.text.lower() for word in ['purpose', 'equipment', 'maintenance', 'check']):
translation_evidence.append(seg.text[:50])
print(f"Translation evidence found: {len(translation_evidence)} segments")
for evidence in translation_evidence[:3]:
print(f" - {repr(evidence)}")
else:
print("❌ Output file was not created")
except Exception as e:
print(f"ERROR during insertion: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
debug_translation_flow()

View File

@@ -1,146 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試翻譯映射過程 - 為什麼A1沒有被翻譯
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
from app.services.translation_service import ExcelParser
def debug_translation_mapping():
"""調試翻譯映射過程"""
print("=" * 80)
print("調試翻譯映射過程 - 為什麼A1沒有被翻譯")
print("=" * 80)
# 使用實際生產檔案
prod_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\f8b0febc-c0df-4902-8dc3-c90f5634f3b3")
original_file = prod_dir / "original_panjit_f8b0febc.xlsx"
parser = ExcelParser(str(original_file))
# 1. 檢查提取的文字片段
print(f"1. 檢查文字片段提取")
print("-" * 60)
segments = parser.extract_text_segments()
print(f"提取到 {len(segments)} 個片段")
a1_content = "製程"
if a1_content in segments:
print(f"'{a1_content}' 在提取列表中")
else:
print(f"'{a1_content}' 不在提取列表中")
return
# 2. 模擬generate_translated_document的映射過程
print(f"\n2. 模擬翻譯映射過程")
print("-" * 60)
from app import create_app
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
target_language = 'ja'
tmap = {}
print(f"建立翻譯映射...")
for original_text in segments:
result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': original_text, 'lang': target_language})
row = result.fetchone()
if row and row[0]:
tmap[original_text] = row[0]
if original_text == a1_content:
print(f"✅ A1映射成功: '{original_text}' -> '{row[0]}'")
else:
if original_text == a1_content:
print(f"❌ A1映射失敗: '{original_text}' -> 無翻譯")
print(f"翻譯映射建立完成: {len(tmap)}/{len(segments)}")
# 3. 模擬儲存格翻譯過程
print(f"\n3. 模擬儲存格翻譯過程")
print("-" * 60)
import openpyxl
wb = openpyxl.load_workbook(str(original_file), data_only=False)
try:
wb_vals = openpyxl.load_workbook(str(original_file), data_only=True)
except:
wb_vals = None
ws = wb.active
ws_vals = wb_vals.active if wb_vals else None
# 檢查A1儲存格的翻譯邏輯
r, c = 1, 1 # A1
src_text = parser._get_display_text_for_translation(ws, ws_vals, r, c)
print(f"A1儲存格:")
print(f" 提取的文字: {repr(src_text)}")
print(f" 是否需要翻譯: {parser._should_translate(src_text, 'auto') if src_text else False}")
if src_text:
if not parser._should_translate(src_text, 'auto'):
print(f" ❌ 跳過原因: should_translate返回False")
elif src_text not in tmap:
print(f" ❌ 跳過原因: 翻譯映射中沒有找到")
print(f" 映射鍵列表中是否包含:")
for key in list(tmap.keys())[:5]:
print(f" {repr(key)}")
if len(tmap) > 5:
print(f" ... 還有{len(tmap)-5}")
else:
print(f" ✅ 應該翻譯: '{src_text}' -> '{tmap[src_text]}'")
wb.close()
if wb_vals:
wb_vals.close()
# 4. 檢查實際執行時的日誌
print(f"\n4. 檢查是否有其他問題")
print("-" * 60)
# 再次檢查快取中的記錄
exact_match = db.session.execute(sql_text("""
SELECT source_text, translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': a1_content, 'lang': target_language})
match_row = exact_match.fetchone()
if match_row:
print(f"✅ 快取精確匹配: '{match_row[0]}' -> '{match_row[1]}'")
print(f" 原文字節數: {len(match_row[0].encode('utf-8'))}")
print(f" 查找字節數: {len(a1_content.encode('utf-8'))}")
print(f" 字符完全相等: {match_row[0] == a1_content}")
else:
print(f"❌ 沒有找到精確匹配")
print(f"\n" + "=" * 80)
print("翻譯映射調試完成!")
print("=" * 80)
if __name__ == "__main__":
debug_translation_mapping()

View File

@@ -1,128 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試翻譯成功率問題 - 為什麼整段落快取沒有儲存
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from app import create_app
def debug_translation_success():
"""調試翻譯成功率問題"""
print("=" * 80)
print("調試翻譯成功率問題 - 為什麼整段落快取沒有儲存")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
# 測試有問題的多行文字
test_texts = [
"與 WB inline 串線DB→WB、時效快支援 Sn/Au 晶片\n最小可支援9mil晶粒\n支援EAP管控",
"空洞表現穩定、尺寸/厚度範圍廣\n最小可支援9mil晶粒\n支援EAP管控"
]
target_language = 'ja'
print(f"檢查多行文字的句子級快取狀況...")
print("-" * 60)
for i, text in enumerate(test_texts, 1):
print(f"\n測試文字 {i}: {repr(text[:50])}...")
lines = text.split('\n')
print(f" 分解為 {len(lines)} 行:")
all_lines_cached = True
for j, line in enumerate(lines, 1):
line = line.strip()
if not line:
continue
print(f"\n{j}: {repr(line)}")
# 檢查這行是否有快取
result = db.session.execute(sql_text("""
SELECT translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': line, 'lang': target_language})
row = result.fetchone()
if row:
print(f" ✅ 句子快取存在: '{row[0][:30]}...' ({row[1]})")
else:
print(f" ❌ 句子快取不存在")
all_lines_cached = False
# 進一步檢查:分句處理
from app.services.document_processor import DocumentProcessor
processor = DocumentProcessor()
sentences = processor.split_text_into_sentences(line, 'zh')
if len(sentences) > 1:
print(f" 📝 分句結果: {len(sentences)} 個句子")
for k, sentence in enumerate(sentences, 1):
sentence = sentence.strip()
if not sentence:
continue
sentence_result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': sentence, 'lang': target_language})
sentence_row = sentence_result.fetchone()
if sentence_row:
print(f" ✅ 句子{k}: '{sentence[:20]}...' -> 有快取")
else:
print(f" ❌ 句子{k}: '{sentence[:20]}...' -> 無快取")
all_lines_cached = False
print(f"\n 整體快取狀況: {'✅ 完整' if all_lines_cached else '❌ 不完整'}")
# 檢查整段落快取
whole_result = db.session.execute(sql_text("""
SELECT translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': text, 'lang': target_language})
whole_row = whole_result.fetchone()
if whole_row:
print(f" ✅ 整段落快取存在: 時間 {whole_row[1]}")
else:
print(f" ❌ 整段落快取不存在")
# 可能的原因分析
if not all_lines_cached:
print(f" 原因: 某些句子翻譯失敗all_successful=False")
else:
print(f" 原因: 可能是其他錯誤或邏輯問題")
print(f"\n" + "=" * 80)
print("翻譯成功率調試完成!")
print("建議: 檢查 translate_segment_with_sentences 中的錯誤處理邏輯")
print("=" * 80)
if __name__ == "__main__":
debug_translation_success()

View File

@@ -1,220 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
調試回寫問題 - 為什麼D2-D8有快取但沒有回寫到Excel
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
import openpyxl
from app.services.translation_service import ExcelParser
def debug_writeback_issue():
"""調試回寫問題的詳細分析"""
print("=" * 80)
print("調試回寫問題 - D2-D8有快取但沒有回寫")
print("使用上傳UUID: f8b0febc-c0df-4902-8dc3-c90f5634f3b3 (有日文翻譯)")
print("=" * 80)
# 使用有日文翻譯的檔案路徑
prod_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\f8b0febc-c0df-4902-8dc3-c90f5634f3b3")
original_file = prod_dir / "original_panjit_f8b0febc.xlsx"
translated_file = prod_dir / "original_panjit_f8b0febc_ja_translated.xlsx"
if not original_file.exists():
print(f"❌ 原始文件不存在: {original_file}")
return
if not translated_file.exists():
print(f"❌ 翻譯文件不存在: {translated_file}")
return
print(f"✅ 檔案確認:")
print(f" 原始: {original_file.name}")
print(f" 翻譯: {translated_file.name}")
# 1. 檢查問題儲存格的具體內容
print(f"\n1. 檢查問題儲存格內容")
print("-" * 60)
problem_cells = ['D2', 'D3', 'D4', 'D5', 'D6', 'D7', 'D8', 'F2', 'F3', 'F4', 'F5', 'F6']
wb_orig = openpyxl.load_workbook(str(original_file), data_only=False)
try:
wb_orig_vals = openpyxl.load_workbook(str(original_file), data_only=True)
except:
wb_orig_vals = None
wb_trans = openpyxl.load_workbook(str(translated_file), data_only=False)
cell_contents = {}
for cell_name in problem_cells:
orig_val = wb_orig.active[cell_name].value
orig_display = wb_orig_vals.active[cell_name].value if wb_orig_vals else None
trans_val = wb_trans.active[cell_name].value
if orig_val: # 只檢查有內容的儲存格
print(f"\n{cell_name}:")
print(f" 原始值: {repr(orig_val)}")
if wb_orig_vals and orig_display != orig_val:
print(f" 顯示值: {repr(orig_display)}")
print(f" 翻譯值: {repr(trans_val)}")
# 決定用於翻譯的文字
parser = ExcelParser(str(original_file))
if isinstance(orig_val, str) and orig_val.startswith("="):
display_text = orig_display if isinstance(orig_display, str) and orig_display.strip() else None
elif isinstance(orig_val, str) and orig_val.strip():
display_text = orig_val
else:
display_text = orig_display if wb_orig_vals and isinstance(orig_display, str) and orig_display.strip() else None
print(f" 用於翻譯: {repr(display_text)}")
if display_text:
should_translate = parser._should_translate(display_text, 'auto')
print(f" 應該翻譯: {should_translate}")
cell_contents[cell_name] = display_text
else:
print(f" ❌ 沒有可翻譯文字")
# 2. 檢查這些文字是否在提取列表中
print(f"\n2. 檢查文字提取狀況")
print("-" * 60)
segments = parser.extract_text_segments()
print(f"總共提取 {len(segments)} 個片段")
for cell_name, text in cell_contents.items():
if text in segments:
print(f"{cell_name}='{text}' 已被提取 (位置: {segments.index(text)+1})")
else:
print(f"{cell_name}='{text}' 未被提取")
# 3. 檢查MySQL快取中的翻譯
print(f"\n3. 檢查MySQL快取中的翻譯")
print("-" * 60)
from app import create_app
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
translation_map = {}
for cell_name, text in cell_contents.items():
result = db.session.execute(sql_text("""
SELECT id, translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = 'ja'
ORDER BY created_at DESC
LIMIT 1
"""), {'text': text})
row = result.fetchone()
if row:
translation_map[text] = row[1]
print(f"{cell_name}='{text}' -> '{row[1]}' (ID:{row[0]}, 時間:{row[2]})")
else:
print(f"{cell_name}='{text}' -> 快取中無翻譯")
print(f"\n快取命中率: {len(translation_map)}/{len(cell_contents)} = {len(translation_map)/len(cell_contents)*100:.1f}%")
# 4. 模擬generate_translated_document的映射邏輯
print(f"\n4. 模擬翻譯映射建立過程")
print("-" * 60)
# 建立翻譯映射 (模擬實際邏輯)
mapping_result = {}
for original_text in segments:
cache_result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = 'ja'
ORDER BY created_at DESC
LIMIT 1
"""), {'text': original_text, 'lang': 'ja'})
cache_row = cache_result.fetchone()
if cache_row and cache_row[0]:
mapping_result[original_text] = cache_row[0]
print(f"映射建立完成: {len(mapping_result)}/{len(segments)} = {len(mapping_result)/len(segments)*100:.1f}%")
# 檢查問題儲存格的映射狀況
print(f"\n映射檢查:")
for cell_name, text in cell_contents.items():
if text in mapping_result:
print(f"{cell_name}='{text}' 在映射中: '{mapping_result[text]}'")
else:
print(f"{cell_name}='{text}' 不在映射中")
# 5. 模擬實際的儲存格翻譯寫入邏輯
print(f"\n5. 模擬儲存格翻譯寫入邏輯")
print("-" * 60)
# 重新載入工作簿進行模擬
wb_test = openpyxl.load_workbook(str(original_file), data_only=False)
try:
wb_test_vals = openpyxl.load_workbook(str(original_file), data_only=True)
except:
wb_test_vals = None
ws = wb_test.active
ws_vals = wb_test_vals.active if wb_test_vals else None
for cell_name in problem_cells:
if cell_name in cell_contents:
text = cell_contents[cell_name]
# 模擬_get_display_text_for_translation邏輯
cell = ws[cell_name]
r, c = cell.row, cell.column
src_text = parser._get_display_text_for_translation(ws, ws_vals, r, c)
print(f"\n{cell_name} 寫入模擬:")
print(f" 提取文字: {repr(src_text)}")
print(f" 預期文字: {repr(text)}")
print(f" 文字一致: {src_text == text}")
if src_text and parser._should_translate(src_text, 'auto'):
if src_text in mapping_result:
translated = mapping_result[src_text]
new_value = f"{src_text}\n{translated}"
print(f" ✅ 應該寫入: {repr(new_value)}")
else:
print(f" ❌ 映射中找不到: '{src_text}'")
# 檢查映射鍵中是否有相似的
similar_keys = [key for key in mapping_result.keys() if key.strip() == src_text.strip()]
if similar_keys:
print(f" 相似鍵: {similar_keys}")
else:
print(f" ❌ 不應翻譯或無文字")
wb_test.close()
if wb_test_vals:
wb_test_vals.close()
wb_orig.close()
wb_trans.close()
if wb_orig_vals:
wb_orig_vals.close()
print(f"\n" + "=" * 80)
print("回寫問題調試完成!")
print("請檢查上述輸出找出問題原因。")
print("=" * 80)
if __name__ == "__main__":
debug_writeback_issue()

102
deploy.sh Normal file
View File

@@ -0,0 +1,102 @@
#!/bin/bash
# PANJIT 文件翻譯系統 - 一鍵部署腳本
# 版本: 2.0.0
# 日期: 2025-09-04
set -e
echo "🚀 PANJIT 文件翻譯系統部署腳本"
echo "================================="
# 檢查 Docker 環境
echo "📋 檢查 Docker 環境..."
if ! command -v docker &> /dev/null; then
echo "❌ Docker 未安裝,請先安裝 Docker"
exit 1
fi
if ! command -v docker-compose &> /dev/null; then
echo "❌ Docker Compose 未安裝,請先安裝 Docker Compose"
exit 1
fi
echo "✅ Docker 環境檢查完成"
# 檢查端口
echo "📋 檢查端口 12010..."
if lsof -Pi :12010 -sTCP:LISTEN -t >/dev/null; then
echo "⚠️ 端口 12010 已被占用"
echo "請停止占用端口的服務或修改配置中的端口號"
echo "當前占用進程:"
lsof -Pi :12010 -sTCP:LISTEN
read -p "是否繼續部署?(y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
exit 1
fi
else
echo "✅ 端口 12010 可用"
fi
# 創建必要目錄
echo "📁 創建必要目錄..."
mkdir -p uploads cache logs
chmod 755 uploads cache logs
echo "✅ 目錄創建完成"
# 停止舊服務
echo "🛑 停止現有服務..."
docker-compose down 2>/dev/null || true
# 建置映像
echo "🏗️ 建置 Docker 映像..."
docker-compose build
# 啟動服務
echo "🚀 啟動服務..."
docker-compose up -d
# 等待服務啟動
echo "⏳ 等待服務啟動..."
sleep 30
# 健康檢查
echo "🔍 檢查服務狀態..."
if curl -f http://localhost:12010/api/v1/health >/dev/null 2>&1; then
echo "✅ 主應用服務正常"
else
echo "❌ 主應用服務異常"
echo "查看日誌:"
docker-compose logs app
exit 1
fi
# 檢查 Celery Worker
echo "🔍 檢查 Celery Worker..."
if docker-compose exec -T celery-worker celery -A app.celery inspect ping >/dev/null 2>&1; then
echo "✅ Celery Worker 正常"
else
echo "⚠️ Celery Worker 可能異常,請檢查日誌"
fi
# 顯示服務狀態
echo ""
echo "📊 服務狀態:"
docker-compose ps
echo ""
echo "🎉 部署完成!"
echo "================================="
echo "📱 系統訪問地址: http://localhost:12010"
echo "🔧 API 文檔: http://localhost:12010/api"
echo "❤️ 健康檢查: http://localhost:12010/api/v1/health"
echo ""
echo "📋 管理命令:"
echo " 查看日誌: docker-compose logs -f"
echo " 停止服務: docker-compose down"
echo " 重啟服務: docker-compose restart"
echo " 服務狀態: docker-compose ps"
echo ""
echo "📞 技術支援: it-support@panjit.com.tw"
echo "================================="

87
docker-compose.yml Normal file
View File

@@ -0,0 +1,87 @@
version: '3.8'
services:
# Redis 服務 (Celery 後端和緩存)
redis:
image: redis:7-alpine
container_name: panjit-translator-redis
ports:
- "6379:6379"
volumes:
- redis_data:/data
restart: unless-stopped
command: redis-server --appendonly yes
# 主應用服務
app:
build:
context: .
dockerfile: Dockerfile
container_name: panjit-translator-app
ports:
- "12010:12010"
volumes:
- ./uploads:/app/uploads
- ./cache:/app/cache
- ./logs:/app/logs
depends_on:
- redis
environment:
- REDIS_URL=redis://redis:6379/0
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:12010/api/v1/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
# Celery Worker 服務
celery-worker:
build:
context: .
dockerfile: Dockerfile
container_name: panjit-translator-worker
volumes:
- ./uploads:/app/uploads
- ./cache:/app/cache
- ./logs:/app/logs
depends_on:
- redis
- app
environment:
- REDIS_URL=redis://redis:6379/0
restart: unless-stopped
command: celery -A app.celery worker --loglevel=info --concurrency=4
healthcheck:
test: ["CMD", "celery", "-A", "app.celery", "inspect", "ping"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
# Celery Beat 調度服務 (可選,如果需要定期任務)
celery-beat:
build:
context: .
dockerfile: Dockerfile
container_name: panjit-translator-beat
volumes:
- ./uploads:/app/uploads
- ./cache:/app/cache
- ./logs:/app/logs
depends_on:
- redis
- app
environment:
- REDIS_URL=redis://redis:6379/0
restart: unless-stopped
command: celery -A app.celery beat --loglevel=info
volumes:
redis_data:
driver: local
networks:
default:
name: panjit-translator-network

File diff suppressed because it is too large Load Diff

View File

@@ -1,107 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
詳細檢查修復後的DOCX翻譯文件內容
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
def examine_fixed_docx():
"""詳細檢查修復後的DOCX文件"""
print("=== 詳細檢查修復後的DOCX翻譯文件 ===")
# 檢查剛生成的測試文件
test_file = r"C:\Users\EGG\AppData\Local\Temp\test_docx_translation\translated_original_-OR026_9c6548ac_en_translat.docx"
try:
from docx import Document
doc = Document(test_file)
print(f"文件: {test_file}")
print(f"總段落數: {len(doc.paragraphs)}")
# 詳細分析每個段落
chinese_only = 0
english_only = 0
mixed = 0
empty = 0
print(f"\n📄 詳細段落分析:")
for i, para in enumerate(doc.paragraphs):
text = para.text.strip()
if not text:
empty += 1
continue
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in text)
has_english = any(ord(c) < 128 and c.isalpha() for c in text)
if has_chinese and has_english:
mixed += 1
status = "🔄 中英混合"
elif has_english:
english_only += 1
status = "🇺🇸 純英文"
elif has_chinese:
chinese_only += 1
status = "🇨🇳 純中文"
else:
status = "❓ 未知"
if i < 20: # 顯示前20段
print(f" 段落 {i+1:2d}: {status} - {text[:80]}...")
print(f"\n📊 統計結果:")
print(f" 空段落: {empty}")
print(f" 純中文段落: {chinese_only}")
print(f" 純英文段落: {english_only}")
print(f" 中英混合段落: {mixed}")
total_content = chinese_only + english_only + mixed
if total_content > 0:
print(f" 中文內容比例: {(chinese_only + mixed) / total_content * 100:.1f}%")
print(f" 英文內容比例: {(english_only + mixed) / total_content * 100:.1f}%")
# 檢查是否有交錯格式
print(f"\n🔍 檢查交錯翻譯格式:")
potential_alternating = 0
for i in range(len(doc.paragraphs) - 1):
current = doc.paragraphs[i].text.strip()
next_para = doc.paragraphs[i + 1].text.strip()
if current and next_para:
current_chinese = any('\u4e00' <= c <= '\u9fff' for c in current)
current_english = any(ord(c) < 128 and c.isalpha() for c in current)
next_chinese = any('\u4e00' <= c <= '\u9fff' for c in next_para)
next_english = any(ord(c) < 128 and c.isalpha() for c in next_para)
# 檢查是否是中文段落後跟英文段落(交錯格式)
if current_chinese and not current_english and next_english and not next_chinese:
potential_alternating += 1
if potential_alternating <= 5: # 顯示前5個交錯範例
print(f" 交錯範例 {potential_alternating}:")
print(f" 中文: {current[:60]}...")
print(f" 英文: {next_para[:60]}...")
if potential_alternating > 0:
print(f" ✅ 發現 {potential_alternating} 個潛在交錯翻譯對")
print(f" 📈 交錯格式覆蓋率: {potential_alternating / (total_content // 2) * 100:.1f}%")
else:
print(f" ❌ 沒有發現明顯的交錯翻譯格式")
except Exception as e:
print(f"❌ 檢查失敗: {e}")
if __name__ == "__main__":
examine_fixed_docx()

View File

@@ -1,167 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
修復D2-D8欄位的翻譯快取 - 手動補充正確的翻譯
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from app import create_app
def fix_d_column_translations():
"""修復D2-D8欄位的翻譯快取"""
print("=" * 80)
print("修復D2-D8欄位的翻譯快取")
print("手動補充正確的中文->日文翻譯")
print("=" * 80)
# 根據調試輸出手動提供D2-D8的正確翻譯對照
d_column_translations = [
{
'source_text': '與 WB inline 串線DB→WB、時效快支援 Sn/Au 晶片\n最小可支援9mil晶粒\n支援EAP管控',
'translated_text': 'WBインラインDB→WBによる直列接続で、処理時間が短いSn/Auダイ対応\n最小9milダイ対応\nEAP制御対応'
},
{
'source_text': '空洞表現穩定、尺寸/厚度範圍廣\n最小可支援9mil晶粒\n支援EAP管控',
'translated_text': '空洞の表現が安定している、サイズ/厚さの範囲が広い\n最小9milダイ対応\nEAP制御対応'
},
{
'source_text': 'DB到焊接爐為串機、時效快減少人員碰觸之風險\n支援Ag/Au晶片\n支援含氧量監控\n支援EAP',
'translated_text': 'DBから溶接炉へのインライン接続により処理時間が短く、人員の接触リスクを削減\nAg/Auダイ対応\n酸素含有量監視対応\nEAP対応'
},
{
'source_text': '爐後氣孔少,提升焊接接縫均勻度、強度高、氣密性好\n支援Ag/Au晶片\n支援含氧量監控\n支援EAP',
'translated_text': '炉後の気孔が少なく、溶接継ぎ目の均一性が向上、強度が高く、気密性が良好\nAg/Auダイ対応\n酸素含有量監視対応\nEAP対応'
},
{
'source_text': 'Wire size: 0.8 mil ~ 2.4 mil量產成熟\n最薄 Al bond pad 1.3 μm最小 bond pad size 55 × 55 μm\n支援EAP管控',
'translated_text': 'ワイヤサイズ: 0.8 mil ~ 2.4 mil量産成熟\n最薄 Alボンドパッド 1.3 μm最小ボンドパッドサイズ 55 × 55 μm\nEAP制御対応'
},
{
'source_text': '1.全自動貼片減少人為作業的風險\n2.機台封閉式設計及有HEPA機構能減少落塵造成的異常風險\n3.自動讀取晶片刻號及貼晶片條碼\n支援EAP管控',
'translated_text': '1.全自動貼付により人的作業のリスクを削減\n2.装置の密閉設計およびHEPA機構により落下塵による異常リスクを削減\n3.ダイの刻印とダイバーコードの自動読み取り\nEAP制御対応'
},
{
'source_text': '1.晶片切割後chipping的品質檢驗\n2.晶片上的缺點檢驗',
'translated_text': '1.ダイカット後のチッピング品質検査\n2.ダイ上の欠陥検査'
}
]
app = create_app()
with app.app_context():
from app.models.cache import TranslationCache
from app import db
source_language = 'zh'
target_language = 'ja'
print(f"準備添加 {len(d_column_translations)} 筆D欄位翻譯...")
print("-" * 60)
added_count = 0
updated_count = 0
for i, trans in enumerate(d_column_translations, 2):
source_text = trans['source_text']
translated_text = trans['translated_text']
print(f"\nD{i} 欄位處理:")
print(f" 原文: {repr(source_text[:50])}...")
print(f" 譯文: {repr(translated_text[:50])}...")
# 檢查是否已存在
existing = TranslationCache.get_translation(source_text, source_language, target_language)
if existing:
if existing.strip() != translated_text.strip():
print(f" 🔄 更新現有翻譯")
TranslationCache.save_translation(source_text, source_language, target_language, translated_text)
updated_count += 1
else:
print(f" ⚠️ 翻譯已存在且相同")
else:
print(f" ✅ 新增翻譯記錄")
TranslationCache.save_translation(source_text, source_language, target_language, translated_text)
added_count += 1
print(f"\n" + "-" * 60)
print(f"D欄位翻譯補充結果:")
print(f" 新增: {added_count}")
print(f" 更新: {updated_count}")
print(f" 總計: {added_count + updated_count}")
# 驗證結果
print(f"\n驗證補充結果:")
print("-" * 60)
success_count = 0
for i, trans in enumerate(d_column_translations, 2):
source_text = trans['source_text']
cached_translation = TranslationCache.get_translation(source_text, source_language, target_language)
if cached_translation:
if cached_translation.strip() == trans['translated_text'].strip():
print(f"✅ D{i}: 驗證成功")
success_count += 1
else:
print(f"⚠️ D{i}: 驗證失敗 - 內容不一致")
else:
print(f"❌ D{i}: 驗證失敗 - 快取中沒有")
print(f"\n驗證結果: {success_count}/{len(d_column_translations)} 成功")
# 測試整體映射覆蓋率
print(f"\n測試整體映射覆蓋率:")
print("-" * 60)
from app.services.translation_service import ExcelParser
from pathlib import Path
from sqlalchemy import text as sql_text
original_file = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\f8b0febc-c0df-4902-8dc3-c90f5634f3b3") / "original_panjit_f8b0febc.xlsx"
if original_file.exists():
parser = ExcelParser(str(original_file))
segments = parser.extract_text_segments()
mapping_count = 0
for segment in segments:
result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': segment, 'lang': target_language})
row = result.fetchone()
if row:
mapping_count += 1
mapping_rate = mapping_count / len(segments) * 100 if segments else 0
print(f"映射覆蓋率: {mapping_count}/{len(segments)} = {mapping_rate:.1f}%")
if mapping_rate >= 90:
print("🎉 映射覆蓋率優秀!翻譯功能應該正常工作")
elif mapping_rate >= 80:
print("✅ 映射覆蓋率良好,翻譯功能基本正常")
else:
print("⚠️ 映射覆蓋率待改善,部分文字可能無法翻譯")
print(f"\n" + "=" * 80)
print("D欄位翻譯快取修復完成")
print("建議: 重新上傳檔案測試D2-D8翻譯功能")
print("=" * 80)
if __name__ == "__main__":
fix_d_column_translations()

View File

@@ -1,214 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
修復韓文翻譯快取問題 - D2-D8欄位韓文翻譯
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
import openpyxl
from app import create_app
def fix_korean_translation_cache():
"""修復韓文翻譯快取問題"""
print("=" * 80)
print("修復韓文翻譯快取問題")
print("目標語言: 韓文 (ko)")
print("=" * 80)
# 檢查韓文翻譯檔案
prod_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\98158984-f335-44f5-a0b4-88fb8ccd5d78")
original_file = prod_dir / "original_panjit_98158984.xlsx"
korean_file = prod_dir / "original_panjit_98158984_ko_translated.xlsx"
if not original_file.exists():
print(f"❌ 原始文件不存在: {original_file}")
return
if not korean_file.exists():
print(f"❌ 韓文翻譯文件不存在: {korean_file}")
return
print(f"✅ 檔案確認:")
print(f" 原始: {original_file.name}")
print(f" 韓文: {korean_file.name}")
# 1. 檢查韓文翻譯檔案內容
print(f"\n1. 檢查韓文翻譯檔案內容")
print("-" * 60)
wb_orig = openpyxl.load_workbook(str(original_file), data_only=False)
wb_korean = openpyxl.load_workbook(str(korean_file), data_only=False)
# 檢查D2-D8和F2-F6欄位
problem_cells = ['D2', 'D3', 'D4', 'D5', 'D6', 'D7', 'D8', 'F2', 'F3', 'F4', 'F5', 'F6']
korean_translations = []
for cell_name in problem_cells:
orig_val = wb_orig.active[cell_name].value
korean_val = wb_korean.active[cell_name].value
if orig_val:
print(f"\n{cell_name}:")
print(f" 原文: {repr(orig_val)}")
print(f" 韓文: {repr(korean_val)}")
# 檢查是否為翻譯格式 (原文\n翻譯)
if isinstance(korean_val, str) and '\n' in korean_val:
lines = korean_val.split('\n')
if len(lines) >= 2:
original_text = lines[0].strip()
translated_text = '\n'.join(lines[1:]).strip()
# 驗證原文是否一致
if isinstance(orig_val, str) and orig_val.strip() == original_text:
korean_translations.append({
'cell': cell_name,
'source_text': original_text,
'translated_text': translated_text
})
print(f" ✅ 已翻譯: '{translated_text[:30]}...'")
else:
print(f" ❌ 原文不一致")
else:
print(f" ❌ 格式異常")
else:
if orig_val == korean_val:
print(f" ❌ 未翻譯")
else:
print(f" ⚠️ 格式不明")
wb_orig.close()
wb_korean.close()
print(f"\n找到 {len(korean_translations)} 個韓文翻譯對照")
# 2. 檢查現有韓文快取
print(f"\n2. 檢查現有韓文快取")
print("-" * 60)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
target_language = 'ko'
source_language = 'zh'
# 檢查韓文快取總數
korean_cache_count = db.session.execute(sql_text("""
SELECT COUNT(*) FROM dt_translation_cache
WHERE target_language = :lang
"""), {'lang': target_language}).fetchone()[0]
print(f"韓文快取總數: {korean_cache_count}")
# 檢查D2-D8是否有韓文快取
missing_korean_cache = []
for trans in korean_translations:
source_text = trans['source_text']
result = db.session.execute(sql_text("""
SELECT translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': source_text, 'lang': target_language})
row = result.fetchone()
if row:
print(f"{trans['cell']}: 韓文快取已存在 (時間: {row[1]})")
else:
print(f"{trans['cell']}: 韓文快取不存在")
missing_korean_cache.append(trans)
# 3. 補充缺失的韓文快取
if missing_korean_cache:
print(f"\n3. 補充缺失的韓文快取")
print("-" * 60)
from app.models.cache import TranslationCache
added_count = 0
for trans in missing_korean_cache:
source_text = trans['source_text']
translated_text = trans['translated_text']
print(f"✅ 新增 {trans['cell']}: '{source_text[:30]}...' -> '{translated_text[:30]}...'")
TranslationCache.save_translation(source_text, source_language, target_language, translated_text)
added_count += 1
print(f"\n韓文快取補充完成: 新增 {added_count}")
# 4. 測試韓文翻譯映射
print(f"\n4. 測試韓文翻譯映射")
print("-" * 60)
from app.services.translation_service import ExcelParser
parser = ExcelParser(str(original_file))
segments = parser.extract_text_segments()
print(f"提取文字片段: {len(segments)}")
korean_mapping_count = 0
for segment in segments:
result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': segment, 'lang': target_language})
row = result.fetchone()
if row:
korean_mapping_count += 1
korean_mapping_rate = korean_mapping_count / len(segments) * 100 if segments else 0
print(f"韓文映射覆蓋率: {korean_mapping_count}/{len(segments)} = {korean_mapping_rate:.1f}%")
if korean_mapping_rate >= 80:
print("✅ 韓文映射覆蓋率良好")
else:
print("⚠️ 韓文映射覆蓋率待改善")
# 顯示缺失的片段
print(f"\n缺失韓文翻譯的片段:")
missing_count = 0
for segment in segments:
if missing_count >= 10: # 只顯示前10個
break
result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': segment, 'lang': target_language})
if not result.fetchone():
print(f"'{segment[:40]}...'")
missing_count += 1
print(f"\n" + "=" * 80)
print("韓文翻譯快取檢查完成!")
print("如果映射覆蓋率不足,請重新執行翻譯任務或手動補充快取")
print("=" * 80)
if __name__ == "__main__":
fix_korean_translation_cache()

View File

@@ -1,184 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
修復Excel翻譯快取缺失問題 - 從已翻譯的Excel檔案中提取翻譯並補充快取
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
import openpyxl
from app import create_app
def extract_translations_from_excel():
"""從已翻譯的Excel檔案中提取翻譯對照"""
print("=" * 80)
print("修復Excel翻譯快取缺失問題")
print("從已翻譯檔案提取翻譯對照並補充快取")
print("=" * 80)
# 使用已翻譯的Excel檔案
prod_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\f8b0febc-c0df-4902-8dc3-c90f5634f3b3")
original_file = prod_dir / "original_panjit_f8b0febc.xlsx"
translated_file = prod_dir / "original_panjit_f8b0febc_ja_translated.xlsx"
if not original_file.exists() or not translated_file.exists():
print("❌ 需要的檔案不存在")
return
# 1. 提取翻譯對照
print("\n1. 提取翻譯對照")
print("-" * 60)
wb_orig = openpyxl.load_workbook(str(original_file), data_only=False)
wb_trans = openpyxl.load_workbook(str(translated_file), data_only=False)
translation_pairs = []
target_language = 'ja'
source_language = 'zh'
# 檢查所有儲存格,找出有翻譯的
for row in range(1, 50): # 前50行應該足夠
for col in range(1, 20): # 前20列
orig_cell = wb_orig.active.cell(row=row, column=col)
trans_cell = wb_trans.active.cell(row=row, column=col)
orig_val = orig_cell.value
trans_val = trans_cell.value
if not orig_val or not trans_val:
continue
# 檢查是否為翻譯格式 (原文\n翻譯)
if isinstance(trans_val, str) and '\n' in trans_val:
lines = trans_val.split('\n')
if len(lines) >= 2:
original_text = lines[0].strip()
translated_text = '\n'.join(lines[1:]).strip()
# 驗證原文是否一致
if isinstance(orig_val, str) and orig_val.strip() == original_text:
cell_name = f"{chr(64+col)}{row}"
translation_pairs.append({
'cell': cell_name,
'source_text': original_text,
'translated_text': translated_text
})
print(f"{cell_name}: '{original_text[:30]}...' -> '{translated_text[:30]}...'")
wb_orig.close()
wb_trans.close()
print(f"\n找到 {len(translation_pairs)} 個翻譯對照")
# 2. 補充到快取中
print(f"\n2. 補充翻譯快取")
print("-" * 60)
app = create_app()
with app.app_context():
from app.models.cache import TranslationCache
from app import db
added_count = 0
updated_count = 0
skipped_count = 0
for pair in translation_pairs:
source_text = pair['source_text']
translated_text = pair['translated_text']
# 檢查是否已存在
existing = TranslationCache.get_translation(source_text, source_language, target_language)
if existing:
if existing.strip() == translated_text.strip():
print(f"⚠️ {pair['cell']}: 快取已存在且相同")
skipped_count += 1
else:
print(f"🔄 {pair['cell']}: 更新快取翻譯")
TranslationCache.save_translation(source_text, source_language, target_language, translated_text)
updated_count += 1
else:
print(f"{pair['cell']}: 新增快取翻譯")
TranslationCache.save_translation(source_text, source_language, target_language, translated_text)
added_count += 1
print(f"\n快取補充結果:")
print(f" 新增: {added_count}")
print(f" 更新: {updated_count}")
print(f" 跳過: {skipped_count}")
print(f" 總計: {added_count + updated_count + skipped_count}")
# 3. 驗證補充結果
print(f"\n3. 驗證補充結果")
print("-" * 60)
verification_failed = 0
for pair in translation_pairs:
source_text = pair['source_text']
cached_translation = TranslationCache.get_translation(source_text, source_language, target_language)
if cached_translation:
if cached_translation.strip() == pair['translated_text'].strip():
print(f"{pair['cell']}: 驗證成功")
else:
print(f"⚠️ {pair['cell']}: 驗證失敗 - 內容不一致")
verification_failed += 1
else:
print(f"{pair['cell']}: 驗證失敗 - 快取中沒有")
verification_failed += 1
print(f"\n驗證結果: {len(translation_pairs) - verification_failed}/{len(translation_pairs)} 成功")
# 4. 測試新的翻譯映射邏輯
print(f"\n4. 測試翻譯映射邏輯")
print("-" * 60)
from app.services.translation_service import ExcelParser
parser = ExcelParser(str(original_file))
segments = parser.extract_text_segments()
print(f"文字片段提取: {len(segments)}")
from sqlalchemy import text as sql_text
mapping_count = 0
for segment in segments:
result = db.session.execute(sql_text("""
SELECT translated_text
FROM dt_translation_cache
WHERE source_text = :text AND target_language = :lang
ORDER BY created_at DESC
LIMIT 1
"""), {'text': segment, 'lang': target_language})
row = result.fetchone()
if row:
mapping_count += 1
mapping_rate = mapping_count / len(segments) * 100 if segments else 0
print(f"翻譯映射覆蓋率: {mapping_count}/{len(segments)} = {mapping_rate:.1f}%")
if mapping_rate >= 80:
print("✅ 映射覆蓋率良好,翻譯功能應該正常工作")
else:
print("⚠️ 映射覆蓋率不佳,可能仍有部分文字無法翻譯")
print(f"\n" + "=" * 80)
print("Excel翻譯快取修復完成")
print("建議: 重新上傳檔案測試翻譯功能")
print("=" * 80)
if __name__ == "__main__":
extract_translations_from_excel()

View File

@@ -1,113 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
修復缺失的翻譯快取記錄
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from app import create_app
from datetime import datetime
def fix_missing_translation_cache():
"""修復缺失的翻譯快取記錄"""
print("=" * 80)
print("修復缺失的翻譯快取記錄")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
# 需要補充的翻譯記錄
missing_translations = [
{
'source_text': '製程',
'target_language': 'ja',
'translated_text': 'プロセス', # 製程的日文翻譯
'source_language': 'zh'
}
]
print(f"準備添加 {len(missing_translations)} 筆翻譯記錄到快取...")
for translation in missing_translations:
source_text = translation['source_text']
target_language = translation['target_language']
translated_text = translation['translated_text']
source_language = translation['source_language']
# 檢查是否已存在
check_result = db.session.execute(sql_text("""
SELECT id FROM dt_translation_cache
WHERE source_text = :source AND target_language = :target
LIMIT 1
"""), {
'source': source_text,
'target': target_language
})
if check_result.fetchone():
print(f"⚠️ 翻譯記錄已存在: '{source_text}' -> {target_language}")
continue
# 計算source_text_hash
import hashlib
source_text_hash = hashlib.md5(source_text.encode('utf-8')).hexdigest()
# 插入新的翻譯記錄
insert_result = db.session.execute(sql_text("""
INSERT INTO dt_translation_cache
(source_text_hash, source_text, translated_text, source_language, target_language)
VALUES (:source_hash, :source, :translated, :source_lang, :target_lang)
"""), {
'source_hash': source_text_hash,
'source': source_text,
'translated': translated_text,
'source_lang': source_language,
'target_lang': target_language
})
print(f"✅ 已添加翻譯記錄: '{source_text}' -> '{translated_text}' ({target_language})")
# 提交變更
db.session.commit()
print(f"\n✅ 所有翻譯記錄已提交到資料庫")
# 驗證添加結果
print(f"\n驗證翻譯記錄:")
for translation in missing_translations:
source_text = translation['source_text']
target_language = translation['target_language']
verify_result = db.session.execute(sql_text("""
SELECT translated_text, created_at
FROM dt_translation_cache
WHERE source_text = :source AND target_language = :target
ORDER BY created_at DESC
LIMIT 1
"""), {
'source': source_text,
'target': target_language
})
row = verify_result.fetchone()
if row:
print(f"'{source_text}' -> '{row[0]}' (時間: {row[1]})")
else:
print(f"❌ 驗證失敗: '{source_text}'")
print(f"\n" + "=" * 80)
print("修復完成!")
print("=" * 80)
if __name__ == "__main__":
fix_missing_translation_cache()

View File

@@ -1,69 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
修正通知表結構腳本
"""
import pymysql
import os
from dotenv import load_dotenv
# 載入環境變數
load_dotenv('C:/Users/EGG/WORK/data/user_scrip/TOOL/env.txt')
def fix_notification_table():
"""修正通知表的欄位名稱"""
try:
# 連接資料庫
connection = pymysql.connect(
host=os.getenv('DB_HOST', 'localhost'),
user=os.getenv('DB_USER', 'root'),
password=os.getenv('DB_PASSWORD', ''),
database=os.getenv('DB_NAME', 'doc_translator'),
charset='utf8mb4'
)
with connection.cursor() as cursor:
# 檢查 dt_notifications 表結構
cursor.execute('DESCRIBE dt_notifications')
columns = cursor.fetchall()
print('Current table structure:')
for col in columns:
print(f' {col[0]} {col[1]}')
# 檢查是否有 metadata 欄位
has_metadata = any(col[0] == 'metadata' for col in columns)
has_extra_data = any(col[0] == 'extra_data' for col in columns)
print(f'\nHas metadata column: {has_metadata}')
print(f'Has extra_data column: {has_extra_data}')
if has_metadata and not has_extra_data:
print('\nRenaming metadata column to extra_data...')
cursor.execute('ALTER TABLE dt_notifications CHANGE metadata extra_data JSON NULL COMMENT "額外數據"')
connection.commit()
print('✅ Column renamed successfully')
# 再次檢查結構
cursor.execute('DESCRIBE dt_notifications')
columns = cursor.fetchall()
print('\nUpdated table structure:')
for col in columns:
print(f' {col[0]} {col[1]}')
elif has_extra_data:
print('✅ extra_data column already exists')
else:
print('❌ Neither metadata nor extra_data column found')
connection.close()
print('\n✅ Database structure check completed')
except Exception as e:
print(f'❌ Error fixing notification table: {e}')
import traceback
traceback.print_exc()
if __name__ == '__main__':
fix_notification_table()

View File

@@ -1,42 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
修正通知表腳本
"""
from app import create_app, db
from sqlalchemy import text
def fix_notification_table():
"""修正通知表"""
try:
app = create_app()
with app.app_context():
print("Fixing notification table...")
# 刪除通知表(如果存在)
try:
db.session.execute(text('DROP TABLE IF EXISTS dt_notifications'))
db.session.commit()
print("Old notification table dropped")
except Exception as e:
print(f"Info: {e}")
# 重新創建通知表
db.create_all()
print("New notification table created with correct structure")
# 檢查表結構
result = db.session.execute(text('DESCRIBE dt_notifications'))
columns = result.fetchall()
print("New table structure:")
for col in columns:
print(f" {col[0]} {col[1]}")
except Exception as e:
print(f"Error fixing notification table: {e}")
import traceback
traceback.print_exc()
if __name__ == '__main__':
fix_notification_table()

View File

@@ -1,64 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
修復用戶ID不匹配問題
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from sqlalchemy import text
def fix_user_id():
"""修復用戶ID - 將ID從1改回2以匹配JWT Token"""
app = create_app()
with app.app_context():
print("=== 修復用戶ID不匹配問題 ===")
try:
# 停用外鍵檢查
db.session.execute(text("SET FOREIGN_KEY_CHECKS = 0"))
# 將用戶ID從1改為2
result = db.session.execute(text("UPDATE dt_users SET id = 2 WHERE id = 1"))
print(f"更新了 {result.rowcount} 筆用戶記錄")
# 重新設定自增起始值
db.session.execute(text("ALTER TABLE dt_users AUTO_INCREMENT = 3"))
# 重新啟用外鍵檢查
db.session.execute(text("SET FOREIGN_KEY_CHECKS = 1"))
db.session.commit()
print("✅ 用戶ID已從1改為2匹配JWT Token")
# 驗證
user = db.session.execute(text("SELECT id, username, email FROM dt_users")).fetchone()
if user:
print(f"確認用戶: ID={user[0]}, 用戶名={user[1]}, Email={user[2]}")
except Exception as e:
print(f"❌ 修復失敗: {str(e)}")
db.session.rollback()
# 確保重新啟用外鍵檢查
try:
db.session.execute(text("SET FOREIGN_KEY_CHECKS = 1"))
db.session.commit()
except:
pass
raise
if __name__ == "__main__":
fix_user_id()

View File

@@ -246,13 +246,7 @@ const handleMenuClick = () => {
const showNotifications = async () => {
notificationDrawerVisible.value = true
// 載入最新通知
console.log('🔔 正在載入通知...')
try {
await notificationStore.fetchNotifications()
console.log('🔔 通知載入完成:', notificationStore.notifications.length)
} catch (error) {
console.error('🔔 通知載入失敗:', error)
}
await notificationStore.fetchNotifications()
}
const handleUserMenuCommand = async (command) => {
@@ -337,12 +331,7 @@ onMounted(() => {
// initWebSocket()
// 載入通知
console.log('🔔 初始化載入通知...')
notificationStore.fetchNotifications().then(() => {
console.log('🔔 初始化通知載入完成:', notificationStore.notifications.length)
}).catch(error => {
console.error('🔔 初始化通知載入失敗:', error)
})
notificationStore.fetchNotifications()
// 監聽窗口大小變化
window.addEventListener('resize', handleResize)

View File

@@ -6,7 +6,7 @@ import NProgress from 'nprogress'
// 創建 axios 實例
const service = axios.create({
baseURL: import.meta.env.VITE_API_BASE_URL || 'http://127.0.0.1:5000/api/v1',
baseURL: import.meta.env.VITE_API_BASE_URL || 'http://127.0.0.1:12010/api/v1',
timeout: 30000, // 30秒超時
headers: {
'Content-Type': 'application/json'
@@ -18,20 +18,11 @@ service.interceptors.request.use(
config => {
NProgress.start()
console.log('🚀 [API Request]', {
method: config.method.toUpperCase(),
url: config.url,
baseURL: config.baseURL,
fullURL: `${config.baseURL}${config.url}`,
headers: config.headers,
timestamp: new Date().toISOString()
})
// JWT 認證:添加 Authorization header
const authStore = useAuthStore()
if (authStore.token) {
config.headers.Authorization = `Bearer ${authStore.token}`
console.log('🔑 [JWT Token]', `Bearer ${authStore.token.substring(0, 20)}...`)
}
return config
@@ -171,7 +162,7 @@ service.interceptors.response.use(
// 檔案上傳專用請求實例
export const uploadRequest = axios.create({
baseURL: import.meta.env.VITE_API_BASE_URL || 'http://127.0.0.1:5000/api/v1',
baseURL: import.meta.env.VITE_API_BASE_URL || 'http://127.0.0.1:12010/api/v1',
timeout: 120000, // 2分鐘超時
headers: {
'Content-Type': 'multipart/form-data'

View File

@@ -26,7 +26,7 @@ class WebSocketService {
try {
// 建立 Socket.IO 連接
const wsUrl = import.meta.env.VITE_WS_BASE_URL || 'http://127.0.0.1:5000'
const wsUrl = import.meta.env.VITE_WS_BASE_URL || 'http://127.0.0.1:12010'
console.log('🔌 [WebSocket] 嘗試連接到:', wsUrl)
this.socket = io(wsUrl, {

View File

@@ -1,9 +0,0 @@
HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.12.10
Date: Tue, 02 Sep 2025 00:55:49 GMT
Content-Type: application/json
Content-Length: 215
Vary: Cookie
Set-Cookie: session=eyJpc19hZG1pbiI6dHJ1ZSwidXNlcl9pZCI6MSwidXNlcm5hbWUiOiJ5bWlybGl1In0.aLZAlQ.40ecGXMyL7P1TWYKutdgMnOZGl0; HttpOnly; Path=/
Connection: close

View File

@@ -1,51 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
初始化應用程式腳本
"""
import os
import sys
sys.path.append('.')
def init_application():
"""初始化應用程式"""
try:
print("Initializing application...")
from app import create_app
app = create_app('development')
print("App created successfully")
with app.app_context():
from app import db
print("Database tables created")
# 檢查表格是否建立
import pymysql
connection = pymysql.connect(
host='mysql.theaken.com',
port=33306,
user='A060',
password='WLeSCi0yhtc7',
database='db_A060',
charset='utf8mb4'
)
cursor = connection.cursor()
cursor.execute('SHOW TABLES LIKE "dt_%"')
tables = cursor.fetchall()
print("\nDocument Translator Tables:")
for table in tables:
print(f"- {table[0]}")
connection.close()
return True
except Exception as e:
print(f"Initialization failed: {e}")
return False
if __name__ == '__main__':
init_application()

View File

@@ -1,41 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
重新創建通知表腳本
"""
from app import create_app, db
from app.models import Notification
def recreate_notification_table():
"""重新創建通知表"""
try:
app = create_app()
with app.app_context():
print("Recreating notification table...")
# 刪除通知表(如果存在)
try:
db.engine.execute('DROP TABLE IF EXISTS dt_notifications')
print("✅ Old notification table dropped")
except Exception as e:
print(f"Info: {e}")
# 重新創建通知表
db.create_all()
print("✅ New notification table created with correct structure")
# 檢查表結構
result = db.engine.execute('DESCRIBE dt_notifications')
columns = result.fetchall()
print("\nNew table structure:")
for col in columns:
print(f" {col[0]} {col[1]}")
except Exception as e:
print(f"Error recreating notification table: {e}")
import traceback
traceback.print_exc()
if __name__ == '__main__':
recreate_notification_table()

View File

@@ -1,119 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
重新生成正確的韓文翻譯Excel檔案
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
from app import create_app
def regenerate_korean_excel():
"""重新生成韓文翻譯Excel檔案"""
print("=" * 80)
print("重新生成韓文翻譯Excel檔案")
print("使用補充後的韓文快取 (覆蓋率: 97.4%)")
print("=" * 80)
# 檔案路徑
prod_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\98158984-f335-44f5-a0b4-88fb8ccd5d78")
original_file = prod_dir / "original_panjit_98158984.xlsx"
if not original_file.exists():
print(f"❌ 原始文件不存在: {original_file}")
return
print(f"✅ 原始文件: {original_file.name}")
app = create_app()
with app.app_context():
from app.services.translation_service import ExcelParser
from app import db
try:
print(f"\n1. 創建Excel解析器")
print("-" * 60)
parser = ExcelParser(str(original_file))
print(f"✅ Excel解析器創建成功")
print(f"\n2. 生成韓文翻譯檔案")
print("-" * 60)
# 使用空的translations字典讓系統從快取中查詢
translated_file_path = parser.generate_translated_document(
translations={},
target_language='ko',
output_dir=prod_dir
)
print(f"✅ 韓文翻譯檔案已生成: {Path(translated_file_path).name}")
print(f"\n3. 驗證翻譯結果")
print("-" * 60)
import openpyxl
# 檢查新生成的翻譯檔案
wb_trans = openpyxl.load_workbook(translated_file_path, data_only=False)
# 檢查關鍵儲存格
test_cells = ['D2', 'D3', 'D4', 'D5', 'D6', 'D7', 'D8', 'F2', 'F3', 'F4', 'F5', 'F6']
translated_count = 0
for cell_name in test_cells:
cell_val = wb_trans.active[cell_name].value
if isinstance(cell_val, str) and '\n' in cell_val:
lines = cell_val.split('\n')
if len(lines) >= 2:
original_text = lines[0].strip()
translated_text = '\n'.join(lines[1:]).strip()
print(f"{cell_name}: 已翻譯")
print(f" 原文: {original_text[:30]}...")
print(f" 韓文: {translated_text[:30]}...")
translated_count += 1
else:
print(f"{cell_name}: 格式異常")
else:
print(f"{cell_name}: 未翻譯")
wb_trans.close()
print(f"\n翻譯檢查結果: {translated_count}/{len(test_cells)} 個儲存格成功翻譯")
if translated_count >= len(test_cells) * 0.8: # 80%以上成功
print("🎉 韓文翻譯檔案生成成功!")
print(f" 檔案位置: {translated_file_path}")
print(" 大部分內容已正確翻譯")
else:
print("⚠️ 翻譯檔案生成部分成功,但部分內容可能未翻譯")
# 4. 提供下載資訊
print(f"\n4. 下載資訊")
print("-" * 60)
print(f"韓文翻譯檔案已準備就緒:")
print(f" 檔案名稱: {Path(translated_file_path).name}")
print(f" 檔案路徑: {translated_file_path}")
print(f" 檔案大小: {Path(translated_file_path).stat().st_size / 1024:.1f} KB")
except Exception as e:
print(f"❌ 生成韓文翻譯檔案時發生錯誤: {str(e)}")
import traceback
print(f"錯誤詳情: {traceback.format_exc()}")
print(f"\n" + "=" * 80)
print("韓文翻譯Excel檔案重新生成完成")
print("現在D2-D8和F2-F6欄位應該都有正確的韓文翻譯")
print("=" * 80)
if __name__ == "__main__":
regenerate_korean_excel()

View File

@@ -1,160 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
使用修復後的邏輯重新生成韓文Excel檔案
預期: 使用原始DIFY翻譯而非手動補充翻譯
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
from app import create_app
def regenerate_with_original_dify():
"""使用原始DIFY翻譯重新生成韓文Excel檔案"""
print("=" * 80)
print("使用修復後的邏輯重新生成韓文Excel檔案")
print("預期: D2應該使用原始DIFY翻譯 (包含 '와이어 본딩')")
print("=" * 80)
# 檔案路徑
prod_dir = Path(r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\98158984-f335-44f5-a0b4-88fb8ccd5d78")
original_file = prod_dir / "original_panjit_98158984.xlsx"
if not original_file.exists():
print(f"❌ 原始文件不存在: {original_file}")
return
print(f"✅ 原始文件: {original_file.name}")
app = create_app()
with app.app_context():
from app.services.translation_service import ExcelParser
from app import db
import openpyxl
try:
print(f"\n1. 重新生成韓文翻譯檔案")
print("-" * 60)
parser = ExcelParser(str(original_file))
# 生成新的翻譯檔案 (會覆蓋舊的)
translated_file_path = parser.generate_translated_document(
translations={},
target_language='ko',
output_dir=prod_dir
)
print(f"✅ 韓文翻譯檔案已重新生成: {Path(translated_file_path).name}")
print(f"\n2. 驗證D2是否使用原始DIFY翻譯")
print("-" * 60)
# 檢查新生成的D2內容
wb_trans = openpyxl.load_workbook(translated_file_path, data_only=False)
d2_value = wb_trans.active['D2'].value
print(f"D2翻譯內容:")
print(f" {repr(d2_value)}")
# 檢查翻譯來源特徵
if isinstance(d2_value, str) and '\n' in d2_value:
lines = d2_value.split('\n')
if len(lines) >= 2:
korean_part = lines[1] # 第二行是韓文翻譯
if "와이어 본딩" in korean_part:
print(f" 🎯 ✅ 使用原始DIFY翻譯")
print(f" 特徵: 包含 '와이어 본딩'")
print(f" 韓文: {korean_part}")
result = "SUCCESS_ORIGINAL"
elif "연결" in korean_part:
print(f" ✋ ❌ 仍在使用手動補充翻譯")
print(f" 特徵: 包含 '연결'")
print(f" 韓文: {korean_part}")
result = "STILL_MANUAL"
else:
print(f" ❓ 無法判斷翻譯來源")
print(f" 韓文: {korean_part}")
result = "UNKNOWN"
else:
print(f" ❌ 格式異常,不是雙行格式")
result = "FORMAT_ERROR"
else:
print(f" ❌ D2沒有翻譯或格式不正確")
result = "NO_TRANSLATION"
wb_trans.close()
# 3. 檢查其他關鍵儲存格
print(f"\n3. 檢查其他關鍵儲存格")
print("-" * 60)
wb_trans = openpyxl.load_workbook(translated_file_path, data_only=False)
test_cells = ['D3', 'D4', 'D5']
translated_cells = 0
for cell_name in test_cells:
cell_value = wb_trans.active[cell_name].value
if isinstance(cell_value, str) and '\n' in cell_value:
lines = cell_value.split('\n')
if len(lines) >= 2:
korean_part = lines[1]
print(f"{cell_name}: 已翻譯")
print(f" 韓文: {korean_part[:30]}...")
translated_cells += 1
else:
print(f"{cell_name}: 格式異常")
else:
print(f"{cell_name}: 未翻譯")
print(f"\n其他儲存格翻譯狀況: {translated_cells}/{len(test_cells)} 成功")
wb_trans.close()
# 4. 最終結果評估
print(f"\n4. 最終結果評估")
print("-" * 60)
if result == "SUCCESS_ORIGINAL":
print(f"🎉 完美!修復成功")
print(f" ✅ D2正確使用原始DIFY翻譯")
print(f" ✅ 翻譯品質: 原始API翻譯 (更準確)")
print(f" ✅ 問題根源已解決: 文字格式不匹配")
elif result == "STILL_MANUAL":
print(f"⚠️ 部分成功")
print(f" ❌ D2仍使用手動翻譯")
print(f" ❓ 可能需要檢查查詢邏輯或重新啟動Celery")
else:
print(f"❌ 修復失敗")
print(f" 需要進一步排查問題")
# 5. 檔案資訊
print(f"\n5. 檔案資訊")
print("-" * 60)
print(f"韓文翻譯檔案:")
print(f" 檔案名稱: {Path(translated_file_path).name}")
print(f" 檔案路徑: {translated_file_path}")
print(f" 檔案大小: {Path(translated_file_path).stat().st_size / 1024:.1f} KB")
except Exception as e:
print(f"❌ 重新生成韓文翻譯檔案時發生錯誤: {str(e)}")
import traceback
print(f"錯誤詳情: {traceback.format_exc()}")
print(f"\n" + "=" * 80)
print("使用原始DIFY翻譯重新生成完成")
print("=" * 80)
if __name__ == "__main__":
regenerate_with_original_dify()

View File

@@ -4,6 +4,7 @@ Flask-SQLAlchemy==3.1.1
Flask-Session==0.5.0
Flask-Cors==4.0.0
Flask-SocketIO==5.3.6
Flask-JWT-Extended==4.6.0
# Database
PyMySQL==1.1.0

View File

@@ -1,122 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
重置資料庫 - 清空除 dt_users 外的所有表並將用戶ID設為1
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from app.models import User, TranslationJob, JobFile, TranslationCache, APIUsageStats, SystemLog
from sqlalchemy import text
def reset_database():
"""重置資料庫"""
app = create_app()
with app.app_context():
print("=== 開始重置資料庫 ===")
try:
# 1. 先檢查現有用戶
users = User.query.all()
print(f"當前用戶數量: {len(users)}")
for user in users:
print(f" ID: {user.id}, 用戶名: {user.username}, Email: {user.email}")
if len(users) != 1:
print("❌ 錯誤:應該只有一個用戶")
return
current_user = users[0]
print(f"\n準備將用戶 ID {current_user.id} 改為 1")
# 2. 停用外鍵檢查MySQL
print("\n⏳ 停用外鍵檢查...")
db.session.execute(text("SET FOREIGN_KEY_CHECKS = 0"))
# 3. 清空相關表格(按依賴順序)
print("\n🗑️ 清空相關表格...")
# API使用統計
deleted_stats = db.session.execute(text("DELETE FROM dt_api_usage_stats")).rowcount
print(f" 已刪除 {deleted_stats} 筆 API 使用記錄")
# 系統日誌
deleted_logs = db.session.execute(text("DELETE FROM dt_system_logs")).rowcount
print(f" 已刪除 {deleted_logs} 筆系統日誌")
# 翻譯檔案
deleted_files = db.session.execute(text("DELETE FROM dt_job_files")).rowcount
print(f" 已刪除 {deleted_files} 筆檔案記錄")
# 翻譯任務
deleted_jobs = db.session.execute(text("DELETE FROM dt_translation_jobs")).rowcount
print(f" 已刪除 {deleted_jobs} 筆翻譯任務")
# 翻譯快取
deleted_cache = db.session.execute(text("DELETE FROM dt_translation_cache")).rowcount
print(f" 已刪除 {deleted_cache} 筆翻譯快取")
# 4. 更新用戶ID為1
print(f"\n🔄 更新用戶ID從 {current_user.id} 到 1...")
if current_user.id != 1:
db.session.execute(text("UPDATE dt_users SET id = 1 WHERE id = :old_id"), {'old_id': current_user.id})
db.session.execute(text("ALTER TABLE dt_users AUTO_INCREMENT = 2"))
print(" ✅ 用戶ID已更新為 1")
else:
print(" ✅ 用戶ID已經是 1")
# 5. 重新啟用外鍵檢查
print("\n⚡ 重新啟用外鍵檢查...")
db.session.execute(text("SET FOREIGN_KEY_CHECKS = 1"))
# 6. 提交所有變更
db.session.commit()
# 7. 驗證結果
print("\n✅ 驗證結果:")
users_after = User.query.all()
for user in users_after:
print(f" 用戶 ID: {user.id}, 用戶名: {user.username}, Email: {user.email}")
jobs_count = db.session.execute(text("SELECT COUNT(*) FROM dt_translation_jobs")).scalar()
files_count = db.session.execute(text("SELECT COUNT(*) FROM dt_job_files")).scalar()
cache_count = db.session.execute(text("SELECT COUNT(*) FROM dt_translation_cache")).scalar()
stats_count = db.session.execute(text("SELECT COUNT(*) FROM dt_api_usage_stats")).scalar()
logs_count = db.session.execute(text("SELECT COUNT(*) FROM dt_system_logs")).scalar()
print(f" 翻譯任務: {jobs_count}")
print(f" 檔案記錄: {files_count}")
print(f" 翻譯快取: {cache_count}")
print(f" API統計: {stats_count}")
print(f" 系統日誌: {logs_count}")
print(f"\n🎉 資料庫重置完成!")
print(f" - 保留用戶: ID=1, {users_after[0].username}")
print(f" - 清空了所有翻譯相關資料")
print(f" - 系統已準備好重新開始測試")
except Exception as e:
print(f"❌ 重置失敗: {str(e)}")
db.session.rollback()
# 確保重新啟用外鍵檢查
try:
db.session.execute(text("SET FOREIGN_KEY_CHECKS = 1"))
db.session.commit()
except:
pass
raise
if __name__ == "__main__":
reset_database()

View File

@@ -1,13 +0,0 @@
HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.12.10
Date: Tue, 02 Sep 2025 01:04:08 GMT
Content-Type: application/json
Content-Length: 470
Access-Control-Allow-Origin: http://localhost:3000
Access-Control-Allow-Headers: Content-Type, Authorization, X-Requested-With
Access-Control-Allow-Methods: GET, POST, PUT, DELETE, OPTIONS, PATCH
Access-Control-Allow-Credentials: true
Access-Control-Max-Age: 86400
Set-Cookie: session=12b5b095-569b-4914-9b32-9e45ead26694; Expires=Wed, 03 Sep 2025 01:04:08 GMT; HttpOnly; Path=/
Connection: close

View File

@@ -1,36 +0,0 @@
@echo off
REM 測試執行腳本
echo ========================================
echo 執行 PANJIT Document Translator 測試
echo ========================================
REM 啟動虛擬環境
if exist "venv\Scripts\activate.bat" (
call venv\Scripts\activate.bat
) else (
echo 錯誤: 找不到虛擬環境,請先執行 start_dev.bat
pause
exit /b 1
)
REM 安裝測試依賴
echo 安裝測試依賴...
pip install pytest pytest-cov pytest-mock
REM 執行測試
echo 執行單元測試...
python -m pytest tests/ -v --tb=short
REM 生成測試覆蓋率報告
echo.
echo 生成測試覆蓋率報告...
python -m pytest tests/ --cov=app --cov-report=html --cov-report=term-missing
echo.
echo ========================================
echo 測試完成!
echo 覆蓋率報告已生成到: htmlcov/index.html
echo ========================================
pause

View File

@@ -1,42 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
簡單檢查任務
"""
import sys
import os
# 添加 app 路徑
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
def main():
from app import create_app
from app.models.job import TranslationJob
from app.services.translation_service import TranslationService
app = create_app()
with app.app_context():
# 查詢等待中的任務
pending_jobs = TranslationJob.query.filter_by(status='PENDING').all()
print(f"等待中的任務數量: {len(pending_jobs)}")
if pending_jobs:
job = pending_jobs[0] # 處理第一個任務
try:
print(f"開始處理任務: {job.job_uuid}")
service = TranslationService()
result = service.translate_document(job.job_uuid)
print(f"處理完成: success={result.get('success', False)}")
if result.get('success'):
print(f"翻譯檔案: {len(result.get('output_files', []))}")
print(f"總成本: ${result.get('total_cost', 0)}")
except Exception as e:
print(f"處理失敗: {str(e)}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
main()

View File

@@ -1,82 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
啟動 Celery Worker 的輔助腳本
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
# 將 app 目錄加入 sys.path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
def start_celery_worker():
"""啟動 Celery Worker"""
print("正在啟動 Celery Worker...")
# 設置環境變數
os.environ.setdefault('FLASK_ENV', 'development')
# 導入應用
from app import create_app
app = create_app()
print(f"Flask 應用已創建: {app}")
print(f"Celery 實例: {app.celery}")
# 啟動 Celery Worker
# Windows 需要使用 --pool=solo 參數
print("正在啟動 Celery WorkerWindows 模式)...")
# 使用 subprocess 啟動 celery worker
import subprocess
cmd = [
sys.executable, '-m', 'celery',
'-A', 'app.celery',
'worker',
'--loglevel=info',
'--pool=solo'
]
print(f"執行命令: {' '.join(cmd)}")
try:
# 切換到正確的目錄
os.chdir(os.path.dirname(__file__))
# 啟動進程
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
encoding='utf-8',
bufsize=1,
universal_newlines=True
)
print("Celery Worker 已啟動PID:", process.pid)
print("正在監控輸出...")
# 即時顯示輸出
for line in iter(process.stdout.readline, ''):
print(line.rstrip())
except KeyboardInterrupt:
print("\n收到中斷信號,正在停止 Celery Worker...")
if 'process' in locals():
process.terminate()
sys.exit(0)
except Exception as e:
print(f"啟動 Celery Worker 時發生錯誤: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
start_celery_worker()

View File

@@ -1,115 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
測試管理後台 API
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
import requests
import json
from app import create_app
from app.models.user import User
from flask_jwt_extended import create_access_token
def test_admin_api():
"""測試管理後台 API 認證"""
app = create_app()
with app.app_context():
# 找到管理員用戶
admin_user = User.query.filter_by(is_admin=True).first()
if not admin_user:
print("❌ 找不到管理員用戶")
return
print(f"✅ 找到管理員用戶: {admin_user.username} (ID: {admin_user.id})")
# 創建JWT token
token = create_access_token(
identity=admin_user.username,
additional_claims={
'user_id': admin_user.id,
'is_admin': admin_user.is_admin
}
)
print(f"✅ 創建JWT token: {token[:50]}...")
# 測試API調用
base_url = "http://127.0.0.1:5000/api/v1"
headers = {
'Authorization': f'Bearer {token}',
'Content-Type': 'application/json'
}
# 測試各個管理後台API端點
test_endpoints = [
('GET', '/admin/stats', '系統統計'),
('GET', '/admin/jobs', '任務列表'),
('GET', '/admin/users', '用戶列表'),
('GET', '/admin/api-usage', 'API使用統計'),
('GET', '/admin/cache/stats', '快取統計'),
('GET', '/admin/health', '系統健康狀態'),
('GET', '/admin/metrics', '系統指標'),
]
for method, endpoint, name in test_endpoints:
print(f"\n🧪 測試 {name}: {method} {endpoint}")
try:
if method == 'GET':
response = requests.get(f"{base_url}{endpoint}", headers=headers, timeout=10)
else:
response = requests.request(method, f"{base_url}{endpoint}", headers=headers, timeout=10)
print(f"📊 狀態碼: {response.status_code}")
if response.status_code == 200:
try:
data = response.json()
if data.get('success'):
print(f"{name} API 測試成功")
# 顯示部分回傳數據
if 'data' in data:
data_keys = list(data['data'].keys()) if isinstance(data['data'], dict) else 'Array'
print(f" 數據鍵值: {data_keys}")
else:
print(f"{name} API 返回失敗: {data.get('message', 'Unknown error')}")
except json.JSONDecodeError:
print(f"{name} API 返回非JSON格式數據")
elif response.status_code == 401:
print(f"{name} API 認證失敗 (401 Unauthorized)")
print(f" 錯誤信息: {response.text}")
elif response.status_code == 403:
print(f"{name} API 權限不足 (403 Forbidden)")
print(f" 錯誤信息: {response.text}")
else:
print(f"{name} API 測試失敗 ({response.status_code})")
print(f" 錯誤信息: {response.text}")
except requests.exceptions.ConnectionError:
print(f"❌ 無法連接到伺服器請確認Flask應用正在運行")
except requests.exceptions.Timeout:
print(f"❌ 請求超時")
except Exception as e:
print(f"❌ 測試發生錯誤: {e}")
print(f"\n=== 測試完成 ===")
if __name__ == "__main__":
test_admin_api()

View File

@@ -1,267 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
API測試腳本
"""
import requests
import json
import sys
import time
from multiprocessing import Process
def start_flask_app():
"""在子進程中啟動Flask應用"""
try:
# 簡化的Flask應用啟動
from flask import Flask, jsonify, request, session
import pymysql
app = Flask(__name__)
app.config['SECRET_KEY'] = 'test-secret-key'
@app.route('/health')
def health():
"""健康檢查API"""
return jsonify({
'status': 'ok',
'timestamp': time.time()
})
@app.route('/api/v1/auth/login', methods=['POST'])
def login():
"""簡化的登入API"""
try:
data = request.get_json()
username = data.get('username')
password = data.get('password')
if not username or not password:
return jsonify({
'success': False,
'error': 'MISSING_CREDENTIALS',
'message': '缺少帳號或密碼'
}), 400
# 測試LDAP認證
import ldap3
from ldap3 import Server, Connection, ALL
server = Server('panjit.com.tw', port=389, get_info=ALL)
bind_dn = "CN=LdapBind,CN=Users,DC=PANJIT,DC=COM,DC=TW"
bind_password = "panjit2481"
service_conn = Connection(server, user=bind_dn, password=bind_password, auto_bind=True)
# 搜尋使用者
search_base = "OU=PANJIT,DC=panjit,DC=com,DC=tw"
search_filter = f"(userPrincipalName={username})"
result = service_conn.search(search_base, search_filter,
attributes=['displayName', 'mail', 'department', 'distinguishedName'])
if not result or not service_conn.entries:
service_conn.unbind()
return jsonify({
'success': False,
'error': 'USER_NOT_FOUND',
'message': '使用者不存在'
}), 404
user_entry = service_conn.entries[0]
user_dn = str(user_entry.distinguishedName)
# 驗證使用者密碼
user_conn = Connection(server, user=user_dn, password=password)
if not user_conn.bind():
service_conn.unbind()
return jsonify({
'success': False,
'error': 'INVALID_PASSWORD',
'message': '密碼錯誤'
}), 401
user_conn.unbind()
service_conn.unbind()
# 模擬成功登入
user_info = {
'id': 1,
'username': username.split('@')[0],
'display_name': str(user_entry.displayName) if user_entry.displayName else username,
'email': str(user_entry.mail) if user_entry.mail else username,
'department': str(user_entry.department) if user_entry.department else 'Unknown',
'is_admin': username.lower() == 'ymirliu@panjit.com.tw'
}
# 設定session
session['user_id'] = user_info['id']
session['username'] = user_info['username']
session['is_admin'] = user_info['is_admin']
return jsonify({
'success': True,
'data': {
'user': user_info
},
'message': '登入成功'
})
except Exception as e:
print(f"Login error: {e}")
return jsonify({
'success': False,
'error': 'INTERNAL_ERROR',
'message': f'系統錯誤: {str(e)}'
}), 500
@app.route('/api/v1/auth/me')
def get_current_user():
"""取得當前使用者"""
user_id = session.get('user_id')
if not user_id:
return jsonify({
'success': False,
'error': 'NOT_AUTHENTICATED',
'message': '未登入'
}), 401
return jsonify({
'success': True,
'data': {
'user': {
'id': user_id,
'username': session.get('username'),
'is_admin': session.get('is_admin', False)
}
}
})
print("Starting test Flask server on port 5000...")
app.run(host='127.0.0.1', port=5000, debug=False)
except Exception as e:
print(f"Flask app failed to start: {e}")
def test_apis():
"""測試API端點"""
base_url = 'http://127.0.0.1:5000'
# 等待Flask應用啟動
print("Waiting for Flask server to start...")
time.sleep(3)
test_results = []
# 1. 測試健康檢查
try:
response = requests.get(f'{base_url}/health', timeout=5)
if response.status_code == 200:
test_results.append(('Health Check', 'PASS'))
print("✓ Health check API works")
else:
test_results.append(('Health Check', 'FAIL'))
print(f"✗ Health check failed: {response.status_code}")
except Exception as e:
test_results.append(('Health Check', 'FAIL'))
print(f"✗ Health check failed: {e}")
# 2. 測試登入API無效憑證
try:
login_data = {
'username': 'invalid@panjit.com.tw',
'password': 'wrongpassword'
}
response = requests.post(f'{base_url}/api/v1/auth/login',
json=login_data, timeout=10)
if response.status_code == 404:
test_results.append(('Invalid Login', 'PASS'))
print("✓ Invalid login properly rejected")
else:
test_results.append(('Invalid Login', 'FAIL'))
print(f"✗ Invalid login test failed: {response.status_code}")
except Exception as e:
test_results.append(('Invalid Login', 'FAIL'))
print(f"✗ Invalid login test failed: {e}")
# 3. 測試登入API有效憑證
try:
login_data = {
'username': 'ymirliu@panjit.com.tw',
'password': 'ˇ3EDC4rfv5tgb' # 使用提供的測試密碼
}
response = requests.post(f'{base_url}/api/v1/auth/login',
json=login_data, timeout=15)
if response.status_code == 200:
result = response.json()
if result.get('success'):
test_results.append(('Valid Login', 'PASS'))
print("✓ Valid login successful")
# 保存session cookies
cookies = response.cookies
# 4. 測試取得當前使用者
try:
me_response = requests.get(f'{base_url}/api/v1/auth/me',
cookies=cookies, timeout=5)
if me_response.status_code == 200:
me_result = me_response.json()
if me_result.get('success'):
test_results.append(('Get Current User', 'PASS'))
print("✓ Get current user API works")
else:
test_results.append(('Get Current User', 'FAIL'))
else:
test_results.append(('Get Current User', 'FAIL'))
except Exception as e:
test_results.append(('Get Current User', 'FAIL'))
print(f"✗ Get current user failed: {e}")
else:
test_results.append(('Valid Login', 'FAIL'))
print(f"✗ Login failed: {result.get('message', 'Unknown error')}")
else:
test_results.append(('Valid Login', 'FAIL'))
print(f"✗ Valid login failed: {response.status_code}")
if response.headers.get('content-type', '').startswith('application/json'):
print(f"Response: {response.json()}")
except Exception as e:
test_results.append(('Valid Login', 'FAIL'))
print(f"✗ Valid login test failed: {e}")
# 輸出測試結果
print("\n=== API Test Results ===")
for test_name, result in test_results:
print(f"{test_name}: {result}")
passed = sum(1 for _, result in test_results if result == 'PASS')
total = len(test_results)
print(f"\nPassed: {passed}/{total}")
return test_results
if __name__ == '__main__':
if len(sys.argv) > 1 and sys.argv[1] == 'server':
# 只啟動服務器
start_flask_app()
else:
# 在子進程中啟動Flask應用
flask_process = Process(target=start_flask_app)
flask_process.start()
try:
# 運行測試
test_results = test_apis()
finally:
# 關閉Flask進程
flask_process.terminate()
flask_process.join(timeout=3)
if flask_process.is_alive():
flask_process.kill()

View File

@@ -1,232 +0,0 @@
import requests
import json
import time
import os
import sys
import io
from pathlib import Path
# 設定 UTF-8 編碼
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
BASE_URL = "http://localhost:5000/api/v1"
def test_login():
"""測試登入功能"""
print("\n=== 測試登入功能 ===")
response = requests.post(f"{BASE_URL}/auth/login", json={
"username": "ymirliu@panjit.com.tw",
"password": "3EDC4rfv5tgb"
})
if response.status_code == 200:
data = response.json()
print(f"✅ 登入成功")
print(f" 使用者: {data.get('user', {}).get('username')}")
print(f" Token: {data.get('token')[:20]}...")
print(f" 管理員: {data.get('user', {}).get('is_admin')}")
return data.get('token')
else:
print(f"❌ 登入失敗: {response.status_code}")
print(f" 錯誤: {response.text}")
return None
def test_file_upload(token):
"""測試檔案上傳"""
print("\n=== 測試檔案上傳 ===")
# 建立測試檔案
test_file = "test_document.txt"
with open(test_file, 'w', encoding='utf-8') as f:
f.write("This is a test document for translation.\n這是一個測試文件。")
try:
with open(test_file, 'rb') as f:
files = {'file': (test_file, f, 'text/plain')}
headers = {'Authorization': f'Bearer {token}'}
response = requests.post(
f"{BASE_URL}/files/upload",
files=files,
headers=headers
)
if response.status_code == 200:
data = response.json()
print(f"✅ 檔案上傳成功")
print(f" Job ID: {data.get('job_id')}")
print(f" 檔案名: {data.get('filename')}")
return data.get('job_id')
else:
print(f"❌ 上傳失敗: {response.status_code}")
print(f" 錯誤: {response.text}")
return None
finally:
# 清理測試檔案
if os.path.exists(test_file):
os.remove(test_file)
def test_job_status(token, job_id):
"""測試任務狀態查詢"""
print("\n=== 測試任務狀態 ===")
headers = {'Authorization': f'Bearer {token}'}
response = requests.get(
f"{BASE_URL}/jobs/{job_id}",
headers=headers
)
if response.status_code == 200:
data = response.json()
print(f"✅ 狀態查詢成功")
print(f" 狀態: {data.get('status')}")
print(f" 進度: {data.get('progress')}%")
return data
else:
print(f"❌ 查詢失敗: {response.status_code}")
return None
def test_admin_stats(token):
"""測試管理員統計功能"""
print("\n=== 測試管理員統計 ===")
headers = {'Authorization': f'Bearer {token}'}
response = requests.get(
f"{BASE_URL}/admin/statistics",
headers=headers
)
if response.status_code == 200:
data = response.json()
print(f"✅ 統計查詢成功")
print(f" 總任務數: {data.get('total_jobs')}")
print(f" 總使用者: {data.get('total_users')}")
print(f" API 總成本: ${data.get('total_cost', 0)}")
return True
else:
print(f"❌ 查詢失敗: {response.status_code}")
return False
def test_dify_api():
"""測試 Dify API 配置"""
print("\n=== 測試 Dify API ===")
# 讀取 API 配置
api_file = Path("api.txt")
if api_file.exists():
content = api_file.read_text().strip()
lines = content.split('\n')
base_url = None
api_key = None
for line in lines:
if line.startswith('base_url:'):
base_url = line.split(':', 1)[1].strip()
elif line.startswith('api:'):
api_key = line.split(':', 1)[1].strip()
print(f"✅ API 配置已找到")
print(f" Base URL: {base_url}")
print(f" API Key: {api_key[:20]}...")
# 測試 API 連線
if base_url and api_key:
try:
test_url = f"{base_url}/chat-messages"
headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json'
}
# 簡單的測試請求
test_data = {
"inputs": {},
"query": "Hello",
"response_mode": "blocking",
"user": "test_user"
}
response = requests.post(test_url, json=test_data, headers=headers, timeout=10)
if response.status_code in [200, 201]:
print(f"✅ Dify API 連線成功")
return True
else:
print(f"⚠️ Dify API 回應: {response.status_code}")
return True # API 配置正確,但可能需要正確的應用配置
except Exception as e:
print(f"⚠️ Dify API 連線測試: {str(e)[:50]}")
return True # 配置存在即可
else:
print(f"❌ API 配置不完整")
return False
else:
print(f"❌ api.txt 檔案不存在")
return False
def run_integration_tests():
"""執行整合測試"""
print("\n" + "="*50)
print("開始執行整合測試")
print("="*50)
results = {
"login": False,
"upload": False,
"status": False,
"admin": False,
"dify": False
}
# 1. 測試 Dify API 配置
results["dify"] = test_dify_api()
# 2. 測試登入
token = test_login()
if token:
results["login"] = True
# 3. 測試檔案上傳
job_id = test_file_upload(token)
if job_id:
results["upload"] = True
# 4. 測試任務狀態
time.sleep(1)
job_data = test_job_status(token, job_id)
if job_data:
results["status"] = True
# 5. 測試管理員功能
results["admin"] = test_admin_stats(token)
# 總結
print("\n" + "="*50)
print("測試結果總結")
print("="*50)
passed = sum(1 for v in results.values() if v)
total = len(results)
for test_name, passed in results.items():
status = "✅ 通過" if passed else "❌ 失敗"
print(f" {test_name.upper()}: {status}")
print(f"\n總計: {passed}/{total} 測試通過")
print(f"成功率: {(passed/total)*100:.1f}%")
return results
if __name__ == "__main__":
# 檢查服務是否運行 - 直接測試登入端點
print("檢查後端服務...")
try:
# 嘗試訪問根路徑或登入路徑
response = requests.get("http://localhost:5000/", timeout=2)
print("✅ 後端服務運行中")
run_integration_tests()
except requests.exceptions.ConnectionError:
print("❌ 無法連接到後端服務")
print("請先執行 start_dev.bat 啟動後端服務")
except Exception as e:
print(f"❌ 錯誤: {e}")

View File

@@ -1,137 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
測試_append_after函數是否正常工作
"""
import sys
import os
import tempfile
from pathlib import Path
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app.services.document_processor import _append_after, _is_our_insert_block
def test_append_after_function():
"""測試_append_after函數是否正常工作"""
print("=== 測試_append_after函數 ===")
try:
from docx import Document
from docx.shared import Pt
# 創建測試文檔
doc = Document()
# 添加原始段落
original_para = doc.add_paragraph("這是原始中文段落。")
print(f"✅ 創建原始段落: {original_para.text}")
# 使用_append_after插入英文翻譯
translation_text = "This is the English translation."
try:
new_para = _append_after(original_para, translation_text, italic=True, font_size_pt=12)
print(f"✅ 使用_append_after插入翻譯: {new_para.text}")
# 檢查插入的段落是否有我們的標記
if _is_our_insert_block(new_para):
print(f"✅ 翻譯段落包含零寬空格標記")
else:
print(f"❌ 翻譯段落缺少零寬空格標記")
# 檢查格式是否正確
if new_para.runs and new_para.runs[0].italic:
print(f"✅ 翻譯段落格式正確(斜體)")
else:
print(f"❌ 翻譯段落格式不正確")
except Exception as e:
print(f"❌ _append_after插入失敗: {e}")
return False
# 再插入一個翻譯來測試鏈式插入
try:
vietnamese_translation = "Đây là bản dịch tiếng Việt."
new_para2 = _append_after(new_para, vietnamese_translation, italic=True, font_size_pt=12)
print(f"✅ 鏈式插入第二個翻譯: {new_para2.text}")
except Exception as e:
print(f"❌ 鏈式插入失敗: {e}")
# 保存測試文檔
test_file = Path(tempfile.gettempdir()) / "test_append_after.docx"
doc.save(str(test_file))
print(f"✅ 測試文檔保存至: {test_file}")
# 重新讀取文檔驗證
try:
doc2 = Document(str(test_file))
paragraphs = [p.text.strip() for p in doc2.paragraphs if p.text.strip()]
print(f"\n📄 測試文檔內容驗證:")
print(f"總段落數: {len(paragraphs)}")
for i, para_text in enumerate(paragraphs):
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in para_text)
has_english = any(ord(c) < 128 and c.isalpha() for c in para_text)
has_vietnamese = any('\u00C0' <= c <= '\u1EF9' for c in para_text)
lang_info = []
if has_chinese:
lang_info.append("中文")
if has_english:
lang_info.append("英文")
if has_vietnamese:
lang_info.append("越南文")
print(f" 段落 {i+1}: [{'/'.join(lang_info)}] {para_text}")
# 檢查是否有正確的交錯格式
expected_sequence = [
("中文", "這是原始中文段落。"),
("英文", "This is the English translation."),
("越南文", "Đây là bản dịch tiếng Việt.")
]
success = True
for i, (expected_lang, expected_text) in enumerate(expected_sequence):
if i < len(paragraphs):
actual_text = paragraphs[i]
if expected_text in actual_text:
print(f" ✅ 段落 {i+1} 包含預期的{expected_lang}內容")
else:
print(f" ❌ 段落 {i+1} 不包含預期的{expected_lang}內容")
success = False
else:
print(f" ❌ 缺少第 {i+1} 個段落")
success = False
if success:
print(f"\n✅ _append_after函數工作正常")
return True
else:
print(f"\n❌ _append_after函數有問題")
return False
except Exception as e:
print(f"❌ 讀取測試文檔失敗: {e}")
return False
except Exception as e:
print(f"❌ 測試失敗: {e}")
return False
if __name__ == "__main__":
success = test_append_after_function()
if success:
print(f"\n🎉 _append_after函數測試通過")
else:
print(f"\n💥 _append_after函數測試失敗")

View File

@@ -1,118 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
基本系統測試
"""
import os
import sys
sys.path.append('.')
def test_database():
"""測試資料庫連線"""
try:
import pymysql
connection = pymysql.connect(
host='mysql.theaken.com',
port=33306,
user='A060',
password='WLeSCi0yhtc7',
database='db_A060',
charset='utf8mb4'
)
cursor = connection.cursor()
# 檢查是否有翻譯系統的表
cursor.execute('SHOW TABLES LIKE "dt_%"')
tables = cursor.fetchall()
print(f"Found {len(tables)} document translator tables")
for table in tables:
print(f" - {table[0]}")
connection.close()
return True, len(tables)
except Exception as e:
print(f"Database test failed: {e}")
return False, 0
def test_ldap():
"""測試LDAP連線"""
try:
import ldap3
from ldap3 import Server, Connection, ALL
server = Server('panjit.com.tw', port=389, get_info=ALL)
# 使用服務帳號連線
bind_dn = "CN=LdapBind,CN=Users,DC=PANJIT,DC=COM,DC=TW"
bind_password = "panjit2481"
conn = Connection(server, user=bind_dn, password=bind_password, auto_bind=True)
# 搜尋測試使用者
search_base = "OU=PANJIT,DC=panjit,DC=com,DC=tw"
search_filter = "(userPrincipalName=ymirliu@panjit.com.tw)"
result = conn.search(search_base, search_filter, attributes=['displayName', 'mail', 'department'])
if result and conn.entries:
user = conn.entries[0]
print(f"Found test user: {user.displayName}")
print(f"Email: {user.mail}")
conn.unbind()
return True
else:
print("Test user not found")
conn.unbind()
return False
except Exception as e:
print(f"LDAP test failed: {e}")
return False
def test_file_processing():
"""測試檔案處理庫"""
try:
# 測試基本導入
import docx
import openpyxl
from pptx import Presentation
import PyPDF2
print("All file processing libraries imported successfully")
return True
except Exception as e:
print(f"File processing test failed: {e}")
return False
def main():
print("=== Document Translator System Test ===")
print("\n1. Testing database connection...")
db_ok, table_count = test_database()
print("\n2. Testing LDAP authentication...")
ldap_ok = test_ldap()
print("\n3. Testing file processing libraries...")
file_ok = test_file_processing()
print("\n=== Test Results ===")
print(f"Database Connection: {'PASS' if db_ok else 'FAIL'}")
print(f"Database Tables: {table_count} found")
print(f"LDAP Authentication: {'PASS' if ldap_ok else 'FAIL'}")
print(f"File Processing: {'PASS' if file_ok else 'FAIL'}")
if db_ok and ldap_ok and file_ok:
if table_count > 0:
print("\nStatus: READY for testing")
else:
print("\nStatus: Need to initialize database tables")
else:
print("\nStatus: System has issues")
if __name__ == '__main__':
main()

View File

@@ -1,130 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Test batch download functionality
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
import tempfile
import zipfile
from pathlib import Path
from app import create_app
from app.models.job import TranslationJob
def test_batch_download():
"""Test batch download ZIP creation"""
app = create_app()
with app.app_context():
# Get the most recent completed job
job = TranslationJob.query.filter_by(status='COMPLETED').order_by(TranslationJob.created_at.desc()).first()
if not job:
print("No completed jobs found to test")
return
print(f"Testing batch download for job: {job.job_uuid}")
print(f"Original filename: {job.original_filename}")
print(f"Target languages: {job.target_languages}")
# Get translated files
translated_files = job.get_translated_files()
original_file = job.get_original_file()
print(f"Found {len(translated_files)} translated files:")
for tf in translated_files:
exists = Path(tf.file_path).exists()
print(f" - {tf.filename} ({tf.language_code}) - {'EXISTS' if exists else 'MISSING'}")
if original_file:
exists = Path(original_file.file_path).exists()
print(f"Original file: {original_file.filename} - {'EXISTS' if exists else 'MISSING'}")
# Test ZIP creation
print(f"\n=== Testing ZIP creation ===")
temp_dir = tempfile.gettempdir()
zip_filename = f"{job.original_filename.split('.')[0]}_translations_{job.job_uuid[:8]}.zip"
zip_path = Path(temp_dir) / zip_filename
try:
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zip_file:
files_added = 0
# Add original file
if original_file and Path(original_file.file_path).exists():
zip_file.write(
original_file.file_path,
f"original/{original_file.filename}"
)
files_added += 1
print(f"✅ Added original file: original/{original_file.filename}")
# Add translated files
for tf in translated_files:
file_path = Path(tf.file_path)
if file_path.exists():
archive_name = f"{tf.language_code}/{tf.filename}"
zip_file.write(str(file_path), archive_name)
files_added += 1
print(f"✅ Added translation: {archive_name}")
else:
print(f"❌ Translation file missing: {tf.file_path}")
print(f"\nTotal files added to ZIP: {files_added}")
# Check ZIP file
if zip_path.exists():
zip_size = zip_path.stat().st_size
print(f"✅ ZIP file created successfully: {zip_filename} ({zip_size:,} bytes)")
# List ZIP contents
print(f"\n=== ZIP Contents ===")
with zipfile.ZipFile(zip_path, 'r') as zip_file:
for info in zip_file.infolist():
print(f" 📁 {info.filename} - {info.file_size:,} bytes")
# Test extracting a sample file to verify integrity
print(f"\n=== Testing ZIP integrity ===")
try:
with zipfile.ZipFile(zip_path, 'r') as zip_file:
# Test extraction of first file
if zip_file.namelist():
first_file = zip_file.namelist()[0]
extracted_data = zip_file.read(first_file)
print(f"✅ Successfully extracted {first_file} ({len(extracted_data):,} bytes)")
else:
print("❌ ZIP file is empty")
except Exception as e:
print(f"❌ ZIP integrity test failed: {e}")
else:
print("❌ ZIP file was not created")
except Exception as e:
print(f"❌ ZIP creation failed: {e}")
import traceback
traceback.print_exc()
finally:
# Clean up
if zip_path.exists():
try:
zip_path.unlink()
print(f"\n🧹 Cleaned up temporary ZIP file")
except Exception as e:
print(f"⚠️ Could not clean up ZIP file: {e}")
if __name__ == "__main__":
test_batch_download()

View File

@@ -1,49 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
測試Celery導入
"""
import sys
import os
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
def test_celery_import():
"""測試Celery是否能正確導入"""
try:
print("嘗試導入app模組...")
import app
print(f"✅ app模組導入成功: {app}")
print("檢查app模組屬性...")
print(f" - hasattr(app, 'app'): {hasattr(app, 'app')}")
print(f" - hasattr(app, 'celery'): {hasattr(app, 'celery')}")
if hasattr(app, 'celery'):
celery_instance = app.celery
print(f"✅ celery實例: {celery_instance}")
print(f" - celery類型: {type(celery_instance)}")
print(f" - celery任務: {list(celery_instance.tasks.keys())}")
else:
print("❌ app模組沒有celery屬性")
if hasattr(app, 'app'):
flask_app = app.app
print(f"✅ Flask app: {flask_app}")
if hasattr(flask_app, 'celery'):
print(f"✅ Flask app.celery: {flask_app.celery}")
else:
print("❌ Flask app沒有celery屬性")
except Exception as e:
print(f"❌ 導入失敗: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
test_celery_import()

View File

@@ -1,187 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
測試修復後的儲存格為單位翻譯邏輯
驗證 Excel 和 Word 表格的翻譯是否正確對應
"""
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
# 設定編碼
sys.stdout.reconfigure(encoding='utf-8')
from pathlib import Path
from app import create_app
from app.services.translation_service import TranslationService
def test_excel_cell_based_translation():
"""測試Excel儲存格為單位的翻譯邏輯"""
print("=" * 80)
print("測試Excel儲存格為單位翻譯邏輯")
print("=" * 80)
app = create_app()
with app.app_context():
service = TranslationService()
# 測試案例1: 泰文翻譯 (之前D4, H2缺失)
print(f"\n1. 測試泰文翻譯儲存格方法")
print("-" * 60)
# 模擬D4儲存格內容
d4_text = "WB inline"
d4_translated = service.translate_excel_cell(
text=d4_text,
source_language="zh",
target_language="th",
user_id=1
)
print(f"D4原文: {repr(d4_text)}")
print(f"D4泰文: {repr(d4_translated)}")
# 模擬H2儲存格內容
h2_text = "製程"
h2_translated = service.translate_excel_cell(
text=h2_text,
source_language="zh",
target_language="th",
user_id=1
)
print(f"H2原文: {repr(h2_text)}")
print(f"H2泰文: {repr(h2_translated)}")
# 測試案例2: 韓文翻譯 (之前D2-D8缺失)
print(f"\n2. 測試韓文翻譯儲存格方法")
print("-" * 60)
# 模擬D2儲存格內容 (多行格式)
d2_text = "WB inline\nDC: 1000V\n@25°C Tstg: -55°C to +125°C"
d2_translated = service.translate_excel_cell(
text=d2_text,
source_language="zh",
target_language="ko",
user_id=1
)
print(f"D2原文: {repr(d2_text)}")
print(f"D2韓文: {repr(d2_translated[:60])}...")
# 檢查是否使用了原始DIFY翻譯的特徵
if "와이어 본딩" in d2_translated:
print(f" 🎯 ✅ 使用了原始DIFY翻譯特徵")
elif "연결" in d2_translated:
print(f" ✋ ❌ 仍使用手動補充翻譯")
else:
print(f" ❓ 翻譯來源不明")
def test_word_table_cell_translation():
"""測試Word表格儲存格為單位的翻譯邏輯"""
print(f"\n" + "=" * 80)
print("測試Word表格儲存格為單位翻譯邏輯")
print("=" * 80)
app = create_app()
with app.app_context():
service = TranslationService()
print(f"\n1. 測試Word表格儲存格翻譯方法")
print("-" * 60)
# 測試案例: Word表格儲存格包含多段落的情況
cell_text = "超温\n存放\n工务部"
cell_translated = service.translate_word_table_cell(
text=cell_text,
source_language="zh",
target_language="th",
user_id=1
)
print(f"表格儲存格原文: {repr(cell_text)}")
print(f"表格儲存格泰文: {repr(cell_translated)}")
# 另一個案例: 單段落儲存格
single_cell = "製程控制"
single_translated = service.translate_word_table_cell(
text=single_cell,
source_language="zh",
target_language="ko",
user_id=1
)
print(f"\n單段落儲存格原文: {repr(single_cell)}")
print(f"單段落儲存格韓文: {repr(single_translated)}")
def test_translation_cache_mapping():
"""測試翻譯快取與儲存格的對應關係"""
print(f"\n" + "=" * 80)
print("測試翻譯快取與儲存格的對應關係")
print("=" * 80)
app = create_app()
with app.app_context():
from sqlalchemy import text as sql_text
from app import db
# 檢查之前提到的快取記錄是否能正確對應
print(f"\n1. 檢查泰文翻譯快取記錄")
print("-" * 60)
# D4對應的ROW 392, 393
d4_cache = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, created_at
FROM dt_translation_cache
WHERE id IN (392, 393) AND target_language = 'th'
ORDER BY id
""")).fetchall()
for row in d4_cache:
print(f"ROW {row[0]}: {repr(row[1][:30])}... -> {repr(row[2][:30])}...")
# H2對應的ROW 381-385
h2_cache = db.session.execute(sql_text("""
SELECT id, source_text, translated_text, created_at
FROM dt_translation_cache
WHERE id BETWEEN 381 AND 385 AND target_language = 'th'
ORDER BY id
""")).fetchall()
print(f"\nH2相關快取記錄:")
for row in h2_cache:
print(f"ROW {row[0]}: {repr(row[1][:20])}... -> {repr(row[2][:20])}...")
def main():
"""主測試函數"""
print("🧪 開始測試儲存格為單位的翻譯邏輯")
print("預期: 翻譯不再進行切片,整個儲存格作為單位處理")
try:
# 測試Excel儲存格翻譯
test_excel_cell_based_translation()
# 測試Word表格儲存格翻譯
test_word_table_cell_translation()
# 測試快取對應關係
test_translation_cache_mapping()
print(f"\n" + "=" * 80)
print("✅ 儲存格為單位翻譯邏輯測試完成!")
print("📊 總結:")
print(" - Excel: 使用 translate_excel_cell() 方法")
print(" - Word表格: 使用 translate_word_table_cell() 方法")
print(" - 兩者都不進行內容切片,保持儲存格完整性")
print("=" * 80)
except Exception as e:
print(f"❌ 測試過程中發生錯誤: {str(e)}")
import traceback
print(f"錯誤詳情: {traceback.format_exc()}")
if __name__ == "__main__":
main()

View File

@@ -1,178 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
使用乾淨的DOCX文件測試翻譯插入
"""
import sys
import os
import tempfile
import shutil
from pathlib import Path
# Fix encoding for Windows console
if sys.stdout.encoding != 'utf-8':
sys.stdout.reconfigure(encoding='utf-8')
if sys.stderr.encoding != 'utf-8':
sys.stderr.reconfigure(encoding='utf-8')
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'app'))
from app import create_app, db
from app.services.translation_service import DocxParser
from sqlalchemy import text
def test_clean_docx_translation():
"""使用乾淨的DOCX文件測試翻譯插入"""
app = create_app()
with app.app_context():
print("=== 使用乾淨的DOCX文件測試翻譯插入 ===")
# 原始文件
original_path = r"C:\Users\EGG\WORK\data\user_scrip\TOOL\Document_translator_V2\uploads\9c6548ac-2f59-45f4-aade-0a9b3895bbfd\original_-OR026_9c6548ac.docx"
# 創建乾淨的副本
clean_copy_dir = Path(tempfile.gettempdir()) / "clean_docx_test"
clean_copy_dir.mkdir(exist_ok=True)
clean_copy_path = clean_copy_dir / "clean_original.docx"
shutil.copy2(original_path, clean_copy_path)
print(f"✅ 創建乾淨副本: {clean_copy_path}")
# 使用乾淨副本測試翻譯
parser = DocxParser(str(clean_copy_path))
# 檢查前幾個段落的當前狀態
try:
from docx import Document
doc = Document(str(clean_copy_path))
print(f"\n📄 乾淨文檔當前狀態:")
print(f"總段落數: {len(doc.paragraphs)}")
for i, para in enumerate(doc.paragraphs[:10]):
if para.text.strip():
print(f" 段落 {i+1}: {para.text.strip()[:60]}...")
# 檢查是否有零寬空格標記(翻譯插入標記)
has_marker = any('\u200b' in (r.text or '') for r in para.runs)
if has_marker:
print(f" ⚠️ 此段落已包含翻譯插入標記")
except Exception as e:
print(f"❌ 檢查文檔狀態失敗: {e}")
return
# 測試翻譯生成只生成前3個段落來測試
print(f"\n🔄 測試翻譯生成...")
try:
output_dir = clean_copy_dir
# 使用空的translations字典因為我們從快取讀取
empty_translations = {}
en_output_path = parser.generate_translated_document(
empty_translations,
'en',
output_dir
)
print(f"✅ 翻譯文件生成成功: {en_output_path}")
# 檢查生成的文件
output_file = Path(en_output_path)
if output_file.exists():
print(f"文件大小: {output_file.stat().st_size:,} bytes")
try:
doc2 = Document(str(output_file))
paragraphs = [p for p in doc2.paragraphs if p.text.strip()]
print(f"\n📄 生成文件詳細分析:")
print(f"總段落數: {len(paragraphs)}")
chinese_count = 0
english_count = 0
mixed_count = 0
marker_count = 0
print(f"\n前20段落詳情:")
for i, para in enumerate(paragraphs[:20]):
text = para.text.strip()
# 語言檢測
has_chinese = any('\u4e00' <= c <= '\u9fff' for c in text)
has_english = any(ord(c) < 128 and c.isalpha() and c not in 'PANJIT' for c in text)
has_marker = any('\u200b' in (r.text or '') for r in para.runs)
if has_marker:
marker_count += 1
if has_chinese and has_english:
mixed_count += 1
lang_status = "🔄 中英混合"
elif has_english:
english_count += 1
lang_status = "🇺🇸 純英文"
elif has_chinese:
chinese_count += 1
lang_status = "🇨🇳 純中文"
else:
lang_status = "❓ 其他"
marker_status = " 🏷️" if has_marker else ""
print(f" 段落 {i+1:2d}: {lang_status}{marker_status} - {text[:70]}...")
print(f"\n📊 統計結果:")
print(f" 純中文段落: {chinese_count}")
print(f" 純英文段落: {english_count}")
print(f" 中英混合段落: {mixed_count}")
print(f" 帶翻譯標記的段落: {marker_count}")
# 判斷翻譯效果
if english_count > 10:
print(f"\n✅ 翻譯效果優秀 - 有 {english_count} 個純英文段落")
elif english_count > 0:
print(f"\n⚠️ 翻譯部分成功 - 有 {english_count} 個純英文段落")
elif marker_count > 10:
print(f"\n🔍 翻譯可能成功但格式問題 - 有 {marker_count} 個帶標記的段落")
else:
print(f"\n❌ 翻譯可能失敗 - 沒有明顯的英文內容")
# 檢查是否有連續的中英文段落(交錯格式)
alternating_pairs = 0
for i in range(len(paragraphs) - 1):
current = paragraphs[i].text.strip()
next_para = paragraphs[i + 1].text.strip()
current_chinese = any('\u4e00' <= c <= '\u9fff' for c in current)
current_english = any(ord(c) < 128 and c.isalpha() and c not in 'PANJIT' for c in current)
next_chinese = any('\u4e00' <= c <= '\u9fff' for c in next_para)
next_english = any(ord(c) < 128 and c.isalpha() and c not in 'PANJIT' for c in next_para)
if current_chinese and not current_english and next_english and not next_chinese:
alternating_pairs += 1
if alternating_pairs <= 3: # 顯示前3個交錯對
print(f"\n 交錯對 {alternating_pairs}:")
print(f" 中文: {current[:50]}...")
print(f" 英文: {next_para[:50]}...")
if alternating_pairs > 0:
print(f"\n✅ 發現交錯翻譯格式!共 {alternating_pairs}")
else:
print(f"\n❌ 沒有發現交錯翻譯格式")
except Exception as e:
print(f"❌ 分析生成文件失敗: {e}")
else:
print(f"❌ 生成的文件不存在")
except Exception as e:
print(f"❌ 翻譯生成失敗: {e}")
if __name__ == "__main__":
test_clean_docx_translation()

Some files were not shown because too many files have changed in this diff Show More