feat: Docker化部署 - 單容器架構轉換

將 Tool_OCR 從 macOS conda 環境轉換為 Docker 單容器部署方案。
前後端整合於同一容器,通過 Nginx 反向代理,僅對外暴露單一端口。

## 新增功能
- Docker 單容器架構(Frontend + Backend + Nginx)
- 多階段構建優化鏡像大小
- Supervisor 進程管理
- 健康檢查機制
- 完整部署文檔

## 技術細節
- 對外端口:12015(原 12010 已被佔用)
- 內部架構:Nginx(12015) → FastAPI(8000)
- 前端靜態文件由 Nginx 直接服務
- API 請求通過 Nginx 反向代理

## 系統依賴完善
- libmagic1:文件類型檢測
- LibreOffice:Office 文檔轉換
- paddlex[ocr]:PP-StructureV3 版面分析
- 中日韓字體支援

## 配置調整
- 環境變數路徑:macOS 路徑 → 容器絕對路徑
- 前端 API URL:修正為統一端口 12015
- Pip 安裝:延長超時至 600 秒,重試 5 次
- CRLF 轉換:自動處理 Windows 換行符

## 清理
- 移除臨時文檔(API_FIX_SUMMARY.md 等 7 個文檔)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
beabigegg
2025-11-13 13:12:59 +08:00
parent 57cf91271c
commit 0f81d5e70b
26 changed files with 1166 additions and 2985 deletions

131
Dockerfile Normal file
View File

@@ -0,0 +1,131 @@
# ============================================
# Tool_OCR - Unified Docker Image
# Frontend (React + Vite) + Backend (FastAPI)
# Served by Nginx with reverse proxy
# ============================================
# ============================================
# Stage 1: Build Frontend
# ============================================
FROM node:20-alpine AS frontend-builder
WORKDIR /app/frontend
# Copy package files
COPY frontend/package*.json ./
# Install all dependencies (including devDependencies for build)
RUN npm ci
# Copy frontend source
COPY frontend/ ./
# Create production environment file
RUN echo "VITE_API_BASE_URL=" > .env.production
# Build frontend for production
RUN npm run build
# ============================================
# Stage 2: Build Backend + Final Image
# ============================================
FROM python:3.10-slim-bookworm
# Set working directory
WORKDIR /app
# Set environment variables
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
DEBIAN_FRONTEND=noninteractive
# Install system dependencies
# - nginx: web server and reverse proxy
# - supervisor: process manager for nginx + uvicorn
# - curl: for health checks
# - pandoc: for markdown to PDF conversion
# - poppler-utils: for pdf2image (PDF processing)
# - libpango-1.0-0, libpangocairo-1.0-0: for WeasyPrint
# - libgdk-pixbuf2.0-0: for WeasyPrint image handling
# - libffi-dev: for cryptography
# - fonts-noto-cjk: Chinese/Japanese/Korean font support
# - libgomp1, libgl1-mesa-glx, libglib2.0-0: for OpenCV and PaddleOCR
# - libmagic1: for python-magic file type detection
# - libreoffice-writer, libreoffice-impress: for Office document conversion (doc/docx/ppt/pptx)
RUN apt-get update && apt-get install -y --no-install-recommends \
nginx \
supervisor \
curl \
pandoc \
poppler-utils \
libpango-1.0-0 \
libpangocairo-1.0-0 \
libgdk-pixbuf2.0-0 \
libffi-dev \
fonts-noto-cjk \
fonts-noto-cjk-extra \
libgomp1 \
libgl1-mesa-glx \
libglib2.0-0 \
libmagic1 \
libreoffice-writer \
libreoffice-impress \
&& rm -rf /var/lib/apt/lists/*
# Copy Python requirements
COPY requirements.txt .
# Install Python dependencies with extended timeout
# PaddlePaddle is 189MB and may take time to download
# Timeout: 600 seconds (10 minutes), Retries: 5
RUN pip install --timeout 600 --retries 5 -r requirements.txt
# Copy backend application
COPY backend/ ./backend/
# Copy frontend build from frontend-builder stage
COPY --from=frontend-builder /app/frontend/dist /app/frontend/dist
# Copy Nginx configuration
COPY docker/nginx.conf /etc/nginx/nginx.conf
COPY docker/default.conf /etc/nginx/conf.d/default.conf
# Copy supervisor configuration
COPY docker/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# Copy startup script and fix line endings (Windows CRLF -> Linux LF)
COPY docker/entrypoint.sh /entrypoint.sh
RUN sed -i 's/\r$//' /entrypoint.sh && chmod +x /entrypoint.sh
# Create necessary directories with proper permissions
RUN mkdir -p \
/app/backend/uploads/temp \
/app/backend/uploads/processed \
/app/backend/uploads/images \
/app/backend/storage/markdown \
/app/backend/storage/json \
/app/backend/storage/exports \
/app/backend/models/paddleocr \
/app/backend/logs \
/var/log/supervisor \
/var/log/nginx \
/var/cache/nginx \
/var/run \
&& chmod -R 755 /app \
&& chown -R www-data:www-data /var/log/nginx /var/cache/nginx
# Expose port (only one port needed!)
EXPOSE 12015
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:12015/health || exit 1
# Set working directory to backend for Python app
WORKDIR /app/backend
# Use entrypoint script to start supervisor
ENTRYPOINT ["/entrypoint.sh"]