Files
OCR/requirements.txt
egg fa1abcd8e6 feat: implement layout-preserving PDF generation with table reconstruction
Major Features:
- Add PDF generation service with Chinese font support
- Parse HTML tables from PP-StructureV3 and rebuild with ReportLab
- Extract table text for translation purposes
- Auto-filter text regions inside tables to avoid overlaps

Backend Changes:
1. pdf_generator_service.py (NEW)
   - HTMLTableParser: Parse HTML tables to extract structure
   - PDFGeneratorService: Generate layout-preserving PDFs
   - Coordinate transformation: OCR (top-left) → PDF (bottom-left)
   - Font size heuristics: 75% of bbox height with width checking
   - Table reconstruction: Parse HTML → ReportLab Table
   - Image embedding: Extract bbox from filenames

2. ocr_service.py
   - Add _extract_table_text() for translation support
   - Add output_dir parameter to save images to result directory
   - Extract bbox from image filenames (img_in_table_box_x1_y1_x2_y2.jpg)

3. tasks.py
   - Update process_task_ocr to use save_results() with PDF generation
   - Fix download_pdf endpoint to use database-stored PDF paths
   - Support on-demand PDF generation from JSON

4. config.py
   - Add chinese_font_path configuration
   - Add pdf_enable_bbox_debug flag

Frontend Changes:
1. PDFViewer.tsx (NEW)
   - React PDF viewer with zoom and pagination
   - Memoized file config to prevent unnecessary reloads

2. TaskDetailPage.tsx & ResultsPage.tsx
   - Integrate PDF preview and download

3. main.tsx
   - Configure PDF.js worker via CDN

4. vite.config.ts
   - Add host: '0.0.0.0' for network access
   - Use VITE_API_URL environment variable for backend proxy

Dependencies:
- reportlab: PDF generation library
- Noto Sans SC font: Chinese character support

🤖 Generated with Claude Code
https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 20:21:56 +08:00

66 lines
1.5 KiB
Plaintext

# Tool_OCR - Backend Dependencies
# Python 3.10+
# ===== Core Framework =====
fastapi==0.115.0
uvicorn[standard]==0.32.0
pydantic==2.9.2
pydantic-settings==2.6.1
email-validator>=2.0.0 # For pydantic EmailStr validation
# ===== OCR Engine =====
paddleocr>=3.0.0
# paddlepaddle>=3.0.0 # Installed separately in setup script (GPU/CPU version)
paddlex[ocr]>=3.0.0 # Required for PP-StructureV3 layout analysis
# ===== Image Processing =====
pillow>=10.0.0
pdf2image>=1.17.0
opencv-python>=4.8.0
# ===== PDF Generation =====
weasyprint>=60.0
markdown>=3.5.0
reportlab>=4.0.0 # Layout-preserving PDF generation with precise coordinate control
# Note: pandoc needs to be installed via brew (brew install pandoc)
# ===== Data Export =====
pandas>=2.1.0
openpyxl>=3.1.0 # Excel support
# ===== Database =====
sqlalchemy>=2.0.0
pymysql>=1.1.0
alembic>=1.13.0
# ===== Authentication =====
python-jose[cryptography]>=3.3.0
passlib[bcrypt]>=1.7.4
bcrypt==4.2.1 # Pin to 4.2.1 for passlib compatibility
python-multipart>=0.0.6
# ===== Configuration =====
python-dotenv>=1.0.0
pyyaml>=6.0
# ===== HTTP Client =====
httpx>=0.25.0
requests>=2.31.0
# ===== Background Tasks (Optional) =====
# redis>=5.0.0 # Uncomment if using Redis for task queue
# celery>=5.3.0 # Uncomment if using Celery
# ===== Translation (Reserved) =====
# argostranslate>=1.9.0 # Uncomment when implementing translation
# ===== Development Tools =====
pytest>=7.4.0
pytest-asyncio>=0.21.0
pytest-cov>=4.1.0
black>=23.9.0
pylint>=3.0.0
# ===== Utilities =====
python-magic>=0.4.27 # File type detection