feat: add translated PDF format selection (layout/reflow)
- Add generate_translated_layout_pdf() method for layout-preserving translated PDFs - Add generate_translated_pdf() method for reflow translated PDFs - Update translate router to accept format parameter (layout/reflow) - Update frontend with dropdown to select translated PDF format - Fix reflow PDF table cell extraction from content dict - Add embedded images handling in reflow PDF tables - Archive improve-translated-text-fitting openspec proposal 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -10,8 +10,11 @@ email-validator>=2.0.0 # For pydantic EmailStr validation
|
||||
|
||||
# ===== OCR Engine =====
|
||||
paddleocr>=3.0.0
|
||||
# paddlepaddle>=3.0.0 # Installed separately in setup script (GPU/CPU version)
|
||||
paddlex[ocr]>=3.0.0 # Required for PP-StructureV3 layout analysis
|
||||
# PaddlePaddle Installation (NOT available on PyPI for 3.x):
|
||||
# GPU (CUDA 12.6): pip install paddlepaddle-gpu==3.2.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
|
||||
# GPU (CUDA 12.9): pip install paddlepaddle-gpu==3.2.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/
|
||||
# CPU: pip install paddlepaddle -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
|
||||
|
||||
# ===== Image Processing =====
|
||||
pillow>=10.0.0
|
||||
@@ -28,7 +31,7 @@ PyPDF2>=3.0.0 # Extract dimensions from source PDF files
|
||||
# ===== Direct PDF Extraction (Dual-track Processing) =====
|
||||
PyMuPDF>=1.23.0 # Primary library for editable PDF text/structure extraction
|
||||
pdfplumber>=0.10.0 # Fallback for table extraction and validation
|
||||
python-magic-bin>=0.4.14 # Windows-compatible file type detection
|
||||
# Note: python-magic requires libmagic (apt install libmagic1 on Linux)
|
||||
|
||||
# ===== Data Export =====
|
||||
pandas>=2.1.0
|
||||
@@ -57,8 +60,9 @@ requests>=2.31.0
|
||||
# redis>=5.0.0 # Uncomment if using Redis for task queue
|
||||
# celery>=5.3.0 # Uncomment if using Celery
|
||||
|
||||
# ===== Translation (Reserved) =====
|
||||
# argostranslate>=1.9.0 # Uncomment when implementing translation
|
||||
# ===== Translation =====
|
||||
# Translation will use external API (to be implemented)
|
||||
# See openspec/changes/add-document-translation/ for proposal
|
||||
|
||||
# ===== Development Tools =====
|
||||
pytest>=7.4.0
|
||||
|
||||
Reference in New Issue
Block a user