# Tool_OCR Setup Guide Complete setup instructions for macOS environment. ## Prerequisites Check Before starting, verify you have: - ✅ macOS (Apple Silicon or Intel) - ✅ Terminal access (zsh or bash) - ✅ Internet connection for downloads ## Step-by-Step Setup ### Step 1: Install Conda Environment Run the automated setup script: ```bash chmod +x setup_conda.sh ./setup_conda.sh ``` **Expected output:** - If Conda not installed: Downloads and installs Miniconda for Apple Silicon - If Conda already installed: Creates `tool_ocr` environment with Python 3.10 **If Conda was just installed:** ```bash # Reload your shell to activate Conda source ~/.zshrc # if using zsh (default on macOS) source ~/.bashrc # if using bash # Run setup script again to create environment ./setup_conda.sh ``` ### Step 2: Activate Environment ```bash conda activate tool_ocr ``` You should see `(tool_ocr)` prefix in your terminal prompt. ### Step 3: Install Python Dependencies ```bash pip install -r requirements.txt ``` **This will install:** - FastAPI and Uvicorn (web framework) - PaddleOCR and PaddlePaddle (OCR engine) - Image processing libraries (Pillow, OpenCV, pdf2image) - PDF generation tools (WeasyPrint, Markdown) - Database tools (SQLAlchemy, PyMySQL, Alembic) - Authentication libraries (python-jose, passlib) - Testing tools (pytest, pytest-asyncio) **Installation time:** ~5-10 minutes depending on your internet speed ### Step 4: Install System Dependencies ```bash # Install libmagic (required for python-magic file type detection) brew install libmagic # Install WeasyPrint dependencies (required for PDF generation) brew install pango gdk-pixbuf libffi # Install Pandoc (optional - for enhanced PDF generation) brew install pandoc # Install Chinese fonts for PDF output (optional - macOS has built-in Chinese fonts) brew install --cask font-noto-sans-cjk # Note: If above fails, skip it - macOS built-in fonts (PingFang SC, Heiti TC) work fine ``` **If Homebrew not installed:** ```bash /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" ``` ### Step 5: Configure Environment Variables ```bash # Copy template cp .env.example .env # Edit with your preferred editor nano .env # or code .env ``` **Important settings to verify in `.env`:** ```bash # Database (pre-configured, should work as-is) MYSQL_HOST=mysql.theaken.com MYSQL_PORT=33306 MYSQL_USER=A060 MYSQL_PASSWORD=WLeSCi0yhtc7 MYSQL_DATABASE=db_A060 # Application ports BACKEND_PORT=12010 FRONTEND_PORT=12011 # Security (CHANGE THIS!) SECRET_KEY=your-secret-key-here-please-change-this-to-random-string ``` **Generate a secure SECRET_KEY:** ```bash python -c "import secrets; print(secrets.token_urlsafe(32))" ``` Copy the output and paste it as your `SECRET_KEY` value. ### Step 6: Set Environment Variable for WeasyPrint Add to your shell config (`~/.zshrc` or `~/.bash_profile`): ```bash export DYLD_LIBRARY_PATH="/opt/homebrew/lib:$DYLD_LIBRARY_PATH" ``` Then reload: ```bash source ~/.zshrc # or source ~/.bash_profile ``` ### Step 7: Run Service Layer Tests Verify all services are working: ```bash cd backend python test_services.py ``` Expected output: ``` ✓ PASS - database ✓ PASS - preprocessor ✓ PASS - pdf_generator ✓ PASS - file_manager Total: 4-5/5 tests passed ``` **Note:** OCR engine test may fail on first run as PaddleOCR downloads models (~900MB). This is normal. ### Step 8: Create Directory Structure The directories should already exist, but verify: ```bash ls -la ``` You should see: - `backend/` - FastAPI application - `frontend/` - React application (will be populated later) - `uploads/` - File upload storage - `storage/` - Processed results - `models/` - PaddleOCR models (empty until first run) - `logs/` - Application logs ### Step 8: Start Backend Server ```bash cd backend python -m app.main ``` **Expected output:** ``` INFO: Started server process INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:12010 ``` **Test the server:** Open browser and visit: - http://localhost:12010 - API root - http://localhost:12010/docs - Interactive API documentation - http://localhost:12010/health - Health check endpoint ### Step 9: Download PaddleOCR Models On first OCR request, PaddleOCR will automatically download models (~900MB). **To pre-download models manually:** ```bash python -c " from paddleocr import PaddleOCR ocr = PaddleOCR(use_angle_cls=True, lang='ch', use_gpu=False) print('Models downloaded successfully') " ``` This will download: - Detection model: ch_PP-OCRv4_det - Recognition model: ch_PP-OCRv4_rec - Angle classifier: ch_ppocr_mobile_v2.0_cls Models are stored in: `./models/paddleocr/` ## Troubleshooting ### Issue: "conda: command not found" **Solution:** ```bash # Reload shell configuration source ~/.zshrc # or source ~/.bashrc # If still not working, manually add Conda to PATH export PATH="$HOME/miniconda3/bin:$PATH" ``` ### Issue: PaddlePaddle installation fails **Solution:** ```bash # For Apple Silicon Macs, ensure you're using ARM version pip uninstall paddlepaddle pip install paddlepaddle --no-cache-dir ``` ### Issue: WeasyPrint fails to install **Solution:** ```bash # Install required system libraries brew install cairo pango gdk-pixbuf libffi pip install --upgrade weasyprint ``` ### Issue: Database connection fails **Solution:** ```bash # Test database connection python -c " import pymysql conn = pymysql.connect( host='mysql.theaken.com', port=33306, user='A060', password='WLeSCi0yhtc7', database='db_A060' ) print('Database connection OK') conn.close() " ``` If this fails, verify: - Internet connection is active - Firewall is not blocking port 33306 - Database credentials in `.env` are correct ### Issue: Port 12010 already in use **Solution:** ```bash # Find what's using the port lsof -i :12010 # Kill the process or change port in .env # Edit BACKEND_PORT=12011 (or any available port) ``` ## Next Steps After successful setup: 1. ✅ Environment is ready 2. ✅ Backend server can start 3. ✅ Database connection configured **Ready to develop:** - Implement database models (`backend/app/models/`) - Create API endpoints (`backend/app/api/v1/`) - Build OCR service (`backend/app/services/ocr_service.py`) - Develop frontend UI (`frontend/src/`) **Start with Phase 1 tasks:** Refer to [openspec/changes/add-ocr-batch-processing/tasks.md](openspec/changes/add-ocr-batch-processing/tasks.md) for detailed implementation tasks. ## Development Workflow ```bash # Activate environment conda activate tool_ocr # Start backend in development mode (auto-reload) cd backend python -m app.main bash -c "source ~/.zshrc && conda activate tool_ocr && export DYLD_LIBRARY_PATH=/opt/homebrew/lib:$DYLD_LIBRARY_PATH && python -m app.main" # In another terminal, start frontend cd frontend npm run dev # Run tests cd backend pytest tests/ -v # Check code style black app/ pylint app/ ``` ## Background Services ### Automatic Cleanup Scheduler The application automatically runs a cleanup scheduler that: - **Runs every**: 1 hour (configurable via `BackgroundTaskManager.cleanup_interval`) - **Deletes files older than**: 24 hours (configurable via `BackgroundTaskManager.file_retention_hours`) - **Cleans up**: - Physical files and directories - Database records (results, files, batches) - Expired batches in COMPLETED, FAILED, or PARTIAL status The cleanup scheduler starts automatically when the backend application starts and stops gracefully on shutdown. **Monitor cleanup activity:** ```bash # Watch cleanup logs in real-time tail -f /tmp/tool_ocr_startup.log | grep cleanup # Or check application logs tail -f backend/logs/app.log | grep cleanup ``` ### Retry Logic OCR processing includes automatic retry logic: - **Maximum retries**: 3 attempts (configurable) - **Retry delay**: 5 seconds between attempts (configurable) - **Tracks**: `retry_count` field in database - **Error handling**: Detailed error messages with retry attempt information **Configuration** (in [backend/app/services/background_tasks.py](backend/app/services/background_tasks.py)): ```python task_manager = BackgroundTaskManager( max_retries=3, # Number of retry attempts retry_delay=5, # Delay between retries (seconds) cleanup_interval=3600, # Cleanup runs every hour file_retention_hours=24 # Keep files for 24 hours ) ``` ### Background Task Status Check if background services are running: ```bash # Check health endpoint curl http://localhost:12010/health # Check application startup logs for cleanup scheduler grep "cleanup scheduler" /tmp/tool_ocr_startup.log # Expected output: "Started cleanup scheduler for expired files" # Expected output: "Starting cleanup scheduler (interval: 3600s, retention: 24h)" ``` ## Deactivate Environment When done working: ```bash conda deactivate ``` ## Environment Management ```bash # List Conda environments conda env list # Remove environment (if needed) conda env remove -n tool_ocr # Export environment conda env export > environment.yml # Create from exported environment conda env create -f environment.yml ```