first

2025-11-12 22:53:17 +08:00
commit da700721fa
130 changed files with 23393 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,233 @@
+# Tool_OCR
+
+**OCR Batch Processing System with Structure Extraction**
+
+A web-based solution to extract text, images, and document structure from multiple files efficiently using PaddleOCR-VL.
+
+## Features
+
+- 🔍 **Multi-Language OCR**: Support for 109 languages (Chinese, English, Japanese, Korean, etc.)
+- 📄 **Document Structure Analysis**: Intelligent layout analysis with PP-StructureV3
+- 🖼️ **Image Extraction**: Preserve document images alongside text content
+- 📑 **Batch Processing**: Process multiple files concurrently with progress tracking
+- 📤 **Multiple Export Formats**: TXT, JSON, Excel, Markdown with images, searchable PDF
+- 🔧 **Flexible Configuration**: Rule-based output formatting
+- 🌐 **Translation Ready**: Reserved architecture for future translation features
+
+## Tech Stack
+
+### Backend
+- **Framework**: FastAPI 0.115.0
+- **OCR Engine**: PaddleOCR 3.0+ with PaddleOCR-VL
+- **Database**: MySQL via SQLAlchemy
+- **PDF Generation**: Pandoc + WeasyPrint
+- **Image Processing**: OpenCV, Pillow, pdf2image
+
+### Frontend
+- **Framework**: React 18 with Vite
+- **Styling**: TailwindCSS + shadcn/ui
+- **HTTP Client**: Axios with React Query
+
+## Prerequisites
+
+- **macOS**: Apple Silicon (M1/M2/M3) or Intel
+- **Python**: 3.10+
+- **Conda**: Miniconda or Anaconda (will be installed automatically)
+- **Homebrew**: For system dependencies
+- **MySQL**: External database server (provided)
+
+## Installation
+
+### 1. Automated Setup (Recommended)
+
+```bash
+# Clone the repository
+cd /Users/egg/Projects/Tool_OCR
+
+# Run automated setup script
+chmod +x setup_conda.sh
+./setup_conda.sh
+
+# If Conda was just installed, reload your shell
+source ~/.zshrc  # or source ~/.bash_profile
+
+# Run the script again to create environment
+./setup_conda.sh
+```
+
+### 2. Install Dependencies
+
+```bash
+# Activate Conda environment
+conda activate tool_ocr
+
+# Install Python dependencies
+pip install -r requirements.txt
+
+# Install system dependencies (Pandoc for PDF generation)
+brew install pandoc
+
+# Install Chinese fonts for PDF generation (optional)
+brew install --cask font-noto-sans-cjk
+# Note: macOS built-in fonts work fine, this is optional
+```
+
+### 3. Download PaddleOCR Models
+
+```bash
+# Create models directory
+mkdir -p models/paddleocr
+
+# Models will be automatically downloaded on first run
+# (~900MB total, includes PaddleOCR-VL 0.9B model)
+```
+
+### 4. Configure Environment
+
+```bash
+# Copy environment template
+cp .env.example .env
+
+# Edit .env with your settings
+# Database credentials are pre-configured
+nano .env
+```
+
+### 5. Initialize Database
+
+```bash
+# Database schema will be created automatically on first run
+# Using: mysql.theaken.com:33306/db_A060
+```
+
+## Usage
+
+### Start Backend Server
+
+```bash
+# Activate environment
+conda activate tool_ocr
+
+# Start FastAPI server
+cd backend
+python -m app.main
+
+# Server runs at: http://localhost:12010
+# API docs: http://localhost:12010/docs
+```
+
+### Start Frontend (Coming Soon)
+
+```bash
+# Install frontend dependencies
+cd frontend
+npm install
+
+# Start development server
+npm run dev
+
+# Frontend runs at: http://localhost:12011
+```
+
+## Project Structure
+
+```
+Tool_OCR/
+├── backend/
+│   ├── app/
+│   │   ├── api/v1/          # API endpoints
+│   │   ├── core/            # Configuration, database
+│   │   ├── models/          # Database models
+│   │   ├── services/        # Business logic
+│   │   ├── utils/           # Utilities
+│   │   └── main.py          # Application entry point
+│   └── tests/               # Test suite
+├── frontend/
+│   └── src/                 # React application
+├── uploads/
+│   ├── temp/                # Temporary uploads
+│   ├── processed/           # Processed files
+│   └── images/              # Extracted images
+├── storage/
+│   ├── markdown/            # Markdown outputs
+│   ├── json/                # JSON results
+│   └── exports/             # Export files
+├── models/
+│   └── paddleocr/           # PaddleOCR models
+├── config/                  # Configuration files
+├── templates/               # PDF templates
+├── logs/                    # Application logs
+├── requirements.txt         # Python dependencies
+├── setup_conda.sh           # Environment setup script
+├── .env.example             # Environment template
+└── README.md
+```
+
+## API Endpoints (Planned)
+
+- `POST /api/v1/ocr/upload` - Upload files for OCR processing
+- `GET /api/v1/ocr/tasks` - List all OCR tasks
+- `GET /api/v1/ocr/tasks/{task_id}` - Get task details
+- `POST /api/v1/ocr/batch` - Create batch processing task
+- `GET /api/v1/export/{task_id}` - Export results (TXT/JSON/Excel/MD/PDF)
+- `POST /api/v1/translate/document` - Translate document (reserved, returns 501)
+
+## Development
+
+### Run Tests
+
+```bash
+cd backend
+pytest tests/ -v --cov=app
+```
+
+### Code Quality
+
+```bash
+# Format code
+black app/
+
+# Lint code
+pylint app/
+```
+
+## OpenSpec Workflow
+
+This project follows OpenSpec for specification-driven development:
+
+```bash
+# View current changes
+openspec list
+
+# Validate specifications
+openspec validate add-ocr-batch-processing
+
+# View implementation tasks
+cat openspec/changes/add-ocr-batch-processing/tasks.md
+```
+
+## Roadmap
+
+- [x] **Phase 0**: Environment setup and configuration
+- [ ] **Phase 1**: Core OCR with structure extraction
+- [ ] **Phase 2**: Frontend development
+- [ ] **Phase 3**: Testing & optimization
+- [ ] **Phase 4**: Deployment
+- [ ] **Phase 5**: Translation feature (future)
+
+## License
+
+[To be determined]
+
+## Contributors
+
+- Development environment: macOS Apple Silicon
+- Database: MySQL external server
+- OCR Engine: PaddleOCR-VL 0.9B with PP-StructureV3
+
+## Support
+
+For issues and questions, refer to:
+- OpenSpec documentation: `openspec/AGENTS.md`
+- Task breakdown: `openspec/changes/add-ocr-batch-processing/tasks.md`
+- Specifications: `openspec/changes/add-ocr-batch-processing/specs/`