docs: update documentation for chart recognition enablement

Updates all project documentation to reflect that chart recognition
is now fully enabled with PaddlePaddle 3.2.1+.

Changes:
- README.md: Remove Known Limitations section about chart recognition,
  update tech stack and prerequisites to include PaddlePaddle 3.2.1+,
  add WSL CUDA configuration notes
- openspec/project.md: Add comprehensive chart recognition feature
  descriptions, update system requirements for GPU/CUDA support
- openspec/changes/add-gpu-acceleration-support/tasks.md: Mark task
  5.4 as completed with resolution details
- openspec/changes/add-gpu-acceleration-support/proposal.md: Update
  Known Issues section to show chart recognition is now resolved
- setup_dev_env.sh: Upgrade PaddlePaddle from 3.0.0 to 3.2.1+, add
  WSL CUDA library path configuration, add chart recognition API
  verification

All documentation now accurately reflects:
 Chart recognition fully enabled
 PaddlePaddle 3.2.1+ with fused_rms_norm_ext API
 WSL CUDA path auto-configuration
 Comprehensive PP-StructureV3 capabilities

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-16 19:04:30 +08:00
parent 7e12f162b4
commit 3f41a33877
5 changed files with 147 additions and 67 deletions

View File

@@ -20,7 +20,8 @@ A web-based solution to extract text, images, and document structure from multip
### Backend
- **Framework**: FastAPI 0.115.0
- **OCR Engine**: PaddleOCR 3.0+ with PaddleOCR-VL
- **OCR Engine**: PaddleOCR 3.0+ with PaddleOCR-VL and PP-StructureV3
- **Deep Learning**: PaddlePaddle 3.2.1+ (GPU/CPU support)
- **Database**: MySQL via SQLAlchemy
- **PDF Generation**: Pandoc + WeasyPrint
- **Image Processing**: OpenCV, Pillow, pdf2image
@@ -39,7 +40,9 @@ A web-based solution to extract text, images, and document structure from multip
- **Python**: 3.12+
- **Node.js**: 24.x LTS
- **MySQL**: External database server (provided)
- **GPU** (Optional): NVIDIA GPU with CUDA 11.2+ for hardware acceleration
- **GPU** (Optional): NVIDIA GPU with CUDA 11.8+ for hardware acceleration
- PaddlePaddle 3.2.1+ requires CUDA 11.8, 12.3, or 12.6+
- WSL2 users: Ensure NVIDIA CUDA drivers are installed
## Quick Start
@@ -55,10 +58,11 @@ This script automatically:
- Installs Python development tools (pip, venv, build-essential)
- Installs system dependencies (pandoc, LibreOffice, fonts, etc.)
- Installs Node.js (via nvm)
- Installs PaddlePaddle GPU version (if GPU detected) or CPU version
- Installs other Python packages
- Installs PaddlePaddle 3.2.1+ GPU version (if GPU detected) or CPU version
- Configures WSL CUDA library paths (for WSL2 GPU users)
- Installs other Python packages (PaddleOCR, PaddleX, etc.)
- Installs frontend dependencies
- Verifies GPU functionality (if GPU detected)
- Verifies GPU functionality and chart recognition API availability
### 2. Initialize Database
@@ -155,26 +159,11 @@ The system automatically detects and utilizes NVIDIA GPU hardware when available
- **Graceful fallback**: If GPU is unavailable or fails, system automatically uses CPU mode
- **Performance**: GPU acceleration provides 3-10x speedup for OCR processing
- **Configuration**: Control GPU usage via `.env.local` environment variables
- **WSL2 CUDA Setup**: For WSL2 users, CUDA library paths are automatically configured in `~/.bashrc`
Check GPU status at: http://localhost:8000/health
**Chart Recognition**: Requires PaddlePaddle 3.2.0+ for full PP-StructureV3 chart recognition capabilities (chart type detection, data extraction, axis/legend parsing). The setup script installs PaddlePaddle 3.2.1+ which includes all required APIs.
### Known Limitations
**Chart Recognition (PP-StructureV3)**
Due to API incompatibility between PaddleOCR 3.x and PaddlePaddle 3.0.0 stable, the chart recognition feature is currently disabled:
-**Works**: Layout analysis detects and extracts charts/figures as image files
-**Works**: Tables, formulas, and text recognition function normally
-**Disabled**: Deep chart content understanding (chart type, data extraction, axis/legend parsing)
-**Disabled**: Converting chart content to structured data
**Technical Details**:
- The PaddleOCR-VL chart recognition model requires `paddle.incubate.nn.functional.fused_rms_norm_ext` API
- PaddlePaddle 3.0.0 stable only provides the base `fused_rms_norm` function
- This limitation will be resolved when PaddlePaddle releases an update with the extended API
**Workaround**: Charts are saved as images and can be viewed manually. For chart data extraction, consider using specialized chart recognition tools separately.
Check GPU status and chart recognition availability at: http://localhost:8000/health
## API Endpoints
@@ -274,6 +263,8 @@ Internal project use
- Token expiration is set to 24 hours by default
- Office conversion requires LibreOffice (installed via setup script)
- Development environment: WSL2 Ubuntu 24.04 with Python venv
- **GPU acceleration**: Automatically detected and enabled if NVIDIA GPU with CUDA 11.2+ is available
- **WSL GPU support**: Ensure NVIDIA CUDA drivers are installed in WSL for GPU acceleration
- GPU status can be checked via `/health` API endpoint
- **GPU acceleration**: Automatically detected and enabled if NVIDIA GPU with CUDA 11.8+ is available
- **PaddlePaddle version**: System uses PaddlePaddle 3.2.1+ which includes full chart recognition support
- **WSL GPU support**: WSL2 CUDA library paths (`/usr/lib/wsl/lib`) are automatically configured in `~/.bashrc`
- **Chart recognition**: Fully enabled with PP-StructureV3 for chart type detection, data extraction, and structure analysis
- GPU status and chart recognition availability can be checked via `/health` API endpoint