# PP-StructureV3 Model Cache Cleanup Guide ## Overview After upgrading PP-StructureV3 models, older unused models may remain in the cache directory. This guide explains how to safely remove them to free disk space. ## Model Cache Location PaddleX/PaddleOCR 3.x stores downloaded models in: ``` ~/.paddlex/official_models/ ``` ## Models After Upgrade ### Current Active Models (DO NOT DELETE) | Model | Purpose | Approx. Size | |-------|---------|--------------| | `PP-DocLayout_plus-L` | Layout detection for Chinese documents | ~350MB | | `SLANeXt_wired` | Table structure recognition (bordered tables) | ~351MB | | `SLANeXt_wireless` | Table structure recognition (borderless tables) | ~351MB | | `PP-FormulaNet_plus-L` | Formula recognition (Chinese + English) | ~800MB | | `PP-OCRv5_*` | Text detection and recognition | ~150MB | | `picodet_lcnet_x1_0_fgd_layout_cdla` | CDLA layout model option | ~10MB | ### Deprecated Models (Safe to Delete) | Model | Reason | Approx. Size | |-------|--------|--------------| | `PP-DocLayout-S` | Replaced by PP-DocLayout_plus-L | ~50MB | | `SLANet` | Replaced by SLANeXt_wired/wireless | ~7MB | | `SLANet_plus` | Replaced by SLANeXt_wired/wireless | ~7MB | | `PP-FormulaNet-S` | Replaced by PP-FormulaNet_plus-L | ~200MB | | `PP-FormulaNet-L` | Replaced by PP-FormulaNet_plus-L | ~400MB | ## Cleanup Commands ### List Current Cache ```bash # List all cached models ls -la ~/.paddlex/official_models/ # Show disk usage per model du -sh ~/.paddlex/official_models/* ``` ### Delete Deprecated Models ```bash # Remove deprecated layout model rm -rf ~/.paddlex/official_models/PP-DocLayout-S # Remove deprecated table models rm -rf ~/.paddlex/official_models/SLANet rm -rf ~/.paddlex/official_models/SLANet_plus # Remove deprecated formula models (if present) rm -rf ~/.paddlex/official_models/PP-FormulaNet-S rm -rf ~/.paddlex/official_models/PP-FormulaNet-L ``` ### Cleanup Script ```bash #!/bin/bash # cleanup_old_models.sh - Remove deprecated PP-StructureV3 models CACHE_DIR="$HOME/.paddlex/official_models" echo "PP-StructureV3 Model Cleanup" echo "============================" echo "" # Check if cache directory exists if [ ! -d "$CACHE_DIR" ]; then echo "Cache directory not found: $CACHE_DIR" exit 0 fi # List deprecated models DEPRECATED_MODELS=( "PP-DocLayout-S" "SLANet" "SLANet_plus" "PP-FormulaNet-S" "PP-FormulaNet-L" ) echo "Checking for deprecated models..." echo "" TOTAL_SIZE=0 for model in "${DEPRECATED_MODELS[@]}"; do MODEL_PATH="$CACHE_DIR/$model" if [ -d "$MODEL_PATH" ]; then SIZE=$(du -sh "$MODEL_PATH" 2>/dev/null | cut -f1) echo "Found: $model ($SIZE)" TOTAL_SIZE=$((TOTAL_SIZE + 1)) fi done if [ $TOTAL_SIZE -eq 0 ]; then echo "No deprecated models found. Cache is clean." exit 0 fi echo "" read -p "Delete these models? [y/N]: " confirm if [ "$confirm" = "y" ] || [ "$confirm" = "Y" ]; then for model in "${DEPRECATED_MODELS[@]}"; do MODEL_PATH="$CACHE_DIR/$model" if [ -d "$MODEL_PATH" ]; then rm -rf "$MODEL_PATH" echo "Deleted: $model" fi done echo "" echo "Cleanup complete." else echo "Cleanup cancelled." fi ``` ## Space Savings Estimate After cleanup, you can expect to free approximately: - **~65MB** from deprecated layout model - **~14MB** from deprecated table models - **~600MB** from deprecated formula models (if present) Total potential savings: **~680MB** ## Notes 1. Models are downloaded on first use. Deleting active models will trigger re-download. 2. The cache directory may vary if `PADDLEX_HOME` environment variable is set. 3. Always verify which models your configuration uses before deleting.