# Change: Fix OCR Track Reflow PDF

## Why

The OCR Track reflow PDF generation is missing most content because:

1. PP-StructureV3 extracts tables as elements but stores `content: ""` (empty string) instead of structured `content.cells` data
2. The `generate_reflow_pdf` method expects `content.cells` for tables, so tables are skipped
3. Table text exists in `raw_ocr_regions.json` (59 text blocks) but is not used by reflow PDF generation
4. This causes significant content loss - only 6 text elements vs 59 raw OCR regions

The Layout PDF works correctly because it uses `raw_ocr_regions.json` via Simple Text Positioning mode, bypassing the need for structured table data.

## What Changes

### Reflow PDF Generation for OCR Track

Modify `generate_reflow_pdf` to use `raw_ocr_regions.json` as the primary text source for OCR Track documents:

1. **Detect processing track** from JSON metadata
2. **For OCR Track**: Load `raw_ocr_regions.json` and render all text blocks in reading order
3. **For Direct Track**: Continue using `content.cells` for tables (already works)
4. **Images/Charts**: Continue using `content.saved_path` from elements (works for both tracks)

### Data Flow

**OCR Track Reflow PDF (NEW):**
```
raw_ocr_regions.json (59 text blocks)
  + scan_result.json (images/charts only)
  → Sort by Y coordinate (reading order)
  → Render text paragraphs + images
```

**Direct Track Reflow PDF (UNCHANGED):**
```
*_result.json (elements with content.cells)
  → Render tables, text, images in order
```

## Impact

- **Affected file**: `backend/app/services/pdf_generator_service.py`
- **User experience**: OCR Track reflow PDF will contain all text content (matching Layout PDF)
- **Translation**: Reflow translated PDF will also work correctly for OCR Track

## Migration

- No data migration required
- Existing `raw_ocr_regions.json` files contain all necessary data
- No API changes