OCR/openspec/changes/simple-text-positioning/proposal.md

# Simple Text Positioning from Raw OCR

## Summary

Simplify OCR track PDF generation by rendering raw OCR text at correct positions without complex table structure reconstruction.

## Problem

Current OCR track processing has multiple failure points:
1. PP-Structure table structure recognition fails for borderless tables
2. Multi-column layouts get merged incorrectly into single tables
3. Table HTML reconstruction produces wrong cell positions
4. Complex column correction algorithms still can't fix fundamental structure errors

Meanwhile, raw OCR (`raw_ocr_regions.json`) correctly identifies all text with accurate bounding boxes.

## Solution

Replace complex table reconstruction with simple text positioning:
1. Read raw OCR regions directly
2. Position text at bbox coordinates
3. Calculate text rotation from bbox quadrilateral shape
4. Estimate font size from bbox height
5. Skip table HTML parsing entirely for OCR track

## Benefits

- **Reliability**: Raw OCR text positions are accurate
- **Simplicity**: Eliminates complex table parsing logic
- **Performance**: Faster processing without structure analysis
- **Consistency**: Predictable output regardless of table type

## Trade-offs

- No table borders in output
- No cell structure (colspan, rowspan)
- Visual layout approximation rather than semantic structure

## Scope

- OCR track PDF generation only
- Direct track remains unchanged (uses native PDF text extraction)