# Change: Fix Table Column Alignment with Header-Anchor Correction ## Why PP-Structure's table structure recognition frequently outputs cells with incorrect column indices, causing "column shift" where content appears in the wrong column. This happens because: 1. **Semantic over Geometric**: The model infers row/col from semantic patterns rather than physical coordinates 2. **Vertical text fragmentation**: Chinese vertical text (e.g., "报价内容") gets split into fragments 3. **Missing left boundary**: When table's left border is unclear, cells shift left incorrectly The result: A cell with X-coordinate 213 gets assigned to column 0 (range 96-162) instead of column 1 (range 204-313). ## What Changes - **Add Header-Anchor Alignment**: Use the first row (header) X-coordinates as column reference points - **Add Coordinate-Based Column Correction**: Validate and correct cell column assignments based on X-coordinate overlap with header columns - **Add Vertical Fragment Merging**: Detect and merge vertically stacked narrow text blocks that represent vertical text - **Add Configuration Options**: Enable/disable correction features independently ## Impact - Affected specs: `document-processing` - Affected code: - `backend/app/services/table_column_corrector.py` (new) - `backend/app/services/pdf_generator_service.py` - `backend/app/core/config.py` ## Problem Analysis ### Example: scan.pdf Table 7 **Raw PP-Structure Output:** ``` Row 5: "3、適應產品..." at X=213 Model says: col=0 Header Row 0: - Column 0 (序號): X range [96, 162] - Column 1 (產品名稱): X range [204, 313] ``` **Problem:** X=213 is far outside column 0's range (max 162), but perfectly within column 1's range (starts at 204). **Solution:** Force-correct col=0 → col=1 based on X-coordinate alignment with header. ### Vertical Text Issue **Raw OCR:** ``` Block A: "报价内" at X≈100, Y=[100, 200] Block B: "容--" at X≈102, Y=[200, 300] ``` **Problem:** These should be one cell spanning multiple rows, but appear as separate fragments. **Solution:** Merge vertically aligned narrow blocks before structure recognition.