fix: add image_regions and tables to bbox dimension calculation
Critical Fix - Complete Solution: Previous fix missed image_regions and tables fields, causing incorrect scale factors when images or tables extended beyond text regions. User's Scenario (multiple JSON files): - text_regions: max coordinates ~1850 - image_regions: max coordinates ~2204 (beyond text!) - tables: max coordinates ~3500 (beyond both!) - Without checking all fields → scale=1.0 → content out of bounds Complete Fix: Now checks ALL possible bbox sources: 1. text_regions - text content 2. image_regions - images/figures/charts (NEW) 3. tables - table structures (NEW) 4. layout - legacy field 5. layout_data.elements - PP-StructureV3 format Changes: - backend/app/services/pdf_generator_service.py: - Add image_regions check (critical for images at X=1434, X=2204) - Add tables check (critical for tables at Y=3500) - Add type checks for all fields for safety - Update warning message to list all checked fields - backend/test_all_regions.py: - Test all region types are properly checked - Validates max dimensions from ALL sources - Confirms correct scale factors (~0.27, ~0.24) Test Results: ✓ All 5 regions checked (text + image + table) ✓ OCR dimensions: 2204 x 3500 (from ALL regions) ✓ Scale factors: X=0.270, Y=0.241 (correct!) This is the COMPLETE fix for the dimension inference bug. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -158,14 +158,22 @@ class PDFGeneratorService:
|
||||
all_regions = []
|
||||
|
||||
# 1. text_regions - 包含所有文字區域(最常見)
|
||||
if 'text_regions' in ocr_data:
|
||||
if 'text_regions' in ocr_data and isinstance(ocr_data['text_regions'], list):
|
||||
all_regions.extend(ocr_data['text_regions'])
|
||||
|
||||
# 2. layout - 可能包含布局信息
|
||||
# 2. image_regions - 包含圖片區域
|
||||
if 'image_regions' in ocr_data and isinstance(ocr_data['image_regions'], list):
|
||||
all_regions.extend(ocr_data['image_regions'])
|
||||
|
||||
# 3. tables - 包含表格區域
|
||||
if 'tables' in ocr_data and isinstance(ocr_data['tables'], list):
|
||||
all_regions.extend(ocr_data['tables'])
|
||||
|
||||
# 4. layout - 可能包含布局信息(可能是空列表)
|
||||
if 'layout' in ocr_data and isinstance(ocr_data['layout'], list):
|
||||
all_regions.extend(ocr_data['layout'])
|
||||
|
||||
# 3. layout_data.elements - PP-StructureV3 格式
|
||||
# 5. layout_data.elements - PP-StructureV3 格式
|
||||
if 'layout_data' in ocr_data and isinstance(ocr_data['layout_data'], dict):
|
||||
elements = ocr_data['layout_data'].get('elements', [])
|
||||
if elements:
|
||||
@@ -173,7 +181,7 @@ class PDFGeneratorService:
|
||||
|
||||
if not all_regions:
|
||||
# 如果 JSON 為空,回退到原始檔案尺寸
|
||||
logger.warning("JSON 中沒有找到任何包含 bbox 的區域,回退到原始檔案尺寸。")
|
||||
logger.warning("JSON 中沒有找到 text_regions, image_regions, tables, layout 或 layout_data.elements,回退到原始檔案尺寸。")
|
||||
if source_file_path:
|
||||
dims = self.get_original_page_size(source_file_path)
|
||||
if dims:
|
||||
|
||||
Reference in New Issue
Block a user