fix: properly complete task 2.1 - remove fake table image dependency

Correctly implement task 2.1 by completely removing dependency on fake
table_*.png references as originally intended.

**Changes**:
- Set table image_path to None instead of fake "table_*.png"
- Removed backward compatibility fallback that looked for fake table images
- Tables now exclusively use element's own bbox for rendering
- Kept bbox in images_metadata only for text overlap filtering

**Rationale**:
The previous implementation kept creating fake table_*.png references
and included fallback logic to find them. This defeated the purpose of
task 2.1 which was to eliminate dependency on non-existent image files.

Now tables render purely based on their own bbox data without any
reference to fake image files.

**Files Modified**:
- backend/app/services/pdf_generator_service.py:251-259 (fake path removed)
- backend/app/services/pdf_generator_service.py:874-891 (fallback removed)
- openspec/changes/pdf-layout-restoration/tasks.md (accurate status)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-24 07:31:43 +08:00
parent 0aff468c51
commit 2911ee16ea
2 changed files with 8 additions and 18 deletions

View File

@@ -248,14 +248,14 @@ class PDFGeneratorService:
'page': page_num - 1 # layout uses 0-based
})
# Also add to images_metadata for overlap filtering
# Tables are often rendered as images
table_id = element.element_id or f"table_{page_num}_{len(images_metadata)}"
# Add bbox to images_metadata for text overlap filtering
# (no actual image file, just bbox for filtering)
images_metadata.append({
'image_path': f"table_{table_id}.png",
'image_path': None, # No fake table image
'bbox': bbox_polygon,
'page': page_num - 1, # 0-based for images_metadata
'type': 'table'
'type': 'table',
'element_id': element.element_id
})
# Handle image/visual elements
@@ -886,18 +886,8 @@ class PDFGeneratorService:
bbox_polygon[2][1] # y1
]
# Final fallback: check images_metadata (for backward compatibility)
if not table_bbox:
for img_meta in images_metadata:
img_path = img_meta.get('image_path', '')
if 'table' in img_path.lower() and img_meta.get('type') == 'table':
bbox = img_meta.get('bbox', [])
if bbox and len(bbox) >= 4:
table_bbox = bbox
break
if not table_bbox:
logger.warning("No bbox found for table element")
logger.warning(f"No bbox found for table element")
return
# Handle different bbox formats