fix: improve PP-StructureV3 structure preservation for complex diagrams

- Fix parsing_res_list field mapping (block_label, block_content, block_bbox)
- Add fine-grained PP-StructureV3 configuration parameters
- Lower detection thresholds (0.5→0.2) for more sensitive element detection
- Use 'small' merge mode instead of default to minimize bbox merging
- Add layout_nms, unclip_ratio, text_det thresholds for better control
- Result: Doubled element detection from 6 to 12 elements on complex diagrams

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
egg
2025-11-25 08:53:37 +08:00
parent 4325d024a7
commit a659e7ae00
3 changed files with 73 additions and 17 deletions

View File

@@ -91,7 +91,13 @@ class Settings(BaseSettings):
enable_table_recognition: bool = Field(default=True) # Table structure recognition
enable_seal_recognition: bool = Field(default=True) # Seal/stamp recognition
enable_text_recognition: bool = Field(default=True) # General text recognition
layout_detection_threshold: float = Field(default=0.5)
layout_detection_threshold: float = Field(default=0.2) # Lower threshold for more sensitive detection
layout_nms_threshold: float = Field(default=0.2) # Lower NMS to preserve more individual elements
layout_merge_mode: str = Field(default="small") # Use 'small' to minimize bbox merging
layout_unclip_ratio: float = Field(default=1.2) # Smaller unclip to preserve element boundaries
text_det_thresh: float = Field(default=0.2) # More sensitive text detection
text_det_box_thresh: float = Field(default=0.3) # Lower box threshold for better detection
text_det_unclip_ratio: float = Field(default=1.2) # Smaller unclip for tighter text boxes
# Performance tuning
use_fp16_inference: bool = Field(default=False) # Half-precision (if supported)