Files
OCR/backend/app/services
egg ad879d48e5 feat: implement Phase 3 list formatting for Direct track
Add comprehensive list rendering with automatic detection and formatting:

**Task 6.1: List Element Detection**
- Detect LIST_ITEM elements by type (element.type == ElementType.LIST_ITEM)
- Extract list_level from element metadata (lines 1566-1567)
- Determine list type via regex pattern matching:
  - Ordered lists: ^\d+[\.\)]\s (e.g., "1. ", "2) ")
  - Unordered lists: ^[•·▪▫◦‣⁃]\s (various bullet symbols)
- Parse and extract list markers from text content (lines 1571-1588)

**Task 6.2: List Rendering**
- Add list markers to first line of each item:
  - Ordered: Preserve original numbering (e.g., "1. ")
  - Unordered: Standardize to bullet "• "
- Remove original markers from text content
- Apply list indentation: 20pt per nesting level (lines 1594-1598)
- Combine list indent with existing paragraph indent
- List spacing: Inherited from bbox-based layout (spacing_before/after)

**Implementation Details**
- Lines 1565-1598: List detection and indentation logic
- Lines 1629-1632: Prepend list marker to first line (rendered_line)
- Lines 1635-1676: Update all text width calculations to use rendered_line
- Lines 1688-1692: Enhanced logging with list type and level

**Technical Notes**
- Direct track only (OCR track has no list metadata)
- Integrates with existing alignment and indentation system
- Preserves line breaks and multi-line list items
- Works with all text alignment modes (left/center/right/justify)

**Modified Files**
- backend/app/services/pdf_generator_service.py
  - Added import re for regex pattern matching
  - Lines 1565-1598: List detection and indentation
  - Lines 1629-1676: List marker rendering
  - Lines 1688-1692: Enhanced debug logging
- openspec/changes/pdf-layout-restoration/tasks.md
  - Marked Task 6.1 (all subtasks) as completed
  - Marked Task 6.2 (all subtasks) as completed
  - Added implementation line references

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-24 09:54:15 +08:00
..
2025-11-12 22:53:17 +08:00
2025-11-12 22:53:17 +08:00