chore: project cleanup and prepare for dual-track processing refactor

- Removed all test files and directories - Deleted outdated documentation (will be rewritten) - Cleaned up temporary files, logs, and uploads - Archived 5 completed OpenSpec proposals - Created new dual-track-document-processing proposal with complete OpenSpec structure - Dual-track architecture: OCR track (PaddleOCR) + Direct track (PyMuPDF) - UnifiedDocument model for consistent output - Support for structure-preserving translation - Updated .gitignore to prevent future test/temp files This is a major cleanup preparing for the complete refactoring of the document processing pipeline. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-18 20:02:31 +08:00
parent 0edc56b03f
commit cd3cbea49d
64 changed files with 3573 additions and 8190 deletions
--- a/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/proposal.md
+++ b/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/proposal.md
--- a/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/specs/environment-setup/spec.md
+++ b/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/specs/environment-setup/spec.md
--- a/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/specs/ocr-processing/spec.md
+++ b/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/specs/ocr-processing/spec.md
--- a/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/tasks.md
+++ b/openspec/changes/archive/2025-11-18-add-gpu-acceleration-support/tasks.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/OFFICE_INTEGRATION.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/OFFICE_INTEGRATION.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/SESSION_SUMMARY.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/SESSION_SUMMARY.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/STATUS.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/STATUS.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/design.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/design.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/proposal.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/proposal.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/specs/export-results/spec.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/specs/export-results/spec.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/specs/file-management/spec.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/specs/file-management/spec.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/specs/ocr-processing/spec.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/specs/ocr-processing/spec.md
--- a/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/tasks.md
+++ b/openspec/changes/archive/2025-11-18-add-ocr-batch-processing/tasks.md
--- a/openspec/changes/archive/2025-11-18-add-office-document-support/IMPLEMENTATION.md
+++ b/openspec/changes/archive/2025-11-18-add-office-document-support/IMPLEMENTATION.md
--- a/openspec/changes/archive/2025-11-18-add-office-document-support/design.md
+++ b/openspec/changes/archive/2025-11-18-add-office-document-support/design.md
--- a/openspec/changes/archive/2025-11-18-add-office-document-support/proposal.md
+++ b/openspec/changes/archive/2025-11-18-add-office-document-support/proposal.md
--- a/openspec/changes/archive/2025-11-18-add-office-document-support/specs/file-processing/spec.md
+++ b/openspec/changes/archive/2025-11-18-add-office-document-support/specs/file-processing/spec.md
--- a/openspec/changes/archive/2025-11-18-add-office-document-support/tasks.md
+++ b/openspec/changes/archive/2025-11-18-add-office-document-support/tasks.md
--- a/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/ARCHITECTURE-REFACTOR-PLAN.md
+++ b/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/ARCHITECTURE-REFACTOR-PLAN.md
@@ -0,0 +1,817 @@
+# Tool_OCR 架構大改方案
+## 基於 PaddleOCR PP-StructureV3 完整能力的重構計劃
+
+**規劃日期**: 2025-01-18
+**硬體配置**: RTX 4060 8GB VRAM
+**優先級**: P0 (最高)
+
+---
+
+## 📊 現狀分析
+
+### 目前架構的問題
+
+#### 1. **PP-StructureV3 能力嚴重浪費**
+```python
+# ❌ 目前實作 (ocr_service.py:614-646)
+markdown_dict = page_result.markdown  # 只用簡化版
+markdown_texts = markdown_dict.get('markdown_texts', '')
+'bbox': [],  # 座標全部為空！
+```
+
+**問題**:
+- 只使用了 ~20% 的 PP-StructureV3 功能
+- 未使用 `parsing_res_list`（核心數據結構）
+- 未使用 `layout_bbox`（精確座標）
+- 未使用 `reading_order`（閱讀順序）
+- 未使用 23 種版面元素分類
+
+#### 2. **GPU 配置未優化**
+```python
+# 目前配置 (ocr_service.py:211-219)
+self.structure_engine = PPStructureV3(
+    use_doc_orientation_classify=False,  # ❌ 未啟用前處理
+    use_doc_unwarping=False,             # ❌ 未啟用矯正
+    use_textline_orientation=False,      # ❌ 未啟用方向校正
+    # ... 使用預設配置
+)
+```
+
+**問題**:
+- RTX 4060 8GB 足以運行 server 模型，但用了預設配置
+- 關閉了重要的前處理功能
+- 未充分利用 GPU 算力
+
+#### 3. **PDF 生成策略單一**
+```python
+# 目前只有座標定位模式
+# 導致 21.6% 文字損失（過濾重疊）
+filtered_text_regions = self._filter_text_in_regions(text_regions, regions_to_avoid)
+```
+
+**問題**:
+- 只支援座標定位，不支援流式排版
+- 無法零資訊損失
+- 翻譯功能受限
+
+---
+
+## 🎯 重構目標
+
+### 核心目標
+
+1. **完整利用 PP-StructureV3 能力**
+   - 提取 `parsing_res_list`（23 種元素分類 + 閱讀順序）
+   - 提取 `layout_bbox`（精確座標）
+   - 提取 `layout_det_res`（版面檢測詳情）
+   - 提取 `overall_ocr_res`（所有文字的座標）
+
+2. **雙模式 PDF 生成**
+   - 模式 A: 座標定位（精確還原版面）
+   - 模式 B: 流式排版（零資訊損失，支援翻譯）
+
+3. **GPU 配置最佳化**
+   - 針對 RTX 4060 8GB 的最佳配置
+   - Server 模型 + 所有功能模組
+   - 合理的記憶體管理
+
+4. **向後相容**
+   - 保留現有 API
+   - 舊 JSON 檔案仍可用
+   - 漸進式升級
+
+---
+
+## 🏗️ 新架構設計
+
+### 架構層次
+
+```
+┌──────────────────────────────────────────────────────┐
+│                    API Layer                         │
+│  /tasks, /results, /download (向後相容)              │
+└────────────────┬─────────────────────────────────────┘
+                 │
+┌────────────────▼─────────────────────────────────────┐
+│                Service Layer                         │
+├──────────────────────────────────────────────────────┤
+│  OCRService (現有, 保留)                             │
+│    └─ analyze_layout() [升級] ──┐                   │
+│                                  │                    │
+│  AdvancedLayoutExtractor (新增)  ◄─ 使用相同引擎     │
+│    └─ extract_complete_layout() ─┘                   │
+│                                                       │
+│  PDFGeneratorService (重構)                          │
+│    ├─ generate_coordinate_pdf() [Mode A]            │
+│    └─ generate_flow_pdf()       [Mode B]            │
+└────────────────┬─────────────────────────────────────┘
+                 │
+┌────────────────▼─────────────────────────────────────┐
+│              Engine Layer                            │
+├──────────────────────────────────────────────────────┤
+│  PPStructureV3Engine (新增，統一管理)                │
+│    ├─ GPU 配置 (RTX 4060 8GB 最佳化)                │
+│    ├─ Model 配置 (Server 模型)                      │
+│    └─ 功能開關 (全功能啟用)                         │
+└──────────────────────────────────────────────────────┘
+```
+
+### 核心類別設計
+
+#### 1. PPStructureV3Engine (新增)
+**目的**: 統一管理 PP-StructureV3 引擎，避免重複初始化
+
+```python
+class PPStructureV3Engine:
+    """
+    PP-StructureV3 引擎管理器 (單例)
+    針對 RTX 4060 8GB 優化配置
+    """
+    _instance = None
+
+    def __new__(cls):
+        if cls._instance is None:
+            cls._instance = super().__new__(cls)
+            cls._instance._initialize()
+        return cls._instance
+
+    def _initialize(self):
+        """初始化引擎"""
+        logger.info("Initializing PP-StructureV3 with RTX 4060 8GB optimized config")
+
+        self.engine = PPStructureV3(
+            # ===== GPU 配置 =====
+            use_gpu=True,
+            gpu_mem=6144,  # 保留 2GB 給系統 (8GB - 2GB)
+
+            # ===== 前處理模組 (全部啟用) =====
+            use_doc_orientation_classify=True,   # 文檔方向校正
+            use_doc_unwarping=True,              # 文檔影像矯正
+            use_textline_orientation=True,       # 文字行方向校正
+
+            # ===== 功能模組 (全部啟用) =====
+            use_table_recognition=True,          # 表格識別
+            use_formula_recognition=True,        # 公式識別
+            use_chart_recognition=True,          # 圖表識別
+            use_seal_recognition=True,           # 印章識別
+
+            # ===== OCR 模型配置 (Server 模型) =====
+            text_detection_model_name="ch_PP-OCRv4_server_det",
+            text_recognition_model_name="ch_PP-OCRv4_server_rec",
+
+            # ===== 版面檢測參數 =====
+            layout_threshold=0.5,                # 版面檢測閾值
+            layout_nms=0.5,                      # NMS 閾值
+            layout_unclip_ratio=1.5,            # 邊界框擴展比例
+
+            # ===== OCR 參數 =====
+            text_det_limit_side_len=1920,       # 高解析度檢測
+            text_det_thresh=0.3,                # 檢測閾值
+            text_det_box_thresh=0.5,            # 邊界框閾值
+
+            # ===== 其他 =====
+            show_log=True,
+            use_angle_cls=False,  # 已被 textline_orientation 取代
+        )
+
+        logger.info("PP-StructureV3 engine initialized successfully")
+        logger.info(f"  - GPU: Enabled (RTX 4060 8GB)")
+        logger.info(f"  - Models: Server (High Accuracy)")
+        logger.info(f"  - Features: All Enabled (Table/Formula/Chart/Seal)")
+
+    def predict(self, image_path: str):
+        """執行預測"""
+        return self.engine.predict(image_path)
+
+    def get_engine(self):
+        """獲取引擎實例"""
+        return self.engine
+```
+
+#### 2. AdvancedLayoutExtractor (新增)
+**目的**: 完整提取 PP-StructureV3 的所有版面資訊
+
+```python
+class AdvancedLayoutExtractor:
+    """
+    進階版面提取器
+    完整利用 PP-StructureV3 的 parsing_res_list, layout_bbox, layout_det_res
+    """
+
+    def __init__(self):
+        self.engine = PPStructureV3Engine()
+
+    def extract_complete_layout(
+        self,
+        image_path: Path,
+        output_dir: Optional[Path] = None,
+        current_page: int = 0
+    ) -> Tuple[Optional[Dict], List[Dict]]:
+        """
+        提取完整版面資訊（使用 page_result.json）
+
+        Returns:
+            (layout_data, images_metadata)
+
+        layout_data = {
+            "elements": [
+                {
+                    "element_id": int,
+                    "type": str,  # 23 種類型之一
+                    "bbox": [[x1,y1], [x2,y1], [x2,y2], [x1,y2]],  # ✅ 不再是空列表
+                    "content": str,
+                    "reading_order": int,  # ✅ 閱讀順序
+                    "layout_type": str,    # ✅ single/double/multi-column
+                    "confidence": float,   # ✅ 置信度
+                    "page": int
+                },
+                ...
+            ],
+            "reading_order": [0, 1, 2, ...],
+            "layout_types": ["single", "double"],
+            "total_elements": int
+        }
+        """
+        try:
+            results = self.engine.predict(str(image_path))
+
+            layout_elements = []
+            images_metadata = []
+
+            for page_idx, page_result in enumerate(results):
+                # ✅ 核心改動：使用 page_result.json 而非 page_result.markdown
+                json_data = page_result.json
+
+                # ===== 方法 1: 使用 parsing_res_list (主要來源) =====
+                parsing_res_list = json_data.get('parsing_res_list', [])
+
+                if parsing_res_list:
+                    logger.info(f"Found {len(parsing_res_list)} elements in parsing_res_list")
+
+                    for idx, item in enumerate(parsing_res_list):
+                        element = self._create_element_from_parsing_res(
+                            item, idx, current_page
+                        )
+                        if element:
+                            layout_elements.append(element)
+
+                # ===== 方法 2: 使用 layout_det_res (補充資訊) =====
+                layout_det_res = json_data.get('layout_det_res', {})
+                layout_boxes = layout_det_res.get('boxes', [])
+
+                # 用於豐富 element 資訊（如果 parsing_res_list 缺少某些欄位）
+                self._enrich_elements_with_layout_det(layout_elements, layout_boxes)
+
+                # ===== 方法 3: 處理圖片 (從 markdown_images) =====
+                markdown_dict = page_result.markdown
+                markdown_images = markdown_dict.get('markdown_images', {})
+
+                for img_idx, (img_path, img_obj) in enumerate(markdown_images.items()):
+                    # 保存圖片到磁碟
+                    self._save_image(img_obj, img_path, output_dir or image_path.parent)
+
+                    # 從 parsing_res_list 或 layout_det_res 查找 bbox
+                    bbox = self._find_image_bbox(
+                        img_path, parsing_res_list, layout_boxes
+                    )
+
+                    images_metadata.append({
+                        'element_id': len(layout_elements) + img_idx,
+                        'image_path': img_path,
+                        'type': 'image',
+                        'page': current_page,
+                        'bbox': bbox,
+                    })
+
+            if layout_elements:
+                layout_data = {
+                    'elements': layout_elements,
+                    'total_elements': len(layout_elements),
+                    'reading_order': [e['reading_order'] for e in layout_elements],
+                    'layout_types': list(set(e.get('layout_type') for e in layout_elements)),
+                }
+                logger.info(f"✅ Extracted {len(layout_elements)} elements with complete info")
+                return layout_data, images_metadata
+            else:
+                logger.warning("No layout elements found")
+                return None, []
+
+        except Exception as e:
+            logger.error(f"Advanced layout extraction failed: {e}")
+            import traceback
+            traceback.print_exc()
+            return None, []
+
+    def _create_element_from_parsing_res(
+        self, item: Dict, idx: int, current_page: int
+    ) -> Optional[Dict]:
+        """從 parsing_res_list 的一個 item 創建 element"""
+        # 提取 layout_bbox
+        layout_bbox = item.get('layout_bbox')
+        bbox = self._convert_bbox_to_4point(layout_bbox)
+
+        # 提取版面類型
+        layout_type = item.get('layout', 'single')
+
+        # 創建基礎 element
+        element = {
+            'element_id': idx,
+            'page': current_page,
+            'bbox': bbox,  # ✅ 完整座標
+            'layout_type': layout_type,
+            'reading_order': idx,
+            'confidence': item.get('score', 0.0),
+        }
+
+        # 根據內容類型填充 type 和 content
+        # 順序很重要！優先級: table > formula > image > title > text
+
+        if 'table' in item and item['table']:
+            element['type'] = 'table'
+            element['content'] = item['table']
+            # 提取表格純文字（用於翻譯）
+            element['extracted_text'] = self._extract_table_text(item['table'])
+
+        elif 'formula' in item and item['formula']:
+            element['type'] = 'formula'
+            element['content'] = item['formula']  # LaTeX
+
+        elif 'figure' in item or 'image' in item:
+            element['type'] = 'image'
+            element['content'] = item.get('figure') or item.get('image')
+
+        elif 'title' in item and item['title']:
+            element['type'] = 'title'
+            element['content'] = item['title']
+
+        elif 'text' in item and item['text']:
+            element['type'] = 'text'
+            element['content'] = item['text']
+
+        else:
+            # 未知類型，嘗試提取任何非系統欄位
+            for key, value in item.items():
+                if key not in ['layout_bbox', 'layout', 'score'] and value:
+                    element['type'] = key
+                    element['content'] = value
+                    break
+            else:
+                return None  # 沒有內容，跳過
+
+        return element
+
+    def _convert_bbox_to_4point(self, layout_bbox) -> List:
+        """轉換 layout_bbox 為 4-point 格式"""
+        if layout_bbox is None:
+            return []
+
+        # 處理 numpy array
+        if hasattr(layout_bbox, 'tolist'):
+            bbox = layout_bbox.tolist()
+        else:
+            bbox = list(layout_bbox)
+
+        if len(bbox) == 4:  # [x1, y1, x2, y2]
+            x1, y1, x2, y2 = bbox
+            return [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
+
+        return []
+
+    def _extract_table_text(self, html_content: str) -> str:
+        """從 HTML 表格提取純文字（用於翻譯）"""
+        try:
+            from bs4 import BeautifulSoup
+            soup = BeautifulSoup(html_content, 'html.parser')
+
+            # 提取所有 cell 的文字
+            cells = []
+            for cell in soup.find_all(['td', 'th']):
+                text = cell.get_text(strip=True)
+                if text:
+                    cells.append(text)
+
+            return ' | '.join(cells)
+        except Exception as e:
+            logger.warning(f"Failed to extract table text: {e}")
+            # Fallback: 簡單去除 HTML 標籤
+            import re
+            text = re.sub(r'<[^>]+>', ' ', html_content)
+            text = re.sub(r'\s+', ' ', text)
+            return text.strip()
+```
+
+#### 3. PDFGeneratorService (重構)
+**目的**: 支援雙模式 PDF 生成
+
+```python
+class PDFGeneratorService:
+    """
+    PDF 生成服務 (重構版)
+    支援兩種模式:
+    - coordinate: 座標定位模式 (精確還原版面)
+    - flow: 流式排版模式 (零資訊損失, 支援翻譯)
+    """
+
+    def generate_pdf(
+        self,
+        json_path: Path,
+        output_path: Path,
+        mode: str = 'coordinate',  # 'coordinate' 或 'flow'
+        source_file_path: Optional[Path] = None
+    ) -> bool:
+        """
+        生成 PDF
+
+        Args:
+            json_path: OCR JSON 檔案路徑
+            output_path: 輸出 PDF 路徑
+            mode: 生成模式 ('coordinate' 或 'flow')
+            source_file_path: 原始檔案路徑（用於獲取尺寸）
+
+        Returns:
+            成功返回 True
+        """
+        try:
+            # 載入 OCR 數據
+            ocr_data = self.load_ocr_json(json_path)
+            if not ocr_data:
+                return False
+
+            # 根據模式選擇生成策略
+            if mode == 'flow':
+                return self._generate_flow_pdf(ocr_data, output_path)
+            else:
+                return self._generate_coordinate_pdf(ocr_data, output_path, source_file_path)
+
+        except Exception as e:
+            logger.error(f"PDF generation failed: {e}")
+            import traceback
+            traceback.print_exc()
+            return False
+
+    def _generate_coordinate_pdf(
+        self,
+        ocr_data: Dict,
+        output_path: Path,
+        source_file_path: Optional[Path]
+    ) -> bool:
+        """
+        模式 A: 座標定位模式
+        - 使用 layout_bbox 精確定位每個元素
+        - 保留原始文件的視覺外觀
+        - 適用於需要精確還原版面的場景
+        """
+        logger.info("Generating PDF in COORDINATE mode (layout-preserving)")
+
+        # 提取數據
+        layout_data = ocr_data.get('layout_data', {})
+        elements = layout_data.get('elements', [])
+
+        if not elements:
+            logger.warning("No layout elements found")
+            return False
+
+        # 按 reading_order 和 page 排序
+        sorted_elements = sorted(elements, key=lambda x: (
+            x.get('page', 0),
+            x.get('reading_order', 0)
+        ))
+
+        # 計算頁面尺寸
+        ocr_width, ocr_height = self.calculate_page_dimensions(ocr_data, source_file_path)
+        target_width, target_height = self._get_target_dimensions(source_file_path, ocr_width, ocr_height)
+
+        scale_w = target_width / ocr_width
+        scale_h = target_height / ocr_height
+
+        # 創建 PDF canvas
+        pdf_canvas = canvas.Canvas(str(output_path), pagesize=(target_width, target_height))
+
+        # 按頁碼分組元素
+        pages = {}
+        for elem in sorted_elements:
+            page = elem.get('page', 0)
+            if page not in pages:
+                pages[page] = []
+            pages[page].append(elem)
+
+        # 渲染每一頁
+        for page_num, page_elements in sorted(pages.items()):
+            if page_num > 0:
+                pdf_canvas.showPage()
+
+            logger.info(f"Rendering page {page_num + 1} with {len(page_elements)} elements")
+
+            # 按 reading_order 渲染每個元素
+            for elem in page_elements:
+                bbox = elem.get('bbox', [])
+                elem_type = elem.get('type')
+                content = elem.get('content', '')
+
+                if not bbox:
+                    logger.warning(f"Element {elem['element_id']} has no bbox, skipping")
+                    continue
+
+                # 根據類型渲染
+                try:
+                    if elem_type == 'table':
+                        self._draw_table_at_bbox(pdf_canvas, content, bbox, target_height, scale_w, scale_h)
+                    elif elem_type == 'text':
+                        self._draw_text_at_bbox(pdf_canvas, content, bbox, target_height, scale_w, scale_h)
+                    elif elem_type == 'title':
+                        self._draw_title_at_bbox(pdf_canvas, content, bbox, target_height, scale_w, scale_h)
+                    elif elem_type == 'image':
+                        img_path = json_path.parent / content
+                        if img_path.exists():
+                            self._draw_image_at_bbox(pdf_canvas, str(img_path), bbox, target_height, scale_w, scale_h)
+                    elif elem_type == 'formula':
+                        self._draw_formula_at_bbox(pdf_canvas, content, bbox, target_height, scale_w, scale_h)
+                    # ... 其他類型
+
+                except Exception as e:
+                    logger.warning(f"Failed to draw {elem_type} element: {e}")
+
+        pdf_canvas.save()
+        logger.info(f"✅ Coordinate PDF generated: {output_path}")
+        return True
+
+    def _generate_flow_pdf(
+        self,
+        ocr_data: Dict,
+        output_path: Path
+    ) -> bool:
+        """
+        模式 B: 流式排版模式
+        - 按 reading_order 流式排版
+        - 零資訊損失（不過濾任何內容）
+        - 使用 ReportLab Platypus 高階 API
+        - 適用於需要翻譯或內容處理的場景
+        """
+        from reportlab.platypus import (
+            SimpleDocTemplate, Paragraph, Spacer,
+            Table, TableStyle, Image as RLImage, PageBreak
+        )
+        from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
+        from reportlab.lib import colors
+        from reportlab.lib.enums import TA_LEFT, TA_CENTER
+
+        logger.info("Generating PDF in FLOW mode (content-preserving)")
+
+        # 提取數據
+        layout_data = ocr_data.get('layout_data', {})
+        elements = layout_data.get('elements', [])
+
+        if not elements:
+            logger.warning("No layout elements found")
+            return False
+
+        # 按 reading_order 排序
+        sorted_elements = sorted(elements, key=lambda x: (
+            x.get('page', 0),
+            x.get('reading_order', 0)
+        ))
+
+        # 創建文檔
+        doc = SimpleDocTemplate(str(output_path))
+        story = []
+        styles = getSampleStyleSheet()
+
+        # 自定義樣式
+        styles.add(ParagraphStyle(
+            name='CustomTitle',
+            parent=styles['Heading1'],
+            fontSize=18,
+            alignment=TA_CENTER,
+            spaceAfter=12
+        ))
+
+        current_page = -1
+
+        # 按順序添加元素
+        for elem in sorted_elements:
+            elem_type = elem.get('type')
+            content = elem.get('content', '')
+            page = elem.get('page', 0)
+
+            # 分頁
+            if page != current_page and current_page != -1:
+                story.append(PageBreak())
+            current_page = page
+
+            try:
+                if elem_type == 'title':
+                    story.append(Paragraph(content, styles['CustomTitle']))
+                    story.append(Spacer(1, 12))
+
+                elif elem_type == 'text':
+                    story.append(Paragraph(content, styles['Normal']))
+                    story.append(Spacer(1, 8))
+
+                elif elem_type == 'table':
+                    # 解析 HTML 表格為 ReportLab Table
+                    table_obj = self._html_to_reportlab_table(content)
+                    if table_obj:
+                        story.append(table_obj)
+                        story.append(Spacer(1, 12))
+
+                elif elem_type == 'image':
+                    # 嵌入圖片
+                    img_path = output_path.parent.parent / content
+                    if img_path.exists():
+                        img = RLImage(str(img_path), width=400, height=300, kind='proportional')
+                        story.append(img)
+                        story.append(Spacer(1, 12))
+
+                elif elem_type == 'formula':
+                    # 公式顯示為等寬字體
+                    story.append(Paragraph(f"<font name='Courier'>{content}</font>", styles['Code']))
+                    story.append(Spacer(1, 8))
+
+            except Exception as e:
+                logger.warning(f"Failed to add {elem_type} element to flow: {e}")
+
+        # 生成 PDF
+        doc.build(story)
+        logger.info(f"✅ Flow PDF generated: {output_path}")
+        return True
+```
+
+---
+
+## 🔧 實作步驟
+
+### 階段 1: 引擎層重構 (2-3 小時)
+
+1. **創建 PPStructureV3Engine 單例類**
+   - 檔案: `backend/app/engines/ppstructure_engine.py` (新增)
+   - 統一管理 PP-StructureV3 引擎
+   - RTX 4060 8GB 最佳化配置
+
+2. **創建 AdvancedLayoutExtractor 類**
+   - 檔案: `backend/app/services/advanced_layout_extractor.py` (新增)
+   - 實作 `extract_complete_layout()`
+   - 完整提取 parsing_res_list, layout_bbox, layout_det_res
+
+3. **更新 OCRService**
+   - 修改 `analyze_layout()` 使用 `AdvancedLayoutExtractor`
+   - 保持向後相容（回退到舊邏輯）
+
+### 階段 2: PDF 生成器重構 (3-4 小時)
+
+1. **重構 PDFGeneratorService**
+   - 添加 `mode` 參數
+   - 實作 `_generate_coordinate_pdf()`
+   - 實作 `_generate_flow_pdf()`
+
+2. **添加輔助方法**
+   - `_draw_table_at_bbox()`: 在指定座標繪製表格
+   - `_draw_text_at_bbox()`: 在指定座標繪製文字
+   - `_draw_title_at_bbox()`: 在指定座標繪製標題
+   - `_draw_formula_at_bbox()`: 在指定座標繪製公式
+   - `_html_to_reportlab_table()`: HTML 轉 ReportLab Table
+
+3. **更新 API 端點**
+   - `/tasks/{id}/download/pdf?mode=coordinate` (預設)
+   - `/tasks/{id}/download/pdf?mode=flow`
+
+### 階段 3: 測試與優化 (2-3 小時)
+
+1. **單元測試**
+   - 測試 AdvancedLayoutExtractor
+   - 測試兩種 PDF 模式
+   - 測試向後相容性
+
+2. **效能測試**
+   - GPU 記憶體使用監控
+   - 處理速度測試
+   - 並發請求測試
+
+3. **品質驗證**
+   - 座標準確度
+   - 閱讀順序正確性
+   - 表格識別準確度
+
+---
+
+## 📈 預期效果
+
+### 功能改善
+
+| 指標 | 目前 | 重構後 | 提升 |
+|------|-----|--------|------|
+| bbox 可用性 | 0% (全空) | 100% | ✅ ∞ |
+| 版面元素分類 | 2 種 | 23 種 | ✅ 11.5x |
+| 閱讀順序 | 無 | 完整保留 | ✅ 100% |
+| 資訊損失 | 21.6% | 0% (流式模式) | ✅ 100% |
+| PDF 模式 | 1 種 | 2 種 | ✅ 2x |
+| 翻譯支援 | 困難 | 完美 | ✅ 100% |
+
+### GPU 使用優化
+
+```python
+# RTX 4060 8GB 配置效果
+配置項目          | 目前   | 重構後
+----------------|--------|--------
+GPU 利用率       | ~30%   | ~70%
+處理速度         | 0.5頁/秒 | 1.2頁/秒
+前處理功能       | 關閉   | 全開
+識別準確度       | ~85%   | ~95%
+```
+
+---
+
+## 🎯 遷移策略
+
+### 向後相容性保證
+
+1. **API 層面**
+   - 保留現有所有 API 端點
+   - 添加可選的 `mode` 參數
+   - 預設行為不變
+
+2. **數據層面**
+   - 舊 JSON 檔案仍可使用
+   - 新增欄位不影響舊邏輯
+   - 漸進式更新
+
+3. **部署策略**
+   - 先部署新引擎和服務
+   - 逐步啟用新功能
+   - 監控效能和錯誤率
+
+---
+
+## 📝 配置檔案
+
+### requirements.txt 更新
+
+```txt
+# 現有依賴
+paddlepaddle-gpu>=3.0.0
+paddleocr>=3.0.0
+
+# 新增依賴
+python-docx>=0.8.11  # Word 文檔生成 (可選)
+PyMuPDF>=1.23.0      # PDF 處理增強
+beautifulsoup4>=4.12.0  # HTML 解析
+lxml>=4.9.0          # XML/HTML 解析加速
+```
+
+### 環境變數配置
+
+```bash
+# .env.local 新增
+PADDLE_GPU_MEMORY=6144  # RTX 4060 8GB 保留 2GB 給系統
+PADDLE_USE_SERVER_MODEL=true
+PADDLE_ENABLE_ALL_FEATURES=true
+
+# PDF 生成預設模式
+PDF_DEFAULT_MODE=coordinate  # 或 flow
+```
+
+---
+
+## 🚀 實作優先級
+
+### P0 (立即實作)
+1. ✅ PPStructureV3Engine 統一引擎
+2. ✅ AdvancedLayoutExtractor 完整提取
+3. ✅ 座標定位模式 PDF
+
+### P1 (第二階段)
+4. ⭐ 流式排版模式 PDF
+5. ⭐ API 端點更新 (mode 參數)
+
+### P2 (優化階段)
+6. 效能監控和優化
+7. 批次處理支援
+8. 品質檢查工具
+
+---
+
+## ⚠️ 風險與緩解
+
+### 風險 1: GPU 記憶體不足
+**緩解**:
+- 合理設定 `gpu_mem=6144` (保留 2GB)
+- 添加記憶體監控
+- 大文檔分批處理
+
+### 風險 2: 處理速度下降
+**緩解**:
+- Server 模型在 GPU 上比 Mobile 更快
+- 並行處理多頁
+- 結果快取
+
+### 風險 3: 向後相容問題
+**緩解**:
+- 保留舊邏輯作為回退
+- 逐步遷移
+- 完整測試覆蓋
+
+---
+
+**預計總開發時間**: 7-10 小時
+**預計效果**: 100% 利用 PP-StructureV3 能力 + 零資訊損失 + 完美翻譯支援
+
+您希望我開始實作哪個階段？
--- a/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/PP-STRUCTURE-ENHANCEMENT-PLAN.md
+++ b/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/PP-STRUCTURE-ENHANCEMENT-PLAN.md
@@ -0,0 +1,691 @@
+# PP-StructureV3 完整版面資訊利用計劃
+
+## 📋 執行摘要
+
+### 問題診斷
+目前實作**嚴重低估了 PP-StructureV3 的能力**，只使用了 `page_result.markdown` 屬性，完全忽略了核心的版面資訊 `page_result.json`。
+
+### 核心發現
+1. **PP-StructureV3 提供完整的版面解析資訊**，包括：
+   - `parsing_res_list`: 按閱讀順序排列的版面元素列表
+   - `layout_bbox`: 每個元素的精確座標
+   - `layout_det_res`: 版面檢測結果（區域類型、置信度）
+   - `overall_ocr_res`: 完整的 OCR 結果（包含所有文字的 bbox）
+   - `layout`: 版面類型（單欄/雙欄/多欄）
+
+2. **目前實作的缺陷**：
+   ```python
+   # ❌ 目前做法 (ocr_service.py:615-646)
+   markdown_dict = page_result.markdown  # 只獲取 markdown 和圖片
+   markdown_texts = markdown_dict.get('markdown_texts', '')
+   # bbox 被設為空列表
+   'bbox': [],  # PP-StructureV3 doesn't provide individual bbox in this format
+   ```
+
+3. **應該這樣做**：
+   ```python
+   # ✅ 正確做法
+   json_data = page_result.json  # 獲取完整的結構化資訊
+   parsing_list = json_data.get('parsing_res_list', [])  # 閱讀順序 + bbox
+   layout_det = json_data.get('layout_det_res', {})  # 版面檢測
+   overall_ocr = json_data.get('overall_ocr_res', {})  # 所有文字的座標
+   ```
+
+---
+
+## 🎯 規劃目標
+
+### 階段 1: 提取完整版面資訊（高優先級）
+**目標**: 修改 `analyze_layout()` 以使用 PP-StructureV3 的完整能力
+
+**預期效果**:
+- ✅ 每個版面元素都有精確的 `layout_bbox`
+- ✅ 保留原始閱讀順序（`parsing_res_list` 的順序）
+- ✅ 獲取版面類型資訊（單欄/雙欄）
+- ✅ 提取區域分類（text/table/figure/title/formula）
+- ✅ 零資訊損失（不需要過濾重疊文字）
+
+### 階段 2: 實作雙模式 PDF 生成（中優先級）
+**目標**: 提供兩種 PDF 生成模式
+
+**模式 A: 精確座標定位模式**
+- 使用 `layout_bbox` 精確定位每個元素
+- 保留原始文件的視覺外觀
+- 適用於需要精確還原版面的場景
+
+**模式 B: 流式排版模式**
+- 按 `parsing_res_list` 順序流式排版
+- 使用 ReportLab Platypus 高階 API
+- 零資訊損失，所有內容都可搜尋
+- 適用於需要翻譯或內容處理的場景
+
+### 階段 3: 多欄版面處理（低優先級）
+**目標**: 利用 PP-StructureV3 的多欄識別能力
+
+---
+
+## 📊 PP-StructureV3 完整資料結構
+
+### 1. `page_result.json` 完整結構
+
+```python
+{
+    # 基本資訊
+    "input_path": str,  # 源文件路徑
+    "page_index": int,  # 頁碼（PDF 專用）
+
+    # 版面檢測結果
+    "layout_det_res": {
+        "boxes": [
+            {
+                "cls_id": int,        # 類別 ID
+                "label": str,         # 區域類型: text/table/figure/title/formula/seal
+                "score": float,       # 置信度 0-1
+                "coordinate": [x1, y1, x2, y2]  # 矩形座標
+            },
+            ...
+        ]
+    },
+
+    # 完整 OCR 結果
+    "overall_ocr_res": {
+        "dt_polys": np.ndarray,      # 文字檢測多邊形
+        "rec_polys": np.ndarray,     # 文字識別多邊形
+        "rec_boxes": np.ndarray,     # 文字識別矩形框 (n, 4, 2) int16
+        "rec_texts": List[str],      # 識別的文字
+        "rec_scores": np.ndarray     # 識別置信度
+    },
+
+    # **核心版面解析結果（按閱讀順序）**
+    "parsing_res_list": [
+        {
+            "layout_bbox": np.ndarray,  # 區域邊界框 [x1, y1, x2, y2]
+            "layout": str,              # 版面類型: single/double/multi-column
+            "text": str,                # 文字內容（如果是文字區域）
+            "table": str,               # 表格 HTML（如果是表格區域）
+            "image": str,               # 圖片路徑（如果是圖片區域）
+            "formula": str,             # 公式 LaTeX（如果是公式區域）
+            # ... 其他區域類型
+        },
+        ...  # 順序 = 閱讀順序
+    ],
+
+    # 文字段落 OCR（按閱讀順序）
+    "text_paragraphs_ocr_res": {
+        "rec_polys": np.ndarray,
+        "rec_texts": List[str],
+        "rec_scores": np.ndarray
+    },
+
+    # 可選模組結果
+    "formula_res_region1": {...},  # 公式識別結果
+    "table_cell_img": {...},       # 表格儲存格圖片
+    "seal_res_region1": {...}      # 印章識別結果
+}
+```
+
+### 2. 關鍵欄位說明
+
+| 欄位 | 用途 | 資料格式 | 重要性 |
+|------|------|---------|--------|
+| `parsing_res_list` | **核心資料**，包含按閱讀順序排列的所有版面元素 | List[Dict] | ⭐⭐⭐⭐⭐ |
+| `layout_bbox` | 每個元素的精確座標 | np.ndarray [x1,y1,x2,y2] | ⭐⭐⭐⭐⭐ |
+| `layout` | 版面類型（單欄/雙欄/多欄） | str: single/double/multi | ⭐⭐⭐⭐ |
+| `layout_det_res` | 版面檢測詳細結果（包含區域分類） | Dict with boxes list | ⭐⭐⭐⭐ |
+| `overall_ocr_res` | 所有文字的 OCR 結果和座標 | Dict with np.ndarray | ⭐⭐⭐⭐ |
+| `markdown` | 簡化的 Markdown 輸出 | Dict with texts/images | ⭐⭐ |
+
+---
+
+## 🔧 實作計劃
+
+### 任務 1: 重構 `analyze_layout()` 函數
+
+**檔案**: `/backend/app/services/ocr_service.py`
+
+**修改範圍**: Lines 590-710
+
+**核心改動**:
+
+```python
+def analyze_layout(self, image_path: Path, output_dir: Optional[Path] = None, current_page: int = 0) -> Tuple[Optional[Dict], List[Dict]]:
+    """
+    Analyze document layout using PP-StructureV3 (使用完整的 JSON 資訊)
+    """
+    try:
+        structure_engine = self.get_structure_engine()
+        results = structure_engine.predict(str(image_path))
+
+        layout_elements = []
+        images_metadata = []
+
+        for page_idx, page_result in enumerate(results):
+            # ✅ 修改 1: 使用完整的 JSON 資料而非只用 markdown
+            json_data = page_result.json
+
+            # ✅ 修改 2: 提取版面檢測結果
+            layout_det_res = json_data.get('layout_det_res', {})
+            layout_boxes = layout_det_res.get('boxes', [])
+
+            # ✅ 修改 3: 提取核心的 parsing_res_list（包含閱讀順序 + bbox）
+            parsing_res_list = json_data.get('parsing_res_list', [])
+
+            if parsing_res_list:
+                # *** 核心邏輯：使用 parsing_res_list ***
+                for idx, item in enumerate(parsing_res_list):
+                    # 提取 bbox（不再是空列表！）
+                    layout_bbox = item.get('layout_bbox')
+                    if layout_bbox is not None:
+                        # 轉換 numpy array 為標準格式
+                        if hasattr(layout_bbox, 'tolist'):
+                            bbox = layout_bbox.tolist()
+                        else:
+                            bbox = list(layout_bbox)
+
+                        # 轉換為 4-point 格式: [[x1,y1], [x2,y1], [x2,y2], [x1,y2]]
+                        if len(bbox) == 4:  # [x1, y1, x2, y2]
+                            x1, y1, x2, y2 = bbox
+                            bbox = [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
+                    else:
+                        bbox = []
+
+                    # 提取版面類型
+                    layout_type = item.get('layout', 'single')
+
+                    # 創建元素（包含所有資訊）
+                    element = {
+                        'element_id': idx,
+                        'page': current_page,
+                        'bbox': bbox,  # ✅ 不再是空列表！
+                        'layout_type': layout_type,  # ✅ 新增版面類型
+                        'reading_order': idx,  # ✅ 新增閱讀順序
+                    }
+
+                    # 根據內容類型提取資料
+                    if 'table' in item:
+                        element['type'] = 'table'
+                        element['content'] = item['table']
+                        # 提取表格純文字（用於翻譯）
+                        element['extracted_text'] = self._extract_table_text(item['table'])
+
+                    elif 'text' in item:
+                        element['type'] = 'text'
+                        element['content'] = item['text']
+
+                    elif 'figure' in item or 'image' in item:
+                        element['type'] = 'image'
+                        element['content'] = item.get('figure') or item.get('image')
+
+                    elif 'formula' in item:
+                        element['type'] = 'formula'
+                        element['content'] = item['formula']
+
+                    elif 'title' in item:
+                        element['type'] = 'title'
+                        element['content'] = item['title']
+
+                    else:
+                        # 未知類型，記錄所有非系統欄位
+                        for key, value in item.items():
+                            if key not in ['layout_bbox', 'layout']:
+                                element['type'] = key
+                                element['content'] = value
+                                break
+
+                    layout_elements.append(element)
+
+            else:
+                # 回退到 markdown 方式（向後相容）
+                logger.warning("No parsing_res_list found, falling back to markdown parsing")
+                markdown_dict = page_result.markdown
+                # ... 原有的 markdown 解析邏輯 ...
+
+            # ✅ 修改 4: 同時處理提取的圖片（仍需保存到磁碟）
+            markdown_dict = page_result.markdown
+            markdown_images = markdown_dict.get('markdown_images', {})
+
+            for img_idx, (img_path, img_obj) in enumerate(markdown_images.items()):
+                # 保存圖片到磁碟
+                try:
+                    base_dir = output_dir if output_dir else image_path.parent
+                    full_img_path = base_dir / img_path
+                    full_img_path.parent.mkdir(parents=True, exist_ok=True)
+
+                    if hasattr(img_obj, 'save'):
+                        img_obj.save(str(full_img_path))
+                        logger.info(f"Saved extracted image to {full_img_path}")
+                except Exception as e:
+                    logger.warning(f"Failed to save image {img_path}: {e}")
+
+                # 提取 bbox（從檔名或從 parsing_res_list 匹配）
+                bbox = self._find_image_bbox(img_path, parsing_res_list, layout_boxes)
+
+                images_metadata.append({
+                    'element_id': len(layout_elements) + img_idx,
+                    'image_path': img_path,
+                    'type': 'image',
+                    'page': current_page,
+                    'bbox': bbox,
+                })
+
+        if layout_elements:
+            layout_data = {
+                'elements': layout_elements,
+                'total_elements': len(layout_elements),
+                'reading_order': [e['reading_order'] for e in layout_elements],  # ✅ 保留閱讀順序
+                'layout_types': list(set(e.get('layout_type') for e in layout_elements)),  # ✅ 版面類型統計
+            }
+            logger.info(f"Detected {len(layout_elements)} layout elements (with bbox and reading order)")
+            return layout_data, images_metadata
+        else:
+            logger.warning("No layout elements detected")
+            return None, []
+
+    except Exception as e:
+        import traceback
+        logger.error(f"Layout analysis error: {str(e)}\n{traceback.format_exc()}")
+        return None, []
+
+
+def _find_image_bbox(self, img_path: str, parsing_res_list: List[Dict], layout_boxes: List[Dict]) -> List:
+    """
+    從 parsing_res_list 或 layout_det_res 中查找圖片的 bbox
+    """
+    # 方法 1: 從檔名提取（現有方法）
+    import re
+    match = re.search(r'box_(\d+)_(\d+)_(\d+)_(\d+)', img_path)
+    if match:
+        x1, y1, x2, y2 = map(int, match.groups())
+        return [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
+
+    # 方法 2: 從 parsing_res_list 匹配（如果包含圖片路徑資訊）
+    for item in parsing_res_list:
+        if 'image' in item or 'figure' in item:
+            content = item.get('image') or item.get('figure')
+            if img_path in str(content):
+                bbox = item.get('layout_bbox')
+                if bbox is not None:
+                    if hasattr(bbox, 'tolist'):
+                        bbox_list = bbox.tolist()
+                    else:
+                        bbox_list = list(bbox)
+                    if len(bbox_list) == 4:
+                        x1, y1, x2, y2 = bbox_list
+                        return [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
+
+    # 方法 3: 從 layout_det_res 匹配（根據類型）
+    for box in layout_boxes:
+        if box.get('label') in ['figure', 'image']:
+            coord = box.get('coordinate', [])
+            if len(coord) == 4:
+                x1, y1, x2, y2 = coord
+                return [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
+
+    logger.warning(f"Could not find bbox for image {img_path}")
+    return []
+```
+
+---
+
+### 任務 2: 更新 PDF 生成器使用新資訊
+
+**檔案**: `/backend/app/services/pdf_generator_service.py`
+
+**核心改動**:
+
+1. **移除文字過濾邏輯**（不再需要！）
+   - 因為 `parsing_res_list` 已經按閱讀順序排列
+   - 表格/圖片有自己的區域，文字有自己的區域
+   - 不會有重疊問題
+
+2. **按 `reading_order` 渲染元素**
+   ```python
+   def generate_layout_pdf(self, json_path: Path, output_path: Path, mode: str = 'coordinate') -> bool:
+       """
+       mode: 'coordinate' 或 'flow'
+       """
+       # 載入資料
+       ocr_data = self.load_ocr_json(json_path)
+       layout_data = ocr_data.get('layout_data', {})
+       elements = layout_data.get('elements', [])
+
+       if mode == 'coordinate':
+           # 模式 A: 座標定位模式
+           return self._generate_coordinate_pdf(elements, output_path, ocr_data)
+       else:
+           # 模式 B: 流式排版模式
+           return self._generate_flow_pdf(elements, output_path, ocr_data)
+
+   def _generate_coordinate_pdf(self, elements: List[Dict], output_path: Path, ocr_data: Dict) -> bool:
+       """座標定位模式 - 精確還原版面"""
+       # 按 reading_order 排序元素
+       sorted_elements = sorted(elements, key=lambda x: x.get('reading_order', 0))
+
+       # 按頁碼分組
+       pages = {}
+       for elem in sorted_elements:
+           page = elem.get('page', 0)
+           if page not in pages:
+               pages[page] = []
+           pages[page].append(elem)
+
+       # 渲染每頁
+       for page_num, page_elements in sorted(pages.items()):
+           for elem in page_elements:
+               bbox = elem.get('bbox', [])
+               elem_type = elem.get('type')
+               content = elem.get('content', '')
+
+               if not bbox:
+                   logger.warning(f"Element {elem['element_id']} has no bbox, skipping")
+                   continue
+
+               # 使用精確座標渲染
+               if elem_type == 'table':
+                   self.draw_table_at_bbox(pdf_canvas, content, bbox, page_height, scale_w, scale_h)
+               elif elem_type == 'text':
+                   self.draw_text_at_bbox(pdf_canvas, content, bbox, page_height, scale_w, scale_h)
+               elif elem_type == 'image':
+                   self.draw_image_at_bbox(pdf_canvas, content, bbox, page_height, scale_w, scale_h)
+               # ... 其他類型
+
+   def _generate_flow_pdf(self, elements: List[Dict], output_path: Path, ocr_data: Dict) -> bool:
+       """流式排版模式 - 零資訊損失"""
+       from reportlab.platypus import SimpleDocTemplate, Paragraph, Table, Image, Spacer
+       from reportlab.lib.styles import getSampleStyleSheet
+
+       # 按 reading_order 排序元素
+       sorted_elements = sorted(elements, key=lambda x: x.get('reading_order', 0))
+
+       # 創建 Story（流式內容）
+       story = []
+       styles = getSampleStyleSheet()
+
+       for elem in sorted_elements:
+           elem_type = elem.get('type')
+           content = elem.get('content', '')
+
+           if elem_type == 'title':
+               story.append(Paragraph(content, styles['Title']))
+           elif elem_type == 'text':
+               story.append(Paragraph(content, styles['Normal']))
+           elif elem_type == 'table':
+               # 解析 HTML 表格為 ReportLab Table
+               table_obj = self._html_to_reportlab_table(content)
+               story.append(table_obj)
+           elif elem_type == 'image':
+               # 嵌入圖片
+               img_path = json_path.parent / content
+               if img_path.exists():
+                   story.append(Image(str(img_path), width=400, height=300))
+
+           story.append(Spacer(1, 12))  # 間距
+
+       # 生成 PDF
+       doc = SimpleDocTemplate(str(output_path))
+       doc.build(story)
+       return True
+   ```
+
+---
+
+## 📈 預期效果對比
+
+### 目前實作 vs 新實作
+
+| 指標 | 目前實作 ❌ | 新實作 ✅ | 改善 |
+|------|-----------|----------|------|
+| **bbox 資訊** | 空列表 `[]` | 精確座標 `[x1,y1,x2,y2]` | ✅ 100% |
+| **閱讀順序** | 無（混合 HTML） | `reading_order` 欄位 | ✅ 100% |
+| **版面類型** | 無 | `layout_type`（單欄/雙欄） | ✅ 100% |
+| **元素分類** | 簡單判斷 `<table` | 精確分類（9+ 類型） | ✅ 100% |
+| **資訊損失** | 21.6% 文字被過濾 | 0% 損失（流式模式） | ✅ 100% |
+| **座標精度** | 只有部分圖片 bbox | 所有元素都有 bbox | ✅ 100% |
+| **PDF 模式** | 只有座標定位 | 雙模式（座標+流式） | ✅ 新功能 |
+| **翻譯支援** | 困難（資訊損失） | 完美（零損失） | ✅ 100% |
+
+### 具體改善
+
+#### 1. 零資訊損失
+```python
+# ❌ 目前: 342 個文字區域 → 過濾後 268 個 = 損失 74 個 (21.6%)
+filtered_text_regions = self._filter_text_in_regions(text_regions, regions_to_avoid)
+
+# ✅ 新實作: 不需要過濾，直接使用 parsing_res_list
+# 所有元素（文字、表格、圖片）都在各自的區域中，不重疊
+for elem in sorted(elements, key=lambda x: x['reading_order']):
+    render_element(elem)  # 渲染所有元素，零損失
+```
+
+#### 2. 精確 bbox
+```python
+# ❌ 目前: bbox 是空列表
+{
+    'element_id': 0,
+    'type': 'table',
+    'bbox': [],  # ← 無法定位！
+}
+
+# ✅ 新實作: 從 layout_bbox 獲取精確座標
+{
+    'element_id': 0,
+    'type': 'table',
+    'bbox': [[770, 776], [1122, 776], [1122, 1058], [770, 1058]],  # ← 精確定位！
+    'reading_order': 3,
+    'layout_type': 'single'
+}
+```
+
+#### 3. 閱讀順序
+```python
+# ❌ 目前: 無法保證正確的閱讀順序
+# 表格、圖片、文字混在一起，順序混亂
+
+# ✅ 新實作: parsing_res_list 的順序 = 閱讀順序
+elements = sorted(elements, key=lambda x: x['reading_order'])
+# 元素按 reading_order: 0, 1, 2, 3, ... 渲染
+# 完美保留文件的邏輯順序
+```
+
+---
+
+## 🚀 實作步驟
+
+### 第一階段：核心重構（2-3 小時）
+
+1. **修改 `analyze_layout()` 函數**
+   - 從 `page_result.json` 提取 `parsing_res_list`
+   - 提取 `layout_bbox` 為每個元素的 bbox
+   - 保留 `reading_order`
+   - 提取 `layout_type`
+   - 測試輸出 JSON 結構
+
+2. **添加輔助函數**
+   - `_find_image_bbox()`: 從多個來源查找圖片 bbox
+   - `_convert_bbox_format()`: 統一 bbox 格式
+   - `_extract_element_content()`: 根據類型提取內容
+
+3. **測試驗證**
+   - 使用現有測試文件重新執行 OCR
+   - 檢查生成的 JSON 是否包含 bbox
+   - 驗證 reading_order 是否正確
+
+### 第二階段：PDF 生成優化（2-3 小時）
+
+1. **實作座標定位模式**
+   - 移除文字過濾邏輯
+   - 按 bbox 精確渲染每個元素
+   - 按 reading_order 確定渲染順序（同頁元素）
+
+2. **實作流式排版模式**
+   - 使用 ReportLab Platypus
+   - 按 reading_order 構建 Story
+   - 實作各類型元素的流式渲染
+
+3. **添加 API 參數**
+   - `/tasks/{id}/download/pdf?mode=coordinate` (預設)
+   - `/tasks/{id}/download/pdf?mode=flow`
+
+### 第三階段：測試與優化（1-2 小時）
+
+1. **完整測試**
+   - 單頁文件測試
+   - 多頁 PDF 測試
+   - 多欄版面測試
+   - 複雜表格測試
+
+2. **效能優化**
+   - 減少重複計算
+   - 優化 bbox 轉換
+   - 快取處理
+
+3. **文檔更新**
+   - 更新 API 文檔
+   - 添加使用範例
+   - 更新架構圖
+
+---
+
+## 💡 關鍵技術細節
+
+### 1. Numpy Array 處理
+```python
+# layout_bbox 是 numpy.ndarray，需要轉換為標準格式
+layout_bbox = item.get('layout_bbox')
+if hasattr(layout_bbox, 'tolist'):
+    bbox = layout_bbox.tolist()  # [x1, y1, x2, y2]
+else:
+    bbox = list(layout_bbox)
+
+# 轉換為 4-point 格式
+x1, y1, x2, y2 = bbox
+bbox_4point = [[x1, y1], [x2, y1], [x2, y2], [x1, y2]]
+```
+
+### 2. 版面類型處理
+```python
+# 根據 layout_type 調整渲染策略
+layout_type = elem.get('layout_type', 'single')
+
+if layout_type == 'double':
+    # 雙欄版面：可能需要特殊處理
+    pass
+elif layout_type == 'multi':
+    # 多欄版面：更複雜的處理
+    pass
+```
+
+### 3. 閱讀順序保證
+```python
+# 確保按正確順序渲染
+elements = layout_data.get('elements', [])
+sorted_elements = sorted(elements, key=lambda x: (
+    x.get('page', 0),          # 先按頁碼
+    x.get('reading_order', 0)  # 再按閱讀順序
+))
+```
+
+---
+
+## ⚠️ 風險與緩解措施
+
+### 風險 1: 向後相容性
+**問題**: 舊的 JSON 檔案沒有新欄位
+
+**緩解措施**:
+```python
+# 在 analyze_layout() 中添加回退邏輯
+parsing_res_list = json_data.get('parsing_res_list', [])
+if not parsing_res_list:
+    logger.warning("No parsing_res_list, using markdown fallback")
+    # 使用舊的 markdown 解析邏輯
+```
+
+### 風險 2: PaddleOCR 版本差異
+**問題**: 不同版本的 PaddleOCR 可能輸出格式不同
+
+**緩解措施**:
+- 記錄 PaddleOCR 版本到 JSON
+- 添加版本檢測邏輯
+- 提供多版本支援
+
+### 風險 3: 效能影響
+**問題**: 提取更多資訊可能增加處理時間
+
+**緩解措施**:
+- 只在需要時提取詳細資訊
+- 使用快取
+- 並行處理多頁
+
+---
+
+## 📝 TODO Checklist
+
+### 階段 1: 核心重構
+- [ ] 修改 `analyze_layout()` 使用 `page_result.json`
+- [ ] 提取 `parsing_res_list`
+- [ ] 提取 `layout_bbox` 並轉換格式
+- [ ] 保留 `reading_order`
+- [ ] 提取 `layout_type`
+- [ ] 實作 `_find_image_bbox()`
+- [ ] 添加回退邏輯（向後相容）
+- [ ] 測試新 JSON 輸出結構
+
+### 階段 2: PDF 生成優化
+- [ ] 實作 `_generate_coordinate_pdf()`
+- [ ] 實作 `_generate_flow_pdf()`
+- [ ] 移除舊的文字過濾邏輯
+- [ ] 添加 mode 參數到 API
+- [ ] 實作 HTML 表格解析器（用於流式模式）
+- [ ] 測試兩種模式的 PDF 輸出
+
+### 階段 3: 測試與文檔
+- [ ] 單頁文件測試
+- [ ] 多頁 PDF 測試
+- [ ] 複雜版面測試（多欄、表格密集）
+- [ ] 效能測試
+- [ ] 更新 API 文檔
+- [ ] 更新使用說明
+- [ ] 創建遷移指南
+
+---
+
+## 🎓 學習資源
+
+1. **PaddleOCR 官方文檔**
+   - [PP-StructureV3 Usage Tutorial](http://www.paddleocr.ai/main/en/version3.x/pipeline_usage/PP-StructureV3.html)
+   - [PaddleX PP-StructureV3](https://paddlepaddle.github.io/PaddleX/3.0/en/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.html)
+
+2. **ReportLab 文檔**
+   - [Platypus User Guide](https://www.reportlab.com/docs/reportlab-userguide.pdf)
+   - [Table Styling](https://www.reportlab.com/docs/reportlab-userguide.pdf#page=80)
+
+3. **參考實作**
+   - PaddleOCR GitHub: `/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py`
+
+---
+
+## 🏁 成功標準
+
+### 必須達成
+✅ 所有版面元素都有精確的 bbox
+✅ 閱讀順序正確保留
+✅ 零資訊損失（流式模式）
+✅ 向後相容（舊 JSON 仍可用）
+
+### 期望達成
+✅ 雙模式 PDF 生成（座標 + 流式）
+✅ 多欄版面正確處理
+✅ 翻譯功能支援（表格文字可提取）
+✅ 效能無明顯下降
+
+### 附加目標
+✅ 支援更多元素類型（公式、印章）
+✅ 版面類型統計和分析
+✅ 視覺化版面結構
+
+---
+
+**規劃完成時間**: 2025-01-18
+**預計開發時間**: 5-8 小時
+**優先級**: P0 (最高優先級)
--- a/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/paddleocr_layout_recovery_research.md
+++ b/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/paddleocr_layout_recovery_research.md
--- a/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/proposal.md
+++ b/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/proposal.md
--- a/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/specs/result-export/spec.md
+++ b/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/specs/result-export/spec.md
--- a/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/specs/task-management/spec.md
+++ b/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/specs/task-management/spec.md
--- a/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/tasks.md
+++ b/openspec/changes/archive/2025-11-18-fix-result-preview-and-pdf-download/tasks.md
--- a/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/FRONTEND_IMPLEMENTATION.md
+++ b/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/FRONTEND_IMPLEMENTATION.md
--- a/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/IMPLEMENTATION_COMPLETE.md
+++ b/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/IMPLEMENTATION_COMPLETE.md
--- a/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/PROGRESS_UPDATE.md
+++ b/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/PROGRESS_UPDATE.md
--- a/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/database_schema.sql
+++ b/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/database_schema.sql
--- a/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/proposal.md
+++ b/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/proposal.md
--- a/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/tasks.md
+++ b/openspec/changes/archive/2025-11-18-migrate-to-external-api-authentication/tasks.md
--- a/openspec/changes/dual-track-document-processing/design.md
+++ b/openspec/changes/dual-track-document-processing/design.md
@@ -0,0 +1,276 @@
+# Technical Design: Dual-track Document Processing
+
+## Context
+
+### Background
+The current OCR tool processes all documents through PaddleOCR, even when dealing with editable PDFs that contain extractable text. This causes:
+- Unnecessary processing overhead
+- Potential quality degradation from re-OCRing already digital text
+- Loss of precise formatting information
+- Inefficient GPU usage on documents that don't need OCR
+
+### Constraints
+- RTX 4060 8GB GPU memory limitation
+- Need to maintain backward compatibility with existing API
+- Must support future translation features
+- Should handle mixed documents (partially scanned, partially digital)
+
+### Stakeholders
+- API consumers expecting consistent JSON/PDF output
+- Translation system requiring structure preservation
+- Performance-sensitive deployments
+
+## Goals / Non-Goals
+
+### Goals
+- Intelligently route documents to appropriate processing track
+- Preserve document structure for translation
+- Optimize GPU usage by avoiding unnecessary OCR
+- Maintain unified output format across tracks
+- Reduce processing time for editable PDFs by 70%+
+
+### Non-Goals
+- Implementing the actual translation engine (future phase)
+- Supporting video or audio transcription
+- Real-time collaborative editing
+- OCR model training or fine-tuning
+
+## Decisions
+
+### Decision 1: Dual-track Architecture
+**What**: Implement two separate processing pipelines - OCR track and Direct extraction track
+
+**Why**:
+- Editable PDFs don't need OCR, can be processed 10-100x faster
+- Direct extraction preserves exact formatting and fonts
+- OCR track remains optimal for scanned documents
+
+**Alternatives considered**:
+1. **Single enhanced OCR pipeline**: Would still waste resources on editable PDFs
+2. **Hybrid approach per page**: Too complex, most documents are uniformly editable or scanned
+3. **Multiple specialized pipelines**: Over-engineering for current requirements
+
+### Decision 2: UnifiedDocument Model
+**What**: Create a standardized intermediate representation for both tracks
+
+**Why**:
+- Provides consistent API interface regardless of processing track
+- Simplifies downstream processing (PDF generation, translation)
+- Enables track switching without breaking changes
+
+**Structure**:
+```python
+@dataclass
+class UnifiedDocument:
+    document_id: str
+    metadata: DocumentMetadata
+    pages: List[Page]
+    processing_track: Literal["ocr", "direct"]
+
+@dataclass
+class Page:
+    page_number: int
+    elements: List[DocumentElement]
+    dimensions: Dimensions
+
+@dataclass
+class DocumentElement:
+    element_id: str
+    type: ElementType  # text, table, image, header, etc.
+    content: Union[str, Dict, bytes]
+    bbox: BoundingBox
+    style: Optional[StyleInfo]
+    confidence: Optional[float]  # Only for OCR track
+```
+
+### Decision 3: PyMuPDF for Direct Extraction
+**What**: Use PyMuPDF (fitz) library for editable PDF processing
+
+**Why**:
+- Mature, well-maintained library
+- Excellent coordinate preservation
+- Fast C++ backend
+- Supports text, tables, and image extraction with positions
+
+**Alternatives considered**:
+1. **pdfplumber**: Good but slower, less precise coordinates
+2. **PyPDF2**: Limited layout information
+3. **PDFMiner**: Complex API, slower performance
+
+### Decision 4: Processing Track Auto-detection
+**What**: Automatically determine optimal track based on document analysis
+
+**Detection logic**:
+```python
+def detect_track(file_path: Path) -> str:
+    file_type = magic.from_file(file_path, mime=True)
+
+    if file_type.startswith('image/'):
+        return "ocr"
+
+    if file_type == 'application/pdf':
+        # Check if PDF has extractable text
+        doc = fitz.open(file_path)
+        for page in doc[:3]:  # Sample first 3 pages
+            text = page.get_text()
+            if len(text.strip()) < 100:  # Minimal text
+                return "ocr"
+        return "direct"
+
+    if file_type in OFFICE_MIMES:
+        return "ocr"  # For now, may add direct Office support later
+
+    return "ocr"  # Default fallback
+```
+
+### Decision 5: GPU Memory Management
+**What**: Implement dynamic batch sizing and model caching for RTX 4060 8GB
+
+**Why**:
+- Prevents OOM errors
+- Maximizes throughput
+- Enables concurrent request handling
+
+**Strategy**:
+```python
+# Adaptive batch sizing based on available memory
+batch_size = calculate_batch_size(
+    available_memory=get_gpu_memory(),
+    image_size=image.shape,
+    model_size=MODEL_MEMORY_REQUIREMENTS
+)
+
+# Model caching to avoid reload overhead
+@lru_cache(maxsize=2)
+def get_model(model_type: str):
+    return load_model(model_type)
+```
+
+### Decision 6: Backward Compatibility
+**What**: Maintain existing API while adding new capabilities
+
+**How**:
+- Existing endpoints continue working unchanged
+- New `processing_track` parameter is optional
+- Output format compatible with current consumers
+- Gradual migration path for clients
+
+## Risks / Trade-offs
+
+### Risk 1: Mixed Content Documents
+**Risk**: Documents with both scanned and digital pages
+**Mitigation**:
+- Page-level track detection as fallback
+- Confidence scoring to identify uncertain pages
+- Manual override option via API
+
+### Risk 2: Direct Extraction Quality
+**Risk**: Some PDFs have poor internal structure
+**Mitigation**:
+- Fallback to OCR track if extraction quality is low
+- Quality metrics: text density, structure coherence
+- User-reportable quality issues
+
+### Risk 3: Memory Pressure
+**Risk**: RTX 4060 8GB limitation with concurrent requests
+**Mitigation**:
+- Request queuing system
+- Dynamic batch adjustment
+- CPU fallback for overflow
+
+### Trade-off 1: Processing Time vs Accuracy
+- Direct extraction: Fast but depends on PDF quality
+- OCR: Slower but consistent quality
+- **Decision**: Prioritize speed for editable PDFs, accuracy for scanned
+
+### Trade-off 2: Complexity vs Flexibility
+- Two tracks increase system complexity
+- But enable optimal processing per document type
+- **Decision**: Accept complexity for 10x+ performance gains
+
+## Migration Plan
+
+### Phase 1: Infrastructure (Week 1-2)
+1. Deploy UnifiedDocument model
+2. Implement DocumentTypeDetector
+3. Add DirectExtractionEngine
+4. Update logging and monitoring
+
+### Phase 2: Integration (Week 3)
+1. Update OCR service with routing logic
+2. Modify PDF generator for unified model
+3. Add new API endpoints
+4. Deploy to staging
+
+### Phase 3: Validation (Week 4)
+1. A/B testing with subset of traffic
+2. Performance benchmarking
+3. Quality validation
+4. Client integration testing
+
+### Rollback Plan
+1. Feature flag to disable dual-track
+2. Fallback all requests to OCR track
+3. Maintain old code paths during transition
+4. Database migration reversible
+
+## Open Questions
+
+### Resolved
+- Q: Should we support page-level track mixing?
+  - A: No, adds complexity with minimal benefit. Document-level is sufficient.
+
+- Q: How to handle Office documents?
+  - A: OCR track initially, consider python-docx/openpyxl later if needed.
+
+### Pending
+- Q: What translation services to integrate with?
+  - Needs stakeholder input on cost/quality trade-offs
+
+- Q: Should we cache extracted text for repeated processing?
+  - Depends on storage costs vs reprocessing frequency
+
+- Q: How to handle password-protected PDFs?
+  - May need API parameter for passwords
+
+## Performance Targets
+
+### Direct Extraction Track
+- Latency: <500ms per page
+- Throughput: 100+ pages/minute
+- Memory: <500MB per document
+
+### OCR Track (Optimized)
+- Latency: 2-5s per page (GPU)
+- Throughput: 20-30 pages/minute
+- Memory: <2GB per batch
+
+### API Response Times
+- Document type detection: <100ms
+- Processing initiation: <200ms
+- Result retrieval: <100ms
+
+## Technical Dependencies
+
+### Python Packages
+```python
+# Direct extraction
+PyMuPDF==1.23.x
+pdfplumber==0.10.x  # Fallback/validation
+python-magic-bin==0.4.x
+
+# OCR enhancement
+paddlepaddle-gpu==2.5.2
+paddleocr==2.7.3
+
+# Infrastructure
+pydantic==2.x
+fastapi==0.100+
+redis==5.x  # For caching
+```
+
+### System Requirements
+- CUDA 11.8+ for PaddlePaddle
+- libmagic for file detection
+- 16GB RAM minimum
+- 50GB disk for models and cache
--- a/openspec/changes/dual-track-document-processing/proposal.md
+++ b/openspec/changes/dual-track-document-processing/proposal.md
@@ -0,0 +1,35 @@
+# Change: Dual-track Document Processing with Structure-Preserving Translation
+
+## Why
+
+The current system processes all documents through PaddleOCR, causing unnecessary overhead for editable PDFs that already contain extractable text. Additionally, we're only using ~20% of PP-StructureV3's capabilities, missing out on comprehensive document structure extraction. The system needs to support structure-preserving document translation as a future goal.
+
+## What Changes
+
+- **ADDED** Dual-track processing architecture with intelligent routing
+  - OCR track for scanned documents, images, and Office files using PaddleOCR
+  - Direct extraction track for editable PDFs using PyMuPDF
+- **ADDED** UnifiedDocument model as common output format for both tracks
+- **ADDED** DocumentTypeDetector service for automatic track selection
+- **MODIFIED** OCR service to use PP-StructureV3's parsing_res_list instead of markdown
+  - Now extracts all 23 element types with bbox coordinates
+  - Preserves reading order and hierarchical structure
+- **MODIFIED** PDF generator to handle UnifiedDocument format
+  - Enhanced overlap detection to prevent text/image/table collisions
+  - Improved coordinate transformation for accurate layout
+- **ADDED** Foundation for structure-preserving translation system
+- **BREAKING** JSON output structure will include new fields (backward compatible with defaults)
+
+## Impact
+
+- **Affected specs**:
+  - `document-processing` (new capability)
+  - `result-export` (enhanced with track metadata and structure data)
+  - `task-management` (tracks processing route and history)
+- **Affected code**:
+  - `backend/app/services/ocr_service.py` - Major refactoring for dual-track
+  - `backend/app/services/pdf_generator_service.py` - UnifiedDocument support
+  - `backend/app/api/v2/tasks.py` - New endpoints for track detection
+  - `frontend/src/pages/TaskDetailPage.tsx` - Display processing track info
+- **Performance**: 5-10x faster for editable PDFs, same speed for scanned documents
+- **Dependencies**: Adds PyMuPDF, pdfplumber, python-magic-bin
--- a/openspec/changes/dual-track-document-processing/specs/document-processing/spec.md
+++ b/openspec/changes/dual-track-document-processing/specs/document-processing/spec.md
@@ -0,0 +1,108 @@
+# Document Processing Spec Delta
+
+## ADDED Requirements
+
+### Requirement: Dual-track Processing
+The system SHALL support two distinct processing tracks for documents: OCR track for scanned/image documents and Direct extraction track for editable PDFs.
+
+#### Scenario: Process scanned PDF through OCR track
+- **WHEN** a scanned PDF is uploaded
+- **THEN** the system SHALL detect it requires OCR
+- **AND** route it through PaddleOCR PP-StructureV3 pipeline
+- **AND** return results in UnifiedDocument format
+
+#### Scenario: Process editable PDF through direct extraction
+- **WHEN** an editable PDF with extractable text is uploaded
+- **THEN** the system SHALL detect it can be directly extracted
+- **AND** route it through PyMuPDF extraction pipeline
+- **AND** return results in UnifiedDocument format without OCR
+
+#### Scenario: Auto-detect processing track
+- **WHEN** a document is uploaded without explicit track specification
+- **THEN** the system SHALL analyze the document type and content
+- **AND** automatically select the optimal processing track
+- **AND** include the selected track in processing metadata
+
+### Requirement: Document Type Detection
+The system SHALL provide intelligent document type detection to determine the optimal processing track.
+
+#### Scenario: Detect editable PDF
+- **WHEN** analyzing a PDF document
+- **THEN** the system SHALL check for extractable text content
+- **AND** return confidence score for editability
+- **AND** recommend "direct" track if text coverage > 90%
+
+#### Scenario: Detect scanned document
+- **WHEN** analyzing an image or scanned PDF
+- **THEN** the system SHALL identify lack of extractable text
+- **AND** recommend "ocr" track for processing
+- **AND** configure appropriate OCR models
+
+#### Scenario: Detect Office documents
+- **WHEN** analyzing .docx, .xlsx, .pptx files
+- **THEN** the system SHALL identify Office format
+- **AND** route to OCR track for initial implementation
+- **AND** preserve option for future direct Office extraction
+
+### Requirement: Unified Document Model
+The system SHALL use a standardized UnifiedDocument model as the common output format for both processing tracks.
+
+#### Scenario: Generate UnifiedDocument from OCR
+- **WHEN** OCR processing completes
+- **THEN** the system SHALL convert PP-StructureV3 results to UnifiedDocument
+- **AND** preserve all element types, coordinates, and confidence scores
+- **AND** maintain reading order and hierarchical structure
+
+#### Scenario: Generate UnifiedDocument from direct extraction
+- **WHEN** direct extraction completes
+- **THEN** the system SHALL convert PyMuPDF results to UnifiedDocument
+- **AND** preserve text styling, fonts, and exact positioning
+- **AND** extract tables with cell boundaries and content
+
+#### Scenario: Consistent output regardless of track
+- **WHEN** processing completes through either track
+- **THEN** the output SHALL conform to UnifiedDocument schema
+- **AND** include processing_track metadata field
+- **AND** support identical downstream operations (PDF generation, translation)
+
+### Requirement: Enhanced OCR with Full PP-StructureV3
+The system SHALL utilize the full capabilities of PP-StructureV3, extracting all 23 element types from parsing_res_list.
+
+#### Scenario: Extract comprehensive document structure
+- **WHEN** processing through OCR track
+- **THEN** the system SHALL use page_result.json['parsing_res_list']
+- **AND** extract all element types including headers, lists, tables, figures
+- **AND** preserve layout_bbox coordinates for each element
+
+#### Scenario: Maintain reading order
+- **WHEN** extracting elements from PP-StructureV3
+- **THEN** the system SHALL preserve the reading order from parsing_res_list
+- **AND** assign sequential indices to elements
+- **AND** support reordering for complex layouts
+
+#### Scenario: Extract table structure
+- **WHEN** PP-StructureV3 identifies a table
+- **THEN** the system SHALL extract cell content and boundaries
+- **AND** preserve table HTML for structure
+- **AND** extract plain text for translation
+
+### Requirement: Structure-Preserving Translation Foundation
+The system SHALL maintain document structure and layout information to support future translation features.
+
+#### Scenario: Preserve coordinates for translation
+- **WHEN** processing any document
+- **THEN** the system SHALL retain bbox coordinates for all text elements
+- **AND** calculate space requirements for text expansion/contraction
+- **AND** maintain element relationships and groupings
+
+#### Scenario: Extract translatable content
+- **WHEN** processing tables and lists
+- **THEN** the system SHALL extract plain text content
+- **AND** maintain mapping to original structure
+- **AND** preserve formatting markers for reconstruction
+
+#### Scenario: Support layout adjustment
+- **WHEN** preparing for translation
+- **THEN** the system SHALL identify flexible vs fixed layout regions
+- **AND** calculate maximum text expansion ratios
+- **AND** preserve non-translatable elements (logos, signatures)
--- a/openspec/changes/dual-track-document-processing/specs/result-export/spec.md
+++ b/openspec/changes/dual-track-document-processing/specs/result-export/spec.md
@@ -0,0 +1,74 @@
+# Result Export Spec Delta
+
+## MODIFIED Requirements
+
+### Requirement: Export Interface
+The Export page SHALL support downloading OCR results in multiple formats using V2 task APIs, with processing track information and enhanced structure data.
+
+#### Scenario: Export page uses V2 download endpoints
+- **WHEN** user selects a format and clicks export button
+- **THEN** frontend SHALL call V2 endpoint `/api/v2/tasks/{task_id}/download/{format}`
+- **AND** frontend SHALL NOT call V1 `/api/v2/export` endpoint (which returns 404)
+- **AND** file SHALL download successfully
+
+#### Scenario: Export supports multiple formats
+- **WHEN** user exports a completed task
+- **THEN** system SHALL support downloading as TXT, JSON, Excel, Markdown, and PDF
+- **AND** each format SHALL use correct V2 download endpoint
+- **AND** downloaded files SHALL contain task OCR results
+
+#### Scenario: Export includes processing track metadata
+- **WHEN** user exports a task processed through dual-track system
+- **THEN** exported JSON SHALL include "processing_track" field indicating "ocr" or "direct"
+- **AND** SHALL include "processing_metadata" with track-specific information
+- **AND** SHALL maintain backward compatibility for clients not expecting these fields
+
+#### Scenario: Export UnifiedDocument format
+- **WHEN** user requests JSON export with unified=true parameter
+- **THEN** system SHALL return UnifiedDocument structure
+- **AND** include complete element hierarchy with coordinates
+- **AND** preserve all PP-StructureV3 element types for OCR track
+
+## ADDED Requirements
+
+### Requirement: Enhanced PDF Export with Layout Preservation
+The PDF export SHALL accurately preserve document layout from both OCR and direct extraction tracks.
+
+#### Scenario: Export PDF from direct extraction track
+- **WHEN** exporting PDF from a direct-extraction processed document
+- **THEN** the PDF SHALL maintain exact text positioning from source
+- **AND** preserve original fonts and styles where possible
+- **AND** include extracted images at correct positions
+
+#### Scenario: Export PDF from OCR track with full structure
+- **WHEN** exporting PDF from OCR-processed document
+- **THEN** the PDF SHALL use all 23 PP-StructureV3 element types
+- **AND** render tables with proper cell boundaries
+- **AND** maintain reading order from parsing_res_list
+
+#### Scenario: Handle coordinate transformations
+- **WHEN** generating PDF from UnifiedDocument
+- **THEN** system SHALL correctly transform bbox coordinates to PDF space
+- **AND** handle page size variations
+- **AND** prevent text overlap using enhanced overlap detection
+
+### Requirement: Structure Data Export
+The system SHALL provide export formats that preserve document structure for downstream processing.
+
+#### Scenario: Export structured JSON with hierarchy
+- **WHEN** user selects structured JSON format
+- **THEN** export SHALL include element hierarchy and relationships
+- **AND** preserve parent-child relationships (sections, lists)
+- **AND** include style and formatting information
+
+#### Scenario: Export for translation preparation
+- **WHEN** user exports with translation_ready=true parameter
+- **THEN** export SHALL include translatable text segments
+- **AND** maintain coordinate mappings for each segment
+- **AND** mark non-translatable regions
+
+#### Scenario: Export with layout analysis
+- **WHEN** user requests layout analysis export
+- **THEN** system SHALL include reading order indices
+- **AND** identify layout regions (header, body, footer, sidebar)
+- **AND** provide confidence scores for layout detection
--- a/openspec/changes/dual-track-document-processing/specs/task-management/spec.md
+++ b/openspec/changes/dual-track-document-processing/specs/task-management/spec.md
@@ -0,0 +1,105 @@
+# Task Management Spec Delta
+
+## MODIFIED Requirements
+
+### Requirement: Task Result Generation
+The OCR service SHALL generate both JSON and Markdown result files for completed tasks with actual content, including processing track information and enhanced structure data.
+
+#### Scenario: Markdown file contains OCR results
+- **WHEN** a task completes OCR processing successfully
+- **THEN** the generated `.md` file SHALL contain the extracted text in markdown format
+- **AND** the file size SHALL be greater than 0 bytes
+- **AND** the markdown SHALL include headings, paragraphs, and formatting based on OCR layout detection
+
+#### Scenario: Result files stored in task directory
+- **WHEN** OCR processing completes for task ID `88c6c2d2-37e1-48fd-a50f-406142987bdf`
+- **THEN** result files SHALL be stored in `storage/results/88c6c2d2-37e1-48fd-a50f-406142987bdf/`
+- **AND** both `<filename>_result.json` and `<filename>_result.md` SHALL exist
+- **AND** both files SHALL contain valid OCR output data
+
+#### Scenario: Include processing track in results
+- **WHEN** a task completes through dual-track processing
+- **THEN** the JSON result SHALL include "processing_track" field
+- **AND** SHALL indicate whether "ocr" or "direct" track was used
+- **AND** SHALL include track-specific metadata (confidence for OCR, extraction quality for direct)
+
+#### Scenario: Store UnifiedDocument format
+- **WHEN** processing completes through either track
+- **THEN** system SHALL save results in UnifiedDocument format
+- **AND** maintain backward-compatible JSON structure
+- **AND** include enhanced structure from PP-StructureV3 or PyMuPDF
+
+### Requirement: Task Detail View
+The frontend SHALL provide a dedicated page for viewing individual task details with processing track information and enhanced preview capabilities.
+
+#### Scenario: Navigate to task detail page
+- **WHEN** user clicks "View Details" button on task in Task History page
+- **THEN** browser SHALL navigate to `/tasks/{task_id}`
+- **AND** TaskDetailPage component SHALL render
+
+#### Scenario: Display task information
+- **WHEN** TaskDetailPage loads for a valid task ID
+- **THEN** page SHALL display task metadata (filename, status, processing time, confidence)
+- **AND** page SHALL show markdown preview of OCR results
+- **AND** page SHALL provide download buttons for JSON, Markdown, and PDF formats
+
+#### Scenario: Download from task detail page
+- **WHEN** user clicks download button for a specific format
+- **THEN** browser SHALL download the file using `/api/v2/tasks/{task_id}/download/{format}` endpoint
+- **AND** downloaded file SHALL contain the task's OCR results in requested format
+
+#### Scenario: Display processing track information
+- **WHEN** viewing task processed through dual-track system
+- **THEN** page SHALL display processing track used (OCR or Direct)
+- **AND** show track-specific metrics (OCR confidence or extraction quality)
+- **AND** provide option to reprocess with alternate track if applicable
+
+#### Scenario: Preview document structure
+- **WHEN** user enables structure view
+- **THEN** page SHALL display document element hierarchy
+- **AND** show bounding boxes overlay on preview
+- **AND** highlight different element types (headers, tables, lists) with distinct colors
+
+## ADDED Requirements
+
+### Requirement: Processing Track Management
+The task management system SHALL track and display processing track information for all tasks.
+
+#### Scenario: Track processing route selection
+- **WHEN** a task begins processing
+- **THEN** system SHALL record the selected processing track
+- **AND** log the reason for track selection
+- **AND** store auto-detection confidence score
+
+#### Scenario: Allow track override
+- **WHEN** user views a completed task
+- **THEN** system SHALL offer option to reprocess with different track
+- **AND** maintain both results for comparison
+- **AND** track which result user prefers
+
+#### Scenario: Display processing metrics
+- **WHEN** task completes processing
+- **THEN** system SHALL record track-specific metrics
+- **AND** OCR track SHALL show confidence scores and character count
+- **AND** Direct track SHALL show extraction coverage and structure quality
+
+### Requirement: Task Processing History
+The system SHALL maintain detailed processing history for tasks including track changes and reprocessing.
+
+#### Scenario: Record reprocessing attempts
+- **WHEN** a task is reprocessed with different track
+- **THEN** system SHALL maintain processing history
+- **AND** store results from each attempt
+- **AND** allow comparison between different processing attempts
+
+#### Scenario: Track quality improvements
+- **WHEN** viewing task history
+- **THEN** system SHALL show quality metrics over time
+- **AND** indicate if reprocessing improved results
+- **AND** suggest optimal track based on document characteristics
+
+#### Scenario: Export processing analytics
+- **WHEN** exporting task data
+- **THEN** system SHALL include processing history
+- **AND** provide track selection statistics
+- **AND** include performance metrics for each processing attempt
--- a/openspec/changes/dual-track-document-processing/tasks.md
+++ b/openspec/changes/dual-track-document-processing/tasks.md
@@ -0,0 +1,170 @@
+# Implementation Tasks: Dual-track Document Processing
+
+## 1. Core Infrastructure
+- [ ] 1.1 Add PyMuPDF and other dependencies to requirements.txt
+  - [ ] 1.1.1 Add PyMuPDF==1.23.x
+  - [ ] 1.1.2 Add pdfplumber==0.10.x
+  - [ ] 1.1.3 Add python-magic-bin==0.4.x
+  - [ ] 1.1.4 Test dependency installation
+- [ ] 1.2 Create UnifiedDocument model in backend/app/models/
+  - [ ] 1.2.1 Define UnifiedDocument dataclass
+  - [ ] 1.2.2 Add DocumentElement model
+  - [ ] 1.2.3 Add DocumentMetadata model
+  - [ ] 1.2.4 Create converters for both OCR and direct extraction outputs
+- [ ] 1.3 Create DocumentTypeDetector service
+  - [ ] 1.3.1 Implement file type detection using python-magic
+  - [ ] 1.3.2 Add PDF editability checking logic
+  - [ ] 1.3.3 Add Office document detection
+  - [ ] 1.3.4 Create routing logic to determine processing track
+  - [ ] 1.3.5 Add unit tests for detector
+
+## 2. Direct Extraction Track
+- [ ] 2.1 Create DirectExtractionEngine service
+  - [ ] 2.1.1 Implement PyMuPDF-based text extraction
+  - [ ] 2.1.2 Add structure preservation logic
+  - [ ] 2.1.3 Extract tables with coordinates
+  - [ ] 2.1.4 Extract images and their positions
+  - [ ] 2.1.5 Maintain reading order
+  - [ ] 2.1.6 Handle multi-column layouts
+- [ ] 2.2 Implement layout analysis for editable PDFs
+  - [ ] 2.2.1 Detect headers and footers
+  - [ ] 2.2.2 Identify sections and subsections
+  - [ ] 2.2.3 Parse lists and nested structures
+  - [ ] 2.2.4 Extract font and style information
+- [ ] 2.3 Create direct extraction to UnifiedDocument converter
+  - [ ] 2.3.1 Map PyMuPDF structures to UnifiedDocument
+  - [ ] 2.3.2 Preserve coordinate information
+  - [ ] 2.3.3 Maintain element relationships
+
+## 3. OCR Track Enhancement
+- [ ] 3.1 Upgrade PP-StructureV3 configuration
+  - [ ] 3.1.1 Update config for RTX 4060 8GB optimization
+  - [ ] 3.1.2 Enable batch processing for GPU efficiency
+  - [ ] 3.1.3 Configure memory management settings
+  - [ ] 3.1.4 Set up model caching
+- [ ] 3.2 Enhance OCR service to use parsing_res_list
+  - [ ] 3.2.1 Replace markdown extraction with parsing_res_list
+  - [ ] 3.2.2 Extract all 23 element types
+  - [ ] 3.2.3 Preserve bbox coordinates from PP-StructureV3
+  - [ ] 3.2.4 Maintain reading order information
+- [ ] 3.3 Create OCR to UnifiedDocument converter
+  - [ ] 3.3.1 Map PP-StructureV3 elements to UnifiedDocument
+  - [ ] 3.3.2 Handle complex nested structures
+  - [ ] 3.3.3 Preserve all metadata
+
+## 4. Unified Processing Pipeline
+- [ ] 4.1 Update main OCR service for dual-track processing
+  - [ ] 4.1.1 Integrate DocumentTypeDetector
+  - [ ] 4.1.2 Route to appropriate processing engine
+  - [ ] 4.1.3 Return UnifiedDocument from both tracks
+  - [ ] 4.1.4 Maintain backward compatibility
+- [ ] 4.2 Create unified JSON export
+  - [ ] 4.2.1 Define standardized JSON schema
+  - [ ] 4.2.2 Include processing metadata
+  - [ ] 4.2.3 Support both track outputs
+- [ ] 4.3 Update PDF generator for UnifiedDocument
+  - [ ] 4.3.1 Adapt PDF generation to use UnifiedDocument
+  - [ ] 4.3.2 Preserve layout from both tracks
+  - [ ] 4.3.3 Handle coordinate transformations
+
+## 5. Translation System Foundation
+- [ ] 5.1 Create TranslationEngine interface
+  - [ ] 5.1.1 Define translation API contract
+  - [ ] 5.1.2 Support element-level translation
+  - [ ] 5.1.3 Preserve formatting markers
+- [ ] 5.2 Implement structure-preserving translation
+  - [ ] 5.2.1 Translate text while maintaining coordinates
+  - [ ] 5.2.2 Handle table cell translations
+  - [ ] 5.2.3 Preserve list structures
+  - [ ] 5.2.4 Maintain header hierarchies
+- [ ] 5.3 Create translated document renderer
+  - [ ] 5.3.1 Generate PDF with translated text
+  - [ ] 5.3.2 Adjust layouts for text expansion/contraction
+  - [ ] 5.3.3 Handle font substitution for target languages
+
+## 6. API Updates
+- [ ] 6.1 Update OCR endpoints
+  - [ ] 6.1.1 Add processing_track parameter
+  - [ ] 6.1.2 Support track auto-detection
+  - [ ] 6.1.3 Return processing metadata
+- [ ] 6.2 Add document type detection endpoint
+  - [ ] 6.2.1 Create /analyze endpoint
+  - [ ] 6.2.2 Return recommended processing track
+  - [ ] 6.2.3 Provide confidence scores
+- [ ] 6.3 Update result export endpoints
+  - [ ] 6.3.1 Support UnifiedDocument format
+  - [ ] 6.3.2 Add format conversion options
+  - [ ] 6.3.3 Include processing track information
+
+## 7. Frontend Updates
+- [ ] 7.1 Update task detail view
+  - [ ] 7.1.1 Display processing track information
+  - [ ] 7.1.2 Show track-specific metadata
+  - [ ] 7.1.3 Add track selection UI (if manual override needed)
+- [ ] 7.2 Update results preview
+  - [ ] 7.2.1 Handle UnifiedDocument format
+  - [ ] 7.2.2 Display enhanced structure information
+  - [ ] 7.2.3 Show coordinate overlays (debug mode)
+- [ ] 7.3 Add translation UI preparation
+  - [ ] 7.3.1 Add translation toggle/button
+  - [ ] 7.3.2 Language selection dropdown
+  - [ ] 7.3.3 Translation progress indicator
+
+## 8. Testing
+- [ ] 8.1 Unit tests for DocumentTypeDetector
+  - [ ] 8.1.1 Test various file types
+  - [ ] 8.1.2 Test editability detection
+  - [ ] 8.1.3 Test edge cases
+- [ ] 8.2 Unit tests for DirectExtractionEngine
+  - [ ] 8.2.1 Test text extraction accuracy
+  - [ ] 8.2.2 Test structure preservation
+  - [ ] 8.2.3 Test coordinate extraction
+- [ ] 8.3 Integration tests for dual-track processing
+  - [ ] 8.3.1 Test routing logic
+  - [ ] 8.3.2 Test UnifiedDocument generation
+  - [ ] 8.3.3 Test backward compatibility
+- [ ] 8.4 End-to-end tests
+  - [ ] 8.4.1 Test scanned PDF processing (OCR track)
+  - [ ] 8.4.2 Test editable PDF processing (direct track)
+  - [ ] 8.4.3 Test Office document processing
+  - [ ] 8.4.4 Test image file processing
+- [ ] 8.5 Performance testing
+  - [ ] 8.5.1 Benchmark both processing tracks
+  - [ ] 8.5.2 Test GPU memory usage
+  - [ ] 8.5.3 Compare processing times
+
+## 9. Documentation
+- [ ] 9.1 Update API documentation
+  - [ ] 9.1.1 Document new endpoints
+  - [ ] 9.1.2 Update existing endpoint docs
+  - [ ] 9.1.3 Add processing track information
+- [ ] 9.2 Create architecture documentation
+  - [ ] 9.2.1 Document dual-track flow
+  - [ ] 9.2.2 Explain UnifiedDocument structure
+  - [ ] 9.2.3 Add decision trees for track selection
+- [ ] 9.3 Add deployment guide
+  - [ ] 9.3.1 Document GPU requirements
+  - [ ] 9.3.2 Add environment configuration
+  - [ ] 9.3.3 Include troubleshooting guide
+
+## 10. Deployment Preparation
+- [ ] 10.1 Update Docker configuration
+  - [ ] 10.1.1 Add new dependencies to Dockerfile
+  - [ ] 10.1.2 Configure GPU support
+  - [ ] 10.1.3 Update volume mappings
+- [ ] 10.2 Update environment variables
+  - [ ] 10.2.1 Add processing track settings
+  - [ ] 10.2.2 Configure GPU memory limits
+  - [ ] 10.2.3 Add feature flags
+- [ ] 10.3 Create migration plan
+  - [ ] 10.3.1 Plan for existing data migration
+  - [ ] 10.3.2 Create rollback procedures
+  - [ ] 10.3.3 Document breaking changes
+
+## Completion Checklist
+- [ ] All unit tests passing
+- [ ] Integration tests passing
+- [ ] Performance benchmarks acceptable
+- [ ] Documentation complete
+- [ ] Code reviewed
+- [ ] Deployment tested in staging
--- a/openspec/changes/migrate-to-external-api-authentication/test_external_auth.py
+++ b/openspec/changes/migrate-to-external-api-authentication/test_external_auth.py
@@ -1,226 +0,0 @@
-#!/usr/bin/env python3
-"""
-Proof of Concept: External API Authentication Test
-Tests the external authentication API at https://pj-auth-api.vercel.app
-"""
-
-import asyncio
-import json
-from datetime import datetime
-from typing import Dict, Any, Optional
-import httpx
-from pydantic import BaseModel, Field
-
-
-class UserInfo(BaseModel):
-    """User information from external API"""
-    id: str
-    name: str
-    email: str
-    job_title: Optional[str] = Field(None, alias="jobTitle")
-    office_location: Optional[str] = Field(None, alias="officeLocation")
-    business_phones: list[str] = Field(default_factory=list, alias="businessPhones")
-
-
-class AuthSuccessData(BaseModel):
-    """Successful authentication response data"""
-    access_token: str
-    id_token: str
-    expires_in: int
-    token_type: str
-    user_info: UserInfo = Field(alias="userInfo")
-    issued_at: str = Field(alias="issuedAt")
-    expires_at: str = Field(alias="expiresAt")
-
-
-class AuthSuccessResponse(BaseModel):
-    """Successful authentication response"""
-    success: bool
-    message: str
-    data: AuthSuccessData
-    timestamp: str
-
-
-class AuthErrorResponse(BaseModel):
-    """Failed authentication response"""
-    success: bool
-    error: str
-    code: str
-    timestamp: str
-
-
-class ExternalAuthClient:
-    """Client for external authentication API"""
-
-    def __init__(self, base_url: str = "https://pj-auth-api.vercel.app", timeout: int = 30):
-        self.base_url = base_url
-        self.timeout = timeout
-        self.endpoint = "/api/auth/login"
-
-    async def authenticate(self, username: str, password: str) -> Dict[str, Any]:
-        """
-        Authenticate user with external API
-
-        Args:
-            username: User email/username
-            password: User password
-
-        Returns:
-            Authentication result dictionary
-        """
-        url = f"{self.base_url}{self.endpoint}"
-
-        print(f"ℹ Endpoint: POST {url}")
-        print(f"ℹ Username: {username}")
-        print(f"ℹ Timestamp: {datetime.now().isoformat()}")
-        print()
-
-        async with httpx.AsyncClient() as client:
-            try:
-                # Make authentication request
-                start_time = datetime.now()
-                response = await client.post(
-                    url,
-                    json={"username": username, "password": password},
-                    timeout=self.timeout
-                )
-                elapsed = (datetime.now() - start_time).total_seconds()
-
-                # Print response details
-                print("Response Details:")
-                print(f"  Status Code: {response.status_code}")
-                print(f"  Response Time: {elapsed:.3f}s")
-                print(f"  Content-Type: {response.headers.get('content-type', 'N/A')}")
-                print()
-
-                # Parse response
-                response_data = response.json()
-                print("Response Body:")
-                print(json.dumps(response_data, indent=2, ensure_ascii=False))
-                print()
-
-                # Handle success/failure
-                if response.status_code == 200:
-                    auth_response = AuthSuccessResponse(**response_data)
-                    return {
-                        "success": True,
-                        "status_code": response.status_code,
-                        "data": auth_response.dict(),
-                        "user_display_name": auth_response.data.user_info.name,
-                        "user_email": auth_response.data.user_info.email,
-                        "token": auth_response.data.access_token,
-                        "expires_in": auth_response.data.expires_in,
-                        "expires_at": auth_response.data.expires_at
-                    }
-                elif response.status_code == 401:
-                    error_response = AuthErrorResponse(**response_data)
-                    return {
-                        "success": False,
-                        "status_code": response.status_code,
-                        "error": error_response.error,
-                        "code": error_response.code
-                    }
-                else:
-                    return {
-                        "success": False,
-                        "status_code": response.status_code,
-                        "error": f"Unexpected status code: {response.status_code}",
-                        "response": response_data
-                    }
-
-            except httpx.TimeoutException:
-                print(f"❌ Request timeout after {self.timeout} seconds")
-                return {
-                    "success": False,
-                    "error": "Request timeout",
-                    "code": "TIMEOUT"
-                }
-            except httpx.RequestError as e:
-                print(f"❌ Request error: {e}")
-                return {
-                    "success": False,
-                    "error": str(e),
-                    "code": "REQUEST_ERROR"
-                }
-            except Exception as e:
-                print(f"❌ Unexpected error: {e}")
-                return {
-                    "success": False,
-                    "error": str(e),
-                    "code": "UNKNOWN_ERROR"
-                }
-
-
-async def test_authentication():
-    """Test authentication with different scenarios"""
-    client = ExternalAuthClient()
-
-    # Test scenarios
-    test_cases = [
-        {
-            "name": "Valid Credentials (Example)",
-            "username": "ymirliu@panjit.com.tw",
-            "password": "correct_password",  # Replace with actual password for testing
-            "expected": "success"
-        },
-        {
-            "name": "Invalid Credentials",
-            "username": "test@example.com",
-            "password": "wrong_password",
-            "expected": "failure"
-        }
-    ]
-
-    for i, test_case in enumerate(test_cases, 1):
-        print(f"{'='*60}")
-        print(f"Test Case {i}: {test_case['name']}")
-        print(f"{'='*60}")
-
-        result = await client.authenticate(
-            username=test_case["username"],
-            password=test_case["password"]
-        )
-
-        # Analyze result
-        print("\nAnalysis:")
-        if result["success"]:
-            print("✅ Authentication successful")
-            print(f"  User: {result.get('user_display_name', 'N/A')}")
-            print(f"  Email: {result.get('user_email', 'N/A')}")
-            print(f"  Token expires in: {result.get('expires_in', 0)} seconds")
-            print(f"  Expires at: {result.get('expires_at', 'N/A')}")
-        else:
-            print("❌ Authentication failed")
-            print(f"  Error: {result.get('error', 'Unknown error')}")
-            print(f"  Code: {result.get('code', 'N/A')}")
-
-        print("\n")
-
-
-async def test_token_validation():
-    """Test token validation and refresh logic"""
-    # This would be implemented when we have a valid token
-    print("Token validation test - To be implemented with actual tokens")
-    pass
-
-
-def main():
-    """Main entry point"""
-    print("External Authentication API Test")
-    print("================================\n")
-
-    # Run tests
-    asyncio.run(test_authentication())
-
-    print("\nTest completed!")
-    print("\nNotes for implementation:")
-    print("1. Use httpx for async HTTP requests (already in requirements)")
-    print("2. Store tokens securely (consider encryption)")
-    print("3. Implement automatic token refresh before expiration")
-    print("4. Handle network failures with retry logic")
-    print("5. Map external user ID to local user records")
-    print("6. Display user 'name' field in UI instead of username")
-
-
-if __name__ == "__main__":
-    main()