test(msd): add unit tests for _build_detail_table structured output and CSV flatten

Covers task 14.3: verify UPSTREAM_MACHINES/UPSTREAM_MATERIALS list format, WAFER_ROOT field, multi-reason row expansion, machine deduplication, and CSV export flatten logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 11:28:28 +08:00
parent 86984cfeb1
commit cd061e0cfd
2 changed files with 276 additions and 0 deletions
--- a/openspec/changes/msd-multifactor-backward-tracing/tasks.md
+++ b/openspec/changes/msd-multifactor-backward-tracing/tasks.md
@@ -0,0 +1,93 @@
+## 1. Backend: Multi-factor attribution engine
+
+- [x] 1.1 Add `_attribute_materials()` to `mid_section_defect_service.py` — symmetric to `_attribute_defects()`, keyed by `(MATERIALPARTNAME, MATERIALLOTNAME)`, handles NULL lot name gracefully
+- [x] 1.2 Add `_attribute_wafer_roots()` to `mid_section_defect_service.py` — keyed by `root_container_name`, builds `root → detection_lots` mapping from lineage roots
+- [x] 1.3 Update `DIMENSION_MAP` — remove `by_package`, `by_pj_type`, `by_workflow`; add `by_material`, `by_wafer_root`
+- [x] 1.4 Update `_build_all_charts()` to call the new attribution functions for `by_material` and `by_wafer_root` dimensions
+- [x] 1.5 Add `lot_count` field to each Pareto bar entry in `_build_chart_data()` (number of associated detection LOTs for that factor)
+
+## 2. Backend: Lineage root extraction
+
+- [x] 2.1 Add root identification logic to `lineage_engine.py` — traverse `child_to_parent` map to find the node with no further parent for each seed
+- [x] 2.2 Include `roots` field (`{seed_cid: root_container_name}`) in lineage stage response
+- [x] 2.3 Pass `roots` through `build_trace_aggregation_from_events()` into aggregation context
+
+## 3. Backend: Staged trace materials domain
+
+- [x] 3.1 In `trace_routes.py` events stage, add `materials` to the domain list for `mid_section_defect` profile backward mode
+- [x] 3.2 Wire materials domain records through `_flatten_domain_records()` into aggregation input
+
+## 4. Backend: Structured detail table
+
+- [x] 4.1 Modify `_build_detail_table()` — change `UPSTREAM_MACHINES` from comma-separated string to list of `{"station": "...", "machine": "..."}` objects
+- [x] 4.2 Add `UPSTREAM_MATERIALS` field to detail records — list of `{"part": "...", "lot": "..."}` objects (when materials data is available)
+- [x] 4.3 Add `WAFER_ROOT` field to detail records — root ancestor `CONTAINERNAME` string
+- [x] 4.4 Add `UPSTREAM_MACHINE_COUNT` field to detail records — count of unique upstream machines per LOT
+- [x] 4.5 Update CSV export in `mid_section_defect_routes.py` — flatten structured `UPSTREAM_MACHINES` back to comma-separated `station/machine` format for CSV compatibility
+
+## 5. Backend: Equipment recent jobs endpoint
+
+- [x] 5.1 Add `GET /api/query-tool/equipment-recent-jobs/<equipment_id>` endpoint in `query_tool_routes.py` — query `DW_MES_JOB` for last 30 days, return top 5 most recent JOB records (JOBID, JOBSTATUS, JOBMODELNAME, CREATEDATE, COMPLETEDATE)
+- [x] 5.2 Add SQL file `src/mes_dashboard/sql/query_tool/equipment_recent_jobs.sql` for the query
+
+## 6. Backend: Reject history Pareto dimensions
+
+- [x] 6.1 Add `dimension` parameter to `query_reason_pareto()` in `reject_history_service.py` — support `reason` (default), `package`, `type`, `workflow`, `workcenter`, `equipment` as groupby keys
+- [x] 6.2 Update `reject_history_routes.py` to accept and pass `dimension` query parameter
+- [x] 6.3 Ensure two-phase caching still works (groupby from cached DataFrame, no re-query)
+
+## 7. Backend: Analysis summary data
+
+- [x] 7.1 Add `total_ancestor_count` to lineage stage response — count of unique ancestor CIDs (excluding seed CIDs)
+- [x] 7.2 Ensure backward aggregation response includes summary fields: total detection lots, total input qty, defective lot count, total reject qty, ancestor coverage count
+
+## 8. Frontend: Multi-factor Pareto charts
+
+- [x] 8.1 Update `App.vue` backward chart section — replace 6-chart layout with 5-chart layout (2-2-1): machine | material, wafer_root | loss_reason, detection_machine
+- [x] 8.2 Add chart builder functions for materials and wafer root attribution data (same pattern as `buildMachineChartFromAttribution`)
+- [x] 8.3 Update `useTraceProgress.js` — in backward mode, request `domains: ['upstream_history', 'materials']`
+- [x] 8.4 Wire new chart data through session caching (save/load from sessionStorage)
+
+## 9. Frontend: Pareto chart enhancements (ParetoChart.vue)
+
+- [x] 9.1 Add sort toggle button (依不良數 / 依不良率) — per-chart state, re-sort data and recalculate cumulative %
+- [x] 9.2 Add 80% cumulative markLine — horizontal dashed line at y=80 on percentage axis, muted color `#94a3b8`, label「80%」
+- [x] 9.3 Add `lot_count` to tooltip formatter — show「關聯 LOT 數: N (xx%)」
+
+## 10. Frontend: Analysis summary panel
+
+- [x] 10.1 Create `AnalysisSummary.vue` component — collapsible panel with query context, data scope stats, and attribution methodology text
+- [x] 10.2 Integrate into `App.vue` above KPI cards — pass query params and summary data as props
+- [x] 10.3 Handle container mode variant (show input type and resolved count instead of date range)
+- [x] 10.4 Persist collapsed/expanded state in sessionStorage
+
+## 11. Frontend: Detail table suspect hit column
+
+- [x] 11.1 Update `DetailTable.vue` — replace「上游機台」column with「嫌疑命中」column
+- [x] 11.2 Implement suspect list derivation — extract machine names from current Pareto Top N (respecting inline station/spec filters)
+- [x] 11.3 Render hit cell: show matching machine names with ratio (e.g., `WIRE-03, DIE-01 (2/5)`), star/highlight for full match,「-」for no hits
+- [x] 11.4 Add「上游台數」column showing total unique upstream machine count per LOT
+- [x] 11.5 Make suspect list reactive to Pareto inline filter changes
+
+## 12. Frontend: Suspect machine context panel
+
+- [x] 12.1 Create `SuspectContextPanel.vue` — popover component with attribution summary section and maintenance section
+- [x] 12.2 Attribution summary content: equipment name, workcenter group, resource family, defect rate, defect count, input count, LOT count (all available from existing attribution data)
+- [x] 12.3 Maintenance section: fetch recent JOB records from `/api/query-tool/equipment-recent-jobs/<equipment_id>`, show up to 5 records; loading state while fetching;「近 30 天無維修紀錄」when empty
+- [x] 12.4 Integrate with ParetoChart.vue — emit click event on bar for「依上游機台歸因」chart only; position popover near clicked bar
+- [x] 12.5 Close on outside click or re-click of same bar
+
+## 13. Frontend: Reject history Pareto dimensions
+
+- [x] 13.1 Add dimension selector dropdown to `ParetoSection.vue` in reject-history — options: 不良原因, PACKAGE, TYPE, WORKFLOW, 站點, 機台
+- [x] 13.2 Update API call to pass `dimension` parameter
+- [x] 13.3 Update `App.vue` in reject-history to wire dimension state
+
+## 14. Tests
+
+- [x] 14.1 Add unit tests for `_attribute_materials()` in `tests/test_mid_section_defect.py` — verify correct rate calculation, NULL lot name handling
+- [x] 14.2 Add unit tests for `_attribute_wafer_roots()` — verify root mapping, self-root case
+- [x] 14.3 Add unit tests for structured `_build_detail_table()` output — verify list format, CSV flatten
+- [x] 14.4 Add tests for equipment-recent-jobs endpoint in `tests/test_query_tool_routes.py`
+- [x] 14.5 Add tests for reject history dimension Pareto in `tests/test_reject_history_routes.py`
+- [x] 14.6 Run full test suite and fix regressions
--- a/tests/test_mid_section_defect_service.py
+++ b/tests/test_mid_section_defect_service.py
@@ -10,7 +10,9 @@ import pandas as pd
 from mes_dashboard.services.mid_section_defect_service import (
    _attribute_materials,
    _attribute_wafer_roots,
+    _build_detail_table,
    build_trace_aggregation_from_events,
+    export_csv,
    query_analysis,
    query_analysis_detail,
    query_all_loss_reasons,
@@ -362,3 +364,184 @@ def test_attribute_wafer_roots_multiple_roots():
    # Sorted by DEFECT_RATE desc
    assert result[0]['ROOT_CONTAINER_NAME'] == 'ROOT-B'
    assert result[1]['ROOT_CONTAINER_NAME'] == 'ROOT-A'
+
+
+# --- _build_detail_table tests ---
+
+def _make_detection_df(rows):
+    """Helper: build a DataFrame like _fetch_station_detection_data output."""
+    return pd.DataFrame(rows)
+
+
+def test_build_detail_table_structured_upstream_machines():
+    """UPSTREAM_MACHINES should be a list of {station, machine} objects."""
+    df = _make_detection_df([
+        {
+            'CONTAINERID': 'C1', 'CONTAINERNAME': 'LOT-1', 'PJ_TYPE': 'T',
+            'PRODUCTLINENAME': 'P', 'WORKFLOW': 'W', 'FINISHEDRUNCARD': 'FR',
+            'DETECTION_EQUIPMENTNAME': 'DET-01', 'TRACKINQTY': 100,
+            'REJECTQTY': 5, 'LOSSREASONNAME': 'R1',
+        },
+    ])
+    ancestors = {'C1': {'A1'}}
+    upstream_by_cid = {
+        'A1': [
+            {'workcenter_group': '中段', 'equipment_name': 'WIRE-01'},
+            {'workcenter_group': '後段', 'equipment_name': 'DIE-01'},
+        ],
+        'C1': [
+            {'workcenter_group': '測試', 'equipment_name': 'TEST-01'},
+        ],
+    }
+
+    result = _build_detail_table(df, ancestors, upstream_by_cid)
+
+    assert len(result) == 1
+    row = result[0]
+    machines = row['UPSTREAM_MACHINES']
+    assert isinstance(machines, list)
+    assert len(machines) == 3
+    assert {'station': '中段', 'machine': 'WIRE-01'} in machines
+    assert {'station': '後段', 'machine': 'DIE-01'} in machines
+    assert {'station': '測試', 'machine': 'TEST-01'} in machines
+    assert row['UPSTREAM_MACHINE_COUNT'] == 3
+
+
+def test_build_detail_table_structured_upstream_materials():
+    """UPSTREAM_MATERIALS should be a list of {part, lot} objects."""
+    df = _make_detection_df([
+        {
+            'CONTAINERID': 'C1', 'CONTAINERNAME': 'LOT-1', 'PJ_TYPE': 'T',
+            'PRODUCTLINENAME': 'P', 'WORKFLOW': 'W', 'FINISHEDRUNCARD': 'FR',
+            'DETECTION_EQUIPMENTNAME': 'DET-01', 'TRACKINQTY': 100,
+            'REJECTQTY': 0, 'LOSSREASONNAME': '',
+        },
+    ])
+    ancestors = {'C1': {'A1'}}
+    upstream_by_cid = {}
+    materials_by_cid = {
+        'A1': [
+            {'MATERIALPARTNAME': 'PART-X', 'MATERIALLOTNAME': 'ML-1'},
+            {'MATERIALPARTNAME': 'PART-Y', 'MATERIALLOTNAME': ''},
+        ],
+    }
+
+    result = _build_detail_table(
+        df, ancestors, upstream_by_cid, materials_by_cid=materials_by_cid,
+    )
+
+    assert len(result) == 1
+    materials = result[0]['UPSTREAM_MATERIALS']
+    assert isinstance(materials, list)
+    assert len(materials) == 2
+    assert {'part': 'PART-X', 'lot': 'ML-1'} in materials
+    assert {'part': 'PART-Y', 'lot': ''} in materials
+
+
+def test_build_detail_table_wafer_root():
+    """WAFER_ROOT should be the root ancestor container name."""
+    df = _make_detection_df([
+        {
+            'CONTAINERID': 'C1', 'CONTAINERNAME': 'LOT-1', 'PJ_TYPE': 'T',
+            'PRODUCTLINENAME': 'P', 'WORKFLOW': 'W', 'FINISHEDRUNCARD': 'FR',
+            'DETECTION_EQUIPMENTNAME': 'D', 'TRACKINQTY': 100,
+            'REJECTQTY': 3, 'LOSSREASONNAME': 'R1',
+        },
+    ])
+    ancestors = {'C1': set()}
+    upstream_by_cid = {}
+    roots = {'C1': 'WAFER-ROOT-001'}
+
+    result = _build_detail_table(
+        df, ancestors, upstream_by_cid, roots=roots,
+    )
+
+    assert result[0]['WAFER_ROOT'] == 'WAFER-ROOT-001'
+
+
+def test_build_detail_table_multiple_defect_reasons_expand_rows():
+    """LOT with multiple defect reasons should produce one row per reason."""
+    df = _make_detection_df([
+        {
+            'CONTAINERID': 'C1', 'CONTAINERNAME': 'LOT-1', 'PJ_TYPE': 'T',
+            'PRODUCTLINENAME': 'P', 'WORKFLOW': 'W', 'FINISHEDRUNCARD': 'FR',
+            'DETECTION_EQUIPMENTNAME': 'D', 'TRACKINQTY': 200,
+            'REJECTQTY': 5, 'LOSSREASONNAME': 'R1',
+        },
+        {
+            'CONTAINERID': 'C1', 'CONTAINERNAME': 'LOT-1', 'PJ_TYPE': 'T',
+            'PRODUCTLINENAME': 'P', 'WORKFLOW': 'W', 'FINISHEDRUNCARD': 'FR',
+            'DETECTION_EQUIPMENTNAME': 'D', 'TRACKINQTY': 200,
+            'REJECTQTY': 3, 'LOSSREASONNAME': 'R2',
+        },
+    ])
+
+    result = _build_detail_table(df, {'C1': set()}, {})
+
+    assert len(result) == 2
+    reasons = [r['LOSS_REASON'] for r in result]
+    assert 'R1' in reasons
+    assert 'R2' in reasons
+    assert result[0]['DEFECT_QTY'] + result[1]['DEFECT_QTY'] == 8
+
+
+def test_build_detail_table_deduplicates_machines():
+    """Same machine appearing in multiple ancestors should appear only once."""
+    df = _make_detection_df([
+        {
+            'CONTAINERID': 'C1', 'CONTAINERNAME': 'LOT-1', 'PJ_TYPE': 'T',
+            'PRODUCTLINENAME': 'P', 'WORKFLOW': 'W', 'FINISHEDRUNCARD': 'FR',
+            'DETECTION_EQUIPMENTNAME': 'D', 'TRACKINQTY': 100,
+            'REJECTQTY': 1, 'LOSSREASONNAME': 'R1',
+        },
+    ])
+    ancestors = {'C1': {'A1', 'A2'}}
+    # Same machine in both ancestors
+    upstream_by_cid = {
+        'A1': [{'workcenter_group': '中段', 'equipment_name': 'EQ-01'}],
+        'A2': [{'workcenter_group': '中段', 'equipment_name': 'EQ-01'}],
+    }
+
+    result = _build_detail_table(df, ancestors, upstream_by_cid)
+
+    assert result[0]['UPSTREAM_MACHINE_COUNT'] == 1
+    assert len(result[0]['UPSTREAM_MACHINES']) == 1
+
+
+@patch('mes_dashboard.services.mid_section_defect_service.query_analysis')
+def test_export_csv_flattens_structured_fields(mock_query_analysis):
+    """CSV export should flatten UPSTREAM_MACHINES and UPSTREAM_MATERIALS to strings."""
+    mock_query_analysis.return_value = {
+        'detail': [
+            {
+                'CONTAINERNAME': 'LOT-1',
+                'PJ_TYPE': 'T',
+                'PRODUCTLINENAME': 'P',
+                'WORKFLOW': 'W',
+                'FINISHEDRUNCARD': 'FR',
+                'DETECTION_EQUIPMENTNAME': 'D',
+                'INPUT_QTY': 100,
+                'LOSS_REASON': 'R1',
+                'DEFECT_QTY': 5,
+                'DEFECT_RATE': 5.0,
+                'ANCESTOR_COUNT': 1,
+                'UPSTREAM_MACHINE_COUNT': 2,
+                'UPSTREAM_MACHINES': [
+                    {'station': '中段', 'machine': 'WIRE-01'},
+                    {'station': '後段', 'machine': 'DIE-02'},
+                ],
+                'UPSTREAM_MATERIALS': [
+                    {'part': 'PART-A', 'lot': 'ML-1'},
+                ],
+                'WAFER_ROOT': 'ROOT-001',
+            },
+        ],
+    }
+
+    lines = list(export_csv('2025-01-01', '2025-01-31', direction='backward'))
+
+    # First line is BOM, second is header, third is data
+    assert len(lines) == 3
+    data_line = lines[2]
+    assert '中段/WIRE-01, 後段/DIE-02' in data_line
+    assert 'PART-A/ML-1' in data_line