Chunk failures in BatchQueryEngine were silently discarded — `has_partial_failure` was tracked
in Redis but never surfaced to the API response or frontend. Users could see incomplete data
without any warning. This commit closes the gap end-to-end:
Backend:
- Track failed chunk time ranges (`failed_ranges`) in batch engine progress metadata
- Add single retry for transient Oracle errors (timeout, connection) in `_execute_single_chunk`
- Read `get_batch_progress()` after merge but before `redis_clear_batch()` cleanup
- Inject `has_partial_failure`, `failed_chunk_count`, `failed_ranges` into API response meta
- Persist partial failure flag to independent Redis key with TTL aligned to data storage layer
- Add shared container-resolution policy module with wildcard/expansion guardrails
- Refactor reason filter from single-value to multi-select (`reason` → `reasons`)
Frontend:
- Add client-side date range validation (730-day limit) before API submission
- Display amber warning banner on partial failure with specific failed date ranges
- Support generic fallback message for container-mode queries without date ranges
- Update FilterPanel to support multi-select reason chips
Specs & tests:
- Create batch-query-resilience spec; update reject-history-api and reject-history-page specs
- Add 7 new tests for retry, memory guard, failed ranges, partial failure propagation, TTL
- Cross-service regression verified (hold, resource, job, msd — 411 tests pass)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Date changes no longer auto-trigger Oracle queries. Users can freely adjust
dates and click "查詢" to execute. Hold Type still auto-refreshes from cache.
Record Type checkboxes now use pill-toggle style (hidden native checkbox).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix dimension Pareto datasources: PJ_TYPE/PRODUCTLINENAME from DW_MES_CONTAINER,
WORKFLOWNAME from DW_MES_LOTWIPHISTORY via WIPTRACKINGGROUPKEYID, EQUIPMENTNAME
from LOTREJECTHISTORY only (no WIP fallback), workcenter dimension uses WORKCENTER_GROUP
- Add multi-select Pareto click filtering with chip display and detail list integration
- Add TOP 20 display scope selector for TYPE/WORKFLOW/機台 dimensions
- Pass Pareto selection (dimension + values) through to list/export endpoints
- Enable TRACE_WORKER_ENABLED=true by default in start_server.sh and .env.example
- Archive reject-history-pareto-datasource-fix and reject-history-pareto-ux-enhancements
- Update reject-history-api and reject-history-page specs with new Pareto behaviors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use CSS container queries to dynamically size summary values based on
card width, preventing large QTY numbers from overflowing the 7-column
card grid.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename page title to "Hold 歷史績效". Change trend, reason pareto, and
duration derivation to use QTY-based counting so cards, trend chart, and
analytical charts are consistent with each other.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add PACKAGE_LEF as a dedicated `package` field in the QC-GATE API payload
and display it as a new column after LOT ID in LotTable.vue. Archive
qc-gate-lot-package-column, historical-query-slow-connection, and
msd-multifactor-backward-tracing changes with their delta specs synced
to main specs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes for the reject history query feature:
1. Fix DPY-4010 bind variable error when querying by WORKORDER — the
workflow_lookup CTE had hardcoded :start_date/:end_date which aren't
provided in container mode. Replaced with {{ WORKFLOW_FILTER }} template
slot that defaults to date-based filter or container-based filter.
2. Move policy toggle filters (material scrap, PB_diode, excluded reasons)
from SQL-level to in-memory pandas filtering. Cache now stores unfiltered
data so toggling policy filters reuses cached results instantly instead
of requiring a ~30s Oracle round-trip per combination.
3. Add per-WORKORDER expansion_info display in FilterPanel for multi-order
container resolution diagnostics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove Jinja2 template fallback (1249 lines) — /admin/performance now serves
Vue SPA exclusively via send_from_directory.
Backend:
- Add _SLOW_QUERY_WAITING counter with get_slow_query_waiting_count()
- Record slow-path latency in read_sql_df_slow/iter via record_query_latency()
- Extend metrics_history schema with slow_query_active, slow_query_waiting,
worker_rss_bytes columns + ALTER TABLE migration for existing DBs
- Add cleanup_archive_logs() with configurable ARCHIVE_LOG_DIR/KEEP_COUNT
- Integrate archive cleanup into MetricsHistoryCollector 50-min cycle
Frontend:
- Add slow_query_active and slow_query_waiting StatCards to connection pool
- Add slow_query_active trend line to pool trend chart
- Add Worker memory (RSS MB) trend chart with preprocessing
- Update modernization gate check path to frontend style.css
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three proposals addressing the 2026-02-25 trace pipeline OOM crash (114K CIDs):
1. trace-events-memory-triage: fetchmany iterator (read_sql_df_slow_iter),
admission control (50K CID limit for non-MSD), cache skip for large queries,
early memory release with gc.collect()
2. trace-async-job-queue: RQ-based async jobs for queries >20K CIDs,
separate worker process with isolated memory, frontend polling via
useTraceProgress composable, systemd service + deploy scripts
3. trace-streaming-response: chunked Redis storage (TRACE_STREAM_BATCH_SIZE=5000),
NDJSON stream endpoint (GET /api/trace/job/<id>/stream), frontend
ReadableStream consumer for progressive rendering, backward-compatible
with legacy single-key storage
All three proposals archived. 1101 tests pass, frontend builds clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Station detection query with large date ranges (5+ months) exceeds the
55s pool call_timeout. Switch to read_sql_df_slow (300s, dedicated
connection) matching other historical query services.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two changes combined:
1. historical-query-slow-connection: Migrate all historical query pages
to read_sql_df_slow with semaphore concurrency control (max 3),
raise DB slow timeout to 300s, gunicorn timeout to 360s, and
unify frontend timeouts to 360s for all historical pages.
2. hold-resource-history-dataset-cache: Convert hold-history and
resource-history from multi-query to single-query + dataset cache
pattern (L1 ProcessLevelCache + L2 Redis parquet/base64, TTL=900s).
Replace old GET endpoints with POST /query + GET /view two-phase
API. Frontend auto-retries on 410 cache_expired.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Covers task 14.3: verify UPSTREAM_MACHINES/UPSTREAM_MATERIALS list format,
WAFER_ROOT field, multi-reason row expansion, machine deduplication,
and CSV export flatten logic.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reject History:
- Compute dimension pareto (package/type/workflow/workcenter/equipment) from
cached DataFrame instead of re-querying Oracle per dimension change
- Propagate supplementary filters and trend date selection to dimension pareto
- Add staleness tracking to prevent race conditions on rapid dimension switches
- Add WORKFLOWNAME to detail and export outputs
- Fix button hover visibility with CSS specificity
MSD (製程不良追溯分析):
- Separate raw events caching from aggregation computation so changing
loss_reasons uses EventFetcher per-domain cache (fast) and recomputes
aggregation with current filters instead of returning stale cached results
- Exclude loss_reasons from MSD seed cache key since seed resolution does
not use it, avoiding unnecessary Oracle re-queries
- Add suspect context panel, analysis summary, upstream station/spec filters
- Add machine bar click drill-down and filtered attribution charts
Query Tool:
- Support batch container_ids in lot CSV export (history/materials/rejects/holds)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Increase CSS specificity for .btn-export to prevent portal-shell override on hover
- Remove RESOURCEID and CONTAINERIDS from jobs tab display columns
- Add lot_jobs_with_txn.sql joining JOB with JOBTXNHISTORY for complete export
- Route lot_jobs export through get_lot_jobs_with_history() for full transaction data
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Export was sending only one container_id while the detail table loaded
data for all selected CIDs (including subtree), causing "查無資料" errors.
Now sends container_ids array and uses batch service functions.
Also rounds hold detail age KPI cards to 1 decimal place.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add DEFECTQTY to reject SUM in station_detection, station_detection_by_ids,
and downstream_rejects SQL so KPI/charts include both charge-off and
non-charge-off reject quantities
- Wire forward direction through events-based trace pipeline so downstream
pareto charts and detail table populate correctly
- Remove inappropriate 5-min auto-refresh from query tool page; replace
useAutoRefresh with local createAbortSignal for request cancellation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Transform /mid-section-defect from TMTT-only backward analysis into a full-line
bidirectional defect traceability center supporting all detection stations.
Key changes:
- Parameterized station detection: any workcenter group as detection station
- Bidirectional tracing: backward (upstream attribution) + forward (downstream reject rates)
- Dual query mode: date range OR LOT/工單/WAFER container-based seed resolution
- Multi-select filters for upstream station, equipment model (RESOURCEFAMILYNAME), and loss reasons
- Progressive 3-stage trace pipeline (seed-resolve → lineage → events) with streaming UI
- Equipment model lookup via resource cache instead of SPECNAME
- Session caching, auto-refresh, searchable MultiSelect with fuzzy matching
- Remove legacy tmtt-defect module (fully superseded)
- Archive openspec change artifacts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Equipment selector shows only RESOURCENAME (remove redundant RESOURCEID)
- Equipment lots table: remove CONTAINERID, add WAFER LOT/TYPE/BOP/WORKORDER
- Rename CONTAINERNAME to LOT ID across all tables and CSV exports
- Rename PJ_TYPE/PJ_BOP/PJ_WORKORDER to TYPE/BOP/WORKORDER in history and equipment lots
- Add export formatters for equipment_lots, lot_history, and lot_rejects
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary grid used fixed 10-column layout with viewport-based media queries
that didn't account for sidebar width changes, causing overflow when sidebar
opened and blank space at certain breakpoints.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace dynamic Object.keys(row) column derivation with explicit display
column lists matching field_contracts.json txn_table definition, ensuring
consistent column order across job-query and query-tool transaction tables.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move query-tool from dev-tools drawer to main drawer and update status
from dev to released now that the UI alignment work is complete.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The MultiSelect component styles (.multi-select-*) and .btn-sm were
previously provided by resource-shared/styles.css which is no longer
loaded by the portal-shell native module registry for this route.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Convert all 18 query-tool Vue components from Tailwind utility classes to
semantic CSS classes (.header, .card, .btn-primary, .query-tool-tab, etc.)
consistent with reject-history, hold-overview, and other pages. Create
self-contained style.css with design tokens, base classes, and page-specific
styles. Fix portal-shell native module loader to load query-tool/style.css
instead of resource-shared/styles.css. Add CSS link tags to Django template
for standalone page rendering.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rebuild /admin/performance from Jinja2 to Vue 3 SPA with ECharts, adding
cache telemetry infrastructure, connection pool monitoring, and SQLite-backed
historical metrics collection with trend chart visualization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace per-interaction Oracle queries with a two-phase model:
- POST /query: single Oracle hit, cache full LOT-level DataFrame (L1+L2)
- GET /view: read cache, apply supplementary/interactive filters via pandas
Add container query mode (LOT/工單/WAFER LOT with wildcard support),
supplementary filters (Package/WC GROUP/Reason) from cached data,
PB_* series exclusion (was PB_Diode only), and query loading spinner.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pareto filter now includes the item that crosses the 80% threshold and
guarantees at least 5 items, so the chart stays useful when one reason
dominates (e.g. defect-only mode). Detail table shows a spinner overlay
while the list is refreshing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SQL CTEs now join on SPECNAME instead of WORKCENTERNAME to resolve
correct WORKCENTER/GROUP from DW_MES_SPEC_WORKCENTER_V, fixing cases
where the raw WORKCENTERNAME was mismatched (e.g. W/B-END with 成型_料).
WORKCENTER_GROUP filter converts groups→specs via cached mapping before
querying. Pareto chart now recalculates on legend toggle by spreading
the ECharts selected object to trigger Vue reactivity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add EQUIPMENTNAME and REJECTCOMMENT columns to the detail table, list SQL,
and per-LOT base query. Rewrite CSV export to use per-LOT rows with Chinese
headers, BOM UTF-8 encoding, and fetch-based blob download with loading
spinner. Sync trend chart legend filter (reject/defect) to detail table and
export via metric_filter parameter through the full stack. Fix chart sizing
with containLabel and autoresize throttle.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Tree: render separate ECharts series per wafer-lot root instead of
overlapping single-data trees
- Lazy loading: resolve builds tree only; detail/timeline load on node
click to reduce initial resource consumption
- Timeline: group tracks by WORKCENTER_GROUP × LOT ID × Equipment with
multi-line Y-axis labels (LOT ID + 機台編號)
- Timeline: backend enriches history rows with WORKCENTER_GROUP via
filter_cache; timeRange derives only from history bars for dynamic
updates on filter/selection change
- TimelineChart: teleport tooltip to body (fixed positioning) to prevent
clipping; adaptive chart width scaling; edge-aware boundary detection
- Build script: add reject-history HTML copy; feature flag registered
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>