Chunk failures in BatchQueryEngine were silently discarded — `has_partial_failure` was tracked
in Redis but never surfaced to the API response or frontend. Users could see incomplete data
without any warning. This commit closes the gap end-to-end:
Backend:
- Track failed chunk time ranges (`failed_ranges`) in batch engine progress metadata
- Add single retry for transient Oracle errors (timeout, connection) in `_execute_single_chunk`
- Read `get_batch_progress()` after merge but before `redis_clear_batch()` cleanup
- Inject `has_partial_failure`, `failed_chunk_count`, `failed_ranges` into API response meta
- Persist partial failure flag to independent Redis key with TTL aligned to data storage layer
- Add shared container-resolution policy module with wildcard/expansion guardrails
- Refactor reason filter from single-value to multi-select (`reason` → `reasons`)
Frontend:
- Add client-side date range validation (730-day limit) before API submission
- Display amber warning banner on partial failure with specific failed date ranges
- Support generic fallback message for container-mode queries without date ranges
- Update FilterPanel to support multi-select reason chips
Specs & tests:
- Create batch-query-resilience spec; update reject-history-api and reject-history-page specs
- Add 7 new tests for retry, memory guard, failed ranges, partial failure propagation, TTL
- Cross-service regression verified (hold, resource, job, msd — 411 tests pass)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix dimension Pareto datasources: PJ_TYPE/PRODUCTLINENAME from DW_MES_CONTAINER,
WORKFLOWNAME from DW_MES_LOTWIPHISTORY via WIPTRACKINGGROUPKEYID, EQUIPMENTNAME
from LOTREJECTHISTORY only (no WIP fallback), workcenter dimension uses WORKCENTER_GROUP
- Add multi-select Pareto click filtering with chip display and detail list integration
- Add TOP 20 display scope selector for TYPE/WORKFLOW/機台 dimensions
- Pass Pareto selection (dimension + values) through to list/export endpoints
- Enable TRACE_WORKER_ENABLED=true by default in start_server.sh and .env.example
- Archive reject-history-pareto-datasource-fix and reject-history-pareto-ux-enhancements
- Update reject-history-api and reject-history-page specs with new Pareto behaviors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three proposals addressing the 2026-02-25 trace pipeline OOM crash (114K CIDs):
1. trace-events-memory-triage: fetchmany iterator (read_sql_df_slow_iter),
admission control (50K CID limit for non-MSD), cache skip for large queries,
early memory release with gc.collect()
2. trace-async-job-queue: RQ-based async jobs for queries >20K CIDs,
separate worker process with isolated memory, frontend polling via
useTraceProgress composable, systemd service + deploy scripts
3. trace-streaming-response: chunked Redis storage (TRACE_STREAM_BATCH_SIZE=5000),
NDJSON stream endpoint (GET /api/trace/job/<id>/stream), frontend
ReadableStream consumer for progressive rendering, backward-compatible
with legacy single-key storage
All three proposals archived. 1101 tests pass, frontend builds clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Transform /mid-section-defect from TMTT-only backward analysis into a full-line
bidirectional defect traceability center supporting all detection stations.
Key changes:
- Parameterized station detection: any workcenter group as detection station
- Bidirectional tracing: backward (upstream attribution) + forward (downstream reject rates)
- Dual query mode: date range OR LOT/工單/WAFER container-based seed resolution
- Multi-select filters for upstream station, equipment model (RESOURCEFAMILYNAME), and loss reasons
- Progressive 3-stage trace pipeline (seed-resolve → lineage → events) with streaming UI
- Equipment model lookup via resource cache instead of SPECNAME
- Session caching, auto-refresh, searchable MultiSelect with fuzzy matching
- Remove legacy tmtt-defect module (fully superseded)
- Archive openspec change artifacts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Promote /tables, /excel-query, /query-tool, /mid-section-defect from
deferred to full shell-governed in-scope routes with canonical redirects,
content contracts, governance artifacts, and updated CI gates.
Unify all page header gradients to #667eea → #764ba2 and h1 font-size
to 24px for visual consistency across all dashboard pages. Remove
Native Route-View dev annotations from job-query, excel-query, and
query-tool headers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>