feat(reject-history): fix silent data loss by propagating partial failure metadata to frontend

Chunk failures in BatchQueryEngine were silently discarded — `has_partial_failure` was tracked
in Redis but never surfaced to the API response or frontend. Users could see incomplete data
without any warning. This commit closes the gap end-to-end:

Backend:
- Track failed chunk time ranges (`failed_ranges`) in batch engine progress metadata
- Add single retry for transient Oracle errors (timeout, connection) in `_execute_single_chunk`
- Read `get_batch_progress()` after merge but before `redis_clear_batch()` cleanup
- Inject `has_partial_failure`, `failed_chunk_count`, `failed_ranges` into API response meta
- Persist partial failure flag to independent Redis key with TTL aligned to data storage layer
- Add shared container-resolution policy module with wildcard/expansion guardrails
- Refactor reason filter from single-value to multi-select (`reason` → `reasons`)

Frontend:
- Add client-side date range validation (730-day limit) before API submission
- Display amber warning banner on partial failure with specific failed date ranges
- Support generic fallback message for container-mode queries without date ranges
- Update FilterPanel to support multi-select reason chips

Specs & tests:
- Create batch-query-resilience spec; update reject-history-api and reject-history-page specs
- Add 7 new tests for retry, memory guard, failed ranges, partial failure propagation, TTL
- Cross-service regression verified (hold, resource, job, msd — 411 tests pass)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
egg
2026-03-03 14:00:07 +08:00
parent f1506787fb
commit a275c30c0e
35 changed files with 3028 additions and 1460 deletions

View File

@@ -284,6 +284,15 @@ QUERY_TOOL_MAX_CONTAINER_IDS=200
RESOURCE_DETAIL_DEFAULT_LIMIT=500
RESOURCE_DETAIL_MAX_LIMIT=500
# 共用解析防護LOT/WAFER/工單)
CONTAINER_RESOLVE_INPUT_MAX_VALUES=0 # 0=不限制輸入筆數
CONTAINER_RESOLVE_PATTERN_MIN_PREFIX_LEN=4 # 萬用字元前最少字首長度(例如 GA25%
CONTAINER_RESOLVE_MAX_EXPANSION_PER_TOKEN=2000
CONTAINER_RESOLVE_MAX_CONTAINER_IDS=30000
# EventFetcher 批次容錯策略
EVENT_FETCHER_ALLOW_PARTIAL_RESULTS=false # false=任一批次失敗即整體失敗,避免靜默缺資料
# 反向代理信任邊界(無反向代理時務必維持 false
TRUST_PROXY_HEADERS=false
TRUSTED_PROXY_IPS=127.0.0.1