feat(reject-history): add materialized Pareto aggregate layer with feature-flagged rollout
Pre-compute 6-dimension metric cubes from cached LOT-level DataFrames so interactive Pareto requests read compact snapshots instead of re-scanning detail rows on every filter change. Includes single-flight build guard, TTL/size guardrails, cross-filter exclude-self evaluation, safe legacy fallback, response metadata exposure, telemetry counters, and a 3-stage rollout plan (telemetry-only → build-enabled → read-through). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,88 +1,30 @@
|
||||
## Purpose
|
||||
Define stable requirements for cache-observability-hardening.
|
||||
## Requirements
|
||||
### Requirement: Layered Cache SHALL Expose Operational State
|
||||
The route cache implementation SHALL expose layered cache operational state, including mode, freshness, and degradation status.
|
||||
|
||||
#### Scenario: Redis unavailable degradation state
|
||||
- **WHEN** Redis is unavailable
|
||||
- **THEN** health endpoints MUST indicate degraded cache mode while keeping L1 memory cache active
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Cache Telemetry MUST be Queryable for Operations
|
||||
The system MUST provide cache telemetry suitable for operations diagnostics.
|
||||
The system MUST provide cache telemetry suitable for operations diagnostics, including materialized Pareto cache behavior for reject-history workloads.
|
||||
|
||||
#### Scenario: Telemetry inspection
|
||||
- **WHEN** operators request deep health status
|
||||
- **THEN** cache-related metrics/state SHALL be present and interpretable for troubleshooting
|
||||
|
||||
### Requirement: Health Endpoints SHALL Expose Pool Saturation and Degradation Reason Codes
|
||||
Operational health endpoints MUST report connection pool saturation indicators and explicit degradation reason codes.
|
||||
#### Scenario: Materialized Pareto telemetry visibility
|
||||
- **WHEN** materialized Pareto cache is enabled
|
||||
- **THEN** telemetry SHALL expose at least hit count/rate, miss count/rate, build count, build failure count, and fallback count
|
||||
- **THEN** telemetry SHALL expose latest snapshot freshness indicators and aggregate payload size indicators
|
||||
|
||||
#### Scenario: Pool saturation observed
|
||||
- **WHEN** checked-out connections and overflow approach configured limits
|
||||
- **THEN** deep health output MUST expose saturation metrics and degraded reason classification
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Degraded Responses MUST Be Correlatable Across API and Health Telemetry
|
||||
Error responses for degraded states SHALL include stable codes that can be mapped to health telemetry and operational dashboards.
|
||||
### Requirement: Pareto materialization fallback reasons SHALL be operationally classifiable
|
||||
Telemetry MUST classify fallback outcomes with stable reason codes so repeated degradations can be monitored and alerted.
|
||||
|
||||
#### Scenario: Degraded API response correlation
|
||||
- **WHEN** an API request fails due to circuit-open or pool-exhausted conditions
|
||||
- **THEN** operators MUST be able to match the response code to current health telemetry state
|
||||
#### Scenario: Snapshot miss fallback reason
|
||||
- **WHEN** request falls back because no snapshot exists
|
||||
- **THEN** telemetry SHALL record a stable reason code for snapshot miss
|
||||
|
||||
### Requirement: Operational Alert Thresholds SHALL Be Explicitly Defined
|
||||
The system MUST define alert thresholds for sustained degraded state, repeated worker recovery, and abnormal retry pressure.
|
||||
|
||||
#### Scenario: Sustained degradation threshold exceeded
|
||||
- **WHEN** degraded status persists beyond configured duration
|
||||
- **THEN** the monitoring contract MUST classify the service as alert-worthy with actionable context
|
||||
|
||||
### Requirement: Cache Telemetry SHALL Include Memory Amplification Signals
|
||||
Operational telemetry MUST expose cache-domain memory usage indicators and representation amplification factors, and MUST differentiate between authoritative data payload and derived/index helper structures.
|
||||
|
||||
#### Scenario: Deep health telemetry request after representation normalization
|
||||
- **WHEN** operators inspect cache telemetry for resource or WIP domains
|
||||
- **THEN** telemetry MUST include per-domain memory footprint, amplification indicators, and enough structure detail to verify that full-record duplication is not reintroduced
|
||||
|
||||
### Requirement: Efficiency Benchmarks SHALL Gate Cache Refactor Rollout
|
||||
Cache/query efficiency changes MUST be validated against baseline latency and memory benchmarks before rollout.
|
||||
|
||||
#### Scenario: Pre-release validation
|
||||
- **WHEN** cache refactor changes are prepared for deployment
|
||||
- **THEN** benchmark results MUST demonstrate no regression beyond configured thresholds for P95 latency and memory usage
|
||||
|
||||
### Requirement: Process-Level Cache SHALL Use Bounded Capacity with Deterministic Eviction
|
||||
Process-level parsed-data caches MUST enforce a configurable maximum key capacity and use deterministic eviction behavior when capacity is exceeded.
|
||||
|
||||
#### Scenario: Cache capacity reached
|
||||
- **WHEN** a new cache entry is inserted and key capacity is at limit
|
||||
- **THEN** cache MUST evict entries according to defined policy before storing the new key
|
||||
|
||||
#### Scenario: Repeated access updates recency
|
||||
- **WHEN** an existing cache key is read or overwritten
|
||||
- **THEN** eviction order MUST reflect recency semantics so hot keys are retained preferentially
|
||||
|
||||
### Requirement: Cache Publish MUST Preserve Previous Readable Snapshot on Failure
|
||||
When refreshing full-table cache payloads, the system MUST avoid exposing partially published states to readers.
|
||||
|
||||
#### Scenario: Publish fails after payload serialization
|
||||
- **WHEN** a cache refresh has prepared new payload but publish operation fails
|
||||
- **THEN** previously published cache keys MUST remain readable and metadata MUST remain consistent with old snapshot
|
||||
|
||||
#### Scenario: Publish succeeds
|
||||
- **WHEN** publish operation completes successfully
|
||||
- **THEN** data payload and metadata keys MUST be visible as one coherent new snapshot
|
||||
|
||||
### Requirement: Process-Level Cache Slow Path SHALL Minimize Lock Hold Time
|
||||
Large payload parsing MUST NOT happen inside long-held process cache locks.
|
||||
|
||||
#### Scenario: Cache miss under concurrent requests
|
||||
- **WHEN** multiple requests hit process cache miss
|
||||
- **THEN** parsing work SHALL happen outside lock-protected mutation section, and lock scope SHALL be limited to consistency check + commit
|
||||
|
||||
### Requirement: Process-Level Cache Policies MUST Stay Consistent Across Services
|
||||
All service-local process caches MUST support bounded capacity with deterministic eviction.
|
||||
|
||||
#### Scenario: Realtime equipment cache growth
|
||||
- **WHEN** realtime equipment process cache reaches configured capacity
|
||||
- **THEN** entries MUST be evicted according to deterministic LRU behavior
|
||||
#### Scenario: Snapshot stale fallback reason
|
||||
- **WHEN** request falls back because snapshot fails freshness/version checks
|
||||
- **THEN** telemetry SHALL record a stable reason code for stale/incompatible snapshot
|
||||
|
||||
#### Scenario: Build failure fallback reason
|
||||
- **WHEN** request falls back because materialization build failed
|
||||
- **THEN** telemetry SHALL record a stable reason code for build failure
|
||||
|
||||
@@ -1,105 +1,12 @@
|
||||
# reject-history-api Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change reject-history-query-page. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Reject History API SHALL validate required query parameters
|
||||
The API SHALL validate date parameters and basic paging bounds before executing database work.
|
||||
|
||||
#### Scenario: Missing required dates
|
||||
- **WHEN** a reject-history endpoint requiring date range is called without `start_date` or `end_date`
|
||||
- **THEN** the API SHALL return HTTP 400 with a descriptive validation error
|
||||
|
||||
#### Scenario: Invalid date order
|
||||
- **WHEN** `end_date` is earlier than `start_date`
|
||||
- **THEN** the API SHALL return HTTP 400 and SHALL NOT run SQL queries
|
||||
|
||||
#### Scenario: Date range exceeds maximum
|
||||
- **WHEN** the date range between `start_date` and `end_date` exceeds 730 days
|
||||
- **THEN** the API SHALL return HTTP 400 with error message "日期範圍不可超過 730 天"
|
||||
|
||||
### Requirement: Reject History API primary query response SHALL include partial failure metadata
|
||||
The primary query endpoint SHALL include batch execution completeness information in the response `meta` field when chunks fail during batch query execution.
|
||||
|
||||
#### Scenario: Partial failure metadata in response
|
||||
- **WHEN** `POST /api/reject-history/query` completes with some chunks failing
|
||||
- **THEN** the response SHALL include `meta.has_partial_failure: true`
|
||||
- **THEN** the response SHALL include `meta.failed_chunk_count` as a positive integer
|
||||
- **THEN** the response SHALL include `meta.failed_ranges` as an array of `{start, end}` date strings (if available)
|
||||
- **THEN** the HTTP status SHALL still be 200 (data is partially available)
|
||||
|
||||
#### Scenario: No partial failure metadata on full success
|
||||
- **WHEN** `POST /api/reject-history/query` completes with all chunks succeeding
|
||||
- **THEN** the response `meta` SHALL NOT include `has_partial_failure`, `failed_chunk_count`, or `failed_ranges`
|
||||
|
||||
#### Scenario: Partial failure metadata preserved on cache hit
|
||||
- **WHEN** `POST /api/reject-history/query` returns cached data that originally had partial failures
|
||||
- **THEN** the response SHALL include the same `meta.has_partial_failure`, `meta.failed_chunk_count`, and `meta.failed_ranges` as the original response
|
||||
|
||||
### Requirement: Reject History API SHALL provide summary metrics endpoint
|
||||
The API SHALL provide aggregated summary metrics for the selected filter context.
|
||||
|
||||
#### Scenario: Summary response payload
|
||||
- **WHEN** `GET /api/reject-history/summary` is called with valid filters
|
||||
- **THEN** response SHALL be `{ success: true, data: { ... } }`
|
||||
- **THEN** data SHALL include `MOVEIN_QTY`, `REJECT_TOTAL_QTY`, `DEFECT_QTY`, `REJECT_RATE_PCT`, `DEFECT_RATE_PCT`, `REJECT_SHARE_PCT`, `AFFECTED_LOT_COUNT`, and `AFFECTED_WORKORDER_COUNT`
|
||||
|
||||
### Requirement: Reject History API SHALL support yield-exclusion policy toggle
|
||||
The API SHALL support excluding or including policy-marked scrap reasons through a shared query parameter.
|
||||
|
||||
#### Scenario: Default policy mode
|
||||
- **WHEN** reject-history endpoints are called without `include_excluded_scrap`
|
||||
- **THEN** `include_excluded_scrap` SHALL default to `false`
|
||||
- **THEN** rows mapped to `ERP_PJ_WIP_SCRAP_REASONS_EXCLUDE.ENABLE_FLAG='Y'` SHALL be excluded from yield-related calculations
|
||||
|
||||
#### Scenario: Explicitly include policy-marked scrap
|
||||
- **WHEN** `include_excluded_scrap=true` is provided
|
||||
- **THEN** policy-marked rows SHALL be included in summary/trend/pareto/list/export calculations
|
||||
- **THEN** API response `meta` SHALL include the effective `include_excluded_scrap` value
|
||||
|
||||
#### Scenario: Invalid toggle value
|
||||
- **WHEN** `include_excluded_scrap` is not parseable as boolean
|
||||
- **THEN** the API SHALL return HTTP 400 with a descriptive validation error
|
||||
|
||||
### Requirement: Reject History API SHALL provide trend endpoint
|
||||
The API SHALL return time-series trend data for quantity and rate metrics.
|
||||
|
||||
#### Scenario: Trend response structure
|
||||
- **WHEN** `GET /api/reject-history/trend` is called
|
||||
- **THEN** response SHALL be `{ success: true, data: { items: [...] } }`
|
||||
- **THEN** each trend item SHALL contain bucket date, `REJECT_TOTAL_QTY`, `DEFECT_QTY`, `REJECT_RATE_PCT`, and `DEFECT_RATE_PCT`
|
||||
|
||||
#### Scenario: Trend granularity
|
||||
- **WHEN** `granularity` is provided as `day`, `week`, or `month`
|
||||
- **THEN** the API SHALL aggregate by the requested granularity
|
||||
- **THEN** invalid granularity SHALL return HTTP 400
|
||||
|
||||
### Requirement: Reject History API SHALL provide reason Pareto endpoint
|
||||
The API SHALL return sorted reason distribution data with cumulative percentages. The endpoint supports dimension selection via `dimension` parameter for single-dimension queries.
|
||||
|
||||
#### Scenario: Pareto response payload
|
||||
- **WHEN** `GET /api/reject-history/reason-pareto` is called
|
||||
- **THEN** each item SHALL include `reason`, `category`, selected metric value, `pct`, and `cumPct`
|
||||
- **THEN** items SHALL be sorted descending by selected metric
|
||||
|
||||
#### Scenario: Metric mode validation
|
||||
- **WHEN** `metric_mode` is provided
|
||||
- **THEN** accepted values SHALL be `reject_total` or `defect`
|
||||
- **THEN** invalid `metric_mode` SHALL return HTTP 400
|
||||
|
||||
#### Scenario: Dimension selection
|
||||
- **WHEN** `dimension` parameter is provided with a valid value (reason, package, type, workflow, workcenter, equipment)
|
||||
- **THEN** the endpoint SHALL return Pareto data for that dimension
|
||||
- **WHEN** `dimension` is not provided
|
||||
- **THEN** the endpoint SHALL default to `reason`
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Reject History API SHALL provide batch Pareto endpoint with cross-filter
|
||||
The API SHALL provide a batch Pareto endpoint that returns all 6 dimension Pareto results in a single response, supporting cross-dimension filtering with exclude-self logic.
|
||||
The API SHALL provide a batch Pareto endpoint that returns all 6 dimension Pareto results in a single response, supporting cross-dimension filtering with exclude-self logic, and SHALL prefer materialized Pareto snapshots over full detail regrouping.
|
||||
|
||||
#### Scenario: Batch Pareto response structure
|
||||
- **WHEN** `GET /api/reject-history/batch-pareto` is called with valid `query_id`
|
||||
- **THEN** response SHALL be `{ success: true, data: { dimensions: { reason: {...}, package: {...}, type: {...}, workflow: {...}, workcenter: {...}, equipment: {...} } } }`
|
||||
- **THEN** each dimension object SHALL include `items` array with same schema as reason-pareto items (`reason`, `metric_value`, `pct`, `cumPct`, `MOVEIN_QTY`, `REJECT_TOTAL_QTY`, `DEFECT_QTY`, `count`)
|
||||
- **THEN** each dimension object SHALL include `items` array with schema (`reason`, `metric_value`, `pct`, `cumPct`, `MOVEIN_QTY`, `REJECT_TOTAL_QTY`, `DEFECT_QTY`, `count`)
|
||||
|
||||
#### Scenario: Cross-filter exclude-self logic
|
||||
- **WHEN** `sel_reason=A&sel_type=X` is provided
|
||||
@@ -109,112 +16,42 @@ The API SHALL provide a batch Pareto endpoint that returns all 6 dimension Paret
|
||||
|
||||
#### Scenario: Empty selections return unfiltered Paretos
|
||||
- **WHEN** batch-pareto is called with no `sel_*` parameters
|
||||
- **THEN** all 6 dimensions SHALL return their full Pareto distribution (same as calling reason-pareto individually with no cross-filter)
|
||||
- **THEN** all 6 dimensions SHALL return their full Pareto distribution (subject to `pareto_scope`)
|
||||
|
||||
#### Scenario: Cache-only computation
|
||||
- **WHEN** `query_id` does not exist in cache
|
||||
- **THEN** the endpoint SHALL return HTTP 400 with error message indicating cache miss
|
||||
- **THEN** the endpoint SHALL NOT fall back to Oracle query
|
||||
|
||||
#### Scenario: Materialized snapshot preferred
|
||||
- **WHEN** a valid and fresh materialized Pareto snapshot exists for the request context
|
||||
- **THEN** the endpoint SHALL return results from that snapshot
|
||||
- **THEN** the endpoint SHALL avoid full lot-level DataFrame regrouping for the same request
|
||||
|
||||
#### Scenario: Materialized miss fallback behavior
|
||||
- **WHEN** materialized snapshot is unavailable, stale, or build fails
|
||||
- **THEN** the endpoint SHALL fall back to legacy cache DataFrame computation
|
||||
- **THEN** the response schema and filter semantics SHALL remain unchanged
|
||||
|
||||
#### Scenario: Supplementary and policy filters apply
|
||||
- **WHEN** batch-pareto is called with supplementary filters (packages, workcenter_groups, reason) and policy toggles
|
||||
- **THEN** all 6 dimension Paretos SHALL be computed after applying policy and supplementary filters first (before cross-filter)
|
||||
|
||||
#### Scenario: Data source is cached DataFrame only
|
||||
- **WHEN** batch-pareto computes dimension Paretos
|
||||
- **THEN** computation SHALL operate on the in-memory cached Pandas DataFrame (populated by the primary query)
|
||||
- **THEN** the endpoint SHALL NOT issue any additional Oracle database queries
|
||||
- **THEN** response time SHALL be sub-100ms since all computation is in-memory
|
||||
|
||||
#### Scenario: Display scope (TOP20) support
|
||||
- **WHEN** `pareto_display_scope=top20` is provided
|
||||
- **THEN** applicable dimensions (type, workflow, equipment) SHALL truncate results to top 20 items after sorting
|
||||
- **WHEN** `pareto_display_scope` is omitted or `all`
|
||||
- **THEN** all items SHALL be returned (subject to pareto_scope 80% filter if active)
|
||||
- **THEN** all items SHALL be returned (subject to `pareto_scope` filter)
|
||||
|
||||
### Requirement: Reject History API SHALL support multi-dimension Pareto selection in view and export
|
||||
The detail view and export endpoints SHALL accept multiple dimension selections simultaneously and apply them with AND logic.
|
||||
## ADDED Requirements
|
||||
|
||||
#### Scenario: Multi-dimension filter on view endpoint
|
||||
- **WHEN** `GET /api/reject-history/view` is called with `sel_reason=A&sel_type=X`
|
||||
- **THEN** returned rows SHALL match reason=A AND type=X (both filters applied simultaneously)
|
||||
### Requirement: Reject History API SHALL expose materialized Pareto freshness metadata
|
||||
The API SHALL surface stable metadata so operators and clients can identify whether Pareto responses came from materialized snapshots or fallback paths.
|
||||
|
||||
#### Scenario: Multi-dimension filter on export endpoint
|
||||
- **WHEN** `GET /api/reject-history/export-cached` is called with `sel_reason=A&sel_type=X`
|
||||
- **THEN** exported CSV SHALL contain only rows matching reason=A AND type=X
|
||||
|
||||
#### Scenario: Backward compatibility with single-dimension params
|
||||
- **WHEN** `pareto_dimension` and `pareto_values` are provided (legacy format)
|
||||
- **THEN** the API SHALL still accept and apply them as before
|
||||
- **WHEN** both `sel_*` params and legacy params are provided
|
||||
- **THEN** `sel_*` params SHALL take precedence
|
||||
|
||||
### Requirement: Reject History API SHALL provide paginated detail endpoint
|
||||
The API SHALL return paginated detailed rows for the selected filter context.
|
||||
|
||||
#### Scenario: List response payload
|
||||
- **WHEN** `GET /api/reject-history/list?page=1&per_page=50` is called
|
||||
- **THEN** response SHALL include `{ items: [...], pagination: { page, perPage, total, totalPages } }`
|
||||
- **THEN** each row SHALL include date, process dimensions, reason fields, `MOVEIN_QTY`, `REJECT_TOTAL_QTY`, `DEFECT_QTY`, and reject component columns
|
||||
|
||||
#### Scenario: Paging bounds
|
||||
- **WHEN** `page < 1`
|
||||
- **THEN** page SHALL be treated as 1
|
||||
- **WHEN** `per_page > 200`
|
||||
- **THEN** `per_page` SHALL be capped at 200
|
||||
|
||||
### Requirement: Reject History API SHALL provide CSV export endpoint
|
||||
The API SHALL provide CSV export using the same filter and metric semantics as list/query APIs.
|
||||
|
||||
#### Scenario: Export payload consistency
|
||||
- **WHEN** `GET /api/reject-history/export` is called with valid filters
|
||||
- **THEN** CSV headers SHALL include both `REJECT_TOTAL_QTY` and `DEFECT_QTY`
|
||||
- **THEN** export rows SHALL follow the same semantic definitions as summary/list endpoints
|
||||
|
||||
#### Scenario: Cached export supports full detail-filter parity
|
||||
- **WHEN** `GET /api/reject-history/export-cached` is called with an existing `query_id`
|
||||
- **THEN** the endpoint SHALL apply primary policy toggles, supplementary filters, trend-date filters, metric filter, and Pareto-selected item filters
|
||||
- **THEN** returned rows SHALL match the same filtered detail dataset semantics used by `GET /api/reject-history/view`
|
||||
|
||||
#### Scenario: CSV encoding and escaping are stable
|
||||
- **WHEN** either export endpoint returns CSV
|
||||
- **THEN** response charset SHALL be `utf-8-sig`
|
||||
- **THEN** values containing commas, quotes, or newlines SHALL be CSV-escaped correctly
|
||||
|
||||
### Requirement: Reject History API SHALL centralize SQL in reject_history SQL directory
|
||||
The service SHALL load SQL from dedicated files under `src/mes_dashboard/sql/reject_history/`.
|
||||
|
||||
#### Scenario: SQL file loading
|
||||
- **WHEN** reject-history service executes queries
|
||||
- **THEN** SQL SHALL be loaded from files in `sql/reject_history`
|
||||
- **THEN** user-supplied filters SHALL be passed through bind parameters
|
||||
- **THEN** user input SHALL NOT be interpolated into SQL strings directly
|
||||
|
||||
### Requirement: Reject History API SHALL use cached exclusion-policy source
|
||||
The API SHALL read exclusion-policy reasons from cached `ERP_PJ_WIP_SCRAP_REASONS_EXCLUDE` data instead of querying Oracle on every request.
|
||||
|
||||
#### Scenario: Enabled exclusions only
|
||||
- **WHEN** exclusion-policy data is loaded
|
||||
- **THEN** only rows with `ENABLE_FLAG='Y'` SHALL be treated as active exclusions
|
||||
|
||||
#### Scenario: Daily full-table cache refresh
|
||||
- **WHEN** exclusion cache is initialized
|
||||
- **THEN** the full table SHALL be loaded and refreshed at least once per 24 hours
|
||||
- **THEN** Redis SHOULD be used as shared cache when available, with in-memory fallback when unavailable
|
||||
|
||||
### Requirement: Reject History API SHALL apply rate limiting on expensive endpoints
|
||||
The API SHALL rate-limit high-cost endpoints to protect Oracle and application resources.
|
||||
|
||||
#### Scenario: List and export rate limiting
|
||||
- **WHEN** `/api/reject-history/list` or `/api/reject-history/export` receives excessive requests
|
||||
- **THEN** configured rate limiting SHALL reject requests beyond the threshold within the time window
|
||||
|
||||
### Requirement: Database query execution path
|
||||
The reject-history service (`reject_history_service.py` and `reject_dataset_cache.py`) SHALL use `read_sql_df_slow` (dedicated connection) instead of `read_sql_df` (pooled connection) for all Oracle queries.
|
||||
|
||||
#### Scenario: Primary query uses dedicated connection
|
||||
- **WHEN** the reject-history primary query is executed
|
||||
- **THEN** it uses `read_sql_df_slow` which creates a dedicated Oracle connection outside the pool
|
||||
- **AND** the connection has a 300-second call_timeout (configurable)
|
||||
- **AND** the connection is subject to the global slow query semaphore
|
||||
#### Scenario: Materialized hit metadata
|
||||
- **WHEN** batch pareto response is served from materialized snapshot
|
||||
- **THEN** response metadata SHALL indicate materialized source and snapshot freshness/version identifiers
|
||||
|
||||
#### Scenario: Fallback metadata
|
||||
- **WHEN** response uses legacy fallback due to snapshot miss/stale/build failure
|
||||
- **THEN** response metadata SHALL include a stable fallback reason code
|
||||
|
||||
@@ -0,0 +1,49 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Reject History Pareto materialization SHALL build reusable aggregate snapshots
|
||||
The system SHALL build reusable Pareto aggregate snapshots from cached reject-history query datasets so interactive Pareto requests do not require full lot-level regrouping on every call.
|
||||
|
||||
#### Scenario: Build snapshot from cached dataset
|
||||
- **WHEN** a valid `query_id` has cached reject-history dataset and Pareto data is requested
|
||||
- **THEN** the system SHALL build a materialized snapshot containing the six supported Pareto dimensions (`reason`, `package`, `type`, `workflow`, `workcenter`, `equipment`)
|
||||
- **THEN** the snapshot SHALL include quantities needed to compute `metric_value`, `pct`, `cumPct`, and affected count fields
|
||||
|
||||
#### Scenario: Build skipped for missing dataset cache
|
||||
- **WHEN** the referenced `query_id` dataset is missing or expired
|
||||
- **THEN** snapshot build SHALL NOT proceed
|
||||
- **THEN** the caller SHALL receive a deterministic cache-miss outcome
|
||||
|
||||
### Requirement: Materialized snapshot keys SHALL encode filter identity and schema version
|
||||
The system SHALL key materialized Pareto snapshots by canonical filter identity and schema version to prevent cross-context reuse.
|
||||
|
||||
#### Scenario: Distinct supplementary filters generate distinct snapshots
|
||||
- **WHEN** two requests share the same `query_id` but differ in supplementary filters or policy toggles
|
||||
- **THEN** they SHALL resolve to different materialized snapshot keys
|
||||
|
||||
#### Scenario: Schema version invalidates prior snapshots
|
||||
- **WHEN** materialization schema version is incremented
|
||||
- **THEN** snapshots produced by prior versions SHALL NOT be treated as valid hits
|
||||
|
||||
### Requirement: Materialized snapshots SHALL preserve cross-filter semantics
|
||||
Materialized read paths SHALL produce the same cross-filter behavior as legacy DataFrame-based Pareto computation.
|
||||
|
||||
#### Scenario: Exclude-self behavior parity
|
||||
- **WHEN** `sel_reason=A` and `sel_type=X` are active
|
||||
- **THEN** reason Pareto SHALL be computed with `type=X` applied but without `reason=A` self-filter
|
||||
- **THEN** type Pareto SHALL be computed with `reason=A` applied but without `type=X` self-filter
|
||||
|
||||
#### Scenario: Multi-dimension intersection parity
|
||||
- **WHEN** multiple `sel_*` filters are active across dimensions
|
||||
- **THEN** each non-excluded dimension result SHALL reflect the AND intersection of all other selected dimensions
|
||||
|
||||
### Requirement: Materialized snapshots SHALL enforce bounded lifecycle and capacity
|
||||
Materialized Pareto cache storage SHALL be bounded by TTL and size guardrails to avoid unbounded memory growth.
|
||||
|
||||
#### Scenario: Snapshot expiry follows configured retention
|
||||
- **WHEN** a materialized snapshot exceeds configured TTL
|
||||
- **THEN** it SHALL be treated as expired and SHALL NOT be returned as a cache hit
|
||||
|
||||
#### Scenario: Oversized snapshot handling
|
||||
- **WHEN** a snapshot build exceeds configured snapshot size guardrail
|
||||
- **THEN** the snapshot SHALL be rejected or degraded according to policy
|
||||
- **THEN** the system SHALL record the rejection/degradation reason for operations telemetry
|
||||
Reference in New Issue
Block a user