Trace pipeline pool isolation: - Switch event_fetcher and lineage_engine to read_sql_df_slow (non-pooled) - Reduce EVENT_FETCHER_MAX_WORKERS 4→2, TRACE_EVENTS_MAX_WORKERS 4→2 - Add 60s timeout per batch query, cache skip for CID>10K - Early del raw_domain_results + gc.collect() for large queries - Increase DB_SLOW_MAX_CONCURRENT: base 3→5, dev 2→3, prod 3→5 Test fixes (51 pre-existing failures → 0): - reject_history: WORKFLOW CSV header, strict bool validation, pareto mock path - portal shell: remove non-existent /tmtt-defect route from tests - conftest: add --run-stress option to skip stress/load tests by default - migration tests: skipif baseline directory missing - performance test: update Vite asset assertion - wip hold: add firstname/waferdesc mock params - template integration: add /reject-history canonical route Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
36 lines
2.2 KiB
Markdown
36 lines
2.2 KiB
Markdown
# event-fetcher-unified Specification
|
|
|
|
## Purpose
|
|
TBD - created by archiving change unified-lineage-engine. Update Purpose after archive.
|
|
## Requirements
|
|
### Requirement: EventFetcher SHALL provide unified cached event querying across domains
|
|
`EventFetcher` SHALL encapsulate batch event queries with L1/L2 layered cache and rate limit bucket configuration, supporting domains: `history`, `materials`, `rejects`, `holds`, `jobs`, `upstream_history`, `downstream_rejects`.
|
|
|
|
#### Scenario: Cache miss for event domain query
|
|
- **WHEN** `EventFetcher` is called for a domain with container IDs and no cache exists
|
|
- **THEN** the domain query SHALL execute against Oracle via `read_sql_df_slow()` (non-pooled dedicated connection)
|
|
- **THEN** each batch query SHALL use `timeout_seconds=60`
|
|
- **THEN** the result SHALL be stored in L2 Redis cache with key format `evt:{domain}:{sorted_cids_hash}` if CID count is within cache threshold
|
|
- **THEN** L1 memory cache SHALL also be populated if CID count is within cache threshold
|
|
|
|
#### Scenario: Cache hit for event domain query
|
|
- **WHEN** `EventFetcher` is called for a domain and L2 Redis cache contains a valid entry
|
|
- **THEN** the cached result SHALL be returned without executing Oracle query
|
|
- **THEN** DB connection pool SHALL NOT be consumed
|
|
|
|
#### Scenario: Rate limit bucket per domain
|
|
- **WHEN** `EventFetcher` is used from a route handler
|
|
- **THEN** each domain SHALL have a configurable rate limit bucket aligned with `configured_rate_limit()` pattern
|
|
- **THEN** rate limit configuration SHALL be overridable via environment variables
|
|
|
|
#### Scenario: Large CID set exceeds cache threshold
|
|
- **WHEN** the normalized CID count exceeds `CACHE_SKIP_CID_THRESHOLD` (default 10000, env: `EVENT_FETCHER_CACHE_SKIP_CID_THRESHOLD`)
|
|
- **THEN** EventFetcher SHALL skip both L1 and L2 cache writes
|
|
- **THEN** a warning log SHALL be emitted with domain name, CID count, and threshold value
|
|
- **THEN** the query result SHALL still be returned to the caller
|
|
|
|
#### Scenario: Batch concurrency default
|
|
- **WHEN** EventFetcher processes batches for a domain with >1000 CIDs
|
|
- **THEN** the default `EVENT_FETCHER_MAX_WORKERS` SHALL be 2 (env: `EVENT_FETCHER_MAX_WORKERS`)
|
|
|