chore: finalize vite migration hardening and archive openspec changes
This commit is contained in:
33
openspec/specs/api-safety-hygiene/spec.md
Normal file
33
openspec/specs/api-safety-hygiene/spec.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# api-safety-hygiene Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change residual-hardening-round3. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Recursive Payload Cleaning MUST Enforce Depth Safety
|
||||
Routes that normalize nested payloads MUST prevent unbounded recursion depth.
|
||||
|
||||
#### Scenario: Deeply nested response object
|
||||
- **WHEN** NaN-cleaning helper receives deeply nested list/dict payload
|
||||
- **THEN** cleaning logic MUST enforce max depth or iterative traversal and return safely without recursion failure
|
||||
|
||||
### Requirement: Filter Source Names MUST Be Configurable
|
||||
Filter cache query sources MUST NOT rely on hardcoded view names only.
|
||||
|
||||
#### Scenario: Environment-specific view names
|
||||
- **WHEN** deployment sets custom filter-source environment variables
|
||||
- **THEN** filter cache loader MUST resolve and query configured view names
|
||||
|
||||
### Requirement: High-Cost APIs SHALL Apply Basic Rate Guardrails
|
||||
High-cost read endpoints SHALL apply configurable request-rate guardrails to reduce abuse and accidental bursts.
|
||||
|
||||
#### Scenario: Burst traffic from same client
|
||||
- **WHEN** a client exceeds configured request budget for guarded endpoints
|
||||
- **THEN** endpoint SHALL return throttled response with clear retry guidance
|
||||
|
||||
### Requirement: Common Boolean Query Parsing SHALL Be Shared
|
||||
Boolean query parsing in routes SHALL use shared helper behavior.
|
||||
|
||||
#### Scenario: Different routes parse include flags
|
||||
- **WHEN** routes parse common boolean query parameters
|
||||
- **THEN** parsing behavior MUST be consistent across routes via shared utility
|
||||
|
||||
26
openspec/specs/cache-indexed-query-acceleration/spec.md
Normal file
26
openspec/specs/cache-indexed-query-acceleration/spec.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# cache-indexed-query-acceleration Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change p1-cache-query-efficiency. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Incremental Synchronization SHALL Use Versioned Watermarks
|
||||
For heavy non-full-snapshot datasets, cache refresh SHALL support incremental synchronization keyed by stable version or watermark boundaries.
|
||||
|
||||
#### Scenario: Incremental refresh cycle
|
||||
- **WHEN** source data version indicates partial changes since last sync
|
||||
- **THEN** cache update logic MUST fetch and merge only changed partitions while preserving correctness guarantees
|
||||
|
||||
### Requirement: Query Paths SHALL Use Indexed Access for High-Frequency Filters
|
||||
Query execution over cached data SHALL use prebuilt indexes for known high-frequency filter columns.
|
||||
|
||||
#### Scenario: Filtered report query
|
||||
- **WHEN** request filters target indexed fields
|
||||
- **THEN** result selection MUST avoid full dataset scans and maintain existing response contract
|
||||
|
||||
### Requirement: Business-Mandated Full-Table Caches SHALL Be Preserved for Resource and WIP
|
||||
The system SHALL continue to maintain full-table cache behavior for `resource` and `wip` domains.
|
||||
|
||||
#### Scenario: Resource or WIP cache refresh
|
||||
- **WHEN** cache update runs for `resource` or `wip`
|
||||
- **THEN** the updater MUST retain full-table snapshot semantics and MUST NOT switch these domains to partial-only cache mode
|
||||
|
||||
@@ -36,3 +36,53 @@ The system MUST define alert thresholds for sustained degraded state, repeated w
|
||||
- **WHEN** degraded status persists beyond configured duration
|
||||
- **THEN** the monitoring contract MUST classify the service as alert-worthy with actionable context
|
||||
|
||||
### Requirement: Cache Telemetry SHALL Include Memory Amplification Signals
|
||||
Operational telemetry MUST expose cache-domain memory usage indicators and representation amplification factors, and MUST differentiate between authoritative data payload and derived/index helper structures.
|
||||
|
||||
#### Scenario: Deep health telemetry request after representation normalization
|
||||
- **WHEN** operators inspect cache telemetry for resource or WIP domains
|
||||
- **THEN** telemetry MUST include per-domain memory footprint, amplification indicators, and enough structure detail to verify that full-record duplication is not reintroduced
|
||||
|
||||
### Requirement: Efficiency Benchmarks SHALL Gate Cache Refactor Rollout
|
||||
Cache/query efficiency changes MUST be validated against baseline latency and memory benchmarks before rollout.
|
||||
|
||||
#### Scenario: Pre-release validation
|
||||
- **WHEN** cache refactor changes are prepared for deployment
|
||||
- **THEN** benchmark results MUST demonstrate no regression beyond configured thresholds for P95 latency and memory usage
|
||||
|
||||
### Requirement: Process-Level Cache SHALL Use Bounded Capacity with Deterministic Eviction
|
||||
Process-level parsed-data caches MUST enforce a configurable maximum key capacity and use deterministic eviction behavior when capacity is exceeded.
|
||||
|
||||
#### Scenario: Cache capacity reached
|
||||
- **WHEN** a new cache entry is inserted and key capacity is at limit
|
||||
- **THEN** cache MUST evict entries according to defined policy before storing the new key
|
||||
|
||||
#### Scenario: Repeated access updates recency
|
||||
- **WHEN** an existing cache key is read or overwritten
|
||||
- **THEN** eviction order MUST reflect recency semantics so hot keys are retained preferentially
|
||||
|
||||
### Requirement: Cache Publish MUST Preserve Previous Readable Snapshot on Failure
|
||||
When refreshing full-table cache payloads, the system MUST avoid exposing partially published states to readers.
|
||||
|
||||
#### Scenario: Publish fails after payload serialization
|
||||
- **WHEN** a cache refresh has prepared new payload but publish operation fails
|
||||
- **THEN** previously published cache keys MUST remain readable and metadata MUST remain consistent with old snapshot
|
||||
|
||||
#### Scenario: Publish succeeds
|
||||
- **WHEN** publish operation completes successfully
|
||||
- **THEN** data payload and metadata keys MUST be visible as one coherent new snapshot
|
||||
|
||||
### Requirement: Process-Level Cache Slow Path SHALL Minimize Lock Hold Time
|
||||
Large payload parsing MUST NOT happen inside long-held process cache locks.
|
||||
|
||||
#### Scenario: Cache miss under concurrent requests
|
||||
- **WHEN** multiple requests hit process cache miss
|
||||
- **THEN** parsing work SHALL happen outside lock-protected mutation section, and lock scope SHALL be limited to consistency check + commit
|
||||
|
||||
### Requirement: Process-Level Cache Policies MUST Stay Consistent Across Services
|
||||
All service-local process caches MUST support bounded capacity with deterministic eviction.
|
||||
|
||||
#### Scenario: Realtime equipment cache growth
|
||||
- **WHEN** realtime equipment process cache reaches configured capacity
|
||||
- **THEN** entries MUST be evicted according to deterministic LRU behavior
|
||||
|
||||
|
||||
@@ -24,3 +24,17 @@ Runbooks and deployment documentation MUST describe the same conda/systemd/watch
|
||||
- **WHEN** an operator performs deploy, health check, and rollback from documentation
|
||||
- **THEN** documented commands and paths MUST work without requiring venv-specific assumptions
|
||||
|
||||
### Requirement: Runtime Path Drift SHALL Be Detectable Before Service Start
|
||||
Service startup checks MUST validate configured conda runtime paths across app, watchdog, and worker control scripts.
|
||||
|
||||
#### Scenario: Conda path mismatch detected
|
||||
- **WHEN** startup validation finds runtime path inconsistency between configured units and scripts
|
||||
- **THEN** service start MUST fail with actionable diagnostics instead of running with partial mismatch
|
||||
|
||||
### Requirement: Conda/Systemd Contract SHALL Be Versioned in Operations Docs
|
||||
The documented runtime contract MUST include versioned path assumptions and verification commands.
|
||||
|
||||
#### Scenario: Operator verifies deployment contract
|
||||
- **WHEN** operator follows runbook validation steps
|
||||
- **THEN** commands MUST confirm active runtime paths match documented conda/systemd contract
|
||||
|
||||
|
||||
@@ -50,3 +50,17 @@ Frontend matrix/filter computations SHALL produce deterministic selection and fi
|
||||
- **WHEN** users toggle matrix cells across group, family, and resource rows
|
||||
- **THEN** selected-state rendering and filtered equipment result sets MUST remain level-correct and reversible
|
||||
|
||||
### Requirement: Reusable Browser Compute Modules SHALL Power Report Derivations
|
||||
Derived computations for report filters, KPI cards, chart series, and table projections SHALL be implemented through reusable frontend modules.
|
||||
|
||||
#### Scenario: Shared report derivation logic
|
||||
- **WHEN** multiple report pages require equivalent data-shaping behavior
|
||||
- **THEN** pages MUST consume shared compute modules instead of duplicating transformation logic per page
|
||||
|
||||
### Requirement: Browser Compute Shift SHALL Preserve Export and Field Contracts
|
||||
Moving computations to frontend MUST preserve existing field naming and export column contracts.
|
||||
|
||||
#### Scenario: User exports report after frontend-side derivation
|
||||
- **WHEN** transformed data is rendered and exported
|
||||
- **THEN** exported field names and ordering MUST remain consistent with governed field contract definitions
|
||||
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
# maintainability-type-and-constant-hygiene Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change residual-hardening-round4. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Core Cache and Service Boundaries MUST Use Consistent Type Annotation Style
|
||||
Core cache/service modules touched by this change SHALL use a consistent and explicit type-annotation style for public and internal helper boundaries.
|
||||
|
||||
#### Scenario: Reviewing updated cache/service modules
|
||||
- **WHEN** maintainers inspect function signatures in affected modules
|
||||
- **THEN** optional and collection types MUST follow a single consistent style and remain compatible with the project Python baseline
|
||||
|
||||
### Requirement: High-Frequency Magic Numbers MUST Be Replaced by Named Constants
|
||||
Cache, throttling, and index-related numeric literals that control behavior MUST be extracted to named constants or env-configurable settings.
|
||||
|
||||
#### Scenario: Tuning cache/index behavior
|
||||
- **WHEN** operators need to tune cache/index thresholds
|
||||
- **THEN** they MUST find values in named constants or environment variables rather than scattered inline literals
|
||||
|
||||
19
openspec/specs/oracle-query-fragment-governance/spec.md
Normal file
19
openspec/specs/oracle-query-fragment-governance/spec.md
Normal file
@@ -0,0 +1,19 @@
|
||||
# oracle-query-fragment-governance Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change residual-hardening-round4. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Shared Oracle Query Fragments SHALL Have a Single Source of Truth
|
||||
Cross-service Oracle query fragments for resource and equipment cache loading MUST be defined in a shared module and imported by service implementations.
|
||||
|
||||
#### Scenario: Update common table/view reference
|
||||
- **WHEN** a common table or view name changes
|
||||
- **THEN** operators and developers MUST be able to update one shared definition without editing duplicated SQL literals across services
|
||||
|
||||
### Requirement: Service Queries MUST Preserve Existing Columns and Semantics
|
||||
Services consuming shared Oracle query fragments SHALL preserve existing selected columns, filters, and downstream payload behavior.
|
||||
|
||||
#### Scenario: Resource and equipment cache refresh after refactor
|
||||
- **WHEN** cache services execute queries via shared fragments
|
||||
- **THEN** resulting payload structure MUST remain compatible with existing aggregation and API contracts
|
||||
|
||||
@@ -0,0 +1,26 @@
|
||||
# resource-cache-representation-normalization Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change residual-hardening-round4. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Resource Derived Index MUST Avoid Full Record Duplication
|
||||
Resource derived index SHALL use lightweight row-position references instead of storing full duplicated record payloads alongside the process DataFrame cache.
|
||||
|
||||
#### Scenario: Build index from cached DataFrame
|
||||
- **WHEN** resource cache data is parsed from Redis into process-level DataFrame
|
||||
- **THEN** the derived index MUST store position-based references and metadata without a second full records copy
|
||||
|
||||
### Requirement: Resource Query APIs SHALL Preserve Existing Response Contract
|
||||
Resource query APIs MUST keep existing output fields and semantics after index representation normalization.
|
||||
|
||||
#### Scenario: Read all resources after normalization
|
||||
- **WHEN** callers request all resources or filtered resource lists
|
||||
- **THEN** the returned payload MUST remain field-compatible with pre-normalization responses
|
||||
|
||||
### Requirement: Cache Invalidation MUST Keep Index/Data Coherent
|
||||
The system SHALL invalidate and rebuild DataFrame/index representations atomically at cache refresh boundaries.
|
||||
|
||||
#### Scenario: Redis-backed cache refresh completes
|
||||
- **WHEN** a new resource cache snapshot is published
|
||||
- **THEN** stale index references MUST be invalidated before subsequent reads use refreshed DataFrame data
|
||||
|
||||
@@ -48,3 +48,47 @@ The system MUST expose machine-readable resilience thresholds, restart-churn ind
|
||||
#### Scenario: Admin status includes restart churn summary
|
||||
- **WHEN** operators call `/admin/api/system-status` or `/admin/api/worker/status`
|
||||
- **THEN** responses MUST include bounded restart history summary within a configured time window and indicate whether churn threshold is exceeded
|
||||
|
||||
### Requirement: Recovery Recommendations SHALL Reflect Self-Healing Policy State
|
||||
Health and admin resilience payloads MUST expose whether automated recovery is allowed, cooling down, or blocked by churn policy.
|
||||
|
||||
#### Scenario: Operator inspects degraded state
|
||||
- **WHEN** `/health` or `/admin/api/worker/status` is requested during degradation
|
||||
- **THEN** response MUST include policy state, cooldown remaining time, and next recommended action
|
||||
|
||||
### Requirement: Manual Recovery Override SHALL Be Explicit and Controlled
|
||||
Manual restart actions MUST bypass automatic block only through authenticated operator pathways with explicit acknowledgement.
|
||||
|
||||
#### Scenario: Churn-blocked state with manual override request
|
||||
- **WHEN** authorized admin requests manual restart while auto-recovery is blocked
|
||||
- **THEN** system MUST execute controlled restart path and log the override context for auditability
|
||||
|
||||
### Requirement: Circuit Breaker State Transitions SHALL Avoid Lock-Held Logging
|
||||
Circuit breaker state transitions MUST avoid executing logger I/O while internal state locks are held.
|
||||
|
||||
#### Scenario: State transition occurs
|
||||
- **WHEN** circuit breaker transitions between CLOSED, OPEN, or HALF_OPEN
|
||||
- **THEN** lock-protected section MUST complete state mutation before emitting transition log output
|
||||
|
||||
#### Scenario: Slow log handler under load
|
||||
- **WHEN** logger handlers are slow or blocked
|
||||
- **THEN** circuit breaker lock contention MUST remain bounded and MUST NOT serialize unrelated request paths behind logging latency
|
||||
|
||||
### Requirement: Health Endpoints SHALL Use Short Internal Memoization
|
||||
Health and deep-health computation SHALL use a short-lived internal cache to prevent probe storms from amplifying backend load.
|
||||
|
||||
#### Scenario: Frequent monitor scrapes
|
||||
- **WHEN** health endpoints are called repeatedly within a small window
|
||||
- **THEN** service SHALL return memoized payload for up to 5 seconds in non-testing environments
|
||||
|
||||
#### Scenario: Testing mode
|
||||
- **WHEN** app is running in testing mode
|
||||
- **THEN** health endpoint memoization MUST be bypassed to preserve deterministic tests
|
||||
|
||||
### Requirement: Logs MUST Redact Connection Secrets
|
||||
Runtime logs MUST avoid exposing DB connection credentials.
|
||||
|
||||
#### Scenario: Connection string appears in log message
|
||||
- **WHEN** a log message contains DB URL credentials
|
||||
- **THEN** logger output MUST redact password and sensitive userinfo before emission
|
||||
|
||||
|
||||
38
openspec/specs/security-surface-hardening/spec.md
Normal file
38
openspec/specs/security-surface-hardening/spec.md
Normal file
@@ -0,0 +1,38 @@
|
||||
# security-surface-hardening Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change security-stability-hardening-round2. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: LDAP Authentication Endpoint Configuration SHALL Be Strictly Validated
|
||||
The system MUST validate LDAP authentication endpoint configuration before use, including HTTPS scheme enforcement and host allowlist checks.
|
||||
|
||||
#### Scenario: Invalid LDAP URL configuration detected
|
||||
- **WHEN** `LDAP_API_URL` is missing, non-HTTPS, or points to a host outside the configured allowlist
|
||||
- **THEN** the service MUST reject LDAP authentication calls and emit actionable diagnostics without sending credentials to that endpoint
|
||||
|
||||
#### Scenario: Valid LDAP URL configuration accepted
|
||||
- **WHEN** `LDAP_API_URL` uses HTTPS and host is allowlisted
|
||||
- **THEN** LDAP authentication requests MAY proceed with normal timeout and error handling behavior
|
||||
|
||||
### Requirement: Security Response Headers SHALL Be Applied Globally
|
||||
All HTTP responses MUST include baseline security headers suitable for dashboard and API traffic.
|
||||
|
||||
#### Scenario: Standard response emitted
|
||||
- **WHEN** any route returns a response
|
||||
- **THEN** response MUST include `Content-Security-Policy`, `X-Frame-Options`, `X-Content-Type-Options`, and `Referrer-Policy`
|
||||
|
||||
#### Scenario: Production transport hardening
|
||||
- **WHEN** runtime environment is production
|
||||
- **THEN** response MUST include `Strict-Transport-Security`
|
||||
|
||||
### Requirement: Pagination Input Boundaries SHALL Be Enforced
|
||||
Endpoints accepting pagination parameters MUST enforce lower and upper bounds before query execution.
|
||||
|
||||
#### Scenario: Negative or zero pagination inputs
|
||||
- **WHEN** client sends `page <= 0` or `page_size <= 0`
|
||||
- **THEN** server MUST normalize values to minimum supported bounds
|
||||
|
||||
#### Scenario: Excessive page size requested
|
||||
- **WHEN** client sends `page_size` above configured maximum
|
||||
- **THEN** server MUST clamp to maximum supported page size
|
||||
|
||||
26
openspec/specs/worker-self-healing-governance/spec.md
Normal file
26
openspec/specs/worker-self-healing-governance/spec.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# worker-self-healing-governance Specification
|
||||
|
||||
## Purpose
|
||||
TBD - created by archiving change p2-ops-self-healing-runbook. Update Purpose after archive.
|
||||
## Requirements
|
||||
### Requirement: Automated Worker Recovery SHALL Use Bounded Policy Guards
|
||||
Automated worker restart behavior MUST enforce cooldown periods and bounded restart attempts within a configurable time window.
|
||||
|
||||
#### Scenario: Repeated worker degradation within short window
|
||||
- **WHEN** degradation events exceed configured restart-attempt budget
|
||||
- **THEN** automated restarts MUST pause and surface a blocked-recovery signal for operator intervention
|
||||
|
||||
### Requirement: Restart-Churn Protection SHALL Prevent Recovery Storms
|
||||
The runtime MUST classify restart churn and prevent uncontrolled restart loops.
|
||||
|
||||
#### Scenario: Churn threshold exceeded
|
||||
- **WHEN** restart count crosses churn threshold in active window
|
||||
- **THEN** watchdog MUST enter guarded mode and require explicit manual override before further restart attempts
|
||||
|
||||
### Requirement: Recovery Decisions SHALL Be Audit-Ready
|
||||
Every auto-recovery decision and manual override action MUST be recorded with structured metadata.
|
||||
|
||||
#### Scenario: Worker restart decision emitted
|
||||
- **WHEN** system executes or denies a restart action
|
||||
- **THEN** structured logs/events MUST include reason, thresholds, actor/source, and resulting state
|
||||
|
||||
Reference in New Issue
Block a user