chore: finalize vite migration hardening and archive openspec changes

This commit is contained in:
beabigegg
2026-02-08 20:03:36 +08:00
parent b56e80381b
commit c8e225101e
119 changed files with 6547 additions and 1301 deletions

View File

@@ -0,0 +1,33 @@
# api-safety-hygiene Specification
## Purpose
TBD - created by archiving change residual-hardening-round3. Update Purpose after archive.
## Requirements
### Requirement: Recursive Payload Cleaning MUST Enforce Depth Safety
Routes that normalize nested payloads MUST prevent unbounded recursion depth.
#### Scenario: Deeply nested response object
- **WHEN** NaN-cleaning helper receives deeply nested list/dict payload
- **THEN** cleaning logic MUST enforce max depth or iterative traversal and return safely without recursion failure
### Requirement: Filter Source Names MUST Be Configurable
Filter cache query sources MUST NOT rely on hardcoded view names only.
#### Scenario: Environment-specific view names
- **WHEN** deployment sets custom filter-source environment variables
- **THEN** filter cache loader MUST resolve and query configured view names
### Requirement: High-Cost APIs SHALL Apply Basic Rate Guardrails
High-cost read endpoints SHALL apply configurable request-rate guardrails to reduce abuse and accidental bursts.
#### Scenario: Burst traffic from same client
- **WHEN** a client exceeds configured request budget for guarded endpoints
- **THEN** endpoint SHALL return throttled response with clear retry guidance
### Requirement: Common Boolean Query Parsing SHALL Be Shared
Boolean query parsing in routes SHALL use shared helper behavior.
#### Scenario: Different routes parse include flags
- **WHEN** routes parse common boolean query parameters
- **THEN** parsing behavior MUST be consistent across routes via shared utility

View File

@@ -0,0 +1,26 @@
# cache-indexed-query-acceleration Specification
## Purpose
TBD - created by archiving change p1-cache-query-efficiency. Update Purpose after archive.
## Requirements
### Requirement: Incremental Synchronization SHALL Use Versioned Watermarks
For heavy non-full-snapshot datasets, cache refresh SHALL support incremental synchronization keyed by stable version or watermark boundaries.
#### Scenario: Incremental refresh cycle
- **WHEN** source data version indicates partial changes since last sync
- **THEN** cache update logic MUST fetch and merge only changed partitions while preserving correctness guarantees
### Requirement: Query Paths SHALL Use Indexed Access for High-Frequency Filters
Query execution over cached data SHALL use prebuilt indexes for known high-frequency filter columns.
#### Scenario: Filtered report query
- **WHEN** request filters target indexed fields
- **THEN** result selection MUST avoid full dataset scans and maintain existing response contract
### Requirement: Business-Mandated Full-Table Caches SHALL Be Preserved for Resource and WIP
The system SHALL continue to maintain full-table cache behavior for `resource` and `wip` domains.
#### Scenario: Resource or WIP cache refresh
- **WHEN** cache update runs for `resource` or `wip`
- **THEN** the updater MUST retain full-table snapshot semantics and MUST NOT switch these domains to partial-only cache mode

View File

@@ -36,3 +36,53 @@ The system MUST define alert thresholds for sustained degraded state, repeated w
- **WHEN** degraded status persists beyond configured duration
- **THEN** the monitoring contract MUST classify the service as alert-worthy with actionable context
### Requirement: Cache Telemetry SHALL Include Memory Amplification Signals
Operational telemetry MUST expose cache-domain memory usage indicators and representation amplification factors, and MUST differentiate between authoritative data payload and derived/index helper structures.
#### Scenario: Deep health telemetry request after representation normalization
- **WHEN** operators inspect cache telemetry for resource or WIP domains
- **THEN** telemetry MUST include per-domain memory footprint, amplification indicators, and enough structure detail to verify that full-record duplication is not reintroduced
### Requirement: Efficiency Benchmarks SHALL Gate Cache Refactor Rollout
Cache/query efficiency changes MUST be validated against baseline latency and memory benchmarks before rollout.
#### Scenario: Pre-release validation
- **WHEN** cache refactor changes are prepared for deployment
- **THEN** benchmark results MUST demonstrate no regression beyond configured thresholds for P95 latency and memory usage
### Requirement: Process-Level Cache SHALL Use Bounded Capacity with Deterministic Eviction
Process-level parsed-data caches MUST enforce a configurable maximum key capacity and use deterministic eviction behavior when capacity is exceeded.
#### Scenario: Cache capacity reached
- **WHEN** a new cache entry is inserted and key capacity is at limit
- **THEN** cache MUST evict entries according to defined policy before storing the new key
#### Scenario: Repeated access updates recency
- **WHEN** an existing cache key is read or overwritten
- **THEN** eviction order MUST reflect recency semantics so hot keys are retained preferentially
### Requirement: Cache Publish MUST Preserve Previous Readable Snapshot on Failure
When refreshing full-table cache payloads, the system MUST avoid exposing partially published states to readers.
#### Scenario: Publish fails after payload serialization
- **WHEN** a cache refresh has prepared new payload but publish operation fails
- **THEN** previously published cache keys MUST remain readable and metadata MUST remain consistent with old snapshot
#### Scenario: Publish succeeds
- **WHEN** publish operation completes successfully
- **THEN** data payload and metadata keys MUST be visible as one coherent new snapshot
### Requirement: Process-Level Cache Slow Path SHALL Minimize Lock Hold Time
Large payload parsing MUST NOT happen inside long-held process cache locks.
#### Scenario: Cache miss under concurrent requests
- **WHEN** multiple requests hit process cache miss
- **THEN** parsing work SHALL happen outside lock-protected mutation section, and lock scope SHALL be limited to consistency check + commit
### Requirement: Process-Level Cache Policies MUST Stay Consistent Across Services
All service-local process caches MUST support bounded capacity with deterministic eviction.
#### Scenario: Realtime equipment cache growth
- **WHEN** realtime equipment process cache reaches configured capacity
- **THEN** entries MUST be evicted according to deterministic LRU behavior

View File

@@ -24,3 +24,17 @@ Runbooks and deployment documentation MUST describe the same conda/systemd/watch
- **WHEN** an operator performs deploy, health check, and rollback from documentation
- **THEN** documented commands and paths MUST work without requiring venv-specific assumptions
### Requirement: Runtime Path Drift SHALL Be Detectable Before Service Start
Service startup checks MUST validate configured conda runtime paths across app, watchdog, and worker control scripts.
#### Scenario: Conda path mismatch detected
- **WHEN** startup validation finds runtime path inconsistency between configured units and scripts
- **THEN** service start MUST fail with actionable diagnostics instead of running with partial mismatch
### Requirement: Conda/Systemd Contract SHALL Be Versioned in Operations Docs
The documented runtime contract MUST include versioned path assumptions and verification commands.
#### Scenario: Operator verifies deployment contract
- **WHEN** operator follows runbook validation steps
- **THEN** commands MUST confirm active runtime paths match documented conda/systemd contract

View File

@@ -50,3 +50,17 @@ Frontend matrix/filter computations SHALL produce deterministic selection and fi
- **WHEN** users toggle matrix cells across group, family, and resource rows
- **THEN** selected-state rendering and filtered equipment result sets MUST remain level-correct and reversible
### Requirement: Reusable Browser Compute Modules SHALL Power Report Derivations
Derived computations for report filters, KPI cards, chart series, and table projections SHALL be implemented through reusable frontend modules.
#### Scenario: Shared report derivation logic
- **WHEN** multiple report pages require equivalent data-shaping behavior
- **THEN** pages MUST consume shared compute modules instead of duplicating transformation logic per page
### Requirement: Browser Compute Shift SHALL Preserve Export and Field Contracts
Moving computations to frontend MUST preserve existing field naming and export column contracts.
#### Scenario: User exports report after frontend-side derivation
- **WHEN** transformed data is rendered and exported
- **THEN** exported field names and ordering MUST remain consistent with governed field contract definitions

View File

@@ -0,0 +1,19 @@
# maintainability-type-and-constant-hygiene Specification
## Purpose
TBD - created by archiving change residual-hardening-round4. Update Purpose after archive.
## Requirements
### Requirement: Core Cache and Service Boundaries MUST Use Consistent Type Annotation Style
Core cache/service modules touched by this change SHALL use a consistent and explicit type-annotation style for public and internal helper boundaries.
#### Scenario: Reviewing updated cache/service modules
- **WHEN** maintainers inspect function signatures in affected modules
- **THEN** optional and collection types MUST follow a single consistent style and remain compatible with the project Python baseline
### Requirement: High-Frequency Magic Numbers MUST Be Replaced by Named Constants
Cache, throttling, and index-related numeric literals that control behavior MUST be extracted to named constants or env-configurable settings.
#### Scenario: Tuning cache/index behavior
- **WHEN** operators need to tune cache/index thresholds
- **THEN** they MUST find values in named constants or environment variables rather than scattered inline literals

View File

@@ -0,0 +1,19 @@
# oracle-query-fragment-governance Specification
## Purpose
TBD - created by archiving change residual-hardening-round4. Update Purpose after archive.
## Requirements
### Requirement: Shared Oracle Query Fragments SHALL Have a Single Source of Truth
Cross-service Oracle query fragments for resource and equipment cache loading MUST be defined in a shared module and imported by service implementations.
#### Scenario: Update common table/view reference
- **WHEN** a common table or view name changes
- **THEN** operators and developers MUST be able to update one shared definition without editing duplicated SQL literals across services
### Requirement: Service Queries MUST Preserve Existing Columns and Semantics
Services consuming shared Oracle query fragments SHALL preserve existing selected columns, filters, and downstream payload behavior.
#### Scenario: Resource and equipment cache refresh after refactor
- **WHEN** cache services execute queries via shared fragments
- **THEN** resulting payload structure MUST remain compatible with existing aggregation and API contracts

View File

@@ -0,0 +1,26 @@
# resource-cache-representation-normalization Specification
## Purpose
TBD - created by archiving change residual-hardening-round4. Update Purpose after archive.
## Requirements
### Requirement: Resource Derived Index MUST Avoid Full Record Duplication
Resource derived index SHALL use lightweight row-position references instead of storing full duplicated record payloads alongside the process DataFrame cache.
#### Scenario: Build index from cached DataFrame
- **WHEN** resource cache data is parsed from Redis into process-level DataFrame
- **THEN** the derived index MUST store position-based references and metadata without a second full records copy
### Requirement: Resource Query APIs SHALL Preserve Existing Response Contract
Resource query APIs MUST keep existing output fields and semantics after index representation normalization.
#### Scenario: Read all resources after normalization
- **WHEN** callers request all resources or filtered resource lists
- **THEN** the returned payload MUST remain field-compatible with pre-normalization responses
### Requirement: Cache Invalidation MUST Keep Index/Data Coherent
The system SHALL invalidate and rebuild DataFrame/index representations atomically at cache refresh boundaries.
#### Scenario: Redis-backed cache refresh completes
- **WHEN** a new resource cache snapshot is published
- **THEN** stale index references MUST be invalidated before subsequent reads use refreshed DataFrame data

View File

@@ -48,3 +48,47 @@ The system MUST expose machine-readable resilience thresholds, restart-churn ind
#### Scenario: Admin status includes restart churn summary
- **WHEN** operators call `/admin/api/system-status` or `/admin/api/worker/status`
- **THEN** responses MUST include bounded restart history summary within a configured time window and indicate whether churn threshold is exceeded
### Requirement: Recovery Recommendations SHALL Reflect Self-Healing Policy State
Health and admin resilience payloads MUST expose whether automated recovery is allowed, cooling down, or blocked by churn policy.
#### Scenario: Operator inspects degraded state
- **WHEN** `/health` or `/admin/api/worker/status` is requested during degradation
- **THEN** response MUST include policy state, cooldown remaining time, and next recommended action
### Requirement: Manual Recovery Override SHALL Be Explicit and Controlled
Manual restart actions MUST bypass automatic block only through authenticated operator pathways with explicit acknowledgement.
#### Scenario: Churn-blocked state with manual override request
- **WHEN** authorized admin requests manual restart while auto-recovery is blocked
- **THEN** system MUST execute controlled restart path and log the override context for auditability
### Requirement: Circuit Breaker State Transitions SHALL Avoid Lock-Held Logging
Circuit breaker state transitions MUST avoid executing logger I/O while internal state locks are held.
#### Scenario: State transition occurs
- **WHEN** circuit breaker transitions between CLOSED, OPEN, or HALF_OPEN
- **THEN** lock-protected section MUST complete state mutation before emitting transition log output
#### Scenario: Slow log handler under load
- **WHEN** logger handlers are slow or blocked
- **THEN** circuit breaker lock contention MUST remain bounded and MUST NOT serialize unrelated request paths behind logging latency
### Requirement: Health Endpoints SHALL Use Short Internal Memoization
Health and deep-health computation SHALL use a short-lived internal cache to prevent probe storms from amplifying backend load.
#### Scenario: Frequent monitor scrapes
- **WHEN** health endpoints are called repeatedly within a small window
- **THEN** service SHALL return memoized payload for up to 5 seconds in non-testing environments
#### Scenario: Testing mode
- **WHEN** app is running in testing mode
- **THEN** health endpoint memoization MUST be bypassed to preserve deterministic tests
### Requirement: Logs MUST Redact Connection Secrets
Runtime logs MUST avoid exposing DB connection credentials.
#### Scenario: Connection string appears in log message
- **WHEN** a log message contains DB URL credentials
- **THEN** logger output MUST redact password and sensitive userinfo before emission

View File

@@ -0,0 +1,38 @@
# security-surface-hardening Specification
## Purpose
TBD - created by archiving change security-stability-hardening-round2. Update Purpose after archive.
## Requirements
### Requirement: LDAP Authentication Endpoint Configuration SHALL Be Strictly Validated
The system MUST validate LDAP authentication endpoint configuration before use, including HTTPS scheme enforcement and host allowlist checks.
#### Scenario: Invalid LDAP URL configuration detected
- **WHEN** `LDAP_API_URL` is missing, non-HTTPS, or points to a host outside the configured allowlist
- **THEN** the service MUST reject LDAP authentication calls and emit actionable diagnostics without sending credentials to that endpoint
#### Scenario: Valid LDAP URL configuration accepted
- **WHEN** `LDAP_API_URL` uses HTTPS and host is allowlisted
- **THEN** LDAP authentication requests MAY proceed with normal timeout and error handling behavior
### Requirement: Security Response Headers SHALL Be Applied Globally
All HTTP responses MUST include baseline security headers suitable for dashboard and API traffic.
#### Scenario: Standard response emitted
- **WHEN** any route returns a response
- **THEN** response MUST include `Content-Security-Policy`, `X-Frame-Options`, `X-Content-Type-Options`, and `Referrer-Policy`
#### Scenario: Production transport hardening
- **WHEN** runtime environment is production
- **THEN** response MUST include `Strict-Transport-Security`
### Requirement: Pagination Input Boundaries SHALL Be Enforced
Endpoints accepting pagination parameters MUST enforce lower and upper bounds before query execution.
#### Scenario: Negative or zero pagination inputs
- **WHEN** client sends `page <= 0` or `page_size <= 0`
- **THEN** server MUST normalize values to minimum supported bounds
#### Scenario: Excessive page size requested
- **WHEN** client sends `page_size` above configured maximum
- **THEN** server MUST clamp to maximum supported page size

View File

@@ -0,0 +1,26 @@
# worker-self-healing-governance Specification
## Purpose
TBD - created by archiving change p2-ops-self-healing-runbook. Update Purpose after archive.
## Requirements
### Requirement: Automated Worker Recovery SHALL Use Bounded Policy Guards
Automated worker restart behavior MUST enforce cooldown periods and bounded restart attempts within a configurable time window.
#### Scenario: Repeated worker degradation within short window
- **WHEN** degradation events exceed configured restart-attempt budget
- **THEN** automated restarts MUST pause and surface a blocked-recovery signal for operator intervention
### Requirement: Restart-Churn Protection SHALL Prevent Recovery Storms
The runtime MUST classify restart churn and prevent uncontrolled restart loops.
#### Scenario: Churn threshold exceeded
- **WHEN** restart count crosses churn threshold in active window
- **THEN** watchdog MUST enter guarded mode and require explicit manual override before further restart attempts
### Requirement: Recovery Decisions SHALL Be Audit-Ready
Every auto-recovery decision and manual override action MUST be recorded with structured metadata.
#### Scenario: Worker restart decision emitted
- **WHEN** system executes or denies a restart action
- **THEN** structured logs/events MUST include reason, thresholds, actor/source, and resulting state