DashBoard/design.md at 7cb0985b1228dcb69f33c865827658605a843cac

Files

egg 7cb0985b12 feat(modernization): full architecture blueprint with hardening follow-up

Implement phased modernization infrastructure for transitioning from
multi-page legacy routing to SPA portal-shell architecture, plus
post-delivery hardening fixes for policy loading, fallback consistency,
and governance drift detection.

Key changes:
- Add route contract enrichment with scope/visibility/compatibility policies
- Canonical 302 redirects from legacy direct-entry to /portal-shell/ routes
- Asset readiness enforcement and runtime fallback retirement for in-scope routes
- Shared feature-flag helpers (env > config > default) replacing duplicated _to_bool
- Defensive copy for lru_cached policy payloads preventing mutation corruption
- Unified retired-fallback response helper across app and blueprint routes
- Frontend/backend route-contract cross-validation in governance gates
- Shell CSS token fallback values for routes rendered outside shell scope
- Local-safe .env.example defaults with production recommendation comments
- Legacy contract fallback warning logging and single-hop redirect optimization

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-12 11:26:02 +08:00

6.8 KiB

Raw Blame History

Context

Phase 1 modernization was completed and archived, but post-delivery review identified several hardening gaps across policy loading, fallback behavior consistency, feature-flag ergonomics, and route-governance drift detection. The gaps are cross-cutting: backend route hosts, frontend contract metadata, CI governance checks, and operator onboarding defaults.

Current risks include:

shared mutable cached policy payloads via lru_cache return values,
inconsistent retired-fallback 503 response surfaces between app routes and blueprint routes,
local bootstrap failures caused by strict .env.example defaults,
duplicated boolean feature-flag parsing with slightly different precedence logic,
missing frontend/backend route-inventory cross-validation.

Goals / Non-Goals

Goals:

Make modernization policy loading deterministic, mutation-safe, and explicitly documented.
Standardize retired-fallback error response behavior across all in-scope route hosts.
Keep .env.example local-safe while documenting production hardening expectations.
Centralize feature-flag resolution semantics in shared helpers.
Enforce route-contract parity across backend artifacts and frontend shell contracts.
Improve operational observability for legacy contract fallback loading.

Non-Goals:

No new route migrations in this change.
No redesign of shell navigation UX.
No deferred-route modernization implementation (/tables, /excel-query, /query-tool, /mid-section-defect).
No runtime hot-reload framework for all policy artifacts.

Decisions

Decision 1: Introduce shared feature-flag helpers for bool parsing and env/config precedence

Choice: Add shared helpers (for example parse_bool, resolve_bool_flag) used by app/policy/runtime modules.
Rationale: eliminates duplicated parsing and precedence drift.
Alternative considered: keep local _to_bool and alignment-by-convention.
Why not alternative: high regression risk from future divergence and incomplete audits.

Decision 2: Protect cached policy payloads from cross-caller mutation

Choice: keep internal cache, but return defensive copies to callers; document refresh semantics and expose explicit cache-clear helper for tests/controlled refresh points.
Rationale: avoids shared-reference corruption without changing call sites.
Alternative considered: return MappingProxyType.
Why not alternative: nested list/dict payloads still remain mutable unless deeply frozen.

Decision 3: Unify retired-fallback response generation

Choice: move to a shared fallback-retirement response helper callable from both app-level and blueprint-level routes.
Rationale: consistent status/template/body contract and easier testing.
Alternative considered: leave blueprint-specific inline HTML responses.
Why not alternative: inconsistent user/operator behavior and duplicated logic.

Decision 4: Rebalance `.env.example` for safe local onboarding

Choice: set strict modernization toggles to local-safe defaults and annotate production-recommended values inline.
Rationale: avoid false-negative startup failures in local/test environments while preserving explicit production guidance.
Alternative considered: keep strict defaults and require all local users to override manually.
Why not alternative: unnecessary onboarding friction and frequent bootstrap failures.

Decision 5: Add governance parity checks across frontend and backend route contracts

Choice: extend governance checks/tests to compare backend route contract artifacts with frontend route inventory/scope metadata.
Rationale: catches silent drift before release.
Alternative considered: rely only on backend JSON consistency.
Why not alternative: frontend contract drift can still break runtime behavior silently.

Decision 6: Emit explicit warning when legacy contract source is used

Choice: log warning when loader falls back from primary contract artifact to legacy path.
Rationale: improves observability during migration tail.
Alternative considered: silent fallback.
Why not alternative: hard to detect stale-source dependency in production.

Decision 7: Reduce unnecessary redirect hops in `/hold-detail` missing-reason flow

Choice: when SPA shell mode is enabled, redirect directly to canonical shell overview path.
Rationale: reduces redirect chain complexity and improves deterministic route tracing.
Alternative considered: keep current two-hop behavior.
Why not alternative: no benefit, adds trace/debug noise.

Decision 8: Add token fallbacks for shell-dependent route styles

Choice: where route-local CSS consumes shell variables, include fallback values in var(--token, fallback) form.
Rationale: prevents degraded rendering when route is rendered outside shell variable scope.
Alternative considered: assume shell-only render path.
Why not alternative: fallback/compatibility entry paths still exist in this phase.

Risks / Trade-offs

[Risk] Shared helper refactor may alter existing truthy/falsey behavior in edge env values.
- Mitigation: add unit tests covering canonical and malformed env values before replacing call sites.
[Risk] Contract parity gate can fail current CI if artifacts are already drifted.
- Mitigation: land parity test with synchronized artifacts in same change.
[Risk] Defensive-copy strategy adds minor per-call overhead.
- Mitigation: policy payloads are small and low-frequency; prioritize correctness over micro-optimization.
[Risk] .env.example default changes may be interpreted as weaker production stance.
- Mitigation: add explicit production recommendation comments next to each local-safe default.

Migration Plan

Add shared feature-flag helpers and migrate existing bool parsing call sites.
Refactor modernization policy cache-return behavior to mutation-safe contract and document refresh semantics.
Introduce shared retired-fallback response helper and migrate hold-overview/hold-history/hold-detail route handlers.
Update .env.example defaults and production guidance comments.
Extend governance script/tests for frontend/backend route-contract parity.
Add warning log on legacy contract-source fallback.
Update /hold-detail missing-reason redirect to single-hop canonical target under SPA mode.
Add fallback values for QC-GATE shell-derived CSS variables.
Run targeted unit/integration/e2e + governance checks.

Rollback strategy:

Changes are config/code-level and can be reverted by standard git rollback.
If parity gate causes unexpected release blocking, gate can temporarily run in warning mode while drift is fixed in same release window.

Open Questions

Should policy cache refresh be strictly restart-based, or do we want an operator-triggered cache-clear hook in production later?
Do we want a single centralized governance artifact as source-of-truth long-term, with generated frontend/backend contract outputs?

6.8 KiB Raw Blame History