feat: dataset cache for hold/resource history + slow connection migration

Two changes combined:

1. historical-query-slow-connection: Migrate all historical query pages
   to read_sql_df_slow with semaphore concurrency control (max 3),
   raise DB slow timeout to 300s, gunicorn timeout to 360s, and
   unify frontend timeouts to 360s for all historical pages.

2. hold-resource-history-dataset-cache: Convert hold-history and
   resource-history from multi-query to single-query + dataset cache
   pattern (L1 ProcessLevelCache + L2 Redis parquet/base64, TTL=900s).
   Replace old GET endpoints with POST /query + GET /view two-phase
   API. Frontend auto-retries on 410 cache_expired.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
egg
2026-02-25 13:15:02 +08:00
parent cd061e0cfd
commit 71c8102de6
64 changed files with 3806 additions and 1442 deletions

View File

@@ -6,9 +6,9 @@ threads = int(os.getenv("GUNICORN_THREADS", "4"))
worker_class = "gthread"
# Timeout settings - critical for dashboard stability.
# Keep this above slow-query timeout paths (e.g. query-tool 120s) and DB pool timeout.
timeout = int(os.getenv("GUNICORN_TIMEOUT", "130"))
graceful_timeout = int(os.getenv("GUNICORN_GRACEFUL_TIMEOUT", "60"))
# Keep this above slow-query timeout paths (e.g. read_sql_df_slow 300s) and DB pool timeout.
timeout = int(os.getenv("GUNICORN_TIMEOUT", "360"))
graceful_timeout = int(os.getenv("GUNICORN_GRACEFUL_TIMEOUT", "120"))
keepalive = 5 # Keep-alive connections timeout
# Worker lifecycle management - prevent state accumulation.