feat: dataset cache for hold/resource history + slow connection migration

Two changes combined: 1. historical-query-slow-connection: Migrate all historical query pages to read_sql_df_slow with semaphore concurrency control (max 3), raise DB slow timeout to 300s, gunicorn timeout to 360s, and unify frontend timeouts to 360s for all historical pages. 2. hold-resource-history-dataset-cache: Convert hold-history and resource-history from multi-query to single-query + dataset cache pattern (L1 ProcessLevelCache + L2 Redis parquet/base64, TTL=900s). Replace old GET endpoints with POST /query + GET /view two-phase API. Frontend auto-retries on 410 cache_expired. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 13:15:02 +08:00
parent cd061e0cfd
commit 71c8102de6
64 changed files with 3806 additions and 1442 deletions
--- a/gunicorn.conf.py
+++ b/gunicorn.conf.py
@@ -6,9 +6,9 @@ threads = int(os.getenv("GUNICORN_THREADS", "4"))
 worker_class = "gthread"

 # Timeout settings - critical for dashboard stability.
-# Keep this above slow-query timeout paths (e.g. query-tool 120s) and DB pool timeout.
-timeout = int(os.getenv("GUNICORN_TIMEOUT", "130"))
-graceful_timeout = int(os.getenv("GUNICORN_GRACEFUL_TIMEOUT", "60"))
+# Keep this above slow-query timeout paths (e.g. read_sql_df_slow 300s) and DB pool timeout.
+timeout = int(os.getenv("GUNICORN_TIMEOUT", "360"))
+graceful_timeout = int(os.getenv("GUNICORN_GRACEFUL_TIMEOUT", "120"))
 keepalive = 5         # Keep-alive connections timeout

 # Worker lifecycle management - prevent state accumulation.