Files
DashBoard/openspec/changes/archive/trace-streaming-response/tasks.md
egg dbe0da057c feat(trace-pipeline): memory triage, async job queue, and NDJSON streaming
Three proposals addressing the 2026-02-25 trace pipeline OOM crash (114K CIDs):

1. trace-events-memory-triage: fetchmany iterator (read_sql_df_slow_iter),
   admission control (50K CID limit for non-MSD), cache skip for large queries,
   early memory release with gc.collect()

2. trace-async-job-queue: RQ-based async jobs for queries >20K CIDs,
   separate worker process with isolated memory, frontend polling via
   useTraceProgress composable, systemd service + deploy scripts

3. trace-streaming-response: chunked Redis storage (TRACE_STREAM_BATCH_SIZE=5000),
   NDJSON stream endpoint (GET /api/trace/job/<id>/stream), frontend
   ReadableStream consumer for progressive rendering, backward-compatible
   with legacy single-key storage

All three proposals archived. 1101 tests pass, frontend builds clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 21:01:27 +08:00

1.9 KiB

1. EventFetcher Iterator Mode

  • 1.1 Add fetch_events_iter(container_ids, domain, batch_size) static method to EventFetcher class: yields Dict[str, List[Dict]] batches using read_sql_df_slow_iter
  • 1.2 Add unit tests for fetch_events_iter (mock read_sql_df_slow_iter, verify batch yields)

2. NDJSON Stream Endpoint

  • 2.1 Add GET /api/trace/job/<job_id>/stream endpoint: returns Content-Type: application/x-ndjson with Flask Response(generate(), mimetype='application/x-ndjson')
  • 2.2 Implement NDJSON generator: yield metadomain_startrecords batches → domain_endaggregationcomplete lines
  • 2.3 Add TRACE_STREAM_BATCH_SIZE env var (default 5000)
  • 2.4 Modify execute_trace_events_job() to store results in chunked Redis keys: trace:job:{job_id}:result:{domain}:{chunk_idx}
  • 2.5 Add unit tests for NDJSON stream endpoint

3. Result Pagination API

  • 3.1 Enhance GET /api/trace/job/<job_id>/result with domain, offset, limit query params
  • 3.2 Implement pagination over chunked Redis keys
  • 3.3 Add unit tests for pagination (offset/limit boundary cases)

4. Frontend Streaming Consumer

  • 4.1 Add consumeNDJSONStream(url, onChunk) utility using ReadableStream
  • 4.2 Modify useTraceProgress.js: for async jobs, prefer stream endpoint over full result endpoint
  • 4.3 Add progressive rendering: update table data as each NDJSON batch arrives
  • 4.4 Add error handling: stream interruption, malformed NDJSON lines

5. Deployment

  • 5.1 Update .env.example: add TRACE_STREAM_BATCH_SIZE with description

6. Verification

  • 6.1 Run python -m pytest tests/ -v — all existing tests pass
  • 6.2 Run cd frontend && npm run build — frontend builds successfully
  • 6.3 Manual test: verify NDJSON stream produces valid output for multi-domain query