Skip to content

Metrics Contract (v1)

Date: 2026-03-04 Endpoint: GET /v1/admin/metrics

Stability Policy

  • Metric names below are stable for v1 dashboards/alerts.
  • New metrics may be added without breaking changes.
  • Existing metric name removal/rename requires a documented migration.

Metrics

  • retention_runs (int)
  • Total number of retention cleanup runs attempted by this process.
  • retention_failures (int)
  • Total number of retention cleanup failures in this process.
  • retention_last_purged_diffs (int)
  • Number of diff blobs purged in the most recent successful run.
  • retention_last_duration_ms (int)
  • Duration of most recent cleanup run attempt in milliseconds.
  • github_publish_failures (int)
  • Total GitHub check publish failures observed in this process.
  • window_minutes (int)
  • Aggregation window applied to distributed DB-backed metrics.
  • retention_runs_window (int)
  • Count of retention.cleanup_run audit events in the window.
  • retention_failures_window (int)
  • Count of retention.cleanup_failed audit events in the window.
  • github_publish_failures_window (int)
  • Count of github_check.publish_failed audit events in the window.
  • webhook_deliveries_window (int)
  • Count of dedupe-tracked webhook deliveries recorded in the window.
  • webhook_accepted_window (int)
  • Count of accepted GitHub webhook submissions in the window.
  • webhook_deduped_window (int)
  • Count of deduped GitHub webhook deliveries in the window.
  • webhook_idempotency_conflicts_window (int)
  • Count of webhook idempotency conflicts in the window.
  • webhook_ignored_window (int)
  • Count of ignored GitHub webhook actions in the window.
  • scan_jobs_queued (int)
  • Current queued scan-job count from the shared store.
  • scan_jobs_running (int)
  • Current running scan-job count from the shared store.
  • scan_jobs_dead_letter (int)
  • Current dead-letter scan-job count from the shared store.

Prometheus /metrics additions: - diffver_http_requests_total (counter by method, path, status_code) - diffver_http_request_duration_ms (histogram by method, path, status_code) - diffver_scan_queue_delay_ms (histogram) - diffver_scan_processing_duration_ms (histogram by outcome) - diffver_scan_end_to_end_duration_ms (histogram by outcome) - diffver_scan_completed_total (counter by outcome) - diffver_worker_polls_total (counter by outcome) - diffver_worker_requeue_total (counter) - diffver_scan_jobs_queued/running/dead_letter (gauges) - diffver_scan_jobs_dead_letter_total (counter) - diffver_github_publish_total (counter by stage, outcome)

Semantics Notes

  • Metrics are process-local in-memory counters.
  • Counters reset on process restart.
  • Windowed distributed metrics are derived from shared DB records.
  • For fleet-wide dashboards, prefer /metrics scraping + central aggregation.
  • Histogram-based P50/P95 should be computed from /metrics in the telemetry backend.