Metrics Contract (v1)¶
Date: 2026-03-04
Endpoint: GET /v1/admin/metrics
Stability Policy¶
- Metric names below are stable for v1 dashboards/alerts.
- New metrics may be added without breaking changes.
- Existing metric name removal/rename requires a documented migration.
Metrics¶
retention_runs(int)- Total number of retention cleanup runs attempted by this process.
retention_failures(int)- Total number of retention cleanup failures in this process.
retention_last_purged_diffs(int)- Number of diff blobs purged in the most recent successful run.
retention_last_duration_ms(int)- Duration of most recent cleanup run attempt in milliseconds.
github_publish_failures(int)- Total GitHub check publish failures observed in this process.
window_minutes(int)- Aggregation window applied to distributed DB-backed metrics.
retention_runs_window(int)- Count of
retention.cleanup_runaudit events in the window. retention_failures_window(int)- Count of
retention.cleanup_failedaudit events in the window. github_publish_failures_window(int)- Count of
github_check.publish_failedaudit events in the window. webhook_deliveries_window(int)- Count of dedupe-tracked webhook deliveries recorded in the window.
webhook_accepted_window(int)- Count of accepted GitHub webhook submissions in the window.
webhook_deduped_window(int)- Count of deduped GitHub webhook deliveries in the window.
webhook_idempotency_conflicts_window(int)- Count of webhook idempotency conflicts in the window.
webhook_ignored_window(int)- Count of ignored GitHub webhook actions in the window.
scan_jobs_queued(int)- Current queued scan-job count from the shared store.
scan_jobs_running(int)- Current running scan-job count from the shared store.
scan_jobs_dead_letter(int)- Current dead-letter scan-job count from the shared store.
Prometheus /metrics additions:
- diffver_http_requests_total (counter by method, path, status_code)
- diffver_http_request_duration_ms (histogram by method, path, status_code)
- diffver_scan_queue_delay_ms (histogram)
- diffver_scan_processing_duration_ms (histogram by outcome)
- diffver_scan_end_to_end_duration_ms (histogram by outcome)
- diffver_scan_completed_total (counter by outcome)
- diffver_worker_polls_total (counter by outcome)
- diffver_worker_requeue_total (counter)
- diffver_scan_jobs_queued/running/dead_letter (gauges)
- diffver_scan_jobs_dead_letter_total (counter)
- diffver_github_publish_total (counter by stage, outcome)
Semantics Notes¶
- Metrics are process-local in-memory counters.
- Counters reset on process restart.
- Windowed distributed metrics are derived from shared DB records.
- For fleet-wide dashboards, prefer
/metricsscraping + central aggregation. - Histogram-based P50/P95 should be computed from
/metricsin the telemetry backend.