Architecture¶
Scope¶
This document defines the v1 architecture for the Diff Verification + Injection Scan microservice.
Services¶
Current v1 runtime is a two-process deployment:
- api: FastAPI app for scan ingestion, governance APIs, webhook ingress, and admin endpoints.
- worker: polling worker that claims queued jobs, executes verification + injection analysis, writes artifacts, and updates GitHub checks.
Logical responsibilities are still separated in code modules (policy, verification, rules, github_checks, artifacts) but not deployed as independent services.
Deferred from v1 scope:
- billing-meter and other monetization microservices.
Storage¶
- Postgres:
tenants,users,api_keysscans,scan_jobs,findingspolicies,policy_exceptionsaudit_eventswebhook_deliveries(GitHub replay/dedup tracking)diff_blobs(raw diff object-storage references + deletion markers)- Object storage:
- raw diffs (
diffs/{scan_id}.patch) - signed artifacts (
artifacts/{scan_id}.json) - optional HTML artifact views (
artifacts/{scan_id}.html) - Queue:
- implemented in Postgres table
scan_jobs - dead-letter state tracked as
scan_jobs.status = dead_letter
Request Lifecycle¶
POST /v1/scansarrives atapi-gateway.- API validates token and tenant access.
- API persists scan row (
queued) and enqueuesscan_jobs. - Worker claims job and runs deterministic verification + injection rules.
- Worker evaluates findings against active policy profile and applies suppression/baseline controls.
- Worker stores signed result artifact and marks scan
completed/error. - Worker and API update GitHub check state when GitHub source is used.
Decision Model¶
pass: no blocking findings under current policy.fail: one or more blocking findings.error: internal failure; configurable fail-open/fail-closed at tenant level.
Isolation and Security¶
- Tenant-scoped data access via tenant id in every table and query.
- Configurable API auth with tenant and admin roles.
- Workers run in ephemeral sandboxed containers with read-only filesystem.
- No default outbound network from workers.
- Artifacts signed via signer mode (
deterministic,kms-hmac,aws-kms). - Production mode blocks deterministic signing operations.
Idempotency¶
POST /v1/scans supports Idempotency-Key header.
- Key collision with same payload returns original scan record.
- Key collision with different payload returns
409.
Initial Tech Choices¶
- API: FastAPI (Python) or Node (NestJS) with OpenAPI-first contract.
- Queue: Redis Streams or SQS.
- DB: Postgres 15+.
- Object store: S3-compatible.
Next Implementation Tasks¶
- Validate cloud object storage bindings in staging/prod with backup/restore drills.
- Validate
aws-kmssigning/verification path in staging smoke and production dry run. - Wire
infra/k8s/prometheus-rules.diffver.yamlto cluster Prometheus and alert routing. - Add SLO dashboards (P50/P95 scan latency, queue depth, publish failures, retention outcomes).