Audit Ledger
The audit ledger is Caracal’s tamper-evident record of every authorization decision. Every token exchange and OPA evaluation — whether allowed or denied — produces an audit event that flows from the STS through a Redis stream into an append-only PostgreSQL table. Events are HMAC-chained so that any modification or deletion is detectable.
What triggers an audit event
Section titled “What triggers an audit event”The STS emits an audit event for every OPA evaluation it performs:
- On allow: one event per resource granted, with
decision = "allow"and the full OPA result - On deny: one event per resource denied, with
decision = "deny"and the OPA diagnostics that caused it - On challenge required: an event with the step-up challenge type
- On JTI collision: an event with
event_type = "jti_collision" - On challenge cooldown or invalid challenge: events with the corresponding denial reason
Every exchange that touches the STS produces at least one audit event. An exchange requesting three resources produces three events.
Audit event structure
Section titled “Audit event structure”Each event contains:
| Field | Description |
|---|---|
id | UUID |
zone_id | Zone the exchange occurred in |
event_type | Classification of the event (e.g., "token_issued", "deny", "jti_collision") |
request_id | Trace ID from the exchange request |
decision | "allow" or "deny" |
policy_set_id | ID of the policy set that was active |
policy_set_version_id | Specific version of the policy set that evaluated the exchange |
manifest_sha | SHA-256 of the policy set version manifest |
evaluation_status | "complete" or another status from OPA |
determining_policies | JSON array of policy IDs/names that caused the decision |
diagnostics | JSON array of arbitrary diagnostic metadata from the policy |
metadata | Additional context (principal ID, resource identifier, session ID, etc.) |
occurred_at | Timestamp at microsecond precision |
The combination of policy_set_version_id and manifest_sha uniquely identifies exactly which version of which policies produced a given decision. This makes audit events reproducible: given the same inputs and the same policy version, the decision should be identical.
The event pipeline
Section titled “The event pipeline”STS (evaluation) │ ├─ emits to AuditBuffer (in-process, capacity 10,000 events) │ flushed every 1,000 events or every 50ms │ ▼Redis stream: caracal.audit.events │ ├─ consumer group: audit-ingestor → Audit service → PostgreSQL └─ consumer group: siem-export → (external SIEM, if configured)AuditBuffer: The STS does not write to Redis synchronously per evaluation. It enqueues events to an in-process buffer that batches flushes. If Redis is unavailable, events are written to an on-disk NDJSON fallback file with HMAC-SHA256 per event. The fallback prevents data loss during Redis outages.
Dead letter queue: Events that the Audit service cannot process after retries are routed to caracal.audit.events.dlq, consumed by the audit-dlq-observer consumer group. The DLQ is monitored separately.
XACK semantics: The Audit service sends XACK only after a terminal outcome — successful insert into PostgreSQL, or a detected benign duplicate. Transient errors cause the event to remain pending in the stream for retry.
HMAC chain integrity
Section titled “HMAC chain integrity”The Audit service writes events to PostgreSQL with a cryptographic chain that enables tamper detection. For each event, the service computes:
Content hash (SHA-256):
The SHA-256 over all audit event fields, concatenated with 0x1f (ASCII unit separator) between each field, in a fixed order. Fields included: id, zone_id, event_type, request_id, decision, policy_set_id, policy_set_version_id, manifest_sha, evaluation_status, determining_policies_json, diagnostics_json, metadata_json, then occurred_at as Unix nanoseconds.
Chain HMAC (HMAC-SHA256):
chain_hmac = HMAC-SHA256( key = AUDIT_HMAC_KEY, data = hex(content_sha256) || "|" || hex(prev_content_sha256))Each event stores:
content_sha256: hash of this event’s fieldsprev_content_sha256: hash of the previous event in the sequencechain_hmac: HMAC linking this event to the previous onechain_seq: monotonic sequence number
An attacker who modifies an event’s content cannot forge a valid content_sha256 without the HMAC key. An attacker who deletes an event breaks the chain because the next event’s prev_content_sha256 will not match. Inserting a fake event also breaks the chain because the HMAC key is required to produce a valid chain_hmac.
Tamper detection runs:
- At startup: a full retention sweep validates the chain from beginning to end
- Every hour: an incremental check validates recent events
Detection results are logged. No automatic remediation is performed — the ledger is append-only and the chain is evidence, not a correctable state.
Append-only guarantee
Section titled “Append-only guarantee”The Audit service never executes UPDATE or DELETE on the audit table. The PostgreSQL user used by the Audit service has INSERT and SELECT permissions only. This is enforced at the database layer, not just by application code.
Per-zone writes are serialized with pg_advisory_xact_lock to maintain chain ordering within a zone.
Querying the audit ledger
Section titled “Querying the audit ledger”The CLI and TUI provide two views:
Live tail (caracal audit tail): streams events from the Redis stream in near real-time. Filter by decision (allow / deny), pause and resume the stream.
Request trace (caracal explain <request-id>): retrieves all audit events for a specific request_id, showing the full OPA evaluation — which resources were requested, which were allowed, which were denied, and which policy versions determined each decision.
The admin SDK exposes the same queries via adminClient.audit.list(zoneId, query) and adminClient.audit.byRequest(zoneId, requestId).
AUDIT_HMAC_KEY
Section titled “AUDIT_HMAC_KEY”The AUDIT_HMAC_KEY environment variable is a 32-byte hex-encoded key required by both the STS (which signs events before emitting to Redis) and the Audit service (which verifies on ingest and uses for chain HMAC). Both services must use the same key. Changing this key after events have been written breaks tamper detection for historical events.
Next steps
Section titled “Next steps”- Authority Model — what decisions produce audit events
- Sessions and Revocation — how revocation events appear in the audit stream
- Step-Up Challenge — how step-up events are recorded