Skip to content

Event Topics

Caracal services communicate asynchronously through Redis streams. Events are published via an outbox table to ensure at-least-once delivery. All streams use flat key-value message fields (not JSON-encoded strings). Messages are optionally HMAC-signed using STREAMS_HMAC_KEY for origin verification.

Message encoding: Redis stream entries with flat string key-value fields. All values are strings; JSON objects (like metadata_json) are serialized as JSON strings within a single field.

HMAC signing: When STREAMS_HMAC_KEY is configured (hex-encoded, ≥32 bytes), a _sig field is appended to each message. The HMAC payload is: all non-_sig fields sorted alphabetically, joined as {stream_name}\n{key}={value}\n..., then HMAC-SHA256. Consumers perform a timing-safe comparison. Messages that fail verification are dropped and dead-lettered; they are still acknowledged to prevent blocking the consumer group.

Outbox pattern: Mutations that produce stream events first write to the event_outbox table within the same database transaction. A background poller (OutboxDispatcher) reads pending rows, signs the payload if STREAMS_HMAC_KEY is set, and publishes to Redis via XADD ... MAXLEN ~ {maxlen}. This guarantees that a stream message is never lost due to a Redis failure at write time.

Dead-letter queues: Messages that cannot be consumed (HMAC failure, parse failure, repeated processing errors) are moved to {stream}.dead.


Session revocation events. Every service that issues or verifies mandates must consume this stream to enforce sub-TTL revocation.

Publishers: Control-plane API (grant revocation), Coordinator (agent termination and delegation cascade revocation).

Consumers:

ServiceConsumer groupConsumer ID pattern
Gatewaygateway-revocationgateway-{hostname}-{pid}
STSsts-revocationsts-{hostname}-{pid}
Resource servers (Redis connector)configurable; default resource-revocationcaller-supplied

Message fields:

FieldTypeDescription
zone_idstringZone the session belongs to
session_idstringSession ID (sid claim) to revoke
reasonstringRevocation reason, e.g. "grant_revoked" or "agent_terminated"
grant_idstringGrant ID that triggered revocation (when reason is grant_revoked)
_sigstringHMAC-SHA256 hex digest (present when STREAMS_HMAC_KEY is set)

Retention: MAXLEN ~ 100,000

Dead-letter queue: caracal.sessions.revoke.dead

The Gateway and STS pick up revocation events within one pollOnce() cycle (typically under 5 seconds). Mandates issued before the revocation that have not yet expired will be rejected on the next request because both the Gateway and STS check the sid claim against their revocation caches on every inbound request.


Authorization decision events from the STS. Every token exchange attempt produces an audit event whether it succeeds or fails.

Publisher: STS.

Consumers:

ServiceConsumer group
Audit serviceaudit-ingestor
SIEM integrationssiem-export

Message fields:

FieldTypeDescription
idstringUnique event UUID
zone_idstringZone where the decision was made
event_typestring"token_exchange", "authz_decision", "replay_detected", etc.
request_idstringRequest correlation ID
decisionstring"allow" or "deny"
policy_set_idstringPolicy set ID used for evaluation
policy_set_version_idstringPolicy set version ID
manifest_shastringSHA-256 of the policy manifest
evaluation_statusstring"complete", "partial", or "error"
determining_policies_jsonstringJSON array of policy IDs that determined the outcome
diagnostics_jsonstringJSON array of diagnostic objects from OPA evaluation
metadata_jsonstringJSON object with additional context (client, resource, principal); sensitive fields are redacted
occurred_atstringRFC 3339 Nano timestamp
sigstringHMAC-SHA256 over the data field using AUDIT_HMAC_KEY (separate from STREAMS_HMAC_KEY)

Signing: Audit events use a dedicated AUDIT_HMAC_KEY environment variable. The HMAC covers the data field (a JSON-serialized copy of the full event) rather than the canonical flat-field format used by other streams.

Publication: Events are buffered in-process and batch-flushed every 50 ms or 1 000 events, whichever comes first. Failed batches are persisted to disk for replay on restart.

Retention: MAXLEN ~ 1,000,000

Dead-letter queue: caracal.audit.events.dlq (up to 100 000 entries)


Policy invalidation signals. When a policy set version is activated, the control-plane API publishes this event so the STS discards its cached compiled policy bundle and reloads from the database.

Publisher: Control-plane API (on POST /v1/zones/{zoneId}/policy-sets/{id}/activate).

Consumer:

ServiceConsumer group
STSopa-engine

Message fields:

FieldTypeDescription
zone_idstringZone whose policy set changed
policy_set_idstringPolicy set ID
policy_set_version_idstringNewly activated version ID
shadow_version_idstringShadow version ID (empty string if none)
_sigstringHMAC-SHA256 hex digest

Retention: MAXLEN ~ 100,000

STS behavior: On receipt, the STS recompiles the Rego bundle for the zone using the new version’s manifest. Until the bundle is ready, the zone continues using the previous bundle (fail-open for policy updates, since the old policy is still valid). If no prior bundle exists and compilation fails, the STS installs a deny-all policy (fail-closed).


Agent session lifecycle events from the Coordinator.

Publisher: Coordinator (on spawn, suspend, resume, terminate).

Message fields: Service-specific payload including agent_session_id, zone_id, application_id, status, and occurred_at.


Invocation state change events from the Coordinator.

Publisher: Coordinator.


Delegation graph invalidation signals. Published when a delegation edge is revoked or the graph epoch changes.

Publisher: Coordinator.


Both the control-plane API and Coordinator run an outbox dispatcher that polls the event_outbox table and publishes to Redis.

Environment variableDefaultDescription
CARACAL_OUTBOX_POLL_MS250Polling interval in milliseconds
CARACAL_OUTBOX_BATCH32Number of rows fetched per poll
CARACAL_OUTBOX_MAX_ATTEMPTS100Maximum delivery attempts before a row is marked failed
CARACAL_OUTBOX_STREAM_MAXLEN100000MAXLEN argument passed to XADD

Failed rows beyond max_attempts are not retried and remain in the table for manual inspection. Backoff between retries is exponential with jitter, capped at 60 seconds.


StreamMAXLEN (approximate)
caracal.sessions.revoke100 000
caracal.audit.events1 000 000
caracal.policy.invalidate100 000
caracal.audit.events.dlq100 000
caracal.sessions.revoke.dead100 000

MAXLEN is approximate (~), meaning Redis may retain slightly more entries than the configured limit. Entries are not time-expired — they are evicted only when the stream exceeds MAXLEN.


StreamPublisherConsumer(s)
caracal.sessions.revokeAPI, CoordinatorGateway, STS, resource servers
caracal.audit.eventsSTSAudit service, SIEM
caracal.policy.invalidateAPISTS
caracal.agents.lifecycleCoordinator
caracal.invocations.lifecycleCoordinator
caracal.delegations.invalidateCoordinator