Skip to content

Inspect Diagnostics and Audit

Observability workflows live in the Console because they need operator context, selected zones, and structured API responses.

Open diagnostics with d. The Console Doctor view runs the shared engine diagnostics across four sections:

SectionChecks
healthAPI reachability, admin authentication, and operator clock skew against the platform.
readinessPer-service /ready probes plus threshold evaluation of STS, Gateway, audit, and Coordinator operator metrics.
zonesVisible zones, resources, policy sets, and audit queryability.
preflightLocal secrets, keys, TLS material, and Postgres/Redis reachability.

Doctor does more than report liveness: it evaluates the operator metrics it collects and raises a warning or failure when a signal needs action. Notable deeper checks include:

SignalResultWhy it matters
Clock skew vs the API exceeds 30sfailToken iat/nbf/exp validation and audit ordering break when hosts are not NTP-synced.
STS OPA compile_errors > 0failA policy bundle will not compile, so token exchanges deny or run on a stale bundle.
Audit tamper_mismatch_total or chain_breaks > 0failAudit chain integrity is broken — treat as a security incident.
Audit dlq_size or hmac_failures > 0warnEvents are failing processing or HMAC verification before they age out.
Coordinator outbox dead rows, or a large pending backlogwarnSpawn and revocation events are not propagating downstream.

Useful Doctor controls include reload, all-zones or selected-zone checks, preflight-only checks, and strict readiness where warnings fail automation gates. The Next actions block collects the advice for every non-passing check.

This view is deeper than caracal status: status answers “is the stack up”, while Doctor inspects internal health — policy compilation, audit integrity, event propagation, clock alignment, and local key material — that keeps the platform working correctly over time.

Open audit with a. Audit records tie STS, Gateway, Coordinator, policy, and Control decisions to request IDs and authority anchors. Use the audit view to filter recent events, inspect denials, and gather incident context.

The audit view opens recent events immediately and streams new ones as they arrive. Press f for advanced filters: decision, event type, time window, request ID, agent session, agent label, and result limit. Time-window fields accept a relative window such as 15m, 2h, or 7d, an ISO timestamp, or a date — the Console converts the value before querying the API. Use row actions to open the full event group (enter) or the decision trace (x) for the selected request.

Audit is an event ledger: it records what authorization decided. It is distinct from the sessions views, which show the live state of authority sessions, agent sessions, and delegation edges. Filter audit by agent session or agent label to connect a decision back to the session that made it.

Open request trace with t when you already have a request ID — from a log line, error response, or Gateway header — and need the decision trace directly without scanning the audit list. It is the same trace reached by x on an audit row, exposed as a direct entry point. Trace output connects the request, selected resource, subject, scopes, determining policies, diagnostics, and result.

SymptomCheck
Diagnostics cannot reach APIRun caracal status --ready; custom deployments should confirm the Management API URL override.
Zone diagnostics show no zonesConfirm the admin token has visibility into the expected zone.
Audit records are missingConfirm the selected zone, time window, and audit service readiness.

For a symptom-first path from a failing application call to the right surface and tool, see Troubleshoot by Symptom.

Use Manage Agents and Delegation when a request involves agent sessions, child sessions, or delegated authority.