Deployment with Docker Compose
The Docker Compose stack at infra/docker/docker-compose.yml is the reference deployment for all five Caracal services plus their infrastructure dependencies. It is the canonical local development environment and the blueprint for adapting to Kubernetes, ECS, or other orchestrators.
Services and startup order
Section titled “Services and startup order”The stack defines eight containers. Health-based dependencies enforce this startup sequence:
postgres ─────┐redis ─────┤── init ── sts ── api ── gateway └── coordinator └── init ── auditThe init container must complete before STS, API, audit, or coordinator start. STS must be healthy before API and coordinator start. API must be healthy before the gateway starts.
postgres
Section titled “postgres”image: caracal/postgres:dev # postgres:18-alpineports: 127.0.0.1:5432:5432volumes: postgresData:/var/lib/postgresqlhealthcheck: pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB} interval: 5s timeout: 3s retries: 20The custom image adds no logic on top of the base Postgres image. Database migrations run from the API service on startup, not from this container.
image: caracal/redis:dev # redis:8-alpine with custom redis.confports: 127.0.0.1:6379:6379volumes: redisData:/datacommand: --requirepass ${REDIS_PASSWORD}healthcheck: redis-cli -a ${REDIS_PASSWORD} --no-auth-warning PING | grep PONG interval: 5s timeout: 3s retries: 20Starts with redis-server /etc/caracal/redis.conf, which enables appendonly yes, appendfsync everysec, and maxmemory-policy noeviction.
A one-shot container (restart: "no") that runs infra/redis/provision-streams.sh after both Postgres and Redis are healthy. It creates all Redis streams and consumer groups idempotently using XGROUP CREATE ... MKSTREAM. Safe to re-run at any time.
All other services depend (directly or transitively) on init completing with exit code 0.
image: caracal/sts:devports: 127.0.0.1:8080:8080depends_on: init: { condition: service_completed_successfully }healthcheck: wget -qO- http://127.0.0.1:8080/ready || exit 1Go binary. Does not run migrations. Healthy once Postgres and Redis are reachable and the audit replay buffer is ready.
Required: DATABASE_URL, REDIS_URL, ZONE_KEK, AUDIT_HMAC_KEY, ISSUER_URL
image: caracal/api:devports: 127.0.0.1:3000:3000depends_on: sts: { condition: service_healthy }healthcheck: wget -qO- http://127.0.0.1:3000/ready || exit 1TypeScript/Node.js service. Runs Postgres migrations (0001 through current) on every startup — idempotent. Seeds the bootstrap admin token if CARACAL_ADMIN_TOKEN is set and not yet present. Starts the transactional outbox dispatcher as a background worker.
Required: DATABASE_URL, REDIS_URL, STS_URL, ZONE_KEK, CARACAL_ADMIN_TOKEN
gateway
Section titled “gateway”image: caracal/gateway:devports: 127.0.0.1:8081:8081depends_on: sts: { condition: service_healthy } api: { condition: service_healthy }healthcheck: wget -qO- http://127.0.0.1:8081/ready || exit 1Go binary. The only service exposed externally in production. Requires TLS in production (TLS_CERT_FILE, TLS_KEY_FILE). In development, set INSECURE_HTTP=true and INSECURE_STS=true.
Required: DATABASE_URL, REDIS_URL, STS_URL, STREAMS_HMAC_KEY
image: caracal/audit:devports: 127.0.0.1:9090:9090depends_on: init: { condition: service_completed_successfully }healthcheck: wget -qO- http://127.0.0.1:9090/ready || exit 1Go binary. Consumes caracal.audit.events stream, persists events to the partitioned audit_events table, runs the chain HMAC tamper sweeper, and drives Parquet export to S3 if configured.
Required: DATABASE_URL, REDIS_URL, AUDIT_HMAC_KEY
coordinator
Section titled “coordinator”image: caracal/coordinator:devports: 127.0.0.1:4000:4000depends_on: sts: { condition: service_healthy }healthcheck: wget -qO- http://127.0.0.1:4000/ready || exit 1Runs two processes via start.sh: the Go relay binary (background) and the Node.js coordinator (foreground). The relay forwards lifecycle events to Redis streams. The coordinator manages agent sessions, delegation edges, and invocations.
Required: DATABASE_URL, REDIS_URL, STS_URL, ISSUER_URL, AGENT_COORDINATOR_SCOPE
First-run procedure
Section titled “First-run procedure”1. Copy and populate the env file:
cp infra/docker/.env.example infra/docker/.envSet these values in infra/docker/.env:
POSTGRES_PASSWORD=<strong-password>REDIS_PASSWORD=<strong-password>CARACAL_ADMIN_TOKEN=<strong-token>
ZONE_KEK=$(openssl rand -hex 32)AUDIT_HMAC_KEY=$(openssl rand -hex 32)STREAMS_HMAC_KEY=$(openssl rand -hex 32)
# Development only — do not use in productionINSECURE_STS=trueINSECURE_HTTP=trueCARACAL_LOCAL_BOOTSTRAP_ENABLED=true2. Build images:
cd infra/docker && docker compose build3. Start the stack:
docker compose up -d4. Verify health:
docker compose psAll services should show healthy. The init container shows Exited (0).
5. Bootstrap (first run only):
With CARACAL_LOCAL_BOOTSTRAP_ENABLED=true, the API seeds initial zone and application data automatically on first startup. To trigger bootstrap explicitly:
caracal stack bootstrapVolumes
Section titled “Volumes”| Volume | Mount inside container | Contents |
|---|---|---|
postgresData | /var/lib/postgresql | All Postgres data files |
redisData | /data | Redis AOF log and RDB snapshot |
Both are Docker-managed named volumes. They survive docker compose down but are deleted with docker compose down -v.
In production, replace these with external volumes backed by persistent block storage (EBS, GCP PD, Azure Disk). Do not use ephemeral container storage for either.
Port summary
Section titled “Port summary”| Service | Local binding | Purpose |
|---|---|---|
| Postgres | 127.0.0.1:5432 | DB access from CLI tools |
| Redis | 127.0.0.1:6379 | Redis CLI debugging |
| STS | 127.0.0.1:8080 | Token exchange; JWKS endpoint |
| API | 127.0.0.1:3000 | Control-plane REST; CLI target |
| Gateway | 127.0.0.1:8081 | Upstream proxy (public-facing in prod) |
| Coordinator | 127.0.0.1:4000 | Agent lifecycle; delegation graph |
| Audit | 127.0.0.1:9090 | Audit query and export status |
All bindings use 127.0.0.1 only. In production, services communicate over a private internal network and only the Gateway is exposed to external traffic.
Production adaptation checklist
Section titled “Production adaptation checklist”- Replace named volumes with persistent external volumes.
- Run multiple replicas of stateless services (STS, API, Gateway, Coordinator) behind a load balancer.
- Supply secrets from a vault or secrets manager; do not use
.envfiles. - Set
INSECURE_HTTP=falseandINSECURE_STS=false; provideTLS_CERT_FILEandTLS_KEY_FILEfor the Gateway. - Set
CARACAL_ENV=productionon the Gateway. - Remove
CARACAL_LOCAL_BOOTSTRAP_ENABLED=truefrom the API env. - Set
STREAMS_HMAC_KEYon all services that publish or consume from Redis streams. - Scale the Audit service to a single replica with leader election active (the service uses advisory locks to elect an export leader and a retention leader).