Skip to content

TLS and Production Hardening

The Gateway is the only Caracal service exposed to external traffic. It enforces TLS, validates upstream URLs, and applies several production-mode constraints that are deliberately disabled in development. This page covers all hardening controls and how to configure them correctly.

The Gateway requires TLS when CARACAL_ENV=production (the default). Provide a certificate and private key:

Terminal window
TLS_CERT_FILE=/etc/caracal/tls/server.crt
TLS_KEY_FILE=/etc/caracal/tls/server.key

Both variables must be set together. Setting one without the other causes the Gateway to panic at startup with:

TLS_CERT_FILE and TLS_KEY_FILE must both be set

If neither is set and INSECURE_HTTP is not true, the Gateway panics:

TLS_CERT_FILE/TLS_KEY_FILE required; set INSECURE_HTTP=true to run plaintext

Minimum TLS version: TLS 1.2, enforced by the Go runtime (tls.Config{MinVersion: tls.VersionTLS12}). Cipher suite selection uses Go’s defaults for TLS 1.2 and TLS 1.3.

Certificate format: Standard PEM. Provide a full certificate chain (leaf + intermediates) in TLS_CERT_FILE if your CA requires chain presentation.


Two flags disable TLS and plain-HTTP restrictions for local development:

VariableEffectForbidden when
INSECURE_HTTP=trueRun the Gateway without TLSCARACAL_ENV=production
INSECURE_STS=trueAllow the Gateway to connect to STS over HTTPCARACAL_ENV=production

Both variables are explicitly rejected in production mode. The Gateway panics at startup if either is set with CARACAL_ENV=production.

Do not set these flags in any environment that handles real credentials or tokens.


CARACAL_ENV=production activates the full set of production safety rules. The Gateway validates all of the following at startup and panics if any constraint is violated:

  • INSECURE_HTTP must not be true
  • INSECURE_STS must not be true
  • JTI_FAIL_OPEN must not be true
  • TLS_CERT_FILE and TLS_KEY_FILE must both be set
  • PORT must be 8081

CARACAL_ENV=dev disables these checks and allows the escape hatches. Any value other than production or dev is rejected.


The Gateway proxies requests to upstream resource URLs fetched from Postgres. To prevent server-side request forgery:

Private upstream blocking (default):

By default, ALLOW_PRIVATE_UPSTREAMS=false prevents the Gateway from forwarding requests to private IP ranges (RFC 1918: 10.x, 172.16–31.x, 192.168.x) and loopback addresses. Any resource with a private upstream URL is rejected at proxy time.

Upstream allowlist (production with private upstreams):

If your architecture requires the Gateway to reach private upstreams (e.g., internal microservices), set:

Terminal window
ALLOW_PRIVATE_UPSTREAMS=true
UPSTREAM_HOST_ALLOWLIST=api.internal.company.com,payments.internal.company.com

UPSTREAM_HOST_ALLOWLIST is a comma-separated list of permitted hostnames. In production with ALLOW_PRIVATE_UPSTREAMS=true, the allowlist is required. Requests to hosts not on the list are rejected.

Path traversal protection:

The Gateway rejects upstream URLs containing path traversal sequences (..). This check runs before any upstream DNS resolution.


The Gateway tracks JWT IDs (jti claims) in Redis to prevent replay of intercepted per-call mandates. This check is always active. If Redis is unavailable:

  • JTI_FAIL_OPEN=false (the default): The Gateway rejects requests — safer but reduces availability during Redis outages.
  • JTI_FAIL_OPEN=true: The Gateway allows requests through without the JTI check — forbidden in production.

VariableDefaultDescription
MAX_REQUEST_BYTES10485760Maximum request body (10 MB). Larger requests are rejected with 413.

VariableDefaultDescription
STS_TIMEOUT5sTimeout for STS token exchange calls
UPSTREAM_TIMEOUT30sTimeout for proxied upstream requests
READ_HEADER_TIMEOUT5sTime allowed to read HTTP request headers
READ_TIMEOUT30sTime allowed to read the full request
WRITE_TIMEOUT60sTime allowed to write the full response
IDLE_TIMEOUT120sKeep-alive idle connection timeout

Tune UPSTREAM_TIMEOUT based on your upstream services’ P99 response times. Setting it lower than your upstreams’ worst-case latency causes false timeouts.


  • CARACAL_ENV=production
  • TLS_CERT_FILE and TLS_KEY_FILE set to valid, non-expired certificate and key
  • INSECURE_HTTP not set (or explicitly false)
  • INSECURE_STS not set (or explicitly false)
  • JTI_FAIL_OPEN not set (or explicitly false)
  • STREAMS_HMAC_KEY set to a 32-byte hex secret shared with all publishers
  • ALLOW_PRIVATE_UPSTREAMS=false unless an internal allowlist is configured
  • UPSTREAM_HOST_ALLOWLIST set if ALLOW_PRIVATE_UPSTREAMS=true
  • Gateway not directly reachable on its internal Postgres or Redis ports
  • Only port 8081 exposed externally (all other services on a private network)
  • TLS certificate renewed before expiry (set up automated renewal with Let’s Encrypt or your PKI)

In the Docker Compose stack, services communicate over Docker’s internal network using container names (http://sts:8080, http://api:3000, etc.). No inter-service traffic passes through the Gateway. The Gateway connects to STS via STS_URL and to Postgres/Redis directly.

In production container orchestration, inter-service traffic should use mTLS or be constrained to a private network segment. The current stack does not provide mTLS for inter-service communication — enforce it at the network layer (Kubernetes NetworkPolicy, AWS VPC security groups, etc.).