Documentation
Self-host
Production deployment: env variables, security posture, scaling notes, backups. The same code runs in dev and prod.
Environment variables
Generated by pnpm bootstrap for dev. In production, set them in your secret manager and pass to each service.
| Variable | Where | Purpose |
|---|---|---|
| DATABASE_URL | control-plane | Postgres with pgvector extension. e.g. postgres://user:pass@host:5432/relay. |
| RELAY_MASTER_KEY | control-plane | 32-byte hex. Encrypts provider credentials. Don't rotate without re-encrypting. |
| RELAY_INTERNAL_SECRET | both | Shared secret for the runtime → control-plane callback. Set on both services. If unset, the callback is unauthenticated (dev only). |
| RUNTIME_URL | control-plane | Where the runtime listens. Default http://localhost:4100. |
| CONTROL_PLANE_URL | runtime | Where the control-plane listens. Default http://localhost:4000. |
| PORT | control-plane | HTTP port. Default 4000. |
| RELAY_TOOL_RESULT_TIMEOUT_MS | control-plane | Custom tool long-poll timeout. Default 30000. |
Postgres setup
You need PostgreSQL 16+ with pgvector. Managed options that include both:
- Supabase (pgvector built in)
- Neon (enable the
vectorextension) - RDS for PostgreSQL with the
vectorextension - Fly.io Postgres (with manual
create extension vector)
Apply migrations from a checkout of the repo:
DATABASE_URL=postgres://… pnpm --filter @relayhq/db migrateRunning the services
Three processes. Any container orchestrator (Fly, Render, Railway, Kubernetes, Docker Swarm, plain systemd) works.
Runtime (Go)
# build a static binary
cd runtime
CGO_ENABLED=0 go build -o relay-runtime ./cmd/runtime
# run it
CONTROL_PLANE_URL=https://api.relay.your-domain.com \
RELAY_INTERNAL_SECRET=… \
PORT=4100 \
./relay-runtimeControl plane (Node)
pnpm --filter @relayhq/control-plane build
DATABASE_URL=… \
RELAY_MASTER_KEY=… \
RELAY_INTERNAL_SECRET=… \
RUNTIME_URL=http://runtime.internal:4100 \
PORT=4000 \
node packages/control-plane/dist/server.jsDashboard (Next.js)
Deploy as a normal Next app (Vercel, Fly, Render, or just next start). Set RELAY_URL and RELAY_API_KEY on its env.
pnpm --filter @relayhq/dashboard build
RELAY_URL=https://api.relay.your-domain.com \
RELAY_API_KEY=relay_live_… \
pnpm --filter @relayhq/dashboard startBootstrapping a tenant in production
The bootstrap script works the same way against a remote Postgres — just point DATABASE_URL at it.
DATABASE_URL=postgres://prod-host… \
RELAY_MASTER_KEY=$RELAY_MASTER_KEY \
ANTHROPIC_API_KEY=sk-ant-… \
OPENAI_API_KEY=sk-… \
pnpm bootstrapSecurity posture
- Master key: derive from your secret manager (AWS KMS, GCP Secret Manager, 1Password, …). Never put it in source control or container env files.
- Internal secret: required in production. The control plane warns at boot when it's missing.
- TLS: terminate at your load balancer (Fly Proxy, ELB, Cloudflare). Nothing in Relay does TLS itself.
- Network boundaries: only the control plane needs public ingress. Runtime + Postgres stay private.
- Tenant isolation: every read query filters by
tenant_id. There's no client-supplied tenant id anywhere in the API. - The runtime never sees a Relay key and has no DB access. A compromised runtime instance leaks one in-flight run, not your tenant catalog.
Scaling notes
The runtime is stateless — run as many replicas as you want behind a load balancer. The control plane is also horizontally scalable, but the pending-tools broker is in-memory: sticky sessions OR pin a single instance per (run_id) until the run completes. (A Redis-backed broker is on the roadmap.)
Backups
Standard Postgres backups. pg_dump nightly plus point-in-time recovery if your hosting offers it. The run_events table grows fastest (one row per token); rotate/archive on a schedule if needed.
Upgrades
New migrations are forward-compatible by convention. Standard flow:
- Pull new code on every service.
pnpm --filter @relayhq/db migrateagainst prod DB.- Roll runtime, then control plane, then dashboard.