The backend cloud for reliable AI agents.
Memory, retries, tools, traces, and durable execution — without building orchestration infrastructure yourself. Sign up free or self-host in three commands.
import { createAgent, builtin, tool } from "@relayhq/sdk";
const reviewPR = tool({
name: "review_pr",
description: "Read a GitHub PR and return a summary",
inputSchema: { type: "object", properties: { url: { type: "string" } } },
async handler({ url }) { return github.pulls.get(url); },
});
const agent = createAgent({
model: "claude-sonnet-4-6",
memory: { namespace: `user:${userId}` },
tools: [builtin.calculator, reviewPR],
});
for await (const e of agent.run("Review the last PR")) {
if (e.type === "token") process.stdout.write(e.text);
}Built on boring, battle-tested infra
The problem
AI apps fail in production because orchestration is unreliable.
Every team building real agents ends up writing the same plumbing: queues, retries, state machines, tool dispatch, traces, memory, replay. Relay is that plumbing — already built, battle-shaped, and out of your way.
The change
Stop building infrastructure.
Same agent. One SDK call vs nine systems wired by hand.
- —OpenAI / Anthropic SDK
- —Redis
- —Queues
- —Retries + backoff
- —State management
- —Tool orchestration
- —Tracing + replay
- —Worker pool
- —Memory + embeddings
const agent = createAgent({
model: "claude-sonnet-4-6",
tools: [github, slack],
memory: { namespace: "user:42" },
})
await agent.run("Review the last PR")One SDK. Every provider. Memory, tools, traces, retries — built in.
How it works
Three steps from `git clone` to a streaming, traced, memory-aware agent.
Self-host with Docker Compose. The whole stack on your laptop in two minutes.
Clone and bootstrap
git clone, pnpm bootstrap. Idempotent — generates keys, brings up Postgres, applies migrations, mints your API key.
Bring your own keys
Upload Anthropic / OpenAI / OpenAI-compatible credentials once. Encrypted with AES-256-GCM and resolved per request.
Run and observe
Every token, tool call, and memory hit is persisted as an ordered event log. Replay any run in the dashboard.
git clone https://github.com/KevinCorrea5103/relay
cd relay && pnpm install
pnpm bootstrap # mints keys, brings up Postgres, migrates
pnpm dev # runtime + control-plane + dashboard + webWhat you get
Production-grade primitives, not a demo.
Everything an agent needs in production, exposed through a single SDK.
/ Multi-provider routing
Anthropic, OpenAI, and any OpenAI-compatible endpoint (Ollama, vLLM, Groq, Together, OpenRouter). Switch with one string.
/ Custom function tools
Tools run in your process, not ours. SDK ships the schema, runtime pauses on tool calls, you fulfill them locally over the same stream.
/ Semantic memory
pgvector + automatic indexing. Agents recall past interactions without you ever touching embeddings.
/ Persistent execution traces
Every run is an ordered event log: tokens, tool calls, memory retrievals, errors. Full replay for debugging.
/ BYOK encryption
AES-256-GCM per tenant. The runtime never sees a Relay key and has no database access. Costs flow direct to providers.
/ Streaming + tool calling
Native SSE end to end. Tokens stream as they're generated. Tool calls dispatch in parallel without breaking iteration.
/ Per-tenant isolation
API keys scope every read. Runs, memories, credentials — all tagged with their tenant. Multi-tenancy from day one.
/ 100% self-host
Three services and one Postgres. Run it on your laptop with Docker, or in your own cloud. No vendor in the loop.
Use it from anywhere
TypeScript today. Python today. Anything else over plain HTTP.
Two official SDKs and a documented HTTP + SSE protocol — wire it into any stack.
import { createAgent } from "@relayhq/sdk";
const agent = createAgent({
apiKey: process.env.RELAY_API_KEY!,
baseUrl: "https://api.relaygh.dev",
model: "gpt-4o-mini",
});
for await (const event of agent.run("Say hi in three languages.")) {
if (event.type === "token") process.stdout.write(event.text);
}No SDK for your language? The protocol is plain HTTP + SSE — ~30 lines in any language. See . /docs/api
Observability
Every run is a complete execution trace.
Tokens, tool calls, results, memory retrievals, errors — captured in order. Replay anything.


Built-in dashboard. See exactly what the model did — including the mistakes it self-corrected.
FAQ
Questions
Is there a hosted version?+
Yes — free during beta, no credit card. Sign up with your email and an OpenAI/Anthropic key, you get a `relay_live_…` back, point the SDK at it. The hosted cloud and self-host run the exact same code; you can switch by changing one env var.
Do you take a cut of my tokens?+
No. Relay is BYOK by design — you upload your own Anthropic / OpenAI keys (encrypted with AES-256-GCM the moment they arrive) and your tokens flow straight to the providers. We never proxy billing.
Is this another LangChain wrapper?+
No. Relay is the runtime under your agent, not a chain abstraction. You write plain functions; Relay handles state, retries, providers, tools, memory, and traces.
Where does my data live?+
On the cloud: a per-tenant slice of Postgres on Supabase, credentials encrypted at rest. On self-host: wherever you run your Postgres. Either way, your tool handlers execute in your process — Relay never sees your business logic.
What's the lock-in?+
Almost none. Your tools are TypeScript functions in your repo. Your prompts are strings. Your memory is a Postgres table you can export. The SDK protocol is plain HTTP + SSE. Moving cloud ↔ self-host is one `baseUrl` change.
What's coming next?+
Durable execution (resumable agents across crashes), human-in-the-loop checkpoints, multi-agent orchestration, voice agents. Track the roadmap on GitHub.
Get started
Get an API key. Ship an agent today.
Free cloud beta — no credit card. Or self-host the whole stack in three commands.