Jetty

One API for AI workflows, evaluation, and agentic execution.

Jetty gives your team a single /v1/chat/completions endpoint that does three things: proxies 100+ LLM providers, orchestrates multi-step evaluation pipelines, and runs autonomous agents in isolated sandboxes. Drop it into any OpenAI-compatible integration — existing SDKs work out of the box.

Start here

Chat Completions API

One endpoint, two modes. Without the jetty block it's a standard LLM proxy with automatic trajectory recording. With the jetty block it provisions a sandbox, runs an agent, and returns structured results.

Works with any OpenAI SDK. Switch providers by changing the model field.

Read the API reference →

Architecture

Understand how Jetty's three engines — Passthrough, Workflow, and Runbook — connect through a single API layer to persistence, tracing, and object storage. See how Collections, Tasks, and Trajectories organize your work.

See the architecture →

Build with Jetty

Writing Runbooks

A runbook is a structured markdown file that tells a coding agent how to accomplish a complex task end-to-end — with evaluation loops, iteration, and quality gates. When the first attempt is rarely sufficient, a runbook encodes your domain expertise into a repeatable process.

Covers the canonical structure, frontmatter schema, evaluation patterns, common pitfalls, and the /create-runbook wizard.

Learn to write runbooks →

Agent Skill & MCP

Use Jetty directly from Claude Code, Cursor, Windsurf, VS Code Copilot, Zed, or Gemini CLI. Your agent can create workflows, kick off runs, inspect trajectories, run evaluations, and browse 40+ step templates — all without leaving your editor.

Three connection methods: Claude Code plugin (/jetty commands), MCP server, or raw REST API.

Set up agent integration →

Guides

Hands-on tutorials for real workflows:

Agentic quickstart — Upload a file, run an agent in a sandbox, retrieve artifacts. 5 minutes.
Evaluating LLMs — Build evaluation pipelines with LLM-as-Judge scoring.
Custom benchmarks — Upload agents and datasets to TerminalBench at runtime.
CI integration — Trigger workflows from GitHub Actions with quality gates.
Brand compliance — Automated content review against your guidelines.

Browse all guides →

Quick Start

Get running fast with progressive tutorials:

60 seconds — Generate an image in one API call.
Setup — Get your API token and configure model keys.
First flow — Create and run your first workflow.
Model comparison — Compare GPT-4, Claude, and Gemini side-by-side (5 min).
Agent benchmarking — Test coding agents with TerminalBench (10 min).

Start the quick start →

How it works

# Passthrough mode — standard LLM proxy, every call recorded as a trajectory
curl https://flows-api.jetty.io/v1/chat/completions \
  -H "Authorization: Bearer $JETTY_API_TOKEN" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Summarize this quarter'\''s metrics"}],
    "stream": true
  }'

# Runbook mode — add a jetty block to run an agent in an isolated sandbox
curl https://flows-api.jetty.io/v1/chat/completions \
  -H "Authorization: Bearer $JETTY_API_TOKEN" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Run the evaluation suite"}],
    "jetty": {
      "collection": "my-evals",
      "runbook_url": "https://raw.githubusercontent.com/org/repo/main/RUNBOOK.md"
    }
  }'

Both modes return trajectories — full execution traces with inputs, outputs, and metadata — so you always have observability into what happened and why.

More resources

Step Library 47+ pre-built activities: AI models, control flow, data processing, evaluation.

Browse steps →

Examples Copy-paste workflow JSON for common tasks: chat, image gen, batch processing, translation.

View examples →

API Reference Chat completions, webhooks, GitHub PR integration, and authentication.

Read the API docs →

Start here​

Chat Completions API​

Architecture​

Build with Jetty​

Writing Runbooks​

Agent Skill & MCP​

Guides​

Quick Start​

How it works​

More resources​