Compare AI Models in 5 Minutes

Ask the same question to GPT-4, Claude, and Gemini - see how they differ.

What You'll Build

Question → GPT-4 → Response 1
        → Claude → Response 2
        → Gemini → Response 3

All three models run in parallel!

The Workflow

{
  "init_params": {
    "question": "Explain quantum computing in one sentence.",
    "system_prompt": "You are a helpful teacher. Keep answers simple and accessible."
  },
  "step_configs": {
    "gpt4": {
      "model": "openai/gpt-4o",
      "activity": "litellm_chat",
      "user_prompt_path": "init_params.question",
      "system_prompt_path": "init_params.system_prompt"
    },
    "claude": {
      "model": "anthropic/claude-sonnet-4-20250514",
      "activity": "litellm_chat",
      "user_prompt_path": "init_params.question",
      "system_prompt_path": "init_params.system_prompt"
    },
    "gemini": {
      "model": "gemini/gemini-2.5-flash",
      "activity": "litellm_chat",
      "user_prompt_path": "init_params.question",
      "system_prompt_path": "init_params.system_prompt"
    }
  },
  "steps": ["gpt4", "claude", "gemini"]
}

Try It

Copy the workflow above
Paste into Jetty UI or run via API
Change the question to test different prompts
Compare the three responses side-by-side

What You'll Learn

1. `litellm_chat` - Universal LLM connector

LiteLLM connects to 100+ models through a unified interface:

{
  "activity": "litellm_chat",
  "model": "openai/gpt-4o",
  "user_prompt_path": "init_params.question"
}

2. Model naming conventions

Provider	Format	Examples
OpenAI	`openai/model-name`	`openai/gpt-4o`, `openai/gpt-4o-mini`
Anthropic	`anthropic/model-name`	`anthropic/claude-sonnet-4-20250514`
Google	`gemini/model-name`	`gemini/gemini-2.5-flash`, `gemini/gemini-2.5-pro`

3. Path expressions

init_params.question references your input parameter:

"init_params": {
  "question": "Your question here"  ← This value
}

4. Parallel execution

All steps in the steps array run simultaneously when they don't depend on each other.

The Output

Each model's response is stored in its step outputs:

{
  "gpt4": {
    "outputs": {
      "content": "Quantum computing uses quantum bits that can exist in multiple states simultaneously..."
    }
  },
  "claude": {
    "outputs": {
      "content": "Quantum computing harnesses the principles of quantum mechanics..."
    }
  },
  "gemini": {
    "outputs": {
      "content": "Quantum computing leverages quantum phenomena like superposition..."
    }
  }
}

Add a Judge to Pick the Best

Let an LLM decide which response is best:

{
  "init_params": {
    "question": "Explain quantum computing in one sentence.",
    "system_prompt": "You are a helpful teacher. Keep answers simple."
  },
  "step_configs": {
    "gpt4": {
      "model": "openai/gpt-4o",
      "activity": "litellm_chat",
      "user_prompt_path": "init_params.question",
      "system_prompt_path": "init_params.system_prompt"
    },
    "claude": {
      "model": "anthropic/claude-sonnet-4-20250514",
      "activity": "litellm_chat",
      "user_prompt_path": "init_params.question",
      "system_prompt_path": "init_params.system_prompt"
    },
    "gemini": {
      "model": "gemini/gemini-2.5-flash",
      "activity": "litellm_chat",
      "user_prompt_path": "init_params.question",
      "system_prompt_path": "init_params.system_prompt"
    },
    "judge": {
      "model": "openai/gpt-4o",
      "activity": "litellm_chat",
      "user_prompt": "Compare these three explanations of quantum computing and pick the best one for a beginner:\n\n1. GPT-4: {{ gpt4.outputs.content }}\n\n2. Claude: {{ claude.outputs.content }}\n\n3. Gemini: {{ gemini.outputs.content }}\n\nWhich is clearest and why?"
    }
  },
  "steps": ["gpt4", "claude", "gemini", "judge"]
}

Note: The judge step runs after the first three because it references their outputs.

Compare Different Model Versions

Test the same provider's models:

{
  "step_configs": {
    "gpt4": {
      "model": "openai/gpt-4o",
      "activity": "litellm_chat",
      "user_prompt_path": "init_params.question"
    },
    "gpt4_mini": {
      "model": "openai/gpt-4o-mini",
      "activity": "litellm_chat",
      "user_prompt_path": "init_params.question"
    }
  },
  "steps": ["gpt4", "gpt4_mini"]
}

Available Configuration Options

{
  "activity": "litellm_chat",
  "model": "openai/gpt-4o",
  "user_prompt_path": "init_params.question",
  "system_prompt": "You are helpful.",
  "temperature": 0.7,
  "max_tokens": 1000,
  "top_p": 0.9
}

Parameter	Description	Default
`temperature`	Creativity (0-2)	1.0
`max_tokens`	Maximum response length	Model default
`top_p`	Nucleus sampling	1.0
`system_prompt`	System message	None

Next Steps

LLM Evaluation - Score and rank model outputs automatically
Batch Processing - Compare models across many prompts
Image Generation - Compare image generation models

What You'll Build​

The Workflow​

Try It​

What You'll Learn​

1. litellm_chat - Universal LLM connector​

2. Model naming conventions​

3. Path expressions​

4. Parallel execution​

The Output​

Add a Judge to Pick the Best​

Compare Different Model Versions​

Available Configuration Options​

Next Steps​