AI Models Overview

The AI Models category provides 16 comprehensive steps for integrating with modern AI services and models. This category serves as the foundation for text generation, image creation, embeddings, and multi-modal AI interactions in Jetty workflows.

Category Structure

🤖 Google Gemini (4 steps)

Direct integration with Google's Gemini AI platform for text generation and file processing:

gemini_prompt - Generate text responses using Gemini models
gemini_text_reader - Process and analyze text content with Gemini
gemini_json_reader - Read and summarize JSON data using Gemini
gemini_file_reader - Upload and analyze files with Gemini models

🌐 LiteLLM Multi-Provider (6 steps)

Unified interface supporting 100+ AI providers including OpenAI, Anthropic, Azure, and local models:

litellm_chat - Chat completions with any supported provider
litellm_vision - Image analysis using vision-capable models
litellm_embeddings - Generate text embeddings from any provider
litellm_image_generation - Create images using DALL-E and compatible models
litellm_function_call - Function calling with tool-compatible models
litellm_batch - Process multiple requests efficiently in batches

🔄 Replicate Hosted Models (6 steps)

Access to specialized models hosted on Replicate's platform:

replicate_text2image - Generate images from text prompts (Flux, SDXL)
replicate_text_stream - Stream text generation from hosted models
replicate_extract_embeddings_url - Extract image embeddings using CLIP
replicate_segment - Image segmentation and object detection
replicate_brand_compliance - Brand guideline compliance checking
replicate_modify_image - Transform and edit images with AI

Common Patterns

Configuration Template

All AI model steps follow consistent configuration patterns:

{
  "activity": "step_name",
  "model": "model_identifier",
  "prompt": "your_prompt_here",
  "temperature": 0.7,
  "max_tokens": 1000
}

Secrets Management

API keys are managed through Jetty's unified secrets system:

{
  "api_key_secret": "PROVIDER_API_KEY",
  "organization": "your_org_id"
}

Supported Patterns:

Direct Configuration: "api_key": "sk-..."
Secrets Manager: "api_key_secret": "OPENAI_API_KEY"
Environment Variables: Falls back to standard env vars

Input/Output Patterns

Text Generation Steps:

Input: prompt, messages, or prompt_path
Output: text, response_length, model_used

Image Generation Steps:

Input: prompt, configuration parameters
Output: images[] with paths, metadata, and format info

Multi-Modal Steps:

Input: prompt + image_path or file references
Output: Analysis results, extracted data, or transformed content

Provider Comparison

Provider	Strengths	Best For	Model Examples
Gemini	File processing, long context	Document analysis, file uploads	`gemini-2.0-flash-001`
LiteLLM	Universal compatibility	Multi-provider workflows	`gpt-4`, `claude-3.5-sonnet`
Replicate	Specialized models	Image generation, custom models	`flux-schnell`, `sdxl`

Use Case Examples

Text Generation Workflow

{
  "steps": [
    {
      "name": "generate_content",
      "activity": "litellm_chat",
      "config": {
        "model": "gpt-4",
        "prompt": "Write a product description for..."
      }
    }
  ]
}

{
  "steps": [
    {
      "name": "analyze_image",
      "activity": "litellm_vision", 
      "config": {
        "model": "gpt-4-vision-preview",
        "image_path": "previous_step.outputs.images[0].path",
        "prompt": "Analyze this image for brand compliance"
      }
    }
  ]
}

Image Generation Pipeline

{
  "steps": [
    {
      "name": "create_image",
      "activity": "replicate_text2image",
      "config": {
        "model": "black-forest-labs/flux-schnell",
        "prompt": "A futuristic cityscape at sunset"
      }
    },
    {
      "name": "modify_image", 
      "activity": "replicate_modify_image",
      "config": {
        "image_path": "create_image.outputs.images[0].path",
        "prompt": "Add flying cars to the scene"
      }
    }
  ]
}

Performance Considerations

Model Selection Guidelines

Speed Priority: Gemini Flash, GPT-3.5-turbo, Claude Haiku
Quality Priority: GPT-4, Claude Sonnet, Gemini Pro
Cost Optimization: Use LiteLLM for provider switching based on pricing
Specialized Tasks: Replicate for cutting-edge image generation models

Batch Processing

Use litellm_batch for processing multiple similar requests
Configure appropriate batch_size based on rate limits
Implement error handling for partial batch failures

Rate Limiting

All steps respect provider rate limits through:

Automatic retry with exponential backoff
Configurable timeout settings
Graceful error handling and recovery

Next Steps

Google Gemini - File processing and direct API integration
LiteLLM Multi-Provider - Universal AI model access
Replicate Integration - Specialized and custom models

Getting Help

Review individual step documentation for detailed configuration options
Check the Flow Library examples for complete workflows
See Step Library Overview for environment variable setup
Join the community for AI model integration best practices

Category Structure​

🤖 Google Gemini (4 steps)​

🌐 LiteLLM Multi-Provider (6 steps)​

🔄 Replicate Hosted Models (6 steps)​

Common Patterns​

Configuration Template​

Secrets Management​

Input/Output Patterns​

Provider Comparison​

Use Case Examples​

Text Generation Workflow​

Multi-Modal Analysis​

Image Generation Pipeline​

Performance Considerations​

Model Selection Guidelines​

Batch Processing​

Rate Limiting​

Next Steps​

Getting Help​