Skip to main content

Google Gemini Integration

Google Gemini steps provide direct integration with Google's advanced AI models for text generation, image generation, file processing, and document analysis. These steps are optimized for long-context understanding and multi-modal processing.

Available Steps (5)

gemini_prompt

Generate text responses using Gemini models with customizable prompts.

Activity Name: gemini_prompt

Use Cases: Content generation, Q&A, creative writing, code generation

gemini_text_reader

Process and analyze text content with customizable analysis prompts.

Activity Name: gemini_text_reader

Use Cases: Text summarization, content analysis, document processing

gemini_json_reader

Read and analyze JSON data or files with intelligent parsing.

Activity Name: gemini_json_reader

Use Cases: Data analysis, JSON summarization, structured data processing

gemini_file_reader

Upload and analyze files directly using Gemini's file processing capabilities.

Activity Name: gemini_file_reader

Use Cases: Document analysis, file content extraction, multi-format processing

gemini_image_generator

Generate images from prompts, optionally conditioned on input images.

Activity Name: gemini_image_generator

Use Cases: Image generation, image editing, visual content creation

Configuration

Authentication

All Gemini steps use unified API key management:

{
"api_key_secret": "GEMINI_API_KEY"
}

Authentication Patterns:

  • Direct: "api_key": "your-api-key"
  • Secrets Manager: "api_key_secret": "GEMINI_API_KEY" (recommended)
  • Environment: Falls back to GEMINI_API_KEY environment variable

Model Selection

All steps support model configuration:

{
"model": "gemini-2.0-flash-001"
}

Available Models:

  • gemini-2.0-flash-001 (default) - Latest fast model
  • gemini-1.5-pro - High-quality analysis model
  • gemini-1.5-flash - Balanced speed and quality

Step Documentation

gemini_prompt

Generate text using Gemini models with direct prompts.

Configuration

{
"activity": "gemini_prompt",
"model": "gemini-2.0-flash-001",
"prompt": "Your prompt here"
}

Parameters

  • prompt (string, required) - The text prompt for generation
  • model (string, default: gemini-2.0-flash-001) - Gemini model to use
  • api_key_secret (string, default: GEMINI_API_KEY) - Secret containing API key

Example

{
"name": "generate_content",
"activity": "gemini_prompt",
"config": {
"prompt": "Write a professional email about project updates",
"model": "gemini-2.0-flash-001"
}
}

gemini_text_reader

Analyze text content with customizable analysis prompts.

Configuration

{
"activity": "gemini_text_reader",
"text_path": "previous_step.outputs.text",
"prompt": "Summarize this text in 100 words"
}

Parameters

  • text_path (string, required) - Path to text content from previous steps
  • prompt (string, default: summary prompt) - Analysis instruction
  • model (string, default: gemini-2.0-flash-001) - Gemini model to use

Example

{
"name": "analyze_content",
"activity": "gemini_text_reader",
"config": {
"text_path": "fetch_article.outputs.content",
"prompt": "Extract the main themes and key points from this article"
}
}

gemini_json_reader

Process JSON data with intelligent analysis and summarization.

Configuration

{
"activity": "gemini_json_reader",
"json_path": "previous_step.outputs.data",
"prompt": "Analyze this JSON structure"
}

Parameters

  • json_path (string, required) - Path to JSON data or file path
  • prompt (string, default: summary prompt) - Analysis instruction
  • model (string, default: gemini-2.0-flash-001) - Gemini model to use

Features

  • Automatic JSON validation and parsing
  • File path resolution for JSON files
  • Intelligent structure analysis

Example

{
"name": "analyze_data",
"activity": "gemini_json_reader",
"config": {
"json_path": "api_response.outputs.json_data",
"prompt": "Identify trends and anomalies in this dataset"
}
}

gemini_file_reader

Upload and analyze files directly using Gemini's native file processing.

Configuration

{
"activity": "gemini_file_reader",
"asset_path": "previous_step.outputs.file_path",
"prompt": "Analyze this document"
}

Parameters

  • asset_path (string, required) - Path to file for upload and analysis
  • prompt (string, default: generic analysis) - Analysis instruction
  • model (string, default: gemini-2.0-flash-001) - Gemini model to use

Supported Formats

  • Documents: PDF, DOCX, TXT, MD
  • Images: PNG, JPG, GIF, WebP
  • Data: CSV, JSON, XML
  • Code: Most programming languages

Example

{
"name": "process_document",
"activity": "gemini_file_reader",
"config": {
"asset_path": "upload_file.outputs.file_path",
"prompt": "Extract key information and create a summary report"
}
}

gemini_image_generator

Generate images from prompts using Gemini's native image generation capabilities.

Configuration

{
"activity": "gemini_image_generator",
"model": "gemini-2.5-flash-image-preview",
"prompt": "A serene mountain landscape at sunset"
}

Parameters

  • prompt (string, required) - Text description of desired image
  • model (string, default: gemini-2.5-flash-image-preview) - Gemini image model
  • image_path (string, optional) - Input image path for image-to-image generation

Output

  • images (array) - Generated images with path, extension, and content_type
  • text (string, optional) - Any text response from the model
  • response_length (int, optional) - Length of text response

Example: Text-to-Image

{
"name": "generate_illustration",
"activity": "gemini_image_generator",
"config": {
"prompt": "A minimalist logo design for a tech startup, clean lines, blue and white color scheme",
"model": "gemini-2.5-flash-image-preview"
}
}

Example: Image-to-Image

{
"name": "edit_image",
"activity": "gemini_image_generator",
"config": {
"prompt": "Transform this photo into a watercolor painting style",
"image_path": "upload_step.outputs.image_path",
"model": "gemini-2.5-flash-image-preview"
}
}

Output Structure

{
"outputs": {
"images": [
{
"path": "collection/flow/0001/gemini_image_generator_0.png",
"extension": "png",
"content_type": "image/png"
}
],
"text": "I've created a minimalist logo...",
"response_length": 45
}
}

Advanced Usage

Chaining Gemini Steps

{
"steps": [
{
"name": "read_document",
"activity": "gemini_file_reader",
"config": {
"asset_path": "document.pdf",
"prompt": "Extract main topics from this document"
}
},
{
"name": "expand_content",
"activity": "gemini_prompt",
"config": {
"prompt": "Based on these topics: {{read_document.outputs.text}}, write detailed explanations"
}
}
]
}

Multi-Format Processing

{
"steps": [
{
"name": "process_json",
"activity": "gemini_json_reader",
"config": {
"json_path": "data.json",
"prompt": "Create a data quality report"
}
},
{
"name": "process_image",
"activity": "gemini_file_reader",
"config": {
"asset_path": "chart.png",
"prompt": "Describe the trends shown in this chart"
}
}
]
}

Error Handling

Common Issues

  • Authentication Error: Verify GEMINI_API_KEY configuration
  • File Upload Error: Check file format support and size limits
  • Path Resolution Error: Ensure referenced paths exist in trajectory
  • Model Not Found: Verify model name is supported

Best Practices

  • Use appropriate models for task complexity (Flash for speed, Pro for quality)
  • Implement retry logic for rate limiting scenarios
  • Cache file uploads when processing multiple times
  • Use specific prompts for better results

Performance Tips

Model Selection

  • gemini-2.0-flash-001: Fast responses, good for simple tasks
  • gemini-1.5-pro: High quality analysis, longer processing time
  • gemini-1.5-flash: Balanced performance for most use cases

Rate Limiting

  • Gemini has generous rate limits compared to other providers
  • Automatic retry with exponential backoff handled internally
  • Consider batch processing for multiple documents
  • LiteLLM Chat - Alternative chat interface with multi-provider support
  • Replicate Text Stream - Streaming text generation
  • File Processing Tools - Complement Gemini's file analysis capabilities

Integration Examples

View complete workflow examples in the Flow Library:

  • Document analysis workflows
  • Content generation pipelines
  • Multi-modal processing examples