Vision

Vision is a tool that allows you to analyze images with vision models.

With Vision, you can:

Analyze images: Analyze images with vision models
Extract text: Extract text from images
Identify objects: Identify objects in images
Describe images: Describe images in detail
Generate images: Generate images from text

The Gen6 Vision integration allows your agents to analyze images using vision models directly within their workflows.

This enables powerful, image-centric automations. Agents can extract text from images, identify objects, describe images in detail, and generate images from text.

By connecting Gen6 with Vision, you can create sophisticated agents that provide accurate responses and deliver greater value without manual intervention or custom code.

Usage Instructions

Process visual content with customizable prompts to extract insights and information from images.

Where to Get API Keys for Vision Models

This guide shows you where to find the API keys required to use vision models from both OpenAI (GPT-4o) and Anthropic (Claude 3).

1. OpenAI API Key (for GPT-4o with Vision)

The OpenAI key is generated through the platform dashboard.

Sign up or log in to the OpenAI Platform: https://platform.openai.com/
Navigate to the Keys Page: Go directly to the API Keys management section: https://platform.openai.com/account/api-keys
Generate Key: Click “+ Create secret key” to generate and name your new key.
- Remember: The full key is displayed only once. Copy it immediately and save it securely.

2. Claude API Key (for Claude 3 with Vision)

The Claude key is managed within the Anthropic Console.

Sign up or log in to the Anthropic Console: https://console.anthropic.com/
Navigate to the Keys Page: Go directly to the API Keys settings: https://console.anthropic.com/settings/keys
Create Key: Create and copy your API key for use in the tool.
- Note: Ensure you purchase credits or have an active plan, as API access may be restricted until billing is set up.

Parameter	Type	Required	Description
`apiKey`	string	Yes	API key for the selected model provider
`imageUrl`	string	Yes	Publicly accessible image URL
`model`	string	No	Vision model to use (gpt-4o, claude-3-opus-20240229, etc)
`prompt`	string	No	Custom prompt for image analysis