Vision
Analyze images with vision models
Vision is a tool that allows you to analyze images with vision models.
With Vision, you can:
- Analyze images: Analyze images with vision models
- Extract text: Extract text from images
- Identify objects: Identify objects in images
- Describe images: Describe images in detail
- Generate images: Generate images from text
The Gen6 Vision integration allows your agents to analyze images using vision models directly within their workflows.
This enables powerful, image-centric automations. Agents can extract text from images, identify objects, describe images in detail, and generate images from text.
By connecting Gen6 with Vision, you can create sophisticated agents that provide accurate responses and deliver greater value without manual intervention or custom code.
Usage Instructions
Process visual content with customizable prompts to extract insights and information from images.
Where to Get API Keys for Vision Models
This guide shows you where to find the API keys required to use vision models from both OpenAI (GPT-4o) and Anthropic (Claude 3).
1. OpenAI API Key (for GPT-4o with Vision)
The OpenAI key is generated through the platform dashboard.
- Sign up or log in to the OpenAI Platform: https://platform.openai.com/
- Navigate to the Keys Page: Go directly to the API Keys management section: https://platform.openai.com/account/api-keys
- Generate Key: Click “+ Create secret key” to generate and name your new key.
- Remember: The full key is displayed only once. Copy it immediately and save it securely.
2. Claude API Key (for Claude 3 with Vision)
The Claude key is managed within the Anthropic Console.
- Sign up or log in to the Anthropic Console: https://console.anthropic.com/
- Navigate to the Keys Page: Go directly to the API Keys settings: https://console.anthropic.com/settings/keys
- Create Key: Create and copy your API key for use in the tool.
- Note: Ensure you purchase credits or have an active plan, as API access may be restricted until billing is set up.
Tools
vision_tool
Process and analyze images using advanced vision models. Capable of understanding image content, extracting text, identifying objects, and providing detailed visual descriptions.
Input
| Parameter | Type | Required | Description |
|---|---|---|---|
apiKey | string | Yes | API key for the selected model provider |
imageUrl | string | Yes | Publicly accessible image URL |
model | string | No | Vision model to use (gpt-4o, claude-3-opus-20240229, etc) |
prompt | string | No | Custom prompt for image analysis |
Output
| Parameter | Type | Description |
|---|---|---|
content | string | Analysis result |
model | any | Model used |
tokens | any | Token usage |
Screenshot

Notes
- Category:
tools - Type:
vision