Browser Use

BrowserUse enables agents to run browser automation tasks with natural-language instructions. It simulates human-like browsing to navigate sites, click elements, fill forms, scrape data, and execute multi-step workflows with optional live monitoring and structured results.

Usage Instructions

Use this block when your agent needs to interact with the live web (research, form submission, scraping, testing).

Typical flow

Describe the task in plain English (the task).
(Optional) Provide variables (secrets/values the steps can use).
Choose whether to save_browser_data (persist cookies/session).
(Optional) Select model for reasoning (defaults to gpt-4o).
Provide your apiKey. (You can get it from here: https://cloud.browser-use.com/)
Run the block → it executes asynchronously and returns a task id, success status, output, and steps taken.

Great for

Automating repetitive web flows (login → search → click → extract).
Collecting structured data from pages.
Filing forms or tickets across internal tools.
End-to-end smoke tests of web apps.

Tools

`browser_use_run_task`

Run a single browser automation task.

Input

Parameter	Type	Required	Description
`task`	string	Yes	Natural-language instruction describing what the browser should do.
`variables`	json	No	Key–value pairs available to the task (e.g., credentials, query terms).
`save_browser_data`	boolean	No	Persist session/browser data (cookies, history). Default: `false`.
`model`	string	No	LLM to use for reasoning (e.g., `gpt-4o`, `gemini-2.0-flash`). Default: `gpt-4o`.
`apiKey`	string	Yes	BrowserUse API key (must be valid and funded).

Output

Parameter	Type	Description
`id`	string	Execution identifier for the task (useful for logs/support).
`success`	boolean	Whether the task completed successfully.
`output`	any	Raw result (e.g., extracted data, confirmation text, structured payload).
`steps`	json	Detailed step list (visited URLs, DOM actions, extracted elements, timing).

Screenshot

BrowserUse screenshot

Examples

Example 1 — Research summary

Task: “Go to google.com, search ‘latest genAI eval frameworks’, open top 3 reputable results, summarize key differences.”
Variables: { "region": "US" } (optional)
Outcome: output contains a concise comparison; steps shows search → click → parse.

Example 2 — Form submission

Task: “Open example.com/login, sign in with provided credentials, go to /submit, fill and submit the form, confirm the success message.”
Variables: { "username": "user", "password": "pass" }
Outcome: success: true and output includes confirmation; steps logs DOM actions.

Example 3 — Data extraction

Task: “Visit site.com/pricing, extract the plan names, monthly prices, and feature lists into structured JSON.”
Outcome: output returns structured data for downstream use.

Best Practices

Keep instructions clear and bounded: Provide goals, constraints, and what to return (e.g., “return JSON with fields X, Y, Z”).
Use variables for sensitive data: Pass secrets via variables to keep prompts clean.
Persist sessions when needed: Enable save_browser_data for flows that benefit from cookies (e.g., staying logged in).
Ask for structure: If extracting data, tell the agent the exact fields/shape to return.
Fail fast with signal: If a selector or page changes, ensure the task returns a helpful error in output.

Troubleshooting

Symptom	Likely Cause	What to Try
`success: false`, empty output	Site changed layout or selector	Refine the `task` with clearer steps; specify buttons, forms, or URLs.
Login keeps failing	Session not persisted	Set `save_browser_data: true`. Pass credentials via `variables`.
Rate-limit or bot detection	Aggressive navigation/scraping	Slow down steps; reduce frequency; add polite delays; target fewer pages.
Unexpected model behavior	Model too weak/fast for complex flows	Set `model` to `gpt-4o` (default) or a more capable reasoning model.

Notes

Category: tools
Type: browser_use
The block executes tasks asynchronously and returns once the run completes (it polls for completion under the hood).

Browser Use

On this page