FILE PHOTO: Google has unveiled a new AI model called Gemini 2.5 Computer Use that can navigate and surf the web using a virtual browser and even perform tasks like filling forms.
| Photo Credit: Reuters
Google has unveiled a new AI model called Gemini 2.5 Computer Use that can navigate and surf the web using a virtual browser and even perform tasks like filling forms. Built on the Gemini 2.5 Pro, the AI model has a “visual understanding and reasoning capabilities” and just takes cues from a user prompt.
“While AI models can interface with software through structured APIs, many digital tasks still require direct interaction with graphical user interfaces, for example, filling and submitting forms. To complete these tasks, agents must navigate web pages and applications just as humans do: by clicking, typing and scrolling,” the blog post making the announcement said.
Users can directly test and navigate interfaces without using an API.
They are required to provide inputs including a screenshot of the environment, a history of recent actions and any functions that they want to include. The AI model analyses these directions and generates a response while performing the action.
Google also said that the AI model has access to a browser only and not the entire computer environment.
The Gemini 2.5 Computer Use model has also shown comparable performance for mobile UI control tasks but isn’t “optimised for desktop OS-level control.”
Developers can access the Gemini 2.5 Computer Use model via the Gemini API in Google AI Studio and Vertex AI.
Other versions of the model have been already used for Project Mariner, a prototype that uses AI agents for tasks, and for some agentic capabilities in AI Mode in Search.
Published – October 08, 2025 01:32 pm IST

