Gemini API Managed Agents are hosted AI agents that run inside an isolated Linux sandbox on Google’s infrastructure. A single API call provisions the environment, executes multi-step reasoning, uses tools, and returns results with persistent state across turns. Announced at Google I/O 2026, they are powered by Gemini 3.5 Flash and built on the Antigravity 2.0 agent runtime.
Artificial intelligence is evolving beyond simple chatbots. Businesses now want AI systems that can handle tasks, use tools, analyze information, and automate workflows with minimal human input. That’s where Gemini API Managed Agents come in. They help developers build intelligent AI applications without managing the entire agent architecture manually.
Google’s Gemini API Managed Agents simplify AI development by handling reasoning, memory, tool usage, and workflow execution behind the scenes. Instead of building everything from scratch, developers can focus on creating better user experiences and smarter business solutions. From AI assistants to workflow automation platforms, managed agents make building scalable AI systems much faster.
In this guide, you’ll learn what Gemini API Managed Agents are, how they work, and how to build with them step by step. We’ll also explore real-world use cases, architecture basics, and best practices for creating reliable AI-powered applications using the Gemini ecosystem.
Gemini API Managed Agents are hosted AI agents that Google runs and manages on your behalf through the Gemini API. You define what the agent can do, what tools it has access to, and how it should behave. Google handles the infrastructure: memory, state persistence across turns, tool execution, and orchestration.
Before Managed Agents existed, building a multi-step agent required you to:
With Managed Agents, most of that disappears. The agent runs server-side, retains state across calls, and can reason across long workflows without you managing a single session object.
Simple definition: A Gemini Managed Agent is a persistent, server-hosted AI agent that you configure once and call via the Interactions API. It remembers context, uses tools, executes code, and handles complex multi-step tasks without you writing orchestration logic.
The context matters here. At Google I/O 2026, Sundar Pichai said Google is now processing over 3.2 quadrillion tokens per month, up from 480 trillion a year ago. More than 8.5 million developers build with Google models every month. At that scale, the bottleneck is no longer the model. It is the complexity of building reliable agents around the model.
Google described this moment as the shift into the “agentic era.” The company’s goal with Managed Agents is to let developers focus entirely on what the agent should do, not on the plumbing required to make it run.
Managed Agents are powered by Gemini 3.5 Flash, which was also announced at I/O 2026. Gemini 3.5 Flash is four times faster than competing frontier models and scores significantly higher on agentic coding benchmarks like Terminal-Bench 2.1 (76.2%) and MCP Atlas (83.6%). That speed advantage is what makes server-side agent hosting practical at scale.
The core of Managed Agents is the Interactions API. This is a new API endpoint introduced at I/O 2026 that handles:
The Interactions API is Google’s equivalent of OpenAI’s server-side history management from the Responses API. State is managed on Google’s servers, so you do not need to pass conversation history on every call.
One of the defining features of Managed Agents is persistent state. When you send a message to a Managed Agent, the agent remembers everything from previous turns in that session. You can close your application, come back the next day, and the agent picks up exactly where it left off.
This is a major shift from stateless API calls where every request starts from scratch.
Managed Agents are built on top of Antigravity 2.0, Google’s agent-first development platform that was upgraded at I/O 2026. The Antigravity agent itself is a Managed Agent running Gemini 3.5 Flash. This means the same infrastructure that powers Google’s own coding tool is available to developers through the API.
Antigravity 2.0 ships in two views:
Managed Agents can call tools you define. You register tools (functions, APIs, external services) when you create the agent. The agent decides when to call them and passes the results back into its reasoning chain.
Built-in tools available to Managed Agents include:
One of the most developer-friendly features is how you define agent behavior. Instead of writing complex orchestration code, you define everything in markdown files:
You register these files as a named agent in the Gemini API. This declarative approach means non-engineers can read and edit agent behavior without touching any code.
Managed Agents support multi-agent workflows. You can create several specialized agents and have them call each other. For example, a research agent might call a writing agent, which calls a fact-checking agent. Each is a separate Managed Agent with its own state and tools, communicating through the Gemini API.
Before you start, you need:
Install the Gemini SDK in Node.js:
npm install @google/generative-ai
Or in Python:
pip install google-generativeai
Export your API key as an environment variable:
export GEMINI_API_KEY="your_api_key_here"
Create a file called AGENTS.md in your project root. This file tells the agent who it is and how it should behave.
# Agent Instructions
You are a research assistant that helps users find accurate information,
summarize documents, and draft structured reports.
## Behavior guidelines
- Always cite your sources when referencing web content
- Ask clarifying questions before starting a long research task
- Format output as structured markdown unless the user requests otherwise
- If you are unsure about a fact, say so rather than guessing
## Tools available
- Web search: use this when the user asks for current information
- Code execution: use this for data analysis, calculations, or generating charts
If your agent has a specialized capability, define it in a SKILL.md file:
# Research Report Skill
When asked to produce a research report, follow this structure:
1. Executive summary (3 sentences maximum)
2. Key findings (bullet points)
3. Supporting evidence with sources
4. Recommended next steps
Always complete the report in a single response unless the topic requires
multiple research passes.
Here is a working JavaScript example using the Interactions API:
const { GoogleGenerativeAI } = require("@google/generative-ai");
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
async function createManagedAgent() {
const agentConfig = {
model: "gemini-3.5-flash",
systemInstruction: `
You are a research assistant.
Follow the behavior guidelines defined in AGENTS.md.
Use web search for current information.
Format output as structured markdown.
`,
tools: [
{ googleSearch: {} },
{ codeExecution: {} }
],
generationConfig: {
maxOutputTokens: 8192,
temperature: 0.2
}
};
return agentConfig;
}
async function runAgentSession(userMessage, sessionHistory = []) {
const agentConfig = await createManagedAgent();
const model = genAI.getGenerativeModel(agentConfig);
const chat = model.startChat({
history: sessionHistory
});
const result = await chat.sendMessage(userMessage);
const response = await result.response;
return {
text: response.text(),
updatedHistory: await chat.getHistory()
};
}
async function main() {
console.log("Starting Managed Agent session...");
// First turn
const firstTurn = await runAgentSession(
"Research the latest developments in AI agents and give me a structured summary."
);
console.log("Agent response:", firstTurn.text);
// Second turn uses the same session history — the agent remembers everything
const secondTurn = await runAgentSession(
"Now focus on Google's approach and how it compares to OpenAI.",
firstTurn.updatedHistory
);
console.log("Follow-up response:", secondTurn.text);
}
main().catch(console.error);
When the agent calls a custom tool, you handle the result and send it back:
async function handleToolCall(toolName, toolArgs) {
if (toolName === "fetchPriceData") {
const price = await yourPriceAPI.fetch(toolArgs.product);
return { price, currency: "USD", source: "live" };
}
if (toolName === "sendEmail") {
await yourEmailService.send(toolArgs);
return { success: true };
}
throw new Error(`Unknown tool: ${toolName}`);
}
async function runAgentWithTools(userMessage) {
const model = genAI.getGenerativeModel({
model: "gemini-3.5-flash",
tools: [
{
functionDeclarations: [
{
name: "fetchPriceData",
description: "Fetches live price data for a given product",
parameters: {
type: "object",
properties: {
product: {
type: "string",
description: "Product name or SKU"
}
},
required: ["product"]
}
}
]
}
]
});
const chat = model.startChat();
let result = await chat.sendMessage(userMessage);
let response = result.response;
// Keep looping until the agent stops calling tools
while (response.functionCalls()?.length > 0) {
const toolCalls = response.functionCalls();
const toolResults = [];
for (const call of toolCalls) {
const toolResult = await handleToolCall(call.name, call.args);
toolResults.push({
functionResponse: {
name: call.name,
response: toolResult
}
});
}
result = await chat.sendMessage(toolResults);
response = result.response;
}
return response.text();
}
Once your agent works locally, export it to Antigravity from AI Studio. The Antigravity 2.0 desktop app gives you:
| Feature | Managed Agents | Custom Orchestration |
|---|---|---|
| State management | Server-side, automatic | You manage it |
| Setup time | Minutes | Hours to days |
| Multi-turn memory | Built in | Custom database required |
| Tool execution | Handled by Google | You write the loop |
| Scaling | Google’s infrastructure | Your infrastructure |
| Customization | AGENTS.md and SKILL.md | Full code control |
| Cost | API call costs only | API plus your compute |
Managed Agents are the right choice for most use cases, especially teams that want to ship quickly. Custom orchestration still makes sense when you need deep control over how the agent reasons, or when you are building on a non-Google stack.
A Managed Agent can handle tier-1 support: reading tickets from your CRM, searching your knowledge base, drafting responses, and escalating to a human when confidence is low. The persistent state means the agent remembers a customer’s full history across sessions.
Define a research agent that accepts a topic, searches the web, pulls data, and returns a formatted report. Because the agent maintains state, users can ask follow-up questions without re-explaining context.
This is the primary use case Antigravity is built around. A Managed Agent can read your codebase, understand the task, write code, run tests in the sandbox, fix failures, and return a working diff. The full Antigravity agent is itself a Managed Agent running this loop.
Google demonstrated at I/O 2026 how agents can call businesses, check inventory, and complete purchases using the Agents Payment Protocol. The Universal Cart uses Managed Agent infrastructure. You can build similar workflows: a travel booking agent that searches flights, checks prices, and books on behalf of the user within parameters they set.
Managed Agents integrate natively with Google Workspace. An agent can read emails, pull data from Sheets, draft documents in Docs, and schedule Calendar events. For organizations already on Google Workspace, this is the fastest path to AI-powered internal automation.
Managed Agents are billed through standard Gemini API token pricing. You pay for:
There is no additional fee for using the Managed Agent infrastructure itself. State persistence and orchestration are included in the API price.
For high-volume agent workloads, Google dropped the AI Ultra plan to $200 per month at I/O 2026, down from $250. A new $100 developer tier was also introduced for teams building on Gemini professionally.
Yes. If you already use the Gemini API, you adopt Managed Agents by adding tools and system instructions to your model call. The Interactions API is additive and does not require you to rewrite existing integrations.
Gemini 3.5 Flash supports a large context window suited for long multi-turn conversations. For very long sessions, the Interactions API handles context compression automatically to keep costs manageable.
Yes. You can define any external API as a function declaration and the agent will call it. The tool execution environment is flexible: your tools can call anything reachable from your backend.
Is Managed Agent state secure?
Session state is stored on Google Cloud infrastructure under your project’s data governance settings. You can delete session data through the API at any time.
Yes. You can create multiple sessions simultaneously, each running an independent Managed Agent. The Manager view in Antigravity is specifically designed to visualize and coordinate parallel agent workloads.
Gemini Spark is Google’s consumer-facing personal AI agent in the Gemini app. Managed Agents are the developer API that lets you build your own agent experiences. Spark is a product built on the same agent infrastructure you now have access to.
Google announced a $2 million Build with Gemini XPRIZE Hackathon at I/O 2026, asking developers to build real applications with Gemini that solve pressing global challenges. Managed Agents are the primary building block.
WebMCP, a proposed open web standard announced at I/O 2026, will allow agents to interact with web-based tools defined as JavaScript functions and HTML forms. This will significantly expand what Managed Agents can do in browser-based environments.
Gemini 3.5 Pro is arriving next month after the Flash release, bringing deeper reasoning to agentic workflows that require longer planning horizons. Managed Agents will automatically support the Pro model when it ships.
Use this checklist to ship your first Managed Agent:
Gemini API Managed Agents are Google’s answer to the growing complexity of building reliable AI agents at scale. Announced at Google I/O 2026 and powered by Gemini 3.5 Flash, they allow developers and AI agent development companies to deploy persistent, tool-using, stateful AI agents through a single API without building orchestration infrastructure from scratch. Businesses looking to automate workflows, build AI copilots, or launch enterprise AI solutions can also hire AI developers to create scalable agentic applications faster using Gemini Managed Agents.
The key points:
For most teams building AI-powered applications in 2026, Managed Agents are the fastest path from idea to production. The infrastructure is Google’s problem. Your job is to define what the agent should do.
Sources: Google I/O 2026 keynote, Google Developers Blog, Google AI Studio documentation
Our team is always eager to know what you are looking for. Drop them a Hi!
Comments