Lesson 12: getting started with Vertex AI and ADK
Introduction
Section titled “Introduction”Over the first eleven lessons, we built a strong foundation in how agents work - reasoning, tools, memory, planning, multi-agent systems, RAG, evaluation, safety, and production operations. All of those concepts are platform-agnostic.
Now we shift to practice. This lesson maps those concepts to the specific tools and services available on Google Cloud. The goal is to give you a clear picture of what exists, how the pieces fit together, and which service to reach for when you are building a particular type of agent.
ELI5: Think of the Google Cloud AI stack like a workshop
Section titled “ELI5: Think of the Google Cloud AI stack like a workshop”Imagine you are setting up a woodworking workshop. You need raw materials (wood, metal), power tools (saws, drills), a workbench (a stable surface to build on), safety equipment (goggles, gloves), and a space to display or sell your finished products.
The Google Cloud AI stack works similarly:
- Gemini models are your raw materials - the intelligence that powers everything
- ADK (Agent Development Kit) is your set of power tools - the framework you use to build agents
- Vertex AI platform is your workbench - model hosting, evaluation, and deployment infrastructure
- Agent Engine is a managed display room - it runs your agents in production without you managing servers
- Model Armor is your safety equipment - guardrails and content filtering
- Vertex AI Search and RAG Engine are your reference library - giving agents access to your data
You can use these pieces independently or together. Not every project needs every tool.
Key takeaway: Google Cloud provides a full stack for building agents, from models to managed runtime. Understanding which piece does what helps you pick the right tool for each job.
The Google Cloud AI stack for agents
Section titled “The Google Cloud AI stack for agents”Here is a map of the major components and how they relate to each other:
+------------------------------------------------------------------+| Your Agent Application |+------------------------------------------------------------------+| || +--------------------+ +----------------------------------+ || | Agent Development | | Agent Engine | || | Kit (ADK) | | (Managed Runtime) | || | | | | || | - Build agents | | - Deploy and run agents | || | - Define tools |--->| - Session management | || | - Orchestration | | - Scaling and monitoring | || +--------------------+ +----------------------------------+ || | | || v v || +----------------------------------------------------+ || | Vertex AI Platform | || | | || | - Model hosting (Gemini, partner models) | || | - Evaluation tools | || | - Context caching | || | - Grounding with Google Search | || +----------------------------------------------------+ || | | || v v || +----------------------+ +------------------------+ || | Vertex AI Search & | | Model Armor | || | RAG Engine | | (Safety & Guardrails) | || +----------------------+ +------------------------+ || |+------------------------------------------------------------------+| Gemini Models || Pro (complex reasoning) | Flash (balanced) | Flash-Lite (volume) |+------------------------------------------------------------------+Let’s walk through each component.
Gemini Models
Section titled “Gemini Models”Gemini is Google’s family of multimodal AI models. For agent development, you will primarily work with three tiers:
| Model | Best For | Characteristics |
|---|---|---|
| Gemini Pro | Complex reasoning, multi-step planning, nuanced decisions | Highest capability, higher latency, higher cost |
| Gemini Flash | Balanced tasks - tool use, summarization, conversation | Good capability, fast, moderate cost |
| Gemini Flash-Lite | High-volume, simpler tasks - classification, routing, extraction | Fast, lowest cost, good for high-throughput use cases |
Choosing the right model
Section titled “Choosing the right model”Think of model selection like choosing the right vehicle for a trip:
- Flash-Lite is a bicycle - fast, cheap, great for short trips (simple classification, intent detection, basic extraction)
- Flash is a car - versatile, good for most journeys (general agent tasks, tool use, RAG, conversation)
- Pro is a truck - powerful, handles heavy loads (complex multi-step reasoning, long documents, difficult planning)
Most production agents use multiple model tiers through model routing (covered in Lesson 11). Use Flash-Lite for the simple steps, Flash for the core logic, and Pro only when the task genuinely requires it.
Multimodal capabilities
Section titled “Multimodal capabilities”Gemini models can process text, images, audio, and video. This means your agents can:
- Analyze images uploaded by users (product photos, screenshots, documents)
- Process audio inputs (voice commands, meeting recordings)
- Understand video content (tutorials, demonstrations)
- Work with PDFs and other document formats natively
This is a significant advantage over text-only models because it lets you build agents that interact with the real world in richer ways.
For model details and capabilities, see the Gemini model documentation.
Vertex AI Platform
Section titled “Vertex AI Platform”Vertex AI is Google Cloud’s machine learning platform. For agent developers, the most relevant capabilities are:
Model hosting and API access
Section titled “Model hosting and API access”Vertex AI provides API access to Gemini models along with partner and open-source models. You get:
- Managed endpoints - no infrastructure to manage
- Context caching - cache long system prompts or documents to reduce cost and latency on repeated calls
- Grounding with Google Search - let the model verify and supplement its knowledge with web search results
- Batch prediction - process large volumes of requests efficiently
Evaluation tools
Section titled “Evaluation tools”Vertex AI includes evaluation capabilities that align with what we covered in Lesson 9:
- AutoSxS (Auto Side-by-Side) - compare two model versions automatically
- Pointwise evaluation - score individual responses on quality dimensions
- Custom metrics - define your own evaluation criteria
- Bulk evaluation - run evals across large datasets
These tools integrate with your CI/CD pipeline for evaluation-gated deployment (Lesson 11).
Model garden
Section titled “Model garden”The Vertex AI Model Garden provides access to a wide range of models beyond Gemini - including open-source models and models from partner companies. This is useful when you need specialized models for particular tasks or want to compare different options.
For the full platform overview, see the Vertex AI documentation.
Agent Development Kit (ADK)
Section titled “Agent Development Kit (ADK)”ADK is Google’s open-source, code-first framework for building AI agents. If Vertex AI is the workbench, ADK is the set of power tools you use to actually build things.
Key characteristics
Section titled “Key characteristics”| Feature | Detail |
|---|---|
| Open source | Available on GitHub, MIT licensed |
| Multi-language | Python, TypeScript/JavaScript, Go, Java |
| Model-agnostic | Works with Gemini, but also supports other LLMs |
| Deployment-agnostic | Run locally, on Agent Engine, on Cloud Run, on any container platform |
| Opinionated but flexible | Provides structure without locking you in |
Why a framework?
Section titled “Why a framework?”You could build agents by calling the Gemini API directly - writing your own tool-calling loop, managing conversation state, and handling orchestration. ADK saves you from reinventing these common patterns:
- Tool registration and execution
- Conversation state management
- Multi-turn orchestration loops
- Multi-agent coordination
- Session and memory management
- Callback hooks for guardrails and logging
Think of it like the difference between writing raw HTTP handlers and using a web framework. You can do either, but the framework handles the boilerplate so you can focus on your agent’s unique logic.
ADK core concepts
Section titled “ADK core concepts”ADK organizes agents into three categories:
1. LLM agents
Section titled “1. LLM agents”These are agents powered by a language model that can reason, plan, and decide which tools to call. This is the most common type and corresponds to the ReAct-style agents we covered in Lesson 4.
from google.adk.agents import Agentfrom google.adk.tools import FunctionTool
# Define a tooldef get_weather(city: str) -> str: """Get the current weather for a city.""" # In practice, this would call a weather API return f"The weather in {city} is sunny, 72F."
# Create an agentweather_agent = Agent( name="weather_agent", model="gemini-2.0-flash", instruction="You are a helpful weather assistant. Use the get_weather " "tool to answer questions about weather conditions.", tools=[get_weather],)2. workflow agents
Section titled “2. workflow agents”These agents follow predefined orchestration patterns rather than relying on the LLM to decide the flow. ADK provides three built-in workflow types:
| Workflow Type | How It Works | Use When |
|---|---|---|
| SequentialAgent | Runs sub-agents one after another in a fixed order | Steps must happen in sequence (e.g., validate -> process -> respond) |
| ParallelAgent | Runs sub-agents simultaneously | Steps are independent and can happen at the same time (e.g., search multiple sources) |
| LoopAgent | Runs a sub-agent repeatedly until a condition is met | Iterative refinement (e.g., generate -> evaluate -> improve) |
from google.adk.agents import SequentialAgent
# A pipeline that validates input, processes it, and formats the responsepipeline = SequentialAgent( name="order_pipeline", sub_agents=[ input_validator_agent, order_processor_agent, response_formatter_agent, ],)Workflow agents map directly to the agentic design patterns from Lesson 4. Sequential corresponds to pipeline patterns. Parallel corresponds to fan-out/fan-in. Loop corresponds to reflection and iterative refinement.
3. custom agents
Section titled “3. custom agents”For orchestration patterns that do not fit the built-in types, you can create custom agents by subclassing the base agent class and implementing your own control flow.
ADK tools ecosystem
Section titled “ADK tools ecosystem”One of ADK’s biggest strengths is its tools ecosystem. Tools are how agents interact with the outside world (Lesson 3), and ADK provides several ways to define them:
| Tool Type | What It Is | Example |
|---|---|---|
| Function Tools | Plain Python/JS functions decorated as tools | A function that queries your database |
| MCP Tools | Tools from Model Context Protocol servers | Connect to any MCP-compatible tool server |
| OpenAPI Tools | Auto-generated from OpenAPI/Swagger specs | Wrap any REST API as agent tools |
| Built-in Tools | Pre-built integrations provided by ADK | Google Search, code execution, RAG |
ADK includes 60+ pre-built tool integrations, covering common needs like:
- Google Search and web browsing
- Code execution (sandboxed)
- File operations
- Database queries
- API calls
- Google Workspace (Gmail, Calendar, Drive)
from google.adk.tools import FunctionTool
# A simple function tooldef search_products(query: str, max_results: int = 5) -> list[dict]: """Search the product catalog.
Args: query: The search query string. max_results: Maximum number of results to return.
Returns: A list of matching products with name, price, and description. """ # Your implementation here return product_database.search(query, limit=max_results)
# ADK automatically generates the tool schema from the function signature# and docstring, so the LLM knows how to call it correctly.ADK Skills
Section titled “ADK Skills”Skills are a newer ADK concept that packages agent capabilities as self-contained, reusable units. Think of them as “plugins” for agents.
Skills have three levels of increasing complexity:
| Level | What It Includes | Example |
|---|---|---|
| L1 - Metadata | Name, description, and tags that help the agent understand when to use the skill | ”This skill handles flight booking” |
| L2 - Instructions | Detailed instructions for how the agent should use the skill | Step-by-step guide for the booking flow |
| L3 - Resources | Tools, data sources, and sub-agents that the skill needs | Flight search API tool, airline database |
Skills make it easier to share and compose agent capabilities across teams and projects.
For complete ADK documentation, see the ADK docs site.
Agent engine
Section titled “Agent engine”Agent Engine is a managed runtime service on Google Cloud for deploying and running agents. If ADK is how you build agents, Agent Engine is how you run them in production without managing infrastructure.
What agent engine provides
Section titled “What agent engine provides”| Capability | What It Does |
|---|---|
| Managed hosting | Run your agent without provisioning servers or containers |
| Session management | Built-in conversation state persistence |
| Scaling | Automatic scaling based on traffic |
| Monitoring | Integration with Cloud Monitoring and Cloud Logging |
| Security | IAM-based access control, VPC Service Controls support |
When to use Agent Engine vs. other deployment options
Section titled “When to use Agent Engine vs. other deployment options”| Deployment Option | Best For |
|---|---|
| Agent Engine | Production agents where you want managed infrastructure and do not want to handle scaling, session management, or deployment yourself |
| Cloud Run | Agents that need custom runtime environments, specific dependencies, or more control over the container |
| GKE (Kubernetes) | Agents that are part of a larger microservices architecture already running on Kubernetes |
| Local / Self-hosted | Development, testing, or when you cannot use cloud services |
ADK agents can be deployed to any of these targets. Agent Engine is the most managed option - you give it your agent code and it handles the rest.
For deployment details, see the Agent Engine documentation.
Vertex AI Search And RAG engine
Section titled “Vertex AI Search And RAG engine”In Lesson 8, we covered how agents can use retrieval-augmented generation (RAG) to access your data. Vertex AI provides managed services for this:
Vertex AI Search
Section titled “Vertex AI Search”A managed search service that can index and search across:
- Websites
- Unstructured documents (PDFs, Word docs, HTML)
- Structured data (databases, spreadsheets)
It handles chunking, embedding, indexing, and retrieval - the full RAG pipeline we discussed in Lesson 8 - as a managed service.
RAG Engine
Section titled “RAG Engine”RAG Engine provides a managed retrieval pipeline specifically designed for grounding LLM responses in your data:
- Document ingestion - upload and process documents automatically
- Chunking strategies - configurable approaches to splitting documents
- Vector search - managed embedding and similarity search
- Integration with Gemini - built-in grounding for Gemini model calls
The advantage of these managed services is that you do not have to run your own vector database, manage embeddings, or build retrieval pipelines. The tradeoff is less control over the details.
For RAG capabilities, see the RAG Engine overview.
Model Armor
Section titled “Model Armor”Model Armor is Google Cloud’s managed guardrails service (we covered guardrails in depth in Lesson 10). It provides:
| Feature | What It Does |
|---|---|
| Prompt screening | Detect and block harmful or adversarial prompts before they reach the model |
| Response filtering | Screen model outputs for harmful, toxic, or inappropriate content |
| Prompt injection detection | Identify attempts to override model instructions |
| Configurable policies | Set your own thresholds for different content categories |
| Integration | Works with Vertex AI endpoints and can be added to any generative AI application |
Model Armor gives you a production-ready Layer 2 defense (from the defense-in-depth model in Lesson 10) without building content filtering from scratch.
Quick setup guide
Section titled “Quick setup guide”Here is how to get started with ADK and Google Cloud for agent development.
Prerequisites
Section titled “Prerequisites”- A Google Cloud account (you can start with the free tier)
- Python 3.9+ (for the Python SDK)
- A Google Cloud project with billing enabled
Step 1: install ADK
Section titled “Step 1: install ADK”pip install google-adkStep 2: configure authentication
Section titled “Step 2: configure authentication”You have two options for authenticating:
Option A: API Key (simplest for getting started)
export GOOGLE_API_KEY="your-api-key-here"You can get an API key from Google AI Studio.
Option B: Google Cloud project (for production and Vertex AI features)
# Install the Google Cloud CLI# https://cloud.google.com/sdk/docs/install
# Authenticategcloud auth application-default login
# Set your projectgcloud config set project YOUR_PROJECT_IDStep 3: Create your first agent
Section titled “Step 3: Create your first agent”Create a file called agent.py:
from google.adk.agents import Agent
root_agent = Agent( name="greeting_agent", model="gemini-2.0-flash", instruction="You are a friendly assistant that greets users " "and answers basic questions.",)Step 4: run it locally
Section titled “Step 4: run it locally”adk webThis starts a local web interface where you can chat with your agent and inspect its behavior. The ADK dev UI shows you the agent’s reasoning steps, tool calls, and state - which is invaluable for debugging.
Step 5: Add tools and complexity
Section titled “Step 5: Add tools and complexity”From here, you can add tools, create multi-agent systems, integrate RAG, and eventually deploy to Agent Engine or Cloud Run. The ADK getting started guide walks through these steps in detail.
Decision tree: which service do i use?
Section titled “Decision tree: which service do i use?”When building an agent, use this decision tree to pick the right Google Cloud services:
"I want to build..." | +-- "A simple chatbot with no tools" | --> Gemini API directly (no framework needed) | +-- "An agent with tools and reasoning" | --> ADK + Gemini Flash | | | +-- "...and I need it in production" | --> Deploy to Agent Engine or Cloud Run | +-- "An agent that searches my documents" | --> ADK + Vertex AI Search or RAG Engine | +-- "A multi-agent system" | --> ADK (multi-agent orchestration built in) | +-- "An agent with strict safety requirements" | --> ADK + Model Armor + custom guardrails | +-- "A high-volume, cost-sensitive application" | --> Model routing (Flash-Lite for simple tasks, | Flash for complex) + context caching | +-- "An agent that needs to use external APIs" --> ADK with OpenAPI tools or MCP toolsQuick reference table
Section titled “Quick reference table”| I Need… | Use… |
|---|---|
| An LLM to call | Gemini models via Vertex AI or AI Studio |
| A framework to build agents | Agent Development Kit (ADK) |
| Managed agent hosting | Agent Engine |
| Custom container hosting | Cloud Run or GKE |
| Document search / RAG | Vertex AI Search or RAG Engine |
| Content safety guardrails | Model Armor |
| Model evaluation | Vertex AI Evaluation tools |
| Prompt management | Vertex AI prompt management |
| Interop with external tools | MCP tools in ADK |
| Interop with other agents | A2A protocol (covered in Lesson 14) |
How the pieces connect: a full example
Section titled “How the pieces connect: a full example”Here is how a typical production agent uses multiple Google Cloud services together:
User asks: "What is the return policy for my recent order?"
1. [ADK Agent] receives the request |2. [Model Armor] screens the input for safety |3. [Gemini Flash] reasons about the request: "I need to look up the order and find the return policy" |4. [ADK Tool: Order Lookup] calls your order database |5. [RAG Engine] searches your policy documents for the relevant return policy |6. [Gemini Flash] synthesizes a response from the order details and policy documents |7. [Model Armor] screens the output for safety |8. [Agent Engine] manages the session state and returns the response to the userEach Google Cloud service handles one part of the puzzle. ADK orchestrates the flow. Gemini provides the reasoning. RAG Engine provides the knowledge. Model Armor provides the safety. Agent Engine provides the runtime.
Where each lesson concept maps to Google Cloud
Section titled “Where each lesson concept maps to Google Cloud”Here is a reference connecting the concepts from earlier lessons to specific Google Cloud services:
| Lesson | Concept | Google Cloud Service |
|---|---|---|
| 2 - How Agents Think | LLM reasoning | Gemini models |
| 3 - Tools | Function calling | ADK Function Tools, MCP Tools, OpenAPI Tools |
| 4 - Design Patterns | Orchestration | ADK Sequential/Parallel/Loop Agents |
| 5 - Memory | Session state, long-term memory | ADK session management, Agent Engine |
| 6 - Planning | Multi-step reasoning | Gemini Pro for complex planning |
| 7 - Multi-Agent | Agent coordination | ADK multi-agent support |
| 8 - RAG | Knowledge retrieval | Vertex AI Search, RAG Engine |
| 9 - Evaluation | Testing agents | Vertex AI Evaluation |
| 10 - Safety | Guardrails | Model Armor |
| 11 - Production | Deployment, CI/CD | Agent Engine, Cloud Run, Agent Starter Pack |
Key takeaways
Section titled “Key takeaways”-
Google Cloud provides a full stack for agents. From Gemini models at the base to Agent Engine at the top, you can build and deploy complete agent systems.
-
ADK is the code-first framework. It is open-source, multi-language, model-agnostic, and deployment-agnostic. It handles the common patterns (tool calling, orchestration, state management) so you can focus on your agent’s logic.
-
Pick the right model tier. Use Flash-Lite for simple tasks, Flash for most agent work, and Pro for complex reasoning. Model routing across tiers is a key cost optimization strategy.
-
Managed services reduce operational burden. Agent Engine, RAG Engine, and Model Armor handle infrastructure and operations so you can focus on building. The tradeoff is less control over implementation details.
-
Everything is composable. You can use ADK without Agent Engine, use Vertex AI without ADK, or use the full stack together. Start with what you need and add services as your requirements grow.
-
ADK agents map directly to the concepts in this course. LLM Agents for reasoning, Workflow Agents for orchestration patterns, tools for external interaction, and skills for reusable capabilities.
Further reading
Section titled “Further reading”- ADK Documentation - Complete guide to building agents with ADK
- Vertex AI Documentation - The full Vertex AI platform reference
- Agent Engine Overview - Managed runtime for agents
- RAG Engine Overview - Managed retrieval-augmented generation
- Agent Starter Pack - Production-ready templates with CI/CD and observability built in
- Google AI Studio - Get API keys and experiment with Gemini models
Next lesson: Building Your First Agent