Skip to main content
Google Cloud
Documentation Technology areas
  • AI and ML
  • Application development
  • Application hosting
  • Compute
  • Data analytics and pipelines
  • Databases
  • Distributed, hybrid, and multicloud
  • Generative AI
  • Industry solutions
  • Networking
  • Observability and monitoring
  • Security
  • Storage
Cross-product tools
  • Access and resources management
  • Costs and usage management
  • Google Cloud SDK, languages, frameworks, and tools
  • Infrastructure as code
  • Migration
Related sites
  • Google Cloud Home
  • Free Trial and Free Tier
  • Architecture Center
  • Blog
  • Contact Sales
  • Google Cloud Developer Center
  • Google Developer Center
  • Google Cloud Marketplace
  • Google Cloud Marketplace Documentation
  • Google Cloud Skills Boost
  • Google Cloud Solution Center
  • Google Cloud Support
  • Google Cloud Tech Youtube Channel
/
  • English
  • Deutsch
  • Español – América Latina
  • Français
  • Indonesia
  • Italiano
  • Português – Brasil
  • 中文 – 简体
  • 中文 – 繁體
  • 日本語
  • 한국어
Console Sign in
  • Generative AI on Vertex AI
Guides API reference Vertex AI Cookbook Prompt gallery Resources FAQ
Contact Us Start free
Google Cloud
  • Documentation
    • Guides
    • API reference
    • Vertex AI Cookbook
    • Prompt gallery
    • Resources
    • FAQ
  • Technology areas
    • More
  • Cross-product tools
    • More
  • Related sites
    • More
  • Console
  • Contact Us
  • Start free
  • Discover
    • Overview of Generative AI on Vertex AI
    • Generative AI beginner's guide
    • Glossary
  • Get started
    • Get an API key
    • Configure application default credentials
    • API quickstart
    • Vertex AI Studio quickstart
    • Migrate from Google AI Studio to Vertex AI
    • Deploy your Vertex AI Studio prompt as a web application
    • Vertex AI Studio capabilities
    • Generate an image and verify its watermark using Imagen
    • Google GenAI libraries
    • Compatibility with OpenAI library
    • Vertex AI in express mode
    • Overview
    • Console tutorial
    • API tutorial
  • Select models
    • Model Garden
    • Overview of Model Garden
    • Use models in Model Garden
    • Test model capabilities
    • Supported models
    • Google Models
    • Overview
    • Gemini
      • Gemini 2.5 Flash
      • Gemini 2.5 Pro
      • Gemini 2.5 Flash-Lite
      • Gemini 2.0 Flash
      • Gemini 2.0 Flash-Lite
      • Vertex AI Model Optimizer
      • Migrate to the latest Gemini models
      • SDKs
    • Imagen
      • Imagen 3.0 Generate 002
      • Imagen 3.0 Generate 001
      • Imagen 3.0 Fast Generate 001
      • Imagen 3.0 Capability 001
      • Imagen 4.0 Generate
      • Imagen 4.0 Fast Generate
      • Imagen 4.0 Ultra Generate
      • Virtual Try-On Preview 08-04
      • Imagen product recontext preview 06-30
      • Migrate to Imagen 3
    • Veo
      • Veo 2
      • Veo 2 Experimental
      • Veo 3
      • Veo 3 Fast
      • Veo 3 preview
      • Veo 3 Fast preview
    • Model versions
    • Managed models
    • Model as a Service (MaaS) overview
    • Partner models
      • Claude
        • Overview
        • Request predictions
        • Batch predictions
        • Prompt caching
        • Count tokens
        • Model details
        • Claude Opus 4.1
        • Claude Opus 4
        • Claude Sonnet 4
        • Claude 3.7 Sonnet
        • Claude 3.5 Haiku
        • Claude 3 Haiku
      • Mistral AI
        • Overview
        • Model details
        • Mistral OCR (25.05)
        • Mistral Small 3.1 (25.03)
        • Mistral Large (24.11)
        • Codestral (25.01)
    • Open models
      • DeepSeek
        • Overview
        • DeepSeek-R1-0528
        • DeepSeek-V3.1
      • OpenAI
        • Overview
        • OpenAI gpt-oss-120b
        • OpenAI gpt-oss-20b
      • Qwen
        • Overview
        • Qwen 3 Coder
        • Qwen 3 235B
      • Llama
        • Overview
        • Request predictions
        • Batch predictions
        • Model details
        • Llama 4 Maverick
        • Llama 4 Scout
        • Llama 3.3
        • Llama 3.2
        • Llama 3.1 405b
        • Llama 3.1 70b
        • Llama 3.1 8b
      • Model capabilities
        • Function calling
        • Structured output
    • Model deprecations (MaaS)
    • Self-deployed models
    • Overview
    • Deploy models with custom weights
    • Google Gemma
      • Use Gemma
      • Tutorial: Deploy and inference Gemma (GPU)
      • Tutorial: Deploy and inference Gemma (TPU)
    • Llama
    • Use Hugging Face Models
    • Hex-LLM
    • Comprehensive guide to vLLM for Text and Multimodal LLM Serving (GPU)
    • xDiT
    • Tutorial: Deploy Llamma 3 models with SpotVM and Reservations
    • Model Garden notebooks
      • Tutorial: Optimize model performance with advanced features in Model Garden
  • Build
    • Agents
    • Overview
    • Agent Development Kit
      • Overview
      • Quickstart
      • Deploy to Agent Engine
    • Agent Engine
      • Overview
      • Runtime
        • Quickstart
        • Set up the environment
        • Develop an agent
          • Overview
          • Agent Development Kit
          • LangChain
          • LangGraph
          • AG2
          • LlamaIndex
          • Custom
        • Deploy an agent
        • Use an agent
          • Overview
          • Agent Development Kit
          • LangChain
          • LangGraph
          • AG2
          • LlamaIndex
        • Manage deployed agents
          • Overview
          • Access control
          • Tracing
          • Logging
          • Monitoring
        • Using Private Service Connect interface
      • Evaluate an agent
      • Sessions
        • Sessions overview
        • Manage sessions using Agent Development Kit
        • Manage sessions using API calls
      • Memory Bank
        • Overview
        • Set up Memory Bank
        • Quickstarts
          • Quickstart with Agent Engine SDK
          • Quickstart with Agent Development Kit
        • Generate memories
        • Fetch memories
      • Example Store
        • Example Store overview
        • Example Store quickstart
        • Create or reuse an Example Store instance
        • Upload examples
        • Retrieve examples
      • Getting help
        • Troubleshoot setting up the environment
        • Troubleshoot developing an agent
        • Troubleshoot deploying an agent
        • Troubleshoot using an agent
        • Troubleshoot managing deployed agents
        • Get support
    • Agent2Agent (A2A) Protocol
      • Overview
      • A2A Python SDK
      • A2A JavaScript SDK
      • A2A Java SDK
      • A2A C#/.NET SDK
      • A2A samples
    • Agent Tools
      • Built-in tools
      • Google Cloud tools
      • Model Context Protocol (MCP) tools
      • MCP Toolbox for Databases
      • Ecosystem tools
    • Prompt design
    • Introduction to prompting
    • Prompting strategies
      • Overview
      • Give clear and specific instructions
      • Use system instructions
      • Include few-shot examples
      • Add contextual information
      • Structure prompts
      • Compare prompts
      • Instruct the model to explain its reasoning
      • Break down complex tasks
      • Experiment with parameter values
      • Prompt iteration strategies
    • Task-specific prompt guidance
      • Design multimodal prompts
      • Design chat prompts
      • Design medical text prompts
    • Capabilities
    • Safety
      • Overview
      • Responsible AI
      • System instructions for safety
      • Configure content filters
      • Gemini for safety filtering and content moderation
      • Abuse monitoring
      • Process blocked responses
    • Text and code generation
      • Text generation
      • System instructions
      • Function calling
      • Structured output
      • Content generation parameters
      • Code execution
      • Medical text
    • Image generation
      • Gemini
      • Generate images with Gemini
      • Edit images with Gemini
      • Imagen
      • Imagen overview
      • Generate images using text prompts
      • Verify an image watermark
      • Configure Imagen parameters
        • Configure Responsible AI safety settings
        • Use prompt rewriter
        • Set text prompt language
        • Configure aspect ratio
        • Set output resolution
        • Omit content using a negative prompt
        • Generate deterministic images
      • Generate images for retail and e-commerce
        • Generate Virtual Try-On images
        • Recontextualize product images
      • Edit images
        • Overview
        • Insert objects into an image using inpaint
        • Remove objects from an image using inpaint
        • Expand the content of an image using outpaint
        • Replace the background of an image
        • Edit using Personalization
        • Edit images using text prompts
      • Customize images
        • Subject customization
        • Style customization
        • Controlled Customization
        • Instruct Customization
      • Upscale an image
      • Prompt and image attribute guide
      • Base64 encode and decode files
      • Responsible AI and usage guidelines for Imagen
      • Legacy features
        • Migrate to Imagen 3
        • Get image descriptions using visual captioning
        • Use Visual Question Answering
        • Get video descriptions using Imagen
    • Video generation
      • Introduction to Veo
      • Generate Veo videos from text prompts
      • Generate Veo videos from an image
      • Generate Veo videos using first and last video frames
      • Direct Veo video generation using a reference image
      • Extend Veo videos
      • Veo prompt guide
      • Turn off Veo's prompt rewriter
      • Responsible AI for Veo
    • Music generation
      • Generate music using Lyria
      • Lyria prompt guide
    • Media analysis
      • Image understanding
      • Video understanding
      • Audio understanding
      • Document understanding
      • Bounding box detection
    • Grounding
      • Overview
      • Grounding with Google Search
      • Grounding with Google Maps
      • Grounding with Vertex AI Search
      • Grounding with your search API
      • Grounding responses using RAG
      • Grounding with Elasticsearch
      • Web Grounding for Enterprise
    • URL context
    • Thinking
    • Live API
      • Live API overview
      • Interactive conversations
      • Built-in tools
    • Embeddings
      • Overview
      • Text embeddings
        • Get text embeddings