Vertex AI plugin
The Vertex AI plugin provides access to Google Cloud’s enterprise-grade AI platform, offering advanced features beyond basic model access. Use this for enterprise applications that need grounding, Vector Search, Model Garden, or evaluation capabilities.
Accessing Google GenAI Models via Vertex AI
Section titled “Accessing Google GenAI Models via Vertex AI”All languages support accessing Google’s generative AI models (Gemini, Imagen, etc.) through Vertex AI with enterprise authentication and features.
The unified Google GenAI plugin provides access to models via Vertex AI using the vertexAI
initializer:
Installation
Section titled “Installation”npm i --save @genkit-ai/google-genai
Configuration
Section titled “Configuration”import { genkit } from 'genkit';import { vertexAI } from '@genkit-ai/google-genai';
const ai = genkit({ plugins: [ vertexAI({ location: 'us-central1' }), // Regional endpoint // vertexAI({ location: 'global' }), // Global endpoint ],});
Authentication Methods:
- Application Default Credentials (ADC): The standard method for most Vertex AI use cases, especially in production. It uses the credentials from the environment (e.g., service account on GCP, user credentials from
gcloud auth application-default login
locally). This method requires a Google Cloud Project with billing enabled and the Vertex AI API enabled. - Vertex AI Express Mode: A streamlined way to try out many Vertex AI features using just an API key, without needing to set up billing or full project configurations. This is ideal for quick experimentation and has generous free tier quotas. Learn More about Express Mode.
// Using Vertex AI Express Mode (Easy to start, some limitations)// Get an API key from the Vertex AI Studio Express Mode setup.vertexAI({ apiKey: process.env.VERTEX_EXPRESS_API_KEY }),
Note: When using Express Mode, you do not provide projectId
and location
in the plugin config.
Basic Usage
Section titled “Basic Usage”import { genkit } from 'genkit';import { vertexAI } from '@genkit-ai/google-genai';
const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })],});
const response = await ai.generate({ model: vertexAI.model('gemini-2.5-pro'), prompt: 'Explain Vertex AI in simple terms.',});
console.log(response.text());
Text Embedding
Section titled “Text Embedding”const embeddings = await ai.embed({ embedder: vertexAI.embedder('text-embedding-005'), content: 'Embed this text.',});
Image Generation (Imagen)
Section titled “Image Generation (Imagen)”const response = await ai.generate({ model: vertexAI.model('imagen-3.0-generate-002'), prompt: 'A beautiful watercolor painting of a castle in the mountains.',});
const generatedImage = response.media();
Enterprise Features (JavaScript Only)
Section titled “Enterprise Features (JavaScript Only)”The following advanced features are available only in JavaScript using the dedicated @genkit-ai/vertexai
plugin:
Installation for Advanced Features
Section titled “Installation for Advanced Features”npm install @genkit-ai/vertexai
If you want to locally run flows that use this plugin, you also need the Google Cloud CLI tool installed.
Configuration for Advanced Features
Section titled “Configuration for Advanced Features”import { genkit } from 'genkit';import { vertexAI } from '@genkit-ai/vertexai';
const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })],});
The plugin requires you to specify your Google Cloud project ID, the region to which you want to make Vertex API requests, and your Google Cloud project credentials.
-
You can specify your Google Cloud project ID either by setting
projectId
in thevertexAI()
configuration or by setting theGCLOUD_PROJECT
environment variable. If you’re running your flow from a Google Cloud environment (Cloud Functions, Cloud Run, and so on),GCLOUD_PROJECT
is automatically set to the project ID of the environment. -
You can specify the API location either by setting
location
in thevertexAI()
configuration or by setting theGCLOUD_LOCATION
environment variable. -
To provide API credentials, you need to set up Google Cloud Application Default Credentials.
-
To specify your credentials:
-
If you’re running your flow from a Google Cloud environment (Cloud Functions, Cloud Run, and so on), this is set automatically.
-
On your local dev environment, do this by running:
Terminal window gcloud auth application-default login --project YOUR_PROJECT_ID -
For other environments, see the Application Default Credentials docs.
-
-
In addition, make sure the account is granted the Vertex AI User IAM role (
roles/aiplatform.user
). See the Vertex AI access control docs.
-
Grounding
Section titled “Grounding”This plugin supports grounding Gemini text responses using Google Search or your own data.
Important: Vertex AI charges a fee for grounding requests in addition to the cost of making LLM requests. See the Vertex AI pricing page and be sure you understand grounding request pricing before you use this feature.
Example:
const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })],});
await ai.generate({ model: vertexAI.model('gemini-2.5-flash'), prompt: '...', config: { googleSearchRetrieval: { disableAttribution: true, } vertexRetrieval: { datastore: { projectId: 'your-cloud-project', location: 'us-central1', collection: 'your-collection', }, disableAttribution: true, } }})
Context Caching
Section titled “Context Caching”The Vertex AI Genkit plugin supports Context Caching, which allows models to reuse previously cached content to optimize token usage when dealing with large pieces of content. This feature is especially useful for conversational flows or scenarios where the model references a large piece of content consistently across multiple requests.
How to Use Context Caching
Section titled “How to Use Context Caching”To enable context caching, ensure your model supports it. For example, gemini-2.5-flash
and gemini-2.0-pro
are models that support context caching, and you will have to specify version number 001
.
You can define a caching mechanism in your application like this:
const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })],});
const llmResponse = await ai.generate({ messages: [ { role: 'user', content: [{ text: 'Here is the relevant text from War and Peace.' }], }, { role: 'model', content: [ { text: "Based on War and Peace, here is some analysis of Pierre Bezukhov's character.", }, ], metadata: { cache: { ttlSeconds: 300, // Cache this message for 5 minutes }, }, }, ], model: vertexAI.model('gemini-2.5-flash'), prompt: "Describe Pierre's transformation throughout the novel.",});
In this setup:
messages
: Allows you to pass conversation history.metadata.cache.ttlSeconds
: Specifies the time-to-live (TTL) for caching a specific response.
Example: Leveraging Large Texts with Context
Section titled “Example: Leveraging Large Texts with Context”For applications referencing long documents, such as War and Peace or Lord of the Rings, you can structure your queries to reuse cached contexts:
const textContent = await fs.readFile('path/to/war_and_peace.txt', 'utf-8');
const llmResponse = await ai.generate({ messages: [ { role: 'user', content: [{ text: textContent }], // Include the large text as context }, { role: 'model', content: [ { text: 'This analysis is based on the provided text from War and Peace.', }, ], metadata: { cache: { ttlSeconds: 300, // Cache the response to avoid reloading the full text }, }, }, ], model: vertexAI.model('gemini-2.5-flash'), prompt: 'Analyze the relationship between Pierre and Natasha.',});
Supported models: gemini-2.5-flash-001
, gemini-2.0-pro-001
Model Garden Integration
Section titled “Model Garden Integration”Access third-party models through Vertex AI Model Garden:
Claude 3 Models
Section titled “Claude 3 Models”import { vertexAIModelGarden } from '@genkit-ai/vertexai/modelgarden';
const ai = genkit({ plugins: [ vertexAIModelGarden({ location: 'us-central1', models: ['claude-3-haiku', 'claude-3-sonnet', 'claude-3-opus'], }), ],});
const response = await ai.generate({ model: 'claude-3-sonnet', prompt: 'What should I do when I visit Melbourne?',});
Llama 3.1 405b
Section titled “Llama 3.1 405b”const ai = genkit({ plugins: [ vertexAIModelGarden({ location: 'us-central1', models: ['llama3-405b-instruct-maas'], }), ],});
const response = await ai.generate({ model: 'llama3-405b-instruct-maas', prompt: 'Write a function that adds two numbers together',});
Mistral Models
Section titled “Mistral Models”const ai = genkit({ plugins: [ vertexAIModelGarden({ location: 'us-central1', models: ['mistral-large', 'mistral-nemo', 'codestral'], }), ],});
const response = await ai.generate({ model: 'mistral-large', prompt: 'Write a function that adds two numbers together', config: { version: 'mistral-large-2411', temperature: 0.7, maxOutputTokens: 1024, topP: 0.9, stopSequences: ['###'], },});
The models support:
mistral-large
: Latest Mistral large model with function calling capabilitiesmistral-nemo
: Optimized for efficiency and speedcodestral
: Specialized for code generation tasks
Evaluation Metrics
Section titled “Evaluation Metrics”Use Vertex AI Rapid Evaluation API for model evaluation:
import { vertexAIEvaluation, VertexAIEvaluationMetricType } from '@genkit-ai/vertexai/evaluation';
const ai = genkit({ plugins: [ vertexAIEvaluation({ location: 'us-central1', metrics: [ VertexAIEvaluationMetricType.SAFETY, { type: VertexAIEvaluationMetricType.ROUGE, metricSpec: { rougeType: 'rougeLsum', }, }, ], }), ],});
Available metrics:
- BLEU: Translation quality
- ROUGE: Summarization quality
- Fluency: Text fluency
- Safety: Content safety
- Groundedness: Factual accuracy
- Summarization Quality/Helpfulness/Verbosity: Summary evaluation
Run evaluations:
genkit eval:rungenkit eval:flow -e vertexai/safety
Vector Search
Section titled “Vector Search”Use Vertex AI Vector Search for enterprise-grade vector operations:
- Create a Vector Search index in the Google Cloud Console
- Configure dimensions based on your embedding model:
gemini-embedding-001
: 768 dimensionstext-multilingual-embedding-002
: 768 dimensionsmultimodalEmbedding001
: 128, 256, 512, or 1408 dimensions
- Deploy the index to a standard endpoint
Configuration
Section titled “Configuration”import { vertexAIVectorSearch } from '@genkit-ai/vertexai/vectorsearch';import { getFirestoreDocumentIndexer, getFirestoreDocumentRetriever } from '@genkit-ai/vertexai/vectorsearch';
const ai = genkit({ plugins: [ vertexAIVectorSearch({ projectId: 'your-project-id', location: 'us-central1', vectorSearchOptions: [ { indexId: 'your-index-id', indexEndpointId: 'your-endpoint-id', deployedIndexId: 'your-deployed-index-id', publicDomainName: 'your-domain-name', documentRetriever: firestoreDocumentRetriever, documentIndexer: firestoreDocumentIndexer, embedder: vertexAI.embedder('gemini-embedding-001'), }, ], }), ],});
import { vertexAiIndexerRef, vertexAiRetrieverRef } from '@genkit-ai/vertexai/vectorsearch';
// Index documentsawait ai.index({ indexer: vertexAiIndexerRef({ indexId: 'your-index-id', }), documents,});
// Retrieve similar documentsconst results = await ai.retrieve({ retriever: vertexAiRetrieverRef({ indexId: 'your-index-id', }), query: queryDocument,});
Next Steps
Section titled “Next Steps”- Learn about generating content to understand how to use these models effectively
- Explore evaluation to leverage Vertex AI’s evaluation metrics
- See RAG to implement retrieval-augmented generation with Vector Search
- Check out creating flows to build structured AI workflows
- For simple API key access, see the Google AI plugin
The Google Generative AI plugin provides access to Google’s Gemini models through Vertex AI using Google Cloud authentication.
Configuration
Section titled “Configuration”To use this plugin, import the googlegenai
package and pass googlegenai.VertexAI
to WithPlugins()
in the Genkit initializer:
import "github.com/firebase/genkit/go/plugins/googlegenai"
g := genkit.Init(context.Background(), genkit.WithPlugins(&googlegenai.VertexAI{}))
The plugin requires you to specify your Google Cloud project ID, the region to which you want to make Vertex API requests, and your Google Cloud project credentials.
-
By default,
googlegenai.VertexAI
gets your Google Cloud project ID from theGOOGLE_CLOUD_PROJECT
environment variable.You can also pass this value directly:
genkit.WithPlugins(&googlegenai.VertexAI{ProjectID: "my-project-id"}) -
By default,
googlegenai.VertexAI
gets the Vertex AI API location from theGOOGLE_CLOUD_LOCATION
environment variable.You can also pass this value directly:
genkit.WithPlugins(&googlegenai.VertexAI{Location: "us-central1"}) -
To provide API credentials, you need to set up Google Cloud Application Default Credentials.
-
To specify your credentials:
-
If you’re running your flow from a Google Cloud environment (Cloud Functions, Cloud Run, and so on), this is set automatically.
-
On your local dev environment, do this by running:
Terminal window gcloud auth application-default login -
For other environments, see the Application Default Credentials docs.
-
-
In addition, make sure the account is granted the Vertex AI User IAM role (
roles/aiplatform.user
). See the Vertex AI access control docs.
-
Generative models
Section titled “Generative models”To get a reference to a supported model, specify its identifier to googlegenai.VertexAIModel
:
model := googlegenai.VertexAIModel(g, "gemini-2.5-flash")
Alternatively, you may create a ModelRef
which pairs the model name with its config:
modelRef := googlegenai.VertexAIModelRef("gemini-2.5-flash", &genai.GenerateContentConfig{ Temperature: genai.Ptr[float32](0.5), MaxOutputTokens: genai.Ptr[int32](500), // Other configuration...})
The following models are supported: gemini-1.5-pro
, gemini-1.5-flash
, gemini-2.0-pro
, gemini-2.5-flash
, and other experimental models.
Model references have a Generate()
method that calls the Vertex AI API:
resp, err := genkit.Generate(ctx, g, ai.WithModel(modelRef), ai.WithPrompt("Tell me a joke."))if err != nil { return err}
log.Println(resp.Text())
See Generating content with AI models for more information.
Embedding models
Section titled “Embedding models”To get a reference to a supported embedding model, specify its identifier to googlegenai.VertexAIEmbedder
:
embeddingModel := googlegenai.VertexAIEmbedder(g, "text-embedding-004")
The following models are supported:
textembedding-gecko@003
,textembedding-gecko@002
,textembedding-gecko@001
,text-embedding-004
,textembedding-gecko-multilingual@001
,text-multilingual-embedding-002
, andmultimodalembedding
Embedder references have an Embed()
method that calls the Vertex AI API:
resp, err := genkit.Embed(ctx, g, ai.WithEmbedder(embeddingModel), ai.WithTextDocs(userInput))if err != nil { return err}
You can retrieve docs by passing in an input to a Retriever’s Retrieve()
method:
resp, err := genkit.Retrieve(ctx, g, ai.WithRetriever(myRetriever), ai.WithTextDocs(userInput))if err != nil { return err}
See Retrieval-augmented generation (RAG) for more information.
Advanced Features
Section titled “Advanced Features”Advanced Vertex AI features like Vector Search, Model Garden, and evaluation metrics require custom implementation using the Google Cloud SDK directly. See the Vertex AI documentation for implementation details.
Next Steps
Section titled “Next Steps”- Learn about generating content to understand how to use these models effectively
- Explore evaluation to leverage Vertex AI’s evaluation metrics
- See RAG to implement retrieval-augmented generation with Vector Search
- Check out creating flows to build structured AI workflows
- For simple API key access, see the Google AI plugin
The genkit-plugin-google-genai
package provides the VertexAI
plugin for accessing Google’s generative AI models via the Gemini API within Google Cloud Vertex AI (uses standard Google Cloud authentication).
Installation
Section titled “Installation”pip3 install genkit-plugin-google-genai
Configuration
Section titled “Configuration”To use models via Vertex AI, ensure you have authenticated with Google Cloud (e.g., via gcloud auth application-default login
).
from genkit.ai import Genkitfrom genkit.plugins.google_genai import VertexAI
ai = Genkit( plugins=[VertexAI()], model='vertexai/gemini-2.5-flash', # optional)
You can specify the location
and project
ID, among other configuration options available in the VertexAI
constructor.
ai = Genkit( plugins=[VertexAI( location='us-east1', project='my-project-id', )],)
Text Generation
Section titled “Text Generation”response = await ai.generate('What should I do when I visit Melbourne?')print(response.text)
Text Embedding
Section titled “Text Embedding”embeddings = await ai.embed( embedder='vertexai/gemini-embedding-001', content='How many widgets do you have in stock?',)
Image Generation
Section titled “Image Generation”response = await ai.generate( model='vertexai/imagen-3.0-generate-002', prompt='a banana riding a bicycle',)
Advanced Features
Section titled “Advanced Features”Advanced Vertex AI features like Vector Search, Model Garden, and evaluation metrics require custom implementation using the Google Cloud SDK directly. See the Vertex AI documentation for implementation details.
Next Steps
Section titled “Next Steps”- Learn about generating content to understand how to use these models effectively
- See RAG to implement retrieval-augmented generation with Vector Search
- Check out creating flows to build structured AI workflows
- For simple API key access, see the Google AI plugin