Generative AI on Vertex AI - Release notes

April 06, 2026

2026-04-06T00:00:00-07:00

Feature

Metadata search for RAG Engine

Use schema-based metadata search in Vertex AI RAG Engine. You can define a metadata schema for a corpus, attach metadata to files within that corpus, and use this metadata to filter contexts during retrieval. For more information, see Filter with metadata search.

April 03, 2026

2026-04-03T00:00:00-07:00

Feature

Gemma 4 26B A4B IT is available as an experimental launch in Model Garden. This is an open model built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output. Gemma 4 26B A4B IT is available as a managed API in Model Garden. To learn more, see Gemma 4 26B A4B IT.

Feature

Vertex AI RAG Engine Serverless mode

Vertex AI RAG Engine Serverless mode is now available in public preview. Serverless mode provides a fully managed database for storing RAG resources that abstracts away database provisioning and scaling. You can seamlessly switch between Serverless mode and Spanner mode, which provides dedicated, isolated database instances.

For more information, see the following:

April 02, 2026

2026-04-02T00:00:00-07:00

Feature

Veo 3.1 Lite

Veo 3.1 Lite is available in public preview. This release is our most cost-efficient Veo on Vertex AI model.

For more information, see 3.1 Lite Generate

Announcement

Gemini 2.5 model retirement dates updated

The retirement dates for Gemini 2.5 Pro, Gemini 2.5 Flash-Lite, and Gemini 2.5 Flash have been updated to October 16, 2026. For more information, see Model versions and lifecycle.

March 25, 2026

2026-03-25T00:00:00-07:00

Feature

Lyria 3

Lyria is available in public preview. You can use lyria-3-pro-preview to generate 184 seconds of audio, or lyria-3-clip-preview to generate 30 seconds of audio.

For more information, see the following:

March 24, 2026

2026-03-24T00:00:00-07:00

Deprecated

Imagen generation GA endpoints deprecation

The following table describes image generation endpoints that are deprecated and their replacements. We recommend updating your model endpoints before June 30, 2026, to avoid service disruption.

Discontinued endpoints	Recommended endpoint migration
`imagegeneration@002`	`gemini-2.5-flash-image`
`imagegeneration@003`	`gemini-2.5-flash-image`
`imagegeneration@004`	`gemini-2.5-flash-image`
`imagegeneration@005`	`gemini-2.5-flash-image`
`imagegeneration@006`	`gemini-2.5-flash-image`
`imagetext@001`	`gemini-2.5-flash-image`
`imagen-3.0-capability-001`	`gemini-2.5-flash-image`
`imagen-3.0-capability-002`	`gemini-2.5-flash-image`
`imagen-3.0-fast-generate-001`	`gemini-2.5-flash-image`
`imagen-3.0-generate-001`	`gemini-2.5-flash-image`
`imagen-3.0-generate-002`	`gemini-2.5-flash-image`
`imagen-4.0-fast-generate-001`	`gemini-2.5-flash-image`
`imagen-4.0-generate-001`	`gemini-2.5-flash-image`
`imagen-4.0-ultra-generate-001`	`gemini-2.5-flash-image`

Deprecated

Video generation GA endpoints deprecation

The following table describes video generation endpoints that are deprecated and their replacements. We recommend updating your model endpoints before June 30, 2026, to avoid service disruption.

Discontinued endpoints	Recommended endpoint migration
`veo-3.0-generate-001`	`veo-3.1-generate-001`
`veo-3.0-fast-generate-001`	`veo-3.1-fast-generate-001`
`veo-2.0-generate-001`	`veo-3.1-generate-001`

March 12, 2026

2026-03-12T00:00:00-07:00

Feature

Partner model evaluations

The Gen AI evaluation service supports evaluating partner models, such as Anthropic and Llama models. For more information, see Perform evaluation using the console.

March 03, 2026

2026-03-03T00:00:00-08:00

Deprecated

Video generation preview endpoints deprecation

The following table describes video generation endpoints that are deprecated and their replacements. We recommend updating your model endpoints before April 2, 2026, to avoid service disruption.

Discontinued endpoints	Recommended endpoint migration
`veo-3.0-generate-preview`	`veo-3.0-generate-001`
`veo-3.0-fast-generate-preview`	`veo-3.0-fast-generate-preview`
`veo-2.0-generate-preview`	`veo-2.0-generate-001`
`veo-2.0-generate-exp`	`veo-2.0-generate-001`
`veo-001-preview-0815`	`veo-2.0-generate-001`
`veo-001-preview`	`veo-2.0-generate-001`
`veo-3.1-generate-preview`	`veo-3.1-generate-001`
`veo-3.1-fast-generate-preview`	`veo-3.1-fast-generate-001`

Feature

Gemini 3.1 Flash-Lite

Gemini 3.1 Flash-Lite (gemini-3.1-flash-lite-preview) is available in public preview. This release is our most cost-efficient Gemini model and is optimized for low latency use cases for high-volume, cost-sensitive LLM traffic.

For more information, see Gemini 3.1 Flash-Lite.

February 26, 2026

2026-02-26T00:00:00-08:00

Feature

Gemini 3.1 Flash Image

Gemini 3.1 Flash Image (gemini-3.1-flash-image) is available in public preview. This release enables high-quality image generation with improved pricing and latency. We recommend using Gemini 3.1 Flash Image when generating images.

For more information, see Gemini 3.1 Flash Image.

February 23, 2026

2026-02-23T00:00:00-08:00

Deprecated

Anthropic's Claude 3 Haiku

Anthropic's Claude 3 Haiku is deprecated as of February 23, 2026 and will be shut down on August 23, 2026. For more information, see Partner model deprecations.

February 19, 2026

2026-02-19T00:00:00-08:00

Feature

Gemini 3.1 Pro Preview

Gemini 3.1 Pro is available in preview in Model Garden. Gemini 3.1 Pro is our most advanced reasoning Gemini model, capable of solving complex problems from different information sources, including text, audio, images, video, PDFs, and even entire code repositories with its 1M token context window.

February 17, 2026

2026-02-17T00:00:00-08:00

Feature

Anthropic's Claude Sonnet 4.6

Claude Sonnet 4.6 is available in Model Garden.

Deprecated

Image generation preview endpoints deprecation

The following table describes image generation endpoints that are deprecated and their replacements. We recommend updating your model endpoints before March 19, 2026, to avoid service disruption.

Discontinued endpoints	Recommended endpoint migration
`gemini-2.0-flash-image-generation-preview`	`gemini-2.5-flash-image`
`gemini-2.5-flash-image-generation-preview`	`imagen-4.0-generate-001` or `gemini-2.5-flash-image`
`imagen-4.0-generate-preview-05-20`	`imagen-4.0-generate-001` or `gemini-2.5-flash-image`
`imagen-4.0-generate-preview-06-06`	`imagen-4.0-generate-001` or `gemini-2.5-flash-image`
`imagen-4.0-ultra-generate-preview-06-06`	`imagen-4.0-generate-001` or `gemini-2.5-flash-image`
`imagen-4.0-fast-generate-preview-05-20`	`imagen-4.0-generate-001` or `gemini-2.5-flash-image`
`imagen-product-recontext-preview-06-30`	`gemini-2.5-flash-image`
`imagen-2.0-edit-preview-0627`	`gemini-2.5-flash-image`
`virtual-try-on-preview-08-04`	`virtual-try-on-001`
`imagen-4.0-ingredients-preview`	`gemini-2.5-flash-image`

February 10, 2026

2026-02-10T00:00:00-08:00

Feature

GLM 5 is available as an experimental launch in Model Garden. This model is targeting complex systems engineering and long-horizon agentic tasks. GLM 5 is available as a managed API in Model Garden. To learn more, see GLM 5.

February 04, 2026

2026-02-04T00:00:00-08:00

Feature

Anthropic's Claude Opus 4.6

Claude Opus 4.6 is available in Model Garden.

January 23, 2026

2026-01-23T00:00:00-08:00

Feature

Virtual Try-On

Virtual Try-On is now generally available (GA). The new endpoint, virtual-try-on-001, replaces the previous endpoint, virtual-try-on-preview-08-04. We recommend changing to the new endpoint as soon as possible.

For more information, see Generate Virtual Try-On Images.

January 22, 2026

2026-01-22T00:00:00-08:00

Announcement

Codestral (25.01) and Mistral Large (24.11) are retired as of January 23, 2026.

January 20, 2026

2026-01-20T00:00:00-08:00

Feature

GLM 4.7 GA is now available in Model Garden. This model is designed for core or vibe coding, tool use, and complex reasoning. GLM 4.7 is available as a managed API in Model Garden. To learn more, see GLM 4.7.

January 13, 2026

2026-01-13T00:00:00-08:00

Feature

Veo 3.1 reference-to-video update

Veo 3.1 Preview models now support the following features:

9:16 aspect ratio for reference-to-video.
Upsampling for videos generated at 1080p and 4k resolutions.

For more information, see the following:

January 06, 2026

2026-01-06T00:00:00-08:00

Feature

GLM 4.7 is available as an experimental launch in Model Garden. This model is designed for core or vibe coding, tool use, and complex reasoning. GLM 4.7 is available as a managed API in Model Garden. To learn more, see GLM 4.7.

January 05, 2026

2026-01-05T00:00:00-08:00

Deprecated

Anthropic's Claude 3.5 Haiku

Anthropic's Claude 3.5 Haiku is deprecated as of January 5, 2026 and will be shut down on July 5, 2026. For more information, see Partner model deprecations.

December 18, 2025

2025-12-18T00:00:00-08:00

Feature

Save and share prompts in Vertex AI Studio: The prompt sharing feature no longer needs to be enabled. You can share prompts without asking your administrator to first enable the prompt sharing feature. For more information, see Save and share prompts.

Feature

The following models are available through Model Garden:

December 17, 2025

2025-12-17T00:00:00-08:00

Feature

Cloud API Registry is available in the Google Cloud console in Preview. Use Cloud API Registry in the Google Cloud console to view and manage the MCP servers and tools your agent has access to.

Feature

Gemini 3 Flash

Gemini 3 Flash is now available in public preview. This model is designed to tackle the most challenging agentic problems with strong coding and state-of-the-art reasoning capabilities, and is our best model for complex multimodal understanding.

For more information, see Gemini 3 Flash.

December 16, 2025

2025-12-16T00:00:00-08:00

Change

Updated pricing for Vertex AI Agent Engine:

Pricing for Vertex AI Agent Engine Runtime was lowered.
On January 28, 2026, Sessions, Memory Bank, and Code Execution will begin charging for usage.

For more information, see Pricing.

Announcement

Vertex AI Agent Engine

Vertex AI Agent Engine Sessions and Memory Bank are now Generally Available.

Change

Vertex AI Agent Engine

Vertex AI Agent Engine is now available in the following regions:

europe-west6 (Zurich)
europe-west8 (Milan)
asia-east2 (Hong Kong)
asia-northeast3 (Seoul)
asia-southeast2 (Jakarta)
northamerica-northeast2 (Toronto)
southamerica-east1 (São Paulo)

For more information, see Vertex AI Agent Builder locations.

Change

Virtual Try-On

Our Virtual Try-On model, virtual-try-on-preview-08-04, is improved. Latency is significantly reduced and quality is improved for shoes, body shape preservation, and product fidelity.

December 12, 2025

2025-12-12T00:00:00-08:00

Feature

Gemini 2.5 Flash with Gemini Live API Native Audio

Gemini 2.5 Flash with Gemini Live API Native Audio (gemini-live-2.5-flash-native-audio) is Generally Available (GA). This model features cutting-edge native audio functionality for Gemini Live API, including enhanced voice quality and adaptability, Proactive Audio, and Affective Dialog.

December 10, 2025

2025-12-10T00:00:00-08:00

Feature

DeepSeek-V3.2 is available in Model Garden. DeepSeek-V3.2 is a state-of-the-art large language model from DeepSeek. DeepSeek-V3.2 is available as a managed API in Model Garden. To learn more, see DeepSeek-V3.2.

December 09, 2025

2025-12-09T00:00:00-08:00

Feature

The following models are available through Model Garden:

December 08, 2025

2025-12-08T00:00:00-08:00

Feature

Veo 3.1 video extension

Veo 3.1 supports video extension in Preview.

For more information, see the following:

December 02, 2025

2025-12-02T00:00:00-08:00

Feature

The Vertex AI Model Garden model co-hosting vLLM container is available to use with this sample notebook. You can use this container to serve multiple replicas of a model and serve multiple models with dynamic loading and unloading. This allows you to maximize resource utilization and serving efficiency, and flexibly adjust the models to serve.

Feature

The following models are available through Model Garden:

November 24, 2025

2025-11-24T00:00:00-08:00

Feature

Anthropic's Claude Opus 4.5

Claude Opus 4.5 is available in Model Garden.

November 17, 2025

2025-11-17T00:00:00-08:00

Feature

Veo video generation

Veo 3.1 is Generally Available, and introduces the following models:

For more information, see the following:

Announcement

LearnLM in Gemini

The LearnLM model is no longer a separate offering or listing on AI Studio as LearnLM capabilities have been integrated into the latest Gemini models (starting with Gemini 2.5).

Built in collaboration with experts in education, LearnLM represents our capabilities fine-tuned for learning informed by rigorous research. These advancements and improvements are available directly in Gemini, enhancing educational experiences and applications.

Pre-existing learnlm-2.0-flash-experimental projects will not remain functional past December 3, 2025 unless an alternative model is manually selected—we encourage developers to switch to the latest Gemini models and optimize their prompts by reviewing our LearnLM Partner Prompt Guide.

November 13, 2025

2025-11-13T00:00:00-08:00

Feature

Kimi K2 Thinking is available in Model Garden. This model is a thinking model that excels at complex problem-solving and deep reasoning. Kimi K2 Thinking is available as a managed API in Model Garden. To learn more, see Kimi K2 Thinking.

Feature

Updated Prompt Caching for Anthropic Claude Models

Prompt caching for Anthropic Claude models now supports a one-hour Time To Live (TTL).

For more information, see Prompt caching.