Monitor content security

This document describes how to view content security insights from Model Armor for supported AI agents.

Model Armor screens the requests and responses for security risks, such as indirect prompt injection attacks, sensitive data leakage, and the generation or serving of harmful content. For more information, see Model Armor.

You can view the results of Model Armor operations at the following levels:

Top-level view: insights for all supported AI agents in the project
Agent-level view: insights for a single AI agent

Before you begin

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Enable the Model Armor API.

Roles required to enable APIs

To enable APIs, you need the serviceusage.services.enable permission. If you created the project, then you likely already have this permission through the Owner role (roles/owner). Otherwise, you can get this permission through the Service Usage Admin role (roles/serviceusage.serviceUsageAdmin). Learn how to grant roles.

Enable the API

Enable the Model Armor API.

Roles required to enable APIs

Enable the API

Configure Model Armor on one or more gateways in your project.
To monitor agents that communicate with a Google Cloud MCP server, configure Model Armor with MCP servers.
Set up tracing for your agent.

Required role

To get the permissions that you need to monitor content security violations, ask your administrator to grant you the following IAM roles on the project:

Observability View Accessor (roles/observability.viewAccessor)
Observability Analytics User (roles/observability.analyticsUser)
Logs Viewer (roles/logging.viewer)
Logs View Accessor (roles/logging.viewAccessor)

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to monitor content security violations. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to monitor content security violations:

monitoring.monitoredResourceDescriptors.list
monitoring.metricDescriptors.list

You might also be able to get these permissions with custom roles or other predefined roles.

Supported agents

The Security tab is populated with Model Armor insights for the following agents only:

Agents deployed in Agent Runtime and governed by a gateway where Model Armor is configured.
Agents deployed in Agent Runtime and communicating with a Google Cloud MCP server.
Agents deployed in Agent Runtime in a project where Model Armor floor settings are configured.

View content insights for supported AI agents in a project (top-level view)

To view the content security insights for all supported AI agents in a project, follow these steps:

In the Google Cloud console, go to the Gemini Enterprise Agent Platform Security tab.
Go to Security
Select your project.

If you don't see content security insights on the Security tab and you have supported AI agents in your project, make sure you have set up tracing for your agents.

View content insights for an AI agent (agent-level view)

To view the content security insights for supported agents, follow these steps:

In the Google Cloud console, go to Agent Registry.
Go to Agent Registry
Select your project.
Click the name of the agent.
Click the Security tab.

View the number of flagged or blocked interactions

Go to the top-level or agent-level Security tab.

On the Security tab, view the number of interactions, including flagged and blocked interactions. The Security tab displays the following metrics:

Total interactions: The total number of prompts and responses that are analyzed by Model Armor.
Interactions flagged: The number of interactions that violated a configured policy in your Model Armor template or floor settings.
Interactions blocked: The number of interactions blocked if you configured Model Armor in the INSPECT_AND_BLOCK mode. These blocked interactions violated floor settings or templates.

Monitor content security violations

Go to the top-level or agent-level Security tab.

In the Violations over time chart, monitor the number of detected violations over time.

The violations detected are categorized into the following areas:

Prompt injections and jailbreaks: Content violations indicating the presence of prompts that contain malicious commands or jailbreak attempts. For more information, see Prompt injection and jailbreak detection.
Malicious URL: Content violations indicating the presence of malicious URLs. For more information, see Malicious URL detection.
Responsible AI: Content violations that are detected by safety filters, such as harassment and hate speech. For a complete list of responsible AI categories, see Responsible AI safety filter.
Sensitive data: Content violations involving the presence of sensitive information types or custom information types that you define. For more information, see Sensitive Data Protection.
Note: Counts for sensitive data content violations are included in the total violations count but aren't displayed in a separate category.

For more information about these detectors, see Model Armor filters.

Identify the agents with the most violations

Go to the top-level Security tab.

The Security tab displays the top 10 agents with the most violations. The list shows the agent ID of each agent and the number of violations detected for that agent.

To view the Model Armor insights for a specific agent in the list, go to Agent Registry to search for the agent by its agent ID. Then, go to the agent-level Security tab for that agent.

Go to Agent Registry

Query and analyze telemetry data using SQL

To query and analyze telemetry data from Model Armor, use Observability Analytics, which provides a SQL-based query interface.

Go to the top-level Security tab.
For the view that you want to query, click More chart options > Explore in Observability Analytics.

For general instructions on how to use Observability Analytics, see Query and analyze telemetry with Observability Analytics.

Download violations data to a PNG or CSV file

To download violations data to a PNG or CSV file, follow these steps:

In the Violations over time view on the Security tab, select the period for which you want to download data.
Click More chart options > Download.
Click Download PNG or Download CSV to download the data in your preferred format.

What's next

Guide

Monitor content security

Before you begin

Required role

Required permissions

Supported agents

View content insights for supported AI agents in a project (top-level view)

View content insights for an AI agent (agent-level view)

View the number of flagged or blocked interactions

Monitor content security violations

Identify the agents with the most violations

Query and analyze telemetry data using SQL

Download violations data to a PNG or CSV file

What's next

Model Armor audit logging

Configure logging for Model Armor

Troubleshoot Model Armor issues

Monitor content security Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Required role

Required permissions

Supported agents

View content insights for supported AI agents in a project (top-level view)

View content insights for an AI agent (agent-level view)

View the number of flagged or blocked interactions

Monitor content security violations

Identify the agents with the most violations

Query and analyze telemetry data using SQL

Download violations data to a PNG or CSV file

What's next

Model Armor audit logging

Configure logging for Model Armor

Troubleshoot Model Armor issues

Monitor content security