EvaluationMetricsThresholds

Threshold settings for metrics in an Evaluation.

JSON representation
{
  "goldenEvaluationMetricsThresholds": {
    object (EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds)
  },
  "hallucinationMetricBehavior": enum (EvaluationMetricsThresholds.HallucinationMetricBehavior),
  "goldenHallucinationMetricBehavior": enum (EvaluationMetricsThresholds.HallucinationMetricBehavior),
  "scenarioHallucinationMetricBehavior": enum (EvaluationMetricsThresholds.HallucinationMetricBehavior)
}
Fields
goldenEvaluationMetricsThresholds

object (EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds)

Optional. The golden evaluation metrics thresholds.

hallucinationMetricBehavior
(deprecated)

enum (EvaluationMetricsThresholds.HallucinationMetricBehavior)

Optional. Deprecated: Use goldenHallucinationMetricBehavior instead. The hallucination metric behavior is currently used for golden evaluations.

goldenHallucinationMetricBehavior

enum (EvaluationMetricsThresholds.HallucinationMetricBehavior)

Optional. The hallucination metric behavior for golden evaluations.

scenarioHallucinationMetricBehavior

enum (EvaluationMetricsThresholds.HallucinationMetricBehavior)

Optional. The hallucination metric behavior for scenario evaluations.

EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds

Settings for golden evaluations.

JSON representation
{
  "turnLevelMetricsThresholds": {
    object (EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.TurnLevelMetricsThresholds)
  },
  "expectationLevelMetricsThresholds": {
    object (EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.ExpectationLevelMetricsThresholds)
  },
  "toolMatchingSettings": {
    object (EvaluationMetricsThresholds.ToolMatchingSettings)
  }
}
Fields
turnLevelMetricsThresholds

object (EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.TurnLevelMetricsThresholds)

Optional. The turn level metrics thresholds.

expectationLevelMetricsThresholds

object (EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.ExpectationLevelMetricsThresholds)

Optional. The expectation level metrics thresholds.

toolMatchingSettings

object (EvaluationMetricsThresholds.ToolMatchingSettings)

Optional. The tool matching settings. An extra tool call is a tool call that is present in the execution but does not match any tool call in the golden expectation.

EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.TurnLevelMetricsThresholds

Turn level metrics thresholds.

JSON representation
{
  "semanticSimilarityChannel": enum (EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.TurnLevelMetricsThresholds.SemanticSimilarityChannel),
  "semanticSimilaritySuccessThreshold": integer,
  "overallToolInvocationCorrectnessThreshold": number
}
Fields
semanticSimilarityChannel

enum (EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.TurnLevelMetricsThresholds.SemanticSimilarityChannel)

Optional. The semantic similarity channel to use for evaluation.

semanticSimilaritySuccessThreshold

integer

Optional. The success threshold for semantic similarity. Must be an integer between 0 and 4. Default is >= 3.

overallToolInvocationCorrectnessThreshold

number

Optional. The success threshold for overall tool invocation correctness. Must be a float between 0 and 1. Default is 1.0.

EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.TurnLevelMetricsThresholds.SemanticSimilarityChannel

Semantic similarity channel to use.

Enums
SEMANTIC_SIMILARITY_CHANNEL_UNSPECIFIED Metric unspecified. Defaults to TEXT.
TEXT Use text semantic similarity.
AUDIO Use audio semantic similarity.

EvaluationMetricsThresholds.GoldenEvaluationMetricsThresholds.ExpectationLevelMetricsThresholds

Expectation level metrics thresholds.

JSON representation
{
  "toolInvocationParameterCorrectnessThreshold": number
}
Fields
toolInvocationParameterCorrectnessThreshold

number

Optional. The success threshold for individual tool invocation parameter correctness. Must be a float between 0 and 1. Default is 1.0.

EvaluationMetricsThresholds.ToolMatchingSettings

Settings for matching tool calls.

JSON representation
{
  "extraToolCallBehavior": enum (EvaluationMetricsThresholds.ToolMatchingSettings.ExtraToolCallBehavior)
}
Fields
extraToolCallBehavior

enum (EvaluationMetricsThresholds.ToolMatchingSettings.ExtraToolCallBehavior)

Optional. Behavior for extra tool calls. Defaults to FAIL.

EvaluationMetricsThresholds.ToolMatchingSettings.ExtraToolCallBehavior

Defines the behavior when an extra tool call is encountered. An extra tool call is a tool call that is present in the execution but does not match any tool call in the golden expectation.

Enums
EXTRA_TOOL_CALL_BEHAVIOR_UNSPECIFIED Unspecified behavior. Defaults to FAIL.
FAIL Fail the evaluation if an extra tool call is encountered.
ALLOW Allow the extra tool call.

EvaluationMetricsThresholds.HallucinationMetricBehavior

The hallucination metric behavior. Regardless of the behavior, the metric will always be calculated. The difference is that when disabled, the metric is not used to calculate the overall evaluation score.

Enums
HALLUCINATION_METRIC_BEHAVIOR_UNSPECIFIED Unspecified hallucination metric behavior.
DISABLED Disable hallucination metric.
ENABLED Enable hallucination metric.