Session turn relevancy

Definition

The session turn relevancy test evaluates whether each turn is relevant to the ongoing conversation. An LLM-as-a-judge reads the full session and scores it against four criteria:

Each response directly addresses the user’s question or request
Responses avoid irrelevant information or padding
Off-topic tangents are handled appropriately (redirected rather than indulged)
The conversation stays focused on the thread the user is pursuing

Taxonomy

Task types: LLM.
Availability: and .
Evaluation level: session.
Polarity: higher score = better. 0 = completely irrelevant turns, 1 = all turns highly relevant.

Why it matters

Even assistants that answer factually correctly sometimes respond to adjacent but not directly-requested questions — a subtle quality degradation.
Turn-level relevancy aggregated across a session catches patterns where the assistant consistently half-misunderstands the user.

Required columns

Input: The user’s message in each turn.
Output: The assistant’s response in each turn.
Session ID: Groups turns belonging to the same conversation.
Timestamp: Used to reconstruct turn order within a session.

This metric relies on an LLM evaluator. On Openlayer you can configure the underlying LLM used to compute it. Check out the OpenAI or Anthropic integration guides for details.

Test configuration examples

[
  {
    "name": "Session turn relevancy above 0.7",
    "description": "Ensure each turn is relevant to the user's question",
    "type": "performance",
    "subtype": "sessionTurnRelevancy",
    "thresholds": [
      {
        "insightName": "sessionTurnRelevancy",
        "measurement": "meanScore",
        "operator": ">=",
        "value": 0.7
      }
    ],
    "subpopulationFilters": null,
    "mode": "monitoring",
    "usesProductionData": true,
    "evaluationWindow": 3600,
    "delayWindow": 0
  }
]

Answer relevancy — trace-level relevancy metric (Ragas).

Session token count

Size ratio

⌘I

Get started

Workspace setup

Governance

Observability

Offline testing

Tests

Gateway

Data quality monitoring

Administration

Notifications

Other resources

Session turn relevancy

Definition

Taxonomy

Why it matters

Required columns

Test configuration examples

​Definition

​Taxonomy

​Why it matters

​Required columns

​Test configuration examples

​Related

Definition

Taxonomy

Why it matters

Required columns

Test configuration examples

Related