Session role adherence

Definition

The session role adherence test evaluates whether the assistant stays in its defined role across a multi-turn conversation. You supply a role description in plain English (for example, “a customer-support agent for an e-commerce platform who handles order tracking, returns, and product questions”), and an LLM-as-a-judge scores how well the assistant kept to that role throughout the session.

Taxonomy

Task types: LLM.
Availability: and .
Evaluation level: session.
Polarity: higher score = better (role adhered to).

Why it matters

Role drift is a leading indicator of prompt leakage, jailbreak success, or retrieval-augmented context bleeding into the agent’s persona.
Role adherence at the session level catches drift that builds across turns and is invisible to per-turn checks.

Required columns

Input: The user’s message in each turn.
Output: The assistant’s response in each turn.
Session ID: Groups turns belonging to the same conversation.
Timestamp: Used to reconstruct turn order within a session.

Insight parameters

role_definition (string, optional): A plain-English description of the assistant’s expected role. If omitted, the judge falls back to a generic “appropriate-for-context” role check (useful when you haven’t yet formalized the role, but lower-signal than the explicit version).

The judge evaluates four dimensions when a role is supplied:

Persona / expertise — does the assistant present itself as the specified role?
Scope — does it stay within the role’s topical boundaries?
Tone & style — does it match the communication style expected of the role?
Handling out-of-role requests — does it decline or redirect cleanly?

This metric relies on an LLM evaluator. On Openlayer you can configure the underlying LLM used to compute it. Check out the OpenAI or Anthropic integration guides for details.

Test configuration examples

[
  {
    "name": "Session role adherence above 0.7",
    "description": "Ensure the assistant stays in its defined customer-support role",
    "type": "performance",
    "subtype": "sessionRoleAdherence",
    "thresholds": [
      {
        "insightName": "sessionRoleAdherence",
        "insightParameters": [
          {
            "name": "role_definition",
            "value": "You are a customer-support agent for an e-commerce platform. You help with order tracking, returns, and product questions."
          }
        ],
        "measurement": "meanScore",
        "operator": ">=",
        "value": 0.7
      }
    ],
    "subpopulationFilters": null,
    "mode": "monitoring",
    "usesProductionData": true,
    "evaluationWindow": 3600,
    "delayWindow": 0
  }
]

Session guideline adherence — broader custom-behaviour check.

Session record count

Session task progression

⌘I

Get started

Workspace setup

Governance

Observability

Offline testing

Tests

Gateway

Data quality monitoring

Administration

Notifications

Other resources

Session role adherence

Definition

Taxonomy

Why it matters

Required columns

Insight parameters

Test configuration examples

​Definition

​Taxonomy

​Why it matters

​Required columns

​Insight parameters

​Test configuration examples

​Related

Definition

Taxonomy

Why it matters

Required columns

Insight parameters

Test configuration examples

Related