OpenAI & Azure OpenAI

Openlayer integrates with OpenAI in two different ways:

If you are building an AI system with OpenAI LLMs and want to evaluate it, you can use the SDKs to make Openlayer part of your workflow.
Some tests on Openlayer are based on a score produced by an LLM judge. You can set any of OpenAI’s LLMs as the LLM judge for these tests.

This integration guide explores each of these paths.

Using OpenAI Agents SDK? Check out the OpenAI Agents SDK integration page.

Evaluating OpenAI LLMs

You can set up Openlayer tests to evaluate your OpenAI LLMs in monitoring and development.

Monitoring

To use the monitoring mode, you must instrument your code to publish the requests your AI system receives to the Openlayer platform. To set it up, you must follow the steps in the code snippet below:

# 1. Set the environment variables
import os
import openai

os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY_HERE"
os.environ["OPENLAYER_API_KEY"] = "YOUR_OPENLAYER_API_KEY_HERE"
os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "YOUR_OPENLAYER_INFERENCE_PIPELINE_ID_HERE"

# 2. Import the `trace_openai` function and wrap the OpenAI client with it
from openlayer.lib import trace_openai

openai_client = trace_openai(openai.OpenAI())

# 3. From now on, every chat completion/completion call with
# the `openai_client` is traced and published to Openlayer. E.g.,
completion = openai_client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "How are you doing today?"},
    ]
)

See full Python example

See full TypeScript example

For Azure OpenAI, check out this code example instead.

Once the code is instrumented, all your OpenAI calls are automatically published to Openlayer, along with metadata, such as latency, number of tokens, cost estimate, and more. If you navigate to the “Data” page of your Openlayer data source, you can see the traces for each request.

If the OpenAI LLM call is just one of the steps of your AI system, you can use the code snippets above together with tracing. In this case, your OpenAI LLM calls get added as a step of a larger trace.

After your AI system requests are continuously published and logged by Openlayer, you can create tests that run at a regular cadence on top of them. Refer to the Monitoring overview, for details on Openlayer’s monitoring mode, to the Publishing data guide, for more information on setting it up, or to the Tracing guide, to understand how to trace more complex systems.

Development

In development mode, Openlayer becomes a step in your CI/CD pipeline, and your tests get automatically evaluated after being triggered by some events. Openlayer tests often rely on your AI system’s outputs on a validation dataset. As discussed in the Configuring output generation guide, you have two options:

either provide a way for Openlayer to run your AI system on your datasets, or
before pushing, generate the model outputs yourself and push them alongside your artifacts.

For AI systems built with OpenAI LLMs, if you are not computing your system’s outputs yourself, you must provide your API credentials. To do so, navigate to “Workspace settings” -> “Environment variables,” and add the OPENAI_API_KEY variable.

For Azure OpenAI, add the AZURE_OPENAI_API_KEY, and AZURE_OPENAI_ENDPOINT secrets instead.

If you don’t add the required OpenAI API key, you’ll encounter a “Missing API key” error when Openlayer tries to run your AI system to get its outputs.

You can use one of the OpenAI templates to check out how a sample project fully set up with Openlayer looks like. We have templates in Python, and TypeScript.

Using OpenAI LLMs as the LLM judge

Some tests on Openlayer rely on scores produced by an LLM judge. For example, tests that use Ragas metrics and the custom LLM evaluator test. You can use any of OpenAI’s LLMs as the underlying LLM judge for these tests. You can change the default LLM evaluator for a project in the project settings page. To do so, navigate to “Settings” > Select your project in the left sidebar > click on “Metrics” to go to the metric settings page. Under “LLM evaluator,” choose the OpenAI LLM you want to use. Furthermore, make sure to add your OPENAI_API_KEY as an environment variable.

Integrations

LLM Providers

Frameworks

No-Code Platforms

Observability

Data Platforms

Evaluation & Quality

Data Labeling

Collaboration

OpenAI & Azure OpenAI

Evaluating OpenAI LLMs

Monitoring

See full Python example

See full TypeScript example

Development

Using OpenAI LLMs as the LLM judge

Integrations

LLM Providers

Frameworks

No-Code Platforms

Observability

Data Platforms

Evaluation & Quality

Data Labeling

Collaboration

​Evaluating OpenAI LLMs

​Monitoring

See full Python example

See full TypeScript example

​Development

​Using OpenAI LLMs as the LLM judge

Evaluating OpenAI LLMs

Monitoring

Development

Using OpenAI LLMs as the LLM judge