May 3, 2024

Simple, dev-focused workflow for AI evals

Most of us get how crucial AI evals are now. The thing is, almost all the eval platforms we’ve seen are clunky – there’s too much manual setup and adaptation needed, which breaks developers’ workflows.

Last week, we released a radically simpler workflow.

You can now connect your GitHub repo to Openlayer, and every commit on GitHub will also commit to Openlayer, triggering your tests. You now have continuous evaluation without extra effort.

You can customize the workflow using our CLI and REST API. We also offer template repositories around common use cases to get you started quickly.

You can leverage the same setup to monitor your live AI systems after you deploy them. It’s just a matter of setting some variables, and your Openlayer tests will run on top of your live data and send alerts if they start failing.

We’re very excited for you to try out this new workflow, and as always, we’re here to help and all feedback is welcome.

Features

•
IntegrationsDeveloper workflow (GitHub integration, CLI and REST API, Sample repositories for various workflows, Ability to clone sample repositories in Openlayer UI)
•
EvalsNew test: column A grouped by column B

Improvements

•
UI/UXMove test options to header bar in modals
•
UI/UXImprovements to test results modals
•
UI/UXImprove layout of workspace onboarding
•
UI/UXAbility to delete tests
•
EvalsRelevant tests created automatically upon project creation in onboarding
•
UI/UXPolished design of in-app callouts
•
UI/UXPolish to activity log
•
DocumentationReorganization of docs
•
APIAllow None values in token column

Fixes

•
UI/UXRow outputs in panel are injected into chat history format when they should not be
•
UI/UXRow panel dropdowns do not appear when opened from a test modal
•
UI/UXMonitoring graphs showed no recent results even when there were some
•
UI/UXOpening create test modal for Group by Column test crashed the app
•
UI/UXColumn parameters was not able to be changed for Group By tests
•
PlatformCreating a commit without a model breaks
•
UI/UXProject filtering did not work in overview page
•
UI/UXCreating Character Length tests runs into client-side error when there are no input variables
•
UI/UXClient-side exception when opening requests

Simple, dev-focused workflow for AI evals

Features

Improvements

Fixes

Stop guessing. Ship with confidence.

The automated AI evaluation and monitoring platform.