A log of all the changes and improvements made to our app

Subscribe to the changelog

October 24th, 2023

Evals for LLMs, real-time monitoring, Slack notifications and so much more!

Introducing support for testing LLMs & monitoring production data 🔍📊

It’s been a couple of months since we posted our last update, but not without good reason! Our team has been cranking away at our two most requested features: support for LLMs and real-time monitoring / observability. We’re so excited to share that they are both finally here! 🚀

We’ve also added a Slack integration, so you can receive all your Openlayer notifications right where you work. Additionally, you’ll find tons of improvements and bug fixes that should make your experience using the app much smoother.

We’ve also upgraded all Sandbox accounts to a free Starter plan that allows you to create your own project in development and production mode. We hope you find this useful!

Join our Discord for more updates like this and get closer to our development journey!

New features

  • LLMs in development mode
    • Experiment with and version different prompts, model providers and chains
    • Create a new commit entirely in the UI with our prompt playground. Connects seamlessly with OpenAI, Anthropic and Cohere
    • Set up sophisticated tests around RAG (hallucination, harmfulness etc.), regex validation, json schemas, and much more
  • LLMs in monitoring mode
    • Seamlessly evaluate responses in production with the same tests you used in development and measure token usage, latency, drift and data volume too
  • All existing tasks support monitoring mode as well
  • Toggle between development mode and monitoring mode for any project
  • Add a few lines of code to your model’s inference pipeline to start monitoring production data
  • Slack & email notifications
    • Setup personal and team notifications
    • Get alerted on goal status updates in development and production, team activity like comments, and other updates in your workspace
  • Several new tests across all AI task types
  • New sample project for tabular regression
  • Select and star the metrics you care about for each project
  • Add encrypted workspace secrets your models can rely on


  • Revamped onboarding for more guidance on how to get started quick with Openlayer in development and production
  • Better names for suggested tests
  • Add search bar to filter integrity and consistency goals in create page
  • Reduce feature profile size for better app performance
  • Add test activity item for suggestion accepted
  • Improved commit history allows for better comparison of the changes in performance between versions of your model and data across chosen metrics and goals
  • Added indicators to the aggregate metrics in the project page that indicate how they have changed from the previous commit in development mode
  • Improved logic for skipping or failing tests that don’t apply
  • Updated design of the performance goal creation page for a more efficient and clear UX
  • Allow specifying MAPE as metric for the regression heatmap
  • Improvements to data tables throughout the app, including better performance and faster loading times
  • Improved UX for viewing performance insights across cohorts of your data in various distribution tables and graphs
  • Updated and added new tooltips throughout the app for better clarity of concepts

Bug Fixes

  • Downloading commit artifacts triggered duplicate downloads
  • Fixed lagginess when browsing large amounts of data in tables throughout the app
  • Valid subpopulation filters sometimes rendered empty data table
  • Fixed bugs affecting experience navigating through pages in the app
  • Fixed issues affecting the ability to download data and logs from the app
  • Filtering by tokens in token cloud insight would not always apply correctly
  • Fixed UI bugs affecting the layout of various pages throughout the app that caused content to be cut off
  • Fixed python client commit upload issues