We’re introducing an exciting new feature to our observability platform: automatic thresholds for tests and anomaly detection.
Openlayer now supports automatic thresholds for tests, which are data-driven and adapt to your AI system over time. Whether you're monitoring cost, data quality, or GPT eval scores, we'll suggest thresholds based on historical patterns to take the guesswork out of defining acceptable criteria for your system.
We’ve also introduced anomaly detection to flag test results that deviate from the norm. This means you’ll get alerted when something’s off based on the automatic thresholds that we predict.
Both features are designed to take the pain out of manual setup and make your evaluations more proactive and intelligent. To get started, just create a new test in the Openlayer app and choose automatic when setting the threshold.
Features
•MCPRelease the Openlayer MCP server so users can use Openlayer tests in IDE workflows
•SDKsAdd OpenLIT integration notebook
•SDKsAdd a convenience function that copies tests from one project to another
•SDKsAdd an option to wait for commit completion to push function
•SDKsAdd async OpenAI tracer
•APISupport creating tests from the API
•EvalsSupport for automatic thresholds
•UI/UXDaily feature distribution graphs for tabular data projects
•EvalsAdd a column statistic test that supports mean, median, min, max, std, sum, count and variance
•EvalsAdd a raw SQL query test
•IntegrationsAdd support for directly integrating a project with BigQuery tables for continuous data quality monitoring
•EvalsAdd an anomalous column detection test
•PlatformAdd root cause analysis and segment distribution graphs to various tests’ diagnostic page
•EvalsAdd support for Gemini 2.0 models for LLM-as-a-judge tests
•PlatformAdd a priority property to tests (critical, high, medium, low)
•PlatformInclude or exclude inference pipelines when creating tests in a project
•PlatformAdd record count, last record received date to inference pipeline
•EvalsSupport running monitoring mode tests on the entire history of data rather than moving windows
•PlatformOn-premise deployment guides for OpenShift, AWS EKS
•SecurityPermissions at a project-level through access groups
Improvements
•PlatformImmediately execute tests in monitoring mode
•PlatformParse OpenTelemetry traces from Semantic Kernel, Spring AI
•PlatformTest failures will not cause the commit’s status to fail
•EvalsLLM-as-a-judge base prompt tweaks to improve consistency
Fixes
•UI/UXBroken link in connected Git repo settings
•EvalsIncrease LLM-as-a-judge criteria character limit
•UI/UXEnable sorting data tables by booleans
•PlatformSurface OpenAI refusals to user in LLM-as-a-judge tests
•PlatformAdd a notification when batch data uploads fail
$ openlayer push
Stop guessing.
Ship with confidence.
The automated AI evaluation and monitoring platform.