This month, we’re excited to unveil our brand new UI! We’ve defined an improved design system, including updated and thoughtfully-crafted styles and components to give the product a fresh, engaging look and feel. The new design system supports both light and dark modes, so be sure to check out both and find the one that suits you.
With the new look, we’ve made a number of other experience improvements, including adding priority levels for tests, improving the navigation and information hierarchy, and adding more data visualizations throughout the product.
We can’t wait for you to try it out and hear your thoughts.
Features
•UI/UXBrand new UI that's faster, slicker and more enjoyable to use
•SDKsSupport tracing Bedrock models
•SDKsSupport tracing OpenAI Agents
•SDKsSupport tracing Pydantic AI systems
•SDKsSupport tracing LangGraph systems
•SecurityNew "Member restricted" role, which can perform member actions without viewing data source data
•IntegrationsDirectly connect Snowflake tables to projects
•UI/UXView project, datasets and table dropdowns when connecting BigQuery tables
•PlatformAllow hosting Openlayer on subpaths in on-prem deployments
•PlatformAllow users to override LLM costs with custom costs
•EvalsInclude standard deviation score in LLM-as-a-judge and Ragas test results
•EvalsNew prompt injection test to detect adversarial attacks on LLM systems
Improvements
•PlatformRename "inference pipelines" to "data sources" to capture broader scope
•PlatformBetter skipped test messages for metrics that require ground truths
•APISpeed up endpoints that return record counts and last record date for data sources
•EvalsShow per-row scores for metrics like semantic similarity, exact match in data tables
Fixes
•EvalsTests that use both historical data and auto thresholds were erroring
•APISpeed up data source creation request
•PlatformRe-run tests that are stuck in running state
•APIAllow streaming data with numpy arrays in the body
$ openlayer push
Stop guessing.
Ship with confidence.
The automated AI evaluation and monitoring platform.