We’re very excited to introduce test bundles to the Openlayer platform! Easily create a set of tests related to use cases or policies of interest, such as the EU AI Act, OWASP, agentic workflows, data quality, and more. Comprehensive test bundles ensure you catch errors across all your key use cases before they slip through the cracks.
Along with test bundles, we’ve added some new tests to our library: groundedness for LLMs, toxicity score, checks for prompt injections, and checks to see if the outputs recommend a competitor company. You can create these tests today in the Openlayer app!
Features
•PlatformIntroduced new metrics and tests, such as toxicity, groundedness, and others
•PlatformReleased test bundles for the EU AI Act, OWASP, agentic workflows, data quality, and others
•CLISupport for new Python runtimes for development mode
Improvements
•PlatformImproved the data polling and exception handling for the BigQuery integration
•UI/UXEnhanced navigation icons across the app, improving visual clarity and user experience.
•APIImproved the handling of attributes from the latest version of the OpenTelemetry GenAI semantic conventions
•PlatformEnhanced secret management interface
•PlatformImproved the explanations for LLM-based metrics
•PlatformFiltering improvements, including filtering tests by priority, status, name, and others.
•CLIImprove date range parsing for the export command of the CLI
•SDKsTrace functionality refactoring for the Openlayer TypeScript SDK with improvements to various integrations, including the LangChain callback handler, and Bedrock Agents
•DocsImproved documentation for integrations like BigQuery, Oracle OCI, and others
Fixes
•SDKsPython SDK bug fixes for the tracing feature when the traced function yields generators
•PlatformSpeed up PII detection test
•SDKsBetter JSON serialization for platform data uploads
$ openlayer push
Stop guessing.
Ship with confidence.
The automated AI evaluation and monitoring platform.
We value your privacy
We use cookies to enhance your browsing experience, serve personalized content, and analyze our traffic.