Media Summary: Jason Lopatecki, Co-Founder and CEO of Arize AI, dives into the world of In this video, we walk through the complete eval workflow, including creating datasets, prompts, and scorers. An eval platform is not just a test runner. You are building shared definitions of "good," reliable data pipelines, labelling workflows, ...

Evaluating Agents With Braintrust - Detailed Analysis & Overview

Jason Lopatecki, Co-Founder and CEO of Arize AI, dives into the world of In this video, we walk through the complete eval workflow, including creating datasets, prompts, and scorers. An eval platform is not just a test runner. You are building shared definitions of "good," reliable data pipelines, labelling workflows, ... This hands-on workshop guides participants through the full AI In this video, we walk through how you can connect an We're introducing a set of upgrades to make complex

Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam ... This hands-on workshop will guide participants through the complete AI He cited "ML-Jym," a framework from Meta and collaborators, as a concrete example of a system for

Photo Gallery

Evaluating Agents with Braintrust
How to evaluate AI agents with Braintrust
Evaluating Agents and Assistants: The AI Conference
Intro to Evals with Braintrust
Why building eval platforms is hard — Phil Hetzel, Braintrust
Evals 101 — Doug Guthrie, Braintrust
Evaluating agents: how we built Loop, the AI assistant for evals
Intro to Loop: the AI agent for evals and observability in Braintrust
Intro to Remote Agent Evals with Braintrust
Intro to Braintrust: AI Observability and Evals
Langfuse Launch Week Day 3: Agent Tracing and Evaluation
Braintrust and Box on AI agents and the future of AI observability
View Detailed Profile
Evaluating Agents with Braintrust

Evaluating Agents with Braintrust

Greylock Change

How to evaluate AI agents with Braintrust

How to evaluate AI agents with Braintrust

Join

Evaluating Agents and Assistants: The AI Conference

Evaluating Agents and Assistants: The AI Conference

Jason Lopatecki, Co-Founder and CEO of Arize AI, dives into the world of

Intro to Evals with Braintrust

Intro to Evals with Braintrust

In this video, we walk through the complete eval workflow, including creating datasets, prompts, and scorers.

Why building eval platforms is hard — Phil Hetzel, Braintrust

Why building eval platforms is hard — Phil Hetzel, Braintrust

An eval platform is not just a test runner. You are building shared definitions of "good," reliable data pipelines, labelling workflows, ...

Evals 101 — Doug Guthrie, Braintrust

Evals 101 — Doug Guthrie, Braintrust

This hands-on workshop guides participants through the full AI

Evaluating agents: how we built Loop, the AI assistant for evals

Evaluating agents: how we built Loop, the AI assistant for evals

Join Doug Guthrie, Solutions Engineer at

Intro to Loop: the AI agent for evals and observability in Braintrust

Intro to Loop: the AI agent for evals and observability in Braintrust

Learn how you can use Loop in

Intro to Remote Agent Evals with Braintrust

Intro to Remote Agent Evals with Braintrust

In this video, we walk through how you can connect an

Intro to Braintrust: AI Observability and Evals

Intro to Braintrust: AI Observability and Evals

An end-to-end walkthrough of

Langfuse Launch Week Day 3: Agent Tracing and Evaluation

Langfuse Launch Week Day 3: Agent Tracing and Evaluation

We're introducing a set of upgrades to make complex

Braintrust and Box on AI agents and the future of AI observability

Braintrust and Box on AI agents and the future of AI observability

BrainTrust

Evaluating and Debugging Non-Deterministic AI Agents

Evaluating and Debugging Non-Deterministic AI Agents

Evaluate

How to evaluate agents in practice

How to evaluate agents in practice

Evaluating Agents

LLM as a Judge: Scaling AI Evaluation Strategies

LLM as a Judge: Scaling AI Evaluation Strategies

Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam ...

[Evals Workshop] Mastering AI Evaluation: From Playground to Production

[Evals Workshop] Mastering AI Evaluation: From Playground to Production

This hands-on workshop will guide participants through the complete AI

Agentic Evals by Shishir Patil

Agentic Evals by Shishir Patil

He cited "ML-Jym," a framework from Meta and collaborators, as a concrete example of a system for