AI Toolkit + GitHub Copilot - Pt. 6: Evaluate Agent Output

Name: AI Toolkit + GitHub Copilot - Pt. 6: Evaluate Agent Output
Uploaded: Dec 1, 2025
Duration: 1363 s

Microsoft Developer677K subscribers

1.4K views

Dec 1, 2025

22:43

This video is Part 6 of the AI Toolkit + GitHub Copilot video series. This video is part of the GitHub Copilot + AI Toolkit Pet Planner workshop. View the repo and instructions: https://aka.ms/AIToolkit/workshop Join April as she demonstrates how to use Copilot in Agent mode to prepare for evaluating an agent’s output. Copilot leverages AI Toolkit tools to help developers choose evaluators, create a dataset, and create an evaluation script to evaluate agent output. Install the AI Toolkit: https://aka.ms/AIToolkit Setup your Microsoft Foundry project: https://ai.azure.com Learn More about Microsoft Foundry Model and Tools announcements at https://aka.ms/model-mondays Join the Discord: https://aka.ms/insideMF/discord Hop on Forum: https://aka.ms/insideMF/forum Chapter Markers 00:00 - 00:02 - Introduction 00:03 - 01:19 - Recap of current progress 01:20 - 02:57 - Choose evaluators with Copilot 02:58 - 07:00 - Create a dataset with Copilot 07:01 - 16:50 - Review evaluation plan and create evaluation script 16:51 - 18:50 - Review evaluation output 18:51 - 22:41 - Use Copilot to create an evaluation report with recommendations

Download

1 formats

Video Formats

360pmp435.4 MB

Download

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.