Back to Browse

How to Evaluate (and Improve) Your LLM Apps

10.5K views
Mar 17, 2025
27:19

🤝 Work with me: https://aibuilder.academy/yt/-sL7QzDFW-4 Here, I discuss 3 types of evals and how to use them to improve LLM apps. 📰 Blog: https://medium.com/@shawhin/how-to-evaluate-and-improve-your-llm-apps-f7b08fb7493c?sk=f2fbcd3f16b958baa4734d4a39d5b237 💻 Example Code: https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/evals References [1] https://youtu.be/XGJNo8TpuVA [2] arXiv:2501.12948 [cs.CL] [3] arXiv:2402.01383 [cs.CL] [4] https://hamel.dev/blog/posts/llm-judge/ [5] arXiv:2203.02155 [cs.CL] [6] https://youtu.be/SnbGD677_u0 -- Intro - 0:00 Vibe Checks - 0:27 Evals - 3:26 Type 1: Code-based - 5:58 Type 2: Human-based - 9:34 Type 3: LLM-based - 13:34 Example: Improving y2b with LLM Judge - 15:28

Download

1 formats

Video Formats

360pmp434.7 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

How to Evaluate (and Improve) Your LLM Apps | NatokHD