LLMeter - LLM Speed Test
Want to know how fast your LLM actually is? In this video, we dive into LLMeter, a brand-new Python library from AWS Labs designed to measure LLM performance metrics like Time to First Token (TTFT), Time to Last Token (TTL), and Tokens Per Second (TPS). We walk through a complete hands-on setup using Python 3.10+ and the UV package manager. You’ll learn how to configure endpoints for providers like OpenAI, Anthropic, and DeepSeek, and how to track real-time costs using built-in cost models Timestamp 00:09 Intro to LLMeter by AWS Labs 01:10 Installation (UV & Pip) and API Configuration 01:38 Key Concepts: Endpoints, Payloads, and Load Testing 02:54 Configuring the Cost Model for Token Tracking 04:46 Running a Sequence of Clients Test 05:40 Analyzing Real-time Statistics in the Terminal 06:28 Plotly Graph Walkthrough: TTFT vs Number of Clients 07:30 Generating Histograms and Heatmaps 08:22 Multi-provider Comparison (GPT-4 vs GPT-4 Mini) 09:04 Building a Real-time Performance Dashboard ➡️ Subscribe to my blog https://qainsights.com ➡️ Join QAInsights Community at https://qain.si/community ➡️ Join QAInsights Academy at https://qain.si/academy ➡️ Buy me a tea 🍵 https://www.buymeacoffee.com/qainsights ➡️ Get Certified in CKAD https://qain.si/startk8s ➡️ Get Certified in CKA https://qain.si/cka ➡️ My preferred DNS is NextDNS https://qain.si/nextdns ➡️ Learn Linux https://qain.si/linux ➡️ Get performance testing jobs real quick using Indeed → https://goo.gl/XAfCcE ➡️ Hostinger Web Hosting → https://goo.gl/MfwDyU ➡️ My Productivity Tools → https://goo.gl/2DfC5d ➡️ App Sumo for your business → https://goo.gl/zj92SA ➡️ Amazon → https://amzn.to/2L0Jv2n ➡️ TubeBuddy → https://www.tubebuddy.com/qainsights ➡️ LoadRunner Playlist https://www.youtube.com/playlist?list=PLJ9A48W0kpRIiVf8W7jMvf6Ao-naX3Ari ➡️ My first Udemy course entitled `Performance Testing using DevWeb` has been published https://qain.si/devweb
Download
0 formatsNo download links available.