Back to Browse

Scheduling Impacts on LLM Inference

24 views
May 20, 2026
1:26:48

Our new book club series is about LLM Inference. Ted has done a deep dive on how LLM inference works and what are the techniques for optimizing performance. This week we will discuss chapter 5 about scheduling, including batching and prefill techniques. A free copy of the LLM Inference Illustrated book is available at https://tedkyi.github.io/llm-inference/ Come join us in person or online. Please make sure to read the instructions for joining the event below. Agenda: 12:00 - 1:15 pm -- Presentation and discussion Time permitting -- Additional Q&A, networking Links to notes/slides and videos of prior meetups are available on the SDML GitHub repo https://github.com/SanDiegoMachineLearning/bookclub Location: We are meeting at Aquillius in Rancho Bernardo. Please Note: There are two steps required to join the online meetup: You must go to our Slack community and ask for the password for the meeting. Link to join is below. You must have a Zoom login in order to join the event. A free Zoom account will work. If you get an error message joining the Zoom, please login to your account on the Zoom website then try again. Community: Join our slack channel for questions and discussion about what's new in ML: https://join.slack.com/t/sdmachinelearning/shared_invite/zt-34vyls6jn-3cREuo8EoPmo6AKwTEgGgA

Download

0 formats

No download links available.

Scheduling Impacts on LLM Inference | NatokHD