Back to Browse

Webinar: "Quantization: unlocking scalability for LLMs"

758 views
Jun 17, 2024
1:01:24

Running large language models (LLMs) can be memory-intensive and slow. Join us for an exclusive webinar where we will explore the game-changing benefits of AI quantization. Discover how to save memory costs and reduce latency, while bringing enhanced reliability, security, and more with optimized models running on device. During this webinar, our experts from Qualcomm AI Research will delve into the state-of-the-art large language model (LLM) quantization techniques such as vector quantization, designed to address the challenges of making AI work efficiently and accurately on device. Don't miss this opportunity to gain valuable insights and discover the useful tools provided by Qualcomm Innovation Center to help developers quantize their models. Speakers: Mart van Baalen Senior Staff Engineer/Manager Mart van Baalen is a Senior Staff Engineer/Manager at Qualcomm AI Research in Amsterdam. His work focuses on efficient deep learning inference, through neural network quantization as well as other methods to maximally exploit efficient inference hardware. Mart joined Qualcomm in 2017 as part of the acquisition of UvA spin-off Scyfer. Besides his work in AI research, Mart is a professional trombone player. Abhijit Khobare Director of Software Engineering Abhijit Khobare is a director of software engineering at Qualcomm Technologies, Inc. (QTI) and currently leads development of model compression and model quantization software tools, as part of Qualcomm AI Research Program at Qualcomm Research, a division of QTI. Prior to his current role, Khobare worked on cellular research for various 4G and 5G technologies as part of Qualcomm Research, helping prototype several pioneering features like LTE-enhanced and LTE-WLAN coexistence. He also contributed to QTI’s commercial cellular stack for 4G. Prior to Qualcomm, Khobare worked at Intel, where he worked on high-speed TCP offload using network accelerator chips and server load balancing solutions. Abhijit holds a Master of Science in Computer Science from Virginia Tech. Moderated by Armina Stepan (Qualcomm Technologies Netherlands).

Download

0 formats

No download links available.

Webinar: "Quantization: unlocking scalability for LLMs" | NatokHD