LLMOps: OpenVino Toolkit quantization 4int LLama3.2 3B, Inference CPU #datascience #machinelearning

Name: LLMOps: OpenVino Toolkit quantization 4int LLama3.2 3B, Inference CPU #datascience #machinelearning
Uploaded: Oct 10, 2024
Duration: 1576 s
Description: In this video I will show you how to convert a model LLAMA3.2 3Billions to format Openvino IR and quantize it to 4Int. Later we will do inference in CPU using CoT prompts Notebook: https://github.com/olonok69/LLM_Notebooks/blob/main/quantization/openvino/llama3.2_3B/llama_3.2_int4.ipynb

The Machine Learning Engineer149K subscribers

165 views

Oct 10, 2024

26:16

In this video I will show you how to convert a model LLAMA3.2 3Billions to format Openvino IR and quantize it to 4Int. Later we will do inference in CPU using CoT prompts Notebook: https://github.com/olonok69/LLM_Notebooks/blob/main/quantization/openvino/llama3.2_3B/llama_3.2_int4.ipynb

Download

0 formats

No download links available.