Back to Browse

Localmaxxing with LLMs

283 views
May 8, 2026
12:39

Run LLMs locally with llama.cpp – Beginner-Friendly Guide In this guide, I show you exactly how to get started with llama.cpp for fast, free, and intelligent local inference using Qwen-3.6-35b. You'll learn how to install llama.cpp, download a quantized GGUF model, run a local server, and integrate it with tools like Cline. Timestamps: 00:00 - Introduction 02:49 - API vs Local Comparison 04:27 - What is Localmaxxing? 07:24 - Installing llama.cpp 10:06 - Running the Server + Using in Cline Follow me for more local AI tips: https://x.com/edgedistiller Download models: https://huggingface.co llama.cpp repo: https://github.com/ggerganov/llama.cpp Localmaxxing benchmarks: https://localmaxing.com #llama.cpp #LocalLLM #RunLLMLocally #Qwen #AI #GGUF #LocalAI #OllamaAlternative

Download

1 formats

Video Formats

360pmp48.9 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Localmaxxing with LLMs | NatokHD