Stop re-explaining yourself to your local LLMs! In this video, we solve two of the most frustrating problems with local AI: giving your models a permanent memory and expanding their context windows.
If you're running local large language models with Ollama, you've probably hit a wall. This guide provides the step-by-step solution to break through those limits and unlock the true potential of your AI rig.
You will learn how to:
✅ Give both cloud (gemini-cli) and local (opencode) a persistent, long-term memory using an MCP server.
✅ Expand your Ollama models' context windows to 16k and beyond.
✅ Configure opencode to work seamlessly with your local Ollama server.
✅ Optimize your hardware with a dual-GPU setup for maximum performance.
00:00 Intro
01:35 Installing opencode
03:19 Ollama models in opencode
05:28 Local LLM context window
08:28 Maxing out the RTX 3060
10:44 Setting up persistent memory
Thanks for watching! If you're building your own AI setup or have any questions, drop them in the comments below.
#ollama #llm #aidevelopment #mcpserver
Download
0 formats
No download links available.
MCP Memory Server Tutorial: Shared Memory for Gemini CLI + OpenCode (Memento) | NatokHD