Back to Browse

MCP Memory Server Tutorial: Shared Memory for Gemini CLI + OpenCode (Memento)

1.9K views
Nov 11, 2025
18:36

Stop re-explaining yourself to your local LLMs! In this video, we solve two of the most frustrating problems with local AI: giving your models a permanent memory and expanding their context windows. If you're running local large language models with Ollama, you've probably hit a wall. This guide provides the step-by-step solution to break through those limits and unlock the true potential of your AI rig. You will learn how to: ✅ Give both cloud (gemini-cli) and local (opencode) a persistent, long-term memory using an MCP server. ✅ Expand your Ollama models' context windows to 16k and beyond. ✅ Configure opencode to work seamlessly with your local Ollama server. ✅ Optimize your hardware with a dual-GPU setup for maximum performance. 00:00 Intro 01:35 Installing opencode 03:19 Ollama models in opencode 05:28 Local LLM context window 08:28 Maxing out the RTX 3060 10:44 Setting up persistent memory Thanks for watching! If you're building your own AI setup or have any questions, drop them in the comments below. #ollama #llm #aidevelopment #mcpserver

Download

0 formats

No download links available.

MCP Memory Server Tutorial: Shared Memory for Gemini CLI + OpenCode (Memento) | NatokHD