AI Nondeterminism Explained: Temperature & Top-p
During one of my talks, someone asked me about temperature and whether AI can ever be fully deterministic. I thought it was such a great question that I decided to turn it into a video. In this episode, we’ll break down both the theory and the practice behind these concepts. Through hands-on experiments, you’ll see how the parameters temperature and top-p affect a model’s responses. We’ll also dive deeper into what’s really happening at the model’s input and output layers—and discuss whether large language models can ever be truly deterministic (and if that’s even a good idea). Step by step, we’ll explore key concepts such as: - tokens & tokenizers - semantic vectors & embedding matrices - truncation and sampling - how the softmax function really works By the end, you’ll have a much clearer picture of how randomness and determinism in AI are controlled - and how you can fine-tune these parameters in practice. TOC: 0:00 - Introduction *** Practice 0:31 - Gemini 2.5 Flash Lite Demo 2:28 - GPT-OSS and Interesting Elias Case 5:07 - Gema - Favorite Animals Test 6:30 - Temperature = 1.0 vs 1.5 - Test Results *** Theory 6:58 - Neural Networks Basics 7:58 - How LLMs Work 6:27 - Tokens and Tokenizers 10:01 - Embedding Vectors & Matrix 11:54 - Final Hidden State Vector and Decoder Matrix 13:09 - Logits 14:11 - Softmax 15:37 - Sampling Strategies & Top-P Parameter Explanation 17:33 - Temperature Parameter Explained 19:30 - Input and output to LLM Summarized #LLM #GenerativeAI #Temperature #TopP #TopK #GreedySearch #Determinism #AIEducation
Download
0 formatsNo download links available.