Why AI Confidently Makes Stuff Up
This is part 6 of a beginner series on how LLMs actually work. Five videos in, the picture is complete. Tokens are locations. Attention shapes copies of those locations based on context. Prediction samples from a distribution. Training settled the geometry with millions of small nudges. This video uses all of that to answer the question everyone hits eventually: why does the model confidently say things that aren't true? Every prompt produces a distribution over next tokens. Every single one. The architecture has no "I don't know" mode and no way for the model to stay silent. It will always produce something. Ask it who uploaded the first YouTube video in 2005 and the distribution peaks tightly on Jawed Karim (correct — he uploaded "Me at the zoo"). Ask it who uploaded the first YouTube video in 1945 — a question with no answer, because YouTube didn't exist — and the distribution still peaks, with about the same confidence, on something invented. The smaller local model we tested produced "Walter Jones." Same machinery, same kind of output. The model does not know any answers. It just predicts the next token in the sequence, because the context isn't a phrase, it's a sequence of tokens. Why? As we learned in the previous video, training optimized the model for one thing: to reduce loss, not truth. Loss is a measurement. It tells you whether a prediction matches the training data. There is nothing related to truth in any part of this process — not in the calculations, not in the model's understanding. There is no representation of truth in mathematics. And here's the deeper point. Even if every word of training data were verified true, hallucinations would still happen. Because of how the geometry settles. Training is a multi-dimensional tug of war: many ropes pulling on every token, each rope a different training example. Where the pulls agree, the position settles tight — a clean answer the model can recall accurately. Where the pulls disagree, the position settles fuzzy — only an approximate sense of where the token lives. The token's position is a noisy average of all those pulls, not an exact match for any one of them. Tight or fuzzy, the model uses it the same way, with the same confidence. The part most people miss: when the model produces a token, it does one forward pass. Context in, next token out. One direction, no second pass. It appends that token to the context and runs another forward pass. And a token can't be true or false — it's just a token. There is no backward step where the model goes back to check what it generated. From the inside, every answer is just a prediction of what comes next. There are techniques to reduce hallucination, mostly by grounding the model with real data. Retrieval gives it real text to attend to. Tools let it look things up. Or you can train it to be more willing to say "I don't know." But remember: these are mitigations, not fixes. The mechanism doesn't change. The model still doesn't know what it knows. Hallucination isn't a bug being patched out. It's the algorithm doing exactly what it was built to do: predict the next token. Whether that token is true or invented lives outside the model. Next: sampling. How the distribution actually becomes a token. Chapters: 0:00 Why LLMs hallucinate 0:07 A quick recap 0:27 The Mata v. Avianca case 0:55 Every prompt produces a distribution 1:12 Real vs anachronistic: same machinery 2:27 Loss measured matching, not truth 2:48 Even with perfect training, it would still happen 3:00 Training is a multi-dimensional tug of war 3:32 No introspection: one forward pass 4:08 What actually helps 4:35 The algorithm doing what it was built to do
Download
0 formatsNo download links available.