18:08THIS is why large language models can understand the worldAlgorithmic Simplicity398.7K views·1 year ago
31:51MAMBA from Scratch Neural Nets Better and Faster than TransformersAlgorithmic Simplicity306.0K views·2 years ago
20:18Why Does Diffusion Work Better than Auto-RegressionAlgorithmic Simplicity697.3K views·2 years ago