Back to Browse

Zhuoran Yang: How Do Transformers Learn to Implement Algorithms

130 views
Feb 27, 2026
1:04:59

American Statistical Association (ASA), Section on Statistical Learning and Data Science (SLDS) February Webinar: How Do Transformers Learn to Implement Algorithms Date: February 26, 2026 Abstract: Systematic, compositional generalization beyond the training distribution remains a core challenge in machine learning -- and a critical bottleneck for the emergent reasoning abilities of modern language models. This work investigates out-of-distribution (OOD) generalization in Transformer networks using a GSM8K-style modular arithmetic on computational graphs task as a testbed. We introduce and explore a set of four architectural mechanisms aimed at enhancing OOD generalization: (i) input-adaptive recurrence; (ii) algorithmic supervision; (iii) anchored latent representations via a discrete bottleneck; and (iv) an explicit error-correction mechanism. Collectively, these mechanisms yield an architectural approach for native and scalable latent space reasoning in Transformer networks with robust algorithmic generalization capabilities. We complement these empirical results with a detailed mechanistic interpretability analysis that reveals how these mechanisms give rise to robust OOD generalization abilities. This is joint work with Awni Altabaa, Siyu Chen, and John Lafferty. Speaker: Zhuoran Yang is an Assistant Professor of Statistics and Data Science at Yale University, starting in July 2022. His research interests lie in the interface between machine learning, statistics, and optimization. He is particularly interested in the foundations of reinforcement learning, representation learning, and deep learning. Before joining Yale, Zhuoran worked as a postdoctoral researcher at the University of California, Berkeley, advised by Michael. I. Jordan. Prior to that, he obtained his Ph.D. from the Department of Operations Research and Financial Engineering at Princeton University, co-advised by Jianqing Fan and Han Liu. He received his bachelor’s degree in Mathematics from Tsinghua University in 2015.

Download

0 formats

No download links available.

Zhuoran Yang: How Do Transformers Learn to Implement Algorithms | NatokHD