A talk I gave to my MATS 8.0 training program on thinking models.
Thinking models seem like a really big deal! Why are they such an improvement? What does this mean for interpretability? What is the right intuitive model for thinking about them? I try to lay out my intuitions here.
The early draft of the paper I discuss from Constantin & Ivan https://openreview.net/pdf?id=OwhVWNOBcz