Steering LLM Behavior Without Fine-Tuning

Name: Steering LLM Behavior Without Fine-Tuning
Uploaded: Dec 17, 2025
Duration: 1064 s

Hugging Face127K subscribers

133.7K views

Dec 17, 2025

17:44

Modify the behavior or the personality of a model at inference time, without fine-tuning or prompt engineering. Read the blog post 👉 https://huggingface.co/spaces/dlouapre/eiffel-tower-llama Explore SAEs on the Hub 👉 https://huggingface.co/collections/dlouapre/sparse-auto-encoders-saes-for-mechanistic-interpretability Neuronpedia https://www.neuronpedia.org 00:00 Introduction 00:25 Steering as Neurostimulation 02:18 Transformer architecture 04:25 Linear representation of concepts 09:04 Steering using 🤗 transformers 13:43 Finding steering vectors 14:36 Using Sparse AutoEncoders 16:28 Conclusion

Download

1 formats

Video Formats

360pmp437.8 MB

Download

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.