Back to Browse

Steering LLM Behavior Without Fine-Tuning

133.7K views
Dec 17, 2025
17:44

Modify the behavior or the personality of a model at inference time, without fine-tuning or prompt engineering. Read the blog post πŸ‘‰ https://huggingface.co/spaces/dlouapre/eiffel-tower-llama Explore SAEs on the Hub πŸ‘‰ https://huggingface.co/collections/dlouapre/sparse-auto-encoders-saes-for-mechanistic-interpretability Neuronpedia https://www.neuronpedia.org 00:00 Introduction 00:25 Steering as Neurostimulation 02:18 Transformer architecture 04:25 Linear representation of concepts 09:04 Steering using πŸ€— transformers 13:43 Finding steering vectors 14:36 Using Sparse AutoEncoders 16:28 Conclusion

Download

1 formats

Video Formats

360pmp437.8 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Steering LLM Behavior Without Fine-Tuning | NatokHD