Back to Browse

Tutorial 6: Transformers and MH Attention (Part 2)

2.9K views
Oct 9, 2021
12:05

In this tutorial, we will discuss one of the most impactful architectures of the last 2 years: the Transformer model. Since the paper Attention Is All You Need by Vaswani et al. had been published in 2017, the Transformer architecture has continued to beat benchmarks in many domains, most importantly in Natural Language Processing. Transformers with an incredible amount of parameters can generate long, convincing essays, and opened up new application fields of AI. As the hype of the Transformer architecture seems not to come to an end in the next years, ​it is important to understand how it works, and have implemented it yourself, which we will do in this notebook. This notebook is part of a lecture series on Deep Learning at the University of Amsterdam. The full list of tutorials can be found at https://uvadlc-notebooks.rtfd.io. Link to the notebook: https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial6/Transformers_and_MHAttention.html 00:00 Transformer Encoder 04:35 Positional Encoding 08:40 Learning rate warm-up 11:39 PyTorch Lightning Module

Download

1 formats

Video Formats

360pmp420.1 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Tutorial 6: Transformers and MH Attention (Part 2) | NatokHD