Back to Browse

Don't Stop Pretraining!

4.9K views
Jul 21, 2020
15:10

This video explains a study on the benefits of continued pre-training with RoBERTa. Even though RoBERTa is trained on 160GB of uncompressed text from a massive range of sources, the authors show continued gains by continuing pre-training not only in the domain of the downstream task (i.e. massive collections of amazon reviews, news articles, computer science, biomedical research papers), but further gains by doing pre-training (masked language modeling) on the data for the task itself (especially helpful when there is unlabeled data that is better curated for that task than the more broad "domain" of the task). Thanks for watching! Please Subscribe! Paper Links: Don't Stop Pretraining: https://arxiv.org/pdf/2004.10964.pdf RoBERTa: https://arxiv.org/abs/1907.11692

Download

1 formats

Video Formats

360pmp423.0 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Don't Stop Pretraining! | NatokHD