[DMQA Open Seminar] Beyond BERT

Name: [DMQA Open Seminar] Beyond BERT
Uploaded: Feb 5, 2021
Duration: 3700 s

‍김성범[ 교수 / 산업경영공학부 ]28.8K subscribers

1.9K views

Feb 5, 2021

1:01:40

2018년 BERT의 공개는 NLP 분야의 판도를 완전히 바꾸었다. 그 이후 언어모델에 대한 많은 연구가 BERT의 영향을 직접적으로 받았으며, 보다 더 효과적으로 언어모델을 학습하기 위한 여러 방법들이 제안되었다. 본 세미나에서는 BERT와 그 이후의 성공적인 언어모델 연구들에 대해 알아보고자 한다. 키워드: Transformer, BERT, SpanBERT, XLNet, RoBERTa, ALBERT, BART, ELECTRA, T5, GPT-3, DeBERTa 참고문헌 [1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, pages 6000–6010. [2] J. Devlin et al., “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. North Am. Association Computat. Linguistics (NAACL)-HLT, Minneapolis, MN, USA, June 2–7, 2019, pp. 4171–4186. [3] M. Joshi et al., “SpanBERT: Improving pre-training by representing and predicting spans,” arXiv preprint 1907.10529, 2019. [4] Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems, pp. 5754–5764, 2019. [5] Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” arXiv:1907.11692, 2019. [6] Z. Lan et al., “ALBERT: A Lite BERT for Selfsupervised Learning of Language Representations,” in Int. Conf. Learning Representations, Addis Ababa, Ethiopia, May 2020. [7] M. Lewis et al., “Bart: Denoising sequence-to-sequence pretraining for natural language generation, translation, and comprehension.” arXiv preprint arXiv:1910.13461, 2019. [8] K. Clark et al., “ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators.” in Int. Conf. Learning Representations, Addis Ababa, Ethiopia, May 2020. [9] Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint 2006.03654, 2020. [10] Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon. Unified language model pre-training for natural language understanding and generation. In Advances in Neural Information Processing Systems, pp. 13042–13054, 2019.

Download

0 formats

No download links available.