ImageBERT

Name: ImageBERT
Uploaded: Feb 7, 2020
Duration: 700 s

Connor Shorten52.3K subscribers

4.9K views

Feb 7, 2020

11:40

This video explores the ImageBERT model from Microsoft Research! This is a really interesting combination of vision and language tokens to achieve state of the art results on MSCOCO and Flickr30k image and text retrieval tasks! I hope this video helped you get a better sense of how image and text tokens can be combined in the transformer architecture and how self-attention uses visual tokens to inform the text task output of BERT's masked language modeling! Paper Links: ImageBERT: https://arxiv.org/pdf/2001.07966.pdf Conceptual Captions: https://ai.googleblog.com/2018/09/conceptual-captions-new-dataset-and.html Thanks for watching! Please Subscribe!

Download

0 formats

No download links available.