Attention Is All You Need Paper Explained
In this video, I ll try to present a comprehensive study on Ashish Vaswani and his coauthors renowned paper, attention is all you need This paper is a major turning point in deep learning research. The transformer architecture, which was introduced in this paper, is now used in a variety of stateoftheart models in natural language processing and beyond. Chapters: 0:00 Abstract 0:39 Introduction 2:44 Model Details 3:20 Encoder 3:30 Input Embedding 5:22 Positional Encoding 11:05 SelfAttention 15:38 MultiHead Attention 17:31 Add and Layer Normalization 20:38 Feed Forward NN 23:40 Decoder 23:44 Decoder in Training and Testing Phase 27:31 Masked MultiHead Attention 30:03 Encoderdecoder SelfAttention 33:19 Results 35:37 Conclusion Link to the paper: Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kais
|
|