Perceiver: General Perception with Iterative Attention ( Google Deep Mind Research Paper Explained)

Views: 9

, perceiver, deepmind, transformer Inspired by the fact that biological creatures attend to multiple modalities at the same time, DeepMind releases its new Perceiver model. Based on the Transformer architecture, the Perceiver makes no assumptions on the modality of the input data and also solves the longstanding quadratic bottleneck problem. This is achieved by having a latent lowdimensional Transformer, where the input data is fed multiple times via crossattention. The Perceiver s weights can also be shared across layers, making it very similar to an RNN. Perceivers achieve competitive performance on ImageNet and stateoftheart on other modalities, all while making no architectural adjustments to input data. OUTLINE: 0:00 Intro Overview 2:20 BuiltIn assumptions of Computer Vision Models 5:10 The Quadratic Bottleneck of Transformers 8:00 CrossAttention in Transformers 10:45 The Perceiver Model Architecture Learned Queries 20:05 Positional Enco br, br,