Scroll Top

DeepMind open-sources data agnostic deep learning model

algorithm-3859549

DeepMind has open-sourced Perceiver IO, a general-purpose deep-learning model architecture that can handle many different types of inputs and outputs. Most deep-learning models are based on architectures designed for a particular type of data; for example, computer vision models typically use convolutional neural networks, while natural language processing models are based on a sequence-learning architecture.

Perceiver IO architecture uses cross-attention to project high-dimensional input arrays into a lower-dimensional latent space. Then this latent space is processed using a standard self-attention structure. Because this latent space has a much smaller dimension than the input, the module processing it can be much deeper than is practical with a module that directly processes large input arrays. Finally, the latent representation is converted to an output by applying a query array that has the same number of elements as the desired output data.

Editorial note: Implementing this model on a PANN network would make it even more practical.

Source