Replacing RNN with Self-Attention Mechanism #21

rabaur · 2020-11-17T14:44:14Z

Dear David Ha, dear Jürgen Schmidhuber

Thank for this inspirational blog-post. I have stumbled upon your paper while researching for my BSc thesis. It is concerned with training agents to navigate in complex buildings. As you know, navigation is a very complex task where memory is great importance.

Given the complexity of the task and the promising results of self-attention, I was wondering if you have considered exchanging the RNN with self-attention mechanism. I reckon that this would make the memory model more powerful while being computationally less expensive.

Thank you for your considerations,
Raphaël Baur, BSc Student ETH Zürich

hardmaru · 2021-09-13T02:21:57Z

Hi Raphaël,

In later work, I've generally kept the RNN, but replaced the latent space bottleneck with other types of bottlenecks related to self-attention.

For example:

Inattentional Blindness bottleneck: https://attentionagent.github.io/
Screen shuffling bottleneck: https://attentionneuron.github.io/

Cheers.

rabaur · 2021-09-14T14:04:40Z

This is very insightful, thank you so much for your answer!

rabaur closed this as completed Sep 14, 2021

worldmodels reopened this Sep 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replacing RNN with Self-Attention Mechanism #21

Replacing RNN with Self-Attention Mechanism #21

rabaur commented Nov 17, 2020

hardmaru commented Sep 13, 2021

rabaur commented Sep 14, 2021

Replacing RNN with Self-Attention Mechanism #21

Replacing RNN with Self-Attention Mechanism #21

Comments

rabaur commented Nov 17, 2020

hardmaru commented Sep 13, 2021

rabaur commented Sep 14, 2021