🤖 Topic: Transformer Networks and Monty
🎯 Aims:
- Identify cross-pollination between transformer networks and Monty.
- Improve transformer networks while developing Monty.
- Enhance communication between Monty and Mega teams.
🔍 Focus: Current implementation of Monty rather than the thousand brains theory for clearer discussion.
📊 Similarities:
- Connectivity and common representational format.
- Voting operations related to self-attention.
- Reference frames and positional encodings.
- Embodiment as a research topic in transformers.
⚠️ Differences:
- Explicit object models in Monty vs. implicit in transformers.
- Learning processes differ significantly.
- Transformers lack self-recurrence in tokens.
📚 Background:
- Monty uses learning modules to process sensory input and update evidence.
- Transformers utilize self-attention for parallel processing of token representations.
🔗 Voting vs. Self-Attention:
- Voting in Monty uses hypotheses from learning modules.
- Self-attention in transformers uses queries, keys, and values to determine relationships between tokens.
🧩 Embodiment:
- Transformers are beginning to explore embodiment through models like Gato and PaLM-E.
- These models generate actions based on sensory input and language prompts.
🔄 Future Directions:
- Integrate self-updating mechanisms within transformer tokens.
- Explore voting mechanisms as a special case of self-attention.
- Consider top-down connections and explicit spatial models.
💡 Conclusion:
- Understanding the differences and similarities between Monty and transformers can lead to improvements in both systems.
- Further research is needed to explore the integration of these concepts.