Link Search Menu Expand Document

Everything You Need to Know about Transformers: Architectures, Optimization, Applications, and Interpretation

Time and Location

Instructors: Andy Zeng, Boqing Gong, Chen Sun, Ellie Pavlick, and Neil Houlsby


  • Tuesday, February 7 2023, 14:00 - 18:00 (Eastern Time)


The tutorial aims to share the exciting recent developments on unified neural architectures that process different input modalities and learn to solve diverse tasks, from the perspective of Transformer architectures. The goal is to equip attendees with “everything they need to know about Transformers”. The tutorial covers the basic architecture and its recent variants (Neil Houlsby), effective optimization algorithms (Boqing Gong), representative and emerging new applications in multimodal learning and robotics (Chen Sun, Andy Zeng), and the tools to probe and analyze what knowledge has been captured by a trained network (Ellie Pavlick).

We envision the underlying principles for the success of Transformers are general, and the tutorial will be beneficial for a wide range of AI researchers and practitioners. Finally, the tutorial will discuss the limitations of existing Transformer-based approaches and highlight some future research directions.

We expect the participants to have general knowledge about machine learning and deep learning, including commonly used neural architectures and learning methods. A hands-on experience with computer vision, language understanding, or robotics research or applications is helpful, but not required.


Lecture slides

The Transformer Architecture and its Variants
Neil Houlsby
How to Train Your (Vision) Transformer?
Boqing Gong
Does Multimodal Pretraining Learn Useful Representation for Reasoning?
Chen Sun
Language as Robot Middleware
Andy Zeng
Probing Knowledge and Structure in Transformers
Ellie Pavlick