6.7960 | Fall 2024 | Undergraduate, Graduate

Deep Learning

Lec 08. Architectures: Transformers

This video introduces transformers, focusing on three key ideas: tokens, attention, and positional codes. It also explores how transformers relate to MLPs, GNNs, and CNNs as variations on common principles.

Lecture Notes
Lecture Videos
Problem Sets
Projects with Examples
Readings