MIT EECS6.7960 Deep Learning |
||
Fall 2025 |
||
Description: Fundamentals of deep learning, including both theory and applications. Topics include neural net architectures (MLPs, CNNs, RNNs, graph nets, transformers), geometry and invariances in deep learning, backpropagation and automatic differentiation, learning theory and generalization in high-dimensions, and applications to computer vision, natural language processing, and robotics.
Pre-requisites: 18.05 and (6.3720, 6.3900, or 6.C01)
Note: This course is appropriate for advanced undergraduates and graduate students, and is 3-0-9 units. Due to heavy enrollment, we will very unfortunately not be able to take cross-registrations this semester.
Any and all personal or logistical questions, such as regarding absenses, accomodations, etc should be emailed to the course email, 6.7960-instructors-fl2025@mit.edu, and not to the instructors directly.
** class schedule is subject to change **
Date | Topics | Speaker | Course Materials | Assignments | |
Week 1 | |||||
Thu 9/4 | Course overview, introduction to deep neural networks and their basic building blocks | Sara Beery |
slides optional readings: notation for this course neural networks |
||
Week 2 | |||||
Mon 9/8 | PyTorch Tutorial: 4-5 PM, 32-123 | pytorch tutorial colab | |||
Tue 9/9 | How to train a neural net+ detailsSGD, Backprop and autodiff, differentiable programming |
Sara Beery |
slides required readings: gradient-based learning backprop |
pset 1 out | |
Tue 9/9 | PyTorch Tutorial: 3-4 PM, 2-190 | ||||
Wed 9/10 | PyTorch Tutorial: 7-8 PM, 32-123 | ||||
Thu 9/11 | Approximation theory+ detailsHow well can you approximate a given function by a DNN? We will explore various facets of this issue, from universal approximation to Barron's theorem. And does increasing the depth provably help for expressivity? |
Omar Khattab |
slides |
||
Fri 9/12 | PyTorch Tutorial: 11 AM -12 PM, 32-123 | ||||
Week 3 | |||||
Tue 9/16 | Architectures: Grids+ detailsThis lecture will focus mostly on convolutional neural networks, presenting them as a good choice when your data lies on a grid. |
Sara Beery |
slides
required reading: CNNs |
||
Thu 9/18 | Architectures: Memory and Sequence Modeling+ detailsRNNs, LSTMs, memory, sequence models. |
Kaiming He | slides | ||
Week 4 | |||||
Tue 9/23 | Architectures: Transformers+ detailsTransformers. Three key ideas: tokens, attention, positional codes. Relationship between transformers and MLPs, GNNs, and CNNs -- they are all variations on the same themes! |
Sara Beery |
slides reading: Transformers (note that this reading focuses on examples from vision but you can apply the same architecture to any kind of data) |
pset 1 due
pset 2 out |
|
Thu 9/25 | Generalization Theory |
Omar Khattab |
slides
optional readings: Understanding deep learning requires rethinking generalization Deep Learning is Not So Mysterious or Different Data Science at the Singularity |
||
Week 5 | |||||
Tue 9/30 | Representation Learning: Reconstruction-based
+ detailsIntro to representation learning, unsupervised and self-supervised learning, clustering, dimension reduction, autoencoders, and modern self-supevised learning with reconstruction losses. |
Kaiming He |
slides |
||
Thu 10/2 | Representation Learning: Similarity-based (aka Neural Information Retrieval) + detailsInformation retrieval, contrastive learning (InfoNCE; hard negatives; KL distillation; self-supervised vs. supervised), sub-linear search and scaling tradeoffs (cross-encoders; bi-encoders; late interaction). |
Omar Khattab |
slides optional readings: Contrastive Representation Learning Contextualized Late Interaction over BERT In Defense of Dual-Encoders |
||
Week 6 | |||||
Tue 10/7 | Representation learning -- theory | Kaiming He |
pset 2 due pset 3 out |
||
Thu 10/9 | Foundation models: pre-training and scaling laws | Omar Khattab | |||
Week 7 | |||||
Tue 10/14 | Foundation models: post-training and RL | Omar Khattab | |||
Thu 10/16 | Generative models: basics
+ detailsDensity and energy models, samplers, GANs, autoregressive models, diffusion models |
Kaiming He |
pset 3 due pset 4 out |
||
Week 8 | |||||
Tue 10/21 | Midterm: 7:30 - 9:30 PM | ||||
Thu 10/23 |
Generative models: representation learning meets generative modeling
+ detailsVAEs, latent variables |
Kaiming He | |||
Week 9 | |||||
Tue 10/28 | Hacker's guide to DL
+ detailsPractical tips mixed with opinionated anecdotes about how to get deep nets to actually do what you want. |
Omar Khattab | |||
Thu 10/30 | Generative models: Diffusion and Flows | Kaiming He | |||
Week 10 | |||||
Tue 11/4 | Generalization (OOD)
+ detailsExploring model generalization out of distribution, with a focus on adversarial robustness and distribution shift |
Sara Beery |
pset 4 due pset 5 out |
||
Thu 11/6 | Transfer learning: Models and Data
+ detailsFinetuning, linear probes, knowledge distillation, generative models as data, domain adaptation, prompting |
Sara Beery | |||
Week 11 | |||||
Tue 11/11 | No class: Veterans Day | ||||
Thu 11/13 | Guest Lecture 1 | pset 5 due | |||
Week 12 | |||||
Tue 11/18 | Guest Lecture 2 | ||||
Thu 11/20 | Evaluation
+ detailsDesigning benchmarks, selecting metrics, and using human input at inference |
Sara Beery | |||
Week 13 | |||||
Tue 11/25 | Applications | Kaiming He | |||
Thu 11/27 | No class: Thanksgiving Day | ||||
Week 14 | |||||
Tue 12/2 | Inference methods for deep learning and systems
+ detailsEverything beyond a simple forward pass: beam search, chain-of-thought, in-context learning, test-time training. Also methods that use search to improve learning. |
Omar Khattab | |||
Thu 12/4 | Project office hours | ||||
Week 15 | |||||
Tue 12/9 | Final project deadline |