MIT EECS6.7960 Deep Learning |
||
Fall 2025 |
||
Description: Fundamentals of deep learning, including both theory and applications. Topics include neural net architectures (MLPs, CNNs, RNNs, graph nets, transformers), geometry and invariances in deep learning, backpropagation and automatic differentiation, learning theory and generalization in high-dimensions, and applications to computer vision, natural language processing, and robotics.
Pre-requisites: 18.05 and (6.3720, 6.3900, or 6.C01)
Note: This course is appropriate for advanced undergraduates and graduate students, and is 3-0-9 units. Due to heavy enrollment, we will very unfortunately not be able to take cross-registrations this semester.
Any and all personal or logistical questions, such as regarding absenses, accomodations, etc should be emailed to the course email, 6.7960-instructors-fl2025@mit.edu, and not to the instructors directly.
** class schedule is subject to change **
| Date | Topics | Speaker | Course Materials | Assignments | |
| Week 1 | |||||
| Thu 9/4 | Course overview, introduction to deep neural networks and their basic building blocks | Sara Beery |
slides optional readings: notation for this course neural networks |
||
| Week 2 | |||||
| Mon 9/8 | PyTorch Tutorial: 4-5 PM, 32-123 | pytorch tutorial colab | |||
| Tue 9/9 | How to train a neural net+ detailsSGD, Backprop and autodiff, differentiable programming |
Sara Beery |
slides required readings: gradient-based learning backprop |
pset 1 out (solutions) | |
| Tue 9/9 | PyTorch Tutorial: 3-4 PM, 2-190 | ||||
| Wed 9/10 | PyTorch Tutorial: 7-8 PM, 32-123 | ||||
| Thu 9/11 | Approximation theory+ detailsHow well can you approximate a given function by a DNN? We will explore various facets of this issue, from universal approximation to Barron's theorem. And does increasing the depth provably help for expressivity? |
Omar Khattab |
slides |
||
| Fri 9/12 | PyTorch Tutorial: 11 AM -12 PM, 32-123 | ||||
| Week 3 | |||||
| Tue 9/16 | Architectures: Grids+ detailsThis lecture will focus mostly on convolutional neural networks, presenting them as a good choice when your data lies on a grid. |
Sara Beery |
slides
required reading: CNNs |
||
| Thu 9/18 | Architectures: Memory and Sequence Modeling+ detailsRNNs, LSTMs, memory, sequence models. |
Kaiming He | slides | ||
| Week 4 | |||||
| Tue 9/23 | Architectures: Transformers+ detailsTransformers. Three key ideas: tokens, attention, positional codes. Relationship between transformers and MLPs, GNNs, and CNNs -- they are all variations on the same themes! |
Sara Beery |
slides reading: Transformers (note that this reading focuses on examples from vision but you can apply the same architecture to any kind of data) |
pset 1 due
pset 2 out (solutions) |
|
| Thu 9/25 | Generalization Theory |
Omar Khattab |
slides
optional readings: Deep learning requires rethinking generalization Deep learning not so mysterious or different Data science at the singularity |
||
| Week 5 | |||||
| Tue 9/30 | Representation Learning: Reconstruction-based
+ detailsIntro to representation learning, unsupervised and self-supervised learning, clustering, dimension reduction, autoencoders, and modern self-supevised learning with reconstruction losses. |
Kaiming He |
slides |
||
| Thu 10/2 | Representation Learning: Similarity-based (aka Neural Information Retrieval) + detailsInformation retrieval, contrastive learning (InfoNCE; hard negatives; KL distillation; self-supervised vs. supervised), sub-linear search and scaling tradeoffs (cross-encoders; bi-encoders; late interaction). |
Omar Khattab |
slides optional readings: Contrastive Representation Learning Contextualized Late Interaction over BERT In Defense of Dual-Encoders |
||
| Week 6 | |||||
| Tue 10/7 | Representation Learning and Information Theory | Kaiming He | slides |
pset 2 due pset 3 out |
|
| Thu 10/9 | Foundation models: pre-training | Omar Khattab |
slides
optional readings: Language Models are Few-Shot Learners SmolLM3; OLMo 2; Marin 8B |
||
| Week 7 | |||||
| Tue 10/14 | Foundation models: scaling laws | Omar Khattab |
slides
optional readings: Scaling Laws for LLMs Training Compute-Optimal LLMs Emergent Abilities of LLMs Are Emergent Abilities of LLMs a Mirage? |
||
| Thu 10/16 | Generative models: basics | Kaiming He | slides |
pset 3 due pset 4 out |
|
| Week 8 | |||||
| Tue 10/21 | Midterm: 7:30 - 9:30 PM | ||||
| Thu 10/23 | Generative models: VAE and GAN | Kaiming He | slides | ||
| Week 9 | |||||
| Tue 10/28 | Foundation models: post-training | Omar Khattab | slides | ||
| Thu 10/30 | Generative models: Diffusion and Flows | Kaiming He | slides | ||
| Week 10 | |||||
| Tue 11/4 | Generalization (OOD)
+ detailsExploring model generalization out of distribution, with a focus on adversarial robustness and distribution shift |
Sara Beery | slides |
pset 4 due pset 5 out |
|
| Thu 11/6 | Transfer learning: Models and Data
+ detailsFinetuning, linear probes, knowledge distillation, generative models as data, domain adaptation, prompting |
Sara Beery | slides | ||
| Week 11 | |||||
| Tue 11/11 | No class: Veterans Day | ||||
| Thu 11/13 | Inference-time Algorithms | Omar Khattab | slides | pset 5 due | |
| Week 12 | |||||
| Tue 11/18 | Guest Lecture: Deep Learning that Improves Real-World Interactions | Rose E Wang (OpenAI) | |||
| Thu 11/20 | Evaluation
+ detailsDesigning benchmarks, selecting metrics, and using human input at inference |
Sara Beery | slides | ||
| Week 13 | |||||
| Tue 11/25 | Applying Deep Learning to Your Problems | Kaiming He | slides | ||
| Thu 11/27 | No class: Thanksgiving Day | ||||
| Week 14 | |||||
| Tue 12/2 | Guest Lecture 2 | Zongyi Li (MIT/NYU) | |||
| Thu 12/4 | Project office hours | ||||
| Week 15 | |||||
| Tue 12/9 | Guest Lecture 3 | Jiajun Wu (Stanford) | project due | ||