6.7960 Deep Learning, Fall 2025

MIT EECS 6.7960 Deep Learning
Fall 2025
[ Schedule \| Policies \| Piazza \| Canvas \| Gradescope \| Lecture Recordings \| Previous years ]

Course Overview

Description: Fundamentals of deep learning, including both theory and applications. Topics include neural net architectures (MLPs, CNNs, RNNs, graph nets, transformers), geometry and invariances in deep learning, backpropagation and automatic differentiation, learning theory and generalization in high-dimensions, and applications to computer vision, natural language processing, and robotics.

Pre-requisites: 18.05 and (6.3720, 6.3900, or 6.C01)

Note: This course is appropriate for advanced undergraduates and graduate students, and is 3-0-9 units. Due to heavy enrollment, we will very unfortunately not be able to take cross-registrations this semester.

Any and all personal or logistical questions, such as regarding absenses, accomodations, etc should be emailed to the course email, 6.7960-instructors-fl2025@mit.edu, and not to the instructors directly.

Course Information

Instructor Sara Beery

beery at mit dot edu

OH: Thu 9-10 AM 45-741H

Instructor Kaiming He

kaiming at mit dot edu

OH: Mon 9-10 AM 45-701H

Instructor Omar Khattab

okhattab at mit dot edu

OH: Tue 9-10 AM 32-G818

Course Assistant Taylor Braun

tvbraun at mit dot edu

Head TA Victor Butoi

vbutoi at mit dot edu

OH: Wed 1-2 PM 36-155

Head TA Ishan Ganguly

iganguly at mit dot edu

OH: Tue 3-4 PM 36-112

- Logistics

Class meetings: Tuesday, Thursday 1:00 - 2:30 PM in room 45-230.
We will be using both Piazza and Canvas for announcements.
Refer to: Piazza (all questions), Canvas (announcements), Gradescope (homework release, submission, and grades), and Google Calendar (office hours).
All extension requests must go through S3 or GradSupport. For any personal or logistical questions, such as regarding absenses, accomodations, etc, please email the course email, 6.7960-instructors-fl2025@mit.edu, not the instructors directly.

- Grading Policy

Problem sets (50%)

5 psets, 10% each, each 1-2 weeks long
Pen and paper + coding

Midterm Exam (25%)

2 hour exam, closed book, no computers, 1 double-sided page of handwritten notes

Final project (25%) - Research project focused on deeper understanding

Project guidelines
Write up demonstrating novel experimentation and visualization
Groups of up to 3
Google Colab Pro will be provided, do not plan for large compute costs (be creative)

Collaboration policy

AI assistants policy (ChatGPT, etc)

Attendance policy

Late policy

- Materials

Readings will come from a variety of sources and will be posted on the schedule each week.
Some readings are derived from the course textbook, which can be found for free online: Foundations of Computer Vision.
The best textbook devoted entirely to deep learning is probably Understanding Deep Learning, which is freely available online.
Content from 6.390 Intro to ML can also be found here for those who want to brush up on ML concepts.

Class Schedule

** class schedule is subject to change **

Date	Topics	Speaker	Course Materials	Assignments
Week 1
Thu 9/4	Course overview, introduction to deep neural networks and their basic building blocks	Sara Beery	slides optional readings: notation for this course neural networks
Week 2
Mon 9/8	PyTorch Tutorial: 4-5 PM, 32-123		pytorch tutorial colab
Tue 9/9	How to train a neural net + details SGD, Backprop and autodiff, differentiable programming	Sara Beery	slides required readings: gradient-based learning backprop	pset 1 out (solutions)
Tue 9/9	PyTorch Tutorial: 3-4 PM, 2-190
Wed 9/10	PyTorch Tutorial: 7-8 PM, 32-123
Thu 9/11	Approximation theory + details How well can you approximate a given function by a DNN? We will explore various facets of this issue, from universal approximation to Barron's theorem. And does increasing the depth provably help for expressivity?	Omar Khattab	slides
Fri 9/12	PyTorch Tutorial: 11 AM -12 PM, 32-123
Week 3
Tue 9/16	Architectures: Grids + details This lecture will focus mostly on convolutional neural networks, presenting them as a good choice when your data lies on a grid.	Sara Beery	slides required reading: CNNs
Thu 9/18	Architectures: Memory and Sequence Modeling + details RNNs, LSTMs, memory, sequence models.	Kaiming He	slides
Week 4
Tue 9/23	Architectures: Transformers + details Transformers. Three key ideas: tokens, attention, positional codes. Relationship between transformers and MLPs, GNNs, and CNNs -- they are all variations on the same themes!	Sara Beery	slides reading: Transformers (note that this reading focuses on examples from vision but you can apply the same architecture to any kind of data)	pset 1 due pset 2 out (solutions)
Thu 9/25	Generalization Theory	Omar Khattab	slides optional readings: Deep learning requires rethinking generalization Deep learning not so mysterious or different Data science at the singularity
Week 5
Tue 9/30	Representation Learning: Reconstruction-based + details Intro to representation learning, unsupervised and self-supervised learning, clustering, dimension reduction, autoencoders, and modern self-supevised learning with reconstruction losses.	Kaiming He	slides
Thu 10/2	Representation Learning: Similarity-based (aka Neural Information Retrieval) + details Information retrieval, contrastive learning (InfoNCE; hard negatives; KL distillation; self-supervised vs. supervised), sub-linear search and scaling tradeoffs (cross-encoders; bi-encoders; late interaction).	Omar Khattab	slides optional readings: Contrastive Representation Learning Contextualized Late Interaction over BERT In Defense of Dual-Encoders
Week 6
Tue 10/7	Representation Learning and Information Theory	Kaiming He	slides	pset 2 due pset 3 out
Thu 10/9	Foundation models: pre-training	Omar Khattab	slides optional readings: Language Models are Few-Shot Learners SmolLM3; OLMo 2; Marin 8B
Week 7
Tue 10/14	Foundation models: scaling laws	Omar Khattab	slides optional readings: Scaling Laws for LLMs Training Compute-Optimal LLMs Emergent Abilities of LLMs Are Emergent Abilities of LLMs a Mirage?
Thu 10/16	Generative models: basics	Kaiming He	slides	pset 3 due pset 4 out
Week 8
Tue 10/21	Midterm: 7:30 - 9:30 PM
Thu 10/23	Generative models: VAE and GAN	Kaiming He	slides
Week 9
Tue 10/28	Foundation models: post-training	Omar Khattab	slides
Thu 10/30	Generative models: Diffusion and Flows	Kaiming He	slides
Week 10
Tue 11/4	Generalization (OOD) + details Exploring model generalization out of distribution, with a focus on adversarial robustness and distribution shift	Sara Beery	slides	pset 4 due pset 5 out
Thu 11/6	Transfer learning: Models and Data + details Finetuning, linear probes, knowledge distillation, generative models as data, domain adaptation, prompting	Sara Beery	slides
Week 11
Tue 11/11	No class: Veterans Day
Thu 11/13	Inference-time Algorithms	Omar Khattab	slides	pset 5 due
Week 12
Tue 11/18	Guest Lecture: Deep Learning that Improves Real-World Interactions	Rose E Wang (OpenAI)
Thu 11/20	Evaluation + details Designing benchmarks, selecting metrics, and using human input at inference	Sara Beery	slides
Week 13
Tue 11/25	Applying Deep Learning to Your Problems	Kaiming He	slides
Thu 11/27	No class: Thanksgiving Day
Week 14
Tue 12/2	Guest Lecture 2	Zongyi Li (MIT/NYU)
Thu 12/4	Project office hours
Week 15
Tue 12/9	Guest Lecture 3	Jiajun Wu (Stanford)		project due

Collaboration policy

Psets should be written up individually and should reflect your own individual work. However, you may discuss with your peers, TAs, and instructors.
You should not copy or share complete solutions or ask others if your answer is correct (in person or via piazza/canvas).
If you work with anyone on the pset (other than TAs and instructors), list their names at the top of the pset.

AI assistants policy

Our policy for using ChatGPT and other AI assistants is identical to our policy for using human assistants.
This is a deep learning class and you should try out all the latest AI assistants (they are pretty much all using deep learning). It's very important to play with them to learn what they can do and what they can't do. That's a part of the content of this course.
Just like you can come to office hours and ask a human questions (about the lecture material, clarifications about pset questions, tips for getting started, etc), you are very welcome to do the same with AI assistants.
But: just like you are not allowed to ask an expert friend to do your homework for you, you also should not ask an expert AI.
If it is ever unclear, just imagine the AI as a human and apply the same norm as you would with a human.
If you work with any AI on a pset, briefly describe which AI and how you used it at the top of the pset (a few sentences is enough).

Attendance policy

Attendance is at your discretion. Recordings will be released here right after each class.

Late policy

Homeworks will not be accepted more than 7 days after the deadline.
The grade on a homework received n days after the deadline (n<=7) will be multiplied by (1-n/14). We will round up to units of full days; submitting 1 hour late counts as using 1 late day.
Ten penalty days will be automatically waived for each student.

GradSupport

We will not be able to support course incompletes.

Previous years

MIT EECS

Fall 2025

Course Overview

Course Information

Instructor Sara Beery

Instructor Kaiming He

Instructor Omar Khattab

Course Assistant Taylor Braun

Head TA Victor Butoi

Head TA Ishan Ganguly

TA Mahmoud Abdelmoneum

TA Abhay Bestrapalli

TA Riddhi Bhagwat

TA Russ Chua

TA Kelly Cui

TA Ali Cy

TA Gerardo Flores

TA Orion Foo

TA Ishita Goluguri

TA Egor Lifar

TA Maggie Lin

TA Edgar Morfin

TA Shreya Ravikumar

TA Tara Sarma

TA Gracie Sheng

TA Ashkan Soleymani

TA Vinith Suriyakumar

TA Vanessa Xiao

TA Lana Xu

TA William Yang

- Logistics

- Grading Policy

- Materials

Class Schedule

Collaboration policy

AI assistants policy

Attendance policy

Late policy

Previous years