MIT EECS

6.7960 Deep Learning

Fall 2025

[ Schedule | Policies | Piazza | Canvas | Gradescope | Lecture Recordings | Previous years ]

Course Overview

Description: Fundamentals of deep learning, including both theory and applications. Topics include neural net architectures (MLPs, CNNs, RNNs, graph nets, transformers), geometry and invariances in deep learning, backpropagation and automatic differentiation, learning theory and generalization in high-dimensions, and applications to computer vision, natural language processing, and robotics.

Pre-requisites: 18.05 and (6.3720, 6.3900, or 6.C01)

Note: This course is appropriate for advanced undergraduates and graduate students, and is 3-0-9 units. Due to heavy enrollment, we will very unfortunately not be able to take cross-registrations this semester.

Any and all personal or logistical questions, such as regarding absenses, accomodations, etc should be emailed to the course email, 6.7960-instructors-fl2025@mit.edu, and not to the instructors directly.




Course Information

Instructor Sara Beery

beery at mit dot edu

OH: Thu 9-10 AM 45-741H

Instructor Kaiming He

kaiming at mit dot edu

OH: Mon 9-10 AM 45-701H

Instructor Omar Khattab

okhattab at mit dot edu

OH: Tue 9-10 AM 32-G818

Course Assistant Taylor Braun

tvbraun at mit dot edu

Head TA Victor Butoi

vbutoi at mit dot edu

OH: Wed 1-2 PM 36-155

Head TA Ishan Ganguly

iganguly at mit dot edu

OH: Tue 3-4 PM 36-112

TA Mahmoud Abdelmoneum

mabdel03 at mit dot edu

OH: Tue 4-5 PM 36-112

TA Abhay Bestrapalli

abhayb at mit dot edu

OH: Mon 3-4 PM 26-168

TA Riddhi Bhagwat

riddhib at mit dot edu

OH: Fri 10-11 AM 24-317

TA Russ Chua

russchua at mit dot edu

OH: Tue 10-11 AM 36-156

TA Kelly Cui

kellycui at mit dot edu

OH: Tue 3-4 PM 36-112

TA Ali Cy

califyn at mit dot edu

OH: Mon 5-6 PM 26-168

TA Gerardo Flores

gfm at mit dot edu

OH: Wed 1-2 PM 36-155

TA Orion Foo

ofoo at mit dot edu

OH: Mon 3-4 PM 26-168

TA Ishita Goluguri

ishi at mit dot edu

OH: Thu 11 AM - 12 PM 34-303

TA Egor Lifar

l1far at mit dot edu

OH: Mon 5-6 PM 26-168

TA Maggie Lin

maggiejl at mit dot edu

OH: Tue 3-4 PM 36-112

TA Edgar Morfin

emorfin at mit dot edu

OH: Wed 12-1 PM 24-319

TA Shreya Ravikumar

shreyark at mit dot edu

OH: Mon 4-5 PM 26-168

TA Tara Sarma

tssarma at mit dot edu

OH: Mon 4-5 PM 26-168

TA Gracie Sheng

grac at mit dot edu

OH: Fri 10-11 AM 24-317

TA Ashkan Soleymani

ashkanso at mit dot edu

OH: Thu 11 AM - 12 PM 34-303

TA Vinith Suriyakumar

vinithms at mit dot edu

OH: Wed 1-2 PM 36-155

TA Vanessa Xiao

vzxiao at mit dot edu

OH: Thu 4-5 PM 36-156

TA Lana Xu

ylanaxu at mit dot edu

OH: Thu 4-5 PM 36-156

TA William Yang

wyyang at mit dot edu

OH: Tue 10-11 AM 36-156

- Logistics

- Grading Policy

  • Problem sets (50%)
  • Midterm Exam (25%)
  • Final project (25%) - Research project focused on deeper understanding
  • Collaboration policy
  • AI assistants policy (ChatGPT, etc)
  • Attendance policy
  • Late policy
  • - Materials

     



    Class Schedule


    ** class schedule is subject to change **

    Date Topics Speaker Course Materials Assignments
    Week 1
    Thu 9/4 Course overview, introduction to deep neural networks and their basic building blocks Sara Beery slides

    optional readings:
    notation for this course
    neural networks

    Week 2
    Mon 9/8 PyTorch Tutorial: 4-5 PM, 32-123 pytorch tutorial colab
    Tue 9/9 How to train a neural net
    + details SGD, Backprop and autodiff, differentiable programming
    Sara Beery slides

    required readings:
    gradient-based learning
    backprop
    pset 1 out (solutions)
    Tue 9/9 PyTorch Tutorial: 3-4 PM, 2-190
    Wed 9/10 PyTorch Tutorial: 7-8 PM, 32-123
    Thu 9/11 Approximation theory
    + details How well can you approximate a given function by a DNN? We will explore various facets of this issue, from universal approximation to Barron's theorem. And does increasing the depth provably help for expressivity?
    Omar Khattab slides
    Fri 9/12 PyTorch Tutorial: 11 AM -12 PM, 32-123
    Week 3
    Tue 9/16 Architectures: Grids
    + details This lecture will focus mostly on convolutional neural networks, presenting them as a good choice when your data lies on a grid.
    Sara Beery slides

    required reading:
    CNNs
    Thu 9/18 Architectures: Memory and Sequence Modeling
    + details RNNs, LSTMs, memory, sequence models.
    Kaiming He slides
    Week 4
    Tue 9/23 Architectures: Transformers
    + details Transformers. Three key ideas: tokens, attention, positional codes. Relationship between transformers and MLPs, GNNs, and CNNs -- they are all variations on the same themes!
    Sara Beery slides

    reading:
    Transformers (note that this reading focuses on examples from vision but you can apply the same architecture to any kind of data)
    pset 1 due
    pset 2 out (solutions)
    Thu 9/25 Generalization Theory
    Omar Khattab slides

    optional readings:
    Deep learning requires rethinking generalization
    Deep learning not so mysterious or different
    Data science at the singularity
    Week 5
    Tue 9/30 Representation Learning: Reconstruction-based
    + details Intro to representation learning, unsupervised and self-supervised learning, clustering, dimension reduction, autoencoders, and modern self-supevised learning with reconstruction losses.
    Kaiming He slides
    Thu 10/2 Representation Learning: Similarity-based (aka Neural Information Retrieval)
    + details Information retrieval, contrastive learning (InfoNCE; hard negatives; KL distillation; self-supervised vs. supervised), sub-linear search and scaling tradeoffs (cross-encoders; bi-encoders; late interaction).
    Omar Khattab slides

    optional readings:
    Contrastive Representation Learning
    Contextualized Late Interaction over BERT
    In Defense of Dual-Encoders
    Week 6
    Tue 10/7 Representation Learning and Information Theory Kaiming He slides pset 2 due
    pset 3 out
    Thu 10/9 Foundation models: pre-training Omar Khattab slides

    optional readings:
    Language Models are Few-Shot Learners
    SmolLM3; OLMo 2; Marin 8B
    Week 7
    Tue 10/14 Foundation models: scaling laws Omar Khattab slides

    optional readings:
    Scaling Laws for LLMs
    Training Compute-Optimal LLMs
    Emergent Abilities of LLMs
    Are Emergent Abilities of LLMs a Mirage?
    Thu 10/16 Generative models: basics Kaiming He slides pset 3 due
    pset 4 out
    Week 8
    Tue 10/21 Midterm: 7:30 - 9:30 PM
    Thu 10/23 Generative models: VAE and GAN Kaiming He slides
    Week 9
    Tue 10/28 Foundation models: post-training Omar Khattab slides
    Thu 10/30 Generative models: Diffusion and Flows Kaiming He slides
    Week 10
    Tue 11/4 Generalization (OOD)
    + details Exploring model generalization out of distribution, with a focus on adversarial robustness and distribution shift
    Sara Beery slides pset 4 due
    pset 5 out
    Thu 11/6 Transfer learning: Models and Data
    + details Finetuning, linear probes, knowledge distillation, generative models as data, domain adaptation, prompting
    Sara Beery slides
    Week 11
    Tue 11/11 No class: Veterans Day
    Thu 11/13 Inference-time Algorithms Omar Khattab slides pset 5 due
    Week 12
    Tue 11/18 Guest Lecture: Deep Learning that Improves Real-World Interactions Rose E Wang (OpenAI)
    Thu 11/20 Evaluation
    + details Designing benchmarks, selecting metrics, and using human input at inference
    Sara Beery slides
    Week 13
    Tue 11/25 Applying Deep Learning to Your Problems Kaiming He slides
    Thu 11/27 No class: Thanksgiving Day
    Week 14
    Tue 12/2 Guest Lecture 2 Zongyi Li (MIT/NYU)
    Thu 12/4 Project office hours
    Week 15
    Tue 12/9 Guest Lecture 3 Jiajun Wu (Stanford) project due


    Collaboration policy



    AI assistants policy



    Attendance policy

  • Attendance is at your discretion. Recordings will be released here right after each class.


  • Late policy