MIT EECS

6.7960 Deep Learning

Fall 2025

[ Schedule | Policies | Piazza | Canvas | Gradescope | Lecture Recordings | Previous years ]

Course Overview

Description: Fundamentals of deep learning, including both theory and applications. Topics include neural net architectures (MLPs, CNNs, RNNs, graph nets, transformers), geometry and invariances in deep learning, backpropagation and automatic differentiation, learning theory and generalization in high-dimensions, and applications to computer vision, natural language processing, and robotics.

Pre-requisites: 18.05 and (6.3720, 6.3900, or 6.C01)

Note: This course is appropriate for advanced undergraduates and graduate students, and is 3-0-9 units. Due to heavy enrollment, we will very unfortunately not be able to take cross-registrations this semester.

Any and all personal or logistical questions, such as regarding absenses, accomodations, etc should be emailed to the course email, 6.7960-instructors-fl2025@mit.edu, and not to the instructors directly.




Course Information

Instructor Sara Beery

beery at mit dot edu

OH: Thu 9-10 AM 45-741H

Instructor Kaiming He

kaiming at mit dot edu

OH: Mon 9-10 AM 45-701H

Instructor Omar Khattab

okhattab at mit dot edu

OH: Tue 9-10 AM 32-G818

Course Assistant Taylor Braun

tvbraun at mit dot edu

Head TA Victor Butoi

vbutoi at mit dot edu

OH: Wed 1-2 PM 36-155

Head TA Ishan Ganguly

iganguly at mit dot edu

OH: Tue 3-4 PM 36-112

TA Mahmoud Abdelmoneum

mabdel03 at mit dot edu

OH: Tue 4-5 PM 36-112

TA Abhay Bestrapalli

abhayb at mit dot edu

OH: Mon 3-4 PM 26-168

TA Riddhi Bhagwat

riddhib at mit dot edu

OH: Fri 9-10 AM 24-317

TA Russ Chua

russchua at mit dot edu

OH: Tue 10-11 AM 36-156

TA Kelly Cui

kellycui at mit dot edu

OH: Tue 3-4 PM 36-112

TA Ali Cy

califyn at mit dot edu

OH: Mon 5-6 PM 26-168

TA Gerardo Flores

gfm at mit dot edu

OH: Wed 1-2 PM 36-155

TA Orion Foo

ofoo at mit dot edu

OH: Mon 3-4 PM 26-168

TA Ishita Goluguri

ishi at mit dot edu

OH: Thu 11 AM - 12 PM 34-303

TA Egor Lifar

l1far at mit dot edu

OH: Mon 5-6 PM 26-168

TA Maggie Lin

maggiejl at mit dot edu

OH: Tue 3-4 PM 36-112

TA Edgar Morfin

emorfin at mit dot edu

OH: Wed 12-1 PM 24-319

TA Shreya Ravikumar

shreyark at mit dot edu

OH: Mon 4-5 PM 26-168

TA Tara Sarma

tssarma at mit dot edu

OH: Mon 4-5 PM 26-168

TA Gracie Sheng

grac at mit dot edu

OH: Fri 9-10 AM 24-317

TA Ashkan Soleymani

ashkanso at mit dot edu

OH: Thu 11 AM - 12 PM 34-303

TA Vinith Suriyakumar

vinithms at mit dot edu

OH: Wed 1-2 PM 36-155

TA Vanessa Xiao

vzxiao at mit dot edu

OH: Thu 4-5 PM 36-156

TA Lana Xu

ylanaxu at mit dot edu

OH: Thu 4-5 PM 36-156

TA William Yang

wyyang at mit dot edu

OH: Tue 10-11 AM 36-156

- Logistics

- Grading Policy

  • Problem sets (50%)
  • Midterm Exam (25%)
  • Final project (25%) - Research project focused on deeper understanding
  • Collaboration policy
  • AI assistants policy (ChatGPT, etc)
  • Attendance policy
  • Late policy
  • - Materials

     



    Class Schedule


    ** class schedule is subject to change **

    Date Topics Speaker Course Materials Assignments
    Week 1
    Thu 9/4 Course overview, introduction to deep neural networks and their basic building blocks Sara Beery slides

    optional readings:
    notation for this course
    neural networks

    Week 2
    Mon 9/8 PyTorch Tutorial: 4-5 PM, 32-123 pytorch tutorial colab
    Tue 9/9 How to train a neural net
    + details SGD, Backprop and autodiff, differentiable programming
    Sara Beery slides

    required readings:
    gradient-based learning
    backprop
    pset 1 out
    Tue 9/9 PyTorch Tutorial: 3-4 PM, 2-190
    Wed 9/10 PyTorch Tutorial: 7-8 PM, 32-123
    Thu 9/11 Approximation theory
    + details How well can you approximate a given function by a DNN? We will explore various facets of this issue, from universal approximation to Barron's theorem. And does increasing the depth provably help for expressivity?
    Omar Khattab slides
    Fri 9/12 PyTorch Tutorial: 11 AM -12 PM, 32-123
    Week 3
    Tue 9/16 Architectures: Grids
    + details This lecture will focus mostly on convolutional neural networks, presenting them as a good choice when your data lies on a grid.
    Sara Beery slides

    required reading:
    CNNs
    Thu 9/18 Architectures: Memory and Sequence Modeling
    + details RNNs, LSTMs, memory, sequence models.
    Kaiming He slides
    Week 4
    Tue 9/23 Architectures: Transformers
    + details Transformers. Three key ideas: tokens, attention, positional codes. Relationship between transformers and MLPs, GNNs, and CNNs -- they are all variations on the same themes!
    Sara Beery slides

    reading:
    Transformers (note that this reading focuses on examples from vision but you can apply the same architecture to any kind of data)
    pset 1 due
    pset 2 out
    Thu 9/25 Generalization Theory
    Omar Khattab slides

    optional readings:
    Understanding deep learning requires rethinking generalization
    Deep Learning is Not So Mysterious or Different
    Data Science at the Singularity
    Week 5
    Tue 9/30 Representation Learning: Reconstruction-based
    + details Intro to representation learning, unsupervised and self-supervised learning, clustering, dimension reduction, autoencoders, and modern self-supevised learning with reconstruction losses.
    Kaiming He slides
    Thu 10/2 Representation Learning: Similarity-based (aka Neural Information Retrieval)
    + details Information retrieval, contrastive learning (InfoNCE; hard negatives; KL distillation; self-supervised vs. supervised), sub-linear search and scaling tradeoffs (cross-encoders; bi-encoders; late interaction).
    Omar Khattab slides

    optional readings:
    Contrastive Representation Learning
    Contextualized Late Interaction over BERT
    In Defense of Dual-Encoders
    Week 6
    Tue 10/7 Representation learning -- theory Kaiming He pset 2 due
    pset 3 out
    Thu 10/9 Foundation models: pre-training and scaling laws Omar Khattab
    Week 7
    Tue 10/14 Foundation models: post-training and RL Omar Khattab
    Thu 10/16 Generative models: basics
    + details Density and energy models, samplers, GANs, autoregressive models, diffusion models
    Kaiming He pset 3 due
    pset 4 out
    Week 8
    Tue 10/21 Midterm: 7:30 - 9:30 PM
    Thu 10/23 Generative models: representation learning meets generative modeling
    + details VAEs, latent variables
    Kaiming He
    Week 9
    Tue 10/28 Hacker's guide to DL
    + details Practical tips mixed with opinionated anecdotes about how to get deep nets to actually do what you want.
    Omar Khattab
    Thu 10/30 Generative models: Diffusion and Flows Kaiming He
    Week 10
    Tue 11/4 Generalization (OOD)
    + details Exploring model generalization out of distribution, with a focus on adversarial robustness and distribution shift
    Sara Beery pset 4 due
    pset 5 out
    Thu 11/6 Transfer learning: Models and Data
    + details Finetuning, linear probes, knowledge distillation, generative models as data, domain adaptation, prompting
    Sara Beery
    Week 11
    Tue 11/11 No class: Veterans Day
    Thu 11/13 Guest Lecture 1 pset 5 due
    Week 12
    Tue 11/18 Guest Lecture 2
    Thu 11/20 Evaluation
    + details Designing benchmarks, selecting metrics, and using human input at inference
    Sara Beery
    Week 13
    Tue 11/25 Applications Kaiming He
    Thu 11/27 No class: Thanksgiving Day
    Week 14
    Tue 12/2 Inference methods for deep learning and systems
    + details Everything beyond a simple forward pass: beam search, chain-of-thought, in-context learning, test-time training. Also methods that use search to improve learning.
    Omar Khattab
    Thu 12/4 Project office hours
    Week 15
    Tue 12/9 Final project deadline


    Collaboration policy



    AI assistants policy



    Attendance policy

  • Attendance is at your discretion. Recordings will be released here right after each class.


  • Late policy