CS224d: Deep Learning for Natural Language Processing

Richard Socher, Stanford

Course Description

Natural language processing (NLP) is one of the most important technologies of the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Applications of NLP are everywhere because people communicate most everything in language: web search, advertisement, emails, customer service, language translation, radiology reports, etc. There are a large variety of underlying tasks and machine learning models powering NLP applications. Recently, deep learning approaches have obtained very high performance across many different NLP tasks. These models can often be trained with a single end-to-end model and do not require traditional, task-specific feature engineering. In this spring quarter course students will learn to implement, train, debug, visualize and invent their own neural network models. The course provides a deep excursion into cutting-edge research in deep learning applied to NLP. The final project will involve training a complex recurrent neural network and applying it to a large scale NLP problem. On the model side we will cover word vector representations, window-based neural networks, recurrent neural networks, long-short-term-memory models, recursive neural networks, convolutional neural networks as well as some very novel models involving a memory component. Through lectures and programming assignments students will learn the necessary engineering tricks for making neural networks work on practical problems.


  • Proficiency in Python
    All class assignments will be in Python (and use numpy). There is a tutorial here for those who aren't as familiar with Python. If you have a lot of programming experience but in a different language (e.g. C/C++/Matlab/Javascript) you will probably be fine.
  • College Calculus, Linear Algebra (e.g. MATH 19 or 41, MATH 51)
    You should be comfortable taking derivatives and understanding matrix vector operations and notation.
  • Basic Probability and Statistics (e.g. CS 109 or other stats course)
    You should know basics of probabilities, gaussian distributions, mean, standard deviation, etc.
  • Equivalent knowledge of CS229 (Machine Learning)
    We will be formulating cost functions, taking derivatives and performing optimization with gradient descent.


  • Knowledge of natural language processing (CS224N or CS224U)
    We will discuss a lot of different tasks and you will appreciate the power of deep learning techniques even more if you know how much work had been done on these tasks and how related models have solved them.
  • Convex optimization
    You may find some of the optimization tricks more intuitive with this background.
  • Knowledge of convolutional neural networks (CS231n)
    The first problem set will probably be easier for you. We cannot assume you took this class so there will be ~3 lectures that overlap in content. You can use that time to dive deeper into some aspects.


Is this the first time this class is offered?
Yes, this is an entirely new class designed to introduce students to deep learning for natural language processing. We will place a particular emphasis on Neural Networks, which are a class of deep learning models that have recently obtained improvements in many different NLP tasks.
Can I follow along from the outside?
We'd be happy if you join us! We plan to make the course materials widely available: The assignments, course notes and slides will be available online. We may provide videos. We won't be able to give you course credit.
Can I take this course on credit/no cred basis?
Yes. Credit will be given to those who would have otherwise earned a C- or above.
Can I audit or sit in?
In general we are very open to sitting-in guests if you are a member of the Stanford community (registered student, staff, and/or faculty). Out of courtesy, we would appreciate that you first email us or talk to the instructor after the first class you attend.
Can I work in groups for the Final Project?
Yes, in groups of up to two people.
I have a question about the class. What is the best way to reach the course staff?
Stanford students please use an internal class forum on Piazza so that other students may benefit from your questions and our answers. If you have a personal matter, email us at the class mailing list cs224d-spr1415-staff@lists.stanford.edu.
Can I combine the Final Project with another course?
Yes, you may. There are a couple of courses concurrently offered with CS224d that are natural choices, such as CS224u (Natural Language Understanding, by Prof. Chris Potts and Bill MacCartney). If you are taking a related class, please speak to the instructors to receive permission to combine the Final Project assignments.

Schedule and Syllabus

Unless otherwise specified the course lectures and meeting times are:

Monday, Wednesday 1:00-2:15
Location: 320-105
EventDateDescriptionCourse Materials
Lecture Mar 30 Intro to NLP and Deep Learning Suggested Readings:
  1. [Linear Algebra Review]
  2. [Probability Review]
  3. [Convex Optimization Review]
  4. [More Optimization (SGD) Review]
  5. [From Frequency to Meaning: Vector Space Models of Semantics]
[Lecture Notes 1]
[python tutorial] [slides] [video]
Lecture Apr 1 Simple Word Vector representations: word2vec, GloVe Suggested Readings:
  1. [Distributed Representations of Words and Phrases and their Compositionality]
  2. [Efficient Estimation of Word Representations in Vector Space]
Lecture Apr 6 Advanced word vector representations: language models, softmax, single layer networks Suggested Readings:
  1. [GloVe: Global Vectors for Word Representation]
  2. [Improving Word Representations via Global Context and Multiple Word Prototypes]
Lecture Apr 8 Neural Networks and backpropagation -- for named entity recognition Suggested Readings:
  1. [UFLDL tutorial]
  2. [Learning Representations by Backpropogating Errors]
[slides] [video]
Lecture Apr 13 Project Advice, Neural Networks and Back-Prop (in full gory detail) Suggested Readings:
  1. [Natural Language Processing (almost) from Scratch]
  2. [A Neural Network for Factoid Question Answering over Paragraphs]
  3. [Grounded Compositional Semantics for Finding and Describing Images with Sentences]
  4. [Deep Visual-Semantic Alignments for Generating Image Descriptions]
  5. [Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank]
Lecture Apr 15 Practical tips: gradient checks, overfitting, regularization, activation functions, details Suggested Readings:
  1. [Practical recommendations for gradient-based training of deep architectures]
  2. [UFLDL page on gradient checking]
[slides] [video]
A1 Due Apr 16 Assignment #1 due [Pset 1]
Lecture Apr 20 Recurrent neural networks -- for language modeling and other tasks Suggested Readings:
  1. [Recurrent neural network based language model]
  2. [Extensions of recurrent neural network language model]
  3. [Opinion Mining with Deep Recurrent Neural Networks]
Proposal due Apr 21 Course Project Proposal due [proposal description]
Lecture Apr 22 GRUs and LSTMs -- for machine translation Suggested Readings:
  1. [Long Short-Term Memory]
  2. [Gated Feedback Recurrent Neural Networks]
  3. [Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling]
[slides] [video]
Lecture Apr 27 Recursive neural networks -- for parsing Suggested Readings:
  1. [Parsing with Compositional Vector Grammars]
  2. [Subgradient Methods for Structured Prediction]
  3. [Parsing Natural Scenes and Natural Language with Recursive Neural Networks]
[slides] [video]
Lecture Apr 29 Recursive neural networks -- for different tasks (e.g. sentiment analysis) Suggested Readings:
  1. [Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank]
  2. [Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection]
  3. [Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks]
[slides] [video]
A2 Due Apr 30 Pset #2 Due date [Pset #2]
Lecture May 4 Review Session for Midterm

Suggested Readings: N/A

[slides] [video - see Piazza]
Midterm May 6 In-class midterm
Lecture May 11 Guest Lecture with Jason Weston from Facebook: Neural Models with Memory -- for question answering Suggested Readings:
  1. [Memory Networks]
  2. [Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks]
[slides] [video]
Milestone May 13 Course Project Milestone [milestone description]
Lecture May 13 Convolutional neural networks -- for sentence classification Suggested Readings:
  1. [A Convolutional Neural Network for Modelling Sentences]
[slides] [video]
Lecture May 18 Guest Lecture with Andrew Maas: Speech recognition Suggested Readings:
  1. [ Deep Neural Networks for Acoustic Modeling in Speech Recognition]
[slides] [video]
Lecture May 20 Guest Lecture with Elliot English from MetaMind: Efficient implementations and GPUs Suggested Readings:
  1. []
[slides] [video]
A3 Due May 21 Pset #3 Due date [Pset #3]
Lecture May 27 Applications of Deep Learning to Natural Language Processing Suggested Readings:
  1. []
[slides] [video]
Lecture Jun 1 Future applications, open research problems, visualization Suggested Readings:
  1. []
[slides] [no video]
Poster Presentation Jun 3 Final project poster presentations: 2-5 pm, Gates patio
Final Project Due Jun 8 Final course project due date [project description]
  • 30 March 2017, 10 weeks
Course properties:
  • Free:
  • Paid:
  • Certificate:
  • MOOC:
  • Video:
  • Audio:
  • Email-course:
  • Language: English Gb


No reviews yet. Want to be the first?

Register to leave a review

Included in selections:
Small-icon.hover Deep Learning
Good materials on deep learning.
More on this topic:
Nlp-logo-4x4_400x400 CS224n: Natural Language Processing with Deep Learning
Natural language processing (NLP) is one of the most important technologies...
More from 'Computer Science':
Maxresdefault CS 282: Principles of Operating Systems II: Systems Programming for Android
Developing high quality distributed systems software is hard; developing high...
Banner_ruby Ruby on Rails Tutorial: Learn From Scratch
This post is part of our “Getting Started” series of free text tutorials on...
Logo-30-128x128 NYU Course on Deep Learning (Spring 2014)
Lectures from the NYU Course on Deep Learning (Spring 2014) This is a graduate...
Cppgm C++ Grandmaster Certification
The C++ Grandmaster Certification is an online course in which participants...
Umnchem Computational Chemistry (CHEM 4021/8021)
Modern theoretical methods used in study of molecular structure, bonding, and...
More from 'Stanford':
Visionlablogo CS231n: Convolutional Neural Networks for Visual Recognition
Computer Vision has become ubiquitous in our society, with applications in search...
Databases DB: Introduction to Databases
Learn about Databases, one of the most prevalent technologies underlying internet...
Envphys_water EP101: Your Body in the World: Adapting to Your Next Big Adventure
Discover the amazing adaptability of the human body to environmental stressors...
Images_course_image HRP258: Statistics in Medicine
Provides a firm grounding in the foundations of probability and statistics,...

© 2013-2019