This course is designed to introduce students to the fundamental concepts and ideas in natural language processing (NLP),
and to get them up to speed with current research in the area. It develops an in-depth understanding of both the algorithms
available for the processing of linguistic information and the underlying computational properties of natural languages. Wordlevel, syntactic, and semantic processing from both a linguistic and an algorithmic perspective are considered. The focus is on modern quantitative techniques in NLP: using large corpora, statistical models for acquisition, disambiguation, and parsing. Also, it examines and constructs representative systems.
Prerequisites:
• Adequate experience with programming and formal structures (e.g., CS106B/X and CS103B/X).
• Programming projects will be written in Java 1.5, so knowledge of Java (or a willingness to learn on your own) is
required.
• Knowledge of standard concepts in artificial intelligence and/or computational linguistics (e.g., CS121/221 or Ling
180).
• Basic familiarity with logic, vector spaces, and probability.
Intended Audience:
• Graduate students and advanced undergraduates specializing in computer science, linguistics, or symbolic systems.
Due to copyright issues, video downloads and lecture slides are not available for Natural Language Processing.