Disclaimer: This is still mostly the syllabus from previous years. While we will follow the general sequence of topics, you can expect the reading materials and specific lectures to change somewhat this year.
Required readings refer to chapters in Jurafsky and Martin (2008), Speech and Language Processing, 2nd edition, unless stated otherwise. Note that the 3rd edition is still in preparation, although the website has (currently) a number of new and/or rewritten chapters available as PDFs.
Optional readings are often more advanced. "MS" refers to chapters in Manning and
Schütze
(1999), Foundations of Statistical Natural Language Processing (you may need to use a campus machine to access these links) or to original research papers
(you can find many more on the ACL anthology). I also
recommend the Handbook
of Computational Linguistics and Natural Language Processing (you also need to be
on the campus network to access this site).
Week | Date | Lecture | Topic | ||
01 | 08/30 | 01 | Introduction | 1up 4up | |
What is NLP? What will you learn in this class? | |||||
Required reading: Ch.1 | |||||
Optional reading: Python tutorial (sec. 1-5), Jelinek (2009), Ferrucci et al. (2010) | |||||
Links: NLTK | |||||
01 | 09/01 | 02 | Finite-state methods for morphology | 1up 4up | MP0 out |
What is the structure of words, and how can we model it? Review of finite-state automata. Finite-state transducers | |||||
Required reading: Ch.3.1-7; | |||||
Optional reading: Karttunen and Beesley '05, Mohri (1997), the Porter stemmer, Sproat et al. (1996) | |||||
02 | 09/06 | 03 | N-gram language models | 1up 4up | |
The most basic probabilistic models of language. N-gram models. Evaluation. Also: review of basic probability | |||||
Required reading: Ch. 4.1-4 | |||||
Optional reading: MS, Ch. 2 | |||||
02 | 09/08 | 04 | Smoothing | 1up 4up | |
How can we predict what we haven't seen before? | |||||
Required reading: Ch.4.5-7 | |||||
Optional reading: MS, Ch.6, Chen and Goodman (1998) | |||||
03 | 09/13 | 05 | Smoothing (continued) | 1up 4up | |
(Same slides as Lecture 4) | |||||
Required reading: Ch.4.5-7 | |||||
Optional reading: MS, Ch.6, Chen and Goodman (1998) | |||||
03 | 09/15 | 06 | Part-of-speech tagging | 1up 4up | MP1 out |
What are parts of speech? How many are there? Basic intro to HMMs. | |||||
Required reading: Ch. 5.1-5 | |||||
Optional reading: Merialdo (1994), Christodoulopoulos et al. (2010), Roche & Schabes (1995) | |||||
04 | 09/20 | 07 | Part-of-speech tagging with Hidden Markov Models | 1up 4up | |
The Viterbi algorithm. | |||||
Required reading: Ch. 5.1-5 | |||||
Optional reading: Merialdo (1994), Christodoulopoulos et al. (2010), Roche & Schabes (1995) | |||||
04 | 09/22 | 08 | Learning Hidden Markov Models | 1up 4up | |
The Forward-Backward algorithm | |||||
Required reading: Ch. 6.1-5 | |||||
Optional reading: MS, Ch. 9 | |||||
05 | 09/27 | 09 | Sequence labeling tasks | 1up 4up | |
Chunking, shallow parsing, named entity recognition | |||||
Required reading: Ch. 6.6-8 | |||||
Optional reading: Sutton & McCallum (2008) (Introduction to Conditional Random Fields), Berger et al. (1996), Etzioni et al. (2008) (web-scale information extraction) | |||||
05 | 09/29 | 10 | PMI, Brown Clusters | 1up 4up | |
How can we learn to group words based on their context? | |||||
Required reading: 4.9-10 | |||||
Optional reading: MS, Ch. 14.1, Brown et al. (1992b) | |||||
06 | 10/04 | 11 | Vector-space semantics | 1up 4up | |
"You shall know a word by the company it keeps" (Firth, 1957) | |||||
Required reading: 3rd edition, chapter 15 (pdf) | |||||
Optional reading: Schutze (1998) | |||||
06 | 10/06 | 12 | Review for Midterm | 1up 4up | MP1 due. MP2 out (PDF, code and data) |
07 | 10/11 | 13 | Word Sense Disambiguation | 1up 4up | |
07 | 10/12 | Midterm (6:30pm, DCL 1320) | Solutions | ||
Good luck! | |||||
07 | 10/13 | 14 | Formal grammars for English | 1up 4up | |
What is the structure of sentences, and how can we model it? Phrase-structure grammar and dependency grammar. Review of basic English grammar and context-free grammars | |||||
Required reading: Ch. 12.1-3, Ch. 12.7 | |||||
08 | 10/18 | 15 | Formal grammars for English | 1up 4up | |
What is the structure of sentences, and how can we model it? Phrase-structure grammar and dependency grammar. Review of basic English grammar and context-free grammars | |||||
Required reading: Ch. 12.1-3, Ch. 12.7 | |||||
Optional reading: MS, Ch. 3, Woods (2010) | |||||
08 | 10/20 | 16 | (Probabilistic) Context-Free Grammar parsing | 1up 4up | |
How can we represent and deal with syntactic ambiguity? | |||||
Required reading: Ch. 13.1-4, Ch. 14.1 | |||||
Optional reading: Chi (1999) | |||||
09 | 10/25 | 17 | Probabilistic Context-Free Grammars | 1up 4up | |
Algorithms for learning and parsing with PCFGs and Treebanks and statistical parsing Going beyond simple PCFGs; Penn Treebank parsing | |||||
Required reading: Ch. 14.1-7, Ch. 12.4 | |||||
Optional reading: Collins' notes, Chi & Geman (1998), Schabes et al. (1993), Schabes & Pereira (1992), Stolcke (1995), Marcus et al. (1993), Collins (1997), Johnson (1998), Klein & Manning (2003), Petrov & Klein (2007), Hindle & Rooth | |||||
09 | 10/27 | 18 | Dependency parsing | 1up 4up | MP2 due. MP3 out |
Dependency treebanks and parsing | |||||
Required reading: McDonald & Nivre (2007) | |||||
Optional reading: Nivre & Scholz (2004), Kubler et al. (2009), Nivre (2010), McDonald & Nivre (2011) | |||||
10 | 11/01 | 19 | Feature structure grammars | 1up 4up | |
Feature structures and unification | |||||
Required reading: Ch. 15.1-4 | |||||
Optional reading: Abney (1997), Miyao & Tsujii (2008) | |||||
10 | 11/03 | 20 | Expressive Grammars | 1up 4up | |
Mildly context-sensitive grammars: Tree-adjoining grammar, Combinatory Categorial grammar | |||||
Required reading: Ch. 16.1, Ch.16.3 | |||||
Optional reading: Joshi and Schabes (1997), Steedman & Baldridge (2011), Schabes & Shieber, Schabes & Waters (1995), Bangalore & Joshi (1999), Hockenmaier & Steedman (2007), Clark & Curran (2007) | |||||
11 | 11/08 | 21 | Introduction to machine translation | 1up 4up | |
Why is MT difficult? Non-statistical approaches to MT (Vauquois triangle); | |||||
Required reading: Ch. 25.1-4 | |||||
Optional reading: Brown et al. (1990), Lopez (2008) | |||||
11 | 11/10 | 22 | Word Alignment | 1up 4up | |
The prerequisite for building a translation model | |||||
Required reading: Ch. 25.5-6 | |||||
Optional reading: Brown et al. (1990), Lopez (2008) Brown et al. (1993) | |||||
12 | 11/15 | 23 | Phrase-based Machine Translation | 1up 4up | |
Required reading: Ch. 25.4, 25.7-9 | |||||
Optional reading: Koehn et al., Och& Ney (2004), Wu (1997), Chiang (2007) Links: www.statmt.org | |||||
12 | 11/17 | 24 | No lecture | 1up 4up | MP3 due. MP4 out |
13 | 11/29 | 25 | Lexical Semantics | 1up 4up | |
What is the meaning of a word, and how can we represent it? | |||||
Required reading: Ch. 19.1-4 | |||||
Optional reading: Palmer et al. (2005), Gildea & Jurafsky (2002), Punyakanok et al. (2008) | |||||
Links: WordNet | |||||
13 | 12/01 | 26 | Compositional Semantics | 1up 4up | |
What is the meaning of a sentence, and how can we represent it? Basic predicate logic and lambda calculus | |||||
Required reading: Ch. 17.2-3 | |||||
Optional reading: Blackburn & Bos (2003) | |||||
Links: Penn Lambda Calculator | |||||
14 | 12/06 | 27 | Reference resolution | 1up 4up | |
Referring expressions, anaphora resolution, coreference | |||||
Required reading: Ch. 21.3-6 | |||||
Optional reading: Reiter & Belz (2012), Stoyanov et al. (2009), Ng (2010) | |||||
14 | 12/08 | 28 | Neural approaches to NLP | 1up 4up | |
What does it take for a text to "make sense"? | |||||
Required reading: Ch.21.1-2 | |||||
Optional reading: Grosz et al. (1995), Poesio et al. (2004), Barzilay and Lapata (2008) | |||||
15 | 12/13 | 29 | Review for final exam | 1up 4up | MP4 due |