CS546 gives a graduate-level introduction to the statistical and machine learning methods used in natural language processing. Our main learning goals for this year are to understand state of the art neural architectures for NLP. Why have they been so transformative? How can you use them in your own research? What are their shortcomings? What comes next? We will therefore largely focus on neural (deep learning) approaches this year, but may also cover other kinds of approaches to place more recent work in context. Prerequisites are a basic understanding of NLP, probability, statistics, linear algebra and machine learning, as well as solid programming skills. Students will learn to read the current literature, and apply these models to NLP problems. They will be required to do a research project, to give class presentations, and to write critical reviews of relevant papers.
35% paper presentation
50% research project
10% paper reviews
5% class participation
01/22 | Introduction | Overview, Policies | ||
01/24 | More Introduction | Why neural methods in NLP? | ||
01/29 | Foundations | From neural language models to static word embeddings | ||
01/31 | Foundations | Static Word Embeddings | ||
02/05 | Foundations | Recurrent Nets in NLP | ||
02/07 | Foundations | Recurrent Nets in NLP | ||
02/14 | Foundations | Transformers | ||
02/19 | Foundations | Transformers; ELMo | ||
Vaswani et al., Attention is all you need, NIPS 2017 pdf | ||||
Peter et al., Deep Contextualized Word Representations, NAACL 2018 pdf | ||||
02/21 | Context-Dependent Word Embeddings | BERT, GPT-2, etc. | pptx | feedback |
Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, NAACL 2019 pdf, | ||||
Liu et al., RoBERTa: A Robustly Optimized BERT Pretraining Approach, 2019 pdf, | ||||
Radford et al., Language Models are Unsupervised Multitask Learners, 2019 pdf, | ||||
Dai et al., Transformer-XL: Attentive Language Modeling beyond a Fixed-Length Context, ACL 2019 pdf, | ||||
Yang et al., XLNet: Generalized Autoregressive Pretraining for Language Understanding, NIPS 2019 pdf | ||||
02/26 | Advanced Neural Architectures | Deep Learning for Graphs | feedback | |
Grover et al., Node2Vec: Scalable Feature Learning for Networks, KDD 2016 pdf , | ||||
Lin et al., Learning Entity and Relation Embeddings for Knowledge Graph Completion, AAAI 2015 pdf, | ||||
Yang et al., Embedding Entities and Relations for Learning and Inference in Knowledge Bases, ICLR 2015 pdf, | ||||
Li et al., Gated Graph Sequence Neural NEtworks, ICLR 2015 pdf, | ||||
Marcheggiani and Titov, Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling, EMNLP 2017 pdf | ||||
02/28 | Project Proposal Presentations | |||
03/04 | NLP Applications | Natural Language Generation | feedback | |
Gatt and Kramer, Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation, JAIR 2018 pdf, | ||||
Wen et al., Multi-domain Neural Network Language Generation for Spoken Dialogue Systems, NAACL 2016 pdf | ||||
03/06 | NLP Applications | Neural Summarization | feedback | |
Rush et al. A Neural Attention Model for Abstractive Sentence Summarization, EMNLP 2015 pdf, | ||||
Nallapati et al. Abstractive Summarization using Sequence-to-Sequence RNNs and Beyond, CoNLL 2016 pdf, | ||||
See et al. Get to the point: Summarization with pointer generator networks, ACL 2017 pdf, | ||||
Gehrmann et al. Bottom-up abstractive summarization, EMNLP 2018 pdf | ||||
03/11 | NLP Applications | Neural Machine Translation | feedback | |
Luong et al., Effective Approaches to Attention-based Neural Machine Translation, EMNLP 2015 pdf | ||||
Sennrich et al., Neural Machine Translation of Rare Words with Subword Units, ACL 2016 pdf | ||||
Gehring et al., Convolutional Sequence to Sequence Learning, ICML 2017 pdf | ||||
Lample et al., Phrase-Based & Neural Unsupervised Machine Translation. EMNLP 2018 pdf | ||||
Zhou et al., Synchronous Bidirectional Neural Machine Translation, TACL 2019 pdf | ||||
03/13 | NLP Applications | Neural Sequence Labeling | feedback | |
Collobert et al., Natural Language Processing (Almost) from Scratch, JMLR 2011 pdf | ||||
Ma and Hovy, End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, ACL 2016 pdf | ||||
Lample et al., Neural Architectures for Named Entity Recognition, NAACL 2016 pdf | ||||
Yang et al., Design Challenges and Misconceptions in Neural Sequence Labeling, COLING 2018 pdf | ||||
03/25 | NLP Applications | Neural Structure Prediction | feedback | Chen and Manning, A Fast and Accurate Dependency Parser using Neural Networks, EMNLP 2014 pdf |
Durett and Klein, Neural CRF parsing, ACL 2015 pdf | ||||
Gaddy et al., What's Going On in Neural Constituency Parsers? An Analysis NAACL 2018 pdf | ||||
Kuncoro et al., What Do Recurrent Neural Network Grammars Learn About Syntax? EACL 2017 pdf | ||||
He et al., Deep Semantic Role Labeling: What Works and What's Next, ACL 2017 pdf | ||||
03/27 | NLP Applications | Neural Question Answering | feedback | Chen et al. A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task, ACL 2016 pdf |
Rajpurkar et al, SQuAD: 100,000+ Questions for Machine Comprehension of Text, ACL 2016 pdf | ||||
Rajpurkar et al. Know what you don't know: Unanswerable questions for Squad pdf | ||||
Yang et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering, EMNLP 2018 pdf | ||||
Welbl et al., Constructing Datasets for Multi-hop Reading Comprehension Across Document, TACL 2018 pdf | ||||
Mihaylov et al, Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering, EMNLP 2018pdf | ||||
04/01 | NLP Applications | More on Neural Question Answering | feedback | |
Seo et al., Bidirectional Attention Flow for Machine Comprehension, ICLR 2017 pdf | ||||
Weissenborn et al. Making Neural QA as Simple as Possible but not Simpler, CoNLL 2017 pdf | ||||
Wang et al., Gated Self-Matching Networks for Reading Comprehension and Question Answering, ACL 2017 pdf | ||||
Weston et al., Memory Networks, 2015 pdf | ||||
Sukhbaatar, End-to-end memory networks, NIPS 2015 pdf | ||||
04/03 | NLP Applications | Multimodal NLP | feedback | |
04/08 | NLP Applications | Neural Dialogue | feedback | |
04/10 | NLP Applications | More on Neural Dialogue | feedback | |
04/15 | Project Update Presentations | |||
04/17 | Advanced Neural Architectures | Intro to GANs in NLP | feedback | |
04/22 | Advanced Neural Architectures | More on GANs in NLP | feedback | |
04/24 | Training Regimes | Reinforcement Learning in NLP | feedback | |
04/29 | (TBC) Training Regimes | (TBC) More on Reinforcement Learning in NLP | ||
05/01 | (TBC) Training Regimes | (TBC) Deep Learning in Low Resource Settings | feedback | |
05/07 | Final Project Presentations | |||