Overview

CS546 gives a graduate-level introduction to the statistical and machine learning methods used in natural language processing. Our main learning goals for this year are to understand state of the art neural architectures for NLP. Why have they been so transformative? How can you use them in your own research? What are their shortcomings? What comes next? We will therefore largely focus on neural (deep learning) approaches this year, but may also cover other kinds of approaches to place more recent work in context. Prerequisites are a basic understanding of NLP, probability, statistics, linear algebra and machine learning, as well as solid programming skills. Students will learn to read the current literature, and apply these models to NLP problems. They will be required to do a research project, to give class presentations, and to write critical reviews of relevant papers.

Grading

35% paper presentation
50% research project
10% paper reviews
5% class participation

Schedule

01/22 Introduction Overview, Policies pdf
01/24 More Introduction Why neural methods in NLP? pdf
01/29 Foundations From neural language models to static word embeddings pdf
01/31 Foundations Static Word Embeddings pdf
02/05 Foundations Recurrent Nets in NLP pdf
02/07 Foundations Recurrent Nets in NLP pdf
02/14 Foundations Transformers pdf
02/19 Foundations Transformers; ELMo pdf
Vaswani et al., Attention is all you need, NIPS 2017 pdf
Peter et al., Deep Contextualized Word Representations, NAACL 2018 pdf
02/21 Context-Dependent Word Embeddings BERT, GPT-2, etc. pptx feedback
Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, NAACL 2019 pdf,
Liu et al., RoBERTa: A Robustly Optimized BERT Pretraining Approach, 2019 pdf,
Radford et al., Language Models are Unsupervised Multitask Learners, 2019 pdf,
Dai et al., Transformer-XL: Attentive Language Modeling beyond a Fixed-Length Context, ACL 2019 pdf,
Yang et al., XLNet: Generalized Autoregressive Pretraining for Language Understanding, NIPS 2019 pdf
02/26 Advanced Neural Architectures Deep Learning for Graphs pdf feedback
Grover et al., Node2Vec: Scalable Feature Learning for Networks, KDD 2016 pdf ,
Lin et al., Learning Entity and Relation Embeddings for Knowledge Graph Completion, AAAI 2015 pdf,
Yang et al., Embedding Entities and Relations for Learning and Inference in Knowledge Bases, ICLR 2015 pdf,
Li et al., Gated Graph Sequence Neural NEtworks, ICLR 2015 pdf,
Marcheggiani and Titov, Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling, EMNLP 2017 pdf
02/28 Project Proposal Presentations pdf
03/04 NLP Applications Natural Language Generation pdf feedback
Gatt and Kramer, Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation, JAIR 2018 pdf,
Wen et al., Multi-domain Neural Network Language Generation for Spoken Dialogue Systems, NAACL 2016 pdf
03/06 NLP Applications Neural Summarization pdf feedback
Rush et al. A Neural Attention Model for Abstractive Sentence Summarization, EMNLP 2015 pdf,
Nallapati et al. Abstractive Summarization using Sequence-to-Sequence RNNs and Beyond, CoNLL 2016 pdf,
See et al. Get to the point: Summarization with pointer generator networks, ACL 2017 pdf,
Gehrmann et al. Bottom-up abstractive summarization, EMNLP 2018 pdf
03/11 NLP Applications Neural Machine Translation pdf feedback
Luong et al., Effective Approaches to Attention-based Neural Machine Translation, EMNLP 2015 pdf
Sennrich et al., Neural Machine Translation of Rare Words with Subword Units, ACL 2016 pdf
Gehring et al., Convolutional Sequence to Sequence Learning, ICML 2017 pdf
Lample et al., Phrase-Based & Neural Unsupervised Machine Translation. EMNLP 2018 pdf
Zhou et al., Synchronous Bidirectional Neural Machine Translation, TACL 2019 pdf
03/13 NLP Applications Neural Sequence Labeling pdf feedback
Collobert et al., Natural Language Processing (Almost) from Scratch, JMLR 2011 pdf
Ma and Hovy, End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, ACL 2016 pdf
Lample et al., Neural Architectures for Named Entity Recognition, NAACL 2016 pdf
Yang et al., Design Challenges and Misconceptions in Neural Sequence Labeling, COLING 2018 pdf
03/25 NLP Applications Neural Structure Prediction pdf feedback
Chen and Manning, A Fast and Accurate Dependency Parser using Neural Networks, EMNLP 2014 pdf
Durett and Klein, Neural CRF parsing, ACL 2015 pdf
Gaddy et al., What's Going On in Neural Constituency Parsers? An Analysis NAACL 2018 pdf
Kuncoro et al., What Do Recurrent Neural Network Grammars Learn About Syntax? EACL 2017 pdf
He et al., Deep Semantic Role Labeling: What Works and What's Next, ACL 2017 pdf
03/27 NLP Applications Neural Question Answering pdf feedback
Chen et al. A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task, ACL 2016 pdf
Rajpurkar et al, SQuAD: 100,000+ Questions for Machine Comprehension of Text, ACL 2016 pdf
Rajpurkar et al. Know what you don't know: Unanswerable questions for Squad pdf
Yang et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering, EMNLP 2018 pdf
Welbl et al., Constructing Datasets for Multi-hop Reading Comprehension Across Document, TACL 2018 pdf
Mihaylov et al, Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering, EMNLP 2018pdf
04/01 NLP Applications More on Neural Question Answering pdf feedback
Seo et al., Bidirectional Attention Flow for Machine Comprehension, ICLR 2017 pdf
Weissenborn et al. Making Neural QA as Simple as Possible but not Simpler, CoNLL 2017 pdf
Wang et al., Gated Self-Matching Networks for Reading Comprehension and Question Answering, ACL 2017 pdf
Weston et al., Memory Networks, 2015 pdf
Sukhbaatar, End-to-end memory networks, NIPS 2015 pdf
04/03 NLP Applications Multimodal NLP pdf feedback
04/08 NLP Applications Neural Dialogue pdf feedback
04/10 NLP Applications More on Neural Dialogue pdf feedback
04/15 Project Update Presentations pdf
04/17 Advanced Neural Architectures Intro to GANs in NLP pdf feedback
04/22 Advanced Neural Architectures More on GANs in NLP pdf feedback
04/24 Training Regimes Reinforcement Learning in NLP pdf feedback
04/29 (TBC) Training Regimes (TBC) More on Reinforcement Learning in NLP pdf
05/01 (TBC) Training Regimes (TBC) Deep Learning in Low Resource Settings feedback pdf
05/07 Final Project Presentations pdf