CS546 Machine Learning in NLP (Spring 2020)

Overview

CS546 gives a graduate-level introduction to the statistical and machine learning methods used in natural language processing. Our main learning goals for this year are to understand state of the art neural architectures for NLP. Why have they been so transformative? How can you use them in your own research? What are their shortcomings? What comes next? We will therefore largely focus on neural (deep learning) approaches this year, but may also cover other kinds of approaches to place more recent work in context. Prerequisites are a basic understanding of NLP, probability, statistics, linear algebra and machine learning, as well as solid programming skills. Students will learn to read the current literature, and apply these models to NLP problems. They will be required to do a research project, to give class presentations, and to write critical reviews of relevant papers.

Grading

35% paper presentation
50% research project
10% paper reviews
5% class participation

Schedule

01/22	Introduction	Overview, Policies	pdf

01/24	More Introduction	Why neural methods in NLP?	pdf

01/29	Foundations	From neural language models to static word embeddings	pdf

01/31	Foundations	Static Word Embeddings	pdf

02/05	Foundations	Recurrent Nets in NLP	pdf

02/07	Foundations	Recurrent Nets in NLP	pdf

02/14	Foundations	Transformers	pdf

02/19	Foundations	Transformers; ELMo	pdf
		Vaswani et al., Attention is all you need, NIPS 2017 pdf
		Peter et al., Deep Contextualized Word Representations, NAACL 2018 pdf
02/21	Context-Dependent Word Embeddings	BERT, GPT-2, etc.	pptx	feedback

		Devlin et al., BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, NAACL 2019 pdf,
		Liu et al., RoBERTa: A Robustly Optimized BERT Pretraining Approach, 2019 pdf,
		Radford et al., Language Models are Unsupervised Multitask Learners, 2019 pdf,
		Dai et al., Transformer-XL: Attentive Language Modeling beyond a Fixed-Length Context, ACL 2019 pdf,
		Yang et al., XLNet: Generalized Autoregressive Pretraining for Language Understanding, NIPS 2019 pdf
02/26	Advanced Neural Architectures	Deep Learning for Graphs	pdf	feedback

		Grover et al., Node2Vec: Scalable Feature Learning for Networks, KDD 2016 pdf ,
		Lin et al., Learning Entity and Relation Embeddings for Knowledge Graph Completion, AAAI 2015 pdf,
		Yang et al., Embedding Entities and Relations for Learning and Inference in Knowledge Bases, ICLR 2015 pdf,
		Li et al., Gated Graph Sequence Neural NEtworks, ICLR 2015 pdf,
		Marcheggiani and Titov, Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling, EMNLP 2017 pdf
02/28	Project Proposal Presentations		pdf

03/04	NLP Applications	Natural Language Generation	pdf	feedback
		Gatt and Kramer, Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation, JAIR 2018 pdf,
		Wen et al., Multi-domain Neural Network Language Generation for Spoken Dialogue Systems, NAACL 2016 pdf
03/06	NLP Applications	Neural Summarization	pdf	feedback
		Rush et al. A Neural Attention Model for Abstractive Sentence Summarization, EMNLP 2015 pdf,
		Nallapati et al. Abstractive Summarization using Sequence-to-Sequence RNNs and Beyond, CoNLL 2016 pdf,
		See et al. Get to the point: Summarization with pointer generator networks, ACL 2017 pdf,
		Gehrmann et al. Bottom-up abstractive summarization, EMNLP 2018 pdf
03/11	NLP Applications	Neural Machine Translation	pdf	feedback
		Luong et al., Effective Approaches to Attention-based Neural Machine Translation, EMNLP 2015 pdf
		Sennrich et al., Neural Machine Translation of Rare Words with Subword Units, ACL 2016 pdf
		Gehring et al., Convolutional Sequence to Sequence Learning, ICML 2017 pdf
		Lample et al., Phrase-Based & Neural Unsupervised Machine Translation. EMNLP 2018 pdf
		Zhou et al., Synchronous Bidirectional Neural Machine Translation, TACL 2019 pdf

03/13	NLP Applications	Neural Sequence Labeling	pdf	feedback
		Collobert et al., Natural Language Processing (Almost) from Scratch, JMLR 2011 pdf
		Ma and Hovy, End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, ACL 2016 pdf
		Lample et al., Neural Architectures for Named Entity Recognition, NAACL 2016 pdf
		Yang et al., Design Challenges and Misconceptions in Neural Sequence Labeling, COLING 2018 pdf

03/25	NLP Applications	Neural Structure Prediction	pdf	feedback
		Chen and Manning, A Fast and Accurate Dependency Parser using Neural Networks, EMNLP 2014 pdf
		Durett and Klein, Neural CRF parsing, ACL 2015 pdf
		Gaddy et al., What's Going On in Neural Constituency Parsers? An Analysis NAACL 2018 pdf
		Kuncoro et al., What Do Recurrent Neural Network Grammars Learn About Syntax? EACL 2017 pdf
		He et al., Deep Semantic Role Labeling: What Works and What's Next, ACL 2017 pdf

03/27	NLP Applications	Neural Question Answering	pdf	feedback
		Chen et al. A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task, ACL 2016 pdf
		Rajpurkar et al, SQuAD: 100,000+ Questions for Machine Comprehension of Text, ACL 2016 pdf
		Rajpurkar et al. Know what you don't know: Unanswerable questions for Squad pdf
		Yang et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering, EMNLP 2018 pdf
		Welbl et al., Constructing Datasets for Multi-hop Reading Comprehension Across Document, TACL 2018 pdf
		Mihaylov et al, Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering, EMNLP 2018pdf

04/01	NLP Applications	More on Neural Question Answering	pdf	feedback

		Seo et al., Bidirectional Attention Flow for Machine Comprehension, ICLR 2017 pdf
		Weissenborn et al. Making Neural QA as Simple as Possible but not Simpler, CoNLL 2017 pdf
		Wang et al., Gated Self-Matching Networks for Reading Comprehension and Question Answering, ACL 2017 pdf
		Weston et al., Memory Networks, 2015 pdf
		Sukhbaatar, End-to-end memory networks, NIPS 2015 pdf
04/03	NLP Applications	Multimodal NLP	pdf	feedback
04/08	NLP Applications	Neural Dialogue	pdf	feedback

04/10	NLP Applications	More on Neural Dialogue	pdf	feedback

04/15	Project Update Presentations		pdf

04/17	Advanced Neural Architectures	Intro to GANs in NLP	pdf	feedback

04/22	Advanced Neural Architectures	More on GANs in NLP	pdf	feedback

04/24	Training Regimes	Reinforcement Learning in NLP	pdf	feedback

04/29	(TBC) Training Regimes	(TBC) More on Reinforcement Learning in NLP	pdf

05/01	(TBC) Training Regimes	(TBC) Deep Learning in Low Resource Settings	feedback	pdf

05/07	Final Project Presentations		pdf