ECE 594: Mathematical Models of Language

Contact Information

Instructor: Professor Suma Bhat

Email: spbhat2 [at] illinois [dot] edu
Office Hours: By appointment through email 

Logistics

Textbook: There is no required textbook. Assigned readings for every week will be mentioned in the schedule. 

Project: This graduate course can be taken for 3 or 4 credits. Students who have signed up for 4 credits will be doing a semester-long project that is due at the end of the semester. We will have some ideas on the course page, but I encourage you to come up with some ideas on your own. Feel free to schedule a time to meet with me to discuss your project proposal.

Grading:

Your grade in this course will be determined based on the following components.

1. Class participation (20%): lecture notes, in-lecture discussion, including sharing insights based on readings.

2. Presentation (40%): Each week will have the following structure.

  • Tuesday --- lecture (Prof. Bhat)
  • Thursday --- student presentation of a paper from a list of papers (beginning the second week of classes)
     
    • You are welcome to suggest other interesting papers that you think should be discussed.
       
    • The papers for each week will be finalized in the first week of classes, and a sign-up sheet will be created for students to pick a paper/presentation slot. 
       
    • Each student will get to present at least two papers in the course (if there are extra slots, you are welcome to use them, for no extra-credit).
       
    • 2 students will present the assigned papers for that week, each taking roughly half the class time.
       
    • The presenters will receive peer feedback (self-selected via sign-up) about the content, delivery, clarity and completeness. A template will be provided.

3. Peer-review (10%): Feeback to presenters provided as a ~2 page document due before the end of Friday of that week.

4. Machine problems (30%): Three to five machine problems.

5. Project: A semester-long project (and a final paper as a report) is expected from students who have signed up for 4 credits. There are several possibilities for a project, some of which are listed below.

  • A well-crafted survey paper on a chosen topic ( at least 10 papers).

  • Pick an NLP task and analyze existing techniques and write about your findings. You can pick ~3 papers with published code (or you can implement the algorithms yourselves) and carry out a thorough analysis.

  • Write a research paper, similar to a conference submission. You could work as a team of up to 2 students on this, with a clear division of tasks and equal contribution from the participants. Depending on the nature of the problem and the results, there is a potential of turning this into a conference/workshop submission (e.g., EMNLP 2022).

General References

We will be reading chapters from some of the books listed below; others are included for further reading on related topics.

  • [JM_2] Daniel Jurafsky and James Martin, Speech and Langugae Processing, 2nd Ed. [link] (Free)

  • [JM_3] Daniel Jurafsky and James Martin, Speech and Langugae Processing, 3rd Ed. [link] (Free)

  • [JE] Jacob Eisenstein, Natural Language Processing [link] (Free)

  • [YG] Yoav Goldberg, Neural Network Methods for Natural Language Processing [link] (Free)

  • [GBC] Ian Goodfellow, Yoshua Bengio and Aaron Courville [link]

  • [HTF] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Elements of Statistical Learning (2nd Edition). Springer Series in Statistics. Springer New York Inc., New York, NY, USA, 2008. [link] (Free)

  • The book Natural Language Processing with Python shows how to do hands-on work with Python's Natural Language Toolkit (NLTK), and also brings a strong linguistic perspective.
  • [EB] Emily Bender, Linguistic Fundamentals for Natural Language Processing