Course Websites

CS 410 - Text Information Systems

Last offered Fall 2024

Official Description

Theory, design, and implementation of text-based information systems. Text analysis, retrieval models (e.g., Boolean, vector space, probabilistic), text categorization, text filtering, clustering, retrieval system design and implementation, and applications to web information management. Course Information: 3 undergraduate hours. 3 or 4 graduate hours. Prerequisite: CS 225.

Related Faculty

Course Director

Text(s)

ChengXiang Zhai and Sean Massung. 2016. Text Data Management and Analysis: a Practical Introduction to Information Retrieval and Text Mining. Association for Computing Machinery and Morgan & Claypool, New York, NY, USA. https://dl.acm.org/citation.cfm?id=2915031

Learning Goals

Be able to explain the basic concepts and principles of text information systems, such as push vs. push information access, probability ranking principle, and generative models. (1), (2), (6 )

Be able to explain how key algorithms for information retrieval, Web search, and text data mining work and compare them. (1), (2), (6 )
Be able to evaluate a text information system using various metrics such as precision, recall, mean average precision, and nDCG (2 ), (6)
Be able to modify a search engine to improve accuracy (1), (2), (6)
Be able to apply knowledge of text information systems to solve a real-world problem with a course project (1), (2), (3), (5), (6)

Topic List

Background & General Introduction

Information Retrieval Models
Evaluation
Web search
Recommender systems
Text data mining
Course Project
Assignments
TitleSectionCRNTypeHoursTimesDaysLocationInstructor
Text Information SystemsCSP79900PKG3 -    Pablo D Robles Granda
Text Information SystemsCSP79900PKG31400 - 1515 F  ARR Illini Center Pablo D Robles Granda
Text Information SystemsDSO67393ONL4 -    ChengXiang Zhai
Text Information SystemsMC379328PKG31400 - 1515 F  ARR Illini Center 
Text Information SystemsMC379328PKG3 -    
Text Information SystemsMC471013PKG4 -    Pablo D Robles Granda
Text Information SystemsMC471013PKG41400 - 1515 F  ARR Illini Center Pablo D Robles Granda
Text Information SystemsTGR78821LEC30930 - 1045 T R  100 Gregory Hall Pablo D Robles Granda
Text Information SystemsTUG78820LEC30930 - 1045 T R  100 Gregory Hall Pablo D Robles Granda