Stat 428: Statistical Computing
Fall 2023
Instructor:
Shulei Wang (shuleiw at illinois dot edu)
TA:
Zhiyu Wang (zhiyuw6 at illinois dot edu)
Course Website: Canvas
Office Hours:
Monday 4:00-5:00pm and Tuesday 10:00-11:00am (CST) by Zhiyu Wang
Monday and Wednesday 10:00am-11:00am (CST) by Shulei Wang
Course Overview
Computational data analysis is an essential part of modern statistics. This course is designed to help students develop programming skills and new computational tools for real data analysis. Through this course, students will learn the core ideas of statistically-oriented programming and how to debug and test their code. This course also covers the design of stochastic simulations experiment, Monte Carlo methods, resampling methods, statistical model fitting, large data sets manipulation and basic data visualization. This course will make a balance between programming skill training and real data analysis with examples. We will make use of the statistical computing software R in class and on homework assignments. Extensive R programming background is not required.
Textbook
The Art of R Programming: A Tour of Statistical Software Design, by Norman Matloff (Required)
Statistical Computing with R, by Maria L. Rizzo (Required)
R for Data Science, by Hadley Wickham & Garrett Grolemund (Required, link)
The R Cookbook, by Paul Teetor (Optional, link)
The R Graphics Cookbook, by Winston Chang (Optional, link)
R Markdown Cookbook, by Yihui Xie, Christophe Dervieux & Emily Riederer (Optional, link)
Advanced R, by Hadley Wickham (Optional, link)
ggplot2: Elegant Graphics for Data Analysis, by Hadley Wickham (Optional, link)
Fundamentals of Data Visualization, by Claus O. Wilke (Optional, link)
R packages, by Hadley Wickham & Jenny Bryan (Optional, link)
Topic Outline
R and data structure
R Markdown, indexing, and iteration
Data frame and vectorization
Functions and objective-design
Testing and debugging
Plotting
Random variable generation
Monte Carlo integration
Simulation
Bootstrap and Jackknife
Permutation test
MCMC
Tidyverse
R package development
Software
R and RStudio (preferred) will be used and all are free to download. (Getting started with R and RStudio)
Grading
Bi-weekly homework assignments (50%): The lowest score will be dropped.
Take-home Midterm (20%)
Final project (30%)