ECE 537 Speech Processing Fundamentals, Fall 2022

Development of an intuitive understanding of speech processing by the auditory system, in three parts. I): The theory of acoustics of speech production, introductory acoustic phonetics, inhomogeneous transmission line theory (and reflectance), room acoustics, the short-time Fourier Transform (and its inverse), and signal processing of speech (LPC, CELP, VQ). II): Psychoacoustics of speech perception, critical bands, masking (JNDs), and the physiology of the auditory pathway (cochlear modeling). III): Information theory entropy, channel capacity, the confusion matrix, state models, EM algorithms, and Bayesian networks. Presentation of classic papers on speech processing and speech perception by student groups.

Pre-Requisites: ECE 310, 401, or equivalent; ECE 313, CS 361, STAT 410 or equivalent; PyTorch

Instructor: Mark Hasegawa-Johnson (jhasegaw)

Teaching Assistant: Dufei Wu (dufeiwu2)

On-line Tools

  • CampusWire will be used for on-line question answering. If you need the code to enter this site, send an e-mail to the course instructors.

  • Gradescope will be used to submit all homework, and will be used to grade exams. The code to enter this site will be posted on CampusWire.

  • MediaSpace will host lecture videos, usually by the day after each lecture.

  • Synchronous lectures are available on zoom. URL will be posted on CampusWire.