ECE 537 Speech Processing Fundamentals, Fall 2024

Development of an intuitive understanding of speech processing by the auditory system, in three parts. I): The theory of acoustics of speech production, introductory acoustic phonetics, inhomogeneous transmission line theory (and reflectance), room acoustics, the short-time Fourier Transform (and its inverse), and signal processing of speech (LPC, CELP, VQ). II): Psychoacoustics of speech perception, critical bands, masking (JNDs), and the physiology of the auditory pathway (cochlear modeling). III): Information theory entropy, channel capacity, the confusion matrix, state models, EM algorithms, and Bayesian networks. Presentation of classic papers on speech processing and speech perception by student groups.

Pre-Requisites: Signal Processing (ECE 310, 401, or equivalent) and Random Variables (ECE 313, CS 361, STAT 410 or equivalent)

Instructor: Mark Hasegawa-Johnson (jhasegaw)

Textbook: Theory and Applications of Digital Speech Processing, 1st Edition by Rabiner and Schafer

On-line Tools

  • CampusWire will be used for on-line question answering. If you need the code to enter this site, send an e-mail to the course instructor.

  • Gradescope will be used for homework, exams, project proposal, and project writeup.

  • Mediaspace has videos of past lectures.

  • There will be two mid-terms. Instead of a final exam, there will be a final project.