ECE 537 Speech Processing Fundamentals, Fall 2024

Development of an intuitive understanding of speech processing by the auditory system, in three parts. I): The theory of acoustics of speech production, introductory acoustic phonetics, inhomogeneous transmission line theory (and reflectance), room acoustics, the short-time Fourier Transform (and its inverse), and signal processing of speech (LPC, CELP, VQ). II): Psychoacoustics of speech perception, critical bands, masking (JNDs), and the physiology of the auditory pathway (cochlear modeling). III): Information theory entropy, channel capacity, the confusion matrix, state models, EM algorithms, and Bayesian networks. Presentation of classic papers on speech processing and speech perception by student groups.

Pre-Requisites: ECE 310, 401, or equivalent; ECE 313, CS 361, STAT 410 or equivalent; PyTorch

Instructor: Mark Hasegawa-Johnson (jhasegaw)

Contents:

On-line Tools

  • CampusWire will be used for on-line question answering. If you need the code to enter this site, send an e-mail to the course instructor.

  • Homework will be submitted weekly, on paper in class, and returned to you one week later.

  • There will be two one-hour midterm exams. They will each contain questions similar in style to the homework questions.