Course Websites
ECE 417 - Multimedia Signal Processing
Last offered Fall 2023
Official Description
Related Faculty
Subject Area
- Signal Processing
Course Director
Description
Lab Projects
Machine problems are the main educational tool in ECE 417; there are seven of them, each covering a different aspect of video and audio synthesis and understanding.
Lab Equipment
None
Topical Prerequisites
Prerequisite: ECE 313 and ECE 310.
Texts
Distributed on the course web page
References
Distributed on the course web page
Required, Elective, or Selected Elective
Elective
Course Goals
The goal of the course is to prepare the students for industrial positions in the emerging field of multimedia and for pursuing further graduate studies in signal processing. Through a set of carefully designed machine problems, the student learns important tools in audio-visual signal processing, analysis, and synthesis, and their applications to biometrics, human-computer interaction, and multimedia indexing and search.
Instructional Objectives
After Machine Problem 1 (MP1), Week 3 of the semester, the students should be able to analyze and synthesize speech signals using a multiband-excitation (sub-band-filtered combination of pulse train and white noise signals) and using a linear predictive (LPC) model of the spectrum (1,6).
After MP2, Week 5 of the semester, the students should be able to understand principal component analysis and linear discriminant analysis, and their applications to face recognition (1,6).
After MP3, Week 7 of the semester, the students should be able to understand maximum likelihood (ML) classifiers, Gaussian mixture models, and multimodal fusion, and their applications to audio-visual person identification (1,6).
After MP4, Week 9 of the semester, the students should be able to understand hidden Markov model (HMM), including algorithms for learning, inference, and decoding, and its application to audio-visual speech recognition (1,6).
After MP5, Week 11 of the semester, the students should be able to train and test a convolutional neural network for face detection, and visualize the receptive fields learned by the network kernels (1,6).
After MP6, Week 13 of the semester, the students should be able to understand 3D face modeling and animation and applications to speech-driven lip movement in an audio-visual avatar (synthetic talking head) (1,6).
After MP7, Week 15 of the semester, the students should be able to understand deep canonical correlation analysis (DCCA) for audiovisual event detection (1,6).
Title | Section | CRN | Type | Hours | Times | Days | Location | Instructor |
---|---|---|---|---|---|---|---|---|
Multimedia Signal Processing | A | 68112 | LEC | 4 | 0930 - 1050 | T R | 2013 Electrical & Computer Eng Bldg | Mark Hasegawa-Johnson |
Multimedia Signal Processing | ONL | 75645 | OLC | 4 | 0930 - 1050 | T R | Mark Hasegawa-Johnson |