| Date | Lecture (and links to advance videos when applicable) | Links and Notes |
Reading Suggestions | Comments | |
| Intro and Basics | 1/20 | Introduction and Logistics | Intro & logistics | Slide set | |
| 1/22 | Edge AI and Mitigating Resource Bottlenecks |
Bottlenecks | Slide set | ||
| 1/27 | Overfitting and Self-supervised Learning | Group requests due | |||
| 1/29 | Edge AI and Mitigating the Data Bottleneck | All groups assigned | |||
| The Data Bottleneck: Self- Supervised Data- Efficient Learning for IoT | 2/3 | Class Project Ideas Introduction. | Projects | Slide set | |
| 2/5 | Fundamentals of Self-Supervised Learning: Tokenization, Pre-training, Fine-tuning, Backbone Architectures (e.g., auto-encoders, transformers, etc), and Issues with Scaling Laws for IoT Applications | Self-Supervised Learning (SSL) | Slide set | ||
| 2/10 | Self-supervised Learning Architectures for Time-Series Data: RNNs, LSTMs, and State Space Models | SSL Models for Time- Series Data | 1. Schmidt, Robin M. "Recurrent neural networks (rnns): A gentle introduction and overview." arXiv preprint arXiv:1912.05911 (2019). 2. Kexin Zhang, Qingsong Wen, Chaoli Zhang, Rongyao Cai, Ming Jin, Yong Liu, James Y. Zhang et al. "Self-supervised learning for time series analysis: Taxonomy, progress, and prospects." IEEE transactions on pattern analysis and machine intelligence 46, no. 10 (2024): 6775-6794. 3. Albert Gu, Karan Goel, and Christopher RĂ©. "Efficiently modeling long sequences with structured state spaces." arXiv preprint arXiv:2111.00396 (2021). 4. Albert Gu, and Tri Dao. "Mamba: Linear-time sequence modeling with selective state spaces." In First conference on language modeling. 2024. |
Note: Project title and abstract due | |
| 2/12 | Representation Learning from Multimodal Sensor Data |
Multimodal Intro HW1 Out |
1. Chao
Zhang, Zichao Yang, Xiaodong He, and Li Deng. "Multimodal
intelligence: Representation learning, information fusion, and
applications." Journal of Selected Topics in Signal Processing, 2020. 2. Dave Vedant, Fotios Lygerakis, and Elmar Rueckert. "Multimodal visual-tactile representation learning through self-supervised contrastive pre-training." In 2024 IEEE ICRA, pp. 8013-8020. IEEE, 2024. 3. Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, and Cordelia Schmid. "Videobert: A joint model for video and language representation learning." In Proceedings of the IEEE/CVF international conference on computer vision, pp. 7464-7473. 2019. |
||
| 2/17 | Representation Learning from Multimodal Sensor Data (Student Led) |
G4 Multimodal Papers |
1. Shengzhong Liu, Tomoyoshi Kimura, Dongxin Liu, Ruijie Wang, Jinyang Li, Suhas Diggavi, Mani Srivastava, and Tarek Abdelzaher. "Focal: Contrastive learning for multimodal time-series sensing signals in factorized orthogonal latent space." Advances in Neural Information Processing Systems 36 (2023): 47309-47338. 2. Chen, Yatong, Chenzhi Hu, Tomoyoshi Kimura, Qinya Li, Shengzhong Liu, Fan Wu, and Guihai Chen. "SemiCMT: Contrastive cross-modal knowledge transfer for iot sensing with semi-paired multi-modal signals." Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, no. 4 (2024): 1-30. 3. Xiaomin Ouyang, Jason Wu, Tomoyoshi Kimura, Yihan Lin, Gunjan Verma, Tarek Abdelzaher, and Mani Srivastava. "MMbind: Unleashing the potential of distributed and heterogeneous data for multimodal learning in iot." In Proceedings of the 23rd ACM Conference on Embedded Networked Sensor Systems, pp. 491-503. 2025. 4. Tomoyoshi Kimura, Xinlin Li, Osama Hanna, Yatong Chen, Yizhuo Chen, Denizhan Kara, Tianshi Wang et al. "InfoMAE: Pair-efficient cross-modal alignment for multimodal time-series sensing signals." In Proceedings of the ACM on Web Conference 2025, pp. 3084-3095. 2025. 5. Li, Zechen, Shohreh Deldari, Linyao Chen, Hao Xue, and Flora D. Salim. "SensorLLM: Aligning large language models with motion sensors for human activity recognition." In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp. 354-379. 2025. 6. Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, Girish Narayanswamy, Maxwell A. Xu, Ahmed A. Metwally et al. "SensorLM: Learning the Language of Wearable Sensors." arXiv preprint arXiv:2506.09108 (2025). |
Debate #1 (20 min) Student led talk (45 min + 10 min Q&A) See note @48 for debate concluding remarks on Piazza. |
|
| 2/19 | Self-supervised Learning
from Frequency Domain Data |
Frequency Domain Intro HW2 Out |
Slide set | Debate #2 (20 min) | |
| 2/24 | Self-supervised Learning from Frequency Domain Data (Student Led) | G3 | 1. Shuochao Yao, Ailing Piao, Wenjun Jiang, Yiran Zhao, Huajie Shao, Shengzhong Liu, Dongxin Liu, Jinyang Li, Tianshi Wang, Shaohan Hu, Lu Su, Jiawei Han and Tarek Abdelzaher, "STFNets: Learning Sensing Signals from the Time-Frequency Perspective with Short-Time Fourier Neural Networks," In Proc. The Web Conference (WWW), San Francisco, CA, May 2019.
2. Dongxin Liu, Tianshi Wang, Shengzhong Liu, Ruijie Wang, Shuochao Yao, and Tarek Abdelzaher. "Contrastive self-supervised representation learning for sensing signals from the time-frequency perspective." In 2021 International Conference on Computer Communications and Networks (ICCCN), pp. 1-10. IEEE, 2021. 3. Yuan Gong, Cheng-I. Lai, Yu-An Chung, and James Glass. "Ssast: Self-supervised audio spectrogram transformer." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 10, pp. 10699-10709. 2022. 4. Setareh Rahimi Taghanaki, Michael Rainbow, and Ali Etemad. "Self-supervised human activity recognition with localized time-frequency contrastive representation learning." IEEE Transactions on Human-Machine Systems 53, no. 6 (2023): 1027-1037. 5. Denizhan Kara, Shengzhong Liu, Jinyang Li, Dongxin Liu, Tianshi Wang, Ruijie Wang, Yizhuo Chen, Yigong Hu, Tarek Abdelzaher, "FreqMAE: Frequency-Aware Masked Autoencoder for Multi-Modal IoT Sensing," In Proc. The Web Conference (WWW), May 2024. 6. Denizhan Kara, Tomoyoshi Kimura, Yatong Chen, Jinyang Li, Ruijie Wang, Yizhuo Chen, Tianshi Wang, Shengzhong Liu, Lance Kaplan, Joydeep Bhattacharyya, Tarek Abdelzaher, "PhyMask: An Adaptive Masking Paradigm for Efficient Self-Supervised Learning in IoT," In Proc. 22nd ACM Conference on Embedded Networked Sensor Systems (SenSys), Hangzhou, China, November 2024. |
HW2 Debate (20 min) Student led talk (45 min + 10 min Q&A) |
|
| 2/26 | Handling Spatial-Temporal IoT Data | HW3 Out | 2-page project proposal due | ||
| 3/3 | Handling Spatial-Temporal IoT Data (Student Led) | G2 |
|
HW3 Debate (20 min) Student led talk (45 min + 10 min Q&A) |
|
| Data Curation and "Faking" | 3/5 | Physical Data Curation and Augmentation | HW4 Out | ||
| 3/10 | Physical Data Curation and Augmentation (Student Led) | G6 |
HW4 Debate (20 min) Student led talk (45 min + 10 min Q&A) |
||
| 3/12 | Project Elevator Talks | ||||
| Break | 3/17 | Spring Break |
|||
| 3/19 | |||||
| The Compute Bottleneck: Efficient Inference at the IoT Edge | 3/24 | Input Data Filtering |
|
||
| 3/26 | Model Reduction: Pruning, Quantization, Distillation |
HW5 Out G1 |
Instructor intro (20 min) Student led talk (45 min + 10 min Q&A) |
||
| 3/31 | Neural Network Architecture Search | HW5 Debate (20 min) | |||
| 4/2 | Mixture of Experts Cascades |
HW6 Out G7 |
Instructor intro (20 min) Student led talk (45 min + 10 min Q&A) |
||
| 4/7 | Timing Guarantees | HW6 Debate (20 min) | |||
| 4/9 | Energy Consumption and Thermal Issues |
HW7 Out |
|||
| 4/14 | Federated Learning, Distributed Fine-Tuning, and Test-Time Adaptation | G8 |
HW7 Debate (20 min) Student led talk (45 min + 10 min Q&A) |
||
| 4/16 | Closed loop control and related foundation models (RT-2, RT-X, etc) | HW8 Out | |||
| Ethics | 4/21 | Ethical and Societal Considerations | HW8 Debate (20 min) | ||
| 4/23 | |||||
| Student Projects | 4/28 | Student-led Final Project Presentations | |||
| 4/30 | Student-led Final Project Presentations | ||||
| 5/5 | Recap |
|
|||