Date Lecture (and links to advance videos when applicable) Links and
Notes
Reading Suggestions Comments
Intro and Basics 1/20 Introduction and Logistics Intro & logistics Slide set
1/22 Edge AI and Mitigating Resource Bottlenecks
Bottlenecks Slide set  
1/27 Overfitting and Self-supervised Learning  Group requests due
1/29 Edge AI and Mitigating the Data Bottleneck All groups assigned
The Data Bottleneck: Self- Supervised Data- Efficient Learning for IoT 2/3 Class Project Ideas Introduction. Projects Slide set  
2/5 Fundamentals of Self-Supervised Learning: Tokenization, Pre-training, Fine-tuning, Backbone Architectures (e.g., auto-encoders, transformers, etc), and Issues with Scaling Laws for IoT Applications Self-Supervised Learning (SSL) Slide set
2/10 Self-supervised Learning Architectures for Time-Series Data: RNNs, LSTMs, and State Space Models SSL Models for Time- Series Data 1. Schmidt, Robin M. "Recurrent neural networks (rnns): A gentle introduction and overview." arXiv preprint arXiv:1912.05911 (2019).
2. Kexin Zhang, Qingsong Wen, Chaoli Zhang, Rongyao Cai, Ming Jin, Yong Liu, James Y. Zhang et al. "Self-supervised learning for time series analysis: Taxonomy, progress, and prospects." IEEE transactions on pattern analysis and machine intelligence 46, no. 10 (2024): 6775-6794.
3. Albert Gu, Karan Goel, and Christopher RĂ©. "Efficiently modeling long sequences with structured state spaces." arXiv preprint arXiv:2111.00396 (2021).
4. Albert Gu, and Tri Dao. "Mamba: Linear-time sequence modeling with selective state spaces." In First conference on language modeling. 2024.
Note: Project title and abstract due
2/12 Representation Learning from Multimodal Sensor Data Multimodal Intro

 HW1 Out
1. Chao Zhang, Zichao Yang, Xiaodong He, and Li Deng. "Multimodal intelligence: Representation learning, information fusion, and applications." Journal of Selected Topics in Signal Processing, 2020.
2. Dave Vedant, Fotios Lygerakis, and Elmar Rueckert. "Multimodal visual-tactile representation learning through self-supervised contrastive pre-training." In 2024 IEEE ICRA, pp. 8013-8020. IEEE, 2024.
3. Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, and Cordelia Schmid. "Videobert: A joint model for video and language representation learning." In Proceedings of the IEEE/CVF international conference on computer vision, pp. 7464-7473. 2019.
 
2/17 Representation Learning from Multimodal Sensor Data (Student Led) G4

Multimodal Papers
1. Shengzhong Liu, Tomoyoshi Kimura, Dongxin Liu, Ruijie Wang, Jinyang Li, Suhas Diggavi, Mani Srivastava, and Tarek Abdelzaher. "Focal: Contrastive learning for multimodal time-series sensing signals in factorized orthogonal latent space." Advances in Neural Information Processing Systems 36 (2023): 47309-47338.
2. Chen, Yatong, Chenzhi Hu, Tomoyoshi Kimura, Qinya Li, Shengzhong Liu, Fan Wu, and Guihai Chen. "SemiCMT: Contrastive cross-modal knowledge transfer for iot sensing with semi-paired multi-modal signals." Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, no. 4 (2024): 1-30.
3. Xiaomin Ouyang, Jason Wu, Tomoyoshi Kimura, Yihan Lin, Gunjan Verma, Tarek Abdelzaher, and Mani Srivastava. "MMbind: Unleashing the potential of distributed and heterogeneous data for multimodal learning in iot." In Proceedings of the 23rd ACM Conference on Embedded Networked Sensor Systems, pp. 491-503. 2025.
4. Tomoyoshi Kimura, Xinlin Li, Osama Hanna, Yatong Chen, Yizhuo Chen, Denizhan Kara, Tianshi Wang et al. "InfoMAE: Pair-efficient cross-modal alignment for multimodal time-series sensing signals." In Proceedings of the ACM on Web Conference 2025, pp. 3084-3095. 2025.
5. Li, Zechen, Shohreh Deldari, Linyao Chen, Hao Xue, and Flora D. Salim. "SensorLLM: Aligning large language models with motion sensors for human activity recognition." In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp. 354-379. 2025.
6. Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, Girish Narayanswamy, Maxwell A. Xu, Ahmed A. Metwally et al. "SensorLM: Learning the Language of Wearable Sensors." arXiv preprint arXiv:2506.09108 (2025).
Debate #1 (20 min)

Student led talk (45 min + 10 min Q&A)

See note @48 for debate concluding remarks on Piazza.
2/19 Self-supervised Learning from Frequency Domain Data
Frequency Domain Intro

HW2 Out
Slide set Debate #2 (20 min)
2/24 Self-supervised Learning from Frequency Domain Data (Student Led) G3 1. Shuochao Yao, Ailing Piao, Wenjun Jiang, Yiran Zhao, Huajie Shao, Shengzhong Liu, Dongxin Liu, Jinyang Li, Tianshi Wang, Shaohan Hu, Lu Su, Jiawei Han and Tarek Abdelzaher, "STFNets: Learning Sensing Signals from the Time-Frequency Perspective with Short-Time Fourier Neural Networks," In Proc. The Web Conference (WWW), San Francisco, CA, May 2019.
2. Dongxin Liu, Tianshi Wang, Shengzhong Liu, Ruijie Wang, Shuochao Yao, and Tarek Abdelzaher. "Contrastive self-supervised representation learning for sensing signals from the time-frequency perspective." In 2021 International Conference on Computer Communications and Networks (ICCCN), pp. 1-10. IEEE, 2021.
3. Yuan Gong, Cheng-I. Lai, Yu-An Chung, and James Glass. "Ssast: Self-supervised audio spectrogram transformer." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 10, pp. 10699-10709. 2022.
4. Setareh Rahimi Taghanaki, Michael Rainbow, and Ali Etemad. "Self-supervised human activity recognition with localized time-frequency contrastive representation learning." IEEE Transactions on Human-Machine Systems 53, no. 6 (2023): 1027-1037.
5. Denizhan Kara, Shengzhong Liu, Jinyang Li, Dongxin Liu, Tianshi Wang, Ruijie Wang, Yizhuo Chen, Yigong Hu, Tarek Abdelzaher, "FreqMAE: Frequency-Aware Masked Autoencoder for Multi-Modal IoT Sensing," In Proc. The Web Conference (WWW), May 2024.
6. Denizhan Kara, Tomoyoshi Kimura, Yatong Chen, Jinyang Li, Ruijie Wang, Yizhuo Chen, Tianshi Wang, Shengzhong Liu, Lance Kaplan, Joydeep Bhattacharyya, Tarek Abdelzaher, "PhyMask: An Adaptive Masking Paradigm for Efficient Self-Supervised Learning in IoT," In Proc. 22nd ACM Conference on Embedded Networked Sensor Systems (SenSys), Hangzhou, China, November 2024.
HW2 Debate (20 min)

Student led talk (45 min + 10 min Q&A)
2/26 Handling Spatial-Temporal IoT Data HW3 Out 2-page project proposal due
3/3 Handling Spatial-Temporal IoT Data (Student Led) G2
HW3 Debate (20 min)

Student led talk (45 min + 10 min Q&A)
Data Curation and "Faking" 3/5 Physical Data Curation and Augmentation HW4 Out    
3/10 Physical Data Curation and Augmentation (Student Led) G6
HW4 Debate (20 min)

Student led talk (45 min + 10 min Q&A)
3/12 Project Elevator Talks      
Break 3/17  Spring Break
3/19
The Compute Bottleneck: Efficient Inference at the IoT Edge 3/24 Input Data Filtering  
 
3/26 Model Reduction: Pruning, Quantization, Distillation HW5 Out
G1
  Instructor intro (20 min)

Student led talk (45 min + 10 min Q&A)
3/31 Neural Network Architecture Search     HW5 Debate (20 min)
4/2 Mixture of Experts Cascades HW6 Out
G7
  Instructor intro (20 min) 

Student led talk (45 min + 10 min Q&A)
4/7 Timing Guarantees   HW6 Debate (20 min)
4/9 Energy Consumption and Thermal Issues HW7 Out
   
4/14 Federated Learning, Distributed Fine-Tuning, and Test-Time Adaptation G8   HW7 Debate (20 min)

Student led talk (45 min + 10 min Q&A)
4/16 Closed loop control and related foundation models (RT-2, RT-X, etc) HW8 Out
Ethics 4/21 Ethical and Societal Considerations HW8 Debate (20 min)
4/23  
Student Projects 4/28 Student-led Final Project Presentations      
4/30 Student-led Final Project Presentations      
5/5 Recap