Date |
Topic |
Presenter |
Notes |
8/23 |
Introduction
Introduction to compilers, architecture and logistics
|
Charith
Slides |
Todo: Class Statistics Survey |
8/25 |
Compilers
Quick overview of Compiler Construction + Optimizations
|
Charith
Slides |
|
8/30 |
Compiler Optimizations
Anatomy of a Compiler Optimization Pass, DSLs, Domain Specific Optimizations
|
Charith
Slides |
|
9/01 |
DSLs + ML in Architecture
Continuation of discussion on DSLs and examples of ML in architecture
|
Charith
Slides |
Background Reading: A Survey of Machine Learning for Computer Architecture and Systems
Todo: Paper selections (Due: Sept 1st)
|
9/06 |
Machine Learning Techniques
Quick overview of ML techniques: Neural Networks
|
Charith
Slides |
|
9/08 |
Machine Learning Techniques (Contd.) and Auto-tuning
Quick Overview of ML techniques: Genetic Algorithms, Simulated Annealing, Sequential Decision Making; Introduction to Auto-tuning
|
Charith
Slides |
Background Reading:
A Survey on Compiler Autotuning using Machine Learning (ACM CSUR 2018)
A taxonomy of ML for Systems Problems(IEEE Micro Sept/Oct 2020)
|
9/13 |
Autotuning: Empirical Autotuning
Main Reading: Automatically Tuned Linear Algebra Software (SC 1998)
|
Russel |
Related Reading:
A Fast Fourier Transform Compiler (PLDI 1999)s
Fast Automatic Generation of DSP Algorithms (ICCS 2001)
The Design and Implementation of FFTW3 (IEEE 2005)
|
9/15 |
Autotuning: Languages for exposing choices
Main Reading: Petabricks: A Language and Compiler for Algorithmic Choice (PLDI 2009)
|
Frithjof |
Related Reading:
A framework for adaptive algorithm selection in STAPL (PPoPP 2005)
Halide: A language and compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines (PLDI 2013) A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers (PACT 2021)
|
9/20 |
Autotuning: Techniques
Main Reading: Bliss: Auto-tuning Complex Applications using a Pool of Diverse Lightweight Learning Models (PLDI 2021)
|
Yufeng |
Related Reading:
Learning to Generate Fast Signal Processing Implementations (ICML 2001)
Towards Better Understanding of Black-box Auto-tuning: A Comparative Analysis for Storage Systems (ATC 2018) - A systems paper with a good overview of techniques
|
9/22 |
Autotuning: Frameworks
Main Reading: CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research (CGO 2022)
|
Vimarsh |
Related Reading:
AutoTVM: Learning to Optimize Tensor Programs (NeurIPS 2018) OpenTuner: An extensible framework for Program Autotuning (PACT 2014)
|
9/27 |
Autotuning: Scaling Up
Main Reading: GPTune: Multitask Learning for Autotuning Exascale Applications (PPoPP 2021)
|
Shubham |
Related Reading:
Portable Performance on Heterogeneous Architectures (ASPLOS 2013) |
9/29 |
Autotuning: Diverging Workloads
Main Reading: A Pattern Based Algorithmic Autotuner for Graph Processing on GPUs (PPoPP 2019)
|
Srinjoy |
Related Reading:
Autotuning Algorithmic Choice for Input Sensitivity (PLDI 2015) |
10/04 |
Autotuning: Increasing Efficiency
Main Reading: AdaTune: Adaptive Tensor Program Compilation Made Efficient (NeurIPS 2020)
|
Joshua |
Related Reading:
SRTuner: effective compiler optimization customization by exposing synergistic relations (CGO 2022) |
10/06 |
Data-driven Cost Models: Part 1
Main Reading: Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks (ICML 2019)
|
Burak |
Related Reading:
Learning execution through neural code fusion (ICLR 2020) |
10/11 |
Data-driven Cost Models: Part 2
Main Reading: A Learned Performance Model for Tensor Processing Units (MLSys 2021)
|
Isha |
Related Reading:
A Deep Learning based cost model for automatic code optimization (MLSys 2021) |
10/13 |
Program Embeddings: Part 1
Main Reading: CodeBERT: A Pre-Trained Model for Programming and Natural Languages (EMNLP 2020)
|
Shubham |
Related Reading:
Blended, precise semantic program embeddings (PLDI 2020)
Learning and Evaluating Contextual Embedding of Source Code (ICML 2020)
|
10/18 |
Program Embeddings: Part 2
Main Reading: ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations (ICML 2021)
|
Frithjof |
Related Reading:
IR2Vec: LLVM IR Based Scalable Program Embeddings (TACO 2020) |
10/20 |
Learned Optimizations: Traditional Compiler Optimizations 1
Main Reading: Compiler Auto-Vectorization with Imitation Learning (NeurIPS 2019)
|
Srinjoy |
Related Reading:
NeuroVectorizer: End-to-end Vectorization with Deep Reinforcement Learning (CGO 2020)
Meta Optimization: improving compiler heuristics (PLDI 2003)
|
10/25 |
Learned Optimizations: Traditional Compiler Optimizations 2
Main Reading: End-to-end Deep Learning of Optimization Heuristics (PACT 2017)
|
Joshua |
Related Reading:
AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning (MLSys 2020) |
10/27 |
Learned Optimizations: DSLs Part 1
Main Reading: Learning to Optimize Halide with Tree Search and Random Programs (SIGGRAPH 2019)
|
Yufeng |
Related Reading:
Ansor: Generating High-Performance Tensor Programs for Deep Learning (OSDI 2020) |
11/01 |
Learned Optimizations: DSLs Part 2
Main Reading: Value Learning for Throughput Optimization of Deep Neural Networks (MLSys 2021)
|
Isha |
Related Reading:
The case for learned index structures (SIGMOD 2018) - databases |
11/03 |
Learned Optimizations: Tensor Programs
Main Reading: Device Placement Optimization with reinforcement learning (ICML 2017)
|
Yao |
Related Reading:
Transferable Graph Optimizers for ML Compilers (NeurIPS 2020) |
11/08 |
NO CLASS: General Elections Day
|
|
|
11/10 |
Guest Lecture (Google)
|
|
|
11/15 |
Architecture Design Space Exploration: Part 1
Main Reading: Timeloop: A Systematic Approach to DNN Accelerator Evaluation (ISPASS 2019)
|
Burak |
Related Reading:
A Full-stack Accelerator Search Technique for Vision Applications Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search (ASPLOS 2021) |
11/17 |
Architecture Design Space Exploration: Part 2
Main Reading: A graph placement methodology for fast chip design (Nature 2021)
|
Russel |
Related Reading:
Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning (MICRO 2021)
Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCs (MICRO 2021)
|
11/22 |
Break
|
|
|
11/24 |
Break
|
|
|
11/29 |
Learned Architecture Simulation
Main Reading: DiffTune: Optimizing CPU Simulator Parameters with Learned Differentiable Surrogates (MICRO 2020)
|
Yao |
Related Reading:
SimNet: Computer Architecture Simulation using Machine Learning |
12/01 |
Learned Systems: Caches
Main Reading: Applying Deep Learning to the Cache Replacement Problem (MICRO 2019)
|
Vimarsh |
Related Reading:
Learning Memory Access Patterns (ICML 2018)
Applying Deep Learning to the Cache Replacement Problem (MICRO 2019)
|
12/06 |
Student Presentations
|
|
|