ECE 598 - MSC (Spring 2023)
Lecturer: Olgica Milenkovic (Office hours: Tuesday 12:30-2:00pm, 311 CSL or by appointment as needed)
Lectures: Tuesday and Thursday, 2:00-3:20pm, Electrical and Computer Engineering Building ECEB 3013
Course Objectives:
To learn about the latest developments at the frontier of molecular storage and computing and identify and solve new coding theory, machine learning and mathematical problems that arise in the context of ensuring data reliability, durability and replicability. Learn about end-to-end storage and computing system designs involving both DNA and synthetic polymers.
Catalog Description
The course will introduce a number of DNA- and polymer-based data storage platforms and the relevant mathematical and biological concepts needed to understand their implementation. In the first part of the course we will describe modern synthesis and sequencing platforms and the problem of reconstructing sequences based on evidence sets of the form of substrings, subsequences or substring/subsequence weight. Topics of relevance on the biological side include reviews of Sanger and shotgun sequencing and nanopore sequencing. Topics on the mathematical side include sequence alignment, deBruijn graphs, deep learning methods for base calling, clustering methods for efficient synthesis, k-decks and trace reconstruction. In the second part of the course we will discuss random access and error-correction problems, with a special emphasis on PCR-based random access techniques and primer design, coding for shotgun sequencing, coded trace reconstruction and Catalan sequence based encoding methods. In the third part of the course we will discuss DNA editing mechanisms and topological storage, along with molecular computing methods based on strand displacement. Special emphasis will be placed on emerging in-memory computational paradigms such as SIM||DNA.
Prerequisites: Some background in coding/information theory is desirable but not needed.
Textbook: There is no textbook. Notes will be posted online, one day before the lecture.
Grading: Homework (30%), Paper presentations (30%), Group projects (40%).
Project Presentations: Date and time to be announced.
Research Project Topics
DNA Sequencing and Assembly Review paper
Random access in DNA-based data storage via PCR and primer design problems Paper
Machine learning methods for basecalling Research paper
Capacity of DNA-Storage Channels Research paper
Course Topics
Introduction to basic concepts in molecular biology
Introduction to basic concepts in synthetic biology
Introduction to DNA-Based Data Storage Systems
DNA-Based Data Storage System Architecture
Random Access
Rewriting, Topological Storage and 2DDNA
Coding theory for DNA-Based Data Storage
DNA Computing: From the Traveling Salesmen to SIM||DNA
Homework
Lectures