ECE 598 - MSC (Spring 2023)

Lecturer: Olgica Milenkovic (Office hours: Tuesday 12:30-2:00pm, 311 CSL or by appointment as needed)

Lectures: Tuesday and Thursday, 2:00-3:20pm, Electrical and Computer Engineering Building ECEB 3013

Course Objectives:

To learn about the latest developments at the frontier of molecular storage and computing and identify and solve new coding theory, machine learning and mathematical problems that arise in the context of ensuring data reliability, durability and replicability. Learn about end-to-end storage and computing system designs involving both DNA and synthetic polymers.

Catalog Description

The course will introduce a number of DNA- and polymer-based data storage platforms and the relevant mathematical and biological concepts needed to understand their implementation. In the first part of the course we will describe modern synthesis and sequencing platforms and the problem of reconstructing sequences based on evidence sets of the form of substrings, subsequences or substring/subsequence weight. Topics of relevance on the biological side include reviews of Sanger and shotgun sequencing and nanopore sequencing. Topics on the mathematical side include sequence alignment, deBruijn graphs, deep learning methods for base calling, clustering methods for efficient synthesis, k-decks and trace reconstruction. In the second part of the course we will discuss random access and error-correction problems, with a special emphasis on PCR-based random access techniques and primer design, coding for shotgun sequencing, coded trace reconstruction and Catalan sequence based encoding methods. In the third part of the course we will discuss DNA editing mechanisms and topological storage, along with molecular computing methods based on strand displacement. Special emphasis will be placed on emerging in-memory computational paradigms such as SIM||DNA.

Prerequisites: Some background in coding/information theory is desirable but not needed.

Textbook:  There is no textbook. Notes will be posted online, one day before the lecture.

Grading: Homework (30%), Paper presentations (30%), Group projects (40%).

Project Presentations:  Date and time to be announced.


Research Project Topics

DNA Sequencing and Assembly Review paper  

Random access in DNA-based data storage via PCR and primer design problems Paper  

Machine learning methods for basecalling Research paper  

Capacity of DNA-Storage Channels Research paper  

Course Topics