#

Date

Topics

Slides

Matlab

Homework
(not graded)

Videos

1

Aug 22

Class logistics. 
Two reasons we need probability and statistics in computational bioengineering.
Lecture_1      
2

Aug 24

Random experiments.
Sample space.
Events.
Venn diagrams.
Definitions of probability: statistical

Lecture_2

coin_toss_template.m

coin_toss.m

   
3

Aug 29

Definitions of probability: inductive or logical.
Combinatorics.
Birthday problem.
Combinatorics: Polya Urn problem.

Lecture_3      
4

Aug 31

Axioms of probability.
Conditional probability.
Event independence.
Circuits

Lectire_4 circuit_template.m    
5

Sep 5

Circuits (continued)

Bayes theorem.

Secretary problem.

Simpson's paradox

Lecture_5      
6

Sep 7 

Simpson's paradox

Monty Hall problem

Discrete random variables:
PMF, CDF, CCDF,
average, variance, standard deviation, skewness, geometric mean

Lecture_6

monty_hall_template.m

monty_hall.m

hw1.pdf


 

7

Sep 12

Random variables: skewness, geometric mean

Discrete distributions: Uniform

The distribution of Ct values of a viral PCR test

Lecture_7

uniform_discrete_template.m

uniform_discrete.m

   
8

Sep 14

Binomial distribution

Poisson distribution

Genome assembly (start)

Lecture_8

poisson_template.m

poisson.m

   
9

Sep 19

Genome assembly (continued)

Geometric distribution

Lecture_9

geometric_template.m

geometric.m

   
10

Sep 21

Mitochondrial Eve

Negative Binomial Distribution

Cancer drivers and passengers

Lecture_10

negative_binomial_template.m

negative_binomial.m

hw1_with_solutions.pdf

hw2.pdf

 
11

Sep 26

Review of discrete distributions.

Continuous random variables: PDF, CDF, CCDF, mean, variance.

Constant rate process.

Exponential distribution

Lecture_11

 

   
12

Sep 28

Constant rate process (continued)

Memoryless property of the exponential distribution

Erlang and Gamma distributions

Lecture_12

exponential_template.m

exponential.m

erlang_template.m

erlang.m

hw2_with_solutions.pdf

Erlang and Gamma 

Log-normal

13

Oct 3

 

Work in class on

Group_Project_1

Videos relevant for this project:

Erlang and Gamma 

Log-normal

   

 

 

14

Oct 5

Gaussian distribution

Standardization

Work in class on

Group_Project_2

Videos relevant for this project:

Gaussian_and_Standardization

 

     
15

Oct 10

Multiple variables:
joint, marginal, conditional probabilities.

Statistical independence of variables

Covariation

Correlation coefficients:
Pearson linear, 
Spearman rank

Work in class on

Group_Project_3

Videos relevant for this project:

Joint, Marginal, Conditional probablities, Statistical indepndence of variables

Covariance and correlations

 

cancer_wdbc.mat    
16

Oct 12

Linear Functions of Random Variables

Principal Component Ananlysis

Work in class on

Group_Project_4

Videos relevant for this project:

Principal Component Analysis

     
17

Oct 17

Descriptive statistics.

Samples, i.i.d. random variables, 

Histograms.
Median, quartiles, boxplot, probability plots

Lecture_17

  boxplot_template.m

  boxplot_exercise.m

   
18

Oct 19

Central Limit Theorem

Parameter estimators:
method of moments

Lecture_18

central_limit_theorem_template.m

central_limit_theorem.m

hw3.pdf  
19

Oct 24

Sample variance S^2 (unbiased estimator) 

Maximum Likelihood Estimator (MLE)

Confidence Intervals

- with known population variance, sigma^2

Lecture_19

moment_estimators_template.m

moment_estimators.m

   
20

Oct 26

Confidence Intervals

- for population average with estimated population variance via sample variance,  S^2. Student T-distribution

- for  population variance. Chi-squared distribution.

- for population fraction

Lecture_20

confidence_intervals_template.m

confidence_intervals.m

hw3_with_solutions.pdf  
21

 Oct 31

Hypothesis tested (one- and two-sided).

One sample and two samples hypotheses

Bonferroni correction for multiple hypotheses

Lecture_21 dark_vs_milk_chocolate_analysis_template.m    
22

Nov 2

Midterm review Lecture_22      
 

Nov 7

MIDTERM        
23

Nov 9

Linear regression 
(single variable)
Lecture_23

expression_table.mat

regression_template.m

 

   
24

Nov 14

Chi-squared Goodness of Fit Test. M&M colors exercise Lecture_24

m_and_m_analysis_template.m

hw4.pdf  
25

Nov 16

Multiple Linear Regression.

adjusted R-squared

Use of testing and training samples to avoid overfitting. 

Double descent

Lecture_25

regression_multiple_template.m

   
26

Nov 28

Supervised and unsupervised machine learning.

Clustering

Lecture_26 clustering_template.m hw4_with_solutions.pdf  
27 Nov 30

Gene Set Enrichment Analysis (GSE) of biological functions in gene expression clusters using NCI David

Netwoek analysis: hubs, PageRank, betweenness-centrality

Network visualization using Gephi

Lecture_27

coexpression_network_random start.gephi

disease_disease_random_start.gephi

gephi_network_analysis_exercise.pdf

hw5.pdf  

28

Dec 5 Final exam review Lecture_28   hw5_with_solutions.pdf  
FINAL
EXAM