#

Date

Topics

Slides

Matlab

Homework

Exams

1

Jan 21

Why probability and statistics in comutational bioengineering? Lecture 1      
2

Jan 23

Random experiments. Sample space, Events, Venn diagramms. 

Definitions of probability:
Statistical (part 1)

Lecture 2      
3

Jan 28

Definitions of probability:
Statistical (continued)

Definitions of probability:
Inductive. 

Paradoxes of inductive definition of probability

Lecture 3 coin_toss_template.m    
4

Jan 30

Combinatorics 

Probability axioms

Conditional probability

Lecture 4      
5

Feb 4

Independence of events

Circuit diagrams 

Bayes' theorem

Specificity/Sensitivity of tests

Lecture 5      
6

Feb 6

Secretary problem

Simpson's paradox

Monty Hall problem

Lecture 6 monty_hall_template.m  


 

7

Feb 11

Discrete random varibales,

PMF. CDF, CCDF, 
Mean, Variance, Skewness

Uniform distribution

The distribution of the
Ct value in COVID-19
PCR tests (part 1(

Lecture 7

 

   
8

Feb 13

Bernoulli trial

Binomial distribution

Lecture 8

binomial_template.m

binomial.m

   
Online

Feb 18

Poisson distribution. Online lecture

Online_lecture_video

 

   
9

Feb 20

 

Applications of the Poisson distribution.

Genome Assembly.

de Bruijn graphs.

 

 Lecture 9

 

   
10

Feb 25

 

Geometric distribution.

Phylogenetic trees and time to the most recent common ancestor. 

Mitochondrial Eve, Y-chromosome Adam.

 Lecture 10

 geometric_and_negative_binomial_template.m

   
11

Feb 27

 

Mitochondrial Eve 
Y-chromosome Adam 
and nuclear DNA ancestors  (continued)

Negative Binomial Distribution

Cancer: Driver and Passenger genes/mutations

 Lecture 11

 

   
12

Mar 4

 

Continuous random variables.

Probability Density Function, CDF, CCDF, Mean, Variance, Std

Uniform continuous distribution. 

Constant rate (Poisson) process.

 Lecture 12      
13

Mar 6

Exponential, Erland, and Gamma distributions. Gaussian distribution.
Normalization. Z-scores
 Lecture 13      
14

Mar 11

 Class in session Midterm review.

Midterm exam at CBTF 
March 11- March 13
Lecture 14      
Midterm

Mar 13

 No class due to the Midterm exam        
15

Mar 25

Fitting Gaussian distribution to the data on binding energies of protein-protein interactions.

Multiple random variables. Joint, Marginal, and Conditional PMFs or PDFs.

Statistical independence of variables.

 Lecture 15   Data:
PINT_binding_energy.mat

  Analysis instructions:

PINT_binding_energy.m

 


 
   
16

Mar 27

 

Covariance

Correlation coefficients:
Pearson, Spearman

 Lecture 16  

correlation_template.m

cancer_wdbc_cc_analysis_template.m

cancer_wdbc.mat

   
17

Apr 1

Samples, histograms, 
median, quartiles, percentiles
Box-and-whisker plots

Sample mean. Its mean and variance (standard error). Central limit theorem.
 Lecture 17  boxplot_template.m

https://onlinestatbook.com/stat_sim/sampling_dist/
   
18

Apr 3

Matlab remedial session        
19

Apr 8

Parameter point estimation.

Sample variance.

Method of moments and Maximum Likelihood Estimator

Lecture 18 moment_estimators_template.m    
20

Apr 10

Confidence intervals of population mean and variance.

Student-t and chi-squared distributions

Lecture 19

confidence_intervals_template.m

https://demonstrations.wolfram.com/ComparingNormalAndStudentsTDistributions/

https://demonstrations.wolfram.com/ChiSquaredDistributionAndTheCentralLimitTheorem/

   
21

Apr 15

Hypothesis testing Lecture 20 dark_vs_milk_chocolate_analysis_template.m    
22

Apr 17

Pearson's chi-square Goodness of Fit (GOF) test

M&M candy experiment.
 

Batchg effect. Meta-analysis. 

Test of staistiucal independence

Lecture 21

m_and_m_analysis_group.m

   
23

Apr 22

Linear regression: two variables   Lecture 22

 

   
24

Apr 24

Multiple Linear Regression.

Overfitting the data.

Traininng and testing/validation datasets

Lecture 23

expression_table.mat

regression.m

 

   
25

Apr 29

Multiple Linear Regression
(Matlab exercise)

Clustering analysis

Gene Set Enrichment Analysis

Lecture 24

expression_table.mat

regression_multiple.m

clustering.m

https://davidbioinformatics.nih.gov/tools.jsp

 
 
26

May 1

Important nodes in networks: degree, PageRank,  betweenness-centraility.

Co-expression and disease-disease networks analyzed using Gephi software

Lecture 25

coexpression_network_random start.gephi

disease_disease_random_start.gephi

gephi_network_analysis_exercise.docx

 

   
27 May 6 Review for the final exam Lecture 26      

FINAL EXAM