#

Date

Topics

Slides       

Matlab

Homework
(not graded)

Exams

1

Aug 26

Why probability and statistics in comutational bioengineering? Lecture 1      
2

Aug 28

Random experiments. Sample space, Events, Venn diagramms. 

Definitions of probability:
Statistical (part 1)

Lecture 2

coin_toss_template.m

coin_toss.m

   
3

Sep 2

Definitions of probability:
Inductive. 

Paradoxes of inductive definition of probability

Combinatorics

Lecture 3      
4

Sep 4

Combinatorics (continued)

Probability axioms

Conditional probability

Independence of events

Circuit diagrams

Bayes' theorem

Specificity/Sensitivity of tests

Lecture 4 circuit_template.m    
5

Sep 9

Secretary problem

Simpson's paradox

Monty Hall problem

Lecture 5 monty_hall_template.m    
6

Sep 11

Discrete random varibales,

PMF. CDF, CCDF, 
Mean, Variance, Skewness

Uniform distribution

The distribution of the
Ct value in COVID-19
PCR tests

Lecture 6 uniform_discrete_template.m hw1.pdf


 

7

Sep 16

Bernoulli trials

Binomial Distribution

Poisson Distribution

Lecture 7

binomial_template.m

   
8

Sep 18

Poisson distribution in genome assembly Lecture 8 poisson_template.m    
9

Sep 23

Geometric distribution.

Phylogenetic trees and time to the most recent common ancestor. 

Mitochondrial Eve &
Y-chromosome Adam

Lecture 9

 

   
10

Sep 25

Negative Binomial Distribution

Cancer: Driver and Passenger genes

Lecture 10

negative_binomial_template.m

hw1_with_solutions.pdf  
11

Sep 30

Probability Density Function, CDF, CCDF, Mean, Variance, Std

Uniform continuous distribution. 

Constant rate (Poisson) process.

Exponential distribution.

Lecture 11   hw2.pdf  
12

Oct 2

Erlang and Gamma distributions

Gaussian distribution

Standardizing and working with the CDF table

Lecture 12

exponential_gamma_template.m

   
13

Oct 7

Fitting Gaussian distribution to the data for binding energies of protein-protein interactions 

Multiple random variables. Joint, Marginal, and Conditional PMFs

Statistical independence of random variables

Lecture 13

PINT_binding_energy.m

PINT_binding_energy.mat

hw2_with_solutions.pdf  
14

Oct 9

Covariance

Correlation coefficients:
Pearson, Spearman

Lecture 14

correlation_template.m

cancer_wdbc_cc_analysis_template.m

cancer_wdbc.mat

   
15

Oct 14

Samples, histograms, 
median, quartiles, percentiles
Box-and-whisker plots
Lecture 15 boxplot_template.m    
16

Oct 16

Sample mean. Its mean and variance (standard error). Central limit theorem.
Parameter point estimation
Lecture 16

central_limit_theorem_template.m

Online simulation of the Central Limit Theorem:
https://onlinestatbook.com/stat_sim/sampling_dist/

   
17

Oct 21

Parameter point estimation. Method of moments and Maximum Likelihood Estimator.

Confidence intervals of population mean
Lecture 17

moment_estimators_template.m

confidence_intervals_template.m

   
18

Oct 23

Midterm review Lecture 18      
19

Oct 28

Midterm exam at CBTF        
20 Oct 30

 

Midterm exam at CBTF        
21

Nov 4

Confidence intervals of population mean and variance

Student-t and chi-squared distributions

Confidence interval of population proportion

Lecture 19

Group signup sheet

Student's t-Distribution in Mathematica

Chi-Squared Distribution in Mathematica

   
22

Nov 6

Hypothesis testing:
Type 1 and Type 2 error. One- and two-sided hyptheses. One and two samples.
Lecture 20 dark_vs_milk_chocolate_analysis_template.m    
23

Nov 11

In-class group exercise #1  

Group exercise 1

cancer_wdbc.mat

   
24

Nov 13

In-class group exercise #2  

Group exercise 2

   
25

Nov 18

Pearson's chi-square Goodness of Fit (GOF) test Lecture 21

m_and_m_analysis_template.m

   
26

Nov 20

Linear regression: two variables.

Multiple linear regression (part one).

Nonlinear regression.

Training and testing/validation sets.

Overfitting.

Double descent

Lecture 22

 

   
27

Dec 2

Matlab exercise on single and multiple variable regression.

Clustering Analysis

Lecture 23

expression_table.mat

regression_single_and_multiple_template.m

clustering_template.m

https://davidbioinformatics.nih.gov/summary_new.jsp

   
28 Dec 4

Clustering Matlab exrcise.
Concepts in Network Analysis: ranking nodes by degree, PageRank, or betweenness-centrality.

Using Gephi software for network analysis and visualization.

Lecture 24

coexpression_network_random start.gephi

disease_disease_random_start.gephi

gephi_network_analysis_exercise.pdf

   
29  Dec 9          

FINAL EXAM