Statistics (I) in Data Science & Machine Learning


Normal Distribution/Gaussian Distribution

Central Limit Theorem


Confidence Interval

z-score table
Sample means are normally distributed if n is large enough.
one-tailed and two-tailed

Student’s t-distribution

Source t-distribution with different degrees of freedom
Source (one-tailed t-score table)

Sample sizing

Hypothesis testing & statistical significance

z-test and t-test

one-tailed z-score table

Types of error

  • Type I error is the false positive that we reject H₀ when H₀ is true.
  • Type II error is the false negative that we fail to reject H₀ when H₁ is true.


Probability Sampling

Study Design

  • An observational study plays an observation role in measuring or surveying subjects without intervention.
  • A controlled experiment introduces intervention. For example in a clinical study, subjects may assign to one group receiving treatment or to another group that does not.
  • those with the disease or condition under study (case), and
  • a control group of similar people but do not have the disease or condition.




