Course Description

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference. A practitioner can often be left in a debilitating maze of techniques, philosophies and nuance. This course presents the fundamentals of inference in a practical approach for getting things done. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analyzing data.

Course Content

Articles

Articles are coming!

What I’ve learned

Statistical inference - the process of generating conclusions about a population from a noisy sample. With it, we can gain insight into our data, and understand more about the world, allowing us to make the right decisions.

Knowledge and parsimony(using simplest models to explain a complex phenomena), go hand in hand. We use probability models to describe the world parsimonously. This use of probability models to connect bewteen our data and the population is the most effective way to obtain inference.

The courses are defintely getting more difficult, the concepts are familiar because I’ve taken an Intro to stats course, but there was more depth to it. I got to understand they math behind them, and the R code as well. For some reason, R code made stats fun for me, maybe it’s because I don’t have to write that much on paper anymore :D.

Week 1:

Week 2:

Week 3:

Week 4:

Final project review

The final project was in two parts.

First part was about simulating sample of an exponential distribution, and then comparing it to a theoretical one. What you will find (because of CLT) is it’s quite similar.

Second part was doing a basic inferential analysis on the tooth growth data package. I had to find out whether the supplement types had a difference in tooth length. I used a basic hypothesis testing and used t.test.

Overall the course was great, I learned a lot and I just hope I understood enough to be able to intuitively perform R statistical analysis well.

Book

Great summary notes from the book

These examples illustrate many of the difficulties of trying to use data to create general conclusions about a population.

Paramount among our concerns are:

Statistical inference requires navigating the set of assumptions and tools and subsequently thinking about how to draw conclusions from data.

The goals of inference

You should recognize the goals of inference. Here we list five examples of inferential goals.

The tools of the trade

Several tools are key to the use of statistical inference. We’ll only be able to cover a few in this class, but you should recognize them anyway.

Different thinking about probability leads to different styles of inference

We won’t spend too much time talking about this, but there are several different styles of inference. Two broad categories that get discussed a lot are:

  1. Frequency probability: is the long run proportion of times an event occurs in independent, identically distributed repetitions.
  2. Frequency style inference: uses frequency interpretations of probabilities to control error rates. Answers questions like “What should I decide given my data controlling the long run proportion of mistakes I make at a tolerable level.”
  3. Bayesian probability: is the probability calculus of beliefs, given that beliefs follow certain rules.
  4. Bayesian style inference: the use of Bayesian probability representation of beliefs to perform inference. Answers questions like “Given my subjective beliefs and the objective information from the data, what should I believe now?”

Data scientists tend to fall within shades of gray of these and various other schools of inference. Furthermore, there are so many shades of gray between the styles of inferences that it is hard to pin down most modern statisticians as either Bayesian or frequentist. In this class, we will primarily focus on basic sampling models, basic probability models and frequency style analyses to create standard inferences. This is the most popular style of inference by far.

Being data scientists, we will also consider some inferential strategies that rely heavily on the observed data, such as permutation testing and bootstrapping. As probability modeling will be our starting point, we first build up basic probability as our first task.

Read more in the book!

Proof of completion

Certificate for 6th course

View it online