Impossible or Improbable — A Gentle Introduction to Probability

  • What is probability?
  • The differences between the Frequentist approach and the Bayesian approach
  • How to visualize probability
  • How to utilize the rules of probability
  • Using confusion matrices to look at the basic metrics

Basic definitions

One of the most basic concepts of probability is the concept of a procedure. A procedure is an act that leads to a result. For example, throwing dice or visiting a website. An event is a collection of the outcomes of a procedure, such as getting heads on a coin flip or leaving a website after only 4 seconds. A simple event is an outcome/ event of a procedure that cannot be broken down further. For example, rolling two dice can be broken down into two simple events: rolling die 1 and rolling die 2. The sample space of a procedure is the set of all possible simple events. For example, an experiment is performed, in which a coin is flipped three times in succession. What is the size of the sample space for this experiment? The answer is eight because the results could be any one of the possibilities in the following sample space — {HHH, HHT, HTT, HTH, TTT, TTH, THH, or THT}.


The probability of an event represents the frequency, or chance, that the event will happen. For notation, if A is an event, P(A) is the probability of the occurrence of the event. We can define the actual probability of an event, A, as follows:

Bayesian versus Frequentist

The preceding example was almost too easy. In practice, we can hardly ever truly count the number of ways something can happen. For example, let’s say that we want to know the probability of a random person smoking cigarettes at least once a day. If we wanted to approach this problem using the classical way (the previous formula), we would need to figure out how many different ways a person is a smoker — someone who smokes at least once a day — which is not possible! When faced with such a problem, two main schools of thought are considered when it comes to calculating probabilities in practice: the Frequentist approach and the Bayesian approach. This chapter will focus heavily on the Frequentist approach while the subsequent chapter will dive into the Bayesian analysis.

Frequentist approach

In a Frequentist approach, the probability of an event is calculated through
experimentation. It uses the past in order to predict the future chance of an event. The basic formula is as follows:

Example — marketing stats

Let’s say that you are interested in ascertaining how often a person who visits your website is likely to return on a later date. This is sometimes called the rate of repeat visitors. In the previous definition, we would define our A event as being a visitor coming back to the site. We would then have to calculate the number of ways a person can come back, which doesn’t really make sense at all! In this case, many people would turn to a Bayesian approach; however, we can calculate what is known as relative frequency. So, in this case, we can take the visitor logs and calculate the relative frequency of event A (repeat visitors). Let’s say, of the 1,458 unique visitors in the past week, 452 were repeat visitors. We can calculate this as follows:

The law of large numbers

The reason that even the Frequentist approach can do this is because of the law of large numbers, which states that if we repeat a procedure over and over, the relative frequency probability will approach the actual probability. Let’s try to demonstrate this using Python.

  1. Pick a random number between 1 and 10 and find the average.
  2. Pick two random numbers between 1 and 10 and find their average.
  3. Pick three random numbers between 1 and 10 and find their average.
  4. Pick 10,000 random numbers between 1 and 10 and find their average.
  5. Graph the results.

Compound events

Sometimes, we Compound events Sometimes, we need to deal with two or more events. These are called compound events. A compound event is any event that combines two or more simple events. When this happens, we need some special notation.

  • The probability that A and B occur is P(A ∩ B) = P(A and B)
  • The probability that either A or B occurs is P(A B) = P(A or B)
  • Pink: This refers to the people who have cancer and had a negative test
  • Purple (A intersect B): These people have cancer and had a positive test
  • Blue: This refers to the people with no cancer and a positive test outcome
  • White: This refers to the people with no cancer and a negative test outcome



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Desi Ratna Ningsih

Desi Ratna Ningsih

Data Science Enthusiast, Remote Worker, Course Trainer, Archery Coach, Psychology and Philosophy Student