Bernoulli Distribution

09.08.2018 |

Episode #3 of the course Theory of probability by Polina Durneva


Hi there!

We started our course by talking about the probability of mutually exclusive and not mutually exclusive events, the probability of dependent and independent events, and conditional probability. Now, it is the time to move to more complicated topics in the theory of probability. From now on, we will mainly focus on different types of discrete and continuous probability distributions. First, we’ll discuss the difference between these two and then proceed to our first discrete probability distribution: Bernoulli distribution.


Difference between Discrete and Continuous Probability Distributions

To understand the difference between discrete and continuous probability distributions, you first need to understand that each of these distributions is associated with discrete and continuous variables, respectively. So, the question is, what’s the difference between these two types of variables?

Discrete variables can be counted over a finite time period. For example, the number of coins in your pockets or the number of planets around the Sun are countable, and therefore, they are discrete variables.

On the other hand, continuous variables are impossible to count. The most popular examples of these kinds of variables are time, weight, income, and age. For instance, weight is typically rounded, meaning that a person might actually weigh 100.01 pounds or 100.011 pounds or 100.0111 pounds. Therefore, continuous and discrete probability distributions are based on continuous and discrete variables, respectively. Now, let’s proceed to our first discrete probability distribution: Bernoulli distribution.


What Is Bernoulli Distribution?

Bernoulli distribution is mainly used when we have two possible outcomes for an event. These outcomes are either success or failure. Typically, success is denoted as 1 and failure is denoted as 0. If the probability of success is value p, then the probability of failure is 1 – p.

For example, let’s say that we have a group of 10 people, 6 of whom love to sing and 4 of whom love to dance (and there is no one who loves both to sing and dance). Assuming that singing is success, the probability of success in our case is 6/10, or 60%. The probability of failure is 1 – 0.6 = 0.4 = 40%. We can check the last answer by counting how many failures we have (4 dancers) and divide these failures by the total number of people in the given group.

Any probability distribution has an expected value and variance (there, of course, exist other things, such as distribution function, moment generating function, and characteristic function, but this is a topic for another course).


Expected Value of Bernoulli Distribution

The expected value is the weighted mean of all possible values in a probability distribution. In the Bernoulli distribution, we can calculate the expected value by calculating the weighted mean of failure and success. Let’s use our previous example to derive the expected mean.

We have two events, such as failure and success, denoted as 0 and 1, respectively. The probability of failure is 0.4 and the probability of success is 0.6. The weighted mean would be 0 * 0.4 + 1 * 0.6 = 0.6, which is basically the probability of success. This is actually true for all Bernoulli distributions, and its expected value is p, or the probability of success.


Variance of Bernoulli Distribution

Variance illustrates the spread between values in the dataset. It measures how far away data points are from the mean value (or the expected value) on average. The variance for the Bernoulli distribution can be calculated using the following formula :V[X] = p * (1 – p). Using our previous example with dancers and singers, we can calculate the variance: 0.4 * 0.6 = 0.24.

That’s it for today! Tomorrow, we will discuss Binomial distribution.

Take care,



Recommended book

An Introduction to Probability Theory and Its Applications, Vol. 1 by William Feller


Share with friends