What is Binomial Distribution?
- Soham Shinde
- May 29, 2021
- 4 min read
Updated: Jan 10, 2023

The word 'Bi' means Two. Distribution is graph showing the probabilities of the events occurring in an experiment. Thus, binomial distribution represents the probability of events with 2 outcomes which can be labelled as "success" and "failure". This distribution has two variables, one is failure and one is success. It calculates the probability of success which has 'n' random samples.
We always want to study the success outcome and calculate the probability. If we want to study probability of students failed in the exam, then success will be number of students failed. Failure event will be number of students passed in the exam. Therefore, defining what is the success and what is failure for the distribution is important.
Formula for binomial distribution:

We can see formula below. Px: Probability of x successes
n: Number of random samples selected by us
p: Probability of success (Known to user)
q: Probability of failure = (1 - p)
x: Number for which outcome is success (Always 0 <= x <= n)
nCx denotes all possible combinations in which the success variable is arranged.
nCx can be calculated using the formula given below.

Illustrative Example:
We can see an example to understand the concept of Binomial Distribution in detail. If we have data of Shipments Delivery time and we want to study delay of shipments, then "Shipment Delay" will be success and "Shipment Not Delay" will be failure in this case.
Let's see the example of shipments delayed or not. We will use binomial distribution concept in MS Excel. I have used Excel to calculate the binomial distribution.

In above formula,
Number of successes: This is x value which denotes the number of success. For example, if we want to find probability of shipments that can get delayed for exactly 3 customers, then x=Number of successes=3.
Trails: This represents number of the random samples taken. For example, if we select 10 random people for the experiment. Then, Number of random samples= n= 10.
Probability of Success: This is the probability of success. Most of the time this quantity is known. In our shipment delay example, probability of success is known. If we take a single trial, what is the probability that the shipment is delayed = probability of success.
CDF (Cumulative Distribution Function) / Probability Mass Function (PMF):
This option is available only in the Excel software. We have to select whether we want cumulative probabilities for the number of successes or probability mass function.
CDF: If we want probability of less than 5 people getting the shipments late, then we have to select cumulative. This feature allows us to calculate the probabilities for the range of values instead of single value.
PMF: PMF is used to calculate the probability for exactly 1 or 2 or 3 or 4 or 10 people that got the shipments late.
Mean and Standard Deviation of Binomial Distribution:
Now, as we have discussed about the binomial distribution formula, it is time to understand Mean and Standard Deviation of the Binomial Distribution.
If 'n' customers are sampled over again and again and 'p' is the probability of getting delay in one trial, then average number of customers getting shipments delayed will be n*p. If shipment is delayed 20% of the time and we sample 10 customers many number of times, then average delay will be 2 out of the 10 customers will get the shipments late/delayed.
Similarly standard deviation for binomial distribution will be given by square root of n*p*q. Where q = 1-p.
Let's see in MS Excel how to calculate the binomial distribution.
Given: Probability of Package Not Delayed = q = 85%
Probability of Package of Delayed = p = 15%
Number of customers = n = 55
Solution:
If we sample 55 customers to see if they got their shipments delayed, the probability of delay is 15%. That is p=15%. Therefore probability of Shipment Not Delayed is 85% because both are cumulatively(if we sum together) 100%. That is p+q=1. Also we have sampled 55 people, therefore n=55.
I have calculated probability of how many number of customers gets the shipments late. I assumed x, n, and p values on my own.
Model in Excel:

Explanation of Excel model:
In the above Excel model, I have calculated the probability of delay for 1, 2, 3, ..., 10 customers. For example, We can see Probability of getting shipments late to 8 customers is 15.03% which is given in cell O20. That is the highest of all. In similar manner, I have calculated for all the customers till 10. We have sampled 55 customers in total.
If we sample 55 customers many number of times (1000 times) and ask them, whether they got the shipments late, then on average 8 customers out of 55 will say that shipment got delayed. This number is the Mean of Binomial Distribution, which is calculated in cell O24. Similarly, I have calculated the standard deviation of distribution in cell O25.
After calculating probabilities (probability mass function) of binomial distribution, I have plotted the distribution in Excel. We can clearly see that more number of people are getting the shipments late if the delay probability is 15%. The distribution is left skewed as you can see in below bar chart.
Some inferences we get from the graph:
1. Probability that exactly 8 customers get the shipment late is 15 %.
2. Probability that exactly 10 customers get their shipment late is 11.24 %.
3. Probability that exactly 3 customers get their shipment late is 1.89 %.
4. Most of the time customers get the shipment late.
5. In long run, on average 8.25 customers, approximately 8 customers get their shipment late.
6. The distribution is left skewed, most of customers get the shipment late.

Hope you got to know about Binomial Distribution.
Thanks,
Soham Sanjay Shinde



Comments