Note: I’ve been doing my best to explain the concepts of Data Science to management types lately, and I’ve had to resort to some interesting (and possibly non-HR friendly) analogies to make it memorable. At the urging of some who have heard these, I’m starting a series of posts called “Crazy Data Science Tutorials”.
Probability is one of the most useful statistical models, even if it is one of the most easily misused (for those who may not know what I mean by that, a probability model called “Value at Risk” was at the center of the 2008 financial meltdown). Being able to take past events and project the probability of a similar future event has almost endless applications in business.
So here, in as interesting a way as I can tell it, is how probability works.
In a nutshell, the main question of probability is “How probable is A, given B?” For example, what’s the probability you’ll be sleeping alone if you forgot your wife’s birthday?
(And ladies, if you haven’t already stopped reading, stay with me. There will be an analogy for you sometime in this series, I promise.)
Let’s assume there are 10,000 men we are studying. There are two main groups:
- Men who forgot – 100
- Men who had the sense to put it in their calendar software – 9,900
But from these two, there are four subgroups:
- Men who forgot and are banished to the couch – 90
- Men who forgot but still get to sleep in the bed – 10
- Men who didn’t forget and are enjoying the company of their bride – 9,890
- Men who didn’t forget and yet are still sleeping alone. These geniuses probably thought a vacuum cleaner was a great gift idea. – 10
To figure this out (as if you need to):
Out of 10,000 men, 100 will sleep alone tonight; 90 of those 100 have forgotten. From the same 10,000 men, 9,900 will not sleep alone and of those 9,900 men, 10 also forgot. This makes the total number of men who forgot 90+10 or 100. Of those 100 men who forgot, 90 will sleep alone. Doing the math, this is 90/100 or 90%.
From this analysis of what happened to those 10,000 men, we can assume the probability in your situation should you happen to forget. Grab a pillow on your way to the couch, bro.
What I’ve described above is the heart of something called Bayes’ Theorem. If you’re curious, it looks like this:
Where P means “Probability”, A is “Sleeping alone” and B is “Forgetting your wife’s birthday.”
Now, you won’t need to remember this formula or how to solve for it because whatever analytics software you’re using will do that for you. But what fun is that?