This is FiveMinuteFriday, Understanding the P-Value.
Welcome back to the SuperDataScience podcast everybody, super pumped to have you back here on the show. If you have been following this podcast for the past couple of weeks, then you probably have heard me say that Hadelin is spending a ridiculous amount of time updating the Machine Learning A to Z course. And so I have gotten dragged into it as well, which I’m very excited about. I’ve updated the support vector regression tutorials and now also the NLP tutorials, so we’ve added some tutorials there. And today what I was doing is a few days ago, Hadelin asked me to look at the p-value part of the course. So there’s a part of the course on multiple linear regression, and there was a tutorial, people were asking, “What about p-values? P-values? Can you explain p-values?” Because we hadn’t actually included a tutorial like that in there.
So what I did is I went to our Statistics for Business Analytics A to Z course, I took a p-value tutorial out of there, I watched it, made sure that it is appropriate, added a little bit of a video at the beginning and added that new borrowed, in a way, tutorial from the Statistics course do the Machine Learning A to Z course. So long story short, now we have a p-values tutorial in there, and I got quite inspired by that because I watched the whole tutorial and refreshed my own knowledge on p-values. And I thought it’d be really cool to talk about on the podcast, especially that a few weeks ago, I think it was, when this goes live, I think it would have been three weeks before you’re listening to this, Sam Hinton was on the show and we talked again about p-values and statistical significance and things like that. So things overlapped and I think it’s a good timing to talk about p-values.
So we’re going to quickly dive into the world of p-values and try to understand them on an intuitive level, what p-values are all about and what this whole 0.05 thing means. Why it’s valuable, why it’s valuable to know and why it’s valuable to know the intuition. And if you’ve already seen this tutorial in one of our other courses, then this will be a good refresher as well. So here we go.
What is p-value all about? So we’re going to imagine a very simple experiment that as we’ve all done many times in our lives, a simple coin toss. And we have a coin and we’re going to throw it and observe whether we get heads or tails. Now we don’t know if this is a fair coin or not a fair coin. So by fair coin, we mean that it’s not weighted and it’s equally likely to land on heads or on tails. And that’s going to be our null hypothesis. So we’re going to say H with a little zero at the bottom. So with an index zero, that’s our null hypothesis and the null hypothesis is that this is a fair coin.
So our assumption there is that we live in a universe where this coin is a fair coin. Now H1 is our alternative hypothesis and the alternative hypothesis is going to be that this is not a fair coin. We don’t know exactly how it’s not fair. It might be just weighted towards one side or the other. So it’s more likely to fall on one side than the other, or might just have to both sides are tails, for example, both sides are heads. So it’s not a fair coin. So that’s our alternative hypothesis, and basically there’s another universe. We don’t know which universe we live in. Do we live in the H0 universe where the coin is fair, or do we live in the H1 universe where the coin is not fair?
And I like thinking about it in terms of universes, because that really just puts it into a whole different perspective that we’re either in this universe or in that universe and you don’t have to guess about the coin itself and things like that. It’s just, what is the nature of reality? So we have two hypotheses and basically today we are discussing p-values, at the same time, statistical significance, at the same time hypothesis testing. We’re going to touch on hypothesis testing. So we throw the coin, we flip the coin and imagine it lands on tails. So how do you feel? So there’s two parts we’re going to be looking at. First part is how do you feel about the result? Second part, we’re going to be looking at the probability. So how do you feel about the coin landing on tails from the first throw? Does it feel that this could be a fair coin or does it feel a bit dodgy?
Probably feels all right, because a fair coin is equally likely to land on heads or on tails. There’s a 50 chance of it landing this way, 50% chance landing it that way, so that it doesn’t feel iffy, doesn’t feel like there’s a setup or dodgy in any way. Totally fine. And the probability there is 50%. Now we flip the coin again and it lands on tails again. So how do you feel about a coin landing twice on tails in a row? Probably okay. I feel quite okay about it. That’s totally normal. Could happen. No big deal. And the possibility there is 25%. Now we flip it again and it lands on tails one more time. Third time it’s landing on tails. How do you feel about that?
Maybe you’re starting to get a little bit suspicious, but again, it could happen. It probably could totally happen. In the course of a day you flip a coin three times, if that’s what you do, flip coins. You flip a coin three times and bam, it lands all three times on tails. Nothing critical there. Maybe a little bit suspicious, but overall I would feel okay about that. Probability there is 12%. There’s a 12% chance of that happening. Now you flip it again, and again it lands on tails. Four times in a row. How do you feel about that? Now I would be starting to feel very suspicious, much more suspicious, or a strong level of suspicious that something’s not right here. This that’s quite a lot of tails. And that’s normal. That’s normal to feel that way that maybe something’s going on here, and the probability here is about 6%. 6% chance of that happening.
And then you flip it again, and yet again it lands on tails. So fifth time in a row. Well now you’re feeling very suspicious. You’re feeling, “This definitely can’t be happening. Something’s wrong with this coin. Five times in a row, it’s landed on tails. What is going on here? This has to be a rigged coin or not a fair coin.” And so what’s going on here? Well, what is going on here is that the probability of this happening is 3%. So what does that mean? That means if you run this experiment a hundred times, if you flip a coin five times in a row and you do this experiment a hundred times, then only in three out of the 100 experiments, this is likely to happen.
So that’s if it’s a fair coin. And this is where the universes has come in handy. So if we truly live in the null hypothesis universe, if we live in the universe where the null hypothesis is true, and this is a fair coin, the probability or the likelihood of this happening five times in a row, of tails happening five times in a row is so low that we feel uneasy about it. We feel that something is wrong here, and that’s because we’ve assumed we live in the null hypothesis universe. Now, if we were to reverse our assumption and say, “Let’s reject the null hypothesis.” Let’s say we don’t live in a universe where the coin is a fair coin. Let’s say we live in a universe where the coin is not a fair coin, where, for example, both sides are tails. Then how would you feel about the coin landing five times in a row on tails?
You would feel totally fine. You would feel absolutely fine about it because it has no other options. Whether you flip it one time, two times, five times, a million times, the chance of it landing on tails is a hundred percent every single time. So you would feel totally fine about it. And that’s where the p-values and the universe kind of approach or the hypothesis approach or the hypothesis testing actually are very powerful. So if you assume a null hypothesis and you get to a point where you feel really like this can’t be happening, then that’s a point where you reject the null hypothesis and you go for the alternative hypothesis where you’ll feel totally fine, where what is happening, what you’re observing is totally, totally fine, is not statistically unlikely.
And that’s in feeling terms. You can’t really go up to your boss in the workplace and say, “Excuse me dear manager, I feel that this result is statistically insignificant or not statistically significant because I have a bad feeling about it.” So what is the mathematical side of it? Well, the mathematical side of it is as soon as your probability drops below 0.05 or below 5%, that means now you have a case to say that you can reject the null hypothesis. Basically, p-value is that probability that we were calculating just now, from five 50%, to 25%, to 12%. Every time we were observing more and more heads, the p-value was dropping. So the p-value of five heads in a row is 0.03. And as long as your p-value is below 0.05, that means you can reject the null hypothesis and the correct way of phrasing it is that we have sufficient evidence to reject the null hypothesis. We have sufficient evidence to state that this coin is not a fair coin.
And of course it’s not a hundred percent certain, we might be wrong, but the chance of us being wrong is below 5%. And depending on the application or use case, you might want to use the cutoff percentage at either 5%, so that’s a 95% confidence, or at 3% or 2% or 1%, that’d be a 99% confidence. So it really depends how confident you want to be about rejecting this null hypothesis, in this case, it’s a fair coin, and going for the alternative hypothesis. So that’s what p-values are all about and what we were discussing with Sam Hinton, for example, a few podcasts ago, was that p-values are in the foundation of most of the research that is happening in the world right now. Most of the research papers, when research paper’s published, they need to basically have statistical significance with, for instance, a p-value below 0.05.
What does that mean? Well, that means that they need to more than 95% confident that there is an effect, that they’re observing an effect. So the null hypothesis is usually stated as there is no effect. Maybe there is no correlation between smoking and lung cancer. So there is no correlation. So that would be your null hypothesis, and your alternative hypothesis H1 would be that there is a correlation. There is an effect, or there’s a causality between the two. There is an effect between the more a person smokes, the more chances they are that they will get lung cancer. So the null hypothesis, there’s no effect. The alternative hypothesis is there is effect.
So the way research is structured, we assume we live in the world of the null hypothesis. We assume that, hey, there is no effect. There is no correlation, causation between them. So we’re just going to go and observe things about the world. We’re going to collect samples, we’re going to survey people, we’re going to get medical records and we’re going to collect all this data and see if all of it always fits in with our universe that we live in. If all of it nicely falls under this null hypothesis. And that will be fine. That basically means that maybe the null hypothesis is correct. We can’t say that it’s correct or incorrect at that stage. We just see, okay, all the data falls under that null hypothesis. Everything’s great. But as soon as we’re able to obtain enough data, sufficient evidence to say, “Hold on a second. If we lived in the universe of the null hypothesis, what we’re seeing wouldn’t be possible, or what we’re seeing would have less than a 5% chance of happening in the case of a 95% confidence level.”
So if we’re able to gather evidence to show something or an effect that would have a less than 5% likelihood of happening in the world of our null hypothesis, then we can reject our null hypothesis. And then we can say the golden words, that we now have sufficient evidence to reject the null hypothesis in the favor of the alternative hypothesis, H1. And so that’s how you would, for instance, prove that there is correlation between smoking and somebody getting lung cancer. So that’s in a nutshell what it’s all about. Of course, in most applications in data sciences and in business, you don’t have to go through the hypothesis testing and write them all out, but it would actually be a very useful exercise to do that.
If you were to sit down and write out your null hypothesis, your alternative hypothesis, select your confidence level, whether it’s 95%, 99%, whatever it is, and then calculate your probability of whatever it is happening, calculate your statistical significance and actually write out why you’re rejecting the null hypothesis and saying that, “Oh, actually there is a correlation between our customers seeing the red banner instead of the blue one and the number of people clicking on the subscribe button,” if you’re doing some marketing analytics with data science or whatever other application you’re doing. If you actually write it out, write out your null hypothesis that there is no correlation, there is no difference whether we use red or blue color, but alternative hypotheses, there is a difference whether there is a red or blue color, and you write it out and you calculate your p-value and you see that it is indeed below 0.05. You do that a couple of times, it would be a great exercise to get into this mindset of thinking about statistical significance.
Because a lot of data scientists, most data scientists from my conversations, from things I’ve observed, they don’t consider this. This is not a habit. They might know about it, they might be well versed in this space of statistical significance, but a lot of the analysis that goes out there actually doesn’t consider statistical significance. And that’s a very, very slippery slope. It’s very dangerous to provide business insight, especially actionable business insights to managers, to executive level when it’s not statistically significant. So what those results are shouldn’t be relied upon. So that’s why knowing and using statistical significance and also educating your audience about statistical significance is an important skill and an advantage in the hands of a data scientist, something that can help take your career to the next level.
All right. So hopefully that was helpful and check out our Machine Learning A to Z course if you haven’t yet. If you have already, then do it again. Why not? Hadelin has updated all of the Python tutorials. Now they’re in Google CoLab. I’ve updated some of the Intuition tutorials. A good opportunity to practice everything again and learn some new things along the way. You can find Machine Learning A to Z of course, on Udemy if you just want to get that course by itself, or in the SuperDataScience membership along with all of our other courses. On that note, thank you so much for being here today. I look forward to seeing you back here next time. Until then, happy analyzing.