(00:05): This is FiveMinuteFriday, Cohorts Analysis.
(00:14):
Welcome back to the SuperDataScience Podcast everybody, super pumped to have you back here on the show. Today, we’ve got a very cool topic called Cohort Analysis, and this can help you add another very powerful technique to your data science arsenal.
(00:29):
So to start off, I’d like to acknowledge that there’s a lot of different ways to assess how a business is performing, especially in the B2C space, but also this will work in a B2B space as well. One way to assess how a business is performing is by looking at total metrics. So for instance, let’s say you have a membership site. Some people come to your page, then some people then subscribe to the email newsletter. From there, some people take the trial version of your membership product, and then some people go and convert to the full paid membership product. So, you could look at how many people in total land on the page, or how many people in the past few months, or in the past month, or six months, purchased the total product. So, you could look at totals. That’s one way of analyzing how a business is performing, and are those numbers going up or down.
(01:20):
Another way is to look at the percentages of conversion in a funnel. So, you can imagine this is a funnel from top to bottom. How many people land on the page? What percentage of them converted to email subscribers? What percentage of them converted then to trial users? And then, what percentage of them converted to paid users? And you could look at those percentages and disregard the full total numbers, but try to optimize, increase those percentages and see, “Are our tweaks that we’re doing to the website, to the product, are they increasing those percentages?” Because ultimately, if you have a good product which you believe in, it’s helping people, you want to drive more people to it, and so that more people get benefit, and that’s how the business grows.
(02:02):
But there’s a third way, at least a third way, but that’s the way we’re going to talk about today, which is very different. It also talks about percentages, but it’s very different. And that is cohort analysis. Now, I had heard of cohort analysis at least a couple times over the past six months or so, but then I was very surprised to see it in this book that I’m reading, it’s called The Lean Startup, by Eric Ries. I highly recommend this book to anybody who’s got a startup idea. You just have an idea to start a new product within your existing organization, even if it’s a huge organization. Just it’s all about experimentation, and trying things out, and running a business in a very lean way. And in one of the chapters called Measure, which is chapter seven, Eric talks about cohort analysis. I was so surprised to see this subchapter there. And so, I thought it’s… If it’s in The Lean Startup, then definitely this technique is worthwhile exploring and sharing on the podcast.
(03:00):
So, what is this cohort analysis technique all about? Well, rather than looking at a funnel, rather than looking at your business as just a funnel by itself, you… in cohort analysis, you look at when did the customers first interact with your product? So, let’s say you have a year worth of data. You would look at… And let’s say you analyze, you want to analyze, how did people who first interacted with your product, in April, how did they go through your product funnel? So, it’s a slightly different approach to analyzing, “Okay, so in April, how did… how many people converted from trial users to paid users?” So, usually let’s say it’s 1% of people convert from trial users to paid users. You made some adjustments, and you want to see, “In May, did the number of people who went from trial to paid, did they increase?”
(03:51):
So, you just compare those two percentages. But in cohort analysis, you would actually take the people who joined in April, and you would follow them along the whole journey through the funnel. Even if it takes longer, even if it takes them three months to get through the funnel. So, they interacted with your product in April, though not joined, they interacted for the first time with your product in April. They found out about it in April, then they became… Some of them became subscribers right away, some of them became subscribers several months later, some of them became subscribers maybe several weeks later. It varies depending on the person. But then, out of those that became subscribers, some of them took a free trial. Again, might be a few weeks, might be a few days, might be a few months until they took it. And then, some of them became paid users.
(04:36):
So, you don’t mind when it happens that they converted, it might be in May, June, July, August, and so on. But looking back, you want to see that cohort, those customers that joined us in, or that found out about us, that first visited the website in April, how did they… what percentage of them? So, we’re going to take only those customers that joined in April. So, in the 30 days of April, we look at all those customers and then we’ll say, “All right, what percentage of those customers ever in the past…” Of course you have to have a time limit on this, so let’s say you’re doing this analysis three months down the track. So, “What percentage of those customers in the past three months have converted to email subscribers, then have converted to trial users, and have converted to paying members?” So, the analysis is different. The premise is not when the conversion happened, but rather when they first interacted with the product.
(05:32):
Another example of cohort analysis, rather than a membership site, could be an eCommerce store, like Amazon. So for instance, there, you could say, “Out of all the customers that joined us in 2015, how many of them have made again purchases in 2016, and how many of them made again purchases in 2017?” So, it helps with analyzing retention. And that way, you can compare… You can look at the sales of, let’s say 2020, and you can, inside the sales of 2020, let it’s a… imagine it’s a big bar, and inside of the sales of 2020, you can break down the sales by the cohorts of the customer. So, how many customers joined in 2015, which of these customers in 2016, ’17, ’18, ’19, and 2020? And then you can see the breakdown. So, do you mostly sell to repeat customers? Do you have high loyalty, higher retention? Or is it mostly customers from 2020 that purchased, and then they left your website and are likely never to come back? So, that’s another way you can use cohort analysis to look at retention.
(06:36):
So, that’s a quick overview of what cohort analysis is. The basic… the premise is to look at when the customer first made a purchase, or that first data point that you have on the customer. It might be when they first saw your website, if you have some cookies or something else that you’re able to track that, or maybe when they first became an email subscriber, that might be a starting point, or maybe when they made a first purchase on your eCommerce store, or whatever else that you define as the first interaction, first measurable interaction. You measure that, and you look at, basically, in the equation, you extract the minimum of the date. So, every interaction will have a date. Your goal is just to pull out the minimum of all those dates, and that will be attached to the customer forever on. From then on, the customer is always going to belong to that cohort. So, if they’re 2015 cohort, they will be at 2015… they’ll be part of the 2015 cohort for the rest of their life. And then you can analyze that and build interesting visualizations.
(07:31):
Again, it can help understand how customers… how you’re improving the product or not improving the product, how customers are going through the funnel. But it’s not just that funnel metric, it’s a cohort metric, so since the part… since the time they first interacted. Also, it can help analyzing loyalty of customers, retention, and things like that. If you’re using Tableau for visualization, you need to do a level of detail expression to do this, and show… And of course it can be done in other tools, but other techniques are required for this.
(08:00):
So, that’s cohort analysis. If you’re interested, highly recommend checking out more online reading into it. It’s a very powerful tool. For instance, we have a membership site at SuperDataScience, and right now we’re hiring a data analyst. And one of the questions I will definitely be asking in the interviews is, “Can you do cohort analysis? Do you know what cohort analysis is? Do you know how it works?” Because that’s a great way for us to measure if we’re improving our product, making better or not.
(08:25):
And finally, speaking of cohort analysis and Tableau, I would like to invite you to a free webinar. So, I’ll be running a webinar on Cohort Analysis in Tableau. It’s an advanced technique using level of detail expressions. And this webinar is going to be on the 3rd of August. On the 3rd of August, you can register our datasciencelabs.com. The registration is already open now. As of when this podcast goes live, registration is open, so you can ahead and register for this webinar. It’s absolutely free. And we’ll be doing a… It will be a practical workshop on cohort analysis, so you’ll need to have at least Tableau Public, preferably if you can, Tableau Desktop, but doesn’t really matter. Tableau Public will work. There’ll be a dataset provided, and it will be a full-on, at least one hour webinar, where we’ll talk about this theory of cohort analysis, use cases, and then we’ll actually perform cohort analysis on a real data set. That’ll be all happening live.
(09:25):
So, if you’re interested, make sure to register. You can grab your seat at datasciencelabs.com. And yeah, I’ll see you on the 3rd of August on the webinar. Can’t wait, see you then, and until next time, happy analyzing.