SDS 363: Intuition, Frameworks, and Unlocking the Power of Data

Podcast Guest: Piyanka Jain

May 6, 2020

This is a rollercoaster of knowledge about analytics and data science. We talked about important data science frameworks, putting courses into context, how to do lean data science, how data science projects fail, and more.
About Piyanka Jain
A highly-regarded industry thought-leader in data analytics, Piyanka Jain is an internationally acclaimed best-selling author and a frequent keynote speaker on using data-driven decision-making for competitive advantage at both corporate leadership summit as well as business conferences. At Aryng, she leads her SWAT Data Science team to solve complex business problems, develop enterprise-wide Data Literacy, and deliver rapid ROI using machine learning, deep learning, and AI. Her client list includes companies like Google, Box, Here, Applied Materials, Abbott Labs, GE. As a highly-regarded industry thought leader in Data Science, she writes for publications including Forbes, Harvard Business Review, and InsideHR.
Overview

Piyanka is all about practical data science and the power of data across all different careers and life paths from marketing managers to someone looking to climb Mount Everest. Life and goals require data. Does data take away the chance and luck in life? Not exactly. Data-driven decision making is all about marrying data to the intuition we already have. An example Piyanka gives is someone deciding to jump into the ocean and hoping to come across a treasure. That’s intuition and you want to keep that but you need to inform it with data about your situation. Bring your whole self to data science. 

Piyanka’s BADIR framework—which stands for business questions, analysis plan, data collection, insight, and recommendation — is the steps Piyanka developed to follow during a data science project. Uniquely, data is third on this list, you must start with business questions. You also need to lay out what kind of problem this is. These are all planning stages before you touch the data. This framework has been adopted in some fashion by Apple, PayPal, eBay, and others. BADIR’s open-source access has made it accessible and attractive to companies and many reach out to Piyanka for assistance. She compares it to a recipe. Piyanka’s BADIR’s framework earned her team the reputation as a “SWAT team” of data scientists. This means their results are fast and effective. They can achieve this thanks to being hypothesis-driven and ensuring that stakeholders are aligned at the beginning of the process.
Piyanka points out some dismal stats: 85% of big data and data science projects actually fail—whether that’s abandonment halfway through or lack of use after it’s done. Of the successful ones, less than 15% truly drive impact or change. There’s a lot of factors that go into this, including data literacy and data maturity. The latter is the foundation of establishing a good culture of robust data science in your company. Data literacy is important for the longevity of your work in a company. Does the marketing manager know what to do with data? Does the business manager? Also important is data-driven leadership and data-driven decision-making processes.
A final important note Piyanka has is the important difference between decision science and data science. Data science is the technical and algorithmic side of things. Decision science is everything you do to make sure those processes yield an impact. When you marry them, you get the true power of data. 
In this episode you will learn:   
  • The power of data plus intuition [5:29] 
  • BADIR framework for data science [12:36] 
  • What can students pick up from Aryng’s courses? [24:58] 
  • SWAT data science teams [34:16] 
  • The rate of successful projects [39:38] 
  • Four D’s of Data Culture [45:27] 
  • Decision science vs data science [49:17] 
  • Piyanka’s inspiration for her book [51:23] 
Items mentioned in this podcast:
Follow Piyanka 
 
Episode Transcript

Podcast Transcript

Kirill Eremenko: This is episode number 363 with President and CEO at Aryng Analytics, Piyanka Jain. 

Kirill Eremenko: Welcome to the SuperDataScience podcast. My name is Kirill Eremenko, Data Science Coach and Lifestyle Entrepreneur. And each week we bring you inspiring people and ideas to help you build your successful career in data science. Thanks for being here today and now, let’s make the complex simple. 
Kirill Eremenko: Welcome back to SuperDataScience podcast everybody. Super excited to have you back here on the show. Are you ready for a rollercoaster of knowledge? This is going to be a lot of fun. I just got off the phone with Piyanka Jain, and you will be overloaded with information about analytics and data science. Literally, I have so many notes and so much in my head. I probably need to sit down and process this for quite a bit of time. 
Kirill Eremenko: So Piyanka is the founder, president, and CEO of Aryng Analytics, an analytics consulting company where they provide services to enterprises and businesses on how to be better with data science, data driven decision making. Also Piyanka is an author of several books now, of multiple books, best selling books which you can find on Amazon. We’ll talk about one of the books during the podcast. Piyanka’s also a writer for publications like Forbes, Harvard Business Review, Inside HR. She has keynoted many conferences around the world, and also, Piyanka is an educator. So they have data science courses on Aryng.com. They have a whole academy of data science where they provide certifications and help people get into the space of data science. So as you can tell, Piyanka is involved in many aspects of data science. 
Kirill Eremenko: And what exactly are we going to be talking about in this episode? There was so much to choose from. There was so many questions I had, so many topics we could have gone into. There was virtually, or literally… Impossible. It was virtually impossible to cover everything. So what did we cover? Well, we talked about, among other things, a very important framework called BADIR, a framework that Piyanka developed herself. It’s B-A-D-I-R. And this is a framework that allows you to do data science in a very thought through way. According to Piyanka, with this framework, you can do lean data science. You can do data science much quicker than normal. You can deliver results faster because you’re thinking things through. Not often do you hear about data science frameworks. I found this one very interesting, especially how it uses hypothesis based data science. In this podcast, you’ll get an acquaintance with this framework, and if you’d like to learn more about it, you can always follow up and check out the book or other resources. 
Kirill Eremenko: In addition to that, we’ll talk about putting courses into context and what that means and how you can do that for yourself, and why you would need to do that. SWAT teams in data science and how to know if your team is a SWAT data science team, how to do lean data science and what percentage of data science projects fail and why. You’ll actually be very surprised at the number. In addition to that, we talked about the four components of data culture and how they come hand-in-hand, how do they enable each other, and we discuss the difference between decision science and data science. So there we go, a podcast full of value. Can’t wait to get started. So without further ado, I bring to you, president and CEO of Aryng, Piyanka Jain. 
Kirill Eremenko: Welcome back to the SuperDataScience podcast everybody. Super excited to have you back here on the show. And today’s guest is calling in from California. Welcome to the show Piyanka Jain. How are you today? 
Piyanka Jane: I’m great and excited to be here. 
Kirill Eremenko: Very excited to have you. And this was a first for me because before the start of the episode, you asked me a ton of questions about the audience, about how you can help our listeners better, and all these other things. I normally don’t have that. So very excited. I can tell right away you have an inquisitive mind, and that probably serves you quite well in your career in data science, doesn’t it? 
Piyanka Jane: It does. A curious mind is a good data science mind. 
Kirill Eremenko: For sure. That’s a great motto. How are things going in California these days? 
Piyanka Jane: Things are good. We’re all sheltered in and it’s a good thing, and hopefully, we are able to contain COVID soon. But yeah, no. Otherwise things are good. We’re all doing what we need to be doing with social distancing and so on. 
Kirill Eremenko: Yeah. That’s right. That’s right. Yeah, hopefully it does go past quite quickly with these new measures. But I’ve had a look at your career background, and it’s extremely impressive, from having a published book to being a CEO of a company that does both consulting for companies like, as far as I understood, Google, Box, Apple, General Electric, and many others. Also you do education in the space of data science. You are everywhere in the space of data science. Tell us a bit about yourself. For somebody who hasn’t met you before, how would you describe what you do? 
Piyanka Jane: Thank you so much for your kind words. I feel like I’m just getting started, but for those who are listening in and want to know a little bit more about me, I am all about practical data science. I really believe in the power of data, and for me, data plus intuition, because we are all intuitive beings, and if we can marry data to that, we can really optimize our decisions. And that goes all the way from corporate decisions as a marketing manager, as a business manager to data scientists to all the way to our internal as personal human beings. You want to achieve something. You want to climb Mount Everest. You have to use data, and that’s how you’re going to be able manage and optimize your progress and your decisions. So I really believe in that, and I think that’s what I evangelize, and that is what I hope to share… I have to be infectious about my passion for data science today. 
Kirill Eremenko: Love it. Love it. Interesting thing just ran through my mind when you were saying that. Indeed, if you’re going to climb Mount Everest, you’ve got to use data. It might sound strange at the start but when I think about it, you’ve got to use data on, okay, I’ve climbed this other mountain. Maybe you’ll be doing training. You’ll be measuring your pulse. You’ll be measuring how tired you get, how much endurance you have, how much water you consume, how much oxygen you breathe, and data will definitely get you there. The interesting that went through my mind was, some people might say that if you just use data in everything from business to personal life to sports, eventually, you’ll be like a robot. You won’t have any emotion, empathy, any kind of random chance that comes with life that is normal. What would you say to people who have that opinion? 
Piyanka Jane: I have a lot to say about that. One is that when we talk about data, we don’t talk about data driven as in just believe on data. We always talk about hypothesis driven, data driven decision making. What does that mean? What that means is you want to bring your whole self, your intuition and the intuition of your colleagues, of your stakeholders to the bearing and then form what is needed from the data, and then prove your hypotheses. So for us, data science, or this aspect of being able to apply, putting data to work is all about marrying data to intuition that you already have. 
Piyanka Jane: So for example, if you’re a marketing manager, you probably have some really good intuition about your audience, about what works for them, what products they like, and so on. Let’s use that intuition you have, the context you have as well as your team members, as well as your stakeholders, and then form a hypothesis driven… We teach a framework called BADIR, that’s also there in my book, Behind Every Good Decision, for those who are interested in knowing more about it. It’s called B-A-D-I-R is the name of the framework, and we talk about how… For you to be effective and efficient in data science and analytics, you basically lay out a hypothesis driven plan. 
Piyanka Jane: So even before you touch data, you lay a hypothesis driven plan. You think what are the things… If you are solving a problem, going back to our personal goal, let’s say you were going to go look for treasure in Pacific Ocean. There two approaches to it, right? One is, I’m going to be just… just going to jump in because I want to experience the world so I’m just going to jump in and start swimming in the ocean, and hopefully, one day, I will run into a treasure. How likely is that, Kirill? 
Kirill Eremenko: 0%. 
Piyanka Jane: Yeah, right? You have to be super lucky. You’re relying on luck and you’re relying on… And you’re actually giving up your power because for anybody who sets sail in Pacific Ocean, you have limited resource. Maybe you have one-month’s supply. So you’re basically saying, “Oh, I have one-month’s supply but I’m going to set sail,” you’re risking that one-month supply because you don’t even know. Maybe if you had infinite supply and infinite time, maybe someday you will run into a treasure. But on the other hand, if you’re like Sherlock Holmes, and you are detective, and you lay out hypotheses, what does hypotheses mean? Basically, you have good ideas of where the treasure could be. 
Piyanka Jane: So you look up past shipwrecks, past routes, the depth of the ocean, all of that, and you figure out, you narrow down, these are the most likely spots and the news reports of where debris was collected and so on. And then you narrow down, these are the five most important, or 10 most important spots, most likely spots for treasure. And you go there and then you send your deep sea divers or your submarines down there, you’re more likely to find that, and at least you would fail faster as well. You would have looked at those 10 spots. You would know within 10 days or whatever else, “Okay. I don’t have it. Now what’s my next plan forward?” Versus just kind of going, right? So that is what we talk about. 
Piyanka Jane: When we talk about data driven, we talk about hypothesis based, data driven. So going back to your will we become a robot? We are human beings and we are special beings. We can’t quite become robot. However, you also don’t want to just rely on data. And I have seen people who are so data driven that they leave their intuition behind, and they come up with these results sometimes, in the business as well as in personal, and we look at it, and you’re like, “This does not even make sense.” My entire being rejects what you are concluding, and that’s my intuition, right? 
Piyanka Jane: So you never want to leave the intuition. Intuition is a big part of us. Intuition is what keeps us safe. Intuition is when you’re going up the Mount Everest and you’re beginning to feel not so good. You look up your VOX meter and you say, “Oh, what’s my oxygen level? What I’m absorbing is dropping,” right? If you didn’t have intuition, if you’re not paying attention, you won’t even know. And before you know, you will have fainted before you even can look at your data, right? So you need to bring your whole self to this game. Data science is not about just data. It’s about bringing your entire self to the table. 
Kirill Eremenko: Fantastic. Thank you for the rundown. It was a great way to see how data science can be combined with intuition. I think a lot of us would agree that both have a place. And I actually want to talk a bit more about your BADIR framework. So I read about it. So the B-A-D-I-R. What do those letters stand for and what is this framework all about? 
Piyanka Jane: Yeah. So the BADIR is an acronym for these five steps, and they stand for business question, analysis plan, data collection, insights, and recommendation. And if you notice, the part about data collection is step number three. So many people think, when they think about data science, they think about, “Oh, let’s start with data.” But it doesn’t start with data. Good analytics, good data science project doesn’t start with data. It starts with business question, refining and flushing out what you really want to find out. What is it your question… For example, a question could be why’s our sales dropping, or why are our customers churning, or why is our conversion down, or can we optimize our conversion? Can we improve our user acquisition? In what ways can we improve our loss ratios, and so on? So those can be the questions. 
Piyanka Jane: And the point about having a full step for it is basically, there’s an ask that comes in to the stakeholder, or the first thought if you’re a marketing manager and yourself, a citizen analyst, which means part of your work, you are doing data science which almost all of us are right now. In the world these day, in the business world, the language of business is data, and so everybody is speaking the language of data. And if they’re not, they’re being left behind. So everybody sort of is sort of having some access to data. And the first thought that comes to your mind as you think about, “Oh. I need to do my next campaign, or I need to figure out whether this feature works or not,” that’s an early question and even if you are a data scientist, if somebody asks you a question, it’s an early question. You need to define it through the business question framework to come to the real business question. And that refinement has many aspects about what actions somebody’s ready to take. If you find the insights with it, there is the, who are the stakeholders? And very many, many aspects both on the data science side as well the decision science side. And so, that’s business question. 
Piyanka Jane: Then you lay out a hypothesis driven analysis plan. You ask yourself and the stakeholders, what is the solution? So if the question was why are our customers churning, then many people will have some good ideas. Your stakeholders may have good ideas like, “Oh, we have recently increased our price and that’s causing some churn. Our ticketing system is not working that well or our policies have changed recently or the customers are churning because we post 90-days, these things that we do that just not working well,” and so on and so forth. You have lots of good hypotheses. It’s like good spots that you would, going back to our Pacific Ocean [inaudible 00:15:44], it’s going back to where you’re going to look deeper. So these are good hypotheses. And then you lay out analysis plan with hypotheses, with your methodology, with your assumptions, with your data and criteria to prove this. There’s a bunch of stuff. 
Piyanka Jane: And from there comes out your data specification which means if this is the question, and these are my hypotheses and these are my assumptions, facts, and my methodology and so on… And by methodology, I mean your specific approach to data science. So are you going to use aggregate analysis? Are you going to go correlation analysis? Are you going to go deeper into using probably some predictive analytics like statistical methods? Or are you going to go even deeper and you’ll use machine learning, whatever else? So you’re laying out, is a classification problem? Is it a regularization problem? You’re laying it out right there and saying how far am I going to go. And that’s also a function of… It’s a function of data. It’s function of time you have. It’s a function of precision you need and so on. So there’s a lot of things going on. These are all planning stages. 
Piyanka Jane: Remember, you have not yet touched the data, right? And most people, most data scientists and others, when they think about data science, they think about, “Oh. Where’s the data? Let me pull that data in Excel, or let me [inaudible 00:16:54] in Python,” whatever else. But that’s not where data science starts from. Data science starts from question, or flushing out the question, laying out a hypothesis driven plan. And when you’re playing out hypothesis driven plan, it also means you are aligning with your stakeholders to say, “This is what my plan is going to be. This is how much time it’ll take. This is blah, blah, blah. Are we in alignment?” When you have a handshake, that’s when you go to the most time consuming step of getting data and then validating it, and triangulating and cleaning it up. That’s all time consuming. 
Piyanka Jane: Then start doing your analysis. So the insight step is also, if you have a recipe for doing insights versus what many people do which is they set sail in the ocean of data and they start looking for treasure, which is a pretty bad idea because it takes you a long time, and your likelihood of finding treasure is also really low. So what we recommend is now that you’ve done all this work, you have laid out a hypothesis, you have collected data, now… And collected data meaning you have collected only the data that you need versus saying, give me all the data you have. Now because you have hypothesis, you’ve used that to know where exactly you’re going to dive deeper. Then the next step is use recipes to derive insights. 
Piyanka Jane: You know if you’re going to do correlation analysis, these are the steps. If you’re going to build a linear regression model, these are the steps. Or if you’re going to go into gradient boosting, these are the steps. This is what you’re going to do, and so on, right? So you know what the steps are. Follow those steps and follow the recipe in a structured manner and come to your insights. At the end of it, share your early insights with your stakeholder and see if that’s making sense. 
Piyanka Jane: And the last step of this BADIR framework is recommendation. So you make recommendations or you instrument your model, you productionalize your model, you instrument your insights, whatever have you. And that’s also very important because I can’t tell you how many good models I’ve seen sitting in shelf because people didn’t know how to align with stakeholders, how to communicate your findings to the right folks in the right way so that you can basically, inspire them into action. So that in a nutshell, is the BADIR framework, and for folks who are interested, they can learn more about that in my book, Behind Every Good Decision, as well as on our website. If they want to go and look at aryng.com, they can find a lot of use cases and case studies on why we believe this works, and many, many organizations, many Fortune 1000 have already adopted it, this framework as their common language. 
Kirill Eremenko: And I actually wanted to ask you about that. So I’m seeing on your website that this framework is adopted by Apple, Google, GE, PayPal, Adobe, SAP, Ebay, and many, many more companies. How did you get this framework into these companies? 
Piyanka Jane: So not all companies that you spoke about and not 100% of them are adopting it, as you say, but many organizations are adopting it much widely and some organizations are adopting it within for example, customer support group or marketing group and so on. But the way the… So I mean, one is that we, after many years of being pestered by our students, I wrote this book, and basically put then the BADIR framework and made it open-source. So many data scientists and business users are picking up the book and it has a step-by-step guide, so they are picking it up and they’re adopting it. And then as and when they need further detail, more detailed help, they reach out to us. So even our non-customers are using it and we may not be even aware of them, so that’s- 
Kirill Eremenko: Got you. Got you. 
Piyanka Jane: The other thing is, it’s a very… I mean, it’s a recipe-based approach. I don’t know, Kirill. Do you cook? 
Kirill Eremenko: Yes, love to cook. 
Piyanka Jane: Loves to cook, okay. So do you know how to make falafel? 
Kirill Eremenko: Falafel? No, I don’t know how to make falafel. 
Piyanka Jane: Okay. So that’s a tricky one. So let’s say you are thinking about, “Oh, I’m going to make falafel.” What would help you the most now that you have to make falafel? 
Kirill Eremenko: A recipe on how to make falafel, I think.
Piyanka Jane: There you go, right? So a recipe. And then the first time you make, do you think you’ll get it perfect? 
Kirill Eremenko: No, of course not. First time always not perfect. 
Piyanka Jane: Because you’re still understanding, “Hey. I’m going to salt chickpea. And I’m going to grind it, how fine I’d grind it. And then what would be the consistency of that as I drop it into… To fry in oil, as I drop those balls, how thick they need to be, how viscus or how liquidy they need to be.” So there’s a lot of details that you’re going to get. The first time you’re going to make, you’re going to get the detail and you’ll see the output. The same way, if you have to learn data science, what would help you the most? A recipe. 
Kirill Eremenko: A course. A book. A guide. A learning path. 
Piyanka Jane: Yes, and a recipe. Whatever, a course is about recipe. A book is about recipe. A recipe. Something that tells me, do this and then do this and then do this, and these are the ingredients. And do this and then do this, right? And the first time I do that, I’ll get somewhat, and I get some understand the second and the third time. So the way we have structured our courses, and for your listeners who are interested, they can go on academy.aryng.com and find these courses, we are all about how to bake a cake, how to make a falafel kind of recipe. So we start, we share this whole BADIR framework and by the time they’re done with even the level one course which is the business analytics course, they have done this framework. They have baked the cake, and they have cooked the falafel at least three times. 
Piyanka Jane: And then following that, they work on a project which means, okay great, you have done this in simulated data, or you have done that in data which was fairly clean. Now do this in real world, in your real world. So for current data scientists who are currently employed, we tell them, “Okay, pick up a project within your own work flow.” Or for future data scientists who enroll with us, we give them one of our client projects. And thereby, they get to practice, again, the same framework. So they know exactly what they’re doing. As we put them in a client situation or they pick up a project, they know how to follow the BADIR framework, and we are there as their mentor at different points, at the analysis plan stage, at the insight stage, at the recommendation stage because they know what they’re doing. We know exactly what they should be following so we can course correct. And that’s the fastest way I have found to learn data science is using some kind of recipe, some kind of… a step-by-step method of this is how it works. 
Piyanka Jane: Now, as you get advanced in it, you can start using shortcuts. You can start using iterations and so on and so forth, right? And so you can think about, you start with your common, simple vanilla cake, and then you can start adding some… I’m going today, make some nuts and raisins, and I’m going today, make some icing, and I’m going to layer it up. And maybe one day I’m going to be able to make tiramisu and all of that, right? So you’re going to be able to advance your skills. And this step-by-step way of learning is recipe-based, and then step-by-step use case based approach is what I recommend for people who want to learn data science. 
Kirill Eremenko: Got you. Wow. Thank you for the rundown. So let’s talk a bit more about your courses. So I noticed you have… For those by the way, for those interested, the website is Aryng, A-R-Y-N-G. And the course are at academy.aryng.com. I noticed you have quite a few interesting courses, and what I wanted to find out is… These are high ticket items, so over $1,000 per course. What is your X-factor? So what is it that students can pick up from this course that will really make it worthwhile for them? 
Kirill Eremenko: Are you subscribed to the Data Science Insider? Personally, I love the Data Science Insider. It is something that we’ve created so I’m biased, but I do get a lot of value out of it. Data Science Insider, if you don’t know, is a absolutely free newsletter which we send out into your inbox every Friday. Very easy to subscribe to. Go to SuperDataScience.com/DSI. And what do we put together there? Well, our team goes through the most important updates over the past week or maybe several weeks, and finds the news related to data science and artificial intelligence. You can get swamped with all the news, even if you filter it down to just AI and data science. And that’s why our team does this work for you. 
Kirill Eremenko: Our team goes through all this news and finds the top five, simply five articles that you will find interesting for your personal and professional growth. They are then summarized, put into one email, and at a click of a button, you can access them, look through the summaries. You don’t even have to go and read the whole article. You can just read the summary and be up to speed with what’s going on in the world, and if you’re interested in what exactly is happening in detail, then you can click the link and read the original article itself. I do that almost every week myself. I go through the articles and sometimes, I find something interesting. I dig into it. So if you’d like to get the updates of the week in your inbox, subscribe to the Data Science Insider absolutely free at SuperDataScience.com/DSI. That’s SuperDataScience.com/DSI and now, let’s get it back to this amazing episode. 
Piyanka Jane: Yeah. So there are courses and there are certifications, and our certifications are… For example, let’s pick up one which is the future data scientist certification. And what it has is a complete [inaudible 00:27:16] of how you can transition your career to data science. And so, it’ll have the underlying courses, and it’s self-paced so you come in, and you log in and you… We recommend one section a week, or if you have more time, one section a day, and make progress. And then after you’re done with that… And while you’re doing that, we have communities so you’re posting questions in Facebook community. And you also have a monthly mentoring sessions directly with us live on Zoom, and thereby, you are able to log in and ask your questions live, as well as post your questions non-live, 24 by 7 on Facebook community. 
Piyanka Jane: So lots of interaction. Students are helping each other. So there’s a community that you have. There’s a learning that you’re doing of the fundamental framework, BADIR, and you’re learning it in a context of marketing of product. If something happens in this [inaudible 00:28:10] in hospital, how are you going to do it? If this is happening in winery, how are you going to think about optimizing and so on? Lots of different use cases. We are opening their blinders and we are giving them a toolkit of tools that they can use. Then the next part of it is- 
Kirill Eremenko: Putting it into… Sorry, putting it into context, putting education into context. I’m just thinking of what can students take away that they can enact in their own learning, and it sounds like putting education, data science into context like you said, in a hospital, in winery or somewhere else. That helps probably retention. Also helps understanding the topic better. 
Piyanka Jane: Yeah. And then follow that up with a real project. So they all work, all of these certifications have a project at the end. So it’s all fine and dandy when you learn something. How many of us have gone and done this in the corporate world, really? We are taking classes all the time. You come in. You even do a half-day offsite for leadership, and you go out there and you say, “Wow. That was amazing. That’s so inspiring.” You come back and it’s business as usual, right? 
Kirill Eremenko: That’s true. 
Piyanka Jane: So for the business to be not as usual, for you to interrupt that way of thinking and to really change manage, you need to bring it home with you, which means you need to tie it to a project. And I can’t tell you how many people… I mean, I’ve seen people just flower from, “Oh. I’m very, very nervous about data science,” to, “Okay. I’ve done the course. I’m still not sure,” to, “Now I’m doing a live project with a client and oh, I get this part. Oh, I can go review. I’m stuck in this part. I can go review this video,” or whatever else. And then when they’re done with the course, they have delivered a final model, final insights to the client and the client is really happy. I’ve seen people go from, I’m so nervous to all of like, “I get it. I can do it,” right? So that’s what it- 
Kirill Eremenko: Gotcha… So in the courses, they would have actual live projects with clients. Is that the case? 
Piyanka Jane: In the certifications. People have options of just taking courses a la carte and they can learn on their own time and do courses. What we recommend is the certifications which has projects at the end with us as live mentors and with live clients, again, all working remotely. We have students logging in from Nigeria to Australia to of course, big percentage of them are in US, as well as all throughout Europe. And so, they’re working remotely but with live client and with us live in the mentoring session. And once they’re done with that, they have the confidence, “Hey. I understand the fundamentals of data science and I can apply it to solve problem and I’ve seen the end-to-end of at least one project all by myself or with someone from my team.” 
Piyanka Jane: And then, for people who are looking to transition, we have mentoring sessions of step number one, do your targeting of your job. All the things that you’ve learned, now let’s apply it to job search. Targeting of your job. Making your resume to your target profile. Because a lot of times people think, “Oh. Now I have done the certification. Let me add this one line item to my resume, and now I’m an analyst.” If you’re looking to transition your career, your resume needs to transition as well. It needs to now tell a story of you as an analyst, you as a data scientist. So that’s the second mentoring session we have. And the third one is how do you interview and how do you ace that, right? So we have follow on, end-to-end process where we’re holding hand and making sure that the people cross over to the other side. And that basically increases the success rate. So for people who are looking to transition, that’s a huge success rate. 
Piyanka Jane: We also have a similar certification for current data scientists. Again, with the project and the with this kind of learning and hand-holding, they get the confidence that they can do it and then they are able to do it, and then they see the stakeholder alignment. They see what happens to people once they deliver the kind of project the way we are talking about. And we have gotten so many letters, I can’t tell you, like, “Piyanka, you won’t believe. I got invited to this meeting where I would never be invited after this project.” And yes, if you’re going to align with stakeholders, if you’re going to use this framework, and make sure that you’re doing the decision science part, you start to appear as a partner versus as a downstream somebody who takes order. So it changes the world altogether when you start doing things in the way that can engage people in the right way. And same for [inaudible 00:32:21] analysts. 
Piyanka Jane: So our approach is sort of end-to-end. I’m all about results. So for me, when any algorithm, any math, any statistics is useless until it gets me results. And so for me, again, as I guided thousands of students with this transition… I also have another book. Sorry, I’m bombarding folks with another book, Acing Your Analytics Career Transition, which is right now because of COVID being made free. It’s on Amazon. It’s called Acing Your Analytics Career Transition. And it’s a very quick read on Kindle. So it’s like a 40-page read or something. And it lays out these steps, step-by-step, and whichever program they choose to go, whatever else, you need to follow a step-by-step method of really transitioning. You can’t follow a haphazard path and expect that your career would be of that of a data scientist by just taking courses here, courses there. I mean, take courses but in the context of identifying what your target is. Back calculate. Look at your own resume. Figure out the gaps. So all of that is there in the book, and hopefully, that will be a good guide for some of your listeners. 
Kirill Eremenko: Amazing. Thank you. Very cool that you made it free. That is very admirable and maybe, probably will help lots of people. 
Piyanka Jane: Yes, hopefully. 
Kirill Eremenko: So talking about these courses. Very interesting. So the certification is something that you are actually organizing internally. I haven’t seen that before, so that is very cool. Tell me a little bit about your SWAT data science. So SWAT, I know the SWOT framework as W-O-T. Strengths, weaknesses, opportunities, threats for business, but you have another framework in addition to your BADIR framework called S-W-A-T. What does that stand for and how does it work? 
Piyanka Jane: Sorry. So that’s not a framework. So this is going back to my days in PayPal. I was heading up business analytics there for North America, and before that, I was part of leading product analytics for merchant consumer on the product side at PayPal. And at that time, it was some series of projects that I did and the credibility I won. I and my team became like a SWAT team. You know the SWAT team who come in when things are not going… When things are complex situation, a SWAT team is parachuted into that situation and they can control it and they can get stuff done. So we came to be known as a SWAT team, and it was a pretty small team that I lead. It was here as well as international. And yet, we came to be known as a SWAT team, and the reason was… And what that meant was, even if we were part of product analytics, if there’s a problem in Omaha in customer support, I would be called in. And we’d be saying, “Leave everything. Drop everything. This is urgent. Come into this meeting and take this over, and for the next one month, this is what you’re focusing and I need results by Monday, April 22nd,” right? So that was how it used to be. 
Piyanka Jane: And I recognized and I used to wonder, what is it that made us a SWAT team? What is it that we got… I mean, there were lots of data scientist team at that point. And what was it that got us that much credibility that got us… We didn’t have any extra or special tool. We used the same tool that most other data scientists had. And we recognized, the power was us, our hypothesis driven method. This BADIR framework that after I left PayPal, I sort of formalized, was what I was doing internally in my head, and my team was doing it because I was teaching them. As I onboarded my data scientists, I would teach them this method, not in this framework the way it is, but I would inherently teach them this framework. And what this does is, it gets results quicker. 
Piyanka Jane: So for example, today, for our clients, we can get a really high end, very good accuracy, highly functioning machine learning model in about eight to nine weeks, and no other consulting companies can. And that’s very lean like consulting team of two to three people, we can produce machine learning models so quickly, same for AI or deep learning models. And the reason is because we are hypothesis driven, and we do a fair bit of the same BADIR framework. We do a fair bit of work upfront aligning stakeholders. So not only does our… We produce work fast, but the percentage of time our work gets used is also really high. And that was the same thing for me. When I was at PayPal, almost every model or every project we worked on went on to produce millions of dollars of impact, and had a amazing shelf life, meaning amazing… Some of the models were operational after three years or four years, and the reason was that- 
Kirill Eremenko: You mean still operational after three or four- 
Piyanka Jane: They were operational and they still form the foundation of many things because we did a lot of work on the decision science aspect, the stakeholder alignment, really understood the question, and then we were hypothesis driven. So all this brain work that we followed, this gave us… One is it gave us acceleration. Second is because we were so success metric driven because that is inbuilt into our framework, we were almost always… The stakeholders could not wait to act on our insights instead of us having to influence and go after them and say, “This is what we need to do.” They could not wait. It’s like a relay race. They were jumping up and down, ready to take the baton from us which rarely happens. And the third thing was, because we did a whole lot of work… because we were hypothesis driven, we were really, really fast. 
Piyanka Jane: So that same analogy then I took over when I went to… Eight and a half years ago when I started Aryng, I took that same analogy and I basically framed the team, our whole philosophy is similar. It’s all about rapid ROI and also practical data science. We’re not about pie in the sky, fairytale data science. Give us all your data. We are going to help you monetize it and it will take over months and months and we’ll keep trickling some insights to you. For us, it’s all about practical data science. What data do you have right now? What are the decisions you’re looking to make? And how can we get you the fastest go-to-market with that? 
Kirill Eremenko: Got you. Well, Piyanka, this is one of the most saturated podcasts. I can’t keep up with you. You have so many ideas, so many things. I’m just going to jump to the next question I had for you. So in one of your videos you talked about analytics projects, and this ties in quite well with what we just discussed that having a hypothesis at the start of your analysis, before you even start your analysis by asking those business questions and doing analysis plan according to this BADIR framework, come out with the hypothesis and then only moving to data collection deriving insights and recommendations. So doing those first two steps, coming up with hypothesis helps your analytics projects be more relevant and creates success. In one of your videos you said that a huge percentage of data science and analytics projects actually fail. I could not believe how low the percentage of successful project was. Could you walk you us through that again, please? 
Piyanka Jane: Yeah. And it’s dismal. Gartner published a report I think two years ago in 2017, or three years ago, which then they later corrected to even… It basically said that 85% of big data or data science projects… And by the way, this whole space is about $200 trillion investment that goes in, and of that- 
Kirill Eremenko: Trillion. 200 trillion. 
Piyanka Jane: Trillion dollar. 
Kirill Eremenko: Into analytics? 
Piyanka Jane: Yeah. Meaning I’m talking about the big space of data science and big data. All the infrastructure investment and so on. Let me correct. There’s $200 billion-plus investment that goes in overall, world over, globally. And of that, Gartner predicted less than 85% of them actually drive an impact. So 85%-plus projects actually fail meaning they get instrumented or they sit on a shelf somewhere. Nobody uses them, or they get abandoned halfway through, whatever else. They just fail. 
Kirill Eremenko: They could even look like a success. We derive the insights but nobody’s using those insights. 
Piyanka Jane: Nobody’s using it because you build the best of the model. And this by the way, is the biggest pet peeve. I keynote at many conferences and one of the conferences I was keynote at is Predictive Analytics World and some of their top data scientists come to these conferences. And their biggest… And when I ask, I often start with my keynote a thing, “Oh, how many of you have work on data science project and it didn’t go anywhere?” And almost all hands go up. 
Piyanka Jane: It’s one of the biggest pet peeves of data scientists. I built the most amazing, lowest misclassification, high accuracy model, but my stakeholders are not listening. They have moved onto something else, or whatever reason. So as for Gartner, it was 85%, right? And then some other experts came in and they did the… CIO.com for example, published some other reports which said of the ones which even get finished and get out, you would deem successful, less than 15% of them actually drive any significant impact. So by the time you do all this math, it’s looking like of the 200 billion-plus investment, we’re talking about 2%. 
Kirill Eremenko: 2%? 
Piyanka Jane: 2% is actually driving any impact. It’s abysmal. This is horrible. But it is real, and I have seen this live and that again, goes back to my world when I used to go back… going back to my PayPal world and now also in my role at Aryng and our client work. I mean, clients pay for data science work so you would think our probability of success would be higher, but I can’t tell you how many times we are coming to project halfway through or somewhere, even end where it’s going nowhere from some other consulting companies or whatever else, and the companies come and said, “This is going nowhere.” They’ve said, “We have already attempted it. It has failed. Can you correct it now?” And often, one project that has failed has taken months to fail also. So it’s not like you fail fast, and the stakeholders have found out you’ve failed. It’s like, “Oh, we were looking at using NLP. We were looking at improving our refund or return rates. This is automotive part. And we did this large scale analysis. It took us six to eight months, and we realized that because of this and this and this and this and this reason, we can’t get any lift and the model which was built is not operational. We asked them to recollect that.” All of that. There’s issues galore. And it took them eight months. 
Piyanka Jane: So there’s so much failure and so much money getting wasted, so much time getting wasted, all because… And there’s lot of main reasons. The main reasons for failure that I have seen is lack of data maturity, which means people are not even believing the data, or getting the data out is itself a… It’s like putting your hand in a lion’s mouth and getting something out of that. It’s really, really hard. Data science rigor is often low. People often have data scientists who start from the step D and do part of I, and they call it done, right? So they don’t do the end-to-end. Often organizations don’t have an engagement model between business and data science. Sometimes they don’t even have… Well, this is more common. I mean, lack of enterprise data literacy is a big one where the data science team is good. They are producing insights. But the organization, the marketing, the product, they don’t understand data science. They are wary of data. And so, you give them a machine learning model, they’re not believing it. And so- 
Kirill Eremenko: Could I just jump in here? This is an interesting topic on data literacy because according to you, data culture consists of three things. Data literacy is one of them. What are the two others? I was just curious. 
Piyanka Jane: So there are four Ds of data culture. 
Kirill Eremenko: Oh, four actually. 
Piyanka Jane: Four Ds, yeah. So data literacy is one of them. But the most important or the foundation on which data culture sits is data maturity, right? The data maturity being do I have easy and appropriate access to single source of truth for all, right? So our data scientist needs a different access. A marketing manager needs a different access. But do I have appropriate access to single source of truth, or an easy access? Or does it take me forever like, “Oh, I click this button. I wait 10 minutes.” Do you think your marketing manager is going to wait 10 minutes to get that report? No. They’re going to start finding shortcuts. You know this excel report that comes in from the other system? Maybe I’m going to look at that, whatever else. So data maturity is the critical. It’s the foundation of if you’re looking to establish a culture of data, you need to get some degree of data maturity. And on a scale of 0 to 10, at least 7 and above and then you’ll be functional. 
Piyanka Jane: Second part is data literacy. Now that you have access to data, do the people know what to do with data? You’ve given access to the marketing managers, the product managers, the operations people, the customer support people. They have access to data. When the customer calls in, they have access to data, not only about what this customer’s history is, how much they have spent and all of that. Now, do they know what to do with it? Do they even know what… The customer support call center agent, they see their own call performance data, their average hold time, average speed of answers, whatever else. Do they know what to do with it? Do the supervisors know what to do with it? So that’s data literacy. 
Piyanka Jane: When appropriate level have appropriate… When people have appropriate level of data literacy at the right level for them, and they’re able to use the data effectively to make decision at their level to be able to use that in discussions to drive conclusion, then your organization has appropriate level of data literacy. And currently, data literacy’s really low in organizations. We have gone into organizations where data literacy… Less than 2%, less than 3% of people have the right level of data literacy. And in the best case scenario, maybe 10%, 15% of the people are going to be at the right level of data literacy. Still 80% don’t have the right skills at their level. 
Kirill Eremenko: Wow. Wow. That’s crazy. What are the other two Ds? 
Piyanka Jane: The other two Ds are data driven leadership. So if the leadership does not have a vision for a data driven organization, they don’t or they’re not holding their team accountable to use data to drive decision, they’re not using something like zero-based budgeting, you give me the money, that’s when you get the money and so on, then that organization cannot have a data driven culture because the leadership itself is not embodying it, and they don’t necessarily see it as an asset. 
Piyanka Jane: And the last one which ties all of this together is data driven decision making process. So if you don’t have data in a structured decision making process, if data is not part of the decision making process, then you can do data all you like. You can build models all you like. But the decisions are getting made in a parallel track almost independent of data. And of course, your organization will not be able to leverage the data, and will not be data driven. So these are the four Ds. I’m going to summarize it. Data maturity, data literacy, data driven leadership, and decision making process. 
Kirill Eremenko: Wow. Very, very interesting. I know we’re going to have to wrap up soon, but I have one more question for you. 
Piyanka Jane: Sure. 
Kirill Eremenko: You mentioned decision science. But what is the difference between decision science and data science? 
Piyanka Jane: That’s a great question. Again, I’m going back to the power of BADIR framework or power of why a SWAT team works wherever they work, is analytics or putting data to work has two components. There’s data science and decision science. Data science is all the algorithmic aspects of the things you need to do to… or technical aspects to do the technical analysis, collection of data, identification of the data type you need, setting up your null hypothesis and all of that. That is all data science. 
Piyanka Jane: Decision science is all the things that you need to do to make sure that those insights that you produce goes towards impacts, which means the business considerations, the stakeholder constraints and communications, timelines, the realities of the business, that is all decision science. So the science that addresses all of those and incorporates that into data science is decision science. And when you marry the data science with decision science, you get the power of data. 
Kirill Eremenko: Fantastic. Love it. All right, we’ll end on that. I think this was an amazing excurse into the world of data science. I have a lot to process after this. Before we go, Piyanka, could you please help our listeners, where can they find you, follow you, get to know more about Aryng, if they’d like to explore this space further? 
Piyanka Jane: Sure. So they can connect with me on LinkedIn, or follow me on LinkedIn. And then my name, if they look up Piyanka Jain and Aryng. Aryng is the company name, A-R-Y-N-G. They can find us either through LinkedIn or Aryng. Or they can also follow me on Twitter. My hashtag is analyticsqueen. 
Kirill Eremenko: Fantastic. And of course, pick up the book. Sounds very exciting. Behind Every Good Decision. Great reviews on Amazon. Love it. What inspired you to write the book? 
Piyanka Jane: I wish it was an inspiration but it was more of a forcing factor. At that point, we were doing lots of public workshops. And every workshop that we would do or we would conclude, people would be by the time the… The day four, day five people, our students would be pestering us that, “Do you have a book? Do you have a book?” And I said, “We don’t have a book.” And I of course, for one reason, I always thought, “Well, who has time to write a book?” I mean, I wouldn’t even know where to start. And I’m a natural speaker but writing is not that easy for me. So I said, “Well, I don’t think… I’m not sure.” But then that kept repeating over and over, and somebody planted a seed and it starts… 
Piyanka Jane: And then, right around that time, Wiley have called us. Wiley reached out, out of the blue. And also [inaudible 00:52:12]. And they said, “Oh, we’re thinking of publishing a book. Would you like to write something along this line?” And I was like… It was all coming together. So I said, “Okay. Well, I don’t know what it takes but we can attempt it.” And by golly, I mean… Because I’m all about practical, it took us a while to get to the level that I wanted- 
Kirill Eremenko: It’s a big process, right, writing a book. It’s a big job. 
Piyanka Jane: It was a big process because… And I had a team. I had my co-author, Puneet Sharma, as my colleague at PayPal, whose now at Google. Great guy. We collaborated together. And he and I are very much in alignment with how we see the power of data, so that was great. But then, none of us are writers and so we had to find some really good editor who could edit out… really put content in perspective for users to understand because we were saying a lot of things but if we are technical, somebody has to call us out on it like, “Hey. It’s not making sense.” 
Piyanka Jane: And so my dear friend, Laxmi, came about on a hike, one of the hikes we were doing up PG&E here. She started talking to me and by vocation, she is not a writer so I had never thought of her. But as we hiked that steep four-mile up which… It’s a very tricky hike because every turn, you think, “Okay. I’m almost to the top.” But it takes you about [inaudible 00:53:37] pretty steep hike. And she got the entire gist of what we were trying to do, and this one chapter… I was basically kind of whining to her that, “Hey, I hired this editor and they are correcting our English but they keep taking the content out. It’s not working well.” 
Piyanka Jane: And since she started talking to me, and then the way she sort of reframed what I was saying, I was like, “Oh. Do you have time to work on this project on with us?” And she thankfully did. And that time, I was pregnant, and also, she was pregnant and it was so funny. And then Puneet was struck in between two pregnant ladies who were like… Our hormones are high and we’re trying to collaborate on this project over phone, over live. And then we hired a graphic design team because we are both very- 
Kirill Eremenko: I love the images in your book. They’re so good. 
Piyanka Jane: Isn’t it? 
Kirill Eremenko: So the one with the sharks, hello data science. That is so funny. You got some really cool illustrations. 
Piyanka Jane: Yeah, thank you. And we hired one of the best teams because I am already visual. And I said, “I don’t want to write a dry book. I want to make it fun.” And so we got this team together, and finally what came out, I was happy with and then it got published. So I know you asked me a short question and I gave a long answer. 
Kirill Eremenko: No, no. Love it. Love it. I highly recommend. I’m a big believer in this. My own book is also about helping people to get into data science. This sounds like it’s a different perspective. You introduce the BADIR framework there. I think it’s a fantastic book for people to pick up. Definitely check it out. It’s available on Amazon. Yeah, looks like a great book. 
Piyanka Jane: Thank you. Thank you so much, Kirill. It was a pleasure talking to you, and it was such a joy having this conversation. 
Kirill Eremenko: Thank you, Piyanka. Yeah, lots to process. I think we might need to do a second podcast sometime down the line. 
Piyanka Jane: Absolutely. Would love to. 
Kirill Eremenko: So there you have it ladies and gentlemen. That was Piyanka Jain, president and CEO of Aryng. I hope you enjoyed this podcast and got a lot of value out of it. I know it probably felt like drinking out of a fire hose. Piyanka has so much knowledge, so much information on the space of analytics. That’s why I said at the end that probably, we need to do a second episode to dive deep into specific topics here. I had so many interesting favorite parts here. I loved the discussion about what data culture is, the four components, the difference between data science and decision science, always an interesting topic. Probably my biggest favorite out of all of them was the hypothesis based approached to data science. I think that is a very refreshing approach rather than just diving in and trying to solve everything, trying to boil the ocean. We all know that you need to ask the right questions, but this hypothesis based data science actually takes it to a whole new level. So if you’re interested in learning more, check out the BADIR framework. 
Kirill Eremenko: As usual, you’ll find the show notes at SuperDataScience.com/363. That’s SuperDataScience.com/363. There you’ll find any links and materials we mentioned on the episode, including Piyanka’s book, or books I should say, one which you can purchase on Amazon. I think one she said is free. Then you can find Piyanka’s courses there. You can find Piyanka’s company for if you want to do any consulting projects with her, and of course LinkedIn, Twitter, everywhere else where you can follow Piyanka. Piyanka does quite a bit of keynotes around the world, probably also in virtual events. So make sure to follow her and maybe you can attend on of the upcoming events with her as well. 
Kirill Eremenko: And on that note, if you know anybody who would benefit from this podcast, make sure to send them the link, SuperDataScience.com/363. Very easy to share, and maybe you can help somebody become an even better data scientist by applying some of the methods that we spoke about today. Thank you so much for being here today. Really appreciate you spending this hour with us and taking the time to tune into the SuperDataScience podcast. Hope we delivered on bringing you an amazing guest once again, and I will see you back here next time. Until then, happy analyzing. 
Show All

Share on

Related Podcasts