SDS 185: The Pre-Requisites for Analytics to Happen

SDS 185: The Pre-Requisites for Analytics to Happen

Pre-Requisites for AnalyticsWelcome to episode #185 of the Super Data Science Podcast. Here we go!

“Company strategy is data strategy.”

This is what William McKnight, our guest for today, wants for all the business owners, leaders, and data scientists to take note of. Listen as we talk about how data is a big asset in a company, the five levels of data maturity inside a business, and the four quadrants or aspects that should be improved inside your business!

Subscribe on iTunesStitcher Radio or TuneIn

About William McKnight

William McKnight is the CEO and President of McKnight Consulting Group, an Inc. 5000 Company which provides clients – from small enterprises to corporate giants – with plans, architectures, and strategies to improve data management and data warehousing inside their organization. Aside from being an award-winning consultant, he’s also an author and the host of the Data Decoded Podcast. 


William is ready to spill the secrets you need to be on top of the game. Considering the fast-paced age we’re living in, you should be panicking if your company has been plodding with the trail you’re taking. Have you been looking at the companies alongside you? They have already completed ‘laps’ and are getting an inch and inch closer day by day to the finish line to win the race to success.

These ‘laps’ I’m referring to are the levels of data maturity. The levels of data maturity are important to get the impression on how your company is doing in general. There are five levels according to William. In this episode, William explains each level point by point. Below is a guide for you that you could use while listening:

The Five Levels of Data Maturity

  • LEVEL 1 – (Nothing good here!) No consistent information match with architecture, no enterprise tools, no data management or governance, there are multiple overlapping sources, etc.
  • LEVEL 2 – Presence of Agile, the cloud, data warehouse, etc.
  • LEVEL 3 – Presence of data standards and data specialists; great balance between centralization and non-centralization; data warehouse is not just a data outhouse; data quality information
  • LEVEL 4 – Predictive analytics, data retention, starting to integrate Artificial Intelligence in your initiative; the existence of not just a database but file systems, analytics stores, etc.; there’s specialization
  • LEVEL 5 – (Time to share your story here in the Super Data Science Podcast!) Data is now an asset of your financial assets; its on the tip of the tongue of the executives; there’s development in the organization happening in architecture and artificial intelligence; data governance; mature organizational management program

And for every level, there are 4 different aspects you have to make sure are progressing so that you are sure to advance to the next level. Below are the 4 aspects of data maturity:

  • Technology – What we purchase and what we’re going to utilize
  • Architecture – How do we fit them all together? How her work the seams of all the technologies we put in place? How does data move around them? How do we decide?
  • Organization – What are the skills of your people? Who are with certain expertise? Are they dedicated? Are they organized?
  • Data Strategy – How does management feel about data? How do they feel it? How is it going to be supported?

In this episode you will learn:

  • What’s William been busy about lately? (05:30)
  • How can it be that there are still companies that don’t have data warehousing, data analysis, etc.? (10:20)
  • The concept of Data Maturity. (13:20)
  • What metric for how well an organization is going. (17:20)
  • 4 Different Aspects of Data Maturity. (18:40)
    • Is there a particular component that is most important among the four?
    • Who do you talk to the organization regarding these?
  • William’s advice to executives on how to incorporate the 4 different aspects of data maturity. (34:25)
  • Data Maturity Levels. (39:57)
    • How does this metric affect small to large industries?
  • What are the most important trends that companies should look into? (56:00)

Items mentioned in this podcast:

Follow William

Episode Transcript


Full Podcast Transcript

Expand to view full transcript

Kirill Eremenko: This is episode number 185 with President of McKnight Consulting Group, William McKnight. ...

Welcome to the Super Data Science Podcast. My name is Kirill Eremenko, data science coach and lifestyle entrepreneur. Each week we bring you inspiring people and ideas to help you build your successful career in data science. Thanks for being here today, and now let's make the complex simple. ...

Welcome back to the Super Data Science Podcast ladies and gentlemen. I literally just got off the phone with William McKnight, and I have to tell you, this conversation was amazing. William is so passionate about what he does that I felt myself just diving deep into the conversation and really getting a great feel, a great sense for what is going in that world of analytics that he is working in.

What are we going to talk about today? That's the important question, right? William is a data science consultant. He does consulting in this space of data science, but in a very specific niche which is the back end of data science; which includes data warehousing, data lakes, dupe clusters, architecture data strategy. All those things that an organization has to have in place in order for analytics to happen. We talk a lot about analytics on the podcast, but sometimes it's important to step back and understand what exactly are the right prerequisites for successful analytics in an organization. That's exactly what we're going to talk about.

In this podcast, you will find out specifically the details of what data maturity is. That's not to say that that's only what William focuses on. He's got a ton of knowledge. He's been in consulting for 20 years, and when you feel his passion when you listen to podcast, you'll know what I mean. He's knowledgeable on many, many things, but we, ... because we were talking about this in such depth, we only covered this one topic but we covered it very well. If you've ever wondered about analytics maturity of organizations, this is the place where you will get all that knowledge. You will find out the four components of William's definition of maturity of data maturity of an organization. You will find out the five different stages at which an organization can be, from one to two, three, four, and five. You'll also understand what's they have, how to assess whether the organization is at level one, two, three, four, or five.

When I say 'your organization,' I don't just include executives and business owners. Without a doubt, this podcast will be beneficial to everybody of all walks of life and all levels of your career. If you're an executive or a director of an organization, you will definitely learn a lot about how to assess analytics maturity of your organization, and what to do about it. If you're working in an organization, you will also learn how to see where your organization is, where it's headed, and find those tell tale signs that whether they will deliver on the promises that you're expecting, or maybe how you can help the organization deliver better on their analytics maturity, on their data maturity. Also, if you're not yet employed and you're looking for to join an organization, this will help you make the informed decision of which organization to join. Do they have the right data maturity? Very important. I think it's very important for anybody in data science to know these things.

Calm way to get started, very exciting episode. I was extremely engaged in this, and I can't wait for you to hear everything that William has to share, and also learn everything that I just learned about data maturity. Without further ado, I bring you to William McKnight, President of McKnight Consulting Group. ...

Welcome back to the Super Data Science Podcast ladies and gentlemen. Today, we've got a very interesting guest on the show, William McKnight, joining us in from Dallas, Texas. William, welcome to the show. How are you today?

William McKnight: Thank you, Kirill. It's great to be here. I'm doing great.

Kirill Eremenko: Awesome. You mentioned that the weather in Dallas was pretty crazy just before, 100 degrees Fahrenheit, or 37 or so degrees Celsius. How did you survive that heat wave?

William McKnight: Not easy. Just by not going outside, between like nine and eight p.m., you know? I'm a runner, too, so it was really crimping my running program.

Kirill Eremenko: Wow, that's crazy. Is that normal weather for this time of the year?

William McKnight: It's a little extra hot this summer, but yeah, the summers are like that. You have to be very cognizant of the weather each day.

Kirill Eremenko: Yeah, wow. Wow, that's crazy. As I understand, you're a consultant and you're running the McKnight Consultant Group, so you'd have to get out to reach your clients.

William McKnight: Oh yeah, can't hibernate. But yeah, definitely getting out to see clients. I enjoy that, of course. Still a bit of travel going on here this summer for me.

Kirill Eremenko: Very interesting, very interesting. It was very cool. I really like your approach. For our listeners, William is a listener on the podcast or has heard a couple episodes. Then, now he's decided to ... and you actually have a podcast of your own, is that correct? What is it called again?

William McKnight: I do. It's called Data Decoded. It's under the IBM Analytics Podcast, so yeah.

Kirill Eremenko: Okay, because I was checking on iTunes and I couldn't find it, but I see it on IBM now.

William McKnight: IBM, yeah. Yeah. We talk about non-IBM things, too.

Kirill Eremenko: Mm-hmm (affirmative), okay. All right, gotcha. Then you decided to reach out, and come and share your knowledge and expertise with our listeners, so thank you so much for that. I'm looking forward to hearing all about your world or your side of the world of data science. It will be exciting.

William McKnight: That's right. It's the side of the back end, the side of platforming data, getting data into their best ... it's best platform to succeed so that data science can go, really.

Kirill Eremenko: Mm-hmm (affirmative), gotcha. We've had a few guests on the show who've spoken about data architecture, data warehouses, or ... yeah, so data engineering, and all those sorts of things. It'll be good to refresh on that and get a different perspective. Maybe to start us off, if I'm a person who just met you, which is indeed the case; but let's say we catch up at a bar or a coffee house, or in the street, and I ask you, 'William, what is it that you do for a living?' Could you give us a brief description, please?

William McKnight: Well, I make data into asset for companies. I make it into their greatest asset. I get into platforms to succeed so that they can do analysis, they can do data science, they can do their reports, and they can really run their business through their data. I treat data as the most important asset of a company, which it is for a lot of companies already. It really needs to be. I improve data maturity, and make it into a great asset for the company.

Kirill Eremenko: Gotcha, and I can already feel how you're raising a lot of eyebrows already because how can it be so, like in this day and age, that there are still companies that don't have the data warehouse, that don't view their data as a very important strategic asset, that are not using their data in their operations day-to-day? How is that the case?

William McKnight: Well, they're not doing well, but they're still ... You can still be hanging in there today, but in the next few years, if you don't get your data act together, if you don't get this part of it right, you will not be ... you will be very much more unwell, and perhaps even out of business. It's that important. A lot of companies that can't get out of their own way. They can't get beyond the tactical. They can't even consider things other than what they have considered for the past, you name it, 20, 30, 50 years. That just isn't going to work in this economy where things are changing so rapidly. There's so many innovations in data and around data that you really have to be availing yourself of what some of them are because your competitors surely will be. They're getting more efficient. They're getting more effective at bringing data into their daily operations, and helping make decisions for the company.

This is becoming really important for companies. Those laggards, and I do have a maturity spectrum, and there are obviously maturity level one companies. That's okay today. Actually, it's not but it is what it is. You've got to get that momentum moving forward. That is the only means of survival for a company today.

Kirill Eremenko: Gotcha, okay. Totally agree. There's actually a interesting quote that is often referenced by entrepreneurs such as Peter Diamandis, who say that 40% of the Fortune 500 companies will no longer exist in the next 10 years. That's actually quite an old quote. It's a few years old, so it's probably like eight years left that the 40% of the Fortune 500 companies will no longer exist. If you think about it, it's massive, right? So, it's only 10 years, and almost half of the biggest companies in the world will disappear. Most of it is due to technology and data as well.

As you correctly pointed out, it all starts at the beginning. It all starts ... That's kind of rhetorical, but like it all starts with the data warehouse, and where the data is stored, and how it's collected. We talk a lot about on this podcast about data science, and artificial intelligence, and machine learning, and how the different algorithms can be applied, and so on. But if you don't have the data in the first place, if you don't have data prepared and ready for analysis, and handy in the first place, you won't be able to actually do those things. That's where you come in. That's the part that you take care of, making sure that everything is structured and working efficiently, and the way it should be, even before we proceed to the analysis stage. Is that about right?

William McKnight: That is about right. You touched on something that's very important for everybody to understand out there. I'm a data consultant. I've been doing data warehousing for many years; 20 years of consulting now. A lot of data warehousing, of course, in my background. In the past decade, it's transitioned to a lot more big data, and master data, and everything, all data, really. But no matter why I might come into a client, I might be there for big data, or master data, or what have you, data integration, data governance, data quality work, as I look around, it's that data warehouse that needs remediation to the point where that's almost the top priority no matter why I came in there. We tend to start to gravitate back towards that data warehouse and fortifying it, bringing its maturity up. That tends to be the place where a company can put its money and get into data, and get the biggest bang for the buck.

Still, still today, even though we need big data, we need master data management, we need streaming data, we need a lot of cloud based data platforms and so on, you better have your data warehouse act together, because that can create so many efficiencies for a company that it's almost impossible to move forward from a point that doesn't have a data warehouse into some of these other disciplines.

Kirill Eremenko: Gotcha. You mentioned a couple of times maturity, on data maturity of an organization. Could you tell us a bit more about that, because I find that that's a very interesting topic to be able to come in and tell, and do an assessment, and tell the organization, 'You, on the maturity scale that we use, you are a maturity level one,' or two, or three. It really puts it into perspective.

William McKnight: It does. It used to be that I used to ... push back on that question. Where I would get this question all the time. How are we compared to everybody else?

I used to say, "Well, you know it's not about that, it's about how good can you be? Let's focus on how good can you be, because we're not at the end of your maturity journey here at this company." But, I have found that having a maturity model actually helps me know what the next step should be on a journey. It does help put things in perspective and fortify some of the arguments that I make with a client about what those next steps should be.

What I did was I looked back over our clients for the past three years, and I did this earlier this year. For the past three years, this is some 40 odd clients, and plotted them across every data characteristic that I could think about. Then, I looked at company performance. How are those companies doing? There were natural lines to be drawn there, in terms of what they were doing with data across five maturity levels. That's great. I mean we're used to five levels of maturity. I can tell ... how good a company is doing by how they're treating their data from one to five. As you go from one to five, you're doing more with data. You're capturing more data. You're managing more data. You have more data science, more data strategy. You have more technology in place. The organization is more mobilized around the importance of data, and things like that.

It helps me plot a company into their maturity level and say, "Hey, here's where a lot of companies ... Here's what a lot of companies did to get to the next level. By the way, my motto is you can't skip levels. We've got to get to the next level before we can jump all the way to the next ... you know, to level five."

By the way, there's different aspects of data maturity. For me, there's data strategy. There's architecture, there's technology, and there's organization. I have not encountered companies that can take one of those all the way to five, and leave the rest behind. They go in sync. They go in concert. While we may be excited about improving our technology, if we don't improve our organization at the same time, you're just not going to be able to improve your technology and improve your overall maturity.

Kirill Eremenko: Very interesting. I really enjoyed that ... or very curious about that point you mentioned that you can tell how well a company is doing by the level of its maturity. What I was wondering as you were describing that is which way do you think that relationship works? Is that they get to a level of maturity, and then they do better, their financial results are better? Or, is it the other way around? Their financial results are better, and so usually they're invested more into their data technology, architecture, organization, and data strategy to get their maturity up to speed. Which way do you think? Which comes first?

William McKnight: Well, that's an interesting question. I never thought about it that way, but I think that the answer really is that it, as you invest in data wisely, not foolishly, not doing more of the same, because that's the only thing we know how to do; but if you invest in your data platforms, your data architecture, data governance, et cetera, wisely, then that's going to raise your company maturity and your company performance. However, we all know how it works out there, right? As company performance is better, they open up a little bit more budget. They do a little bit more. Success begets success, and so the other way works as well.

Kirill Eremenko: I totally agree, and I think they come hand in hand. Also, probably organizations that enhance their data and see that they're getting better financials out of it, they will be more inclined to invest further into data. Actually on that, I wanted to ask you what is the metric? ... We'll talk about maturity in a second, but what is your metric for how well an organization is doing? Is it like something specific that you're talking about? The profit, or the margins, or the market capitalization, or some other metrics that allow you to tell, or is it just like a general sense based on the profit and loss statement, and balance sheet, you can tell how well the organization is doing?

William McKnight: Well, there's the difference between a public company and private company in terms of that. For a public company, to a large degree, you can use the stock performance relative to the stock market. For a private company, it's not as simple. We certainly have the private companies as clients, as well. That gets back to a little bit more of a gut feel, and maybe a little bit more things that we know being on the inside that aren't public about performance. Then, it's things like sales, things like profit, things like expanding into product lines, and meeting the goals that you might have as a company.

Kirill Eremenko: Gotcha. Awesome, thanks for that. Now, talking a bit about maturity. So you mentioned you have the spectrum on one to five, and you also mentioned that you have at least four different components of maturity. Can you tell us a bit more? So the ones you mentioned are data strategy, architecture, organization, technology. Is there anything else? How do they all tie in to define this concept of data maturity?

William McKnight: Yeah, so there's data strategy, architecture and technology, as you mentioned. There's also organization. These are quadrants, right? These are not discrete. Certainly they don't operate completely independently, but you have to draw the line somewhere. I've always used these four things as measurements for my clients for a long time. I find that I want to draw a distinction for the client between technology and architecture, for example, because there is a difference. Technology is what we purchase, what we're going to utilize to get things done; but architecture is how we fit it all together and how we work the seams of all the technologies that we've put in place. So we might have cloud databases, and Hadoop clusters, and [no Sequel 00:19:43] clusters, and a master data management hub. Well, how does the data move around between them? How do we decide? How do we decide that we need a new database, a new data mark, for examples, versus reusing an existing mark? How do we make these decisions?

There are many, many decisions of that nature, and I put them in architecture. I have organization because it's very important the skills of the people that we have aboard, the quantity of the people that are working on data. Are they dedicated? How do projects work? Are they organized? Is there a clear path to production? These are things that are independent of the technology and the architecture. Finally, I wanted to address the trickle down effect that there is on any form of company strategy. How about data strategy? How does management feel about data? Because how they feel about it is how it's going to be supported. Are they putting it on a pedestal? Are they putting it in their quarterly reports? Are they talking about it? Do they even know we have a data warehouse down here, and do they care? Things like this.

I wanted to just raise awareness with whomever I'm talking to that there's more to it than whatever that they're tightly focused on. These are the four things that fell out of that.

Kirill Eremenko: Okay. All right, gotcha. So technology and architecture ... or like work together, organization is how everything works, data strategies. What is the strategy coming from the top? Interesting. So my question then would be when you go into an organization, who is it that you talk to about all these things? Is it the CEO? Is it the CIO? Is it the board? Is it all the executives together? Because this seems like quite a few things from different areas? Strategy is more the CEO. Architecture and technology is more the informations, the CIO, or different departments under the technology space. Organization is maybe the COO. How do you get them all to work together on this?

William McKnight: Well, that's true. As a matter of fact, Kirill, it's the same question about how do I ... who do I talk to anymore about the services that we provide to a company? This is true for the software vendors out there as well. This is true for IBM, all the way down to a start-up in the data area. What's true that I'm saying is that it's more complicated than it used to be, because it used to be somebody like me would go to the CIO. Everybody had one. That's where everything was ... everything emanated from that office, right? But, that is so not true anymore when it comes to what we do in data. It's been so disintermediated over the past five years that there are legitimate pockets of data technology people everywhere in the company. The tricky part is getting them all to work together, to the right degree, right? We don't want draconian centralization. That's not going to work anymore. But, we also need some centralization.

To answer your question, I get around. I love to have a free hand in the company. That's going to get you the best results in whatever it is that I do, to talk all the way from the CEO to everybody. But, every company is different. We get focused in different areas. The strategy, the maturity level, is only going to be as good as the input of the people that I talk to. But most companies, when they engage, they really want the full spectrum of service. They have no problem getting me in front of ... all the way from the CEO to everybody that is interesting at all in this engagement.

Kirill Eremenko: Mm-hmm (affirmative), okay. All right, gotcha. Is there any one of these components that you would say is the most important out of the four?

William McKnight: Well see, that's where I do believe that some things are more important than other things, but what we've seen is that they tend to move together. They tend to move in locked step. They tend to be like four people that are arm in arm that are moving forward. This is what I was saying before, that one of them can't get too far ahead of the others because it just won't work. You're not going to be able to get technology at a high maturity level when the data strategy is weak, when the organization is weak. You don't have the people to follow through on getting the technology, when you don't have an architecture to put that technology into. For example, and you can make arguments like this about any of these quadrants moving forward.

I just encourage everybody that works in data to be aware of the environment of data, and the fact that there's other aspects that have to grow as well as your own, in order for you to succeed. This is where, when I consult to a lot of technology people, I tell them, 'Yes, this is all great. We need to get a Hadoop cluster going in here. We need to get on S3. We need to move this to the cloud, et cetera, et cetera. Yes, yes, yes. Data streaming, yes, yes, yes.' But, we also at the same time, we have to be growing the data science of the company and growing the awareness of the importance of data, because that's the demand for what we supply. For many companies, they're so remedial, but their answer to me is, 'Well William, all this is great, but nobody's not knocking on my door. Nobody's demanding that they get more data, they get it in real time, et cetera. They're just not demanding it.'

That's a symptom of a greater problem in this company. In that company, you really have to grow the data science, and you have to ... Your job then becomes to be an advocate for data, a champion for data within the organization, and show the company the possibilities. This is another big theme of mine, and that is that today, company strategy is data strategy. We who sit on the gold mine of the data within our organizations, we have to be bringing the initiatives forward to the company that it can consider and take up; because the rest of the company doesn't know what we know. If data is so important, ... I hope we all believe that, us who are working in data. We need to get the message out.

Kirill Eremenko: Wow, I love how passionate you are about that, and the quote, "Company strategy is data strategy." That's definitely going on my list of quotes. That is so, so true. Indeed, I believe there's two approaches to analyzing or running, or strategizing about an organization top down and bottom up. Top down is kind of like the more old fashioned approach where you look at what is going on, like the vision, the mission, and then you trickle down. It's probably not old fashion's not the right way to describe it. It's the one that's been around for longest.

Still very valid, still very relevant; but now we have a new tool, a new powerful tool when we can look in our organization bottom up. We can look at what is going on, to the minute. If you have the data collection points right, you can look at to the minute, to the second. What are your customers doing? Where are they purchasing? If it's not a B to C business, what are your partners doing? What is the factory doing? You can analyze all these things and understand where the bottlenecks are, where the inefficiencies are, and then as you mentioned, bring that to the business decision makers to inform their strategy to better understand how to run this organization, how to improve it. Competitive pressures are going to make that become the norm in the world. What are you seeing, by the way, in that space? How fast has this change been adopted over, or embraced over the past 10 or five years that organizations are more and more introducing data from a bottom up approach into their operations and strategy?

William McKnight: So Kirill, it's not ... Yes, it's not just about what is everything ... What's happening right now in the company and the concept? What's happening with our products? What's happening with our employees? What's happening with our customers? But, what's going to happen next? That is so possible today. Companies are starting to embrace that they can understand that. Not only can they understand it, they must understand it. For a lot of companies now, and I'm not saying that artificial intelligence, for example, is pervasive in all the companies in production. They're doing it everywhere. But, the ... type A companies that are going to succeed are going to embrace those disciplines that are going to stand the test of time.

That is really the game I think for leadership in companies. It's to embrace those things early that are going to withstand the test of time, and be a good enough company to be able to accept these technologies early, warts and all, and still make it happen within the company, ... and grow with the discipline so that when the laggards and the [also rans 00:29:23] come aboard, they're way behind. Artificial intelligence is one of those things. There's artificial intelligence opportunities everywhere, so what I am going to say is that even if I'm not asked, I am laying the foundation for artificial intelligence within companies because data is that foundation. We must bring a lot of data to make those artificial intelligence algorithms make sense and be accurate.

So, we have to get our data act together. We have to start collecting all data, even if we don't see an immediate purpose for it. We've got to get it into the right platforms to succeed.

Kirill Eremenko: Mm-hmm (affirmative), gotcha. Thank you, thank you very much for that. On the whole concept of these four components of data maturity, you mentioned, and I can totally appreciate that, that you cannot be successful, more mature in one than they others. They have to come hand in hand, like you mentioned. They're four, what is it? Four people walking together. But I wanted to ask is from your experience and from your observations of companies out there, industries, have you noticed in any case that companies tend to focus on one more than the other? Also, what I mean, for example, companies would invest a lot into technology in the hopes that they can get more data mature and just forget about ... not notice data strategy, data architecture, or the organizational component. ... Not withstanding your comment that they cannot be successful in that way making it more mature than the other three, at same time, have you seen companies trying to do that? Like investing more heavily into one or the other, and which one would it be out of the four?

William McKnight: Oh, totally. You hit the one that's going to get the most attention, and that's technology. This has been, and I've been consulting now 20 years, and this has been true for all those 20 years, and probably the next 20 years, which is that we tend to throw technology at problems, and think that it's going to solve the problem without architecture, data strategy, and probably other things; because it seems like the easy thing to do. It never is. It's never enough.

Yes, you hit on the one that many companies are trying to use as the one thing that's going to fix all data problems. But, in some other companies, ... it's senior leadership that is pushing data strategy. They just don't have the mechanism inside the company to do it with, so sometimes it's strategy that's leading the way. Which to me, that's good. That's something I can work with a lot better than if technology is leading the way. I can work with any situation, but if I got executive leadership behind a whole idea of data, ... Maybe it's fuzzy as to how it all happens, and what happens next, and what technologies do we need now, and what organization do we need? Those are great questions to be answering, but sometimes it's data strategy that's out there ahead of other things. It's seldom organization. It's seldom architecture.

Architecture, it tends to be in many companies, still a drag along. Meaning, they don't really do it. They don't have a plan for this what we want our data to look like in one year, three year, and five year. That's what I want for organizations. That's what I develop for organizations is here's what you need to be targeting in the next one, three, and five years. Now, you may deviate, but you have to have this true north so that you can make great decisions today that's going to fit into your plans, and that you're not continually making the same decisions that you've always made because ... I guess you're waiting for somebody else to make the big decision.

Kirill Eremenko: Mm-hmm (affirmative), wonderful. Very interesting. Very interesting how technology and data strategy sometimes ... Okay, and so in that case, what would you have ... Do you have anything to say to executives of companies who are focusing on one of these components or the other? Because ultimately, I think I can kind of appreciate perspective of leadership of an organization that's doing that. What I mean there is that it's probably expensive to focus on all four at the same time. It's a lot of time. It's a lot of effort, and a lot of research, and a lot of man power to get all these things up and going.

It might be like an idea of, 'Oh, I'm going to dip my toes in the water and try to push this technology side of things, or this data strategy, and see how that goes, and if that goes well, I'll invest into the other four.' I might even be aware of all four of them, but I want to start with one, and then go. What would your one biggest piece of advice be for executives listening to this podcast out there right now?

William McKnight: Well, I've seen many a half measure in my career. I can tell you that they don't work. I never said that this is all easy, becoming data driven, moving up the maturity cycle. What is extremely hard, no matter where you are, one, two, three, or four ... What's really hard is getting momentum going in the right direction, and getting everybody on board that needs to be on board to move in the right direction. Once you get that momentum going, the hard part is keeping it up. That is the hard part. It doesn't matter if you're at one, two, three, or four.

Now, here's what I say: I say everybody has to be at maturity level three today. That's my bottom line. If you're not there, you need to be sprinting to maturity level three. Some companies out there are in more leading edge companies, leading edge industries that are more fully exploiting data, like I would say telecommunications, healthcare; certainly there's a ton of possibilities there. Almost ... Software, you need to be at level four, ASAP. You need to be sprinting to what it takes to be level three and four. Everybody else needs to get marching towards level five, even though it's going to be hard. It gets exponentially harder to get to these levels, but it's so worth it.

So, to answer your question, Kirill, what I would say is that what we need is still a ton more leadership in companies around data. We need more people that understand the holistic nature of data, and the fact that you need to move all these things forward at once. We need people that aren't married to technology, and married to the way things have been going for the past decade or more, and are willing to say, 'Hey, this is a great opportunity,' for example, 'for this company to get into,' you name it, S3 for example, cloud storage for this particular work load. Maybe, and I've encountered a lot of companies that they have big data, but they're not capturing it. They're not storing it. They're not doing anything with it. Maybe there's an initiative coming up that we can start to target and say, 'Hey, that's the initiative that could use big data. We're going to platform it correctly on S3 or whatever we decide,' but something that's a mindful decision. There's plenty of things like this.

As a matter of fact, another thing I like to say is that all the projects that I've worked on in the past, I'd say five years, I might be brought in to build a data warehouse. I might be brought in to build a Hadoop cluster, or a data lake. I might be brought in to strategize and build master data management. Okay, great. Inevitably, that's not the end of the story. Inevitably, the client, they're not Agile yet, so we have to break them into Agile. They're not in the cloud yet, in a big way, so we have to break them into the cloud. They're not capturing big data yet, so they don't get that, so I have to help them get that. There's like 10 things like this.

Inevitably, every project is going to be burdened with one or more of those things. If you're looking for some discipline where things are going to be smooth sailing and it's going to be exactly what management tells you to do, and nothing else, and you're not going to have to provide any leadership to break that company into some of these bigger things. You're just going to do projects. This is not the space for you. Because if I'm tasked to do a data warehouse, for example, even today, it happens; I'm going to lean hard towards putting that in the cloud. I've been doing tons of benchmarks of all these database management systems out there in the cloud. That informs my decision.

I want to put that in the cloud. Then, we have various parties within the organization that are going to rebel against that, and it's so easy for a company to say, 'Oh well, John over there,' and I don't even have to say what department John is it. 'Somebody named John out there, he doesn't like what we're doing, so I guess we're not going to do it.' That is where leadership has to step in, and see the vision, see the future, help us move toward that future. I would say to senior management about data, I would say it's going to take some leadership to move forward.

Kirill Eremenko: Gotcha, okay. Make those tough decisions and hard calls to make things happen.

William McKnight: Yeah, yeah.

Kirill Eremenko: Like you said, it's about the momentum, right? Once you have the momentum and you see the results, things are much easier after that. Okay. An interesting thing I would love to get a bit ... extract a bit more information out of you on is these data maturity levels. It's a very interesting concept. Can you give us maybe some examples of what is a level one versus a level three versus a level five, for instance? Just so we can get a feel for what we're talking about here. Maybe some people listening to this that might get kind of a sense for where their organization might be on this maturity scale.

William McKnight: Okay. Yeah, sure. I'm glad you're interested in data maturity. A lot of people are out there, so I'm more than happy to talk about it. If all the listeners could get their hand ready and create a fist with their right hand-

Kirill Eremenko: I'm doing it right now.

William McKnight: I'm going to go ... Yeah. I'm going to go through one, two, three, four, and five; real quick, obviously. You can start holding up fingers as this makes sense for you. Now, maturity one, go ahead, hold up a finger, because I'm going to give that to you. I'm going to give that to everybody. I'm not going to fool around. Everybody is at maturity level one. Some maturity models start with zero. Mine starts with a one, so you got that. Now, there's nothing good about maturity level one except that two follows on.

In maturity level one, you're talking about there's no consistent information management architecture. You don't have true enterprise tools. You don't have data management quality or governance. You don't have multiple ... You do have multiple overlapping data stores, lot of redundancy. Everything's behind schedule. You don't have well done data modeling. You don't have a data dictionary. You don't have a good path to production. If that's you, you can stop right there. But, hopefully some of us can go forward now. In reality, when I do my classes, when I do ... this four organizations, obviously we get a whole lot more nuanced. We break it down, and everybody gets maturity levels for each of the four categories.

Maturity level two, I'm going to say some of the things there, you're starting to get into the cloud. You're starting to go Agile. You have something that somebody calls a data warehouse, or maybe you have many of them. It's not a dirty word. [inaudible 00:42:15] ... company. It's okay. It's okay to have a data warehouse. It's up and running, and it's doing some things, let's say. That's sort of your basic stuff for level two. Some of you got to hold up one more finger. How many are you holding up now, Kirill?

Kirill Eremenko: Well, two so far.

William McKnight: Two for you.

Kirill Eremenko: I'm getting a bit nervous about number three.

William McKnight: Yeah, number three. Here we go. With number three, you have data specialists. You have some data standards. You have a great balance between centralization and non-centralization. You are covered when it comes to data privacy and things like GDPR. You're on top of that. You have MDM starting. You have third party data that you're bringing in. You have a commitment to the cloud; not everything in the cloud yet, but you have a commitment to the cloud. You have Hadoop or some sort of distributed file storage in development; maybe not in production, but in development.

In your databases, you have ... you're using in-memory, or you're using columnar, or maybe both, but you're at least using one of those things to a great degree. You get what I'm talking about, first of all, and you're doing it. You have some data governance going on. Your data warehouse, it's not just a data outhouse that happens to have some data from your operational system. You've got some slowly changing dimensions in there. You're doing data quality transformations. Overall, you're actually doing this thing called data platform selection. It's not just same old, same old. You're doing platform selection and you're making heads up decisions about how you're platforming data. It's not just, 'We need more data. I guess we'll put it in yet another Oracle database, because that's what we do here.' That's level three in a nutshell. Some of you got to put up one more finger.

Now, it gets really hard. If you think it's hard so far, maturity level four, the data layer as acknowledged by the organization. You have data retention. You are doing predictive analytics. You're starting to break into artificial intelligence. Okay, so that is now being considered as part of all your initiatives. You might say, 'Well, what does artificial intelligence doing on a data maturity spectrum?' Well, like I said before, data is the foundation for artificial intelligence. That's been proven. That will continue to be true. It's on my maturity spectrum for data. It's actually part of data science, if you think about it. It's a way to exploit your data.

Technology, now you've got your file systems, not just database, but you've got file systems now in production. You have analytic stores, meaning it's not all vanilla Oracle, Sequel server, what have you. It's not OLTP databases all over the place. You've got some specialized analytic stores. By the way, that's an important part about maturity is that today, we're definitely in the world of specialization; which means that one size does not fit all, which means that as you make decisions, you have a lot to choose from. You should be choosing from a lot, and you should be able to absorb a lot of different technologies inside of your organization; not redundant technologies, but a lot of different ones, and making them all work together.

There's a lot in technology when it comes to level four: data visualization, not just reporting, data virtualization, that helps with this strategy of having so many technologies. You've got no Sequel or new Sequel in your operational area, not just OLTP databases. Cloud is now your default for all data. You don't have things like open source databases in production. You have streaming data, not just ETL. You have in memory stores, and you have columnar stores, because there's a place for all of this in a modern organization; at least a mid-size company and up.

Another thing I do, Kirill, in this is if you're a mid-size company and you can define that different ways, but I give you an extra finger, an extra maturity level because this gets ... this is really geared towards a upper mid-size and enterprise level kind of global 2000 type of organization.

Kirill Eremenko: I was about to say, I was thinking that as well. But maybe let's go through level five, and then we'll get back to that, the size of the organization.

William McKnight: Okay. ... Just to finish off some of the high points of level four. You've got a chief data officer now. Okay, you're doing organizational change management; because all these data projects is bringing about a lot of organizational change. You got to do that. Oh, and you have a data lake now, which is another thing I think is really important today. Most data lakes out there are really data swamps. They're not doing anything.

Kirill Eremenko: I love your metaphors.

William McKnight: They're really not, you know? But if you do it right, it does a lot for the company. That's part of maturity level four. Okay, maturity level five now.

Kirill Eremenko: Yeah. Here we go, this is the pinnacle.

William McKnight: Here we go, okay. I mean if you're at maturity level five, you need to get on the Super Data Science Podcast with Kirill.

Kirill Eremenko: Oh yes, please. Please email me.

William McKnight: And give your story, because data is now an asset in your financial statements. It's on the tip of the tongue of all your executives. You've got all development in the organization happening within architecture. By the way, you're all in on artificial intelligence. Wherever artificial intelligence can help a new initiative, it is. Your data integration is streaming. You have graph databases. There are specialized analytical stores for workloads with requirements not suited for the EDW, so you're past this 1990s notion where the EDW is all things to all people. It's a lot of things to a lot of people, but it's ... they've become kind of ... cluttered a little bit now. I don't mean that in a bad way. I mean they're highly used by so many constituencies, but there's going to be room for specialized stores in the analytical arena. You've minimalized the use of cubes; remember cubes. They don't have a place in-

Kirill Eremenko: [crosstalk 00:48:45] Yeah, everybody loves cubes.

William McKnight: Well, I don't love cubes.

Kirill Eremenko: Yeah. No, that was sarcasm. Just for our listeners, cubes are like these formations in SQL that you can query and get ... it's like a way of querying SQL, extracting data from different sources of tables. It's a really not the most efficient way.

William McKnight: That's right, that's right. Now you're starting to look at GPU databases, not CPU, but GPU. You have data governance across the board. You have a mature organizational change of management program. You have true self service business intelligence where you disintermediated IT and other forms of people between people and their data that are going to use it for business purposes. You not only have a CDO, but you have a chief information architecture, or something similar; because it's that important.

Kirill Eremenko: CIA.

William McKnight: You have ... Yeah, yeah. ... In your enterprise data warehouse, you're measuring data quality, you're scoring data quality, and it always hits that mark. It's always above the standard. You have those architecture plans I talked about before where you have one year, three year, five year; and you're actually doing something with it when you're platforming data. And anyway, Kirill, on and on, and on. But those are some of the high points of the maturity levels, so maybe there's somebody out there with five fingers up right now. I don't know.

Kirill Eremenko: Well, fantastic; and probably a criteria would be you called William and asked him to help you with services, and William declined your offer because he has nothing else to add. ... William always has something to add.

William McKnight: There's always something to add when it comes to data, because ... Now, here's another point to what you just said. Next year, the model's going to be different, right?

Kirill Eremenko: True.

William McKnight: Every year the model's going to be different. Who knows? That five I just mentioned, that's going to be four next year. Who knows?

Kirill Eremenko: Yeah.

William McKnight: There's going to be a lot of escalation in this. It's moving forward fast, so you have to keep going. This gets back to that momentum I talked about before.

Kirill Eremenko: Gotcha, gotcha. Thank you so much. I was enjoying this really in ... Do you feeling myself like going deep into all these things. Very, very detailed description of maturity levels of data, and I'm sure a lot of us has got a lot of value out of it. What I wanted to touch on was what, like you mentioned, I think it was after level three or level ... Well, during level four where you give an extra finger to those who are like small or mid-size, or smaller organizations. That's what I felt, like when I was applying the same logic to ... I totally, like no questions asked, this works for enterprise level organizations, large organizations, 100%. I'm thrilled for our listeners who are from those types of organizations, and who got to hear this.

My question is for, like for instance, our company. Our company is like only 15 people strong. It's a start-up, and at the same time, we're servicing quite a lot of students. We have several hundred thousand students worldwide. We have necessities for certain parts of what you mentioned, like we're all in the cloud. We are constantly evaluating new tools and things like that. On the other hand, we don't have a need for a data lake. Well, not yet, anyway. We don't have a big data Hadoop cluster, or we don't even need these OLTP databases. We don't need Oracle and things like that.

In some areas, on the other hand, like you mentioned in maturity level four, there's artificial intelligence. On the other hand, we're exploring using artificial intelligence and enhancing our services that way. We kind of like ... I feel like we're penetrating your levels of maturity that you described in certain aspects, but very, very specifically targeted, laser focused. But mostly, we're sitting at level two. Should I consider the organization sitting at level two, and start trying to get to level three? Or, is there a different scale or a different approach that you have for younger organizations that are less mature in general, not just in the sense of the data?

William McKnight: Well, I'm not going to let you off the hook, Kirill. I'm not going to let smaller organizations off the hook when it comes to your data maturity. The best I can do, and I've thought about it, but the best I can do is just to add one. If you think you're at two, then you're at three; which means you're at standard. You're at level for right now, but I would encourage you and everybody out there to keep it moving forward.

As a matter of fact, one thing that I like to do, because sometimes I'm kind of the rent a CIO guy for some companies, right? I'm doing performance incentives, and things like this, for the people. ... For example, if you're a data integration architect at the company, it's not good enough anymore to simply respond to requests and make the users happy. Of course we want to do that. That's very important, but we also want to grow the maturity of our discipline within the organization. I want to see signs of maturity growth within data integration from everybody that works in data integration.

I don't want them to just be sitting there with a pen and paper taking down orders. We're not order takers here anymore. We can't be. There's the time for that is way past. This is not the job for the order taker. This is the job for the leader. This is the job for the person who understands that they have to impose their will sometimes, and at the same time, not only be great technically, but also be great interpersonally and organizationally so that you can do that without ruffling too many feathers. Because here's another thing that I like to say a lot, Kirill, which is it doesn't take more time and money to do it right. What it takes is discipline, focus, and knowledge. It's incumbent upon all of us in data to go and get that.

Kirill Eremenko: I'm just going to repeat that. That's a good one, it doesn't take more money and time to do it right. It takes more discipline, focus and knowledge. Very, very true. Yeah, I just wish more organizations followed that motto, kind of like that philosophy, and totally appreciate what you're saying about that sometimes it takes the person that understands these things ... They just need to push through that initial phase, and then the momentum is going to keep going.

So, on that note, you mentioned this maturity scale changes with time, and quite fast. I wanted to get your feel for where is the world going? Because 20 years in consulting, dozens, I'm assuming, probably more, of companies that you've helped and worked with. What is your impression? Where are we going? What is coming in the next three to five years that organizations should prepare for, and in which direction should they be looking? What are the most important trends that you see now?

William McKnight: I think artificial intelligence is the most important trend that I see now. How you can start getting ready for that, again, is to get your data act in order and start to collect all data at a detailed level, in the right platform, at quality standards. Not perfect, necessarily, but at quality standards. Because what we need to be doing is looking for opportunities, not just with AI, but with all the possibilities that are out there. IOT is another one, streaming data, predictive analytics. These are the things that are going to set companies apart.

While it may be great that you have a great supply chain, you have great customer service, ... All of these things are tickets to entry anymore into business, they're expected. So beyond that, beyond that, what are you doing with your data? That's what's going to tell the tale of the future. Look for ways, number one, let's say you can't think outside the box. Okay, look for ways to be more efficient about what you do today.

Kirill Eremenko: Gotcha.

William McKnight: So maybe it's in your call center, because call centers are rapidly changing, for example. They're going hard to chat bots, whether we like it or not. They're going pretty hard in that direction. So by 2020, people are going to manage about 85% of business relationships without human interaction.

Kirill Eremenko: Wow.

William McKnight: That's just where things are going.

Kirill Eremenko: Yeah.

William McKnight: How about your shipping? The cost of handling misplaced items. How about automating paper based and human intensive processes? How about predicting small things like flight delays, for example, based on maintenance records, and getting in front of these things that are going to happen. Just look for efficiencies in your company. You will find them, and you need great leadership to be mapping modern technologies to these problems, but that's always been true and that always will be true.

Kirill Eremenko: Gotcha. Thank you so much, William, that is amazing ... some amazing insights. It makes them even more valuable that they're weighted with so much of your experience and knowledge. I just have one question for you: can you believe that it's been 55 minutes that we've been chatting? Like for me, it felt like five. How crazy.

William McKnight: [crosstalk 00:58:58] I'm just passionate about this. I feel like it's so important. I know that at the end of the day, it helps people. That really gets me going ... is helping people in their careers, and so on, and so whenever I feel like I'm doing even a tiny, tiny part of that, I'm all about it. The time has flown.

Kirill Eremenko: Yeah, yeah. Totally. I could totally feel your passion. It was a wonderful conversation. I have so many more questions, but unfortunately we've run out of time. I want to thank you deeply and kindly for coming onto the show. Before I let you go, I just have to ask: what are the best places for our listeners to get in touch, follow you, maybe read more about your insights and ideas, listen to, you have a podcast; just please mention all and any of the links that will be valuable to our listeners.

William McKnight: Well, everything's going to be at the website, which is McKnightCG, for 'consulting group,' dot com. You can also find me on LinkedIn. You can also find our company page on Facebook. I tweet a few times a day, so my Twitter handle is WilliamMcKnight. You can keep up with me at all of these places.

Kirill Eremenko: Wonderful. I just noticed on your website, ... Oh, let's not forget the podcast, Data Decoded.

William McKnight: Data Decoded, yeah.

Kirill Eremenko: On the IBM.

William McKnight: IBM Analytics Insights Podcast.

Kirill Eremenko: I just noticed on your website, you have the Inc 5,000 list, congratulations on that. That's a major achievement for a business.

William McKnight: Thank you.

Kirill Eremenko: That's really cool. Okay, and yeah ... All of those will be in the show notes. Anybody can get the links there. One final question I have for you today: what is a book that you would like to recommend to our listeners to help them enhance their careers.

William McKnight: A book. I'm a reader, so I've always got a few going. ... Okay, so one I'm reading now is called 'The Champion's Mind: How Great Athletes Think, Train, and Thrive.' That's for the sporty side of me, but there's also some knock on over to the work side, right? Because I believe that what we need to be doing is foraging champions' minds out there, and that's going to make all the difference in the world. This book talks about that, brings in some examples of great athletes of our time, and how they think, and how hard they work. That's really inspiring to me.

Kirill Eremenko: Awesome, and who's the author?

William McKnight: Jim ... I can't hardly say this, Afremow.

Kirill Eremenko: Jim Afremow.

William McKnight: Yeah.

Kirill Eremenko: The title of your book is 'The Champion's Mind.' All right, well there you have it, ladies and gentlemen. Thank you very much, William, for coming on the show. Really appreciated your time today and all the insights you shared.

William McKnight: You got it.

Kirill Eremenko: So there you have it, ladies and gentlemen. That was William McKnight of McKnight Consulting Group; and very exciting podcast indeed. I hope you enjoyed it as much as I did. You could probably feel the passion that William has for what he does and the way he communicates. It makes it extremely simple to understand what data maturity is and how to assess it in your organization. I wonder how many fingers you've got to ... how many fingers you were awarded in his rating from one to five. That would be always so interesting to understand what our averages for podcast listeners is here.

In any case, my personal favorite was the quote that he mentioned, "Company strategy is data strategy." That is where the world is going, and that is what we're seeing. Data is so ubiquitous in every organization, and the organizations that do use it and do incorporate it in their strategies. Those are the organizations that are going to survive and thrive.

Of course, as usual, you can get all of the links in the show notes, which are available at There, you can also find the transcript for the episode, and any other materials that were mentioned. Finally, if you enjoyed this podcast and you know somebody who has ever wondered about data maturity of their organization, or at which stage their organization is in terms of data science analytics, or maybe you know an executive, or a board member, or a director who might benefit from this, then forward on this episode to them and share this knowledge. Help them also understand, benefit from this wisdom, from these insights that William shared with us today.

On that note, I hope you enjoyed today's podcast. Can't wait to hear you and see yo back here next time. Until then, happy analyzing.

Kirill Eremenko
Kirill Eremenko

I’m a Data Scientist and Entrepreneur. I also teach Data Science Online and host the SDS podcast where I interview some of the most inspiring Data Scientists from all around the world. I am passionate about bringing Data Science and Analytics to the world!

What are you waiting for?


as seen on: