Jon Krohn: 00:00
This is episode number 865 with Cal Al-Dhubaib, head of AI and data science at Further. Today’s episode is brought to you by ODSC, the Open Data Science Conference.
00:16
Welcome to the SuperDataScience Podcast, the most listened to podcast in the data science industry. Each week we bring you fun and inspiring people and ideas, exploring the cutting edge of machine learning, AI, and related technologies that are transforming our world for the better. I’m your host, Jon Krohn. Thanks for joining me today. Now let’s make the complex simple.
00:49
Welcome back to the SuperDataScience Podcast. I’m delighted to have my longtime friend, Cal Al-Dhubaib, a tremendously gifted communicator and data science entrepreneur as our guest on the show today. Cal is head of AI and data science at Further, a data and AI company based in Atlanta that has hundreds of employees. Previously, he was founder and CEO of Pandata, an Ohio-based AI and machine learning consultancy that he grew for over eight years until it was acquired by Further a year ago. He delivers terrific talks. Don’t miss him if you have the chance. He holds a degree in data science from Case Western Reserve University in Cleveland.
01:23
Today’s episode should appeal to any listener, particularly anyone that would like to drive revenue and profitability from data science or AI projects. In this episode, Cal details why his first startup was unsuccessful, but how the experience allowed him to discover an untapped market and build Pandata, a thriving data science consultancy. He talks about his unconventional strategy of requiring clients to make a sizable upfront commitment that initially scared away clients but ultimately attracted the best ones. He talked about the way core values inspired by his tin can to Mars thought experiment shaped his hiring and company culture and how making data science boring helping his clients trust AI systems and delivering a clear return on investment became his formula for success.
02:06
All right, you ready for this invaluable episode? Let’s go.
02:14
Cal, welcome to the SuperDataScience Podcast. We’ve been friends for a long time, so it’s cool to have you now on the show. How are you doing, man? Where are you calling in from?
Cal Al-Dhubaib: 02:22
I’m doing well. I’m at home in Cleveland, Ohio today. It’s so great to be here.
Jon Krohn: 02:27
For people watching our YouTube version, you have a beautifully decorated office. You did that yourself?
Cal Al-Dhubaib: 02:33
I did. I did. This was just before the pandemic. I found myself working from home once a week and this was going to be my reward to myself. I get to work from home one day a week, and so I wanted a beautiful space to be inspired. Then this became my very fun prison cell during COVID and I started arranging it with all the things that brought me joy. So, that’s the explanation of what’s behind me right now.
Jon Krohn: 02:59
Do you remember how we met? Do you remember the time that we met?
Cal Al-Dhubaib: 03:03
It was at Open Data Science. I believe it was the very first time we hung out at ODSC West. Was it like 2021?
Jon Krohn: 03:10
ODSC West, that’s right.
Cal Al-Dhubaib: 03:10
Yeah.
Jon Krohn: 03:12
It was probably something like 2021. I think it was the first post-pandemic conference that I’d maybe been to, period. I was in an Uber van with you and you were so positive that I was like, “Who is this guy? What is he on, and what’s his deal?”
Cal Al-Dhubaib: 03:32
I’m just excited to be there.
Jon Krohn: 03:33
I know. Then I discovered over time that that is how you are all of the time. That is Cal, which is really cool. You’re maybe the most positive, happy person in data science. That can be the title of the episode.
Cal Al-Dhubaib: 03:47
I would love to have that be my brand.
Jon Krohn: 03:52
So I’m sure that having that attitude is helpful as an entrepreneur in data science. So, you have a fascinating story. I’d love you to fill our audience in on how you created a data science consultancy from scratch yourself, having the confidence to do that. I’m sure there’s tons of listeners out there who have thought, “You know, I wonder what it would be like if I left the comfort of my employer and tried doing data science on my own.” So yeah, fill us in on how that went, how it all came about.
Cal Al-Dhubaib: 04:25
So there’s lots of fun twists and turns that we can dive into later, but the story in a nutshell is I was actually studying computational neuroscience at Case Western and I was dealing with a lot of the application of mathematics and data. I had an internship with a health system where I was looking at public health data. That was my very first entry into the world of data science. I was encouraged after demonstrating some of my undergraduate research at a research showcase to consider commercializing. I had no idea that one could say I want to start a business. So, the entrepreneurship center, they showed me how to file an LLC. I’d take this big giant leap into this knowing nothing, raised some money, got into an accelerator, and this thing falls flat in about a year and a half.
05:18
I can talk about all the lessons I learned about not selling data science the right way. But an interesting outcome of this is we had some research pilots with hospital systems and I was trying to get them to sign these agreements to say they’ll eventually buy this software. In return, I was doing free research for them. So, when that first startup collapsed, some of these researchers were upset. They’re like, “We were depending on this work.” It was funny to me because I asked the question, I’m like, “Wait, you would pay for this?” So it turns out all along the thing I was giving away for free was actually this growing booming demand. So, to give you a snapshot in time, this is the end of 2015. That started 2016, that’s birth of Pandata.
06:03
In Northeast Ohio where I’m based, there were fewer than 150 data scientists at that point in time. About a third of them worked for IBM, the Cleveland Clinic or Progressive Insurance, household names for many people. Then these massive enterprises based here had one or two. So, I started Pandata to address this talent gap. Over time, Pandata grew, bootstrapped painfully by focusing on what was hard to cultivate. So, as data analytics platforms and business intelligence became more commoditized, it became machine learning.
06:39
As machine learning started to get commoditized, it became machine learning in heavily regulated environments. That’s ultimately how we found our niche. After growing that business and surviving the pandemic for about eight years, completely bootstrapped, we were acquired almost a year ago this month by a company called Further. That’s where I’m now their head of AI and data science. Love to tell you about all of that, but that’s my story in a nutshell.
Jon Krohn: 07:06
Yeah, let’s dig into some of these things in more detail. That was a really great summary. It’s amazing that you deliver that so concisely. It’s like eight years of your life distilled into just a few minutes, which is nice.
Cal Al-Dhubaib: 07:22
There’s something to that. So, you talked about this whole, “Why am I so positive all the time?” I’m actually a very grumpy person when it comes to certain things. I’m like, “No, we’re not going to analyze the data that way.” But I found early on with Pandata that when I was selling data science, most people didn’t know what data science was. It was like a salt shaker. It’s like, “Hey, can we sprinkle some data science on it and maybe some money is going to come out on the other side?” It was this novelty and those people didn’t understand what this thing was. A lot of data scientists at the time were leaning into being smart, being very technical, working on the algorithms that were the next greatest thing, that they actually liked it when other people didn’t understand what they did.
08:03
On the other hand, I really wanted to find ways for people to get it because I found that when they get it, they were a lot more likely to work with us, stick with us longer. That ultimately became one of the core values that I built Pandata around and the type of people that I hired. So, my motto over the years evolved into make it boring. If I can use cats and dogs and puppies and weird little memes to explain a very complicated mathematical concept, you bet I will. People love it. It is fun. It can be easy, it can be funny. People go, “Wow, that was easy.” So that’s become a big part of how I do things.
Jon Krohn: 08:42
I like that you in the same breath described it as making it boring while simultaneously having cat memes. It seems like you’re making it more interesting, but I know what you mean that you’re making it less technical. The data scientist isn’t going to feel like, “Wow, this is…” They’re not going to be seeing all the equations that make them excited. It’s making it boring for data scientists and making it exciting for everyone else.
Cal Al-Dhubaib: 09:10
Exactly. Exactly. Actually, I had the same reaction from a class I gave recently, and they’re like, “That wasn’t boring at all.” I’m like, “But you now understand how boring AI is.” They go, “Yup.” I’m like, “That’s the point.”
Jon Krohn: 09:21
So let’s talk about the kinds of things that made Pandata so successful. We have already this make it boring idea of making it boring for data scientists, easy for your clients to be able to understand the data science that you’re delivering. What are the other keys to scaling a successful data science consultancy?
Cal Al-Dhubaib: 09:41
So something that I didn’t quite know at my first startup that really stuck with me is this notion of product market fit. Anyone who’s in the space of entrepreneurship will hear this term bandied about. For those of you who haven’t been in the field of entrepreneurship, what that means is you found a pain point that someone is willing to spend something on solving. There’s enough of those people at enough scale, you know how to reach them and you can consistently deliver that thing that they’re willing to pay for. Clients vote with their money. I found early on, because I bootstrapped, that meant I didn’t raise any capital, the only source of growth I had was when a customer is willing to pay for it. So, it’s one thing when somebody says, “Hey, that’s a great idea.”
10:27
It’s another thing when they’re willing to sign a big check for you to solve that problem, and then they come back to you to solve that same problem or similar problem again and again and again. So, product market fit and listening to what people were willing to spend on was a really big part of Pandata. My first year, all I had to do was say, “Hey, we can do data science things.” I was able to land a few contracts here or there, but it was a rotating window. I’d work with one enterprise and then they’d go away. Another enterprise would come. That’s a very common story for consulting companies. There were maybe one or two clients that stuck around or kept coming back to us. I remember having a conversation with my stakeholder there.
11:07
I finally worked up the guts and I said, “Not that I want you to question the situation at all, but why are you coming back?” I was really trying to do some market research and understand, and it turns out that they really liked that we were approachable. That was one of our core values is hold back the jargon, always speak plainly. Then there were a couple of formulaic things that we accidentally ended up doing. We have this process called discovery and design that now is a mandatory requirement. Anybody that hires us to do any work, I say, “You have to do this upfront or I won’t work with you.” With those clients, we accidentally did it.
11:47
That’s where we spent just 30 days, six weeks diving into a problem, trying to figure out, “Where are the skeletons? Is this solvable? How can we approach this? What are the unknown unknowns?” Which is a really big part of solving problems that have not been solved before with pattern matching algorithms just to simplify it. So, I tried to recreate that magic. So, there were these attributes that we had that became our core values. We had five core values that I can talk about later. Then there are these processes, and one of these processes was discovery and design. Now, the funny thing is I decide, all right, I’m now no longer going to work with any client that doesn’t want to do this, and we’re going to charge an arbitrary amount of money. That engagement size is now $50,000.
12:36
At that time, that was a measly $12,000. I was really a first time entrepreneur and nervous about throwing that about. But I’d say, “Hey, you know what? Unless you’re willing to spend this, I don’t even want to work with you.” It helped me weed out two things. One, clients that weren’t serious. If they weren’t willing to pay that, they definitely weren’t willing to pay for the rest of the engagement. Two, if they didn’t philosophically agree with the importance of that step, then I knew that they were likely to be a client that was consistently disappointed by the results because they didn’t quite get the data science process.
13:08
So, I went from spending a lot of time talking to a lot of people that seemed interested at first in data science and then I got, no, no, no, no. My pipeline started to dry out and this is one of three times that Pandata’s bank account reached less than a month’s worth of expenses. I was like, “This was the end. This was maybe the dumbest idea.” Within that same period of time, I landed three of the biggest clients I had ever engaged, two of which remained clients until Pandata’s exit. So, over a period of about six years, and that process became a part of how we were able to scale so much larger than most small solopreneur consulting shops.
Jon Krohn: 13:54
So the key was having this 30-day discovery and design initial engagement at the beginning of trying to consult with somebody and you’d say, “There’s going to be this $50,000 price point to do that initial 30-day engagement.” So that initially seemed to put you in peril where your pipeline dried up. Everyone was saying no. But then it did ultimately lead to discovering solid long-term clients that were with you for six plus years. Cool.
Cal Al-Dhubaib: 14:23
Well, and so I would use this tactic, and now I use this tactic to scare off non-serious people. It actually allows me to save them time. It allows me to save time. Then I find the companies and the groups that say, “Heck, yeah, that sounds amazing. I love how you think about this.” There’s a lot of fish in the scene. It’s all about this matchmaking process. One of the counterintuitive lessons I learned was the art of saying no or ruling others out by saying no to them. It really allows you to spend more time on the bigger things, the higher value things. This is a common tactic I see a lot of most of my friends who are wildly successful.
Jon Krohn: 15:01
Excited to announce, my friends, that the 10th annual ODSC East (Open Data Science Conference East), the one conference you don’t want to miss in 2025, is returning to Boston from May 13-15! And I’ll be there leading a hands-on workshop on Agentic AI! ODSC East is three days packed with hands-on sessions, and deep dives into cutting-edge AI topics, all taught by world-class AI experts. Plus, there will be many great networking opportunities. No matter your skill level, ODSC East will help you gain the AI expertise to take your career to the next level. Don’t miss out — the Early bird discount ends soon! Learn more at odsc.com/boston.
15:46
Right, right, right. That is tricky. It’s very hard to say no to smaller or more challenging projects because you remember those times where you got to only a month of expenses of value left in your bank account. You’re like, “Well, I guess I better say yes to everything,” but then that ultimately it slows you down. You have the death by a thousand cuts of just all of these low value touch points.
Cal Al-Dhubaib: 16:12
Well, it’s funny, when we were going through due diligence on this acquisition, there were about three points on the balance sheet and the financials that they had virtually circled. They’re like, “We want to talk about this, this, and this. We don’t like that.” I said, “I didn’t like those either.” That was really bad moments for me too.
Jon Krohn:
16:30
Right, right, right. All right. So, let’s talk about… What did you call them, your five pillars?
Cal Al-Dhubaib: 16:37
Core values.
Jon Krohn: 16:38
Core values.
Cal Al-Dhubaib: 16:39
So along the way, I became a member of this group called the Entrepreneurs Organization. It’s a global network of 16,000 members worldwide. To qualify, you have to have a business that generates over $1 million in annual revenue where you’re a founder, majority shareholder, and they have an accelerator program for businesses between a quarter million and a million. So, before I really knew what I was doing, I got into this accelerator program and it helped equip me with a lot of mindsets around how to grow a scalable business, how to think about a business as an operating system. One of the concepts is core values, not as something that just sits on a wall or things that we say, “Hey, we do things with trust and integrity,” which is great.
17:24
We want everybody to do things with trust and integrity. That’s a give me. Core values are really those unique attributes that describe the type of character that individuals when they exhibit within your organization help you be even more successful. They’re usually patterned off of the strengths of the founders. So, it’s the way in which you discover these core values is if you think you’re about to go on a tin can to Mars, who are the five people you would choose to take with you? What is it about those five people that you really want with you on that mission? And then on the flip side, what are the five people you’d never want to have in that tin can? What about them makes you not want to bring them along? It’s really this process of what about these attributes?
18:13
So I found for Pandata that there are these five values and we kept refining them over the years, but it was be approachable, win together, cultivate trust, pursue growth, and tame uncertainty. So, these are five traits that are somewhat related to data science, but also somewhat related to the way in which we approach data. So, the most important to me that was the hardest to find in data scientists was this idea of be approachable. In fact, in our interviewing process, we had interview questions aligned to each of the core values. We’d specifically look for individuals who naturally gravitated to explaining things without leaning on jargon. They looked for ways to help the other person understand. Taming uncertainty was also another really big data science indicator.
19:08
Individuals who get poorly framed problems, not a lot of assumptions, but naturally gravitate towards, okay, I’m going to unpack this, I’m going to list my assumptions out. If I don’t know what they are, I’m going to know how to get them. Cultivate trust was another really big one. Because there’s this pattern of a lot of unintended consequences that have unfolded in the world of machine learning and AI over the years, I look for individuals who didn’t need to be reminded to press pause and ask difficult questions.
19:41
I would look for individuals who even if they didn’t know how to do it, had the capacity to do it to engage in a difficult conversation. There is a period of time where it wasn’t widely known that machine learning algorithms discriminated, for example, against people of color. That’s a very difficult conversation to have if you’re not even willing to use the words to describe this is a person of color.
Jon Krohn: 20:08
Yeah, I haven’t figured all this joke out yet or how this would work, but I like how one of your values is tame uncertainty, and that makes me think about in data science the bias-uncertainty trade-off. Then now you’re all of a sudden talking about bias. Is that where you’re going with this?
Cal Al-Dhubaib: 20:23
Yeah. So, this all comes full circle. So, this package of traits when you put them all together helped me find individuals that would thrive in my environment and consequently would do the things that would make clients hire us again and again and again. So, these core values combined with our processes were the secret sauce behind Pandata.
Jon Krohn: 20:44
Very nice. So, the five values being approachable, winning together, being able to tame uncertainty, pursue growth, and cultivate trust, those five core values combined with these chunky initial engagements that get people to commit, you can do discovery and design solutions to the problems that you discover, get the skeletons out of their closets. Between those two facets, the core values and that big initial engagement, you were able to have huge success at Pandata. Thanks.
Cal Al-Dhubaib: 21:17
Well, it’s crazy, right? A lot of people think core values are these soft things or fluffy things, but I hope that this gives an example to those maybe considering going into entrepreneurship of how powerful it can be to have these simplifying things that are really unique. It doesn’t have to be a single word. It can be a phrase, it can be a motto, but it’s these concepts you rally around. If you play with it, it really turns into a powerful process. You use it in interviewing, in evaluating promotions and raises. You can talk about it in context of, “Hey, you really exercised A, B, C core value on this one client engagement,” or “You really didn’t. Let’s talk about that.”
Jon Krohn: 21:53
I like how you had interview questions related to them. I also like how you came up with your core values. Everybody needs to sing a little bit of David Bowie, here am I sitting in a tin can, and think about who you’re in that tin can with and who you want to be in that tin can with.
Cal Al-Dhubaib: 22:11
I mean when I was talking to you and I saw your face light up. I’m like, “I know you’re five and I know you definitely know the five you don’t want on there.”
Jon Krohn: 22:19
I’d want to spend some time thinking about it. I’ll spend some time maybe after this podcast episode thinking about that. It’s hard while I’m in the middle of the episode trying to focus on what you’re saying and keep this interview on track. But yeah, I had some things that came to mind right away. I immediately thought of one or two people that I definitely want to be with and one or two that I definitely wouldn’t. Amazing. So, yeah, I’ll expand on that a bit after the episode personally, probably won’t share.
Cal Al-Dhubaib: 22:52 Yeah, I’d love to hear what those values are.
Jon Krohn: 22:55
I’ll tag them on LinkedIn.
Cal Al-Dhubaib: 22:56
Great. I love it. I love it.
Jon Krohn: 22:58
Five people I definitely want to be in a tin can with, five people I definitely don’t. Tag them on LinkedIn. That’ll be a fun post.
Cal Al-Dhubaib: 23:02
Tag all 10.
Jon Krohn: 23:05
Exactly, exactly. All right, so let’s move on a little bit. Something that Pandata ended up specializing in was highly regulated environments. So, doing data analytics, building machine learning, AI systems within highly regulated environments. Tell us about how Pandata ended up getting into these highly regulated environments, what those environments were, and the particular challenges that you face in those kinds of environments.
Cal Al-Dhubaib: 23:33
So we ended up working with hospitals, life sciences companies, higher education institutions, energy and utilities, some mild defense work, and then financial services.
Jon Krohn: 23:50
Mild defense.
Cal Al-Dhubaib: 23:51
Mild defense. So, not actually all the cool exciting stuff, but defense companies looking at operational efficiency and little use cases like that. But you still have these setups where you’re dealing with code that may be lives in an environment that’s totally internet air gapped and you’ve got a VPN into super secure systems. It was crazy getting to see all the logistics. In fact, on my team, the joke was if you didn’t have at least three laptops, you weren’t busy enough with client work. So, we often got super secure devices shipped out to our team members and it was really cool. So, that’s what I mean when I talk about heavily regulated environments. Now, how we got into this was actually probably a touch of not stupidity, but naivete at the start of this journey.
24:40
As it turns out, you need big insurance bucks if you’re going to serve this type of a market. The companies that would typically go after these engagements were looking at minimum spend, multi-seven figure. We came in and we’d say, “Well, the small folks don’t know how to do this and they don’t want to engage the big folks. I think there’s an opportunity here to build something scalable.” I had my early background in healthcare research and working with hospitals, and that was the early phases of Pandata. It helped build this track record of saying, “Hey, we’ve worked with very sensitive data before.” As we built up that portfolio, it became a lot easier to navigate that work. As we did that, we had to be a lot more cautious going through data protection training, data privacy training.
25:33
It’s not to say that all of these different data privacy and data protection laws are the same, but you start to see patterns. You start to see generally acceptable patterns about how you think about privacy, how you hit stop and ask some questions, think about unintended consequences, the ways in which variables can reveal information unintentionally, et cetera, et cetera. You also think about your obligation, secure passwords. We ended up building processes that helped differentiate us for these mid-sized projects that the big guys really didn’t care about. It felt great.
26:07
Every so often we’d steal a project from McKinsey or Accenture and that just made my day. But as we started to build up that portfolio, we started to build up the expertise in, “How do we handle these tricky situations? How do we talk clients through it?” That ultimately became a big part of the reason why Further wanted to acquire Pandata.
Jon Krohn: 26:30
Very nice. Yeah, it’s a great story there and I can see how this is helpful for people who are thinking about their entrepreneurial journey and where could they carve out a particular niche. Where could they find product market fit? That was a great idea there where okay, there’s tons of data science projects out there that are six-figure contracts that McKinsey and Accenture don’t want to do, and that might be a pretty good engagement for somebody just getting started on some data science consulting.
Cal Al-Dhubaib: 27:01
Well, it’s interesting because those very same projects are the ones that their companies actually want to de-risk. Even if they have a data science team, they actually want somebody else outside to put a layer of protection between their team’s work and when this product is delivered. So, over time, we built up insurance, we built up practices, we built up these processes that were so airtight. We were way more secure than any of our clients were. We worked with some major enterprises and it allowed us to navigate those spaces, charge a premium, and those same clients would come back and work with us again and again and again.
Jon Krohn: 27:35
Very cool. Where did the Pandata name come from? Suddenly, I’m curious about that.
Cal Al-Dhubaib: 27:41
So funny story. I was racking my brain. I had just failed my first startup and I was doing these data science tutorials. Pandas was on my mind. I was like, “This is a data science company. Let’s go with Pandas, Pandata.” Then I used a very early form of an AI logo generator. So, the Pandata logo was actually generated by an algorithm that used simple A/B testing. It’s like this or this, this or this.
Jon Krohn: 28:18
Wow.
Cal Al-Dhubaib: 28:19
It eventually came up with a font, logo. The brand was pretty much AI generated back in 2016.
Jon Krohn: 28:25
Wow, that’s cool. It does make sense that you would have Pandas in there. For people who aren’t aware, super popular data frame, open source option.
Cal Al-Dhubaib: 28:37
If you’re listening to this podcast, I hope you know what Pandas are because it’s a part of the building blocks and history of data science.
Jon Krohn: 28:47
The lovable bears. So, when you’re coming into these kinds of organizations that you’re working with, higher education, defense, there’s probably a lot of people in those organizations that aren’t data literate or AI literate. How do you tackle that?
Cal
Al-Dhubaib: 29:00
So literacy is I think one of the biggest barriers to AI adoption, and I love to share this with a story that is just one of my favorite examples of how AI or lack of AI literacy can backfire. We were working with a health tech company and they partner with insurance carriers and what they’re trying to do is they’re going through claims. They have about 50 million or so patient lives represented in their database and they’re trying to identify who might qualify for certain government assistance based off of recent diagnoses, medications. It’s not as clear cut as it sounds. There’s guidelines that social security puts out and they had a rules-based approach.
29:47
They would go through and try to identify who are the 10,000 or so members that we want to reach out to this month and help them through that process. Then when they get reimbursed, our client would get a small fee for helping with that increased coverage. So, it’d be a direct mail. Hey, we think you qualify, we can help you, we’ll help you with the application process. So, machine learning is actually a really useful approach here. We partnered with them early on and we built a model and it was an ensemble model. We showed that our model could actually help identify 30% more patients than they were identifying with their traditional approach. This was exciting, 30% more revenue, made a lot of sense. Let’s pilot this. We rolled it out.
30:39
If they reached out to 100 people, they’d get about 10 people that would go all the way through this process. Our pilot out of a 100, only two made it through this process. It was worse than the human-led approach and we almost got fired. This was really, really, really bad results. I was racking my brain. I’m like, “No, no, we validated this. The stats were great. We double ran the numbers.” It turns out that the team responsible for sending out or selecting the mailing pool have yearend bonuses that depend on the quality of their selection and then it was also marked with pilot. They didn’t trust these new individuals, this new mix of individuals we were bringing in. We started to understand why. We had a slightly different diagnosis mix.
31:28
We were finding patients that ultimately would’ve gotten the subsidy, but they had different combinations of characteristics that this group just wasn’t used to looking at. So, it looked weird and they didn’t want to risk their yearend bonuses. It said AI pilot all over it and totally failed. So, we went back, we worked with this team, we sat down with them, we showed them this is how we train a model. This is how we know it works. We put up some of those cases that they thought were weird and we started to work through them. Why would you not trust this? What about this do you need? So we ended up ultimately doing this educational tour, built their trust in the process. Then two, we built a process where we could use explainability.
32:10
Anytime a patient was predicted, we could show the factors that were contributing most to that prediction. That also built more trust in the process. We relaunched the model with no changes to the model and we were able to help them grow their reach by 18%. So, the only thing we really changed is how humans interact with the model. That was one of the very first times I started to see the importance of cultivating AI literacy. There’s a lot of great courses out there that I think satisfy this. I love to do little workshops with cats and dogs, but we got folks like Andrew Ng who does AI for everybody, now gen AI for everybody, our dear friend Cassie Kozyrkov who does amazing work with decision intelligence and does it in such a fun and playful way.
33:00
So, I try to bring in materials and inspiration from folks like that to help build that intuition of, “Well, how does AI go wrong? When can we trust it?” I find that when you empower organizations with that skill, they’re able to do much more with the tools they already have.
Jon Krohn: 33:18
Nicely said, and really great story there to bring to life. Actually, throughout this episode, it’s been great how you bring in specific case studies that make it easy to understand the principles that you’re describing. Another aspect of consulting that I suspect you might be able to provide a good analogy for, I guess we’ll find out now the pressure’s on, is that when you’re doing consulting, you have to be able to demonstrate that you’re delivering a return on the client’s investment, some ROI.
Cal Al-Dhubaib: 33:49
Oh, my gosh. Yeah.
Jon Krohn: 33:51
How have you achieved that in your engagements?
Cal Al-Dhubaib: 33:55
So this might sound super obvious, but one of the biggest missteps is not having the right value hypothesis at the onset of a project. I talk about this a lot in context of AI because we get enamored with the model itself. Like the patient example I was just talking about, we get enamored with, “Can it accurately predict who’s going to qualify?” versus really stopping to think about, “What decisions are we influencing and what does the success of that decision look like?” In another example where we worked with cancer readmissions, we were working with a health system and we were trying to help them build models that could help them with solid state tumor cancer patients and who nationally have a readmit rate of about 25%. It’s really bad.
34:49
So, if you can use machine learning to identify who’s most at risk, you can maybe come up with a more effective intervention. In our very first iteration, we were actually modestly successful. It’s a very tricky population to work with because of the complexity of the patients. But the providers would say, “Okay, now what? This patient’s at risk, what do you want me to do?” It’s actually cognitively overwhelming and not helpful when you have this alert and you’re not actually offering a solution. So, we actually had to go back to interpretability again and think about what would be useful to know. Is this patient at risk because they’re on a certain type of medication or they have a certain comorbidity or some social attributes that we knew about the patient?
35:36
They come from a zip code that has low reliable access to transportation. So, this patient actually might be readmitted not because they’re super sick, but because they might use the ER in an inappropriate way. So, let’s get them some social assistance. So, it’s really interesting to think about when we talk about models, we get very excited as data scientists about the accuracy of the models. We don’t stop to think, “What levers, what decisions are we influencing? What’s the value of a successful intervention?” And then realistically, how often is that decision going to be influenced by whatever work we’re doing? So we now work with clients upfront to try to map that out and we try to challenge it. We try to stress test it.
36:20
In this readmission example, we actually would say, okay, we can identify 80% of these patients and we think realistically that there’s only going to be a 20% intervention rate. So, the value is based off of that. We stop check with the client. Hey, if this is the only value that we get, is it still worth X investment to try to figure out this process? And then you go through the project. As you learn more, as you validate your models, you keep going back to that early hypothesis.
Jon Krohn: 36:49
Very nice, well delivered, concisely again and with an example, five stars, great work.
Cal Al-Dhubaib: 36:57
I hope I can keep delivering. I don’t know.
Jon Krohn: 37:00
All right. So, in all of these projects that you’ve done over all these years, the eight years that you had Pandata and then now continuing to do consulting work Further, you’ve hired a lot of data scientists over the years and you’ve alluded a little bit earlier in the episode to the kinds of things you’re looking for. You have interview questions around your core values, for example, but what are the key skills or attributes that you look for in data science hires today? For example, it seems like AI engineering skills are a really big deal recently.
Cal Al-Dhubaib: 37:32
Oh, my gosh, and hard to find. The most challenging and frustrating thing about data science today really just over the last 10 years is what being a data scientist meant 10 years ago, 5 years ago, 3 years ago is drastically different than today. So, some of the technical skills that we’re looking for today are individuals who are really strong with evaluation and quality control, specifically when it comes to multimodal data.
38:03
So, text-based data, image-based data, the ability to formulate a statistical design and say, “Okay, we’re working with this unstructured problem and we need to come up with a way to measure is our approach working and do it in a way where we’re not creating statistical weirdness, where we’re basically predicting something we’ve artificially put into the data.” So we really look for individuals who have that skill. We’re also seeing the rise of machine learning engineers specifically with this AI engineering flavor.
38:37
Right now, we’re actually looking to hire folks who have more experience with deploying language models and running MLOps or whatever we want to call it today, AIOps around these language models and how we diagnose when they’re being prompted, when they’re drifting, when they’re behaving weirdly. But those are some of the technical skills we look for.
38:59
As far as the soft skills, I’d say that the core values that were really successful at Pandata for me are going to be an evergreen and the traits that I can consistently look for in individuals regardless of the importance of the technical skill today, tomorrow, five years from now, we’re always going to want people who explain things plainly, who naturally are driven to learn the next thing, who can deal with ill-formed problems. These are things that if you really want to stand out as a data scientist, they’re worth cultivating.
Jon Krohn: 39:32
Nice. Thanks for those tips on what you’re looking for from data scientists that you hire both in terms of hard and soft skills. I’d love to learn more about the culmination of a lot of the things that we talked about today. So, we talked about core values, hiring great people obviously just now, delivering a return on investment to clients. Ultimately, that allowed you to be acquired by Further, a much larger consultancy. So, what was that process like? How did you start to think, “Wow, I’ve built this company over eight years. I might like to sell it,” and then you find a sell? I mean, how does all that work?
Cal Al-Dhubaib: 40:10
So over the years, different folks have reached out and they’d say, “Oh, I think we want a consultancy,” or “We like what you’re doing. We don’t have a consultancy, but we need a data science team.” It’s a lot easier to just buy a data science team. So, I flirted with the idea. The goal of building a business is ultimately you do want to exit, you want to find a home. You want to find a place where your business can then create more value consistently as a part of something bigger. So, I’ve always been open to it. A funny story is Mike Gustafson, who’s the CEO of Further or president of Further had seen Pandata story in the Case alumni.
40:49
So, I went to Case Western Reserve University and he started following and thinking about the future of Further. At the time, it was called Search Discovery, and they wanted to really move into cloud infrastructure, AI, and advanced machine learning projects. They had done great work in the analytics space and performance marketing space with these heavily regulated environments. So, it made sense for them to want to get into data science, machine learning. He reached out to me on LinkedIn and I was in the middle of conference season. I totally missed his message. So, some of his employees who I was connected to, they have an office here in Cleveland. So, I’d met them at local meetups. They said, “You really should talk to Mike.”
41:36
I answered his message. We set a time to meet. I go to his office. We do these pleasantries, fellow alumni, and then within five minutes, he says, “I think we want to buy Pandata.” I said, “Oh, okay.” As much as I wanted to say, heck yeah, this is the thing that I wanted the most, I had to stop. I asked a few questions and one of them was tell me about your core values. It wasn’t, again, because I’m going to this whole kumbaya warm, fuzzy place. It was actually I wanted to understand if I were to sell what I have to this organization, what is the compatibility of my client base that would now be moving over?
42:16
Would they stay? Would they have people taking care of them that are similar to the people taking care of them today? Would my team thrive? Would they be able to move over and seamlessly move into projects? And even though the core values are named different things, there was a lot of organizational compatibility. We had a lot of the same problems that we were trying to solve, same attitudes, similar pay scales.
42:38
So, when I saw that I could take my engine and put it into this machine and it would create even more value, that’s a win-win. That’s one of the biggest lessons learned I talked to other entrepreneurs about is find that home that maximizes the value for them and you. Not every organization is going to look like that. So, in the case of Further, it made a lot of sense. We went through diligence at a record pace. It went from this is cool, let’s explore this some more. Conversation got a little bit deeper in December. Then between January and March of last year, we went through so much paperwork and the deal came together in about three months.
Jon Krohn: 43:18
Wow. Really good story there. I love this idea of finding the right home where there’s a win-win for sure for both organizations, the one being acquired as well as the acquirer. Very cool.
Cal Al-Dhubaib: 43:29
Absolutely. I hear a lot of horror stories of similar exits and then it doesn’t work out. It’s because that upfront configuration matchmaking really didn’t happen. It’s very hard. We talked about it’s hard to say no to good things. So, I see a lot of entrepreneurs who end up pursuing it because it sounds great in theory, but then it falls flat. So, in this instance, and I can’t really end this episode without giving a plug to Further and saying, we love our new home. We’re getting to work on exciting things. A lot of that vision and excitement that we had then has came forward today because of that matchmaking process.
Jon Krohn: 44:06
Fantastic. That’s great, Cal. Congrats to you, the Pandata team and the Further team. Sounds like-
Cal Al-Dhubaib: 44:11
Thank you.
Jon Krohn: 44:12
… a great marriage there. One final question that I have for you before I get to my questions that I always ask at the end to the guest. So, my last question for you specifically is overall these years from the startup that you raised capital for right out of Case Western to founding Pandata to the acquisition by Further, what were your biggest lessons learned? What were your biggest failures that you learned?
Cal Al-Dhubaib: 44:42
So it is interesting when you first start a business, you’re tempted to keep the idea a secret. We have a lot of pride over presumed intellectual capital or presumed something that’s valuable. Over the years, I’ve found that some of the most successful people are the most giving of their ideas, the most giving of their time, the most humble and willing to admit when they’re wrong. I think I learned it the hard way, but a lot of what defines how I approach talking with people, engaging with people is really shaped by that.
45:19
So, one of my biggest lessons learned is just be okay with being wrong and be okay with asking for help. That’s a really hard thing to do, especially when you’re a data scientist and you’re trained to be smart. You’re trained and taught that your smartness is your differentiator, and then you’re trained to become an entrepreneur where you’re trained and taught that you have to scale or go bust. The contrary advice is be okay with being vulnerable.
Jon Krohn: 45:42
Do you ever feel isolated, surrounded by people who don’t share your enthusiasm for data science and technology? Do you wish to connect with more like-minded individuals? Well, look no further, Super Data Science community is the perfect place to connect, interact, and exchange ideas with over 600 professionals in data science, machine learning, and AI. In addition to networking, you can get direct support for your career through the mentoring program, where experienced members help beginners navigate. Whether you’re looking to learn, collaborate, or advance your career, our community is here to help you succeed. Join Kirill, Hadelin and myself, and hundreds of other members who connect daily. Start your free 14-day trial today at www.superdatascience.com and become a part of the community.
46:27
Nice. That’s a great tip. I like that a lot. Really nice point to end this interview on, at least the unique questions that I have for you. I do, as I alluded to a moment ago, have questions that I end every episode with. So, we need a book recommendation from you, Cal.
Cal Al-Dhubaib: 46:44
I got one. So, I spent a lot of time on communicating. I keep trying to improve my ability to communicate. A recent book that I just finished is Super Communicators. It’s actually right here behind me and it’s by Charles Duhigg. It’s a classic in the space, but it really talks about the art of individuals who are able to… Maybe they don’t seem like the most charismatic person in the room, but they have this natural ability through conversation to get everyone to feel at ease around them and more willing to reach consensus.
47:22
So, it goes into the science of how people do that, stories and examples of it in practice. As somebody who constantly tries to improve communication, I got so much out of it. So, [inaudible] 10 recommend for anybody wanting to get more effective at just having better conversations with people.
Jon Krohn: 47:41
If people want a taste of what the Super Communicators book might be like before buying it, you could check out episode 805 of this podcast when Charles Duhigg came on and we talked just about Super Communicators.
Cal Al-Dhubaib: 47:54
I love that.
Jon Krohn: 47:55
Nice. Awesome. Then very last question before I let you go is how can people follow you after this episode?
Cal Al-Dhubaib: 48:02
Great question. So, I am the most active on LinkedIn. If any of these points resonate with you, if you’re thinking of starting a data science oriented business or you’re super excited about responsible AI and heavily regulated environments, reach out to me. I’d love to hear from you.
Jon Krohn: 48:17
Awesome, Cal. Thank you so much for that generous offer to our SuperDataScience listeners. Thank you so much for being on the show. Again, so amazing talking with you. I mean, yeah, we’ve been friends for years and so I always enjoy chatting with you. I’m not surprised that you give an amazing interview, but this was exceptional. Really well done. Thank you.
Cal Al-Dhubaib: 48:36
I was honored to finally be a part of the show, Jon. Thanks again.
Jon Krohn: 48:44
What a great episode that was with Cal Al-Dhubaib. In it, Cal covered how he built Pandata by focusing on mid-sized data science projects in regulated industries like healthcare, defense, and education that were too small from major consulting firms, but required specialized expertise like his firm had. He talked about how his company’s success was built on two pillars. First, a mandatory 30-day discovery and design phase costing about 50 grand to ensure client commitment and project clarity. Two, his core values, be approachable, win together, cultivate trust, pursue growth, and tame uncertainty.
49:16
He also talked about how making complex data science concepts accessible and boring through simple explanations and analogies proved more effective than emphasizing technical sophistication. He talked about how AI literacy among stakeholders proved crucial for project success. In one case, improving a model’s adoption from 2% to 18% simply by helping users understand and trust the system. He talked about how Pandata’s eventual acquisition by Further succeeded because of strong alignment in organizational values, client base, and vision for creating value at a larger scale.
49:47
As always, you can get all the show notes including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Cal’s social media profiles, as well as my own at www.superdatascience.com/865. If you’d like to connect with me in real life as opposed to online, I’ll be giving the opening keynote at the RVA Tech / Data + AI Summit in Richmond, Virginia on March 19th. Tickets are super reasonable and there’s a ton of great speakers. So, this could be a solid conference to check out, especially if you live in the Richmond area. It’d be awesome to meet you there.
50:19
Thanks of course to everyone on the SuperDataScience Podcast team, our podcast manager, Sonja Brajovic, media editor, Mario Pombo, our partnerships manager, Natalie Ziajski, researcher, Serg Masís, writer, Dr. Zara Karschay, and of course our founder, Kirill Eremenko. Thanks to all of them for producing another invaluable episode for us today for enabling that super team to create this free podcast for you. We are deeply grateful to our sponsors. You can support the show by checking out our sponsor’s links, which are in the show notes. If you’d like to sponsor an episode of the show, you can get the details on how to do that by heading to jonkrohn.com/podcast.
50:54
Otherwise, share the show with people who would enjoy it. Review the show on your favorite podcasting app or on YouTube. I recently got a question about this. On platforms like Apple Podcasts, you can only review the whole podcast as opposed to individual episodes. So, that’s a limitation. I hear Spotify is going to allow you to have more sophisticated commenting soon. We’ll see. But anyway, yeah, so review the episode or the entire podcast on whatever podcasting app you use or YouTube.
51:26
Subscribe, obviously, if you’re not a subscriber. Feel free to edit our videos into shorts to your heart’s content. But most importantly, we just hope you’ll keep on tuning in. I’m so grateful to have you listening, and I hope I can continue to make episodes you love for years and years to come. Until next time. Keep on rocking it out there and I’m looking forward to enjoying another round of the SuperDataScience Podcast with you very soon.