SDS 539: Interpretable Machine Learning — with Serg Masís

Podcast Guest: Serg Masís

January 11, 2022

In this episode, Jon and Serg dive deep into interpretable machine learning, his journey into data science, and, of course, the ins and outs of his day-to-day as a climate and agronomic data scientist at Syngenta.

Thanks to our Sponsors
About Serg Masís
Serg Masís is a data scientist working in digital agriculture with a lengthy background in entrepreneurship and web/app development, and the author of the bestselling book “Interpretable Machine Learning with Python”. Passionate about machine learning interpretability, responsible AI, behavioral economics, and causal inference. 

Overview
As the author of the book Interpretable Machine Learning with Python, an epic hands-on guide to techniques that enable us to interpret, improve, and remove biases from machine learning models that might otherwise be opaque black boxes, Serg brings a wealth of knowledge on the topic.
When it comes to mastering interpretable machine learning, Serg emphasizes that feature importance is a vital component that one must absolutely master. Understanding the relative weight of individual inputs and how they contribute to your outcome, and in particular which ones and when are vital to understanding your model and steering clear of potential ethical issues.
As far as what’s next in interpretable ML, Serg predicts that as methods progress, they will convergence with causal inference. But that’s not all. He insists that more robust explanations are needed and adds that causal methods are a vital link in the natural progression of the field. Secondly, Serg forecasts that a legal framework will develop and that the increase in no-code and low-code tools will free up the time spent programming and cleaning data.
Next, Serg and Jon move on to a lighter subject – his book! With his first edition being such a hit, he’s already readying the release of his second edition later this year. What can you expect this time? Prepare to examine new examples, and read a new chapter devoted to NLP transformers. And if that wasn’t enough, he’s also co-authoring a book on Responsible AI, which he considers “one of the most important problems in machine learning.”
Now, when it comes to his role as a climate and agronomic data scientist, Serg dives into the components and problems he deals with on a day-to-day basis at Syngenta. Whether it’s working with weather data, using computer vision to examine drone images, or providing farmers with valuable insights that maximize the health and growth of their crops, the core of his work revolves around the field of agronomics, which is a branch of agriculture dealing with field-crop production and soil management.
Finally, as a busy author, data scientist, and speaker, Serg has also managed to magically fit in time to complete his Masters of Data Science recently, which naturally led Jon to ask how Serg was compelled to pursue another degree and become such a deep expert in a complex topic like interpretable ML. Despite having years of experience working with data, he had significant gaps in his toolkit regarding statistics and computer vision. He felt uncomfortable learning both topics independently and trusted the guidance that traditional education offers. Now with more confidence and knowledge than ever, Serg has officially kicked imposter syndrome to the door!

In this episode you will learn:
  • What is interpretable machine learning? [8:41]
  • The social and financial ramifications of interpreting models incorrectly [10:23]
  • The challenges involved in interpretable ML [16:00]
  • The most important interpretable ML concepts to master [19:54]
  • The future of Interpretable ML [32:41]
  • What it’s like to be a Climate & Agronomic Data Scientist [42:28]
  • Serg’s day-to-day tools [49:05]
  • Serg’s productivity tips [50:25]
  • Why Serg pursued a Master’s in Data Science [52:25]

Podcast Transcript

Jon Krohn: 00:00:00

This is episode number 539 with Serg Masís, agronomic data scientist at Syngenta and author of Interpretable Machine Learning. 
Jon Krohn: 00:00:13
Welcome to the SuperDataScience Podcast, the most listened to podcast in the data science industry. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. I’m your host, Jon Krohn. Thanks for joining me today. And now let’s make the complex simple. 
Jon Krohn: 00:00:44
Welcome back to the SuperDataScience Podcast. Today’s guest is the absolutely brilliant, Serg Masís. Serg is a climate and agronomic data scientist at Syngenta, one of the world’s leading agricultural companies. He’s author of the book Interpretable Machine Learning with Python, an epic hands-on guide to techniques that enable us to interpret, improve and remove biases from machine learning models that might otherwise be opaque black boxes. And he holds a master’s degree in data science from the Illinois Institute of Technology. 
Jon Krohn: 00:01:16
In this episode, Serg details what interpretable machine learning is, the key interpretable ML approaches we have today and when they’re useful, the social and financial ramifications of getting model interpretation wrong, what agronomy is and how it’s increasingly integral to being able to feed the growing population on our warming planet, what it’s like to be a climate and agronomic data scientist day-to-day, and why you might want to consider getting involved in this fascinating high impact field. He also covers his productivity tips for excelling when you have as many big commitments as he does. Today’s episode does get technical in parts, but Serg and I made an effort to explain many technical concepts at a high level where we could. So today’s episode should be equally appealing to both practicing data scientists and anyone who’s keen to understand the importance and impact of interpretable ML or agronomic data science. All right, you ready for this brilliant episode? Let’s go. 
Jon Krohn: 00:02:20
Serg, welcome to the podcast. It’s awesome to have you here. It’s so great to see you again. How are you doing? 
Serg Masís: 00:02:26
I’m good. Thank you. I am. 
Jon Krohn: 00:02:29
Nice. So- 
Serg Masís: 00:02:30
Can’t complain. 
Jon Krohn: 00:02:32
… we’re recording at Christmas time. People are on holidays and they often think of Christmas as kind of a snowy situation, but you, if people are watching the video version on YouTube, clearly are in a beautiful tropical location. 
Serg Masís: 00:02:47
Yes, that’s correct. I’m in my birth place, Costa Rica. 
Jon Krohn: 00:02:52
Ah. Yeah, it looks wonderful. And I absolutely love Costa Rica. We actually talked about this a bit before we started recording, but it’s one of my favorite countries in the world. I went there several times in my early twenties on my own or with friends, and then more recently it’s become the favorite holiday destination for me and my family. We love spending time in Costa Rica. Everyone is so friendly. Everyone seems to be so happy. The food is incredible. The weather is incredible. There’s so many different ecosystems that you can explore. It is paradise. So what a wonderful place to be able to go back to on the holidays. 
Serg Masís: 00:03:35
Yes. Especially after a couple of years of not coming. It feels like- 
Jon Krohn: 00:03:40
Right. 
Serg Masís: 00:03:41
COVID kept me away, so… 
Jon Krohn: 00:03:45
Oh yeah. So we met for the first time just before COVID. Well, the year before COVID hit. So we met in spring 2019. You were the MC for a workshop that I did, an intro to deep learning at the Open Data Science Conference in New York. It was the inaugural Open Data Science Conference in New York. And the expectation was that it would happen every year, but then the pandemic hit the next year. And I’m hopeful to have ODSC in New York again soon. And then we reconnected at ODSC West 2019, which was in San Francisco. And that was the last big conference. That was the last conference that I went to before the pandemic hit. So it was nice to squeeze that in, but even since then I’ve felt connected to you. While we haven’t had a conversation and while I haven’t interacted with your face in a video, you have been supplying so many of the questions that I’ve been asking guests. 
Jon Krohn: 00:04:46
So sometimes when I have a guest coming up and I have a lot of time in advance before we’re going to record the episode, I’ll post on LinkedIn, “Hey. So and so is going to be on the show next week. They have expertise in this or that, and do you have any questions for them?” And Serg you then reply to those LinkedIn posts or those tweets with the unbelievably insightful questions for me to ask the guests. And so when yours come up, I pretty much always ask them. And so listeners may have already heard your name on the program if they’ve been listening a lot, when we get to those questions with guests, so it’s been great to continue to connect with you via that. And I knew that it would only be a matter of time before I could get you on the show and our listeners could get to know you really well. 
Serg Masís: 00:05:36
Thank you for that opportunity. I’m glad to be providing the questions. I, myself, am intrigued about those questions that’s why I’ve asked them. I’m like, “Oh…” I’m not a podcaster myself, but I wonder if I met your guests, they’re often very illustrious people with amazing careers, and so I think to myself, “If I was still going to conferences in person, this is what I would ask this person.” Right? But I haven’t [crosstalk 00:06:09] had that chance since you’ve been in the podcast and of course, since COVID, because they both kind of came at the same time a little bit. 
Jon Krohn: 00:06:15
Exactly. They did. I am very much looking forward to… In a completely post pandemic world I plan on merging those two things by having SuperDataScience episodes filmed live at conferences with a live audience. And so, you could be there in the audience with your list of questions, and then you can say them there live to the guests and it’ll be on air. I’m really excited for that to happen. It might actually happen. We might do that as soon as March 2022, working out the details, but might do that at a conference in New York. So stay tuned for that. All right. So you, yourself, in addition to being an outstanding question asker of illustrious data scientists, are an illustrious data scientist yourself. So you have a book called Interpretable Machine Learning. It’s an epic book of 700 pages in length. And your book… Everything that you do, Serg, I’ve been so impressed. 
Jon Krohn: 00:07:14
You put an exceptional amount of effort into every detail. I’ve rarely seen anything like it. Whether it’s your book, your website, recordings of presentations that I’ve seen you do, you put a huge amount of effort into getting every detail of preparation, execution, and then post-production. It’s incredible. Listeners, anything that Serg does, I highly recommend checking it out. [crosstalk 00:07:44]. Yeah. I don’t know if I’ve waxed so lyrically about kind of the polish that people put on things ever on the show before, like you do. So your book, exceptional. So we’re going to actually go over it. So this episode is going to focus primarily on interpretable machine learning, the topic of your book. Later on in the episode, we will talk a bit about what you do as a data scientist day-to-day on top of being an author. And maybe if there’s time, your journey into that. 
Jon Krohn: 00:08:16
But the piece that, yeah, that we’re really going to focus on is interpretable machine learning. So we’re going to talk about what it is. We’re going to talk about various approaches that are available for being able to interpret machine learning algorithms as well as to kind of fine tune them. And then what’s up next in the interpretable machine learning realm. So we’ll start off with the topic that’s part one of your book. It’s an introduction to interpretable machine learning. So tell us, why does interpretable machine learning matter? What is it? Why does it matter? And is it the same as explainable AI? 
Serg Masís: 00:09:00
Yes. Yeah, it is very important. It’s very important for the same reason that you ought to debug your code, you ought to understand it at a intricate level. AI there isn’t… for newcomers into the field, it’s almost like an extension of software because it’s often replacing software. It has software-like properties, because it’s automating things, but it’s not software. I mean, we don’t know it deterministically in the same way we can know about software. Right? So when an AI does something funky, we cannot just point to a line of code and say, “Oh, that’s why it did it.” It’s not that easy. So that’s the reason why I got into it, to begin with. I had a startup many years ago and that was my frustration with that project. 
Serg Masís: 00:09:55
I didn’t know the term for it, but once I knew the term for it… I mean there was a book back then. It was like a booklet. I think it was the very first one for practitioners on the subject. And once I found it, that was like… “Holy (beep). This exists. People are talking about this problem.” And then I found out, yeah, they’ve been talking about it for a long time in academia, but in industry it hardly made a difference. So the reason it should make a difference, of course, and I point it out it in the first chapter is, well, there’s ethical reasons. Very important ones. You want your, as they say, “Trust is mission-critical,” and as long as we’re making products with AI we have to understand how they’re impacting and whether… We could be impacting correctly like 90% of the time, but 10% of time we’re creating these ethical ramifications we’re not aware about. 
Jon Krohn: 00:10:55
Sending innocent people to jail. 
Serg Masís: 00:10:58
Exactly. That’s a possibility. 
Jon Krohn: 00:11:00
[crosstalk 00:11:00]. We’re only sending innocent people to jail 10% of the time, man. 
Serg Masís: 00:11:03
Yeah, exactly. There can be awful effects and there’s the kind of effects that just compound. They’re not by themselves, like small percentage, very terrible, but they compound because they’re affecting millions of people. So that could be the effects of social media or something to that effect. Nobody thinks of it. It’s like, it’s not life and death, but it certainly affects people psychologically in profound ways. 
Jon Krohn: 00:11:36
Totally. 
Serg Masís: 00:11:36
So you have that ethical reason. And then there’s also, from a business reason, there’s also the question of public relations. It can have huge ramifications there. It can have financial ramifications as we saw with Zillow. So we have to understand our models. We have to understand what they’re for, what is the scope they’re useful for, because they can be very good at one thing, but then we can’t just repurpose them for everything. So that’s basically what the first chapter talks about. And to answer your other question about explainable AI, in my view, they are the same thing because they’re used interchangeably, at least in industry. There’s folks in academia that want to redefine them and give them a specific… And specifically there’s a camp that says interpretable machine learning is what is used in industry for big complex models where explainable AI is… Or do I have it the other way around? Actually I do have it the other way around. They say explainable AI is the big models, deep learning and so on, the complex ones, and then interpretable machine learning is the simple one. I don’t distinguish. I think they’re used interchangeably and there’s no point in redefining them right now. As far as the terms, explainable and interpretable, something I do caution about in the book and the reason I prefer interpretable machine learning versus explainable AI is because, explainable just adds a level of hubris to the whole thing. Because if you think of it, semantically, interpretable is something you don’t know. You have to interpret, you have to understand, whereas explainable is something, “Oh, I can explain it.” You know? 
Jon Krohn: 00:13:38
Right. 
Serg Masís: 00:13:39
And it’s kind of semantically tricky because of course I’m explaining an interpretation, right? So it’s not easy to separate both terms, but I think people become over confident of their explanations if they call it explainable AI. 
Jon Krohn: 00:13:58
That is a beautiful explanation. And I totally agree with you that it doesn’t make sense to be saying, oh, XAI is for, whatever, models with more parameters and interpretable machine learning isn’t, because I know, and something we’re going to get into briefly is that… or sorry, that we’re going to get into shortly, it might not be brief, but we’re going to get into it shortly, is that a lot of these techniques for interpreting machine learning models are agnostic to the model. So it- 
Serg Masís: 00:14:27
Yeah. 
Jon Krohn: 00:14:28
… it doesn’t matter whether it’s a giant model with a billion parameters or a regression model with two parameters. It’s- 
Serg Masís: 00:14:34
Yeah. 
Jon Krohn: 00:14:35
Yeah. Every week I talk to leaders in data science about the techniques and approaches they’re using to make sense of data. The common thread, learning. They all make time for learning and encourage the same of their teams, but between meetings and the actual work that you’ve got to complete every day, all-day trainings are impossible for most of us. This is why an on-demand learning platform like Udemy Business makes sense. Accessible whenever you and your team need it by either your browser or an app. With Udemy Business, you can access over 500 cutting edge data science courses taught by industry experts and validated by other learners’ real time reviews. Amongst these 500 courses you’ll find my own Mathematical Foundations of Machine Learning course as well as dozens of mega popular courses from SuperDataScience instructors, like Machine Learning A-Z and Data Science A-Z. If you enjoy this free podcast, I encourage you to support us by visiting business.udemy.com/sds. That’s business.udemy.com/sds. From there, you can discover how to democratize data science learning in your teams through Udemy Business. 
Jon Krohn: 00:15:42
Yeah, so we’ll get into that shortly. And I think we’ve kind of covered the intro to the topic. So we’ve covered why it matters. We’ve talked about the kind of key concepts, I think, already. What are some of the big challenges in interpretable machine learning? 
Serg Masís: 00:16:04
Well, there’s a lot of challenges. I mean, as you mentioned, one of them and it has nothing to do with the, with the model, but a lot of people think the culprit is the model, it’s the data, right? 
Jon Krohn: 00:16:19
Right. 
Serg Masís: 00:16:21
I completely agree with the data centric AI movement. Data is the center of everything, but a lot of people are saying, “Okay, it’s the model.” It’s a big, complex model and it’s true, it can augment a lot of the issues with bias that are already embedded in the data, but it’s a fool’s errand to not look at the data and to specifically look at the data generation process. And so, there is this idea that you can just take any data and just throw the most complex model in there and just make it work. And that’s also a bad way to look at it. So I think as long as we’re trying to understand our models actively, choosing the right model for the job, I mean, that covers up, to a certain degree that problem with the data. I can’t say completely because we always have to look at other things. I think, on the level of reliability, we have to make sure that the models… and that’s also within the realm of interpretable machine learning, reliability, robustness of a model and assess it. 
Serg Masís: 00:17:37
And I think more and more folks are coming to terms with that. One thing is having the model work well in the lab when you’re testing or evaluating it with a holdout data set, and another one is doing it with real world data as it’s coming in. So there’s a push towards that and that is a problem. And then there’s also another issue with the models that has nothing to do with the data, but it precisely has to do with decisions that are made during the process. And I already alluded to it a little bit, what model class you use, why you choose it, how do you hyperparameter tune it. Are you trying to mitigate something during that process or not? Is there some kind of domain knowledge that you have to inject into the model training? Often, and this is another thing that I cover in the book, people don’t think of things in terms of cost. And I think it’s an important concept, cost-sensitive training. Making sure that you’re aligning model to the mission that it’s meant for, right? Making sure that… Like in my line of work, which we’re probably going to get into later, there’s also the cost. What’s the cost of treating a plot with chemicals versus not treating it, you know? 
Jon Krohn: 00:19:08
Right. 
Serg Masís: 00:19:09
Because that’s going to come in handy for misclassification. And you want to know, what is the mission? Yeah. You have a mission of maximizing profit, but at the same time, that can’t be all the mission. You want the plot to be sustainable in the long run. You don’t want to drown it with chemicals just because. 
Jon Krohn: 00:19:29
Yeah, yeah, yeah. 
Serg Masís: 00:19:30
So- 
Jon Krohn: 00:19:32
To be clear on that point of plots and chemicals, we’re talking about plots of land, not graphical plots. 
Serg Masís: 00:19:37
Yeah. Agriculture. Agriculture, yes. Plots of land. 
Jon Krohn: 00:19:41
That will become clear later when we get into your agricultural expertise. Beautifully said. So that, it was a great introduction to why interpretable machine learning matters, what it is, the challenges that we face in this space. Let’s move on to the second part of your book now on mastering interpretable machine learning. So there are some big sections in there. I kind of picked out the ones that I thought might be most interesting to introduce to listeners of this program. So the first big thing that you talked about in the mastering interpretable machine learning section of your book is on feature importance. So what is that? It sounds important. 
Serg Masís: 00:20:20
Yeah, it is. It is. I think, utmost importance, understand what your model finds more important and [crosstalk 00:20:31] in general. So- 
Jon Krohn: 00:20:31
So this is figuring out… You could have some large number of inputs into, especially, a big [inaudible 00:20:41] network today. And so the idea here is that you’re figuring out the relative importance of individual inputs, how they contribute to the same outcome. 
Serg Masís: 00:20:50
Absolutely. Absolutely. And that’s one of the most basic methods you can use for interpretation of a model. So you want to understand, not only which ones are important, but when are they important. What values are triggered? If there are any thresholds you should be aware about. Because that can help you in the long run implement some things. They could form part of a feedback loop towards the business as well, because they might not understand this themselves, that there’s a certain value within their… a certain feature that acts as a trigger, that it’s either used for discrimination in a classification model to say, okay, it goes to this class or another. Or for some reason, it makes a big difference in a regression model as well. 
Jon Krohn: 00:21:55
Yeah, yeah, yeah. If listeners are familiar with a simple linear regression model, in that case you have a beta weight for every single one of your inputs, and you can just say, “Okay. I know exactly how my input relates to my outcome,” but with a lot of modern models, you could have many, many, many inputs. And so these feature importance techniques for interpretable machine learning, they allow you to, in a way, reverse engineer that same kind of relationship to be able to say, “Okay, this feature has this much impact.” And it’s interesting, you talked about this idea of threshold, of maybe limits. And I wasn’t sure exactly what you meant there, but it gave me the idea that it could show you that for some particular inputs if you go outside of a range, if you go too low with that input or too high with that input, you will get unusual results with your model, perhaps. 
Serg Masís: 00:23:02
Yeah. 
Jon Krohn: 00:23:02
Maybe… Right. Right, right. And so that actually ties back to what you were saying with one of the challenges. So in the real world you end up with data that is out of sample. That is different from what you used to train or validate your model when you were developing the model. And so maybe these kind of feature importance techniques could allow you to say, “Okay, if we have inputs that are of this value we should be alerting somebody, because it suggests that we’re now dealing with inputs that are out of sample. Our features have drifted from what our model was designed for.” 
Serg Masís: 00:23:39
Exactly. And there’s also cases in which the there’s a monotonic relationship between the input and the output. If that’s the case, you can also do something which I… I’m getting ahead of myself, but it’s in the third section of the book. It’s called monotonic constraints, in which you can tell the model, basically, even if you weren’t training with that data, that it’s monotonic and therefore it increases the odds of something or decreases it. Or if it’s a regression, it’ll just continue down the same slope. The data could tell it otherwise. The model could learn from some outlier. And if you’ve seen in a lot of regression problems, you’ll find a lot of outliers actually around the edges of your training data. 
Jon Krohn: 00:24:35
Right. Right, right, right, [crosstalk 00:24:36]. 
Serg Masís: 00:24:36
The model doesn’t know what to do with it, and especially… In a regression that won’t happen because you have a linear… It’s linear, so therefore it’s monotonic, but when you have something non-linear, as you do in many models, who’s to say what’s going to happen when it’s [inaudible 00:24:54]. 
Jon Krohn: 00:24:54
Right, right, right, right, right, right. And that idea of monotonicity, that’s the idea that if, say an input goes up, the output always goes up, or if an input goes up, the output always goes down. So that’s what this kind of monotonic means. There’s no curve in the relationship between, say the input and the output. All right. So these kinds of techniques that we’ve talked about so far, particularly something like a model-agnostic approach like a LIME or SHAP, it doesn’t necessarily depend on the inner workings of a model. That contrasts with a particular interpretable machine learning approach that I’d like to discuss next, which is one that is one of the most fun for me and that I’ve had a lot of fun experimenting with myself, is visualizing convolutional neural networks. So convolutional neural networks, they are some of the most visually gripping models that we have today. 
Jon Krohn: 00:25:57
I make use of a lot of them in my Deep Learning Illustrated book in illustrations because they can be so fun to illustrate. These convolutional neural networks have lots of layers, and as you go deeper into the layers you go from artificial neurons that detect very simple features like straight lines at the input end of the neural, and then as we go closer to the outputs, the network can non-linearly recombine information. So straight lines found by the first layer of artificial neurons of the network can be recombined into curves and corners. And then a third layer can recombine those curves and corners into textures. And then as you go deeper and deeper and deeper, we can have increasingly abstract, increasingly complex visual representations handled by this neural network. 
Jon Krohn: 00:26:53
So you can have these really fun visualizations of these CNNs, these convolutional neural networks that allow you to see, “Okay, look at this specific artificial neuron in the sixth layer.” It’s clearly specialized to detect dog faces. And so you can kind of see the canonical dog face that it’s learned to detect. And then you’ll find some other neuron that’s specialized for horses and another one for cars. And so it’s really fun to look at visually. And one particular video that I often recommend is Jason Yosinski’s Deep Viz Toolbox, if I’m remembering the name correctly. There’s a fun YouTube video that takes you on a tour through a convolutional neural network showing specific layers. He does things in real time with video where he shows a face-detecting neuron, a human-face-detecting neuron, and he shows how in real time he can move in the frame of his camera and that neuron, which is detecting his face moves around. He brings a friend in, and then you have two faces that it’s detecting in real time in the video. And I don’t know, it’s such a fun area. And I know that you’re particularly interested in it. Are there particular tools or particular resources that you recommend checking out in this CNN visualization space? 
Serg Masís: 00:28:16
There’s a ton. There’s a ton and it also… it’s not always model-agnostic because of course- 
Jon Krohn: 00:28:26
Exactly [crosstalk 00:28:27]. 
Serg Masís: 00:28:26
… deep learning it’s [inaudible 00:28:28]. The frustrating part of it is that the libraries, they pick a flavor. I know you’re PyTorch… more leaning towards PyTorch these days. 
Jon Krohn: 00:28:38
Well- 
Serg Masís: 00:28:40
There’s PyTorch tools. 
Jon Krohn: 00:28:41
Yeah. Although I would say I’m bilingual and I think there’s huge value in both TensorFlow and PyTorch. All of Deep Learning Illustrated was done in TensorFlow. If I was doing it today, maybe it would be PyTorch, and maybe a future edition of the book will… I’d probably do both because I think in… We’re going off on a bit of a tangent and I have YouTube talks specifically on the pros and cons of PyTorch and TensorFlow. And my conclusion at the end of the video is that you should really learn both because yeah, once you know one it’s very easy to learn the other. In a lot of ways they’ve kind of converged on each other. They used to be more different, but the TensorFlow people said, “Oh, look at what they’re doing at PyTorch. Some of that stuff is cool. We should do it.” And the PyTorch people said the same thing. So they converge in a lot of ways. And so, yeah, it’s pretty easy to learn both, but they have some slight different pros and cons. For example, PyTorch is a bit easier to use. It’s easier to interpret errors while TensorFlow still today has more complex deployment options for you. And [crosstalk 00:29:46], anyway, so I recommend learning both. If you want to be competitive in the deep learning job market, showing that you have both means that you’ll be applicable to basically any deep learning job. 
Jon Krohn: 00:29:57
Interested in deep learning and interested in learning live online from me? Well, then you may want to check out my deep learning certificate which I’ll be offering online starting February 2nd. Based on my book, Deep Learning Illustrated as well as years of experience teaching my deep learning curriculum in the classroom, this is the first time I’ve ever offered this online and I currently don’t have plans to offer it online again. So this could be a one-time opportunity. My deep learning certificate brings high level theory to life via TensorFlow, Keras and PyTorch, all three of the principle Python libraries for deep learning. 
Jon Krohn: 00:30:31
This foundational knowledge will empower you to build production-ready deep learning applications across all of the contemporary families, including CNNs, RNNs, GNs, and deep reinforcement learning. There will be six, three and a half hour classes every other Wednesday, starting February 2nd. If you miss a class, no sweat, you’ll have access to the recording. Join me. We’ll get to run code together and you’ll be able to ask me questions in real time, if you have any. For all the details on my deep learning certificate head to jonkrohn.com/dlc. That’s jonkrohn.com/dlc. Anyway, I [crosstalk 00:31:11] completely derailed the conversation.
Serg Masís: 00:31:13
No, no, no, no, no. That’s good. In that chapter, I’m using strictly the one you speak on [inaudible 00:31:22] convolutional neural networks and specifically using TensorFlow. But the reason I chose TensorFlow is because at the moment when I started the writing, there was only one really good library for interpreting convolutional neural networks with PyTorch. And I’m also bilingual. I think my strengths are more on TF right now, but I’m using more and more PyTorch and I’m feeling more and more comfortable with it. So it took me a while, but it’s just one of those things. You become used to something, it becomes part of your process. Probably now that I’m doing the next version of this book, the next edition, I will switch it to PyTorch because the library now that I… I forget its name, but it has more methods than it had two years ago. So I’ll go for that. 
Jon Krohn: 00:32:29
Cool. All right. Well that was a fun, deep discussion on automatic differentiation libraries. So what’s next in the field? What’s the big up-and-coming topic in interpretable ML? 
Serg Masís: 00:32:48
Yeah. I think this is one of the questions I asked one of your guests, someone you were going to interview. But yeah, I think the future of this field… I mean, of course there’s going to be always a lot of work being done on advancing some of the methods. Every year there’s new flavors of SHAP and LIME. There’s new ways to improve Grad-CAM integrated gradients and apply them to specific use cases, apply them to graph networks, apply them to transformers and so on. But I think there’s going to be a convergence, in the future, of these methods with causal inference. I think that’s where that’s heading- 
Jon Krohn: 00:33:46
Oh.
Serg Masís: 00:33:50
… because we’re also… We need better, more robust explanations. It’s also become not good enough to say, “Look at this saliency map. It suggests such and such.” Right? How can we be certain that that saliency map that shows that this picture is indeed a dog is always going to work? How can we guarantee that? So there’s more and more methods I’m hearing in scholarly journals looking into ways of making robust explanations. And they often have something to do with causal methods. So I’m excited about that. I think it’s going to become a bit more complicated than what people are used to in interpretable machine learning, but it’s also like a natural progression. So that’s why when I mentioned counterfactuals, I think it’s a very solid link between associational kind of algorithms and the causal one. So- 
Jon Krohn: 00:35:01
Super cool. 
Serg Masís: 00:35:03
… and another thing I think will also happen is, right now we don’t have a framework. I’m talking about a legal framework, a technical framework with standards and so on, because precisely it’s evolving field, but I think a lot of those things will coalesce. And the reason they have to coalesce is, one, not only because there’s a need, but also there’s going to be an increase of no code and low code tools out there and they’re going to want to incorporate a lot of these methods. And some of them already exist and they’re constantly being improved. PerceptiLabs, Fiddler. On the causal side, you have causaLens. And they’re incorporating a lot of these methods in their framework, which is completely like drag and drop. So I see a lot of the machine learning in the future is going to be drag and drop. So that kind of frees our hands from all the programming stuff that we have to do day by day and can lead into more productive things. Not that I have anything wrong with pro programming, but I think it takes up a lot of time. All the programming, all the manual tasks of sorting data and cleaning it and the data wrangling and so on. I think once that part… It doesn’t necessarily have to become completely automated, but certain bits and pieces, it becomes part of a pipeline we can [crosstalk 00:36:42]- 
Jon Krohn: 00:36:41
Of course. 
Serg Masís: 00:36:43
… and then we can test all these different hypothesis, which I think is at the core of what interpretable machine learning will become in the future, which is, you don’t necessarily know all the answers. You don’t assume them, you test them and you test them rigorously. And you have all the tools to do that with a click of a button. So it’s like you have things… It’s like flying a plane. You have all these different buttons and sure, it’s complicated, but the plane can fly itself. You don’t want it to all the time, but it can, right? So AutoML, interpretable machine learning, no code, all going to connect and converge. And at the same time, a legal technical framework which will kind of connect with all of this, blockchain to maintain, of course, some traceability of all the decisions made throughout the modeling process, as well as a lot of things that are common place already in [inaudible 00:37:53], the drift and so on to make sure that’s always on course. Because that’s an important part. You train the models and then you let them out. And that’s not the end of the story once they’re in production, of course. So I think there’s going to be a lot of that in the future and new professions like AI auditors and so on. 
Jon Krohn: 00:38:19
Cool. You have one of the richest views of what I think is definitely coming over the next decade that anybody has expressed on the show. You touched on so many great topics there, causality, legal issues, AutoML, whole new professions coming about. Amazing answer, Serg. Thank you. So- 
Serg Masís: 00:38:38
Thank you. 
Jon Krohn: 00:38:39
Yeah. Great being able to see into the future of the data science industry in general. Let’s look into the future a little bit about your specific publishing. So I understand that this book was such a hit, you’ve got a second edition coming out soon. How soon is it coming out and what are the changes that there will be in that second edition? 
Serg Masís: 00:39:01
Okay. It’s coming out in August, I think. July or August. And it’s going to have new examples. It’s going to have an entire new chapter devoted to NLP transformers. It’s very exciting because it’s something that was a glaring omission from my current edition, is that it doesn’t… Only one chapter deals with NLP and it does through, I think, SVM. It wasn’t even a deep learning example. So I think that’s… I asked and hundreds of people answered in my poll and what they preferred me to write about was NLP transformers. 
Jon Krohn: 00:39:49
Nice. Yeah. So natural language processing with transformers, big deep neural networks. If you want to learn more about NLP transformers, we recently did a whole episode of the SuperDataScience podcast on that topic. It’s episode number 513 with Dennis Rothman. And I think that that is actually the same person that you asked about what would be happening in the future of… in interpretable machine learning. 
Serg Masís: 00:40:15
Yeah. 
Jon Krohn: 00:40:15
Very cool. And then, that isn’t your only book in the works. You also have a responsible AI book that you’re co-authoring. Is that right? 
Serg Masís: 00:40:24
That’s correct. Yeah. For all the biased mitigation and biased detection stuff, I have like two and a half chapters in my book, but this book is entirely about that. So I’m very excited to work on that because it’s at its core, one of the most important problems in interpretable machine learning. I prefer to see interpretable machine learning as a pyramid. It’s often seen as having three layers. You have fairness, accountability, and transparency. Most people focus on the transparency. You say interpretable machine learning and they’re like, “Oh, what does a model think of? What is it thinking? How does it make decisions?” Right? But I argue that it’s important to look at it from the top of the pyramid. Fairness and accountability are of utmost importance, and if you uncover those, transparency is the low hanging fruit. 
Jon Krohn: 00:41:28
Cool. Can’t wait to check out that book as well. So we’ve got the second edition of interpretable ML coming out and responsible AI not too long after. So despite being such a prolific writer, churning out books at an incredible pace and not just any books, but gigantic 700 page books with an extraordinary amount of detail and tons of hands-on examples, somehow on top of doing all of that, you also do have a day job. So you work for Syngenta, which is a Swiss-based company. They’re huge. About 50,000 employees around the world. And they’re one of the world’s leading agricultural companies. So for example, they sell seeds and crop protection products. So at this leading agricultural company, you have the title of climate and agronomic data scientist or agronomic. I’m not even sure that I’ve pronounced it correctly because it’s one of the first times I’ve seen that word. So what does that mean to be a climate and agronomic data scientist? And is there any interpretable element to what you do in that job? 
Serg Masís: 00:42:38
Absolutely. Yeah. Climate and agronomic. I mean, most people see that and say, “Okay, you work with climate.” And it’s true. I work with weather. A lot of my models, I would say all of them, or nearly all of them have some weather data in them, right? But the core of everything is agronomic and within Syngenta, which has the crop protection side, which is all the pesticides and herbicides and all that, and the seed side, it also has an ever-growing digital agronomy side. And digital agronomy is the department my department is embedded in and it has to do with giving farmers insights. So the insights can be, “Okay, you need to plant early because this is going to happen with the weather,” or, “you need to apply this pesticide right now because you have a high likelihood of having this disease come up.” 
Serg Masís: 00:43:41
And there’s also things that are computer-vision based. So they have to do with the farmer pointing the camera at the plant and then telling it, it’s achieved this about the growth or it has these signs of this disease or so on. Also, there’s a lot of precision in it, that’s why there’s also the field of precision agriculture. So you can also determine by the look in, I don’t know, satellite images or drone images, or even within the tractor itself, you can tell what areas need to be sprayed and how much should you spray them. So there’s those components as well. So I work mostly with time series, [inaudible 00:44:35] data and to some degree, image data. Also, what does it have to do with interpretability? A lot. A lot, because actually, a lot of the data I have was made for a certain purpose, so it can be biased, so I have to find ways of devising it. 
Serg Masís: 00:44:58
I also have to check with agronomists. I have to check, “Does this look right? I mean, this can’t be right, right?” And so I have to show them what I can see the model sees through interpretation methods to kind of check things with them. Because one big concern with bringing machine learning into this field is that this field for 30 years or so has been using statistical methods. And by statistical methods, I mean, linear regression, to some degree logistic regression, but mostly linear and A/B testing, P-values, all that, ANOVA, what have you? [crosstalk 00:45:42] A whole lot of that. And so a lot of agronomists, chemists, and so on, domain experts, they’re very knowledgeable and that’s the tools they have been using. So when you come up to them and you say, “I’m going to use machine learning,” with them, their biggest concern is, “Okay, well, how do I understand it?” You know? 
Jon Krohn: 00:46:02
Right. 
Serg Masís: 00:46:04
Will be biased. 
Jon Krohn: 00:46:06
“Where are the P values?” 
Serg Masís: 00:46:07
Yeah, exactly. Exactly. So I have to give them something back and that’s how I do that. 
Jon Krohn: 00:46:14
Cool. So you’ve just opened my mind to a whole new area. So agronomy, I guess, this is like the science, the study of crop yields, of maximizing production, I guess? 
Serg Masís: 00:46:26
Agriculture. Yeah. 
Jon Krohn: 00:46:27
Yeah. And- 
Serg Masís: 00:46:28
And also there’s ever growing concern of making it sustainable, which is one of the biggest missions of agriculture right. Because of climate change, there’s a big question, mark of, can we still maximize yield given all the different unknowns that we have? Which is, well, we have El Niño this year. Well, we have a big hurricane. Will be a drought. So we have to deal with those uncertainties as well as, of course, insects. Of course, a lot of insects they’re driven by climate patterns. So if it’s very moist, then you might have some kind of insects or some kind of fungus. If it gets really dry, they might not have food, so they’ll try to eat the food that we grow. So, I mean, there’s a lot of things. And then there’s ways in which that affects the soil and/or therefore our ability to plant. So it’s some very, very complicated causal chain. 
Jon Krohn: 00:47:35
Very cool. So it’s interesting to hear you say this. In the most recent guest episode that we had, episode 537, which was on data science trends for 2022, Sadie St. Lawrence opened my eyes to how complex tractors have become these days and all the kinds of sensors that they have. And you’ve just built on that even more talking about farmers being able to, in real time, be taking images and being able to do precision agriculture and know that they should be applying more water or fertilizer or pesticide to a particular area, and yeah, maximizing the amount of food that we have for this growing population on the planet that is still expected to continue to grow for several more decades. So I think we have something like about 8 billion people on the planet now, but we’re going to get to about 11 billion before we’re projected to have that come down. And so this kind of thing, using data and machine learning to maximize crop yield in a way that minimizes future climate change while simultaneously adapting to the climate change that is happening, wow. It sounds like a very cool area to be in. I’m sure you’re going to have some people looking up the jobs page for Syngenta looking for data science jobs there, because it sounds like really impactful work and really cool work that you’re doing. 
Serg Masís: 00:49:03
Yeah. Yeah. It is. 
Jon Krohn: 00:49:05
Are there particular tools that you use day-to-day? Interesting tools, I guess, other than interpretable machine learning tools, which we’ve already talked about. Are there particular tools that you use day-to-day that you think listeners might find particularly interesting? 
Serg Masís: 00:49:19
Well, I mean, we have tools that a lot of people are aware about, SageMaker. We recently also got [inaudible 00:49:30]. I don’t know if you’ve heard of them. 
Jon Krohn: 00:49:32
No. And even, SageMaker, you could explain. So that’s an AWS, an Amazon Web Services tool and it makes it easier to automate and deploy machine learning. Is that right? 
Serg Masís: 00:49:43
Yeah. Yeah. 
Jon Krohn: 00:49:45
Very cool. And so something that I’d like to ask you, I know that there’s a big story here, but maybe we can kind of condense it into the key points, but I know that you’ve had, maybe not the most direct path into being a data scientist, but the specific thing that I want to ask you is, you only recently completed a master’s in data science in 2019. 
Serg Masís: 00:50:14
That’s true. That’s true. 
Jon Krohn: 00:50:17
So you already have an amazing book that’s out. The second edition is coming out soon. You have another separate book in the works. So how did you become such a deep expert in particular domains, like interpretable machine learning, so quickly that you could author such dense, rich content on it for everyone to enjoy? Do you have any particular productivity or success tips that you use day-to-day? 
Serg Masís: 00:50:50
Well, I do have productivity tips regarding writing the book. I mean, I set aside some time for it and I made sure [crosstalk 00:51:01] that I was in the mood. I made sure that I was in the right mood also. So even if I set aside some time, if I wasn’t in the right mood for writing, I would simply switch it with some other time slot and do something else. It’s also good to always have some kind of physical activity. And often I would find myself getting inspired and I would have to come back from running and start typing right away, because I kind of… I didn’t know what it does, but… You would know that better, but it kind of makes the blood flow and all of a sudden all the neurons are firing and you’re like, “Goddammit, I wish I was in front of a computer, but I’m running.” So that would happen. And as far as my trajectory, yeah, I recently completed a master’s. What was it like? A couple years ago? But- 
Jon Krohn: 00:51:54
Yeah, 2019. So just yeah, just two, three years ago. Yeah. 
Serg Masís: 00:51:58
Yeah. But I’m not new to data or analytics in general. 
Jon Krohn: 00:52:05
I see. 
Serg Masís: 00:52:07
I mean, I’ve been doing- 
Jon Krohn: 00:52:12
That makes a lot more sense. 
Serg Masís: 00:52:14
… I’ve been doing SQL since 1999, ETL on big data since 2006… 
Jon Krohn: 00:52:19
I see. 
Serg Masís: 00:52:20
… Python since 2009, and machine learning since 2015. But the reason I got in to do my masters was because I found I didn’t have the statistic grounding that I thought I needed to have. I mean, I had programming skills. I mean, you would say I was kind of a hacker, because I would find ways to solve things with the tools I had available, which at that time many years ago did not include machine learning, but I would find ways of solving problems, whether it was a fraud detection problem. It would be through a rule-based system. I would find patterns in the data and solve it. [crosstalk 00:53:04] and of course you could say, “Well, that’s not really data science,” but I mean, I had a strong connection- 
Jon Krohn: 00:53:09
Well, it’s AI. 
Serg Masís: 00:53:12
… I had a strong connection with the data and I realized that I needed better tools for that, and I needed to improve the tools I already had. So even though I had used machine learning for my startup back in 2015, ’16 and ’17, I kind of felt like I had big gaps. There were only the tools I knew and then tools that I was kind of too afraid to kind of learn on my own and I wanted to learn them. So one of the coolest things when I went to master’s, I took a computer vision course. Right? So I got to learn computer vision from a computer vision expert teaching me all the aspects of pattern recognition, Pre-AI, how that worked, OpenCV, how to deal with those.
Serg Masís: 00:54:06
And then one of the things that kind of came at the end of the course was, “Okay, by the way, this is how a convolutional neural network works.” But I already had all the background, so that was super valuable to have all the background, the mathematical background, the pattern recognition stuff, understanding how it works in the brain, obviously not to your extent of knowledge. But even though it was a short period, it worked for me. I’m not saying it’s everybody’s path, but the master’s worked for me in the sense that I covered all these gaps. I went strictly with an agenda and it was like, “I’m going to take this course. I’m going to take this course. I’m going to take this course.” And they were things I thought I needed and it worked wonders because I think I was kind of a data scientist before, but it kind of sealed all those gaps and helped propel me, in a way, where I felt more confident about… Because I still do obviously have some, and I think it’s good to have some imposter syndrome, but pre my masters, my imposter syndrome was through the roof. “How could I dare to say I was a data scientist if I didn’t have that solid understanding of statistics I thought I needed.” 
Jon Krohn: 00:55:29
Yeah. So that’s a good tip for listeners, in general, I suppose, is that, if you feel like you have some of the pieces of the data scientist puzzle and you’d like to obtain certifiably broad exposure to all of the key aspects of the field so that you can shore up any particular gaps, any big gaps then, yeah, something like a master’s in data science could be just the right thing for you. Very cool. All right. So other than your own books, do you have a book recommendation for us? 
Serg Masís: 00:56:05
Yeah. It’s one I’m nearly done with right now. 
Jon Krohn: 00:56:08
Oh nice. 
Serg Masís: 00:56:09
It’s called Causal Inference: The Mixtape. 
Jon Krohn: 00:56:12
Nice. 
Serg Masís: 00:56:12
And it has a very good kind of reintroduction to probability and regression. And then it goes through every single topic. A lot of the stuff here I wasn’t familiar with because it comes from a more obscure academic topics, but a lot of these things are being resuscitated. So he talks about things that were… they came out in the ’70s and people ignored them because as they happened they didn’t find applications for it, or there wasn’t enough computation to deal with them using these methods. So they kind of stayed dormant for 30 years and only came about back in the last decade. And of course, I wasn’t aware about this research. So it’s super interesting to say, “Okay, I’ve had this problem and this fixes that.” So he talks about discontinuities in data and how they’re indicative of some kind of treatment effect by a human. So if you find in the data that something is suddenly jumping, basically after a threshold, there might be a reason, that in the data generation process there’s something in there. Some kind of causal explanation. And it’d be irresponsible not to include it in your model, whether it’s a machine learning model or a causal model. So I just find this book really revealing, and it’s been a wonderful, like holiday [inaudible 00:57:51] to finish it. 
Jon Krohn: 00:57:54
Awesome. Nice. That’s a really cool recommendation, Serg. So as I started off this show by saying, and as this become only more clear for our listeners as we’ve gone through this episode with you, you are a remarkable fount of knowledge on a broad range of topics associated with the data science field, especially interpretable machine learning. So no doubt, lots of listeners will want to follow you to keep up with you on your latest. How should people do that? 
Serg Masís: 00:58:25
They can follow me on LinkedIn, on Twitter. Just type in my name. I think I’m the only one, which is cool. 
Jon Krohn: 00:58:34
[crosstalk 00:58:34]. We’ll include links in the show notes. 
Serg Masís: 00:58:36
Yeah. And they can also find me through serg.ai. That’s my personal website. 
Jon Krohn: 00:58:42
You’ve got a really slick website, as I mentioned right at the top of the show. Yeah. You’ve done an amazing job. And I was particularly blown away by, no matter where you are in the world listener, if there is a place that your book can possibly be purchased in their country, it is listed comprehensively on your website. Yeah. Yeah. That to me, it exemplifies this thing that I’ve known about you for a long time. Just the level of detail that you go into with everything is extraordinary, and that’s just one more example. All right, Serg. It’s been so wonderful having you on the show. I’ve loved reconnecting. I’ve learned so much. We’ll have to have you on again some time to learn even more. 
Serg Masís: 00:59:25
Sure thing. Count on it. 
Jon Krohn: 00:59:33
What deep expertise, Serg, has in both interpretable ML and agronomic data science. And I love how clearly he illustrates everything he describes with plenty of examples. In today’s episode, Serg, filled us in on the social and financial ramifications of interpreting models incorrectly, that data matter as much as your model for avoiding the misapplication of a model, how counterfactuals enable us to determine what a model can’t do and therefore also what it can, how he sees more causal inference within future interpretable ML techniques. And he talked about how agronomy, the science of crop production has gigantic as yet unrealized opportunity for optimization with data science enabling us to nourish everyone on our planet while minimizing the negative environmental impact of agriculture. As always, you can get all the show notes, including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Serg’s social media profiles, as well as my own social media profiles at www.superdatascience.com/539. That’s www.superdatascience.com/539. 
Jon Krohn: 01:00:40
If you enjoyed this episode, I’d greatly appreciate it if you left a review on your favorite podcasting app or on the SuperDataScience YouTube channel. I also encourage you to let me know your thoughts on this episode directly by adding me on LinkedIn or Twitter, and then tagging me in a post about it. Your feedback is invaluable for helping us shape future episodes of the show. Finally, here’s something fun and new for you. If you’d like to check out a detailed spreadsheet of all of the book recommendations we’ve had in the 500 plus episodes of this podcast, you can make your way to www.superdatascience.com/books. 
Jon Krohn: 01:01:13
All right. Thank you to Ivana, Mario, Jaime, JP and Kirill on the SuperDataScience team for managing and producing another fascinating episode for us today. Keep on rocking it out there folks. And I’m looking forward to enjoying another round of the SuperDataScience podcast with you very soon. 
Show All

Share on

Related Podcasts