Podcasts SDS 453: Big Global Problems Worth Solving with Machine Learning

82 minutes
Business, Data Science, Deep Learning

SDS 453: Big Global Problems Worth Solving with Machine Learning

Subscribe on Apple Podcasts, Spotify, Stitcher Radio or TuneIn

Stephen talks to us about 10 global problems to which machine learning can be applied for solutions. We also dug into some technical topics such as the successful deployment of machine learning models into production systems, but we primarily stuck to the big picture about the important topics today in data science.

About Stephen Welch

Stephen Welch is VP of Data Science at Mariner, where he leads a team developing deep-learning based solutions for manufacturing applications. Prior to working with Mariner, Stephen was VP of Machine Learning at Autonomous Fusion, an Atlanta-based autonomous driving startup, where Stephen lead the design, development, and deployment of machine learning algorithms for autonomous driving. Stephen has extensive experience training and deploying machine learning models across a wide variety of domains, including an on-board crash detection algorithm that is now deployed in over 1M vehicles as part of the Verizon Hum product. Stephen strives to not just develop strong technology, but to explain and communicate results in clear and accessible ways – as an adjunct professor at UNCC, Stephen teaches a 60+ person graduate level class in machine learning and computer vision. Finally, Stephen is the author of the educational YouTube channel Welch Labs, which has earned 200k+ subscribers and 10M+ views. Stephen holds 10+ US patents, and engineering degrees from
Georgia Tech and UC Berkeley.

Overview

Welch Labs, Stephen’s YouTube channel, has over 20,000,000 views on highly technical content, which was inspired by a 2013 channel on physics that interested Stephen. As an early adopter of the YouTube format, he racked up views as he put up educational content. Since his last visit to SDS, Stephen has been updating his videos and working in the manufacturing industry, working on modeling for quality defects. At the end of 2020, he put up a blog post looking back on his journey of getting his manufacturing model into production and taking a moment during the holiday season to reflect on the 10 items Stephen pointed out as the most important things at the intersection of his work and personal life.

Stephen, who has been lucky this and last year in his health and job, still faced challenges, as we all did. He found being stuck at home for a year made it difficult for his perspective to shift at all on problems and stress items. It was a big impact on his mentality and emotions. And by giving himself some perspective, he was able to outline 10 things he wished he had focused on more over the past year. We went over that list.

Empowering non-technical people with deep learning: For this, Stephen notes a lot of core tools haven’t penetrated a wide range of industries. This is especially true for manufacturing, an industry of extremely varied needs and pain points, where he estimates less than 25% of the industry is utilizing deep learning tools.
Teams collaborating in software engineering: This is something Stephen thought about a lot this year as his team worked completely remote to rollout their model. He had to learn this year that a team is not a group of individual contributors, you need full synergistic teamwork to take on huge projects.
Education: Stephen is involved in education both at the collegiate level and through his YouTube channel. But, he’s found frustrations in remote learning and lecturing this year, missing the energy that comes with teaching in front of a live group. This got him thinking about the intersection of the internet and education and where the future of education is heading.
Unsupervised and semi-supervised learning: In manufacturing, the data work focuses on software engineering and unsupervised and semi-supervised learning. The bottleneck is the data collection. So, can we learn with less data?
Sales commercialization: Stephen has worked in several startups where he has had to try to sell technology to customers while also trying to quickly learn about their market. He returned to “Four Steps to the Epiphany” to gain some learnings.
Fundamentals of computer science: Not a computer scientist by training, Stephen wished he worked more in CS in college and understood building backend infrastructure that we take for granted.
Open source and why it matters: Stephen read “The Cathedral in the Bazaar” which opened his eyes to issues of silos and the importance of open-source solutions. His current company leverages open-source materials to sell proprietary software. What is the intersection of open-source and commercialization and when is open source not the right route?
Social justice, information in politics, and climate change: Stephen’s final three points are all connected over truthfulness and opportunities in politics, social justice, and climate change through data and data science.

We closed out with a preview conversation on an upcoming climate change focused episode, something Stephen has a lot of interest in. Be on the lookout for that!

In this episode you will learn:

Welch Labs on YouTube [4:54]
What Stephen’s been up to [7:56]
Stephen’s 2020 year-end blog post [10:11]
Stephen’s reflections on 10 areas worth focusing on [16:25]

Items mentioned in this podcast:

Welch Labs
Coming Up for Air
Data Science Insider
Four Steps to the Epiphany by Steve Blank
Site Reliability Engineering: How Google Runs Production Systems by Niall Richard Murphy, Betsy Beyer, Chris Jones and Jennifer Petoff
The Cathedral & the Bazaar by Eric S. Raymond

Follow Stephen:

Follow Jon:

Episode Transcript

Download The Transcript

Podcast Transcript

Jon Krohn: 00:00:00

This is episode number 453 with Stephen Welch, Vice President of Data Science at Mariner.

Jon Krohn: 00:00:12

Welcome to the SuperDataScience podcast. My name is Jon Krohn, a chief data scientist and best-selling author on deep learning. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. Thanks for being here today. And now, let’s make the complex simple.

Jon Krohn: 00:00:42

Welcome to the SuperDataScience podcast. I’m your host, Jon Krohn, and I feel very lucky indeed to be joined today by the affable and intelligent Stephen Welch. Stephen leads all machine learning and data science efforts at Mariner, a firm that specializes in detecting industrial errors automatically with deep learning. He also teaches computer vision to graduate students at the University of North Carolina, Charlotte, and he has a machine learning focused YouTube channel with a staggering 20 million views.

Jon Krohn: 00:01:14

During this episode, Stephen fills us in on his epiphany about what’s really important in life. Fortunately, for this audience, what’s really important largely involves applying data science techniques, particularly machine learning models to 10 big global problems that are definitely worth solving. And that would make a massive difference to fulfillment and quality of life to people everywhere if they were.

Jon Krohn: 00:01:41

This episode does dig into technical topics a bit here and there, particularly with respect to successfully deploying machine learning models into production systems. That said, the episode is primarily a thoughtful, big picture reflection on what’s really important as a data professional today. Stephen has perspective widening ideas for practitioners of all stripes, no matter whether you’re just getting started with data, or you’re a seasoned pro.

Jon Krohn: 00:02:15

Stephen, welcome back to the SuperDataScience podcast. I’m so happy to have you here. Where are you calling from?

Stephen Welch: 00:02:24

Thanks for having me. I’m in Charlotte, North Carolina.

Jon Krohn: 00:02:27

Nice. I have never been to the Carolinas, but I have heard only good things. It’s supposed to be beautiful down there.

Stephen Welch: 00:02:33

Yeah, definitely. The weather has been. It’s been kind of a soggy winter, I would say, but summers, and springs, and falls are usually awesome.

Jon Krohn: 00:02:41

Beautiful. How is the lockdown situation down there? Are things opening up?

Stephen Welch: 00:02:48

Yeah, I’d say it’s okay. My company has been fully remote for, I guess, it’s almost a year now, which is pretty crazy to think about. As far as the city itself, the Mecklenburg County where Charlotte is, is a little more, I’d say, lockdown than the rest of the state. But things feel reasonably normal.

Stephen Welch: 00:03:05

And then vaccines are coming along the way. My wife’s parents both have, got their vaccines already, which is great. And so, they’re rolling out, which is good. Yeah, I’m in group five or something. So, it will probably be a little while. Yeah, things are still reasonably locked down, but life goes on.

Jon Krohn: 00:03:21

Certainly, there’s been a lot of people in my life, in my professional community in particular who’ve been affected by COVID in New York, in the spring of 2020. People didn’t even know, because there wasn’t a widespread testing, but it’s estimated that a third of New Yorkers got COVID in spring of 2020.

Jon Krohn: 00:03:42

I had my accountants and my immigration attorney were both completely knocked out for months, and they’re young. They’re in their 30s, I think. And some even bigger impacts, actually, professionally, but I won’t go into those.

Jon Krohn: 00:04:03

However, I feel very lucky that family members, nobody has been sick. And so, things like the wife’s parents getting vaccinated now. I can’t wait until grandparents, parents in my family get vaccinated because I’m going to be like, “We did it.”

Stephen Welch: 00:04:20

Yeah, it’s awesome, right? A few months ago, it felt like vaccines were just forever away. And now, they’re really happening, which is that’s freaking awesome. Yeah, really cool.

Jon Krohn: 00:04:29

Yeah. All right. I have a budding YouTube channel. I just started it a year ago, and I’ve got a few dozen videos out there on machine learning, and I was excited. So, you were recommended as somebody that is an amazing podcast guest by Kirill, the outgoing host of the SuperDataScience podcast. When I was checking out your profile earlier, I was blown away by your YouTube channel. It’s called Welch Labs.

Stephen Welch: 00:05:03

Thank you.

Jon Krohn: 00:05:05

It’s amazing. You have over 300,000 subscribers. You have 20 million views of your videos. I think that, that’s especially important, because if you think about … So, you have a highly technical content on neural networks, computer vision, self-driving cars. There’s one playlist that I can’t wait to watch. It is beautifully named, it’s called, Imaginary Numbers Are Real. That’s so fun.

Jon Krohn: 00:05:33

This is a technical content. Probably 1% of people are ever going to be interested in your content, and so you’ve captured a huge amount of that 1% of people.

Stephen Welch: 00:05:45

That’s a cool way to look at it. Yeah.

Jon Krohn: 00:05:48

So, a hot tip to you. Do you want to tell us a little bit about Welch Labs?

Stephen Welch: 00:05:57

Yeah. I love to. Yeah, definitely. Kind of got into it haphazardly, like where you’re kind of mentioning before a little bit. I feel like part of the reason the channel has been successful is I had good timing at the beginning. So, I really kind of stumbled into it back in 2013.

Stephen Welch: 00:06:12

There was a channel at the time called Minute Physics by this creator, Henry Reich that I was watching, and that just blew my mind. And I was like, “Oh, man, why is no one doing this for machine learning?” So, yeah, I just made a quick video on neural networks. I shouldn’t say quick. I’ve labored over it, because I’m a perfectionist, but I put it out. And it did surprisingly well. After less than a week, it had 1,000 views, and I was like, “1,000 people care about this? That’s cool.”

Stephen Welch: 00:06:33

And now, that video has 700,000 or something crazy. Yeah, I kind of stumbled into it a little bit, I would say. I just really, really enjoy it. It has never been like a full time job profession for me. Maybe one day, I can have that as a piece of something, but I definitely … The process of making videos stresses me out sometimes for sure, but I also just really enjoy making that kind of stuff.

Stephen Welch: 00:06:55

So, yeah, it’s been a really cool journey. It’s kind of been on and off for the last six, seven years. Sometimes, I have more time for it. Sometimes, I have less. Recently, it’s been less unfortunately. But yeah, it’s been a cool journey.

Stephen Welch: 00:07:06

And yeah, you’re right, it is pretty technical content. I think something that I try to do, no matter what is no matter how technical your content is, I still want to tell a good story. It doesn’t matter if you’re explaining something super esoteric. Everyone is the same as far as how they’re going to engage with content. And if you can tell a story, then it doesn’t matter how technical your topic is, you’re going to get better engagement.

Jon Krohn: 00:07:28

Yeah. Well, the labor that you put into those videos has certainly paid off. I absolutely love the ones that I looked at. I thought that the production quality was really high, and you told the stories beautifully. If you’re one of the few people out there who’s listening to this podcast, who hasn’t already been at the Welch Labs YouTube channel, then I highly recommend checking it out.

Jon Krohn: 00:07:51

And so, you mentioned how you don’t have that much time for Welch Labs lately. So, what have you been doing in the last two years since you were last on the SuperDataScience show? When you were last here, most of your recent professional experience was in autonomous vehicles and self-driving cars. And that’s what you and Kirill primarily spoke about.

Jon Krohn: 00:08:11

However, at that time, you were transitioning into a role applying machine learning models to manufacturing at a company called Mariner, and now you’ve been doing that for two years. So, do you want to fill us in?

Stephen Welch: 00:08:22

Yeah, I’d love to. Yeah, I think the timing is really appropriate. It would be funny to go. I haven’t listened to that podcast, but I should go back and listen, because I feel like I was very bright-eyed and bushy-tailed about hopping into a new industry. It has been. It’s been an awesome learning experience.

Stephen Welch: 00:08:36

But I would say if you look at the differences between where I was before and where I was now, I think there’s some interesting … I certainly learned a lot, and I think talking about it, hopefully, has some lessons to it.

Stephen Welch: 00:08:46

Autonomous driving is just an awesome engineering problem, right? There’s so many cool problems wrapped up in there. It’s super interesting to work on. At the same time, when I was part of that company, I was like the ML lead basically. My whole life was training models and just making sure that they were as safe as I could make them. I did some work on simulation, things like that.

Stephen Welch: 00:09:05

But now, it’s a totally different industry manufacturing. And in some ways, manufacturing is maybe less sexy, or maybe the problems are less glamorous. But at the same time, manufacturing just has this awesome scale to it. If you think about the carbon footprint of manufacturing, think about how much it impacts your life, it’s huge, and just the scale is mind boggling.

Stephen Welch: 00:09:25

So, it’s really interesting, because when I talked to Kirill back in 2019, the initial modeling had been done for a couple customer projects. We had the POC models in place. We could say, “Hey, customer, you’re going to get 98% accuracy at quality defects, for example.” And now, those models are in production. They’ve been in production for, in some cases, up to a year like doing real stuff. But getting to there and getting those models maintained and actually working, it’s been a journey for sure. So, definitely a lot of lessons along the way and a lot of the reasons that I’ve been a little too busy to make videos recently.

Jon Krohn: 00:09:59

I’m surprised to hear you say all this, because we have a quote from two years ago that says, “This is going to be easy. I’m going to load it in no time.” Who would say anything like that?

Jon Krohn: 00:10:11

I’ve recently read a blog post from you. It describes how you have a particular journey in 2020. A really challenging journey with getting a manufacturing model, a big manufacturing model into production. And it sounds like that was quite time consuming, quite laborious. Getting everything to work, getting all the pieces glued together and working perfect for the clients was a stressful experience.

Jon Krohn: 00:10:42

So, maybe you can fill us in a bit on that in a second. But the blog post starts off by saying that then when the holiday season approached, the end of 2020, you finally had a moment to breathe and reflect on the year. And so, tell us about last year a little bit. And then we’re going to get into.

Jon Krohn: 00:11:06

The blog post has 10 items that based on your reflection, these are the really important things at the intersection of your work life and just the meaning of your life and the purpose in life. And so, I can’t wait. We’re going to try to make time to talk about all the 10 of those edits. I love every single one. So, first, yeah, give us some context on the year leading up to that.

Stephen Welch: 00:11:30

Yeah, awesome. Yeah, I’d love to. Yeah. It happens on the stage of the pandemic. I kind of start the post just by saying, “Hey, this year has been different for everyone.” In the grand scheme of things, I feel incredibly lucky. My family is healthy, I’m healthy. I still have a job, that’s a huge blessing. I don’t have a lot of reasons to complain is the first thing.

Stephen Welch: 00:11:54

But at the same time, it does impact even people like me who are super blessed to have jobs and be healthy. It still is different. There’s changes that really do impact how you work, how you think, all this stuff. And just like you said, Jon, it was a really busy year anyway. So, if there had been no pandemic, it would have been pretty crazy.

Stephen Welch: 00:12:12

I think that by working at home, working remotely … I had done some remote work before, but never this much. Never a fully distributed team like this, and bringing on new people, and being in a little bit more of a leadership role. Definitely a new set of challenges.

Stephen Welch: 00:12:27

I really found that when I got a chance to reflect on the year that being stuck at home, really, the biggest thing for me was it was hard to change my perspective. Because being stuck at home, for me, is kind of the opposite of traveling. So, whenever I do get a chance to travel, or go on vacation, or whatever, when I come back, I have such better clarity on what’s important. That problem I was spending so much time stressing about doesn’t actually matter, all that stuff.

Stephen Welch: 00:12:52

That was really missing for 12 months. Just grinding away, trying to get this product shipped. And I never had that chance to read. I never gave it to myself, because I didn’t have … In hindsight, it’s like, “Oh, just go for …” You don’t have to leave your house to feel different, but it just helps.

Stephen Welch: 00:13:08

I think over the holidays, I took a full week off and just relaxed. I read all my old journals from the whole year, and I was like, “Stephen, why are you just getting stuck in these problems?” I just would get so wound up on stuff. I think, for me, that was a big way the pandemic impacted me. Just being stuck in one house, it was just hard to get out of my own head sometimes.

Stephen Welch: 00:13:29

So, that’s one reason I wrote the post just because I wanted to kind of reflect on that, get my thinking clear about it, what did I learned, what happened. And then I reflected back and saw that I really did learn some cool stuff in 2020. I really did grow in some ways, but I wasn’t quite aware of it.

Stephen Welch: 00:13:44

So, I came up with 10 areas that I think, “Hey, if I wasn’t able to focus on the big important things in 2020, what did I miss?” And these are the 10 things that I kind of wish I had focus on more at the time, but at least I’m thinking about them now. So, that was kind of the impetus for getting this recorded.

Jon Krohn: 00:14:01

Beautiful. I’m so excited to talk about all these. And for listeners, we are going to provide, of course, the blog post link in the show notes. But if you’re just hopping up Google to look it up, it’s called Coming Up For Air, which is a great title given how Stephen has been talking about finally having a chance to breathe. And yes, it was published on Valentine’s Day 2021, so February 14.

Jon Krohn: 00:14:29

Eliminating unnecessary distractions is one of the central principles of my lifestyle. As such, I only subscribed to a handful of email newsletters, those that provide a massive signal to noise ratio. One of the very few that meet my strict criterion is the Data Science Insider. If you weren’t aware of it already, the Data Science Insider is a 100% free newsletter that the SuperDataScience team creates and sends out every Friday.

Jon Krohn: 00:14:59

We pour over all the news and identify the most important breakthroughs in the fields of data science, machine learning, and artificial intelligence. The top five, simply five news items. The top five items are handpicked. The items that we’re confident will be most relevant to your personal and professional growth.

Jon Krohn: 00:15:17

Each of the five articles is summarized into a standardized, easy to read format, and then packed gently into a single email. This means that you don’t have to go and read the whole article, you can read our summary, and be up to speed on the latest and greatest data innovations in no time at all.

Jon Krohn: 00:15:35

That said, if any items do particularly tickle your fancy, then you can click through and read the full article. This is what I do. I skim the Data Science Insider newsletter every week. Those items that are relevant to me, I read the summary in full. And if that signals to me that I should be digging into the full original piece, for example, to pour over figures, equations, code or experimental methodology, I click through and dig deep.

Jon Krohn: 00:15:59

So, if you’d like to get the best signal to noise ratio out there in data science, machine learning and AI news, subscribe to the Data Science Insider, which is completely free and no strings attached at SuperDataScience.com/DSI. That’s SuperDataScience/DSI. And now, let’s return to our amazing episode.

Jon Krohn: 00:16:25

All right. So, let’s do it. This is so great. So, let’s start off with number one. What’s number one of 10?

Stephen Welch: 00:16:32

Okay, great. Well, thanks so much. Yeah, so number one, just to kind of get going through the list here, it’s really empowering nontechnical people with deep learning. I think this is a huge opportunity. I think that there’s so much we can do with deep learning. I think that as far as the core research, maybe I’m just not reading enough papers, but I feel like some of the core, especially in computer vision, we kind of go over this core hub hump of improved performance. So, we’ve got these great new tools, but they really haven’t penetrated a lot of industries yet.

Stephen Welch: 00:17:04

Some industries are kind of a no-brainer. This is another interesting difference when going from autonomous driving to manufacturing. It’s on autonomous driving, everyone and their [inaudible 00:17:11] is using deep learning. If you’re not using deep learning and autonomous driving, it’s going to be really hard to solve some of these computer vision problems. Manufacturing is the opposite. The penetration is, at best, 25%, but it’s probably worse. It’s crazy, right? So, you have all these systems out there.

Jon Krohn: 00:17:28

Yeah. Can I interrupt you with a real amazing story about computer vision and self-driving cars?

Stephen Welch: 00:17:32

Yeah, please.

Jon Krohn: 00:17:33

At a conference around 2015, maybe 2016, the International Conference on Machine Learning, ICML. I went to an amazing talk about the history of neural networks, and I can’t remember the speaker’s name now. He isn’t someone who comes up in deep learning conversations all the time, but the front row at this talk was a who’s who of big names in deep learning, like Yann LeCun. He was there sitting in the front row nodding his head.

Jon Krohn: 00:18:02

And the speaker would sometimes just stay out of the front row. He’s like, “Did I get that right, Yann?” He was talking about the history of deep learning and the research communities around it since the ’80s, and he showed a video of a self-driving car with an end-to-end convolutional neural network driving a car in the ’80s. Anyway, I didn’t know that [crosstalk 00:18:33].

Stephen Welch: 00:18:33

Yeah, it’s super cool. Yeah. In one of my videos, the guy’s name who did that is Dean Pomerleau. Actually, I got him to be on a call on my video. Spoiler alert, but I get to talk to him on the video. He’s the nicest guy and he’s super smart. Yeah, he was doing end-to-end deep learning self-driving cars, like 1986-ish, something like that. I was negative one. That’s pretty cool. Yeah, it’s a really cool story. It’s so ahead of its time. He had a lot of the same challenges. It was super interesting.

Stephen Welch: 00:19:03

Actually, in the course I teach … This is a diversion, but in the course I teach, I have my students train the same style neural network that he did back in the ’80s. It’s [crosstalk 00:19:11]. Yeah.

Jon Krohn: 00:19:13

That’s really cool. Anyway, I interrupted you, but you ended up having some of the background information that I didn’t have. That was amazing. All right. So, anyway, you want to empower people with deep learning. And you talked to me about penetration of deep learning models. It varies by industry.

Stephen Welch: 00:19:27

Right. So, yeah, that was a very fun diversion. I thought we could have gone longer. Yeah. It definitely depends on the industry. And manufacturing is really interesting too, because the problems are so varied. If you take autonomous driving, every autonomous company needs a way to recognize cars. There’s kind of like global problems.

Stephen Welch: 00:19:44

In manufacturing, everyone is a freaking snowflake. You talk to people and they’re like, “Oh, I’m making this kind of widget. Every other Thursday, I get this kind of weird bump on my widget. Can you help me find it and not make these anymore?” Stuff like that.

Stephen Welch: 00:19:59

So, you’re not going to be able to go find an open source data set that has their kind of widget in it, so it’s very specific. And really, we found that as much as I like data science, the more we can get the data scientist out of the way, and really empower the domain expert who knows all about widgets to just use deep learning as another kind of tool in their quality tool belt, the better.

Stephen Welch: 00:20:19

I think there’s huge opportunity there. I think that further, maybe no one has quite figured out the right interface between non-data scientists and deep learning models. Clearly, deep learning models are being trained in the background when you scroll Instagram and stuff, but that’s not much of an interface, right? It’s mostly targeted at advertising, which is fine, of course. As far as having a tool that someone can use if you’re not a data scientist, I think there’s just a huge opportunity there.

Jon Krohn: 00:20:46

Yeah. I think we probably interact with it, at least dozens of deep learning algorithms a day. There’s some things like your phone recognizing your face, your Amazon device recognizing your voice or whatever. All those things use deep learning. But I get your point that there’s a big gap in terms of …

Jon Krohn: 00:21:10

Especially a domain expert, being able to say, “I run into this problem every Thursday. My widgets having a bump on them. And I know that there’s a pattern here. How come I don’t have a model or a tool?”

Stephen Welch: 00:21:22

Yeah. Exactly. Yep. And if they have to go do some big consulting agreement and bring in a data scientist for six months to study the problem, it’s too slow. Some companies can do that, and it makes sense, but for everyone that can, there’s 100 that can. And if you can reach that market, then it’s awesome. A big opportunity, for sure.

Jon Krohn: 00:21:40

Nice. So, is that kind of the end of point one? Is it kind of like an open ended point one? Or do you have some ideas on how this can be resolved?

Stephen Welch: 00:21:47

Yeah. The fun part about this post is I have no answers. I only have questions.

Jon Krohn: 00:21:52

Beautiful. All right. So, that’s problem number one. I mean, we’re just [inaudible 00:21:56] for it.

Stephen Welch: 00:21:57

Yeah.

Jon Krohn: 00:21:58

[inaudible 00:21:58] with deep learning.

Stephen Welch: 00:21:59

Someone please solve that, and then email me, and then we’ll go from there.

Jon Krohn: 00:22:05

If you could create a GitHub repo with the solution.

Stephen Welch: 00:22:07

Yeah. Just send me the link. Yeah, please use the MIT license so I can borrow it, and then that would be great. Yeah.

Jon Krohn: 00:22:15

All right. So, item two.

Stephen Welch: 00:22:18

Yeah. Two is really about teams. I call this one teams. Really collaborating and software engineering. This was, I’d say, an area that I’ve been aware of for a while, but it really started to come into focus more in 2020 for me, especially having to work on such a big project remotely.

Stephen Welch: 00:22:36

I think if I look back at the beginning of my career, a lot of the things I was most proud of are things that were kind of individual contributor kind of roles. Yeah, you’re on a team, but it’s you kind of own this thing. If I think about YouTube, that’s very much individual contributor role.

Stephen Welch: 00:22:52

And that’s great, but I think something that I really kind of picked up on and had to learn, I think, in 2020 is that when you’re building something really big that has to scale and be supported and maintainable, it’s not just a group of individual contributors anymore. You have to do this thing called teamwork.

Stephen Welch: 00:23:09

I thought I knew what that was, but I think I kind of learned it a layer deeper, like how to really get your team to … I mean, I’m basically in a tech lead role, so I do some management with some coding at the same time. So, it’s an interesting role to be in, and there’s this whole management leadership side that I hadn’t really thought about explicitly before. That really, I think, 2020 got me, kind of forced me to have to think about more deeply.

Stephen Welch: 00:23:36

Again, I think it’s a super interesting area. I’m honestly way at the bottom of the learning curve on this. Again, open ended questions. But it was definitely something that I definitely got snapped into focus, I would say, by remote working and by kind of being put in this area where I was building. Probably the biggest distributed system I had ever built, or I should say the team had ever built. So, that was a really interesting part of 2020, for sure.

Jon Krohn: 00:24:01

Nice. And there’s a beautiful table, which I guess you made. There’s this table, the blue and green table on traits, characteristics of individual thinking versus team thinking. Did you make that, Stephen?

Stephen Welch: 00:24:13

I did. Yeah, I was trying to think through what did each of those feel like, because I think a trap that I would fall into is like we got to go solve this problem as a team, basically. And sometimes, you want to just … Let’s go have an individual contributor go work on this, figure out what they think is best, and then kind of present it.

Stephen Welch: 00:24:31

But sometimes, you really want to do more team thinking from the beginning. And what this kind of feels like to me is probably three or four years ago, if I was kind of working on a tough problem, and I was looking at some academic papers and trying to figure out the right way to approach it, I would kind of do that more in isolation. And then when I felt comfortable and confident, then I would come to the team and say, “I’ve figured it out.”

Stephen Welch: 00:24:51

I think part of having a good team dynamic is really having a place where people are comfortable, they don’t feel judged and they can be weighed at the beginning of an idea and say, “Oh, I’m thinking about this. What do you think?” My team that I work on has become much more like that over the last 18 months. We’ve really had to, and I really enjoy it now.

Stephen Welch: 00:25:09

I think it’s just because culturally, I’ve worked during the past, I’ve been less amenable to that. I hate to say it, but you feel a little more judged, or you have to have it perfect before you get it in front of the team. I’m just trying to kind of capture what does it feel like to be in that individual mode, and what does it feel like to be in that team mode?

Jon Krohn: 00:25:29

Yeah. I think it’s a great summary. I’ve been skimming in here as you’ve been speaking. I promise I’ve been listening. But the table beautifully summarizes the pros and cons of individual versus team thinking. There’s a place for both. I think that’s a big part of the point that you’re making here.

Jon Krohn: 00:25:43

So, for example, teams don’t write novels, but individuals don’t write operating systems. So, depending on the kind of task you’re facing, there’s a right mix. Some sub-tasks are going to be appropriate for an individual or team within a broader, maybe team task, like building a production system.

Jon Krohn: 00:26:02

And so, it summarizes nicely how individual thinking is highly nonlinear. And so, you can have maybe some creativity comes out of that, but it also is kind of scary.

Stephen Welch: 00:26:12

Right. Yep.

Jon Krohn: 00:26:14

Whereas with team thinking, there’s sometimes not a depth that you can get into, but at the same time, there’s that sharing and mixing of ideas and perspectives can lead to serendipitous opportunities that you couldn’t have thought of on your own. [crosstalk 00:26:32].

Stephen Welch: 00:26:32

Absolutely. Yeah. I think the results are better, and then I think also the thing to think about is, if you involve the team early on in an idea what you’re working on, you will get more buy-in basically. If I pick some technology to go use kind of by myself, and I’m like, “Oh, I looked at this problem, and hey, I want to use this kind of neural network or something.”

Stephen Welch: 00:26:52

If I decided that on my own, and I could be totally right, but then I kind of blindside everyone with it like a month later. I’m already way deep in it, then I’m not going to have as much buy-in. They might question and say, “Hey, why did you make this decision?” But now, it’s too late.

Stephen Welch: 00:27:07

Even if it’s just a sounding board kind of relationship, I feel like it’s tremendously valuable, because now the team kind of knows, “Oh, Stephen is doing this problem this way for that reason. I get it. I understand why.” Just that buy-in alone I feel like is worth for certain kinds of problems. Trying to work on them as a team early on.

Jon Krohn: 00:27:25

You and I live in similar kinds of worlds. We are involved in creating machine learning models, and putting those machine learning models into production. I think a lot of listeners are in a similar kind of situation or involved somewhere along that stack or interested in being involved in maybe many parts of that kind of that process from model invention all the way through to putting things into production.

Jon Krohn: 00:27:53

A piece of advice that I’ve realized, actually, in the last year as well is that a lot of the more senior people that I work with directly, they tend towards the team thinking. I’m encouraging the less experienced people to move away from the individual thinking as much as they can and into the team thinking.

Jon Krohn: 00:28:19

I think it’s even more obvious, as you say, when we’re working remotely like this, where I’m like, “Some of the people on my team I’m hearing from several times a day.” It tends to be somewhat less experienced where I kind of have to prod to get that one update of the day. And I’m like, “All right, where do we get you today?”

Jon Krohn: 00:28:44

I guess, they don’t mind. I think they might not find it scary doing that and getting their head down and really digging through the problem and trying to get things perfect, but I do agree that generally speaking, of course, there is a place where getting your head down, being really focused on your own is important, especially in software, data science. It happens all the time.

Jon Krohn: 00:29:01

But in general, I would say if you can, talk to people on your team, talk to your manager every few hours. I think two or three, maybe upwards of four times a day is the right amount of time to be getting feedback even on problems that you think you should be tackling deeply on your own.

Stephen Welch: 00:29:23

Yeah, I totally agree. Yeah, absolutely. I think the pandemic makes it more obvious basically. Working remotely makes it more obvious, like what that cadence is. But yeah, I absolutely agree. Yeah. I think part of that is setting the right culture where people feel comfortable coming up with a half baked idea. I think that’s so important.

Jon Krohn: 00:29:41

Yeah. I guess I should stop being really, really hard on those more senior people.

Stephen Welch: 00:29:46

You’re an idiot. Why did you say this?

Jon Krohn: 00:29:47

Why did you bring that up? All right. All joking aside, what’s number three? Number three is education.

Stephen Welch: 00:29:58

Education. Yeah. So, just a little background. Obviously, I do YouTube, which I think is … That’s educational. That’s a piece of it for sure.

Jon Krohn: 00:30:08

No question.

Stephen Welch: 00:30:08

But I also teach a graduate level course at UNC Charlotte, in computer vision. I taught it for three years now, and this year was the first time fully remote, and for obvious reasons.

Stephen Welch: 00:30:20

And to be honest, it kind of sucked. Again, I had awesome TAs. The university has been really, really supportive, but I just really missed the in-person component. And we did some things online that I think helped. We tried to make it as interactive as we could with people asking questions over the chat or whatever made sense.

Stephen Welch: 00:30:42

There’s something about being in front of a group of people. I think part of it is kind of the bandwidth of the communication. So, as much as it’s nice to chat with you or video right now, if we were in the same room, there would be a whole set of cues that we would be picking up on that we’re not.

Jon Krohn: 00:30:56

100%.

Stephen Welch: 00:30:57

I would bet a lot of money, we actually have a different conversation in person. Not that this conversation is bad, but it would be different. And especially, I noticed when I’m in front of a group, you get a little nervous, and you get a little energy, but it can be really positive energy too.

Stephen Welch: 00:31:11

And when I have like 30, or 40, or 50 students who are all hopefully engaged, but there’s so many cues out there. I can look at people’s eyes, and I can be like, “Oh, this person is a little behind. This person thinks I’m an idiot.” And it’s so helpful, because that lets me moderate my energy and just be a more effective communicator.

Jon Krohn: 00:31:30

Totally.

Stephen Welch: 00:31:31

And gosh, did I miss that. It’s funny, when I lectured in person for the first two years of teaching this course, I would leave lecture energized and excited for the rest of the day and pumped. And I would finish an online lecture and just be so exhausted, just drained. Just like, “What’s the point?” That kind of thing, which sucks.

Stephen Welch: 00:31:54

So, that was kind of the experience. That was another layer of 2020. And again, I hope students got a lot out of the course. I think I know some of them certainly did. And the TAs are great and everything. I’m not complaining about that. Remote is definitely a challenge.

Stephen Welch: 00:32:07

That got me thinking about something that that I do think about from time to time, for sure., is education in general and kind of where it’s going, and what the intersection of the internet and education is. Because the internet has done so much awesome stuff for education. There’s so many resources now that weren’t there. I remember right when this started, when I was an undergrad, it would have been 2005 to 2009. That’s when MIT OpenCourseWare was kind of getting going.

Jon Krohn: 00:32:34

Your undergrad is from Georgia Tech, right?

Stephen Welch: 00:32:36

It is. Yeah.

Jon Krohn: 00:32:36

And so now, the computer science master’s is offered completely online by Georgia Tech.

Stephen Welch: 00:32:42

It is.

Jon Krohn: 00:32:42

I think it’s absolutely brilliant. I have someone who goes to work for me who did that master’s, and he was so switched on about modern computer science, and he did it entirely remotely.

Stephen Welch: 00:32:52

That’s awesome.

Jon Krohn: 00:32:54

Yeah.

Stephen Welch: 00:32:55

Oh, yeah, that’s fantastic. So, that’s where I was back then. And I remember discovering OpenCourseWare from MIT, and I was like, “Holy crap, these lectures are amazing.” And my lectures are good at Georgia Tech, but there’s one professor at MIT named Gilbert Strang, who’s just [crosstalk 00:33:09]. He’s just so good.

Jon Krohn: 00:33:12

Yeah, I’ve heard of.

Stephen Welch: 00:33:12

Yeah, definitely. There’s more content than ever, which is nice. Same time, if you go fully virtual, I know there’s some good ways to do it. That’s awesome. I have heard a lot of good reports about the Georgia Tech fully virtual thing. I think that’s awesome.

Stephen Welch: 00:33:28

But at the same time, at least my experience has been teaching online, you just lose a lot. It makes me wonder about what’s going to happen in education. Will more things go virtual? Because other thing that we’re seeing at UNCC, for example, is there was a big drop in enrollment, and it’s because people don’t want to pay that full tuition price for online courses, which makes sense, because they’re comparing it against all the other online courses.

Stephen Welch: 00:33:49

What is the role of in-person verse virtual? Clearly, internet has a role to play. And how is this going to change in this century? Is college going to look different this century than last century? And as promised, I have no answers for you, only questions. I think it’s a question I like to think about, because I think it can really inform some interesting action, for sure.

Jon Krohn: 00:34:11

Yeah, this one in particular, when you read the blog post, there are a lot of questions in here about what’s the right balance of virtual and not. Yeah, there’s huge opportunity around virtual instruction. Like right now, the podcast. 10,000 people are going to listen to this podcast, at least, and we couldn’t have that kind of scale if we wanted to try to ride Stephen into New York and then we’ll book a giant conference [crosstalk 00:34:40]. It just wouldn’t happen.

Jon Krohn: 00:34:43

By having this kind of these virtual ways of engaging with content, you can get a scale that you couldn’t otherwise possibly get, but it doesn’t feel certainly all of the situations you described about teaching in person.

Jon Krohn: 00:34:59

As an instructor, yes, absolutely. It’s much harder to read the room. It’s exhausting. You talked about when you’re in front of a room lecturing, it’s so interesting. There’s an energy that you can see and you’re like, “Okay, it’s time for a break.” Or like, “All right, people are really into this. I’m going to really home run today.” But online, you just have no idea.

Stephen Welch: 00:35:28

You don’t. And even if you see someone’s face … For some reason, I think it’s body language, I guess, I don’t know, but the the face is way better than just audio, I think, usually. Even that, there’s still so much missing. I don’t know what it is, but it’s something.

Jon Krohn: 00:35:42

Everybody smells great today.

Stephen Welch: 00:35:44

Yeah, it’s the smell. Exactly.

Jon Krohn: 00:35:50

As an instructor, it is definitely not as good of an experience, but I think for students, it’s even worse, because so much of what students gain, even if it’s like a weekend, Saturday course that I’m teaching or a whole university curriculum, I think, especially if it’s a whole university curriculum. To be able to come in person and hear what other people are working on, talk to them over lunch, go out and grab a coffee with them, and you build relationships for the future professionally, personally, that you can’t today simulate online. And so, I guess a lot of your questions are related to that.

Stephen Welch: 00:36:24

Yeah, absolutely. Yep. 100%.

Jon Krohn: 00:36:26

All right. We got to move on to number four. I love talking about [crosstalk 00:36:28]. We’re going to motor on. So, Number four, this is a bit more technical. It’s about algorithms.

Stephen Welch: 00:36:36

Yep. So, yeah, this is a fun one. This is probably the most technical one of the day, I think. So, yeah. This is really about unsupervised and semi-supervised learning. Just for a little bit of context about what I’ve been up to for the last two years. I’ve been working on really getting deep learning into factories, right into manufacturing. And like I mentioned earlier, one of the challenges is that you have very specific problems. Every manufacturer is like a snowflake, or they believe they are. And in some ways, they are as far as their images and as far as their data.

Stephen Welch: 00:37:12

The challenge is it’s very expensive, and in some cases, almost impossible to make really large datasets, right. And I guess when I say really large, like 10,000 examples or more. That’s going to be really hard. 1,000, maybe, but still kind of hard. And then down to 100, feasible, depending on what’s going on.

Stephen Welch: 00:37:31

Part of the challenge is that who is the final decider of quality at a manufacturer. So, there are people whose jobs are quality engineers, but you get three in a room, and they might disagree on certain examples, things like that.

Stephen Welch: 00:37:43

So, at the end of the day, when we look at … At Mariner, we look at where do we focus our time, what are the hard problems. I think there’s really two layers to it, and I kind of glossed over this at the beginning, but there’s this whole software engineering layer.

Stephen Welch: 00:37:56

We deploy deep learning for quality inspection in factories, so we have to have like an edge component. We run Linux servers in the factory that are making decisions. By the time I finish this sentence, we’ll process 10 more images, something like that. So, there’s that whole [crosstalk 00:38:10].

Jon Krohn: 00:38:10

A quick terminology thing for … Probably many audience members know this, but an edge component means like a sensor or some kind of compute happening. Not on a centralized server, but on-site in real time.

Stephen Welch: 00:38:24

Yeah, that’s a big challenge for what we’re doing, basically, because we can’t afford the latency, so we have to make real time quality decisions, basically. And we also have to be robust to internet outages. So, all of our compute happens on-premise.

Stephen Welch: 00:38:36

We deploy Linux, GPU-powered computers. In some cases, you have one per factor. In some cases, you have one per production line, which is an interesting set of challenges there.

Stephen Welch: 00:38:46

So, at the end of the day, it ends up being a large scale distributed deep learning system. I don’t know about you, but that’s a big complicated beast. A lot of things can go wrong shipping all this stuff. Because at the end of the day, we want to retrain these models, so we move data up to the cloud, retrain models, push models back down, all this infrastructure and stuff. Lots of interesting things can happen.

Jon Krohn: 00:39:07

I can only imagine. I can always assume access to any of my production models via the internet at any time.

Stephen Welch: 00:39:16

Yeah, totally.

Jon Krohn: 00:39:17

Yeah, it sounds really challenging. I won’t even know what to start.

Stephen Welch: 00:39:21

It’s super interesting. And I kind of glossed over it, but at the beginning of the blog post, I kind of talked about just a short story from last year. This probably represents some of the challenges that we faced.

Stephen Welch: 00:39:30

We had this big go live. It’s back in August last year. And we’ve been working our butts off, getting ready for it. It was just one production line, so it was kind of a phased go live, and we did all our testing. We barely made this deadline. We got the code shipped and everything, and then we realized we had a memory leak.

Stephen Welch: 00:39:49

Literally, you would turn on the system monitor and the RAM would start creeping up, which is never a good thing to see. Our application is containerized. Most of the components, they’re not affected by an overall … They basically allocate their own RAM, so it’s okay. But our application also has a web-based front end, which does not have such nice partitioning for its RAM usage.

Stephen Welch: 00:40:13

Literally, the front end of our application, which was used for the operators in the factory to see what’s actually happening, it would crash every six hours, or eight hours, maybe. We were like, “Holy crap.” Of course, this happened on a Friday. All important bugs happen on Fridays. So, we literally …

Jon Krohn: 00:40:28

And releases, why do they always happen on Fridays?

Stephen Welch: 00:40:30

We actually moved our entire release schedule to never. Absolutely. Release on a Tuesday, and not a Friday. Yeah. We literally made a support rotation for the weekend where one of us would wake up every six hours, basically, and restart. We’d remote into the machine, restart Chrome.

Stephen Welch: 00:40:52

Come on, that’s super embarrassing. We’ve been working on this technology for months now, and you have to log in every six hours manually, and then restart. That’s so depressing, right?

Stephen Welch: 00:41:01

We had a cron job do it pretty quickly. I think there was some problem with automating the Chrome restart, so we had to do it manually for the first days. We had the fix shipped by the middle of next week, something like that. The team really hustled and got it done, which is great. We got through it, but managed it.

Stephen Welch: 00:41:19

I don’t know, there’s so much that can go wrong when you’re in that messy, noisy, dirty environment of manufacturing that you might not think about when you’re in a pure cloud kind of setup. Yeah. Machines overheating, all kinds of different things that can end up. It’s the hardware all the way up to the software.

Stephen Welch: 00:41:38

I think I have a bullet point coming up, actually. Yeah, we get to number six. We can maybe talk more about databases and stuff, because that’s just such a cool world that I’ve been able to learn more about this here.

Stephen Welch: 00:41:50

But what I was trying to talk about was unsupervised learning, because what I was getting at was that I’d say in manufacturing, at least for us, there’s two big areas we see as areas that we spend our time focusing on, because they’re kind of the hard problems.

Stephen Welch: 00:42:02

So, one is certainly software engineering infrastructure kind of stuff. The other one is unsupervised learning and semi-supervised learning, because again, getting a quality labeled data set, as far as our data science pipeline, is by far the bottleneck. That’s the hardest thing. Because I can’t just sit down and label it. I don’t know what your bumps on your widgets look like exactly. For a couple customers, I’ve kind of learned …

Jon Krohn: 00:42:22

What do you know about the bumps on my widgets?

Stephen Welch: 00:42:25

I need a better metaphor I’d say.

Jon Krohn: 00:42:29

Who told you about the bumps on my widgets?

Stephen Welch: 00:42:31

Alright, let’s say scratches. No, it’s not better. Yeah, it’s not better.

Jon Krohn: 00:42:37

[crosstalk 00:42:37]. Just keep going.

Stephen Welch: 00:42:37

Yeah, it’s all bad. It’s all bad now. For a couple of domain industry, I’ve learned … I know a bunch about fabric now. I can tell you 25 fabric with different kinds of defects, but that’s not very scalable for me to go into every industry and learn about …

Jon Krohn: 00:42:52

That will be your next blog post list.

Stephen Welch: 00:42:54

Yeah. Here are the 25 types of … Do you know about yarn flashes? I don’t think anyone would read that, except maybe some fabric quality engineers, but I don’t think they …

Stephen Welch: 00:43:05

So, that’s a big challenge. The idea is can we learn with less data, which is obviously a big area of research. I’m not as up to date in the literature as I once was, but to me, as far as the deep learning technology that I feel comfortable with data science, deploying and getting to that 99 point, whatever percent we need for production, I really only deploy supervised deep learning.

Stephen Welch: 00:43:29

I would love to deploy some semi-supervised or unsupervised. I’m advising a student right now at UNCC working on unsupervised. It’s great, but I haven’t seen anything, at least, that would apply to manufacturing yet where I’m like, “Yes, let’s go. We’re going to have 10 times less data. It’s going to be awesome.”

Stephen Welch: 00:43:45

I’ve seen some interesting work out there, but nothing yet where I’ve been like, “Yeah, we can move away from unsupervised.” I think that’s a huge area for progress to be made. A paper was published last week, and I don’t know about it, and it’s like going to change the world, but I haven’t seen it yet. I think it’s a really interesting area.

Jon Krohn: 00:44:05

Beautiful. Yeah. This has been a recurring topic, actually. Semi-supervised learning, the capacity to be able to label a much larger data set than you have formally labeled is a superpower. I think that there’s a lot of individual approaches people are always … I don’t think there’s a magic bullet. Yeah, it’s great. It’s definitely an open question, something to work on here.

Stephen Welch: 00:44:33

Yep, definitely.

Jon Krohn: 00:44:35

For sure. All right. So, number five is a step away from technical.

Stephen Welch: 00:44:39

Yeah. Cool. Yeah. So, this one is more kind of on the business side. And this side is really interesting to me. My problem at work is that I think everything is interesting, which is a blessing and a curse. This one is really about sales commercialization and startups.

Stephen Welch: 00:44:54

I’ve been involved in a few startups along the way, but I’d say that if you look at 2020 for me at least, by far, I was more actively involved in the sales process than I’ve been before. It was really interesting. And coming into this role, it’s one thing I wanted to learn more about. I’d say if I look at previous startups I’ve been a part of or been a founder of, sales is definitely a weak point. I’m definitely more technically strong than I’m in sales.

Stephen Welch: 00:45:20

I kind of tell this story in the blog post. There’s a book called Four Steps to the Epiphany by Steve Blank. Here it is, just in case you’re interested, if you’re watching the video.

Jon Krohn: 00:45:33

[crosstalk 00:45:33] you can reach it from where you’re sitting.

Stephen Welch: 00:45:35

Oh, yeah, I got my whole bookshelf. Yeah. Right out of grad school, I went and founded startup. I was going to revolutionize how acoustic guitar pickups work using machine learning. It was going to be life changing. We made some cool stuff. It was awesome, but not a commercial success.

Stephen Welch: 00:45:53

My advisor, I remember at the time, he said, “Hey, go read this book.” So, I read it and I thought it was terrible. I thought it was the most dry, stupidest book I had ever read.

Jon Krohn: 00:46:01

Steve Blank’s Four Steps to the Epiphany?

Stephen Welch: 00:46:05

Yeah. Hey, I was dumb. I read it readily. I wonder if in 10 years, I’ll think I am as dumb now as I think I was 10 years ago. I don’t know. Early 20s is bad as early 30s. But anyway, at the time, I mean, I thought it was fine. I was like, “This does not pertain to me.” But it did, I just didn’t know it.

Stephen Welch: 00:46:27

As part of this year, I’ve been part of … I was probably on 100 sales calls last year, something like that. In those experiences of talking to real customers doing real things, and trying to explain how our technology can hopefully help them right, and all the ways that can go wrong and go right.

Stephen Welch: 00:46:42

It’s a super interesting problem. And trying to learn about the market while you’re talking to customers, that’s a huge piece of that book. It’s not that you’re just building what customers tell you to, but at the same time, you got to pay attention to those market signals. They’re critical.

Stephen Welch: 00:46:58

After having these customer conversation, I kept thinking back to that book. I was like, “Didn’t Steve Blank explained this or something?” And I went back and read it last year, and yeah, he predicted exactly what was going to happen. It was uncanny.

Stephen Welch: 00:47:10

The second time I read it, oh my gosh, it was super relevant, lots of really interesting lessons. He has some recommendations for how you should think about a startup. And he really recommends four phases. The first phase is called customer discovery, and that’s where you really … You’re not selling anything yet, but you’re going to talk to customers, and you are really learning about what you could sell basically.

Stephen Welch: 00:47:30

And that is one thing I did get right in my first startup, thanks to that advisor. I was ready to go hack. I was like, “Oh, I’m ready to build. Let’s go make this prototype.” And he was like, “No, you got to go talk to every guitar store owner in Atlanta.” And I was like, “Oh, God, it sounds terrible.” But it was super that I did it.

Stephen Welch: 00:47:46

I highly recommend the read. It is kind of boring and dry the way he writes, I think, but it’s very relevant. But there is a cheat sheet, though, actually.

Jon Krohn: 00:47:53

Yeah, I was just going to mention that.

Stephen Welch: 00:47:54

Here it is. Yep. There’s a much shorter version now that has things boiled down. If I was talking to myself 10 years ago when I founded my first startup, I would say read this. It wasn’t out yet, but it would be …

Jon Krohn: 00:48:05

The Entrepreneur’s Guide to Commercial Development? To customer development.

Stephen Welch: 00:48:07

Yeah, because it’s the cheat sheet to the four steps. It’s fairly small right there. It’s the cheat sheet to the full kind of a dryer book basically.

Jon Krohn: 00:48:16

When you say cheat sheet, I think a sheet of paper. It is many sheets of paper, but …

Stephen Welch: 00:48:20

I agree. It says cheat sheet on the cover. We should call up Brant Cooper and be like, “Hey, man cheat sheet is not what this is.”

Jon Krohn: 00:48:27

The cheat volume.

Stephen Welch: 00:48:29

Right. Cheat volume. Exactly.

Jon Krohn: 00:48:32

Nice. All right. Well, that sounds very practical advice, and I agree 100%. It doesn’t matter how good your tech is. You don’t know how to sell it. You’re never going to have a good startup.

Stephen Welch: 00:48:43

Totally. Yeah. I couldn’t agree more. Yep. Absolutely. And at the same time, it’s not all about sales, obviously. Especially for more tech focused people, you kind of think that sales is sleazy or something. That’s not really what we’re talking about. We’re talking about who are the people that are going to get value out of your thing and what is your relationship with them. That’s really what we’re talking about here.

Stephen Welch: 00:49:02

It’s not about spending some crazy narrative or something. It’s really about building some of those relationships and learning from them and explaining how you’re set. Because you’re going to learn so much about how you’re communicating and how you’re building what you’re building.

Jon Krohn: 00:49:15

I’ve had the same journey over the last decade where I definitely, in my early 20s, would have thought sales. Who needs [crosstalk 00:49:22].

Stephen Welch: 00:49:22

Exactly. Yeah.

Jon Krohn: 00:49:23

Make the best tech ever. It’s going to sell like hotcakes. Everyone [crosstalk 00:49:26] how amazing it is. Absolutely not.

Stephen Welch: 00:49:30

No, no one cares. No one cares. Yep. As soon as I put this landing page up, the world is going to change.

Jon Krohn: 00:49:39

Acoustic pickups. No one is going to believe the sounds they hear.

Stephen Welch: 00:49:41

It’s going to blow their mind. Yep.

Jon Krohn: 00:49:45

Okay, sweet. So, that’s number five. Now, we’re getting to the second half of the list.

Stephen Welch: 00:49:51

We’re rocking. Yeah.

Jon Krohn: 00:49:52

[crosstalk 00:49:52] of computer science. I’m excited for this one.

Stephen Welch: 00:49:55

Yeah, me too. This is a cool area. I’m not a computer scientist by training. I kind of wish I had studied computer science. Undergrad, I studied electrical engineering. This is just my own opinion, but I feel like EE was kind of the pinnacle of engineering from 1940 to …

Jon Krohn: 00:50:13

EE is electrical engineering?

Stephen Welch: 00:50:14

Sorry. Yeah, electrical engineering. Yeah, I feel like it used to be the coolest engineering, and I feel like I joined the party a little late. It was a cool degree, but I don’t build circuits now or anything. I don’t really need to know how DSP works, as cool as it is. Did some math, all kinds of stuff like that.

Jon Krohn: 00:50:29

A booming space too. If you want to be employable, being able to make circuits is definitely … You’re not going to have a hard time finding work.

Stephen Welch: 00:50:37

I totally agree. Yeah. No regrets, but I do wish I had double majored and maybe done some more computer science. This is embarrassing, but when I was 18, I thought coding was lame. That’s super embarrassing. Yeah. I know. Podcast over. Get this guy out of here.

Stephen Welch: 00:50:56

I only took two computer science courses in undergrad and then I got to graduate school and everyone was doing machine learning. And I was like, “Oh, crap, I got to learn Python. I’m way behind.”

Stephen Welch: 00:51:08

No formal computer science training. I know some of these things. I kind of know how databases work. But in 2020, because I was leading not only the model creation part, but the infrastructure code part, I had to really learn and think about databases and global, like large scale databases and stuff that.

Stephen Welch: 00:51:28

The things I bumped into databases is definitely one of the big ones, but I also bumped into some other kind of fundamental computer science components, especially when thinking about micro services and things like that. But I’d say databases were the one that blew my mind.

Stephen Welch: 00:51:41

A part of our stack, a part of our system. If an operator in Michigan, for example, they see that, “Hey, our deep learning model, it’s missing these defects.” She can actually go and tag it in our app. And part of our back end is we send that image up for retraining. Sounds simple. What could go wrong?

Stephen Welch: 00:51:59

So, as part of building that infrastructure, that’s fairly simple, but I kept thinking about infrastructure like Instagram. We all just take for granted, “Hey, I can open my phone. I can take a picture. I can post it.” And all my friends, Jon is going to have it less than 30 seconds. He’s going to have it five seconds later.

Jon Krohn: 00:52:18

Instant data. [crosstalk 00:52:18].

Stephen Welch: 00:52:18

Yeah. We’re just like, “Oh, yeah, of course, it works. It’s the internet.”

Jon Krohn: 00:52:21

I think about that all the time.

Stephen Welch: 00:52:26

Especially, consider that it’s so reliable, and it’s built on a hardware that is less reliable. So, the hardware can have … And the networks. You got unreliable networks, unreliable disks. The geniuses at Google and Facebook have figured out and other companies have built this layer of software that is just so reliable that we think it’s as reliable as breathing.

Stephen Welch: 00:52:50

There’s a great book I read last year called Site Reliability Engineering from Google, and they talk about how people check if the internet is working by going to Google.com. They assume that Google is so reliable that, of course, I can just check my internet access by going to Google.

Stephen Welch: 00:53:07

Just the level of engineering that goes into that, and making it actually work just blows my mind. And when I think about future things I’d to make videos about or learn more about myself, I think doing something more about databases or programming language or something more towards the fundamentals of computer science would be just fascinating.

Jon Krohn: 00:53:22

I do absolutely think it’s usually important. I mentioned earlier, I had an employee who did the Georgia Tech computer science master’s, an AI specialization.

Jon Krohn: 00:53:36

And to people who are listening, if you already have a quantitative undergrad, and you would like to take your data science, machine learning, data engineering, back end engineering capacity to the next level. You really want to invest a couple of years in doing that, and it’s going to be a lot of hard work, but actually not that expensive. This Georgia Tech master’s, I couldn’t recommend it enough. I think it’s the best remote learning option you could have for that kind of background.

Jon Krohn: 00:54:11

Seeing him, when we were thinking about the same kinds of things you’re describing, getting code into production, having the machine learning models operate in a performant way that was efficient with respect to memory, and compute for any particular task, you’ve got to understand computer science.

Jon Krohn: 00:54:34

When you think about data science, a lot of people think about the models, and the models are important. That next step, how your model is going to surface in the real world, that matters typically a lot more. With my models, we could spend R&D for years as a large team and make a model that is slightly more accurate than the model that I have today. The user would never know.

Jon Krohn: 00:55:05

You might win a Kaggle competition or whatever, but it’s going to make no difference to your end user. What makes the difference to your end user is that, like you’re describing, that when they say something or do something, they get the results back in real time. And in order to do that, having the caching work well, having redundancy built in, there’s so many things that you need to think about, and it’s all computer science. I couldn’t agree with you more. Yeah. Vince is his name. And actually, he’s going to be on an upcoming [crosstalk 00:55:36].

Stephen Welch: 00:55:36

Oh, awesome. That would be cool. I’ll look out for it. Yeah. I don’t know if it’s just my own career is developing, or maybe the field of data science is developing, but I feel like either kind of the work you do as data scientists will probably serve …

Stephen Welch: 00:55:47

There’s two big areas that come to mind for me. Either you’re kind of more in the traditional BI kind of role where your deliverable is in site with your leadership team, for example. Your deliverable is like a Jupyter Notebook or a deck, for example, where you’re going to explain, “Oh, I found these things.” Like traditional data mining.

Stephen Welch: 00:56:03

I feel like there’s kind of a group of data scientists that are specialized in that way. And then there’s another group that I have been either forced, or pooled, or walked into, which is kind of the production side of things or like, “Hey, my deliverable is not in site necessarily. It’s actually this model, this living, breathing model. It’s going to run 24/7 and do a thing.”

Stephen Welch: 00:56:21

And if you’re in that second camp, then yeah, software engineering and computer science, they’re real things that you should probably know something about. And I say that fully to my previous self where maybe I was a little arrogant five years ago, and I was like, “Oh, I’m just going to train the model. Infrastructure is going to figure itself out. There’s the cloud for that. I’ll just throw it on AWS, Azure, whatever. It’s going to be fine.” But there’s a whole layer there.

Stephen Welch: 00:56:42

I think for bigger teams, depending on where you are, you may just have that kind of unified, you just focus on the models. I think as you think about your career going forward too, just for the listeners, it’s worth thinking about where you like to be.

Stephen Welch: 00:56:56

I definitely really enjoy both sides. For me, the modeling side is more academic, and it’s a little slower paced. No customer is ever going to yell at me, so that’s nice. But then on the deployment side, yeah, there’s some more stress for me, at least, and you got to kind of know more of the stack.

Stephen Welch: 00:57:10

But it can also be really rewarding, because when at the end of the day, you build something that a user is interacting with. There’s this own level of satisfaction to doing that, I would say, that I wouldn’t probably get if I was doing more like a static analysis.

Stephen Welch: 00:57:23

I think it’s a really interesting direction. I’m not sure if that’s kind of the field maturing, or just my own career. I can’t tell the difference sometimes, but that’s definitely been what I’ve seen as I’ve been in data science for longer.

Jon Krohn: 00:57:33

I think it’s both. I think that it is a common arc. I think a lot of people start in data science. A really common trajectory into data science is you kind of start, like you’re saying, with business intelligence or data analytics. Understanding distributions of data, and how to perform some relatively simple manipulations on them and show them in a nice way and understand what’s happening in the data. Correlations between data.

Jon Krohn: 00:58:05

That can lead quite naturally to data science where you’re modeling, you’re making predictions about the future based on incomplete data. It’s a more challenging step from data analytics typically. I think [crosstalk 00:58:20]. Typically, your career go the other way.

Stephen Welch: 00:58:24

Yeah, I agree. That’s a good point. Yeah.

Jon Krohn: 00:58:27

And then the next step, I think, in that journey is this kind of data engineering. [crosstalk 00:58:32] engineering, software engineering. I think that it is, in a lot of ways … You’re right that you can absolutely have these segregated teams. You have computer science specialists and the data scientists. In a very large company, they could be totally separate.

Jon Krohn: 00:58:46

But I think in a lot of scenarios, particularly in any small or mid sized machine learning company, the data scientist learns on the job that if they want their model to make the biggest impact for customers, in order to get involved with the software side of things is …

Stephen Welch: 00:59:01

I couldn’t agree more. Yep. Absolutely. I think it’s super interesting, too, because I’m not sure that there’s an obvious place to me today where people kind of learn that, because it’s not part of … There’s a bunch of data science education popping up, which is great, and most of that is focused on the modeling.

Stephen Welch: 00:59:16

And maybe it’s just my own bias for not having a computer science background. To me, it was like, “Holy crap, I’ve got a bunch to learn.” And maybe it is just kind of pure computer science, but I do like that the stuff is changing quickly enough for the things about like the way we’re using containerization and some of these kind of newer technologies. I feel like I wouldn’t have learned them in school anyway, probably. I mean, maybe some of these more up to date degrees for sure, but it …

Jon Krohn: 00:59:41

Yeah. It’s a tricky one. I think it underlines the importance of education was, I guess, point number three, I think.

Stephen Welch: 00:59:46

Yeah.

Jon Krohn: 00:59:47

And this is part of it is in this field and a field like data science, or data engineering, or even data analytics, the stack, what you need to know, is changing so quickly. Containers, like Docker containers, it’s something, five years ago, you could absolutely not know what that is. In fact, even [crosstalk 01:00:07] five years ago, but today …

Stephen Welch: 01:00:08

I don’t know. I definitely didn’t know about them five years ago.

Jon Krohn: 01:00:11

Yeah. Today, it’s a fundamental part of building a machine learning application using Kubernetes probably to be able to have scalability of your Docker containers. And yeah, these are the kinds of things.

Jon Krohn: 01:00:22

I think it’s kind of tricky. It’s hard to say like, “Oh, you definitely need to follow this particular educational path,” because I think it emphasizes the ongoing learning. You need to spend some time in your week, every week, just learning about some new things that are happening.

Stephen Welch: 01:00:37

Yeah, I couldn’t agree more. Yeah, just one more riff off just real quick is in the class I teach, my class is very much … It’s the theory, right? So, it’s how to train deep learning models for computer vision, and we do some analytical approaches and things like that, but we only have one lecture or half a lecture on deployment and infrastructure and stuff.

Stephen Welch: 01:00:53

And that was a really common request this hearing, and it should be. It might be out of date in two years, but it could easily be a whole class. All the stuff that goes into that structure and deployment. [crosstalk 01:01:04].

Jon Krohn: 01:01:05

That’s exactly why we don’t have more classes on it, because it’s too hard to keep their [crosstalk 01:01:09] up to date.

Stephen Welch: 01:01:09

I know. Yeah.

Jon Krohn: 01:01:11

When you teach something, to have a curriculum on something, there needs to be some kind of stability. So, I created my own deep learning curriculum. When I offered that class, it’s inevitable. 10% or 20% of people are going to come up to me digitally and say, “This is all great. I’m glad we’ve learned the models, but we’re never going to cover deployment.” And I’m like, “I just can’t keep up to date.”

Stephen Welch: 01:01:39

I agree.

Jon Krohn: 01:01:40

You just have to learn on your own. It’s something that you can get on the job.

Stephen Welch: 01:01:44

Yeah, that’s been a battle I’ve had at YouTube. I wanted to cover some more. I’ve half written a series on generative adversarial networks, but I couldn’t keep up with the literature. Every time I’d have a decent script, some new paper would come out, and I’d be like, “Crap.” One thing I like about the Imaginary Number Series is I’m not covering anything that was invented past 1860 or something. I know that it’s figured out.

Stephen Welch: 01:02:06

That’s kind of depressing, because we need fresh content out there, but the delivery mechanism of YouTube is kind of challenging, because just like you said with your deep learning course, you invest a lot of time trying to get it right, and it’s changing so fast that it’s like, “Darn, it’s going to be out of date before I publish it.”

Jon Krohn: 01:02:22

Nice. Well, all right, let’s move on with the list.

Stephen Welch: 01:02:25

I think this will be a nice segue. I think this riffs right off it, I think.

Jon Krohn: 01:02:30

Yeah, it is.

Stephen Welch: 01:02:31

Sorry, I interrupted you. Sorry.

Jon Krohn: 01:02:31

No.

Stephen Welch: 01:02:31

All right.

Jon Krohn: 01:02:31

You didn’t interrupt me. I mean, you’re the guest. It’s really your show. I’m just here to make it sure that it keeps running. I just got to make sure the camera is on. So, number seven is …

Stephen Welch: 01:02:44

Open source and why it matters. Like I said, part of my challenge and it’s becoming very obvious to me now in this podcast is I just have a bunch of questions. But let me get into these questions about open source.

Stephen Welch: 01:02:57

So, let me give a little bit of context, and then I think we’re going to tie it back to content going out of date, so stay tuned. In five minutes, we should be back to that, I think with an interesting layer on top of it, I think. So, here we go.

Stephen Welch: 01:03:10

So, the context. When I think about open source software, to me, and I think this is kind of reflecting on it, I think it has to do with when I grew up. To me, I very much remember like when I came of age, in high school, Wikipedia was just becoming a cool thing. In 2004, I was in high school, and I just had my mind blown by Wikipedia. I was like, “Holy crap, this is alive. It feels different than anything that was created by a company ever could.” It’s because of the open source model. People who give a damn about it are working on it all over the world at the same time, and that’s incredible.

Stephen Welch: 01:03:45

I think kind of as a result of some of those early experiences, I think there’s this kind of assumption in my head. I wrote it down. So, the assumption that is just part of, I think, just in my head, is that the most reliable and maintainable software in the world is created using an open source model.

Stephen Welch: 01:04:02

So, I feel that’s true, but I never really challenged it or thought about it until last year when I was kind of at work, I had to think, do some stuff. And I picked up a book. Maybe you’ve read it. It’s The Cathedral and the Bazaar. I happen to have all these books on my shelf right here, thankfully.

Jon Krohn: 01:04:15

Got it.

Stephen Welch: 01:04:15

Yeah, it’s highly recommended. Again, I’m still at the bottom of the learning curve on this, but one of the things that the author, Eric S. Raymond talks about is this idea of Linus’s law, basically for Linus Torvalds, Linux’s creator.

Stephen Welch: 01:04:30

Linus’s law is given enough eyeballs, all bugs are shallow, which I think is really cool. As soon as you’re in a silo, as soon as you’re stuck in one company, it’s very easy to have these bugs that seem really tough. But as soon as you’re in this open source model, someone out there is going to be an expert on this and they’re going to be able to solve it in no time.

Stephen Welch: 01:04:49

I think I’ve got kind of two riffs off that and I want to tie back to the original point, but my first riff is really just thinking about … Maybe I’m behind the ball here. Open source has been around for a while, but it does come up. My company sells proprietary software. We leverage open source stuff just like everyone else does.

Stephen Welch: 01:05:08

I think an important question to think about is like, “Okay, if open source, in some ways, is kind of a better model of software development, and not always right, but in some cases. When should you not do open source? How does it intersect with commercialization?”

Stephen Welch: 01:05:21

Throughout my career, I’ve worked with a few different lawyers along the way, and I’m always asking them questions like, “How does this work exactly?” Maybe I’m just not smart enough to understand it, but I feel like even the law around that governs the intersection of open source and commercial software is super fuzzy.

Stephen Welch: 01:05:40

I think if I had one question about it, it would be like, “If you’re starting a startup tomorrow, and let’s say you’re making software, what are the big things you should think about as far as open source versus not? What is the place for proprietary code? Why do you do it? What’s the advantage?”

Stephen Welch: 01:05:56

Obviously, if you’re selling something, there’s a layer to it there. But if open source is so good, why don’t we do it all the time? Maybe that’s my question in a nutshell. And again, no answer, only questions.

Jon Krohn: 01:06:11

Yeah. And I have a lot of ideas. There’s a lot of places that we could go from here, but in the interest of time, and us still having three points left. Not trying to let the podcast go too long, although it’s been every minute, I hope, has been as interesting for our listeners, isn’t it?

Stephen Welch: 01:06:25

Right.

Jon Krohn: 01:06:26

Yeah. Open source is hugely important. Did that tie directly back to the last point that we’re talking about?

Stephen Welch: 01:06:33

Here it is. Ready? Here we go. That was Linus’s law. Given enough eyeballs, all bugs are shallow. So, I would like to present my corollary to Linus’ law. Here we go. This is something that drives me crazy as a YouTube creator.

Stephen Welch: 01:06:45

So, my corollary is basically, when you make something for enough people to see and you care about being right, you’re eventually going to screw up. That’s my corollary. And my thinking on it basically is … There’s a really good video from CGP Gray. I’ve linked it in the blog post, but he talks about how this very real fear of being wrong on the internet.

Stephen Welch: 01:07:06

We kind of laugh about it, but when you’re trying to get something right, and I’m sure you know this, Jon, for making your course. When you’re trying to be right, there’s a certain level of anxiety that comes with that. In the worst case, it can paralyze your creation process, because you’re like, “I’m not the expert. I don’t know how to be right about this.”

Stephen Welch: 01:07:24

And it’s kind of scary, because if you do have a popular or even semi-popular channel, and people are going to look at it, someone out there is going to be more of an expert than me. That’s the reason open source software works, because someone … You have this open development model, but when you’re doing content creation, at least the way I’ve done in the past, it’s very much this step where I’m going to create it, and then I’m going to release it. And there’s not a dialogue with the community.

Stephen Welch: 01:07:47

I’m sure someone out there smarter than me has figured out or is figuring out how to do that dialogue with the community as part of creating content for YouTube, but I have not figured it out, I guess I would say. And it can really work against you in these creative situations where you’re trying to be right because Linus’s law will work against you when you’re trying to kind of be authoritative on a topic.

Jon Krohn: 01:08:07

Yeah, beautifully said. I think you’re exactly spot on. We’re going to move to the next point. Actually, unless this will completely derail the way you want to cover this, eight, nine, and 10, I think they’re actually related.

Stephen Welch: 01:08:22

Okay.

Jon Krohn: 01:08:25

Eight is about social justice. Nine is about information in politics. And 10 is about climate change. And so, all three of your final 10 points, final 10, all three. The final three of your 10 points are related in the sense that there are these big problems, also big opportunities in the world around social justice, politics, and truthfulness in politics, accurate information in politics and climate change.

Jon Krohn: 01:09:03

These are interrelated in many ways on these three concepts. A lot of polarizing views where one person often has the same polarized view on all three issues that is opposite. In the United States, we call them Democrats and Republicans. Yeah. So, fill us in on these three really big, important topics. And it may be [crosstalk 01:09:29] learning or data science relationships.

Stephen Welch: 01:09:32

Yeah. I love to. I’ll give you my opinion on these things, and they’re obviously huge topics. They could be their own podcast easily, but they are. When I think about the things that are important that I thought about a lot in 2020, what it seems to me to be the important topics, I think you’ve got to be on the list.

Stephen Welch: 01:09:52

So, the first one, yeah, it’s social justice. I think the Black Lives Matter movement of 2020, you can’t watch that and think that … Yeah, you can’t ignore it, I don’t think. Not that I was trying to, but at the same time, it’s a shift that happened, I think, in 2020.

Stephen Welch: 01:10:12

The thing I talked about in the post a little bit is when I see something like that, I really want to think how can I make a difference here. The one thing that pops into my head right away is the course I teach.

Stephen Welch: 01:10:24

I’m in Charlotte, and Charlotte has the unfortunate reputation. There’s a study done, I think, in 2017 of social mobility among the top 50 American cities. I think it’s actually economic mobility, and Charlotte is 50 out of 50. So, we are the worst upwardly mobile city in the country out of those top 50. Yeah, not a great claim to fame.

Stephen Welch: 01:10:48

I think about my course, so I teach a graduate level computer vision course in Charlotte. There’s 60 students every year. It’s very internationally diverse, which is great. As far as kids from Charlotte, especially from underprivileged backgrounds who come to my class, there’s never been one. It’s 180 students, and it’s zero.

Stephen Welch: 01:11:06

I don’t want to be self-centered about it. Obviously, there’s other things that they can do. Maybe they don’t want to study computer vision, but I think about some of the salaries that come from these disciplines, and it seems like it’s not just a coincidence, I guess. That’s seeing a symptom of this problem, I think.

Stephen Welch: 01:11:22

I just can’t help thinking about what are the barriers in Charlotte. Let’s just take Charlotte as an example. Why are those kids not in my class? And maybe there’s a benign explanation, or do they go to college elsewhere, but I think there’s more going on. So, the questions I have in this section are, what are the barriers that keep them from pursuing that path? What can we do about them? That’s something I really want to learn more about this year.

Stephen Welch: 01:11:46

I do think that as far as people who are talking about this, especially in data science. I follow Jeremy Howard’s work pretty closely, and then Rachel Thomas, who founded fast.ai. They have some really great writing about fixing data science, which we could go really deep down.

Stephen Welch: 01:11:59

When I think about how can I make a difference, how does this impact my life, that’s an area that really popped up for me. And again, no answers, but it’s something that I think it’s a real signal, and something that I need to learn more about this year. I know there’s foundations in Charlotte that are working on this exact problem, so I would love to understand what their understanding of the problem is, and what they’re working on.

Jon Krohn: 01:12:23

Nice. If people are interested and haven’t already heard, in a recent episode of the SuperDataScience podcast, which aired March 4, we focus primarily on unethical or ethical AI, so both the industry as well as issues with models, and even hardware, actually, which is something new that I learned in that episode. I totally agree that this topic could be an entire podcast, and we did that.

Stephen Welch: 01:13:01

It’s coming on March 5. Yep. Awesome.

Jon Krohn: 01:13:03

Yeah. Well, from the perspective of the listener, that is in the past.
Stephen Welch: 01:13:05 That’s right. It’s already happened. Go back. Yep.

Jon Krohn: 01:13:09

Nice. I don’t know if you want to wrap up with politics and climate change.

Stephen Welch: 01:13:15

Yeah. I’d love to. Just have a little questioning around them, but yeah, absolutely. So obviously, 2020 was a crazy year politically. And I think for me, I obviously don’t work directly in that area, but I think as a technologist, it just helps makes me wonder how does the work that myself and my peers do, how does that impact what’s happening in politics?

Stephen Welch: 01:13:40

If I think about how old I am now, and when I kind of came of age and stuff, my youth was really as the internet was becoming a thing. When I got our first computer, I remember when we got it on the internet like in 1995, that was a big step.

Stephen Welch: 01:13:54

And there’s a lot of optimism, I think, early on. The internet is going to make the world this more transparent place, because you’ve got more information. Everyone can get the information they need. But if you look at 2020, gosh, it seems like the exact opposite is happening. It seems like it’s making the world less transparent, and it’s harder to get through information.

Stephen Welch: 01:14:17

So, it’s one of those things I think about. Is there a technological solution to this? Can we get verifiable facts through a new kind of technology? Or is it more of a policy thing? Do we need more regulation or something? Just something that I couldn’t help but think about last year just because it was so crazy.

Jon Krohn: 01:14:37

Totally. Yeah. The US is really crazy, for sure. Yeah. I thought some of the same things. There are definitely people working on ways with models to try to surface more accurate information. I think it’s a huge disservice to society when a fictitious point of view is just completely … Yeah, it becomes so prominent. And ultimately, yeah, it makes lives worse.

Jon Krohn: 01:15:20

If there are technical solutions, I think also policy solutions, I think is going to be a mix of both. And hopefully, because of how crazy things had become that people are going to be making headway on these things. Definitely something to think about.

Stephen Welch: 01:15:32

Totally.

Jon Krohn: 01:15:33

And speaking of, issues that are already big issues, and no matter what we do is going to become a much bigger issue over the coming decades, but we have some green shoots of progress and data can play a role. We also, actually, in a forthcoming episode, I think it’s going to air in April. So, I mentioned him just quickly. I said his first name, Vince, is going to be on an upcoming episode of the SuperDataScience podcast, and we are going to spend the entire episode talking about how machine learning and data science can be used to help the environment, to help with climate change.

Stephen Welch: 01:16:09

Super cool.

Jon Krohn: 01:16:10

Yeah. But please, also Stephen, tell us your thoughts.

Stephen Welch: 01:16:14

This will be a preview or an encouragement to watch to listen to that episode. That sounds awesome. And again, this is just kind of reflecting on what I’ve been up to for the last 10 years. When I think back, about a decade ago is when I went to graduate school.

Stephen Welch: 01:16:27

I studied environmental engineering, actually, for graduate school, and wanted to go work on big problems. At the time, I was pretty fired up to like, “Let’s go work on climate change.” This was 2009. Obama was freshly in office. He just appointed Steven Chu from Berkeley to be Secretary of Energy. And I was like, “I’m going to Berkeley. This is going to be awesome.”

Stephen Welch: 01:16:48

When I got there, I looked at different research opportunities. And a lot of the research I found, or the people that were working on climate change, they were really working on it from a policy perspective. It makes sense. The thesis there, there’s a well known paper called Stabilization Wedges, where they talked about how … It’s from 2004, or something, but they say, “The technology is here. We can curb this. We just need the policies to do it.”

Stephen Welch: 01:17:10

So, a lot of the focus, at least of the labs that I kind of visited in graduate school, was in that, in the policy region. And at the time, I was like, “I’m not sure I can make an impact in politics.” I’m more technical, and that was my bias at least then. So, I went off and researched something else. It was really interesting. I used machine learning to actually predict when snow was going to melt, which is a really important resource in California. The melting snow is 60% of their water supply or something. So, really, really cool research.

Stephen Welch: 01:17:35

But anyway, I didn’t dive into climate change. And then, I was like, “Well, we’re making progress. This is great.” But I looked back on this 10 years later and I feel like it’s kind of one step forward, two steps backwards in a lot of ways. It’s kind of discouraging. Maybe I’m just being negative, but reflecting back now, I wonder at the time, I just …

Stephen Welch: 01:17:54

This was 10 years ago. I just did a cursory look around the university to see what was happening. And it didn’t seem they were big, technical pushes on climate change. It’s just seemed like mostly policy. I’m excited to listen to this podcast, because I really want to learn what are the technical ways to make a difference too.

Stephen Welch: 01:18:10

Obviously, just like the information and politics thing, just like social justice, policy matters, of course, but technology matters too. So, just another area I’m excited to learn more about in 2021. I think it’s really an important area.

Stephen Welch: 01:18:24

I think the thing I’m most curious about is, is it the case that we just need to change our policies? How much should we be investing in policy versus technology? Obviously, it’s a both and strategy. I’m really curious to hear some other thinkers explain their thinking on those two trade offs.

Stephen Welch: 01:18:42

In some ways, the technology is really simple. It’s just about energy balance. If we’re burning more energy than we’re able to get out of the air sustainably, then it’s not going to work. I think a cursory physics look at this, you can say, “Oh, there’s no room for innovation because of the first law of thermodynamics.” I’m sure there’s more to it than that, and I’m excited to learn more.

Jon Krohn: 01:19:01

Nice. Well, thank you so much, Stephen, for taking us through your 10 wonderfully thoughtful points in your Coming Up For Air blog post. Tons of food for thought. Lots of jumping off points and many resources that you provided with us already with books and other blog posts, videos, all of which are linked too from your blog posts, but we’ll also capture those in the show notes for today’s episode.

Jon Krohn: 01:19:35

And then could you please let viewers know how they can get in touch with you, how they should be following you online? Obviously, the Welch Labs YouTube channel is a great choice.

Stephen Welch: 01:19:45

Yeah, that’s probably the best place. You can go to welchlabs.com as well. I do publish a blog there from time to time. But, yeah, YouTube and welchlabs.com are probably the best places to get in touch.

Jon Krohn: 01:19:56

Beautiful. All right, thank you so much, Stephen. This has been such a wonderful episode. I’ve learned so much and I can’t wait to catch up with you again. Maybe in two years we can have you on the show again.

Stephen Welch: 01:20:04

Oh, I’d love to. Yeah. Yeah, hopefully, all 10 problems will be solved in two years.

Jon Krohn: 01:20:10

No doubt about it.

Stephen Welch: 01:20:10

No doubt. Awesome. I really enjoyed it. Thanks for having me.

Jon Krohn: 01:20:13

You’re welcome. See you soon. Well, I’m sure you can tell, I’ve thoroughly enjoyed that episode. What a treat to have Stephen’s insightful thoughts on what’s really important in a data science career.

Jon Krohn: 01:20:30

We talked about empowering all people with deep learning power tools, the trade offs and opportunities of virtual education, widening the impact of machine learning with semi-supervised models. The critical importance of technology sales. The massive value of computer science and open source software, and the opportunity for machine learning to contribute to meaningful progress in social equality, political truthfulness, and avoiding climate change.

Jon Krohn: 01:21:02

As always, you can get all the show notes, including the transcript for this episode, the video recording, any materials mentioned on the show, and the URLs for Stephen’s website and YouTube channel at SuperDataScience.com/453. That’s SuperDataScience.com/453.

Jon Krohn: 01:21:20

If you enjoyed this episode, I’d of course greatly appreciate it if you left a review on your favorite podcasting app or on YouTube. I also encourage you to tag me in a post on LinkedIn or Twitter where my Twitter handle is @johnkrohnlearns. To let me know your thoughts on this episode, I’d love to respond to your comments or questions in public and get a conversation going.

Jon Krohn: 01:21:39

All right, it’s been a great episode. I’m looking forward to enjoying another round of the SuperDataScience podcast with you very soon.

Podcasts SDS 453: Big Global Problems Worth Solving with Machine Learning

SDS 453: Big Global Problems Worth Solving with Machine Learning

Podcast Transcript

Share on

Related Podcasts

February 6, 2026

February 3, 2026

January 30, 2026

Podcasts SDS 453: Big Global Problems Worth Solving with Machine Learning

Share

SDS 453: Big Global Problems Worth Solving with Machine Learning

Podcast Transcript

Share on

Related Podcasts

February 6, 2026

SDS 964: In Case You Missed It in January 2026

February 3, 2026

SDS 963: Reinforcement Learning for Agents, with Amazon AGI Labs’ Antje Barth

January 30, 2026

SDS 962: Wharton Prof Ethan Mollick on Why Your AI Strategy Is Already Obsolete