SDS 444: Future-Proofing Your Career

Podcast Guest: Jon Krohn

February 11, 2021

Subscribe on Apple PodcastsSpotifyStitcher Radio or TuneIn

Welcome back to the FiveMinuteFriday episode of the SuperDataScience Podcast!

Today I’m answering some questions from Twitter on future-proofing your data science career.
 

At the start of 2021, I asked on Twitter what questions people had about data science. Today, I’m going to answer the questions I felt would bring most value to everyone:

  • “Is a career in data science really future-proof? What are the odds of another AI winter and a crisis in this career?”
    – As a lifelong student of probability, I know nothing is 100% certain. But, with the proliferation of sensors and the forthcoming 5G internet of things, information is going to be increasing at an incredibly fast rate. This makes data scientists important to sift through the noise to uncover important signals and drive commercial value through that data. The necessary tools for a data scientist involve keeping on top of their software proficiencies to best futureproof yourself. As far as an AI winter? In my opinion, AI is a bit overhyped but I don’t see an AI winter comparable to the 1980s on the horizon thanks to the global connectivity and wealth of data available.
  • “Is AutoML the future of this field?”
    – AutoML, or automated machine learning, is the topic of an upcoming SDS episode but to give you a short answer – yes and no. Auto ML works best on clean data and in the real world, we get extremely noisy data. Auto ML’s prevalence will rely on its ability to select the optimal model choice or hyperparameters, but it is not a replacement for data scientists at this time.
     
  • Questions on model interpretability and bias
    – Open-source model interpretability tools are becoming more common but the data scientists themselves are integral to ensure a model is free of unwanted biases and fully interpretable.

I’ll be back next week to answer questions on machine learning! If you want to ask me your own questions, tag me on a post through any of the below social platforms!

ITEMS MENTIONED IN THIS PODCAST:
DID YOU ENJOY THE PODCAST?
  • What trends moving forward are most interesting to you and how can you ensure your core skills keep you up to speed with the shifting world of data science?
  • Download The Transcript

Podcast Transcript

(00:05):
This is Five-Minute Friday on Future-Proofing Your Career. 

(00:19):
At the beginning of 2021, I asked the following on Twitter: “What questions do you have about machine learning as a science or as a career?” 
(00:28):
In response, I was asked some terrific questions about data science, many of which are popular ones that I’ve been asked time and again. In today’s FiveMinuteFriday, I’ll answer the ones I thought would be most valuable for everyone to hear the answer to. 
(00:44):
Gabriel, who appears to be Brazillian, but indicates his location is “Lost + Found” asked me: “Is a career in data science really future-proof? What are the odds of another AI winter and a crisis in this career?” 
(01:01):
Well, as a lifelong student of probability, I’m never 100% sure of anything especially if it’s a prediction of the future, but with the proliferation of sensors everywhere accelerating all the time — which will only be amplified further in the coming years by the 5G “Internet of Things” — enabling the amount of data stored on the planet to double about every 18 months, so I think a career in data science is about the safest bet out there. More and more of the data that gets stored is noise and so data scientists should be ever-more critical to devise and apply techniques to distill meaningful signals from the noise and drive commercial value with data. 
(01:47):
Given this abundance of data, it will be more and more important to be able to engineer machine learning pipelines, so I do recommend data scientists develop an understanding of software engineering best-practices, including algorithms and data structures. That should help you stay totally future-proofed. 
(02:06):
All of that said, I do think that AI is currently a bit overhyped. But I don’t think we’re going to have an AI winter like we had in, say, the 1980s. This time is different because there are so many more sensors, global connectivity, data, and cheap processors than ever before. So, I don’t think data science will become obsolete, but as investors and consumers realize some of their expectations of AI are driven by Hollywood and marketing hype, there may be an AI “autumn”. For more on that, check out The Economist’s special Technology Quarterly issue from June 13th on “Artificial Intelligence and Its Limits”. 
(02:45):
In a similar vein to Gabriel’s question about the future-proofness of data science, the French-Martinican cloud consultant Frédéric Anauld asked: “Is AutoML the future of this field?”
(02:58):
So, AutoML stands for Automated Machine Learning. In the next episode of SuperDataScience, #445, which will be released on Wednesday, the guest — Sinan Ozdemir — and I discuss AutoML in more detail, but the short answer is “yes” AND “no” — AutoML is only very useful on clean data and in the real world we are typically presented with only the dirtiest, noisiest of data.
(03:25):
AutoML may become more prevalent as it accelerates the identification of the optimal model choice or the optimal model hyperparameters, but it is not a replacement for the blood, sweat, and tears of data scientists.
(03:42):
All right, next: Both a bioengineering PhD student named Zach and someone with the Twitter handle DoomscrollPro asked questions that are related to model interpretability and bias.
(03:55):
This is related to the previous questions because while open-source model interpretability tools are becoming more common — two particularly popular ones are LIME and SHAP — ultimately the data scientist themself is responsible for ensuring that a model is sufficiently interpretable and doesn’t include unwanted biases such as those against a particular demographic group.
(04:16):
AutoML may recommend an extravagant deep learning model as optimal for accurately solving a problem, but if interpretability is paramount to your application — say, because your model will approve people for a credit card or determine the length of their prison sentence — than a simple regression model with marginally less accuracy perhaps but completely interpretable model weights might be much more appropriate.
(04:44):
Cool, we’re out of time for this week. On next week’s FiveMinuteFriday, I’ll be back to answer popular questions about machine learning, ranging from the best learning paths for getting started in ML to the hardest concepts to understand in ML.
(04:59):
In the meantime, check out Wednesday’s guest episode with Sinan Ozdemir for more detail on AutoML as well as tons of other fascinating topics, such as how to make Conversational A.I. — also known as chatbots — effective for automating real-world business processes.
(05:15):
Finally, if you’d like to ask me your own data science or machine learning questions or anything at all really, feel free to tag me in a post on LinkedIn or Twitter — my handles are in this episode’s show notes — and I’ll aim to answer them via social media or perhaps on an upcoming SuperDataScience episode! 
Show All

Share on

Related Podcasts