Jon Krohn: 00:00:00
This is episode number 555 with Ken Jee, head of data science at Scouts Consulting Group. Today’s episode is kindly brought to you by Udemy Business, the platform that powers your business with learning and by “Unlocked”, Z by HP’s short film made specifically for data scientists.
Jon Krohn: 00:00:23
Welcome to the SuperDataScience podcast, the most listened to podcast in the data science industry. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. I’m your host, Jon Krohn. Thanks for joining me today. Now, let’s make the complex simple.
Jon Krohn: 00:00:54
Welcome back to the SuperDataScience podcast. We’re fortunate to be joined today by the exceptionally popular data science content creator, Ken Jee. Ken is the head of data science at Sports Consulting Group, an analytics consulting company that specializes in sports data. Ken improves the performance of athletes and teams by analyzing data collected on them. To the public, however, Ken is best known for his YouTube channel, which he uses to help his 190,000 subscribers with tutorial and commentary-focused videos that make the fields of data science and machine learning accessible to everyone. He’s also renowned for starting the ubiquitous #66DaysoOfData social media hashtag, which has helped countless people create the habit of learning and working on data science projects every day. On top of all that, he’s the host of the Ken’s Nearest Neighbors podcast and is an adjunct professor at DePaul University in Chicago. He holds a Masters in Computer Science with a concentration in AI and machine learning.
Jon Krohn: 00:01:55
Today’s episode should be broadly appealing, whether you’re already an expert data scientist or just getting started. In this episode, Ken details what sports analytics is and specific examples of how he’s made an impact on the performance of athletes and teams with it, where the big opportunities lie in sports analytics in the coming years, his four-step process for how someone should get started in data science today, how the #66DaysoOfData hashtag can supercharge your capacity as a data scientist, whether you’re just getting started or already an established practitioner. And, he talks about his favorite tools for software scripting, as well as for production code development. All right, you ready for this fun episode? Let’s go.
Jon Krohn: 00:02:42
Ken Jee on the SuperDataScience podcast. This has been a long time coming for the podcast. Where in the world are you calling in from?
Ken Jee: 00:02:52
I am on Oahu near Honolulu. Got a pretty nice day out here. I’m a couple hours behind you though. I think it’s your evening almost. I’m just starting my day. I’m about to eat lunch.
Jon Krohn: 00:03:07
It’s possible for people watching the YouTube version of the podcast that the sun will set while we’re recording. It will definitely not happen for you. Hawaii is beautiful. I’ve only been there once. I went with my family for two weeks. We had an amazing time. We were on the big island for a while. Then we were on-
Ken Jee: 00:03:24
Excellent.
Jon Krohn: 00:03:25
… Kauai. There was a historic thunderstorm while we were on Kauai that took out all the power. There were-
Ken Jee: 00:03:32
Oh, no.
Jon Krohn: 00:03:33
… thunder claps in the middle of the night so loud that not only did they wake me up, which is super unusual, but I was laughing out of sheer fear. It was such a loud sound, that I was laughing inappropriately. That I was just, “What’s going on?” But yeah, Hawaii, absolutely a beautiful place. I would love to go there all the time, but it is not close to mainland US. Especially for me on the East Coast flying from New York, it was quite a haul.
Ken Jee: 00:04:12
Well, there’s direct flights from a lot of East Coast cities, actually. Washington DC and Massachusetts. That’s not a city, but you get the idea. I can’t remember what city the airport’s in, but yeah, I mean, it’s not easy, but you get a nice seat, you sleep the whole time and you wake up in paradise, right?
Jon Krohn: 00:04:32
You do. It is paradise. You haven’t always been in paradise. Is that right?
Ken Jee: 00:04:39
No, I was in Chicago for about five years before this. I actually really liked it. I think for three months of the year, Chicago is a top-five city in the world. It’s just once it gets cold, it is not a top-five city in the world.
Jon Krohn: 00:04:58
How did you pick that? Did you do some kind of cluster analysis or any other kind of quantitative analysis of where would the most wonderful place in the entire world I could live be? Then you were like, “Okay, it turns out it’s Oahu and now I’m here.”
Ken Jee: 00:05:17
Well, actually it was an economic decision. When I was in Chicago, I lived by myself there for a little bit. I was looking to probably move somewhere else. With the nature of the world and the kind of pandemic world we live in now, I could do my work from anywhere. I realized that if I could do my work from anywhere, I might as well go somewhere where it’s very temperate and I can play golf every week and do those types of things. Around the same time as I was looking to move, my parents actually bought a retirement home in Hawaii. They weren’t going to live in it.
Jon Krohn: 00:05:56
Wow, that’s [crosstalk 00:05:58].
Ken Jee: 00:05:57
I told them, being a really good son, I would go look after the place until they wanted to move in. After they decided to move in, I started looking for houses. But to be honest, the real estate prices in Hawaii went up 30% during the pandemic because everyone was taking their San Francisco salary and essentially moving out here. It’s perfectly fine. I’ve been with my parents for a couple months. I’m now looking to buy a place either here or somewhere back on the mainland.
Jon Krohn: 00:06:28
No kidding?
Ken Jee: 00:06:28
Yeah, it was just a really nice situation. I’m super grateful to my parents that they would let me do something like this. It’s been an incredible year-and-a-half, two-year vacation for me. It might be time to get back to the real world. But yeah, I mean, that’s sort of how it came about. I was like, “Look, who knows what’s going to happen in the pandemic. I’d be foolish not to take my parents up on either very cheap or free rent, save up to buy a really nice house when I decide to make that jump.” I could not be more happy with the decision to pursue this little foray for the last kind of chapter of my life.
Jon Krohn: 00:07:12
Awesome, it sounds great. If anything, the pandemic, one of the lessons, for me, was that family and being able to see them, we can’t take that for granted. That you were actually able to spend a disproportionate amount of your time in the pandemic with them, I think that is special, and no doubt, a lot of wonderful memories from that time. All right. We hadn’t met before filming this show, but I feel like I know you from your YouTube channel. I’ve been watching so many videos. We were introduced by Harpreet Sahota, who was on the SuperDataScience podcast in episode number 457. He’s the host of The Artists of Data Science podcast. I don’t know how you guys connected, but you’re both very well known in the data science world, especially in the data science content creation world.
Ken Jee: 00:08:01
Well, I love shouting out Harpreet. I go on his happy hour, which is just actually right after we’re recording this. I don’t know if I’ll make it today, but it’s a great time. I highly recommend anyone go check that out. He’s also been on my podcast, Ken’s Nearest Neighbors and I’ve been on his podcast. In my mind, one of the things I love about data science and this space is that there are so many incredible just creators, people putting content out there. The community is incredible. He recommends me to go on your podcast. I recommend him for a bunch of stuff. It’s not like there’s a competitive aspect to it. It’s like if we’re building together, if we’re creating this community together, everyone wins. We all grow the whole pie. It’s not like we’re stealing slices from each other.
Jon Krohn: 00:08:50
Yeah, it’s definitely not a zero sum game. One of the really nice things about the data science field and the associated fields around it, is that they are all growing so quickly and interest in them is growing so quickly that there’s huge amounts of opportunity for carving out specific niches, like Harpreet’s podcast, Artists of Data Science, it’s such a unique and wonderful niche. Yes, it’s tailored to data scientists, but it’s this idea of data scientists as being creators. He has content not only from creative data scientists, but also just from creative writers in general, including some of the best known authors on the planet of just general books.
Ken Jee: 00:09:33
Books, yeah.
Jon Krohn: 00:09:34
What an interesting take on a podcast theme. Yeah, super cool guy. Harpreet, if you’re listening, clearly, you got two big fans here. Speaking of content creation, you have a few followers. You have 185,000 at the time of recording YouTube subscribers. I am not surprised that you have so many. You have incredible content. You have practical videos on what to do to get started in data science. You have this kind of annual series of How I Would Learn Data Science in 2022 or 2021. Things like how to go from a data analyst to a data scientist. You have detailed walkthroughs of the phases of a data science project. You have suggested data science projects. Something that you released shortly before we recorded this episode is a really fun video on can you analyze my data better than me? Where you have a starter project for people, you provide them with your results, and then you challenge your 185,000 subscribers to outperform you on that project. I think that’s a super cool, fun way to get people going on a project.
Jon Krohn: 00:10:55
You have suggested pathways. Should somebody do a Google data analyst certification or an IBM one? What are the differences? You go beyond data science. You also have general life guidance. Things like how I learn to learn, seven incredible books that transformed my health. On top of all that, you also have some very amusing videos. I loved, and I watched from end to end, I was riveted to this video of you doing data science job expectation versus reality. It’s this really humorous skit that you did with Luke Barousse. You play his boss on a data science job. I thought that was so much fun.
Ken Jee: 00:11:42
I have to give all the creative genius on that one to Luke. I think in another life, he might have been a cinematographer. He has a lot of fun with the storytelling of images. That’s something I’ve been working to improve. It’s funny. You said that you weren’t surprised that I have this following, but the funny thing is I am more surprised than anyone that the things that I put on the internet, the stuff that I was making effectively, that I wish I had had when I was starting out. I mean, if I really look at what I have done, it’s that I’ve created videos that I wish I could have given to myself three, four, five years ago. Back then, I thought the audience for that was really small. But it seems like with the domain growing and the interest and a lot of the conflicting information out there, I thought it’s fun for me to be able to tell my stories and give my perspective. I mean, something that I try not to do, and it might not sound like it from the titles, but I like to speak from my experience. There’s no ground truths necessarily. I’m trying to tell what I’ve seen. I reserve the right to be wrong because I find that I’m wrong a lot more than I even thought I would be. But that’s a beautiful thing about creating and storytelling, is I don’t look at myself as an authority. I only look at myself as a reflection of the conversations that I’ve had and the experiences that I’ve had. Hopefully that journey is something that resonates with people. I guess, apparently it has over time.
Jon Krohn: 00:13:14
As I was doing research for this episode, I read an interesting story that you got started with YouTube because you had a class project where you had the option of either, was it you could present live in class when you could record a video? You went for the recorded video option. Then I don’t know how later, months later or years later you came back and it had 5,000 views. You were like, “Oh, wow.” People are just interested in stuff you could create. That kind of got you going.
Ken Jee: 00:13:46
I mean, that’s almost exactly how it happened. I did this project. I still remember it. It’s my first video that I have uploaded to the channel. It’s comparing LSTM and GRU … oh, my goodness.
Jon Krohn: 00:13:58
Recurring neural networks.
Ken Jee: 00:14:00
In a recurring neural network when predicting cryptocurrency prices. It was right, I think, during the first big crypto boom, when everyone was super stoked on it. It was maybe the second one. I guess there was a lot of traction there. I don’t think it was even 5,000 views. It was maybe a 1,000, 2000. I was like, “Oh, my goodness. This is crazy.” It’s funny. Now, one of my videos does that in the first couple hours. That idea though that, “Hey, this is a platform that I could communicate with other people. This is a platform where I can storytell. This is a platform also where I could improve some of the things that I needed to work on in myself.” I’ve told this story, I think, a couple times, but it’s something I love sharing is that when I started making videos outside of that one where I was going through a presentation, I did it in order to improve the way that I articulate, improve my diction and the way that I speak. I did an interview with a company where I had to read a prompt off of a screen and respond to it. There was no human that I was talking to. It was just talking into a camera.
Jon Krohn: 00:15:11
When I talk to leaders in data science, I notice they all make time for learning and encourage the same of their teams. But, with your actual every day work to do, all-day trainings, aren’t possible for most of us. That’s why an on-demand learning platform like Udemy Business makes sense. With Udemy Business, you can access over 500 cutting-edge data science courses taught real-world subject matter experts and validated by other learners realtime reviews. Amongst these 500 courses, you’ll find my own Mathematical Foundations of Machine Learning course, as well as dozens of mega popular courses from other SuperDataScience instructors. To hear the latest on the state of data science in the workplace and discover how you can democratize data science learning in your teams through Udemy Business, check out the new video series called Insights On Demand: Diving Into Data Science. To watch this series and learn more, visit business.udemy.com/sds. That’s business.udemy.com/sds.
Ken Jee: 00:16:08
I watched it back. I remember thinking to myself, “Why would anyone ever hire this person of this-
Jon Krohn: 00:16:14
Whoa.
Ken Jee: 00:16:14
… video that I just saw?” I was a robot. I didn’t have any emotion to me. I couldn’t convey the information clearly. I thought, “Wow, I could convey information that I know about,” because I just had this experience moving from management consulting into data science. I can also work on improving this skill of mine, this storytelling, this ability to speak, this ability to put together a cohesive message. Then eventually, that led into improving my ability to make videos and use different camera angles and do all this other stuff. That one little thing sparked this entire sort of avalanche of me trying to improve myself and improve my content, but also convey value and tell stories and hopefully create some meaning or some understanding around this domain of data science that was quickly evolving on YouTube and on a platform. It was an awesome sort of fun project and experience. I never would have guessed it would turn into what it has now.
Jon Krohn: 00:17:18
It’s super exciting. It’s really nice to be able to hear that kind of origin story for people who are getting started, probably not only in content creation, but kind of any new pursuit out there. How at some point, you just start. There’s just some first step, you don’t know how it’s going to go. You might be doing, because you have a robotic apparent voice in these videos that you’re reviewing. Then you’re like, “Well, there’s an opportunity here to improve.” You take those first steps and then, wow, now you’ve got 185,000 subscribers, probably far more by the time this episode actually airs. In terms of today, when you’re thinking about what content you’d like to create, what’s your process? What’s your inspiration behind the videos that you create? Do you follow a set process? Are you like, “I’m going to create something, no matter what, once a week. I’m going to create a video once a week.”? Or do you just kind of do it when inspiration hits? What’s kind of your process around creativity as well as production?
Ken Jee: 00:18:26
That’s something that’s in motion right now. At the beginning of the year, I’ve been sort of reflecting on how often I want to create content. I used to make videos every week. Even before that when I was growing the most, I was making three videos a week. I was doing a sort of commentary video on Mondays. Wednesday, I would record an interview. Then Friday, I would review someone’s project. It was fun. I enjoyed it, but I also realized that I needed to make more cohesive content in order to grow and maximize what I was working on. I also wanted to just have as much fun with the content and not burn out. That’s always my goal. It’s very easy if you feel like you have to do something over a long period of time, to burn out and get fatigue. I slowed down to around one video made each week. This year, I started working with an editor, which has been unbelievable in terms of my ability to produce. But I also have felt like, to a certain extent, I want to do more in terms of creativity. I want to do more in terms of the storytelling aspects. I don’t know if I can do that on a week-to-week cadence. I’m thinking about stretching the time periods out a little bit more just to make it so I can feel like I’m producing the best possible things that I can put out there.
Ken Jee: 00:19:51
In terms of video ideation, I have a list of 150, 200 video ideas that I could make at any given point in time. More of that is about thinking about what I want to make, what I’m excited about creating and then using data to substantiate if that would be interesting to the audience. Over the last couple years, I’ve done significantly more polling. YouTube has a great community tab where you can ask polls or you can ask questions. I’ve been using that quite a bit to make sure that the content is matching what the audience is interested in. I will say, for thinking about getting started with data science and that sort of niche, I don’t know if there’s many more videos that I could make about that would tell a different story. I’m sort of making this progression into telling more relevant stories to actual data scientists or people that aren’t necessarily related to data science.
Ken Jee: 00:20:50
I made a video recently about Zillow and what happened where, effectively, they got into the iBuying business and it failed absolutely miserably. Was that a machine learning problem or was that a management problem? That’s something that resonates with a lot of different audiences. I also did one on this incredible story of this guy named Bill Benter, who made reported almost a billion dollars betting on horses in Hong Kong in the ’80s, ’90s and early 2000s. He used a pretty simple machine learning model, but no one’s ever talked about the implications of the model or how he was able to make this. I’m trying to create more content that sort of stretches the pipeline. If someone watches my content to just get in to start and learn data science, if there isn’t a next step after they become a data scientist, after they land their job, what incentive is there for them to stick around and watch more introductory content? I’m trying to sort of expand the pipeline and make it interesting to broader audiences. That’s also really appealing to me is because I am interested in helping people start, but I’m also really interested in what’s going on in the data science world. I’m really interested in the new problems that arise. It makes me really excited to be able to make content about that as well.
Jon Krohn: 00:22:03
I love those stories that you’re focusing on. The Zillow story is something that a lot of people, especially in the data science community, are kind of aware of. A lot of people in the tech community are aware of. It would be interesting to be able to take a deep dive on that topic. I’ve seen that video come up on my feed. I’ve narrowly watched it because I was like, “Oh, that looks really interesting. I would love to learn more about that.” Then same thing with the horse racing. That one’s also come up in my feed. Again, it’s something it’s like, a human interest story isn’t the right way, but it’s kind of a way to describe it. But there’s this really compelling story to be told where data and models play a key role. I think that’s a really fun niche for you to be starting to dig more and more into.
Ken Jee: 00:22:54
It’s interesting, I think the news, and whenever they’re talking about these stories, there’s a lack of nuance about the actual machine-learning elements. It’s cool to be able to, hopefully, give some additional insight from a practitioner’s standpoint, to be able to describe what’s going on either internally or with the models or whatever it might be. There’s a fun art for me of trying to make that accessible and interesting to both someone who’s not a data scientist and someone who’s a senior data scientist at X, Y, Z company. I absolutely love that challenge. And I love storytelling. I think I have a long, long way to go in terms of growing as a storyteller, but to me, that’s as exciting as any of the data science challenges that I’m also working on.
Jon Krohn: 00:23:39
Awesome. It’s great to be specializing at that intersection in both of those arenas. As a data scientist, as well as a storyteller, there is literally an infinite amount of depth that you could go into and an infinite amount of expertise in so many different little pieces of it. By intersecting those, it sounds like a powerful combination. I know now that you’re not focusing so much on the how to get started in data science videos, but for our audience’s sake, and given that those are historically some of your most popular videos, how do you recommend someone today get started in data science?
Ken Jee: 00:24:23
I mean, I think that’s a really important question. I make a video on this every year because my perspective on it changes basically every year. Historically I would tell people, just learn enough coding to be able to start a project and get your hands dirty as quickly as possible. I think that can work. I think hands-on work and projects are one of the most important things you can do. They’re one of the things that allows you to make the most growth in a short period of time, but it also isn’t necessarily the best way to start because it can be unbelievably intimidating. I think in the past, I was maybe being a little too hard on people that they couldn’t get through that initial phase. I think the first thing that’s really important is some meta learning around what data science is, what does it entail? What tools do you use? What pathway do most people take to learn the discipline? Something as easy as going through four or five different online courses, not taking the courses, but just looking at the curriculum and seeing how they’re structured and what they teach, just understanding, “Hey, these are the domains. You’re going to need XYZ math. You’re going to need to learn these concepts. These algorithms seem to be popping up a lot. I can make sort of this meta learning roadmap for myself to be able to approach this domain.”
Ken Jee: 00:25:38
Next, I think it’s really important to learn how to program. I think coding allows you to apply a lot of the math that you’ll eventually need to learn as well. I look at it as if I’m trying to remember something, if I’m trying to take notes or whatever it is, it’s a lot easier if I have a pencil to write it down and to understand it and to solve problems, then if I don’t have any tools to apply the information that I’m learning. I recommend this. It’s not for everyone. Some people are like, “You have to learn statistics first,” yada, yada, yada. It’s just not how I see it. But if you can code, again, you can apply all this math. If you want to understand how randomness works, you can create randomness using code and look how it creates distributions or whatever it does. To me, just the ability to use tools to be able to build things is something that’s been integral in my life as a data scientist and as a lifelong learner. That’s generally the next step that I would recommend.
Ken Jee: 00:26:42
After that, I would then historically say jump right into projects. This time, I learned my lesson. The next thing I would recommend doing is actually reviewing other people’s projects, looking through the code, understanding how they work. Something really cool you can do is you can just copy and paste from other people’s projects and tinker with it. You run this function and you tweak it a little bit and see what happens. That’s a really low-risk, low-effort way to understand the mechanisms that are going on and to be able to digest it and evaluate it further. If you really want, what you can do is take someone else’s project and expand on it. That could be your first project. It doesn’t have to be this completely new and novel thing. You could just be taking someone else’s work, giving them credit, of course, and then iterating it a little bit, asking additional questions of the data with this framework in place and then solving it in that way.
Ken Jee: 00:27:35
After that, is when I think you should really get your hands dirty with the data, start building projects, start exploring, start reading research and applying it into projects. Then that sort of iteration loop keeps going with the learning, meta learning, getting your hands dirty and reviewing other people’s work, just keep going through that path. I haven’t gotten off of that hamster wheel yet because this domain is something where you are constantly, effectively, always learning. There’s no place where you’re like, “Oh, I’m done. That’s it. I’ve learned everything there is to know,” because tomorrow there will be a new tool, tomorrow there will be an innovation in the technology and the algorithms that we use. If you establish those habits and if you get into that mindset, it can be very valuable for longevity in this career and for your overall success.
Jon Krohn: 00:28:24
I love that. Great tips. Meta learning about what the field is, learning how to program, reviewing other’s projects and then digging into your own projects.
Ken Jee: 00:28:33
Exactly, and repeat.
Jon Krohn: 00:28:36
And repeat, exactly. That makes so much sense. Yeah, I love that approach. It’s nice that because you do this every year, you get time to reflect probably through the year. You realize, “Ah, that advice I gave, maybe that isn’t the best.” It’d be interesting to see how that continues to evolve in years to come, but I think that this is an awesome process. I agree with everything about it. I think this idea of seeing what data science is about first … that’s actually Josh Starmer, whose episode aired last week, he described that same kind of thing. When he’s researching something that he wants to learn for a new video, he goes through the first few pages of Google results, opens every single one of those in a new tab and then just kind of skims them. Just like you described, what are these words that keep recurring? Those are probably important. I think that makes a lot of sense. Then also, I loved your point about learning to program as one of the first steps because then, probability, statistics, machine learning, it’s so much more fun and interesting when you can see it, when you’re not just reading about it in a book. I think programming, that kind of interactivity is huge. I think it’s very helpful for learning and very helpful for understanding how something works in practice.
Jon Krohn: 00:30:08
This episode is brought to you by “Unlocked” by Z by HP. “Unlocked” is an interactive short film made specifically for data scientists. The movie is broken up into four segments, each with a unique data science challenge covering data visualization, text analysis, audio signal processing and computer vision. Each challenge is beginner-friendly. If you submit your answers via the website, you are entered for a chance to win one of 10 ZBook Studio laptops and a free trip to the Kaggle World Championships. Watch the movie and take on the challenges today at hp.com/unlocked. Want to do it with someone? Stop by the Hackathon on March 12th, where we’ll work through the projects together. There will be speakers, data science panels, live tutorials and prizes. RSVP and details in the show notes.
Ken Jee: 00:30:57
Well, I get criticized. A lot of people are like, “Oh, you’re undermining the importance of math.” I’m like, “No, math, statistics, calculus, linear algebra, they’re all very important.” But if I were to go back and learn those subjects, I would probably understand them so much better if all of those concepts that I was doing I was implementing in code because, at least for me, that’s as hands on as you’re, effectively, going to get with that rather than pen and paper. You have to understand how the code works to understand how the math works and then eventually vice versa. I don’t know. I have always looked at that as that seems very obvious to me that if you can apply these things, you’re going to understand it better and you’re going to have more ways to describe what you’re doing and what you’re working with. But that is just to say, I don’t want to undermine the importance of math. It is very important, but I also don’t think it needs to be learned first. I actually think it makes significantly more sense to learn in second.
Jon Krohn: 00:31:58
As it happens, you probably don’t know this about me, Ken, but I am in the process of releasing a very, very long course on the mathematical foundations of machine learning. I’m releasing it all for free on YouTube. All of the linear algebra playlist is complete. In the next few weeks, the calculus playlist will be complete. Then I’ll move on to probability, statistics and computer science, algorithms and data structures, optimization, that kind of thing. It is a coding-forward approach. Every single linear algebra, calculus concept that we learned, you implement in code. I show you the equations. Then I’m like, “Here’s how we make the equation in code.” Then those functions that you use, can be used later in the curriculum. People would need to know a bit of Python to get started, but if you’re not familiar with PyTorch or TensorFlow or NumPy, I do kind of cover those libraries from scratch through the curriculum. I am very much in agreement with you.
Ken Jee: 00:33:03
Well, I will definitely be taking that. I need a brush up on a lot of those concepts anyway.
Jon Krohn: 00:33:11
I would love your critical feedback on how I could be making my videos better on YouTube and getting even more traction. I would love that. Awesome. YouTube, however, despite being a way that you make a huge impact on the world, isn’t, in fact, your day job. You are the head of data science at Scouts Consulting Group. What does Scouts Consulting Group do? What does it mean to be a head of data science there?
Ken Jee: 00:33:42
Scouts Consulting Group is an organization that helps athletes and teams improve their performance by analyzing the data that’s collected on them. Effectively, what I’m in charge of doing is helping scope projects to work with both of these athletes and teams. We’re analyzing data collected mostly by the large sports organizations. In essence, our goal is always to help improve the probability that a team or athlete wins. Everything is about winning. It’s about performance. A lot of other applications in sports analytics are about describing outcomes, creating ROI for organizations. Our focus has been, and probably will always be, about creating better opportunities on the court, on the golf course, on the field, whatever it might be. Honestly, it’s a lot of fun. I get to work with a lot of professional athletes. I get to work with a lot of really cool and unique sports data that, for the most part, is behind a paywall, which I think is fundamentally wrong, by the way. I think most organizations should completely open up their data sets. It would advance almost all sports tremendously in a very rapid period of time. The main sports that we’ve historically focused on are golf and basketball. We’ve, obviously, explored some other ones like football, baseball, et cetera, but our main clients are either athletes or teams in those two domains.
Jon Krohn: 00:35:13
What’s the idea there? The idea is to literally be analyzing information and helping people have a better golf swing or a better strategy on the course?
Ken Jee: 00:35:26
More a strategy on the course. Our biggest client is the US Ryder Cup team. If you’re not familiar with golf or what the Ryder Cup is, the Ryder Cup is an every other year event with, effectively, the best 12 players in the US play the best 12 players in Europe. One of the things that the team has to consider, is that of those 12 players, at least in this past year, six of the players qualified through points, through how they performed over the course of the year. Then six were picked by the captain. We have to evaluate who would be a really good fit for the course based on the course conditions, past performance-
Jon Krohn: 00:36:05
Wow.
Ken Jee: 00:36:06
… recent performance and make those recommendations. The captain could choose to use those recommendations or not, but we are doing a quantitative analysis to evaluate those outcomes. That’s a fairly specific project. We also will help and go into organizations and help them build out a research branch or a team of data scientists. It might be surprising to people, but there’s a lot of sports organizations. Outside of baseball, I think every single baseball team has an analytics program, but there’s a lot of programs in other sports, basketball, football, et cetera, where they do not have, effectively, an infrastructure to create research data science analytics programs. We’ll go in. We’ll talk to them about what types of projects they should work on, how they should build out a team, the five, 10-year strategy of what they should look like and get all the pieces up and running. In that sense, the goal is to make ourselves obsolete over time. We have this one type where we go in and we build the organization and this other type where we’re going in and doing projects with the specific data.
Jon Krohn: 00:37:13
That is so cool. I am fascinated by that kind of work. Michael Lewis has popularized this-
Ken Jee: 00:37:24
Of course, Moneyball.
Jon Krohn: 00:37:24
… kind of approach in baseball with Moneyball. Then the subsequent feature film featuring Brad Pitt. This approach, as you mentioned in baseball, is relatively fleshed out. As you mentioned, baseball teams have caught on that there’s this huge opportunity to be using data to be making better player selection decisions. It’s cool to hear that is starting to spread out more and more into other sports. Amazing to hear that specifically for the Ryder Cup, you were able to make that kind of direct impact. That’s a really fascinating job. You must love it because you, I mean, you talked about playing golf on weekends now that you’re in Hawaii. You’re able to do that regularly. I think you have a lot of athletics in your background, right?
Ken Jee: 00:38:13
Yeah, I played golf in college. I tried to play professionally for a little bit. Then I realized that I was a little bit better at analyzing golf than I was at actually playing golf. It was one of those pursuit of passion things. I was wondering how I could stick with this profession or stick with this game and work in it. It was a really unique combination of the skills that I acquired, as well as my previous background playing in the domain. It created this nice energy that was appealing to a lot of people. I mean, particularly in sports, domain knowledge is really important, especially when you’re making the first level introductions or you’re selling business in the early stages. I don’t think we would’ve been able to get a lot of our work if my entire team hadn’t all been really involved with golf and playing at a high level to begin with.
Ken Jee: 00:39:10
Sports is a really unique domain because, for example, our clients, a lot of them are golfers. A lot of them are either former golfers or former basketball players or whatever it is. Some of them are unbelievably business-minded. They’ve taken classes. They’ve done whatever. Some of them, they’re incredible in their domain, but they just haven’t necessarily dedicated the time to understanding analytics or any of these things. It’s not a fault on them. It’s not that they’re dumb or anything, but it’s that they haven’t spent the time understanding and learning about what a specific type of bar graph is. You might show a business stakeholder bar graph and they get it, but you might show an athlete, or a former athlete, something like this and it’s just they haven’t indexed on these skills. We have to be really creative with how we present information.
Ken Jee: 00:40:06
It’s like in any domain, we want to use terminology and stories that are relevant with them to convey this information. That’s a huge part because nobody wants you to come in there and talk to them like they’re dumb. Nobody wants to feel like they’re dumb when they’re being presented information. Again, not a single player or team or athlete that we ever worked with would I consider unintelligent. They are geniuses in their sport. Effectively, all of them are really smart and personal and easy to talk to, but there’s just this information gap because they’ve never been trained on this specific domain of interpreting data. I love that challenge, and for me, it’s fun. It’s like how do I take this really complex analysis, how do I explain a simulation, like a Monte Carlo simulation to someone who has never heard of one of these before? I just love that aspect of the work.
Jon Krohn: 00:41:00
That is so cool. You’ve described, in a way, the applications and the fun parts of your job, communicating, for example. When you’re doing those analyses, when you’re doing those Monte Carlo simulations, what kinds of tools do you use? What are your kind of go-to tools day to day?
Ken Jee: 00:41:18
Mostly, I’m just using Python and a bunch of the regular libraries there. I am a weirdo in the sense that I prefer using Spyder as an IDE. A lot of people give me flack for that. I think it’s because a long time ago, the first exposure to coding I ever had was with R and RStudio. I liked being able to see a gooey interface of what my data looks like in tabular form. I’ve stuck with that. If I do actual development work, I’ll use VS code. But if I’m doing just pure data science, I quite enjoy using Spyder. If you’re going to judge me for that, you’re welcome to tune off at this point in time.
Jon Krohn: 00:41:57
See you later, listener. That’s where they all switched off. No, I totally understand that. R wasn’t my first programming language, but it was the first that it was a huge part of my life. In that time, RStudio was my IDE of choice for working with it. I do understand. I understand how you’d be into the Spyder IDE. Cool. Throughout this episode, there’s been a topic that’s permeated it. Whether we’re talking about the content that you create on your YouTube channel, whether we’re talking about your process for learning, whether we talk about having lots of different skills and tools that you need at your job, there’s this learning theme that’s permeated all the conversation on this episode. You are behind a hashtag #66DaysOfData, which is perhaps the most ubiquitous hashtag used on social media associated with data science today. I see it everywhere. This is a hashtag that you created to support people in learning data science techniques. What is it? Tell us more about it. Where did it come from and why is it so effective?
Ken Jee: 00:43:15
I think we alluded to it before, but learning for me is a lifelong journey. I think learning has to be a career-long journey in data science as well. It’s not like you just learn all these techniques and then you’re ready to go and it stops. It’s a continuing process. It can be really overwhelming to learn a lot of data science concepts over any period of time. In my mind, the best way to approach that is to habitualize learning rather than thinking of it as like checking boxes. I have to learn this concept and this concept and this concept. If we create this daily learning process or practice, we can, effectively, have a really long and prosperous career because we’re not intimidated by how much there is to consume. #66DaysDaysOfData is truly about creating great learning habits. 66 days comes from the book by James Clear, Atomic Habits. In it, he talks about how 66 days is the average amount of time that it takes to ingrain a habit. As a data scientist, yes, I realize there are problems with averages, especially related to this. It might take some people 30 days. It might take some people 90 days, whatever it might be.
Jon Krohn: 00:44:25
He acknowledges that in the book.
Ken Jee: 00:44:26
Exactly.
Jon Krohn: 00:44:27
He doesn’t try to dumb it down to, “There’s this magic number.” It’s, obviously, some things are harder to learn than others.
Ken Jee: 00:44:36
Exactly. But, I thought 66 was nice. It gets people asking about the numbers. There’s some marketing, hopefully, use in there as well. But, the idea behind the challenge is first, you create this habit. You learn data science for at least five minutes each day for 66 days. The next thing that you do, is you share what you learned on your favorite social platform. This has a two-fold benefit. The first, is you’re part of this community. Everyone’s helping to keep you accountable. If someone sees that you didn’t post or something like that, you feel it. You feel like you’re letting of people down because you’re part of this thing that everyone is doing. The other part of that is you’re creating this awesome track record. Something that I think a lot of people struggle with is putting themselves out there. I know, and I think you know this, in 2021, 2022, going forward, wherever it is, creating a brand for yourself, creating content, creating a portfolio, whatever it is, it gives you a huge leg up in landing a job. It gives you a huge leg up in creating opportunities for yourself. By posting every day and having something really clear to post, it breaks through that fear of this process. It lets you see this track record. It lets you create this track record. Then afterward, 66 days later, you can go back and look at how far you’ve come.
Ken Jee: 00:45:56
I think that’s the other really motivating thing is that a lot of people over a week, over a couple days, they don’t feel like they’re moving anywhere. Compared to yesterday, I’m like, “What the heck did I do?” I did a podcast. I did whatever, but that’s not like, “Oh, I made this monumental leap in my learning or my career.” If you’re looking over the course of effectively two months, that’s a different story. You say, “Oh, my goodness. I learned all these topics. I have it really well documented what I did,” that’s going to make me want to continue this or to learn more or really dive in. Or you’ll learn that, “Hey, data science isn’t for me, man. I didn’t like this experience. I’m not going to do this anymore,” which is also okay, right?
Jon Krohn: 00:46:34
Yeah, I think that’s an important part of the process is to commit to something for a period of time. You don’t need to commit to anything forever, but 66 days, just a couple months of your life. That’s a good amount of time to get experience with something new, see what it’s really like. I have spoken about data science in the context of Atomic Habits. I did a Five-Minute Friday, episode number 442, on that. One of the things that I love that you’re doing with #66DaysDaysOfData science, that I talked about in that episode as well, is this idea of not breaking the chain of learning. If you want to be good at anything, well obviously, just continuing at it is the key. It’s like it’s so obvious that it’s amazing how people fall down on that all the time.
Jon Krohn: 00:47:28
I think an example, it might be in James’ book, if not, it was in his newsletter years ago before he started writing the book, was this technique came from Jerry Seinfeld. If you want to be a great comedian, then, obviously, the key is to write jokes. You need to spend a lot of time writing jokes. Jerry Seinfeld got into this habit of marking an X on a calendar every day that he did a joke, that he made at least an effort. Kind of like you’re saying with #66DaysDaysOfData, maybe it’d be something like five minutes of effort writing a joke or actually completing a joke or just doing something. Having that visual of being able to look on a calendar without breaking a chain, is kind of helpful. You’re bringing that idea into the digital world. With this #66DaysDaysOfData hashtag, you have this continuity across whatever platform you’re using, LinkedIn, Twitter, whatever. You can look through, based on that hashtag, go back through days and see that there’s this continuous trend without breaking for 66 days. Any of your followers can see that too. I think it was a brilliant idea to create this. I love how much it’s taken off. I have no doubt that it’s made an enormous impact. Probably difficult to quantify, but alongside your YouTube channel, this #66DaysDaysOfData initiative has no doubt made a huge impact on the data science community in the broader world. I have no doubt that will only amplify in the years to come.
Ken Jee: 00:49:03
Thank you. Well, I think one of the most important things, especially for me, is that I actually participate in it too. It’s great if I were to start the challenge and tell other people to do it, but I didn’t kind of do as I said. I think that the fact that I’m engaging, the fact that … it’s also real. Sometimes it’s hard for me to get five minutes in. I have a bunch of meetings. I’m doing X, Y, Z, and it’s like, “Oh, shoot. I have to do this, but I’m still going to do it.” I let people know when it’s like, “Oh, this was a slippery slope day for me. It is what it is.” I also have to thank the community. For example, you had Josh Starmer on in the previous podcast. He did the #66DaysDaysOfData challenge with me before. My friend, the Data Professor, has also done it. There have been a lot of people who have engaged and bought in. I mean, I am so grateful and completely surprised by the community growth. We have this Discord server where a lot of people, they can ask questions, they can form groups, they can collaborate with each other. That’s an externality of this that I didn’t even see as possible. I didn’t even see that we could be bringing to community building in this aspect. I thought of this more as an individual challenge, but oh, my goodness. That’s the best part of it now. Is you got to be in this learning community where everyone else is, a lot of new beginners, a lot of people are learning this together.
Ken Jee: 00:50:27
Tell me if that isn’t rewarding or doesn’t feel like you’re part of something? I mean, the scariest thing is you’re “Oh, everyone’s ahead of me. I don’t know what I’m doing. I could never break in because X, Y, Z has already done this. Look how smart they are. Look at their project.” No, everyone, you’re seeing like, “Oh, I learned this really basic concept today.” Even me. I went through the basics in the first session. I don’t know. I just think that’s so powerful. Again, I’m really grateful to the admin, all the staff, everyone that’s made it possible because truly, truly one of the coolest things that’s happened around my content in the last year or two.
Jon Krohn: 00:51:06
Oh, yeah. It’s extraordinary. How often do you do it?
Ken Jee: 00:51:11
I’ve been trying to do it two to three times a year.
Jon Krohn: 00:51:13
Cool.
Ken Jee: 00:51:14
This year I started it on my birthday, January 4th. We’re almost exactly a month in now. I’ll probably do it, I think Josh committed to doing another round with me sometime in the middle of the year. I’ll probably join him or I’ll let him kind of have that one and I’ll do one a little bit later, a November, December-type of thing.
Jon Krohn: 00:51:33
Cool.
Ken Jee: 00:51:34
Yeah, it’s a great way to knock the rust off. I also found that I was stagnating a little bit in my learning. It’s really nice, and intimidating, to have this accountability with however many people are viewing my content knowing that I am expecting me to post on a certain day.
Jon Krohn: 00:51:55
I think by the time your episode airs, this episode on March 8th, I think you will have pretty much wrapped up. You’ll have included your latest 66-day adventure.
Ken Jee: 00:52:07
It might be the last day, actually.
Jon Krohn: 00:52:09
Oh, that’s really interesting.
Ken Jee: 00:52:10
Or one of the last couple days.
Jon Krohn: 00:52:13
Then for a particular 66-day cadence, do you pick a particular theme at the outset or do you have recommendations for listeners in general, whether it’s something you do yourself or not? Is it I’m going to get through this book. I’m going to make some progress, at least a few pages every day kind of thing.
Ken Jee: 00:52:35
Yes and no. Admittedly, for me, I haven’t been able to put together a cohesive learning plan. That’s something I would very much like to do. I’m not following my own advice with the meta learning. This time around, I’m actually working on a machine-learning algorithms course with my friend, Jeff Lee. I think he might have been on the podcast. If not, incredible guest to bring out eventually. I’m slowly going through each of the algorithms and documenting it and learning more about it, about them and making them more cohesive. But there are days when I can’t work on that and I’ll watch a video or I’ll do X, Y, Z. A lot of other people, Josh in particular, made a learning plan for each day where people could follow along with him.
Jon Krohn: 00:53:20
It does not surprise me he did that.
Ken Jee: 00:53:23
I would love to do that. I just haven’t been able to put the time together to do it. I do encourage other creators if they want to use the platform or they want to do the challenge with me, to do something like that. I find that outsourcing, for me, is a lot easier, but also it can be really great for their brand if they’re part of something like this. I’ve gotten really long on the sort of community aspect and having community contributors. There have been many people that have contributed what their learning plans were and things like that. It doesn’t just necessarily have to be from me. It’s probably actually more useful if someone a little earlier in their career is sharing their experiences to people who are just a little bit behind them.
Jon Krohn: 00:54:08
Super cool. We’ve talked a lot about community. Something that we touched on right at the beginning of the episode, in discussion of Harpreet, was your Ken’s Nearest Neighbors podcast. You mentioned Josh Starmer just a moment ago. You’ve also had Ben Taylor, who was a guest on the SuperDataScience show. In fact, quite a few super SuperDataScience guests, Karen Jean-Francois, Vin Vashista. Obviously, you’ve had some amazing guests on the program. I’m not surprised, given how well known you are in the data science community. Why do you also, on top of everything else, on top of your YouTube channel, your job, the hashtag, that in itself creates a lot of work, at least for intervals at two months at a time, on top of all of that, why did you also decide to create a podcast?
Ken Jee: 00:55:04
Admittedly, the podcast, I think, is the child that I love the most. I started interviewing people on my YouTube channel because, admittedly, it’s the easiest form of content. I talk to someone for an hour. I really enjoy it and I just post it. There’s very little edits, whatever it might be. But I saw on my YouTube channel that it was actually hurting the analytics pretty bad because the length of the episode, the percent of watch time, the click-through rates were significantly lower, but I absolutely loved doing it and people saw the value in it. I decided to spin-off a podcast because I realized the podcast was an unbelievable mechanism to create meaningful, human connection in my life, effectively. In the times of the pandemic, during everything that’s going on when we’re very isolated, I found unimaginable catharsis in talking to people or having a structured time where I would talk to people for an hour plus at a time who wouldn’t have any distractions or interruptions, no phones. I’d get to learn their story. I’d get to learn what was meaningful to them in their life and in their career. To me, I mean, that’s sort of what, in a sense, life is all about is connecting with other people and gaining their perspectives and bouncing ideas off of them.
Ken Jee: 00:56:24
A podcast, for me, I was able to, again, structure that into my life and make it a regular thing. But I was also able to share that with thousands of people. To me, that is mind-blowing, that I can get this incredible benefit out of these conversations with people, but other people can listen in and get that same benefit that I had. I actually made a video a while ago called, “Why I think everyone should start a podcast.” That’s the reason why. I think that everyone should have more of these dedicated and meaningful conversations because in the world of social media, which I’m a part of, we’re sending memes every day to our best friend, but we haven’t called them up in two months and checked in with how they’re doing. I just got such an unbelievable, meaningful experience out of talking to people regularly that how could I not turn it into something more structured and try to get paid to do or to grow this thing that I love so much or make it a bigger part of my life?
Jon Krohn: 00:57:29
It will probably not surprise the listener to hear that you are preaching to the choir, Ken. Having a podcast can be an incredibly rewarding experience. If people are interested in trying it out, definitely go for it. As we said at the beginning of the episode, there are all kinds of niches out there. Like Artists of Data Science, you could have your own niche and have some really deep, meaningful conversations. I couldn’t agree more. It is often the best part of my week when I get to sit down with folks like you and get to hear the latest and greatest from all different kinds of perspectives, all different kinds of practitioners in the data science field or associated fields.
Jon Krohn: 00:58:09
Now, the name of the podcast is so incredibly clever. I love Ken’s Nearest Neighbors, named after the machine learning algorithm, k-nearest neighbors. I thought you were, oh, so clever, Ken, for having come up with it. I discovered in my research for this episode, that in fact, you crowdsourced that name.
Ken Jee: 00:58:31
Well, what’s more clever? Coming up with the name itself or outsourcing the work of coming up with a really good name? That’s the question I would ask you.
Jon Krohn: 00:58:40
You just mentioned that a few minutes ago, that outsourcing is a thing to do, that you’d love to outsource. That is smart. You’re right. What’s better, coming up with things or having everyone else come up with everything for you?
Ken Jee: 00:58:54
Well, I mean, the other benefit there too is that the people, the viewers, the listeners, they’re more a part of the channel. They’re more a part of the podcast because they had some input into it. I mean, that’s something you’d mentioned earlier. I released all of my YouTube data for people to analyze. There’s already been two or three analyses where I’m going to take those actual insights and apply them to my YouTube channel. They’re really good and interesting stuff. To me, it’s a no-brainer. It’s like, “Okay, I could sit down and do this,” but I get a twofold benefit. I get I don’t have the time to do the analysis, one and so other people can outsource it. Two, these people, I get to celebrate their incredible work that they’ve done that have that helps me. It’s like the ultimate win-win situation, which I am all about.
Jon Krohn: 00:59:39
Sounds really good. It’d be fun to be able to do that with your day job.
Ken Jee: 00:59:44
We’re working on it. We’ll see.
Jon Krohn: 00:59:47
I mean, with anyone’s day job. I don’t just mean yours in particular, but just kind of, “Hey, here’s an interesting problem. Do you have any ideas out there on how to solve it?”
Ken Jee: 00:59:57
Well, that’s what Kaggle does in a sense, right?
Jon Krohn: 01:00:00
Right. Yeah, I mean, not even in a sense. I guess that is explicitly what they do.
Jon Krohn: 01:00:09
Amazing episode, Ken. We’ve been on quite a journey here already. We have talked about how someone should get started in data science, how you got started in YouTube, what you do as a sports analyst, fascinating work. Then more recently, talking about the podcast of yours, the #66DaysOfData. All right, Ken, I posted that you were going to be on the SuperDataScience podcast on LinkedIn. It had tons of engagement. One of the questions was from Ben Taylor. Ben just simply asked, “When are you going to go and visit him?” I think he’s kind of cutting into our serious audience questions here. Then Christina Stathopoulos asked the same thing. She said, “When are you going to go and visit her?” There you go, Ben and Christina, you’re going to have to chase up with them.
Ken Jee: 01:01:02
I could say this publicly. I won’t tell Ben to his face, he can only hear it here. You have one guaranteed listener, maybe. It would probably be in mid-April is the goal to travel a bunch of the mainland, get off my island over here. I don’t think I’ll make it out to New York, but I will try to book a conference or something in New York to see Christina.
Jon Krohn: 01:01:27
Nice.
Ken Jee: 01:01:27
There we go.
Jon Krohn: 01:01:28
That’d be great. Nice, all right. Well, looking forward to hopefully seeing you when you visit the mainland. Here’s a question from Luke Morris, who’s a healthcare data analyst. He says, “After spatial data proliferation, what’s the next frontier of sports data that could blow the dam off countless new developments?”
Ken Jee: 01:01:48
Ooh, that’s actually a very good question. The next frontier, in my mind, is going to be bioinformatic data. You’re looking at heart rate when people are playing. Some golfers are wearing WHOOPs now where they’re actually like linking that to a TV feed so you can see when they’re putting, whatever’s happening. I think something really relevant for longer-term sports, again, golf would be blood sugar data, figuring out when to eat, when to fuel your body based on a lot of these things. The big challenge with that though is the players’ unions. A lot of player organizations don’t want to have as much data collected on them because they think that’ll be used negatively in contract negotiations. If you can figure out how to get around the legal ramifications and get volunteers to do that, I think that could transform a lot of the games, how we view them and how we understand them.
Jon Krohn: 01:02:44
If your heart rate isn’t above 150 in the ninth inning, you’re not getting your bonus.
Ken Jee: 01:02:50
I mean, potentially. You never know.
Jon Krohn: 01:02:52
Yeah, you never know. Thank you for that insight. Here’s a couple of questions from Arafath Hossain, who is a data scientist at Illinois State University. He asks, “How would you rank these skills according to their order of importance for a data scientist?” We’ve got a few here. Maybe you could just pick one or two that are your favorite. He’s got coding and algorithms is his first one, creativity, networking and communication. How would you rank these skills according to their order of importance, coding and algorithms, creativity, networking, communication? I guess you could just pick your favorites if you that’s easier.
Ken Jee: 01:03:32
I don’t think you can be a data scientist without coding.
Jon Krohn: 01:03:36
What’s important being able to type or being able to read?
Ken Jee: 01:03:40
I mean, if you’re looking at the core of the work, I think the one that’s most important to the core of the work, being able to be called a data scientist, would be the coding and kind of algorithms, not as much. The challenge for me with a lot of those is I think that they’re all important, but they don’t necessarily define what it means to be a data scientist. I think if you would throw problem solving or asking good questions or eliciting information into that, that would be as important, if not more important than any of those other things. I mean, to me, what is a data scientist? Someone who solves problems using advanced tools. If there isn’t a problem solving element as part of your equation, if that’s not accounted for somewhere, which I don’t think it is necessarily in the options that were given, then I don’t know if you can call yourself a data scientist on that front.
Jon Krohn: 01:04:30
That’s a good answer. Do you have advice for a five-year-younger Ken?
Ken Jee: 01:04:38
In all of my YouTube videos, that’s who it’s for. But, if anything, I would almost always encourage myself to don’t overthink it and just get started. You think about how you’re going to approach a marathon or climbing Mount Everest. You do it one step at a time. The more you think about the large things you want to accomplish, the more you think about the overwhelming nature of the task you have at hand, the more apprehension you have and the less likely you are to do it. Yes, you can dream and say you want to do those things, but focusing on the process and enjoying the process is really, really important for me.
Jon Krohn: 01:05:17
Nice, I love that. Then we’ve got one here from Serg Masís. It’s also a sports one. He’s asking, “What sport has the most sports analytics potential?”
Ken Jee: 01:05:31
Potential’s an interesting thing. I think all sports have, effectively, unlimited potential in terms of analytics capabilities. On the other hand, there are some sports that lend themselves more effectively to data analytics. If you look at baseball, there’s discreet outcomes. Someone throws a pitch, something happens, stop. Someone throws another pitch, something happens, stop. That makes data relatively easy to use compared to the other sports like basketball, for example, that’s dynamic. There’s a lot of things that happen between when someone touches the basketball or when someone scores to the next score. There’s, in theory, infinite things that could happen in that time period. Sports with discreet outcomes, as of right now, are the most progressive in analytics because it lends itself to analysis better. I think when we get new tools, when we get new ways to evaluate data, and we talked about some of the geospatial stuff, that is one really cool way to do that. All of the sports are sort of on a level playing field there. It’s about the creativity and the problem solving and what types of ways that you can optimize within the sport.
Ken Jee: 01:06:39
I think that, again, it’s not fair to say one is better than the other, but the differences in the data does lend itself in the short term to more success with baseball. I think golf is lagging, for some reason. Maybe there’s just less interest, but there are discreet golf data points. Probably one of the reasons is that data is not public. Whereas baseball data has been tabulated. The most advanced baseball data is not public, but a lot of baseball data has been published over time. That’s something I would actually tell almost every sports organization is open up the data. Let people analyze it. It’ll make the game better. It will make it more interesting. It’ll have better story arcs. It’ll make it to play better. Why wouldn’t you do that?
Jon Krohn: 01:07:23
Cool. Great answer, Ken. Thank you. We’re reaching the end of the episode. I always ask at the end for a book recommendation. What do you got for us, Ken?
Ken Jee: 01:07:33
I am currently reading this one. It’s a Ray Dalio book, Principles for Dealing with the Changing World Order. Really interesting stuff. It’s a framework for understanding economic cycles and rise and falls of economies, which it’s a very technical and academic, but I’m super into it right now. I think the way the world is looking, the US is exhibiting a lot of signs of minor decline. China is exhibiting a lot of signs of sort of an ascension. There’s a lot of rocky rubble between both of those outcomes, but it’s fascinating to see what’s happened with previous economies and how that could be representative of what’s happening now.
Ken Jee: 01:08:23
I also recently read this book that I’m recommending to everyone called Sandworm. It’s about, effectively, the Russian hacking programs that have been in existence for the last eight to 10 years. It’s really interesting when we talk about Russia’s relationship with the Ukraine right now because Russia has been testing cyber warfare techniques on the Ukraine since like 2013 or 2012. Now they’re practicing more aggressive, real-world techniques. But I highly recommend if you’re interested in cybersecurity, you’re interested in hacking. To me, that’s an awesome, awesome book to pursue. I have a bunch of an entire library of self-help books that I enjoy, but I’ll save those for another time.
Jon Krohn: 01:09:18
Nice. In the meantime, if people can’t wait, then there’s YouTube videos on that kind of content. Like How I Learn to Learn, I suspect as some of those books, as well as Seven Incredible Books that Transformed My Habits. Oh, Transformed My Health, my apologies. Those sound like videos that people could refer to get that content right away if they’re itching for it. Awesome. Thank you so much, Ken, for being on the show, having these laughs and sharing all this information with us. If people want to stay up on the latest of what you’re doing, obviously your YouTube channel is going to be something to subscribe to. Where else can people follow and stay up to date on what you’re doing?
Ken Jee: 01:10:02
LinkedIn is a really good place. YouTube is still the best. I respond to almost every comment that I get there.
Jon Krohn: 01:10:10
No way.
Ken Jee: 01:10:10
LinkedIn, I am accessible, but I get too many messages to really keep up with. Probably not the best place if you actually want to get a response. Twitter and Instagram are both KenJee_DS. Then I also will dabble in some writing on Medium, mostly just rehashing my YouTube videos and giving them a little search boost. But, those are the best places to learn more.
Jon Krohn: 01:10:41
Nice. We will be sure to include all of those YouTube, LinkedIn, Twitter, Instagram, and Medium in the show notes.
Ken Jee: 01:10:48
And the podcast, sorry.
Jon Krohn: 01:10:49
Oh, yeah, and the podcast.
Ken Jee: 01:10:50
On YouTube or any main podcasting platform.
Jon Krohn: 01:10:55
Awesome. All right, Ken. Thank you so much for being on the program. It’s been a delight. We’ll have to have you on again some time and see what’s going on in your world then.
Ken Jee: 01:11:04
Heck yes. Thank you so much for having me.
Jon Krohn: 01:11:12
What a great time I had chatting with Ken. In today’s episode, he filled us in on his four steps for getting started in data science, namely meta learning, learning to code, reviewing others’ projects and creating your own projects, then looping back to step one. He talked about his preference for the Spyder IDE when scripting Python and visual studio code for production development work. He talked about how sports analytics is transforming sports like golf, for example, by guiding the selection of Ryder Cup teams. He covered how there’s a tremendous amount of potential in sports analytics that could most randomly be realized by sports leagues open sourcing more of the data they collect on athletes.
Jon Krohn: 01:11:54
As always, you can get all the show notes, including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Ken’s YouTube and social media profiles, as well as my own social media profiles at www.superdatascience.com/555. That’s www.superdatascience.com/555. If you’d like to ask questions of future guests of the show, like several audience members did of Ken during today’s episode, then consider following me on LinkedIn or Twitter because that’s where I post who upcoming guests are and ask for your thoughtful inquiries for them. On that note, if you live in the New York area and would like to experience a SuperDataScience episode filmed live and ask the guest questions in real time, then come to MLconf NYC, which will be held on March 31st. That’s MLconf, the Machine Learning Conference on Thursday, March 31st. In addition to filming a SuperDataScience episode live, I’ll also be doing a book signing for my book, Deep Learning Illustrated. The first 10 folks in line will get a free copy, generously donated by my publisher Pearson. After that, I’ll be signing them and giving them away at cost, as cheap as they come. This will be my first conference experience in over two years, and boy, am I ever excited about it. Hopefully, I’ll get to meet you in person then.
Jon Krohn: 01:13:16
All right. Thanks to Ivana, Mario, Jaime, JP and Kirill on the SuperDataScience team for managing and producing another awesome episode for us today. Keep on rocking it out there, folks. I’m looking forward to enjoying another round of the SuperDataScience podcast with you very soon.