Podcasts SDS 266: Exploration vs Exploitation

15 minutes
Data Science

SDS 266: Exploration vs Exploitation

Subscribe on Apple Podcasts, Spotify, Stitcher Radio or TuneIn

Welcome to the FiveMinuteFriday episode of the SuperDataScience Podcast!

In this episode we start out on top of a mountain to discuss the best practices of exploration.

When we develop AI, we’re trying to mimic the way humans make decisions and learn. But, I was thinking about it and what we can learn from neural networks and integrate into our own organic neural networks: exploration as opposed to exploitation. One example is a neural network looking at A/B tests on ads, exploring insights and getting higher conversion rates. However, what if it’s a fluke? What if we didn’t do enough exploration? So what do we do: continue exploring or are we exploiting something we thought was the best? There’s a balance.

As humans we tend to fall into patterns. If we see something is working then we just repeat it and don’t want to deviate from it. Think of your path to work or when you want to go on holiday or the groceries you buy each week. You think it’s the best option, but that may not be the case and if you spend time on exploration, you might find better options. This can go for anything in life. However, it’s not about blindly jumping into exploration and just replacing your habits. You have to find the balance for yourself.

For example, I’m in Bali right now because I love the energy here and there’s a lot of options of things to do if you want to try new things: surfing, yoga, food and drink options, coworking options, party options, networking, meditation. There’s so much variety. When I was here last year I predominantly worked at the coworking place and did yoga. But, things weren’t perfect this time: they no longer taught the yoga style I was a fan of so it gave me options to explore. I found another gym called Nirvana Strength that’s a world class gym for Olympic athletes. It’s not a place I’ve found anywhere else in the world and I would not have thought to explore it without the options available to me. At the same time, however, I’m balancing out my usual habits.

Balance is important here. Change something small, be open, and be aware for the changes, not everything will work out the way you want. Everything can be a hit or miss. It’s a process and journey.

ITEMS MENTIONED IN THIS PODCAST:

SDS 197: How to be Happy and Successful with Carl Massy
The Guidebook to Happiness
by Carl Massy
The Practice
Nirvana Strength
Shady Shack

DID YOU ENJOY THE PODCAST?

Where are you overexploiting patterns in your life and where can you be more open to some exploration?
Download The Transcript
Music Credit: Mesmerize by Tobu [NCS Release]

Podcast Transcript

This is FiveMinuteFriday, episode number 266, Exploration vs Exploitation.

Welcome back to the SuperDataScience podcast ladies and gentlemen, super excited to have you back here on the show. It’s super windy and that’s because I’m at the top of mount Batur, in Bali, it’s 5, I think it is like 5:50 AM. And I’m up here, it’s 5:43 AM. I’m here with some friends for the sunrise, the sun is, you can barely see it on the horizon behind Lombok, behind the mountains in Lombok and it’s getting beautiful. It’s just amazing. So it was about almost two hour hike to get here. Very excited. You might know from previous episodes how much I like hiking. So I thought I’ll just start this recording from here and probably I will continue it, not in such a windy place. So yeah, I just wanted to share this bid with you and will continue in a second.

Alright. And we’re back. So got off the mountain, and a few days later now I’m sitting in a studio. Well semi studio, Skype call phone booth at one of the co-working places at the Dojo in Canggu. And yeah, those are really cool hike. Hope you got a little bit maybe of a feel for that energy at the start of the audio. And here we’re going to continue with the FiveMinuteFriday episode and the topic is exploration versus exploitation.

So what does that mean? Well, what I’ve noticed is that when we’re building artificial intelligence, we try to mimic the human brain. We try to recreate a neural network, and kind of learn from the way that humans learn or from the way that humans make decisions and do things. Well, there’s at least one thing that I’ve noticed that we can actually learn the other way around. Something we can learn from neural networks and from artificial intelligence itself and integrate more into our life. And that is the concept of exploration and exploitation. So the whole field of reinforcement learning, especially online reinforcement learning is built around the concept of balancing out exploration and exploitation.

And what does it mean? Well, what it means is that if you are, by the way, if you’ve done our machine learning A-Z course, we discuss this in quite a bit of detail in the upper confidence bound algorithm or Thompson sampling. But that also applies to other types of reinforcement learning algorithms, which we talk about in the artificial intelligence A-Z course. So basically if you are building a reinforcement learning algorithm, it has to, which is working online basically meaning that data is coming in. For instance, let’s say it’s a reinforcement learning algorithm that is optimizing advertising for a website. So you have five different types of ads and they need to be displayed to users. And so basically every time a user clicks on a page or ends on the page, it has to pick which ads to display to the user in this case.

And what it can do is it can basically, you can get some data. So it needs to build up some data because at the start it has no idea which ads, which out of the five ads performs the best. So it has no idea. So it can gather some data around the ad. So like basically by exploring, by trying out these ads and then you’ll find that for instance, ad number two performs quite well. Well at that point it can continue using ad number two and exploit these insights that ad number two seems to be the best. And it can continue exploiting that and getting the higher conversion rate.

But the thing is it cannot know that stage for sure. Because what if ad number two, what if that is just a fluke? What if that we’ve just found we don’t have enough data yet to sufficiently tell. What if that’s like a sampling error, what if actually ad number two isn’t the best ad out there. That we haven’t done enough exploration and we need to check the other four ads more in order to determine which one is truly the best. And then in that case we need to do more exploration. And so in that sense, there’s a balance. What do we do? What do we focus on? Do we focus on exploring these options and therefore will have an opportunity cost because we’re not choosing the best ad, right. One of them is the best ad. And because we’re not using the best ad, we’re not displaying the best ad to our users. There’s an opportunity cost of exploring. On the other hand, if we’re exploiting something that we think is the best, there’s a risk that it’s actually not the best and we’re going down the wrong path.

And there’s other things out there for us in this case, four other ads that might be performing better and we just haven’t spent enough time exploring them. So therefore there is a balance and it’s quite an interesting problem to solve. And the differences between different reinforcement learning algorithms, for instance, upper confidence balance versus the Thompson sampling algorithm is how writes their balances in that sense or which one is better at balancing those two things out. And therefore gets better results. And of course, the other learning algorithms also need to look into that problem and address it in their own ways. And what I mean by us being able to copy that from artificial intelligence or integrate more into life is that we can actually, we as humans, we tend to fall into patterns. Once we see that something is working for us, we tend to just repeat that and not go out of our comfort zone in order to explore new things.

For example, you might be in a pattern of buying, of driving, of taking the same route to work, you know, driving the same way or walking the same through the same roads and so on. Or you might be in the pattern of going on holiday in the same places. You might be in the pattern of buying the same groceries to cook the same dishes. You might be in a pattern of where you have your coffee or tea during your lunch break or how you spend your lunch break. You might be in the pattern of the type of music that you listen to. So you found basically, often what happens is as humans, we find a local extrema something that works really well. And if you could shift a little bit to the left a little bit to the right, it doesn’t feel as good and therefore we think this is the best option for us. Full Stop.

But at the same time, that might not be the case. And if you spend a bit more time and effort on exploration, you might get better results. So for instance, you might feel that the job you’re in is the best option for you. And you’ve tried a little bit to the left, a little bit to the right and that’s doesn’t really work for you. But maybe there can be more exploration that we put into, into effect there. It goes pretty much for anything in life. But the interesting thing I find is that it’s not about just blindly jumping into exploration and replacing all of your habits with just random tries of different new things. No, there’s actually a balance. And that’s what I like about it that you got to figure out “All right, where’s the balance for me in this specific aspect of my life in terms of exploration versus exploitation”.

So I’ll give you an example. I’m in Canggu in Bali right now, a city which I really like. I can’t even, I don’t think you can call it a city. It’s more of a, a town or you know, a village / town. A very, very nice place, great energy here. And that’s why I come here because of the energy. And here is lots of things to do for any taste and habits and whatever you want. You know, you can go surfing, you can work, there’s a great co-working place here. You can go do yoga, there’s lots of yoga places, you can do meditation. You can party. There’s a, there’s like whole parts of the city town where there are people just partying. You can eat a lot of really great food. You can drink a lot if you want. You can network. There’s lots of different people here, interesting entrepreneurs and people working away or people on holidays. You can stay in your hotel, you can live in an expensive hotel, you can live in an AirBnB, you can live in a homestay, there’s lots of variety here.

And when I was here last year and what I did last year was I predominantly worked at the Dojo and I went, which is a co-working place. And then I went to the yoga studio called The Practice and Carl Massy, the founder and one of the founders of The Practice. He was on the podcast. If you go to www.superdatascience.com/podcast you can search for his episode. I don’t remember off the top of my head, but really great episode about being happy. And he wrote the book, The Guidebook to Happiness. Really cool. Really cool guy. Really cool place to practice.

And so when I came here, I already had this routine in mind, already had this, I already had in mind, how I’m going to exploit what I’ve already found last year, what I found last year. And I was going to, you know, go to Shady Shack, which is a place where you can get really good vegan food, really, really nice place for lunch or maybe or dinner, have it by the beach and then go to The Practice and go to the Dojo. But things didn’t turn out exactly that way. And one of the reasons was at The Practice of the Yoga Studio, they no longer have Yin Yoga, which I was a big fan of, and I was actually, also open to exploring. So I was thinking to myself, what else can I do?

You know, let’s not get into this whole pattern. And I’m actually glad things worked out this way because the other thing I was looking forward to doing is doing a bit of Calisthenics where you use your own body weight to train your muscles. And through searching for calisthenics, I found this really cool other gym, which is called Nirvana Strength. And that’s a world class gym for Olympic gymnastics. So you’ve got the rings, you’ve got the stall bars, you got maths, you’ve got everything there and it’s in Canggu. It’s just like so crazy. Not, not every city, like even in Australia, I don’t know of even one gym like that that I can, I’ve heard of seeing and there’s like one massive one here in Canggu and they’ve got really cool sauna, cold pool and so on. And so I thought to myself, well, let me try that out.

And I liked it so much that I’m, instead of going to The Practice every day of the week as I was doing last year, this year, I’m going to Nirvana Strength every day of the year. And there’s no right or wrong, I’m not saying one is better than the other, it’s just that in, for me right now, Nirvana Strength works better. And what I’m learning there, how to do pushups property, how to do pull ups and whatever else. That’s really exciting and interesting. And had I not been open to exploration, I would have not found that, I would have been an old pattern. And my life would have been different. Of course I would have learned things and had a great time probably as well. But I’m glad I was open to the exploration because that added something new to my life.

At the same time, it’s a balance. Right? So at the same time I’m balancing out, I’m not exploring crazily around everything. For instance, I’m still, you know, having several meals a week at Shady Shack because I know that’s a great place. I’m exploring some other places to have food, but I am balancing that out with exploiting the places I already know. And also the area that I am mostly in of Canggu cause is quite spread out. I could be in a different area. But I know I like this area so I’m exploiting being in this area. So basically the moral here is that in life we can fall into patterns and as humans we tend to fall to paterns of exploitation and we’re not open enough to exploration. It doesn’t mean that we need to go crazy about exploration and just completely forget about exploiting the useful things that we’ve found, the useful patterns that we’ve found. It’s just a matter of balance.

And so my call to you this weekend is how to think about where do you, where are you in a pattern, where are you like over exploiting things in your life when you could actually be more open to exploration and wouldn’t, it wouldn’t be too risky, it wouldn’t be too much out of your comfort zone that it’s like moving to a new country. It’s okay for some people. For some people that might be a bit too much as a first step, something maybe small that you can change your exploitation for exploration and, but you have to be kind of aware and prepared that there might be short term losses or short term, not necessarily financial losses, but what I’m talking about like short term sacrifices that you might have to do, undergo in order to have that exploration because any kind of exploration there can be a hit and miss.

You might get better results, maybe worst results. You might enjoy things more, enjoy things less. But you know, that’s a process. That’s a journey. So where in your life are you prepared to make a little bit of a sacrifice? Take a bit of risk in order to, but in order to facilitate some exploration and potentially find something new for yourself that you might enjoy as well, or might enjoy more.

So there you go. That’s the balance between exploration and exploitation and how does it apply to your life. Thanks so much for being here today. I look forward seeing you back here next time. Until then, happy analyzing.

Podcasts SDS 266: Exploration vs Exploitation

SDS 266: Exploration vs Exploitation

Podcast Transcript

Share on

Related Podcasts

June 19, 2026

June 16, 2026

June 12, 2026

Podcasts SDS 266: Exploration vs Exploitation

Share

SDS 266: Exploration vs Exploitation

Podcast Transcript

Share on

Related Podcasts

June 19, 2026

SDS 1002: Fable 5: The Full Story from Capabilities to Drama

June 16, 2026

SDS 1001: How AI Erased My Career Moat, an Episode #1001 Special: Jon Krohn interviewed by Kirill Eremenko

June 12, 2026

SDS 1000: Ten Years of the Super Data Science Podcast, with Jon, Kirill and Special Guests