Jon Krohn: 00:00:00
This is episode number 611 with Dr. Ken Stanley, world leading expert on open-ended AI. Today’s episode is brought to you by Zencastr, the easiest way to make high quality podcasts.
Welcome to the Super Data Science Podcast, the most listened to podcast in the data science industry. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. I’m your host, Jon Krohn. Thanks for joining me today. And now let’s make the complex simple.
Welcome back to the Super Data Science Podcast. You are in for a seriously mind-blowing episode today with Dr. Ken Stanley. I don’t think I’ve ever said on air before that listening to a single Super Data Science episode could dramatically change how you view your entire life, but today’s episode could do just that. Ken co-authored the book Why Greatness Cannot Be Planned. It’s a genre-defying book that leverages his machine learning research to redefine how a human can optimally achieve extraordinary outcomes over the course of their lifetime.
Until recently, Ken was the open-endedness team leader at OpenAI, one of the world’s top AI research organizations. Prior to that, he led core AI research for Uber AI and was Professor of Computer Science at the University of central Florida. He holds a dozen patents for machine learning innovations, including open-ended and evolutionary, especially neuro-evolutionary machine learning approaches. Today’s episode does get fairly deep into the weeds of machine learning theory at points, so it may be best suited to technical practitioners. That said, the broad strokes of the episode could be not only informative, but again, they could be life perspective altering for any curious listener.
In this episode, Ken details what genetic machine learning algorithms are and how they work effectively in practice; how the objective paradox that you fail to achieve an objective you seek is common across machine learning and human pursuit; how an approach called novelty search can lead to superior outcomes that pursuing an explicit objective, again, for both machines and humans alike. He also talks about what open-ended AI is and its intimate relationship with artificial intelligence, a machine with the same learning potential as a human. And he talks about his vision for how AI could transform life for humans in the coming decades. All right, you ready for this extraordinary episode? Let’s go.
Ken, welcome to the Super Data Science Podcast. I’m so excited to have you here. Where in the world are you calling in from?
Ken Stanley: 00:02:52
Thank you for having me. I am right now sitting in San Francisco where I live.
Jon Krohn: 00:02:57
Nice. And so we’re connected by Jeremy Harris, who is one of my favorite guests on the Super Data Science Podcast of all time. So he was back in episode number 565, and we talked a lot about artificial general intelligence and the potential dangers that lie as AGI arises. So for listeners who aren’t aware, the idea of artificial general intelligence is this concept of an AI algorithm that has all of the learning capability of a human brain, though Ken might be able to define that better than I can.
But Jeremy was on the show in episode number 565, talking about AGI. It ends up being this… It was this incredible conversation that… So the edited footage for you is two hours long, but Jeremy and I also talked for about two hours before we even started filming, and we talked for two hours after. And in that conversation afterwards, I said to Jeremy, “Jeremy, you’re the host of the Towards Data Science Podcast, another very popular data science podcast. I’m sure you’ve had countless amazing guests over the year. Do you have anyone that I need to have on Super Data Science?” And Jeremy said, “You’ve got to get Ken Stanley on the show.” So now here you are.
Ken Stanley: 00:04:13
Great to be here. Nice of him to recommend me.
Jon Krohn: 00:04:17
Yeah. Ken, you wrote a book called Why Greatness Cannot Be Planned: The Myth of the Objective. So I love the idea of this book. I haven’t had a chance to read it yet, but my understanding is that even in the title there, you’re making a connection between the kind of planning for greatness that humans like to do and tying that to the objective function that we use in almost all machine learning algorithms in order for that machine learning algorithm to be optimized. Is that correct? Is that the kind of connection that you’re making in this book?
Ken Stanley: 00:04:49
That is correct, and there’s actually a pretty long story there, which I’ll try not to make long, and it goes back to the fact that I was, previous to writing this book, really just a AI researcher. That’s basically all I was doing is AI research, and so I was using objective functions and things like that. And we discovered, through experiments that we did, some really interesting facets of objective functions that are very counterintuitive, which included the insight that sometimes you cannot do very well at getting to the objective function to be maximized by actually trying to get it to maximize. Another way of saying it is that maybe a better way to get to say some algorithm performing the way you want it to perform might be actually not trying to maximize the objective or optimize the objective.
That was very counterintuitive. We saw this through several experiments, including at first experiments that were involving humans in the loop, which was one that was called Picbreeder. And that observation originally was just an observation about AI and machine learning, but it was really to me profound. I thought, “Wow, that’s just totally counterintuitive and probably really important to know.” And I spoke about it a lot at AI conferences. Actually, we created a whole new algorithm called novel research because of it, but in the course of doing that over a few years, I started to appreciate that it’s not just about AI and machine learning.
Because conversations would veer off, questions at the ends of talks would veer off to like, “Well, but what does that mean for my life or what I do? I also have objectives.” And does this apply more broadly or just bigger questions, like how does society work? Because the society is very objectively oriented or institutions which decide what to do based on whether an objective is actually being maximized or whether it’s being satisfied. And it started to dawn on me that this is a really big broad topic.
It also affects people personally because one of the biggest, most impactful interactions I had early on before the book was with a bunch of artists because I speaking at the Rhode Island School of Design, and it brought into focus for me that this was an emotional, personal, psychological issue, not just an issue about practical, “How do I get things done?” But there was a very kind of cathartic reaction when I sort described some of these things that I had observed in algorithms because they were saying, “Oh, this finally justifies, in some way, something that I haven’t been able to justify, which is like, ‘Why am I doing what I’m doing?'”
And you could see this would be more of a problem for artists than software engineers or something because their parents are like, “What is this for? Where are you going with this? What is the point of this?” It’s like, well, there are some things where you can’t have to define an objective and actually you might get to a better place if you don’t. And that really kind of validated I think in a way some of the life choices that people were making. So all of this combined in my mind to think, “Man, there’s a much bigger implication,” and I thought this was super exciting. I was like, “When has ever there been an algorithmic insight that leads to social critique or to understanding yourself better?” And then I thought, “Well, but it should be that way.” This is artificial intelligence. You’d think if we make any advances or have any deep insights in AI, that should actually lead us to understanding ourselves better, shouldn’t it? And so in some way you would expect that that is an implication that will naturally emerge from it. The more we learn about AI, the more we learn about ourselves.
And so I just thought at some point this has to be a book because you can’t really talk about the social implications, the larger implications for personal objectives, for institutional objectives, by writing an AI paper, because that just goes to the AI community. So the only thing I could think of doing was writing a book, which would be trying to trigger a broader social conversation across society about how we run things. Because I think the insights here suggest that we don’t run things very smartly, especially when we’re aiming for innovation or discovery.
Jon Krohn: 00:08:51
Wow. Okay. So this ties to an idea that I’ve had recurringly while I’m on runs or in the shower, that kind of thing. I had been thinking about this talk in my head where I wanted to relate maximizing a reward function, a kind of objective function common in deep reinforcement learning. So with deep reinforcement learning algorithms, we have an algorithm that can typically explore an environment and then you have some objective typically defined as a reward.
So as a simple example, if you’re training an algorithm to play a video game, you’re training it to play Tetris, say you want it to maximize its score in Tetris. So that’s your reward and so you allow your deeper reinforcement learning algorithm typically to explore various kinds of actions that it could take and try to learn what the actions are that will lead to maximizing that score in Tetris, maximizing that reward.
And so a few years ago, as I was learning a lot about team reinforcement learning, it kept occurring to me that I was making this parallel to my own life. I was thinking about, what is the reward function that I am optimizing for? If I could define it down to one thing, and so it would be something like contentment or if you can define what happiness is. I’ve been in my head for years, kicking around the thought of this idea of doing a presentation, similar to what you’re describing, a talk that is not necessarily designed for a data science audience or an AI audience, but maybe for a general lay audience and explaining, at a high level, this idea of reward functions and then saying, “We’re kind of like this as well. We’re trying to, whether we’re aware of it or not, we are probably spending some or a lot of our time trying to take actions to maximize some abstract reward that we may have in our subconscious.”
And so I haven’t actually created that presentation. It sounds like from what we’re hearing from you today that maybe I should be rethinking doing it at all. If it sounds like that this idea of maximizing a reward function might not get me where I want to anyway. So in terms of your social… We’ll get back to the machine learning stuff in a second, and maybe you’ll end up even referring to machine learning in your answer. But in terms of what I, as a person, should be doing to achieve my goals, what should I be doing instead of trying to climb some reward function?
Ken Stanley: 00:11:40
Right. And this gets to some subtle points. Because usually the first knee-jerk reaction, if you hear the general point that, oh, objectives actually can be self defeating, or actually I like to call it the objective paradox. It’s like setting an objective actually causes you not to achieve the objective. Then the kind of knee-jerk reaction is like, “Well, what are you suggesting we just go around randomly or something? We need some kind of guidance. You can’t just be random.”
But I think it’s important to elaborate that clearly the lesson is not that you should just be random. That’s not the point. And I think especially for machine learning researchers or reinforcement learning researchers, the mind tends to go in that direction because often the word exploration really is associated with just taking a random step. We tend to think of this exploitation versus exploration dichotomy in machine learning. And we think, “Oh, well…” The exploitation is the principle thing. It’s like you’re following the gradient. The exploration, just do whatever and just hope for something good to happen. That isn’t a good dichotomy.
And actually that’s why I don’t really like to analogize this insight with exploration versus exploitation because what I think it’s doing is really underselling exploration. The real insight here is that exploration is a very principled and rich thing that we do as humans intuitively and instinctively, which is not just taking random actions. And what we do do is we tend to follow, and you tried to distill it down into a word or something like happiness or contentment. The word that I often use is interestingness. We tend to follow a path that we find interesting, and you have to understand that that word interesting is really just kind of distilling a huge array of amazingly intelligent capacities that we have, which currently would be AI complete in some way, to be able to actually formalize what interestingness is. And, by the way, it’s different for every individual human.
But we have a very good instinct for interestingness. It is kind of domain dependent. If you’ve spent your life thinking about gardening, you’ll be very good at understanding what might be interesting in that space compared to someone who has not done that, but within the spaces where you’re familiar, the domains where you’re familiar, you have a very good nose for the interesting, and that’s the thing that I think we tend to discount way too much is you kind of say, “But it’s not principled.” Because the thing about it is you can’t measure it. That’s what makes people feel uncomfortable, and that’s why we like objectives so much is that they’re measurable. We can measure progress towards an objective. We call that assessment.
And so we love assessment. That’s a very popular idea in our culture, and I think it’s basically a security blanket because we’re very insecure about the possibility that something bad might happen. And so we want to do everything we can to guard against that possibility. But if you think about it, if you’re trying to do something that’s exploratory. There are things where objectives do make sense. I want to make that caveat clear. I agree that in certain situations and I usually characterize them where you have basically modest goals that are very realistic, then yeah, it makes sense to measure progress along an objective and you should.
So the kind of argument I’m making really applies to situations where you’re going somewhere and you really don’t know how to get there. So I call it more ambitious situations or situations that you’re trying to innovate or be creative. Now those situations, the objective can be very bad for you because it basically stops you from seeing all the other options that you have. And the problem is the objective itself is deceptive. So the compass that you’re using, which is measuring progress towards the objective is actually misleading you rather than leading you in the right direction.
And this happens all the time. Of course, deception is just pervasive in all complex problems. And you think about it, the reason they’re called complex is because they are deceptive. In other words, it may look like as you measure progress, things are going up, but actually you’re going towards a dead end. There’s some extreme examples you could think of. For example, if I was trying to get to the moon, but I tried to begin by climbing a mountain, then your objective function will go up for a while. In fact, when you get to the top of the mountain, you have reason to celebrate you’ve achieved something substantial, but it has nothing to do with getting to the moon. And this is a really cautionary tale because a lot of the things we’re doing, in some of the most complex problems that we have, are exactly like that. They’re highly deceptive. We see progress and even rapid progress and we celebrate and think we’re going to go all the way, but actually we’re just on a deceptive local optimum, and that’s obviously a problem.
So objectives can be very misleading in that way, when we don’t know what the stepping stones are, that will lead us from where we are to where we want to be. So the alternative is let’s go to places that we find interesting. Now that’s different than random, right? Because interesting is based on information, tons of information, your entire life basically. Plus everything that biology has endowed you with over the eons of evolution, all go into what you find interesting today, right now, in this second. So it’s not by any means random and it’s highly rich information and people tend to be very good at it. But the problem is, and this is the thing you have to understand, which is subtle and kind of counterintuitive about this. The problem is that if you follow gradients of interestingness, you don’t know where you’re going. You have to accept that.
So there’s no guarantee that you will because of following what you find is interesting go to something that you a priori already thought was your objective. Of course, it won’t guarantee something like that. What it does do though, is it increases the probability that you’ll actually encounter something that’s good. It may just not be the objective that you had. So what you have to make a shift is, if I want to achieve something great, following interestingness can be a very good formula, but I can’t guarantee and I can’t make it very likely that I’ll achieve a particular great thing.
And there’s nothing we can do about that, which is I think really interesting. There’s no formula that exists on this planet or in anybody’s mind that will make it very, very likely that you will achieve something that no one has any clue how we’re going to ever accomplish. There is just no formula for that. What I’m proposing is a formula for making it more probable that we will over time uncover many of these things that we find interesting, but without knowing which one will happen or in what order.
Jon Krohn: 00:17:55
Trying to create studio-quality podcast episodes remotely used to be a big challenge for us with lots of separate applications involved. So when I took over as host of Super Data Science, I immediately switched us to recording with Zencastr. Zencastr not only dramatically simplified the recording process – we now just use one simple web app – it also dramatically increased the quality of our recordings. Zencastr records lossless audio and up to 4K video and then asynchronously uploads these flawless media files to the cloud. This means that internet hiccups have zero impact on the finished product that you enjoy. To have recordings as high quality as Super Data Science yourself, go to Zencastr.com/pricing and use the code SDS to get 30% off your first three months of Zencastr Professional. It’s time for you to share your story.
That is brilliant and so well articulated. I get exactly what you’re saying, and what you have been describing is immediately incredibly reassuring to me personally, in the way that I have been living my life, which is that I have periodically made drastic changes in my career path into things that I would never have imagined a year earlier or two years earlier would even be an option. And a lot of that, for me personally, I find, and I guess this wouldn’t be surprising for somebody who’s the host of a podcast. One of the things that I find most interesting in the world is having interesting conversations with people. That is the peak of the mountain that I’m often trying to realize in a given day or a given period of time.
And so I’ve dramatically changed my career at a number of key points as a result of meeting somebody that I find extremely interesting and following them at whatever they’re doing, even though it’s something that I hadn’t planned on doing before. So eight years ago I met a guy named Ed Donner who was founding a startup and I was in a very comfortable corporate job. I had not imagined at that time that I would go and join a startup, but I just found this guy that had founded this company, Ed Donner, just so interesting. And I was like, “I want to be having conversations with this guy every day. And so whatever you’re doing, I’m in, and sign me up.”
And the same kind of thing happened for choosing my PhD supervisor. It was in an area that I had very little experience in previously, but I’m so delighted that I did choose to work with him because that then led to machine learning expertise that I previously had not really developed formally in my academic career. So super cool to hear that. And I also like it as kind of… Do you agree, Ken, that this idea of following the interesting gradient at any given moment applies to decisions big and small? So we’ve kind of done it so far… Your examples so far have related to big questions and talking about big ideas and not being able to plan for greatness over the course of a lifetime. But do you think that it also works even just for small decisions, daily decisions that can come up?
Ken Stanley: 00:21:19
Yeah. And I want to acknowledge that what you were describing in your own life is something I would call maybe opportunism or the ability to pivot. And that’s certainly a characteristic of people who are less objectively driven. It’s very hard to pivot in your life if you’re just completely transfixed with an objective. If I say that, “Oh, something else interesting is over here.” It’s a very hard to maybe say, “Oh, forget the objective.” But there are some people who just are more inclined towards that. Just, “Well, I see an opportunity. It’s just interesting. Let’s just go with that.”
And that tends to lead to the most really kind of greatness and interesting outcomes is that ability. And so this question of: Does it apply both to big and small? I think that the one thing that we have to go back to that’s just important to remind ourselves, is that, what do we mean by small? If we just mean that you’re trying to get to a modest objective, then I would say, no. I think for modest objectives, you certainly don’t have to or even want to follow gradients of interestingness. I just think it’s an important caveat.
So something like I want to get in better shape. Well, millions of people have done that and we know what to do. It’s true, there could be some innovative thing that we haven’t thought of. But generally speaking, going out for a jog, eating more healthy. These are well known stepping stones that lead to getting into better health. And so we wouldn’t want to go to our book and then be like, “Ah, forget it. I’m just going to try to do something more interesting than that.” You don’t have to do that. So this really is about exploratory behavior. But within that context, then I would say, yes. If it is a case where you’re just exploring and it could be more daily life type of stuff that it’s not going to profoundly change the world although who knows because you’re exploring, but probably not. Then, yeah, why not actually take small opportunities, go down paths that other people would see as not necessarily exciting, but are something that you just thought was interesting at the time. I think that’ll lead to having a more interesting day. It’s most likely to be the case.
And then maybe that’s a big enough win and that’s all you really care about. So I do think that this can apply, certainly at the micro scale, as well. But you just have to remember that, if you really are trying to achieve something and it’s modest, then I wouldn’t necessarily think like this. It’s something like driving to work, clearly, you don’t want to be exploratory. Just do what you know need to do and just get to work. But yeah, if you’re trying to have an interesting day, then yeah.
Jon Krohn: 00:23:47
All right. So we’ve talked now to a great extent about what humans can be doing to be exploring, to be following interestingness, and then achieving great things that we couldn’t have planned for. I think that that is awesome, and I’m sure that we will come back to this idea in human life recurringly throughout this conversation. But let’s jump now a bit and talk about it in machines.
So you mentioned earlier in the episode, you briefly said that as a result of your research, the initial research that you were talking about, that led to all of this human greatness insight, was that you were discovering that having a machine learning algorithm try to follow an objective, was not getting you there. It was not getting you to the objective that you were trying to. And so, you came up with something called novelty search.
So, what is this novelty search algorithm? And how does it work differently? To be able to… I guess it sounds like, it explores differently than other kinds of machine learning algorithms, and that allows it to achieve great outcomes that would be possible if we were strictly following an objective function that could, say, lead us to the top of a mountain when we were trying to get into outer space.
Ken Stanley: 00:25:07
Yeah, that’s right. This all started with algorithmic insights. And you also have to keep in mind, this is stuff that happened around a decade ago, and so you have to put in the context of where the field was back then. Because some of these ideas have percolated into the field since then. So they may not be as novel now as they were 10 years ago, but I think they’re still provocative, so, you’ll still probably find it interesting. So basically, I just want to put into what was going on at that time. I believed in objective functions. I wouldn’t have said it like that because I wouldn’t even been thinking about this question. Basically I thought objective function was a way that even-
Jon Krohn: 00:25:43
There’s nothing to believe in. It just is the way.
Ken Stanley: 00:25:43
Yeah, it’s just obvious. Yeah, that’s just what we do. And so I wasn’t particularly trying to question them or anything like that. But we did this experiment with… It was a very strange topic, which was with picture breeding online. It was called Picbreeder and we created this website called Picbreeder.
Jon Krohn: 00:26:04
What’s that?
Ken Stanley: 00:26:04
Okay. Okay. So what is Picbreeder? Is it probably… You need to give a little background on what that even is. Picbreeder was a website where you could breed pictures. Okay, so what does that actually mean?
Jon Krohn: 00:26:17
That’s it. That’s what eventually you get to understand here.
Ken Stanley: 00:26:20
Yeah, so basically what it was, it had an archive of images that people had bred and you could pick one of those images and breed it further. And what that would mean, would be like-
Jon Krohn: 00:26:31
Well how do you breed an image? What does that mean?
Ken Stanley: 00:26:33
Right, right. So, what it means is basically, just from the surface level, basically what it would mean is that I could say I want this image to have children. And the site, which is basically an online service, would generate children for that image.
Jon Krohn: 00:26:46
Okay.
Ken Stanley: 00:26:47
And basically they’re just mutations. The way if you have children, they look kind of like you, but not exactly. The image would also have children that look kind of like it, but not exactly.
Jon Krohn: 00:26:56
I see. And so-
Ken Stanley: 00:26:57
This-
Jon Krohn: 00:26:58
Is there asexual breeding where an image just multiplies into children or they always have to be two images that can breed together, and-
Ken Stanley: 00:27:09
You could do either one, Sorry. You could do either one. But mostly I think people are doing asexual. Asexual is just simpler and easier. It isn’t really that important towards what happened. But yeah, asexual, I know it’s asexual is very exotic and weird from our perspective. But within this kind of system it doesn’t really matter. You can do lots of exploration with asexual.
Jon Krohn: 00:27:31
Right.
Ken Stanley: 00:27:32
And so, yeah, usually single parents would have children and then you can just keep iterating. You’re basically just breeding. It’s like breeding horses, breeding dogs, or something. Except it’s very fast, of course. Cause you’ve been instant gratification.
Jon Krohn: 00:27:44
Yes. And so there’s less to clean up.
Ken Stanley: 00:27:49
Yeah. Well that’s true. Yeah. Yeah. It’s very much clear that they do biology experiments.
Jon Krohn: 00:27:55
Yeah.
Ken Stanley: 00:27:56
And so, why did we… So basically what happened… One other interesting thing is where it all began, was from a bunch of random blobs. And so if you just said start from scratch, where does all of evolution begin? Or something. You’d get bunch of random blobs, they don’t look like anything. And what happened though, is that through all this breeding, and recall that part of what can happen is that I can branch from something that you had bred. So you would publish your discovery to the site and I could go and look at that and say, actually I want to breed that further. So we got people branching upon people, branching upon people, which creates what biologists would call a phylogeny. This tree of life basically was growing inside the system. So why did we do this?
So this actually… There’s a meta story here which points back to why greatness cannot be planned. Because it was actually one of those things that’s from an AI perspective, it’s a little hard to justify. It’s like, what is the objective of all of that? Why are we creating a picture breeding service and putting it… And I was basically, my theory was, I’m not exactly sure what’s going to happen, but it’s going to be really, really interesting. That’s pretty much what was motivating me at the time. Because I had seen these, sometimes they’re called genetic art or evolutionary art or things like that. But to actually put it on an online service where it just goes forever, we could just keep branching off each other’s discoveries and with a very interesting representation under the hood. That’s the other thing, so, how does that work? Like, under the hood, there’s a special kind of neural network called a compositional pattern producing network, which is generating these images. And it’s basically designed to generate patterns with regularities and symmetries.
And so it was made for exploring geometric patterns. And so I thought, put all this together, put people that crowdsource the internet… It’s going to be just super interesting. And in fact, it was profoundly interesting, because from that experiment, thousands of images were bred… And what happened, the first interesting that thing that happened was that actually, remember it starts with totally meaningless blobs, people were breeding things that looked incredibly like stuff that we know from the real world. Like butterflies, and skulls, and cars. And at first you may say, well they’re breeding, it’s not such a surprise. But actually it’s a huge surprise because it turns out that it’s an extremely deceptive space and these are very, very rare needles in a haystack. The whole space, if you think of the whole space of CPP, of this compositional pattern producing network space, it’s like 99.99999999% is just complete garbage.
So over, and over, and over again, extremely infinite decimal, like needles in a haystack were being found consistently by our users. And they’re amazing likenesses of things. And you might think, well, but breeding could get you that. But it’s actually not true. It turns out that, if I told you breed a horse, it just, you’ll never be able to do it. And this is starting to get to this point of the problem with objectives. You can see how it starts to fall out here, because I noticed that people can’t get to things when they want to, and yet they’re finding all these things. And so the question which I thought was really profound, which the answer was discovered inside of Picbreeder, was how then are they finding things, if you can’t find things by looking for them? So if you set an objective, it’s like, I want to get a butterfly. Forget it, it’ll never happen. It’s impossible if you’re starting from scratch. But people did discover a butterfly and over and over again all kinds of interesting things.
And it turned out that the answer was that people were finding things by not looking for them. And now this is totally counterintuitive and it goes against everything that I had believed. I basically believed in objective functions. I would’ve thought, okay, the way to get butterflies is to try to get a butterfly, and the closer the image gets to a butterfly, the better I’m doing. I mean that’s maximizing the objective function, but it actually doesn’t work that way. The reason that it works, is because people weren’t trying to get butterflies, which means that the things that lead to butterflies, which I call the stepping stones, those things were being discovered because people weren’t looking for butterflies. Because if they were looking for butterflies, they wouldn’t have chosen those stepping stones.
See, it’s very counterintuitive. Because they don’t look like butterflies. See, the things that lead to butterflies don’t look like butterflies. Just like the things that lead to making to inventing computers, which are vacuum tubes don’t look like computers to us, we wouldn’t think that those would lead to computers. And this is true of just about every interesting invention ever invented by humankind. And this happens in Picbreeder. But the original revelation, observing this in Picbreeder to me, yeah, it was an epiphany. I mean it blew my mind completely. Because it was like, I just was like, this goes against everything I’ve ever been taught, especially in the field of machine learning. It’s like we do things by setting them as objectives, and then maximizing, or you could say minimizing the loss or however you want to put it, that’s how we do things. Like in evolutionary computation it would be maximizing the fitness function.
And this is… That doesn’t work at all. The only way to do these things is by not trying to do them because the stepping stones will be lost if you do try to do them, because they’re deceptive. Deceptive in the sense that the things that lead to the things you want don’t look like the things you want. So if you’re maximizing that I want them to look like what I want, then I won’t get them. It’s very paradoxical.
And I spent weeks just… My mind was consumed with this insight for weeks because it was just so bizarre and counterintuitive. And I started thinking, well what would an algorithm that respects this actually look like? Because it’s totally nuts. And I was thinking, what I was thinking was, imagine a car going around a track and normally, I mean I’ve done this experiment before where I just basically try to maximize the distance that the car goes before it crashes. And eventually, according to the way we think about objective functions, eventually the car will go all the way around the track. And it’s probably true, I mean because driving car around a track isn’t super hard in some simulator or something like that. But I thought what if we-
Jon Krohn: 00:33:33
This is not something… You’re not every weekend demolishing a series of cars on a track.
Ken Stanley: 00:33:39
So I thought, yeah, it has to be a simulator. But I thought, okay, okay, let’s think about it in the Picbreeder way. And this is pre novelty search, I’ll just think, what would that be like? So I thought, okay, instead of I’m trying to get around the track, which is the normal objective function, what if I just think about it… I’m just going to try to do something new that’s interesting. So I don’t even think about what’s bad or good, there’s no concept like that. What would happen? Well, so the car would basically crash into the side of the track immediately. I mean it doesn’t know what it’s doing. But then I would say, Okay, let’s have a rule. You have to do something different now, because in my head I was like, that’s kind of what Picbreeder users are saying. They want to find something new that’s interesting.
So now let’s say that the car tried to follow that heuristic. So the car now can’t crash the same way, it has to try to crash a different way, or not crash, but do something different. Well likely it would just crash again, but it would crash in a different way. So it’s going to crash over here, it’s going to crash over there, it’s going to crash all these different ways you can crash. But guess what? There’s going to be this amazing moment, this threshold that we cross where it won’t crash, right? Cause it’s going to run out of ways to crash someday it’s going to happen. That it’s tried all the boring, easy things to do and it’ll be forced then to actually understand what a track is or to actually look at it’s visual system and respect it in some way. And this is a special profound change that will happen, but it’s inevitable, it seemed to me, it has to eventually happen.
And then I was thinking, is that better or worse than the usual way of thinking about this? And the usual way of thinking is just maxim… Every time you go, you try to go a little bit farther. But here I’m thinking just do something new. And I start thinking this is actually better. Because if you think about it, I don’t actually know what leads me to going farther down the track from a control… From a neural network weights perspective. It’s not clear that just because I went a little bit that that’s actually close to going a lot. And it’s not even clear that’s even going in the right direction in weight space, which is this abstract space of the multidimensional settings of weights in a neural network.
So if you think about that, you think about that the actual stepping stones that lead to getting around the track could be deceptive and counterintuitive. Then this is actually a principle thing to do. It’s like just try something new. We don’t even have a concept of getting around the track as something we’re trying to maximize or we don’t even know we’re trying to do that. Just try something new.
If you keep trying something new, you’ll run out of stupid things to do. Eventually, you’ll do something that’s not stupid. And eventually, you’ll do something that’s actually interesting, and so forth. And so that started going through my head like, oh this is an interesting way of thinking about progress in the world. It’s quite different than maximizing, or going down an objective gradient. And then that led naturally to this novelty search idea. So I thought along with, who eventually became my co-author on the book, Joel Lehman, who was a PhD student at the time, one of my PhD students, we thought together that, hey, we could write this as an algorithm. It’s basically just searching for novelty.
And it’s a really interesting algorithm. It’s because it’s kind of, counterintuitive and paradoxical, and crazy in a way, is that the algorithm doesn’t know what it’s trying to accomplish. All it’s trying to do is do something different than it’s done before. And so it’s completely anti-objective. But, here’s the hypothesis is that if we run this, it will end up doing things that are actually interesting and useful. And especially in constrained domains, like a car on a track, or say a biped robot trying to learn to walk. I believed that it would actually learn to walk. I should add, it’s very important to understand that in a very unconstrained domain, this isn’t going to give you a single objective that you want. This relates back to the point that I made earlier just in life that, if you go with the interestingness heuristic in your life, you can’t think about it as I’m going to solve problem X.
What you think about it is, I’m going to do something interesting. I might solve some problem, but I don’t know which one. This is true of the novelty search algorithm as well. So it’s not like a solution… You don’t want to think of it as a magic bullet, like, “Oh well this is going to now solve all our problems.” Because we don’t know which problem it’s going to solve. But in a very constrained domain, it actually does have the property that sometimes it just solves the problem because there’s not much to do. Like for the biped robot, its like, it falls down, it falls down, it falls down in a new way, it falls down in a new way, but eventually, it doesn’t fall down because that’s the only way to be novel. And eventually it will go some distance and that’s basically, it’s learned how to walk. So in kind of constrained domains like that, this actually can give you a solution.
But in the bigger picture, the novelty research algorithm is really about finding interesting things that are out there, that are out there in the search space, and kind of plundering interestingness. But at the time, we showed it in experiments that were in these narrow domains to make this very counterintuitive point that look, by not having an objective, you get better solutions. And that was the fun part of it. If we tried to optimize the objective function, like, tell the biped robot walk as far as you can, it would actually get you worse walking gates than if we did the novelty search, which doesn’t know what’s trying to do. And the lesson I think, is not that novelty search is the solution to all your problems, rather, I think what I take from it is that objectives are an absolute embarrassment. They should not be losing to an algorithm that doesn’t know anything about what is trying to do.
Jon Krohn: 00:38:45
Right.
Ken Stanley: 00:38:46
That just should be ringing alarm bells all across the field about what we think, what we believe and have faith in, as sort of intuitively the practical common sense approach to achieving something that’s hard. And that was sort of how I tried to couch it in the early papers, is like this is very concerning for our way of thinking, but also could be useful. It’s liberating and concerning. If you think about it, if you don’t have to always be adherent to an objective, your life is a lot… I mean I shouldn’t say your life, cause we’re talking about algorithms here, but we’re a lot less constrained in the way that we can think about achieving things. So a lot of creativity becomes possible. But that was the early genesis of the novelty search algorithm and why it was kind of controversial. You can imagine, because it’s sort of annoying to people who really do like this kind of objective maximization framework, but it was also liberating and opened up a lot of new ideas and led to entirely new fields. Like quality diversity is now a new field that drew on those ideas.
Jon Krohn: 00:39:42
I think it’s really cool. I have an intuition that I’d like to share that I’m curious as to whether you think that this is related to why exploring novelty works so well. So when you were talking earlier about blobs randomly forming a butterfly, even though you couldn’t have the objective of them forming a butterfly, that got me thinking that part of why this approach might work so well, and again open to your feedback, maybe I’m completely off base here, but it just kind of struck me that there are a lot of interesting things that could happen. Whether it’s in your life or to a machine learning algorithm.
You know, could have a butterfly emerge randomly, or a horse, or a skull, or a flower, or the other kinds of examples that you gave. Any number of things that we find interesting could emerge randomly. And so there’s this huge space of possible interesting things, that is, I think, again, open to your feedback, but that space of interesting things, is probably far more vast than the number of interesting objectives that we could kind of enumerate. That at any given time you could say for machine learning algorithm, how we’d like to perform, or what I would like to do in the future, I could get out a list and think about it for a while and say, Oh, I’d like to write a book, or I’d like to host a podcast or whatever. These are like… There’s this set of objectives that I could possibly think of.
But if instead, I just follow interesting things that happen, after following one or two or a few interesting things, I end up in a place where now more interesting things could happen, that I could never have even conceived of when I was originally sitting down doing the exercise, thinking about interesting objectives in my life. Does that kind of make sense?
Ken Stanley: 00:41:29
Yeah, I do think that sheds light on the underlying intuitions why this makes sense. I would maybe broaden it and say the real key to a successful open, innovative, open ended innovative system is the proliferation of stepping stones. That’s what really gives power to innovation. Because every stepping stone in the world is now an opportunity to do something new that builds on that stepping stone. So the more stepping stones you have in your repertoire, your archive, whatever you want to call it, the more powerful your system is. If you think about what Picbreeder is, it’s a stepping stone proliferator. It’s basically collecting discoveries and then surfacing them to other people who then can build on those discoveries. That’s also what society is. That’s what the patent database is. We create these kind of archives of stepping stones. That’s what you’re doing in your life, more towards the spirit of what… The way you just articulated it in terms of, I do a lot of interesting things, I surface more stepping stones effectively. That’s what’s happening.
We are crippled by the lack of stepping stones. So, if there isn’t a stepping stone then we can’t do anything about that. And the thing is, following interestingness just uncovers stepping stone, after stepping stone, after a stepping stone. And I agree that many of those may be things that you would not have thought of about priori. So they can only be uncovered by following interestingness. But now that we have them, they’re a jumping off point. And what becomes the real issue, I think over time, is the surfacing of them getting the right stepping stone to the right person who can then take it forward. It’s like passing a baton. That becomes the real issue, because as we proliferate thousands and thousands of millions of stepping stones, I can’t show them all to you, so I need to expose you to only the ones that you might be able to build on.
But this is a society wide benefit. All the stepping stones that we have really are just good for everybody because there are things that we can all jump off from. And obviously that relates to where we are in history. Today in 2022, we can go much farther than we could in 1822. It’s because there’s more stepping stones and things will be more powerful the longer the system runs.
Jon Krohn: 00:43:31
And we’re going to come back to that idea later on in the episode. We’re going to try to do some stepping stone projections, which will be a fun exercise. All right, so I now understand, at an intuitive way, how trying to achieve an objective is not the way to be going about my life. That there’s this objective paradox and that I should be focusing on following the interestingness gradient at any given point in time to achieve complex long-term goals that will ultimately be more impactful than if I tried to follow some relatively narrow, well-defined objective. And then I also now have some understanding of how your novelty search arose, as well as how it’s useful. And you already gave the example of novelty search allowing a biped to learn how to walk more effectively than if we use an objective function to do that. Do you have any other practical use cases that would help us illustrate for us how a novelty search is useful practically in the real world?
Ken Stanley: 00:44:39
Yeah, so maybe three things. The first is that, you can imagine that novelty search, or let’s say advancements on novelty search, are clearly relevant just to practical reinforcement learning problems. Because they get stuck in local optum… I mean it’s obviously true, and this is not a novel insight to the field. And the field has implemented exploration in different ways, curiosity sometimes it’s called, which is very related to what novelty research does. And so novelty search clearly plays a role, and can play a role, in just being able to solve problems that we can’t solve right now.
You have to understand though, that what we’re talking about now, when we say problems, that is objective, we are talking about objectives. And so that means that, what we’re talking about is kind of combinations of objective driven processes with novelty driven processes. But the novelty driven process can still be very useful in a combination like that as a practical matter. And in fact that that’s why there’s now a field called quality diversity algorithms, which basically try to combine them. I do want to say though, that the most interesting stuff, reduces the reliance on the objective. But still it’s a practical thing, so it can be useful. And so, that’s one thing. Another thing which I think is maybe more interesting, is in creative applications.
If you want to see… Let’s say this, here’s a robot. I want to see all the cool stuff it can do. I don’t have a specific behavior I’m going for. Just uncover the space of possibilities. This kind of approach is very useful for something like that. It has been used, quality-diversity has been used in that way to… People, we call them repertoires, for example. Instead of, so say what I’m trying to do is not generate a solution. I’m trying to generate an entire repertoire in a single run, everything that this thing could do that’s interesting. And algorithms going beyond novel search that built on those ideas have been used to do that. And MAP-Elites is another algorithm. Novelty search with local competition is another algorithm that’s a quality-diversity algorithm, so generating lots of interesting stuff all in a single run.
And if you think about it, there’s an analogy there to something like Picbreeder or even natural evolution where one of the most fascinating things about those kind of processes is it’s all one run. It’s totally crazy when you think about in evolution, all these discoveries, human intelligence, flight, photosynthesis, amazing inventions each in their own right. It’s not separate optimization attempts. Like, “Okay, let’s figure out photosynthesis. We’ll start this big long thing.” No, it’s all part of the same run, totally different from the way we think usually about machine learning. So now this kind of algorithm can be used to do that. And I’d also add really creative kind of domains like art, music and things like that. It also makes sense there to introduce this because those are things where we are trying to kind branch, branch, branch. That’s what we want to do. We want to get to novel genres and ideas all the time. And finally that segues into the third thing, which is I think human-in-the-loop systems, which is that a lot of the time, what we want to do is not necessarily just have an algorithm sitting in a box alone, but to facilitate human exploration and make it work better in domains that could be creative or could be practical.
And that’s a huge possible win that you can see on the horizon now with all these image generators right now that are amazing and stuff like that. But the idea of an underlying algorithm, which is basically working in tandem with the human to help them to explore spaces, that I think could be extremely powerful. It goes back to Picbreeder, which kind of is a very kind of nascent version of that. But we’re just scratching the surface in novelty search as a beginning step in that direction, I think has a future in that area.
Jon Krohn: 00:48:37
All right. So those are amazing applications. I love that. So across replacing reinforcement learning algorithms to better maximize an objective, to creativity, to especially human-in-the-loop systems. Lots of great applications there of novelty search. So a question that I’m going to tie to novelty search in a moment, but is also related against the idea of opportunities for humans to be better making decisions. So you’ve probably heard of the 10,000 hour rule, which was made popular by Malcolm Gladwell in his book Outliers.
And so this rule, to summarize it briefly for the audience in case you haven’t heard of it, is that you must practice with objectives and feedback for at least 10,000 hours to become an expert at something. So this could be playing the violin or being a great athlete, figure skating or something, or even a great software developer, great machine learning algorithm developer. So generally in the human sense, do you think that this 10,000 hour rule holds any water? And then as well as the human level, to an AI agent, is there a difference between intelligence and expertise?
Ken Stanley: 00:50:04
Right. I think that the 10,000 hour idea does hold some water. I think I would want to disentangle it from the idea of an objective necessarily. And that takes a little bit of thinking I guess, because if you say, “Well, I want to be a really great software developer,” that does sound like an objective, but I think you have to understand that what I would be advocating is to say… I would be saying, though, if you really were thinking about as an objective, you might think about it as this incremental journey of getting better and better at some hard problems or something like that. It’s almost like a textbook, going through school with textbooks or things like that.
What I would be more advocating is doing the exploration. That’s what you spend 10,000 hours doing. Apply your creativity and your creative vision over those 10,000 hours. What are you going to program next? Well, nobody should tell you. You should do whatever’s interesting based on what you did before. I think you’ll get to a place where it doesn’t mean objectively, quantitatively measurably better than someone else. I’m not sure how you would know that.
So in other words, if you take a test, maybe you won’t get the highest score in the class, but I do think you’ll be a much more interesting programmer at the end of that than somebody who just went this very strictly objective route, because you’ve just traversed all kinds of stepping stones they would never have traversed. And that makes you very unique and interesting in a different way. And you’ve spent 10,000 hours doing that, yeah, you’re going to be in a place that makes you unique. And being unique’s obviously becoming increasingly important these days with AI and stuff like that.
But just one little exception to that is, of course, some degree of just doing your homework is important. I mean, that’s obviously true. Just getting the basic fundamentals, doing your exercises, graduating from high school. I do think that’s worth it. And that’s an objective endeavor and that’s one of those that I would call modest because we do know what the stepping stones are. And so when you branch out of that and are ready to do interestingness exploration, it’s a personal question, but some of those preliminaries I do think are important.
Jon Krohn: 00:52:15
I guess that’s kind of what happens as we evolve through the education system. The way that it’s kind of designed is when you’re in primary school, there’s these specific building blocks that everyone has to learn. And even in secondary school, that’s kind of true to some extent. In secondary school you can say, “Okay, I find chemistry more interesting or I find English literature more interesting,” and you can specialize a little bit more. And then undergrad is even we have some more specialization. You might even have some courses, especially later year courses as an undergrad, where instead of it being, you must learn, you must do these questions in your calculus textbook and be able to answer related questions, you go to, “All right, find something interesting, define an interesting research project for yourself. You can ask me as the professor for some feedback on your idea and maybe I can point you in the right direction.”
And then as you move on to a PhD and then onto a postdoc and then into a faculty position, you’re going more and more away from the guardrails of, “This is the homework that you must be doing,” into, “Explore something interesting and get back to us in a few years. I hope you publish some papers.”
Ken Stanley: 00:53:32
Yeah, that’s true. Actually, I think that the educational system is not very conducive, at least in this current form, to the kind of behavior that I’m describing. I think it’s not good for this. It’s very objective and very regimented.
Jon Krohn: 00:53:52
It seems to only start to really happen at the PhD level, which very few people get to.
Ken Stanley: 00:53:57
That’s exactly true. Yeah, that’s what I was going to say actually. It’s really the PhD level where the first time that I felt somebody really wanted me to do what I thought was interesting or would even be slightly interested in what I would do. Anytime before that, I’ve never gotten this signal in college or high school or even before that anybody really wants me to just go off and do whatever I find is interesting.
And it’s true that there is increasing independence. I agree a project in college is probably more interesting than a project in third grade, but still you don’t get the signal that just go explore stepping stones, do what you want, and the whole idea of test-driven. You’re always trying to get to that next score, just not at all compatible with the idea of pursuing interestingness. And so I think there’s a lot to think about with the educational system and the implications of this insight, not just for how to get people to be better at their field, but also just what is actually good for human nature and good for students as people.
Jon Krohn: 00:55:05
Mm-hmm. Yeah, I couldn’t agree more. All right. So you’ve answered the human side of that big question that I asked and in a really fascinating way. I loved where we took that, especially with the education thing, but then there was this follow up question that I had related to the 10,000 hours, which is for an AI agent, for some kind of machine learning algorithm. Is there a difference between these concepts of intelligence versus expertise or are they kind of the same thing?
Ken Stanley: 00:55:35
I do think there’s a difference. I mean, obviously it’s kind of a matter of definition, but I would think of them as different because I think expertise rides on top of intelligence. So I think…
Jon Krohn: 00:55:46
Yeah. So maybe in your case with a novelty search algorithm, the ability to do a novelty search, this kind of capacity for learning, that’s kind of an intelligence that you could say, for example, I’m oversimplifying this, but let’s say a novelty search algorithm. You could consider that to be more intelligent than a deep reinforcement learning algorithm or most of the common deeper reinforcement learning algorithms. Because there’s more capacity to learn. It could learn a broader set of things. So in that kind of sense, it’s a more intelligent algorithm, but the expertise doesn’t happen until it does some learning.
Ken Stanley: 00:56:27
Yeah. The spirit of that point, I totally agree with. I think you might get into some of the nuances of whether deep reinforcement learning today can rival novelty search from 10 years ago and you might say, well, actually it does now do novelty search things and we could get into kind of a hair splitting argument about that. But I think the general point though is that the intelligence is latent in the style of learning that you have, the ability to absorb things. That’s the real-
Jon Krohn: 00:56:57
How about novelty search versus DQ learning network?
Ken Stanley: 00:57:02
Sure, sure. Yeah. But this also goes to the point that a lot of the points I’m making are very hard to quantify. And they’re almost subjective. If I have a style where I tend to move towards novelty and you have a style where you tend to maximize an objective, how do we measure the difference between me and you in a way that says I’m more intelligent than you? It’s not completely clear. It’s somewhat subjective because it’s sort of like, what do you even mean by intelligence? But it’s like, well, I think I have more higher probability of getting to interesting things.
So if that’s how we think of intelligence, then I’m more intelligent, but someone could dispute that and say, “Well, that’s not really what I think of as intelligence,” but really what’s more important I think is to grapple with the fact that if some things actually are subjective, and that doesn’t mean that we have to ignore them. Maybe I can’t give you a quantifiable way to prove that this way is better than that way. But, look, it doesn’t mean we can’t discuss it. It’s still really important for us to understand and grapple with these questions.
And so I think it does get to this issue of expertise versus intelligence because this is relevant to modern machine learning. And some of the things are being done with large language models and things like that, where we can inject a huge amount of expertise because we have the training data, but underneath the hood, the actual intelligence, if we think of the learning capacity and things like that, it’s very alien. I’m not saying it’s worse or better. We could argue it’s worse than human, probably is worse, but it’s just very alien.
And the implications of that are unclear because you’ve got this high level understanding, which is beyond a lot of humans. You could know more about organic chemistry than most humans, but on this edifice that’s completely different and slippery. And so that underlying edifice, I think is the thing that really matters, because it’s sort of just saying, well, if I could get the brain of a five year old or a two year old or even 18 year old, that’s a place where I could build any kind of expertise and it would be expertise that’s similarly solid to human expertise eventually. So I think we hope to eventually get that kind of intelligence on top of which would rest the expertise.
Jon Krohn: 00:59:10
Nice. I love that. I am learning a ton from you as I anticipated in this episode. This is one of those episodes where I know I’m going to be thinking about the thoughtful insights that you provided for days or weeks where I’ll be doing something and being able to even put into practice these things you’ve been saying. So this has been an amazing conversation so far, and we’re not even done. So I don’t want that to sound like I’m starting to wrap up because I’ve got another couple, really interesting questions for you.
And there are, like many of the questions I’ve already asked in this episode, from Serg Masis, a researcher on our team here who is a brilliant data scientist in his own right. And so here’s another one from him. He says objectives inherently constrain our machine learning algorithms to not draw outside the lines of what’s in the training data that the algorithm was trained on. Even with these constraints, these machine learning algorithms do unexpected things, but with an open-ended approach, like the novelty search algorithm that you’ve been talking about throughout this episode, these algorithms are allowed to draw anywhere. So from a safety perspective, what kinds of safeguards can be put in place to stop open-ended algorithms like novelty search from doing something potentially dangerous?
Ken Stanley: 01:00:35
Well, it is true that as we start to train algorithms or machine learning systems or large models with huge amounts of data, they do start to do surprising things because the dataset is just so vast. But then if you put on top of that, some kind of novelty imperative, yeah, you’re going to go places that you weren’t expecting. And so I think it’s true that it raises safety issues. There’s no question about that, but I would point out that we are entering an era where this will happen even if we don’t explicitly try to make it happen.
If you think about it, what we want to do, it seems what humans want to do is we seem to want to create things that are incredibly intelligent and then release them into an ecosystem where humans and these machines are basically just doing stuff together. That ecosystem will be intrinsically open-ended. There’s nothing we can do about it, even if there’s not an open-ended algorithm explicitly being run around that ecosystem. Because the interaction of humans and the machines, and then back to the machines and back to the humans, that’s going to generate a phylogeny of ideas and discoveries that build off of each other, which is open-ended. It’s inevitable that it’s going to be open-ended just as human civilization is intrinsically open-ended. It’s just going to add a new element to that civilization.
And so why I mention that is because it’s not like you can just say, “Oh, well, if we want to be more safe, we just won’t be open-ended.” It’s eventually going to be open-ended. And that’s why the fact that we can actually grapple with the question that you’re asking, which is given an open-ended algorithm, how can we try to control it? Is incredibly important as a focus for research, to be able to anticipate the world that’s coming, which is going to be an open-ended world where AIs are making suggestions and putting things into the world, which are not possible to anticipate right now. And we’re going to be reacting to those in unpredictable ways.
And so the open-ended algorithms are a microcosm for studying this and allows us to ask these questions now in a safer enclosed environment, which is the environment of these kinds of enclosed experiments. How can we control these? If you think about it, it’s a very paradoxical question because open-endedness is itself is about not putting or imposing control on something. That’s kind of what it’s about. So if you say, “Well, how do we control something that we’re not trying to control?” Of course, it’s basically a paradox, but nevertheless it has to be addressed.
And the truth is it’s about nuance and it’s about calibration because clearly you can put constraints on a system and it’s all about how those constraints are imposed that will ultimately determine the safety of the system. And so even novelty search is constrained. In fact, it should be noted that the only truly unconstrained system is one that’s completely random. If you just say the algorithm is allowed to create any variation it wants. And all of those variations will have a chance to see the light of day. Well then we’ll see everything over time. Of course, most of it will be absolute garbage, but it’s not constrained in any way.
The whole point of these algorithms is that even if we forget about safety for a moment is that they are constrained so that it’s more likely that the things we look at are interesting. If they’re completely unconstrained, we wouldn’t get anything interesting or at least most things wouldn’t be. And so the algorithms that are being developed are intrinsically, and they go way beyond just novelly search, which is now old, but they are intrinsically about applying constraints to open-ended systems. That is what open-endedness is about. It’s basically saying, what is the constraint that defines interesting in a way that’s both useful and also not scary or dangerous to us? And we can do that. Constraints are all around open-ended systems. Evolution is highly constrained. Evolution is a canonical open-ended system. That’s why I like to go back to it. Evolution on Earth, natural evolution. Because if you think about it, the constraint in natural evolution is basically that you have to be a walking Xerox machine or else your lineage is over. You’re basically walking around with copying apparatus in your belly that basically can make a copy of yourself, which is absolutely insane as a constraint. That’s a very high bar. If anything is created that doesn’t have that stuff working, that’s going nowhere from an evolutionary perspective.
And so if you think about that constraint, at first it may seem completely arbitrary. It’s like, what the heck? You have to be a walking Xerox machine. Why that of all things? Well obviously it has to do with the state of the physical natural world and what could or couldn’t be viable there, but there’s a more abstract kind of way of thinking about it, which is just like, well, actually the fact that it’s a high bar is what is causing things to be so interesting.
If the bar was low… Imagine as a thought experiment that instead everybody gets to have a child as long as they’re above a certain mass. Now this can’t happen obviously, but imagine something like some God will intervene. It will give you a child. So you may have no reproductive organs at all, but the God will give you a child. And this is just a thought experiment. Well, then of course things will be less interesting, wouldn’t they? All they have to do is get big inert blobs sitting on the ground and they’re viable and they’ll have children. And so the thing that is causing us to decide where to go further and where not to is the constraint, and it is actually causing things to be interesting or not. And furthermore can also cause things to be safe or not, but you’re just going to be dancing along a very delicate trade-off because the more you try to make things more safe through a constraint, the less creativity and exploration will happen. And so it’s like, how do we traverse that really fine line in a way that preserves the open-ended quality without completely destroying it, but has safety?
And so that delicate balance is I think at the heart of what the safety problem that we’re facing is. And I think if we veer too far towards the safety side, we will lose the open-endedness and we will be in a stagnant system that does not produce creative output. And I’m not saying that this is good or bad. I’m not making value judgment. That’s just the way things work. So we have to just grapple with that and decide what tolerance we have.
Jon Krohn: 01:06:58
That was a really great answer. And so basically the summary of that is that in order to have safety in open-ended AI systems, we can’t put those constraints on it, but we also need to be careful with how we do that because creativity and constraints are inversely related. And that makes perfect sense to me, at least in an intuitive sense. So in your answer there, you spoke a fair bit about evolutionary algorithms. You are an evolutionary algorithm expert. We’ve talked about them a number of different times right from the beginning of the episode, talking about Picbreeding. And so what do you think is in store for the future of open-endedness research in AI? Do you think it’s going to be related to genetic algorithms or related to reinforcement learning or is it going to branch off of something else?
Ken Stanley: 01:07:47
Yeah, so that’s a great question. I do think that the field has a future. And so I would encourage people to learn more about open-endedness as a field. I think that the future of open-endedness has several facets. And the core issue in it is this theoretical question, which I find very fascinating, which is: Will we or can we or when will we figure out how to get a system to open-endedly perpetuate interesting artifacts indefinitely? That to me is the Holy Grail of open-endedness. It’s a system that never ends. And we are far from it right now and have not seen anything close.
I think this is a really interesting puzzle because to me it’s very kind of commensurate with the problem of trying to achieve AGI, or artificial general intelligence, because I think they’re both equally grand challenges and yet one of them has way more attention than the other. But I think the open-endedness challenge, this challenge of the never ending system is just equally grandiose and important and would be absolutely an incredible breakthrough. Right now, open-end ended systems or so-called open-ended systems, what they do with the algorithms that we have today is they tend to produce new, interesting things for a while, which probably means a couple days, maybe a couple weeks at best. But if I came back in a year, it wouldn’t be worth it. If I came back in 10 years, it wouldn’t be worth it. And that’s just different from the kinds of open-ended systems that exist organically in nature. Like evolution, that gave us more than a billion years of interesting stuff just coming out over and over and it’s continuing to this day. Civilization, which is the human driven version of this, civilization has been going for thousands of years and continues to produce interesting artifacts. And interesting is an understatement. Like human intelligence is a product of evolution.
This is one of the most interesting things that exist in the universe, and it’s being produced by this particular open-ended process. Civilization is producing pretty much everything you look outside your window and see, that people interact with and use that we find useful is a product of civilization. And so, these are incredibly powerful processes that go basically, for all intents and purposes, forever. And the algorithms that we have are pale, pale shadows of that, that can go only go for a couple days, maybe a couple weeks, and so, there is a remaining challenge to understand how to get them to be indefinite and I think it’s achievable. And so, that makes it really exciting, because I don’t think there’s any theoretical reason why we couldn’t do this, and we are. This is a more researchy answer, but I think from the perspective of the field, we are going to move the direction of making breakthroughs in understanding how to keep these processes going for longer and longer and longer.
And the longer they go, the better they are, because it’s this stepping stone perpetuation issue. The more things they discover, the more things they can discover, and that should be possible to set up in an algorithmic framework and then you’ll have a truly never-ending algorithm, which we have never seen in the history of computer science and would be completely fascinating. I also think that this is instrumental in AI and the pursuit of AGI. So that’s another part of it is that, look, that’s probably, to me, the most distinguishing aspect of human intelligence is that we are open ended. Take that out and you’ve got just robots. We are not that interesting if you take in open-endedness out of our intelligence and that’s, I think, a serious concern for what we are doing right now in machine learning. We’re making incredible progress at getting machines to do things that we’ve already done and that has huge economic value.
So it’s not to diminish the importance of what’s happening right now, but that is very, very different from getting machines to do things that we haven’t done or that we could do or will do. That’s a whole other thing, and that’s open-endedness. And so to truly grasp this superlative aspect of what it means to be human, I think involves conquering open-endedness. It’s a component of actually making something that’s truly AGI.
And it’s really interesting because it’s not just a component of the AGI, but it’s also a stepping stone to the AGI because I think it can be argued, and this is slightly weaker argument, but I think also important to consider that the only way to get to the AGI requires going through stepping stones that are also open-ended or in other words, an open-ended algorithm may be necessary to get to the point where you have open-ended brains. After all, that’s what happened. We are a product of evolution, which was open-ended and now we are open-ended with our own intelligence, and so, it can’t be proven that you have to go that way, but I think there’s a lot of arguments if we got into the nitty gritty details that actually, that might be the case. So open-endedness is important for all of these reasons and has, I think, a big future in all of them.
Jon Krohn: 01:12:49
I can see that. That is super exciting. And maybe that ties into what you are thinking about doing next with your career. So your open-ended brain has led to some open-ended opportunities that are building on themselves. You were a professor of computer science at the University of Central Florida, and then with Gary Marcus and others, you founded the AI startup Geometric Intelligence, which was acquired by Uber. Then you led the core AI research team for Uber AI on the back of that, and most recently, you were the open-endedness team leader at the revered AI research company, OpenAI. Certainly one of the top shops to be working at doing any kind of AI research. But at the time of filming, you have recently left OpenAI to follow the gradient of interestingness to something completely new. So do you want to tell us about what that is, Ken?
Ken Stanley: 01:13:49
Yeah, well said. I think I’m trying to follow the gradient of interestingness. So this is something I’ve not disclosed before so this is the first time that I’m going to say this publicly, but yeah, I’m now trying to start a company and I just want to tell you a little bit. Also, again, just for listeners, why. It’s really that I am inspired by the book that we were discussing earlier in the show. It’s the foundation of a lot of what we’ve been discussing. The book Why Greatness Cannot Be Planned. I’ve received a lot of reaction to this book over the years. This book is about what it means and how it can be achieved to really help people to achieve what I would call serendipity. We didn’t put it that way earlier, but when you think about it, when you’re talking about achievement without objectives, what you’re talking about is increasing the probability of serendipity.
And so, I have been just overwhelmed with the response from people to the book, talking to me about what they want to do to maximize serendipity in their lives or their company’s, or just the way they behave generally in their lives that has helped them to get to serendipity in the past. And just all of that has made me come to the conclusion that I need to do something about this. This resonates with people. People really want to do something about this. And what I’ve concluded is that we should build a system that helps people to achieve serendipity in a principled way that follows the algorithmic insights, but is really about people and helping them to get to that point in their own explorations and in how they interact with others, because after all, stepping stones are a communal property. People get stepping stones from other people and from themselves.
And so this system is going to be what we’ll build in this company. Not going to give away too many details about it, but that’s basically the goal, is to create a big new serendipity network, and right now, it’s obviously in a very nascent stage, but if people are interested, I did put up an email address, so you can inquire about jobs or even about beta testing if you were interested, or anything else you might want to ask about… I created an interim address and that is Newco. So N-E-W-C-O at Ken Stanley dot net. newco@kenstanley.net if you want to inquire about it. At this point, it’s super early days, but still be happy to hear about anybody’s interest.
Jon Krohn: 01:16:26
That’s incredibly exciting. With what you’ve achieved already in your career so far, and then that big ambition of allowing people to achieve serendipity in a structured way that’s based on insights that we have from algorithms, it sounds amazing. And so, I hope that there are tons of listeners out there whose interest has been piqued by this and they reach out to you. Sounds like an amazing opportunity and I can’t wait to see how this develops, Ken. All right, so a question that I save only for a few guests. I haven’t asked this that many times on air, but it is my absolute favorite question to ask on air, which is, we sit at this point in history where thanks to exponentially cheaper data storage, exponentially cheaper compute, ever more abundant sensors collecting more and more types of data in more and more places all over the world.
There’s interconnectedness that we have through the internet and more and more people around the world, billions more people in recent decades and that trend will continue over coming decades, have access to all this information, including information on data modeling innovations and data sets, so archive papers are shared in real time. Code is shared in GitHub in real time and so anybody anywhere in the world, any of these open-ended brains can be getting access to all of these open-ended innovations in real time, and so technology is advancing at an exponentially faster pace each year, thanks to adding all these different effects together. And a lot of them have to do with data and machine learning, either in a direct or an indirect way, so with all of that context around not only how exciting things are today, but this trajectory that we’re on of things being more exciting in the future, what excites you about what could happen in our lifetimes? What are you maybe hoping to look back on when you retire?
Ken Stanley: 01:18:38
Yeah. Great question. So I think what’s happening that you’re describing, there’s really a dichotomy because it’s incredibly exciting, but it’s also gravely concerning at the same time. And I think this is a moment where we’re really, as a society and especially in the field of AI grappling with that dichotomy, it’s hard to really triangulate between these two aspects of it. It’s like, wow, you can generate imagery that you would’ve had to pay a professional prohibitive amounts to do and would’ve taken days or weeks to actually produce now in a few seconds. And this obviously opens up huge opportunities for all kinds of people, but it’s also, in some ways, crushingly worrisome for other people who produce these things. And there’s also subtle problems. For example, if something is trained on all of the art that was ever created, it’s great at generating things that are interpolations among all the art that’s ever created, but if that actually caused us to stop producing art as human artists, there will be nothing new.
And it’s not just about art. Everything that we have pictures of, all the things in the world like offices, if you want to stock photo of an office, it wouldn’t be that great if all I got out of these AIs were stock photos from the 1970s. It would look weird, but if people don’t take pictures anymore because the AIs are generating everything for free, something doesn’t make sense eventually. But at the same time, that’s why you can bounce back and forth. You have to acknowledge, this is a great step for a lot of us. The creativity it opens up is absolutely incredible. So what I would hope for is that I think the guiding principle or the north star for where the future should go is to remember that it’s really ultimately should be about us humans and really allowing us to amplify and maximize our own self expression, I think, is what would be a good future.
A bad future is one where self-expression is muted and diminished because machines replace us, whatever that might mean exactly, and we lose that capacity for self-expression or channels to actually enjoy it. And rather, you can see a more optimistic version of this, where with these tools that are coming actually amplify our ability to express ourselves rather than diminish it, and obviously again, it’s a very delicate line to understand how to traverse that, but the idealistic view is that we’ll be able to use that amplification capacity to basically elevate the human condition and human self expression, because I believe that ultimately, where the most satisfaction comes from is ultimately from the ability to express yourself, and that doesn’t just mean art. I think an engineer expresses themself through their works as well and a software engineer and things like that as well, and so we’re all, I think, most satisfied when we’re expressing ourselves in some way.
And so, we don’t want to diminish that. And so, I think that the world that I would imagine is where seeds of insight that we have as individual humans, where we wouldn’t have had necessarily the talent to realize those seeds suddenly become realizable for us. But this seed is still a human thing. It still comes ultimately from our understanding of the world, what we want in the world, and the experience we’ve had as individuals. And it’s the AI that helps us to actually amplify and make those things real that a lot wouldn’t have been possible to be real in the past.
And so, our talents are still being honored at some level and still integral to what ultimately we see. And so, that’s one of the reasons that, when I thought about the direction that I want to go, when I thought about this company, is something with humans in the loop, because I want to put humans at the center of what the AI facilitates and ultimately, elevate the human condition as opposed to just replace it and subvert it. That is a high bar and a future that can’t be guaranteed, but a good aspirational future, I think, is one that would look like that.
Jon Krohn: 01:22:49
Nice. And I like it. And I realize now when I ask this question of you or anyone, it is an extremely difficult question to answer, at least with specifics, because of the proliferation of stepping stones that we have no visibility into at this time. So even projecting five years into the future, I’m asking you to do 30 years and even projecting five years into the future is very difficult because we cannot ourselves conceive of what the stepping stones are that will emerge in the intervening years. That was a good answer. A great answer indeed. And I love that idea of humans being able to realize insights that wouldn’t be possible without machine assistance, and I think very much, you alluded to the kinds of things that DALL·E 2 is doing there, the kind of image generation that it’s doing, something that we’ve just actually adopted to the podcast ourselves.
For our guest episodes on Tuesdays for the YouTube thumbnails, we use photos of your face and my face. A still image of us ideally laughing in conversation, having a good time, and that makes sense for the Tuesday episodes, the guest episodes, but on Fridays, we have these Five-Minute Friday episodes which are typically just me talking about some topic and we’ve historically gone to a stock photo library, but this week at the time of filming, for the first time, we’re using image generation based on prompts to create some of our Five-Minute Friday thumbnails.
We’re experimenting with it. Some interesting early successes. And anyway, so I love this human-in-the-loop idea and I love how, while no doubt, there are downsides to any new innovation, like in this case, I don’t know if I would be investing in a stock photo marketplace, but super, super exciting times ahead. So Kenneth, you have provided so many great insights over the course of this episode, but sadly, even all incredible insightful experiences must come to an end. And so my penultimate question that I ask all of our guests is, do you have a book recommendation for us?
Ken Stanley: 01:25:09
Sure. Yeah. Recently, it was actually recommended to me, but I’ve been reading John Cleese’s little, short book. I think it’s called Creativity. It’s on creativity.
Jon Krohn: 01:25:20
Oh yeah. John Cleese from Monty Python.
Ken Stanley: 01:25:22
Yeah, yeah, yeah, yeah. It’s just such a different field, comedy and acting and writing, than what I’m associated with, but it just was really interesting to me how a lot of his thoughts intersect so much with things that I’ve been thinking. And there’s definitely a non-objective flavor to a lot of what he’s saying, but with other layers of insight that come from his background which is totally different, so I think it’s very complimentary to… If you read our book and read his book, they actually go together well.
Jon Krohn: 01:25:55
Yeah. I think John Cleese is, from my perspective, a brilliant actor and producer. Things like Monty Python, A Fish Called Wanda or Fawlty Towers. These are all series or films that have such an abundance of creativity, and I think are part of the unexpected happening in these programs is part of what makes them so enjoyable to watch, so it’s a cool recommendation.
Ken Stanley: 01:26:26
I wish more people would talk about how they… Creative people like him, how they approach creativity. It’s so interesting to hear from people like that. You don’t usually get that behind the scenes, so it’s really fascinating.
Jon Krohn: 01:26:41
Yeah, no doubt. I wonder if some of John Cleese’s insights will end up influencing machine learning models that you develop in the future.
Ken Stanley: 01:26:50
It’s not impossible.
Jon Krohn: 01:26:51
Yeah. It’s not impossible. All right, so earlier in the episode, you already provided us with an email address to get in touch. If people have any questions about your company, perhaps if they’d be interested in getting involved in your company, and so you can repeat that email address for us or any other ways that you recommend for people to stay in touch with the brilliant thoughts that you have, Ken. Maybe social media accounts, that kind of thing.
Ken Stanley: 01:27:17
So just to remind people of that address that was newco@kenstanley.net. That’s newco@kenstanley.net. And just for clarity, that’s really just for if you want to ask me about the company. If you want to get in touch with me about other stuff like using more normal email, I would just recommend… Or I don’t know, get in touch with Twitter or something like that, I would recommend to go to kenstanley.net. Just go to kenstanley.net. That has different links for ways to contact me, and you can find me through normal channels there and the Newco is just for inquiries about the company.
Jon Krohn: 01:27:55
Nice. [inaudible 01:27:56]-
Ken Stanley: 01:27:56
And I’d be happy to hear from people too.
Jon Krohn: 01:27:59
Wonderful. Thank you for opening up that line of communication with our listeners. Ken, it’s been awesome having you on the show. I really did learn a tremendous amount and loved just hearing the way that you speak about ideas. I found you fascinating, your ideas to be fascinating, and hopefully, we can get you on the show again in the future, maybe after Newco is off the ground and you have some interesting applications to share from that. Looking forward to it.
Ken Stanley: 01:28:31
That’d be great. Thanks for having me. Yeah, this was really enjoyable. Great questions.
Jon Krohn: 01:28:40
Whoa. What a trip that conversation was. I’m blown away by the practical human decision making insights Ken has been able to glean from his machine learning research. In the episode, Ken filled us in on how interesting this maximization… Exploring what you find interesting at the moment may lead to superior life outcomes relative to explicitly pursuing a specific objective although we can’t perceive in advance what these outcomes will be because the stepping stones toward them are revealed only through exploration and also talked about how with ramifications for reinforcement learning, AI creativity and human-in-the-loop systems, novelty search approaches enable machines to learn something interesting by exploring all the options, including the boring options first. And he talked about how never ending open-ended exploration in machines would be a breakthrough on par with AGI or a critical stepping stone on route to achieving AGI.
As always, you can get all the show notes, including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Ken’s social media profiles, as well as my own social media profiles at www.superdatascience.com/611. That’s www.superdatascience.com/611. If you enjoyed this episode, I’d greatly appreciate it if you left a review on your favorite podcasting app or on the Super Data Science YouTube channel. I also encourage you to let me know your thoughts on the episode directly by following me on LinkedIn or Twitter and then tagging me in a post about it. Your feedback is invaluable for helping us shape future episodes of the show.
Thanks to my colleagues at Nebula for supporting me while I create content like this Super Data Science episode for you. And thanks of course, to Ivana, Mario, Natalie, Serg, Sylvia, Zara, and Kirill on the Super Data Science team for producing another mind blowing episode for us today. For details of everyone on the team, you can visit jonkrohn.com/podcast. If you’re interested in sponsoring the show, you can email natalie@jonkrohn.com. We’ve provided her contact details in the show notes, or again, you can find them at jonkrohn.com/podcast. All right then. Until next time, keep on rocking it out there folks, and I’m looking forward to enjoying another round of the Super Data Science podcast with you very soon.