Podcasts SDS 589: Narrative A.I. with Hilary Mason

56 minutes
Artificial Intelligence, Data Science

SDS 589: Narrative A.I. with Hilary Mason

Subscribe on Apple Podcasts, Spotify, Stitcher Radio or TuneIn

Co-Founder and CEO of Hidden Door, Hilary Mason joins the show for a live talk that dives into the world of narrative AI, her life as the leader of an early-stage A.I. company and the emerging ML techniques that she’s excited to watch in the future.

About Hilary Mason

Hilary Mason is Co-Founder of Hidden Door, a start-up that leverages state-of-the-art narrative A.I. techniques to generate unique, customized dialog and graphics in real-time to deliver a groundbreakingly immersive video game experience. She was previously Founder and CEO of Fast Forward Labs, an emerging-technology research company that was acquired by Cloudera. She was also a Data-Scientist-in-Residence at Accel and co-founded several iconic tech communities in New York such as DataGotham and HackNY.

Overview

As the leader of Hidden Door, Hilary describes Hidden Door as “Roblox meets Dungeons and Dragons with an A.I. dungeon master.” Essentially, it’s a machine that plays any story in any world, allowing you to take existing worlds and co-creating one of your own. Regarding the technology behind it all, Hilary and her team use a mix of emergent language model techniques, procedural techniques, and classic NLP techniques, all architected around creating a safe and controllable experience. “The core problem we’re solving is one of taking unstructured language of a novel or a story, structuring it, mediating it through a game engine, and then using that structured representation to generate both text and art dynamically,” Hilary explains.

In 2010, Hilary coined the term OSEMN to explain the then little-known data science lifecycle: obtain, scrub, explore, model, and interpret. Though her process may seem obvious these days, Hilary emphasizes that interpretation is the one step that data scientists still struggle with today.

While discussing the future of data science, Hilary admits that while most people are excited about generative models or being able to generate images from various inputs, Hilary points out a method that she says is overlooked these days: “few-shot learning and the ability to take a pre-trained model and turn it into a new task without even needing to fine-tune…I think that’s incredibly powerful and under-appreciated,” Hilary shares.

When it comes to life as a CEO of an early-stage start-up, Hilary says that her role is currently split into three parts. First, she remains responsible for the company’s overall vision and ensuring its growth, including hiring and fundraising. The second piece involves engineering leadership, including reviewing pull requests. And lastly, Hilary reveals that she still codes and completes technical contributions here and there.

And as far as hiring for engineering roles, Hilary looks out for people who can write production code, have good judgment, are open-minded, and have a good balance between pragmatism and optimism.

Tune in to this episode to hear Hilary answer audience questions and why she’s hopeful A.I. will transform our lives for the better in the decades to come.

In this episode you will learn:

How narrative A.I. can assist creativity [5:14]
How to build ML products that have no quantitative error function to optimize [10:31]
How to ensure creative A.I. systems do not output non-sense or explicit content [16:58]
Hilary’s OSEMN data science process [21:05]
The emerging ML technique Hilary is most excited about [24:58]
What it takes to be successful as CEO of an early-stage A.I. company [27:20]
What Hilary looks for in engineering hires [32:28]
How Hilary’s hopeful A.I. will transform our lives for the better in the decades to come [38:48]

Items mentioned in this podcast:

Follow Hilary:

Follow Jon:

Episode Transcript

Download The Transcript

Podcast Transcript

Jon Krohn: 00:00

This is episode number 589 with Hilary Mason, co-founder and CEO of Hidden Door.

Welcome to the SuperDataScience Podcast. The most listened-to podcast in the data science industry. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. I’m your host, Jon Krohn, thanks for joining me today. And now, let’s make the complex, simple.

Welcome back for a special episode of the SuperDataScience Podcast, which we filmed live on stage at the New York R Conference. For this special occasion, we have an extra special guest, Hilary Mason, one of the most well-known and beloved data scientists in the world. Hilary is co-founder and CEO of Hidden Door, a startup that leverages state-of-the-art narrative AI techniques to generate unique, customized dialogue and graphics in real-time, to deliver a ground-breakingly immersive video game experience.

She was previously founder and CEO of Fast Forward Labs, an emerging technology research company that was acquired by the software giant, Cloudera. She was also data scientist in residence at Accel, a leading venture capital firm. She co-founded several iconic tech communities in New York, such as DataGotham and hackNY. She studied computer science at Brown University and Grinnell College. And she is known for sharing useful data science knowledge with the public. She has over 120,000 followers on Twitter and 160,000 followers on LinkedIn.

The first half of today’s episode contains some technical elements, but by and large, the episode should be appealing to anyone who’s keen to be on the cutting edge of machine learning application and commercialization. In today’s episode, Hilary details how narrative AI can assist creativity, how to build machine learning products that have no quantitative error function to optimize, how to ensure creative AI systems do not output nonsense or explicit content. The emerging machine learning technique she’s most excited about. What it takes to be successful as CEO of an early-stage AI company. And, how she’s hopeful AI will transform our lives for the better in the decades to come. All right, you ready for this exceptional episode? Let’s go.

Hilary Mason, obviously our esteemed guest here today. We know each other through Claudia Perlich, who was in episode number 437, one of my first episodes when I became host of SuperDataScience. And I saw Claudia and Hilary together in a super popular YouTube video called Computer Scientist Explains Machine Learning in Five Difficulty Levels. So, it was published by Wired and it has already 1.3 million views. And so, Claudia’s been a friend of mine for years, and I used that as leverage to trick Hilary into being on stage here.

In that video, Claudia was the expert of the five levels of difficulty that Hillary was explaining to. We also had a grad student, which was the fourth of five levels, that was Melanie Subbiah. She’s in episode number 559 of the podcast. Great one on GPT-3. And now we have Hilary, the host of that show and of that YouTube video, and countless other amazing achievements, many of which we’ll go over in this episode.

Currently, Hilary, you are the co-founder and CEO of Hidden Door. So, you’ve already raised $2 million in venture capital from the likes of Makers Fund, Betaworks, Brooklyn Bridge Ventures, Homebrew, and individuals like the CTO of Roblox. Hopefully I say that right.

Hilary Mason: 04:00

You did.

Jon Krohn: 04:02

And you’ve described Hidden Door as Roblox meets Dungeons and Dragons, with an AI dungeon master. So, do you want to fill us in on hidden door?

Hilary Mason: 04:14

I’m very happy to, and first thank you for inviting me and thanks to our audience for lingering and participating in this. It’s going to be a lot of fun. And yeah, so I’m the co-founder of Hidden Door and we are building a machine for playing any story in any world. So, imagine being able to take a novel or a TV show or a movie that you’re in love with, and imagine your own characters and the adventures they might have with your friends-

Jon Krohn: 04:41

Wow.

Hilary Mason: 04:41

… in those worlds, and to build those worlds together. So, it becomes a collaboration between whoever created that initial world, you and your friends, as you create characters and build your own stories and adventures, and then the machine that is facilitating and co-creating along with you.

Jon Krohn: 04:57

Wow.

Hilary Mason: 04:58

Yes.

Jon Krohn: 04:59

That sounds like a tricky machine learning problem. And so, you’re leveraging, what your marketing materials at least, describe as a new kind of narrative AI. So, how does machine learning assist with creativity in this Hidden Door platform? How does it allow your AI dungeon master to create new adventures every time?

Hilary Mason: 05:23

I’m actually going to take a step back in answering that question and talk a little bit about what the goal of the experience is, because I don’t know that everybody here is a big tabletop RPG fan. So, actually let me ask, for those of you who are here in the audience, how many of you have played tabletop RPGs, such as Dungeons and Dragons? We have a handful of folks.

Jon Krohn: 05:42

We’ve got, probably a third of hands went up.

Hilary Mason: 05:44

Yeah. And of the folks who have not played, how many of you are familiar with the idea of what it is? So, we’ve got … Okay. Pretty much everyone else, but I’m just going to say for those of you who are not familiar with the idea, you sit around a table, whether it’s real or virtual, with your friends, and tell a story in a world where you have a character, you make decisions for that character. And then another player, who’s usually called the game master, the game guide, mediates and forces the rules, applies the laws of physics and tells you what happens next, and sets up the story arc for you.
So, we are trying to create this kind of experience, but in any world, whether it is the Bridgerton world, or whether it’s unnamed wizard boy, having wizard boy adventures, world, right? Whatever it is that you happen to be in love with, or it’s the classic fantasy setting. And so, when we think about what that experience is, it is one of being able to take a story world, that world-building, to take this set of tropes, to distinguish what makes that world unique from those tropes, and then to express it in a way that is consistent and that gives you the room to tell your own story within it.

Jon Krohn: 06:56

Wow.

Hilary Mason: 06:57

And so, that’s the sort of experience we’re trying to create. And it’s one that currently, you can really only have by playing with a person who plays the role of the storyteller. And every group is different, but this experience is one where it’s a social experience. So, you end up telling a story you could not tell alone, and you couldn’t tell without your particular set of friends and your particular game leader.
So, back to the technology. So, when we think about the technology to make this possible, it is only in the last few years that we have been able to use large language models in a way to facilitate this sort of storytelling. Though, we are probably not using them in the way you might think, because that would lead to bad things or problems. But when you think about what those systems are able to provide, conceptually, they are able to essentially encode the patterns of a large amount of communication.

So, essentially, they’re trope machines. And when you apply them to say, a classic fantasy genre or classic science fiction genre, you can start to pull out the tropes that distinguish that genre and that sub-genre.
We’ve done, actually, a ton of cool data visualization of the sub-genres and how they relate through language use and trope overlap, to try and understand these things. So, we are using a extremely pragmatic mix of emergent language model techniques, a bunch of procedural techniques, a bunch of classic NLP techniques, all architected around creating a controllable and safe experience.
We also have a simulation and a game engine inside our system, which means that you get things that you think are reasonable. It has memory. You have a character sheet and statistics behind the scenes as well, even if everything is expressed in language. So, said another way, the core problem we’re solving is one of taking unstructured language of a novel or a story, structuring it, mediating it through a simulation or a game engine, and then using that structured representation to generate both text and art dynamically.

Jon Krohn: 09:21

Wow. The art as well.

Hilary Mason: 09:22

The art as well. So, as you play, it comes to life as a graphic novel.

Jon Krohn: 09:26

Wow.

Hilary Mason: 09:27

That is created based on what you and your friends choose to do.

Jon Krohn: 09:32

Cool. So, the natural language generation has us thinking maybe about transformer architectures. Yeah.

Hilary Mason: 09:40

Tons of that stuff. But I also say, most of that is running offline. So, we are not doing that stuff in real-time, for a bunch of reasons, primarily controllability and safety.

Jon Krohn: 09:50

Right. That makes a lot of sense. And so, how do you conceive building these kinds of machine learning models or even these products? I imagine you’re taking advantage of a lot of natural language data, lots of stories. And then, how do you transform that into a playable product when there’s no quantitative error function to optimize? So, I know you’ve described that to me personally, the way that you’ve set up these machine learning models, there’s no cost function that you’re minimizing say, or an objective function that you’re maximizing. So, how do you frame your machine learning models and the product, without that cost function that probably most of us are used to?

Hilary Mason: 10:40

I’m going to make what may be a provocative statement and say that as data scientists, we tend to lean into working on things that have quantitative notions of correctness.

Jon Krohn: 10:53

Correct.

Hilary Mason: 10:53

And there are a large set of things we could build, but where we don’t have those error metrics. And there are a ton of products that are incredibly valuable, that exist today, where we are using machine learning or data in a way where we don’t have those things. And they are all around us. They are even things like web search, as a canonical example of something where … Yeah, right? You’re going to say, “Oh, of course, there are quantitative ways we do this.” But they’re kind of crappy. Right? They don’t really solve the problem of, given this global set of documents, which is the absolute right one for this query, for this person, in this context, at this time. Right?
And so, I’m going to say that our problem here is not unique and that if we’re thinking as a room full of data oriented people, about the kinds of problems we’ve been thinking about solving, I often find that there are unexplored opportunities in these problems that are messy, because they don’t have some pure notion of correctness. So, that’s my controversial statement. And I can’t tell, I think everyone’s very grumpy now. We’ll see how this one goes over.

Jon Krohn: 12:03

There’s some thumbs up there in the crowd.

Hilary Mason: 12:05

Maybe. Maybe. So, then we have to ask-

Jon Krohn: 12:08

From the bioinformaticians. I know that that’s the bioinformatician section of the audience.

Hilary Mason: 12:15

Well, they’re way in the back, so they must be cool. Right? Yeah. So, we have to ask questions, when we think about building products, about how we know that a product is good, like, what is the experience we’re trying to create? How do people react to that? Can we decompose our problems into things that do have notions of correctness? Can we test our approaches against subsets of our data, where we can look at it, we can rank things. We can know what we can expect. There are ways to approach that, but I think, as a data person, it requires accepting that there is no one right quantitative answer. And then, really thinking about the product experience you’re trying to create and how you’re going to have people, in our case, play test the thing.
And just to give some more concrete examples of what we’re doing at Hidden Door. We have a system that is telling stories, that is deciding what might happen next. And sometimes it comes up with stuff that is completely expected, but if it does too much of that, it’s boring. Sometimes it comes up with stuff that is bizarre as heck. Right? But if it does too much of that, it’s just a weird machine that does bizarre stuff. It’s not fun. And so, how do we come to this notion of fun? It’s a mix of the boring and the unexpected. And so, we do indeed have a formula for the mix we are trying to optimize for, but it is not something that is purely quantitative. And we just have to accept that and work it into the product development and design process.

Jon Krohn: 13:53

So, you can stop me if I’m on the wrong track here, but it sounds like sometimes, even though ultimately, the product, the experience that we’re trying to deliver, doesn’t have something quantitative, we could still break the problem into parts where we can have quantifiable objective functions that we can train with gradient descent?

Hilary Mason: 14:16

Sometimes. And then we can use things like A/B testing and multi-armed bandit testing, when you have a sufficiently large audience of people, which we do not, because we have not released our thing yet. So, that’s a bit of a trick.

Jon Krohn: 14:28

Nice. Yeah, that’ll be fun.
Eliminating unnecessary distractions is one of the central principles of my lifestyle, as such, I only subscribe to a handful of email newsletters, those that provide a massive signal to noise ratio. One of the very few that meet my strict criterion is the Data Science Insider. If you weren’t aware of it already, the Data Science Insider is a 100% free newsletter that the SuperDataScience team creates and sends out every Friday.
We pore over all of the news and identify the most important breakthroughs in the fields of data science, machine learning, and artificial intelligence. Simply five news items. The top five items are handpicked. The items that we’re confident will be most relevant to your personal and professional growth. Each of the five articles is summarized into a standardized, easy to read format, and then packed gently into a single email. This means that you don’t have to go and read the whole article. You can read our summary and be up to speed on the latest and greatest data innovations in no time at all.

That said, if any items do particularly tickle your fancy, then you can click through and read the full article. This is what I do. I skim the Data Science Insider newsletter every week. Those items that are relevant to me, I read the summary in full. And if that signals to me that I should be digging into the full original piece, for example, to pore over figures, equations, code, or experimental methodology, I click through and dig deep. So, if you’d like to get the best signal to noise ratio out there in data science, machine learning, and AI news, subscribe to the Data Science Insider, which is completely free and no strings attached at www.superdatascience.com/dsi. That’s www.superdatascience.com/dsi. And now, let’s return to our amazing episode.
Awesome. And then, so you’ve mentioned a couple of times how you need to be careful to not have, not just boring outputs or too bizarre of outputs, but also just inappropriate outputs.

Hilary Mason: 16:41

Absolutely.

Jon Krohn: 16:41

So, ones either that don’t make sense or actually contain content that is explicit or otherwise undesirable. And so, there might be that kind of language in the training data set, or it might just happen by accident. So, what kinds of safeguards do you put into place with a product like this, to ensure that anyone who uses it, including the children who use it, don’t get exposed to these nonsense or inappropriate outputs?

Hilary Mason: 17:10

Yeah. And I think it’s worth saying explicitly to you, that all of the pre-trained large language models are trained off all the language they can find on the internet and then some.

Jon Krohn: 17:19

Reddit.

Hilary Mason: 17:19

Right. And Reddit. And so, it is full of crap. And that crap gets magnified and it is very easy to evoke that crap without trying. And that is not good.

Jon Krohn: 17:31

Correct.

Hilary Mason: 17:34

I’m fairly comfortable saying, there is no way to fix it. You can do a lot to try and prune the training data, but it doesn’t solve the problem. And so, then we think about, okay, if we have this thing that can, let’s say, generate a bunch of useful stuff, but we can never put that in front of a human being, much less a teenager or somebody who’s 10 years old, right?
What do we build around that system, so that we are always monitoring it? And then, how do we use it in an architecture such that we’re able to use it to create a product, but with a fairly high level of confidence in what the product can output? So, in our particular case, we do all of our generation of language offline, templatize it, and then only fill those templates with words that come out of a dictionary that we have-

Jon Krohn: 18:29

Vetted. Yeah.

Hilary Mason: 18:30

… vetted, curated. And it’s a dictionary that is itself derived from data, but at least there’s been a set of eyes on it. And now of course, templates can get filled with combinations of words that can be problematic in that combination. And so, you also can classify things against a known problematic language.
Well, this also, I think at one point in the introduction, Jared mentioned, I’ve been around for a long time, right? So, we’ve actually done language generation stuff for a very long time. I worked on it extensively in 2014. This is how we did it then, before we had transformers, we had these other generative models, right? So, it is not a novel approach. It’s just going back to our roots and now combining things where we may end up with tens of thousands of candidate sentences generated, that are now templatized, that can then be used in a production environment.

It also has the side effects from a data practitioner point of view, of turning what is an open-ended generation problem with no notion of what correct or goodness looks like, into a ranking problem, where we can indeed have a notion of what correct and good can look like, which makes me as a data scientist, very happy. I like ranking problems. They’re a lot easier to solve than open-ended generation problems. And it also gives us a lot of control over what the output of the system is.
Now, that control comes at a sacrifice of generalizability. So, if you start to ask our system things like, “Who won the World Series in 1995?” We are not going to have a template pre-generated for that. That’s not something that exists in the worlds that we’ve built our content around, and so it won’t work. So, you do have some trade-offs there.

Jon Krohn: 20:20

That makes sense. I wouldn’t expect that to work, as a user. So, I think that, that’s a fair trade-off for this use case. So, you mentioned that you’ve been around in data science for a long time.

Hilary Mason: 20:33

So I hear.

Jon Krohn: 20:35

And so in 2010, before I had heard of the term data science, you coined a term that I first learned from Jeroen Janssens, who’s right here in the audience. So, he was in episode number 531 of the SuperDataScience Podcast. And he pointed out to me in that episode, this awesome data science process that you coined with Chris Wiggins in 2010, before I, or probably most of the people in the audience here, had heard of data science. So, what is the OSEMN, O-S-E-M-N, data science process? Maybe let’s just start with that. And then I’ve got a follow-up question for you.

Hilary Mason: 21:13

Okay. I mean, as you’ve already said, it’s an abuse of language, because I like the word awesome a little bit too much, but so, obtain, scrub, explore-

Jon Krohn: 21:26

Explain.

Hilary Mason: 21:27

Explain, model, interpret.

Jon Krohn: 21:29

Oh, it’s explore?

Hilary Mason: 21:30

I thought it was explore.

Jon Krohn: 21:32

It’s explore, Jeroen Janssens has fact checked for me.

Hilary Mason: 21:33

Yes. Thank you. Thank you.

Jon Krohn: 21:35

I wrote it down wrong.

Hilary Mason: 21:39

It actually wasn’t that long ago, and I’m sure this seems completely obvious to everybody in this room, that you get some data, you clean it up, you look at it, you interpret it, model it, and then you visualize it, or communicate it in some way. That’s what it was. But in 2010, that was not obvious and we wrote it down, because we, or at least, I kept finding myself in a room where … And this is one of the things I love about the data community. People come from all kinds of different backgrounds to data. Some folks come from bio stats, and some folks, like me, come from computer science, and some people come from physics, and we probably have some economists in the room, and I don’t even know where everybody else comes from, but it’s probably incredibly awesome.
And we end up trying to solve a problem together, doing the same thing, but using different language or different processes and procedures, and this just kept happening. And so, finally we wrote it down and said, “We’re going to say, ‘This is the process. This is what you do when you are data science-ing.'” And it is really funny to talk about it here today, because now you look at it and you think, “Oh, this is so obvious. Like, of course.” But it wasn’t obvious in 2010.

Jon Krohn: 23:01

Yeah. It’s super cool how things like that, writing that down 12 years ago, how that has been absorbed by the data science community, and it’s helped to provide a backbone for people to understand what they might be doing on a regular basis. And then it makes it into things like Jeroen’s book Data Science at the Command Line. And yeah, it’s such a wonderful, open community, that I love being a part of. So, we know now the five stages of the OSEMN process, obtain, scrub, explore, model, and interpret. That’s an N.

Hilary Mason: 23:40

Whole thing is an abuse.

Jon Krohn: 23:42

And so, 12 years later, is there a particular stage of those five stages, that data scientists struggle with the most still today?

Hilary Mason: 23:54

I’m going to say that, that last one ended up being a stand-in for a lot of things. And that data science work is sometimes work, so that you understand something. Sometimes it’s work to communicate to somebody else, so they understand something, they make better decisions. Sometimes we’re building systems that go into production for machines to make decisions or to automate processes. And we just elided all of that, the what happens next part. And I think that’s become one of the harder things, is doing the first four steps with all of those different contexts in mind. So, I’m going to say that.

Jon Krohn: 24:42

Great answer. Nice.

Hilary Mason: 24:44

And the tooling for all the rest of it has gotten so much better. It’s so much friendlier and so much easier to start to play with so many things and so many techniques that were very difficult even 12 years ago.

Jon Krohn: 24:58

What are the tools and techniques today, that you’re most excited about?

Hilary Mason: 25:04

That is a good question.

Jon Krohn: 25:06

It’s a big one.

Hilary Mason: 25:07

Yeah, it is a big-

Jon Krohn: 25:07

Or just one.

Hilary Mason: 25:08

One. Okay. I’m going to say that the thing I’m most excited about is probably the thing that I think gets overlooked in machine learning these days, but I think is tremendously powerful, which is few-shot learning.

Jon Krohn: 25:22

Oh, yeah.

Hilary Mason: 25:22

Everyone is excited about generative models or about being able to generate images from various input or from text prompts, or adversarial networks. But I think few-shot learning and the ability to take a pre-trained model and turn it to a new task, without even having to fine-tune, to at least validate whether it’s worth a deeper exploration. I think that’s incredibly powerful and underappreciated.

Jon Krohn: 25:53

And it’s something that these big natural language transformer models are trying to specialize more and more in, as well as deeper enforcement learning approaches. And it seems like these kinds of things, being able to do few-shot learning, is key to having machine learning be a lot more like human learning, because people can, even children can often learn what to do in a circumstance from one example, but a lot of the big deep learning models that we have today might require millions of examples in order to be able to understand something that a child could learn in one or two.

Hilary Mason: 26:28

That’s right. So, it allows us to approach problems that were previously inaccessible, because we just didn’t have these large, clean data sets. And I also think it allows us as data practitioners to play with using a model to approach a particular problem in a way that lets us understand if it’s worth doing something more in-depth in that domain. And I am a huge fan of tooling changes and algorithmic changes that take something that used to be really hard and make it really easy, because I think then we just start to play. And that’s where a lot of great work ends up coming from.

Jon Krohn: 27:12

That’s a lot of what you were doing at Fast Forward Labs, I imagine.

Hilary Mason: 27:12

I built a whole company around the idea of playing with new things and emerging things. So, it is true.

Jon Krohn: 27:18

Yeah. So, now that you are the CEO of Hidden Door, do you get to spend much time getting into the code and into the data? What’s it like day to day, as the CEO of an emerging tech company?

Hilary Mason: 27:33

I will make an admission that I probably shouldn’t, which is that, yes, I still write code. And that is bad, and I should probably not be doing that.

Jon Krohn: 27:41

Extracurricular.

Hilary Mason: 27:42

Well, I’m leading our engineering team and do write some of the code myself. If you look at my GitHub profile, you’ll see exactly when we’ve been fundraising, because it is when I’m not writing code. But no, my job right now is really split in three chunks. One of which I’ll call CEO-ing, which is all of the operational work of having the vision for the company and making sure that we are moving towards it, that we are building the right team. So, hiring’s a big part of that. We are fundraising as necessary and have the resources to keep everything moving forward. Second is, engineering leadership. So, recruiting, managing, and reviewing a lot of poll requests.

Jon Krohn: 28:27

Wow.

Hilary Mason: 28:27

Making sure that the team is as productive as possible. And then the third piece is, doing some small technical contributions myself. And that will of course evolve as the company evolves too.

Jon Krohn: 28:41

That’s super cool.

Hilary Mason: 28:42

That’s my day to day. And we’re all remote. So, I’m based here in Brooklyn. So, a lot of spending time in interesting places in Brooklyn, too.

Jon Krohn: 28:53

So, you’re all remote, but because you’re based in Brooklyn, you get to sometimes meet up with other people who are based in Brooklyn?

Hilary Mason: 28:59

Well, yes. I have one team member who’s based in New York and then folks from the West Coast to Western Europe, but we do get our whole team together every three to four months, somewhere that is accessible to everybody, that is relatively cheap, where we can cook together and where there’s amazing food and something memorable to do, because if there’s not something memorable to do, I have to be memorable, and that is a problem. So, it is much better to go somewhere interesting. So, we just went to Lisbon. We went to New Orleans before that.

Jon Krohn: 29:31

Whoa.

Hilary Mason: 29:32

So, we’ve gone to some really nice spots.

Jon Krohn: 29:35

That sounds like a great management strategy tip for all of the founders or engineering leaders out there.

Hilary Mason: 29:42

Well, if you are in a world where you’re not paying for an office, then we have a little bit of leeway to pay, because you still need to build that trust. Right?

Jon Krohn: 29:53

Yeah.

Hilary Mason: 29:53

You still need to get to know people. And you just don’t have that casually, when you’re not in the same space or even in the same city. So, you have to create that time. And it is really fun to do it in nice places.

Jon Krohn: 30:08

Yeah, I bet. Those sound like great locations. Yeah. I personally have never been a fan of the Zoom coffees or the zoom happy hours. I think that they’re a drag.

Hilary Mason: 30:14

They are a drag. And it’s very hard to get that insight into … Like, when I first met up with some of my colleagues, I didn’t know if they were vegetarian or not. You just don’t get that casual insight into what they’re like, other than around the work. So, it’s really important to create that space.

Jon Krohn: 30:37

Yeah, we do a similar thing to you, in getting the engineering team together. Though, flying everyone into New York, now that you mention it, I don’t know why we’re doing that. I guess it’s because we do have office space still in New York, but it sounds like we should be going to Lisbon next.

Hilary Mason: 30:50

Yes.

Jon Krohn: 30:52

And it is certainly-

Hilary Mason: 30:55

It is incredible.

Jon Krohn: 30:56

Yeah, I bet. And Sean, on my team, is in the audience here and he’ll probably agree that when we actually all get together, it is some of the most valuable time and it’s great productivity. So, we use it for planning tickets in the upcoming sprints and a lot of details, so being all together around a whiteboard, how are we going to architect these things? How really is this going to work? How’s it going to look? How are we going to do that efficiently? Figure that all out, and then separate it into tickets. And then everyone goes off to wherever they’re from and we can work on those tickets in misery, alone, until the next time we can all get together.

Hilary Mason: 31:31

I feel like I have to say, we do a lot of that. We try to carve out some of our hardest problems and then work as a group through how we want to approach them, and then get to the point where we can scope it out into individual work. But that’s actually less important than the rest of-

Jon Krohn: 31:46

Than the just hanging out.

Hilary Mason: 31:47

… just getting to see people and know them a bit.

Jon Krohn: 31:49

That sounds great. So, I noticed on the Hidden Door careers page, that you are hiring right now, you’re hiring software engineers.

Hilary Mason: 31:59

Yes.

Jon Krohn: 31:59

And something that I mention on the show constantly, regular listeners of SuperDataScience will be aware that I’m constantly saying, even if you are a data scientist, you should be picking up software engineering skills as often as you can, because every single guest on the show is always hiring software engineers. They’re not always hiring data scientists, but they’re always hiring software engineers. So, the more software engineering skills you have, the more hireable you will be, and the more desirable you’ll be. So, you’re hiring software engineers right now. What do you look for in the people that you hire?

Hilary Mason: 32:33

So, I’ll just say, we are helping people who can help us build this product. So, people who can build production code, even if they come from a data science background, and we want to see them. And what I look for in the people I’m hiring into these engineering roles is first, good judgment, because again, we are not always in the same room. Nobody’s micromanaging you. I need to trust that you can take a piece of a product, you can understand the experience we’re trying to create with it, where it’s going to interact with other pieces of systems, the trade-offs that need to be made. And some of our stuff is quite quirky too, in terms of its infrastructure requirements, like sometimes GPUs or weird memory, or whatever it may be. And that you’re going to make good decisions quickly, that get us to something that works.
And I look for folks who are open-minded and collaborative, because again, this is about having a team where everyone brings whatever expertise they may be bringing to the team, but we are able to work together and move …

And again, we’re an early stage company. We have a product going into alpha in four weeks. We need to move quickly, but also build things that are robust, that are maintainable, where we can build systems, we can build on top of those systems, and we can trust them. And so, it is really, good judgment.
And then, being somebody who is collegial, who is open-minded, who is willing to not always be right, or has the agility to think through … One of the interview questions I like to ask, if I can give it away, and I ask this of data scientists too, is, “Let me give you a problem that we’re working on, something real, and then tell me how you might want to approach it. What do you think?” And usually, the answer I get is something that’s going to be a six month, two year long effort. I say, “Great. That seems like the right way to approach it. Now, tell me how you’re going to do it in four weeks.” And then they give that answer and I’m like, “Okay, cool. You have one week. What are you going to do?” Like, “Cool. We’ve got by the end of the day, the two of us, what are we going to do together, to get something in place?”

Jon Krohn: 34:42

Wow.

Hilary Mason: 34:42

And so, really look for that agility in thinking, because you should always build that stupid thing first, anyway, so you always have something to compare to as you explore different approaches.

Jon Krohn: 34:53

I love that. I do something similar with the beginning of that, and I don’t mind giving it away as the interview question that I ask, because the problem is always going to be different. It’s going to be whatever I’m banging my head against the wall most, at the time that you come to interview with me. But then I’ve never thought about this second part, this testing the agility of how you can shorten that time span to make it more challenging. That’s cool.

Hilary Mason: 35:12

Well, it’s also the kinds of people who thrive in a very early company and a product where you do have to think about a bunch of different systems at the same time, and things are moving quickly, and you might build something and then realize, “Oh, we have to evolve it in a certain way.” Versus the kinds of people who are coming into a mature product. Some people are much happier in that space, where they can go deep on one corner of it. That’s a different skill set than what we’re looking for. There’s nothing wrong with that. It’s just what makes you happy.

Jon Krohn: 35:42

Yeah. That kind of skill set could be great in a big tech company with tens of thousands, or hundreds of thousands of people. But yeah, you need agility in a small, early stage company.

Hilary Mason: 35:52

I also, since I have the luxury of thinking of building a team, I think a lot about how much optimism versus pragmatism we have on our team, because you need people who are optimistic enough to come up with those ideas and want to go after them. You need people who are pragmatic, to say, “Oh, and this is going to be a problem, for all of these reasons.” But you can’t just be all pragmatists, because then you’ll never do anything risky. And you can’t be all optimists, because you’ll never ship. So, it’s really trying to balance that sort of approach as well, across the team.
And then, I also really believe that everyone should bring some different background as well. So, I’m not looking for a bunch of folks who were software engineers at Google to all come work together. Maybe we have space for one of those, but people who grew up technically in different ways, who have seen a lot of different things, who can bring that wisdom.

Jon Krohn: 36:46

Cool. So, can write production code, come from different backgrounds, have good judgment, are open-minded and collaborative, and have a good balance of pragmatism and optimism.

Hilary Mason: 36:57

Yeah.

Jon Krohn: 36:57

Cool things to look for in software engineers, and how you could get hired at Hidden Door, which sounds like an amazing company to be working at, honestly.

Hilary Mason: 37:04

It’s a lot of fun. We have a lot of fun and we eat really good food.

Jon Krohn: 37:08

I bet. That sounds like the best perk. So, I’m going to have my last question before we go out to the audience to ask audience questions. Lots of sharp people out there, so I hope they have brilliant questions prepared. But my last question, going back to the beginning, I talked about how I knew I could leverage Claudia to try to get you to be on SuperDataScience, because of this super popular computer scientist explains machine learning in five difficulty levels, this Wired video with over a million views.
And I mentioned how Claudia was the level five person. She was the expert. Melanie Subbiah was level four. Level one was a child, and so you had to explain machine learning to a child. The reason why I watched the video initially, was so that I could have more ideas of how to better explain what I do for a living, to people that are non-technical. And then it was this bonus treat at the end, that I was like, “Oh, I know one of these people.”

And so, thinking about that child, the level one person. In the coming decades, these trends that have been happening over the past decades will continue. So, things like exponentially cheaper compute, exponentially cheaper data storage, more and more people working in data science and computer science, who can share information in real-time via archive papers and open source code. So, these are the kinds of driving forces behind the AI revolution that is just starting to take off and that is going to dramatically transform our lives in ways that probably we can’t even imagine today. So, Hilary, big question, but for a child today, what are you most excited about what AI could do for them in their lifetime?

Hilary Mason: 38:59

Yeah. I love this question and I always love getting to speculate about what the future may bring. I really do believe that we’re in a moment of incredible technical progress, certainly in some of the aspects of machine learning, we are just at the beginning of figuring out how to make it useful and what those products are actually going to look like, and how we build them in a way that actually supports us as people.
And the framing I like to think about is that these are things that are able to, in many cases, reduce cognitive drudgery, but they are not themselves creative or intelligent. And so, I like to think about ways we might use AI machine learning to support our creativity, to support us in accessing information at the right moment, when we have to make a decision, to augment what we’re able to know and understand, when we need to understand it.
And that is just understanding that these systems are really great at ingesting a huge amount of data, representing it some way. And then, we’re sort of getting better at having it come out the other end in a useful way. But I can imagine that, for that kid who was … She was so cute.

Jon Krohn: 40:19

She really is.

Hilary Mason: 40:19

By the way, she wants to be a spy, an AI spy. That’s just awesome. But as she’s growing up thinking about using AI and machine learning to ease a lot of the stuff that we end up doing by hand, and that might even be as mundane as scheduling, or if you think about right now, when I communicate with you, it might be in text, or in email, or in Slack, or on Twitter, or it could be in any of these places. And right now, I have to manually go and search each one if I don’t happen to remember. Right? So, just ways of organizing and accessing information we care about, so that we make better decisions.
And then, going back a bit to what we’re working on at Hidden Door, I’m very excited about these tools as one of helping us be more creative together. So, not being creative itself, because I don’t think these systems are, they’re just really great at representing patterns and tropes, but having those tropes made useful to us, at the moment when we want to play off them or we have a different idea, where we’re the ones in control and guiding, we can give it a little bit of input and see where it goes, and then say, “No, more like that.” I think we’ll see a ton of products making use of the tech in those ways, over the next decade or so.

Jon Krohn: 41:41

Super cool.

Hilary Mason: 41:42

I’m very excited about it.

Jon Krohn: 41:43

I love that answer and it’s nice to be able to tie it back to what you’re doing at Hidden Door today. All right. So, let’s open it up to some audience questions. Wow. Yeah. We’ve got a few already. All right. So, [inaudible 00:41:56] can be busy up here. Is that Tom?

Tom: 41:59

Yeah.

Jon Krohn: 42:00

Hey, Tom.

Tom: 42:00

Hi, Jon. How are you doing?

Jon Krohn: 42:01

Good.

Tom: 42:03

Yeah. So, first of all, thanks so much for coming. I’m not a person who has ever played Dungeons and Dragons or anything like that, but this entire idea is really fascinating to me. So, I think it’s really cool. So, my question for you is, obviously, the space of novels, books, TV shows, movies, is very large, is the thought that the user will be able to upload and pick whatever they want? And additionally, related to that, obviously, a 100-page novel is a different amount of data, it’s different than maybe a 100-episode TV show.

Hilary Mason: 42:41

Right.

Tom: 42:42

How do you work with such different data types?

Hilary Mason: 42:47

That’s a great question. Thank you for asking it. So, I’m going to say that our vision is that creators will bring their novel or TV script, or whatever it may be, and then the fans will be playing in that world. So, even if you don’t play these games yourself, you can think of it almost as a fan fic engine. And then, we do a ton of work to analyze text and pull out the characters, locations, the items, the kinds of plots, the personalities.
And so, when I speak with authors, one of the things they ask is like, “How much do you need to make it work?” The answer is, actually a very small amount, but whatever we get is augmented by … We ask our authors to give us a recipe of the mix of sub-genres they’re drawing from, and then use those tropes as a starting point. So, the more that exists, the more it may feel like a distinctive, unique world, but then as people play in that world, they start to develop it themselves and they create new locations along with the system that are of the world, consistent with it, but something that’s new to them and their friends. And so, that’s more or less how we’re approaching that. Of course, more is better, but also, you can actually get pretty far with even a short story, if it is detailed.

Jon Krohn: 44:14

Awesome. Great question, Tom Bliss. Thank you.

Gary: 44:17

Hi, Hilary. Hi, Jon.

Hilary Mason: 44:18

Hi.

Gary: 44:18

Hilary, big fan. Really, thank you for-

Jon Krohn: 44:21

What’s your name?

Gary: 44:22

Oh, my name is Gary.

Jon Krohn: 44:23

Hi, Gary.

Gary: 44:24

Hi. I really thank you for the video on YouTube, because I think I contributed at least 20 views to the 1.3 million views.

Hilary Mason: 44:31

Awesome.

Gary: 44:33

It really helped me to learn how to explain what is machine learning to my non-technical coworker, but still many people and maybe people who work in specific industries like in education or in public sectors, still think machine learning is a scary thing. So, how do you think we, as data people, can advocate for the trust in machine learning or leveraging AI?

Hilary Mason: 44:57

So, I think we have to make it boring again. Again, I’ve been around at least 12 years. Right? And when I started working in machine learning, which was more like 20 years ago, it was not cool. Okay. It was the kind of thing where the cute boys at grad school parties would turn and walk away. It did not work out so well. I really do think that we need to show what it does for us and give people an intuition for, this is the information that goes in, and this is what we’re doing to it. You don’t have to get into all the details. And this is what comes out the other end and how it’s useful. Such that we’re not saying it’s an AI, but rather, it’s a thing that’s useful. And we’re focusing on that versus focusing on the technology itself.
And I do think that we happen to be, hopefully coming out of a moment of extreme hype in both directions. So, AI is changing the world and everything is AI, but on the other hand, it’s incredibly dangerous. And that’s not to say that it’s not, and we shouldn’t be very wise about where we put automation. But I do think we are hopefully getting into a moment where we can start to focus on how the technology is useful and what it can let us understand and do, and not on the technology is a scary thing itself, or is something that’s leading to generalized AI, or actually intelligent machines, because that’s a bit of a boogeyman.

Jon Krohn: 46:32

Something that’s surprising to me is how almost everyone who isn’t in the industry, thinks that we either already have, or are really on the cusp of artificial general intelligence.

Hilary Mason: 46:45

Yes.

Jon Krohn: 46:46

It’s really interesting, that disconnect.

Hilary Mason: 46:49

I mean, there’s that, and then I’ll say this also, because we’re here in New York, that’s a bit of a West Coast thing too. A bit of a meme for some folks.

Frederick: 46:59

Hi, my name’s Frederick. I’m a big fan of tabletop RPGs, so this concept sounds really interesting. Of course, the story is just a part of playing these games, so I’m curious if you could talk a little bit about the underlying mechanics and game design that goes into Hidden Door?

Hilary Mason: 47:18

I can. And so, I’ll tell you also, since we’re talking a little bit about the experience of building the company too, that my co-founder and I built our first prototype of this idea, we raised a little bit of money, and the very first person we hired was a game designer, because getting those mechanics and that experience right is really important. And that is not my experience. I’ve been a lifelong player, not a designer.
And so, we happened to have someone on our team named Chris Foster, who worked in design on the Rock Band games, and on the Lord of the Rings Online, and brings that game design sensibility and expertise to what we’re able to do with machine learning in this kind of system. And so, I will say that we are not really innovating on the mechanics and that it is more or less what you expect. We have a very simple stats-based system. There are skill-based challenges in the game.

But what we are doing is keeping most of that behind a veil of language, so that you see when you attempt a challenge, whether it succeeds or fails, or critical succeeds, or critically fails, and you see what the consequences are and why, but this is not a stats optimization kind of RPG.
We’ve tried to be really thoughtful about giving as much robustness to the stats and the progression system as necessary to motivate through story and through whatever kind of story it may be, whether it’s a comedy or a more serious story, or a gritty one. But that’s actually been one of our biggest design questions, is how we want to approach that. And also, I’ll say, one of the fun things about this project has been building a team that comes half from game design, traditional game design, and half from machine learning, too. And trying to get everyone to speak the same language and actually work together to create that experience. So, thank you for the question. And we do a lot of play testing to see if we’re hitting it well or not. And it may change a lot this year.

Jon Krohn: 49:26

Nice. I think we might have one more in-person question here, but we also have at least one virtual question, which Jared Lander can provide to us.

Jared Lander: 49:33

Jai Jeffries asks, “What about licensing and clearance of intellectual property?”

Hilary Mason: 49:39

Yes. So, we are working with some very nice IP lawyers and doing all of that above board. It is a good question.

Jon Krohn: 49:47

Cool. And a clear answer. All right. Sean Khosla.

Sean: 49:51

Hi, Jon and Hilary.

Hilary Mason: 49:53

Hi.

Sean: 49:54

Hilary, you mentioned earlier, you called language models, basically trope machines, I think, and you mentioned finding overlapping tropes in genres and sub-genres. I’m curious how you’ve fine-tuned models to basically parse out tropes from a piece of text?

Hilary Mason: 50:11

Yeah. We look at it in a few different ways. So, one is in the language overlap. So, really, just what words are being used and how are those words associated with each other? We also look at plot, at the level of subject, verb, object, on the sentence by sentence basis. So, this may not be interesting to anybody else, but have a particular set of plot actions that our game simulation understands and we model those, let’s say, an arbitrary story into that vocabulary and look at the progressions. We can build sequence predictors on that sort of stuff. And then look for basically, clusters.
And we also cheat outrageously. So, we have a set of essentially, sub-genres, that we are interested in and have pulled a bunch of data for, where those sub-genres did not emerge from the data, where we and some authors we work with, wrote them down. And then we look at those things and how they’re different and whether it’s statistically interesting. And again, I’ll say this is all in the context of building a particular game, so it may or may not be useful to anyone, but I do have some cool visualizations I can put on Twitter later, if folks are interested in seeing it.

Jon Krohn: 51:25

Excellent. I’ll try to make sure that we get those into the podcast show notes as well.

Hilary Mason: 51:28

That’d be awesome.

Jon Krohn: 51:30

Wonderful. All right. Thank you for the great audience questions, the virtual ones, the in-person ones. Now we get to my final questions that I always ask. So, Hilary, do you have a book recommendation for us?

Hilary Mason: 51:41

Oh, I have so many. I think I’m going to go with Kaiju Preservation Society by John Scalzi. And I’m going to pick that one for a bunch of reasons. One is that I think it is the kind of book that should be a playable game. If you haven’t read it, it is very fun, very lighthearted. And in fact, in the end notes of the book, he talks a lot about going into the pandemic and getting very depressed and then writing this book just to have fun again. So, I had a lot of empathy for that.
And it is one of, taking also a bunch of tech tropes and then going in, without giving much away, to a different world and seeing how people who come out of our tech community do in that world, which I thought was hilarious. So, I’m going to say, we should all read something on the beach this summer that is a little bit fun and lighthearted.

Jon Krohn: 52:29

Nice. That sounds good, Hilary.

Hilary Mason: 52:30

And maybe it’s a good counterpoint to the tomes of data books Jared has been giving away at this event all day.

Jon Krohn: 52:38

Super recommendation. I’m looking forward to that fun beach time with that book. It sounds great. And so then, Hilary, my last question that I always ask-

Hilary Mason: 52:47

You’ve had like five last questions, I’m just going to mention.

Jon Krohn: 52:50

So, I had my last question before we went to the audience questions, but then my two standard ones that we always have at the end of the show, the book recommendation you’ve got, and then how to follow you. So, you have a massive Twitter following already. So, presumably, Twitter is the spot. Anywhere else that people should be?

Hilary Mason: 53:10

Yeah. So, HMason on Twitter. You can follow Hidden Door by signing up at hiddendoor.co, and we are going into a closed alpha in about four weeks. So, if that’s something you’d want to give us feedback on, we’d love to have you participate. And particularly, any teenagers who love stories as well. Yeah, I’m a little boring on Twitter these days, but that’s probably the best place.

Jon Krohn: 53:36

Great. All right. Thank you so much, Hilary, for-

Hilary Mason: 53:39

Thank you.

Jon Krohn: 53:41

… both experiencing the R Conference as an interviewee in this discussion, as well as being on the SuperDataScience Podcast, really appreciate your time. It was amazing, as I would expect to be, and hopefully we’ll catch up with you again sometime in the future.

Hilary Mason: 53:55

This was awesome. Thank you, and thank you all.

Jon Krohn: 54:03

For those of you listeners who were able to join us in New York for this live filmed episode of the SuperDataScience Podcast at the R Conference, it was awesome to meet you and chat about your favorite aspects of the show. A live audience certainly makes shooting an episode more fun and exhilarating. We’ll have to do it again soon.

While shooting live might be intimidating for some, it was a breeze for the data science legend, Hilary Mason, from whom we all learned a lot. In today’s episode, Hilary filled us in on how we can build products with no quantitative function to optimize, by breaking it into lots of separate machine learning problems, that themselves have differentiatable cost functions. She talked about how we can ensure narrative AI systems do not output nonsense or inappropriate outputs, by creating output templates and carrying out offline vetting of dictionaries.

She talked about how few-shot learning is the machine learning technique she’s particularly excited about. She filled us in on her awesome data science process, obtain, scrub, explore, model, and interpret. And how the final step, interpretation, is the one that data scientists still struggle with today. And finally, Hilary told us how she hopes that AI will transform our lives in the coming decades by automating cognitive drudgery, thereby augmenting human intelligence.

As always, you can get all the show notes, including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Hilary’s Twitter profile, as well as my own social media profiles, at www.superdatascience.com/589. That’s www.superdatascience.com/589. If you enjoyed this episode, I’d greatly appreciate it if you left a review on your favorite podcasting app or on the SuperDataScience YouTube channel. I also encourage you to let me know your thoughts on this episode directly, by adding me on LinkedIn or Twitter, and then tagging me in a post about it. Your feedback is invaluable for helping us shape future episodes of the show.

Thanks to my colleagues at Nebula for supporting me while I create content like this SuperDataScience episode for you. And thanks of course, to Ivana Zibert, Mario Pombo, Serg Masis, Sylvia Ogweng, and Kirill Eremenko, on the SuperDataScience team, for managing, editing, researching, summarizing, and producing another pioneering episode for us today. Keep on rocking it out there folks, and I’m looking forward to enjoying another round of the SuperDataScience Podcast with you, very soon.

Podcasts SDS 589: Narrative A.I. with Hilary Mason

SDS 589: Narrative A.I. with Hilary Mason

Podcast Transcript

Share on

Related Podcasts

June 26, 2026

June 23, 2026

June 19, 2026

Podcasts SDS 589: Narrative A.I. with Hilary Mason

Share

SDS 589: Narrative A.I. with Hilary Mason

Podcast Transcript

Share on

Related Podcasts

June 26, 2026

SDS 1004: Recursive Self-Improvement

June 23, 2026

SDS 1003: Building an AI Data Center End to End, with Lightning AI’s Frank Basso

June 19, 2026

SDS 1002: Fable 5: The Full Story from Capabilities to Drama