SDS 617: Causal Modeling and Sequence Data

Podcast Guest: Sean Taylor

October 11, 2022

We can’t get enough of causal modeling! This week, we welcome Sean Taylor, Co-Founder and Chief Scientist of Motif Analytics, who covers yet another perspective of causal modeling. From large-scale causal experimentation to Information Systems and Bayesian parameter searches, it’s another jam-packed episode full of technical details that will appeal to both beginners and experts in the field.

Thanks to our Sponsors: 
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
About Sean Taylor
Sean J. Taylor is Chief Scientist at Motif Analytics. He was formerly a data scientist and manager at Lyft’s Rideshare Labs and Facebook’s Core Data Science team. Sean has a PhD in Information Systems from NYU and a BS in Economics from the University of Pennsylvania.
Overview

Continuing with the theme of causal modeling from episodes 607 and 613, Dr. Sean Taylor, Co-Founder and Chief Scientist of Motif Analytics, joins Jon Krohn for yet another discussion that dives into one of our favorite topics.
With years of experience at Facebook and Lyft, Sean delivers expert insights into the field of causality and discusses some real-world applications of causal modeling. Sean boils causality down to exploring the consequences of choices and taking appropriate action. And as far as its applications in business, at Lyft, Sean applied simulation and reinforcement learning to projects like matching riders to drivers and pricing.
When it comes to the causal tools Sean recommends, he emphasizes that one should focus instead on the right tool for the job and start by learning the fundamentals first. One tool in particular that Sean favors is Facebook’s Prophet. He was instrumental in developing the automated time series forecasting package, which recently garnered criticism from the industry. In response to the critics, Sean clarified that one must always carefully interpret the results of any tool and that blindly applying methodologies is no way to gain confidence in your results. And while diagnostics and model validation may not be the most pleasant of processes, it remains an essential part of any experiment.
After interviewing over 400 people during his time at Facebook and Lyft, Sean also shares what he searches for when hiring new data science candidates. As his top criterion, Sean notes that intellectual curiosity remains at the top of his list of skills. People who are excited to learn new things stay motivated and can pick up new skills quickly, says Sean.
Tune in for more from Sean, including what it’s like working at Lyft, his thoughts on the field of Information Systems, and a few of his favorite book recommendations.
In this episode you will learn:    
  • Sean on his new venture, Motif Analytics [4:23]
  • The relationship between causality and sequence analytics [15:26]
  • Sean’s data science work at Lyft [22:21]
  • The key investments for large-scale causal experimentation [27:25]
  • Why and when is causal modeling helpful [32:34]
  • Causal modeling tools and recommendations [36:52]
  • Facebook’s Prophet automation tool for forecasting [40:02]
  • What Sean looks for in data science hires [50:57]
  • Sean on his PhD in Information Systems [53:34]
 
Items mentioned in this podcast: 
Follow Sean:

Podcast Transcript

Jon Krohn: 00:00:00

This is episode number 617 with Dr. Sean Taylor, co-founder and chief scientist of Motif Analytics. Today’s episode is brought to you by Datalore, the collaborative data science platform, and by Zencastr, the easiest way to make high quality podcasts.
00:00:20
Welcome to the Super Data Science Podcast, the most listened to podcast in the data science industry. Each week we bring you inspiring people and ideas to help you build a successful career in data science. I’m your host, Jon Krohn. Thanks for joining me today. And now let’s make the complex simple.
00:00:51
Welcome back to the Super Data Science Podcast. We’ve got a great episode for you today with Dr. Sean Taylor. Sean is co-founder and chief scientist of an exciting new startup that blends his deep expertise in causal modeling with time series analytics. Previously, he worked as a data science manager at the rideshare company Lyft. He also worked as a research scientist manager at Facebook where he led the development of the renowned open source forecasting tool Prophet. He holds a PhD in information sciences from New York University. Today’s episode gets deep into the weeds on occasion, particularly when discussing making causal inferences, but most of the episode will resonate with any curious listener. 
00:01:30
In this episode, Sean publicly unveils his new venture, filling us in on why now was the right time for him to co-found and lead data science at a machine learning startup. He details what causal modeling is, why every data scientist should be familiar with it, and how it can make a real world impact, with many illustrative examples from his time at Lyft. He fills us in on the infrastructure and teams required for large scale causal experimentation. He covers how causal modeling and forecasting can’t be fully automated today, as it requires humans to make assumptions, but also how humans can make these assumptions in a more informed manner thanks to data visualizations. He explains what the field of information sciences is, and having conducted several hundred interviews, what he looks for in the data scientists he hires. All right. Are you ready for this terrific episode? Let’s go. 
 
00:02:26
Sean Taylor, welcome to the Super Data Science Podcast. Where in the world are you calling in from? 
Sean Taylor: 00:02:30
Thanks, Jon. Glad to be here. I am in my parents’ house in Philadelphia where I grew up. 
Jon Krohn: 00:02:35
Oh no kidding. We’re in the same time zone. I’m in New York. I was expecting you to be on the west coast. You’re usually out in Oakland, I guess. 
Sean Taylor: 00:02:44
I do live in Oakland. Visiting the east coast to see some family for the last couple of weeks. So it’s a little bit of a nostalgia tour, being in the parents’ house and everything. 
Jon Krohn: 00:02:53
Nice. Well, they’ve got a… The sound in whatever room you have in your parents’ house here is great. We lucked out. No echo at all, so perfect for podcasting. 
Sean Taylor: 00:03:03
Might be the one thing that still works in this house. 
Jon Krohn: 00:03:08
So we know each other through Sarah Catanzaro. She was in episode number 601. She had this brilliant episode about venture capital investing in data science, and she was just this incredible speaker, and she’s been super helpful with us also for lining up subsequent speakers. So a couple of weeks ago we had Dr. Emre Kiciman on the show, whom I think you also are aware of. 
Sean Taylor: 00:03:36
I know Emre. Yep. 
Jon Krohn: 00:03:37
Yeah. So that was in episode 613. And then you and Emre and another recent guest, Jennifer Hill in episode 607, you’re all causality experts. So we’ve got this nice string… I think you’re the third in the trilogy of causality experts. 
Sean Taylor: 00:03:56
I hope to say something novel after those two. Probably hard to follow those acts. 
Jon Krohn: 00:04:03
I have no doubt. Based on the questions that we have lined up for you, I know that you are going to be complementing all the causality discussions that we had in episodes 607 and 613. It’s going to be a lot of fun. But before we get into causality, let’s talk about what you’re up to. So your LinkedIn profile has been saying for the last few months that you’ve been working on this “new thing.” So Sean, can you spill the beans for us on air as to what this exciting new thing is? 
Sean Taylor: 00:04:38
I guess I set myself up to have to reveal this today. It’s been really, really tough not to tell people what I’m working on, because I always tell people the two joys in work are doing the fun thing and then telling people about it. And I haven’t gotten to do the second part for quite a while. So a couple months ago I joined a startup called Motif Analytics as a co-founder, and I am the chief scientist at Motif Analytics. Some people may have heard of us already, we have a website and a few blog posts, but we’re still building some awesome stuff and I’m really excited about a further reveal maybe later this year, and showing what we’ve actually been doing. 
Jon Krohn: 00:05:14
Nice. Yeah, congrats. The website looks great. The product idea looks great. We’ll talk about that in a second. The founding team looks great too. It seems like you’re on to something big. So how did you decide to work on this? You haven’t worked at a smaller company like this before, so why was now the right time for you? 
Sean Taylor: 00:05:33
I guess I’ve been thinking about joining a small company for a while, but never really quite pulled the trigger. I like big companies. I think it’s fun to work with on a well defined problem with a big team, and have a lot of resources. I think the thing I’ve always wondered about is how fast you could move if you had way fewer people, and you had really high alignment on what you were doing. And working on some green space stuff was always really exciting to me. So I always had this little startup FOMO, and then finally got to… Now it’s no longer FOMO, I’m not missing out any more. And it’s been every bit as exciting as I thought it would be. It’s just so much fun to have something that’s not done yet to work on. Like when I showed up at Lyft and Facebook, I think both of those companies had already been successful, and were likely to be successful if I weren’t there or not. But here, the work we’re doing is quite pivotal to what product we’ll actually build in the long run. 
Jon Krohn: 00:06:28
And so what does the chief scientist at Motif do? What are you responsible for? 
Sean Taylor: 00:06:33
That’s an interesting question. We debated what title I would take when I joined, and I think chief scientist is a good one. I kind of copied after… Jure Leskovec is the chief scientist at his new startup, and I was like, if he’s a chief scientist, I think I’m doing a similar role. We have a CTO and a CEO already, and so the product and engineering I think are really well covered by Misha and Theron. And I think that the question is, what does science add beyond engineering? And the product that we’re building, really the target audience is data scientists and analysts. And in some ways I think I’m the voice of the customer, so trying to make sure that I channel those requirements. I’ve been at these companies, I know how these people work, I really want to build the best tool possible for them.
00:07:17
But then also the tool itself is meant to impart a really strong opinion about how to do data science and analytics in a correct way, and in particular, infusing it with this idea that we’re trying to answer causal questions for our customers, which requires a lot of rigor and being careful about how we present results to people. And so in a lot of ways I think of it as, if we succeed at this tool, it will be as if you had me helping you do your data analysis at scale. So we’ll have lots of customers, they’ll be happy using the tool, and they’ll have correct results and be getting good causal inferences to the questions that they have. 
Jon Krohn: 00:07:56
Nice. So you’re kind of chief customer. 
Sean Taylor: 00:07:59
Chief customer. Yes. 
Jon Krohn: 00:08:01
And so did you ever have any debate, either in your own head or with the founding team, about whether you would be chief data scientist or chief scientist? That’s an interesting… It’s a subtle difference in a lot of cases, but I wonder if you spent any time thinking about it. 
Sean Taylor: 00:08:18
I guess I never really considered that, but I guess chief data scientist was maybe a little bit of a mouthful, and so we just went with chief scientist. But I think the interesting bit is, am I actually doing science? And maybe it is more like data science than science. 
Jon Krohn: 00:08:32
I don’t know. It seems like… I don’t know how we define these relatively nebulous concepts, but I think data science, you could consider a subset of science. So you could be doing data science, but you could also be doing chemistry in the future. It leaves that option open for you. 
Sean Taylor: 00:08:53
That’s right. Yeah. Physics, biology. Many, many options. 
Jon Krohn: 00:08:58
Nice. 
Sean Taylor: 00:08:59
Room for future pivots.
Jon Krohn: 00:09:00
Exactly. So you mentioned that you’ll be answering causal questions there a little bit, so that gives some insight into what Motif is doing. You were saying that that’s the kind of thing that you want your customers to be able to get, to basically have you in there doing their analysis for them in an automated way and answering causal questions. But just before we get to causal questions, there’s something about the kinds of analytics that Motif specializes in, and it’s actually related to the name of Motif, which is that the firm specializes in sequence analytics. So how is that different from analytics in general?
 
Sean Taylor: 00:09:42
I think the major insight that we’re trying to work around here is that, when people are using apps and services, they generate sequences of events that capture in a really rich way what they’re doing while they’re using sites and services, whether they’re succeeding at what they’re trying to do, where they might be encountering some friction or bad user experiences. And a lot of tools and techniques aren’t really able to work with sequence data in a native way, and frequently they’re compressed into metrics. So we’re very good at counting things and taking averages and making a time series plot of how many people are doing certain things. But then in this disaggregated form, actually there’s much richer information available, and most of it just gets discarded. 
 
00:10:27
So in a lot of ways, probably one of the biggest potential unlocks for a lot of these teams is being able to use more of the data that they’re already collecting. And this log event data is very common in practice, but frequently when people work with it, they have to flatten it into cross sectional form first. And so I think we’re really working on defining what this space looks like, but it has a lot of potential, because it’s kind of a middle ground between time series data and cross sectional data. Time series data is aggregated around the time dimension, and we’re capturing aggregate over many people. And then cross sectional data is aggregated over time into the user dimension, and so we get this slice in time. And so, is there a middle ground between those two things that can get us the best of both worlds? And I think the answer is yes. 
Jon Krohn: 00:11:14
Cool. So one of the obvious kinds of sequence data to me is anything that unfolds over time. It seems like that’s a lot of what you’re going to be dealing with, and that’s really important in a lot of business applications. Obviously we have sales change over time, revenue changes over time. Pretty much anything you can measure, you can count over time, since time is always passing around us. But are there other kinds of sequences that we could also potentially analyze with your tools? Or is time the main idea? 
Sean Taylor: 00:11:47
You would think about a user, a time, and a tag would be the simplest kind of data that we work with. So when a user starts using Google, they start typing, they might click search, then they choose to click on some results. So there’s a little story encoded by that set of actions, and that’s a powerful data format. It’s also very universal, and I think it applies to almost anything that… Any website or app that has logging enabled is something that I think we should be able to help them at some point work with their data. 
Jon Krohn: 00:12:21
Nice. So for now, the tools might not apply to things like natural language sequences or an audio sequence or a genetic sequence. It’s specific to kinds of temporal events, right? 
Sean Taylor: 00:12:35
Right. Yeah. We’ve been calling it event sequences, but it is funny, when I first tell people I work on sequence analytics, they ask me about genomics a lot, is one of the first things that comes to mind. 
Jon Krohn: 00:12:46
That’s what popped into my mind first too, but I did a genomics PhD, so I was like, is that just me? 
Sean Taylor: 00:12:52
Yeah. Well, we’re trying to borrow some ideas from those fields, and I think there are some relationships. Probably the biggest distinction is that regular sequences, like text or genomics, have a fixed frequency of sampling, whereas events generated by a user interacting with a platform have a timestamp associated with them and can be dispersed over time. 
Jon Krohn: 00:13:15
I’m not sure how much this factors in with your product, but something with event analysis that can often be really tricky is the events that we’re most interested in often occur quite rarely. So if we think about a binary label in a lot of different domains, we have the positive label might occur very rarely, like 1 in 100 events or 1 in 1,000 events relative to the negative label. Does that end up factoring in with any of your analytics? 
Sean Taylor: 00:13:49
Certainly we’re very interested in drilling down into very specific situations that users might be in when they’re using a product or service. And those situations are often quite rare, and they’re defined by certain patterns of behavior, like clicking into a deeply nested menu or deeply a nested funnel. And so I think there is a rareness problem that we encounter, and that’s why we work with this very disaggregated form of the data. We’re assuming that the interestingness is happening in the corners of your app, and we’d like to help you find that. 
Jon Krohn: 00:14:21
Nice. Today’s show is brought to you by Datalore, the collaborative data science platform by JetBrains. Datalore brings together three big pieces of functionality. First, it offers data science with a first class Jupyter Notebook coding experience in all the key data science languages, Python, SQL, R, and Scala. Second, Datalore provides modern business intelligence with interactive data apps and easy ways to share your BI insights with stakeholders. And third, Datalore facilitates team productivity with live collaboration on notebooks and powerful no code automations. To boot, with Datalore you can do all this online, in your private cloud, or even on-prem. Register at datalore.online/sds and use the code SuperDS for a free month of Datalore Pro, and the code SuperDS5 for a 5% discount on the Datalore enterprise plan.
00:15:16
And so now it’s finally come time for a topic that we’re going to be talking about a lot throughout your interview, which is causality. So you have this impressive background in causality. What’s the relationship between this causality background and this sequence analytics that you’re now getting into at Motif Analytics? 
Sean Taylor: 00:15:38
Right. So probably the most important relationship is that I think I’m trying to work on casting a lot of the questions that we try to ask and answer in the analytics space as causal questions. And I think that we don’t often formalize them that way, and when you do, it’s very fruitful. It starts to shed some light on what you’re really trying to do when you’re measuring things. And so there’s what I would call forward and reverse causal questions, or causes of effects and effects of causes. And so one of them… The forward causal questions are things where you know what you changed about what you’re doing, and you want to answer what happened as a result of the change that we made. 
 
00:16:17
So say you launched a new feature or changed your design of your site. You might be interested in how that changed the patterns of user behavior. And so that’s a causal question. What happened when I changed something? And then there’s these causes of effects kinds of questions, where you notice something weird is happening, and you’d like to trace it back to some underlying thing that may have been the root cause of that thing. Root causing is another term that’s quite common for that.
00:16:45
And so when you start to cast those as causal questions, it really sheds some light on what you’re trying to do analytically. How are we analyzing the data in order to produce an answer to that question? And sequences are a very fruitful way to do this, because the temporal ordering of events actually implies something about, what are potential causes of effects that you’ve noticed, or what are potential consequences of things that you’ve changed? 
Jon Krohn: 00:17:09
You probably don’t have to account for time travel in your models, and things in the future impacting things in the past. [inaudible 00:17:14] 
Sean Taylor: 00:17:14
No, so far we’re not aware of any applications for that. 
Jon Krohn: 00:17:19
Gotcha. Okay, cool. That makes a lot of sense. And then we will later in the show dig into more causality questions, but I love already we understand what Motif is about. These causal questions allow us to not just get reports on sequence data, so we don’t just have what, we have the how or the why. I think that’s the main idea of how causality is being applied in Motif. 
Sean Taylor: 00:17:48
I love that summary. The why is really what we’re trying to get at. I think that’s the most important thing that often people are looking at. And we are trying to draw a pretty strong distinction between a lot of analytics tools are mainly focused on this reporting use case, and I think we’re focused more on the cause-effect relationships. And that’s what we think is more actionable for people, and likely to provide more benefits. Plus the reporting and counting things space is already really well covered, and I think we have lots of great tools there already. 
Jon Krohn: 00:18:15
Nice. Yeah, that makes perfect sense. So we will come back to causality again later in the show. We’ll just put a little bit of a pin in it for now to talk a bit about this specific transition that you made to Motif Analytics. So Sean, how are you finding it working at this small company like Motif, relative to the big companies like Lyft and Facebook that you’ve been at before? 
Sean Taylor: 00:18:42
There’s a lot of differences. We don’t have a product yet, so probably the urgent thing here is to build a great product and make sure that it makes our users happy. We also don’t have any formal customers yet. We have design partners that we’re working with, but that untethers the work in a lot of ways. You have to have a lot of focus on your problem, and creating alignment between people on the team is challenging, and you don’t have a lot of resources, so you’re working with a much smaller set of people. But the flip side of that is, you get to work very quickly and can do cheaper tests of things. So we’re really into a rapid prototyping culture where we can try new ideas really quickly, see if they work. If they do, we keep them. If we don’t, we throw them away. And I really enjoy that. I think you learn very quickly in an environment like that. 
 
00:19:32
At a big company, I think things are slow moving almost by default, because you have to be very careful and intentional about how you test things. Preparing for launching or testing something at Lyft might have taken several weeks or months, because we had a whole marketplace that needed to stay healthy, and we needed to very carefully introduce changes. Now we’re in this very high entropy state, and I personally find it really rewarding and fun right now. But I am looking forward to the state where we have some people giving us more feedback about the product. 
Jon Krohn: 00:20:00
Nice. No doubt. That’s an exciting early part of the journey in the start of adventure. So one final question related to Motif and the kind of work that you’re doing there. So previously, when you were at companies like Lyft and Facebook, primarily the questions that you were thinking about solving were related to specific company problems. So you would have been trying to maximize revenue or optimize profitability. And so the products and machine learning models that you’d be building would be related to tackling those kinds of company problems. But now, from my understanding of talking to you before we started recording, at Motif you have this perspective of thinking about it from the user’s perspective. So you’re wearing your chief customer hat, and you’re thinking about, what would my customer most need? So you’re building tools for people, instead of just tackling a specific company problem. 
Sean Taylor: 00:21:04
Right. Yeah. It’s been quite a mindset shift. There’s always a little bit of a tension between focusing on a very specific problem for a long time, versus trying to synthesize across many of them. And I did enjoy being… When I was a Lyft, I called myself a student of ridesharing. I spent the whole three years that I was there just really deep diving on one specific topic. Now I have to be a little bit more of a generalist and try to see what’s common between the different people that we talk to, what’s common about their problems, and what can we generalize about it? Which is a little bit of dabbling. I have to quickly understand and try to see what’s similar. And there’s a little bit of an unsatisfying aspect to that, and we have these design partners, and I can’t go deep on their particular business. I have to try to figure out what it is about that that we can help them solve. It definitely is a mindset shift, but it’s fun. It’s really fun to start to think about this diversity of problems and what’s similar about them, rather than sticking to one for quite a long time. 
Jon Krohn: 00:22:08
Nice. Sounds very cool. I’m excited to see how the Motif adventure plays out over the coming months. 
Sean Taylor: 00:22:16
So am I. 
Jon Krohn: 00:22:19
So before Motif, most recently you were a data science manager at Lyft, and you were the director of Rideshare Labs and several data science initiatives ranging from experimentation to simulation in reinforcement learning. Very cool problems. What kinds of business problems were you tackling with these kinds of things? So how were you using simulation in reinforcement learning, which sounds like a really fun thing to be working on? It seems like the kind of thing that, if I was there, I might be like, I really want to be doing reinforcement learning simulations, how can I come up with some commercial objective to apply it to? 
Sean Taylor: 00:23:02
Yeah. Lyft has got the benefit of, the problem doesn’t change. It’s been very similar for many, many years. And the core problems of Lyft are quite simple. It’s to charge people the right price, and then to match riders to drivers in the most efficient way possible. So operating a marketplace has some core problems that persist over long periods of time, and so you can afford to spend many years trying to develop the best possible technology to solve those problems. And the benefits have high multipliers. Even a small improvement in efficiency, multiplied by all the different markets that Lyft operates in, and all the time that you get to benefit from the improvements, is quite high. So you get to do this top of market style research, where you really go into the most advanced techniques and iterate upward to the best of breed stuff. 
 
00:23:53
So the two big problems that I worked on most recently at Lyft are these marketplace problems. So dispatches is one part of Lyft’s system that decides on which riders get matched to which drivers, and it’s a gigantic design space. Every single time you make a request with Lyft, there are many possible drivers that you could be matched to, and each one of those matches is a potential world that Lyft could live in. And some of those worlds are better for that rider, and some of them are worse for that rider, but they have consequences for the rest of the marketplace as well. And likewise with pricing. Increasing or lowering prices contributes to an individual rider’s experience, but then has these spillovers on other people by taking up drivers or relocating drivers into different places. 
00:24:39
So it’s such a complex system that simulating things and trying to understand things from that perspective is a very fruitful way to try to understand it, but also running lots of experiments and trying to try new things in a controlled way, and see what you can learn from running a different version of the Lyft algorithms and contain situations. 
Jon Krohn: 00:24:58
That is super cool. So something that maybe you were even involved in, something that is a relatively recent development, at least from my experience in New York on the Lyft app, is that for a very marginal price increase, like a dollar, I can get this priority. I can be prioritized. And they often write in the app, it’ll say what the difference is, and it’s often saying, I could get a car in four minutes with the priority, you pay an extra dollar, or get a car in five minutes without the priority. But in practice I find that, when I spend that extra dollar, I get a car right away. It’s like a minute. And so I tend to do it, because I’m like, is a dollar worth five minutes of my time, or that I get to be at this first date five minutes early as opposed to just on time? And that level of relaxation is definitely worth a dollar to me. So I guess it’s those kinds of problems, figuring out can we make this marketplace more efficient by allowing people to have more options, and experimenting with those kinds of things? 
Sean Taylor: 00:26:05
Yeah. What you’re describing is a classic Lyft product idea, which is to try to figure out who’s willing to be patient and wait, and who is in a hurry. And without having that option, we don’t really know. Lyft can’t decide. And so they can’t match the closest driver to the person who actually is the most impatient. And you, by willing to pay a little bit of a higher price, are revealing that you’re in a hurry and you need to be somewhere more quickly, or that your time is more valuable to you than to other riders. And so there’s ways in which that’s helping the marketplace be more efficient by letting somebody who’s willing to wait a little bit longer, and then there’s a little bit of slack created by that. So there’s lots of fascinating product ideas that can be tested, and they all have huge marketplace implications. And even deciding whether it should be $1 or $2 or $2.17 cents or whatever is also a policy decision that Lyft [inaudible 00:26:57]. 
Jon Krohn: 00:26:57
Yeah. And it’s dynamic. It depends, I guess, on rider and car availability. It doesn’t seem like it’s a fixed difference [inaudible 00:27:07]. 
Sean Taylor: 00:27:07
No, no. It’s probably not. And there’s an algorithm that decides it, and that algorithm has parameters, and choosing those parameters to make the market the most efficient that it could be is a pretty tough problem. And that’s what I spent a lot of time thinking about. 
Jon Krohn: 00:27:20
Cool. So as a data science manager there at Lyft, clearly with the kinds of sophisticated modeling, reinforcement learning simulations you’re doing, you need a solid foundation, both of the right people, clever data scientists that can do these kinds of advanced simulations for example, and you also need great systems. So in order to run these kinds of experiments efficiently, to be able to gather the data that you need, and then I imagine with things like a simulation, even having the reinforcement learning environments easily available and usable by the people who need them. I guess what key investments does an organization need to make in order to reach that great solid foundation that you had at Lyft? What kinds of things could listeners be doing at their own company to be building to that same kind of great foundation that Lyft has? 
Sean Taylor: 00:28:32
That’s an excellent question, and I wouldn’t say that we were totally succeeding in all those things at Lyft, but I’ll tell you what we were trying to do. Number one is an investment in platforms, and recognizing when you have a problem that’s not going away, and making the requisite investment to solve it in a way that’s good in the medium, in the long run, in addition to… And you sacrifice something in the short run to do that. Building a platform means waiting months or years until something’s ready. And then you have some trust that, when you hit the point where your platform is mature, that you’re going to get all the benefits from that. 
00:29:08
So we worked on this Bayesian optimization system for our experimentation platform, where we could introduce these continuous parameters and run experiments that would vary those in a rigorous way and converge toward the better values. And it took over a year until we were running experiments like that at a high bandwidth, and it was because we had to make a big long-term investment up front. And part of that isn’t just the technology, it’s the culture of acknowledging that these long-term projects are the future of the company, and they pay a big long-term dividend, and so you have to pay a lot upfront. And so you have all these people working on things that you’re not going to benefit from in the short run, but that leadership is okay with that, and you’re willing to reward people and keep them incentivized to do projects like that. 
 
00:29:55
It’s hard at tech companies, because I think people often want to have this six month review cycle where a project is successful within six months, and people get promotions and get bonuses and raises within six months or a year. And sometimes these problems are just fundamentally more difficult than that. And that’s culture. I think you just need to keep eyes on the long-term prize. And then doing that while you have a short-term business that is very volatile… Lyft is a company where there’s like… There was COVID while I was there, and so a lot of people stopped riding Lyfts. And that is something that you feel like you need to immediately drop everything and respond to. But if you keep your eyes on the long term, you say, three or four years from now COVID will be over, and we’ll really wish that we had built this platform. So the culture was a big part of it as well. 
 
00:30:42
Also, I guess the third thing would be assembling the right people, which is a really hard problem. It’s like, what complementary set of skills do you need to solve problems like that? And a reinforcement learning researcher from a top university would be a great person to add to Lyft, but they’re not going to be super productive unless you have somebody who really understands the engineering or the idiosyncrasies of the data that are being produced by a company like Lyft. So you need such a diverse set of skills to make projects like that work. 
Jon Krohn: 00:31:11
Nice. Trying to create studio quality podcast episodes remotely used to be a big challenge for us, with lots of separate applications involved. So when I took over his host of Super Data Science, I immediately switched us to recording with Zencastr. Zencastr not only dramatically simplified the recording process, we now just use one simple web app, it also dramatically increased the quality of our recordings. Zencastr records lossless audio and up to 4K video, and then asynchronously uploads these flawless media files to the cloud. This means that internet hiccups have zero impact on the finished product that you enjoy. To have recordings as high quality as Super Data Science yourself, go to Zencastr.com/pricing and use the code SDS to get 30% off your first three months of Zencastr Professional. It’s time for you to share your story. 
00:32:04
That was an awesome answer. I’m so glad that we had that question to ask you. And so sticking with Lyft, but going back to what will be our recurring theme of causality, as with all of the rest of your career it seems, a theme throughout your work at Lyft was the application of causal modeling. So several of your talks and podcast appearances detail this subject. Perhaps you can give our listeners a taste of why and when causal modeling is helpful, beyond just having an understanding of the correlation between two variables. 
Sean Taylor: 00:32:47
Yeah. Causal inference is a really important lens for every company, because ultimately we care about what will happen if we change something. If you don’t change things, then the business will continue to function as it did before you analyzed the data. And so there’s the completely passive view of data science as an activity where you’re a glorified accountant and all you’re doing is counting the events and telling people what happens. But really I think we believe that we’re having an impact on the business. And so the shortest path to having impact on the business is to think about, how do we estimate what would happen if we were to make certain changes, and then choose the one that the estimate looks the best for. We want to live in the world that, if we made a rank of all the cause inferences possible, and we found the one with the best one for the company, we choose that course of action.
00:33:35
So it’s really simple conceptually, but not until you put it into these causal terms of planning to make a change of some kind. So the simplest version of that is an A/B test, where you have two different ideas about what to do in some situation on your app or service, and you want to just understand what would play out differently if you chose one or the other. But in practice, the configuration space is the way I like to think about it. How many choices that you make that determine the user experience or how your service runs is actually quite large. It’s not just A versus B, it’s actually a large number of numbers that need to be decided on dynamically and real time all the time.
00:34:14
And that to me is causal modeling, is we’re going to change all these things about… So Lyft has lots of different policies besides pricing and dispatch. There’s coupons, there’s driver incentives, there’s support tickets, how we handle those, dealing with offboarding drivers that are dangerous, which creates supply problems but makes Lyft safer. And all those things are causal questions. If we did something differently, what would happen to our service? And the modeling becomes very important because you’re introducing changes on a number of different dimensions at the same time. It’s not just like everything gets better or worse. There’s actually three or four things that you’re trading off. At Lyft, there were things like rider happiness and retention versus driver earnings and driver retention. And safety and reliability are other aspects. And so you need to understand, when you change things, it changes multiple aspects of your business at the same time. And so that’s what causal modeling is really all about. It’s just understanding the consequences of your actions and making appropriate trade-offs. 
Jon Krohn: 00:35:19
Awesome. That was a really great answer. 
Sean Taylor: 00:35:23
Thank you. Thought a lot about it. 
Jon Krohn: 00:35:25
Do you think that all data scientists should be aware of causal methods? Do you think that there’s value for basically all of us? 
Sean Taylor: 00:35:33
I really do believe that. I know that it’s easy for me to say being the causal inference guy who’s always talking about it, but I do think that often what we’re doing implicitly is causal. I saw in your Jennifer Hill interview, I think she said something similar, that we’re implicitly making causal conclusions, so why don’t we just make them explicit and try to actually estimate the causal effect that we care about? And even if we can’t get it perfect, at least we’re putting things in the terms that we need them to be in later. So it will benefit almost anybody that works in data science to start thinking about, “how do I frame this problem that I’m working on as a causal question”? And if it doesn’t change the answer, then you were already doing causal inference correctly. If it does change the answer, then you weren’t doing it correctly, and you were probably making a mistake. And so it’s probably going to benefit you either way, either in terms of building confidence in formalizing what you’re already doing, or if it changes the answer then surely you’re going to do better by using a causal inference technique. 
Jon Krohn: 00:36:39
Nice. Do you have… So up next we’ll end up talking about Prophet, a famous Facebook causal prediction tool that you were involved with. But beyond Prophet, are there specific causal modeling tools or resources that you highly recommend to listeners? 
Sean Taylor: 00:37:00
Yeah. This is a great question. People often ask me for book recommendations and stuff like that. So probably the one that I’m most excited about is my friend Robert Ness has a new book coming out on Manning, and I’ve only gotten to read a little bit of a preview, I don’t think the entire book is done yet, but that’s a book that I recommend checking out. I think that causal inference and causal modeling is less about a particular set of tools or methodologies than it is a holistic way of thinking about solving problems. We use a number of different tools to do causal models, but you can do a causal model with linear regression or with just… Measuring two means and subtracting them, is a valid causal inference technique. So there’s a diverse set of tools that can be applied, but it’s like the way you frame the problem and set up the problem that’s the important part. 
 
00:37:57
And obviously, I’d be remiss if I didn’t mention the DAGs, and Pearl’s work, and thinking about building these graphs of causality is a really powerful tool for trying to frame the models that you’re ultimately going to have to estimate. But also experimentation is a big aspect of causal modeling. The best way to create a causal estimate is to actually run an experiment and do some randomization, and that’s a whole other set of methodologies that’s what I would call design-based data science, where instead of analyzing data that’s already been produced by someone else, you’re designing the data that can answer the question that you need to answer. And so that’s another methodology that I recommend people become very familiar with. So it’s like ten different things and nothing, I guess. 
Jon Krohn: 00:38:42
That latter methodology where you’re really being thoughtful about how you design your experiments, that seems to be Professor Hill’s preferred way about thinking about causal problems. We talked about that a lot in her episode, in episode number 607. In contrast Emre, Dr. Kiciman in episode 613, it seems like he thinks a lot about the DAGs, so the directed acyclic graphs that allow us to explicitly identify how we think different variables are related to each other and then allow the data to bear out in that DAG. So two different kinds of approaches there for listeners to dig into in those separate episodes. It’s great to hear your take on it, Sean, and just the general overview that there’s lots of different approaches out there, including taking two means and subtracting them. 
Sean Taylor: 00:39:40
If it was an experiment, that’s a great approach. I think it’s fun to reflect on this. Everyone wants to be told there’s one tool that they should learn, and I think ultimately it’s right tool for the job and it’s very generic advice, but it’s why understanding the fundamentals is so much more important than understanding how to use a particular tool. 
Jon Krohn: 00:40:00
Nice. So one tool that a lot of people who aren’t maybe deeply involved in causality might have thought would be the answer to all of their causal problems was Prophet. So we mentioned that a few minutes ago. So before Lyft, you worked at Facebook and helped develop their famous automatic time series forecasting package, Prophet. And more recently last year on Twitter, you addressed some of the criticism that Prophet received for automating a process that is often not one size fits all. And so here’s a quote. You said, “The lesson here is important and underrated. Models and packages are just tools. We attribute magical powers to them, as if their creators have somehow anticipated all the idiosyncrasies of your particular problem. It’s unlikely that those creators have anticipated all those idiosyncrasies, and there’s no substitute for evaluation.” 
 
00:41:04
And that kind of point is something that Professor Hill brought up a lot in episode 607 as well, which is that… I’m going to butcher the quote here, but the concept that she was describing is that there are no true causal tools. There are only packages for doing causal modeling where you as a human need to be comfortable with the assumptions that are being made. So it sounds like that’s the same kind of point that you’re making there. 
Sean Taylor: 00:41:36
Yeah. I think we always have to interpret any answer or number that a model or anything else spits out in the context of how it was created. And diagnostics are the way that we do that. We pay close attention to, does it produce answers that make sense in situations where I know the answer, or where it could have been known what the answer was? And then you build confidence through doing that over time. I think when people just blindly apply methodologies and hope that it’s going to work, that’s exactly when we tend to fail. And it highlights the problem, is that it’s much more fun and interesting and easy to fit a model, than it is to as assess whether it’s good or not. And diagnostics and model assessment and model validation isn’t very fun, but it’s probably far more important. 
 
00:42:32
And I often tell younger data scientists that probably that you should start with writing down how you’re going to evaluate a model before you even think about fitting it. And it’s not the fun part, so we tend to… It’s like eating your vegetables. We tend to eat dessert first in data science a lot of the time. 
Jon Krohn: 00:42:48
Right. We come into this job for the fun, not the data cleaning. 
Sean Taylor: 00:42:54
No. 
Jon Krohn: 00:42:57
I want cool models. So do you have any thoughts on how the data science community can disrupt this tendency that you’re describing for conflating tools? Especially maybe seeing a single tool as a one size fits all solution to their specific solution to their specific solution. How can we… 
Sean Taylor: 00:43:17
Yeah. One of the things I really have been enjoying seeing are these packages or projects that make accessible many different methodologies simultaneously for benchmarking or comparison, either on the same problem or on a set of problems. And academics sometimes produce these review papers or comparison type papers. I think that this is a very valuable thing for us to be doing, is saying, “We have a variety of tools that have all been developed for the same problems. Let’s just compare them in some fair environment, and try to do some benchmarking.” 
 
00:43:53
So the people who build benchmarking or tool comparison tools are doing great service to everybody, because they make it easier to facilitate learning about the properties of these algorithms across a wide variety of potential use cases. But then generalizing that to your specific application is also going to be a leap even from there. So maybe something works well on nine out of ten data sets, but your data set is the tenth one. So you still can’t trust other people to have done the work for you under any circumstances, if you really care about the quality of the result. And I think that things that help us with diagnostics and visualization of diagnostics are probably even the most important tools. 
 
00:44:37
So those are what I think we should be focusing on as a community, and rewarding people for finding problems and for fixing bugs and for helping to assess things, rather than… Anybody can develop a new methodology. I think that’s relatively easy. Whether it’s good or not is a really hard question to answer, and it takes a long time. So people that help us come to some consensus about what works and what doesn’t are probably doing the most important work. 
Jon Krohn: 00:45:04
So now working at Motif, do you end up spending time thinking about this… How can the Motif platform be providing causal insights, but also in a way that’s responsible? 
Sean Taylor: 00:45:21
Right. Yeah. It’s a great question, and it’s something that I’m grappling with right now, is how can we make sure that users get good results, even if we haven’t seen their data before? And my most honest answer is that I can’t guarantee that. And I think that the best way forward is to be honest about what you… Tools that degrade gracefully when they don’t know the answer are probably a good approach to that. So forecasting is a good example. If you make a forecast, it’s okay for your forecast to be wrong. Almost all forecasts are wrong. But you should be honest about how uncertain you are about forecasts that are likely to be wrong. So the uncertainty intervals, confidence intervals for those forecasts, could be quite large, and that’s a way for you to be intellectually honest even if you’re willing to make a prediction. 
 
00:46:06
So with Motif, I think part of what we can do is convey our uncertainty about things through how we visualize data, and how we show it in the tool. And then I guess our other secret weapon here is, we’re mainly focused on what I would call hypothesis generation rather than hypothesis testing. So the burden of proof is a little bit lower when you’re looking for opportunities or explanations. It’s not like you’re going to bet the farm on those particular inferences that you make. They’re just to help point you in the right direction. So for instance, finding a bug, it’s okay to have a really high false positive rate for finding bugs, because missing a bug is so expensive. So if you find 20 candidate bugs and one of them turns into a bug, that’s a perfectly reasonable outcome. Whereas if it were like, ship this thing 100% of the time is the conclusion that we’re making and we’re being really forceful about that, then that can lead to a lot of bad problems. 
 
00:47:06
So I think there’s this loss function piece to this, which is that it’s okay to generate hypotheses as long as you follow it up with a more rigorous study. And I think we’re often… Within Motif, we’re focused more on hypothesis generation than in directly telling people what they should do. 
Jon Krohn: 00:47:22
Nice. It’s interesting that data visualization was part of your answer as to how we can avoid making misleading causal assumptions, because when you were talking about how the community at large could be avoiding making these kinds of mistakes, when you talked about data visualization, I was like, that is something that Motif seems to be specializing in, from what I’ve read in the blog posts. 
Sean Taylor: 00:47:44
Right. We are thinking quite carefully about that, and I think visualization… And there’s a lot of great researchers working on this, how do we incorporate our uncertainty into the visualizations in an honest way that people actually perceive in the correct way? Matthew Kay is one, Jessica Hullman, both great researchers, and they are helping us as researchers who are producing insights and producing inferences be more honest and have the best available tools for communicating the results of those things to people so that they don’t make mistakes downstream. 
Jon Krohn: 00:48:20
Nice. So Sean, beyond just causal tools, it sounds like in your role as chief scientist at Motif, you’re still getting to be quite hands-on with the data. Are there particular tools that you love or that you’re excited about right now, that you think our listeners should know about, whether they’re causal tools or not? 
Sean Taylor: 00:48:40
Good question. I’m trying to think if I’ve had to use anything new. We’re building tools from scratch, and so there’s a little bit less using tools and more building them. 
Jon Krohn: 00:48:49
Well then, even just telling us a bit about that, if you’re comfortable with it, telling us what kinds of programming language decisions you make when you’re building a platform like Motif from scratch. 
Sean Taylor: 00:49:03
Right. I will reveal one thing that I think has been very interesting. We have spent a long time trying to figure out, could we write SQL queries that can answer the questions that we have, and that we want to answer with our tool? And the answer has been, after many times of trying, that we probably can’t, and that SQL is not very expressive language for the kinds of questions that we have with sequences. And it’s for a interesting reason. The relational model assumes that you have this fixed width output, whereas sequence data has a irregular shape to it. The answer to what sequence you get from a query could be any length, and so SQL’s just not equipped to return answers that. 
 
00:49:46
You can hack it with lots of left joins on top of one another, but it gets very verbose. So we’ve spent a lot of time thinking about, how do we query sequence data, and what does a query language like that look like? And I think the answer will be very interesting to people that we show it to later. 
Jon Krohn: 00:50:01
Cool. 
Sean Taylor: 00:50:01
So I think that’s been my primary takeaway, is when you really focus on the problem and how you’d solve it in the best way, and you look around and you say, “Has anybody built any tools that could do this already?” And the answer is no, it means you’re onto something interesting. And I’m the last person in the world that will try to invent something that already exists. I will try to find somebody else’s work and use it as much as I can. So I spent a long time doing that, and the answer was, I don’t think anybody can solve this problem right now, and so we’re going to have to do it. And I think that’s a great premise for a startup. I hope other people feel similarly when we show it to them. 
Jon Krohn: 00:50:35
Very cool. Yeah. We’re all looking forward to that. With you behind it, no doubt it will be innovative and useful. All right. So another question that I ask frequently on air, because I think it’s really useful for our listeners to have an understanding to this answer, which is what you look for in people you hire. So you’ve been at huge organizations like Facebook and Lyft in the past, I’m sure you had to do tons of interviewing at those organizations, and now as chief scientist at Motif, same kind of situation, you’re interviewing people for different kinds of roles probably, maybe a more diverse set of roles than in the past. And so what do you look for? What are the key things that you look for in hires? It could be hard skills, soft skills, maybe a taste of both. 
Sean Taylor: 00:51:26
Yeah. I’ve thought a lot about this, because I have interviewed… I think I counted at one point I interviewed four or 500 people in my career. It’s been quite a number at this point. And most of those are formal interviews that are meant to help other people hire, so I don’t think they directly tell me what I should be looking for. But I think that if I had to nail down one thing that is the most important thing, it’s intellectual curiosity, and people who are excited to learn new things and excited to learn in general, because they pick up new skills and they learn new things quickly, and they are interested enough in the problems that they figure out how to be interested in what you’re working on, and that keeps them motivated, and it keeps them finding problems and try to identify solutions. 
 
00:52:12
And so intellectual curiosity is probably the best meta-skill. It’s how you pick up new skills, it’s how you figure out what to work on, develop tastes, and what’s interesting to you. And I think when I tell people what I look for on a resume, it’s like people who have been through the loop of trying and learning and solving a problem and then moving on to another one. So I do like seeing people who have some long term experience working on one thing, but I think bouncing around and learning new things is also a good sign. And those people have generalized what they’re learning into many different settings. It’s like cross-training or something like that. 
 
00:52:52
And then what you learn on one problem often comes in handy later, so if you learn how to solve ten problems, then you’ll find a little bit from every one of those ten problems that applies to what you’re working on currently. So that’s probably the most important thing. I have tended to hire people with PhDs, because I think that they have learned to pick up new skills, learned to learn, and also have a level of focus and have very good communication skills, because they need to write about their work. But I have worked with plenty people without PhDs, and who have been equally brilliant to many of the PhDs I’ve hired, so I wouldn’t say that’s a sure thing either. 
Jon Krohn: 00:53:30
Cool. Great answer. I am so glad that I asked that. So speaking of PhDs, you have a PhD. It’s in quite a multidisciplinary subject, it’s something called information systems. You did that just a stone’s throw for me here in my apartment, at New York University. So this information systems PhD is part social science, it’s part computer science. Can you elaborate on what that subject area is? 
Sean Taylor: 00:53:56
Yeah. Information systems is a fun one. I kind of fell into it. I don’t think I knew in advance what the whole provenance of that was. But it comes out of the field of accounting. In the ’60s and ’70s, before computers were part of business in general, there were systems to keep track of information within a company, like filing systems, and making sure that you could answer questions even when you didn’t have computer systems. And then as technology started to disrupt the way that businesses worked and computers started to become mainstream, information systems evolved into the study of how computers and information technology is used to improve businesses and make them better. 
 
00:54:43
So it has a lot of adjacency to maybe management science and operations research, but really because it has this kind of… It’s steeped in this organizations and business school way of thinking, it has a lot of human elements to it. So how are people going to use the technology? It’s not like the technology solves the problem itself. You need people to use it in a certain way. Will they adopt it? If you tell your company to start using some new information technology, are they even going to use it? Will they respond in a way that’s counterintuitive? And so that’s where the social science really gets woven in, is that you need to understand human behavior in addition to the technology itself. 
 
00:55:19
And the causal inference side of it basically comes down to, you’re going to introduce some new technology or prescribe some new way of doing business or changing the way your business operates. You should be able to say something about what the impact of that was or will be in advance. So it does align a lot of my interests, but you really do need to understand the technologies themselves in order to be able to talk about these things. 
 
00:55:42
So I really was forced to learn all these different perspectives, and I actually am really glad that I did. I think it played to my… First of all, I really have a hard time committing to any one particular discipline, so it’s been good for my attention span, is that I can bounce around a little bit. But also I found it to be tremendously effective to borrow ideas from all these different places in the course of my PhD. And later when I went to Facebook to work, I ended up hanging out with a lot of people with all kinds of varied backgrounds, computer scientists, political scientists, economists, and learning from them and working with them became an extension of my PhD. It’s like, how do I borrow ideas from all these other fields that I’m working with, and see what’s fruitful about the way that they think about problems? So it’s really been very formative for the way that I approach things that I work on. 
Jon Krohn: 00:56:33
Sweet. That was a cool answer. I didn’t know much about information systems before, so it was nice to hear about what it is. It sounds like a great focus for learning how to apply computer science in the real world, which is what you’ve been doing in your career since, at Facebook, at Lyft, and now at Motif. So it sounds like you were doing the right thing. 
Sean Taylor: 00:56:55
Yeah. You never know whether you just tell the story that it sounds like the right thing later, or if it really was the right thing. But I think for me it worked out great, and I’m very proud of having studied that topic. But I feel like it was just the start. Some people think a PhD is the end of your intellectual life, and it was really the start of seeding this idea of studying, being a student of the world, and treating things… I treated my first few years at Facebook as a new PhD, and I’m treating this startup life as a new PhD that I’m starting. How do I get immersed and really become an expert on the topic of building a successful tool that lots of people want to use? 
Jon Krohn: 00:57:38
Yeah. This is the thing that I love and probably a lot of our listeners love about the field of data science, is that it’s endless. In our lifetime, you could never hope to capture 1% of the data science expertise out there, especially since more and more people are getting into it, and more and more papers are being published, and so you can dig really deep into specific kinds of problems, and as you’re saying, be now onto your third or fourth PhD, and still just be getting started. [inaudible 00:58:10] 
Sean Taylor: 00:58:10
Yeah. And still being energized by it, I think is the important part. It’s like, do you still feel like that’s motivating for you? And for me it still is, and I’m really thankful for that. But you’re right. If you’re doing data science your whole life, you’ll never get to stop, like I’m fully trained. I think you never stopped training, actually. 
Jon Krohn: 00:58:30
And it’s cool how we have the wind at our backs with respect to the kinds of applications we can be excited about, because there’s more and more sensors collecting richer and richer data all over the world every year, and it’s cheaper to store those data than ever before. It’s cheaper to compute with those data than ever before. People are sharing modeling approaches as open source, GitHub [inaudible 00:58:58] exponentially more all the time. And so it is this really cool career at a really cool time. 
 
00:59:05
So when we have big guests like you coming up, I post on social media that I will be hosting you. And so I made a post on LinkedIn and Twitter about a week ago relative to when we’re filming, and your post had a huge amount of engagement. So people were really excited to hear from you, over 12,000 impressions on the LinkedIn post, a couple hundred reactions. And we had a great question for you from Douglas McLean. So Doug is a lead data scientist at Tesco Bank, and he says, “It would be great to hear what views Sean has about Judea Pearl’s work on causal inference and on his book Causality. It sounds great, but I’m not really sure how practical this book is in a business setting. I read something Andrew Gelman wrote about it along the lines of, “I couldn’t find anything wrong with the book, I just never found it was useful to me”. And Doug acknowledges that he’s probably misquoting Andrew Gelman there a bit, but that was the sentiment. 
 
01:00:18
So Andrew Gelman, for listeners who aren’t aware of him, is one of the world’s best known statisticians, and Jennifer Hill, whom we’ve talked about a number of times in this episode, she has co-authored a lot with him, including on a couple of iconic books. So Andrew Gelman is definitely somebody in the field to follow, and his opinions carry a lot of weight, as Doug is suggesting here. So that’s one thing for you to address, if I haven’t already said too much for your working memory, Sean, we’ve got one other little bit to it, which is that Doug found it interesting that Jennifer Hill never mentioned Judea Pearl in the entire episode of Super Data Science, and so I guess he says he found that confusing. 
Sean Taylor: 01:01:08
Yeah. I have a lot to say about this, but I’ll try to keep it more succinct. So first of all, to give full transparency, I’m a huge Andrew Gelman fan, and I’ve been reading his blog for a long time, and just a huge fan of his work and his perspective. And I have read Judea Pearl’s work, and I’ve gotten into many Twitter debates with Judea Pearl. 
Jon Krohn: 01:01:30
I didn’t know that. 
Sean Taylor: 01:01:32
And Judea Pearl’s a brilliant researcher, and his work has been foundational to the field, and it’s extraordinarily valuable work. I think that the problem with his work isn’t that it isn’t valuable, it’s that it’s just a piece of what you need in order to be a practitioner who solves causal inference problems. So I think it’s sold as something that solves all your problems for you, but it doesn’t really have a lot of empirical content. And someone like Jennifer Hill and Andrew Gelman are used to working with data in the real world and trying to answer problems as a practitioner. And so I think they’ve found some value in Pearl’s work, and I would bet they draw DAGs and think about what identification criteria they’re using on a regular basis. But it’s just such a small piece… Not a small piece, it’s an important piece, but there’s so many other problems that are interesting about causal inference that are all needed to be solved in addition to that, that I think Pearl’s work can seem a little bit trivial in comparison to those other problems. 
 
01:02:34
Pearl often ignores the problem of, where do you get the data from in the first place, and do you have the ability to intervene or run an experiment? So ignoring human agency in the production of the data that we use to create insights is really silly to me. It’s a big limitation of that perspective, is living in a fixed world where your job is to model it and try to answer questions the best you can with the data that someone already gave you. So Pearl’s great, Gelman’s great. I think it’s just that it goes back to my… I think you need to know ten things to do causal inference, not one. And so there’s no one tool off the shelf that’s going to be, here’s your causal inference tool. And I think Pearl might want really want you to use his tool, and you probably will, but there are lots of other ones that you’re going to need as well.
Jon Krohn: 01:03:20
So that could also… Your answer could explain why Jennifer Hill might not have needed to talk about Judea Pearl, because there’s lots of… It doesn’t surprise me that one causality practitioner would need to mention another specific practitioner. As you say, there’s so many different ways to solve causal problems, and especially when you think about the ways that they’re doing things… So as we heard in Jennifer’s episode, she has this focus on how we design experiments, and when you’re designing experiments carefully from the very beginning, like you’re saying, you could be using the difference between means to come up with a causal answer. You don’t need complex modeling. 
Sean Taylor: 01:04:03
Yeah. We tend to focus on solving problems that are feasible, which means that the DAGs are often simple. And with a simple DAG, I think Pearl’s work isn’t particularly useful, because it’s something that can be solved through other ways of thinking about it that are easier for people to reason about. When you get to complicated DAGs, it’s like, are we going to trust the results of a causal inference procedure on top of a complicated DAG anyway? And often the answer to that is no. 
 
01:04:33
So Pearl’s work is most useful in the setting where we’re least likely to be able to make progress or proceed with an estimation task. So like I said, I love Pearl’s work, I use it all the time. I just think that you just need to know so much more to proceed. And Jennifer Hill’s working in very pragmatic places, she’s working on real data with real problems, and found it to be not the emphasis of her work. I think that’s somewhat revealing. 
Jon Krohn: 01:04:59
Cool. Yeah. Great answer. I’m so glad that Doug asked that. Thank you, Doug. 
Sean Taylor: 01:05:03
Great question. 
Jon Krohn: 01:05:05
And so we’ve now talked about Judea’s book. Do you have any particular book recommendations for us? They could be related to causality or it could be about anything, but I always ask for a book recommendation from our guests.
Sean Taylor: 01:05:17
Yeah. I think one I’ve recommended before that I really is The Art Of Doing Science And Engineering by Richard Hamming. I just think that that’s such a great perspective. First of all, it’s one of those career retrospective type of books. He’s a brilliant researcher who’s made a lot of contributions, and he’s just trying to synthesize, how did he work on these problems, and what was it like, and what was useful to him? And then marrying together science and engineering and thinking about the relationship between those two things. The building step and the learning step are so inextricably linked, and I think Richard Hamming does a great job of covering that.
01:05:58
And the foreword’s by this guy Bret Victor, who is one of… I don’t know what to call Bret Victor. I guess you’d call him a researcher, but Bret Victor has built some of the coolest technology demos that you’ll ever see, and has just such an amazing perspective on the relationship between building things and thinking. And so getting to read some new concept from Bret Victor is always a treat, so highly recommend that book. 
Jon Krohn: 01:06:22
Cool. Great recommendation. Unsurprising from someone like you. So Sean, this has been a great episode. I’ve loved talking to you. It’s felt really easy. Listeners don’t know this because we edit the episode, and so sometimes episodes require more editing than others. And in this episode, during this conversation, any of the retakes that we had to do were because of me. Sean did this entire episode completely flawlessly. And so just really impressive being able to dig deep into causal problems, real world problems, getting to hear about what you’re up to right now at Motif. It’s been really fun, Sean. So no doubt listeners will be eager to follow what you’re up to, to hear your thoughts. And so what’s the best way that they should follow you? 
Sean Taylor: 01:07:21
Probably just Twitter, SeanJTaylor on Twitter. Really appreciate all the people that follow me for all the crazy stuff that I say, but I hope they find it useful. 
Jon Krohn: 01:07:28
There’s like 50,000 followers on Twitter, which is wild. 
Sean Taylor: 01:07:33
I always tell people it’s just because I’ve been on Twitter for way too long and for way too many hours. But I do really appreciate that I have an audience like that. I think it’s been so rewarding to me to be able to ask questions to that group of people, because I just get the best answers. So if I sound like I’m saying something smart, it’s often because I learned it from people on Twitter, and I’m thankful that those people listen to me and what I have to say. 
Jon Krohn: 01:07:56
Nice. All right, Sean. Thank you so much for being on the program. It’s been a blast, and maybe in a few years we can catch up with you again and hear how the Motif journey’s been coming along.
01:08:10
Today’s episode was a ton of fun for me. I hope you enjoyed it too. In the episode, Sean filled us in on his new machine learning startup, Motif Analytics, and how it aims to identify the root cause of patterns in sequential data. That is the how or the why, instead of just the what. He talked about how we can use causal modeling to understand the consequences of our actions and make better decisions, with specific references to how Lyft uses causal modeling to inform its dispatch and pricing models. He talked about how big sophisticated experimentation projects like Bayesian parameter searches can require an upfront investment of several months and diverse teams that blend, say, academic reinforcement learning experts with data engineers. And he talked about how the field of information systems blends computer science with the social sciences to build tools that are effective as part of a broader human in the loop system.
01:09:03
As always, you can get all the show notes including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Sean’s social media profiles, as well as my own social media profiles, at Super Data Science.com/617. That’s Super Data Science.com/617. If you’d like to ask questions of future guests of the show, like an audience member Douglas did during today’s episode, then consider following me on LinkedIn or Twitter, as that’s where I post who upcoming guests are and ask you to provide your inquiries for them.
01:09:35
Thanks to my colleagues at Nebula for supporting me while I create content like this Super Data Science episode for you. And thanks of course to Ivana, Mario, Natalie, Serg, Sylvia, Zara, and Kirill on the Super Data Science team for producing another invaluable episode for us today. For enabling this super team to create this free podcast for you, we are deeply grateful to our sponsors. Please consider supporting the show by checking out our sponsors’ links, which you can find in the show notes. And if you yourself are interested in sponsoring an episode, you can find our contact details in the show notes as well, or make your way to jonkrohn.com/podcast. Last but not least, thanks to you for listening all the way to the end of the show. Until next time, my friend, keep on rocking it out there, and I’m looking forward to enjoying another round of the Super Data Science podcast with you very soon. 
Show All

Share on

Related Podcasts