Kirill Eremenko: This is episode number 229 with Co-Founder at Cursor, Adam Weinstein.
Kirill Eremenko: Welcome to the SuperDataScience Podcast. My name is Kirill Eremenko, Data Science Coach, and Lifestyle Entrepreneur. Each week we bring you inspiring people and ideas to help you build your successful career in Data Science. Thanks for being here today, and now let’s make the complex simple.
Kirill Eremenko: Welcome back to the SuperDataScience Podcast ladies and gentlemen. Super excited to have you back here on the show. Today, we’ve got a very special guest joining us for this episode, Adam Weinstein, who is a Co-Founder at Cursor. Now, what you need to know about Cursor is it’s a company tool that helps organize Data Science assets. So, if in your company you’re working on many different Data Science projects, you have lots of different types of code, different dashboards, different meta data, different teams working on these projects. All that can be organized with Cursor.
Kirill Eremenko: In this podcast, you will find out quite a lot of interesting things. First of all, we’ll talk about Adam’s own journey, his background. How he went from working at Deloitte, all the way to working at LinkedIn, and then founding his own company. So, if you’re interested in actually being an entrepreneur in the space of Data Science, this podcast is definitely for you. Plus, we’ll talk about the concepts of Data Literacy and Citizen Data Scientist, and you will find out how Cursor can help you out in this journey. Of course, in general what it means for an organization to be data driven, data literate, and what Citizen Data Scientists are.
Kirill Eremenko: So, if you are a founder of an organization or, if you are an executive, this podcast is also for you. And, in general, if you are interested in becoming more data literate, and interested in the concept of Citizen Data Scientist, whatever your level is in the organization, once again this podcast is for you.
Kirill Eremenko: One more thing I wanted to mention before we get started is that this podcast is available in a video version. So, if you’d like to watch the video of us chatting with Adam, head on over to www.www.superdatascience.com/229. Then, you can enjoy the video experience there. However, if you’re listening to an audio while you’re running or in the car or something else, then feel free to continue with the audio because you will still get all the valuable insights from here.
Kirill Eremenko: And now, without further ado, I bring to you Co-Founder of Cursor, Adam Weinstein.
Kirill Eremenko: Welcome to the SuperDataScience Podcast ladies and gentlemen. Today got a very exciting guest on the show, Adam Weinstein. Adam, how are you doing today?
Adam Weinstein: Doing great. Doing great. How about yourself?
Kirill Eremenko: Very good as well, thanks. We were just chatting before how cool it is, the time difference. I’m in Brisbane, Australia. It’s almost 10:00 AM. What’s the time for you in San Francisco?
Adam Weinstein: Yeah, it’s like a little before 4:00 PM the day before.
Kirill Eremenko: We were talking about it-
Adam Weinstein: [crosstalk 00:03:21] the day almost.
Kirill Eremenko: Yeah, I can tell you all about like in the morning you’ll have some rain. Then it’ll get sunny. Whole bunch of [inaudible 00:03:28] your day.
Adam Weinstein: We could use some rain here. There’s about six years of drought that we’re trying to dig our way out of.
Kirill Eremenko: That is crazy. That is crazy. I heard about the fires that were happening in California. Is that still going on?
Adam Weinstein: Yeah. No, so they’re luckily all … they’ve all burned out, I guess, at this point. Unfortunately, they finally contained the fire about 12 hours before the rain came. So, it was poor timing but, yeah. The impact has been pretty massive. It’s fascinating, Vice just did an interesting special over the weekend on sort of what does this mean in the long run, because we’ve had now, two or even three years in a row, we’ve had tons and tons of acreage burn, and houses burn. People displaced, people killed, et cetera. Just because of this wildfires that have started. Interesting to think if that’s not necessarily an anomaly anymore, right? Is it becoming the normal?
Kirill Eremenko: Yeah.
Adam Weinstein: Maybe a topic for another time.
Kirill Eremenko: Yeah, I know what you mean. I originally saw a visualization of they had … what was it? I think they had the 50 states of the US, and they had how often abnormal weather events happened over the past like 60 years. And, all the same pictures, and it’s animated. You can see like, okay, if the start was here, something popped up here, here, here, here, here. Then, as you get into the 2000 it’s like everything is red every year like, something is popping up. It’s crazy. We might actually include that in the video version of this podcast so people can see.
Adam Weinstein: Yeah, no it’s interesting. I mean, the size of the area that burned, I think, was roughly equivalent to almost six San Francisco’s or, you know …
Kirill Eremenko: Wow.
Adam Weinstein: About 12 New York Cities, at least. So, if you imagine like … no grant that, these aren’t densely populated areas, but still, that … if you’ve been to one of these towns and you say, “Okay, that entire town burned. Multiply that by 12 or six [inaudible 00:05:27].” It’s a huge amount of land that just totally been destroyed.
Kirill Eremenko: Wow, that’s crazy. All right, well let’s move on to Data Science. Hopefully that situation will get better with the fires. Data Science, so Adam, very excited to have you on the show. You have an amazing journey through Data Science with lots of highlights, LinkedIn, Bright, and now your own company. I don’t even know where to start. Let’s …
Adam Weinstein: Yeah.
Kirill Eremenko: Let’s maybe talk about if somebody were to ask you off the street for the first time, and you were introducing your … how do you introduce yourself to people? Who is Adam Weinstein, and what do you do?
Adam Weinstein: Yeah. I guess I’m a data geek, right? No. You know, background is interesting, like you said. I started life out of school like every undecided undergraduate. I was a consultant for a couple of years. And, I always-
Kirill Eremenko: Sign me up for that as well. Yeah, same story.
Adam Weinstein: So, I’d always been interested in the hardware side of technology. So, I actually got into infrastructure consulting. We were helping really large companies figure out how to deploy data centers around the world. At the time, there was this big wave of virtualization that was occurring, right? So, back in the day you’d have one application on one server, and even if it only ran a job for 10 minutes a day, it would be an individual server that would be wasted for the other 23 hours and 50 minutes.
Adam Weinstein: So, virtualization was like, okay, can you compact multiple processes over the same box. Now, it’s containerization, or [Kubernetes 00:07:12] or whatever. Fast forward 15 years. So, I happened to get a little bit of a focus in data infrastructure. So, after I had done my sort of tour in the consulting world I joined a company, a start up actually, called Exact Target. We had an office in Sydney ironically but, never [inaudible 00:07:32].
Adam Weinstein: But, Exact Target was-
Kirill Eremenko: You worked in Sydney?
Adam Weinstein: Say what?
Kirill Eremenko: You worked in Sydney?
Adam Weinstein: No, we had an office in Sydney. I [inaudible 00:07:41] in Sydney. Wish I worked there, but no. I worked in Indianapolis, which is … I lived in Chicago as a consultant and moved back to Indianapolis where I grew up after being a consultant for a while. Exact Target was in the email marketing space. So, it’s now … it’s currently Sales Force Marketing Cloud. Sales Force bought the company about five years ago. It’s now the … I think it’s the largest email center in the world still. Like, most large brands that send out any quantity of email, whether it’s Nike, or large banks, or anything in between, right? If you’re sending … if you need to send a few hundred emails, a few hundred million emails in a few minutes, you tend to use something like an Exact Target, or today’s Sales Force Marketing Cloud.
Adam Weinstein: But, they didn’t really have a data team at the time. So, that was a role I kind of jumped into shortly after getting there. It was fascinating, right? We were running the world’s largest Microsoft sequel server, which I’m not sure that’s something you want to brag about, but it was a fascinating time, right? The company had gone from sending a few million emails on behalf a few small businesses, to you know, hundreds of millions of emails on behalf of Groupon, and Nike, Bank of America.
Kirill Eremenko: How crazy is that, that like a company that’s running the world’s biggest SEL server, and working with so many users, and companies, they didn’t have a data team. Like, right now in this day and age, 10 years later, it’s unfathomable for a company [crosstalk 00:09:04] data team.
Adam Weinstein: Yeah, there was like an infrastructure team that kept the lights on, right? They literally were hiring the architects from Microsoft just to keep the thing running. Like, [inaudible 00:09:14] Microsoft sequel server because it was such a kind of a fire … I’ll refrain from using other language, but yeah. It was a mess.
Adam Weinstein: But it’s interesting, right? Even data companies struggle sometimes I feel like, to step away from it and say, “Okay, how can we look at the data that we have and be more intelligent about how we use it?” So, yeah. We were really a data driven company. We helped companies identify, okay, you’ve got this list of emails that you’ve accumulated from your website, or from orders. There was a lot of retail at that time. How do you market to them in a more cost effective manner, as opposed to TV ads, or print mailers and things like that. The company was growing through the recession in 2008.
Adam Weinstein: So, I got there in 2007. Ironically, we probably went public at the end of the year. Although, we ended up pulling it because it was such a terrible time, but the business did phenomenal through the recession, mostly because it was a lost cost alternative. You know, the cost of sending an email is a fraction of a penny. The cost to send something in a mail or put a TV ad is infinitely higher. So, yeah-
Kirill Eremenko: One of the first players there. One of the first companies-
Adam Weinstein: Yeah, first data player at least there. So, we did everything from how do you identify when a customer is like, you know, high risk and looking to turn, to hey, this customer just signed up yesterday and they’ve already blown through their utilization so, why don’t we talk to them and find a better product. That kind of thing. It was very early days of data, and I’m not even sure it was called Data Science. It was probably more business intelligence.
Kirill Eremenko: Yeah.
Adam Weinstein: But, it was a fun time. So, was there for a few years. We got to about 1200 employees and I decided I wanted to go do something small again. I actually took kind of a brief hiatus from data. So, I was always a sarcastic greeting card sender. I used to send cards to family and friends. I had this crazy idea that the challenge with cards is you can go down the street and you can buy some cards, but if they’re not the style that you like or in the language that you need you’re kind of … you have no options, right? It’s not like you can just go find another store. Card stores were kind of dying too.
Adam Weinstein: So, I came up with this idea like, okay, if you just print everything on demand you can have a selection of an infinite number of cards. Someone could come online in Australia, order a card, have it printed and mailed in New York today and get to the person tomorrow. You know, if you just have this distributive print infrastructure. So, that was a company called Engreet. We were a small team. Grew it, never raised any money ironically, but debated to moving it out west to do so. Coincidentally it was bought by the printer, so the printer that was doing all this was like, hey we want to get in that business. We talked over drinks and they decided, okay, well what if we just brought you on board. So, ended up selling them Engreet, and now it’s called The Greeting Card Shop. So, it still exist. It’s funny, I still get their emails every holiday season.
Adam Weinstein: Then, I moved out west, and this is probably were my real start in data kind of occurs, at least modern day data. To work for a company called Bright. Bright was a company in the machine learning space that helped match jobs to people. So, if you think about the job search process going back a few years, you’d go to a career builder on Monster and Indeed. You’d type in a job title and say, hey, I want to be a software engineer or a product manager, or marketing coordinator or whatever it may be. Then, they’d show you all the lists of jobs that had those titles. But, companies were getting a little creative with titles. I think the joke we used to use was, what if the job was called ninja. Nowadays, I don’t think, hopefully, anybody uses that title, but maybe they do.
Adam Weinstein: So, we built an algorithm that would basically parse a resume and a job description. Calculate the normalized skills that were basically being used, right? So, Oracle, and Facebook, and SAP, and Microsoft and Google. They all recruit people that can write Java, but how they describe it is very different. So, we would come up with a way to normalize the skills being represented on both the job description and the resume. Then, score the fit between the two of them.
Adam Weinstein: So, instead of searching for a job title you would actually just [inaudible 00:13:21] show you all the jobs you were qualified for. So, that was a fun couple of years.
Kirill Eremenko: Sorry, just kind of like a recommender engine, right?
Adam Weinstein: Yeah.
Kirill Eremenko: Like you upload your resume and then instantly you get jobs that you’re … is that still used online when I go on Indeed or Glassdoor, and stuff like that?
Adam Weinstein: Yeah. So, LinkedIn bought Bright in 2014. So, when you do … when you perform a job search on LinkedIn that scoring algorithm is actually being used behind the scenes to [inaudible 00:13:53] a job. So, you still do search for a title but, there’s sort of a marriage of the title, and then that score that help recommend what jobs to see.
Kirill Eremenko: Oh okay.
Adam Weinstein: And then, [inaudible 00:14:03] LinkedIn emails on job recommendations. I’m sure there are still some mistakes in the algorithm but yeah, those recommendations are being informed by the same score.
Kirill Eremenko: Interesting.
Adam Weinstein: [crosstalk 00:14:14] owned by LinkedIn, but yeah.
Kirill Eremenko: So, it uses the information on your profile when you search for stuff? That’s so cool. That’s so cool. It’s important to get your profile right, not just for other people to see it, but also for your searches also be most relevant to you.
Adam Weinstein: Right. And, increasingly, so LinkedIn’s core business is this recruiter product that if you’re a recruiter inside of a large corporation you pay a substantial fee to be able to search across the entire LinkedIn network. You can’t … you know, you can then send an email to anybody. Increasingly, that search process is converting to a recommendation process. So, instead of the recruiter saying, “Hey, I’m looking for software engineers that have three years of job experience and have worked at these five companies.” LinkedIn is trying to push candidates to those recruiters. That’s being informed by the same recommendation algorithms. I think, you know, the downside of the hey, you have to put all your sort of life’s work into this profile, but the more you put there the better off people that might be looking to hire you will be. Although, as a Data Science you’re probably not hurting for inbound interest in being hired.
Kirill Eremenko: Yeah. [crosstalk 00:15:22]. And, just maybe a year or two ago LinkedIn started … when somebody endorses somebody for their skills, now it’s not as easy as before. Just like click, click, click. Now you have to explain. How do you know the person, what level of endorsement and so on. I think that’s probably has to do with that whole system as well.
Adam Weinstein: Yeah, it has to do with how knowledgeable, or what’s the quality of that recommendation. So, before anybody could go and endorse anybody for anything. I could endorse somebody for material science engineering and I don’t know the first thing about material science, and vise versa. Somebody could endorse me for something I might good or might not be good at but, even if they don’t know anything.
Adam Weinstein: There was actually a time, maybe embarrassing length in history, where you could create your own skills. We famously endorsed people for very inappropriate things like, you know, I don’t know what a good example would be. Like dropping things on the floor, or tripping in public places, or things like this.
Kirill Eremenko: Yeah.
Adam Weinstein: You know, not skills that anybody would want in their profile but, you could just endorse anybody for anything. So, why not?
Kirill Eremenko: You’d endorse them … you’d create a skill for them, right? You endorse them for something they don’t even have on the profile.
Adam Weinstein: Exactly. Exactly.
Kirill Eremenko: Good times, yeah.
Adam Weinstein: So, no. Yeah, they’re trying to get the quality aspect figured out because it’s … that really is what tells you whether somebody is good at something. If somebody that’s in a domain can endorse somebody else in that same domain for something, that should be a very valuable endorsement, and that’s what they’re trying to get to with skills.
Kirill Eremenko: Yeah, totally. Totally gotcha. What happened after LinkedIn? You were there … you were at Brights until they got acquired, then, you were for what? Two years. Then, at LinkedIn you worked for another three years?
Adam Weinstein: Yeah, exactly. So, here I was, this sort of start up data guy at Bright, right? We had a small team but it wasn’t LinkedIn size. When I got to LinkedIn, they actually … the joke was they didn’t necessarily know what to do with me. They’re okay, here’s a guy that knows how to build data teams in small organizations. We’ve already got 200 of those people. What do you want to do?
Adam Weinstein: But, it turned out that LinkedIn had just built a … sorry. An office in China, so in Beijing. And, the way doing business in China works it was technically a subsidiary. So, we had a couple of folks there but we were building it out as if it were an independent company. It had to be autonomous, right? LinkedIn in California didn’t know the first thing about succeeding in China. We hired a team that had been really successful previously, a couple from Google, from Apple, and elsewhere. We just wanted to give them the autonomy to do so.
Adam Weinstein: So, I became the data ops guy that was sent there to help build out all the tooling that they needed to be successful against most of the LinkedIn data that already existed. But, the recommendation was to don’t necessarily use what we already have. Think of things as if you were doing it from the ground up. So, I was there like two weeks when they asked me to do this.
Adam Weinstein: I started getting on a plane, going back and forth every six weeks. My first question was okay, how are things done around here? What are the metrics we care about? Where is the data? Show me models that are relevant. How do I get an understanding of this business that was about 5000 employees at the time. The challenge I found was that there was no one place to go. So, I had about 200 or so, it was like 180 coffees. Ironically I don’t drink coffee, but 180 meetings over the course of about 18 months where I just met leaders in different domains. And said, “Okay, you’re the marketer for this product. How do you measure new customers? What is the definition of a customer? What’s the definition of a customer at risk? What’s the definition of a successful customer?” Like, all these things that were not captured in one place. They were just in individuals peoples’ heads or on their local machines.
Kirill Eremenko: Sorry, this isn’t the main LinkedIn? You’re getting that information on the main LinkedIn [crosstalk 00:19:17]-
Adam Weinstein: Then, I would take that, go to China and say, “Okay, here’s how we do it in the US. You could use this if you want.” Or, I should say the rest of the world. We were supporting a global business, kind of sort of being as one carve out. Yeah, that was sort of my 18 month window of life. Where literally, come back, I’d pick a new domain, a new product, new area, go learn as much as I could about it. Then, fly sort of like … show the team what I’d figure out. Then, go do the same thing all over again.
Adam Weinstein: What I ended up developing was this great corpus of knowledge of, hey, here’s data inside LinkedIn. Knowing every time I interviewed someone like chances are three weeks later it would change. But, you know, it was relatively up to date. A bunch of people around the business started using it as well, right? So, it was a collection of code, of terminology, nomenclature, like data definitions. A little bit of like, okay here’s were we keep … we had a bunch of different reporting systems because we were a software company. So, we built them every time we needed them.
Adam Weinstein: So, here’s the reporting system for this metric. Here’s the reporting system for that metric. Sometimes it was Tableau, sometimes it was homegrown. Sometimes it was something that had been there for 13 years that we didn’t know why it was there. Yeah, it was a fascinating journey, but I think it taught me that even in a really innovative company it can be hard to keep your arms around what’s going on where, and how to find answers to sort of even the most basic data questions, which I think as a Data Scientist is … before you can have fun with data as a Data Scientist, you have to know where things are. You have to know what it means. Can you trust the data? Is it of high quality? Is it being refreshed? Is this the source of truth, if you will, right?
Adam Weinstein: So, that drove me to start Cursor, which I can talk about but, that was sort of my journey at LinkedIn. I left last March officially to … you know, decided that a paycheck was no longer worth it.
Kirill Eremenko: Did you already know when you were leaving that you have this idea for Cursor, or did you leave and then, come up with the idea for Cursor later on?
Adam Weinstein: Yeah, so I had a good understanding of what I wanted to do. It wasn’t … I don’t think it was perfectly nailed down. What I wanted to do was spend a couple of months talking to other companies, particularly outside of the area of technology. Silicon Valley can be a little bit of a bubble in terms of how we look at problems, and how we solve them. So, I wanted to talk to banks and industrial companies, and retailers and understand what is data inside their organization and how do people interact with it.
Adam Weinstein: I had a good sense though of what I wanted to build in terms of somewhat of a data catalog but something that was more interactive for the average business user. So, that was the premise. I think we honed it for a few months before we actually started the company, and raised money and that kind of thing. I had a strong sense of what it was we were going build at maybe [inaudible 00:22:14].
Kirill Eremenko: Okay. Awesome. So, that leads us to Cursor.
Adam Weinstein: Yeah.
Kirill Eremenko: First of all, why the name? Why Cursor?
Adam Weinstein: Yeah. So, I’ve liked the name Cursor since long before I came up with the idea. You know, I guess you could say the concept to me is like, in a knowledge management kind of problem, which I guess you could generically call Cursor a knowledge management solution, although it’s not generic knowledge management. It’s specific to data.
Adam Weinstein: I think that the notion of a cursor helping you seek, or find something is really I think powerful. Then, I think, you know, there’s also a database Cursor concept, although we talked to a DVA about a database Cursor although they’ll run you out of the room because they’re not very conferment. But, I think the sort of marriage of those two, right, that it’s steeped in data. At least the concept of a cursor, or even code right? I mean, cursors have existed in [inaudible 00:23:11] for a long time.
Adam Weinstein: Then, the notion of like, okay, people can relate to a cursor helping to find something. Whether it’s on the screen, or potentially buried in a data link somewhere, right? Like, I’ve always liked the name. So, it just so happened I found the dot com was available.
Kirill Eremenko: Oh wow.
Adam Weinstein: Between the two of them, I was like okay, well this is the right name. So, we had the name before we had the company and the idea.
Kirill Eremenko: Yeah. Gotcha. As in Cursor like the cursor you have on the screen, that type?
Adam Weinstein: Yeah, exactly.
Kirill Eremenko: That’s so cool. Rarely those domain names are available, short ones like that.
Adam Weinstein: Yeah. It’d be interesting to see if domain names last. You know, they’re sort of a real estate gold rush for domain names now that you’ve got so many other TLDs, right? Dot IO, dot APT. You know? Other things, right, is the dot com is valuable? I guess we’ll find-
Kirill Eremenko: Yeah, we’ll see. As soon as Google updates their search algorithms. Right now I think [crosstalk 00:24:10].
Adam Weinstein: Yeah, exactly. Exactly.
Kirill Eremenko: Easiest to find. Okay, well Cursor. Tell us about the company. What does it do? We’ve heard your story. Obviously, you’ve built up a lot of experience, knowledge and data, and then some pressing issues that you actually saw first hand. How does Cursor go about solving? And just in general. Give us an overview.
Adam Weinstein: Yeah. So Cursor, the challenge if I had to sort of boil it down that we had at LinkedIn is that we had a bunch of users across the organization that were creating content, right? Could be Ad Hawk Sequel code, could be dashboards in Tableau. Could be an Excel spreadsheet. Could be a Python model, right? There was no one place to go find all of that. Mostly because everybody was using their own set of tools. So, you had people that had locally installed sequel editors. Tableau, I guess if you were looking for a Tableau dashboard, you could search if it had been published to the server. But, there was a lot of work that was being done on the local machine. Even Jupiter notebooks, right, for the most part were installed in local environments.
Adam Weinstein: So, Cursor is a tool that a user can start with, or a team, that if they work inside the product it has a built in sequel editor, it has a built in Python environment. It connects to all these places where data lives, and to BI solutions like the Tableau and any database that you might use. It basically curates in sort of an intelligent way all the data that it’s seen. Or, I should say meta data that it’s seeing. It’s actually not looking at the raw data itself.
Adam Weinstein: So, if you connected to three databases, you’ve written some sequel, you’ve written some Python, you’ve connected to Tableau, it helps build a single corpus of knowledge that any user in that business can come search. And, helps … and the goal of defining things that have already been done, or answers that may already exist. So, an example might be I’m an analyst and I’m trying to figure out how many products have we sold today? Generically speaking. If somebody else has done that work, how do I find them? If they haven’t, how do I find what table has the product data in it? If I do find that table, how do I know that table is the right table? So, we help built a place where people can come find what they’re looking … you know, find an answer to a data question. Make use of data if they don’t necessarily know what the answer may be, and then understand what they’re seeing.
Adam Weinstein: So, it’s you know, simply speaking you can think of it like an Evernote or Dropbox for an analyst, or for a data user. It could be also for a Data Scientist, but it’s designed to scale as wide as need be. So, data we know is siloed, right? As are the teams that use it. So, the solution’s kind of designed to fit that. You can start with one team. You can let another team come on later. This was a challenge that we saw on LinkedIn. We looked for a solution in the market to try and solve it for the business, and the problem was we couldn’t get everybody to agree. The perfect prevented was the enemy of the good, right? You can’t have a solution for everybody, nobody had anything.
Adam Weinstein: So, this is designed to solve that challenge with hey, one team can start using Cursor. They can at least start sharing with themselves, and then, typically what you’ll see is okay, another user gets jealous of this corpus of knowledge. They’ll come on, and that brings their team with them, and it kind of grows from there.
Kirill Eremenko: Wow. That is such a cool idea. It’s like, and I’m already hooked because I think of myself as a very organized person, and what you described it sounds like a tool to organize Data Science assets. You know, whether it’s code, whether it’s data, whether it’s like anything to do with the Data Science projects. Very cool.
Kirill Eremenko: So, basically I can not only search … as I understand it you are combining, first of all the tools. Sort of like if something was done Python or in R, or in Tableau, I don’t know, right? I might only know Tableau or might only know Python. I don’t know what other people have done in other tools. Or, even I’ve worked in many tools. I can actually put those entries into Cursor, and that way I will know what I’ve done across different tools like keep track of it across different platforms.
Adam Weinstein: [crosstalk 00:28:25] right? We don’t want to have to have you pull in everything manually. So, in many cases we’ve built connectors. Like, if you’ve got Tableau already, you can just plug in your credentials once. We’ll automatically suck everything in. We’ll pull all the queries behind every dashboard, make it searchable, and same thing goes for other environments too. The goal, again, it’s like we don’t want to replace every tool. We just want to bring them all together into one sort of searchable interface.
Kirill Eremenko: Gotcha. So, that’s tools. Then, on the other hand you also organize across people and departments, right? So, in a bigger organization, or even if it’s like … even if it’s a small organization, but decentralized. Like, our business is across different countries. So, if somebody has worked on a project and I don’t know if they worked. So, again, you want to reduce double work, right?
Adam Weinstein: Yep. That’s exactly right. So, what we separate is that to know that somebody has worked on something versus, being able to see the results. So, we have teams where, let’s just say sales and finance. There may be certain things that finance produces that they’re comfortable knowing, like okay they worked on a quarterly sales pipeline. But, they may not be comfortable sharing the results of it. So, what that separates is like, okay the sequel query or the Python code that’s been written, you can see that but then, the only way you can actually see the results is if you have the credentials to actually execute it. So, we allow you to sort of separate those two, because the model or the code is often times less sensitive than the actual results.
Adam Weinstein: That’s a challenge we see time and time again, that you know, why have somebody start from scratch when they can reuse 80% of something that someone else produced just because you know, you’re on a different team.
Kirill Eremenko: Gotcha. Yeah, that’s a really cool idea. I’m surprised nobody has done it before. Were you shocked that it didn’t exist?
Adam Weinstein: Yeah. I think it’s … I think it would have been difficult to do it too many years ago. The reason being, like if you look at how fragmented … I mean, I would say it’s becoming more fragmented, but if you look at how fragmented the tool space was even just a few years ago and how few of those were web accessible. So, it’s really easy to build an integration to Tableau because they have rich APIs that you can connect to, and it allows you to extract a lot of the relevant information you want to add into some sort of search interface. But, if you go back to the world of SAP and Oracle, where that was commonly what you would see in big enterprise, there weren’t rich APIs. There weren’t great ways to stitch things together.
Adam Weinstein: So, it would have been harder to build a solution that was trying to do what we’re doing. To be fair, we get asked to plug into things and depending on the product, we can do good sometimes and less good others. It is a … it is something in this web era where things are built with, I don’t know if it’s collaboration, but certainly accessibility in mind, and the ability to come from third party platforms. You know, it’s getting easier to do.
Adam Weinstein: But yeah, I was surprised that there was nothing focused on this search problem. There were data catalogs, and data catalogs was sort of like a V1 of this problem set, which is like how do you at least provide just a dictionary. Think of like a telephone directory of data inside of a business. But, the problem I saw with those … we looked to deploy one at LinkedIn too. The problem I saw with those is that everybody has to go upload the dictionary manually, and by the time you’re done uploading it and surveying the entire business, it’s already out of date. Like, ingrained in the person’s workflow. So, if they’re not using it on a daily basis, and they have to take time to separately go document something, just like documentation in general, right? It’s not going to get done.
Adam Weinstein: So, we tried to build something thoughtfully that was part of a user’s daily workflow. That’s why I hope we can succeed.
Kirill Eremenko: Gotcha. What kind of integrations do you have at the moment? You mentioned Python, Tableau, R. Can you give us a quick overview?
Adam Weinstein: Yeah, yeah. So, we’ve sort of focused on three areas. Any data store that you’d want to plug into from … so, big data, like a Hive or a Spark, to you know, traditional data stores like an Oracle, or Terra Data or Microsoft Sequel, right? Any database we want to be able to plug into on the BI front. So, we think of that as sort of layered to you know, that there’s Tableau, there’s Click, there’s Looker, there’s Power BI. We started with those just because they’re sort of the larger, more popular ones on the market.
Adam Weinstein: Then, you’re right, on the language front we have Python. R is in process. Not there just yet, but it’s on the horizon. And, sequel, lots of sequel and various flavors of sequel. If you’re writing P sequel, Microsoft world, they support that of course. Then, support a number of different operating systems. So, we have a Mac client, a Windows client, and a Linux client too if you want that.
Adam Weinstein: The product is … you know, it’s cloud based in the sense that when you share something, like if you write some code and we’re on the same team, and you want to share it with me, that’s shared via the cloud. But, there is a client aspect to deal with the certain networks if you layer in between. It’s like often times inside big companies, you know all the places where data live are not accessible to the clouds. We couldn’t directly connect to it from our cloud layer. We’d have some sort place, or some place internal to be able to get into that.
Adam Weinstein: So, you can use the client as a means of doing that, or you can actually deploy it on a server internally if you want. It’s up to you. It’s much like an R server, or a Jupiter notebook environment, right? You need some place internally for it to live in order to connect to data.
Kirill Eremenko: Okay, gotcha. Let’s talk about actually people using this. How has it been received? Have you had people and companies try it out, because I can imagine it actually solves a lot of pain points, and for some of our listeners listening in, they’re probably already seeing this at the company they work in, or maybe have experienced this in the past, or maybe it’s their own business so they’re seeing it. So, tell us about how others have perceived this, and what kind of benefits has this been able to deliver?
Adam Weinstein: Yeah. So, I think it depends on the audience, right? So, there’s probably three or four audiences that have crept up. I don’t know if they were intentional or not. In no particular order, right? So, there’s an engineering audience that like more traditional software engineering. They may support a data organization, or an analytics team, but they’ll often times have queries that they want people just to be able to see. They could be health checks, they could be just actually like business insight type information. Like hey, here’s a metric that we look at that we monitor. They’ve used the tool as a way to democratize that, make it easy for other people to come find it. You know, if they want to go on vacation not have to worry about they’re going to get a phone call just to get a snippet of code. You know, like get …
Adam Weinstein: Our tools like that do a great job of documenting code, and sort of version control, but they may not have the business context. So, they’ll use our product as a means of sharing that. That’s sort of software engineering.
Adam Weinstein: On the Data Science front, I think it’s probably more in collaboration with a BI team or an analytics team, where too much of Data Science has become data prep. How do you get dirty data or the right data in a format that you can then actually start performing machine learning on, or for that matter, even just modeling. So, where Data Science teams, and BI teams have come out of the platform like, if the BI team comes first, which has been a common trend, they’ll get all their code in there. Then, a Data Scientist, they might want to go look at A, is there something predictive in this data set that we could use, or we could monetize? They’ll at least know, okay, I’ll pick the code the BI guy uploaded. I’ll get the result set, and then I can just go and I don’t have to waste time finding the data, prepping it, getting it ready for whatever I’m trying to do to it. So, that’s probably audience number two is this sort of joint BI/Data Science audience.
Adam Weinstein: Then, audience number three, coincidentally is like a business user. So, somebody who spends all day looking for a report or an answer to a question. They don’t know whether it’s in sales force, or in Tableau, or they just need to ask the analyst sitting next to them. They’re looking for a quicker way to not bug people over email or slack or whatever it may be. So, they’re using the product sort of asking the team like, hey, can you start using something like this so that I can not bug you as much. That’s sort of one of our selling points. It’s like, hey, if you’re a business leader and you’re constantly bugging someone for answers to questions, for your sake and theirs, put it all in one place so that you can come find it.
Kirill Eremenko: Yeah, yeah. So, it’s kind of like you’re benefiting from this network effect. Yeah, it’s classic Silicon Valley start ups.
Adam Weinstein: Yeah, exactly. We didn’t invent that, right? Same thing as [inaudible 00:37:31], same thing … you know, even self service BI, the Tableau’s [inaudible 00:37:36]. Same thing, like hey, if you’ve got a dashboard come find it. Right? But, not everything is in a dashboard, and for that matter, not every dashboard is accurate.
Kirill Eremenko: Not every dashboard is Tableau.
Adam Weinstein: But yeah, those are probably the three audiences. Engineers, analyst and business leaders that use the project, or that are driving to push the adoption of the product.
Kirill Eremenko: You mentioned four audiences, no?
Adam Weinstein: Do what?
Kirill Eremenko: You said four.
Adam Weinstein: Data Scientist and business analyst.
Kirill Eremenko: Oh okay. Gotcha. Very cool. Very cool. I actually want to talk a bit more about the business audience, right? So, the way I see it is it’s not just like business data is for sure, executives and directors, but also I think this could be useful for really anybody in the business. Like, as an organization, and the world is moving on to [inaudible 00:38:26] more kind of data driven type of environment approach of doing business. Every business is starting to try to become data driven. You actually, you talk about this concept. The whole notion. Maybe it’s a good time to talk about this. The Citizen Data Scientist, right? So, let’s talk about that for a bit.
Adam Weinstein: Yeah. Data Science is fascinating right? I think it almost feels to me like the early days of BI, or I should I say self service BI. Self service BI, I think the sales pitch was like oh, you build this cube, which was what it used to be, right? Excuse me. Then, anybody can come to this system and ask a question and it’ll give you the answer. How many [inaudible 00:39:09] did we sell yesterday? How many employees do we have in this country? How many of them graduated from this college? You can always come up with a question that a self service BI system may or may not be able to answer, right?
Adam Weinstein: Data Science sort of feels like a similar problem set in the sense that there are really hard Data Science problems that require someone with extensive statistical understanding, and math capabilities, and the ability to code, and all that. But, there’s also a set of Data Science problems that should be approachable to what I call like a technical business analyst or a Citizen Data Scientist. So, you know, I think helping those folks feel comfortable exploring data, and playing with it, and using tools, whether it’s Cursor or there’s sort of even a growing auto ML set of solutions, right? How do you automatically model … throw a number of different models against the data set and figure out what’s predicted, right? Someone should be able to feel comfortable using that if they’re comfortable writing sequel.
Kirill Eremenko: Something like Data Robot you mean.
Adam Weinstein: Yeah, Data Robot, or there’s a number of different … I mean, Acer has one that fits in the [inaudible 00:40:20]. Amazon is in the process of making one. I think that … there’s an audience for that type of use case where like maybe 60% of the ML problems might be solvable by that audience in the next five years. Not today, but at some point soon. Maybe it’s more than 60, I don’t know.
Adam Weinstein: I think that the challenge is sort of like how do you help breed these folks that they may be stuck in their current day job, and how do you help sort of encourage that type of exploration, and understanding? So, I think that’s a little bit of what Cursor can hopefully help with, but it’s not just Cursor, right? It’s how do you encourage people to take that leap. So, we saw that a lot at LinkedIn where somebody that was a technical analyst would just start playing with Python. They’d take a course, and sometimes on Udemy right? They’d figure out, hey, there’s something more than just pulling data that I can do that might be more valuable to the business, and just understanding that that opportunity is out there is … it’s the only thing stopping them.
Adam Weinstein: I don’t know if that answer is where you’re getting at, but yeah, there’s a growing audience there and I see it. It’s probably going to be the sequel user of today that’s the Citizen Data Scientist of tomorrow.
Kirill Eremenko: Yeah, yeah. And, to your point, I recently read a … I think it was like a study somewhere. It was not recently obviously, I’d remember it better. But, it was a while ago, and it … what they did … a bit of a different situation but, to illustrate the same concept, that they were developing certain, I think it was certain drugs, to fight some kind of diseases. With drugs, you need to put the chemical formula together in order to you know … and, they had the modeled environment prepared, so basically there’s this environment where all the tests can be run. But now, it’s just about iterating and trying out these millions variations of the chemical compounds and formula.
Kirill Eremenko: So, instead of doing it internally or running brute force through it, and running simulations, what they did was they opened up a online place where people, anybody, could go, and just try it out for themselves. So, people, random people from all around the world, would log in. Not even log in, just go there and drag and drop these chemical compounds and, click run and see what comes up. In the end, they came up with the most non standard, and they solved all the problem. They found all the right composite they needed. So, that just shows that even people who don’t understand chemicals and drugs-
Adam Weinstein: Sure.
Kirill Eremenko: Bacteria and all these diseases, and stuff, they still have creativity, right? People can still … you just provide a self serve drag and drop type of environment. They can solve probably like half, or like you say, 60% of your business problems can be solved by people just in their spare time. Like, oh you know, let me try this machine learning algorithm and things like that.
Kirill Eremenko: I think what you’re doing in Cursor is like a massive step towards that. I think that with time, businesses not only need to leverage their data more, but also the creativity of the people that work there in general.
Adam Weinstein: Yeah, no that’s a really good point. I think there’s … if you open the newspaper every morning, and you look at headline of okay, this company had this much of a data breech, and the sort of repercussions, right? There’s sort of this desire to just crawl into a shell. We used to joke. The last role I was in in LinkedIn I actually helped work on the security side of the house. It was interesting. We’d walk into meeting and sometimes you’ve have some pessimist or, there’d be a negative tone to it. I say, “Well, okay. You can just turn off all the servers and go home. Then, there’s no security risks.” No business either, but you know.
Adam Weinstein: So, I think there needs to be a comfortable way to allow people, like you said, experiment, explore, learn because your employees are your biggest advocates. I mean, generally speaking. There’s always going to be bad actors, but you know, rarely are they internal. So, this balance of like, okay, how do you trust but then, excuse me, also have some security around how you do it is an important one to strike.
Adam Weinstein: So, yeah. I couldn’t agree more. How do you open things up as much as you can without putting yourself at risk? That’s a question I think people are grappling with, and even Cursor, we often live in a hybrid environment. So, companies have some data in the cloud, and some on prim. I don’t see that mix changing. I don’t think it’s going to go 100% cloud any time soon. Yet, if it did, it would open up so many different opportunities from an infrastructure perspective, or a tool perspective and what they could use, and how it would actually benefit the company. But, because of this security fear they have data is probably one of the last things to go to the cloud unfortunately.
Kirill Eremenko: Mm-hmm (affirmative). Yeah, gotcha. Also, big companies, like a lot of these large corporations have so much momentum that it’s going to take years before things change there. Okay. Obviously Cursor is solving a very interesting problem, and looking very forward, [inaudible 00:45:55] tool. What would you say to those listening who are … they see the value of Cursor, but they’re not ready to go ahead yet. They want to build a data driven culture with Citizen Data Scientists but, not yet there that to invest in a tool like Cursor. What would you … any advise for business leaders, or even people in organizations that are of that mindset?
Adam Weinstein: Yeah. I mean, I think the key is just to always experiment. So, you know, whether it’s to try open source tools, and I know there’s some apprehension in large corporations around open source. Not because of cost, but because of support, and security and that kind of thing. But, I think if you’re a company that’s not always experimenting and looking for ways to use data to drive efficiency or productivity or even, you know, if you want to use the phrase like, monetary gain, right? Not doing that, then your competitors are.
Adam Weinstein: So, you know, we were joking the other day. It’s like okay, what companies have been displaced in the last 10 years by the Amazon’s, the Uber/Lyft’s, the … what industries had been turned upside down, and [inaudible 00:47:12] turned upside down. And, it’s all just data, right? Uber and Lyft are still using the same cars, but you know, they may reduce the number of needed cars on the road in the next 10 years because of, whether it’s self driving or just data and being able to put cars at the right place at the right time. Same thing with Amazon, right? They’re not selling any different products than all the retailers down the street. They’re just delivering it in a better fashion.
Adam Weinstein: It’s fascinating, I think to me, that companies would be afraid to experiment. I think, you know, often times I see that coming from … this is actually something I’ve seen with Cursor and before Cursor as a consultant. You know, not listening to people that are actually in the trenches on a daily basis is usually where that sort of mindset with set in, and people that are actually interacting with and, you know, there are plenty of people in the world that are still looking at an Excel spreadsheet every day, spending hours a day manually cleaning data. Not helping them find a solution to get out of that is like, you’re wasting a very valuable and productive person’s time doing something that can be automated in an instant.
Adam Weinstein: So, you’re not helping anybody. The company, yourself, that person. What’s more likely to happen is that person will quit and then, go find a better job and your company will have to suffer the pain and consequences, right? But yeah. I think just always experiment, and find time to do it. Carve out 20% of the quarter, or the year to just … maybe it’s less. Maybe it’s 5%, who knows? Whatever it is, but some amount of time to look at ways to do things better.
Kirill Eremenko: Gotcha. So, experiment. Very valuable advise. What about spreading Data Literacy? Any thoughts on that? How does an executive inspire people in the company to become more … to want to become more Data Literate?
Adam Weinstein: Yeah. I think there’s always going to be a crowd that is literate, right? It may be a small analytics team. It maybe a CIO’s organization or whatever. But, I think making … I don’t know if it’s a requirement, but inspiring them to teach the rest of the organization. So, we had these brown bags constantly at LinkedIn where we would invite almost anybody to come listen to a talk on a data topic. It was a big deal for the author to put together the content, and to be able to actually articulate it, and document it in a way that was easy to understand. But, it was also really exciting to go listen to it if it wasn’t a domain that you were a part of.
Adam Weinstein: So, having that kind of a conversation and, giving it a forum, I think is one way to start increasing Data Literacy. That’s not even doing it in a systematic way, right? That’s just hey, how do you have sort of a conversation about it. You know two, I think is, teams that work with data finding a way for them to share with those that might care what [inaudible 00:50:07]. You know, we made it a basic goal every quarter for all the teams to send out an email update of all the work they were doing. It’s like, what are the priorities? What got done this quarter? What’s going on next quarter? We actually emailed it to basically the entire company, even in sales you would get updates from data infrastructure that would say, “Hey, we’re adding 10,000 Hadoop notes and, here’s what that’s going to do for us.”
Adam Weinstein: You know, they may not care, but the ones that do care you’ll quickly identify because they’ll raise their hand and say, “Hey, I want to know more.” And they want to help. So, that’s a great way to, I think, get started around literacy, and certainly collaboration products. Products that could help. It doesn’t have to be ours, right? There’s tons of tools in the market, whether it’s Jira, or Slack, or something like that, right? Just allowing people to have a conversation helps create empathy, and ultimately helps, I think, solve problems.
Kirill Eremenko: Yeah, gotcha. Why did you call them brown bags? I didn’t quite understand that reference.
Adam Weinstein: Oh, bring a lunch. Brown bag, like a brown paper bag.
Kirill Eremenko: Oh okay, gotcha.
Adam Weinstein: Yeah, people literally didn’t [inaudible 00:51:10]. We had a cafeteria. We were spoiled. But, in the older days, right, you’d bring a brown paper bag lunch. So, that was … yeah, you’d have your sandwich and your soda, and your chips, and that’s what you’d-
Kirill Eremenko: So, you’re enjoying lunchtime. I remember we had those at Deloitte as well. That was really …
Adam Weinstein: Yeah. It was back to a different time.
Kirill Eremenko: Yeah. I gotcha. So, experiment. Don’t be afraid to experiment, and empower people by good conversations about Data Literacy because, you’re right, that’s where the world is going. Organizations are going to be doing more and more of that, and people want that. That’s what I find. People are so fascinated with data these days that surprisingly a very large segment of employees who actually want to be more involved in this business because they see the value and they see this as something that … inevitably it’s part of our lives more and more. Like, with social media and stuff. So, they’re like, oh cool, I can do this in business. Something exciting and interesting.
Adam Weinstein: There’s not a person in the organization, whether you’re hanging up the phone and realizing that okay, we need a prompt for people that have this question, or you know, making lunch and realizing oh, I got to refresh this paper food more often because … there’s always … everyone has a thought on data. So, giving them a forum to do that and … or an executive, right? That’s wondering why can’t I get a quicker answer to this question to take six weeks. There’s always someone that needs help, and yeah. I think it makes sense that making it easier to get to would be positive for everybody.
Kirill Eremenko: Yeah. Adam, I wanted to ask you another thing. I know you guys have, for Cursor, you have like a free version. How does that work? Because I understand, you would need … like an executive would need to approve it and install into the business. That’s a long process. How does the free version work?
Adam Weinstein: Yeah. So, the free version is actually pretty good. It does quite a few things, I think, out of the box. So, the free version uses our cloud. You would download a client on a local machine, much like you would a Python editor, or a sequel editor. And, just like you use other sort of cloud based tools, like Dropbox, or Evernote, or that kind of thing, the work that you do gets shared to the cloud, right? You can determine how you want that shared. You can determine if you want it visible to your team, or just to you. But, the idea of being that like the data never leaves your network. So, if you’re running code the data lives in your local machine, but the code and the meta data, like hey, you worked with this table, or this was some query that you wrote, and this is what actually-
Kirill Eremenko: The columns and the rows of the table, that kind of stuff.
Adam Weinstein: Yeah, the names or the columns, that kind of thing. That gets shared to the cloud. So, if you wrote something that says, okay, how many laptops did we sell in Brisbane this year? There’s a guy that’s in New York that wants to know that same question. They can discover that code. They still need the credentials to be able to actually run it, it doesn’t share that. But, it does share anything that’s being done. So, they would be able to see the database you connected to. Be able to see the table names that you used. But again, if they don’t have the credentials to that database they can’t actually do anything with it.
Adam Weinstein: So, it’s sort of a light weight way to get started, and the idea is … you know, what we’ve seen is that even though often times IT or legal or security may need to get involved. Most companies will have a way or a user that’ll try it on their side time at home to be able to play with something, and if they see that hey, this is great, this is useful. It makes the process of getting it in the enterprise version a little bit easier. So, it’s a pretty fully featured … we call it the Cursor Core Product, which is just sort of like the lighter weight version of it, but it doesn’t have every integration. It doesn’t have every language, but it has most.
Adam Weinstein: So, you should be able to get a decent amount of value out of it.
Kirill Eremenko: That’s cool. That’s very nice of you to share that as well, because you know especially the start ups that don’t really have … like data is not being shared. So, they don’t really care about their intellectual property at this stage. They could use that, especially [inaudible 00:55:31] maybe I’ll sign up and use them for our company now.
Adam Weinstein: You should try it.
Kirill Eremenko: Because we’re decentralized and we have that problem a lot. Everybody is all over the world, and it’s different time zones. It’s so hard to get to the bottom of things sometimes. So yeah, the free version would work there as well.
Adam Weinstein: Yeah.
Kirill Eremenko: Yeah, very cool. Well, Adam, thanks so much. It’s been a pleasure. We’re coming close to the hour mark. Before I do let you go, I wanted to ask you what are some of the best ways that our listeners can get in touch, follow you, your career, or maybe get in contact to learn more about Cursor?
Adam Weinstein: Yeah. So, certainly feel free to reach out to me directly. I mean, my email is just adam@cursor.com. Our website is Cursor.com. Check it out. Feel free to download the product. Follow us on twitter, Cursor Data. But yeah. We’d love to chat and hear what people think, right? Good, bad or indifferent.
Kirill Eremenko: Awesome. Awesome. Okay for people to connect with you on LinkedIn as well?
Adam Weinstein: Sure. Always.
Kirill Eremenko: Fantastic. Okay, Adam, thanks so much for coming on the show. It’s been a massive pleasure for having you.
Adam Weinstein: Thanks for having me. It’s really been awesome talking to you as well.
Kirill Eremenko: So there you have it. That is Adam Weinstein, Co-Founder of Cursor. Hope you enjoyed this podcast. My personal favorite part was the whole notion of organizing Data Science assets. I’m very surprised that no company in the world has been doing this as actively as Cursor, and I think it’s a very APT problem that needs to be solved because more and more companies will want to become Data Literate, data driven, and will want to introduce Citizen Data Scientist kind of tool like that can really help out with that.
Kirill Eremenko: So, on that note, if you’d like to get the show notes for this episode, head on over to SuperDataScience.com/229. You’ll find all the materials that we mentioned in this podcast, plus the URL to connect with Adam, and of course, the URL to Cursor, which is Cursor.com. If you are interested in building a Data Literate organization and, helping organize your data size assets then check out Cursor.com. Check out their product and see if it can help you. So, they have, as you know, they have the Core of Cursor, which is a paid product. It might be interesting to larger organizations that are ready to make the jump.
Kirill Eremenko: If you are not there yet, then they have a free version, which you can try out in the cloud and see how that works for you. On that note, thank you so much for being here, and spending this hour with us. Can’t wait to see you back here next time. Until then, happy analyzing.