Welcome to episode #171 of the Super Data Science Podcast. Here we go!
“When life changes on you, make the change quickly”
This is what our guest, Nathan Stephens, learned and realized the transitions he made in his data science career. On this episode, he’ll be touching on the difference between statistics and data science, what an analytics administrator does, the future of R language and RStudio, and many more!
About Nathan Stephens
Nathan Stephens is currently the Director of Solutions Engineering for RStudio. He has been using SAS and R languages to create sustainable solutions for clients for over 15 years. He also took Statistics as his undergraduate and graduate studies at Brigham Young University.
From studying statistics to advertising to analytic infrastructure. Nathan sure has a lot of experience in different industries. We talked a lot about his odd but perfect combination of skills from these.
First up, we discuss his background in Statistics. His interest in studying data grew from here. He started learning about programming languages (R and SAS) in his university. After studying, he worked for Hallmark Cards, and in which they used R. He remembers back then that data science is not yet really known. And he remembers the experience of fighting for the jobs related to data science.
Nathan also discusses on the difference of Statistics and Data Science. He says that statisticians come up with eloquent mathematical equations and solutions while data scientists brute force their way through things with machine learning and algorithms.
We also talked about leveraging different skills in data science. Data science can be inclusive of technical and nontechnical skills. They covered advertising, consulting and communication skills when they were discussing. When Nathan was in advertising, he learned that one must know the person they’re working for since he dictates the quality of life and the goals. He also says that it was fulfilling to get around the ins and outs of advertising with his data science skills. After this work, he moved to a team of a client services company where he learned about monetizing on analytics.
His company, RStudio, aims to improve computational and scientific reasoning through data and programming. They build tools for R Programming language and takes advantage of it. Nathan helps customers integrate R products into their enterprise systems (cloud, communication, etc.)
Nathan is also responsible for analytic infrastructure. Analytic infrastructure could be in the notion of data laboratory (where analysts could work, create, and build new tools) and in the notion of running analysts in a production environment. An analytic administrator is needed in companies. An analytic administrator is important in creating new tools, training data scientists, overseeing data science labs, making sure solutions are aligned with executives and etc. It is very different from a typical data scientist.
Nathan sure has lots of skills to put on the table! Start tuning in to learn more!
In this episode you will learn:
- Nathan talks about his career background and why he chose Baltimore as the place to build his career. (03:40)
- Doing empirical work by studying data and applying your knowledge to actual problems are what he finds interesting in Statistics. (06:30)
- The difference between statistics and data science. (11:01)
- Understand the people you’re working for – the executives and the visionaries of the company. (14:00)
- Advertising is the biggest application of data science. (20:30)
- Data Science is a bottom-up approach while Consulting is a top-down approach which makes them an interesting mix. (24:30)
- Analytics is just a part of an overall bundle of goods getting sold. (26:25)
- “When life changes on you, make the change quickly.” (29:16)
- A quick overview of R program and R Studio. (33:09)
- The concept of Analytics infrastructure. (40:23)
- The role of an analytics administrator vs the role of a data scientist. (45:00)
- Why does it take so long to implement a tool inside an organization? (49:11)
- R works well with other programming languages. (54:07)
Items mentioned in this podcast:
- Analytics Administration for R by Nathan Stephens
- R for Data Science by Hadley Wickham and Garret Grolemund
Kirill Eremenko: This is episode number 171 with director of solutions engineering at R Studio, Nathan Stephens.
Welcome to the Super Data Science Podcast. My name is Kirill Eremenko, data science coach and lifestyle entrepreneur. And each week, we bring you inspiring people and ideas to help you build your successful career in data science. Thanks for being here today, and now let's make the complex simple.
Welcome back to the Super Data Science Podcast, ladies and gentlemen; super excited to have you on this show. And today, all the way from R Studio, we have Nathan Stephens joining us. So, a lot of you already use R programming in your data science careers, or in your data science education. If you don't use R, then you probably have heard of it in one way or another. R programming is one of the two titans. R programming language is one of the two titans of data science, alongside Python. It's one of the two languages that we use, predominantly use to create models, do machine learning, do data science, build deep learning models, even create artificial intelligence.
And today, we have Nathan Stephens joining us. And he's a director at R Studio. And R Studio is, by far, the most popular program through which you program, or through which you code in R. And in this podcast, we had a great time. We had a blast. So, some of the things that we chatted about are Nathan's background. I deliberately went through all of his background, because he's got such an interesting story. Even before we got to R Studio, there was so many fun and exciting things that we talked about. And one of them being that an analytic admin does. Because Nathan is now in solutions engineering, he knows a lot about what goes into building the environment, building the infrastructure for a data scientist, or a data science team, or a data-driven company, so that is a very valuable part of our conversation.
If you're not familiar with things like data engineers, data architects, data analytics admins, servers, and all these other things that are components of a data science environment, highly recommend checking it out, listening to the podcast, because you will learn a lot about that. And after we meticulously went through Nathan's career, we finally got to R. So, you'll learn a lot about what the R language is all about, where R Studio is headed, what the recent updates are, who they just hired, how they compare to Python, and all these other cool and exciting topics.
So all in all, very exciting podcast; can't wait to get started. Let's dive straight into it. Without further ado, I bring to you Nathan Stephens, director of solutions engineering at R Studio.
Welcome, ladies and gentlemen, to the Super Data Science Podcast; super excited to have you back. And today, we've got a very exciting guest, Nathan Stephens, director of solutions engineering at R Studio. Nathan, welcome to the show. How are you doing today?
Nathan Stephens: I'm doing great. Thanks for having me on, Kirill.
Kirill Eremenko: Thank you for coming on. And where are you calling in from?
Nathan Stephens: The Baltimore-Washington area.
Kirill Eremenko: Baltimore, we were just talking about it. You're like, in between Baltimore and Washington, like can't decide which one it is?
Nathan Stephens: Yeah. I go back and forth. I'm a little closer to Baltimore.
Kirill Eremenko: And where is work? So home is in between, where's work?
Nathan Stephens: Well, the company is, you know, technically based in Boston, Massachusetts, but we all work from our homes. So, I work from my home office.
Kirill Eremenko: Okay, wow. Well, that's so cool. We'll get to that in a second. So, how's the weather in Baltimore?
Nathan Stephens: It's been very, very wet and cold, which has been great for my lawn. [crosstalk 00:04:26] The yard's doing great.
Kirill Eremenko: Wow. And we're in June. Why is it so wet? Like, does it get hot in summer?
Nathan Stephens: I don't know. Yeah, no, it's usually a lot warmer than this, but I haven't been to the pool yet. It's just been an especially cold June, but the kids are eager to get into the pool.
Kirill Eremenko: Wow, crazy. How many kids do you have?
Nathan Stephens: I've got two; two young boys.
Kirill Eremenko: Nice, very nice. It's pretty insane, what's happening with the weather, right? Like in California, you have these fires all the time. And then, you have the hurricanes down south. And then now, it's like wet and cold in summer in Maryland. Don't know what to expect.
Nathan Stephens: Yep. So, I'm actually from California, so I'm used to the earthquakes and fires. And then, I lived in Kansas, and I got used to the tornadoes. And now I'm out in the east, and we do hurricanes. So, pretty familiar with all of those things.
Kirill Eremenko: Interesting. And so, out of all those places, you found your home in Baltimore? You recommend that as like the nicest place to settle down?
Nathan Stephens: Well, yeah. I think I came out for a job, and the jobs out here are plentiful. And it's a great place to build a career. I think Washington DC attracts people from all over the world, especially in the United States. It brings a lot of people in. So you know, it's just a crossroads for a lot of people. And I find that really exciting, a lot of fun. [crosstalk 00:05:47] So, it's been good to build a career out here. It's a good place to work.
Kirill Eremenko: Gotcha. Okay, all right. Well, being the director of solutions engineering at R Studio, you warned me just before the podcast, before we started recording, that the podcast is going to be R-focused. And I wanted to pass on that message to our listeners, that this podcast is R-focused. And we're going to learn all about the lovely language of R, and what it's been up to in these past years, and where it is currently.
But before we jump into that, Nathan, could you give us a quick overview of your background? Like, coming from California, where did you study? And what took you on this journey into data science, because ultimately, our listeners are all very interested in following this journey from the start, how you went about getting into data science.
Nathan Stephens: Yeah, I'll do my best to keep my answers brief. I actually learned R and SAS at the exact same time when I was an undergrad in college, and that would have been 1999. So, I'm a very old R user, and a somewhat young SAS user. And I learned both of those through the statistics department at my university, and that was a really great experience. Statistics taught me how to think scientifically. You study hypothesis testing. You study science as a statistician. And then, there's this notion of making, doing empirical work by studying data and applying your knowledge to actual problems that I found very interesting in statistics.
So, I actually got off to a great start. I was very fortunate, very young in my studies to get some great programming languages, some great scientific thinking, and then exposure to applied science with data backing up those conclusions.
Kirill Eremenko: Gotcha.
Nathan Stephens: That set the foundation for everything that would come later.
Kirill Eremenko: And so that was in your undergraduate?
Nathan Stephens: That was my undergraduate, yeah, when I was in university.
Kirill Eremenko: Gotcha. And where did you go after that?
Nathan Stephens: After I graduated?
Kirill Eremenko: Yeah.
Nathan Stephens: Yeah. So, I made this interesting detour over into actuarial science, and that's a whole nother discussion entirely. That didn't last very long. I went back to grad school after I tried my hand at actuarial science. I didn't find that to be particularly satisfying. It didn't suit my interests. So I went back to graduate school, and I got a master's degree in statistics.
Kirill Eremenko: Gotcha. And just for maybe like our non-English speaking listeners, or for whom English is not their first language, actuarial science, because it took me a while when I [inaudible 00:08:47] to wrap my head around what that means. It's like statistics applied to population and demographics. Is that correct?
Nathan Stephens: Yeah, yeah. It is a broad field. Statistics is, actuarial science is actually a regulated practice in the United States. It's like being a lawyer in the United States. You have to actually have some sort of license to practice actuarial science. And so if you want to be an actuary, you have to go through this series of exams, and you have to comply with certain regulations in order to practice it.
Kirill Eremenko: Okay, gotcha. All right. And so, then you did a master's in statistics. And where did that take you, after that?
Nathan Stephens: So after leaving my master's program, I worked for a manufacturer of greeting cards in the Midwest. And I worked in their research department, and that was a really, really good experience. I got to cut my teeth on a lot of very interesting problems there. I also got to do more R there as well.
Kirill Eremenko: Okay. And-
Nathan Stephens: So just to characterize that, you have to keep in mind, this is back in like, 2005. So you know, Hadoop hasn't even really caught on yet, right? Big data's kind of on the ramp-up. Data science hasn't been coined as a term. There's no such thing as data science. It hasn't been, that term hasn't been invented at this point. And most analytic jobs are sprinkled throughout the United States. So as a statistician in 2005, when you're looking for a job, you're actually ... Actually, I got my job in 2004, so let's say 2004 ... you're actually looking for little pockets of analysts here or there. They didn't really clump together in large amounts, by and large.
And so, you're actually fighting for those jobs. We've come a long way, right? So it's like, back in 2004, you're actually fighting for a job where a statistician can work.
Kirill Eremenko: Yeah. And it's such a different world, right? Like back then, data science was pretty much statistics, right? It was called statistics. And I had a guest on the podcast like a few months ago, who put it very aptly; that the difference between statistics and data science is that in statistics, you still have to think through a lot of the mathematical components, come up with eloquent equations, and so on, and solutions; whereas in data science, a lot of the time you can just brute force your way through things, facilitated through different machine learning algorithms.
Nathan Stephens: Yeah, I think that's fair. I think data science is a term that's really grown on me over time, because I think statistician is a little too narrow to define what the world really needs. And the term data science is such a broad umbrella, you know, almost nebulous term; that it does a pretty good, that that's the strength of that term, that it actually just, it's all-inclusive of this idea that we're going to use data, we're going to be data-drive, we're going to be scientifically minded, and we're going to apply that information to problems.
So, I really like the idea that that's a general, nebulous term. I think that's the strength of the term.
Kirill Eremenko: Yeah. And also, that allows people from different backgrounds to come into data science, right? Like, it's not just statisticians or mathematicians. I know people who were in something very creative, like acting, and they leveraged their skills in data science through the component of communication of their results.
Nathan Stephens: Exactly, exactly. It's very inclusive. If you want to be in data science, we welcome you in. Please, be a data scientist. We need more data scientists. We want people to, yeah, think scientifically in their view of the world.
Kirill Eremenko: Yeah, gotcha. True. Okay, and so you worked with Hallmark Cards for a couple years.
Nathan Stephens: Yep.
Kirill Eremenko: And where'd you move on, after that?
Nathan Stephens: So after Hallmark Cards, I worked for an ad network. And at the ad network, I got to build ... This is where I start, well, my background's always been in big data. So even at Hallmark, I was working with massive datasets, mostly on Teradata. At the ad network, I got to work with large amounts of data on data sources like Netezza, Greenplum, and that's where I started learning Hadoop. We were early adopters of the Hadoop platform. And this is also at the same time when AWS was coming online. So, AWS was spinning up and doing all sorts of interesting things. And we got to jump on that platform.
Kirill Eremenko: So, is that around 2012?
Nathan Stephens: No, no. This is around like, 2008.
Kirill Eremenko: Ah, okay.
Nathan Stephens: Yeah, so we were early-on adopters of Hadoop at that point.
Kirill Eremenko: Okay. I mean, AWS, was it coming online around 2008 as well?
Nathan Stephens: Yeah, yeah.
Kirill Eremenko: Okay, cool.
Nathan Stephens: Yeah. So, part of my good fortune has been to work with really interesting, you know, managers and leaders in my career, so that's been a real fortunate thing. And I always encourage people to, you know, when they go to select their jobs, put a lot of emphasis and weight on the person that you're going to be reporting to, because that person's going to dictate a lot of things about the quality of life of the job, and also future opportunities that you'll have. And at the ad network, I had just a real great visionary, who was very passionate about cloud technology and distributed computing. And so, yeah, we went down that route. It was a very exciting time, actually.
Kirill Eremenko: That's really cool, because on the podcast, sometimes I mention that it's important to, during the interview, when you're applying for a job, important to understand what the job itself will be and will entail, and the company itself, because that shapes your future. But you're right, you have to also understand the person who you're going to be working for, who's your direct manager. What are they like?
Nathan Stephens: Yeah.
Kirill Eremenko: Yeah. Like you, I've been fortunate to have some very impactful direct managers in my life. What would you say was your one biggest takeaway that pops to mind from that person at the ad network?
Nathan Stephens: Oh, with that manager?
Kirill Eremenko: Yes.
Nathan Stephens: I think there's this notion of, you know, rejecting the status quo, right; thinking differently, accepting new ideas. I think there's also this, with him ... I'm struggling to explain it, but he was very interested in philosophy. And he was a much broader thinker, right? So, it's nice to work with somebody who has a broad world view, and can kind of articulate how the work that we do in technology fits into that world. So, I found that really interesting as well.
The other thing that was interesting about him, and a lot of my managers, is that I've had very few statistical managers, people that really, actually can do what I do, which has been a real blessing, because it allows me to differentiate myself and bring something valuable to the table; but it also allows me to pick up a lot of the skills that I hadn't acquired through my normal channels. For example, like, you know, the consultative work, and being successful, navigating up the political landscape of a corporation, right? But also, a lot of the engineering work, a lot of the ETL pipelines, a lot of these things, you know, my managers and other people that I've worked with have brought to me.
So, I think it's great to work with a manager who compliments your skills as well. Or at least, that's one thing that has been really nice in my experiences. You know, learn from your manager [inaudible 00:17:04] doesn't have the exact same background you do.
Kirill Eremenko: Yeah, gotcha. And that's actually a sign of a good leader, when a person can hire somebody that's better than them at something. Because like, sometimes managers can be a bit intimidated if their reports are like, better than them at something. And therefore, that team won't work out; but like in your example, that worked perfectly fine. And that usually, for me, shows that the leader knows what they're doing, and is confident enough to lead a team of experts in different fields. They don't have to be an expert themselves in those same areas.
Nathan Stephens: Yeah. I think it was funny. I remember this time when this manager in particular, he did a statistical analysis and presented it to me and a few other people on the team. And we kind of shot it down. [inaudible 00:17:57] that he didn't do this right. He was so gracious about it. He's like, "Oh, okay. Okay, I see." It's like, "I'll leave it to you guys." That's good. It was all done in good humor, but we're like, "Yeah, yeah. That wasn't right." [inaudible 00:18:10]
Yeah, but you know, I think diversity is good, right? Diversity, I'm a big proponent of diversity and building diverse teams. And that's another thing that this guy did. He built a team. And it was kind of funny that we called it the data analytics team at the time, because data science, again, wasn't a term; but we had data experts, data engineers. We had machine learning engineers, system integrators, DevOps people, and statisticians, as well as domain experts. So, we had this nice crosscut of everything that you would need to build a singular data science team that can pretty much lay waste and devastation to the world. Like, we had all the capabilities that we needed in that team, because it was a cross-functional team. And that was great. That was just a wonderful experience.
Kirill Eremenko: Gotcha. And in terms of the work and tools that you used at the time, and techniques, would you say that advertising, data science and advertising now is different, is radically different to what it was back then, in the 2008 to 2011 period?
Nathan Stephens: Well, certainly the complexity has risen. I think the main objectives are pretty much similar when it comes to targeting and promotion. Advertising is still advertising. I think one thing that I found fascinating about going from a manufacturer of greeting cards to an ad network as a statistician was, I used all of the same skills in both places. So, when I went into my next gig, the skills carried over. So, I was still doing predictive models, segmentation, clustering, supervised and unsupervised learning techniques. I had to still scrub data. I had to understand the data.
So, the principles of doing the data didn't seem to matter so much with the application. I was still using those exact same principles, despite the fact that I was going from one domain to another domain.
Kirill Eremenko: Okay, interesting. And before we move onto your next role in your career journey, just a quick question on working with ads, because even today, or especially today, advertising is one of the biggest applications of data science. What would you say to people who are studying data science, and are considering a role in advertising, but have never had any exposure to using data science for advertising? I guess the core of the question is, is it a fulfilling experience? Is it something that you can build a career around, and at the same time, not feel like sometimes we see in the movies, where people just feel like all they do is sell, sell, sell all the time, and they have no meaning to their lives?
Nathan Stephens: Yeah. I actually have a lot of, I've actually had that same question in my own experience. And I think it's an existential question, right, to say like, what is fulfilling to you, and what is meaningful to you in your life. I mean, as a statistician, you aren't rushing into burning buildings and saving children from fire, right? And you're not saving people from cancer. You're not fighting world ... Well, you might fight world hunger as a data scientist. I mean, and you can work on these areas [crosstalk 00:21:36]. Yeah, you can do that.
And so, I think finding what's fulfilling to you is an individual question. What I will tell you about my experience in the ad world is that the technologies are amazing. And the sophistication is bottomless. And the complexities are high. It's also extremely challenging, so it's intellectually challenging. So if you're a person that, you know, really enjoys a challenge, that's good as well. I think if you're the type of person that says like, you know, "How do I do the most good with the skills and talents that I have in my life," I think that's a very thoughtful question. And I think there are probably more noble things that we can do than, you know, targeted advertisements, right?
And so, I always encourage people to follow those aspirations. And I think that's actually one reason I actually moved onto the next area of my life, which was to do client services. I wanted to learn a little bit more about that. And then, onto R Studio as well, because I, myself, have been trying to figure out what satisfies me in my life, and what things can I, what types of impact can I have to the world.
Kirill Eremenko: Gotcha.
Nathan Stephens: But for me, doing targeted advertising, it was one step in the journey. I made a lot of connections. I got to learn a lot of technology. I got to challenge myself. It was a time of intense, analytic effort. And I think all those things made me better, but it was just one step in that journey.
Kirill Eremenko: Mm-hmm (affirmative). All right, gotcha. And thank you, thank you for that overview. I'm sure that will be helpful for some of our listeners.
So, let's talk about your next role. You mentioned you went onto work in customer service. And from your LinkedIn, I see that that was quite a lengthy role that you had there.
Nathan Stephens: Yeah, yeah. So, I wanted to learn how to build a business around analytics. That's one reason I went to the client services company, because I worked for a data-driven organization, a company that was actually selling analytics as part of its solutions. And I was really impressed with the quality and the caliber of the people at that organization. And that was the next set of skills that I wanted to learn about.
So, yeah. I went over to client services, and that was another ... Anyway, I could talk forever about like, what I learned in client services. That was an amazing adventure, to be honest. Yeah, I don't know if you've ever worked in that background, but that's quite the field, working for clients.
Kirill Eremenko: No, I actually, like, I worked in consulting; you know, selling consulting solutions to clients, but I'm not sure if it's exactly the same as what you're describing. Maybe let's go through your experience a bit, and I'll pitch in a little bit if I can add value to the conversation.
Nathan Stephens: Yeah. We can call it consulting. It's a very human-driven endeavor, right? You're trying to help other people be successful with their work and their challenges. And some of those challenges are going to be technical. And then, a lot of those challenges are not going to be technical. And I think that's what I found interesting, was that balance of the technical and nontechnical requirements.
Kirill Eremenko: Yeah, no. That's definitely true, especially in consulting. It's like what we found is data science is more of a bottom-up approach, whereas consulting itself, at its core, is a top-down approach. And you start from the executive team, you define the strategy, and then it trickles down. And when you combine the two, you have both the technical and nontechnical aspects, and it's interesting to see where and how they meet; because data will be telling you the truth, from the point of view of data, but consulting or people will be telling you the truth from the point of view of their experience. And it's always interesting to see when there's conflict in that, and how to resolve that.
Nathan Stephens: Yeah, I think that's really insightful. I totally agree with that. I think what you see in consulting is, you see what is required to take action on the insights and the understanding that you glean from your data. So, just learning about the data, that bottom-up approach, you know, that's not necessarily enough to actually take action on those insights. There's a lot of other pieces in that chain, and you see that in the consulting, when you go to the top-down. You see, "Oh, I see how that information is combined with other pieces of information to lead to actions."
Kirill Eremenko: Yeah, yeah, definitely very interesting. Okay. So, what is your biggest takeaway from your time in client-facing data science?
Nathan Stephens: Yeah. My biggest takeaway, well, I'll circle back with what I said about monetizing analytics. That's why I wanted to go there, and I got a good idea of building a business with analytics. The answer that I came to was that analytics is one piece of a much larger pie for monetization. So, you don't build a predictive model and then make money on that predictive model. Even in the ad network that I worked for before, where we were putting models into production, that wasn't the whole story. The entire story is, how do you set that strategy? How do you influence the key players? How do you line up against the market? You know, yeah, so those, that broader ... So, what I learned was that the analytic piece is actually a part of an overall bundle of goods that ends up getting sold.
Sometimes, I kind of compare it to like, you know, maybe like your Siri on your phone, or you know, Google ... What's the Google Answer, Google Now, the Google Assistant ... Like, you don't usually buy, I don't know many people who buy their phone for Siri, or buy their phone for Google Assistant; but it is part of the overall value of that platform, right? And that's what I've seen with a lot of analytic work as well. It's like, you know, I have a great predictive model. Okay, that's great that you have a great predictive model, but that's one piece of an overall solution that you're trying to come up with.
Kirill Eremenko: Mm-hmm (affirmative), yeah. Okay, very, very interesting takeaway and recommendation, I guess, for the people listening, for the future, that it's not just about analytic solution. That is often just a component.
Okay, all right. And before we jump to your current role, which was the next step in your career, I know people are dying to hear about R Studio and what you're doing there, I just have one more question. So, you've moved through different roles. So, you were in a company that creates cards for about three and a half years or so. Then, three years in the ad network. And then, four years in the company that does the consulting services in data science. My question to you would be, what was always, was there a common trigger that prompted you to move onto the next role? So as we can see, the industries are quite varying, and it doesn't seem like a natural progression from one to the other, except for this last one, where you actually intended to find out how to build a business around data science.
So, what would you say, is there a trigger that, or like a point of saturation, why did you choose to move on and leave, not just the company, but the industry as a whole, to move onto the next thing?
Nathan Stephens: Yeah, I'm actually glad you brought that up. My personal experience is that jobs really change, and jobs definitely changed for me. So, I'll have a job where things are really great. And then something will change, and it will change the dynamic of that job. And in that situation, you can decide to stick it out and keep going, which is one option; and then, the other one is to tack and go a different direction, which has been the strategy I've taken.
So, what was that key change for me? In all three of those cases, it was a change in manager. Like, I moved from a manager that I really enjoyed working for, to a manager that was out of alignment with what I wanted to accomplish. And that's not going to be a trigger for everybody. I think you can be really successful in a lot of careers by staying around, and you know, working through a change of manager. But I think what is important is to know that jobs are highly in flux, and you can go from a great job to a not-great job in a day; because either the company acquires another company, or gets sold to another company, or you get a reorganization of leadership, or your manager leaves, and another manager comes in, which is kind of what I'm talking about. But those things actually do have big impacts on your day-to-day quality of life, and wellbeing, and your potential future.
So I think for me, personally, I think if anything, I spent too long trying to make a difficult situation work. I think, looking back on it, one of the lessons I've learned is like, you know, when things change, when life changes on you, make the change quickly. Like, say, "Okay. You know, this isn't what I used to have. Maybe I'll go do something else. That's going to change now." Or like, "I didn't really want this reorganization. I didn't want my company to be sold to some other company, but it is what it is. And so, what am I going to do about it," you know? And I think if I had actually moved faster in those switches, I probably would have been a lot happier. But you know, it worked out pretty well for me. I'm pretty happy with the journey. I've been really fortunate to have had good opportunities along the way.
Kirill Eremenko: Yeah. It's all a learning experience at the end of the day. It's not about the end destination. It's about the people we become on the journey, taking us to that end destination.
Nathan Stephens: Absolutely. I've learned a lot about data science in my life, but my career and experiences with other people have also taught me a lot about who I am and what I'm interested in.
Kirill Eremenko: Yeah. Very interesting you mention that, because I never thought of it in that way; but like, looking back now, the reason I left Deloitte was exactly the same, that the partner that was managing our division, he moved onto a more senior role, a more national-focused role, and a new partner came in. And while he was very talented, definitely, it didn't align. I didn't feel in the right place. I didn't see that I could learn as much as I could from the first one. And so, like after a few months, I handed in my resignation.
Nathan Stephens: I wish somebody would just like, have put their arm around me, and told me much younger that it's like, "Look, things are going to happen to you in your career, and they're not fair, and you're not going to like it. But that's okay. That's just the way it goes, and you're going to be okay."
Kirill Eremenko: Yeah, yeah. Well, there you go. You're passing on this message to all of our listeners now. And if anybody's feeling the same, then don't worry. Nathan is putting his hand around you right now and saying, "Everything will be okay."
Nathan Stephens: I am extremely empathetic to people who are under a lot of stress in their jobs. I understand that that happens. And yeah, I am saying it's going to be okay.
Kirill Eremenko: Yeah, awesome. Okay, well that nicely brings us to your current role at R Studio, where you're the director of solutions engineering. So to start off, maybe give us a quick overview of R Studio, because we will have some listeners on the podcast who haven't used R or R Studio before. Can you give us a quick overview of what R programming is all about, and what is R Studio?
Nathan Stephens: Okay. So, and those are two questions, so I'll answer them separately. So the R programming language is an opensource programming language, like Python, or C, or Java, or any other programming language that you might use to do data analytics. And it's been around for a long time, and it's run by a core group that's totally unrelated from R Studio. And it's primarily designed for statistical computing and visualization. And it turns out that it has some other really nice strengths that we can talk about, too.
R Studio is a company, right? So, R Studio was founded by JJ Allaire, along with Joe Cheng, who was one of the early employees, and Hadley Wickham, that you probably know about if you're in the R space, who works at R Studio as well. And the mission of R Studio is to improve computational and scientific reasoning through data, using programming. And we don't even necessarily limit ourselves to R, but we're very R-centric, right? We believe in APIs, that you know, you should be doing, connecting with other systems. And we also believe in reproducible resource, that all of your work should be scripted out and programmed, so that you can communicate with other people, and collaborate with other people on your research.
So what R Studio does is, it builds tools that sit on top of the R programming language, that really take full advantage of the R programming language.
Kirill Eremenko: Okay, gotcha.
Nathan Stephens: Our most popular product, by far, is the R Studio IDE, and if you've used R, you've probably used the R Studio IDE. It's free, opensource software that you can download and use to interact with R.
Kirill Eremenko: Yep, and IDE stands for integrated development environment. That's like the window in which you program things.
Nathan Stephens: It's, yeah, the data scientist's lightsaber, right? That's the tool they're going to use to do their work.
Kirill Eremenko: Yeah. I tried programming, when I was learning R myself, I tried programming a little bit. And you know, you can program R in a text editor, and then just apply, it's like R is a compiled ... R is an interpretive language, not a compiled one. So, you apply the interpreter to the text editor. And you can still get the results, but it's so much easier and more efficient in an IDE.
Nathan Stephens: Mm-hmm (affirmative), exactly.
Kirill Eremenko: Gotcha, all right. So, that's a great overview of R Studio and R. And what about your role? What's your role in R Studio? Or in fact, you started R Studio three years ago. Has your role evolved over time?
Nathan Stephens: Yeah, my job changes every six months. You know, I'm doing something new every six months, because it's a small company, and it's a growing company. And that's what happens at small, growing companies, is your roles change. So, I'm a solutions engineer now. And what we do in the solutions engineering group is, we help customers integrate our products into their systems. So if you buy our products, and you want to work with them, with databases, or with Hadoop, or Spark, or crypto-authentication, or on the cloud, any of those types of problems, we get involved with those problems.
So, we're really there to help build enterprise systems, and help the architects and the IT groups manage these workflows.
Kirill Eremenko: Mm-hmm (affirmative). Okay, gotcha. And just before the podcast, you mentioned that, or in the email correspondence, you mentioned that you have moved on from being a data science practitioner, more to the role of a data science tool builder. And that gives you a unique perspective on career opportunities for data scientists. Could you tell us a bit more about what is, what does a role of a data science tool builder entail? And how does it compare to just a data science practitioner, a standard role? And what are those unique career opportunities that you mentioned?
Nathan Stephens: Yeah. So, let me be clear on like, what the shift is. So, I no longer analyze live data. So, data scientists are largely there to, a chief component of the data scientist's job is to get insights and understanding from their data, to influence decisions, actions, and results. I no longer do that. I don't have live data. I don't analyze live data, and I don't take any data insights to influence actions and results, not from ... By live data, what I mean is, you know, living data that's coming in through other data sources that I can analyze.
So, let me explain how I got here. So when I was at the client services, I always was very interested in this idea of systems, and architecture, and building data products. That's what I got to do at the ad network. And when I went into client services, I actually got a part of my time reserved to building analytic infrastructure, in addition to all the client services work I did. And as time went on, I found that I got more and more interested in that analytic infrastructure role, to the point where I was helping my other clients learn how they would implement their analytic infrastructure as well.
So, I was working heavily with IT at this point, and other architects. And I was like, you know, working with the CTO to expand out the use of R. And that's why R Studio got interested in me, was because that particular skillset was what they needed over at R Studio. What was interesting about that was like, that wasn't the primary core of the job. My job was actually to work with the clients, you know, as a data scientist; but it kind of morphed into this other interest of me doing analytics infrastructure.
Kirill Eremenko: Yeah, interesting how you can discover new things on the job, and find out new interests that you have, and passions.
Nathan Stephens: Well, it was actually a real struggle, to be honest, because you know, if you worked for Deloitte, right, you work on billable hours, right?
Kirill Eremenko: Yep.
Nathan Stephens: And so, you're under a lot of pressure to bill a lot of hours. And you had hourly targets, yearly targets, that you're supposed to hold up. And all the while, I'm doing this other thing that isn't tied to billable hours, and isn't necessarily aligned with the corporate strategy, but something I feel really passionate and really curious about, right?
So, there was a real tension there about like, how to spend my time.
Kirill Eremenko: Yeah. That's when you start working like, evenings, and weekends, and you lose any kind of personal life, or sports, and health. Everything goes down the drain.
Nathan Stephens: Yeah. That's kind of client services in a nutshell, actually.
Kirill Eremenko: Exactly. All right. Well, that is very interesting. Tell us a bit more about, before we move onto the other components about the career opportunities, tell us a bit more about the analytic infrastructure. So, I encountered that, like when I was at Deloitte, it was quite closed off to me. I was just doing the consulting work, just doing the data science side of things. But then, when I moved onto the superannuation fund, or the pension fund, in Australia, I was heavily involved with infrastructure, and data architects, solutions engineers, and all these other different roles that I didn't even know existed. And I found that to be a fascinating role. Could you give us like a short excursion to the world of analytic infrastructure? What is it all about?
Nathan Stephens: Yeah, so that is a great question. That is a fantastic question. So, analytic infrastructure has two, the way I view analytic infrastructure right now is kind of in two compliments. You have this notion of a data lab, right? You have this idea that you have a sandbox to play in, where analysts can work with their data, and learn, and discover, and create. And most analysts I know love that part of the job. They want to go create. They want to build applications. They want to generate reports. They want to try new technology. They want to blow things up, right? I always say like, [crosstalk 00:41:47] not a big difference between a data scientist and a mad scientist, right; just a few letters.
So, I think creating a data lab for people, a playground for people to play in, is really important. And then, there's this other notion of running analytics in a production environment. And the difference between those two is that, in the data lab, the data scientists are in charge; and in a production environment, the IT group, or the IT operations are in charge.
Kirill Eremenko: Yeah.
Nathan Stephens: And that handoff becomes ... Well, we could talk about the handoff, but spanning those two worlds is the part that I find very fascinating, so that's that fuzzy area where I've lived. It's like, how do you connect this data lab to this production world?
Kirill Eremenko: Gotcha. I totally agree. Like when I went to this company, you would always get slapped on your hands for trying to like, run a query without asking in advance. Like, data scientists didn't even have access to SQL before I came in. Then I requested the access, and finally, after certain hurdles, we got it. And like, every time you run a query, they're like, "Oh, you could have hung the whole server, and you know, the production environment." And then, they have ... Oh, what's it called? They have these time slots, like in the night, when all the queries are supposed to run. I forgot what the exact, technical term for it is, but like, they have allocated time slots for certain queries because they know how much time it's going to run and so on. And they need to get a certain amount done in 24 hours.
And one of the first things that we did after a couple of those incidents, where data scientists were like, slapped on the hand, what we did is, we set up this data lab. It was like, I think it was called the sandbox. Some people called it the data lab, playground-
Nathan Stephens: Yeah, sandbox.
Kirill Eremenko: ... yeah, or called it a sandbox. And that really solved the whole issue, because you can just experiment as much as you want. There is still that issue of handover, which you briefly touched on, but at least it's not as bad. Like, people are not constantly chasing you up about things that you're allegedly doing wrong, and you get the freedom to experiment at the same time.
Nathan Stephens: Yeah. That's so fun to hear you share that, because that's been my exact experience as well. I had two guys from IT come over to my desk when I was at the ad network, and they were not happy at all. They towered over my desk with very unhappy faces, and wanted me to account for myself, you know?
Kirill Eremenko: Yeah.
Nathan Stephens: And that didn't happen once, but it happened twice. And then I got a sandbox, also.
Kirill Eremenko: Yeah, yeah. True. It's interesting how they have these systems in place to track down who exactly is the culprit. They find you very quickly.
Nathan Stephens: Yeah, yeah. They will find you.
Kirill Eremenko: Yeah, okay. And so, in the case of analytic infrastructure, so it's not ... Like, one of the steps is setting up the sandbox or the playground, and then dealing with different servers in the production environment, and things like that. What else is part of a role, the role of somebody who is in analytic infrastructure? What does the day-to-day look like there?
Nathan Stephens: Okay, yeah. So, I'm trying to parse that question, because that also feels like two questions. So, let me do the role, and then the day-to-day. So, the role of like, let's call it an analytic administrator. I actually wrote an article on R Views, which is one of our corporate blogs about R, about an analytic administrator.
So those analytic administrators, they have to be pretty awesome, to be honest, right? Like, they have to be connected to their data scientists, to understand what the data scientist needs. They have to be aligned with the executive audience to know like, what matters to the company; like, where the value is going to be, like what types of solutions are going to produce business value. They have to get along well with IT. So, they have to bring them doughnuts, and make sure that their voice is being heard, and that they're complying with all of those rules. And they have to be really good evangelists in general, about promoting the need for data science in the organization. If you're fortunate enough to work for an organization where data-driven decisions are happening, then that will be easier. If you're working for an organization that's maybe still more like, politically oriented, or making decisions from their gut, then you're going to have a little bit more work to persuade them that data science is meaningful in your organization; but being a proper evangelist is a really important part of the role.
What does that mean, day-to-day? Well, part of the day-to-day is going to be managing that data science lab that we talked about, right? Like, somebody's got to be overseeing that architecture, making sure that that thing is running. And you can either have, in some cases, IT will manage that; but what I've seen, usually, is more effective is if the analytics admin has like, some nice levers that they can kind of pull to control those things. They're also, you know, teaching best practices. So, they're educating data scientists on how to do things properly. I sometimes call it like, shared infrastructures. Like airports, you can't have all of the planes landing on the runway at the same time. The data scientists have to know who else is flying around them in the space, and who's coming in for a landing. So, you have to be aware of those resources.
And they don't know. Like, here's the thing with data scientists. Data scientists, they're just not trained to do this. Like, you don't learn it in school. So, somebody has to teach them, and it's going to he the analytic admin that's going to teach them, right? Like, they're just going to do things, like you and I were just saying, like we're going to blow up stuff, right?
Kirill Eremenko: Mm-hmm (affirmative).
Nathan Stephens: They're just going to do it. They're not going to know. And that's okay, because nobody taught them. So, there's an opportunity there.
Kirill Eremenko: Yeah.
Nathan Stephens: Other day-to-day would be, you know, making sure that you're getting your architectural review board presentation ready to go, to make R, or whatever language you're using, an analytic standard, to make sure that you have resources dedicated to that; like, that people are actually funneling human and financial resources into that work. And then of course, the production work is a whole nother ball of wax, but you know, if you're in the production side, you can actually make even greater impacts.
So, it's like a big job. And I tell people like, analytic admin, you're not going to see that in Indeed, or on your job searches. Like, people aren't advertising for it, but it's an actual need in organizations. And I know that because I talk to a lot of organizations in my role. Like, I'm on the phone every day with customers and potential customers, and almost all of them have this need. So, I like to tell people that if this is something that they're interested in doing, I would definitely go for it, because the need is definitely there; even though the job description might not be written for it yet.
Kirill Eremenko: Gotcha, okay. I just have one burning question from that. That was a great description of this whole role, and I think I learned quite a bit of new things for myself, just now. My question is, could you let me know why does it sometimes ... And I'm sure other data scientists will have exactly the same question ... Why does it sometimes take so long to implement a tool in an organization, especially like, for instance, I'm in an organization, and I want an opensource tool, such as R; like, I can download it on my computer, and run it within 30 minutes. Why does it take several weeks for an organization to roll that out to me, and to allow me to use it for analyzing their data?
Nathan Stephens: Or months, or years, right?
Kirill Eremenko: Yeah.
Nathan Stephens: Yeah. So, there are barriers here. And I don't want to be like a downer, right, but when we talk about large corporations, it's important to know that there's this long journey, decades-long journey, on how they get here. And a lot of them aren't really geared towards data-driven decisions. And a lot of companies don't really know what to do with data scientists, is the problem. Like, there's this notion of like, yes, it's important. We need really smart guys. Let's go get some really smart guys, and boom, we'll have a bunch of financial success.
And that's not really the way it works. Like, you really have to be thoughtful about how you're going to align a data science team with the overall corporate strategy. And the reality is that most companies struggle with that. So when you're in an organization, and you say, "Hey, I need a data science lab," a lot of organizations are not even going to know what that means. Or if they do know what that means, they're not going to be geared to a way to fulfill that request.
So, it's an evolution. And I think a lot of the younger companies that are coming up, like if you work for a startup, that's not going to be as big of an issue. Like, they're just going to know like, we run on Amazon. We're going to [inaudible 00:51:12], you know, a VPN ... Or, I'm sorry, yeah, basically a new server infrastructure, right, or an existing server infrastructure inside of Amazon, and will serve your needs. But like, a larger organization is going to struggle with that.
Kirill Eremenko: Mm-hmm (affirmative), yeah. And do you think that's going to be the cause, why all large organizations are going to end very soon? Or [crosstalk 00:51:45].
Nathan Stephens: No, I'm not saying that that's going to happen. I think there is a tension between like, the large corporations and the smaller companies; but I think, I'm actually very optimistic. And there's, I've met very talented people in all sorts of groups. You know like, I work with large financial groups, insurance groups, consumer packaged goods groups. And I'm always impressed with the quality of talent that these different organizations can attract.
So, I'm actually very, very optimistic about the future of data science, and the direction. What I get more concerned about, frankly, is that the data scientists themselves don't always really understand what they bring to the table. So, I'll be more specific. Data scientists are responsible for understanding their data. And nobody else in the organization has that responsibility. And so if you're a data scientist, and you're spending 80% of your job like, scrubbing data, that's because that is, you're in the role that does that. Like, nobody else is doing that. And the power of that is that when you speak about something, you can speak authoritatively about that. You have ammunition to say, "I know this is true because I actually have been in the data, and I've seen it." And I think that's one of the under-leveraged skills that I see with data scientists, is that they take ... Not everyone, but some data scientists will take that for granted. Like, "Oh, yeah." It's like, "I just happen to know all this stuff."
It's like, no, you know all that stuff. Like, take advantage of that. Make sure other people know about that. Like, broadcast that information. Make sure you communicate what it is you're learning, because I guarantee you, your boss, and your boss's boss, they're not looking at that data. They don't know unless you tell them. So, getting that information out is extremely critical for the success of the data scientist, and for their overall happiness in the job.
Kirill Eremenko: Yeah, and for the success of the business as well.
Nathan Stephens: Yeah, great point. I left that one out, but that's probably the most important one.
Kirill Eremenko: Yeah, all right. Wow, fantastic. That's such a good excursion to that world. Thank you so much. How about we shift gears a little bit and jump into R? Let's talk about R, and what's going on in R these days. And you know, like some great things, I'm sure you have so many great things to say about R.
Nathan Stephens: I do. I think R is fantastic. We were talking, before the call, about R and Python. Could we just jump into that one? Why don't we just hit the elephant in the room?
Kirill Eremenko: Yes.
Nathan Stephens: Okay. So, I don't think there's a war between R and Python. I think the analytic space is plenty big to accommodate two programming languages. And it reminds me a little bit of the conversation back in the '90s, when people were like, "Oh, it's got to be Apple or Microsoft." Well, guess what; computation is big enough to handle two large companies, right? We still have both of these.
So, I don't think there's a war between R and Python. I think that what needs to happen is, you know, you can ... Well, what needs to happen is that those two things need to work really well together. And in case, I just want to mention that we recently made some progress in that area, if you missed the announcement. We actually brought Wes McKinney on-staff at R Studio, and he's one of the well known developers in the Python world. He's the father behind Pandas, and he's now in charge of working in this thing called Ursa Labs. And you can query that, if you haven't seen Ursa Labs. It's named after the bear, right; Ursa Major, Ursa Minor; the Big Dipper and the Little Dipper.
And the job that he is leading up is really around interoperability between datasets and programming languages. So, what do I mean by that? If you're familiar with, Apache Arrow is the project that's building datasets that can be loaded into memory, both in Python, and into R, and into other programming languages. And if you can load, if you can share data across programming languages, you can easily jump in between the programming languages. Like, you could say, "Okay, I've got this R data frame. I want to like, use some Python magic on this." I'd boot up my Python instance, and I suck that data over into Python. Right now, transferring data is an extremely painful process. And you know, Wes is trying to make that a much easier process. And it's a very foundational piece in the toolchain that I'm really excited about.
So basically, my point is that we brought on one of the key Python developers, who works for R Studio now. We've made R Studio much more Python-friendly. We're still R-centric, right? Like, we are still saying, "We like R." But if you're an R developer, it's getting easier and easier to work with the Python tools; to call Python functions, and modules, and interoperate between the two languages. And I think that's a huge advantage for data scientists. The next generation of data science development is to be multilingual, and to take advantage of the things that Python and R both offer; and Julia, and you know, whatever other languages you might be working with as well.
Kirill Eremenko: Yeah, wow. I didn't know that. That's a very ... That's a huge stride forward with getting the languages closer, and hiring-
Nathan Stephens: I think it'll take a couple of ... Yeah, it'll take a little while to play it all out, right? Like, it's definitely part of the long game, but if I look down the road, I see a future where you've got people who know R really well, that are also very comfortable, you know, taking advantage of Python. So, Python opens the door to TensorFlow, Spark; and those are things that we've already incorporated in the R stack, is good connectors to Python, and to Spark, and to TensorFlow, via Python. And I think there'll be more things like that coming in the future.
Kirill Eremenko: Yeah. And I like your comment about multilinguality. That's very important; or, it's a great selling point for any data scientist to have on their résumé, that I know both R and Python. I have experience with both. That's where the world's going, right? [inaudible 00:58:34]
Nathan Stephens: Yeah, right. If you're a hiring manager, and you've got one person who knows Python, and another person who knows R and Python, yeah.
Kirill Eremenko: Gotcha.
Nathan Stephens: Yeah, it's an easy call.
Kirill Eremenko: It's a no-brainer. And so, just to clarify, is your vision that in a couple years, we're going to have one, combined language, R-Python? I'm assuming not. I'm assuming we're still going to have separate, R and Python, but the interlink between them is going to be very efficient and very high. In that case, what would you say that R and Python are good for, separately? Like, which one would you use for certain things, and the other one for other things?
Nathan Stephens: Yeah, yeah. I think you could answer that in a lot of ways. I've asked a lot of people, "Why did you choose R? Or why did you choose Python?" And I get a lot of different answers from that, but one thing I hear frequently, one thing that doesn't surprise me is that it seems like, it's like it's not even a question in their mind. They just kind of went to the language that actually resonated with them. They're like, you know, and R users are very much this way; it's like, "I just love R." You know it's like, you talk to people, it's like, "I just love that experience. I love what it does, and it's just part of like, who I am, even," like the people that really, really love it. Or maybe you want to build Shiny applications, right? There are things that R does, that Python won't do.
You know Python, I've talked to a lot of people that use Python. And sometimes, the answer is back, like, "What is R? I don't even know what it is." So, it's like maybe they don't even know what it is. I think if it's an individual choice, I think that's fine. Like, I think that's great. If you're a Java guy, and you love Java, that's fantastic. Just use the language that you want.
But what's interesting about the R language is that R is so, I guess, forgiving, or just inclusive of other languages. R is a little, there's some humility in the language. And it kind of gives up a lot of its control and power to other languages. So when you run a model in R, you don't actually run it in R. You call a C, or a C++, or a Fortran library to run it, right? When you run a Spark job, you don't run a Spark job in R. You're calling into the Scala API, right?
So like, and that's totally fine with what R is about. R doesn't really want to do that, anyway. R's just like, "Let me just introduce you to these other things." And that's ... So anyway, not a lot of people look at R that way, but that's the way I see R, as more of a way to orchestrate, you know, a lot of power and goodness to work with other systems.
Kirill Eremenko: Yeah, gotcha. And it makes the best of many worlds, rather than just trying to introduce everything on its own. That's pretty good.
Nathan Stephens: Yeah. R's pretty slow, right? Like, if you run things inside of R, it's pretty slow.
Kirill Eremenko: Well, not with everything. Some things, like specific ... What's it called ... like vector operations. There, I think, R outperforms Python in some of those cases.
Nathan Stephens: Right, right. Yeah, yeah.
Kirill Eremenko: But like, loops and stuff [crosstalk 01:02:04], totally agree with you. Like, R-
Nathan Stephens: Loops are pretty slow. Yeah, yeah.
Kirill Eremenko: [crosstalk 01:02:10] Yeah, all right. And what would you say about R and deep learning? Like, with the recent developments in using Keras with R and things like that, those are pretty exciting.
Nathan Stephens: Yeah, yeah. So, just piggybacking on that, that R is slow, it's like the solution to R is slow is to push that information somewhere else. Like, don't do it in R. Do it somewhere else. So with Python, with Keras, and deep learning, all of those routines are also, that's a Python world, right? Like, those are all written in Python. And what JJ has done, JJ's our founder, and done a lot of the engineering around Python and TensorFlow, JJ has written a nice library of connectors that allows somebody who knows R to take advantage of all of the work that's being done in TensorFlow; and not only take advantage of it, but actually give them a really nice experience.
So, we put things into the IDE to help you debug your models. JJ's very good at documentation as well, so there's a really nice set of ... There's a book that you can read. There's a library. There's a website with examples to learn about this. So basically, that technology is like, there and available today. Like, that landed a few months ago. And we're trying to invite as many people as are interested, to come experience it, try it out, and learn from it. It's really cool stuff. I have to say, it opens up a whole new dimension into problems that we previously didn't have tools for.
Kirill Eremenko: Mm-hmm (affirmative), yeah. Definitely exciting, and very, very exciting, especially for those who are used to R, and are now interested in deep learning and AI. And this is finally going to be available.
Nathan Stephens: Right.
Kirill Eremenko: Yeah, all right. Well, we're kind of like, coming close to the end of this session. And time has flown by, and I still have so many questions that I would love to ask you, but I guess I'll hand it over to you. Like, is there anything you would like to share with our listeners, or with aspiring and professional data scientists who want to grow their careers?
Nathan Stephens: Yeah. I think I've shared a lot with the career advice. Can I just make a shameless plug for what we do at R Studio?
Kirill Eremenko: Sure, of course. Go for it.
Nathan Stephens: All right, because a lot of people don't realize that we actually do sell professional grade products for the enterprise. And those are designed to work with all of our opensource packages and tools. So if you're in the enterprise world, you're typically looking at like, security, authentication. You're trying to figure out high-availability scaling. You have like, mission-critical applications and whatnot in there. And we sell products to bring R into the enterprise, and make it an analytic standard in there.
So if you, today, if you are using R on your desktop at your job, and you're downloading data from your SQL server database, onto your laptop, and then taking it home, you know, and leaving it at a café or something, I would encourage you to think about going to the website, seeing what we have to offer, because we actually have a really nice platform for scaling out R in the enterprise; a really nice toolchain for doing that. And it'll make your life better, and increase the capabilities of your tools. And not a lot of people know that like, that's all available. So, yeah. I just wanted to point that out. Thank you for letting me make a shameless plug.
Kirill Eremenko: That's all right [crosstalk 01:06:00]. I just, I will reiterate that. Like, there's a lot of organizations, like we have executives, and directors, and entrepreneurs listening to this. And just for their purpose, for their sake, there's a lot of organizations that still use large, corporate tools, such as SAS, and other tools that are just there, archaically. And it's time to change. And I'm not saying anything against SAS, but the world is going opensource. The power of opensource is incredible, and the communities behind opensource tools are really empowering very fast changes, very fast developments in the algorithms, in the speed, and in everything that the tool requires.
And so, if it's time for change for your organization, then R Studio is there to help. And also, if you are starting a new business, an enterprise, or taking an idea to execution, to actually building a company around an idea, then don't go, it's probably not the best idea to go for some enterprise-specific tool that is not opensource. Why not go for an opensource tool, and get in touch with Nathan? He'll set everything up for you.
Nathan Stephens: I think that's fantastic. Can I add one thing on that?
Kirill Eremenko: Yeah, sure. Of course.
Nathan Stephens: Because I 100% agree with everything you just said. Things are changing rapidly, and when I talk to people who are in the hiring position, who are trying to build out their platforms, you know, and bring in the best, you know, to adapt to this new world, there's this idea of bringing in the best talent, as well. You're trying to capture the data scientists, and they're in high demand. They can be expensive, right? And it's a big investment.
And by and large, that new demand that's coming in from colleges, they're going to know R, and they're going to demand that there's R tools available to them in their job. And so, making an investment in R, I feel very, very strongly, obviously, because I work for R Studio, but feel very strongly that an investment in R is a good move in bringing in the best talent out there.
Kirill Eremenko: Gotcha, couldn't agree more. All right, Nathan, so thank you so much for sharing all the insights, and your wisdom, and your career journey. Where could our listeners get in touch with you and contact you, if they'd like to learn more, or maybe explore the opportunities with R Studio?
Nathan Stephens: Yeah, you're welcome to reach out. My email at R Studio is [email protected] My Twitter handle is NWStephens; and also, everywhere else on the internet, it's going to be NWStephens.
Kirill Eremenko: Yep. And LinkedIn is a good place to get in touch with you?
Nathan Stephens: NWStephens, yeah.
Kirill Eremenko: Yeah, awesome.
Nathan Stephens: Yeah, LinkedIn is great.
Kirill Eremenko: Awesome, all right. We'll include those links in the show notes, and we'll try to find that article that you mentioned, that you wrote about the analytics admin. That was really interesting. I have one more question for you today. What is a book that you can recommend to our listeners, to empower their careers even more?
Nathan Stephens: Well, I'm going to make another shameless plug for Hadley Wickham's book, called R For Data Scientists. It is about R, but it also has some great foundational material, just about how to think about and approach data science. And so, that's why I recommend it.
Kirill Eremenko: Yeah. Does Hadley have a few books, because I'm sure I've read one of them, and I think it's this one.
Nathan Stephens: Hadley has, yeah, Hadley is amazing with the amount of content he pumps out. And yeah, he's got a few books. I neglected to mention that it's co-authored with Garrett Grolemund, as well, who also works at R Studio.
Kirill Eremenko: Okay, gotcha. It seems like you've got all the top analytics talents working for R Studio, and now you're poaching from Python as well.
Nathan Stephens: I have the great ... When I go to a meeting, I assure you, I'm the dumbest one in the meeting. It's really nice to work with such amazing people.
Kirill Eremenko: Exactly. Like, that's my, I always appreciate when I'm the dumbest person in the room. That means there's places I can grow, right? Like if you're the smartest person in the room, you should be in a different room.
Nathan Stephens: Yeah, yeah. I know, and I don't say that, yeah, just in false sincerity. I really mean it. I'm the dumbest one in the room. It's really a great experience, actually working with so many wonderful people. And they're not just smart at their jobs, but they're wonderful people to get to know as well. I'm just really impressed with the character of these people that I get to work with.
Kirill Eremenko: Yeah. Well, the character of this podcast has been amazing. Thank you so much, Nathan, for coming onto the show and sharing all these wonderful insights.
Nathan Stephens: Thank you so much for the opportunity. I really enjoyed it. I learned a lot.
Kirill Eremenko: All right. Talk to you soon. Bye.
Nathan Stephens: Bye.
Kirill Eremenko: So there you have it. That was Nathan Stephens from R Studio, sharing his career journey, and all the recent and greatest updates from R Studio; directly, you hear it from, directly, the person who works there as a director. And what was your favorite part of this podcast? Mine, by far, was the analytic admin concept and description. Nathan obviously has a lot of experience in this space, and he described the idea behind what an analytic admin does, or what that role entails, very aptly, and makes a lot of sense that companies should have a person like that onboard if they are looking to build a lasting analytics culture, a sustainable approach to data science, where everybody's happy. The IT team is happy, and the data scientists are happy as well.
So there we go. That was Nathan Stephens. All of the show notes, and links, and all the things mentioned in this episode are available at www.superdatascience.com/171.
There, you will also find a transcript for this episode, and the URL to Nathan's LinkedIn. Make sure to connect with him, hit him up, and stay in touch. If you are looking to implement R Studio at an enterprise level, or a corporate level in your company, then make sure to get in touch with Nathan. He'll guide you through the process, and at least give you some tips.
And finally, if you know somebody who uses R programming in their language, who is a big fan of R, or who loves R Studio, why not send them this podcast? There's a lot of valuable information, a lot of updates on what's going on in the R space, and I think there's a lot to learn here. So, make sure to forward it on, and you might help somebody out; your friend, your colleague, your relative. Help them out in their career in data science.
And on that note, thank you so much for being here. Can't wait to see you next time. And until then, happy analyzing.