SDS 111: The Power of Soft Skills in Data Science - SuperDataScience - Big Data | Analytics Careers | Mentors | Success

SDS 111: The Power of Soft Skills in Data Science

Welcome to episode #111 of the SDS Podcast. Here we go!

Today's guest is Senior Data Scientist at LinkedIn Learning, Eric Weber

Subscribe on iTunes, Stitcher Radio or TuneIn

In the digital age, people have grown accustomed to accessing entertainment and social media wherever and whenever. Imagine a situation where people are accustomed to reaching for their smart devices at every opportunity, primarily to learn a new skill or acquire knowledge. LinkedIn learning, Udemy, Coursera and other learning platforms are preparing for a future where learning is a widespread habit.

In this episode, Eric Weber tells us about his career in teaching graduate students and how that evolved into a job where he can have a much more significant impact on education and learning.

We also talk about soft skills in data science, skills that are often under-rated in this space, and yet are essential especially in the leadership of data science teams.

Listen in to find out why rock star data scientists are those who master soft skills!

In this episode you will learn:

  • A teaching career at University of Minnesota (04:03)
  • Screening calls play are a very important part of the recruitment process (08:12)
  • Everyone should contribute to spreading the culture of learning (14:52)
  • Eric’s role at LinkedIn Learning and why he enjoys what he does (18:05)
  • Ill-formed questions are a big challenge in finding solutions for business teams (26:10)
  • Tips on how to successfully manage ill-formed questions (31:00)
  • How to be effective in a project that has multiple stakeholders (35:01)
  • Use soft skills to develop context expertise (37:43)
  • In today’s business environment, executives need to understand the concepts, principles, and powers of data science (44:25)
  • Soft skills are crucial in presenting project insights and getting buy-in (51:16)
  • How Eric discovered his passion in teaching (55:31)
  • Developing a concept for looking at the ROI of data scientists (59:33)

Items mentioned in this podcast:

Follow Eric

Episode Transcript

0

Full Podcast Transcript

Expand to view full transcript

Kirill: This is episode 111 with Senior Data Scientist from LinkedIn, Eric Weber. Welcome to the SuperDataScience podcast. My name is Kirill Eremenko, data science coach and lifestyle entrepreneur, and each week we bring inspiring people and ideas to help you build your successful career in data science. Thanks for being here today and now let’s make the complex simple.

Welcome back to the podcast, ladies and gentlemen, I just got off the phone with Senior Data Scientist at LinkedIn, Eric Weber, and we had a great chat. I really enjoyed the chat, and what you will learn in this podcast is first of all, of course, how Eric got into LinkedIn. He just started his job a few months ago and we talked a bit about the interview process and what he really enjoys about LinkedIn and specifically the division of LinkedIn where he is, LinkedIn Learning. Then we unexpectedly delved straight into soft skills in data science. It was a very interesting turn in our conversation and we chatted for ages about soft skills in data science, a very important topic, I think a lot of data scientists miss out on this and the great, the rock star data scientists are those who master the soft skills. So, if you want to be a rock star data scientist, this podcast is for you, we’re going to be giving away lots and lots of tips and hacks from our personal experience. I can’t wait for you to check it out and without further ado, I bring to you Eric Weber, Senior Data Scientist at LinkedIn.

[Background music plays]

Kirill: Welcome everybody to the SuperDataScience podcast. Today I’ve got a special guest from LinkedIn, Eric Weber, data scientist at LinkedIn. Eric, welcome to the show. How are you going today?

Eric: I’m doing great and I’m excited to be here. How are you?

Kirill: I’m doing good as well. Tell us quickly. You work in the Bay area but right now you’re somewhere else. Where are you, Eric?

Eric: I’m actually back home in Minnesota right now, it’s a little bit colder.

Kirill: How cold is it?

Eric: A little bit windier, so the wind chill is around 10˚F today. It’s a little bit cold so I’m looking forward to getting back to the Bay area but spending time at home, family, is great as well.

Kirill: That’s awesome. Yeah, I just looked it up, it’s minus 12˚C for those of you operate in Celsius, that’s crazy. I’m in San Diego right now and it feels like summer here and at the same time you’re like, what, a three-hour flight away, it’s already minus 12 there, it’s crazy.

Eric: It is stunning. To get off the plane and feel that temperature, it’s a good reminder of what home feels like.

Kirill: Are you looking forward to Thanksgiving? What do you guys normally do for the holiday?

Eric: I am. Yeah, it’s a big holiday for my family, so we get relatives from all sorts of areas around the country together and do a big dinner, spend time together. We definitely don’t participate in the crazy Black Friday shopping, I think I gave up on that years ago, so, but generally just spending time with family. It’s a good opportunity to take a break from work and have a chance to see everybody that I don’t get to see regularly.

Kirill: That’s awesome. Cool, let’s talk about how you got to the Bay area a little bit. What made you move there? You told me you moved just a few months ago.

Eric: I did, yeah. I actually moved to the Bay area at the end of June, so I haven’t been around LinkedIn for a long time though I feel like I’ve done a lot in a short time. What brought me there was twofold. One, my interest in data science which had been rapidly developing over the previous couple of years. I was, since 2015, teaching statistics and biostatistics at the University of Minnesota. That involved a lot of teaching programming courses, statistical experiments and design courses and in conjunction with that, I was doing a part-time Master’s program in Business Analytics at the Carlson School of Management. Those two things together, my interest in data, interest in statistics, really pushed me in the direction of data science. The other thing is my focus on love of education. Ever since I can remember, I’ve loved teaching and I’ve loved thinking about how to focus on learning.

Those two things, data science and learning, naturally turned into an opportunity at LinkedIn to work in their data science team, working specifically on the LinkedIn Learning product. For me it’s kind of a dream come true to be in a job where I get to not only like the techniques and the approaches that I use, but also to be passionate about the area in which I’m working because in data science it’s nice to have that passion, to feel like what you’re doing makes a difference and has a real impact on people. Those two things brought me to the Bay area. In addition, I think the weather is a little bit better than it is at home.

Kirill: Awesome, and I totally agree with the concept that you’ve got to be passionate about what you’re doing and that’s amazing that you found that. But you make it sound so easy, like, I was interested in learning and I was doing data science, and naturally that turned into an opportunity at LinkedIn. Walk us through a bit more, like did you apply for the job, did they find you, how did it happen?

Eric: I did actually apply for the job and so I should step back and reflect on exactly how not easy it was. I still have easily over 100 rejection letters that I just kept in an email inbox, during my job search. Because the reality is, data science, though there are a lot of jobs, it’s important to find the right fit. To me, why I mention naturally, is that once I saw the position open and once I had a chance to meet with the data science leaders and the managers and go through the interview process, it felt natural, it felt like the type of position I was supposed to be in. But I think it’s anything but when it comes to the application process. As anybody listening knows, it’s chaotic, it ends up working in ways you did not anticipate, it can be frustrating, all of those things. That’s actually why I started writing, I never had posted on LinkedIn until I started working there. What I wanted to share was sort of my journey about what it means to be in data science, find a job, learn on the job and be real about, there are frustrating aspects to it as much as we have an awesome career, there are a lot of things that are difficult about data science as well.

Kirill: Yeah, I totally understand, and we’ll get to that in a moment. When you applied for the job at LinkedIn, you sent in your résumé, you heard back from them and where did it go from there?

Eric: It’s a pretty typical interview process. Once you make it through the initial screen, typically a recruiter reaches out to you. I had a little bit of a unique process because at the time they were changing over from using more of a pipeline interview process, much more in line with what Google and Facebook do, where you don’t necessarily interview for a specific position, you interview to be part of the team and then placement happens later. I had an initial reach out from a manager in data science, that was followed by a conversation with a recruiter and something I want to emphasize because I think it can be overlooked, is the importance of those screens with recruiters. I think often that can be viewed as just an information gathering and stuff where you’re automatically taken in to the next round. I found in most cases it’s not that. They care a lot about your cultural fit, about how well you know the company, why you want to work there, so that was actually a huge step for me. I had a great positive impression coming out of that conversation with the recruiter. I went through a standard interview process after that, we have a couple of technical phone screens, I can’t get into too much detail about what exactly they involve. But certainly, we tried to evaluate things like product sense and then also the ability to do data manipulation. You can imagine standard ways of manipulating data with SQL and Python and R and those type of things.

The onsite interview process that follows the successful completion of those technical phone screens, the onsite interview process was a day, you spend a very intensive day going through a set of different interviews and depending on the position and the team, those can take on a little bit of a different flavour. You can imagine we cover things like statistics, we cover things like machine learning, and all of this is available to people so they know the general topics before they would ever come onsite. I think it’s helpful to just see that there’s not really a trick to it. Data science interviews I know if you read, prep for them, often they can say be ready for these 100 different questions, but in general we try to do a comprehensive evaluation of what data science knowledge means in a particular company. From that point, I went through a standard negotiation process with the company and I actually moved pretty quickly after that. From acceptance of the term offer to starting, I think, was less than three weeks. It was a pretty intensive move from Minnesota all the way to the Bay area but …

Kirill: Yap. Packed your bags, packed your skis.

Eric: Yeah, pretty much all those things. It’s been great ever since and I’m happy I made the move.

Kirill: That’s awesome. Sounds like a lengthy interview process and thanks for walking us through it. But also sounds like it was worth it, you’re really enjoying your time with LinkedIn right now.

Eric: I am, yeah. And I think it’s interesting because I get a lot of messages everyday on LinkedIn saying, I want to work there. I think it’s important in some cases to think about why because, yes, it’s certainly a brand name company and it’s helpful to have on a résumé. But more than that, when people ask about referrals and things like that, I often will respond and ask, why do you want this job in particular? And if it’s because of the brand name, I often don’t think that’s necessarily enough in data science right now.

Kirill: Of course, the correct answer is data, because of the LinkedIn data.

Eric: It is. It’s fascinating. Every day I find out more things that I don’t know and that I get a chance to learn, so no regrets. It’s been great.

Kirill: Nice, that’s really cool. It’s a very interesting subdivision of LinkedIn. Is this LinkedIn Learning or is this Lynda? Because LinkedIn acquired Lynda a few years ago, which one is it?

Eric: It is LinkedIn Learning. The content, depending on how you talk to our marketing team or our branding team, they might say different things. But really LinkedIn Learning and the content is generated through what’s available on Lynda. They did acquire Lynda two years ago, but LinkedIn Learning is the eventual product. As we move customers from Lynda to LinkedIn Learning, one day that will be what we want to house everything under, as LinkedIn Learning.

Kirill: Gotcha. This is like educational courses, similar to Udemy or Coursera, and other platforms that are out there. Can anybody take these courses or is it only available to companies as part of a corporate subscription?

Eric: Anybody can take the courses. There are consumer subscriptions, there are enterprise level subscriptions, free trials for anybody who wants to try to use it. Right now, we’re in a phase where we are looking to grow our customer base and in some cases, that comes from existing customers on Lynda but we’re also trying to generate courses and topics that are of great interest to industry professionals and trying to integrate that into the LinkedIn ecosystem. We have that advantage of having a large platform and being able to try to build courses in education within that platform, it’s really our ultimate goal.

Kirill: Okay, gotcha. Past couple of times I was flying especially the long flights like Emirates flights specifically, they have on the selection of movies and programs, they actually have LinkedIn Learning. That’s also something that you guys do, that’s really cool.

Eric: Yeah. I think one thing that I would say about LinkedIn Learning is we have some incredibly creative and amazing people that are both in data science but also trying to thing about how we share this product and how we try to motivate learning across the world. I think there’s a lot of ground-breaking ideas and we’ve seen attempts and moves in this direction by Coursera and Udemy and other platforms, and I think there is still this central question that we’re all trying to solve, how do we make learning a habit? We want to be like Facebook, we want people to develop this habit of doing something except in this case it’s learning. I think it’s an important question to think about, how do you shape human behaviour to get them to repeatedly come back to something and continue to learn it. It’s a big challenge, I don’t think anyone has really solved well at this point.

Kirill: Yeah, interesting. The interesting, about this podcast is people listening to it, most of them are already in the habit of learning. They’ve taken courses and they’re listening to this podcast to learn. What do you have to say to them, people who are already in this habit?

Eric: I’d say stay in the habit. I think I realize that sounds relatively trivial- keep doing what you’re doing. But as this field develops, I think one thing that I’ve come to see is even after a year of really being out of school, I have to go back and revisit a lot of the things that I knew or thought I knew at the end of my formal education, because things change really fast, so this attitude towards learning. And the other thing that I think maybe is overlooked is developing an individual attitude toward learning is awesome. A really powerful thing that you can do with that is, try to extend that to other people. Try to share your passion with the people around you that may not have the same outlook, maybe they don’t know about the same resources that you do. If you can think about how powerful it is to communicate that type of attitude to a larger organisation, I think that’s where a lot of the big gains come from. I would put that as a challenge to people listening to a podcast like this. That’s awesome that you want to learn, and you have that attitude, the next big challenge is how do you share that with other people?

Kirill: Amazing, I love that. It’s a great challenge. I would actually take it a step further to quantify it and say, guys listening, the challenge is get at least two other people who are not in the habit of learning. Introduce them to something that you’ve learned recently like a course or maybe an article or book or whatever that’s benefited you and changed your life. Once you’ve done that, shoot Eric a message on LinkedIn boasting about how you’ve accomplished this challenge.

Eric: I’d love to hear about it. That’s kind of my reason for posting the things that I do on LinkedIn. I’m trying to share the things that I learn every day, that I figure out at work. Even if those aren’t always amazing accomplishments, even the little stuff can be really powerful for people.

Kirill: Yeah, I totally agree. Speaking of work, can you without disclosing any classified or company secrets, walk us through some of the things that you do at work because people are definitely interested in what a data scientist does at LinkedIn and especially in the space of education.

Eric: Yeah, that’s awesome. To give you a sense of what I do, I think it’s helpful to contextualize data science at LinkedIn a little bit. I’ll put this in as a plug, our data science org is already large we’re well over one 150 in terms of headcount, yeah data science side of things. But we also have opened headcount, we have at any given time, we probably have 20-25 because we’re growing. This is something that I want to emphasize, just because we’re big, doesn’t mean that we’re not going to continue to grow. We do a lot of different things. You can imagine with a group that size, we handle a lot of different data science tasks.

We’re split by product. I work in our LinkedIn Learning product, we have Talent Solutions which is what any recruiters hearing this will be familiar with when they’re trying to find talent, we have Sales Solutions for sales people, and we have Marketing Solutions. Those are really our four main verticals or products so to speak, within the organisation, so we organise our data science teams around those. Within each of those, we tend to focus on a product side and a go-to-market side. The product side is more what you would think of in actually design of the product, where you’re trying to evaluate and test features and things related to the product itself. The go-to-market side is rolling things out. You can think about go-to-market as sales, marketing, content, anything that falls within that realm. I specifically work on the go-to-market side within LinkedIn Learning and what that means is I kind of cover sales and I cover sales for our bigger customers. We ask questions, of course without disclosing too much, about how do we know what our best customers or the customers we’d like to go after, how do we know what they look like? How do we build profiles of them, how do we help, let’s say, sales people understand the best approach to sign to a particular customer? How do we make our sales org as efficient as possible with data? I think that’s the central question that everything I do falls under. Is, how do we make the sales side more efficient? At the end of the day, our value is really measured by how efficient we can make the sales organisation. We are not a revenue generating piece, I think that’s an important thing to remember. In some cases, I can say, okay I build profiles of what our best customers look like. In cases where we want to try, you can imagine how many different companies we have on LinkedIn, we can’t call all of the companies, so we want to understand which company should we be calling and if we call them, what kind of things should we be asking for, what kind of things should we be saying? You can imagine a lot of the customer segmentation type things and machine learning models that underlie a lot of that work and as you hinted at earlier, we certainly have no shortage of data and that’s one thing that we sit with a distinct advantage of, having a lot of information. We can measure things that may otherwise be difficult to measure or may otherwise require pulling together maybe information from 50 different third-party sources. We have it centralized and so to me that’s one of the most exciting things. We can generate ideas and then see exactly how powerful our data is to try to help us accomplish those.

Kirill: That’s really cool. With the amount of data that LinkedIn has, especially like you’ve got static data, information about the person’s background, about their age, about their previous job experiences and stuff like that, that changes but very slowly over time. And then you’ve got dynamic data which is constantly updating, like people posting stuff, people viewing stuff, companies posting updates and so on, data that updates every day, sometimes even every hour. Which of those two types of data would you say is the most valuable for your predictions?

Eric: I think that’s actually an open question. Part of what we try to do is we try to spend our time working on open problems, where we aren’t necessarily sure what the most valuable piece of data might be. To me when I answer what is most valuable, I think it depends on what the end user is looking for. In my case the end user is let’s say our sales intelligence side. If they’re asking for characteristics of what the best customers look like, most likely those are going to be slowly changing dimensions over time. But let’s say that in some cases we find something new that we want to be able to measure about a customer, that might require going to understand data that changes every hour like you mentioned. Of course, it can change a lot faster than that. One of the operational challenges is figuring out as we build models and as we try to do analysis, it’s much easier … I shouldn’t say easier. It’s more traditional to look at the things that slowly change rather than have to rebuild models over and over and over again as we have data come in every hour, every minute, every second. I think one of the things that I find fascinating about our work is it’s not just a one-off build the model and then generate insights from it. We need to be able to think at a production level. I’ve posted about that before on LinkedIn. We want to be able to scale things over time and I think a lot of companies are starting to encounter this challenge about how do you scale these models that you’re building. How do you do this? How do you monitor their health over time? Even as let’s say the data is changing. As data science orgs become larger and larger, within a lot of companies, that’s particularly true in The Valley, we’re having to answer these questions about scalability. It’s not just about one-off best model performance any longer.

Kirill: In an organisation like this, would you have people, specific data scientists who are designated to that question of scalability, or does everybody in their role have to take that into consideration? For example, does that affect you, that problem?

Eric: I would say everybody has a role in it. Though we certainly have hierarchies within the data science org, we also operate in a very flat structure type of way where we’re expected to really be working not only with other people in our team, but across teams and horizontally, and tackling these challenges together. I would say that one of the things that I’ve noticed in my first few months at LinkedIn is people take responsibility for getting things done. If they notice a problem if they notice an issue, they take ownership and that taking ownership of the data product that you create and its health over time, that in a lot of cases can be cradle to grave. Right from its creation all the way to the time that we phase it out. There’s not necessarily just a few people who monitor models or the health over time or focus on scalability, though certainly at some point we have to decide on smaller teams to be able to tackle individual issues. It’s a data science problem but it’s also an org-wide problem. Any company that’s growing really quickly has to deal with an engineering problem, data engineering, data ingestion. To me it’s a big grand experiment in how fast you can scale and how effectively you can scale as the data increases.

Kirill: That’s a very apt description of that. So, you create models and you mentioned there’s some things about your work which are difficult. Let’s dive into that for a few minutes. What do you find the most difficult in your work as a data scientist?

Eric: I think the thing that’s most difficult and also probably the thing I enjoy the most, kind of funny to put those two things together. We often receive problems that are relatively ill-formed. By that I mean our business partners don’t come to us with a problem that it says, build this model in this particular way and tell us the exact result. We often get a question or a directive that is stated at a high level. We need this to be better. And so the question is, what does it mean for something to be better? That’s a whole discussion in and of itself. Then also deciding if we’re going to make this better or improve this model or find a new metric, what data do we need to do that? The assessment of A, do we have the data readily available, B, if we don’t have it readily available, do we have a period, and can we get to it, and C, if we don’t have it, is it worth it to go pursue it? Is it worth it to get this new data? Overarching all of these issues to me is, if we are going to be able to answer a question, an ill-formed question, we need to be able to rely on the data that we have to be accurate. For me, it’s starting with this difficult-to-answer question because it’s not stated as something that’s nicely tied back to a model, trying to turn it into a data science problem and then as you turn it into a data science problem, you find all of these issues. Data availability, data quality, every organisation deals with these, but for me it’s this process of sort of unpacking the business problem to a data science problem, to then giving back and actually seeing what is the source of truth data that I need to answer this question? Then of course, going back up. Once you’re able to measure what you want to measure, build the model in a way that you want to, how do you repackage that for the business partner? Because on their end, they only want an answer to their question. They don’t necessarily need to know all these details about how things worked out, they just need to know that it works and that your response works. I find that to be the most challenging piece, is to at the end of the day, take what may have been a really complex process and distil it down to a relatively simple and actionable answer. For me that’s the hardest but also the most important and enjoyable thing that I do.

Kirill: That’s awesome. You’re acting like an investigator in that sense.

Eric: Yeah, absolutely.

Kirill: That’s really cool. Can you give us a few tips? Since this is your most challenging part and the one you most enjoy, I’m sure you have some best practices that you have that you might be able to share. For instance, like I’ve encountered similar situations as well, like back when I was in consulting, the problem was usually formulated already, not by the client but by the relationship manager in the business. Whoever sold the project to the client already formulated the problem or the issue of the business into a data science problem or question, and we were just solving it as analysts and data scientists. But then when I moved out of consulting into industry, I was hit smack-bang on with that exact same situation where your colleagues in the business come to you with issues, they’re ill-formed as you say, and then you have to turn them into problems. Once I got over that, got the hang of that, how to put them into problems and questions that are solvable with data science, I came across another issue that unless you actually put it into writing and get their confirmation that this is the rephrased version of their initial issue, people tend to do something that’s called scope creep to you. Where you’re working a project and then they’re like, can you do this as well, can you do this as well, can you do this … And they think that’s related to their original issue but it’s not any more related to that question that you formulated. One of my biggest takeaways from that was once you’ve done that formulation and once you’ve come up with a new question or objective that you can solve with data science, you’ve got to put it in writing, you’ve got to get their written confirmation that this is what you’re going to be working on, otherwise there’s not going to be an end to it. Have you encountered anything like that, and do you have any other tips that you might be able to share with us?

Eric: Basically, I feel like I could say exactly what you just said. I think it gets to a larger point, it’s that communication matters. My advice is to over-communicate. Some people might not like getting a ton of emails from you but in the end, over-communicating, making sure that everyone is looped in and everyone has a solid understanding of the problem that you’re solving, the time that you think it will take to solve it, and also what the end-result might look like, what is the data product or insight that you’re expected to deliver on the other side. Circling this, getting a ton of business partner buy-in, also making sure that you have a business partner supporter. I think that’s something that I would suggest. Make sure that you have … Because sometimes you mentioned scope creep, people pile things on and expectations, you have to have a business partner who is on your side, who’s willing to push back on other people because sometimes that can be difficult as a data scientist to say no. I often try to say over- communicate, get business partner buy-in, and find someone who’s a champion, someone who is going to be a support for you. Because without this step, you may go down a rabbit hole, you might find that you produce something in the end and people say, I thought you were going to do this. You always want to have that buy-in up front. I think what you mentioned before the perfect example, communicate, communicate, communicate. Everything else comes from there.

Kirill: Gotcha. Just to sum those up for our listeners, make sure to put the agreed end-result expectation in writing, put your timeline in writing, get business partner buy-in, ideally get a business partner supporter who’s going to be on your side, and over-communicate. On that last one I’d like to add as well, I totally agree, and I think not just at the start or on the onset of the project. Over-communicate, the way I understand it, is even as the project goes along, you keep communicating. You don’t hold your cards close your chest, instead you say, like you’ve run into an issue and you need more time, you expect delays, you tell them. As you, like you hit a checkpoint in your project, you show the results. Sometimes it’s tempting to keep all the results hidden and then surprise them at the end, like look what I did, this is amazing. Sometimes you’ll get everybody clapping and applauding, but sometimes you’ll also get people like, this is not what we wanted. It’s better to actually show them along the way and that way if they’re agreeing to your checkpoints like every week, that you’ve made progress, then it’s much harder for them to go back on their word and say, actually this is not what we wanted, because they were with you along the way the whole time.

Eric: Yeah. Try to include them, think about them as partners along the way. You very rarely are going to be able to do something completely independently and produce something that will make everybody happy, so you need to kind of carry them along with you and sometimes that seems like a bit of a burden to send let’s say detailed updates every week, but it pays for itself in terms of time and payoff in the end.

Kirill: Totally agree. Have you ever had a project where you have multiple stakeholders and there’s no one obvious main stakeholder? You might have like three, four people, talking to you about the project and then it’s not obvious who’s the main one. There might be like let’s say two teams and there’s two stakeholders and they kind of have a bit different questions or a bit different objectives. What do you do in those cases?

Eric: I think it’s important that everybody that’s involved in the project, that touches the project has a clear view of the different asks and requests that are being made to you. I often make sure that if I have multiple teams that are invested in a process and perhaps they each need something different, that they’re still well aware of what’s being asked at a global level. Because what’s being asked at a global level of you often impacts your ability to maybe be responsive to one team in a particular week, or maybe there was a fire on one project for a particular week, so you didn’t have as much time to catch up with the other one. But making sure that they are, again this goes back to over-communicating, make sure that you loop in everybody who has a stake in the project and before you commence on the project, make sure that there is an agreement and try to get this in writing, don’t just get an agreement in passing. Say these are the major priorities. Because when something takes a turn in the middle of the project and they try to change the priorities, it’s important to have something to go back to and say, this is really almost like a project charter, where you say these were the main things that were asked, any deviation from this really needs to be bumped up even higher. Outside of anybody in the team, maybe it’s a vice president that sits outside of everybody, but make sure that it’s clear about how changes can be made to that and they usually should not be able to be made by a stakeholder. It should really be somebody who can look at the project top-down and really see if there is a need for major changes.

Kirill: That’s golden advice. In consulting, that would be, if you want to make a change, you’ve got to pay more.
[Laughter]
The only difference is in a business you don’t say that but it’s still time, effort, it still delays other things. I love how this podcast has taken an unexpected turn into soft skills into data science. I’m really enjoying this. I’ve got another question for you. How important are people skills in that process that you find the most interesting and the most challenging in understanding how to change the problem? What I mean by that is not the people skills with the stakeholders that are asking you to do the project, but do you ever go out and walk around LinkedIn and talk to people and say, build some domain knowledge, build some expertise, not through the data but through talking to people and getting their insights into what’s going on into what they think about the issue?

Eric: Yeah. I would call this certainly context expertise and it’s something that is fundamental to delivering something of high quality in the end. Because in data science, we may be coming from totally different areas. I didn’t come from a sales background, yet I support sales intelligence, so what does that mean? That means I need to better understand exactly what it is that sales does, how they do it, and why they do it in a particular way. That requires going beyond the data science team. I need to go and seek out people who have a really good understanding of a particular context, in some cases maybe even observe sales people at work to really understand what that process is, what the constraints are, what the pain points are. That “people skills”, it’s so important because I would say if there’s one thing that can make or break your success as a data scientist, it’s that ability to seek out the right people to help you understand the problem fully. I think about it like this. When we have a data science problem, we are familiar with the measures and the quantifiable things related to the problem. But we also don’t know what we don’t know. Maybe there’s a metric or maybe there’s something that the business side or the sales people know is fundamentally important that isn’t captured in any of the metrics that you have. Your database is not going to scream and say, hey, go measure this. This needs to come from people who are on the business side and having that, not only the ability to go seek it out, but also just to feel like it’s an important part of every data science problem that you tackle. I think for me that’s something that I put on a checklist of whenever I’m starting a project. I better understand the context before I launch into trying to build any sort of model.

Kirill: Yeah. I agree with that as well. I heard somewhere, I might be getting the quote incorrect, but the meaning boils down to this: There are two types of knowledge. There’s things that we think we know and there’s things that we know that we don’t know. That’s it, and this is what we’re talking about here, that there’s going to be things that you know you don’t know and you need to find a way to get to them.

Eric: I love that. It’s well stated.

Kirill: Okay. Going out there, talking to people, it’s really fun. I like that part of the job, that you have this opportunity, reason, whatever you want to call it, to get out of your desk and actually walk around. Sometimes I feel like, it might be a bit of a stretch of a metaphor, but I feel like a doctor. You’re walking around the organisation, you look here, you talk to these people, you look at this process, you enquire about how this thing is working. In a previous role when I was helping customer support guys deal with their backlogs and stuff like that, I was actually allowed to participate in phone calls they were having with customers and kind of listen in on which questions they ask, what responses they get, and so on. It was so much fun, I really liked that part of being in the field kind of thing of data science.

Eric: You know what stands out to me about saying that is the qualities that you described about being excited about going out to understand the context and how the business works. That’s also an important part of developing leadership skills. I’m definitely of the belief that a lot of the best data scientists are eventually going to move up into high-level leadership positions and I think part of that comes from really understanding the business well. What you described is important for anybody not just in data science but anybody who really wants to take ownership and leadership in a business, is to take that initiative to understand what people’s experiences are like, what their pain points are.

Kirill: Exactly. And I’ll take your comment even further that I also read somewhere that previously in a previous century or the start of this one, it was important for an executive to be good at marketing, to understand marketing and how he’s going to sell the product, sell the business and so on. Because like long ago at the start of the previous, like in the 1900s and so on, it was more about established relationships and you had the radio and that’s all the media you had. Maybe some TV but not that much advertising, so you just go to the store and you get what you get and there’s not much competition going on. If you are the major player selling a certain product in a certain region, then the guys from across the country, even in the US, won’t be able to really compete with you. Then we moved into a very digital internet-driven age and all executives had to start thinking more about marketing, and it wasn’t just a given that their product would sell. Now all of a sudden, they’re competing with everybody, not just locally but nationally, even globally and there’s all sorts of different media so they have to have a good understanding of it. Now we’re moving into an era where data science, though not marketing is king, not that saying anymore, marketing is king, but data science is king. Now all executives have to upskill themselves in data science. It tallies along what you said that the most successful data scientists will move into executive roles and that’s a really cool way of putting it. I really like that. But even those who are now executives, if they want to stay where they are, they will have to get up to speed with data science and develop that side of things. They might have the people side of things, but they’ll have to be develop the data science side of things. Very interesting future ahead, I think.

Eric: I agree. I think you make a great point in that a lot of the demand for data science is also coming from executives who want to better understand it. I think you’re seeing that as programs role out nationwide for executive education, almost all of them have either specialties or some courses that focus on data science.

Kirill: Yap, exactly. Why do you think that is? Let’s say there’s for instance some executives, which they are, we know you’re listening. There are some executives listening or some business owners or entrepreneurs listening. Why would you say, in a nutshell, is it important for them to understand the concepts, principles and power of data science?

Eric: Data science is an investment. If you look at salaries of data scientists, you look at what it costs to go through and hire one and keep one over time, it’s a significant investment cost. In order to understand what you might get out of that, I think it requires two things. One, it requires a detailed understanding of your own business, what the pain points are of your business, what the long-term vision is for your business. That’s often housed within the mind of leader, CEO, whatever it might be. But it’s also important to understand how data science might fill that gap. And to understand how data science might fill that gap or be part of that long-term vision, it might not be important that you understand every single algorithm, but it’s important that you understand what it’s affordances and constraints for your business. Like, is there really going to be a value add from bringing in a data science team? Or do you need maybe an analyst who can just pull data? As organisations begin to scale their data science operations, having executives who can really understand the ROI so to speak, understand how to build a data science team, I think that might be one of the most challenging pieces. In the end, I think what’s most difficult for organisations, how do you find leaders who are also visionaries to lead a data science team? They’re not just good at the data science side but they can see where things are going and so to me that demand and the importance for leadership to develop data science skills is to understand the affordances and constraints that it has for that particular organisation.

Kirill: I love that answer. I didn’t even think of it that way before, that executives need to understand data science because it’s an investment that they will inevitably have to make, and they need to understand how to best make it and to what extents it’s going to be worthwhile and what benefits they are going to get from it ROI-wise.

I wanted to touch on something else you mentioned just now. The leadership, not of the organisational role but finding the leaders that are going to lead the data science divisions, that are going to build the data science team. There’s an example that I’ve heard a couple of times now. It was of a big bank in the US, every time this example is mentioned I don’t even know what bank it is because you know, from these consultants. It was a big consulting firm that did the consulting job for them, for that bank, and then one of the presenters was talking about it. They built out this whole data science division, I think they hired like over 100 people. You can imagine the scale of the bank. They caught wind that data science is getting big, and this was like last year or the year before, and they instantly were like, hey we’re going to invest a couple of million dollars into this data science division, built out a 100 people and they were doing projects. Then, literally, that lasted for one and a half years or somewhere around that, and they closed it down. They took the best of the best into different parts of the business and then they fired like 90% of the remaining data scientists. The reason for that was that the management that was running the data science division didn’t see how to integrate with the rest of the business in terms of showing the value that they bring. They were doing some projects and so on, but they weren’t demonstrating the bottom line, how they were changing. As you can imagine in banks, it’s a cut-throat environment and they’ve got to compete with other banks and so on. And if you’re not delivering to the bottom line, if you’re not changing the profit of the business, then they don’t see that you’re brining value. And that whole mismatch, even though they were doing stuff and probably valuable stuff, but it wasn’t conveyed in the terms of how much profit they were making per project and so on per month, and they just disbanded the whole team. And so I completely stand by what you said that it’s also important to find the right people to build those data science teams so that they integrate with the business well and deliver on what the business objectives are.

Eric: Right. I think you’ve said it very well. I mentioned a little bit before, data science by its nature is not a revenue generating piece typically, and so your value is going to be measured through the people in the parts of the organisation you affect. It’s important to agree with those parts of the organisation as to how you’re going to measure that effect. If you try to measure, of course, just like we know in data science, trying to measure something retrospectively is much more difficult than trying to actually plan for and figure out a way to measure something proactively. To me, that’s where a lot of … and I see really good data science leaders and I’m fortunate to have a lot of them to look up to at LinkedIn, they think proactively, they think ahead as to how are we going to show our value to this organisation? And honestly the organisation data science org at LinkedIn would not be as big as it is if they were not thinking about these issues. We wouldn’t get these many positions approved to hire if we weren’t thinking proactively about how to show our value. I think that’s an important thing for anybody listening to think about how you establish ROI as a data scientist.

Kirill: Yeah, I totally agree. I’ve got one more question on soft skills and then we’ll move on. It feels like it’s taking up the bulk of the podcast but ….

Eric: That’s totally fine, I’m enjoying this.

Kirill: Me too. We talked about soft skills at the start when you got to come off the project, we talked about soft skills during, when you’ve got to go and develop context knowledge as you called it, context expertise. How about soft skills at the end? What tips can you give us in terms of presenting? Obviously, you have some projects that you’re presenting to your business partners and you’ve probably done quite a few of those. What kind of presentation skills or techniques do you use? Do you use slides, do you not use slides? Do you just send off the presentation? What’s the most important? Do data scientists in your view need to be personable in that sense, do they need to make it an engaging presentation? What are the best approaches there?

Eric: This is a really fascinating question for me and I think I’ll start by saying the temptation in summarizing what you’ve done is to overshare about its complexity. Because you did a lot of interesting stuff, you have this desire to share the cool ways that you approached the problem, how you define the problem, how you iterated through the problem, but in thinking about communicating the complexity, you really should only be communicating complexity if it’s necessary to understand your final recommendation, your final solution. I often start, when I’m thinking about these presentations, and they are important, they are extremely important. I start with what is the takeaway message going to be? Let’s say I have leadership in the room and I have 20 minutes, I’ll say 15 because that’s about as much as you’re going to get. What do you want them to walk out of that room with in mind? And typically, you need to have a clear succinct message, so I try to never go more than having three major points in any presentation, and that’s even in a long presentation. In 15 minutes, I might have two major takeaways, so I build those takeaway points and then I build the presentation from those takeaways. I go into the level of detail that I need to go into to help them understand how we can be sure that these takeaway points are solid, I also try to not only say what happened but also be forward looking and say, while these are the major takeaways, there are typically things that we still don’t understand, things that we need further support for. Because if you can communicate the value in those takeaways and also say, hey we may be asking for something down the road, I think it’s useful to at least put that idea in mind about what the next steps of this process are. And certainly, there are some cases there might be just one-off analysis but if you’re doing data science in the way I think it really should be done, you’re going to find things during your analysis and during your work that lead you to think in new directions, and I think it’s important to share those in an affective and responsible way with leadership. That’s really where I start, what are your takeaways, tie each slide, each whatever it’s going to be, maybe it’s a memo, everything should come back to those major takeaways, and resist the urge to dig into all the complexity unless you absolutely have to.

Kirill: Awesome. Start with the takeaway, build slides around that. That note about sharing some possible future work, it’s really good. I was just thinking that you’d make a really good consultant. Like on-selling in the process of the presentation, that’s genius.

Eric: Yeah. Selling, I mean as a data scientist you are selling yourself in a lot of ways. You’re selling the data product you create the insights you generate, and I think it’s a great … Always think about yourself by the product that you’re selling, it’s important.

Kirill: Gotcha. In what we’ve got left of this podcast, I would like to rewind a bit back, back to you, now that we’ve talked so much about the soft skills which I think was very valuable and I hope a lot of listeners got something out of it. Tell us a bit more about your passion. You said you’re passionate about education, about teaching. How did this come to be? Like I have a very similar passion, but it took me so long to discover it. For years, I didn’t know. I knew I was good at it, I knew I enjoyed it but when I figured out that it’s my passion, that’s when things clicked, and everything lined up and I could exactly see which way I need to go and which direction. How did it happen for you?

Eric: I was fortunate to have it basically presented to me early on in life. I was actually in high school, so I was probably 16 or 17 at the time. One of my teachers said, you’d make a really good teacher, and I kind of sat on that for a while and I thought, is that true? Will I really be a good teacher? And I started thinking about it a lot more and I realized that basically everything I enjoyed came back to helping people better understand things that were difficult, so this started out in being interested in teaching math over the years it’s meandered to teaching statistics, teaching data science, teaching programming. Like we’ve just spent a little bit of time on soft skills, teaching people how to become more comfortable in communication. And so that passion was unlocked really early for me and I think it’s changed over time in terms of how I’ve seen myself contributing to education. It started saying, I’m going to be in let’s say a secondary education, like a high school classroom teaching mathematics, that’s what I initially envisioned myself doing. And then it was, I’m going to be teaching at a collegiate level, I’m going to teach undergrads. And then it was, I’m going to teach grad students when I was at the University of Minnesota. I’ve taught small classes, I’ve taught large lecture classes that number in the hundreds, and what really pushed me to think about making the change to LinkedIn was I saw a chance to do things at scale. It was a chance to continue to think critically about education, continue to think about learning, but really have this chance at a large, large, impact. Something that you can’t just do when you’re let’s say teaching a couple of classes a semester. But instead thinking about how do we get people to develop learning as a habit? And for me that is where I’m at right now, when I wake up every day I’m excited to go to work and think about these problems. I think that will always continue to be a motivating force for me when it comes to anything related to data science.

Kirill: That’s fantastic, I love that story. Hopefully that also inspires people. Everybody has their own different passions but there might also be people listening to this podcast that if you find that you enjoy teaching, then maybe it is your passion and like you’ve got two people here on the podcast who discovered it for themselves talking. Maybe give it a go, explore, like in this example like Eric did. Maybe do some seminars, maybe help some other people and see if you can cultivate that and if you start really enjoying that, maybe that is your future, maybe you can build a career on that or at least it would help hundreds or even dozens, even if you help two people, that’s already a huge contribution back to the world. It’s always admirable when people do that, I find.

Eric: Yeah. And starting small is great, you don’t have to start by trying to impact hundreds of people at once. If you turn to the person that you work with and you can help them better understand something, that’s as important as anything that we’re doing at scale. I think that’s a great point, I’m glad you were able to bring that up.

Kirill: Awesome. Eric, I’ve got one more question for you before we wrap up. From what you’ve seen in data science, in education, and from all the experiences you’ve had and where you stand now, where do you think the whole field of data science is going and what should our listeners prepare for to be ready for the future that’s coming?

Eric: This is a tough question. In terms of where it’s going, I think we’re going to see some radical changes as companies start reckoning with the investments that they’ve made in data science, and by that, I mean some companies are going to decide that they haven’t seen a return on investment and I think in a lot of cases it’s because they’re not measuring the right thing. But you’re going to see other companies that decide to double down and say, okay we really need to take seriously how we’re measuring the impact of our data scientists on this organisation. So, to me what the near future feels like, is really developing a concept for really looking at ROI of data scientists. Really turning the data science onto the actual data scientists. How are you going to measure your value for us? How are you going to test your value for us? That’s a question I think we’re going to have to answer. To this point I think data science being the sexy career and having high salaries and all of this stuff, that’s great. But companies at the end of the day are a business and they need to see the lift and the amount of improvement they get from adding data scientists. And to me that’s a near-term thing, that’s something that a lot of companies will be grappling with. To me, where we’re going, I mentioned scalability at the start of the conversation. Scalability is a big deal. I continue to not sometimes care if someone can eke out an extra 0.1 percent of performance in their model. I’d rather see them be able to scale and productionize their model so that we don’t have to rely on someone maintaining it over time, so that we can actually build it to deal with the size of the data that we have. This is an issue that we see at LinkedIn, but it’s not unique to LinkedIn. The volume of data is not going to decrease, the speed at which it comes in is not going to slow down all of a sudden, so I think really dealing with scalability, that is a big issue to grapple with here in the near future.

Kirill: Awesome. Return on investment from data science and scalability. How would you say listeners should prepare or can prepare for that first one, how to be prepared to have those conversations of what data science means for the business in terms of the bottom line?

Eric: I think in a lot of cases, and this is something that I am trying to go through now, is figuring out how does the business measure success? In some cases, of course it’s going to be maybe cash related, or revenue related. But there are probably also other metrics that the business cares a lot about that represent their true north or their success metrics. We have a lot of those internally at LinkedIn. So, you go into a business, it’s worth taking the time to better understand what those metrics are. Think about it like this, executives in almost every company now have an executive dashboard they look at. It’s really important that we understand what goes on that executive dashboard, because their knowledge and their assessment of how the company is doing is probably going to be guided by a few simple metrics and if we can try to think about how our work impacts those metrics, I think that’s where we can start to be able to show the value to the executives. Rather than just saying, well, hey if I leave, who’s going to build your model? I don’t think that’s an effective way to communicate your value. It might work in some cases but certainly thinking about what do they care about measuring and then how can you figure out how your work contributes to those measures.

Kirill: Love it. Great tip. Guys, look out for that executive dashboard. Executives measure the business based on that dashboard and a lot of the time the executives themselves have KPIs related to what the dashboard is showing so that’s definitely a good tip on how to portray business value.

Eric, thanks so much for coming on the show. I really appreciate you sharing all the insights. How can our listeners contact you or follow you to see how your career progresses from here?

Eric: You’ll be surprised to know that LinkedIn is probably the best place for that, but I really try to centralize most of my social media activity on LinkedIn so feel free to reach out to me there. Given the volumes of messages I get, I will get back to you, which is maybe a little bit. But interact with me there, ask questions, if you have questions that are really important, I often will not just respond to you individually, I’ll try to post about it more broadly so that I can open it up for discussion. Also, if you’re going to post about something or you have a question, tag me in your post. I’d love to see the kind of questions that you’re generating, the issues that you’re grappling with, because to me that’s where we sort of build a community around data science, that’s how we start to do it and that can be across a lot of social media platforms, I just happen, working for LinkedIn, I feel like it’s a way for me to learn about the platform and contribute to the conversation.

Kirill: Awesome. Thanks a lot. We’ll share that in the episode notes. One final question, is there a book that you can recommend to our listeners to help them better their careers?

Eric: Yeah. It’s a book I am reading right now, it’s really an update a lot of the machine learning texts. Computer Age Statistical Inference: Algorithms, Evidence and Data Science by Efron and Hastie. That to me is something you should be reading. It really takes a lot of the things that we’ve discussed, machine learning and issues up to this point, and tries to update them based on the big data challenges that we’re encountering. It’s a relatively advanced, some parts of it are advanced but you will know very quickly what those are, but I think generally speaking it’s a useful resource to have.

Kirill: Fantastic. Thank you so much, that’s Computer Age Statistical Inference and the rest of that very long name. in terms of our discussion of soft skills, I will probably add a book if you don’t mind, I’ll add the book I’m reading now. It’s been recommended to me by lots of people, I finally got to it. It’s called How to Win Friends and Influence People by Dale Carnegie. I’m through the first two chapters and it’s some really good tips especially related to what we discussed today. There you go guys, two books. Thanks a lot, Eric, for coming on the show once again. Can’t thank you enough. I really enjoyed our discussion today.

Eric: I enjoyed it, too. Thank you so much.

Kirill: So, there you have it. That was Eric Weber, Senior Data Scientist at LinkedIn and I hope you enjoyed our conversation today. Personally, I really enjoyed how we talked about soft skills in data science, it’s a very under-rated topic. A lot of people glance over it because it seems so much easier than all the technical things that you need to know in data science. They forget about it, they don’t develop those soft skills, interpersonal skills, people skills and hopefully from this podcast you were able to see how important they are, in how many aspects and areas of work they come up and I really wish for you to see there’s always room for development. Regardless of how great or not great your soft skills are, there is always going to be room for development and I hope this podcast inspires you to see where your room for growth in that area is because ultimately, we all want to be a rock star data scientist, and this is one of the crucial skills. There’s a lot of talented people out there who can crunch numbers and who can get the insights, but there are much fewer data scientists out there who can actually convey their insights in a meaningful way and educate their stakeholders that are after the results in the first place.

There you go. Make sure to connect with Eric on LinkedIn. The URL to his LinkedIn will be on the show notes at www.superdatascience.com/111. There you will find the transcript for this episode and any other materials that we’ve mentioned in our conversation. If you know any data scientists who are great data scientists but not yet rock star data scientists, then share this episode with them and they will also hear about the power of soft skills in data science. I really hope for this podcast to help as many people as it can, and you can help us out by sharing it with your colleagues and friends. On that note, thank you so much for being here today, I look forward to seeing you here next time, and until then, happy analysing.

[Background music plays]

Kirill Eremenko
Kirill Eremenko

I’m a Data Scientist and Entrepreneur. I also teach Data Science Online and host the SDS podcast where I interview some of the most inspiring Data Scientists from all around the world. I am passionate about bringing Data Science and Analytics to the world!

What are you waiting for?

EMPOWER YOUR CAREER WITH SUPERDATASCIENCE

CLAIM YOUR TRIAL MEMBERSHIP NOW
as seen on: