SDS 121: Building a Successful Data Science Practice and How to be an Effective Data Scientist

Podcast Guest: Dr Alex Antic

January 12, 2018

Welcome to episode #121 of the Super Data Science Podcast. Here we go!

Today’s guest is Senior Data Scientist at the Australian Federal Government, Dr Alex Antic
What a treat I have for you today, Dr. Alex Antic is here with us to share his wide-ranging experience and deep insights from his years as an applied mathematician and data scientist across research, finance, insurance, and now public service. You will hear us discuss how the field has changed over the years, and his passion for coordinating meetup groups in this space.
As an added bonus, in the second half, Alex will share his distilled advice on building a successful data science practice and how to be an effective data scientist.
I can’t wait for you to check this out!
In this episode you will learn:
  • Organising Meetup Groups and Why They Are a Good Idea (6:10)
  • Applied Mathematics and Data Science: Solutions and Applications (18:15)
  • Monte Carlo Simulations in Finance (24:32)
  • Life After Finance (as a Data Scientist) (32:52)
  • Building a Successful Data Science Practice from a Management Perspective (39:24)
  • How to be an Effective Data Scientist (51:35)
Items mentioned in this podcast:
Follow Alex
Episode Transcript

Podcast Transcript

Kirill: This is episode number 121 with Senior Data Scientist at the Australian Federal Government, Dr Alex Antic.

(background music plays)
Welcome to the SuperDataScience podcast. My name is Kirill Eremenko, data science coach and lifestyle entrepreneur. And each week we bring you inspiring people and ideas to help you build your successful career in data science. Thanks for being here today and now let’s make the complex simple.
(background music plays)
Welcome back to the SuperDataScience podcast, ladies and gentlemen. And today I’ve got a very interesting and insightful episode for you. On the show, I have Dr Alex Antic. Now, Alex started out into the space of data science with a PhD in Applied Mathematics. And then his career took him on an incredible whirlwind of journeys. He’s been a quantitative analyst, or a quant, in banks and investment organisations. He’s been in the space of customer analytics. He’s been a consultant at PriceWaterhouseCoopers, and he has also worked with the Australian Federal Government. So a very, very diverse background and in the first half of the podcast, we will walk through all of it and you will find some very interesting insights and applications that he’s seen in his career. And then in the second half of the podcast – well, in the second half of the podcast, Alex really surprised us with some special gifts that he shared on this podcast.
So Alex has a huge wealth of knowledge and experience in the space of data science, and he actually runs a meetup group in Canberra for data scientists, and he constantly helps and mentors other data scientists in this space. And so Alex was kind enough to actually prepare something for this podcast. He prepared two guides for data scientists, and he shared them with us on the podcast. In the second half, you will find them there. So the first guide is how to become an effective data scientist. And there, we don’t just talk about technical skills. We talk about technical skills, the business side of things, communication, and attitude. Well in fact, Dr Alex just shares all these things, all of his wealth of knowledge in that space.
And the second guide is for those who want to build a successful data science practice. So whether you are a person who wants to get into data science and be the most effective data scientist that you possibly can, or whether you’re looking to build a successful data science practice, in both cases you will get incredible value from what Alex shared on this podcast. In fact, the insights were so amazing that we couldn’t just leave them as audio, and together with Alex and the design team at SuperDataScience, we put together two infographics for you. So one for each of those guides, and you can get those infographics if you go to www.www.superdatascience.com/121. You can just download and keep them in order to help you remember what Alex mentioned on the podcast, the steps that he outlines, whether it is for becoming an effective data scientist or whether it is to build a successful career in data science.
So you have an opportunity, then go and download these infographics before you listen so you can follow along. If you’re on the go, if you’re in the car or you’re running or you’re on a bicycle, or you’re on public transport, that’s ok, listen to the podcast and then still make sure to download those infographics so you can keep something tangible that you can always reference just to refresh on how to do either of those things.
On that note, you can already hear that I’m very excited about this podcast, so on that note, without further ado, I bring to you Dr Alex Antic.
(background music plays)
Welcome ladies and gentlemen to the SuperDataScience podcast. Today I’ve got a very special guest calling in from Australia, Dr Alex Antic. Welcome, Alex, to the show, how are you going today?
Alex: Thank you, Kirill. Yeah, great to be here. Looking forward to our discussion.
Kirill: Me too, very much so. And where are you right now?
Alex: Based in Canberra at the moment, so dialling in from home, doing some errands and speaking to you before heading off to work.
Kirill: Awesome, awesome. And how is the weather down there in Canberra?
Alex: It’s lovely. Nice, hot day today, quite warm. Nice change from the recent rain we’ve had. That should be good.
Kirill: And a lot of people don’t know this, but Canberra is actually the capital of Australia. When I was a kid, I used to think it was Sydney. Do you correct people often about that?
Alex: I haven’t for a while, given most people I deal with these days are Canberra-based, and hopefully they’ve figured that out by now. I have on occasion, when I’ve travelled around. Yes, that is a good point.
Kirill: For those listening, there’s a bit of geography there. Canberra is the capital of Australia. How big is Canberra?
Alex: I’m not sure size-wise, the population’s about 400,000. It’s quite small in terms of relative, physical size to the rest of the country. And as you know, it’s located almost halfway between Sydney and Melbourne.
Kirill: And there’s a lot of government facilities there?
Alex: Yes, it’s very much the heart of the country when it comes to politics and government departments and agencies.
Kirill: And just looking through your background, I think that information will be very relevant to our discussion. But let’s start with the beginning. You’ve got a very interesting and diverse background, a PhD in mathematics if I’m not mistaken, applied mathematics, and then you’ve done lots of different consulting work and in fact, for those who are listening, Alex was recommended for the podcast by one of our previous guests, by Ot Ratsaphong, who heard one of your talks, Alex, at I think the R User Group in Canberra, or the Data Science User Group in Canberra, and he found it really fascinating. So tell us a bit more about that. You run these user groups for data scientists in Canberra, is that one of your passions?
Alex: Yes, promoting analytics and data science overall is definitely a passion of mine. So when an opportunity came up a few years ago to host the Canberra R User Group and Data Science Canberra locally, I thought it would be a fantastic way to not only meet up with the large number of data scientists and analysts we have throughout the government space here in Canberra, but also to I guess spend a bit more time mentoring aspiring and junior data scientists who often come to me for advice, technical, career advice, whatever the case may be, I thought it would be a fantastic forum to actually get everyone together and just share ideas and speak about what we’re doing, which often wouldn’t be so easy to share those ideas and to see one another outside of conferences that may occur.
So Canberra is quite unique in the sense that we have a lot of really great people working in different departments and agencies, but sometimes they’re working on their own, or in small teams, so they have very little oversight on what others are doing, and sometimes I’m quite surprised to hear that someone else is working on a similar problem, or is working on something that’s exciting and that they’d like to get into and want to ask about.
So I tried to invite speakers who are doing something quite interesting, I think it will apply and appeal to most people, invite them along to have a chat, and share their ideas, and it normally works quite well. People seem very happy to attend and to reach out to one another to share stories, war stories.
Kirill: That’s awesome. And how often do you have these groups? How many times a month?
Alex: Yeah, it varies. I try and do one every month or two, depending on my own availability. In my previous role I was travelling quite a lot, so that made it difficult, but now that I’ll be spending a lot of more time in Canberra I’ll try and do it every month or two, at least have one of them running every month, if not more frequently. That would be ideal actually.
Kirill: And so when these groups get together, you have a speaker or a couple of speakers who present or do you do some exercises? How do these groups run?
Alex: Normally it would involve one main speaker and then myself or someone else doing I guess a small introduction beforehand to just give a small oversight of what they’re currently working on and what may be of interest, and that would lead into the main speaker and then have a lot of questions. Question/answer session after that. And then people would tend to mingle before and after just to catch up with people they know or just to ask them informal questions on what they’re working on. It’s quite relaxed and casual in that sense. That tends to work well. A lot of us tend to be introverts and we prefer these more informal sessions to talk to one another and to get some advice or just to share our own views. And it’s often a lot of fun, yes.
Kirill: Oh, fantastic. I’m going to play the devil’s advocate here. In this day and age where everything is interconnected online and there is plenty of resources and plenty of forums where people can go online to find mentors or to connect with others, find out about their work, talk and so on, how is catching up in person better, how is it more beneficial, and why do people get more out of it than just online interactions that are readily available to them at any time of the day?
Alex: That’s a good question. I think in reality people use both, both methods of communications, to learn. They’re both fantastic and have a lot to offer. 24/7 access via the digital platform is incredible, you can’t knock that at all, but I guess being human and social creatures we love to actually be able to speak face to face with people, get that immediate response. And speaking to someone, asking them questions, you can read their body language, which sometimes is very helpful when someone is trying to answer a question about, “Should I take this particular job that I’ve been offered?” or “I’m having a problem with some technical issue. Do I have a chance to solve it?” I think people are a bit receptive to the human elements than they may be on a forum where you can get a lot of negative feedback at times which isn’t always helpful, a lot of criticism depending on the forums, so I think there’s space for both and as humans, we appreciate both streams. I think it’s a good thing having both open to us.
Kirill: Okay. I totally agree with that. I think you’re right. And that human element, I don’t know, it has some magic to it that you just can’t get online sometimes.
Alex: That’s right. When you see someone speak about a topic that you’re interested in and passionate about, I think being there can excite you and inspire you a lot more than just reading about it online or hearing a recording sometimes. Feeding off the people around you and the vibe in the room can be quite powerful.
Kirill: Totally. So what would you then recommend to people who are not yet attending meetups? I’m certain there are people listening to this podcast who haven’t ever attended a data science meetup. They like their profession, they go to work, they do their job, they meet people at work, but they’ve never gone out of their way to actually connect and meet others through a meetup like this. What would your advice be for them?
Alex: I highly recommend that they give it a go, maybe start one themselves if there isn’t anything like that in their local area or community. I think it’s quite easy to set up a meetup site online, get a mailing list together, use your contacts and networks. Otherwise definitely attend. There’s a lot of specialty ones I’ve noticed within data science overall – there’s deep learning ones, ones on machine learning, R, Python, whatever the case may be. Pick one that you’d be interested in. You may have a lot to offer that you don’t realize. You may be able to learn a lot from your peers. They’re normally quite short and very informal sessions, so go along and you might be surprised by how much you enjoy them.
Kirill: Fantastic. Any recommendations on where to find these meetups if one is not arranging their own?
Alex: I think just look up the website, the Meetup website, and have a look in your local area, do a search, or reach out to your contacts and ask if they know of any as well.
Kirill: Meetup.com, yeah?
Alex: Yes.
Kirill: I was actually surprised at how very interesting and broad that website is. I was in San Diego a few months ago and I had nothing to do in the morning and I wanted to go do some yoga. I looked up ‘yoga meetup’ and literally the next morning I went to a yoga meetup and it was amazing.
Alex: It’s incredible.
Kirill: Yeah, it’s a really cool place. Okay, we jumped straight into the meetups discussion. But now let’s rewind a little bit and talk about your background. Walk us a little bit through it. You started with a Bachelor’s degree in Math and Computer Science. Let’s go from there.
Alex: I did a double degree, Mathematics and Computer Science, which was quite new at the time, there weren’t many universities in Australia offering an actual double degree versus a double major.
Kirill: What’s the difference?
Alex: Double degree is you walk out with two degrees. I guess they synthesize the six years of the two separate 3 year Bachelor degrees into one 4 year degree. That has its own challenges obviously, taking extra credits, but the reason I did that was I really enjoyed mathematics a lot, worked hard and did well at school, so I wanted to pursue that to learn more. I had no specific career aspirations in mind when I did that. And the computer science element, I was getting into the programming and I thought it would make a great mix. I thought math on its own wasn’t enough, I wanted to do something else so it was either maths and physics, or maths and computer science, and I thought, “Yeah, I’d love to learn a bit more about coding and that might come in handy one day,” which it very much has.
But throughout that, I have to be honest and say maths was more my passion. After that I did an honours degree in pure mathematics, which was very interesting, especially some of the advanced algebra theory I did, some of the more complicated stuff I’d studied in my life. And then after that I was considering a doctorate and some of the applied maths that I studied, I really enjoyed the element of applying maths to the really world using both the mathematics and computer science elements of my degree, and I ended up doing a doctorate in applied mathematics, which is actually with the CSIRO – Commonwealth Scientific and Industrial Research Organisation, Australia’s premier science organization, and that entailed looking at heat transfer in grain silos, which was a fascinating topic.
Kirill: Sorry, what was that in grain silos?
Alex: Heat transfer throughout grain silos.
Kirill: Wow. That’s very applied for sure.
Alex: Very much.
Kirill: Okay. Any interesting discoveries there?
Alex: The aim of the research was to look at regions within the grain silo where particular insect infestations were occurring. And given the insects are quite small, the thermal devices they had at the time to measure the heat in those areas, it was too large to pick up the heat distribution, so it wasn’t sensitive enough. So the only way we could actually try to determine what the heat was and [indecipherable 16:00] was to actually do some mathematical modelling, so hence the heat transfer component, and the idea was that we discovered the insects were localized within certain regions, which meant that you could, at a lower cost, only heat those regions to kill the insects using either microwave heating or just high heat methods, and that way you wouldn’t have to invest in heating up the whole grain bulk to disinfest because the chemical methods they were using were being phased out globally.
So we determined that you could use microwave heating or just use large heating elements to kill the insects on the outside periphery without damaging any other properties within the grain bulk. Destroy the insects, keep your costs down, and then go forth and export your grain throughout the world, which was the main driver in this case.
Kirill: Wow, that’s really cool. So was that research applied in the end?
Alex: Yes, it was. The government was using that to determine which regions of grain silos and grain bulk structures they could disinfest at a lower cost, which was great for them and for the farmers that were actually looking at disinfesting there in the grain holdings.
Kirill: Wow! Congratulations! There you go. People eating bread in Australia, you might have been influenced by Alex’s research.
Alex: Also around the world, because what was happening is we would be exporting to a country and we would disinfest with a chemical method which would only have say, a 99% or 99.5% rate of killing those insects, so by the time the grain has been shipped to another country, that bulk could be re-infested. So we needed better methods to kill a higher volume of those insects. And what the biologists were finding, because they were actually, depending on the ambient temperature, theya would move into different portions of the grain bulk, which the chemicals weren’t always at a high enough dosage, so being able to just target those areas using heat was quite an efficient way to actually exterminate all the insects.
Kirill: That’s so cool. What I really like about this example is—so this was back in early 2000s?
Alex: Yeah. Quite a while ago, yes.
Kirill: So what I like about it is, right now I think most people would agree that we would call that data science, very data science kind of problem, solution and so on, but back then, it was applied mathematics. Don’t you find it interesting how the field of data science didn’t exist back then, but you were already doing data science?
Alex: That’s true. The label in some ways has changed, and also, as I’m sure you’re very well aware, the computing power we have at our disposal these days which has really shaped the world of data science and given us a lot more freedom and power as to how we tackle these problems. That was probably more mathematical in the sense that the equations I was solving, the methods were semi-analytical and numerical, whereas a lot of the work we’re doing these days in data science is very much numerical.
That’s the shift I’ve noticed as I’ve progressed throughout my career and tackled different problems, I’m doing less of the analytical and semi-analytical solutions to problems and much more now on the numerical side given the power we have, the beauty and incredible availability of libraries and functions through machine learning and deep learning. So that has been the big shift I’ve seen, and quite an interesting one for me too.
Kirill: That’s very interesting comments. I don’t often stop to think about that, that back in the day you had to come up with ingenious approaches to minimize your computational cost, whereas now you don’t really care, you just go for it.
Alex: Exactly. Distributed systems, parallel processing, it’s fascinating.
Kirill: And with the advancement of quantum computing, do you have any comments on that, on how we’re going to move even further into that space where we’re just going to throw machine learning at anything and just brute force the results out of it?
Alex: I think we really don’t know what we’re going to discover with that revolution. It’s going to be amazing to see. Hopefully it occurs in my lifetime. I think it will open up a lot of doors in terms of the problems we can tackle and how we can solve them, and more importantly I think it will allow the broader public or the broader industry to really see how they can apply the power of data science and analytics to their own problems to find innovative solutions. In the health space, I think there’s a lot more being done in that world, of course physics – the traditional areas where analytics was heavy. Computational power is being used in astronomy, theoretical and practical physics. Yeah, I think that will be quite interesting to see what happens there.
Kirill: Yeah. And do you think that we will have quantum computing laptops in the next decade or two decades?
Alex: Hopefully. You need to speak to a quantum computing expert on that. I would love to know. I’m hoping that does eventuate. That actually reminds me of one project I once worked on during my undergrad. It was in optical sciences, so we were looking at creating circuit boards using light effectively to transmit information on the circuit board rather than etching copper circuits to make them much faster. That was I think some of the early work being done heading towards quantum computing. So you would alter the refractive indices on these parts on a circuit board effectively, you’d use light to transmit the information, so I think that was a precursor to a lot of stuff that will happen in the future. That was fascinating.
Kirill: Yeah. And I think I’ve heard of similar approaches. They were maybe 10-15 years ago and now we’re heading into quantum computing space.
Alex: Yeah, almost 20 years ago. It brings back memories.
Kirill: That’s really cool. Okay, so then what happened after your PhD?
Alex: Sure. So, I spent a brief stint as an academic deciding will I use my powers for good or evil. Do I stay in academia or do I go into the real world? I had a couple of professors pull me aside and say, “The world the academia is changing. You may want to think about heading off and doing something different.”
Kirill: On to the dark side. (Laughs)
Alex: On to the dark side before coming back in the future. So I thought I did the right thing morally and went out to make some money. I guess there were two main reasons for that. One was I’ve been in academia for almost a decade doing undergrad and postgrad studies and some teaching. I was enjoying that, but I felt like I needed a change. And two, a guy who I’d done a PhD with, he was a couple of years ahead, he went over to the real dark side, he went over into investment banking, and he talked to me about these fascinating problems that they were solving and how you could use mathematics and computer science to actually do something meaningful and I thought, “Okay, that sounds really interesting.” I’d done a course in derivatives pricing and I thought that was quite cool, you know, I get to use my maths and computer science skills to do something interesting.
So off I went into the world of financial services. Initially I spent about a year as a lead quant in a fund of hedge funds when hedge funds were quite sexy and the rage, which was a great way using my skills to look at portfolio optimization and trying to understand how to actually make more money for the organization I work with, help them make more money by looking at the distribution of your own investments – in this case it was investing in hedge funds, so that posed some really interesting challenges.
And then after a year I was approached by an investment bank to go and actually do some front office quant work of derivatives pricing, share pricing, and I spent almost 6 years doing that, which was incredible. I learned a lot. That was probably the highlight of my career in many ways in terms of—from a technical viewpoint it was very challenging, but also very exciting.
Kirill: I’d like just to pause you for a second, because I’m looking at your LinkedIn and you mentioned that you used some modelling techniques including Monte Carlo simulations.
Alex: Yes.
Kirill: I’m not an expert on Monte Carlo simulations, but I’ve done some work with them, and I find that approach so interesting. Would you care to share some insights about Monte Carlo with us?
Alex: I guess in a simple way it’s very much like rolling the dice. I used to explain to people it’s looking at a brute force approach to try and solve an equation. So you might do a million or 10 million simulations on all possible results that you can have and you effectively average them out in the end. So at the time, before we had the power of machine learning that we do today, we had to try and solve some quite complex problems numerically, some of them we could solve analytically and semi-analytically, which was great, but the others we had to take a numerical approach and often in that space, in the derivatives pricing and the financial world, Monte Carlo was quite a popular way and an effective way to actually come up with those solutions.
I found it quite interesting to use because you were looking at some of the fundamental mathematics through a computational solution. And I think in some ways, if I can be so vague, it was kind of a precursor to a lot of the machine learning we’re doing today, especially the more brute force approaches. It’s something I haven’t touched since then, to be honest. I predominantly used it in that career and haven’t had to think about it for many years. It’s interesting that you bring it up. It’s good that you bring it up. Actually, I wonder how much it’s used these days given the power we have of machine learning.
Kirill: Yeah, it’s interesting. I’ve talked to a few people who’ve used Monte Carlo, but not as much. Still some use it in finance, but I’ve discovered that some biologists use it in modelling evolutionary—
Alex: [indecipherable 26:10]
Kirill: Yeah. So one TED Talk I was listening to, what they did I think is they were modelling—okay, so do we see the world as it is, the world around us? Is the table I’m sitting at, is it actually white, in reality does it exist the same way I see it? So what they were modelling, the theory they were trying to prove, was that this table or whatever we see is actually a mind projection and in reality these things might be completely different. Like, a tomato might not be a tomato, it might be something else, but our brain makes it look at it as it’s red, it’s this form, it’s this smell, whatever, because it’s good for us to eat it.
So they were modelling, like, “Let’s see if there’s a species that sees the world as it is versus a species that sees the world as the brain tells you to see it, and which one will outperform the other one.” And they used Monte Carlo to run those simulations to see on average who is going to win.
Alex: That’s fascinating. It reminds me of Schrödinger’s cat. That’s incredible.
Kirill: Yeah, very interesting examples of that. Any idea why it’s called Monte Carlo? I don’t think I’ve ever answered that question.
Alex: I once heard it came from being done in Monte Carlo itself. I’m not sure if that’s true or not. It could have been based on some of the techniques people were using. Yeah, it’s heavily based around repeated random sampling, the process itself, so maybe it does come from the gambling world, I’m not sure.
Kirill: Okay. Well, there we go. If anybody is interested, Monte Carlo is a pretty interesting averaging method. But let’s move on. You spent almost 6 years in the commodities space.
Alex: Fixed income, currencies and commodities, yeah. What was interesting about that is I was there during the GFC, so it was pre- and post-GFC.
Kirill: Did you notice any change?
Alex: I did. Pre-GFC, the appetite within the investment sector from our clients leading up to it was looking at more complex, more intricate, exotic options in derivatives that we were pricing, so that was very challenging for us in the quant space, for my team, actually looking at more complex and complex problems to solve, so we were often having to reach out to the academic world and read journal articles to try and look at inventions being made in that space and how to turn those often theoretical solutions into practical problems.
That was really challenging and interesting, but the problem was we couldn’t publish any of these in the IP. That was a shame because we came up with some really interesting solutions to a lot of these problems that we were facing that others would have benefited from as well, but of course we didn’t want the competition to get ahead of us.
Kirill: That’s the price you pay for going to the dark side.
Alex: Very much so. And then post-GFC, that appetite waned. What we were pricing were more of the vanilla-based products, the simpler options, so that became I guess for me less challenging. Having spent quite a bit of time there, I felt like I needed a change. I was doing more mentorship, so the management side was interesting me a little bit more, but not just doing the daily grind of just hacking away at problems and coming up with intricate solutions. I wanted to share my knowledge and experience a bit more.
Yeah, that made me decide to take a short break and then move on. I felt in some ways burnt out after that. Even though I greatly enjoyed it, I worked with some fascinating people, some of the best quants in the country, I just really wanted a change so it was great to have a chance to have a short break and then move into another role.
Kirill: That’s very admirable, you know, for anyone to have the courage to say no to a senior quant position, because that’s such a sought-after position, and a lot of people even in the space of data science think that that’s a dream job, to be a quant or even a senior quant at a bank. The perception is, “Once I get that job, I am going to be set for life, I will be happy and so on.” But as you say, sometimes you just want a change. You only have one life, right? You want to grow all the time.
Alex: Yes. And I grappled with the moral issues as well. I wanted to use my powers for good. I wanted to be able to—a long time ago I wanted to share some of that knowledge and experience to move into a government space, which happened years later. But that was something that I was thinking about at the time, and it was a difficult decision to make. It’s a very difficult role to get into. Once you worked up and built that reputation and experience, it’s very hard to turn your back to that. And I still get asked to this day, “Why the hell did you leave? What were you thinking?” But as you said, life is about more than just one job or one career. I think it’s important for me in particular to move around, learn new things, meet new people.
Kirill: Exactly. And not to say there aren’t companies that provide that. You know, in one company you might grow and learn and do different things, but if you feel you’re stagnating, then why not?
Alex: Exactly.
Kirill: Awesome. We’re going to have 500 people quit their jobs after this. (Laughs)
Alex: I’ll be blamed for the next GFC!
Kirill: (Laughs) Yeah. By the way, with the GFC, I wanted to ask you, why do you think they made the shift from complex products to simpler products?
Alex: Risk appetite wanes. People weren’t willing to take as much risk given the heat on the banks at the time and what was happening in the banking sector, especially in the U.S. with the large banks folding and struggling. Investors wanted something a bit safer, a bit more secure. As you know, in that space, high risk is high return, so people wanted a bit more stability and safety so there was less focus on quick wins and a real large appetite for risk. And also, some of those products were based on short-selling, which a lot of issues occurred, so with less interest on that, there was in some ways less complexity with some of the models we were actually trying to price. At least in Australia, that’s what I saw happen for quite a while.
Kirill: It totally makes sense. I can see how people wouldn’t want to buy those—what are they called, credit default swaps?
Alex: Yeah.
Kirill: Yeah, the main cause of the whole crisis in the first place.
Alex: Yes, unfortunately.
Kirill: Okay, so what happened after that? How did you get out of the dark side? Where did you go?
Alex: I didn’t have a specific goal in mind. I just got to a point where I thought I’d like to take a short break, travel a bit, and just unwind from that. I hadn’t really taken any leave during that period. So I took a short break and had many offers, a lot of similar roles immediately come up, which I thought, “I better not be swayed into that, I really want to try something different.”
And I was told about an interesting role to move over into the insurance space, which is a sector I’ve been thinking about because I’ve worked with a couple of actuaries, or people who are former actuaries. I didn’t want to necessarily do something as technical initially. I wanted something that was a bit more varied, so the role that I ended up taking was a management position where I was managing a marketing analytics team.
There were two elements to it. There was managing that team, helping them with their marketing, looking at customer churn, acquisition, the usual things you look for in that space, but also what I guess is termed these days as a lead data scientist role and also helping actuaries with the more technical problems.
But what’s particularly interesting is that’s when I first became aware of a lot of these techniques that are these days known as machine learning techniques. That’s when I first heard about that. People approached me and said, “We’re looking into this. We’re trying to understand some of the statistics behind this. Can you just help us out and we can bounce some ideas and learn together?” And that’s how I got into what is formally machine learning and data science, moving away from the traditional analytics, from all the mathematical modelling, stats modelling, into the more computational side. So I did a little bit of that in that role along with looking at customer insights and marketing strategies, and then from there I transitioned into the government space where I did most of my machine learning and data science.
Kirill: It’s interesting to follow your career because on LinkedIn I can see the dates and it’s like this transition happened for you as data science started becoming more popular, around 2010/2011.
Alex: Yeah, that’s an interesting point. I guess I can be held responsible for that.
Kirill: (Laughs) There we go.
Alex: Yeah, it’s funny that it happened that way. I guess I could see what was happening and the different opportunities that were coming up, and it just sounded very interesting to me so I pursued it more from that point of view, as I’ve done anything in my career, out of interest and what I’d like to work on as opposed to having a specific career goal in mind.
Kirill: Interesting. So that brings us to Canberra, to the government work. As we mentioned at the start, Canberra is 400,000 people and 399,000 of them work in the government.
Alex: (Laughs) It does feel that way, doesn’t it?
Kirill: Yeah. So, what kind of work were you doing? Again, whatever you can disclose, because I know there is some probably sensitive topics there.
Alex: Sure. That’s understandable. So, the first role was effectively around risk profiling, looking at predictive modelling techniques primarily to try and find bad people, categorize them in some way, whatever the department agency categorized as bad. In that case it was with the Department of Immigration and Border Protection, so trying to find, pick out that small number of people that are trying to get into the country illegally or maybe they’re part of some drug syndicate or whatever the case may be. So looking at some advanced techniques to try and pick them out, moving away from more traditional approaches, looking at traditional techniques, looking at people using their own tacit knowledge or anecdotal evidence to try and profile a person.
They are looking at high-powered analytical methods to actually target them better, to minimize how many good people you actually catch on the border to interrogate and interview and then to increase the number of actual bad people you end up catching. So, yeah, that was fascinating. I spent over two years there and worked with some really interesting problems, helping develop systems that are used to this day at our borders to protect our country, which I’m very proud of.
Kirill: That’s fantastic, very exciting to hear. I’m sure a lot of Australians listening to this will be excited to hear that data science is making our country safer.
Alex: Yeah, that’s right.
Kirill: I hope all other countries are following in the same way.
Alex: I’m sure they are, yeah. We did a lot of liaison work with our fellows in other Commonwealth governments and there was a lot of interesting work that was being shared and that was great to see.
Kirill: Fantastic. And then I noticed you moved into a different role with the government. What caused you to move?
Alex: Once again, that was looking at new opportunities and in some ways career progression, because the next move involved going into a department that was much more immature in terms of their data science and analytics capability. So I was brought in to try and help push that agenda forward and to help set up a platform, a cloud-based platform looking at Hadoop and R, integrating that to really increase their power of their analytics capability. They needed someone to help build that up, to promote that, to get people inspired as to what can be done with that, come up with some proof of concepts. That was really interesting.
A lot more management in that role, managing staff, projects, dealing with senior execs, and really helping spread the word of data science and the power of data science, which is something I do a lot of these days. It’s not just the technical work which I find interesting and challenging, it’s really promoting what can be done. There’s a long way to go, I think, especially in the government space. I’m sure in this country, like many others, it’s getting people to feel comfortable with what you can do with analytics and not to be scared of it. They see it as a black box often. It’s building trust so they can trust the methods, the systems. And that can often be a challenge, but quite a rewarding one when you see people have their aha moment, “I can see why you do it this way or why this works,” so a lot of my time is spent educating these days, which I really enjoy.
Kirill: That’s fantastic. You mentioned on this podcast that you would be happy to share about building a successful data science practice from a management perspective. I think maybe this is the right time to go a little bit into that discussion. What tips or advice can you share for people trying to build a successful data science practice?
Alex: Sure. I think there’s three key elements to that, which I’m happy to go into the detail. For me it revolves around people, the value you can add, and communication. On the people side, it’s most imperative to have the right people, the right mix of people in your team, to have the data scientists, quants, whatever the case may be with whatever industry you’re working in. They’d have to have the right technical skills, people that have proven themselves throughout their career to not just be able to do the technical work, but to communicate it to yourself, to the broader department and agency. In some roles it’s enough to have the people that are stronger in a technical way, that can just sit at their desk and hack away at code. We definitely need those people and they’re highly valuable, but sometimes you need people that can engage with the stakeholders to collect the functional requirements effectively and then translate them to technical requirements.
It’s having the right people with the skills to do the data wrangling, the modelling, the engineering side to embed the models into the enterprise-wide system. You’re having a mix of a person that can do all that. You can’t build a successful practice without the right people.
On the people side, you definitely need a willing coalition of support from within the department, agency, organization, whatever the case may be. You need support from your peers and from senior management. Without that, you really won’t be effective. Because the goal of data science and analytics is to initially develop insight from data, but more importantly it’s to make that actionable. Without that you are pretty much just doing academic research in many ways. Unless you have support from senior management to turn those insights into actions, I don’t think the actual data science practice will be effective or successful in the end. So it’s really important to have that.
I’ve noticed many cases where people just take on data scientists just to build teams more for the vanity reasons rather than actual need or actually wanting to support it, and that’s where it often fails and where a lot of challenges occur. So if you’re going into a job, building up a practice, make sure you have that support from higher up above. Otherwise your efforts may be wasted in the end.
On the people side, it’s important to be a data science evangelist to really show the benefits, to educate people about what it can actually provide to them, how they can personally benefit. I think that’s very important. People don’t always see that connection so I think it’s important to take on that inspirational role, which can be hard for some people. They don’t feel as comfortable talking about what they do, but it’s important to share your ideas, to communicate, to always become a marketer of analytics and data science, so inspire others and create meetups or informal groups within your own organization, attend meetups to see what other people are doing and get advice from them. I think that’s quite imperative to the success.
Also the hierarchy can be an issue. Areas where it’s very much hierarchical, I’ve found don’t work as effective in these technical teams as opposed to having a more flat structure where there’s more autonomy and flexibility. I think that tends to work much better and data scientists prefer to work in those environments. And also, you need to know how to manage—for management to know how to manage data scientists, because their career aspirations can be quite different to the rest or to more generalists, say.
So keeping them interested, engaged, giving them access to the right people, tools, software, whatever the case may be, is very important. Normally it comes around making sure there’s data for them to work on and challenging problems for them to solve. Without that, you’ll lose those people, you’ll have such a high churn rate, which I’ve seen many times where I’ve managed teams and tried to hold on to people or to bring people on. It really depends on the organization, the problems they’re working on, and that’s sometimes what drives me out as well. If I don’t have interesting problems or enough data, then I’ll move on myself. So that’s on the people side.
On the value side, the most important thing there is to be seen as a trusted professional and not just a technical genius. A lot of people that build these teams or work in these teams, and rightly so, they want to be seen as the technical gurus, that they can be approached to solve these problems. You don’t really get the practice off the ground unless senior management and your peers actually trust you as well. So part of that is to share knowledge, to educate those around you, inform them and be transparent about what you’re doing and why you’re doing it, don’t just be a black box, show them how you can help them and that you want to support them and how you’ll go about doing that. I think that’s very important.
And also what is very much key is to link the outcomes, the work you’re doing, to the strategic goals of the organization. Sometimes you have to make that connection quite clear, what you’re doing and how will that benefit the organization, how will customer churn, increase money, increase value to the public or private sector, whatever the case may be. Making that connection clear, always having that connection in the back of your mind so you can use that when you’re speaking to senior management, I think is paramount to actually having them trust you and believe in you and throw money and resources at you to actually try and solve their problem.
On the more technical side around adding value, I think it’s important for the analysts to develop software development practices, which I see less of these days with people coming into data science without having the more computational IT background. They’re not used to doing things like unit testing, peer code reviews. These days, with things like github, that’s all becoming more popular, but for a while the source control was something no one had ever thought of. They’re just developing models on their own, systems stored where no one else can see the code or debug it or do anything. So I think those practices are very important to having the right team and for the team to add value to organization.
And I think using prototypes to overcome doubt and resistance is very important because often, as we face a lot of doubt and resistance from people around us as to how we’re building this model, improve what we’re doing. We’ve been doing it for years. A traditional one is, “Why is your method any better or any different?” So building a POC to show people, “Look, this is the insight we’re getting from the data. This is what we can do quickly. This is the actual value that we can add. How about we invest more time and money to actually develop a full-blown system?”
Another important thing is to put people before technology. It’s more important to have the right people rather than first investing in some software solution that a vendor is pushing and then getting people to adapt to that, which I’ve seen happen a lot in my career. Getting the right people, let them choose what is possible – it’s not always possible depending on what you’re working on for security reasons, funding, whatever, but ideally you want to get that first. [indecipherable 46:37], worry about technology if you’re building the system from ground up because the good people will tell you what the need, they want flexibility. These days it’s really moving towards open source, even in the government space, which is great to see. We’re using R. Python, Hadoop platforms rather than the traditional SAS-based systems, IBM, etc. So that’s very good to see. It gives you a lot more flexibility and it’s cheaper for the organization department.
Another key thing is don’t be afraid to fail, or fail fast and cheap. I think that’s important as well, don’t be shy, try new ideas, developing the mindset like a hacker’s mindset of just giving something a go, see if it works, work in a more agile way, and then move on. Don’t just put all your eggs in one basket or just have this one solution at the end. Work in a more iterative fashion and make sure that the people around you are comfortable doing that, and that senior management understands and your stakeholders understand that you want to interact with them a lot more. That interaction, I think is very important, interacting with the business. You can’t be isolated. You need to be constantly engaging with them, understanding their business world.
That is key, to understand the business and then to go back and develop solutions. And what often helps in those cases where I’ve led those teams is to have some of my analysts embedded in some of those teams, either in IT, in the business unit we’re working with, in some of our stakeholder teams, just to make sure there is a constant flow of information back and forth. That’s what often really increases the chance of success.
And the final one is around communication. So, one key point there is to focus on the outcomes, not the methods and tools used. So when you’re communicating as a data scientist or a leader of the data scientists to other people, talking about what the real outcomes are, something that they can understand, use their terminology, understand their jargon, and don’t worry so much about the tools and methods you’ve used. That’s important to you, but may not be the main focus from their point of view.
Try to communicate those ideas very clearly, limit the jargon you’re using, but use their business lingo so they understand. You all want to be speaking one language. Visualization is often important with this. These days, having tools like Tableau and Qlik and SAS VA is a great way to show people some of the solutions you’re actually coming up with. Visualization, I’ve noticed, works very well, especially when you’re talking to people from a less technical background.
And building trust within them also helps sometimes when you’re doing roadshows. I used to do a lot of stakeholder roadshows within an organization, go around to the different teams and show them what my team can do for them, what we can do. Let them pose particular problems and we’d say to them, “Okay, give us a week or two, depending on our own timing, to try and come up with a simple solution or a roadmap that we can work together on doing a proof of concept for you.” And often that’s great, because people then are more open in those informal settings to discuss ideas, to come up with questions, you know, “We’ve thought about this. Is this possible? Does this fall into your realm?” You open up that dialogue, that communication. It shows them that you’re keen to learn about their passion about what you do, and it gives them a chance to ask questions face to face.
It goes back to what we were saying earlier about the meetups. We as humans are a lot more comfortable, especially with these technical issues, talking face to face. And what you find sometimes is people can be embarrassed about asking a question. They’re not sure if it’s a stupid question or does this technical question mean anything, but once you allay those fears and they see you as just another human as well, not just a technical genius or a geek, that dialogue opens up and then often it becomes quite successful from that point on. So hopefully that ramble answers those questions and provides some insight as to how I’ve managed to, at times, build quite successful teams.
Kirill: Wow. Alex, that was amazing. I was listening and at first I was writing down everything you’re saying, so much value. I was writing down from the people side and the value add side and then I just ran out of space on my paper. (Laughs) But I think it’s so valuable. If you don’t mind – the good news is this has all been recorded – I’ll ask someone in the team to put all of this into an infographic and then we’ll share it on the page. So, guys listening to this, you just go to the page of the podcast which you will hear at the end of the session or at the start of the session, and you can download the infographic absolutely free and we’ll share it on LinkedIn. I think this is super valuable for people building a data science team.
Alex: Yes, it is. If we have time, I can run through something similar on actually becoming an effective data scientist, which is something that I’ve often been asked by people. I’ve got a similar list that I tend to work through in my head.
Kirill: Please, let’s do that. We definitely have time and let’s do that again. I’m sure this is going to be super valuable on the flipside, for those who want to not just build the data science team but be the data scientist. So, here we go. How to be an effective data scientist?
Alex: In my view, anyway. So, there’s four particular areas I like to break it down to: that’s looking at skills, the business side, communication and attitude is quite an important one, I feel. So from the skills side, you need to have strong quantitative skills, of course, to become a great data scientist. The main point there is to build up those analytic capabilities, not so much on the tool side, but how to think about problems, the problem-solving logic involved. A key element of that is to ignore the math and stats at your own peril. I don’t expect people necessarily to go out there and get a PhD in mathematics and statistics and to understand all the fundamentals in great detail. An important thing, as I’m sure you’re going to agree, Kirill, is to understand at an intuitive level. I think that really makes or breaks a person as an analyst in general throughout their career.
I’ll give you one quick reason or an anecdote as to why that happens. I was once mentoring a junior staff member looking at solving a particular semi-analytical solution to a problem. And they had the answer and they said, “Look, I’ve got this answer now. How do I know if it’s right? How do I actually test this?” And I said, “Well, first of all, you should be using common sense, and I can tell you your answer is wrong.”
And he looked at me and he said, “How do you know? I’ve gone through the mathematics, I’ve done the computations and everything seems to make sense and I’ve done this many times before. How do you know the answer is wrong?” I said, “Well, first of all, if you understood the business problem, you will see that we’re out by an order of magnitude. The answer was 32 and I was expecting 320. So obviously there’s a problem there.” That’s one thing. That’s the business side. They weren’t really engaging with the business enough to understand. They just took on the problem and thought, “This is very much now an academic problem and I’ll go away and solve it,” which was something they’re comfortable with and they’ve done many times before.” And I get that. We all fall into that trap at times.
But the important thing is, if they were also to look at the structure of the mathematical equation underneath, they would see why we were out by order of magnitude. So having that intuitive grasp of what the model is doing helps you then understand are you getting the right answer, are you in the right ballpark, which is often very important. And also, the structure tells you, “How do I actually go and debug it or how do I find out where a small change in my input is giving me this large change in the output which I’m not expecting?”
So after we worked through that, we were quite quickly able to determine what was happening. And it gave them just a better insight for how to actually go about solving those problems. That’s one thing, I guess. I’m used to these days having less of an opportunity to work on those analytical and semi-analytical solutions, to get lost in the beauty of mathematics given that now most things I work on are so empirical and very much computationally-based. I’m used to that at times, but…
So, the intuitive grasp still holds true. If you’re working on—let’s say we’re looking at machine learning, artificial neural networks. You’re looking at forward propagation, backward propagation, there’s gradient descent methods, cross-entropy, cross-functions, whatever the case may be. People are looking at all these complex-looking equations and then they’re thinking, “I don’t understand what’s going on here.”
Someone asked me recently a question about—they were looking at a derivation. They were trying to gauge from me what does this mean mathematically. One point was to explain it from an intuitive level, to try and explain what’s happening. But an element of that that was really important was going back to fundamental principles they would have studies such as calculus, in this case the chain rule, you know, understanding how forward propagation and gradient descent work is all around the chain rule.
So once you understand the concept of the chain rule and what it’s doing, looking at the underlying derivatives, then you can quickly understand, “Okay, I can see that the rate at which the weight learns is controlled by an area in the outputs, so large areas mean I’m getting faster learning in the neuron itself.” So, that really helped the person grasp the concept, even if they didn’t look at all the derivation and do it themselves – I did that for them – but the ability to look at it and say, “Okay, now I know what’s happening intuitively with this calculation,” means I have a better grasp not only of how does the method work, but how can I test them, my own solutions, is it the right method to use for a particular problem. That’s all paramount and really important to actually becoming a strong data scientist.
That goes back to how do things work intuitively, which I think people forget sometimes, but it’s incredibly important to focus on that as you learn. It helps you learn and understand how any problem works in life, not just the mathematical ones.
On the quant side, often I’m asked, “Should I be doing a Master’s or a PhD in data science?” And my question to them is, “Why? Do you just like the idea of having those extra letters after your name, or do you think it’s important for your actual career?” What I think is more important to do further studies, if any, in the fundamental sciences, in the maths, the stats, econometrics, physics, whatever, I think the maths and stats skills (or actuarial studies as well) that you gain from that gives you a better grasp of what’s happening intuitively in the modelling sense than just doing say a Master’s in data science, which is becoming all the rage these days that I’ve seen. That might have merit as well, but I think understanding the fundamentals at a deeper level will take you further in your career, especially if you’re moving into a more technical area of machine learning, AI, whatever. I think that’s much more important.
If you just want to work more on the periphery and understand what’s happening, then Master’s has a lot of benefit, but to go deeper, you need to go deeper with the fundamentals. There’s a lot of great courses online, SuperDataScience and others, and I try and tell people about some of that stuff. And going back to my earlier point, when you’re building an effective data science practice, I think trying to do some of that education internally is important. Like, if you were to run some introductory R courses or data science courses internally, you’ll find that there’s lots of analysts that are interested and that people want to build up their skills, so not only do you share your passion, but now you have extra people you can use within your organization and department to help spur your cause, and also to help you with your modelling. You know, you have extra staff now that you can use to help you with your workload. I think that’s great. I’ve seen that happen many times and there’s lot of people that are very keen to learn. They may not all go and become data scientists, but having a greater understanding of R or Python or some simple predictive modelling really helps them set them up for a new element of their career they never would have done before. So through you, they’re now going to learn more and share that passion, which is fantastic.
Another great way to learn and to build your skills is to do a lot of hands-on on-the-job training, which I think is a fantastic way to learn. Kaggle, of course, and things like that are great. At times, new areas that I’ve wanted to learn has happened in two ways: either putting myself in a situation where I’m trying to solve a problem in a technical field that I haven’t really done before, it’s a great way to learn. And the other one is to actually be asked to teach a course that I’ve never done before. So what better way to learn than through teaching?
So, there’s a lot of ways for people to learn these days, as we discussed earlier: go to meetups, ask informal questions, do a lot of online training, do formal courses… So many options these days, but on-the-job training, I think, can be quite important, especially if you’re working with strong analysts around you, people that are willing to share. And when you’re forced to work on a problem, that forces you to quickly learn a technique and try things out, not to be scared or just to sit back and think theoretically about a problem but actually get your hands dirty.
The coding side, on the skills side, is very important. You can’t get anywhere without SQL or the Hadoop equivalent these days. You need to be able to extract the data before you do any analysis on it, of course. R and Python are the go-to languages. I’ve seen a lot more growth in the government space in Australia, which has been great, not just moving towards open source, but all the libraries and functionality available through R and Python has been great and opening up opportunities on the types of problems that people can solve – predictive modelling, natural language processing, AI, ANNs. It’s just fantastic. Even if it doesn’t go anywhere, people are learning it and trying new things and they’re moving away from more traditional stuff like SAS and C#, C++, etc., so more towards the functional programming languages, which is good.
And I guess for people it’s important on the skill side to just be familiar with most techniques. Even if you don’t use them, just be aware of the different things that exist, you know, natural language processing space, convolutional neural networks and their implications, text mining, new advances in predictive modelling, other analytical techniques. Just be aware at least what exists and something you may turn to to help you solve a problem at some point in the future. Just be aware of it. I think it’s good going to meetups, reading stuff online. It’s a great way to broaden your knowledge base. That’s on the skill side.
On the business side, the fundamental thing you have to keep in the back of your mind and to constantly strive for is to understand the business problem. Understand the business you’re working with, assuming you’re working in an area that I’ve—many times when you work with a business unit, you need to understand their world. You need to feel their pain, know what their pain points are, where can you really add value, and then start thinking, “Okay, how can I use my skill and experience as an analyst to actually help these guys?”
It can be something as simple as helping them transition away from doing some sort of high-level analytics and reporting in Excel to a more advanced system, or do they actually have a problem that entails itself to a predictive modelling solution. You won’t know that until you really focus on their area, their business area, to understand what problems they face. So that should be at the forefront of your mind rather than “What techniques should I be using?” or “What do I want to work on today?” Understand the business. That adds value to you and it adds value to the organization.
You have to work with stakeholders, open up communication and engage with them. You don’t want to be isolated. As part of that, where often people get caught out is they don’t have clear objectives defined. You know, someone will talk about a problem at a very high-level and you think, “Okay, I have a potential solution to this.” You go away and develop something after a few weeks, you come back and they’ll say, “Well, that’s not really what we meant.” “Well, that’s what you told me it was. Hold on, where are the actual objectives? Nothing is being written down.” So it’s important to try from an early point to clearly stipulate what the problem is and what the approach is to actually getting a solution. And what will the solution look like? Will it be a one-off report? Will it be predictive modelling that goes into an enterprise-wide system that’s run continuously for risk scoring? Whatever the case may be, try to have that set up early. Sometimes it’s an iterative process. They don’t what they want and you don’t know what you’re going to come up with until you see the data. So work towards that.
And part of that, what’s really fundamental, is make sure they have data. I’ve been in these situations where people call me in and they want me to solve their problems using the fancy world of data science, but they don’t really have much data available, or they don’t have a large historical database. I’m quite limited in what I can do in those cases when there isn’t much data. Or you have data, you have access to limited data, but you have to wait a month to go through the security clearance, get all the data sorted. Or one group gives you the data, but another group, because their data is always sitting in a disparate system, it’s always dirty, it’s always messy to work with – how are you going to join these different datasets?
So understanding the business, understanding where the data which is important to an organisation lies, who are the guardians of that data, how to win their confidence to share it with you? Because sometimes people aren’t happy to share the data with you, I’ve noticed. They want to hold onto it, they think the data is their property, as opposed to belonging to the organization. So often there’s a lot of these internal battles that you’ll have to convince people that they should relinquish the data, that you want to help them, and that you’re there for the greater good of the organization.
So often, when you’re working with a business to solve their problems, I think it’s important to validate the models and the analytics you’re doing with the people, not just by statistics and evaluation techniques. It goes back to this iterative process of engaging with your stakeholders. Take them on a journey, tell them a story of what you’re working on, why you’re doing something, show them interim results, does this make sense, educate them why it may not make sense, or what we’re expecting to get at this point. So that communication and education with your stakeholders is really important, it should never stop as you work on these projects. I think it’s very valuable.
That takes us on to the next point, which is on communication. So when you’re trying to communicate it with internal stakeholders, senior management, whatever the case may be, I think it’s important to try and excite these people about what you’re doing, share the passion, tell a story, show them how you can help remove the pain that they’re facing, increase efficiencies, whatever the case may be. Link it back to the strategic goals as I mentioned earlier about the organization and try to make it as clear as possible that what you’re doing will help them. As opposed to what you’re doing is cool or exciting or is the right thing to do, but how does it actually help them? I think that’s really important.
Often I found what works best is to use demos, not so much leverage the power of PowerPoint slides, which is great but it can be a bit boring and stale to people. So, try to show them a demo, get them involved, get them to interact with it, “Here, change this value,” you know, “Put in the value that you think is realistic. Let’s see what happens with the outcome. Let’s look at some ‘What if’ analysis or try and predict something three months down the line if we change this particular data point now.”
And as part of that, when you’re talking to them, as I mentioned earlier, try and adopt as much business jargon and terminology as possible rather than just revert back to jargon we tend to use in the data science space. If people are keen and want to learn about that, that’s great. But if not, it’s probably bad to try and force it on them.
Because what often happens is people get a bit more sceptical when they hear some of these big terms that we use, and they feel intimidated in some ways, they feel uncomfortable. So try and steer away from a lot of that technical jargon, use it when you have to, but speak more in business terms and in a way that really helps them understand how it’s going to benefit them. So that’s why I think visualization is often important in that space. Visualizing something as a human is sometimes easier to grasp rather than the words that we use, which aren’t common to everyone.
And that takes me to my final point, which is around attitude. And I think an important aspect is don’t be scared to fail, as I mentioned earlier, as a data scientist. Try new ideas, talk to people. If a particular method doesn’t work, it’s okay – try something else and no one is going to think any less of you. It’s a dynamic world, things are changing, there’s new techniques to try, data can be very difficult and complex to work with, understanding the business rules around it is very hard. Once again, having good communication with the business owners really helps in understanding those underlying business rules, which is something that can catch you out, especially in the government space, where there’s a lot of complexity in these old legacy systems, data sitting all over the place. How does it all hang together? What’s used and why, depending on particular legal policy implications that sit around it? You really need to understand that when you’re working with the data and coming up with a model, hence the business is your go-to for that.
Also along those similar lines with attitude is to adopt, as I call it, a hacker’s mindset, just try new ideas and don’t be scared if you’ve never done something in Python but you’ve done it in R, give it a go in Python, build up your own skills. You may find another library that you think is helpful, that may be better or faster, whatever the case may be. Don’t be scared to get your hands dirty and play around. And on that, don’t be scared to ask for help when you need it, either from a peer, from your manager, whatever. Don’t let it go too long and you think, “Oh, I’ll finally get the answer, it will be okay,” when this deadline is looming. If you need help with something, either a business or a technical problem, just yell out. I’ve noticed a lot of people, and myself at times, are afraid to ask, think people will think we’re stupid or we don’t know enough. You quickly realize people do want to help you to be there for the greater good and everyone goes through these situations where you just don’t know the answer to something, so just ask.
And on that too, as I’ve also alluded to, always focus on the outcomes, not the methods that you’re using. So, it’s important to build a simple model first, not just go to something that’s complex and exciting, which is harder to understand, harder to debug, harder to explain to people. So just focus on those outcomes and then worry about the methods used to deliver on those outcomes. And part of that is really understanding the business problem to help you strive towards what the correct outcome is. On attitude, you need to be curious and a problem solver, you need to enjoy problems, and be an evangelist to really inspire others and to share the passion for analytics and data science. So that can be through your own work in formal gatherings, create your own meetup, write a blog, whatever the case may be. I think the attitude is an important one, yeah. So, yeah, hopefully that sums up some of my ideas that I’ve come up with over the past few years. I hope that helps someone with their own career.
Kirill: Fantastic. I’ve been listening to this and I’ve learned quite a few things from here and I really like how you broke it down into those different parts, about skills, the business side, communication, attitude. I think they’re all very valuable. Once again, with this one we’ll also aim to do an infographic and share, and that way for any listener, whether you’re trying to create a data science practice or you are trying to be the most effective data scientist that you can be, you’ll have something to follow along.
Thank you so much, Alex, for sharing. I’m just curious, how did you come up with these? You mentioned ‘over the years,’ but did you have a system that helped you develop these bullet points?
Alex: I guess when I gave a meetup a little while ago, I had to force myself to think about how to synthesize what I’ve learned in my career in a way that would help those aspiring to become better data scientists or to transition to the area, because I’ve been asked many times, you know, “How do I actually go about becoming a data scientist?” or “I’ve been doing a different type of analytics for a while. I want to move into it. Why should I, what skills do I need?”
And in terms of building effective teams, I guess as I was building more and more teams, I’d often think about “What do I need to do in the next role I move into to actually make sure it’s as successful as it was before or better than it was before?” So then I started thinking, “Well, I think I better start jotting down what’s worked and what hasn’t, and then of course bouncing ideas off peers, reflecting on what’s worked in the past with managers I’ve had and what hasn’t worked, what could have been done better.” And just try to quantify everything, put it down into a nice little framework that helps me revert back to it or just hand it to someone and say, “Try to focus on this and then see how you go. Come back to me if you have any questions. This should help you get started.”
Kirill: Yeah. That’s very admirable.
Alex: Thank you.
Kirill: It’s great to see how you’re giving back to people who are starting out and getting there. Hopefully we will help you spread the word through this podcast.
Alex: I hope so. That would be great.
Kirill: Awesome. Well, we’re out of time, but thank you for coming. What is the best way people can follow you? If there’s somebody maybe in Canberra who might want to get in touch, or somebody who wants to follow your career online, what are the best ways to do that?
Alex: LinkedIn would be great. It has links to my Meetup groups as well through there, so they can link to that and then register as members and then come along to the next meetups. But LinkedIn is a great way to stay in touch and ask any questions anyone may have. I’m always happy to offer advice and help out in any way I can.
Kirill: Okay, fantastic. Thank you so much. And I just have one last question for you. Is there a book that you can recommend to our listeners to help them in their careers?
Alex: One book I think would be great for people is written by someone I know, aa friend of mine who is now Director of Data Science at Microsoft. His name is Graham Williams and the book is “Data Mining with Rattle and R.” It’s a great book in helping people understand how to actually go through data mining, data science process. It particularly uses R and a package called Rattle, which Graham invented himself. It’s a really great GUI for doing predictive modelling within R rather than having to do all the modelling through a command line. It’s got a great interface, makes it easy to learn, and a great way to interrogate and understand the data, so I highly recommend working through his book and some examples for people who haven’t done much and maybe look at some techniques. That’s “Data Mining with Rattle and R” by Graham Williams. Graham is the guy who is always happy to help people as well.
Kirill: Yeah, it says a lot that now he’s working as a data scientist, is that right, at Microsoft?
Alex: Yeah, and for a long time he was the key data scientist at the Australian Tax Office and he did some fantastic work there, so he’s likely respected throughout the Australian data science sector, one of the leads there.
Kirill: Fantastic. Well, there you go. Graham Williams: “Data Mining with Rattle and R.” Once again, Alex, thank you so much for coming on the show. This has been invaluable and I’m sure lots and lots of people have and will get a lot of value out of the insights you shared today.
Alex: I hope so. It’s been great chatting to you, Kirill. I really enjoyed it.
Kirill: So there you have it. That was Dr Alex Antic, senior data scientist at the Australian Federal Government. I really hope you enjoyed today’s episode. There was lots and lots of information to share. First of all, make sure you go to www.www.superdatascience.com/121 and download those two infographics. We put in quite a bit of effort into those and, as you can see, Alex put in all of his life’s and career’s experience into those, so you definitely don’t want to miss out on those. And moreover, you might know somebody who they can help, so feel free to share them around. That’s our mission, that’s our goal, to help people into the space of data science, to help spread the word, and we’ll only be excited and happy if you can contribute to that as well.
The other question I wanted to ask you is, what was your favourite takeaway from this podcast? Again, there was lots and lots of information, but personally for me, I was most curious about the discussion that we had with the shift that’s happened in the world from analytical to numerical. I think that was a very philosophical conclusion from what we can observe in the world right now. Before, you had to be very smart and cunning about the mathematical equations you develop in order to solve problems. And now you can just be less like that and just throw things into machine learning algorithms and get them to churn the numbers and it’s still going to work. It’s just like a brute force approach instead of a very elegant mathematical approach.
That’s exactly what gave rise to the space of data science. If we didn’t have a machine that can churn that many numbers and brute force through things like that, it’d still be called just mathematics, just applied mathematics, and that’s what we’d be doing. But now we have the power of data, the power of analysing lots of data and that’s called data science. Interestingly, it’s continuing, that trend is continuing. With the rise of quantum computing, we will be able to brute force even more, we’ll have to think even less about how to approach problems, just throw everything into the computer and let it spit out the results and you I guess will be able to use more and more sophisticated algorithms like deep learning, which require a lot of computational power simply because you will have that computational power.
So there we go, a very interesting way in which the world is going. And again, your takeaways from this episode might be different, but in any case, I hope you enjoyed this conversation today. If you did, make sure to share it around with others who might benefit from it as well. Don’t forget to connect with Dr Antic on LinkedIn. You can find his LinkedIn URL and the link to his Meetup group at www.www.superdatascience.com/121. You can also get all the show notes, including the two infographics there as well. On that note, thank you so much for being here. I really appreciate you taking the time to join us for our discussion. I can’t wait to see you back here next time. Until then, happy analysing.
Show All

Share on

Related Podcasts