Kirill Eremenko: 00:00:00
This is episode number 427 with VP of Data Science at Gojek, Syafri Bahar.
Kirill Eremenko: 00:00:12
Welcome to the SuperDataScience Podcast. My name is Kirill Eremenko, Data Science Coach and Lifestyle Entrepreneur, and each week we bring you inspiring people and ideas to help you build your successful career in Data Science. Thanks for being here today, and now let’s make the complex simple.
Kirill Eremenko: 00:00:44
Welcome back to the SuperDataScience Podcast everybody, super excited to have you back here on the show. This episode is incredibly fun and cool. Today we had the VP of data science from Gojek join us on the episode. If you’re from Southeast Asia, you have probably heard of Gojek and actually very likely used it.
Kirill Eremenko: 00:01:10
But in case you are not from Southeast Asia or you haven’t heard about Gojek, this is a huge company. It is valued at $10 billion as of today. It’s had extreme rapid growth and it is a super app. It is one app inside which you can get 20 different services from ride-sharing, to shopping, to food delivery, to insurance, to cleaning, to even hair styling. How cool is that?
Kirill Eremenko: 00:01:42
The app serves millions of people across Indonesia, Vietnam, Singapore, and Thailand. And they’re growing extremely fast. They have been growing extremely fast, they continue to grow extremely fast.
Kirill Eremenko: 00:01:57
And today we had the pleasure of speaking with the VP of Data Science from there, Syafri Bahar. And before I continue onto what this episode is all about and what we talked about, I wanted to say why I keep saying, “we,” we spoke with Syafri. Because today we have a second host, Jon Krohn joined me as a co-host on this episode. You may remember Jon from episode 365 in May this year.
Kirill Eremenko: 00:02:25
And the reason why Jon is joining, there’s something super exciting coming up in 2021 as an exciting change. Jon is actually going to be… I’ll give you a heads up now without going into too much detail. We’ll talk about it. I’ll announce it more in the coming episodes, but Jon will be taking over as host of this show.
Kirill Eremenko: 00:02:47
I know that might come as a surprise. It’s the first time I’ve mentioned this publicly, but it’s going to be super fun, it’s going to be an amazing time. And we won’t talk about this too much right now and not detract from the episode, we’ll get into that in a future episode. But in this episode we decided to co-host and talk with Syafri together, and it turned out really fun. We had a lot of laughs and I’m sure you will join us with them, with those laughs.
Kirill Eremenko: 00:03:16
And so what did we speak about today with Syafri? Well, we talked about Gojek and the impact it’s having. We talked about decision science versus data science. They actually have three divisions under Gojek, decision science, data science and business intelligence, and we specifically discussed the difference between decision science and data science.
Kirill Eremenko: 00:03:34
We talked about CartoBERT and Turing, so some more technical things and some use cases are on this. Some very interesting use cases. We talked about what it’s like to be a VP or vice president of data science, and what that role entails at a rapidly-growing company like Gojek.
Kirill Eremenko: 00:03:55
We talked what it takes for a data science team to be a high performance data science team. We talked about mathematics in data science quite extensively. Both Jon and Syafri are experts on mathematics and data science. It was very interesting to have that conversation. And finally, we talked about what it takes to thrive as a data scientist in a company like Gojek.
Kirill Eremenko: 00:04:18
So lots of very cool insights coming up. Can’t wait for you to check out this episode. Without further ado, I bring to you Syafri Bahar, VP of data science at Gojek.
Kirill Eremenko: 00:04:34
Welcome back to the SuperDataScience Podcast, everybody. Super excited to have you back here on the show. Today we’ve got a very exciting episode. We’ve got two hosts and one guest. Our guest for today is Syafri Bahar calling in from Indonesia, from Bali. And we’ve also got Jon Krohn as our co-host calling in from New York. Hi, guys. How are you doing?
Syafri Bahar: 00:04:55
Hi, Kirill. Doing good, thanks.
Jon Krohn: 00:04:58
Hey, very well, Kirill. Yeah. Delighted to be here.
Kirill Eremenko: 00:05:01
Awesome. What’s the time for you, Syafri?
Syafri Bahar: 00:05:05
Now it’s 9:00 actually, so I’m calling from Bali.
Kirill Eremenko: 00:05:08
9:00 AM, right?
Syafri Bahar: 00:05:10
9:00 AM, yes.
Kirill Eremenko: 00:05:11
Awesome, awesome. And Jon, you?
Jon Krohn: 00:05:14
Yeah. 8:00 PM. Getting there.
Kirill Eremenko: 00:05:17
Yeah. Crazy. Across all the time zones.
Jon Krohn: 00:05:20
And how about you, Kirill?
Kirill Eremenko: 00:05:22
For me? It’s about 6:30 AM. About 6:00 AM.
Jon Krohn: 00:05:28
Oh, man.
Syafri Bahar: 00:05:29
Oh, wow. That’s very early.
Kirill Eremenko: 00:05:31
Yeah. That’s okay.
Jon Krohn: 00:05:32
Do you always get up that early?
Kirill Eremenko: 00:05:34
I do, my girlfriend doesn’t. She was so dazed, I had to go to another room to go and sleep there because this is the only room where I can record. Took her blanket and pillow and just went away.
Jon Krohn: 00:05:52
Our apologies.
Kirill Eremenko: 00:05:54
No, it’s okay. It’s okay. I’m glad we’re all here. We’ve I think met Jon. Our listeners have met Jon before from other podcasts, but just quickly, Jon, if you could give us a quick intro about your background.
Jon Krohn: 00:06:09
Sure. I’m the chief data scientist at a machine learning startup in New York. That’s my day job, but on the side I do lots of data science education. I have a book, Deep Learning Illustrated that was a number one best seller. Not been translated into Indonesian yet, but we do have a lot of translations around the world.
Jon Krohn: 00:06:34
And I’ve also been doing some work with SuperDataScience. We’ve got a Machine Learning Foundations course that just launched in the Udemy platform together. And Kirill and I met through the SuperDataScience podcast. I was a guest on the podcast early in 2019, and I asked Kirill if he would like to be a guest on my podcast, which I had just launched. At that point I’d only had two episodes, and we hit it off. We had a really great conversation and if you don’t mind me breaking it to your audience right now, Kirill.
Kirill Eremenko: 00:07:11
Yeah, sure.
Jon Krohn: 00:07:15
A couple of months ago, Kirill approached me to begin hosting the SuperDataScience podcast, so I’m absolutely blown away. I couldn’t believe that he asked me to do that. Now we’re getting me warmed up by co-hosting today, and I couldn’t be more excited.
Kirill Eremenko: 00:07:34
Me too. Super fun, super fun. It’s going to be an exciting time I think. I feel you’re the right person to carry the SDS podcast forward. Thanks for being here today, Jon.
Jon Krohn: 00:07:45
Yeah. An honor.
Kirill Eremenko: 00:07:46
Awesome. All right. Oh, and by the way, congrats on the Machine Learning Fundamentals or Foundations. 90,000 students, right? Last I checked.
Syafri Bahar: 00:07:56
Wow.
Jon Krohn: 00:07:57
Yeah. I think it’s 80,000, but that’s about the same in terms of the impact. And yeah, 80,000 students. It’s only been live for five or six weeks. And that’s the kind of thing that I couldn’t have possibly ever dreamed of that kind of thing. It’s by association with you guys, with the SuperDataScience podcast, and so thank you very much for that.
Jon Krohn: 00:08:22
And we’re only just getting started. There’s three and half hours live for the course right now, and I expect when the podcast is released it’ll still be about that three and a half hour mark. But by the end of 2020 it’ll be about six hours. We’ll have finished the first quarter or so of all the content. In 2021, there’ll be 25 hours of content in there, covering linear algebra, calculus, probability, statistics, computer science. Everything you need to know to be a great machine learning practitioner, or data scientist.
Kirill Eremenko: 00:08:58
Fantastic. That’s very cool. And that’s a very good segue to Syafri, because Syafri, you love mathematics, right?
Syafri Bahar: 00:09:04
Oh, yeah.
Kirill Eremenko: 00:09:04
Your whole story is mathematics.
Syafri Bahar: 00:09:08
Sure, yeah. Exactly.
Kirill Eremenko: 00:09:09
Please tell us a bit about that.
Syafri Bahar: 00:09:13
Yeah. I’ve actually been into mathematics since I was a child, actually. My father is actually a math teacher, so when I was a child-
Jon Krohn: 00:09:20
There you go.
Syafri Bahar: 00:09:20
Yeah. I remember a day where I was I think in elementary school and I start asking about this sequence problem. I just make a sequence problem with the three differential layer of arithmetic sequence to my teacher. And then I actually asked the problem to my father, but he just tossed me a book.
Syafri Bahar: 00:09:45
But later I found out that it’s actually in a university book. I’m kind of being crunching in order to find the solution of the problem, and since then I’ve actually grown my interest to math. In fact, I’m also lucky enough to represent Indonesia actually to a couple of math Olympiad competitions, so that’s a very nice experience.
Jon Krohn: 00:10:06
That’s huge because Indonesia is the fourth most populous country on the planet, so you’re representing a big population there.
Syafri Bahar: 00:10:15
Yeah. It’s quite surreal also for me back then because I was kind of from, how do you call it, the underdog regions of Indonesia, so to say. A lot of the representatives [inaudible 00:10:28] always come from the Jakarta area, and I was probably the first representative from that region, from that province actually, after let’s say eight to 10 years. It was quite a euphoria for me as well.
Kirill Eremenko: 00:10:42
What province?
Syafri Bahar: 00:10:42
Sorry?
Kirill Eremenko: 00:10:42
What province is that?
Syafri Bahar: 00:10:44
Sulawesi province. Sulawesi.
Kirill Eremenko: 00:10:45 Sulawesi.
Syafri Bahar: 00:10:46
Yeah.
Kirill Eremenko: 00:10:49
I know there was a few active volcanoes. I was doing a data science analysis of the active volcanoes of the past, I don’t know, centuries. And there’s quite a few in Sulawesi. I think four or five [inaudible 00:11:02] hundreds.
Syafri Bahar: 00:11:10
Yeah. It looks like a K actually on the map. It’s easily recognizable. Since then I’ve grown my interest and I’m actually still actively reading, learning about math book. I think I consider it as a hobby actually, because I find it beautiful as a discipline. So yeah, you’re right about it. I’m a big fan of math.
Kirill Eremenko: 00:11:26
That’s awesome. And when you don’t do math, what is it that you do? Because it sounds like you’re so into mathematics, sounds like your full time job, but you have a different full time job. Tell us a bit about.
Syafri Bahar: 00:11:39
Oh, yes. Yes. Actually it’s my day job. I am a VP of data science for Gojek, so I’m Gojek is actually an on demand super app platform. We have around 20 products. I think we basically from ride hailing, we have food delivery, we have entertainment kind of like Netflix streaming services. We also have insurance, for example. It’s a super app.
Syafri Bahar: 00:12:09
We used to actually have even a service where you can actually order a masseuse coming to your house within 15 minutes actually, just with a click of a thumb. But unfortunately, we can’t get the service to sell. But yeah, it is quite a hyper growth product. We become the first unicorn of Indonesia and then two, three years after we became the first decacorn of Indonesia, which is surreal in terms of growth I would say.
Kirill Eremenko: 00:12:41
What’s a decacorn?
Syafri Bahar: 00:12:43
A decacorn is with a 10 billion valuation basically.
Kirill Eremenko: 00:12:45
10 billion valuation. Oh, my gosh. In 10 years, right you said?
Syafri Bahar: 00:12:49
Well, it’s eight years actually to be precise.
Kirill Eremenko: 00:12:52
Yeah. Wow. Wow. Very cool. Are you subscribed to the Data Science Insider? Personally, I love the Data Science Insider. It is something that we created, so I’m biased. But I do get a lot of value out of it. Data Science Insider if you don’t know is a free, absolutely free newsletter which we send out into your inbox every Friday. Very easy to subscribe to. Go to SuperDataScience.com/DSI.
Kirill Eremenko: 00:13:20
And what do we put together there? Well, our team goes through the most important updates over the past week or maybe several weeks and finds the news related to data science and artificial intelligence. You can get swamped with all the news, even if you filter it down to just AI and data science, and that’s why our team does this work for you.
Kirill Eremenko: 00:13:39
Our team goes through all this news and finds the top five, simply five articles that you will find interesting for your personal and professional growth. They are then summarized, put into one email, and at a click of a button you can access them, look through the summaries. You don’t even have to go and read the whole article, you can just read the summary and be up to speed of what’s going on in the world.
Kirill Eremenko: 00:14:01
And if you’re interested in what exactly is happening in detail, then you can click the link and read the original article itself. I do that almost every week myself. I go through the articles and sometimes I find something interesting, I dig into it. So if you’d like to get the updates of the week in your inbox, subscribe to the Data Science Insider absolutely free at SuperDataScience.com/DSI. That’s SuperDataScience.com/DSI. And now, let’s get back to this amazing episode.
Jon Krohn: 00:14:32
And you’re financed by some of the biggest possible financiers around. Sequoia Capital, Tencent, Google, Facebook. So it’s interesting that you would think that a lot of those companies would actually be competing companies, and so it’s interesting. I guess they see a lot of potential in Indonesia.
Jon Krohn: 00:14:50
Something that really interests me and may interest a lot of our listeners is what is a super app? In the West, I don’t think we have anything like that. It seems almost like in the West they deliberately fragment apps. So Facebook fragmented into Messenger, and as many different pieces as possible.
Jon Krohn: 00:15:11
When you have a super app, when you look on your phone it’s just one app that you click on, and then when you’re inside you navigate to all these? You get your massage and your insurance once you’re inside?
Syafri Bahar: 00:15:24
Exactly. No, exactly, Jon. Yeah. It is very interesting indeed, because if you think about it there’s not really a comparable I would say platform on there. But just the idea is we built the whole ecosystem within one app. And I think [inaudible 00:15:39] actually managed to create this network, and then you actually start to reap the benefits of having. Because anything that you put in that ecosystem scales very fast actually.
Syafri Bahar: 00:15:48
So we became, for example for the food delivery, the biggest in Asia excluding China. Logistics for example also became the biggest in Indonesia just from leveraging of this network effect actually that we have within the app. But you’re right, if you think about it the opportunities to implement data science, machine learning just meshed in terms of personalization.
Syafri Bahar: 00:16:17
It’s just amazing. For example, being able to know [inaudible 00:16:21] of food orders or massage appointments allows us to recommend what is the best service for that. You might think of music actually. [crosstalk 00:16:34]. There’s so many kind of information within the network which can actually be leveraged to build a very powerful personalization. It’s quite an exciting environment. It’s like having 20 companies within one umbrella, pretty much.
Jon Krohn: 00:16:52
Yeah. The data science perspective of it sounds absolutely amazing. And I guess we’ll spend most of today’s program talking about that, so it’s great. I love this idea of how you can be like, “Oh, yeah. If you like a deep tissue massage, then you’ll probably be interested in our athlete insurance.”
Syafri Bahar: 00:17:08
Exactly.
Kirill Eremenko: 00:17:12
It’s like recommender systems on Netflix or Amazon but on steroids. You get the network effect of the recommender systems. It’s exponential on exponential. No wonder it grows so fast.
Syafri Bahar: 00:17:25
Exactly, exactly.
Kirill Eremenko: 00:17:25
That’s so cool. As I understand, you’re operating in Thailand, Vietnam, Singapore, and Indonesia. Is that correct?
Syafri Bahar: 00:17:32
Correct. Yes.
Kirill Eremenko: 00:17:34
And how many people, just for those… Of course, for those people who are from those countries will know you well, but for those from the West who maybe haven’t heard of Gojek, how many users do you have on your platform? How many people do you work with on your platform?
Syafri Bahar: 00:17:51
Sure, sure. Maybe just to give the idea of the scale. The app itself has been downloaded 170 million times, actually. And I think one every four Indonesian have the app installed. They have actually [inaudible 00:18:08]. And then we have already around-
Kirill Eremenko: 00:18:10
I have the app installed, too.
Syafri Bahar: 00:18:11
Oh, really?
Kirill Eremenko: 00:18:13
I’ve had at least.
Syafri Bahar: 00:18:14
[crosstalk 00:18:14].
Kirill Eremenko: 00:18:15
Yeah. When I was in Bali I asked for a ride. You get on the scooter behind this driver and you hold on for your life. It’s a really cool experience.
Syafri Bahar: 00:18:26
Yeah. And just to give you the scale because that’s very interesting, because it has around total of drivers and then also service providers, we have around two, two and a half million. So that’s almost 1% of the population of Indonesia, so it’s quite crazy. Basically that thing that a lot of people’s lives actually depend on us.
Syafri Bahar: 00:18:49
So it’s also quite a privilege I feel, because we need to do our jobs really well in order to be able to survive and really provide these people with the day-to-day livings as well. Maybe couple of other things [inaudible 00:19:05]. In in terms of the economy also we’ve contributed immensely in Indonesia. I think if we total everything for all the incomes coming from the platform itself, it’s actually contribute to 1% of Indonesia’s GDP. So it’s pretty big.
Jon Krohn: 00:19:22
That’s incredible. Yeah.
Syafri Bahar: 00:19:24
Yeah. And actually, we also hit our two billion orders milestone last year, if I’m not mistaken. It’s actually quite a milestone also for us.
Kirill Eremenko: 00:19:33
Congrats. That’s really cool. I’m sure data science played a huge role in that.
Syafri Bahar: 00:19:40
Yes, yes.
Jon Krohn: 00:19:42
I had a question along a similar vein. I’ve been queued up for it perfectly, which is how big is the core company at Gojek? For example, how many data science people are there?
Syafri Bahar: 00:19:57
Yeah. Data science, there are around 60 to 80 people I think now. In total within data [inaudible 00:20:05] we have around 150-180 people in total. Now we actually have three different I would call it analytic professional within the company. We have data scientists, we have BI, business intelligence, we also have decisions scientists.
Syafri Bahar: 00:20:22
Recently we introduced this basically to kind of help us making the right decisions for million-dollar decisions that we need to take. We need a really specialized knowledge to, how do you call it, to clear out all the ambiguities in terms of asking questions and being able to systematically taking decisions in a more rigorous way, basically. That’s about the size of the data team.
Jon Krohn: 00:20:49
Nice. The decision sciences team sounds like the holy grail in business. That’s what everybody wants to be doing, and maybe because you guys do it that’s why you’re having this incredible hyper growth, and you’ve become a decacorn. It could be a big part of it.
Syafri Bahar: 00:21:03
Yeah. We actually just recently started. We’ll see, but I think we’re probably one of the first that’s introduced the job ladder, the job family in Indonesia. I am really looking forward to what kind of impact actually it can. But if you look at already the use cases, there’s quite a lot of use cases already where we need to take, for example decisions about expansion, decisions about releasing certain features. Decisions about for example distributing [inaudible 00:21:30]. I think those are the typical questions that these job I’ll say architects will focus in on for that.
Kirill Eremenko: 00:21:40
What’s the difference in skill sets for a decision scientist versus a data scientist?
Syafri Bahar: 00:21:48
Yeah. Our definition, because again within the market especially Indonesia, every company has their own ways to define data science. I think our definition of data science versus decision science, if you look at the core skill set, data scientists within our companies are very strong in software engineering as well. So they’re trained to build scalable machine learning system.
Syafri Bahar: 00:22:14
A little bit more like the applied machine learning engineers actually. Very close to that, while our decision scientists they need to be very strong with the statistical analysis, like causal inference for example. Being able to do hypothesis test and they need to be good with experimentation. The focus are a little bit different.
Syafri Bahar: 00:22:34
Data scientists, they really build data products, scalable data products. And our decision scientists really help with decision making actually, by running some certain analysis. Statistical analysis that can help us making better decisions.
Kirill Eremenko: 00:22:48
Wow. That’s very cool. I guess in smaller companies or companies that are not as advanced in terms of data science, that is all combined in the analyst or the data scientist, those hypothesis testing and so on. But as you scale, I guess you made the call to separate those two and really specialize people. “All right. You are in hypothesis testing and you can run all these experiments, whereas you’re in machine learning and engineering of features,” and things like that. So people can actually focus and get really good at, not one thing but that group of things that are relevant to that profession.
Syafri Bahar: 00:23:31
Yeah. Indeed, indeed.
Kirill Eremenko: 00:23:32
That’s very cool.
Jon Krohn: 00:23:34
It seems like those data scientists I’ve been reading about Gojek’s machine learning platform, they’re a series of articles on medium. And some very cool specialized tools like CartoBERT, so using the BERT system, the transformers in natural language processing. So leveraging particular deep learning techniques to allow you in the ride hailing product to be able to create names for pickup points, right?
Syafri Bahar: 00:24:04
Correct. Indeed.
Jon Krohn: 00:24:06
And then I read about Turing, which named after the great British computer scientist Alan Turing. And it’s a tool for evaluating machine learning models I guess before they go into production, or maybe after they’re also in production to make sure that they’re still performing as you’d expect?
Syafri Bahar: 00:24:24
Yeah. I’m actually very, very happy that you’ve basically spent some time in visiting our medium blog, and there are like great articles over there. But to write about CartoBERT, I think the idea is one of the things that we would like to [inaudible 00:24:39]. This is also very interesting in terms of how we really bring data end to end. Just particularly if you don’t mind, I’ll tell a little bit stories about CartoBERT.
Jon Krohn: 00:24:48
Please.
Syafri Bahar: 00:24:50
Yeah. It used to be that we learned from the data [inaudible 00:24:54] people who actually been pick up from a very crowd location, like a shopping mall, et cetera, et cetera. We basically look at the percentage of people who call the drivers. It’s actually two X compared to the other place.
Syafri Bahar: 00:25:09
Basically we have concluded that people [inaudible 00:25:12] around these areas. So what we did is that we run some clustering, we [inaudible 00:25:18] basically, and we found out that among these pickup points apparently we can actually find the center of those clusters, where people ask to being pick up.
Syafri Bahar: 00:25:30
And then what we do is that we also have the chat history of drivers and then our customers. So what we did is that with a clustering system we picked the center point, and we need to basically attach a label into that. And that’s where CartoBERT actually plays into the role, because it allows us to crunch millions of chat logs, and then summarize it into a pickup point.
Syafri Bahar: 00:25:54
Especially given the size of Indonesia, it’s just not possible to do it manually. So what we did is that we ran 100,000 pickup points in Indonesia for the shopping mall, and then it translates into product features that people love. And then we all see quite a significant reduce in the number of calls between drivers and customers. Just to illustrate how we really use data to improve the experience of our users.
Kirill Eremenko: 00:26:20
That’s a cool one.
Syafri Bahar: 00:26:25
In addition to that, actually a couple of months ago we released also together with Hong Kong University of Technology, we worked together and we released probably one of the bigger BERT… One of the biggest BERT model pre-train NLP models for Indonesian language actually. And we have open sourced it. People here if you happen to be interested in Indonesian language, NLP for Indonesian language, you can actually go to www.IndoNLU.com. You actually can download the pre-trained model for Indonesian language.
Kirill Eremenko: 00:27:07
Beautiful. That’s awesome.
Jon Krohn: 00:27:07
Yeah. It’s great to be sharing your expertise with the world. Really wonderful. Seems like you guys are doing great things on your team. Maybe Kirill already knows this, I don’t know how much you know about each other’s backgrounds, but in your role are you… Who reports in to you? How big are the teams? What does being a VP of data science mean at Gojek?
Kirill Eremenko: 00:27:38
Yeah. I’m also very curious. That’s a great question.
Syafri Bahar: 00:27:41
Yeah. Thanks a lot, actually. Especially in the hyper growth startup, I’ve probably changed my role three to four times already within two years in terms of scope. I was originally hired to develop the data science capabilities [inaudible 00:27:56] originally, and then became the head accountant for data science basically.
Syafri Bahar: 00:28:00
At that the teams are still around 40-50 people I think within the machine learning engineers. [inaudible 00:28:07] platform. And then recently, the portfolio has grown a little bit. Not only that, I actually have two other peers within Gojek, so we both report to the chief data officer of Gojek.
Syafri Bahar: 00:28:23
Together with my peers we basically split the portfolio, so I currently oversee around nine verticals. Our entertainment, third party platform, groceries, marketing, for example. It’s not all logistics. There are a couple of verticals and we oversee both the analytic and science part of the portfolio.
Syafri Bahar: 00:28:46
What I refer to analysts is the BI and analysts, data analysts. And then the science part is decision scientists and data scientists. Probably there are around 50-70 people, 60 people I think eventually reporting to me currently.
Syafri Bahar: 00:29:07
In terms of scope of work, it basically encompasses almost all spectrums. If I were to decide the cluster [inaudible 00:29:15] activities, it’s starting from the people itself. We’re taking care of the technology, what impact. Which technology that we need to for example invest in next year.
Syafri Bahar: 00:29:25
We also deal with building organizations. How do we organize ourself actually to prepare us to tackle the company strategic team, positions ourself. Basically all of these aspects from hiring and everything. Even the dirty one, like the financing, cleaning up the systems and stuff like that.
Syafri Bahar: 00:29:47
It encompasses almost everything, basically. I actually do see myself as a problem solver in a way that whatever, I try to fill the sack in terms of, “Hey, I don’t think that there is a, for example clear career path for some of our people,” so I’m going to immediately jump talking to HR and ensuring that for example that we’ve managed to create a good system that allows people to basically follow their aspirations.
Syafri Bahar: 00:30:15
But sometimes I’m also put in a very project-specific activity, like for example really understanding our customers. Creating a framework in order to be able to actively manage our customer portfolio for example, by properly [inaudible 00:30:31] customer lifetime, for example. I think those are different spectrums, just to get some flavor of what I’m doing on day-to-day basis. I hope that answers your question.
Jon Krohn: 00:30:38
Yeah. That was an amazing answer, and it sounds like a really interesting role. Wow.
Kirill Eremenko: 00:30:46
Do you still do much technical work?
Syafri Bahar: 00:30:50
Yes. I try to do so because I think, and especially in this field, things just evolve very rapidly so I try spend couple of hours still coding basically, and really pushing codes as well to the repository. Being involved also in the technical discussion in the modeling, so I still try to do that.
Kirill Eremenko: 00:31:10
Yeah. That’s impressive.
Syafri Bahar: 00:31:12
[crosstalk 00:31:12]. Yeah, exactly.
Kirill Eremenko: 00:31:15
Absolutely. That’s very, very good to hear. I like what you said in one of your interviews about a high performing data science team that requires three main components. Do you mind telling us a bit about that? As a VP of data science, you have a unique position that not only you need to deliver the work, but you also need to evaluate the performance of your team.
Kirill Eremenko: 00:31:41
And report to higher up executives on, “We are delivering value. This is a very useful team to the company.” You have accountability and you have a responsibility to your team to do that, otherwise there’s stories of whole teams getting disbanded because executives didn’t see value. And I found your philosophy about what is a high performing data science team very structured.
Kirill Eremenko: 00:32:10
And I think not only managers listening to this podcast will find valuable, but also individual contributor data scientists will find it valuable to understand. To evaluate for themselves if they’re part of such a team, and what they can do in order to be part of such a team. If you could jump into that, that’d be great.
Syafri Bahar: 00:32:30
Sure. Yeah. Thanks a lot for that. Indeed, I think what I found actually to be very challenging is really to establish values. Especially data science itself as a discipline is a valued thing. It takes a lot of faith I would say from executives to kind of even invest in the team, because typically the investment will probably take, especially for our largest machine learning systems that we have that really move the needle, it probably took a year in the making.
Syafri Bahar: 00:33:01
Involves a lot of iterations, trials and error. So I think what I also highlighted in my interview that basically what we need to show to the company is that, first of all I think we need to measure everything. And that’s also the reason why we’re actually integrating all of our machine learning system. We integrate also the measurement system inside, just to ensure that we are actually able to quantify the impact even on a team level.
Syafri Bahar: 00:33:26
I’m able to know for example what is the dollar impact that a team of three people basically deliver for a particular project. That’s how rigorous we are in terms of measuring impact. And really, I think the case that we try to make is that we want to make a case that we’re not [inaudible 00:33:43], because there are very tangible dollar savings or dollar generating, actually that we do for the company.
Syafri Bahar: 00:33:50
And we are able to achieve that by really putting a very strong measurement in place, even before we engage in any of the machine learning projects actually. I think the first thing that we ask for product engineers counterpart, to have a measurement system in place. We have experimentation system in place actually, just to understand where basically the things that we will build for them will actually lead to many impacts.
Syafri Bahar: 00:34:15
And the fact about being in the hyper growth startup, there’s hundreds of things that we can actually do for next year. So we need to have a very ruthless [inaudible 00:34:24]. Having a proper way to measure the impact or potential impact is very essential in order to establish a case for the company.
Syafri Bahar: 00:34:35
I think one of the characterizations of high performing team will be that they deliver impact, and how do they know whether they deliver any impact? It’s by really putting this measurement in place. And then by educating as well. I think what I seem to learn as well during my experience within Gojek is that a lot of these end to end, a lot of these projects have actually managed to deliver big impact.
Syafri Bahar: 00:35:01
A lot of the challenges, of course there are [inaudible 00:35:03] challenges as well, but I think not to be undermined as well is the non-technical challenge of really ensuring that we have created a good structure for our data scientists and product engineers. And engineers actually to work different pace, but they are able to integrate their solution.
Syafri Bahar: 00:35:27
This is one, and the second thing is also about constant education to stakeholders to try to convince them why it is okay actually for their millions of dollars of their money, actually being managed by a black box. I think that also requires a lot of convincing, I would say. So really establishing a good operation model is very essential I think for the high performing team, because by having a good operational model.
Syafri Bahar: 00:35:53
Just to give a little bit more flavor to that one. For example, we have recently basically declared that all of our solutions need to be basically communicate with product engineer systems using API base. Because that allows people basically to move in a different pace, and then meet up again like a couple of weeks later to integrate their solutions.
Syafri Bahar: 00:36:14
But as long as before the start of a project, the teams are very clear in terms of what they will deliver. And we only can achieve that by having a proper API contract. It allows team really to reiterate. And the thing about data science, I think what I found to be very interesting is that their iteration, the sprint cycles are very different with product engineering teams actually.
Syafri Bahar: 00:36:38
There have so many data dependencies when we look at it from data science perspective, so we can’t really treat it as an engineering sprint. So they need to be able to have the flexibility to move at different pace. But then eventually their solutions that they built need to match actually the API.
Syafri Bahar: 00:36:56
And what I also find to be very important is to ensure that the team is empowered to make decisions, by having a proper experimentation system, having a robust methodology to decide whether the team needs to go left or right. And empowering them to make decentralized decision making actually. I think that I found to be very important to ensure that the team can move very fast.
Syafri Bahar: 00:37:20
So we need to trust them with decision making in decentralized manner, as long as the methodology and the system that they’ve actually created to make those decisions are robust. I hope that answers your question.
Kirill Eremenko: 00:37:31
Yeah. And empower them to fail as well, right? You said that in one of your other interviews.
Syafri Bahar: 00:37:36
Yeah.
Kirill Eremenko: 00:37:37
Decisions sometimes will be wrong, and they should know it’s okay.
Syafri Bahar: 00:37:42
Exactly. And I’m actually very glad that we [inaudible 00:37:44] or CEOs or co-CEOs [inaudible 00:37:49] are very supportive of that with the cultures of it’s okay if they actually fail. It’s better to fail fast and learn from it, rather than moving very slow because especially the competitions is very fierce, the market also moves very fast. So agility is definitely something that we value very high within our company.
Kirill Eremenko: 00:38:14
Fantastic. Thank you. Thank you for that answer.
Jon Krohn: 00:38:17
It sounds like you guys are doing everything right. Yeah. If I was in Indonesia and listening, I’d be like, “Man, how can I get involved with this company?” Really amazing. You’re saying all of the things that I think are spot on from a quantitative data management perspective. How you are treating your data scientists and relating that into the broader operations of the organization and evaluating it. Brilliant.
Syafri Bahar: 00:38:52
Yeah. I feel also very privileged actually to work with these amazing people, and I think I learned a lot from my team. And the ability just to work with amazing people who are actually distributed. Our teams are actually well distributed, even our CEO is actually working from US. We have 31 nationalities working for the company, so we’re really chasing talent also globally. We have people across different continents actually working for us. Just as additional information.
Jon Krohn: 00:39:27
There you go. How did you find yourself here? What was your journey? I mean, I know that you’ve worked across the world, you studied in the Netherlands and then you worked there for a while at banks, asset management company. And so what was your journey from that world, so from a different continent?
Jon Krohn: 00:39:49
I expected when I was talking to you that you would have been involved in a lot of the finance applications at Gojek. I thought that that would be what you were working on. But it sounds like it’s much broader than that, so how did you end up making that journey from financial companies, really traditional financial companies? Big banks in the Netherlands to a hyper growth decacorn in Indonesia?
Syafri Bahar: 00:40:18
Yeah. Thanks a lot actually for asking the question. I think because especially the journey has been very intimate to me, and I think the reason of that because I’ve always had doubt whether a pure mathematician like me is actually able to make an impact for the society. I always see it as very remote.
Syafri Bahar: 00:40:40
I remember there was some certain time in my life that I say [inaudible 00:40:44]. Because my background is actually in pure mathematics, so my thesis back then is about topological structure basically, so I haven’t really seen data. I worked a lot with writing formulas [inaudible 00:40:58] formulas, and I had my bachelor education.
Syafri Bahar: 00:41:04
It just felt very remotely and not [inaudible 00:41:07], and I actually switched a little bit to the applied mathematics. I was actually taking education to be a quant, and that there actually I got myself into a lot of high performance computing. A little stochastic courses, and then being able to actually see data. How do you say? It’s quite a spotty journey actually to, how do you call it, to come from pure mathematics-
Jon Krohn: 00:41:42
[inaudible 00:41:42]. Yeah.
Syafri Bahar: 00:41:42
… to applied. Exactly, right.
Jon Krohn: 00:41:47
You might not even have had numbers for many years.
Syafri Bahar: 00:41:50
No. No, no, no.
Jon Krohn: 00:41:50
It was just variables, right?
Syafri Bahar: 00:41:54
It’s just variables. Indeed, indeed.
Jon Krohn: 00:41:54
That’s so interesting.
Syafri Bahar: 00:41:55
Yeah, exactly. And then when I came to bank and I started actually my education, I was trained I would say in a very classical environment. I remember one of mentors back then, I was requested to do analysis with only five basic statistics, mean, median, percentile, max, and min. And I really need to kind of-
Jon Krohn: 00:42:14
Oh, no.
Syafri Bahar: 00:42:15
Yeah, exactly. But I got really rely a lot on my problem solving skills, and getting to know what these measurements are actually doing. Because actually, I was surprised that a lot of things can be done with these very basic statistics actually. A lot of insight can be uncovered by just playing around with the weighted average for example, and then being able to compare these different statistic.
Syafri Bahar: 00:42:38
And really make an educated guess in terms of what is the [inaudible 00:42:41] distribution, is there an anomaly or not. Actually with some basics, as long as one knows very well [inaudible 00:42:49] actually there a lot of things can be done. So I was actually trained in that environment and I was also lucky enough to work with different type of risk.
Syafri Bahar: 00:42:59
And actually, for the audience who’s not very familiar with risk management, there are actually different type of risk. And what’s very unique because for each different type of risk, it actually deploys different type of mathematical tool. Just as an example for credit risk, I used a lot of predictive funnels with [inaudible 00:43:19] risk.
Syafri Bahar: 00:43:19
For example, my last type of domain that I’ve worked with before I moved to Indonesia, I actually needed to do a lot of simulation kind of things. I actually maintained Monte Carlo engine for the bank itself. So basically, what we need to do, we have couple of hundreds of thousands of trades and we need to do simulations of thousands of risk factors.
Syafri Bahar: 00:43:44
And not only we need to simulate it for one day or two days later, but really 30 years ahead. So basically, I used a lot of the parametric simulation techniques in order to be able to do that.
Syafri Bahar: 00:43:59
But basically, what I wanted to say is that I really built the required skill set in a very classical environment, really beat by beat. And then what I found to be very beautiful about mathematics is because it’s very, how do you call it, transferrable to other type of domains. Because the language are the same, especially the language of linear algebra I think is very useful in order for me to grasp the new concept as well.
Syafri Bahar: 00:44:29
When I came back to Indonesia, I started at a fintech company. And then by coincidence I gave a talk at Gojek actually, and then I got approached by what now becomes the co-CEOs of Gojek itself and I got hired from a coffee.
Kirill Eremenko: 00:44:46
Nice.
Syafri Bahar: 00:44:47
I’m actually very glad that he took a bet on me until I basically managed to be where at where I am now.
Kirill Eremenko: 00:44:56
That’s interesting.
Syafri Bahar: 00:44:56
It was quite a series of coincidences actually.
Kirill Eremenko: 00:45:00
That’s very interesting. I’ve got a question, kind of I guess a question that will challenge me more, and I’d like to get your opinion on this. The way I teach data science in the courses is very different to the way Jon teaches, and the way I guess that you apply data sciences. I studied also mathematics, studied mathematics and physics in my bachelor, but it was a long time ago and I liked it a lot.
Kirill Eremenko: 00:45:30
But the way I applied data science when I was at Deloitte in an industry, it required very little mathematics. And that’s how I teach it as well. I teach more as like a plug and play type of instrument that, “All right, machine learning. Here’s an algorithm. I don’t know, Naïve Bayes clustering. This is intuitively how it works, this is what’s in the background. This is what’s going on and this is how you apply it.”
Kirill Eremenko: 00:45:57
And I avoid teaching the mathematics. For instance, the analogy I give is driving a car. To drive a car, you need to know where to put the petrol, how to steer, where to press the gas, where to press the brakes. And you need a lot of practice. That’s how you pass your driving test. You never need to know what a crank shaft is, how it’s different to a cam shaft, what’s under the hood.
Kirill Eremenko: 00:46:20
I don’t even sometimes know how to put the oil in the car, for crying out loud. So my question to you is, is there a right or wrong? Or if you think it’s important for people to learn mathematics in order to be data scientists, then why?
Syafri Bahar: 00:46:42
Understand. Okay. Yeah. I think it all very depends on the type of domain that they will work on in the future, and what they’re interested in. I would say because we kind of look at the spectrum of applications of mathematics within data science, I think we can define it in a couple of clusters actually.
Syafri Bahar: 00:47:01
And particularly in Gojek why it is important for the people to understand the basics, because we dealt a lot with what I call green field projects. These are the type of projects which we can’t just Google and get the answer. We really need to exercise the first principles in order to understand what kind of mathematical apparatus that we basically need to deploy to solve the problem.
Syafri Bahar: 00:47:22
[inaudible 00:47:22] can just come to us and say, “Hey, we have this amount of budget. I want you to be able to distribute it in an optimal way.” Very, very vague and ambiguous, so one really required to ask more, ask direct questions first of all to understand the real problem.
Syafri Bahar: 00:47:38
How do you define optimal, what are the different levers that we basically can use to distribute those things, and how can we basically use the right apparatus to model the problem itself? What I want to say, basically that that’s also the reason why we emphasize this a lot, the context.
Syafri Bahar: 00:47:56
Even for example [inaudible 00:47:58] linear regression during an interview. I think what we sometimes do, we try to tweak it a little bit, the problems. “Hey, what if we take these L1 penalty, what if we take L2 penalty? What if for example we shift the distribution of the target variables to become very highly imbalanced?”
Syafri Bahar: 00:48:15
Just to test the ability of the candidate to adapt to different reality that they might encounter while working on the problems within Gojek. And the reason why we do it is because we think that’s a relevant skill set to have.
Syafri Bahar: 00:48:30
I can imagine for example when one will focus a lot on building the data science platform or engineering platform. [inaudible 00:48:41] to know the kind of two, three layer [inaudible 00:48:44]. Like what you said, more like a plug and play, but I think the emphasis will be how to design the right architecture that can be very scalable. And how do we use the mathematical concept to cut some of the computational resources that we basically goes into that?
Syafri Bahar: 00:49:00
And maybe in that case, it will be less obligatory to know the two, three-layer depth. So maybe I apologize because there is no straight answer, but I think it all depends. And I think for the Gojek context it’s very important to understand those basic, because then the choice of apparatus to deal with the problem is just quite immense.
Syafri Bahar: 00:49:22
We employ the economic technique, we employ also the operation research technique for example in our problems. If we play around with logistics, sometimes also predictive models, supervised and unsupervised. And even to some certain extent also some [inaudible 00:49:38] type of algorithm. There’s just quite a lot of possibilities over there, so it’s really important to know the at least two, three layers deep from the mathematical perspective.
Syafri Bahar: 00:49:51
But I think for my personal opinion, I think it is also very important to basically, how do you call it, like a painter. Sometimes we need to be able to really bring people to the, how do you call it, to appreciate the painting itself. And I think sometimes the best way to do it is that by not starting with [inaudible 00:50:16] differential calculus.
Syafri Bahar: 00:50:17
But really starting with the stories and then, “Hey, why this is important. Why [inaudible 00:50:22] is important. Because hey, we can actually translate that fraction of this problem by bringing it to the [inaudible 00:50:28] for example.” Then they’re able to imagine the solutions of the problem.
Syafri Bahar: 00:50:33
I think it depends. And I think my personal preference is always to start very simple and then try to peal the layers one by one, bringing them to a bit more, how do you call it, depth. The required depth actually necessary. That’s just [inaudible 00:50:49] a lot of it in learning as well. [inaudible 00:50:52] language or the way you present your teaching.
Jon Krohn: 00:50:59
I love that answer, and I don’t think I have too much extra to add. I think that to kind of summarize the value of understanding the underlying mathematics is that I love the car driving analogy. But the beautiful thing about machine learning is it isn’t necessarily actually that complicated, what’s going on under the hood.
Jon Krohn: 00:51:29
And so I actually started teaching exactly the same kind of way that you described teaching, Kirill. And it’s only relatively recently that I was like, “Maybe it is worth getting into the partial derivative calculus, the linear algebra that’s happening under here.” And I was inspired to think that by colleagues of mine, people who work for me.
Jon Krohn: 00:51:54
I would see them doing matrix algebra or I would see them thinking about, “What’s the right data structure for this particular type of data in this model because of how we’re going to be scaling it, so that we can minimize computational resources.” I was seeing people use these underlying understandings to make on the science side, huge intuitive breakthroughs that by only understanding the [inaudible 00:52:29] API, there’s no way you could have had that breakthrough.
Jon Krohn: 00:52:33
And then on the engineering side, being able to think about, “Okay. What is the time complexity or the memory complexity of what I’m doing here? And then how can I maybe make adjustments there, trade offs between computational complexity versus memory complexity, so that I can use fewer resources, or maybe have a faster experience? Realtime experience for my users.”
Jon Krohn: 00:53:02
It’s a really recent thing for me that it seems so valuable, but the more and more I dig into it, the more and more I appreciate that, “Wow. There’s so many possibilities.” And there’s still absolutely a time and a place for using the high-level APIs.
Jon Krohn: 00:53:19
I mean, maybe more often than not. But to be making really cutting-edge algorithms, or to even be understanding and trying to deploy some of the latest things that you read that might only occur in papers or graduate-level textbooks. There might not be a high-level API for you to use yet. So if you wanted to make CartoBERT, you can’t just be able to use BERT. You have to understand what’s happening in BERT.
Syafri Bahar: 00:53:51
Yeah. Fascinating. If I can add couple of more sentences to that, I personally think that I’m on personal mission to really spark interest from people. Especially in Indonesia, to really found this discipline to be fascinating. I really want the people in Indonesia in their job for example are being asked, “What do you want to do in the future?”
Syafri Bahar: 00:54:17
Instead of saying astronaut or doctor for example, they say, “I want to be a data scientist.” I think what I wanted to kind of emphasize, I think there’s so many beautiful things which you can actually put in more intuition, in order to just make the first bridge for people to cross that bridge to find it interesting.
Syafri Bahar: 00:54:34
And I found communications via wrapping up things in terms of intuitions, like what Kirill just mentioned. I think it’s very helpful to spark their interest and really for people to get interested and really to get motivated, and they will give energy in order to go even deeper, to a deeper level.
Syafri Bahar: 00:54:53
But I think I’m still learning. I try to also learn how can I actually present all these different complex concepts to actually make it very simple, intuitive, and exciting as well. I think that’s kind of my personal mission. I’m still learning of course, but I think they’re just so beautiful in terms of discipline. And I think a lot more people actually can benefit from that, and especially the society.
Kirill Eremenko: 00:55:22
Thanks. Thanks, guys. I asked to get challenged and I feel challenged. Yeah. I think it’s a good perspective that there’s room for both to get started, go down the intuition path, but then always keep in mind that you can go deeper and it’ll give you more superpowers with the mathematics.
Kirill Eremenko: 00:55:42
Syafri, you mentioned that Gojek is hiring, so where can people apply? And then I wanted to ask you a second question. What does it take to thrive as a data scientist at Gojek?
Syafri Bahar: 00:56:00
Sure. Thanks a lot for asking this question specifically. It helps us a lot. And I think that we can always find the open positions actually within the recruitment. And I think if people just type Gojek recruitment, they’ll pop up basically the website where they can see what are the available positions at Gojek.
Syafri Bahar: 00:56:26
And I think the second question is very interesting. I think the fact that the company itself is, how do you call it, we’re going to the next phase now. It used to be that we were in this very high growth phase, so to say where things are a little bit ambiguous, I would say sometimes. So people who can navigate in an ambiguous environment will thrive within Gojek. People who can actually systematically approach problems in general, they will thrive. And I think it also takes a lot of determination and grit to push things as well as a data scientists.
Syafri Bahar: 00:57:03
And I think this is also the type of data scientist who actually do not just stick with conventional approach of things, but data scientists are required also to be able to exercise first principle. And I think those are the type of data scientist can actually thrive within Gojek environment.
Syafri Bahar: 00:57:23
So they need to be able to very diverse enough to know what are the different apparatus available to solve problem, and be very skillful enough to. And I think also it’s being intellectually humble, to really acknowledge that we don’t know what we don’t know. Because sometimes it’s just really like asking.
Syafri Bahar: 00:57:43
Sometimes actually there are a lot of things that actually hidden behind all of these numbers and digits that we’re seeing on our screen, as a data scientist. So I think I often also ask my data scientists just to go to the field. Really talk to our drivers, really understanding their pain points, and then that way it actually allows them to understand and to basically rationalize what they see under there.
Syafri Bahar: 00:58:11
How do you say, [inaudible 00:58:11] in terms of all the figures and numbers. And then from those intellectual curiosities, they are able to frame the problems correctly, and then frame it as data scientist problem. And then again, the next level will be to find the right apparatus.
Syafri Bahar: 00:58:28
And I think another quality that I think also will help a lot will be to be very practical. If you look at the overall in the company, there are a lot of problems that require simple solutions. Because there are a lot of low hanging fruits in the company, these are the type of problems that we need basically effort 80%. We can achieve the standard solution.
Syafri Bahar: 00:58:51
But there are also rather mature problems where in order to go from 95% to 97%, then we will need the fundamental research. And I think what I always told to my team is that we should be fine using hammer to kind of hammer the problem, but we should not shy away from using scalp as well in really formalizing the solutions. I think this type of mindset will basically help people to try within the Gojek environment.
Kirill Eremenko: 00:59:22
Wow. Fantastic. Thank you. Thank you for sharing that.
Jon Krohn: 00:59:25
Yeah. I think something that if people get a chance to check out the video version of this podcast, you can see Syafri is so happy this whole time talking about modeling. And maybe that even comes through in the sound of his voice, but there’s so many points where he throws his head back with a big smile, because you’re so enlivened by these questions and these ideas. It’s wonderful to see.
Syafri Bahar: 01:00:01
Yeah. Thanks a lot, Jon and Kirill.
Kirill Eremenko: 01:00:01
Yeah. In one of the videos, you mentioned in one of your interviews that when you were working as a quant back in Europe I believe, you realized that the impact you make cannot extent much further beyond the company you work for. And that you just want to do more, you want to work… Your quote in quotation marks, “I just want to do more. I want to do work to benefit a lot of other people.” Do you feel that you’re doing that at Gojek now?
Syafri Bahar: 01:00:32
Yeah. As a matter of fact, I do. And I actually feel lucky myself, because when I wake up in the morning I still feel that day is my first day, to be honest. Because I’m really still very motivated to solve different problems. And the thing about Indonesia because there are just so many structural inefficiencies within the country, that I believe people like me and other…
Syafri Bahar: 01:00:58
I think there’s also an interview where I specifically call all the expats over there, like Indonesian people who live abroad, to just come and really contribute to the country. Because there’s just so many structural issues that we need to fix, and I think exploitations of natural resources is one way to extract values.
Syafri Bahar: 01:01:15
But I think solving structural inefficiency is also one way to create value for the system actually. And I feel actually blessed and privileged also to have the opportunity to really be able to serve the community. Because these are products that I can really relate from. My family will say, “Hey, I feel that this app actually has helped me to remove the daily frictions.”
Syafri Bahar: 01:01:43
Even for example, there is something bad happening I’ll get immediate feedback. And even because I also sometimes, before pandemic of course, I go ride to office. I talk to the drivers as well, and then he mentioned how his life actually has changed since he became one of our partner. He was able to for example, adopt a couple of children because of the fact that he works as a partner, a driver partner within our platform.
Syafri Bahar: 01:02:14
I think those are all the stories that really keeps me going through the day, and I feel blessed to be honest, to be able to have the opportunity to do that. Especially with my remote discipline, what’s considered to be very remote. Mathematics, computer science, and social impact.
Kirill Eremenko: 01:02:36
Wow. Thank you. That’s very inspiring to hear. I wish for as many people listening as possible to feel the same way at work. It’s clearly a very fulfilling place to be in.
Syafri Bahar: 01:02:57
Thanks.
Kirill Eremenko: 01:02:59
That’s awesome. Jon, do you have any questions to finish off?
Jon Krohn: 01:03:04
No. We’ve covered all of my questions and I love the ones that you asked as well, Kirill. I’ve learned so much today. I can’t help but notice that it seems like Gojek’s mission is to impact its scale through technology. And so it sounds like you’re really living that as a data scientist at the firm, Syafri.
Jon Krohn: 01:03:28
I don’t have any other questions. I just felt like saying that one more time, kind of reinforcing this idea of with probably the vast majority of people listening to this podcast are data professionals, or aspiring data professionals. And to hear a story like this today, it made me feel inspired and so I hope you feel inspired, too, to be identifying places that you can be making a big positive socioeconomic impact with your skills. Even if you started with a pure math topology background, you too can make a difference.
Syafri Bahar: 01:04:08
Oh, that’s a nice [inaudible 01:04:09] over there, Jon. And thanks a lot. I really enjoyed the conversation actually. You have done a fantastic job in really controlling the flow, and just really participating as well. Genuinely ask questions I think. And I think to a lot of data professionals out there, I still fundamentally believe in the futures of our professions actually.
Syafri Bahar: 01:04:32
I think we can do a lot of things for the community, even for the world in general. I think we just scratch the surface of what data actually can do and bring to lives of millions of fellow people out there. I would really encourage people who are in their learning journey to keep going, find their energy and their motivation to keep going. Because there’s a beautiful thing and it’s worth to really put investment in really enhance the professional and also the knowledge on the industry itself.
Syafri Bahar: 01:05:07
Thanks a lot. And I think both of you also have inspired people with the podcast, and also especially for the aspired data scientist and data professionals out there. Thanks a lot for that contributing back to the community.
Kirill Eremenko: 01:05:25
Thank you, Syafri. It’s been a really cool podcast. And for those of our listeners who want to or would like to connect with you or maybe just follow how your career progresses, where are some of the best places to get in touch?
Syafri Bahar: 01:05:37
Yeah. I think the best to get in touch with me on my LinkedIn actually. I don’t have a social media, like Instagram or Twitter, intentionally. But I think the best place to connect with me will be on my LinkedIn actually.
Kirill Eremenko: 01:05:53
Thank you. We’ll share.
Syafri Bahar: 01:05:54
And actually, I do [crosstalk 01:05:55]- yeah. Sorry.
Kirill Eremenko: 01:05:56
Sorry. You go ahead.
Jon Krohn: 01:05:57
Go ahead. Sorry. I didn’t realize who was Kirill speaking to. Syafri, you go ahead. You go ahead.
Syafri Bahar: 01:06:15
Thank you. Thank you. Thank you. I was actually thinking also sharing more materials and sharing some of more thoughts as well actually. I felt that I could have done it a bit better because, especially for a lot of aspiring data professionals in Indonesia. I think one of the things that I personally commit at least to 2021, so hopefully more content that I can share to the community as well in the future.
Kirill Eremenko: 01:06:29
Nice. Jon?
Jon Krohn: 01:06:33
I was just going to say that on the LinkedIn point that I don’t think, Syafri you shouldn’t feel ashamed that LinkedIn is your go to social medium, because I think Kirill and I feel exactly the same way.
Kirill Eremenko: 01:06:43
Yeah. Absolutely.
Syafri Bahar: 01:06:43
Okay. It makes me feel better at least that I’m not the only one there.
Kirill Eremenko: 01:06:50
Yeah. That’s the only one that I really use. I don’t think I use any other ones.
Jon Krohn: 01:06:57
Same.
Kirill Eremenko: 01:06:58
Yeah. Syafri, one final question for you. What’s a book that you would like to recommend to our listeners?
Syafri Bahar: 01:07:07
Yeah. In terms of books there’s actually quite a lot that I have in mind. But maybe just to select few of them, definitely Elements of Statistical Learning is a good start. I recently also get myself into more causal learning, basically because it happens to be that we’re in the space where we will need it a lot actually.
Jon Krohn: 01:07:26
Judea Pearl?
Syafri Bahar: 01:07:27
Yes, yes. That is for the mathematics. There’s also the title is What If? I forget the author again, but what I think is also a good mix of combinations of theory and practice as well. And Judea Pearl is definitely, if you’re into math itself, I think you will enjoy reading Judea Pearl’s book on that. And I also like-
Kirill Eremenko: 01:07:56
What’s it called? What is it called? The book.
Syafri Bahar: 01:08:00
The book of-
Kirill Eremenko: 01:08:01
Judea Pearl.
Jon Krohn: 01:08:01
Causality.
Kirill Eremenko: 01:08:04
Causality, okay.
Syafri Bahar: 01:08:05
Causality, yeah. Exactly. Yeah. And Elements of Statistical Learning is also a good book, as I mentioned earlier. And there’s also this 100-page machine learning book that I just from one time like to read as well as a refresher, because it condensed everything within one book.
Jon Krohn: 01:08:25
Nice.
Syafri Bahar: 01:08:25
Do you happen to recall again the name of the author, Jon? The 100.
Jon Krohn: 01:08:30
It’s Andriy. It’s so embarrassing, I can’t remember.
Syafri Bahar: 01:08:37
Burkov?
Jon Krohn: 01:08:38
Yeah, that’s right. Andriy Burkov. Exactly.
Kirill Eremenko: 01:08:42
Oh, yeah. Andriy Burkov. Jon, you might have him on the podcast sometime. We’ve been talking with him.
Jon Krohn: 01:08:47
Well, he’s been making quite a splash. I would love to have him on the podcast.
Kirill Eremenko: 01:08:52
He’s from Canada, right?
Jon Krohn: 01:08:53
I think he’s in Montreal.
Kirill Eremenko: 01:08:55
He’s Russian, ex-Russian but in Canada.
Jon Krohn: 01:08:59
Yeah.
Kirill Eremenko: 01:09:00
Awesome. Okay. Well Syafri, thank you so much. Jon, thank you a ton. It’s been a huge pleasure being part of this podcast. Been great.
Jon Krohn: 01:09:11
Same.
Syafri Bahar: 01:09:12
Sure. Yeah. Thanks a lot. Thanks, Kirill. Thanks, Jon.
Kirill Eremenko: 01:09:20
There you have it, everybody. Hope you enjoyed this episode and enjoyed the conversation we had with Syafri and Jon. I definitely had some great laughs. My favorite part of this episode, there’s lots of really cool insights that we shared.
Kirill Eremenko: 01:09:36
My favorite part was the use case that Syafri shared around CartoBERT and how they modify BERT, and how they used it to analyze all those interactions between customers and drivers, to figure out the best. How to optimize their logistics for pickups, and also how in result it helped reduce the number of calls, and basically improve efficiency.
Kirill Eremenko: 01:10:06
Also, I really enjoyed hearing about what Syafri mentioned about meaning and purpose, that he is very excited to be helping people to be contributing to improving people’s lives, in that example that he shared of a driver that was able to adopt children. I think that’s very noble and I wish for all data science to ultimately result in great things for communities and people across the world.
Kirill Eremenko: 01:10:37
That would be very good, and if we all look out for that and try and strive to find jobs, and make our jobs about impact, I think that will help serve the world and also create more happiness around the world.
Kirill Eremenko: 01:10:55
As usual, you can find the show notes at SuperDataScience.com/427. That’s SuperDataScience.com/427. There you’ll find any materials that are mentioned on the show, all of the books that Syafri mentioned and Jon mentioned as well. Plus the URL to Syafri’s LinkedIn. We’ll also include the URL to where you can apply for a job at Gojek as a data scientist, if you would like to explore that further.
Kirill Eremenko: 01:11:24
Make sure to connect with Syafri, make sure to connect with Jon. They’re both open to connecting on LinkedIn. And yeah, you’ll hear more from Jon in the coming weeks as mentioned in the beginning. There’ll be this transition. I’ll talk more about that in the coming episodes.
Kirill Eremenko: 01:11:40
And yeah, on that note if you enjoyed today’s episode, make sure to share it with somebody. It’s very easy to share. Send them the link, SuperDataScience.com/427. And I look forward to seeing you back here next time. Until then, happy analyzing.