Kirill Eremenko: This is episode number 251 with CEO and Data Scientist at TypingDNA, Raul Popa.
Kirill Eremenko: Welcome to the SuperDataScience Podcast. My name is Kirill Eremenko, Data Science Coach and Lifestyle Entrepreneur, and each week, we bring you inspiring people and ideas to help you build your successful career in data science. Thanks for being here today, and now, let’s make the complex, simple.
Kirill Eremenko: This episode is brought to you by our very own data science conference, DataScienceGO 2019. There are plenty of data science conferences out there. DataScienceGO is not your ordinary data science event. This is a conference dedicated to career advancement. We have three days of immersive talks, panels, and training sessions designed to teach, inspire and guide you. There’s three separate career tracks involved, so whether you’re a beginning, a practitioner, or a manager, you can find a career track for you and select the right talks to advance your career.
Kirill Eremenko: We’re expecting 40 speakers, that four, zero, 40 speakers to join us for DataScienceGO 2019. Just to give you a taste of what to expect, here are some of the speakers that we had in the previous years, Creator of Makeover Monday Andy Kriebel, AI Thought Leader Ben Taylor, Data Science Influencer Randy Lao, Data Science Mentor Kristen Kehrer, Founder of Visual Cinnamon Nadieh Bremer, Technology Futurist Pablos Holman, and many, many more.
Kirill Eremenko: This year, we will have over 800 attendees from beginners to data scientists to managers and leaders, so there will be plenty of networking opportunities with our attendees and speakers, and you don’t want to miss out on that. That’s the best way to grow your data science network and grow your career. As a bonus, there will be track for executives. If you’re an executive listening to this, check this out. Last year at DataScienceGO X, which is our special track for executives, we had key business decision makers from Ellie Mae, Levi Strauss, Dell, Red Bull, and more.
Kirill Eremenko: Whether you’re a beginner, practitioner, manager, or executive, DataScienceGO is for you. DataScienceGO is happening on the 27th, 28th, 29th of September 2019 in San Diego. Don’t miss out. You can get your tickets at www.datasciencego.com. I would personally love to see you there, network with you and help inspire your career or progress your business into the space of data science. Once again, the website is www.datasciencego.com, and I’ll see you there.
Kirill Eremenko: Welcome back to the SuperDataScience Podcast, ladies and gentlemen, super excited to have you on the show today because we’ve got a very exciting guest, Raul Popa, who is the CEO and Data Scientist at TypingDNA. What you’re about to experience on this podcast is very, very different to probably anything you’ve heard before because we’re talking about a brand new industry that is completely crushing it and disrupting everything we know about security, and this is the world of typing biometrics.
Kirill Eremenko: The idea behind typing biometrics is that based on how you type, whether it’s on your laptop or computer or on your mobile phone, it can be established that you are you. You can be identified just as that can be done with your fingerprint or with facial recognition, same thing can be done through the patterns that you use for typing. As you can imagine, that can completely revolutionize how we identify people, the whole world of two-factor identification. It’s also a very passive process, non-intrusive. You don’t have to get an SMS and type in a code. It just happens in the background.
Kirill Eremenko: Raul Popa is the CEO and the Data Scientist at a company called TypingDNA that is spearheading this whole industry, one of the leading companies in this space. Today, we’ll get to hear from him all about this world. You’ll learn about typing biometrics, what it is, how it works, how machine learning and data science enable and propel this industry forward. You’ll also hear about different applications ranging from making sure students don’t cheat on exams all the way to two-factor authentication for banks and other financial institutions.
Kirill Eremenko: The fact that Raul is both the CEO and the data scientist at the same time makes this conversation that much more interesting because we get to dive into both worlds. From the perspective of data science, we talked about pattern recognition, anomaly detection, one-shot learning, binary classification, data sampling, generative algorithms and more. From the world of business, we talked about what it’s like to run a data science startup and going from idea to research to making a business happen. As you can imagine, we’ve got lots of interesting things to cover.
Kirill Eremenko: Before we dive in, I want to give a shout out to the fan of the week. This one is from Andy who said, “The FiveMinuteFriday episodes always feature an insightful look into a unique topic, meditation, the importance of enjoying the moment, how to maximize efficiency and continuous inspiration to supercharge our own careers. Thank you, Kirill.” Thank you, Andy, so much. For the rest of you guys out there, if you haven’t left a review yet, make sure to head on over there on your podcast app or just go to iTunes, and you can leave a review on the SuperDataScience podcast. It would mean a ton to me. I would be very, very excited to read your review. All right, guys, I’m super excited about this one. Let’s dive straight into it. You will learn plenty about running data science startups and how the world of typing biometrics works. Without further ado, I bring to you Raul Popa, the CEO and Data Scientist at TypingDNA.
Kirill Eremenko: Welcome back to the SuperDataScience Podcast, ladies and gentlemen, super excited to have you back here on the show. For today’s episode, we’ve got Raul Popa joining us from New York. Raul, how are you going today?
Raul Popa: Hi, I’m fine, everything is fine, welcome to your audience and nice to meet you.
Kirill Eremenko: That’s awesome. That’s awesome. Thank you for coming on the show. I like how we were chatting before the podcast that because you’re based between Romania and New York, we’re talking about the weather in New York, and it’s like 17 degrees, and usually, when I talk to somebody from the US, it’s in Fahrenheit. That’s very cool. How do you find adjusting to the weather in the US, when everything is in Fahrenheit and you are used to Celsius?
Raul Popa: I just set it on Celsius on my phone, and everything is fine. I don’t really know it’s Fahrenheit. I know that I have to set temperature in the apartment around 70, but I don’t really know what that means in Celsius.
Kirill Eremenko: Yeah. Yeah, it’s so interesting. It’s a nonlinear conversion. It’s a trivial conversation from Fahrenheit to Celsius. I personally get confused all the time, and I just wish there was one system around the world. Nevertheless, how are you enjoying New York? Have you been there for a long time this time around?
Raul Popa: Yeah, only for two weeks. I used to spend more time. Last year, I’ve been here about five months. The rest of the time, I’ve been in, mostly, Europe and Romania.
Kirill Eremenko: Okay. Got you. You’ve done quite a few presentations and pitches through the nature of your business. You’ve even done a TEDx talk, but this is your first podcast, so congratulations. I think our listeners are going to be very excited to hear what you’re about to share. Are you excited about this?
Raul Popa: I’m really excited, yeah.
Kirill Eremenko: Awesome. Awesome.
Raul Popa: Maybe a bit nervous as well.
Kirill Eremenko: That’s totally normal. That’s very normal. All right, so Raul, tell us a bit about yourself. You’re running this very innovative, different business that many people haven’t even heard of this type of technology before, and by the looks of it, from what I’ve seen, what I’ve read, you guys are really crushing it. To get our listeners up to speed, please tell us what is TypingDNA and how did you come up with this idea.
Raul Popa: Yeah, so I’m CEO and Data Scientist at TypingDNA, and this is called typing biometrics, a behavioral biometrics company, basically. We’ll look at how people type and build behavioral biometrics profiles that we use for authentication and fraud prevention. From an AI perspective, I’m more into pattern recognition, anomaly detection, one-shot learning, and binary classification. I know it sounds trivial, but being able to give a yes or no answer in a fraud problem is really tough, and the difference between 90 to 95% accuracy matters a lot.
Raul Popa: The techniques behind our technology are exponentially more complex than what you would typically think for solving a binary classification problem or a one-class classification problem if you want, so just to understand, from a machine learning point of view, how it looks. I think what we’re building looks innovative from the outside. From the inside, it’s just pattern recognition. It’s just applied to something that machine learning was not applied before or not to this extent.
Kirill Eremenko: Okay. Okay. Got you. That’s very exciting. For us to get a better understanding, like an intuitive understanding of how this works, let’s say you’re working … You partner up with financial institutions. You partner up with banks, other kind of companies that … Tell us a bit about that. What kind of companies use TypingDNA services?
Raul Popa: Yeah, so we’re at the beginning now. TypingDNA itself is at the beginning, but basically anywhere where you would need another factor for authentication or another security layer other than a simple password or anything like a push identification or one-day passwords sent via text message, you will probably want to use TypingDNA.
Kirill Eremenko: Okay.
Raul Popa: We started with proctoring companies or online assessment companies, companies like ProctorU or Mind Prov. Actually, they verify students when they take exams, and before using our technology, they were using real people. It’s more expensive like that. We helped them reduce people, and everybody is happy. Students can take exams faster, and they cannot cheat as much as they did before.
Kirill Eremenko: Okay. Got you. Very interesting, and so tell us a bit about how this would work in the background. Let’s say a financial institution, a bank is using TypingDNA to authenticate their users. I’m logging onto my online account. I get to my computer. I type in the address of the URL of the bank’s online portal. I type in my name. I type in my password. Then I click Log In, and this is the point where I’d normally get an SMS to verify that it’s me. Where in that process does TypingDNA come in? Where do your algorithms start to recognize me?
Raul Popa: You don’t really have to get the SMS if you’re using TypingDNA. That’s the point. Anywhere you type something, like your email password, credit card, anything that you previously typed, you can use … An application can use TypingDNA to record the timing between the keys, how long you keep each key pressed. These are the kind of information that we’re looking at. Based on that, we build a profile that I told you about, and we can do authentication right then and there and use that as a two-factor.
Kirill Eremenko: Got you.
Raul Popa: Also, financial institutions can use our technology for employee-facing authentication. Nowadays, for 2FA, they typically use a hard token or a push identification system, you’d have to install something on your mobile phone and use that. I think that’s really not something very user-friendly, so we offer more like a frictionless solution. We just look at how people type. If that fails, sometimes, it fails, if that fails, we can always go and use the one-time password over SMS or the push identification or anything other.
Kirill Eremenko: Okay. Got you. How much typing is required to authenticate a user? If I just type in my email, is that enough, or do you need me to type in maybe 20 words or 50 words? What is the minimum amount of words required to authenticate somebody?
Raul Popa: I think this is where things get really interesting. For data scientists, I think this is where things get really, really interesting because to build a good model, you will need 15 samples at least, 15 previous typings, so a person typing 15 types. You asked for the length of the text, we can work on text of eight characters or 20 characters. We found that a lot of use cases have around 28. This is the average of use cases that we form, around 28 characters, like email plus password combination, or credit card name, or other things like that, but also username, password is a bit less, but still, 20, 20-something plus usually work really well.
Raul Popa: We can do authentication even with just one or two samples. I know this is something that machine learning was not supposed to be used for, but we actually do that. This field of one-shot learning, we’re using very few samples to do prediction, to predict whether somebody is the same person or not with the owner of the account. That’s really interesting.
Kirill Eremenko: That’s very cool. How unique are these typing patterns? With fingerprints, they’re pretty reliable. With facial recognition, even more reliable. How about typing patterns? What is the chances of two people having the same typing pattern?
Raul Popa: This is a good question. I want to say it’s as … I think the technology is at the beginning, and we will see more improvements in this. Depending on how much text you’ve got from the user, if the user typed a sentence 10 times, then you can capture the uniqueness of his typing. It will be really hard to break. If you only want to do authentication after just one or two samples, then the authentication, the match will have some potential error, some potential false positives or false negatives exist there.
Kirill Eremenko: Yeah. Okay. You would say it’s almost as reliable as a fingerprint when you have enough data to authenticate the user, let’s say, 10 samples we have or 15 samples.
Raul Popa: Yeah, I’d say that, but I’d say that fingerprints and face recognition, for example, on the other side are so public. I mean, you leave your fingerprints everywhere, your faces everywhere, so somebody can snap a photo of you at high resolution and use that to authenticate pretty much anywhere other than face ID. With typing biometrics, it’s not the same. You can ask in your own application for that user to type a specific name that he will never type in other platforms, like random combinations of words, his password, or anything like that. To get how that person type that exact text in a different environment is almost impossible. We’re looking at something that has more reliable character like typing pattern. From a reliability point of view, I think that it can be even more reliable than fingerprint or face.
Kirill Eremenko: Wow, I’m still sitting here in amazement because I’ve never considered this be an option for authenticating. Well, in that case, if you … You have somebody who types in their password, but what about tools that teach people how to type? There’s the whole fast speed typing learnings where you can use the QWERTY or the Dvorak keyboard layout to type faster. Let’s say somebody is using the standard QWERTY keyboard layout and they learned how to type faster, so they learned those techniques for typing. Now, you have, let’s say you have a thousand people who learned using the same typing program, and they’re very diligent. They got very good at it. Wouldn’t they have very similar typing patterns?
Raul Popa: I don’t think they are similar because their size of fingers and how they use the fingers and the muscles differs. If people would take the same running classes, would they run the same? Would their walking style be the same? I don’t think so. It could be very similar, but still, there is a lot of character in watching somebody typing or watching somebody walking or even speaking. Even if you sing at the opera and other 10 singers, you would not maybe distinguish the opera voice between those, but when you talk like normal people, your voice will be singular. Right?
Kirill Eremenko: Mm-hmm (affirmative).
Raul Popa: This is only intuition. I’m just talking about what my intuition says, but from our test, we adapt. For example, it’s like face recognition adapts when you grow a beard, or-
Kirill Eremenko: Or you’re wearing sunglasses.
Raul Popa: Yeah. Yeah. Yeah.
Kirill Eremenko: Okay. Got you.
Raul Popa: We adapt when people start typing faster or differently, but if it’s very different, then we have to fall back to a different method for authentication, for example, but not all use cases are about authentication. We can do all sorts of other things, like making sure somebody is not sharing accounts with other people, like all sorts of things. We have about 20 different use cases that we identify.
Kirill Eremenko: Got you. Okay. Well, that’s very cool already, so I hope our listeners are as excited as I am to dive into this. I really love that you are both the CEO and the data scientist, and you actually point that out in your LinkedIn that you perform both roles in your company. It’s very cool. That means we can dive into the whole notion of setting up a business about a really cool idea and about some research, a data-driven business, and on the other hand, the techniques and algorithms that allow for all this to happen, for this technology to work.
Kirill Eremenko: Probably, let’s start on the data science side of things. We already touched on a little bit on the pattern recognition, anomaly detection, one-shot learning, binary classification. What can you share? I know a lot of this would be proprietary information and that you can’t share freely on this podcast. Nevertheless, what can you share with our listeners that might be exciting for them? What kind of machine-learning algorithms are used in typing biometrics these days? What’s new? What’s hot, and what can they look into if they ever want to get into this field?
Raul Popa: It’s a tough question. For TypingDNA, I cannot go into details, unfortunately. I can say one of the most important things, greatly underestimated, not just in typing biometrics, but in any kind of machine-learning applied technology is data sampling. We had a lot of misfortunes at the beginning of building TypingDNA because we didn’t address that correctly. Whenever you deal with fraud prevention, anomalie detection, you’ll always find extremely unbalanced sets of data, also very noisy, so it’s easy to throw away 90% of your extra data, but it’s not always an option.
Raul Popa: Generative algorithms also seem to work really well on, mostly, all levels, I mean, regardless of what you’re building, I suggest people should try to generate data as much as possible. Make sure you don’t compromise your testing and cross-validation sets, however, but definitely generate data if you can. It can help you improve your general accuracy no matter what you’re building.
Raul Popa: A big part of what we’re doing is called one-shot learning, being able to predict the class just by seeing one single sample or very few ones. Techniques like transfer learning might work well here, just there’s not enough information out there about what to do and how to do that. I was researching a lot about how to do one-shot learning because I really wanted to make the technology work for just one previous sample or two previous samples because, otherwise, it’s not efficient.
Raul Popa: Also, since we’re talking about security, so our technology used to prevent fraud for security purpose. A good technique is to use multiple algorithms [inaudible 00:23:28], stacking generalization or blending, but unlike for Kaggle competition, stacking generalization work well for security algorithms because it’s harder to break multiple algorithms at the same time. If you’re a hacker, you want to break these algorithms, really hard.
Raul Popa: You probably know about adversarial examples samples used to trick traffic signs, probably use [crosstalk 00:23:57], stop signs that you stick some tape on it, and all of a sudden, self-driving cars will recognize those as 45 miles limit, speed limit instead of stop sign, and that’s a huge thing. Right?
Kirill Eremenko: Yeah.
Raul Popa: Adversarial glasses used to trick face recognition systems to believe you’re somebody else. Using completely different types of machine-learning algorithm in your production will help reduce the ability for a hacker to hack into your system with a, sort of, master key if you want, but this is something that I suggest people would do if they do anything related to security or to fraud prevention or to verify users in any way, anything related to authentication or identity. Yeah, so unlike Kaggle which people do blending and stacking to be at a better accuracy. Here, you’re not after the best accuracy. You’re after better chance of succeeding or smaller chance of succeeding for hackers.
Kirill Eremenko: Got you. That’s very interesting about the comparison to data science competitions versus the real world and different objectives. What would you say your experience with research has been in data science? We talked a little bit about, in competitions, you just want many models to get the best accuracy. In the real world, you want many models, and specifically in this use case, to get the best security. What about research? How was your process of researching this technology using data science?
Raul Popa: I learned a lot. I did some of the main data science online courses, tried to follow masters, Andrew Ng, Geoff Hinton, Yann Lecun, Yoshua Bengio, Ian Goodfellow, so forth. Basically, I read stuff that was new to data science or to machine learning and trying to understand what are all these things going, where are all these things will … Where it will connect without a purpose in mind, like doing typing biometrics or other pattern recognition problems. I just wanted to learn more.
Raul Popa: A few years ago, there were not a lot of resources, not a lot of frameworks and libraries, and you had to basically code everything yourself, and understanding the math, use the great apps. It was really important. For example, I never did Kaggle competitions. I really think people doing that are really crazy. I know a few guys who really scored on top of that. I know building hundreds and hundreds of model and stacking them and getting rid of half of your work or 90% of your work just to make a small minor improvement takes a lot of time, a lot of ambition. I could have never done that. Yeah, but one of the things that I really like is you have the Winner’s Interview on Kaggle blog, those I really like to read, and I still read them, really cool.
Kirill Eremenko: Okay. Got you. Got you. From your research, first of all, you mentioned before, back in the day, there weren’t so many frameworks available, and you had to go into the math and that was very helpful. Now that we have things like TensorFlow, PyTorch, and other tools that make it easier to create, let’s say, for instance, in this case, deep-learning algorithms, AI, would you say that learning the math is essential, or people can get started faster without having to learn the math?
Raul Popa: You don’t need to know the math anymore. I mean, you can if you really want or if you want to develop the field. If you want to advance the field, yes, you definitely have to understand the details to create that. Other than that, just to use machine learning for your standard problems like typical computer vision, recognize objects, classify things, definitely, you don’t need to know the math.
Kirill Eremenko: Got you. Okay. Then going from research to building a company, so tell us a bit about that process. How did that researches you were doing, turned into the idea of actually turning into a company and what are some of the challenges that you face with building a startup out of research?
Raul Popa: Yeah, I spoke to a few events about this, and they keep inviting me to talk about this because it seems like a lot of data scientists want to make the move to start a business or do a startup, and it’s not really easy because they’re very, very good at what they do. We all know data scientists are paid really well, really hard to break from that lifestyle and make a company where you will work forever, 24/7. You will never see the light at the end of the tunnel. That’s really hard.
Kirill Eremenko: [Inaudible 00:29:49] Yeah.
Raul Popa: Yeah. Yeah.
Kirill Eremenko: Eat rice and beans, live on a friend’s couch, very difficult.
Raul Popa: Exactly, so it’s really, really hard. I wouldn’t advise a lot of people to do this unless they’re risk-takers. There is a case where you research something and you find that thing that nobody advanced it enough like I found typing biometrics. There are so many fields where the research didn’t go deep enough, and you just know that you have a … You can say it’s a breakthrough, but maybe it’s just something that you saw in a different domain. You just apply that and realize, “Hey, I can use this to advance this field.” At that point, you can actually use that to create a company or to start a product and actually solve the problem. I think that’s like a calling. If you have that, why not start a company? Right?
Kirill Eremenko: Okay. Do you think somebody can start a company while they have a normal nine-to-five job and see how it goes, test things out?
Raul Popa: I don’t know a lot of people who managed to do that. I think it’s really hard sometimes, and if you have a safe net to go back to like a normal job, you would just do that. You would have to let go that job and live on whatever you have saved or raise some money, find co-founders and start a company. It takes a lot of courage, but the good thing is that you can always go back to work for Goldman Sachs or whatever company pays you.
Kirill Eremenko: Yeah. Yeah, so you kind of have to burn the bridge to force yourself. Let’s just say if you want to take the island, burn the boats, right, something like that, so there’s no way back.
Raul Popa: Yeah, definitely. Yeah. Yeah. Some of the challenges as data scientists starting their company were around creating a prediction-level software, so basically, you start with the research, you have models. You have stuff like that. You have to turn that … We turned that into an API that is completely scalable, can do millions of transactions in a day. To do that is not complicated, but it’s not easy science. It’s different type of science, that data science. It takes a bit of ambition to learn it.
Raul Popa: Also, for what I did, I didn’t start with a lot of data that I got from university or something like that. Actually, so gathering data, lots of data was really, really hard. If you’re a startup or you’re building a startup, it’s almost impossible to do it with small data. For example, I started with 200 friends. I sent the link to type and ask them to ask some text and do a very quick survey. After that, I went to my Mensa group, the high IQ society?
Kirill Eremenko: Men-
Raul Popa: Those high IQ society called Mensa.
Kirill Eremenko: No, I haven’t heard of them.
Raul Popa: Yeah. I am a member of that.
Kirill Eremenko: Okay.
Raul Popa: I asked them to help. I told them that I can build a classifier based on the people typing patterns that will try to differentiate between regular people and high IQ ones. They loved it. I got about 400 people to take the test and complete the survey with things like age, gender, personality profile, stuff like that.
Kirill Eremenko: Yup.
Raul Popa: After that, I used to data to build a fun test. It wasn’t enough data to build authentication systems but like 400 people. I built a fun test titled Find Out What Your Typing Says About You. I put it out there. In two days or so, it got viral. People started sharing to their friends and forums and personality forums, and it got on Reddit at some point. One morning, I woke up with about 20,000 samples in my database. The server was down. I realized that “Hey, I have more data than needed to start my research,” but to get there, I had to trick people to help me.
Kirill Eremenko: Well, trick, I wouldn’t say trick. You just encouraged them in different ways. You gave them back something that they-
Raul Popa: Yeah, actually, I created some algorithms trying to classify things like gender or age or IQ based on how you type.
Kirill Eremenko: Yeah.
Raul Popa: There are some similar characteristics between people that shared the same attributes like age, for example, and you can see them typing in a different way. We got 60, 70% accuracy, and so not a lot, but for a fun test, was really fun. People were really intrigued. A lot of people like that, they didn’t get the right MBTI profile name or they’ve got different gender. They were questioning their gender now. I mean, it’s just a fun thing.
Kirill Eremenko: Nice. Yeah. Got you. You link it up to that and Myers-Briggs personality test, right?
Raul Popa: Yeah.
Kirill Eremenko: Yeah, that’s smart. That’s pretty cool. Basically, what you’re saying is that when somebody is moving from being a data scientist and having an idea, maybe even doing some preliminary research on that idea, seeing that they can break into a field and revolutionize something, and moving from there to actually building a company, there’s a lot of challenges along the way from productization to gathering data, and probably lots more other challenges. Maybe we’ll talk about a couple more.
Kirill Eremenko: You need to be prepared for them, and you need to also think outside the box this whole [inaudible 00:36:01]. In this case, with the data situation, like the fun test. I think that’s a genius idea and will get you data because data is value, right? Data is valuable. You can go scrape the web for data. You can go buy data. You can go do a fun test like that to get the data, but you need to consider all of these things before you dive into starting a data science business, right?
Raul Popa: Yeah, totally. Yeah. I think being able to think creatively about how you gather your first data or to think in steps, first, you do this, then you do that, but then you have real data to do your research. I think it’s really important.
Kirill Eremenko: Got you.
Raul Popa: You can partner with some other companies or universities. Sometimes, you can do that. Other times, really, really hard.
Kirill Eremenko: Okay. What are some of the other challenges that you face when starting your data science or typing biometrics business?
Raul Popa: Well, there are business problems, stuff like you need more people or you need people to help you with marketing and research, like research on the marketing side, market research, sales. You have to meet investors, extremely hard. You need investors. They want to see someone who knows everything, who’s able to sell, who’s able to do research, who’s able to manage people and everything like that. I’m not saying I’m not that person. I’m saying nobody is that person. It’s really hard to be a perfect dude to do a machine learning algorithm to a business, basically.
Kirill Eremenko: Got you. Okay. Okay.
Raul Popa: You have to meet a lot of things about that. Internet has plenty of links.
Kirill Eremenko: Yeah. Yeah. I can totally attest to that, that you … Running a startup is not just about an idea. You need to also have a business mind or start learning to have a business mind, and that’s completely different, or maybe you can partner up with somebody right? You can stay the data scientist, and somebody else can be your sales director or your business development director, something like that, chief operations officer, somebody who’s going to help you grow the business. That’s also another thing to consider.
Raul Popa: That’s what I actually did. I asked two friends of mine that were helping me from the side to join a project, Christian and Adrian. One of the things that we did is apply to accelerators. We got to Techstars in New York City last year, and that really helped us with investors, and helping cover the gaps that we had at that point, really, really valuable for us. Also, we’re from Romania. We started in Romania, Eastern Europe, really hard to build something from that side of the world and get global exposure so we had to move to US to do that.
Kirill Eremenko: Yeah. Yeah. Got you. Now, you’re half-half, US, Romania, right?
Raul Popa: Yeah. Yeah. Yeah. I’m half-half. Right.
Kirill Eremenko: The team as well, like you got some people in the US, some people in Romania.
Raul Popa: Yeah, but we do R&D is still Romania for different reasons. I really think that Romania has a lot of talent.
Kirill Eremenko: Yeah, definitely, Romania has a lot of talent. I wanted to ask you though, how do you find combining your role as a data scientist and as a CEO because, previously, from our discussion, I think everybody got the gist that you are actually very involved through the algorithm and you do quite a lot of research. You’re very up to date with the technological part of things. That requires a lot of time from how I can imagine it. At the same time, running a business requires a lot of time as well, meeting clients, doing marketing, making business decisions, scaling, growing, things like that. How do you find combining those two, and plus, as well, I know you’re a father, a husband. How do you coming all those things? When do you find the time?
Raul Popa: I don’t know. I don’t have a prepared answer for that. I think that the key is that I really like the data science part. It’s like a hobby for me. The CEO part or the founder part makes sense because I really think typing biometrics can make a difference. I think in the future people will type more than ever. Today, we’re communicating through voice or typing. I think if you look at young people, 11 years daughter that you mentioned, every time I talk to her, even if we’re in the same room, she would WhatsApp me, right?
Kirill Eremenko: Yeah. In the same room, she would WhatsApp you.
Raul Popa: Yeah, so it’s like asynchronous type of communication in which you type whenever you want. The other person replies whenever they want. I think this is the future of communication. If we look at young people, we know that. It’s no-brainer. I had to do this because I realized that we will type more than we used to, and with devices and with other people. I think having a layer of security based on typing is really a key to a better world in a way.
Raul Popa: I have sort of motivation to be a CEO and founder, and I have love for data science, so it’s like a hobby or something that I really like to do. I have to find time for everything, of course, family as well. For sure, of course, you can work full days on whatever you need to, but eventually, you have to find time for family and friends. I think that’s really important.
Kirill Eremenko: Okay. Got you. Just to clarify because I only realized this now when you’re talking about WhatsApp and typing on your phone, so TypingDNA and, in general, typing biometrics is not only designed for keyboards and computers. It can also work on your phone. Am I getting that right?
Raul Popa: Definitely.
Kirill Eremenko: Wow.
Raul Popa: Yeah, it’s a different thing. It’s a different thing. It’s not the same thing, but we’re not focused only desktops. We also have algorithms for mobile phones. On mobile phones, we think we’re even better at some points. Mobile phones, you move them a lot. You have pressure and you have a lot of things, at which you can do when people type. Yeah, we’re quite good in mobile phones, and we have a few really important projects on that.
Kirill Eremenko: You can also measure not just the typing speed and how, in some phones, how much force people applied to press the buttons, but also how the people are holding the phone, orientation in space, as you said, how they’re moving their phone as they’re typing, those types of things.
Raul Popa: Yeah. Yeah. Yeah. Yeah, so we can do those as well. As I said, people are using, whether it’s mobile phones or desktops, it defers a lot for that use case. I think in enterprise if you look at what people used to perform their task is probably always going to be their computer. You’ll rarely see people using mobile phones or tablets in their office for office business. It’s like for personal rather than business. On the other side, you have personal communication and entertainment that, typically, you use mobile phones for that. We have to understand these both.
Raul Popa: When we’re talking about banks or financial apps, people are using both mobile and desktops. A lot of focus on mobile later, and so people are checking their bank accounts over mobile phones. They’re typing in some pins or some other sensitive information that we can look at how they type this and use that for authentication and flagging suspicious users and so forth.
Kirill Eremenko: Got you. One of the fears that entrepreneurs or entrepreneurs-to-be have, so data scientists that may have even come up with an idea, really, genius idea that can revolutionize the whole industry, one of the fears that stops them from starting a business is the fear that as soon as it starts, and if it proving to be working, a large company can come in with lots of R&D, lots of funds, budget, presence, huge user base.
Kirill Eremenko: We’re talking likes of Google, Facebook, LinkedIn, Microsoft can come in and just copy the idea, not necessarily like create a solution for the same problem. You’ve identified the problem, but somebody else can come in and do the same thing. How did you feel about that when starting your business? Did that hold you back or not at all? How has that played out? Do you have any major competitors at the moment?
Raul Popa: What you’re saying is a real thing. It happens every time, right? Think about this, this is not a bad problem to have. It’s actually a very good problem to have. Google coming in the game, just saying, just dropping a name, right, means validation for our technology, means that this is mainstream now, means that anyone who started working on the technology like three or four or five years ago has a leg up, right? It means that other competitors of Google will want to buy a company, or investors will say, “Hey, we want to fund this company because this thing is mainstream, and they have a leg.”
Raul Popa: I think, definitely, a good problem to have. Clients will want to use the technology that we’re building rather than other alternatives that are not typing biometrics. Of course, some of that competition will go to Google in that case, probably most of it, but when a big player like that comes into a new space, I think they create a large ocean that didn’t exist before that you can benefit as well and everyone in the space will benefit at some point [crosstalk 00:47:18].
Kirill Eremenko: Got you. Got you. Basically, you’re saying it’s better to have, I don’t know, 1% of a huge pie than 100% of a tiny pie.
Raul Popa: Yeah, I know it sounds less. 1% always sounds less than 100%. I think what you said is like is true.
Kirill Eremenko: Yeah. You imagine if you have 100,000 users and you have 100%, or you 1% of 7 billion users, better, 7 billion.
Raul Popa: Yeah. Yeah.
Kirill Eremenko: Got you. Got you. Okay, very, very cool. How is this industry right now? We didn’t talk about this. How long have you been working on TypingDNA and has this industry or this technology matured from your perspective? What’s the competitive space right now?
Raul Popa: Yeah, I’m working from 2014 on this technology as a side project. Actually, I researched for two years. I actually started working in 2016, but I already connected most of the dots, and I knew that I can do it, sort of. I had the conviction that I can pull this off. Regarding the competition space, a lot of people tried to create algorithms, recording typing patterns. There are patterns that are almost 30 years old in the space. I mean, now, there are public space, right?
Kirill Eremenko: Yeah.
Raul Popa: People try all sorts of things, but without machine learning, statistical models are not really accurate. You can use them, but before I started, the most valuable technologies that were built around this thing needed you to type for a day or two for the technology to be able to recognize you with a 70-something percent, 80% accuracy. We can do zero false positives with three samples of you typing in your email and password, for example, I think, for your credit card. That’s like rare, right? We don’t want to replace passwords. We just want to have a second layer of security that can go with anything that you type.
Raul Popa: Also, we have an algorithm that works when somebody types anything. You can type in the chat window and just chat with somebody else, and we’ll look at what you’re typing statistically and we can say … I mean, we create a statistical profile of your typing. We don’t record exactly what you’re typing, only how you type letter A, letter B, stuff like that, right, the averages, standard deviations, that kind of thing, and we build a profile, and whenever you’re typing again, we can do matching and that’s the machine learning part.
Raul Popa: We can do matching and we can say whether it is the same person or not on any text when you type completely different things. The downside of that is that you have to type about a tweet-length but imagine you got to a country where you don’t live. It’s the first time you get to Vietnam and somebody steals your computer, your mobile phone, everything. You try to get online.
Kirill Eremenko: As it happens, yeah.
Raul Popa: Yeah, and you try to get online to connect to your people, to your friends, ask for some money to be able to go back to your home, whatever hotel, wherever you are, you will need a new computer or you will try to log in to Gmail, let’s say, from a new computer, a new location without your phone, without your UBQ, whatever you have. Your password, you don’t remember it because it was in your OnePass, whatever, LastPass, whatever you’re using for password managing.
Raul Popa: All of a sudden, you’re in front of a computer unable to log into your account or to talk with your people, your friends and ask them for help. That’s, really, a situation when you are into that, you will just have no escape. One thing that Google can do, for example, or anyone, any big application like that, Facebook, anyone like that, they could ask you to type a sample, a sentence, whatever.
Raul Popa: They can use that along with, let’s say, a question that they can ask, stuff like, “When did you log in last time? From where?” They can combine pseudo-information with typing biometrics and other things like that and be able to reset your account or enter your account like that. That’s really-
Kirill Eremenko: Yeah. Yeah. Yeah. Yeah, fantastic, definitely a great … It’s actually bringing a lot of convenience, a lot of, even in this case, certainty to the world that if you go traveling and you lose all your things, you can still get in touch and log into your accounts because I think that’s, for me anyway, that’s always a concern, when I’m in third-world countries and so, exactly what do you do in that situation?
Kirill Eremenko: Moreover, as we discussed before, typing biometrics brings a whole new user experience where instead of waiting for that SMS or doing those CAPTCHA, when they’re showing you images of a bus and I find those so time-consuming when you have a grid of nine images or an image broken down into nine parts, and you have to point out where there’s a bus, or where there’s a car to authenticate yourself or to show that you’re not a robot. I think that could be also removed with typing biometrics, and definitely a whole new user experience that can be introduced. That’s a very cool example.
Kirill Eremenko: I wanted to ask you another question. What is the future, what future do you see for TypingDNA? I think it’s a very cool thing to think about and always wonder about. Do you see an IPO some time in the future? I know it’s probably very early stages right now. Do you see the company growing and, I don’t know, building teams around the world, getting clients in lots of different countries? What is your vision for your business right now?
Raul Popa: Well, I’m not thinking of the IPOS at this point. We’re really small. We’re a startup. We raised that seed round two months ago. I believe that the tech can become mainstream and can save a lot of situations, can be used to protect all sorts of accounts and businesses and money and assets. We have a lot of crypto exchanges, crypto wallets, crypto projects that want to use our technology for making people safer when they do transactions like that.
Raul Popa: We are happy that we can play a role in this entire scheme, and we can make authentication and communication easier, so you don’t have to go to friction in order to authenticate or to send some important messages or to deal with private data. On the how big the company can get, I think it really depends on whether the technology gets adopted or not on the wider scale. We believe there are clear use cases on which the company can grow to multi-billion-dollar size, but it all depends on how the audience and how the market will receive that. We do have good science though.
Kirill Eremenko: Got you. Got you. Also, well, hopefully, it all goes really well, and this will be the first interview that you did just before your company becoming a billion-dollar company. Before we finish up, I wanted to ask you, from everything that you’ve gone through because I think your journey is very exciting from data scientist to research to founder and CEO to growing the business and disrupting a whole new space.
Kirill Eremenko: What would you say has been your one biggest learning, something that you can share with our listeners out there who are data scientists who might be considering starting a business, who might be considering staying in their current positions or progressing with their careers in data science? Nonetheless, what is one learning, something that you’ve found very useful for yourself in your life that you can share with them?
Raul Popa: I think data scientists are typically very intelligent people, and so I think typical intelligent people have this problem of overthinking. By far, this is like the biggest problem that, collectively, data scientists have. All that, I know. Also, on the other side, entrepreneurs are rarely overthinkers. Usually, they are more opportunistic, kind of optimists, that think that things will somehow solve by themselves in the future.
Raul Popa: This is why they start companies, they do a lot of things like that, so making the switch between those two is really, really hard. The key is to, first, you have to be less anxious, to overthink less. I think, to do that, you have to be more relaxed. I have a mentor, Kevin O’Brien from GreatHorn. Actually, this is his advice. Whenever you’re entering the ring to box with an opponent, you can’t be that stiff, kind of rigid opponent that you would fear at first, right, that you see that there’s a lot of fear on his face as well, or you can be that relaxed guy who just enters the ring by knowing things will work out.
Raul Popa: By being laid back, you will be able to spot the other person mistakes because people make mistakes al the time. If you relax, you can do that. You can use those mistakes to get an advantage. I think that’s really important. Things that you can learn when you want to turn into an entrepreneur from a data scientist, relax. That’s really important. Become curious about things that are not really as important and you think they should do more things for fun, get out of your comfort zone, stuff like that. I think those are really important things.
Kirill Eremenko: How do you relax?
Raul Popa: Yeah, as I told you, I am a curious person, so I like to learn things about all sorts of things. Sometimes, when I’m really stressed about something, I find a topic on which I want to learn more. I always want to learn more about something. Trying to do that, you will focus on something new. Entering that learning mode, really, really valuable. Two, three, four hours, learning something about something else like your normal life, you will just find yourself in a comfortable position where you don’t see the risk anymore, you don’t see the pressure anymore, and you can think clearly, really important.
Kirill Eremenko: Okay. Got you. Got you. Raul, that’s fantastic advice. I’m going to take on board as well. I sometimes overthink a lot of things as well. Yeah, good, good advice. On that note, we’ve slowly approached the end of this amazing podcast. Raul, thank you so much. Before I let you go, can you tell us a bit about where can our listeners find you, connect with you, maybe for a career, maybe some business owners then get in touch about trying out TypingDNA, or maybe if some data scientists that are looking for jobs that might be interested in this space and looking for job opportunities that you might have. What are some of the best places to find you?
Raul Popa: Typingdna.com, we have a lot of demos there, a lot of information. Our recorders for typing patterns are actually open sourced, so if you have another project in typing biometrics, you want to use these for, I don’t know, detecting how people type, you know, use that, you can contact me on raul@typingdna.com or R-A-U-L@typingdna.com. That’s pretty much it.
Kirill Eremenko: Awesome. Is LinkedIn okay as well?
Raul Popa: Yeah, LinkedIn is fine, Raul Popa at LinkedIn, sure.
Kirill Eremenko: Got you. Okay. Awesome, and guys, make sure to get in touch. One final question before we finish up today. What’s a book that you can recommend to our listeners to help them in their careers and life journeys?
Raul Popa: I don’t recommend data science books, typically. I recommend online courses, but one book I recommend for everyone doing data science is Black Swan from Nassim Taleb, also the last books from this author like Antifragile and Skin in the Game touch the essence of the rare asymmetries that we find in the real world and how these may lead to positive outcome. I think these are extremely satisfying books for me as a data scientist.
Kirill Eremenko: Got you. Awesome. That’s Black Swan, Antifragile, Skin in the Game.
Raul Popa: Yeah.
Kirill Eremenko: Great, great advice, and on that none, thanks so much, Raul, for coming on the show, very, very exciting podcast. I’m sure our listeners enjoyed it as well, and best of luck with TypingDNA. I hope you change the world.
Raul Popa: Thank you very much. Thank you for having me.
Kirill Eremenko: There you have it. That was Raul Popa, CEO and Data Scientist at TypingDNA, what a discussion. I hope you enjoyed this conversation as much as I did. We went into, first of all, the world of typing biometrics. How crazy is that? It blows my mind. Just by typing in my logging and password a couple of times, I can then be identified in the future as myself, and that is crazy, as you can imagine that, or from what we discussed in the podcast, there are plenty of applications that can make our lives easier and safer in many, many ways.
Kirill Eremenko: Also, it’s very cool to learn both the data science aspect of things and the different approaches, techniques, algorithms that are used in the space as well as setting up a startup around a data science idea. If you have a data science idea, maybe now, you have some better ideas of what is coming up for you, what to expect. My personal favorite part of this podcast, something that stood out to me the most was the creative ways of collecting data, as Raul mentioned, on when they needed those patterns, how they created that fun tool for people to use to learn a bit about themselves, their personality type, and things like that based on their typing pattern. That allowed them to collect data. I think that’s a very out-of-the-box type of thinking, type of idea.
Kirill Eremenko: On that note, make sure to connect with Raul and you can find his URL to his LinkedIn as well as all other materials that we mentioned on this episode at www.www.superdatascience.com/251. That is www.superdatascience.com/251. Make sure to hit up Raul on LinkedIn. If you know anybody who is looking to create a startup in data science, who has a really cool idea that can be empowered with machine learning or data science, or any other data-driven technologies, then make sure to send them this episode, and maybe they can cut through their learning curve and get some really cool ideas from here. On that note, thanks so much for being here today. I look forward to seeing you back here next time. Until then, happy analyzing.