Kirill Eremenko: This is episode number 317 with aspiring data scientist, Edis Gonuler.
Kirill Eremenko: Welcome to the SuperDataScience podcast. My name is Kirill Eremenko, Data Science Coach and Lifestyle Entrepreneur. Each week we bring you inspiring people and ideas to help you build your successful career in data science. Thanks for being here today, and now let’s make the complex simple.
Kirill Eremenko: This episode is brought to you by my very own book, Confident Data Skills. This is not your average data science book. This is a holistic view of data science with lots of practical applications.
Kirill Eremenko: The whole five steps of the data science process are covered from asking the question to data preparation, to analysis, to visualization, and presentation. Plus, you get career tips ranging from how to approach interviews, get mentors and master soft skills in the workplace.
Kirill Eremenko: This book contains over 18 case studies of real world applications of data science. It comes off with algorithms such as Random Forest, K Nearest Neighbors, Naive Bayes, Logistic Regression, K-means Clustering, Thompson sampling, and more.
Kirill Eremenko: However, the best part is yet to come. The best part is that this book has absolutely zero code. So, how can a data science book have zero code? Well, easy. We focus on the intuition behind the data science algorithms, so you actually understand them, so you feel them through, and the practical applications. You get plenty of case studies, plenty of examples of them being applied.
Kirill Eremenko: And the code is something that you can pick up very easily once you understand how these things work. And the benefit of that is that you don’t have to sit in front of a computer to read this book. You can read this book on a train, on a plane, on a park bench, in your bed before going to sleep. It’s that simple even though it covers very interesting and sometimes advanced topics at the same time.
Kirill Eremenko: And check this out. I’m very proud to announce that we have dozens of five star reviews on Amazon and Goodreads. This book is even used at UCSD, University of California San Diego to teach one of their data science courses. So, if you pick up Confident Data Skills, you’ll be in good company.
Kirill Eremenko: So, to sum up, if you’re looking for an exciting and thought provoking book on data science, you can get your copy of Confident Data Skills today on Amazon. It’s a purple book. It’s hard to miss. And once you get your copy on Amazon, make sure to head on over to www.confidentdataskills.com where you can redeem some additional bonuses and goodies just for buying the book.
Kirill Eremenko: Make sure not to forget that step is absolutely free. It’s included with your purchase of the book, but you do need to let us know that you bought it. So, once again, the book is called Confident Data Skills and the website is confidentdataskills.com. Thanks for checking it out, and I’m sure you’ll enjoy.
Kirill Eremenko: Welcome back to the SuperDataScience podcast, ladies and gentlemen. Super pumped to have you back here on the show. What an episode I have prepared for you today. I literally just got off the phone with Edis just under an hour ago, and Edis is a very inspiring young man who is breaking his way through into the field of data science. What you need to know about Edis is that he’s already participated in a couple of Kaggle competitions. In one of them, he actually finished in the top 18%. He knows deeply about neural nets. He actually programs them for his Kaggle competitions, as you’ll find out for the podcast.
Kirill Eremenko: Something else very impressive I know is that he creates neural networks from scratch. He doesn’t use PyTorch or TensorFlow, just because he wants to learn how neural networks work. It takes him a week or so to program a neural network, but he really dives deep into it. He’s started an artificial intelligence club in New York, and lots and lots of very cool things are happening in his life. He got into the field of data science just under two years ago, just two years ago or so.
Kirill Eremenko: But the most surprising thing you’ll find is that Edis is only 15 years old. That’s right. 15 years old, and he’s already doing these massive crazy cool things. In this podcast, he talks about Generation Z, and how they use artificial intelligence, and often don’t even realize that there is artificial intelligence behind the products that they’re using. We talked about Edis’s path, and how he got into the space of data science, why, and how he continued progressing to artificial intelligence.
Kirill Eremenko: We talked about his Kaggle competitions. Edis shared how he codes models from scratch, and why. We talked a lot about neural nets. We actually had a very deep dive into neural network. If you don’t yet understand how they work, or you want to refresh on your knowledge on neural networks, we discussed that. We talked about the neurons, layers, weights, activation functions, normalization, standardization of data, min/max functions, quite a lot of things. That was a really cool discussion. For me personally, it was great to revise some of these things, and refresh them in my own knowledge. I’m sure you’re going to enjoy all this.
Kirill Eremenko: Plus, in addition to that, we talked about feature engineering, and how he used this technique in one of his Kaggle competitions. Plus, we took it to the next level, and Edis told us how he built an ensemble of neural nets. That was the first for me to hear, that somebody’s building an ensemble, not of just decision trees to build a random forest, but ensemble of neural nets. That was a very exciting conversation, and you’ll learn about how that works, and how you can build your ensembles of neural nets, and what the advantages of that are.
Kirill Eremenko: Finally, we talked about the motivation, what keeps Edis going. All in all, a very exciting inspiring episode. Can’t wait for you to check it out. Without further ado, I bring to you aspiring data scientist, Edis Gonuler.
Kirill Eremenko: Welcome back to the SuperDataScience podcast, ladies and gentlemen. Super excited to have you back here on the show. Today, we got a very special guest calling in from New York. Edis Gonuler. Edis, how’re you going?
Edis Gonuler: I’m going good. Thank you for having me here.
Kirill Eremenko: Super excited to have you. Super excited. Since we caught up at DataScienceGO in San Diego, how’s your life been? What’s been happening? It’s been a few months since then. Right?
Edis Gonuler: Yeah, yeah. It’s been a few months, and it’s been great. I’m really motivated to create data science projects to explore different fields. It’s just been awesome.
Kirill Eremenko: That’s really cool. What I was wondering about is what’s got you to come to DataScienceGO in the first place? How did you hear about the conference?
Edis Gonuler: Well, I listen to a lot of your courses on Udemy, and that’s basically a lot of the content that I studied. I saw it in the email that you sent me, or someone sent me, and I knew I had to be there. I knew it was going to be a great experience, and that’s how I went there. I told my dad, and I said, “Dad, I have to go here. I want to go,” and he said, “Sure.”
Kirill Eremenko: Wow.
Edis Gonuler: That’s how it all happened.
Kirill Eremenko: Interesting, because I thought your dad brought you there. It was the other way around. You encouraged your-
Edis Gonuler: [crosstalk 00:07:22].
Kirill Eremenko: That’s so cool. You live in New York, so you flew all the way from New York for the event.
Edis Gonuler: Yeah.
Kirill Eremenko: That’s so cool.
Edis Gonuler: I went to San Diego. It was a great experience.
Kirill Eremenko: That’s so cool. What fascinated me the most is, you probably noticed, you were by far the youngest person there. How old are you?
Edis Gonuler: I’m 15 years old.
Kirill Eremenko: 15 years old, and you’re at a data science conference. Well, I got to caveat that. I did see one kid. Did you see that kid? Like five or six years old.
Edis Gonuler: Yeah, I saw him. [inaudible 00:07:55]. Yeah.
Kirill Eremenko: He was there, and his mom was like, “Oh, you got to learn data science. You got to learn mathematics.” She was really pushing him, five years old. Apart from that kid, you were by far the youngest, and you were getting a lot of value out of it. You were asking the right questions. You even came to the workshops. Was that right?
Edis Gonuler: Yes, I did. I attended your workshop as well as creating R websites using Shiny, and they were really valuable experiences.
Kirill Eremenko: Fantastic. Well, so what I want to … Of course, I want to have a fantastic chat, and see how I can help you on this podcast, but I also want to see how you can inspire our listeners because at such a young age, you’ve already established for yourself that, “Hey, this is the path I want to take.” I personally have never met anybody this young who is so far advanced in the space of data science and artificial intelligence. Yeah, let’s kick things off. Why and how did you decide to get into data science?
Edis Gonuler: My dad’s a software engineer, so he already got me into technology. Since I was very young, I’ve always had a computer. When I was really young, I was watching videos, exploring the internet. Then my dad, he was talking to me about creating websites, and how he’d create websites. One night, I remember we were going over something, how websites work, and went into AI, and explained what AI is for me. That night, I just sat on my desk for 30 minutes, and I was like, “Wow, this is so cool.” I was just like, “Oh, my God. Oh, my God.”
Edis Gonuler: I started researching it, and I found a lot of great courses. I started watching them, and I started creating projects, and that’s how I learned. That’s how everything started.
Kirill Eremenko: Okay, but you don’t have AI at school, right?
Edis Gonuler: No. We do have computer science courses in school, where they teach you a little bit HTML and CSS. But we don’t have a strictly AI class.
Kirill Eremenko: Okay. Was it hard to learn how to create AI?
Edis Gonuler: Well, there were some challenges, of course. There’s a lot of content online, which is really good. But also could be a really bad thing, because not knowing which content to pick, and you could waste time trying to pick content. But I just started on Udemy, and listened to a few of your courses. They got me super pumped up, and I just started exploring, and creating projects, and learning from there.
Kirill Eremenko: Okay, got you. What is it that your dad said, do you remember, that got you so inspired?
Edis Gonuler: I think he was explaining how he can predict what movies users would like based on previous data, and I was just, especially about that example, I was so interested in how it worked. Yeah.
Kirill Eremenko: Okay, okay. Have you used Netflix yourself? Was it cool to realize that that’s happening in the background?
Edis Gonuler: Definitely, definitely. When I learned what it is, I realized that a lot of things in life, a lot of different products like Netflix, self-driving cars, you actually use AI. I’ve always heard the term AI, but I didn’t really know what it means. It was just like a light bulb going off in my head.
Kirill Eremenko: Fascinating. I’ve never thought about this, but from what you’re saying, I’m gathering that we talk a lot about millennials. I’m a millennial. If you’re 16, then you would probably be a Generation Z, because millennial’s define as someone who was, I think, 16 years old, at least 16 years old when the year 2000 came around. But you’re already born in 2003. So you’re Gen Z. We talk a lot about Generation Z, and how there is, I’m not talking in particular about you, but in general, there’s this addiction to social media, mobile phones, and Generation Z people are very good with technology and so on. But it’s interesting to hear that until recently, you actually didn’t know how artificial intelligence works, even though you use it to your advantage pretty much daily.
Edis Gonuler: Yeah, definitely. A lot of things that we use, and something as basic as Siri, we don’t realize that there’s actually AI behind it. That was just fascinating to me when I first learned AI. When I was first watching courses, I was like, “Oh, I use this every day, almost,” and it just made me more motivated to learn it, and to progress.
Kirill Eremenko: Okay. Well, tell me, how long ago was this conversation with your dad?
Edis Gonuler: This was about two years ago when I was in eighth grade, I believe.
Kirill Eremenko: Oh, my gosh. Wow, your dad’s a legend. That’s so cool. Huge props to your dad. Okay, so you were 14, you had this conversation. I’m assuming you didn’t know Python back then.
Edis Gonuler: I was actually 13.
Kirill Eremenko: 13. Wow.
Edis Gonuler: Yeah. Yeah, back then I didn’t know Python. I did have a little bit of experience with HTML, just basics. But after that conversation, after I started researching online, I realized that it was either Python or R. I looked at a few of the examples, and Python seemed more like pseudocode or readable code. I decided to just pick Python and go with it. From there, I started learning Python, the basics, variables, loops, and it just got me more and more fascinated as I went.
Kirill Eremenko: Cool. How long did it take you to learn Python?
Edis Gonuler: It took me about six months, I would say, because I was at school meanwhile, so I was trying to balance school, and I was trying to learn.
Kirill Eremenko: Six months to get to what level?
Edis Gonuler: Six months to get to an intermediate level. I knew most of the concepts in Python. That’s when I started learning AI.
Kirill Eremenko: Okay. All right. Which course in AI did you take?
Edis Gonuler: The first course that I actually took was your course on Udemy. It was Machine Learning A-Z.
Kirill Eremenko: That’s a big course.
Edis Gonuler: Yeah. I finished it in about a month, because I just-
Kirill Eremenko: No way.
Edis Gonuler: … countless amount of nights, countless amount of nights. I slept really, really late. I woke up early in the mornings before school just to watch a few lectures. Even on my commute to school, I used to watch the lectures, and I was always thinking about, even in school, how would I solve this problem? Or how would I do this?
Kirill Eremenko: Wow. That is crazy. That’s like a 40-hour long course with over 200, or even 300 tutorials. That’s insane to do it in a month.
Edis Gonuler: [crosstalk 00:14:45] a lot content.
Kirill Eremenko: Well, the best ones I heard was people doing it in two months, and average is about three months, I think. You did it in one month. Crazy, man. Your knowledge of Python was sufficient to take that course by then.
Edis Gonuler: Yeah, definitely, it was sufficient enough. Also, in that course, it’s also written in R, half bits in Python, half bits in R, so I also explored onto R to learn some of R at that time.
Kirill Eremenko: Okay, okay. Machine Learning A-Z, cool. Machine learning is indeed a type or form of artificial intelligence. Did you ever go deeper? Did you go into reinforcement learning on those types of concepts?
Edis Gonuler: Yeah, definitely. Machine learning was like a gateway. It was really, really interesting. But when I saw deep learning, and reinforcement learning, I got even more interested and motivated to learn all this stuff. That’s when I started buying the books, the heavy math books, because I realized that a lot of math needs to be learned in order to actually grasp these concepts. I started reading book after book, learning all these concepts.
Kirill Eremenko: Okay, got you. You’re probably way ahead of your class in terms of mathematics.
Edis Gonuler: Well, the thing is, in school when they teach you math, you don’t really understand why they teach it to you. That’s why I really love data science and AI because all the things, all the math that you learn in school, you can apply it, and it’s just cool to see how it all clicks into place.
Kirill Eremenko: Okay, okay. Very cool. All right. You learned the Python, you did the course, you learned some mathematics. Then what? When did you start doing your first projects? When did you actually feel, I don’t know, a spark of creativity towards artificial intelligence?
Edis Gonuler: Yes, so that was about last year when I was in ninth grade. I started with Kaggle. I looked at a few competitions, and I decided to join it. My first attempt wasn’t very successful. But I learned a lot, learned a lot of different cool things like feature engineering is very important. Definitely analyzing the data before actually putting it inside a model and training it, it’s really important. After that, I joined a few other competitions.
Edis Gonuler: The last competition, my most recent competition was the 1C competition. It’s a rushed software company, where you had to predict how many of each item would be sold in the next month. It was a really good experience for me. I learned a lot of different cool things. At the end, I got into the top 18% using ensemble methods.
Kirill Eremenko: That’s very impressive. To give us a sense of how big your competition was, how many people, 18% out of how many?
Edis Gonuler: I don’t want to lie, but about 5,000, I believe, 4,500. Somewhere over there.
Kirill Eremenko: That’s very good. That’s very good result.
Edis Gonuler: Yeah. I try to improve my results, and I try to join more competitions because creating projects and experience is the most valuable thing, and it’s what makes you learn. At one point, after watching all these courses, and reading all these books, I realized that I was learning, but because I wasn’t applying it, I was forgetting the stuff that I already learned. I just decided I’m going to stop the books, and I’m going to actually start creating projects. That helped me a lot.
Kirill Eremenko: Okay, got you. Let’s go into that a little bit. I love talking about Kaggle competitions. Top 18% in this 1S company competition to predict total sales for products in the stores. Plus, you did some competitions before that on Kaggle, a few. You mentioned ensemble methods. That got you quite far. Tell us, if you don’t mind sharing, exactly what ensemble method did you use, and why, and how did you go about getting to this solution?
Edis Gonuler: Okay, yeah. I looked at my data, and before using ensemble methods, I tried neural networks, I tried SVMs. I realized that it wasn’t getting the result that I expected, and the features that engineered weren’t as useful in these models. I was just like, “Let me just try an ensemble method.” I think I used a few neural networks, and on top of that, put linear regression. I think that’s what got me the top 18%.
Kirill Eremenko: Wow. You used an ensemble of neural networks.
Edis Gonuler: Yeah. Neural networks and I think I also used SVM. I’m not completely sure. But yeah, [crosstalk 00:19:43].
Kirill Eremenko: That’s crazy. When people say ensemble methods, it’s usually a random forest, or something more basic.
Edis Gonuler: Yeah.
Kirill Eremenko: But you used an ensemble of neural networks. That’s a good next level. Okay. Tell me then, you used this ensemble method, and how, because the way you say it is like, “Oh, I tried SVM, I tried neural nets, I tried linear regression, I tried anything else,” how difficult is it to put together a model? How long does it take you to code one of those from scratch?
Edis Gonuler: What I try to do in projects, the whole point of doing projects for me is to learn. It’s not really to get into the top 18%, or the top 5%, it’s to learn. I usually code them out from scratch, not using any libraries, just to understand the math behind it.
Kirill Eremenko: Wow
Edis Gonuler: Usually, some of like linear regression, and multiple linear regression take me maybe a day to train, and to … But some of the more complex methods like neural networks, building it from scratch could take me up to a week just to build it, because I also have a lot of different things going on. I don’t really have that much time to code.
Kirill Eremenko: Hold on, hold on. Did you build neural networks without TensorFlow or PyTorch?
Edis Gonuler: Yes. I did it just to understand the math behind it. [crosstalk 00:21:06].
Kirill Eremenko: Wow, that’s crazy, man. That is insane. You build a neural network without using existing libraries. That is definitely going to get you to learn. I see your point. You’re going to know them so well after that. How do you know that you didn’t make any mistakes along the way? There’s so much math involved.
Edis Gonuler: Well, there is so much math. Of course, I didn’t do this on top my head, I researched online to see how it would work, and then the formulas in the mathematical aspect. I just kept on trying and trying and trying, and then I compared from one of the libraries, I believe from TensorFlow, the neural network. I imported it, and also my neural network, and it was about the same [inaudible 00:21:55]. It gave about the same loss. I realized that the one I created was pretty good, and it worked.
Kirill Eremenko: Okay. Wow. Very cool. All right, so you built your own neural, and you said it took you about a week to build a neural network from scratch.
Edis Gonuler: Yeah, about a week.
Kirill Eremenko: Interesting, because when asked you the question from scratch, I actually thought to myself, “I shouldn’t have said from scratch because you probably were using libraries, drag and drop, putting things together.” Because if you use TensorFlow or PyTorch, it’s going to take like two hours to build a neural network. Maximum. All right, it’s more about the architecture that you decide upon.
Kirill Eremenko: Yeah, how do you go about that? How did you go about experimenting with the neural net architecture? What insights can you share? What did you learn for that specific challenge about sales prediction?
Edis Gonuler: Yes. In the neural networks, online and in the books, they usually recommend that in the middle of the neural network, to put a lot of neurons, and then towards the ends, not to put a lot of neurons. I realized that to be true. But what I did in the beginning was I just put a bunch of neurons, and I just tested it out. The result that I think I got from that was, if you put neurons in the first few layers, it’s not as good as putting them in the last few layers. In the beginning of your neural network, try to keep it simple, and towards the end, try to amp up the amount of neurons.
Kirill Eremenko: Okay. Why do you think that is, because you coded these from scratch, you know the math, you maybe have a better sense about what’s going on in there than most of us. What would you say are your comments? Why do you think you’re seeing this inside?
Edis Gonuler: Well, to be honest with you, I’m not 100% sure. But I think that in the beginning, since the data is completely … it’s fresh, nothing has been applied to it, it needs to be basic, and it needs to just have, just each neuron has to have a specific weight. Then towards the end, as the numbers get all jumbled up, it doesn’t really matter as much. I think you can use more weights there, and then to get an output. This is my experience, and this is how it worked for me. I’m not sure how it would be for other types of neural networks. But in my experience using this data set from this Kaggle competition just seemed to work pretty well.
Kirill Eremenko: Okay. What activation function did you use in your neural net?
Edis Gonuler: I think I used ReLU. I’m not sure if I’m pronouncing that right. But also I tried a lot of different activation functions. I believe I use Adam, and I’ve realized that ReLU was the easiest one to use, and it got me the most effective results.
Kirill Eremenko: Okay. All right. You know what, I got an interesting idea. I love doing these things, to recap my own knowledge and solidify it, but also maybe somebody listening to this will find it useful. Let’s try together to explain neural nets in a couple of minutes. Obviously, we don’t have visual aids because this is a podcast, but it’s going to be tough. But let’s see if we can do it. Neural net, how would you summarize it up? What is a neural net?
Edis Gonuler: I would summarize it like this, I would say you’re trying to simulate how our brain works using a machine. In your brain, you would have neurons. You would get an input, and then that results in output. Famous example for this is when you touch a hot pot, you move your hand away. You’re getting input that it’s hot, and your neurons are saying, “Well, you’re going to burn, so move your hand away.” I think that neural network is very similar to this. You get an output … An input. Sorry. An input, and it goes through all these neurons, and then you get an output at the end. This is how I would probably sum it up. There’s a lot more to it, but in a few sentences.
Kirill Eremenko: That’s a good start, it’s a good summary. Now, let’s unwrap it a little bit. I’m going to describe a bit how it works, and then you can add your comments [inaudible 00:26:13]. Sounds good?
Edis Gonuler: Sure. Yeah.
Kirill Eremenko: Okay. Everybody can probably imagine, everybody’s already seen, I’m going to assume here that everybody’s already seen an basic image of a neural net. You have these circles, vertically aligned, that’s your first input layer. It can be three, five, can be 100, can be 1,000. Those are where your inputs come in. Then, they’re all interconnected to the next layer, next vertical aligned layer of these circles, which are neurons. It can be more circles, or less than in the first layer, and so on.
Kirill Eremenko: All these layers are connected to the next one, to the next one, to the next one, and in the end, you have one output neuron, or maybe even several output neurons in the output layer. They’re connected, and what happens is your information, just imagine, it’s coming in from the left into these input neurons. It has to be obviously, any graphics, or any text has to be all changed into numbers, as the numbers have to be normalized, and so on. Basically, your inputs, which can be different columns in your data sets as well, they’re any variables are put into input layer, then what happens is, each of these, every next neuron in the next layer, let’s pick a random neuron, is connected, let’s say to all of the neurons in the previous layer with these lines. These lines are called Synapses. Every synapse has a weight assigned to it.
Kirill Eremenko: The neuron gets assigned to value, which is equal to the weighted sum of the input values multiplied by the weights of the synapses. It’s basically as simple as that. When I learned this for the first time, I was so shocked that it’s just as easy as a weighted sum, all of your input values in that neuron. Then, an activation function’s applied on top of that, which we’ll talk about in a bit. Then the same thing happens for all the neurons in that hidden layer. Then again, this process repeats for next layer, next layer there.
Kirill Eremenko: Basically, all the time, all that’s happening is you’re adding up the values that you have in your current layer, multiplied by the weights in the synapse to get them a value in the associated neuron. You’re calculating that for all the neurons, and you’re going through the whole neural network. That process’s called Forward Propagation goes from left to right. Does that sound about right?
Edis Gonuler: Yeah. Yeah, that’s a great way to sum it up. Like you said, it’s as basic as that. It’s nothing really complex. It’s basically like linear regression is a really good foundation for that. That’s basically what linear progression does, but a neural network just does it in a much wider scale, in a much larger scale.
Kirill Eremenko: Yeah, that’s a good way of putting it as well. Okay, cool. That’s how they work. Once your neural network is trained, and the weights are defined, that’s how exactly it works. Then, a question for you, Edis, where do the weights come from?
Edis Gonuler: Okay, so the weights are first initialized. And then after that, [crosstalk 00:29:05].
Kirill Eremenko: What does that mean? Initial.
Edis Gonuler: They’re initialized. They have a value. You first set them all to zero or one. When the numbers pass through as they go through the neural network, they get multiplied by one, or they get multiplied by zero. Then you get an output from those inputs. After that-
Kirill Eremenko: I would add a small correction there. It’s usually a value close to zero. If they’re all zero, then it’s just going to be … not going to really work.
Edis Gonuler: Yeah. It’s going to be zero. Then after that, after you get the output, it compares it to the output that it’s supposed to get. When you’re training the model, it compares to the output that you’re supposed to get. It looks at the difference in the last function. Based on that, it changes the weights. It either makes them more, or makes them less, and it tries to make this neural networks get an output that’s as close as the real value. This is just a summary. There’s way more to this than I’m explaining. But this is basically just … Yeah.
Kirill Eremenko: Yeah, exactly. In forward propagation, the data goes from, or the information goes from left to right, feeds forward through a network. Then, as you said, in the training data set, you actually … because by definition, in the training data set, you already have the outputs that you’re supposed to get. If you’re training a neural network whether to distinguish between a dog and a cat, you already have like 10,000 images, which are labeled dog and cat. You have the answer. You compare it to the answer, and depending on your loss function, or your error, you do back propagation.
Kirill Eremenko: Then, you propagate the error back through a network. Basically, what that means is that the network automatically just slightly permutates those weights, so changes them a little bit to reduced error, and to reduce the error. There’s processes behind that such as Gradient Descents, Stochastic Gradient Descent, Batch Gradient Descent, and so on, which we’re not going to go into. But basically, there’s ways that the neural network can slightly adjust the weights to see if it can get the answer better closer to the result. That’s why you need a lot of data, right, so that you can do this thousands of times before you get those final weights that work well for your problem.
Edis Gonuler: Yeah, yeah, definitely. This was a good summary of it. It’s not very complex, but I feel like it could be, and there’s a lot of math behind it, which it’s good to understand because that’s a good foundation for it.
Kirill Eremenko: For sure. Look at us, we explained neural nets on the podcast. The question is how useful that is. I wonder if listeners got value out of that, but hopefully they did.
Edis Gonuler: Hopefully.
Kirill Eremenko: Hopefully. Activation functions are simply the function that’s applied to the output of the neurons. The input into a neuron is the input, so the input of the neuron is whatever the values were in the previous neurons, time’s a weight, so weighted sum, and then the output of neuron is that weighted sum, but we apply an activation function. A very simple one is the threshold function where if the weighted sum is less than zero, then we turn that neuron off, and make it a zero. If it’s more than zero, then we turn that neuron on, and make the output a one. That’s a very simple activation function called the Threshold Function. The one that-
Edis Gonuler: Yeah, that’s [inaudible 00:32:23].
Kirill Eremenko: Yeah. Then the one that you use is called the ReLU, or rectifier. How would you explain that one?
Edis Gonuler: Yeah. If you look at a graph basically below zero, it’s a straight line, and after zero, it’s the line y=x. This is used for predicting numbers, and not probability, so not a percent. But it’s used to predict numbers. It’s very useful in a lot of different cases, like in the competition that I had, where you would predict the number of items sold. This was the best. This was the activation function that I found was most useful.
Kirill Eremenko: That’s very accurate description. It’s basically, it’s kind of like the threshold function, the very simple one that we just talked about, but where below zero, the neuron’s off, above zero, the neuron is equal to one. Here, it’s like if your value is below zero, then the neuron’s off. But if your value is above zero, then your value, your output of the neuron is equal to the value. You maintain the value, don’t change it to a simple one. Okay. What other activation function can you think of?
Edis Gonuler: I can think of Adam. I haven’t really used it. I’ve just heard of it. I’ve heard a lot of good.
Kirill Eremenko: What about the Sigmoid Activation Function?
Edis Gonuler: Yeah, so the Sigmoid Activation Function is also useful, but it’s useful in different case. When you’re predicting probability, you can use a Sigmoid Activation Function, and what it will do is you would set a threshold, so below 0.5, so below 50%, or above 50%. If you’re, for example, predicting if a cell is a cancer cell or not, and you get a percent, you would want to change it to either yes or no. You would say if it’s below 0.5, it’s a no, if it’s about 0.5, it’s a yes. That’s basically at its core what the Sigmoid Function is.
Kirill Eremenko: Okay. It’s the same function that’s used in logistic regression.
Edis Gonuler: Yes. Yeah.
Kirill Eremenko: Okay, fantastic. That’s just three activation functions. I checked the Wikipedia page recently for activation functions, there’s like 30 of them, at least.
Edis Gonuler: There’s a ton. Yeah.
Kirill Eremenko: It just shows that in neural nets, there’s a lot of room for creativity. As the architect of the neural network, you can decide, how many neurons am I going to have? How many layers am I going to have? And what activation functions am I going to use, and lots of other things, but those are the main things that you can tweak, and just design your own neural network. I wanted to ask you, what do you like more? Do you like the mathematics aspect, and coding the neural nets? Or do you like the architectural aspect of the neural networks?
Edis Gonuler: I think they tie in together, because when … I like personally the math aspect a little bit more, because you understand how it works, and it’s good to know the math to actually build architecture, because when you know the math, you know how it would work. Based on that, you could predict even before training your neural network, because neural network takes a lot of time to train. If you don’t have a good architecture, you could waste hours of time, and not have a good model. I think they tie in, but personally, me, I like the math aspect a little bit more.
Kirill Eremenko: Okay. Got you. Got you. You also mentioned feature engineering, right? That’s a very important part for any kind of machine learning, or artificial intelligence for that matter.
Edis Gonuler: Definitely.
Kirill Eremenko: Tell us a bit about that. What is feature engineering? What kind of cool features did you come up with for this Kaggle challenge that you were in?
Edis Gonuler: Yeah. Feature engineering is creating a new feature based on the data that you already have. One of the features that I created for my competition, for the competition I was in was, there was a bunch of items, and I created a feature in CD. If the word CD wasn’t the item name, then in the column is CD, it would be yes or no. This helped me because I realized that CDs were sold more than other products. This helped me to just take them, and put them on a different side, and say, “Okay, this is a CD, the way it’s going to be a little bit more. Or how am I going to … It made me think, “How am I going to use this to build a better neural network, to build a better model?”
Kirill Eremenko: Very cool. Okay. That’s an unusual way of engineering a feature, so you’re pulling name from text data, or values from text data, creating a Boolean variable. Very cool. Indeed, maybe some domain knowledge about the company, the market that it’s in, that 1S company indeed sells a lot of CDs, because I think a lot of them are used for corporate type of accounting software and things like that. That’s one feature. Did you do engineering other features?
Edis Gonuler: The other features were just the means of the columns, and the mode, and other stuff. It wasn’t major. This one was the most interesting one, and the one that I realized was the most effective, I would say.
Kirill Eremenko: Okay. Okay, got you. What do you mean, mean of the column? You take a mean of the column, and you would create a feature that has the same value across all of the rows?
Edis Gonuler: Yes. What I would do is I would take the mean, and I would subtract it from the actual value, so the difference between the mean and the actual number. I realized that in some competition, this helped, because in my previous competition, this was a useful feature. But in this competition, it wasn’t as useful as it was in my previous competitions. It varies from competition to competition.
Kirill Eremenko: Okay, okay. Got you. Interesting. Feature engineering, you’d think that you have all these inputs, you just use them. Why is feature engineering so valuable?
Edis Gonuler: Feature engineering is important to bring out the best in the data. Some of the features might not be useful in the way that they’re presented. But if you can extract something from that data, most of the item names wouldn’t be used because they’re text. It would be hard to, if you can use NLP, but it would be harder to translate them to numbers. But if I know that I can pull CD from the item name into a separate column, that would be way more useful than spending my time changing the text into numbers. Just pulling, extracting the parts that you need from data.
Kirill Eremenko: Okay. Okay, got you. Understood. How did you normalize your data before feeding it into the neural net? Did you use standardization, or did you use min/max? What did you use?
Edis Gonuler: I tried it out a lot, and this was a few months ago. I’m not completely sure, but I used min/max, I believe, in my final submission, because standardization, I believe it didn’t work as well for me as min max, and it’s just based on the data, min/max, which is basically taking the maximum of the column, and then minimum of the column, and subtracting them. I believe that worked better for me. Like I said, all these things, like feature engineering the model, they all depend on the data. That’s why exploring your data and visualizing is very important thing.
Kirill Eremenko: Yeah. Okay. Yeah. We’ll talk about that in a second. Let’s talk about for a second as well, about standardization, or normalization of data. Why do you need to normalize variables before feeding them into a neural network? What would your one paragraph summary, or comment on that be?
Edis Gonuler: Yeah, so normalization or standardization is very important, because there’s a lot of outliers that can be outliers in data. Aside from outliers, some numbers, like in one column, the numbers could be thousands. It could be 100,000, and in the other column, it could be one, could be a value of one. If, let’s say in this competition with the items, the amount sold last month could be two, but the price sold could be 100. When you put these numbers inside a neural network, two and 100, one is obviously bigger than the other, and the neural network could alibi saying that the number that’s bigger, it’s stronger. That feature is more important.
Kirill Eremenko: It’s overwhelming.
Edis Gonuler: Yeah. To not have this problem, you standardize, and you convert them into small numbers, usually between -1 and one, or -two and two, and it solves this problem of having a bias model.
Kirill Eremenko: Fantastic. Fantastic. Well, I love the description. Indeed, you want to make sure the numbers are not only … You can’t add kilometers and kilograms, for instance. You want them to have the same unit of measurement, or no unit of measurement for that matter, and you want them to be on the same scale, like you said. If you have a one and a 200 value, or 100 value, then one will overwhelm the whole neural network. One way to normalize the process of bringing old values to a common scale is called normalization.
Kirill Eremenko: One process of doing that is called standardization, where you subtract the mean of that variable across the whole column. In every row, you subtract the mean of that variable, and divide by the standard deviation. That gives you a value, which is distributed with a mean of zero, and a standard deviation of one. Then that way, all of your columns become like that. Or the other one that you said you used in your Kaggle competition in the end was min/max, where you just take … Maybe you can describe this one better.
Edis Gonuler: Yeah. You take the value that you want to normalize, and you subtract it from the mean. You got to subtract the main from that value, and then you divide it over the difference between the maximum and the minimum value in that column.
Kirill Eremenko: Yeah. That gives you a number between zero and one every time.
Edis Gonuler: Yeah.
Kirill Eremenko: Yeah. Well, yeah, naturally. That’s another way of normalizing variables. There we go described. I think that’s a very important concept in a lot of machine learning, including deep learning to understand how best to normalize. Even depending on what you choose, you choose standardization, you choose min/max, or some other approach. That’s going to give you a different result, and that’s what you saw in your competitions, right?
Edis Gonuler: Yeah. You might see like one or two decimal places off. It’s not a big deal. But at the end, when it goes through all these different models, those numbers, even if it’s a one, or a zero, or a -1, it makes a huge difference at the end.
Kirill Eremenko: How would you describe that? Why does it make such a big difference? Why does the type of normalization that you choose play such a big role?
Edis Gonuler: Well, because there’s a lot of these numbers, and data sets normally are really big. When it goes into these models, if a number is between zero and one, it would get a different result than if it was between -1 and one. When it goes through these models, something that is not negative wouldn’t make a huge effect.
Kirill Eremenko: Yeah, okay.
Edis Gonuler: It would make a different effect.
Kirill Eremenko: Yeah. Another point would probably be that some of these activation functions like the ReLU are not symmetrical around zero. If you have negative numbers there, a lot of them will get dropped off along the way. Interesting. Okay. You mentioned another thing about neural nets is analyzing, or exploring your data before the model, and how that helps. Tell us a bit about that. Why can’t I just take my challenge, and jump straight into modeling, and creating these features, and building a neural net right away? What’s the benefits of exploring your data beforehand?
Edis Gonuler: Well, you would know which features have the biggest correlation with the result you want to get, with the output. You want to explore your data, and you want to find these correlations, and you want to also see what kind of data do you have? Do you have text? Do you have numbers? Do you have categorical features? Based on that, even when you’re data pre-processing in that stage, you want to know, what do you have to do? What do you have to apply? That makes all the more difference, just looking at your data and seeing what is it? What is there?
Kirill Eremenko: Yeah, okay. When you say explore, what do you mean by that?
Edis Gonuler: Just create graphs, create charts, create heat maps to visualize the correlation between two features. There are great libraries for these, like Matplotlib, Seaborn is a really good one that I use. It helps you a lot in the other steps in building your model, in pre-processing your data because you get to see all your data at once.
Kirill Eremenko: Okay. All right, so you got to look for these correlations beforehand, and then you know better which features will likely be most important for your neural net, or which features you can engineer that will be useful. Tell me then, you said you use ensemble of neural nets. How does that work? What were the differences? Was it like the same neural network, a typical ensemble method where you just feed in different subsets of the data to the different neural nets, and then you average out their response? Or were the neural nets themselves different to the ensemble?
Edis Gonuler: Yeah, like you said, there’s two ways to do that. I picked the first way that you said. What I did was I created four or five different neural networks. I trained them all with data, and then I used a linear regression. I averaged them, and I use a linear regression to predict the output, to pick the output. I think that when you average them, it’s really good because each model can have a different bias towards a feature. When you average them, it doesn’t cancel out, but it gives a new perspective. I mean, these are just numbers. Your perspective sounds wrong, but that’s really what it is.
Kirill Eremenko: Interesting. Okay, so it was the same neural network, a copy/paste five times of the same neural network. Right?
Edis Gonuler: I did play around with some of the neurons, and the way they were placed. But overall, it was about the same type of neural network.
Kirill Eremenko: Okay. The same type neural network, but did you feed in different subsets of your training data to them?
Edis Gonuler: Yes, I split it into four, or five different subsets, and I gave each neural network one type, one of those subsets, and changed it on that, and then validated it.
Kirill Eremenko: Interesting. How big was the training dataset?
Edis Gonuler: I can’t remember off the top of my head, but it was about 100,000.
Kirill Eremenko: Okay. So about 20,000 each.
Edis Gonuler: Yeah.
Kirill Eremenko: Were the predictions of the neural nets drastically different?
Edis Gonuler: It weren’t drastically different, but for some of the items that are predicted, because these were items, so they were in the one to 20 range, so they weren’t drastically different. But I would say averaging them, just even if it’s by a little bit, it improves the score, it improves the loss function.
Kirill Eremenko: Yeah.
Edis Gonuler: It just helps a lot.
Kirill Eremenko: Ensebmle net is great in general. I find they’re much more stable than if you use a single method, because as you said, because it just can be biased towards that one feature in a certain direction. Whereas if you have ensemble methods, you’re in a way leveraging the law of large numbers. It’s like a democracy, you are less likely to make an error. Even if one of them makes an error, on average, they average out.
Edis Gonuler: Yeah.
Kirill Eremenko: That’s very cool. Okay, so why do you choose that approach rather than having five radically different neural nets, and feeding all of them the same data?
Edis Gonuler: I thought that this would keep me more organized, and I thought, when I thought about, I thought that this would be the best way. Probably back then I didn’t know that there was another option, so I just decided to do a few neural networks with the data, and then just average them. That would be the output, that would be my prediction.
Kirill Eremenko: Yeah. Got you. That is more classic approach, because that’s the same approach you’d use for your decision trees to create a random forest. That’s all. Okay. What did you use? I’m just curious, because an ensemble of neural nets would require quite a bit of computing power. Did you use Amazon, or did you use your own computer? What did you use?
Edis Gonuler: I definitely did not use my own computer. That would have be almost impossible. I remember training a few on Kaggles kernels, on their notebooks. I believe I also used Google’s platform, Google Colab. I changed a few there and here, and then at the end, I just wrote them down to one file on my computer. I actually averaged them out on my own computer.
Kirill Eremenko: Gotcha. Okay.
Edis Gonuler: [crosstalk 00:50:12].
Kirill Eremenko: Nice, very nice. Awesome. Well, very creative and a non-unconventional approach to this problem. I haven’t heard somebody use an ensemble of neural nets. But I guess we’re entering the world of unlimited computing power, where it’s going to be easier and easier to create crazy things like that. That’s exciting.
Edis Gonuler: Yeah.
Kirill Eremenko: Thanks a lot for this discussion on neural nets. I think that was very useful.
Edis Gonuler: Yeah, thank you.
Kirill Eremenko: Yeah, I haven’t had this on the podcast yet where we dove so deep into it. That was, for me, even great to refresh on these things. I want to change gears a bit and talk a bit about motivation. Where do you get the motivation at 16 years old, right, to-
Edis Gonuler: 15.
Kirill Eremenko: 15. Sorry, man. At 15. It still doesn’t sound right in my head. At 15 years old, to go and keep plowing ahead, waking up early, watching videos, coming home from school, watching course videos again. Participating in Kaggle competitions, going to conferences, reading books about mathematics. Your friends are probably out there, I don’t know, playing sports, or partying, or other things. How do you keep yourself motivated?
Edis Gonuler: Well, when you have fun, it’s really easy to keep yourself motivated. When I’m doing data science, I find that playing sports doesn’t make me happy. Partying doesn’t either, but when I’m doing these data science projects, I see myself, I’m happy, I’m motivated, I’m just there, and I’m willing to put in the time, and effort, and I think that’s the biggest thing. Motivation defeats all the obstacles, and if you use motivation, and if you put in the time, the effort, then all the obstacles will turn into a valuable learning experience. Yeah.
Kirill Eremenko: Okay. Okay. Very, very true. You do go outside sometimes, right? Because it’s-
Edis Gonuler: Yeah, yeah. Definitely.
Kirill Eremenko: Okay.
Edis Gonuler: Definitely.
Kirill Eremenko: All right. Okay, so you’re just having fun. Did you start having fun right away? How did you find that fun in artificial intelligence, and neural networks, and machine learning, because to be honest, a big part of it is quite tedious data preparation, or even more coding mathematics into code? How did you find the fun in it?
Edis Gonuler: Yes. When I first began, it wasn’t as fun as it is now. Like you said, the data preparation is a little bit sluggish, and it’s not as fun. But I always had the end goal in my mind that if I learned this, and I get to the models, and I get to the neural networks, then that’s when it starts to be cool. I was like, “Okay, it’s coming, it’s coming. The next thing that I’m going to do, it’s going to be fun, it’s going to be cool.” When you put that in your mind, you’re willing to just learn these things fast, and then get to the cool part, get to the exciting part, and just do projects.
Kirill Eremenko: Yeah. You promised yourself it will be fun, and it turned to be fun.
Edis Gonuler: Yeah, and it turned out to be actually more fun than I expected. I was always saying, “This is the future. If I want to learn something, I have to learn this.” It just kept pushing me. I kept pushing myself, and my family was also very supportive of me. But I kept pushing myself to learn, and to stay motivated, and to put in the time and the effort. Now, I see that all the work that I put in pays off, because now I’m actually having fun. I’m creating really cool things.
Kirill Eremenko: That’s very cool. At DataScienceGO, we spoke with you about one question you asked me, and you put me on the spot. I couldn’t come up with an answer for you. The question was, what is the next step for you? I was lost for words, because at this age, you’ve already progressed a lot, and at the same time, you still have to finish school.
Edis Gonuler: Thank you.
Kirill Eremenko: You’re welcome. Thank you coming to DataScienceGO, and chatting to me. It was really inspiring. But basically-
Edis Gonuler: [inaudible 00:54:25].
Kirill Eremenko: … you’ve progressed so much ahead, and at the same time, you still have to finish school, you have to finish uni. I can’t give you the normal career advice that I provide people because those people are usually out of university, and they are looking for careers. In your case, as I understand, you probably still want to go to university, and I connect you with a few people, or I think you met a few people there at DataScienceGO, who maybe helped with some advice. Have you found an answer to that question over the past couple months?
Edis Gonuler: Overall, I did. DataScienceGO, and thank you so much helping and everyone that talked to me and gave me advice, and I realized that I learned a few things actually at DataScienceGO. The first one I’ve mentioned over this podcast was doing projects. Projects helped me learn, and I talked to a lot of people there. They said, “Just do projects, keep learning, keep doing cool things.” That’s really been my goal. Ever since I came home to New York from DataScienceGO, I started an AI club at my school.
Kirill Eremenko: Wow. Congrats. That’s awesome.
Edis Gonuler: Thank you. It’s twice a month, and it started a few weeks ago. Yesterday was our second meeting. We just talk about AI and a little bit of teaching how AI works, all the different things. Next week, hopefully, we’re going to start with our first algorithm, linear regression, starting to predict house prices.
Kirill Eremenko: Nice. Very cool.
Edis Gonuler: Yeah.
Kirill Eremenko: Okay. How many people attend your meetup?
Edis Gonuler: About 25 people.
Kirill Eremenko: Wow.
Edis Gonuler: It’s after school. Yeah. It’s a great turnout. I didn’t expect that much. But a lot of people were interested, and I called it AI Club, not that, Data Science Club, because I realized if I call it Data Science Club, people wouldn’t know what that is. But if I called it AI club, then there will be more people.
Kirill Eremenko: Nice. That’s a great way to give back to your-
Edis Gonuler: [crosstalk 00:56:14].
Kirill Eremenko: … community, to your peers to help others get in the space. It’s awesome.
Edis Gonuler: When I went to DataScienceGO, there was a huge community of data scientists, and I realized that I missed that when I come home. There aren’t a lot of people to talk to about all these things, like neural networks, all the problems, and everything. That was my motivation for starting AI Club. I realized that if there was a community of people that want to learn this, then there’s always someone to talk to.
Kirill Eremenko: Yeah, wow. That’s very true. It was cool. For me, every time DataScienceGO is a really cool place to see everybody get together, and how inspired, and creative, and sharing everybody is with each other. I love that, so I totally get your feeling. You want to recreate that in your hometown, so you can have that not just once a year, but every week, or every second week. Huge congrats to you. That’s a massive step forward.
Edis Gonuler: Thank you. Thank you.
Kirill Eremenko: Okay, so doing projects, and growing as part of a community, helping your community grow. Those are your next steps, and yeah, sounds really cool. Are you excited?
Edis Gonuler: I’m very excited. I’m very excited for the future in AI Club, progressing myself, doing projects. Yeah. I’m really motivated, and I’m just ready to work.
Kirill Eremenko: Fantastic. Well, Edis I wish you the best of luck with all these things. It sounds like such a great journey [inaudible 00:57:46]. Yeah, and thanks so much for coming on the podcast. This has been a great-
Edis Gonuler: Thanks for having me.
Kirill Eremenko: … great, great episode. Before I let you go, what’s the best way for listeners to get in touch, contact you, follow your career, and basically find you online?
Edis Gonuler: Yeah. My email, you can email me, my email’s edis@gonuler.com. I also have a LinkedIn. If you want to text me or message me, you can do it from there.
Kirill Eremenko: Got you. Well, we’ll include the LinkedIn in the show notes for sure, and people can find you there. Highly recommend to connect with Edis, and see how his career progresses. Okay, and one final question for today, you read a lot of books, so what’s a book that you can recommend to our listeners, something that’s inspired you to take your career further?
Edis Gonuler: I had two books in mind, and they both have two different things. My favorite book for the mathematical side of data science is The Hundred-page Machine Learning Book. I think that it’s really good, and from scratch, you can learn a lot of the mathematic theory behind it.
Kirill Eremenko: Very cool.
Edis Gonuler: Just an overall awesome book. The second book is more of a … It’s a really good book as well. I think you know it. It’s Confident Data Skills.
Kirill Eremenko: Thanks, man.
Edis Gonuler: I read it. I picked it up at that DataScienceGO, and I started reading it. I’m almost done with it, and it’s just a great overall book. There’s a lot of great concepts, and a lot of great charts and graphs that it just had that I enjoy looking at, and just researching.
Kirill Eremenko: Fantastic. Thanks for the shout out.
Edis Gonuler: [crosstalk 00:59:26].
Kirill Eremenko: That’s really cool. Actually, as we speak, I have on my computer open, the files for the second edition. I’m working on the second edition.
Edis Gonuler: That’s great.
Kirill Eremenko: Yeah.
Edis Gonuler: That’s great.
Kirill Eremenko: I think it’s already announced on Amazon, but it’s coming out mid next year, so very excited about that. I’m glad.
Edis Gonuler: I’m excited too. You got me excited.
Kirill Eremenko: Man, I’m glad you’re getting some value, and another book too, The Hundred-page Machine Learning Book. I haven’t read it myself, but I’m sure there’s a lot of value in that. Once again, Edis, thanks for coming on the show, and can’t wait-
Edis Gonuler: Thank you for having me.
Kirill Eremenko: Yeah. My pleasure. My pleasure. Can’t wait to catch up again sometime soon and see where your career, and your learning, your projects take you.
Edis Gonuler: Yeah, that would be great. Thank you.
Kirill Eremenko: All right. See you. There we have it, ladies and gentlemen. That was Edis Gonuler, aspiring data scientist. Somebody who is very young, but is already powering through into the field of data science and artificial intelligence. I really hope you found this episode as inspiring as I did. Make sure to connect with Edis. I’m going to leave his LinkedIn on the show notes. Personally from me, the most exciting and inspiring part of this episode was probably, everything was really cool. Probably, I’ll say two things. The first one was how he codes models from scratch is very surprising. But the most inspiring was the fact that at the start, he didn’t find data science and AI fun, but he promised himself they’ll be fun, and they became fun, and became even more fun than he thought they would be.
Kirill Eremenko: I think that’s a very cool psychological technique any one of us could use in order to get more into something that we really believe in, get into whether it’s data science, AI, into a hobby, an interest, a profession. Promise yourself, it will be fun, and work for it to become fun, and then assess in a bit of time. I think that was a really cool tip. As usual, you can find the show notes for this episode at www.superdatascience.com/317. There, you will find a URL to Edis’s LinkedIn, any materials that we mentioned on the show, plus the transcript for this episode.
Kirill Eremenko: As mentioned, make sure to connect with Edis follow his career. I can already tell he’s going to have a very bright future if he keeps going the way he is going now, and I’m personally very excited to see where it’s going to take him. If you want to meet Edis and lots and lots of other inspiring and talented data scientists in person, make sure to get your tickets to DataScienceGO, and we’ll see you there. Until next time, happy analyzing.