Kirill Eremenko: This is episode number 309 with the legendary data science instructor, Jose Portilla.
Kirill Eremenko: Welcome to the SuperDataScience Podcast. My name is Kirill Eremenko, Data Science Coach and Lifestyle Entrepreneur. And each week, we bring you inspiring people and ideas to help you build your successful career in data science. Thanks for being here today, and now let’s make the complex simple.
Kirill Eremenko: This episode is brought to you by my very own book, Confident Data Skills. This is not your average data science book. This is a holistic view of data science with lots of practical applications.
Kirill Eremenko: The whole five steps of the data science process are covered from asking the question to data preparation, to analysis, to visualization, and presentation. Plus, you get career tips ranging from how to approach interviews, get mentors and master soft skills in the workplace.
Kirill Eremenko: This book contains over 18 case studies of real world applications of data science. It comes off with algorithms such as Random Forest, K Nearest Neighbors, Naive Bayes, Logistic Regression, K-means Clustering, Thompson sampling, and more.
Kirill Eremenko: However, the best part is yet to come. The best part is that this book has absolutely zero code. So, how can a data science book have zero code? Well, easy. We focus on the intuition behind the data science algorithms, so you actually understand them, so you feel them through, and the practical applications. You get plenty of case studies, plenty of examples of them being applied.
Kirill Eremenko: And the code is something that you can pick up very easily once you understand how these things work. And the benefit of that is that you don’t have to sit in front of a computer to read this book. You can read this book on a train, on a plane, on a park bench, in your bed before going to sleep. It’s that simple even though it covers very interesting and sometimes advanced topics at the same time.
Kirill Eremenko: And check this out. I’m very proud to announce that we have dozens of five star reviews on Amazon and Goodreads. This book is even used at UCSD, University of California San Diego to teach one of their data science courses. So, if you pick up Confident Data Skills, you’ll be in good company.
Kirill Eremenko: So, to sum up, if you’re looking for an exciting and thought provoking book on data science, you can get your copy of Confident Data Skills today on Amazon. It’s a purple book. It’s hard to miss. And once you get your copy on Amazon, make sure to head on over to www.confidentdataskills.com where you can redeem some additional bonuses and goodies just for buying the book.
Kirill Eremenko: Make sure not to forget that step is absolutely free. It’s included with your purchase of the book, but you do need to let us know that you bought it. So, once again, the book is called Confident Data Skills and the website is confidentdataskills.com. Thanks for checking it out, and I’m sure you’ll enjoy.
Kirill Eremenko: Welcome back to the SuperDataScience Podcast. Ladies and gentlemen, super pumped to have you back here on this very special episode of the SuperDataScience Podcast, because today, we have none other but the legendary data science instructor, Jose Portilla.
Kirill Eremenko: Very interesting episode. You’re probably wondering why we recorded it together since we’re direct competitors in the online education space in data science. Well, we’ll answer that question for you right at the start of the episode. And we thought you’d be interested to have us both in the same room talking about your favorite topics such as AI, data science, and the future of the world.
Kirill Eremenko: So, in this episode, you will hear about neural networks that create other neural networks, how that all works and what that means for data scientists. How to manage and lead a community of over a million students.
Kirill Eremenko: The question that Jose gets asked the most, as you can imagine with such large communities, we get hundreds, I think it’s like 500 or so questions per day that are asked in our courses. And here, you’ll find out what is the most asked question for Jose and how he answers it.
Kirill Eremenko: You’ll also hear about the pyramid of learning and what is the pinnacle of learning what you need to do in order to understand that you have indeed mastered a topic. And finally, we’re going to have a very interesting debate about artificial general intelligence.
Kirill Eremenko: I really enjoyed chatting to Jose and I can’t wait for you to hear this podcast. So without further ado, I bring to you the legendary data science instructor, Jose Portilla.
Kirill Eremenko: Welcome back to SuperDataScience Podcast ladies and gentlemen, super special guest on the show with me today, Jose Portilla. Jose, how are you going?
Jose Portilla: Good. Good to be here. You said it right in front of me as if there was an audience, but we’re in an empty room.
Kirill Eremenko: I know. You got to do it. Man, where are we, Jose?
Jose Portilla: We are at Udemy LIVE in Berlin, in Germany.
Kirill Eremenko: In Berlin. Out of all places …
Jose Portilla: I know, right?
Kirill Eremenko: … and we’re back in Berlin. What a great party last night.
Jose Portilla: Yeah. It’s fantastic. You had the Birddogs, was it?
Kirill Eremenko: Birddogs. Yeah, if anybody is interested in some cool cover band in Berlin, check out Birddogs. They were epic.
Jose Portilla: Yeah, they’re fantastic.
Kirill Eremenko: Yeah, Udemy knows how to throw a party.
Jose Portilla: That’s very true.
Kirill Eremenko: Yeah. Like a lot of food, a lot of drinks and the excursion on Friday was really cool.
Jose Portilla: Yeah, the boat tour and then the Boros Collection and then all that stuff.
Kirill Eremenko: In the bunker that’s above ground.
Jose Portilla: I was a little more interested in the building than the … I don’t know. What do you think of …
Kirill Eremenko: It really was cool.
Jose Portilla: This is getting off topic, but what did you think of the artwork?
Kirill Eremenko: The artwork, I never understood contemporary art. Like post modernism. But, what I really liked in this tour was that they explained it, and that allowed me … For example, that one of the trampoline and the arrow, and the horse. Compared to Picasso, that took five minutes to put together.
Kirill Eremenko: Maybe it took ages for the person, but it’s not … You can’t really compare that to classic art. It’s just different realms. But with the way they explained is it’s not about what the artwork is, it’s about what it represents, what the person was thinking and kind of like the idea that they’re provoking you to think about.
Kirill Eremenko: And when you think about it that way, it’s like somebody writing down an idea with pen and paper. But here, they’re just doing it with like sketches or household goods or whatever else. And in that way that for me, that was much easier to accept. Yeah. So in that sense, I like the explanations. What about you?
Jose Portilla: That’s actually the thing I dislike the most about it. In my opinion, it’s like, if your artwork is so reliant on a third party to happen to explain your thesis behind the artwork, maybe the art is not the best manifestation of trying to get your message across. Maybe you should just be writing a paper on whatever topic and it may be clear to more people. But some of them were like crazy. Like the one of the images of the houses. Remember the 9×9?
Kirill Eremenko: Oh, yeah.
Jose Portilla: So, the thing was this … I guess to explain it to the listener a little bit. Apparently, there used to be this old German company that would fly around in a helicopter, take aerial photos of your home, then go door to door and try to sell you an aerial photo of your house. And apparently, it’s so very popular in the 70s to have a little aerial photo of your house. And then Google Maps comes along and they go out of business.
Jose Portilla: So, they have 30,000 essentially stock images that they did not end up selling, because not everybody wanted to buy a picture of their house. And then they gave it to this artist and he manually, instead of the convolutional neural network or some filter, he just looked for patterns. So, then he gets like the nine images where everyone is washing their car, or the nine images where all the windows are boarded up in these houses.
Kirill Eremenko: Yeah. And puts them into one big frame or the entire collection near each other, and then you have to guess the name like car wash.
Jose Portilla: Yeah, you have to figure out what’s the same or similar track between all these paintings and images.
Kirill Eremenko: Yeah, definitely some interesting ideas. But fair point on maybe it’s not the best way if you need explanation. Speaking of the building. So, it’s a bunker above ground from World War II, with two or three meter thick ceilings and walls. Did they tell you in your group that the bunker like there’s an actually a living resident above …
Jose Portilla: Yes. The whole building is insane. Because you look at it from the outside. Yeah, it’s like concrete, very industrial or brutalism looking. And I thought to myself like, “That’s weird, bunkers are usually … I thought they were underground.” And I was like, “I’m surprised this could have survived the Berlin bombings.”
Jose Portilla: And then you go inside and they show you how thick the walls are. You’re like, “Oh, this could survive anything, because they’re hugely thick.” Yeah. Then later of the tour, the owners of the collection live at the top floor of this bunker. It’s so weird.
Kirill Eremenko: And they explained to us how they managed to do that because in Berlin, you’re not … You want to tell that story?
Jose Portilla: You probably remember better than I do in some weird legal thing, right?
Kirill Eremenko: Yeah. In Berlin, you’re not allowed to legally add an extra floor on top of a building that already exists. And this was a bunker. They don’t want to live in the bunker. They wanted to live, add a floor on top. But the legal loophole was that bunkers … This building doesn’t fall under the classification of a house.
Kirill Eremenko: It falls under the classification of a bunker, and bunkers are normally underground, so everything that we see above ground in this case is considered the basement. Basement one, basement two, basement three, basement four. So, they were like, “Oh, we got to add a top level. We kind of live in a basement.” Something like that. So, that was really fun. So, are you having a good time in Berlin overall?
Jose Portilla: Yeah, it’s been great to … I’ve just been traveling around Europe for this. Yeah. So, it’s been nice to get to see everybody.
Kirill Eremenko: Very nice. Yeah. Well, today’s podcast. First of all, some of our students who know us both will be …
Jose Portilla: Their minds blown that we’re talking to each other.
Kirill Eremenko: Yeah. It would be like thinking, “What? Did the world turn around?” Because we are apparently like … Well, we’re competitors. We compete. Fierce competitors at each other’s throats. So, how would you explain that? Why are we talking more, let alone recording this podcast if we’re such fierce competitors?
Jose Portilla: It’s so funny. Well, we’ve had this conversation on multiple times, but everyone from the outside thinks like one of us has to die in order for the other to survive.
Kirill Eremenko: Hunger Games. Yeah.
Jose Portilla: Yeah. Exactly. But if anything, it’s the opposite. Specifically like at Udemy where … I don’t know. Some people think like, “Oh, you probably wish that your competitors come up with really bad courses or something. That way your courses can reign supreme.”
Jose Portilla: When, in fact, the opposite is true, because the worst thing that can happen to me is that a popular competitor releases a bad course. Because then students think, “Oh, even just online education in general, it wouldn’t be that great.”
Jose Portilla: Suddenly, it becomes a reflection of not just one course but their entire online learning experience. So, one of the best things that comes to me is have a competitor like you with good content. And then it’s like I was telling you earlier, buying a course is not like buying a car where you buy one car and then many years later, you’re not buying a car until much further into the future.
Jose Portilla: It’s more like buying a book on a topic you like. You’re going to buy multiple books by multiple people. So, the best thing that come to me is have a competitor with a good course, which engages a student and then says, like, “Oh, I can actually learn some of this complex stuff online. Let me go check out other courses, etc.” So, yeah, it’s not some Hunger Games situation.
Kirill Eremenko: Yeah. For sure. And also, engage one course, tell my friends about it. They’ll come and different people like different styles. You and I have different styles of teaching, inevitably.
Jose Portilla: Yeah.
Kirill Eremenko: Everybody is unique and somebody might prefer the way you explain something. Somebody might prefer the way I … Or somebody might benefit from both.
Jose Portilla: I was just about to say, different people like both styles. I would say that the Venn diagram of our crossover students is huge.
Kirill Eremenko: Yeah.
Jose Portilla: Yeah.
Kirill Eremenko: For sure. And also, what I like about the competition is it doesn’t let you lack off. I mean either of us, because we hold each other up to a standard. If there was just one of us, then the standard drops. You, first of all, might not notice that your standard of teaching is dropping. Students might not notice, because they have nothing to compare it to. And then you won’t be incentivized to improve. I like this that like I can’t let my standards drop, you can’t let your standards drop, because the nature of the competitive market.
Jose Portilla: Yeah. The quality of course is getting better. I don’t know if you remember your first course. Ever look back at it?
Kirill Eremenko: Yeah, I have. Oh, it was so [crosstalk 00:13:40].
Jose Portilla: I’m so embarrassed how bad my first courses are.
Kirill Eremenko: I know. It’s like night and day.
Jose Portilla: Yeah.
Kirill Eremenko: I do appreciate the effort I put in. Listening back to it, it took so much courage to start recording.
Jose Portilla: I know. I believe we might not even start doing it, right?
Kirill Eremenko: Yeah.
Jose Portilla: Because you’re by yourself in a room recording it not knowing if anyone will ever even view this. And I’m shocked that … I don’t know, it’s almost like a different person made that course, because it’s like, “I can’t believe I did this.”
Kirill Eremenko: Oh, I can be as grateful to that person who I was for making that leap. That was good. Okay. So now, move that out of the way. I don’t know, let’s maybe talk about what are some of the recent trends, some of the recent things that you’re seeing in the data science AI industry that you’re creating courses on that students are excited about?
Jose Portilla: Well, let’s see. Recent trends. There’s always new updates to the various deep learning libraries. So, like TensorFlow 2.O just came out. Like just, just came out. Maybe like a couple weeks ago maybe.
Kirill Eremenko: No. I think it was in June.
Jose Portilla: Well, that was the beta or alpha, right?
Kirill Eremenko: Oh, yeah.
Jose Portilla: I mean, the official 2.0 release was pretty recent. And then PyTorch 1.0 also came out really recently.
Kirill Eremenko: Okay. Very cool.
Jose Portilla: So, those are some new things. The new libraries has always been developed. Maybe this might not be such a new trend that I recently saw the publication date of this paper, but I just recently found out about this was the neural architecture search or NAS by Google, or Google AI where they’re basically using recurrent neural networks to create or search for optimized architectures for different problems. So, like the CIFAR-10 dataset, with the 32×32 colored images of 10 different topics like plain, frogs, whatever.
Kirill Eremenko: 60,000 images there.
Jose Portilla: Yeah. What they’re doing is they’re basically deciding that humans, since we design everything in a very structured way, like convolutional neural networks are very structured with the kernels, everything is still kind of squared, connected. That perhaps there is some more organic, more optimized connection.
Jose Portilla: So, they’re using a recurrent neural network to actually build the architecture of another network to solve for the CIFAR-10 dataset problem. And they were able to actually improve the performance quite a bit from whatever the state-of-the-art convolutional neural network could do.
Jose Portilla: And this is with a network of essentially what looks like to the human eye randomized connections, and they can even skip layers and stuff. And so that one really blew my mind to the fact that I used to think now like, “Oh, the future is like recurrent neural networks or the feature is convolutional neural networks.” When probably the reality is the future is some unknown random network that another network has figured out. That’s almost like the … What is it? Like the I am Robot or …
Kirill Eremenko: I am Robot. Yeah.
Jose Portilla: Yeah, the [crosstalk 00:16:53].
Kirill Eremenko: I, Robot.
Jose Portilla: I, Robot? Where you have robots building robots. Now, we have neural networks building other neural networks.
Kirill Eremenko: That’s really cool. And then you can go deeper than neural networks building neural networks [crosstalk 00:17:05].
Jose Portilla: Yeah. Then the other thing then is like a loop. Almost like have a neural network build a neural network for finding neural networks. What’s the most optimized thing? Yeah, that one really blew my mind, because it really showed that the shape of the actual network seems to have some quite a bit more importance than the weights.
Jose Portilla: And it’s not something I think … Well, this was published in 2017. Now, people, I’m sure, are really thinking about it. But definitely just five years ago, I don’t think that many people were thinking about if a randomized neural network would actually perform better than a structured one, given the same randomized initialization of weights.
Kirill Eremenko: Yeah. Interesting, because you sent me that paper. I had to look through it. First of all, I was surprised. I was like, “Yeah, this is 2017.” But still, it also, as you said, blew my mind that you have, from scratch, this neural network that they created to create new neural networks was building them from absolutely zero and outperforming by a small margin like 0.09% performance and 1.05% faster than the human ones, but still outperforming them on the CIFAR, right?
Jose Portilla: Yeah, CIFAR-10. Yeah.
Kirill Eremenko: CIFAR-10 dataset. That was really cool. The way I understood it, the way it works is it takes the neural network and that is building or wants to build and represents it as a variable length string. So, it puts it into a text string basically. The representation of the neural network. And then it iterates through that string through what was … Gradient descent, right?
Jose Portilla: Mm-hmm (affirmative).
Kirill Eremenko: To optimize for the accuracy of the image prediction. Is that about right?
Jose Portilla: Yeah, basically. Yeah. I think maybe the reason I found that about it so recently was I recently, this year for sure, even though maybe it was published two, three years ago. I recently saw the pictures of the neural network architectures that the RNN was actually solving for.
Jose Portilla: And it was the weirdest looking. I mean, it looked like a little kid drew sloppy lines with random neurons everywhere. Nothing was even. You would expect maybe the RNN would find some sort of hidden structure, right? But it was just …
Kirill Eremenko: Unstructured.
Jose Portilla: Yeah, for better or for worse, it looks more like an organic brain. Like an actual biological brain, right?
Kirill Eremenko: That’s so cool. You got to send me those images, because [crosstalk 00:19:53].
Jose Portilla: Yeah, I’ll have to find them. You look at them and it’s like there’s no way this performs better than a structured network.
Kirill Eremenko: Have you ever seen those images of when certain parts of a building or an airplane instead of a human designing them, they get an AI to build it?
Jose Portilla: Oh, yeah.
Kirill Eremenko: Through reinforcement learning. And it’s completely weird, completely random. Like simple parts that hold … You know that part under a table that holds the legs of the table to the main part of the table?
Jose Portilla: Yeah.
Kirill Eremenko: Like 90 degree type of angle metal thing. Like if you get an AI to design, it looks completely randomly weird. And it’s like 30% lighter, 100% stronger, less material required. It looks very organic.
Jose Portilla: Yeah, I remember I was once in a museum. And one of the exhibits was an antenna that was designed by a … It wasn’t technically AI. It was like a genetic algorithm that try to keep solving for what kind of antenna could get the strongest signal.
Jose Portilla: And the antenna looks so weird. It looked like a string of spaghetti like floating in space or something. And it was like, “Yeah, this is what the algorithm figured would get the strongest signal in this particular spot.”
Jose Portilla: And it just goes to show that it’s really hard to have intuition for some of this stuff. And it kind of makes sense. I don’t know. The more you study evolution and biology, certain animals are super weird. Like you see a platypus, or like a squid has a beak like a bird. It’s so bizarre, but I don’t know. Nature is essentially a really long reinforcement learning algorithm, where it’s like many, many generations, what works, what doesn’t work.
Kirill Eremenko: Yeah. But what I find interesting. I was also thinking about it just know that at the same time in nature, a lot of things are symmetrical.
Jose Portilla: Yeah, right?
Kirill Eremenko: As weird as they are, they’re symmetrical, but what AI designs most of the time is as asymmetrical. There’s kind of a combination of both in nature.
Jose Portilla: Yeah. And then not to get too philosophical, but then you see certain numbers keep popping up in nature like a pi or something.
Kirill Eremenko: Oh, the Fibonacci number.
Jose Portilla: Yeah. Or the fact that definition of a normal distribution, the actual function for it has pi in there. It blows my mind. How is this freaking number showing up everywhere and things that you wouldn’t think it would show up? You wouldn’t think that relationship of a circle would have much to do for normal distribution. But then it happens [crosstalk 00:22:28].
Kirill Eremenko: And then everything follows. Like the heights of humans. I don’t know, populations of animals, bacteria. A lot of things are normally distributed in this world.
Jose Portilla: Yeah. I don’t know. There may be some deeper order to things that we’re just not getting, but yeah. Yeah, like I said, you see a platypus and you’re like, “There must be some random noise here.”
Kirill Eremenko: Crazy. All right. Well, shifting gears a little bit. You teach online. And by the way, congratulate 1.2 million students.
Jose Portilla: Yeah. Well, congratulations to you too, to the both of us, I guess.
Kirill Eremenko: It’s crazy. How does it make you feel? 1.2 million students worldwide.
Jose Portilla: Oh, it feels bizarre. I remember thinking like a long time ago, like, “Man, when I hit 100,000 students, that will be it. I would have hit the ultimate goal. And then you hit that, and then you’ve hit it too. And then you think, “Okay. 250,000 students, let’s really go for it.” Some crazy goal. Then you hit that and you’re like, “Oh, okay, half a million.” Yeah, it’s just been absolutely insane how fast everything has been growing just in a couple of years.
Kirill Eremenko: Yeah, it is very fast. We’re, I think, at 920,000 [crosstalk 00:23:45].
Jose Portilla: Yeah. I bet you, if we had this same conversation even just some weeks from now, you would have had a million as well.
Kirill Eremenko: Yeah. Probably.
Jose Portilla: Next time I see you, for sure, you’ll have at least a million, if not much more.
Kirill Eremenko: Yeah. That puts a lot of responsibility, right? You got to create the right content. The right guidance is no longer just fun and games and just putting out there like things that you’re passionate about. But you also got to think through what do people need? What do your 1.2 million students need? What are their requirements?
Kirill Eremenko: You got to think about the needs of the students. I guess my question to you is how do you go about that? How do you go about communicating with your audience and finding out what is it that you can help them with the most in this next stage of your journey?
Jose Portilla: That’s an interesting question. It’s almost like as we’ve been progressing through this online education world and this population of students, the analogies keep changing. So, at first, it was like, “Okay, I can structure myself as if I teach a course, like a classroom of 30 students.”
Jose Portilla: Then it starts getting too big. It’s like, “Okay. Well, now, I’ll structure myself like a seminar.” So, maybe I’ll have a set piece of notes for students like they wouldn’t a large seminar class. Less one on one interaction.
Jose Portilla: Then it starts getting bigger, it’s like, “Well, I guess now I’m structuring myself as a department end of college.” So now, I have TAs or something and much more standardized practice across multiple courses or something like that.
Jose Portilla: And these are structuring yourself as a university or something. So then now you have multiple departments of like, “Oh, Python topics, or R topics or Tableau topics, etc. And then there’s some sort of structure within those, etc.
Jose Portilla: And now with our scale, it’s almost like the analogy becomes like a city or something. So, then you have to start thinking of … At this point, one-on-one interaction as much as I love it is kind of impossible. We can’t communicate with every citizen of us that we have a million people, right?
Jose Portilla: So, then you start trying to think of what does a city do. So, they may have meetups. So then we try to have different sources or students to interact with each other. That’s maybe a little more fluid. And this is something maybe you can have advice for me, because I know you’re probably better at this than I am.
Jose Portilla: But just trying to build that sense of community. Maybe off of you to me, because the Q&A forms, for interaction purposes from one student to another, isn’t exactly optimized. First, we tried Slack. That quickly got unscalable, because we couldn’t pay for every student, and it deletes the history.
Jose Portilla: Then we tried Gitter, which is kind of like this Slack based off GitHub, but that was also trying to have scaling problems. And then we switch to Discord, which I hadn’t really heard of it before until someone suggested it to me. And it’s like for online gaming. Do you know what I mean?
Kirill Eremenko: Yeah, I’ve heard of them.
Jose Portilla: Yeah. So, it’s a free version of essentially what Slack does. And so, so far, that’s what we’re using to try to help scale a sense of community. Yeah, and they can do things like … Well, like I said, you’re probably better at this than I am of things like a podcast or something, to build a sense of community or some sort of weekly updates, that kind of thing where …
Jose Portilla: You’re not going to be able to talk to each student, but at least you can try to encourage students talking to one another. So, I think as we scale larger, trying to encourage the student interaction is one of my priorities.
Kirill Eremenko: Yeah, I absolutely agree. I wouldn’t say much better than you at this. At SuperDataScience, we’re also exploring things. So, right now, we are trying out the Slack approach that you’ve already tried. We’re also considering an approach of forums, an approach of building our own system because our whole LMS at SuperDataScience, the learning management system is completely custom built by ourselves.
Jose Portilla: Yeah. [Murray 00:28:03] told me that.
Kirill Eremenko: And so we can add on whatever we want to just like … We just need to see that there’s a need for it and there’s time. But in general, I completely agree with you that as much as well, I want to interact with everybody. I simply physically cannot do that. And therefore, putting people into groups to talk to each other. That’s the best.
Kirill Eremenko: I’ll give you an example. I was at DataScienceGO, the conference we run in San Diego. I was running a workshop on Tableau. And there’s, I think, like 60 people in the room. All different levels. And I said right away, “This is a workshop for beginners. If you’re advanced, there’s another workshop in this neighboring room about AI ethics. Go there. You get a lot of value out of that. This is a workshop for beginners.”
Kirill Eremenko: I think one people changed the room. But still, there were a lot of different levels here. Very advanced people, beginner people. While we were going through these exercises on building this dashboard, some people are going really fast ahead and I thought, “What are you doing in this room? I told you go to the other one.” And they’re like, “Yeah. No, I just wanted to play around, see what the dashboard will be like, see what the dataset will be like.”
Kirill Eremenko: And so what I started doing is said, “All right, if you went ahead, like far ahead, why don’t you get up and help somebody who’s falling behind? There’s 60 people, which is not a million, but still, I can’t go help everybody.”
Kirill Eremenko: And so the more advanced people, like I remember specifically Jonathan and Ogo. If they’re listening to this huge thank you for that. They just got up and helped out a lot of people. And there were others as well. And in that sense, nobody was bored. Everybody was keeping up.
Kirill Eremenko: And I think that sense of community is amazing in data science. Data scientists want to help each other. Our job is to facilitate that and find the best way. It looks like we’re both exploring to find like what is the right medium for this community to thrive.
Jose Portilla: Yeah. I don’t know. It almost sounds douchey to say this, but we really are pioneers in this space, because there’s no one else we can really talk to of like, “How do you deal with the community of students this large?” Where you don’t have some university or company level team to handle all of it. So, we have explore these different methods.
Jose Portilla: And the other thing I was going to say about the students interact with each other. I think students get a lot out of it as far as the … There’s some more official term for it, but like the pyramid of knowledge or the steps to really understanding a topic.
Jose Portilla: Like the very final step is teaching a topic. So, you know you understand something if you’re able to teach it. So, I think it helps the students to help other students because then they know that they really understand the topic if they’re able to help out another student in it.
Kirill Eremenko: That’s a great way of putting it.
Jose Portilla: Yeah. There’s some official term for this that someone will have to Google that there’s a hierarchy of understanding. And the very last or top level is the ability to teach it. It has some sort of proper noun name whoever discovered it.
Kirill Eremenko: Okay. Yeah, I think I’ve heard of this well before, but it doesn’t come to top of head, but I agree with you. Yeah.
Jose Portilla: Although I teach stuff, I feel like I don’t understand crap. Even though I teach them.
Kirill Eremenko: Why did you love it?
Jose Portilla: Because it’s like a new thing. Every five seconds in this freaking field.
Kirill Eremenko: Oh, yeah.
Jose Portilla: But actually, I was going to say that might be one of the more positive aspects of the field we work in is that the libraries are so new sometimes. And because if you are the world’s expert in TensorFlow 2.0 and you are not a developer at Google that was actually working on it, the amount of total experience you can have at this moment in time, is that most like one or two years, right?
Jose Portilla: Technically, it’s based on Keras, so you could kind of have more experience. But for something like PyTorch 1.0 as well, the most experience you could possibly have to be the world’s expert is just a few years versus like calculus or whatever. It was around since you were born, so you could have a lot more experience in it.
Jose Portilla: And I think because in this field, so many people remember what it’s like to be a beginner, because it was not that long ago that they were a beginner themselves just by the nature of the field. They don’t mind helping out, because it was not too long ago themselves that they knew nothing about like TensorFlow or PyTorch.
Jose Portilla: So, I think that definitely helps out. Just a sense of community that for whatever reason, data science and Python has, versus some other … Not to disparage other communities, but like …
Kirill Eremenko: Consulting.
Jose Portilla: Yeah, like consulting or some people in JavaScript or web development that’s been around much longer like HTML, CSS and JavaScript. There’s definitely an attitude of like, “Oh, you don’t get this? Whatever.”
Jose Portilla: Because they’ve had enough time with it like 15 or 20 years since web 1.0 that it’s probably faded from their mind of what it’s like to be a beginner versus Python and data science. That the libraries are constantly being updated, and there’s a new library every year so to speak. Everyone remembers what it’s like to be a beginner, so they don’t mind as much helping out.
Kirill Eremenko: Got you. Is your community mostly beginners? What did you …
Jose Portilla: That’s a great question of the general skill level, the community. It depends how you define beginner, because they come from all walks of life, right?
Kirill Eremenko: Yeah.
Jose Portilla: So, there’s people that, yeah, they’ve never programmed before. But they’ll have a PhD in …
Kirill Eremenko: Psychology.
Jose Portilla: Yeah, psychology or something. So, they’re not beginners in the sense that they’re beginners at learning, because this person is clearly able to be self motivated and teach themselves complex topics. It’s just that they didn’t take a Python class in university because it wasn’t taught there for them.
Jose Portilla: And then there’s other people that they already work at AWS or something or they’re already working at Google, and their boss just said, “Oh, I need you to learn this esoteric library in Python or R or whatever.”
Jose Portilla: And then they’re definitely not beginners and they … For them, it’s almost like they just need to pick and choose certain lectures from the courses of like, “Oh, let me quickly just learn this couple things my boss told me to learn.” I think, yeah, the majority of our “beginners” …
Kirill Eremenko: Like newcomers to data science.
Jose Portilla: Exactly, yeah. They’re not beginners in the sense that they don’t know anything. They usually have some sort of expertise in a field outside of data science or programming. And I think it kind of attracts that mind that you are already technically adept at something. It makes you interested in the possibilities of leveraging data science in Python with your current skillset.
Kirill Eremenko: Definitely. That’s something we’re also seeing, I think, because of all … Between 60% and 70% are newcomers to data science. Whether just college students or transitioning into data science. And then about 20 or so percent are more advanced practitioners. And then about 10% are managers, executives, entrepreneurs. But what I find interesting is that over time … Because we’ve been doing this for years. How long have you been teaching?
Jose Portilla: Since March 2015.
Kirill Eremenko: 2015. I started on Udemy in 2014, but in data science, it was, I think, June 2015. And so similar timeline, right?
Jose Portilla: Yeah.
Kirill Eremenko: And so over that … That was, what, four years. I’ve seen people grow from beginner to intermediate to almost advanced practitioner level. I’ve seen people get jobs and so on. And it’s really cool to see this growth and especially if you get to meet them in person. That is just fantastic.
Kirill Eremenko: They’re like, “Oh, I remember you three years ago, you were like asking these questions and you were just starting out into your journey or transitioning from whatever other career you had. And now, you’re a data scientist. You’re coaching others. People are asking you for advice.” That is so inspiring.
Jose Portilla: It blows my mind sometimes, like the careers that some of my students have been able to get. I was just talking to someone recently who ended up becoming a senior developer for AWS. I start thinking to myself, “Would I be able to get that job? I don’t think I would.” Given the interview process and how hard it is.
Jose Portilla: And they’re like, “Oh, thank you. Your course helped me that so much.” I was like, “I don’t know if I could do your job.” It blows my mind when you see students getting jobs that like, “I don’t think, I would probably fail that interview if I wasn’t really practicing for it.” Yeah. So, it’s crazy the growth of the students and how fast everything has been going just in the past few years.
Kirill Eremenko: That’s absolutely true. What’s the most common question that students ask you?
Jose Portilla: Where do I find the notes?
Kirill Eremenko: You get like hundreds of questions. Like we both get hundreds of questions.
Jose Portilla: Yeah. Well, there’s certain questions that’s just like … It’s also a bit of a selection bias of the kind of person that asks a question on forums or something. It is usually a person who has not done a quick Stack Overflow search or something.
Jose Portilla: But beyond that, beyond little silly questions like that, maybe one of the most common questions I get is like, “How do I choose a machine learning model?” One thing I do is I point them straight to the … You know scikit-learn. They have their choosing an estimator diagram. It’s like this weird, ugly little bubble tree chart. That’s like, “Oh, if you have this many data samples, choose this. If you’re trying to do unsupervised or supervised, do that.”
Jose Portilla: So, I point at that chart, but then I also tell them that realistically, for some of these models, it’s difficult to have an intuition for them. Once you deal with them a lot, then you can be like, “Oh, I think you should do this. Blah-blah.” If you’re about to do in SVM, there’s not that many people that would be have a strong intuition of what the exact C value or gamma value should be, right? They pretty much always just do a grid search. And the same for choosing a model. You usually run a couple and see what performs best or then make a combination of models.
Jose Portilla: And I think a lot of students sometimes go into it thinking like, “By the end of this, I will know exactly what model to choose in any situation.” When realistically, you’re still going to have to test out different models. And I think it’s hard to convey to students that even after you are extremely knowledgeable on this topic, when it comes to a new problem, you’re still just going to have to do what everyone else does, explore the unknown, not really know what’s the best model.
Jose Portilla: So, you can be the world’s top expert. At the end of the day, when it comes to a new problem, you’re still going to have to kind of guess and check almost. Which is kind of, to bring it back, exactly what that neural architecture search is doing, right? Keep guessing and checking until you find the good fit for the good model.
Kirill Eremenko: Or what AutoML is designed to do.
Jose Portilla: Exactly. Yeah.
Kirill Eremenko: Do you think AutoML will replace data scientists?
Jose Portilla: That is such a good question, because I used to think like, “Oh, crap. Maybe we’re going to be out of a job.” Especially this robot building robots and models building models. What’s left for us?
Jose Portilla: I don’t know. I think what is defined as a simple problem keeps expanding as you go throughout time. Because something like a linear regression task many, many years ago, that’s [goaltending 00:39:37]. It’s just beginning to figure it out. That’s an extremely hard problem. How do I find the line of best fit?
Jose Portilla: Now, that’s an extremely easy problem. So, I don’t think it will replace data scientists or machine learning practitioners. It will just basically push them to harder problems and reclassify things as easily solvable problems or easier problems for something to be automated against.
Kirill Eremenko: Absolutely. And I think there’s always going to be room for human creativity in these aspects. At least for the next 10 years.
Jose Portilla: Yeah. Then you see the neural networks that are painting and the recurrent neural networks that are doing text generation, like character. I’m sure you’ve read that blog post of the unreasonable effectiveness of recurrent neural networks.
Kirill Eremenko: Oh, yeah.
Jose Portilla: [crosstalk 00:40:27]. And it’s writing out Shakespeare.
Kirill Eremenko: Yeah, that’s an old blog post book.
Jose Portilla: That’s a very old blog post.
Kirill Eremenko: Like 2015 or something. A really good one as well.
Jose Portilla: Fantastic. And it always blows my mind that the network is doing it character by character, not word by word. The fact that you can even read it blows my mind.
Kirill Eremenko: Have you seen … There’s a movie that they filmed based on the script created by [crosstalk 00:40:51].
Jose Portilla: I have heard of it. I definitely have not seen it.
Kirill Eremenko: What is it called? I forgot. Solar something. I’ll link to it in the show notes and I’ll send it to you as well. It is ridiculous. They got Middleton. So, the actor from Silicon Valley. You know that TV show?
Jose Portilla: Oh, yeah.
Kirill Eremenko: Jeff Middleton or something.
Jose Portilla: I forget his name, but yeah, I know what you’re talking about.
Kirill Eremenko: Yeah. And then they got him to act the main role and it’s like a whole script written by this neural network that even gave itself … It’s been a while. I forgot. It called itself Barney. It called itself a name. It’s like a 30 or 15 minute long short movie. It was on the London Film Festival I think.
Kirill Eremenko: The sentences themselves make sense by what people say in the movie, but overall it’s complete nonsense, but they still acted it out in a way that you get like sure goosebumps down your spine like, “Wow. This is a space saga of a love story in it.” It’s pretty funny.
Jose Portilla: Yeah, it’s crazy, because it’s clear that the networks are able to easily conquer now like things like grammar. It will just take a deeper network to conquer something like plot, right? I don’t know if you’re a … This was maybe within the past year. OpenAI created basically a model to produce text articles.
Kirill Eremenko: No, I don’t know.
Jose Portilla: Yeah, that was really interesting because they did not release the full model because they thought it was too dangerous, because they basically … With a seed sentence of Syria blah-blah-blah.
Jose Portilla: Suddenly, this model could generate a full … It was essentially like fake news text article that read perfectly. That really read someone had written it personally. And it was just completely made up by a network. And they decided it was so good at generating fake news style articles that they refuse to release the full network.
Kirill Eremenko: That is crazy.
Jose Portilla: Yeah.
Kirill Eremenko: This kind of reminds me the story of CRISPR. The lady that developed CRISPR for adjusting genes. As soon as it came out of the lab was like … If I’m not mistaken, she was like, “This is very dangerous for the world. What have we created?”
Jose Portilla: Yeah. It’s almost like a milestone of you know a technology is really good and really worth pursuing if it’s always like this double edged sword. Something like atomic science, right? Like you have this really interesting aspect of nuclear energy. And at a certain reactor, it’s like a thorium reactor, whatever, has the potential for very low nuclear waste and you’re conquering the atom itself, like what the universe is built out of.
Jose Portilla: On the other hand, you also have the ability to create a nuclear weapon. And I think it’s like that for anything. You have convolutional neural networks that can detect cancer or skin cancer better than any doctor could. But then, at the same time, you could abuse these networks to then begin racial profiling based off corrupt datasets.
Jose Portilla: Yeah, any technology I think has the ability to be exploited for good or bad. But at least it’s a good signal that you’re onto something. Like CRISPR, like you were saying, if you see a child with a birth defect or something, the fact that you could maybe fix it preemptively is fantastic. But then, should you be able to choose the color of your baby’s eyes? Maybe not so much? I don’t know.
Jose Portilla: Then there’s also the ethical questions. The ethics of it is something that … I don’t know. That will take a long time to catch up to the technology.
Kirill Eremenko: For sure. What do you think? How far are we from AGI?
Jose Portilla: It’s funny. I was just having a conversation with someone about this here at Udemy LIVE. Every time I get asked this question, my timeline becomes shorter. I remember when I was first asked the question, I was like, “Never.” And then I start building out networks myself.
Jose Portilla: The one that really convinced me was the very first couple years ago, when I really built up my first good text generation network. I was like, “Oh, this is way more effective than I thought it was going to be.” I felt like I don’t know what I’m doing, and I’m actually able to do something that could fool a person.
Kirill Eremenko: Imagine someone who knows what they’re doing and then …
Jose Portilla: That’s exactly what …
Kirill Eremenko: Just steal them.
Jose Portilla: Yeah. And there’s people way smarter than me working on stuff that’s way harder than this. Will it be in my lifetime? I don’t know. I definitely not believe that it will be reached … It’s inevitable now at a certain point in humanity there will be general AI, the singularity or whatever.
Jose Portilla: Will it happen in my lifetime? I don’t know. Hopefully an old man on my deathbed, maybe it will become more clear like, “Oh, yeah, in a couple years, we’ll [crosstalk 00:45:55].”
Kirill Eremenko: Oh, man, I think 100% in our lifetime. What’s his name? Ray Kurzweil. 2025 or 2029, that’s the year. And then 2050 is when AI becomes super intelligent, like surpasses humans, and so on. Petitions for its independent rights, and things like that. A classic example is I think why we mistake it is because we’re used to linear thinking, and this stuff is happened exponential …
Jose Portilla: That’s exponential. That’s true. Yeah.
Kirill Eremenko: A great question I ask people. How far do you think you’ll get from here where we sitting in 30 linear steps? You know that one?
Jose Portilla: Yeah. Versus like the [crosstalk 00:46:37].
Kirill Eremenko: 30 exponential steps. Ridiculous. Ridiculous. I have the same feeling. As every year passes, my timeline gets close, so like my expectation for this.
Jose Portilla: I think it may also be out of selfishness that I hope it doesn’t happen in my lifetime where it’s like … Because then, I think something that’s going to happen is it’s going to be a real question of what does it mean to have consciousness? And what does it mean to actually be human?
Jose Portilla: Because when it’s replicated completely artificially, it’s going to be something that humans are going to have to grapple with, and that’s a very tough thing to think about of, “Now, what does it mean to be human, have a fulfilled life, have consciousness when this computer has essentially all the same things?” Right?
Kirill Eremenko: Yeah. What’s the difference? How can we discriminate against them? And now all of a sudden, they’re also conscious.
Jose Portilla: Yeah.
Kirill Eremenko: Why do we consider ourselves better?
Jose Portilla: Exactly. Will they be second tier citizens when they’re actually smarter than us only because we created them? Suddenly have some sort of power of them. Will they live with us at the same level? This is a question for someone much smarter than me to answer or think about.
Kirill Eremenko: Some of the AI scientists or futurists think that our generation and the next generation are the final generations of humans who are here. I think it was Elon Musk saying that we are biological … What is it called? What’s the thing that starts computer? Boot sequence or something. Like pre-boot, or whatever.
Kirill Eremenko: I forgot the word, but basically like this. When the computer starts, there’s a part that has to go first and then boot up the rest of the computer on the motherboard. That’s like biological way to boot AI to get it started. And then as soon as it started, we’re no longer need it. We were just a phase in evolution that, “Okay, now we’ve created AI.” Boom, the end. And then from then on, this new species, artificial intelligence, robots, whatever, are going to take over the planet and so on.
Jose Portilla: Yeah. And it makes you think if any civilization across the cosmos, if it’s some sort of inevitable conclusion that once some organic system evolves enough, they create artificial intelligence as the next step.
Kirill Eremenko: That’s interesting. Quite possibly. What’s his name? And the interesting thing about AI is listening to a podcast with Ben Goertzel recently is that it won’t be us individually. We always are individually, and we try to … We strive for the sense of community. We want to be on our phones all the time, Instagram. We want to think as a collective mind, but it’s hard for us, because the way we do it is through phones and that’s very inefficient, very slow. Whereas AI is going to be hooked up to the Internet.
Kirill Eremenko: So, it’s not going to be individual AI. They’re all going to be in one big mega mind. It’s like whenever you see an AI, whether it’s robots or a program whatever else, they’re all going to be thinking the same thing and exchanging knowledge. And so therefore, for them, for us, one day is going to be one day. For them, it’s going to be like one day is like 100,000 years in their collective mind. And so they’re going to evolve super fast.
Jose Portilla: Yeah. We’re so limited by our monkey brains of what consciousness even means, right? When in reality, once general AI is achieved, it’s … Maybe superior is not the right word, but whatever their consciousness is will be a higher level than what we are able to achieve as some organic system. I mean, it’ll almost be like godlike.
Kirill Eremenko: That’s the thing. There’s a great article on Wait But Why where he explains the latter of consciousness. Try to explain to an ant in ant language what a monkey is. Like no freaking way.
Kirill Eremenko: Try to explain to a monkey in monkey language what a human is or like what these moving things in the sky are, which are airplanes. This is going to think as stars. And same thing for us. Why do we think we’re the ultimate pinnacle of consciousness?
Kirill Eremenko: There is a level above us, which we’ll never be able to comprehend just simply because of the nature of how our brains work and limitations. There’s no way we can ever understand. I think AI, I really think that it will get to that level where it will be looking at us as ants.
Jose Portilla: As ants. Yeah. No, for sure. I don’t know. It’s a testament to our ignorance. When we think of AI, we think of it as a copy of a human. What it really will be like we created some superior God that will hopefully be benevolent to us.
Kirill Eremenko: Benevolent. Yeah.
Jose Portilla: I’m very glad to be part of this generation though that still doesn’t have it. The questions that come up of when general AI does exist are things that I’m glad that I don’t have to think about.
Kirill Eremenko: A very interesting time to be alive.
Jose Portilla: For sure. Yeah.
Kirill Eremenko: All right. Well, Jose, thank you so much for coming on the show.
Jose Portilla: Thank you for having me.
Kirill Eremenko: What a pleasure. Where can our students or listeners find you, connect with you, take your courses, follow your career?
Jose Portilla: So, probably the easiest way is just if you Google search my name, Jose Portilla, the first thing that pops up is probably my Udemy page. So, you can always check that out, my profile page in Udemy for different courses. You can feel free to connect with me on LinkedIn. Again, that’s probably like the second link on Google.
Jose Portilla: Or you can check out pieriandata.com. That’s our little company for data science stuff, but just Google Jose Portilla and I’m maybe too accessible. You can easily contact me either on LinkedIn or messaging on Udemy.
Kirill Eremenko: Fantastic. And we’ll definitely include those links in the show notes. Pierian Data, by the way, we didn’t talk about this, but I want to give a shout out that you do corporate trainings. So, if anybody is interested in corporate trainings, check out Pierian Data. I heard fantastic things about you.
Jose Portilla: Thank you very much.
Kirill Eremenko: Definitely have a look at that. On that note, we probably better get back to the conference and great chatting in person. I’m glad we did this.
Jose Portilla: Yeah, likewise. We’ll have to do this again.
Kirill Eremenko: There you have it ladies and gentlemen. That was Jose Portilla. I really hope you enjoyed this conversation as much as I did, and we’re super grateful for you being part of this episode.
Kirill Eremenko: My favorite part by far was the conversation about neural networks creating neural networks. That indeed could be the future that we’re heading towards where AI builds AI, which builds AI, which builds AI, and so on. And then we will live in a world that we probably wouldn’t even recognize today. And then we will live in a world that we probably wouldn’t even recognize today.
Kirill Eremenko: As always, the show notes for this episode are available at www.superdatascience.com/309. That’s www.superdatascience.com/309. There, you can find the transcript for this episode, any materials, research papers, images that we mentioned on this episode. And, of course, the URLs for Jose’s LinkedIn website and Udemy profile where you can find all of his courses.
Kirill Eremenko: I highly encourage everybody to check out Jose’s courses on Udemy. And if you or your company are interested in in-person corporate trainings, Jose is doing a great job in that space. You can find him at his website, pieriandata.com.
Kirill Eremenko: On that note, if you enjoyed this episode, forward it to somebody you know, somebody who’s passionate about data science, analytics, AI, machine learning, somebody who’s learning these things online, or maybe somebody who’s already following Jose, and you know that they love him and would love to hear from him on this podcast.
Kirill Eremenko: It’s very easy to share the episode. Just send the link, www.superdatascience.com/309. On that note, thank you so much for being here today. I really appreciate your time, and I look forward to seeing you back here next time. Until then, happy analyzing.