Podcasts SDS 255: Diving Into Computer Vision

56 minutes
Artificial Intelligence, Data Science

SDS 255: Diving Into Computer Vision

Subscribe on Apple Podcasts, Spotify, Stitcher Radio or TuneIn

A great chat with Adrian Rosebrock for anyone interested in computer vision or anyone who has no idea what it is and wants to learn.

About Adrian Rosebrock

Adrian Rosebrock is a Ph.D and entrepreneur who has spent his entire adult life studying computer vision and machine learning. In early 2014 he started PyImageSearch.com, a blog dedicated to computer vision, deep learning, and OpenCV where he’s authored over 300+ free tutorials and blog posts. He’s also authored two books (one on OpenCV, another on computer vision + deep learning) and created the PyImageSearch Gurus course, an in-depth treatment of computer vision. He’s currently working on a brand new book on the Raspberry Pi and computer vision.

Overview

Adrian loves talking about computer vision. He describes himself as part author, part educator, part blogger, and many other things. Like many of us in the data science field, Adrian didn’t enter college thinking he would end up where he did. But, when he was 17, sitting in a statistics class his eyes opened to the possibility of building an image search engine and looking at how algorithms might analyze an image.

OpenCV is a software that loads algorithms and allows you to choose the ones you need to analyze an image (such as edge detection and contours). It’s not training a neural network, but already applying pre-trained models for various detections in an image. It’s a place to build a computer vision model. For open source computer vision, OpenCV is the industry standard. For those looking to get started in OpenCV, Adrian says it’s an incredible process to get started on your first project.

One application of this software is prescription pill identification, which Adrian says many people will Google to see what they look like. Compiling information for what the physical pills look like to understand what they’re looking at (color, size, shape, imprint, etc.). The problem is, however, there’s a high risk of human error since it requires people to put in manual information. Another example is anything relating to the medical field. Adrian worked, previously, in helping to predict breast cancer and using data that way. Using computer vision and deep learning, they can predict cancer at a more accurate level and say not only if there is cancer in an image, but construct a pixelized mask. Outside the medical field, in-store analytics is a great place for this as well. Companies are very interested in the layout of their stores and the way customers walk throughout the store. Monitoring paths and perform person attention is much more efficient with computer vision. Will computer vision replace humans in medical imaging? Adrian thinks it’ll reduce the burden but won’t replace humans any time soon.

On PyImageSearch.com, Adrian wants to educate and inspire fellow data scientists to be able to tackle their own project. Over 300 free tutorials are available on the website to their students. One of his biggest success stories was a student who won around $30,000 through a Kaggle competition thanks, in part, to Adrian’s books. His focus and hope is in education and helping fellow data scientists expand their work in computer vision and all its applications.

In this episode you will learn:

Who is Adrian Rosebrock & why computer vision [6:15]
What is OpenCV? [12:38]
Getting started in OpenCV [19:56]
Applications of computer vision [22:54]
PyImageSearch.com [34:30]
How to avoid common pitfalls [38:37]
Adrian’s Kickstarters [43:40]

Items mentioned in this podcast:

PyImageSearch.com
OpenCV
Raspberry Pi for Computer Vision [ eBook ] – Adrian’s Kickstarter Campaign
An interview with David Austin: 1st place and $25,000 in Kaggle’s most popular image classification competition
Python Machine Learning by Sebastian Raschka
Confident Data Skills by Kirill Eremenko

Follow Adrian

LinkedIn
Twitter
adrian@pyimagesearch.com

Episode Transcript

Download The Transcript

Podcast Transcript

Kirill Eremenko: This is episode number 255 with the Founder of PyImageSearch.com, Adrian Rosebrock.

Kirill Eremenko: Welcome to the SuperDataScience podcast. My name is Kirill Eremenko, Data Science Coach and Lifestyle Entrepreneur and each week we bring you inspiring people and ideas to help you build your successful career in data science. Thanks for being here today and now let’s make the complex simple.

Kirill Eremenko: This episode is brought to you by my very own book, Confident Data Skills. This is not your average data science book. This is a holistic view of data science with lots of practical applications. The whole five steps of the data science process are covered from asking the question to data preparation to analysis to visualization and presentation. Plus you get career tips ranging from how to approach interviews, get mentors, and master soft skills in the workplace. This book contains over 18 case studies of real world applications of data science. It covers off algorithms such as Random Forest, K-nearest Neighbors, Naive Bayes, Logistic Regression, K-means Clustering, Thompson Sampling, and more.

Kirill Eremenko: However the best part is yet to come. The best part is that this book has absolutely zero code. So, how can a data science book have zero code? Well, easy. We focus on the intuition behind the data science algorithms so you actually understand them. So, you feel them through and their practical applications. You get plenty of case studies. Plenty of examples of them being applied. And the code is something that you can pick up very easily once you understand how these things work. And the benefit of that is you don’t have to sit in front of a computer to read this book. You can read this book on a train, on a plane, on a park bench, in your bed before going to sleep. It’s that simple even though it covers very interesting and sometimes advanced topics at the same time.

Kirill Eremenko: And check this out. I’m very proud to announce that with dozens of five star reviews on Amazon and Goodreads, this book is even used at UCSD, University of California San Diego, to teach one of their data science courses. So, if you pick up Confident Data Skills, you’ll be in good company.

Kirill Eremenko: So, to sum up. If you’ve looking for an exciting and thought provoking book on data science, you can get your copy of Confident Data Skills today on Amazon. It’s a purple book. It’s hard to miss and once you get your copy on Amazon, make sure to head on over to www.confidentdataskills.com where you can redeem some additional bonuses and goodies just for buying the book. Make sure not to forget that step. It’s absolutely free. It’s included with your purchase of the book, but you do need to let us know that you bought it. So, once again, the book is called Confident Data Skills and the website is confidentdataskills.com. Thanks for checking it out and I’m sure you’ll enjoy.

Kirill Eremenko: Welcome back to the SuperDataScience podcast, ladies and gentlemen. Super excited to have you back here on the show today. And today’s guest is Adrian Rosebrock who is a very experienced researcher in the space of computer vision. Adrian is also an educator in the space of computer vision. He is the Founder of pyimagesearch.com, a very popular website on computer vision and OpenCV specifically and you can find a lot of videos and blog posts and a lot of educational materials about computer vision there. And so, what did we talk about today? Well, in this podcast today you’ll get a great overview of the space of computer vision, what it was in the past, what it is now, and most importantly, what it will be in the future and what you need to prepare for if you’re interested in computer vision.

Kirill Eremenko: Also, we dove into quite a bit of depth on OpenCV specifically. So, this is a library, one of the most popular libraries and tools for computer vision in the world right now and in this podcast you’ll find out what it’s all about and how to quickly get started with it.

Kirill Eremenko: And also another thing we talked about is Adrian’s Kickstarter. Adrian’s third Kickstarter, which is actually running right now. So, something you might be interested if you want to dive into the world of computer vision, but more on that in the show.

Kirill Eremenko: So, all in all you’ve got a very exciting podcast coming up and just before we dive straight into I wanted to give a shout out to our Fan of the Week. And this one goes to Rose Gadea. I hope, Rose, I’m pronouncing your name correctly. And this is what Rose wrote, “Kirill invites a wide variety of guests in order to cover various topics in data science. In the podcast he makes a field that might seem frightening and impossible to some, an obtainable goal. Additionally, his courses are a great resource to learn the data science skills.” Thank you so much, Rose, for this feedback. Very inspiring to hear. And for those of you listening out there who haven’t yet left a review and some comments, then you can do so by heading over to iTunes or whatever other app you’re listening to your podcast on and leave us a review there. I’d be super thrilled to read it and possibly read it out as a Fan of the Week in one of the upcoming episodes.

Kirill Eremenko: And on that note, without further ado, let’s dive straight into it. I bring to you Adrian Rosebrock, Founder of pyimagesearch.com.

Kirill Eremenko: Welcome to the SuperDataScience podcast, ladies and gentlemen. And today I’ve got a very special guest coming here onto the show Adrian Rosebrock calling in from Philadelphia. Adrian, how are you going today?

Adrian Rosebrock: I’m doing good, Kirill. Thanks for having me on.

Kirill Eremenko: Man, it’s a huge honor. I’ve been following your work for I think since 2017 and to me you’re like one of the top experts in computer vision around the world, globally. So, like, I’m not kidding. I’m very excited to dive into this topic in this podcast. How are you feeling about today’s session?

Adrian Rosebrock: I am really stoked. I will talk to anyone about computer vision. I think my wife is the most stoked about this, because it’s just me and her. We don’t have any children and you know, I think she’s tired of hearing about the [crosstalk 00:06:35].

Kirill Eremenko: Awesome, man. Awesome. Well, to get our listeners up to speed of whom I think maybe there are some who haven’t heard of you before even though probably a lot of them, of our audience has heard of you. For those of our audience who haven’t heard of you before, can you give us a quick background? Who is Adrian Rosebrock and what do you do?

Adrian Rosebrock: All right. So, I have a Ph.D. in Computer Vision and Machine Learning. I’m part researcher, part author, part entrepreneur. I blog about computer vision, deep learning, and OpenCV over at my blog, pyimagesearch.com and I’ve even authored two books and a course on computer vision as well.

Kirill Eremenko: Awesome. Just for our listeners, Adrian has been very modest here. I literally just before this podcast I went on alexa.com, which is like a website ranking tool and Adrian’s website pyimagesearch, P-Y image search, all one word, dot com, because you know, from Python image search. It is the 16,227th most popular website in the world and that’s a huge accomplishment. Might sound like a big number, but that is actually really massive out of like the millions and millions of websites out there. So, man, congratulations. You’ve had some really steady and also rapid growth over the past 12 months or so.

Adrian Rosebrock: Oh, thank you Kirill. I really appreciate that.

Kirill Eremenko: Yeah, man. So, what’s your secret? Like, why computer vision? You know, you went into doing a Ph.D. in computer vision. Were you always interested in this topic? Because I’m assuming when you started your Ph.D., computer vision wasn’t as explosive as it is now. Why did you pick this area?

Adrian Rosebrock: It’s pretty interesting you know. I started coding when I was a freshmen in high school and I immediately fell into it. I was like this really weird, awkward kid and I was very to myself, very introverted and I had trouble relating to everyone else in high school. So, coding gave me that escape and at the time I was super interested in building websites and I thought for a good many years that that was going to be my career. It wasn’t until my AP Statistics class that I really thought of studying machine learning and computer vision and that class opened my eyes to what is possible using strictly statistical models and tests. And it sounds super basic for me to say, but the first time I learned about what a mean and standard deviation was, that was truthfully eye opening to me. Like, I was 17 years old, sitting in this class, I was like, wow. You can construct all these predicted models based all on these two simple variables that are easy to calculate. Our final project for that class, the teacher said in lieu of an exam, let’s do a project. I want you to show me something that you can build using statistics.

Adrian Rosebrock: So, at that time, I was starting to ponder like how does software understand what’s in an image and how can I build an image search engine of sorts, you know? Using these statistical models. And the way I decided to do it was I was gonna create this super small data set of images. I think of wrote a script that was on like deviantart.com or something like that. Then downloaded maybe like a hundred images and of all languages I used java. I don’t know why, maybe I was learning java at the time and I wrote some routines to compute the mean and standard deviation of the pixel intensities in an image. So, like in a standard image that you see, there’s three color channels. There’s red, green, and blue. So, I computed the mean standard deviation for each of them so that six values and then I treated that like a vector. Just computed the equivalent distance between the mean and standard deviations for my dataset and plus a new query image. Now, that’s like nothing novel. I mean, these types of things have existed for as long as computer vision field has been around, but for like a 17 year old kid that was like a holy- whoa. I can actually do something with these variables.

Kirill Eremenko: Uh-huh. (affirmative).

Adrian Rosebrock: So, that got me super interested in computer vision and actually kind of gave me my area of expertise at least for a while, which was content-based image retrieval, which again, it’s just building image search engines.

Kirill Eremenko: I love it, man. I find that that’s kind of the best way to get into technology. Where you explore it not using the most advanced tools. Like, you build stuff from scratch. You know, playing around with pixels. You kind of understand through that process, you understand that challenges that are faced in this space, what it is like to look at pixels and calculate means and different colors and locations on the image and things like that. And then when you do get your hands onto something like OpenCV or another library, another powerful tool, then it’s much more high level, but you’ve already done the nitty gritty low level things and you know what is happening in the background when this tool is working. It gives you much deeper appreciation for the actually tool, wouldn’t you say?

Adrian Rosebrock: I totally agree with you.

Kirill Eremenko: Yeah. Yep.

Adrian Rosebrock: Just started to talk about computer vision and deep learning. Like, we didn’t have libraries like Nim Py or Keras or TensorFlow. Like, if you wanted to train your own network, you were implementing that yourself. So to be able to just withstand that nitty gritty just allows you to appreciate it so much more once you get to that higher level.

Kirill Eremenko: Awesome, man. And speaking of OpenCV, can you tell us a bit about this tool, this extremely powerful tool that you use for computer vision? ‘Cause as I understand most of your work is built around OpenCV. What is OpenCV?

Adrian Rosebrock: So, OpenCV stands for Open Computer Vision. It’s an open-source library and it’s really been the effective standard for computer vision and image processing. The goal of OpenCV is to facilitate realtime processing. So things like accessing your webcam in an efficient manner and extracting frames. Allowing you to build a pipeline of computer vision steps to achieve some sort of goal. So, like a good example is edge detection. In edge detection your goal is just to find the boundaries of an object in an image. So, like let’s say you wanted to build a document scanner for your SmartPhone where you would open your SmartPhone, open the camera, hold it over this document and this app would find the piece of paper and take a photo and convert it to a .pdf. Well, that’s just a computer vision application. And building a simple document scanner, it’s not challenging. It’s either thresh holding or edge detection. A fine outline of this, outline of this document and use contours, which will give you the X, Y coordinates of the boundary of the edges and then from there you can apply this perspective transform to actually give you this top-down bird’s eye view of the document. And OpenCV just facilitates all this. It gives you all the algorithms you need and then you just pick and choose which ones you need to actually create your application.

Kirill Eremenko: Gotcha. Is it correct that OpenCV is not actually deep learning?

Adrian Rosebrock: So, this has been a point of confusion I’ve seen around for sure. OpenCV is not meant to train neural networks or to train deep learning models. What OpenCV wants to do is be able to load those models in an efficient way. That way you can perform inference using those models. But OpenCV is so much more than just inference using deep learning models. It’s computer vision as a whole.

Kirill Eremenko: Uh-huh. (affirmative). Got you. So, OpenCV as I understand it really contains some of these pre-trained, pre-modeled, edges, contours, maybe’s people’s eyes, noses, and things like that that you can actually load into your model and you don’t have to do that training on your own. It’s already there. Now you can start applying it. Is that right?

Adrian Rosebrock: So, yes. That’s partially right. OpenCV definitely provides these pre-trained models that you can use for face detection or pedestrian detection or eye detection. All of that exists in OpenCV. But another goal of OpenCV is to say, “Okay. You train this model in Tensorflow or PyTorch or Café.” OpenCV wants you to be able to take that model and then load it via OpenCV’s built in functions. That way you’re not mixing all its Tensorflow code used to load model with OpenCV’s code. And furthermore that gives OpenCV room for a lot more optimizations. For example, using the Openvino toolkit if you want to use Movidious NCS. That really helps OpenCV performance underline optimizations that may be unavailable to you if you were using whatever the native library your model is trained in.

Kirill Eremenko: Uh-huh. (affirmative). Got you. Got you. And there’s quite a few other tools out there on the market that especially with the proliferation of computer vision they started popping up. For instance, SSD, Single Shot Detection, YOLO, You Only Look Once, and others. How does OpenCV compare to those tools in your experience?

Adrian Rosebrock: So, again with OpenCV you’re just taking those models that were already trained, those SSDs that were trained in Café. The YOLO model’s trained in Darknet. OpenCV is just taking those models and loading them. OpenCV isn’t rate training those algorithms. OpenCV isn’t providing their own implementations typically. Sometimes they do, usually not. But the goal is for OpenCV just to be able to suck in those models as existing models trained on different frameworks and be able to perform inference using them.

Kirill Eremenko: Gotcha. Wow. So, I’m learning so much already. So, OpenCV is kind of like the parent or the main tool that you can import different elements into and build up your computer vision model?

Adrian Rosebrock: Yeah, yeah. Definitely. And there’s still a lot of support or a lot of functionality that needs to be implemented. We didn’t see these implementations or the ability to load these models until OpenCV version 3.3. We’re on OpenCV 3.4 and OpenCV 4.0 right now. So, there’s still a lot of work and still a lot of layer support that needs to be implemented.

Adrian Rosebrock: One of the biggest gaps right now is we can’t use NVIDIA GPUs with OpenCV. I know for a fact that that’s on their roadmap for Google Summer of Code this year. So, I’m really hoping they’re going to be able to implement that and you know, you’ll be able to just load that model with OpenCV with your NVIDIA drivers installed and just have all these additional optimizations available to you.

Kirill Eremenko: Gotcha. Okay. What are some alternatives to OpenCV and how do they compare?

Adrian Rosebrock: So, you have a few libraries that I wouldn’t call like true alternatives or competitors to OpenCV, but they’re different choices. One of my favorites is Scikit Image, similar to Scikit Learn. This is for computer vision and it’s processing algorithms. One of the things that I like about Scikit Image is that it has a really nice intuitive API. It’s much more pythonic. OpenCV’s API isn’t the best per se from a strictly Python perspective. So, Scikit Image definitely stands out in that regard.

Adrian Rosebrock: What’s also kind of cool is that you see state of the art implementations faster in Scikit Image. Not necessarily deep learning algorithms, but just kind of your more standard computer vision algorithms.

Adrian Rosebrock: Then there’s also the Dlib lbrary from Davis King and Dlib just started as Davis’ collection of tool and algorithms he was using at his job and it just kind of morphed into becoming more and more computer vision and deep learning. So, a lot of listeners on this podcast may have heard of the face recognition library, face_recognition. It’s from a guy named Adam Geitgey. He was actually a speaker at PyImageConf this last year and he’s a super cool guy, but what he did was he took dlib for face recognition and just made it super, super easy to use. Like, you could just pip install this library and start performing face recognition.

Adrian Rosebrock: So, again, those are some fun libraries you can play around with. Dlib especially has these obscure algorithm implementations you won’t find anywhere else. So, if you haven’t heard of that library, definitely, definitely check it out.

Kirill Eremenko: Gotcha. Gotcha. But you would still say that OpenCV is the industry standard?

Adrian Rosebrock: For open source computer vision, yeah. Absolutely.

Kirill Eremenko: Gotcha. Awesome. And how hard is it to get started in OpenCV? Let’s say there are many listeners on this podcast who are still deciding on where to progress their career in data science. They might be beginners or they might be expanding heir careers. How difficult is it to get started and create your first sample project in OpenCV?

Adrian Rosebrock: Man, it’s so easy especially compared to five years ago. It was probably five or six years ago I took my first and only undergrad course in computer vision and our first project the professor said, “Okay. You’re going to go home and you’re going to install OpenCV on your machine.” And we’re all like, “Come on. Like, how long can that be? Like, it’s going to take a half hour.” Well, there were about 50% of the students in the class could not get OpenCV installed in their machine before the assignment was due. It was a bear. You know, you had to compile it from source. The documentation wasn’t as good. There quite frankly weren’t that many tutorials and books online as there are today.

Adrian Rosebrock: Luckily it’s gotten … we’re at the point now where we can do pip installs of OpenCV and it’s fairly reliable on most operating systems. Of course if you want like all optimizations and everything else, you can compile it from source, but even compiling from source is super easy as well. So, it’s just a testament to how quickly and how fast the computer vision field is growing. There’s just so much interest in this area right now and if anyone is looking for a niche area that they want to do within data science. Like, I cannot recommend computer vision enough.

Kirill Eremenko: I agree. Part of the reason for this is that we are moving from a world of how do we deal with structured data and how do we get a competitive advantage as a business from structured data to a world where, okay, all of our competitors are already using structured data for an advantage. Now, let’s look at our unstructured data. Do we have any cameras that are monitoring our products? Can we put some cameras above our conveyor belts to monitor for defects? Or to sort out items? Can we put some cameras in the car park to see how our customers are coming in and going out? Can we use Google satellite images and apply computer vision there to detect certain things that we’re looking for? And plenty and plenty of areas where there’s unstructured data that’s already been collected or we can start collecting it. The richest by far formal unstructured data is video and image data. Like, way richer than what you can get from any kind of text document. That’s where computer vision comes in.

Adrian Rosebrock: Oh, absolutely. The number of areas that I see computer vision being applied to especially medical fields is incredible. I’m just so happy to be part of this field right now.

Kirill Eremenko: Yeah. Agree. And speaking of applications, could you give us maybe three or five of your top examples that you’ve gathered over the years of applications of computer vision to inspire our listeners to maybe show some of these scenarios that we don’t even normally think of in our day to day lives?

Adrian Rosebrock: Yeah. So, one of the ones that was really eye-opening to me was prescription pill identification. I found out that over 1.2 million injuries and deaths happen each year due to a patient being given the wrong medication. You know, maybe the patient just mixes up their pills at home. Maybe the doctor prescribes the wrong pills. Or maybe the pharmacist doesn’t fill the prescription correctly. There’s a lot of ways that these systems can break down. Traditionally what people would do for prescription pill identification was they would just Google the imprint of the pill. Now, in the United States, there’s, what? About 30 to 40 thousand prescription pills on the market. Over 50% of them are round and or white. So, we have an incredible number of similar pills, of similar embellishments or imprints with the company’s logo, and simply Googling the name of the pill or Googling what the pill looks like, that’s not going to get you very far.

Adrian Rosebrock: So, what the National Library of Medicine did was they took all this information of pills since every pill on the market is required by the government to have this insert fact sheet that not only describes side effects, but shows what the physical pill looks like. So, they ingested all this information and built this online form where people can kind of click and fill out information saying, “No. It’s a round pill with the imprint TD-33 and it has a size of maybe 8-millimeters.” And then this form will just kind of narrow down the search. The problem is that humans are very, very prone to error with manual entry of information, especially if this information is technical in nature or we’re not familiar with it. Like, I don’t know about you, but I don’t know where a ruler is in my house right now and I’m sure as hell not going to go find it and measure a pill and be like, “Okay. It’s 8-millimeters long.” Or plus or minus a couple millimeters ’cause my eyesight’s not that good. This is just introducing errors and people can’t find what pill actually is. So, this is a great example of where computer vision can come in handy.

Adrian Rosebrock: Using computer vision, you know, we can detect a pill in an image. If it’s calibrated, we can get the size of it very accurately. We can compute the shape of it, the color, and the texture. That allows us to have this very accurate pill identification system and really just help reduce the injuries and deaths that happen each year just due to the wrong pill being taken.

Kirill Eremenko: Wow. So it’d be like an app on the phone or something like that?

Adrian Rosebrock: Exactly. Exactly.

Kirill Eremenko: That’s a really cool application. All right. Anything else?

Adrian Rosebrock: So, another really good one is anything related to medical areas. I originally did some consulting for the National Cancer Institute back during my early graduate school years and we were developing systems to automatically determine the risk of breast cancer in breast histology images. These images are massive. If you ever load them up in RAM, they’d be multiple, multiple gigabytes. They’re just huge and very, very hard to process. So, we were just kind of coming up with ways to go and do this automatically. That way trained pathologists don’t have to spend hours doing this risk assessment. Just click a button and the computer would chew on it and give us the output. And back then we didn’t have deep learning. We just had these basic computer vision image processing algorithms. And we did some pretty cool work then. We were able to actually develop a system that could detect and predict certain risk factors.

Kirill Eremenko: Uh-huh. (affirmative).

Adrian Rosebrock: Like, now we’re getting [inaudible 00:27:04] and we’re getting more accurate predictions. It was actually a really big deal for me, ’cause I kind of got away from medical computer vision for a few years as I was writing other content and writing other books and courses, but back in I guess October, one of my family members ended up telling me that they have cancer. That was like, “Wow.” This is all of a sudden I’m dropped back into the world of medical computer vision and deep learning now. I’m so much more motivated to study it and using computer vision and deep learning we’re able to do this analysis at a much higher rate that’s so much more accurate. We’re able to predict breast cancer and other types of cancer. We’re able to use instant segmentation. So, we’re actually able to not only say yes or no to is there cancer in this image, but we’re also able to localize like exactly where it is. So, we can draw a boundary box around it and point to it and be like, “Yep. That’s the cancer.” Or we can even go to a deeper level and construct a pixelized mask and say each of these pixels, that’s cancer and these pixels are not.

Adrian Rosebrock: That’s a [inaudible 00:28:14] of the problem, but these algorithms are being applied to just everywhere in medical right now and that’s super exciting for me to see.

Kirill Eremenko: Wow. That’s so cool. It sounds like you’re quite well versed in the medical space. Would you say that at some point in the future computer vision will completely replace human analysis of any kind of medical images, whether that be MRI, x-ray, or pretty much any imaging that is done in the medical field?

Adrian Rosebrock: I think it’ll greatly reduce the burden of people who have to perform those jobs and hopefully it will make their lives easier. But will it replace people altogether? I highly, highly doubt that unless there’s some sort of form that a patient checks that says I waive the right of a person to actually look at my information. I’ll trust the computer. You know, whether or not that happens in our lifetime, it’s truthfully hard to say. It’s like talking about Bitcoin for example. At my age even with Bitcoin exploding and dropping back down, even if it comes back up to millions of dollars per Bitcoin, most people my age might be like, yeah, I’d rather have the dollars, ’cause I know the dollar. But people who are much, much younger than me, they may say, “No. I see the value in that and I see that’s where things are going.” So, in some cases like some of the medical areas of computer vision and deep learning, it’s going to take time. It’s going to take time for people to adjust to it. Of course there’s government legislation that has to go through as well. So, that’s a long way of me saying, no. Not anytime soon.

Kirill Eremenko: Gotcha. Gotcha. Speaking of government legislation, what is your experience in this space? I’m assuming you’ve seen a few things or heard what’s going on in the space of computer vision. Is it keeping up to the advancements or is it like extremely far behind?

Adrian Rosebrock: I guess it really depends on the country. In my opinion the government legislation probably is not keeping up with the actual algorithms themselves and what they’re capable of. You kind of look at some of the self-driving car stuff and you can kind of see what could be coming down the road. You know, I imagine that’s going to be one of the areas that is most sweepingly reformed by legislation.

Kirill Eremenko: No pun intended, right?

Adrian Rosebrock: Yeah.

Kirill Eremenko: One could be coming down the road.

Adrian Rosebrock: Exactly. Self-driving cars are getting good, but there is this incredible liability on top of them and I don’t think we’re going to start to see that liability until we start approaching scale with self-driving cars. Then it’ll be unfortunate because that type of legislation can make or break businesses. They could go out of business overnight. It’s a little sad and unfortunate, but that’s just kind of how government can work in some cases. I think it naturally lags behind.

Kirill Eremenko: All right. Cool. So, we’ve had two examples from the medical field. That’s very, very interesting. Do you have any example from another industry just to mix it up little bit?

Adrian Rosebrock: Oh, definitely. I wish I could remember the name of the company and the person who runs it, but probably five or six years ago, there was this guy that I knew and he was very interested in in-store analytics. So, what I mean by that. People walk into a major shopping store and they start browsing around and looking at the various products. Believe it or not, companies are very interested in how their stores are laid out and you know, people seeing this product first and that product first. They want their stores laid out in a way that optimizes purchases and reduces the friction and the amount of time it takes from a potential customer walking to different areas of the store. So, like store layout is incredibly, incredibly important.

Adrian Rosebrock: And you also have a measure of like how long an average customer stands in front of like a various kiosk. Like say the new whatever the biggest, like a new video game is coming out at Game Stop. Game Stop would like to know how many people are stopping to check out this display? Where can we put this display in our store to optimize sales?

Adrian Rosebrock: So, I knew a guy and he created this company that originally started by him [inaudible 00:32:50] together Dance Dance Revolution mats and then putting them under the carpet of the floor and then being able to monitor these mats and track people as they walk around the store and like detect how long they stopped at this kiosk or that kiosk. So, that’s like an old-school way of doing things now, like with computer vision, just mount a camera up in the corner of your store and we can perform person detection, right? We can track them as they are walking around the store. If they stop at a kiosk, we can track down to the second how long they- and if you want to get really crazy, you can put a camera in front of the kiosk and monitor the person’s you know, let’s say facial expressions and their emotions. What are their emotions, if any, that they have while looking at this kiosk? Are they emotionally engaged in that product? That information is super, super valuable to these commerce companies, because they want to optimize their stores in any way that they possibly can.

Kirill Eremenko: Uh-huh. (affirmative). Yeah. And that’s how the Amazon Go stores are working, right? You come in and then you just … they don’t even have shopping assistants these days in those stores. It’s something new, but that’s where the world’s going, right? Computer vision plays a huge role in that.

Adrian Rosebrock: Yeah, absolutely.

Kirill Eremenko: That’s fantastic. Well, thank you. Thank you for the three examples. Hopefully that’s very inspiring to our listeners. Of course there’s plenty, plenty more and just a quick Google search on applications of computer vision can put that into perspective.

Kirill Eremenko: So, I wanted to switch gears a little bit here and talk about what it is that you do on PyImageSearch. So, it’s a very popular website. There’s thousands upon thousands of people visiting and doing something there. Tell us a bit about your creation. What is it all about?

Adrian Rosebrock: So, what I want to do with the PyImageSearch is I just want to teach you to be awesome at computer vision, deep learning, and OpenCV. At this point, I’ve authored over 300 free tutorials on these libraries and you can come there and you could actually learning something, learn something practical that you can go download and just launch and see it in action. And from there it can inspire you. It can inspire you to build your own projects. So, that’s what PyImageSearch is all about. It’s pure education that’s practical, that’s real world, and very hands on.

Kirill Eremenko: That’s awesome. So you mentioned 300 tutorials that are for free on your website.

Adrian Rosebrock: Yes, that’s correct.

Kirill Eremenko: That’s so cool. How’s the feedback been from students? Have you had many success stories?

Adrian Rosebrock: Oh, definitely. A few, I guess a few months ago now, we actually had a person who read one of my books and courses actually win Kaggle’s Most Popular Image Classification Doc Competition.

Kirill Eremenko: No. No way.

Adrian Rosebrock: Yeah. Yeah.

Kirill Eremenko: That’s so cool.

Adrian Rosebrock: It was about like 25 thousand maybe even 35 thousand dollars from this competition.

Kirill Eremenko: Wow. That’s awesome. Congratulations. Huge testament. And they wrote to you?

Adrian Rosebrock: Yeah, yeah. He reached out and he thanked me and we did a case study on the PyImageSearch blog. His interview where he shared his experience, his lessons learned, what algorithms worked well, and just his suggestions for people who want to start studying computer vision and deep learning.

Kirill Eremenko: Fantastic. We’ll definitely include that in the show notes for everybody listening, check it out. Okay. And what do you say is the most common question that your students ask you about computer vision and OpenCV?

Adrian Rosebrock: Well, I think the most common question is probably why Import CV2 is not working. I think that’s probably one of the most common ones. But another question I see get asked is the difference between image classification and object detection. You know, with standard data science algorithms or machine learning algorithms, we’re typically looking for this yes/no prediction or a prediction with certain confidence levels. With computer vision you can actually … you have this visual aspect where you’re not just interested in a yes/no prediction or a probability prediction. You know, what’s the probability that there’s a dog in this image or a cat in this image? You want to know where that cat and dog is. You want the X, Y coordinates in that image of where there those objects are. And I’ve noticed people that are new to computer vision sometimes don’t understand the difference between object detection and image classification, ’cause it’s something that is really part of computer vision and you might not see it in other areas of data science.

Kirill Eremenko: Uh-huh. (affirmative). Uh-huh. (affirmative). Gotcha. So, like, a normal deep learning CNN, convolutional neural network once you train it up and the way you train it up, it will do classification for you, but it won’t allow you to tell in which part of the image you’ve got the dog or you’ve got the fluffy ear or you’ve got the cat’s nose or whiskers, or whatever else; whereas with computer vision you take it to the next level. You get an additional layer of information about the coordinates of certain objects that you might be interested in.

Adrian Rosebrock: Exactly.

Kirill Eremenko: Gotcha. Gotcha. Okay. Any other interesting question? This is really cool. I personally find like in my experience, ’cause with the courses that we create, from the questions, analyzing the questions, and comments from students, you can learn such insightful things about what actually matters and what are the challenges people face. So, is there anything else you can share with our listeners to help them maybe avoid those same pitfalls and already have a head start into this field?

Adrian Rosebrock: Yeah. So, I think one of the most important things to understand is computer vision is just like having tools in any toolbox. You’ve got to bring the right tool to the job. And this is something that I say all the time on the PyImageSearch blog. Just because you have a hammer, doesn’t mean that every project is a nail. You know? You would never use hammer to like beat in a screw, right?

Kirill Eremenko: Uh-huh. (affirmative).

Adrian Rosebrock: So, for example, deep learning may be all the rage right now, but depending on the problem and depending on the project, a simple application of computer vision will be just as successful and far easier to build both in terms of time, coding, and with actual time investment of yourself. Over my years in doing this, I’ve developed sort of a sixth sense just as most advanced practitioners will.

Adrian Rosebrock: But previously I would start with the basics first and say, “Will basic image processing algorithms work here?” And I would give it a try and see and just see how far it got me. And I said and if that didn’t work, I’ll take a step back and be like, “Okay. That didn’t work. What about some sort of feature extraction, image descriptors, and maybe a bit of machine learning? How far can I get with that?” And if that still didn’t work, I would take a step back and assess, you know, do I need other tools? What other tools in my toolbox can I pull out? Do I need deep learning? Do I need some sort of more advanced algorithm? And it’s so important to approach computer vision this way, because it’s a natural way of building things. You’re building this knowledge that’s existed for years. You’ve gotta understand basic image processing, ’cause that will enable you to perform in more advanced deep learning techniques. For example, data augmentation. You know, ways to generate more training data. That’s just, most of that’s just basic image processing, and it’s important for you to understand that. Continuing up the training you have to understand why’s feature extraction not enough in certain cases and when will it not work?

Adrian Rosebrock: It just builds this almost pyramid level of decisions and different topics that you can use and tools in your toolbox. So, that’s something I’m always recommending to PyImageSearch readers is, you know, just because deep learning is super popular right now, doesn’t mean that you can neglect basic computer vision algorithms.

Kirill Eremenko: Uh-huh. (affirmative). Completely agree and that applies not just in computer vision. In any area of data science I also stand by that approach. The most successful data scientists have humility and what that means is a lot of the time ego can get in the way. You got a project you’re like, “Oh, I’m gonna go and put the most recent deep learning/artificial intelligence/computer vision algorithm I know that exists in the world to solve this problem.” Like, you feel that if you apply something simpler that might take you less time, less effort, that might be just copy pasting a few lines of code from an open source library or some open project, some example, that that is not enough effort on your behalf and that you want to actually build something really, really cool. Well, that’s not … in terms of business, that might be cool to explore and try out, but in terms of solving actual business problems, if you want to be an ultra successful data scientist that businesses really value and seek out, then you need to have that humility to put your ego aside and say, “All right. What’s the actual fastest, most efficient cost-effective way to get this problem solved?” And oftentimes, as Adrian you pointed out, it’s actually just the simple stuff works best.

Adrian Rosebrock: Yep. And I think it’s humility and I think it’s also the discipline in what you’re doing. You don’t want to be chasing the shiny object all the time. A new library comes out. It’s super shiny and super cool and you’re like, “Oh, I must have it. I must play with it.” And if you’re doing a hobby project or if this is something you’re doing on your nights and weekends for fun, yeah. Totally go do that and go explore, because I’m a big advocate of just throwing yourself in the deep end and learning how to swim; however if this is your job whether it’s data science or computer vision, you’re doing this on the job and helping your organization by writing this code or training this model, you have to have that discipline to take a step back and be like, “Okay. I know there’s this shiny thing over here. I’m going to write it down in my notebook and I’m gonna check it out when I get home, but while I’m here, while I’m on this job, I’m gonna do what I need to do to get it done.”

Kirill Eremenko: Fantastic. Fantastic advice. Thank you so much. At this point I want to again switch gears a bit. I want to talk about your Kickstarters. So, for our listeners who don’t know, Adrian’s had two Kickstarters and at the time that this podcast is live, Adrian’s got a third one running. So, let me get you up to speed. So, the first Kickstarter, PyImageSearchGuru’s become a computer vision plus OpenCV Guru had 253 backers and raised a total of 34 thousand, almost 35 thousand dollars. Second Kickstarter was Deep Learning for Computer Vision with Python, the e-book had a thousand and fourteen backers and raised a whopping 262 thousand dollars. So, now, you’ve got a third Kickstarter running again at the time that this podcast is released. I’m very excited to see how that goes, because I’ve been following your work I think since the second Kickstarter that you launched. So, what would you say is the difference between the content that people get in your Kickstarters versus the free content on your website?

Adrian Rosebrock: So, there’s a lot of free content on the PyImageSearch blog. I have basic image processing examples. I have basic computer vision all the way up to more advanced deep learning tutorials. And there’s a lot of websites online today, you can find this content in a lot of different places just by doing a Google search, but what you don’t get through that is a nice linear cohesive way of starting as a basic … as someone who doesn’t have any knowledge or experience in computer vision all the way up to someone who is winning Kaggle competitions for example or applying deep learning and computer vision at their job or changing their careers.

Adrian Rosebrock: You know, having that linear path and that structure is so valuable and I believe a lot of people can really improve their careers just by studying it.

Kirill Eremenko: Fantastic. Totally, totally agree. That’s a wonderful thing that you’re doing to structure out that learning pathway, that journey for people. I think it’s very fair that for those who just want to explore computer vision on their own, there’s a lot of free content and definitely plenty of areas on YouTube and so on where that can be explored. But if you’re serious about computer vision, then you might be interested in considering a learning pathway. And what I like about your Kickstarters is that the pledge levels are not like … you can get in at a very low level if you just want to get started, and I think it’s as low as 50 dollars to get you into this Kickstarter. I think it’s totally worth it. So, guys, if you’re interested in computer vision, highly recommend checking this out. This Kickstarter’s going to end soon. So, if you’re on Kickstarter, head on over and search for Adrian’s latest creation.

Kirill Eremenko: And Adrian, I have another question for you. From what you’ve seen in the space of computer vision and how this field has been developing … you’re one of the top experts in this space … where would you say this field is going and what should our listeners look into to be prepared for the future that’s coming for computer vision?

Adrian Rosebrock: Man, that’s such a great question. You know, I can even start it off just by talking about basic data science and basic machine learning. You know, in the mid-nineties we had support vector machines and people thought they were just the best algorithms ever and they were just going to work on whatever project we applied them to and then in the 2000s we had ensemble methods with random forest specifically and then people were like, “Yeah. This is the future of machine learning.”

Kirill Eremenko: Uh-huh. (affirmative).

Adrian Rosebrock: And 2000s, 2012, we have AlexNet, we start to see the dawn of the latest incarnation of machine learning and deep learning. So, now we’re wrapping up the 2010s. We’re bound to break in the 2020s. The big question is: What’s next? For me, I really think embedded computer vision and deep learning is going to be the next big thing in this niche. Just think of what’s happened in the past year. For example, the Raspberry Pi is more popular than ever due to the cheat board, the reasonable specs, and the small form factor. Intel –

Kirill Eremenko: Sorry. Sorry. I’ve got to interrupt you. Can you tell us what is the Raspberry Pi, because I only know very briefly about it. What is Raspberry Pi?

Adrian Rosebrock: So, the Raspberry Pi is a super cheap, 35 dollar, single board computer. It has like a gigabyte of RAM, a little over a gigahertz processor, but with four ports, and it’s great for both hobbyists who want to build their own applications and it’s great for teaching. For example, for kids in the STEM area. It’s also great for people who are building their own products, ’cause the board is so cheap, but it’s also really powerful in the sense that it can be deployed almost anywhere.

Kirill Eremenko: Gotcha. Gotcha. Okay. Thank you. Thank you. Please continue. I interrupted.

Adrian Rosebrock: Yeah, so the Pi is super popular. You have Intel releasing version two of their neural compute stick, which is just a plug and play. It looks like a thumb drive, but it actually has special hardware that allows inference and deep learning models to run really quickly.

Adrian Rosebrock: Google has just shipped Coral, their single-board and USB accelerator for artificial intelligence and deep learning and now just a couple weeks ago, NVIDIA announced the Jetson Nano, which many considered the 99 dollar version of the Raspberry Pi. It’s optimized for deep learning. It’s super powerful.

Adrian Rosebrock: This tells me three things. The first is that there’s a huge need for not only internet of things devices, but smart internet of things devices capable of leveraging computer vision and deep learning. The second thing it tells me is that this need is so strong that large tech giant companies such as Google and Intel and NVIDIA are starting to compete with each other through their own single board offerings. So, again, in 2010, we followed the rebirth and resurgence of neural networks and deep learning, but entering the 2020s, I really believe that we’ll start to see neural networks and deep learning brought together in small form factor systems and embedded devices and really just seeing more proliferation of computer vision and deep learning all the way around us.

Adrian Rosebrock: So, that’s what my new Kickstarter is all about. It’s going to focus on the Raspberry Pi for computer vision. We’re gonna cover more advanced stuff like Intel’s Movidious and Google Coral and the INVIDIA Jetson Nano and the goal of that is just to prepare you for this next big wave that’s going to be coming in the 2020s and that’s … I’ll say it again, it’s embedded computer vision and you’re going to start to see more and more products leveraging embedded systems.

Kirill Eremenko: Wow. Fantastic. Thank you. That’s a very, very clear description of the future. I don’t hear those that often. It seems like the way we’re going with these embedded computer vision and deep learning tools are really going to revolutionize just the world we live in, including as consumers the things that we operate with like our mobile phones and things like that. Sounds like a very exciting future and really cool to be part of it. Like, I’m very happy for you and your students that are in this space. Are you guys excited? How’s the community feel?

Adrian Rosebrock: Oh, the community is just so stoked on this, you know? It follows like a natural progression that I was talking about with my teaching. You know, the first book I ever wrote was the basics of computer vision. How this book can teach you that in less than a single weekend. It’s a short book that you can read quickly, build some cool stuff, and see it in action. And from that I created a much more in depth computer vision course similar to like a college level survey course, but much more practical and hands on. Then we did deep learning book specifically deep learning for computer vision in which we learned how to train models for image understanding.

Kirill Eremenko: Uh-huh. (affirmative).

Adrian Rosebrock: And now, again, we’re going to continue building on that knowledge. Now the time is embedded computer vision. So we can take what we learned from the basics of computer vision. We can take our deep learning models and we can actually deploy them to these embedded systems.

Kirill Eremenko: Fantastic. Fantastic. Well, once again guys we’re going to be available on Adrian’s new Kickstarter, which you can find in the show notes or by checking out kickstarter.com. And that actually brings us to the end of today’s show. Adrian, thanks so much for coming on and being a pleasure. Before I let you go though, can you tell us what are some of the best ways for our listeners to contact you and follow your career and your work?

Adrian Rosebrock: Yeah, totally. Thank you, Kirill so much for letting me be on this podcast. It’s a privilege and you know, something I don’t take lightly. I appreciate everything you guys do. So, just let me start by saying thank you for that.

Kirill Eremenko: Thank you.

Adrian Rosebrock: So, if you want to reach out to me and if you want to talk to me, you can go to the pyimagesearch.com blog. That’s P-Y as in Python, pyimagesearch.com. There is a contact link and a form that you can fill out. You can email me directly at adrian@pyimagesearch.com. On Twitter @pyimagesearch or if you Google my name, you’ll find my LinkedIn profile as well.

Kirill Eremenko: Awesome. Fantastic. Of course we’ll include all those links in the show notes as well. One final question before we wrap up, what’s a book that you can recommend to our listeners, maybe on computer vision or anything else to help them in their careers?

Adrian Rosebrock: So, I was like thinking of a good answer to this question, ’cause I knew it was coming.

Kirill Eremenko: Uh-huh. (affirmative).

Adrian Rosebrock: One of my favorites is Sebastian Raschka’s Python Machine Learning. It’s so easy to follow and so easy to understand. I wanted to recommend it to you guys just as data scientists and data analysis. Just go read that book and you’ll get a lot of value out of it.

Kirill Eremenko: Gotcha. Thank you very much for your recommendation. Sebastian Raschka’s Machine Learning. And of course guys, Adrian has a few books of his own, so have a look at those as well.

Kirill Eremenko: On that note, Adrian, thanks so much for being here today. Really appreciate you taking the time and sharing this world of computer vision with us and our listeners.

Adrian Rosebrock: Oh, it’s my pleasure. Thank you for having me on.

Kirill Eremenko: So, there you have it. That was Adrian Rosebrock from pyimagesearch.com. I hope you enjoyed today’s chat as much as I did. And what was your favorite takeaway? For me, it was the comment that Adrian made about developing that sixth sense for understanding when to use which model from your toolkit and at the same time before you’ve developed that sixth sense, it’s about knowing how to decide which tool to use and not just throwing everything at the problem or just using one thing for all the problems, but having a toolkit and having a methodology about how you’re going to decide where to get started, with which tool to get started, which tool to try out next, and so on. I think that’s a very important concept to keep in mind in all areas of data science, not just in computer vision.

Kirill Eremenko: And on that note, of course, of course, check out Adrian’s Kickstarter, which is running right now. It has a limited number of days that it’s available for, so if you are interested in computer vision, then head on over to kickstarter.com and search for computer vision or Adrian Rosebrock or you can go to the show notes, which are at www.www.superdatascience.com/255. That’s www.superdatascience.com/255. There you will find all of the links to materials mentioned in the show including Adrian’s Kickstarter campaign so you can check it out from there as well.

Kirill Eremenko: On that note, thanks so much for being here today and if you know anybody who’s interested in computer vision, share the love. Send them the link to this episode so they can get these valuable insights as well. And I’ll see you back here next time. Until then, happy analyzing.

Podcasts SDS 255: Diving Into Computer Vision

SDS 255: Diving Into Computer Vision

Podcast Transcript

Share on

Related Podcasts

July 10, 2026

July 7, 2026

July 3, 2026

Podcasts SDS 255: Diving Into Computer Vision

Share

SDS 255: Diving Into Computer Vision

Podcast Transcript

Share on

Related Podcasts

July 10, 2026

SDS 1008: The AI-Native Startup Playbook

July 7, 2026

SDS 1007: How to Find Solid Career Ground in the AI Era, with 80,000 Hours Founder Ben Todd

July 3, 2026

SDS 1006: In Case You Missed It in June 2026