SDS 564: Clem Delangue on Hugging Face and Transformers

Podcast Guest: Clem Delangue

April 7, 2022

Welcome back to the Five-Minute Friday series of the SuperDataScience podcast!

This week, Jon is joined by the CEO of Hugging Face, Clem Delangue, for an informative discussion on open-source machine learning and transformer architectures. The conversation took place at the ScaleUp:AI Conference.

 

About Clément Delangue
Clément Delangue is co-founder and CEO of Hugging Face, the leading machine platform that is used by more than 10,000 companies today. His passion for building machine learning products started more than 15 years ago while working on Moodstocks, a computer vision startup that was acquired by Google.
Overview
Clem Delangue is the CEO of Hugging Face (which is cleverly named after the emoji of the same name), a New York-based start-up that has raised $61 million in venture capital. They’re a platform used by more than 10,000 companies to build a broad range of machine learning products, features, and workflows. They are also behind an open-source library called Transformers, which is the most widely-adopted software library for ML models to deal with written or spoken language — a big exciting area of ML called natural language processing or NLP for short.
The transformer concept is only a few years old, and yet every cutting-edge natural language application already seems to be incorporating them. Jon sat down with Clem to discuss why he and his team decided to make their Transformers library freely available through an open-source license and chatted about examples of companies that have used Hugging Face to scale up their impact.
Jon also asked Clem to share his advice for companies interested in using ML but don’t know where to start, and discussed the biggest roadblock he tends to see en route to production deployment. Tune in for more insights from Clem in this special episode.  
Items mentioned in this podcast: 

Follow Clem:
Follow Jon:
DID YOU ENJOY THE PODCAST?
  • What mindset switch can you make to help you build more effective machine learning systems?
  • Download The Transcript

Podcast Transcript

Jon Krohn: 00:02

This is Five-Minute Friday with special guest Clem Delangue the CEO of Hugging Face. 
Jon Krohn: 00:11
Yesterday at the ScaleUp:AI conference in New York, I interviewed Clem Delangue, the CEO of Hugging Face for a super informative discussion on open source machine learning and Transformer architectures. It was such a great conversation. I was confident. You’d love to hear it too. So here you go. Enjoy. 
Jon Krohn: 00:39
Hi there. Welcome back from the break, thank you Nicki. I’m John Krohn, chief data scientist at Nebula and host of SuperDataScience, the most listened-to-podcast in the data science industry. I’m thrilled to be here with Clem Delangue, co-founder and CEO of Hugging Face, from my perspective, the coolest most prestigious company for a data scientist to work for today. Wow! What an honor to be here with you. We’re bummed that due to pandemic-related travel restrictions, we can’t be with you in person for ScaleUp:AI, but fortunately, we can still bring this enlightening conversation to you virtually. 
Jon Krohn: 01:15
To give you some context on Hugging Face, who are named after the friendly-looking Hugging Face emoji. They are a New York startup that has raised $61 million in venture capital. They are a platform used by more than 10,000 companies to build a broad range of machine learning products, features and workflows. They are also behind an open-source library called Transformers, which is the most widely adopted software library for machine learning models to deal with written or spoken language, a big, exciting area of machine learning called natural language processing or NLP, for short, we’ve got lots more on all of that from Clem himself. Bonjour Clem. So how are you doing? 
Clem Delangue: 01:55
I’m doing good. Thanks so much for having me. 
Jon Krohn: 01:58
Nice. Yeah, our pleasure. Perhaps you can start us off by filling us in what Transformer architectures are and why they have become so ubiquitous in NLP and machine learning today. The Transformer concept is only a few years old and yet every cutting edge natural language application already seems to be incorporating them. 
Clem Delangue: 02:20
I think the first fundamental paper came amount in 2018, it was called Attention is all you need, introducing this fundamental new way of doing machine learning based on transfer learning. So the idea of transfer learning is that you’re going to train a model on a very large dataset for NLPs or for text on the large dump of the web, basically. You scrap the web and then you train a very large model on that. And then you are going to be able to transfer this learning from one simple task, which is usually mask filling or text completion for any other machine learning tasks. So to go from there, to being able to classify text, for example, the sentiment or the topic of text, being able to extract information from text, being able to classify an image, for example, in computer vision and it really to cover the machine learning domain, from like a lot of different approaches in the past, it started to beat the state of the art on every single science benchmark that you can look at really creating more accurate prediction, but at the same time being very easy for companies to use. So that’s what we’ve seen at Hugging Face with Transformers becoming almost the default way of doing machine learning and at the same time machine learning, becoming the default way of building technology for companies. 
Jon Krohn: 04:07
Nice. And so why did you decide to take that amazing technology, Transformers and transfer learning and make it freely available through an open source license? How is that an effective business model to you be giving some of your most valuable intellectual property away with no practical limitations on use? 
Clem Delangue: 04:28
That’s a good question. So at Hugging Face, we believe that machine learning is becoming the default way of building technology. So that’s the fundamental technology trend of the decade. And we think it won’t be happening just with one company. Even if it’s a very large company like Google, Microsoft, if we want to make this happen, we have to involve the whole field. We have to involve all scientists working on the domain to be able to democratize machine learning. So that’s why we’re taking a very open source driven, community driven approach to building things. And it worked wonder for us as a company, because a little bit like GitHub that has seen this network effect of software engineers sharing their code on the platform. And so people adapting that and it creates this network effect. The same thing applies to us in the sense that we have thousands of researchers who are sharing more than 30,000 open models on the Hugging Face platform. And it’s attracting companies that are using the platform and then asking the next researchers to share their models. So there’s this network effect that you can build in your startup thanks to a model like that. 
Jon Krohn: 05:51
Nice. It’s a powerful platform, no doubt. And made even more powerful by all of those open source contributions. So speaking of companies that use it, do you have one or two examples of companies that have used Hugging Face to scale up their impact? 
Clem Delangue: 06:05
Yeah, we have a lot. So we have over 10,000 companies that are using us from Microsoft, Facebook, Google, all the way to new startups building with machine learning. Some that I’m really excited about are things like Grammarly, for example, for grammatical error detection, for example, if you’re not a English native like me, obviously I’m French, as you can hear from my accents that helps you correct your mistakes, helping you write better. I think it’s a really, really cool use case. I’m super excited about for example, the Bloomberg terminal using us to do summarization of text. I’m excited about companies like Segment.ai that is doing automatic segmentation of images to be able to detect objects in images. So there are like a million of them. What we are seeing is now for every single feature workflow product that companies build, you can almost start by thinking, how do I do that with machine learning and almost use that as a starting point. And it’s almost like if machine learning doesn’t work, then you’ll fall back into this old way of doing technology with which is re-writing a million lines of code. So we really envision this world where in a few years, machine learning is going to be the default way to build a technology, to build features, to build a product. 
Jon Krohn: 07:48
Yeah, no doubt. That’s the idea of software 2.0 in Andrej Karpathy words, as opposed to software 1.0, where in this software 2.0 world, instead of trying to hard code, how everything’s connected, we’ll have machine learning algorithms that can be doing that automatically to greater efficacy than if we try to hard code it manually ourselves and certainly with a lot less human effort. So very cool that Hugging Face is playing a key role in that. Do you have any advice for companies that are interested using machine learning, but don’t know where to start? 
Clem Delangue: 08:24
Obviously, they should start from the Hugging Face hub where there’s like 30,000 open models that can use without any training so they can use what is called pre-trained model. So even if they don’t have machine learning engineers, now software engineers are really able to use that and to build machine learning features. And then I would really recommend to start with like a small, simple feature and really build a muscle, build a machine learning muscle. Sometimes companies think that their first machine learning project is going to be this big, conversational AI that is going to talk with customers about every single subject to answer 100% of questions and things like that. Truth is, it’s really hard. And that’s maybe something that you are going to get to like a few years after you started building this machine learning muscle. So I would recommend to start with something really simple, like you have customer support emails. How do you classify that? How do you extract information from that? How do you have a chat interface, how do you build auto completes, for this text? And go from like these very simple initial features and then build on that, build on the learning in your organization and then progressively you’ll be able to get more and more ambitious machine learning features. 
Jon Krohn: 10:02
Nice. That is a really good guidance for sure. And then once a company does get going with machine learning. So starting with baby steps, building little machine learning muscles to use your analogy there, what’s the biggest roadblock that you see on route to production deployments. So the company gets going, they’re doing a bit of machine learning, but then they want to productionize, where do they tend to run into trouble? 
Clem Delangue: 10:30
Interestingly for me, most of the challenges are more human than technology related, especially when it comes to changing your mindset. For example, machine learning compared to traditional software, it’s much less deterministic, much less explainable than regular software. And so you have like, as a product manager as a founder for startup, be comfortable with not completely understanding, not being able to exactly predict all the outcomes of your machine learning features. And that’s the main roadblock because it’s a big mindset switch. Especially for people who’ve been building software for 20 years, 30 years in a very deterministic way. So like changing your mindset about the potential of the future, but also the limitations of the future, to me, it’s like the biggest thing. And then if you manage to overcome that, I think we’re in the state technology speaking now that you’ll be able to bring value to your company, by using machine learning. But you have to overcome this mindset switch in a way. 
Jon Krohn: 11:52
Nice. So then are there things that people can be doing to mitigate concerns that they have? So if they have to take this leap and not understand exactly what the outcome might be, are there mitigating ways that people can become comfortable with these deployments and know that the tools aren’t acting in a way that say is acting nefariously or incorporating unwanted bias, that kind of thing? 
Clem Delangue: 12:21
Yeah. There are a lot of strategies to do that. First is to be very transparent in the organization about what your model can and can’t do. So for example, at Hugging Face, we have someone called Dr. Margaret Mitchell who was the co-founder and co-lead of the ML ethics team at Google before, and who pioneered something called model cards, which are standardized way of describing what your model can and can’t do both in terms of functionality, but also in terms of ethical considerations. Where it’s going to be biased where it’s not going to be biased and things like that. So you can use tools like that to create transparency within the organization, but also with your users and with your customers, it’s really important. You can use tools like Gradle, for example, which is an easy way to create demos for your machine learning models so that other people can try with their own examples and see if it works or if it doesn’t work. 
Clem Delangue: 13:35
So these are some of the ways that you can help doing this mindsets change and make everyone feel more comfortable about integrating machine learning. And the last one, it’s obvious, but I think a lot of people are forgetting it, is that you should work around your use is to make sure that you use machine learning the right way. So if you’re not confident, if it’s like too critical a feature, let’s say you’re building a hiring tool, and you want to use machine learning. Obviously, you shouldn’t use a machine learning model to feel filter resume, because you know that the machine learning model can be biased towards women, for example, or towards minorities. So you should use it maybe in addition to team members and in addition to humans to complement them, but never be the only conflict filter. So making sure that in your workflow, in your product it’s used on the right use case with the right setup around it, is also a way to mitigate the risks. 
Jon Krohn: 14:51
Nice. Those were an awesome series of ways of mitigating and getting people over that hurdle, getting comfortable with getting their machine learning models into production. You’ve provided us with tons of exciting context on what’s possible with natural language processing today, how easy open source tools and how easy Hugging Face can make training or using these machine learning models. For you, what are the most exciting, natural language applications that are coming up that are maybe impossible today or tough to scale today, perhaps due to narrow applicability or impractically high error rates, that you think will be widespread five to 10 years from now. What Clem, is the future of machine learning? 
Clem Delangue: 15:42
That’s a billion dollar question. It’s a really tough question to answer. Some of the things that I’m excited about, especially because we started Hugging Face on this topic, is open domain conversational AI. Like this obvious sci-fi dream that we’ve all seen in Her and movies like that. Being able to chat with an AI, but a lot of different topics. This is really hard to do. It’s impossible to do today, but hopefully in a few years it will be possible. What’s exciting also to me, is that the Transformers architecture that we were talking about is actually starting to make its way into computer vision, into time series into audio. And so what you’re starting to see is that the lines between the different machine learning domains are really blurrier and blurrier now. And so you are starting to be able to use Transformer models and to use Hugging Face in things like time series, for example. So like Uber, for example, now is using Transformers to do like the ETA prediction. You start to be able to use it for like recommender systems. You start to be able to use it for biology, chemistry. 
Clem Delangue: 17:04
And what’s interesting is not that audios or domains are going to get accelerated the same way NLP has been, but also that you can start to merge them and use similar models to create multi model models. So for example, if you think about the fact that now we are using Transformers on audio, so detecting audio, it’s really interesting to think about the fact that you can detect the audio and then analyze the text to be able to do more powerful stuff. Or if you think about time series that is used a lot for like fraud detection, for example, if you add time series to NLP, then you get something even more powerful because you get the prediction for the fraud, not only based on the numbers of interactions for your products, for example, but also the kind of interactions. If someone is sending you an email, that is weird, then the probability for it to be fraud increases. So merging all these different domains, which is something that is very new because before it was very siloed, to me is extremely exciting. And I hope we’ll more of that in the future. 
Jon Krohn: 18:32
Yeah. I agree, Clem, a huge amount of opportunity in the multi model space in the years to come. And it seems really likely that Transformers and transfer learning and probably Hugging Face will be at the forefront of that change. All right, Clem, merci, I greatly get appreciate the time you spent with us today. I learned a ton and no doubt, our ScaleUp:AI attendees did as well. Looking forward to catching up with you again sometime soon. 
Clem Delangue: 19:01
Thanks so much and have a good event everyone.
Jon Krohn: 19:04
All right, that’s it for this Five-Minute Friday episode as well. Keep on rocking it out there folks. And I’m looking forward to enjoying another round of the SuperDataScience podcast with you very soon. 
Show All

Share on

Related Podcasts