SDS 712: Code Llama

Podcast Guest: Jon Krohn

September 8, 2023

Code Llama might just be starting the revolution for how data scientists code. In this week’s Five-Minute Friday, Jon Krohn investigates the suite of models under the free-to-use Code Llama and how to find the best fit for your project’s needs.

 

After training Llama 2 on 500 billion tokens of code and related data, Meta essentially filled in the gaps left by the older model to handle a huge range of coding tasks with Code Llama, making it simple for those who write software to speed through their coding. Just type in a text prompt, and you’ll get the code you need.
You can select between an impressive range of model sizes and variants for Code Llama – all while keeping your proprietary data on your own servers. This flexibility helps you stratify your approach, which can help you select the perfect model for your productivity and financial needs. Great news for the future of coding, and even better—Code Llama and its predecessor Llama 2 are both available and free to use for research and commercial applications.
Listen to the episode to hear Jon explore the various model sizes and variants available, and which model to use for your project. This episode is also a precursor to Episode 713, where guest Thomas Scialom discusses Code Llama and other world-leading projects at Meta.

Podcast Transcript

(00:05):
This is Five-Minute Friday on Code Llama. 

(00:19):
Welcome back to The Super Data Science Podcast. I’m your host, Jon Krohn. Today’s episode is all about Code Llama, a model with its weights recently released to the public by Meta. Code Llama is a Large Language Model derived from Llama 2, which I detailed back in Episode #702. One of my few criticisms of Llama 2 back in Episode 702 was that it didn’t handle coding tasks nearly as well as it did tasks that involved natural language only. Well, that’s exactly the gap that Code Llama was designed to fill. 
(00:52):
This means that Code Llama has the potential to revolutionize how we code, by making workflows faster and more efficient for data scientists and other folks who write software. And, like Llama 2, Code Llama’s free for both research and commercial use. All you have to do is fill out a super-short access-request form, which I’ve provided in the show notes at www.superdatascience.com/712. 
(01:15):
All right, so what’s up with this Code Llama, how is it different from Llama 2? Starting with Llama 2, Meta performed additional training on 500 billion tokens of code, 500 billion tokens of code and related data. They also performed something called “infilling code training” that specifically allows the LLM to excel at filling in gaps in code. The result is that you can provide a text prompt to Code Llama like “provide me with a neural network in PyTorch that’s appropriate for an object detection task” and Code Llama will generate appropriate code for you automatically. 
(01:52):
The code data are quite diverse so Code Llama can be used for code completion, debugging and more in a wide range of programming languages including Python, C++, Java, Javascript, Bash and a bunch of others.
(02:06):
Not only that, but like Llama 2, Meta have released a range of model sizes for Code Llama. Specifically, Code Llama comes in 7B, 13B and 34B parameter versions. The largest — the 34B model — provides the best coding assistance while either of the two smaller models can fit on a single GPU and so therefore offer lower latency in real-time applications. Being able to choose between these three model sizes means that you have the flexibility to choose your model size to weigh performance, cost and inference time tradeoffs for your particular use case. 
(02:42):
There’s even more optionality than model size, however, because Meta also released specialized Code Llama versions. Because Python is so critical to A.I. research and development, Meta trained their aptly-named “Code Llama — Python” variants on an additional 100 billion tokens of Python code. Separately, their “Code Llama — Instruct” variants were instruction fine-tuned on 5 billion tokens of natural-language instructions and the expected outputs. 
(03:10):
Like the general Code Llama family, the specialized “Code Llama — Python” and “Code Llama — Instruct” variants come in 7B, 13B and 34B parameter sizes. All in all, that means this release involves a total of nine models: the general Code Llama family, the “Code Llama — Python” family and the “Code Llama — Instruct” family, each of which has three model sizes. 
(03:34):
So, in this episode, I’ve already covered why you might choose one model size over another, but what about a particular variant? Well, you might go with “Code Llama — Python” if you know your use case involves Python code exclusively. And Meta themselves recommends that for natural-language applications you use a “Code Llama — Instruct” variant because these are better at understanding what humans expect out of prompts and are also aligned, meaning that the “Code Llama — Instruct” variants are fine-tuned to generate safe and helpful answers in natural language.
(04:07):
Cool, so as you can always expect with a big splashy release like this, the Code Llama authors have of course, compared all nine Code Llama versions against the major proprietary models that handle code well, such as OpenAI’s Codex, GPT-3.5 and GPT-4 models, as well as against the major open-source models that handle code well, such as StarCoder and the 70B parameter Llama 2. At a tenth of the size, even the 7B Code Llama outperforms the 70B Llama 2 on the majority of the coding benchmarks Meta published. The largest Code Llama — the 34B parameter version — comfortably outperforms all the open-source alternatives and performs comparably to GPT-3.5. The only model Meta shows outperforming their Code Llama is the mighty GPT-4, which actually outdoes even Code Llama 34B by quite a ways. Super impressive but of course all of these results should be taken with a grain of salt since Meta published them themselves. The best way to confirm Code Llama actually is a game-changer for your code-generation use case is to experiment with it yourself!
(05:21):
With Code Llama, you could now build powerful CoPilot-like tools yourself, but now you don’t need to share your firm’s proprietary code with a third party — you could keep all your proprietary data on your own servers. You could also fine-tune Code Llama to your own use cases, potentially for as little as a few hundred bucks if you use a parameter-efficient fine-tuning technique like the ones I covered back in Episode #674. And, since you’re listening to a data science podcast, there’s a good chance that a “Code Llama — Python” variant is the perfect starting point for you. Again, the request form link to gain access to the model weights is in the show notes at www.superdatascience.com/712.
(06:03):
All right, that’s it for today. To hear more about Llama 2, Code Llama and other open-source LLM projects from Meta such as Toolformer and Galactica, check out the very next episode of this podcast, #713, in which Dr. Thomas Scialom — the extraordinary research scientist at Meta leading all these world-leading projects — is our very special guest on the show.
(06:27):
All right. Please consider supporting this show by sharing, reviewing, or subscribing, but most importantly, just keep listening! Until next time, keep on rockin’ it out there and I’m looking forward to enjoying another round of the Super Data Science podcast with you very soon. 
Show All

Share on

Related Podcasts