SDS 584: OpenAI Codex

Podcast Guest: Jon Krohn

June 9, 2022

Welcome back to the Five-Minute Friday episode of the SuperDataScience Podcast!

This week, Jon reviews OpenAI’s Codex algorithm, which acts as a natural-language interface for generating code. With a waitlist for its API in effect since the summer of 2021, Codex is particularly adept at Python and works in more than a dozen other popular languages, including JavaScript, Go, and Shell.

OpenAI Codex is a natural language model that is less well-known when compared to GPT and DALL-E, though no less impressive. This algorithm acts as a natural language interface for generating code and is derived from the GPT-3 natural language model but, in addition to being trained on human language, it is also trained on billions of lines of code.
Conveniently for data scientists like many of this show’s listeners, Codex is particularly adept at Python though it also works in more than a dozen other popular languages including JavaScript, Go, and Shell.
There’s been a waitlist since the summer of 2021 to gain access to the Codex beta API, but you can head to the Codex blog page now to be bewildered and mesmerized by the demo videos that illustrate the algorithm’s staggering and wide-ranging capabilities.
Beyond the demo videos, you can also get a sense of Codex indirectly via applications that make use of the Codex API. If, for example, you’ve ever used GitHub’s popular Copilot functionality in Visual Studio to get real-time suggestions on lines of code or functions you write, Copilot leverages Codex under the hood. Indeed, according to a recent OpenAI blog post, Codex powers 70 different applications.
Tune in to learn more about Codex and hear Jon review just a few of its practical applications. 
ITEMS MENTIONED IN THIS PODCAST:
DID YOU ENJOY THE PODCAST?
  • How will you be using Codex in your work? Which practical application are you most likely to use for yourself?
  • Download The Transcript

Podcast Transcript

(00:06):
This is Five-Minute Friday on OpenAI Codex.

(00:28):
OpenAI, based out of San Francisco, is one of the world’s leading A.I. research labs. They’re responsible for the iconic GPT-3 natural language model that we detailed in Episode #559 as well as the remarkably stunning text-to-image generator DALL-E 2 that I covered in Episode #570.
(00:46):
Less well-known — though no less impressive — relative to their GPT and DALL-E series of models is OpenAI’s Codex algorithm, which acts as a natural-language interface for generating code. Codex is actually derived from the GPT-3 natural-language model but, in addition to being trained on human language, it is also trained on billions of lines of code. Conveniently for data scientists like many of this show’s listeners, Codex is particularly adept at Python though it also works in more than a dozen other popular languages including JavaScript, Go, and Shell.
(01:22):
There’s been a waitlist since the summer of 2021 to gain access to the Codex beta API, but you can head to the Codex blog page right now to be bewildered and mesmerized by the demo videos that illustrate the algorithm’s staggering and wide-ranging capabilities.
(01:37):
One video shows the creation — from scratch — of an interactive video game where you fly around a rocketship in two dimensions to evade an asteroid. Another shows Codex rewriting Python code in Ruby. Yet another shows Codex solving elementary-school math problems by automatically converting the problems into sensibly-named Python variables, performing the appropriate arithmetic on those variables, and then printing out the correct answer to screen. Of particular interest to data scientists, is the video showing Python code being generated by Codex to download and plot weather data over a specified date range.
(02:09):
Beyond the demo videos, you can also get a sense of Codex indirectly via applications that make use of the Codex API. If, for example, you’ve ever used GitHub’s popular Copilot functionality in Visual Studio to get real-time suggestions on lines of code or functions you write, Copilot leverages Codex under the hood. Indeed, according to a recent OpenAI blog post, Codex powers 70 different applications, including Pygma, which takes user-interface designs made in Figma and converts them into Javascript frontend-framework code. There’s also Machinet, which generates unit-test templates to help you ensure your code is bug free in production. And Warp, a command-line terminal that lets you use natural language to find shell commands right in the terminal instead of needing to switch to a web browser to try to find what you’re looking for on StackOverflow or via a Google search.
(03:00):
Pretty amazing stuff all around. I love in particular that idea of user-interface designs that you have in Figma, being automatically converted into Javascript code. So cool. Codex’s capabilities are remarkable on their own; it’s even cooler to see dozens of practical applications springing up in the past year to take advantage of these capabilities. If you’re as fascinated as I am by all this, check out the Codex blog page for the mind-bending demo videos and get yourself on the waitlist for access to the Codex API.
(03:28):
The show notes to today’s episode contains all of the links to all of the various things, blog posts and companies and SuperDataScience episodes that I mentioned today. All right, that’s it for this Five-Minute Friday episode. Keep on rockin’ it out there, folks, and I’m looking forward to enjoying another round of the SuperDataScience podcast with you very soon. 
Show All

Share on

Related Podcasts