SDS 660: Five Ways to Use ChatGPT for Data Science

Podcast Guest: Jon Krohn

March 10, 2023

The potential use of ChatGPT is more wide-ranging than we might think. In this episode, Jon Krohn lists five ways that the tool can be used for data science.

 
ChatGPT is best known as a tool that can generate intelligent strings of text in response to questions you ask of it. It is the stuff of Ask Jeeves’ dreams, and yet answering your most pressing concerns is not the limit of ChatGPT’s capabilities. The tool can also generate code in all of the primary software languages (Python, R, SQL), putting it in direct competition with OpenAI. While we would never advise listeners to use this function blindly, ChatGPT can facilitate the way you extrapolate from or visualize data.
Another way that ChatGPT can help data scientists is by translating the code of one programming language into another. ChatGPT has access to training data across a whole roster of programming languages, and so that piece of unfamiliar code no longer has to remain unknown. This capability will come as a great boon for all data scientists—very few of us have an encyclopedic knowledge of every single programming language available to us, and with ChatGPT, we can now continue to “stay in our lane” while also augmenting our data science toolkit with several other languages.
Listen to the episode to hear Jon reveal the next three ways that data scientists can use ChatGPT to augment their work!

Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.   
ITEMS MENTIONED IN THIS PODCAST:
DID YOU ENJOY THE PODCAST?

Podcast Transcript

(00:03):
This is Five-Minute Friday with Five Ways to use ChatGPT for Data Science. 

(00:19):
Back in Episode #646, we focused on how anyone can extract commercial value from ChatGPT today — whether ye be a technical data science practitioner or not. In today’s episode, it’s exclusively the technical practitioners’ turn: In today’s episode, I’ve got five specific ways that ChatGPT can be used for data science. 
(00:39):
Use case #1 is code generation. While ChatGPT was designed primarily as a tool for generating natural language – while, in contrast, OpenAI’s Codex algorithm was designed explicitly for generating code — you can hear all about it in Episode #584) – the friendly, conversational UI of ChatGPT nevertheless comes in handy for rapidly generating code. And it can do so in all of the primary software languages for data science, including Python, R, and SQL. ChatGPT’s code is not always going to be perfect, but for quick ideas on how you could be extracting features from your data, implementing an algorithm, or creating a data visualization, any of these kinds of things, ChatGPT is a great tool for getting started. All right, so that’s use case #1, code generation. 
(01:25):
Use case #2 is translating code between programming languages. Not only can ChatGPT convert your natural-language input into code, it can also translate between programming languages. So, if you, for example, are expert at Python but unfamiliar with an R code snippet you found online that you’d like to understand and implement in Python, you could ask ChatGPT to convert the R code into Python for you. Because ChatGPT has training data from many different programming languages, you can now convert perhaps any unfamiliar code you come across into a familiar target programming language of your choice. All right, so that was 2, translating code between programming languages. 
(02:07):
Use case #3 is code troubleshooting. Not only can ChatGPT help you with generating code, you can use it to explain errors that you’re coming across and provide suggestions as to how to fix it. You can even request ChatGPT to rewrite your code for you so that it’s bug-free. All right, so use case #3 was code troubleshooting. 
(02:26):
Use case #4 is providing library suggestions. In Python or R, there are countless open-source libraries of code available to you. With ChatGPT, you can now quickly identify which library or libraries are best-suited to a particular task you’d like to perform with your code. 
(02:45):
And finally, use case #5 is article summarization. A seemingly endless number of fascinating articles on machine learning innovations are published on ArXiV each week. Poring through each of the articles that interests you is likely to be impossible, but with ChatGPT you can instantly have articles summarized and key information extracted, making it much easier for you to stay on top of the latest data science developments. 
(03:10):
All right, I hope you found some of these ChatGPT-for-data-science use-cases to be helpful. To recap they are: 1. Code generation, 2. Translating code between languages, 3. Code troubleshooting, 4. Providing library suggestions, and 5. Article summarization. 
(03:28):
Thanks to Kirill Eremenko, founder and long-time host of the SuperDataScience Podcast, for suggesting these tips to me for an episode. He’s always full of great ideas. Ok, that’s it for Five-Minute Friday today. Until next time, keep on rockin’ it out there, folks, and I’m looking forward to enjoying another round of the SuperDataScience podcast with you very soon. 
Show All

Share on

Related Podcasts