1 minute read

TL;DR: Automatically summarize any youtube video using openAI’s whisper and ChatGPT.


What it does

In a nutshell, this python script will automatically:

  1. Download the audio from a youtube video using youtube-dl
  2. Generate the transcript using whisper, OpenAI’s text2speech model
  3. Summarize it using ChatGPT

How it works

Here’s a visual schematic:


I explain it in detail in the youtube video

Using it on actual videos

To demonstrate how well this thing works in practice, let’s summarize the youtube video I made using the same youtube video summarizer. I know, meta right?!

Here’s the code:

youtube_url = "https://www.youtube.com/watch?v=WtMrp2hp94E"
outputs_dir = "outputs/"

long_summary, short_summary = summarize_youtube_video(youtube_url, outputs_dir)

And here’s the short summary of my youtube video it made:

Short Summary

The video demonstrates how to use open AI’s speech to text model, ChaiGPT, and Librosa to automate the process of summarizing YouTube videos. The tool can provide both long and short summaries of the videos and is best suited for videos with a lot of audio. The examples given in the video include summaries of a video on stretching and a conversation on AI alignment and the future of AI.

Pretty good overview of the video!

Notice the typo on ChaiGPT, this actually comes directly from the whisper transcription. It’s very likely that the audio data whisper was trained on never contained any reference to chatGPT.


I was also surprised at how well this worked at summarizing a 45 minute interview, even though it actually has no idea who is speaking. Here are the sumamries if you’re interested:

Short Summary

The co-founder and chief scientist of OpenAI discusses a variety of topics including AI alignment, economic value, reliability, and the future of AI as a collaboration between humans and machines. They acknowledge the difficulty of achieving alignment with models smarter than us and the potential for illegal or malicious uses of AI. They also emphasize the importance of data for making predictions about the future of AI and the potential for AGI to help people become more enlightened.

I’ll be giving that video a listen soon!


All code is availble on colab!