YouTube Timestamp Search allows you to search for short segments in videos all over YouTube. After inputting a query, the user is returned with 5 relevant segments that may come from various videos. Clicking on any of these will take you directly to the timestamped portion of the video in which the segment exists. A short summary generated by ChatGPT describing the segment is also displayed, so that users can make educated choices.
- Install libraries with
pip install -r requirements.txt
- Obtain
youtube-api-credentials.json
- Create
.env
(environment file) with OpenAI API key and organization key, as well as YouTube Data API key - Run the app:
flask run
- User Query: The user enters a search query, which is then used to obtain relevant videos and their transcripts through the YouTube API.
- Transcript Segmentation: The transcripts are divided into 3-minute sections to improve the granularity of search results.
- Embeddings Generation: Using OpenAI's API, specifically the ada model, embeddings are generated for both the segmented transcripts and the user's search query.
- Cosine Similarity Calculation: The cosine similarity between the embeddings of the query and the segmented transcripts is calculated to determine the relevance of each transcript section to the user's search query.
- Ranking: The application returns the top 5 video segments based on the cosine similarity scores, ensuring that the most relevant content is displayed to the user.
- Summarization: For each of the top results, a one-line summary of the 3-minute transcript section is generated using ChatGPT via OpenAI's API. This summary provides users with a brief overview of the content before clicking on the video.
- Direct Access: The user can click on the video thumbnail to be taken directly to the video and timestamp that the segment exists in.