Skip to content

Spotify Web AI DJ - client side agentic smarts using Gemma 2, two billion parameter LLM, to play what a user wants via natural speech input

Notifications You must be signed in to change notification settings

jasonmayes/Web-AI-Spotify-DJ

Repository files navigation

Spotify Web AI DJ - create the perfect mix of music using a natural voice interface, powered by Google's Gemma 2 model on device, to act as an agent capable of using the Spotify API

A Web AI Agent running entirely client side in browser, that's capable of generating a spotify playlist using Google's Gemma 2 (2B) model in JavaScript via WebGPU thanks to the MediaPipe Web LLM library, combined with some extra function calling logic to enable advanced user experiences and get the job done requested by the user.

Click the image below to watch the YouTube video of it in action:

Watch this Web AI Agent demo in action on YouTube

Prerequisites

  • Chrome - as the LLM is accelerated using WebGPU
  • A computer with a GPU - most will work so long as 4.5GB VRAM, but if you have a dedicated GPU it will usually be much faster.
  • The model file is a 2.5GB download! Be patient.

Demo time

A live demo is available on Glitch. Just click and open the link and wait for the 2.5GB model download - it will take time so use good WiFi when opening!

If any issues open the blue side panel on the right by hoving over it and click the clear memory button at the bottom to start over if you managed to get the Agent into an odd state. You can also create and specify your own private API keys in the right side panel if you want to listen to full songs instead of the previews you get when not using a Spotify Premium account's API key.

Learn More

Designed and coded by Jason Mayes, 2025.

Learn more about the base project this code is based on:

Questions / Contact

I will be adding to this over time but if you have any questions / feedback then you can find me over on LinkedIn or Twitter:

About

Spotify Web AI DJ - client side agentic smarts using Gemma 2, two billion parameter LLM, to play what a user wants via natural speech input

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published