Skip to content

mejonyu/yale-wrapped

Repository files navigation

CS484 Project: What are Yalies Listening to?

Requirements

install http-server for launching

npm install -g http-server

install Google GenerativeAI package

npm install @google/generative-ai

Running

In the root directory run the following:

./run.sh

Problem Space

Many students like to listen to music throughout the day as they relax, study, do chores, play games, hang out with friends, and walk around campus. However, students often listen to the same music on repeat, and in turn, get bored of the music selection they may have on their playlists at the time. Many find it difficult to discover new music that they enjoy.

Tasks

  1. Users can add songs to a public, collaborative Spotify playlist
  2. Users can find new music added by their peers (other Yale students)
  3. Users can receive a personalized song recommendation based on the song they added to the playlist

Constraints

One important constraint from our deployment environment to consider for this application is in regards to the physical limitations of the microphone provided and used with our application.

Because the microphone is wired, it must be physically attached to the display system at all times of usage. The short length of the cable makes it difficult to conveniently access and hold the microphone while speaking. Moreover, the end of the cable is attached to the very back of the display system, which means that in combination with the cable's short, wired features, the user must stand within 3 feet of the end of the cable to avoid pulling it out. This means that the user will likely have to stand directly next to the display screen to use the microphone, which not only inconveniences the user but also makes it difficult to seamlessly integrate with the motion features of our application.

Collaboration Record

Jonathan Yu - jwy2: Worked on routing and implmented the Spotify Web API call logic. Integrated motion sensor functionality into the application, specifically timeout delays. Also styled web pages.

Mike Masamvu - mam462: I worked on the individual webpages using HTML, CSS and javascript. I also worked on the motion and integrated the motion with the webpages. I also worked on integrating the motion and the speech to text together.

Michelle Zheng - mz539: I worked primarily on researching the policies and implementing the Gemini API by Google to create and integrate a customized instance of the Gemini LLM to respond to user input. I used this in conjunction with the provided speech-to-text function (via OpenAI's Whisper) from the display system's microphone. I also researched different ways of wording and structuring the preamble prompt for Gemini to provide the best response. This involved looking into how large language models operate, including their transformer architecture, attention mechanisms, word-vector representations, tokenization, as well as experimenting with prompting directly into the public Gemini interface.

Modifications

References

Disclaimer

  1. This application uses the Gemini API by Google.
  2. Information generated by the Gemini API may be erroneous, and this can result in errors in the application.
  3. Information provided to the application may be sent to Google, and Google may use this data to improve and develop Google products.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •