MovieLens Dataset
The version of MovieLens dataset we are working with (265MB) contains about 27,000,000 ratings applied to 58,000 movies by 280,000 users. It also contains information such as each movie’s genres which is very important for calculating the similarity between two movies.
This project constructs a contend-based movie recommendation system which can uncover potential audiences for a movie for the purpose of advertisement and can recommend movies to the customer based on his preference. The model we use is the KNN model. The Cosine similarity is used as the metric of distance between movies. Movie popularity is also included in our model.