Skip to content

sunningilyA7/SparrowRecSys

 
 

Repository files navigation

SparrowRecSys

SparrowRecSys is a movie recommendation system, named SparrowRecSys (Sparrow Recommendation System), which means "a sparrow is small but has all the internal organs". The project is a mixed language project based on maven, which also includes different modules of recommendation systems such as TensorFlow, Spark, and Jetty Server.

environment

  • Java 8
  • Scala 2.11
  • Python 3.6+
  • TensorFlow 2.0+

project data

The project data comes from the open source movie data set MovieLens, The project's own data set has been streamlined from the MovieLens data set, retaining only 1,000 movies and related comments and user data. Please go to MovieLens official website to download the full dataset. It is recommended to use MovieLens 20M Dataset.

SparrowRecSys technology

SparrowRecSys technical architecture follows the classic industrial-grade deep learning recommendation system architecture, including multiple modules such as offline data processing, model training, near-line stream processing, online model services, and front-end recommendation result display. The following is the architecture diagram of SparrowRecSys:

  • It is divided into three main sections: data processing, model part, and frontend part.

  • Data Processing Section

  • User Information : User data includes user actions, social relationships, and attribute tags.

  • Item Information : Item data includes item attributes, tags, and third-party information.

  • Context Information : Contextual data includes time, location, and other contextual parameters.

  • Data Processing Platforms:

  • Flink: Used for real-time data processing.

  • Spark: Used for offline data processing.

  • Redis: Used for storing user, item, and context features.

  • Feature Engineering:

  • User Features: User actions, social relationships, attribute tags.

  • Item Features: Item attributes, tags, third-party information.

  • Context Features: Time, location, and other contextual parameters.

  • Techniques: Normalization, binarization, non-linear transformations, ID features, one-hot encoding, embedding, feature combination.

  • Model Part

  • Recommendation System Model and Online Serving:

  • Cold Start Strategy :

  • Recall Layer : Embedding, collaborative filtering, multi-dimensional tags, social relationships, freshness update.

  • Ranking Layer : Temporal and sequential models, LR (Logistic Regression), FM (Factorization Machines), MLR (Multivariate Linear Regression), deep learning models.

  • Filling Strategy Algorithm : Diversity, novelty, hotness, flow control, freshness.

  • Exploration and Utilization : Interaction with candidate item database.

  • Model Serving:

  • MLeap: Model deployment.

  • TensorFlow Serving: Model serving.

  • Model Training:

  • Platforms: Spark MLlib, TensorFlow.

  • Offline evaluation: Metrics include AUC, Recall, RMSE.

  • Frontend Part

  • Implementation: Based on HTML and JavaScript with AJAX functionalities.

  • Recommendation Item List : Display of recommended items.

SparrowRecSys Implemented deep learning model

  • Word2vec (Item2vec)
  • DeepWalk (Random Walk based Graph Embedding)
  • Embedding MLP
  • Wide&Deep
  • Nerual CF
  • Two Towers
  • DeepFM
  • DIN(Deep Interest Network)

Related paper

Related resources

About

A Deep Learning Recommender System

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 38.9%
  • Java 26.4%
  • Scala 15.5%
  • HTML 11.6%
  • JavaScript 7.6%