The project encompasses these main stages:
- Web analytics data processing and preparation
- Feature engineering to establish an article engagement metric
- Training a recommendation algorithm using cloud-based ML tools
- Applying the model to generate content suggestions
create_table.sql
: SQL code for preparing and transforming raw analytics datatrain.sql
: SQL code for building the recommendation algorithmpredict.sql
: SQL code for producing recommendations with the trained modelbqml_ga360.ipynb
: Jupyter notebook detailing the complete process with explanations
- Uses session duration as an indicator of article interest
- Implements data scaling and normalization methods
- Leverages cloud-based matrix factorization capabilities
- Demonstrates large-scale data handling in a cloud environment
- Confirm access to a cloud-based dataset containing Google Analytics information
- Execute the SQL scripts in this sequence:
create_table.sql
train.sql
predict.sql