An exploration of all the strategy games on the Apple App Store, presenting some interesting findings and insights. (Please refer to Jupyter notebook for complete details, models and explanations)
The project is split in two sections : In parts 1 to 4, all features of the entire dataset are used to predict the 'Average User Rating' of the games. Data cleaning, pre-processing, exploratory data analysis, feature engineering, model selection and model validation were performed. We applied methods like logistic regression and random forest to perform the classification. Cross-validation was used for model selection. For the second section of the project some known and popular natural language processing methods were used to predict the same 'Average User Rating' using the description of the game. The methods applied include count vectorization, tf-idf vectorization and Doc2Vec embeddings. The models used include Naive Bayes, Support Vector Machine and Logistic Regression. We also employed k-means for the clustering, and PCA/t-SNE for dimensionality reduction.
Here are the highlights of some of the interesting findings based on our analysis.
- Games larger in size are likely to receive higher average user ratings.
- Games that are well-maintained and regularly updated are likely to be rated higher than those without regular updates.
- Age groups of 4+ are the most targeted audiences for game developers.
- A huge majority of strategy games are still below <250MB in size, which indicates that not all users like heavy games.
- In 2019, a very high number of strategy games got introduced in the App Store which indicates a higher demand for this sector.