Predicting Social Media Performance Using Machine Learning with a Personalized Dataset.
This project, as part of the Machine Learning Engineering Class (Spring 2024) at Rensselaer Polytechnic Institute, will instigate a supervised regression ML model to predict the popularity (in the amount of "likes") that a proposed image will generate given a precedent of images (validated through feature vector extraction) and their likes as a target.
The problem that I will be investigating involves putting a pin on the behavior of my personal followers on instagram: Every post that I upload involves my face in some way, so my assumption is that all of my posts would receive a consistent amount of likes and comments by the same individuals. However, despite all 826 of my followers having access to my account at all times, some pictures naturally receive more likes and camaraderie than others. I am not an instagram influencer: however, I can see why others (especially those financially depending on their instagram performance) would want to track these patterns in accordance to their audiences.
What makes this problem interesting is that, despite the inherent demand, there is not really an open source (or github project) that tackles this problem on a personal, individual dataset. Most “instagram likes predictor” projects online mention web scrapers to build massive swathes of instagram engagement datasets, but those datasets defeat the purpose of predicting the behavior of followers within small, closed off, private accounts like my own. In other words, these projects make attempts to reverse engineer the instagram algorithm to predict what type of content encourages maximum content-consumer interactions on a constantly changing public “for you page”. When it comes to smaller, more intimate, private social media profiles, the user is not competing for likes or comments or exposure with others, so their generated “likes” data more reflects the dedicated interests of a loyal control group instead of an ever changing flow of individuals who happen to stumble upon said content.
The purpose of this project is to figure out a way to interpolate the correlation between my likes and my image-data to best predict how popular a post would be given my dedicated friends/family as a control group. Machine learning, as a field, is an excellent way to do this interpolation as it uses mathematics to enumerate data and draw patterns between common trends. Predictions based upon the patterns found can, hence, be made from the machine learning model created. My project will focus directly on the behaviors of my personal followers and their respective likes/dislikes, and hence, data from my account will be used in training and fitting my Machine Learning model.
Read the "Social Performance Estimator".pdf under "Literature Review" for more information on the project! Enjoy!