Topic: how does TikTok’s data mining to improve its algorithm impact its users addiction to their platform
Why is this matter important?
- The platform is huge and still growing
- Frontiers paper considers the algorithm the most advanced
- The target audience is young adults and adolescents
- 06-17 (31.59%)
- 18-24 (30.14%)
- 25-30 (20.85%)
- 31-35 (8.66%)
- 35+ (8.76%)
- The native and easily absorbed nature of young people
What does most research on the topic of social media addition fail to account for?
The role of users' personal perception outside of psychological mechanisms (dopamine release)
What does this study do differently?
examines the effects of TikTok technical environment factors (information and system quality) and the impact this has on adolescents' inner perception
adopts an extended version of the SOR model based on environmental psychology
What is the SOR model?
- SOR Model: Stimulus Organism Response model
- Proposes that environmental stimulants (S) work in conjunction to affect a users' (O) internal state to illicit some response (R)
- Stimulus: Information/system quality
- Information quality refers to content and is measured in conciseness, subscription and usefulness.
- System quality is the application/algorithm and is measured in flexibility, integration, ease of use, and response time
- Organism: The users' internal state of perception, feelings, and thinking composed of cognition and emotion
- Cognition: time distortion
- Emotion: enjoyment/concentration/boredom/etc.
- These were components of the online flow experience (The inner feeling of enjoyment that encourages further participation)
- The heaviest weight (seemingly) is concentration
- Response: Depicted as access/avoidance attitude or behavior generated from external environment (S) or internally from the user (O)
(According to the paper) What was data was explicitly said to be mined?
- Login method
- Network-associated personal info
- content of likes
Keywords/phrases
- Entertainment spiral
- Closed Loop relationship: between the addiction and algorithm optimization
- Flow Experience: an inner feeling of enjoyment, concentration, time distortion
- Also found in the context of online gaming
- The (S) in SOR could refer to the layout or physical interface used (phone/tablet/desktop)
Conclusions: TikTok understands the advance nature of its algorithm and should impose some intermission system to aid adolescents in managing their viewing time considering the lack of self-control they have. Because TT weighs concentration heavily in their algorithm, ultimately leading to addiction, they could easily implement this. Alternatively they could recommend educational content. TT should be a platform that looks to inform and educate just as much as it does entertain. Parents and schools should take steps to decrease viewing time of adolescents.
-
Labels it fastest growing social network
-
100 Automated bots used
- no gender
- different predefined interests
- varying DOB
-
Uses secret algorithm to gather data (obvious)
-
Spoke to company execs
-
shares, likes, follows, what you watch
- determined watch time as most important (only one needed)
-
app tracks watch time, skips, rewatch, etc.
-
TikTok Experience
- suggested popular videos (wide variety shown)
- based on initial responses/exploration it'll recommend you similar videos
- more niche with less views
-
Learned bots' interests in less than 2 hours (sometimes less than 40 minutes)
-
Guillaume Chaslot (Data scientist | Founder of Algotransparency)
- On YT more than 70% of the views come from recommendation engine
- On TT its 90+%
- Similar to what we see on YT we see on TT
- Users get rabbit-holed into a content spiral (echo chamber)
- Despite being given a
Not Interested
button on videos, the algorithm seems to follow its own understanding of the user over what the user says
- On YT more than 70% of the views come from recommendation engine
-
Investigation steps:
- kentucky_96 is a bot
- interest:
sadness
- interest:
- first
sadness
video found less than 3 min in (15th video)- stops scrolling and watches this video twice
- things noted by TT (metadata)
- Audio
- Author of video
- Description
- Hashtags
- 23 videos later another
sadness
video (4 more minutes of watching) - vid 57 repeated watch of heartbreak video
- vid 60 emotional pain video
- vid 80 (15 mins in) relationship video (scrolls away before finishing)
- Pause with tag
#mentalhealth
- swipes quickly past videos about relationships
- lingers over video with tag #depression and #anxiety
- kentucky_96 is a bot
-
Analysis:
- 278 videos (36 minutes of watch time total)
- 93% of videos being recommended are depression related
- Using the
Not Interested
button wasn't enough to get a different feed- Changing viewing habits did
TT's algorithm results in users regressing into the farthest corner of their niche
This could result a user who is exploring content of sadness through one video
to eventually being pushed into a rabbit-hole of content related to depression
What is TikTok (TT)?
- Successor to Musical.ly app (Acquired by TT owner ByteDance)
- Approximately 1 Billion monthly active users
Recommendation Engine
For You
page is curated content selected using TT's algorithm- Considers how both Instagram (IG) and TT weigh factors:
- previous video interactions
- accounts
- hashtags
- location
- language preferences
- user created content
- TT opens with video recommendations (What the platform thinks you like)
- IG opens with content from followed accounts (What the user thinks they like)
- Claims TT using AI and data mining "practices" in their recommendation engine
- TT openly claims it uses these factors (categorized) in a blog:
- User Interactions:
- videos watched
- videos liked/shared/commented on
- content created
- Video Info/details:
- Hashtags
- Sounds
- Captions
- Device and account settings*:
- Device type
- Language Preference
- Country Setting
*While these are used their weight is lower in how they influence the algorithm as these aren't factors the user actively expresses
- User Interactions:
Data Collected TikTok's privacy policy:
- Location data
- IP Address
- Keystroke Patterns
- Device Type
- Search History (In-App Browser)
- Content of messages exchanged within app
- Phone Number *
- Phone book *
- Social-network contacts *
- GPS Data *
- User Age *
- User-generated content (photos/videos) *
- Stored payment info *
- Videos liked/shared/watch time *
* With permissions