-
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
47 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# Trending Algorithm | ||
|
||
It is essential to consider specific factors in order to get the best potential result for trending detection. | ||
The first thing to consider is the _increment or decrement ratio between two y-axis in a given time (x-axis)_ to detect a peak, you may use deltas _(Δf(x))_ | ||
for this. | ||
You would have the following mathematical model: | ||
|
||
|
||
_**ω** = (Δf(x)) / f(x) or **ω** = (Δy) / y_ | ||
|
||
|
||
Where **_y_** or **_f(x)_** represents _social interaction scoring (SI)_ while **_ω_** represents the _social fluctuation ratio (SFR)_. Furthermore, SI represents | ||
a _view, like, mention or comment_ in an specific aggregate. However, by using this specific ratio it is impossible to detect peaks in a _high-mass | ||
environment_ such as the internet that is an Exponential-Driven and high-mass environment. SFR or ω is highly relative when comparison are being made in a high-mass environment. | ||
|
||
|
||
Consider the following potential scenario: aggregate "x" has 1 social interaction (SI) at the first hour (t), eventually it went from 1 to 10 SI in | ||
the next hour. Furthermore, aggregate "y" has an increment of 123,980 to 125,980 SI in 1 hour (t). Henceforward, you may calculate the increment or | ||
decrement ratio (social fluctuation ratio or ω) to determine if there was a peak between the given hours (t). | ||
|
||
Using the past hypothetical scenario, | ||
aggregate "x" social fluctuation ratio (ω) is 0.9 (90%) while aggregate "y" ω is 0.1587 (1.87%). Consequently, if you exclusively used this | ||
mathematical model, _you won't get the result needed to calculate trends as discussed before._ | ||
|
||
|
||
Thus, you may consider **two factors** as key to discriminate each candidate. While ω (social fluctuation ratio) works to get the SI increment and | ||
decrement ratio, it lacks of a required factor to _remove the high-relativity_. Here is when the **_valuable ratio (υ)_** is mentioned. Valuable ratio | ||
mathematical model is represented like this: | ||
|
||
_**υ** = f(x)/z_ or _**υ** = y/z_ | ||
|
||
Where **f(x)** represents the given SI (social interaction score) and **z** represents the _maximum SI score_ in the whole data set. | ||
|
||
This **valuable ratio (υ)** gives you the _required factor to eradicate high-relativity_ in a large data set. Therefore, you may be able to calculate | ||
the required score to detect a trend in a Exponential-Driven and high-mass environment. | ||
|
||
To conclude, _you must sum the given mathematical algorithms_ to get an overall score called **Trending Score** _(represented as Φ)_. It may be represented | ||
in a mathematical model as this: | ||
|
||
|
||
_**Φ** = ω + υ_ | ||
|
||
|
||
Where **υ** represents valuable ratio, **ω** represents _social fluctuation ratio_ and **Φ** represents _trending score_. In addition, this lends us to _store these | ||
results in a specialized database_ such as a data lake to query any aggregate by it's high-cardinality field and output very specific results for each | ||
scenario _(like trends for an specific user based on his consumed content and country by denormalizing aggregate fields)_. | ||
|