The origin of Burst
In the famous paper titled The origin of bursts and heavy tails in human dynamics, barabasi introduce that the waiting time of human communication behavior follows a power law distribution rather than a Poisson distribution, and he argues that the origin of the burst phenomena is originated from the queuing process of decision making.
In his book Bursts, Barabasi told the story of how he came out of the idea. To be specific, the secret of the success of Poisson. As a great scientist, Poisson made great contributions in many aspects. He has a habit of writing down the good research question he encountered and returning to his on-going work. Until he has finished his work in hand, he will select the most interesting question in his question list.
Heavy-tailed processes allow for very long periods of inactivity that separate bursts of intensive activity. I am interested in this model since:
Although I have illustrated the queuing process for e-mails, in general the model is better suited to capture the competition between different kinds of activities an individual is engaged in; that is, the switching between various work, entertainment and communication events. Indeed, most data sets displaying heavytailed inter-event times in a speciﬁc activity reﬂect the outcome of the competition between tasks of different nature.
Bursts of video viewing activity
The metaphor of burst if pretty good. It’s more obvious in video viewing activity.
In the paper of Robust dynamic classes revealed by measuring the response function of a social system, Riley Crane* and Didier Sornette argues that bursts of activity originated from endogenous and exogenous causes. an epidemic cascade of actions becoming the cause of future actions.
To fit in Barabasi’s theory, we can understand individual’s viewing behavior as the following process.
The action is for the individual to view the video in question after a time t since she was first subjected to the cause without any other influences between 0 and t, corresponding to a direct (or first-generation) effect.
However, there is big problem, since most audiences view the video immediately after they are exposed to the Youtube videos.
They illustrated an epidemic branching process that describes the cascade of influences on the social network. The model integrate both exogenous sources and the interpersonal effect of the social networks.
As we have discussed above, “by definition, the memory kernel φ(t) describes the distribution of waiting times between “cause” and “action” for an individual”.
μi is the number of potential viewers who will be influenced directly over all future times after ti by person i who viewed a video at timeti. Thus, the existence of well connected individuals can be accounted for with large values of μi. Lastly,V(t) is the exogenous source, which captures all spontaneous views that are not triggered by epidemic effects on the network.
Based on this model, they categorized the videos into four kinds: Endogenous-subcritical, Endogenous-critical, Exogenous-subcritical, and Exogenous-critical.
According to our model, the aggregated dynamics can be classified by a combination of the type of disturbance (endo/exo) and the ability of individuals to influence others to action (critical/subcritical)
Peak Fraction (F) is the fraction of views observed on the peak day compared with the total cumulative views. They calculate the fraction F and sort the time series into three classes:
Class 1 is defined by 80% ≤ F ≤ 100 %.↔ Exogenous subcritical↔ Spam videos.↔ 1+θ
Class 2 is defined by 20% < F < 80%.↔ Exogenous critical↔ Quality videos.↔ 1 − θ
- Class 3 is defined by 0% ≤ F ≤ 20%.↔ Endogenous sub critical. ↔Viral videos.↔ 1 − 2θ
Demise of Bursts
I am working on analyzing the time series of video views and found interesting demise of bursts along the time.
Interestingly, I find there is no burst for the most popular video Charlie bit my finger – again ! The total views si 456,651,832, and it has never stopped to grow since it has been uploaded to Youtube in 2007. Enjoy it http://t.cn/
The green line is the cumulative growth curve, and the red line is the normalized daily views. You can see that the growth of the red line is stead. However, it lives out most of the other videos.