# mj1.atMichael Jaros' Techblog

8Apr/120

## From SIR to AISI – adapting SIR for tweetflows

The length of the infectious period $t_I$ as used in a SIR model corresponds best to the time a service request is visible for a follower on Twitter. This number is difficult to model however, because it depends on the number of tweets in the follower's timeline at the moment of the service request. In other words: The infectious period is a parameter of each potentially receiving node in contrast to biological contagion where the infectious period is a general parameter of the disease.

Instead of analyzing the infectious period for each individual node, I propose to heuristically classify nodes in the originator's follower network as active (A) or inactive (I) nodes. Only active nodes can be susceptible or infectious. AISI (active/inactive susceptible/infectious) means a Tweetflow-specific simplification compared to the SIR model: Instead of considering the infectious period for each vertex in the graph, a simple heuristical algorithm determines nodes that are likely to react to a service request at the beginning using data available via the Twitter API.

AISI actually adapts a concept referred to as percolation [1]: Each vertex in the follower network is considered open or closed. Instead of making this decision randomly, it is more accurate in the tweetflow scenario to classify vertices as open or closed based on existing node classification data (active/inactive).

Determining active nodes among the followers is obviously the first sub-problem that goes hand in hand with the AISI model. The problem can be solved by analyzing the frequency of passive or active Twitter use. The Twitter API currently does not provide any way to determine a user's last login date [2]. However the date of the last status update is available and should allow for a better approximation than the last login date because passive users are unlikely to retweet or answer a service request. The time window in which a last status update has to occur for the node to be considered active should be similar to a typical $t_i$.

Last but not least, an open vertex (leading to an active node) is a necessary, but not a sufficient condition for an infection of that node. The probability of infection for such an open vertex depends on several both general and node-based factors (e.g. skill, payoff).

References:
1. Easley and Kleinberg (2010): Networks, Crowds, and Markets, p.572 []