The importance of allowing ties to decay

March 20, 2009

A great number of social network datasets have been, and are, collected through surveys and interviews. For example, an advice network could be collected by asking each individual within a group to designate the people they go to for advice. Another, more rigid, method is to give each individual a list of the other people in the group and let them select the people they go to for advice (roast surveys).

In addition to number of biases (e.g., the informant inaccuracy bias; Bernard et al., 1984; see my thesis for a critic), survey instruments and direct observation methods are generally labour-intensive and difficult to administer. As a result, most networks collected using these methods are of a fairly limited size, often comprising only a few tens (e.g., Bernard et al., 1988) or hundreds (e.g., Fararo and Sunshine, 1964) of people.

This limitation has been overcome by using archival data sources instead of surveys. For example, the online social network of 1,899 people used in Patterns and Dynamics of Users’ Behaviour and Interaction: Network Analysis of an Online Community could only reasonably be collected since the social interactions were automatically recorded. Other social network papers using archival data sources include Kossinets and Watts (2006) and Uzzi and Spiro (2005).

Although archival data sources allow for larger networks to be collected, and in turn, more robust statistical analysis to be applied, a bias might be introduced into the data if information about the severing of ties is not included: archival data sources have a much better memory than individuals.¹ For a social network, this could imply that social interactions that are no longer relevant to an individual are recorded as being relevant. Moreover, the weight of ties might be overestimated. These issues do not exist when data is collected through surveys as each individual would only list current or relevant friends with the current tie strength (if they are honest that is).

Evolving networkIn the empirical analysis of the online social network, we studied the network in two ways. First, we assumed that social ties never decay (the cumulative perspective). This assumes that if a social interaction is recorded on, for example, day 12, it will become included in the analysis from that point, and it will always remain included. Second, we followed Kossinets and Watts (2006) and imposed lifespans to the social relationships. This ensured that, if two people do not continue to communicate over time, their tie will be severed. This also applied to the weighted network: if the rate of messages sent from one person to another decreases, the tie would be weakened.

The length of the lifespan is crucial in determining which past events are taken into account to generate the network structure at a given point in time. By analysing which past events are relevant to the current state of the network, the length of the lifespan can be defined (Kossinets and Watts, 2006). An ill-defined lifespan will have the effect of, either breaking continuous social interactions into independent sets of interactions, or combining two separate interactions into a single one.

To illustrate the difference between imposing a lifespan and not imposing one, the following figure shows results from the the online social network where networks are constructed both cumulatively and with smoothing windows of 2, 3, and 6 weeks. Both panels in the figure highlight the vulnerability of network measures to the use of a smoothing window. Panel a suggests that there is only a small core of users that actively use the virtual community at the end of the observation period. An analysis of the cumulative network at that point would be heavily influenced by the majority of users that only used the network in the fi rst 6 weeks, and would not reflect the current activities that are occurring in the community. This could bias network measures and, ultimately, the analysis. Panel b shows the evolution of one possible measure, the clustering coefficient. In particular, the clustering coefficient measured on the active core is mostly below the value found in the cumulative network.

Use of windows: (a) active users; (b) clustering coefficient

The above figure also highlights the sensitivity of sampling time. By using shorter lifespans, the network measures become more unstable and dependent on the time at which the observation is taken. Kossinets and Watts (2006) argued that network measures would remain stable over time. As a result, the average of the measures in a given observation period can be generalised to a longer period of time. The figure, however, suggest that, when social relationships have a lifespan, network measures are not stable. Therefore, it is difficult to infer from network snapshots stable network measures that can reflect the network structure over a longer period of time.

By allowing for the severing of ties and sampling the network structure at various times over a longer period (e.g., each day in the observation period as we did for the online social network), the validity and robustness of a network analysis could be improved.
_____________________
¹ A number of other limitations, notably validity issues, could also be introduced into the data when using archival data sources.

Want to test it with your data?

First, you need to ensure that your data confirm to the tnet standard for longitudinal networks. Then you need to load it into an R session.

Second, you need to download, install, and load tnet.

Third, by combining the add_window_to_longitudinal_data-function and the longitudinal_data_to_edgelist-function, an instantaneous structure of the network at any point in time can be created.

# Add the severing of ties after 21 days
net <- add_window_to_longitudinal_data(net, window=21)

# Create the static network on February 20, 2009 at 7am
net0902200700 <- longitudinal_data_to_edgelist(net[net[,1]<="2009-02-20 07:00:00",])

Then you use the other functions to study the network.

# Average degree
tmp <- degree_w(net0902200700)
sum(tmp[,"degree"])/length(which(tmp[,"degree"]!=0))

# The global clustering coefficient
clustering_w(dichotomise(net0902200700))

References

Bernard, H. R., Killworth, P. D., Kronenfeld, D., Sailer, L. D., 1984. The problem of informant accuracy: the validity of retrospective data. Annual Review of Anthropology 13, 495-517.

Bernard, H. R., Kilworth, P. D., Evans, M. J., McCarty, C., Selley, G. A., 1988. Studying social relations cross-culturally. Ethnology 27 (2), 155-179.

Fararo, T. J., Sunshine, M., 1964. A Study of a Biased Friendship Network. Syracuse University Press, Syracuse, NY.

Kossinets, G., Watts, D. J., 2006. Empirical analysis of an evolving social network. Science 311, 88-90.

Uzzi, B., Spiro, J., 2005. Collaboration and creativity: The small world problem. American Journal of Sociology 111, 447-504.

Please cite or link to this post if you use it.

Entry Filed under: Network thoughts. Tags: , , , , , , , , , , , , , , , , .

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed


Welcome

Tore OpsahlMy aim for this blog is to explore and throw out in the open some of the ideas about social network analysis that I have, but no time to implement. Many of my ideas stem from my interest in weighted networks and my belief that the weights are an enormous source of data. However, many social network measures require that the weights are discarded. In so doing, the richness of the data is considerably reduced. In turn, this limits the analysis.

Links

Feeds

Licensing

The information on this blog is published under the Creative Commons Attribution-Noncommercial 3.0-lisence.

This means that you are free to:
· share (copy, distribute and transmit)
· remix (adapt)
under the following conditions:
· attribution (you must cite this blog)
· noncommercial (you may not use it for
   commercial purposes).

Creative Commons License