A great number of social network datasets have been, and are, collected through surveys and interviews. For example, an advice network could be collected by asking each individual within a group to designate the people they go to for advice. Another, more rigid, method is to give each individual a list of the other people in the group and let them select the people they go to for advice (roast surveys).
In addition to number of biases (e.g., the informant inaccuracy bias; Bernard et al., 1984; see my thesis for a critic), survey instruments and direct observation methods are generally labour-intensive and difficult to administer. As a result, most networks collected using these methods are of a fairly limited size, often comprising only a few tens (e.g., Bernard et al., 1988) or hundreds (e.g., Fararo and Sunshine, 1964) of people.
This limitation has been overcome by using archival data sources instead of surveys. For example, the online social network of 1,899 people used in Patterns and Dynamics of Users’ Behaviour and Interaction: Network Analysis of an Online Community could only reasonably be collected since the social interactions were automatically recorded. Other social network papers using archival data sources include Kossinets and Watts (2006) and Uzzi and Spiro (2005).
Although archival data sources allow for larger networks to be collected, and in turn, more robust statistical analysis to be applied, a bias might be introduced into the data if information about the severing of ties is not included: archival data sources have a much better memory than individuals.¹ For a social network, this could imply that social interactions that are no longer relevant to an individual are recorded as being relevant. Moreover, the weight of ties might be overestimated. These issues do not exist when data is collected through surveys as each individual would only list current or relevant friends with the current tie strength (if they are honest that is).
In the empirical analysis of the online social network, we studied the network in two ways. First, we assumed that social ties never decay (the cumulative perspective). This assumes that if a social interaction is recorded on, for example, day 12, it will become included in the analysis from that point, and it will always remain included. Second, we followed Kossinets and Watts (2006) and imposed lifespans to the social relationships. This ensured that, if two people do not continue to communicate over time, their tie will be severed. This also applied to the weighted network: if the rate of messages sent from one person to another decreases, the tie would be weakened.
The length of the lifespan is crucial in determining which past events are taken into account to generate the network structure at a given point in time. By analysing which past events are relevant to the current state of the network, the length of the lifespan can be defined (Kossinets and Watts, 2006). An ill-defined lifespan will have the effect of, either breaking continuous social interactions into independent sets of interactions, or combining two separate interactions into a single one.
To illustrate the difference between imposing a lifespan and not imposing one, the following figure shows results from the the online social network where networks are constructed both cumulatively and with sliding windows of 2, 3, and 6 weeks. Both panels in the figure highlight the vulnerability of network measures to the use of a sliding window. Panel a suggests that there is only a small core of users that actively use the virtual community at the end of the observation period. An analysis of the cumulative network at that point would be heavily influenced by the majority of users that only used the network in the first 6 weeks, and would not reflect the current activities that are occurring in the community. This could bias network measures and, ultimately, the analysis. Panel b shows the evolution of one possible measure, the clustering coefficient. In particular, the clustering coefficient measured on the active core is mostly below the value found in the cumulative network.
The above figure also highlights the sensitivity of sampling time. By using shorter lifespans, the network measures become more unstable and dependent on the time at which the observation is taken. Kossinets and Watts (2006) argued that network measures would remain stable over time. As a result, the average of the measures in a given observation period can be generalised to a longer period of time. The figure, however, suggest that, when social relationships have a lifespan, network measures are not stable. Therefore, it is difficult to infer from network snapshots stable network measures that can reflect the network structure over a longer period of time.
By allowing for the severing of ties and sampling the network structure at various times over a longer period (e.g., each day in the observation period as we did for the online social network), the validity and robustness of a network analysis could be improved.
¹ A number of other limitations, notably validity issues, could also be introduced into the data when using archival data sources.
Want to test it with your data?
Second, you need to download, install, and load tnet.
Third, by combining the add_window_to_longitudinal_data-function and the longitudinal_data_to_edgelist-function, an instantaneous structure of the network at any point in time can be created.
# Load tnet library(tnet) # Load the Facebook-like online social network data(OnlineSocialNetwork.n1899) lnet <- OnlineSocialNetwork.n1899.lnet # Add the severing of ties after 21 days lnet <- add_window_l(lnet, window=21) # Create the static network on June 30, 2004, at 7am net <- as.static.tnet(lnet[lnet[,"t"]<as.POSIXlt("2004-06-30 07:00:00"),]) # Calculate network measures, such as the global clustering coefficient clustering_w(net)
Bernard, H. R., Killworth, P. D., Kronenfeld, D., Sailer, L. D., 1984. The problem of informant accuracy: the validity of retrospective data. Annual Review of Anthropology 13, 495-517.
Bernard, H. R., Kilworth, P. D., Evans, M. J., McCarty, C., Selley, G. A., 1988. Studying social relations cross-culturally. Ethnology 27 (2), 155-179.
Fararo, T. J., Sunshine, M., 1964. A Study of a Biased Friendship Network. Syracuse University Press, Syracuse, NY.
Kossinets, G., Watts, D. J., 2006. Empirical analysis of an evolving social network. Science 311, 88-90.
Uzzi, B., Spiro, J., 2005. Collaboration and creativity: The small world problem. American Journal of Sociology 111, 447-504.