Average shortest distance in weighted networks
January 9, 2009
Shortest distances has long been a key element of network research. First, it has been used a global measure: what is the average shortest distance among nodes in a network? The global perspective became famous following Milgram’s six-degrees of separation experiment in 1967 that found that people in the US were on average 6-steps from any other person (see the introduction of my thesis
for a more complete story and criticism).
Second, the shortest distances have been used as an underlying metric in a number of measures, such as centrality ones (Freeman, 1978). One of these measures is closeness, which is inversely proportional to the distance from a node to all other nodes in the network. A number of researchers has found an association between the closeness to others and performance – but this post is not about applications, but about the global average shortest distances.
The shortest distance among nodes in a network is quite easy to calculate if you only have present or absent ties: you simply count the ties along the shortest path. If two nodes are directly connected: distance=1; and if they are not directly connected, but are connected through intermediaries, then it is the lowest number of intermediary nodes +1. For example, if the ties in this network did not have weights, the distance between node A and F would be 3. Below is a matrix showing the shortest distances among the nodes in the sample network.
A B C D E F A NA 1 1 2 2 3 B 1 NA 1 1 1 2 C 1 1 NA 2 2 3 D 2 1 2 NA 2 3 E 2 1 2 2 NA 1 F 3 2 3 3 1 NA
The average shortest distance in this network is 1.8 (the mean of the matrix, diagonal excluded).
Difficulty occurs when ties are differentiated, as they are in a weighted network. For example, the tie between node A and B has twice the strength of the tie between node A and node C. This could mean that node A has more frequent contact with node B than with node C. In turn, this could imply that node A might give node B a piece of information (or a disease) twice as likely as node C. If we are looking at the diffusion of information
or diseases
in a network, then the speed that it travels, and routes that it takes, are clearly affected by the weights.
Let us look at the shortest path from node C to node B. The direct connection carries a weight of 1 only; however, the indirect connection through node A is composed of stronger ties. Therefore, a piece of information (or a disease) might be transmitted quicker through node A than directly.
Dijkstra (1959) proposed an algorithm that sum the cost of connections and find the path of least resistance. For example, GPS devices uses this algorithm by assigning a time-cost to each leg of the road, and then find the route that cost least in terms of time. This algorithm can also be used in social network analysis. Newman (2001) applied it to a collaboration network of scientists by inversing the tie weights (dividing 1 by the weight). This implies that a stronger tie gets a lower cost than a weaker tie. Below is the network from above with the inverted weights:

By applying Dijkstra’s algorithm, we find that the direct connection between node C and node B has a cost of 1, whereas the indirect connection via node A has a cost of 0.75 (1/2 + 1/4). Therefore, according to this algorithm, the information will travel faster through the indirect connection. Below is a matrix showing the cost of the least costly routes among the nodes in the sample network.
A B C D E F
A NA 0.25 0.50 0.50 0.75 1.75
B 0.25 NA 0.75 0.25 0.50 1.50
C 0.50 0.75 NA 1.00 1.25 2.25
D 0.50 0.25 1.00 NA 0.75 1.75
E 0.75 0.50 1.25 0.75 NA 1.00
F 1.75 1.50 2.25 1.75 1.00 NA
The average of this matrix is 0.9833.
This application of Dijkstra’s algorithm is fine to use if we wish to rank the nodes in terms of closeness to others (node A is closer to the others than node C because it has a stronger tie to node B); however, what does an average distance of 0.98 mean?
To this end, I suggest to “normalise” the weights by the average weight in the network. The weights would then be:

By calculating Dijkstra’s algorithm using this normalisation, we get the following shortest distances among nodes:
A B C D E F
A NA 0.5825 1.1650 1.1650 1.7475 4.0775
B 0.5825 NA 1.7475 0.5825 1.1650 3.4950
C 1.1650 1.7475 NA 2.3300 2.9125 5.2425
D 1.1650 0.5825 2.3300 NA 1.7475 4.0775
E 1.7475 1.1650 2.9125 1.7475 NA 2.3300
F 4.0775 3.4950 5.2425 4.0775 2.3300 NA
The average of this matrix is 2.29.
A unit of distance then refers to one step with the average weight in the network. The average of the matrix suggests that on average nodes are 2.29 steps with average tie weight away from each other. This measure would be comparable across networks with different ranges of tie weights.
If anyone has any suggestions for improvements, let me know!
Want to test it with your data?
First, you need to download tnet
and install it in R.
You need to create an edgelist of your network (see the tnet documentation
). The edgelist for the sample network here can be entered into R using the following command:
net <- cbind( i=c(1,1,2,2,2,2,3,3,4,5,5,6), j=c(2,3,1,3,4,5,1,2,2,2,6,5), w=c(4,2,4,1,4,2,2,1,4,2,1,1))
Then you can write the following commands:
# Load tnet library(tnet) # Create a binary version of the network bnet <- dichotomise(net) # Calculate the average distance in the binary network mean(distance_w(bnet),na.rm=T) # Calculate the average distance in the weighted network mean(distance_w(net)*mean(net[,3]),na.rm=T)
References
Dijkstra, E. W., 1959. A note on two problems in connexion with graphs. Numerische Mathematik 1, 269-271.
Freeman, L. C., 1978. Centrality in social networks: Conceptual clarification. Social Networks 1, 215-239.
Milgram, S., 1967. The small world problem. Psychology Today 2, 60-67.
Newman, M. E. J., 2001. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Physical Review E 64, 016132.
Entry Filed under: Network thoughts. Tags: centrality, closeness, complex networks, global, graphs, network, shortest distance, social network analysis, valued networks, weighted networks.
2 Comments Add your own
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed

1.
esagor | January 9, 2009 at 2:46 pm
Tore, this is great stuff. I’m also working on dissertation research (pretty early in the process) and love the way you’ve organized your site. I look forward to reading future posts and seeing how your work proceeds.
2.
kaikuo | May 5, 2009 at 12:36 pm
your posts are really helpful:) I am working on my dissertation too