## Shortest Paths in Two-mode Networks

Shortest paths among nodes has long been a key element of network research. While the shortest paths often are not of interest in themselves, they are the key component of a number of measures. First, a popular question has been what is the average distance among people? The question became famous following Milgram’s six-degrees of separation experiment in 1967 that found that people in the US were on average 6-steps from any other person (see the introduction of my thesis for a more complete story and criticism). Second, the shortest paths have been used as an underlying metric in a number of measures, such as centrality ones (Freeman, 1978). One of these measures is betweenness, which is roughly equal to the number of shortest paths between others that pass through a node.

The shortest distance among nodes in binary one-mode networks is straight forward to calculate: you simply count the ties along the shortest path. If two nodes are directly connected:*distance=1*; and if they are not directly connected, but are connected through intermediaries, then it is the lowest number of intermediary nodes +1. For more information as well as how to identify the shortest paths in weighted networks and calculate their length, see Shortest Paths in Weighted Networks. To exemplify how shortest distances are calculated, the distance between node A and F in the sample network to the right would be 3 as this is the minimum number of ties between them. The distances between all node pairs is shown in the matrix below. The average shortest distance (or network distances) is calculated by taking the mean of the matrix excluding the diagonal. In this cases, it is 1.8.

[A][B][C][D][E][F] [A] NA 1 1 2 2 3 [B] 1 NA 1 1 1 2 [C] 1 1 NA 2 2 3 [D] 2 1 2 NA 2 3 [E] 2 1 2 2 NA 1 [F] 3 2 3 3 1 NAIn two-mode networks, the shortest paths have been identified in a number of ways (Borgatti and Everett, 1997; Latapy et al., 2008, Newman, 2001). While the methods vary, they generally lead to the substantially the same outcome. The distance between two nodes is equal to the number of ties between the.

First, Borgatti and Everett (1997) suggested to represent two-mode networks in (n+p)x(n+p)-matrices where n and p are the numbers of primary and secondary nodes, respectively, instead of rectangular two-mode matrices (n x p). This matrix type is extremely inefficient as it is at least four times larger than the rectangular one without holding any additional information. Nevertheless, it has a key feature: it is square. As such, one-mode metrics, such as the one-mode shortest path algorithms, can directly be applied to it. This method has been the one generally advocated for users of UCINET. The two-mode sample network is represented in such a matrix below:

[A][B][C][D][E][F][1][2][3][4][5][6] [A] 0 0 0 0 0 0 1 1 0 0 0 0 [B] 0 0 0 0 0 0 1 1 1 1 1 0 [C] 0 0 0 0 0 0 0 1 0 0 0 0 [D] 0 0 0 0 0 0 0 0 1 0 0 0 [E] 0 0 0 0 0 0 0 0 0 1 1 1 [F] 0 0 0 0 0 0 0 0 0 0 0 1 [1] 1 1 0 0 0 0 0 0 0 0 0 0 [2] 1 1 1 0 0 0 0 0 0 0 0 0 [3] 0 1 0 1 0 0 0 0 0 0 0 0 [4] 0 1 0 0 1 0 0 0 0 0 0 0 [5] 0 1 0 0 1 0 0 0 0 0 0 0 [6] 0 0 0 0 1 1 0 0 0 0 0 0

A second method is to project two-mode networks to one-mode networks (n x n-matrix), and then applying the standard one-mode shortest path algorithm. Projection works connecting primary nodes that a connected to common nodes in the two-mode structure. To exemplify this structure, the matrix below is the binary projection of the two-mode network above.

[A][B][C][D][E][F] [A] 0 1 1 0 0 0 [B] 1 0 1 1 1 0 [C] 1 1 0 0 0 0 [D] 0 1 0 0 0 0 [E] 0 1 0 0 0 1 [F] 0 0 0 0 1 0

Using both the above procedures produces a similar outcome. The only difference is that all distances found using the first method are twice the distances found using the second method. For example, the distances found from node B to node F are 4 and 2 using the two methods, respectively.

There are a number of issues with these two methods:

- Two nodes connected through two common nodes (e.g., B and E) are assumed to have an identical connection as two connected through only one common node (e.g., B and D).
- All non-projected nodes are considered equal. This implies that nodes connecting many nodes are equal to the ones connecting only two nodes. Newman (2001) argued that secondary nodes connecting few primary nodes gave those nodes a higher level of interaction.
- There is no established way of calculating shortest paths in weighted two-mode network.

Building on the second method, projection, it is possible to deal with some of these limitations. By projecting two-mode networks (both binary and weighted) onto *weighted one-mode networks*, and then apply the weighted shortest path algorithm, this procedure has the possibility of capturing more information than simply saying that weights do not matter. It is important to use an appropriate projecting method as this will influence the outcome of the measure. To exemplify this method, the distances in the two-mode sample network using the sum-projection method would be:

[A] [B] [C] [D] [E] [F] [A] NA 0.67 1.33 2.00 1.33 2.67 [B] 0.67 NA 1.33 1.33 0.67 2.00 [C] 1.33 1.33 NA 2.67 2.00 3.33 [D] 2.00 1.33 2.67 NA 2.00 3.33 [E] 1.33 0.67 2.00 2.00 NA 1.33 [F] 2.67 2.00 3.33 3.33 1.33 NA

As you can see from this matrix, the distance between B and E (0.67) is half of the distance from B to D (1.67). This is because they have two common nodes instead of only one, and thus, have a stronger connection.

**Want to try it with your data?**

The distance matrix can be calculated using tnet. First, you need to download and install tnet in R. Then, you need to create an edgelist of your network (see the data structures in tnet for two-mode networks). The commands below show how the edgelist for the sample network here can manually be entered, and how to calculate the distance matrix.

# Load network net <- cbind( i=c(1,1,2,2,2,2,2,3,4,5,5,5,6), p=c(1,2,1,2,3,4,5,2,3,4,5,6,6)) # Load tnet library(tnet) # Compute distances distance_tm(net, projection.method="sum")

**References**

Borgatti, S. P., Everett, M. G., 1997. Network analysis of 2-mode data. Social networks 19, 243-269.

Latapy, M., Magnien, C., Del Vecchio, N., 2008. Basic notions for the analysis of large two-mode networks. Social Networks 30(1), 31-48.

Newman, M. E. J., 2001. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Physical Review E 64, 016132.

Opsahl, T., 2013. Triadic closure in two-mode networks: Redefining the global and local clustering coefficients. Social Networks 35, doi:10.1016/j.socnet.2011.07.001.

Opsahl, T., Agneessens, F., Skvoretz, J., 2010. Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks 32 (3), 245-251.

1.Abdul waheed mahesar | June 24, 2015 at 7:48 pmDear Dr.tore,

Hi,

Very informative blog for the researcher working in complex networks. I’m working on weighted two-mode network, where links have weights. Kindly if possible please tell me the command in r to compute the weighted distance.

In this blog only the command based on the simple two-mode network is given but not on the actual weighted two-mode network.

I’ll remain thankful to you.

Kind regards,

2.Tore Opsahl | June 28, 2015 at 7:48 pmGlad you find this site informative. All the functions can handle weighted two-mode networks; however, it is key to use as.tnet with weighted two-mode networks. See https://toreopsahl.com/tnet/software/two-mode-data-structure/