Node Centrality in Weighted NetworksThe centrality of nodes, or the identification of which nodes are more “central” than others, has been a key issue in network analysis (Freeman, 1978; Bonacich, 1987; Borgatti, 2005; Borgatti et al., 2006). Freeman (1978) argued that central nodes were those “in the thick of things” or focal points. To exemplify his idea, he used a network consisting of 5 nodes. The middle node has three advantages over the other nodes: it has more ties, it can reach all the others more quickly, and it controls the flow between the others. Based on these three features, Freeman (1978) formalized three different measures of node centrality: degree, closeness, and betweenness. Degree is the number of nodes that a focal node is connected to, and measures the involvement of the node in the network. Its simplicity is an advantage: only the local structure around a node must be known for it to be calculated (e.g., when using data from the General Social Survey; McPherson et al., 2001). However, there are limitations: the measure does not take into consideration the global structure of the network. For example, although a node might be connected to many others, it might not be in a position to reach others quickly to access resources, such as information or knowledge (Borgatti, 2005; Brass, 1984). To capture this feature, closeness centrality was defined as the inverse sum of shortest distances to all other nodes from a focal node. A main limitation of closeness is the lack of applicability to networks with disconnected components (see Closeness Centrality in Networks with Disconnected Components). The last of the three measures, betweenness, assess the degree to which a node lies on the shortest path between two other nodes, and are able to funnel the flow in the network. In so doing, a node can assert control over the flow. Although this measure takes the global network structure into consideration and can be applied to networks with disconnected components, it is not without limitations. For example, a great proportion of nodes in a network generally does not lie on a shortest path between any two other nodes, and therefore receives the same score of 0.
The three measures have been generalised to weighted networks. In a first set of generalisations, Barrat et al. (2004) generalised degree by taking the sum of weights instead of the number ties, while Newman (2001) and Brandes (2001) utilised Dijkstra’s (1959) algorithm of shortest paths for generalising closeness and betweenness to weighted networks, respectiviely (see Shortest Paths in Weighted Networks for details). These generalisations focused solely on tie weights and ignored the original feature of the measures: the number of ties. As such, a second set of generalisation were proposed by Opsahl et al. (2010) that incorporates both the number of ties and the tie weights by using a tuning parameter.
Degree is the simplest of the node centrality measures by using the local structure around nodes only. In a binary network, the degree is the number of ties a node has. In a directed network, a node may have a different number of outgoing and incoming ties, and therefore, degree is split into out-degree and in-degree, respectively.
Degree has generally been extended to the sum of weights when analysing weighted networks (Barrat et al., 2004; Newman, 2004; Opsahl et al., 2008), and labeled node strength. It is equal to the traditional definition of degree if the network is binary (i.e., each tie has a weight of 1). Conversely, in weighted networks, the outcomes of these two measures are different. Since node strength takes into consideration the weights of ties, this has been the preferred measure for analyzing weighted networks (e.g., Barrat et al., 2004; Opsahl et al., 2008). However, node strength is a blunt measure as it only takes into consideration a node’s total level of involvement in the network, and fail to take into account the main feature of the original measures formalised by Freeman (1978): the number of ties. This limitation is highlighted for degree centrality by the three ego networks from Freeman’s third EIES network. The three nodes have roughly sent the same amount of messages; however, to a quite different number of others. If Freeman’s (1978) original measure was applied, the centrality score of the node in panel A is almost five times as high as the node in panel C attains. However, when using Barrat et al.’s generalisation, they get roughly the same score.
In an attempt to combine both degree and strength, Opsahl et al. (2010) used a tuning parameter to set the relative importance of the number of ties compared to tie weights. Specifically, the proposed degree centrality measure was the product of the number of nodes that a focal node is connected to, and the average weight to these nodes adjusted by the tuning parameter. There are two benchmark values for the tuning parameter (0 and 1), and if the parameter is set to either of these values, the existing measures are reproduced (Barrat et al., 2004; Freeman, 1978). If the parameter is set to the benchmark value of 0, the outcomes of the measures are solely based on the number of ties, and are equal to the one found when applying Freeman’s (1978) measure to a binary version of a network where all the ties with a weight greater than 0 are set to present. In so doing, the tie weights are completely ignored. Conversely, if the value of the parameter is 1, the measure is based on tie weights only, and are identical to the already proposed generalisation (Barrat et al., 2004). This implies that the number of ties is disregarded. The table below highlights the differences between the degree-measures.
|Node||Degree measure from|
|Freeman (1978)||Barrat et al. (2004)||Opsahl et al. (2010; alpha=0.5)||Opsahl et al. (2010; alpha=1.5)|
|Phipps Arabie (A)||28||155||66||365|
|John Boyd (B)||11||188||45||777|
|Maureen Hallinan (C)||6||227||37||1396|
To calculate the degree scores of nodes, below is a sample code for calculating the degree scores of the neurons of the c.elegans worm (Watts and Strogatz, 1998) using the R-package tnet.
# Load tnet library(tnet) # Load the neural network of the c.elegans network data(tnet) # Calculate the out-degree of neurons and the generalised measures (alpha=0.5) degree_w(net=celegans.n306.net, measure=c("degree","output","alpha"), alpha=0.5) # Calculate the in-degree of neurons and the generalised measures (alpha=0.5) degree_w(net=celegans.n306.net, measure=c("degree","output","alpha"), alpha=0.5, type="in")
Closeness is defined as the inverse of farness, which in turn, is the sum of distances to all other nodes (Freeman, 1978). The intent behind this measure was to identify the nodes which could reach others quickly. A main limitation of closeness is the lack of applicability to networks with disconnected components: two nodes that belong to different components do not have a finite distance between them. Thus, closeness is generally restricted to nodes within the largest component of a network. The blog post Closeness Centrality in Networks with Disconnected Components suggests a method for overcoming this limitation,
Closeness has been generalised to weighted networks by Newman (2001) who used Dijkstra’s (1959) algorithm (see Shortest Paths in Weighted Networks for details). To quickly reiterate Dijkstra’s (1959) and Newman’s (2001) work here:
- Dijkstra (1959) proposed an algorithm to find the shortest paths in a network where the weights could be considered costs. The least costly path connecting two nodes was the shortest path between them (e.g. a network of roads where each leg of road has a time-cost assign to it).
- Newman (2001) transformed the positive weights in a collaboration network into costs by inverting them (dividing 1 by the weight).
- Based on the inverted weights, Newman (2001) applied Dijkstra’s algorithm and found the least-costly paths among all nodes.
- The total cost of the paths from a node to all others was a measure of farness: the higher the number, the more it cost a node to reach all other nodes. To create a closeness measure, Newman (2001) followed Freeman (1978) and inverted the numbers (1 divided by the farness). Thus, a high farness was transformed into a low closeness, and a low farness was transformed into a high closeness.
Similarily to Barrat et al.’s (2004) generalisation of degree, Newman’s (2001) generalised algorithm solely focuses on the sum of tie weights, and fails to consider the number of ties on paths. Opsahl et al. (2010) generalisation of shortest paths can be applied to determining the length of them.
To calculate the closeness scores of nodes, below is a sample code for calculating the closeness scores of the neurons of the c.elegans worm (Watts and Strogatz, 1998) using the R-package tnet.
# Load tnet library(tnet) # Load the neural network of the c.elegans network data(tnet) # Calculate the binary closeness scores closeness_w(net=celegans.n306.net, alpha=0) # Calculate the first generation weighted closeness scores closeness_w(net=celegans.n306.net, alpha=1) # Calculate the second generation weighted closeness scores (alpha=0.5) closeness_w(net=celegans.n306.net, alpha=0.5)
The extent to which a node is part of transactions among other nodes can be studied using Freeman’s (1978) betweenness measure. In the sample network on the right, if the ties did not have a weight assigned to them, the flashing grey lines represent the 9 shortest paths in the network that pass through intermediate nodes. The highlighted node is an intermediate on 8 of these paths. This will give this node a betweenness score of 8.
Brandes (2001) proposed a new algorithm for calculating betweenness faster. In addition to reducing the time, this algorithm also relaxed the assumption that ties had to be either present or absent (i.e. a binary network), and allowed betweenness to be calculated on weighted networks (note that this generalisation is separate from the flow measure proposed by Freeman et al., 1991, which might be more appropriate in certain settings). This generalisation takes into account, that in weighted networks, the transaction between two nodes might be quicker along paths with more intermediate nodes that are strongly connected than paths with fewer weakly-connected intermediate nodes. This is due to the fact that the strongly connected intermediate nodes have, for example, more frequent contact than the weakly connected ones. For example, the tie between the top-left node and the focal node in the sample network above has four times the strength of the tie between the bottom-left node and the focal node. This could mean that top-left node has more frequent contact with the focal node than the bottom-left node has. In turn, this could imply that top-left node might give the focal node a piece of information (or a disease) four times quicker than the bottom-left node. If we are studying the nodes that are most likely to be funnelling information or diseases in a network, then the speed at which it travels, and routes that it takes, are clearly affected by the weights. The identification of the Shortest Paths in Weighted Networks can also be used when identifying the nodes which funnel transactions among other nodes in weighted networks. If we assume that transactions in a weighted network follow the shortest paths identified by Dijkstra’s algorithm instead of the one with the least number of intermediate nodes, then the number of shortest paths that pass through a node might change.
|Node||Betweenness measure from|
|Freeman (1978)||Brandes (2001)||Opsahl et al. (2010; alpha=0.5)|
Now, node 1 (A) has gotten betweenness score of 4 as well. This is because the indirect path from node B to node C through A is used instead of the direct connection.
Similarily to Newman’s (2001) generalisation of closeness, Brandes’ (2001) generalised algorithm solely focuses on the sum of tie weights, and fails to consider the number of ties on paths. Opsahl et al. (2010) generalisation of shortest paths can also applied to identifying them.
To calculate the betweenness scores of nodes, below is a sample code for producing the three tables above using the R-package tnet.
# Manually enter the example network net <- cbind( i=c(1,1,2,2,2,2,3,3,4,5,5,6), j=c(2,3,1,3,4,5,1,2,2,2,6,5), w=c(4,2,4,1,4,2,2,1,4,2,1,1)) # Calculate the binary betweenness measure betweenness_w(net, alpha=0) # Calculate the first generation weighted betweenness measure betweenness_w(net, alpha=1) # Calculate the first generation weighted betweenness measure betweenness_w(net, alpha=0.5)
Note: The implementation of Brandes’ (2001) algorithm finds multiple paths if they have exactly the same distance. For example, if one path is found over the direct tie with a weight of 1 (distance = 1/1 = 1) and a second path is through an intermediary node with two ties with weights of 2 (distance = 1/2 + 1/2 = 1), the two paths have exactly the same distance. However, if there is a third path through two intermediaries with three ties with weights of 3 (distance = 1/3 + 1/3 + 1/3), it does not exactly equal 1 as computers read these values as 0.3333333 and the sum of these values is 0.9999999. Therefore, this path is considered shorter than the other two paths (distance = 1).
Barrat, A., Barthelemy, M., Pastor-Satorras, R., Vespignani, A., 2004. The architecture of complex weighted networks. Proceedings of the National Academy of Sciences 101 (11), 3747-3752. arXiv:cond-mat/0311416
Brandes, U., 2001. A Faster Algorithm for Betweenness Centrality. Journal of Mathematical Sociology 25, 163-177.
Dijkstra, E. W., 1959. A note on two problems in connexion with graphs. Numerische Mathematik 1, 269-271.
Freeman, L. C., 1978. Centrality in social networks: Conceptual clarification. Social Networks 1, 215-239.
Freeman, L. C., Borgatti, S. P., White, D. R., 1991. Centrality in valued graphs: A measure of betweenness based on network flow. Social Networks 13 (2), 141-154.
Newman, M. E. J., 2001. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Physical Review E 64, 016132.
Opsahl, T., Agneessens, F., Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks 32, 245-251.