Local clustering coefficient for two-mode networks
January 6, 2010
In a similar vein as the global clustering coefficient that I proposed in Clustering in two-mode networks
, the local clustering coefficient can be redefined for two-mode networks. Originally, Watts and Strogatz (1998) defined the local clustering coefficient for a focal node as the fraction of present ties among a node’s neighbors over the possible number of ties between them. It can be formalized for a focal node, as follows:
where is the number of 2-paths centered on a node, and
is the number of these that are closed. While the global clustering coefficient is an aggregation of all 2-paths, the local one can be seen as simply an intermediary level of aggregation as it can be conceptualized in terms of 2-paths.
When applying the traditional local clustering coefficient to the projection of two-mode network
, cliques among nodes connected to common nodes in the two-mode network are created. These cliques contain a high number of triangles. This has an impact on measures that rely on ego network density, such as the clustering coefficients and structural holes measures (Burt, 1992). The average of local clustering coefficients is over-estimated for these networks as projections of random two-mode networks contain an above random clustering coefficient. Therefore, a new measure that does not over-estimate the level is needed.
Given the extention of 2-paths in one-mode networks to 4-paths in two-mode networks for the global clustering coefficient
, the denominator and numerator of the local clustering coefficient can also be redefined in terms of 4-paths. While the original local coefficient was based on 2-paths centered on the focal node, this can be extended to 4-paths centered on a focal node in two-mode networks. This would imply that the first and last nodes of the path are of the same mode as the focal node. Formally, I propose:
where is the number of 4-paths with ego as the middle node, and
is the subset of these in which the first and the last nodes of the path share a common node that is not part of the 4-path.
This coefficient has similar properties as the local clustering coefficient. First, for each node, the coefficient varies between 0 and 1 as the numerator and denominator are positive numbers, and the numerator is a subset of the denominator. Second, all 4-paths are closed in a fully connected network, and therefore, the coefficient is equal to 1. Third, if ties are randomly placed in the network, the expected value of the local clustering coefficient is the same as the one for the global coefficient, where
is the density of the network.
Empirical test
To empirically test the proposed local clustering coefficient for two-mode networks, I have also used the Davis Southern Women dataset as this dataset has a limited number of nodes. The table below shows the local clustering coefficients attained from the two-mode network and projected one-mode network as well as the two-mode and one-mode degree scores (i.e., the number of events attended and the number of other women attending the same events, respectively).
| Node | Events attended | Other women attending same events |
One-mode LCC | Two-mode LCC |
|---|---|---|---|---|
| EVELYN | 8 | 17 | 0.8971 | 0.7667 |
| LAURA | 7 | 15 | 0.9619 | 0.8422 |
| THERESA | 8 | 17 | 0.8971 | 0.7523 |
| BRENDA | 7 | 15 | 0.9619 | 0.8388 |
| CHARLOTTE | 4 | 11 | 1 | 1 |
| FRANCES | 4 | 15 | 0.9619 | 0.869 |
| ELEANOR | 4 | 15 | 0.9619 | 0.7959 |
| PEARL | 3 | 16 | 0.9333 | 0.6463 |
| RUTH | 4 | 17 | 0.8971 | 0.6703 |
| VERNE | 4 | 17 | 0.8971 | 0.6741 |
| MYRNA | 4 | 16 | 0.9333 | 0.7139 |
| KATHERINE | 6 | 16 | 0.9333 | 0.7696 |
| SYLVIA | 7 | 17 | 0.8971 | 0.7462 |
| NORA | 8 | 17 | 0.8971 | 0.838 |
| HELEN | 5 | 17 | 0.8971 | 0.8159 |
| DOROTHY | 2 | 16 | 0.9333 | 0.5407 |
| OLIVIA | 2 | 12 | 1 | 0.5806 |
| FLORA | 2 | 12 | 1 | 0.5806 |
The two-mode and one-mode degree scores and the traditional and proposed local clustering coefficients (LCC) of the women in Davis’ (1940) Southern Women dataset. The randomly expected one-mode clustering coefficient is 0.9085, while the one for two-mode networks is 0.7978.
There are a number of observations. First, for all the nodes that did not have the maximum value, the two-mode coefficient is smaller than the coefficient attained on the projected network. This feature is not given as multiple 4-paths might exist among three primary nodes, and therefore, the two-mode coefficient might be higher than the one attained on projected one-mode network. It gives, however, an indication of the bias that is created by three or more primary nodes are connected to a common node. Second, the reduction difference between the two coefficients is greater for the women attending fewer events (pair-wise correlation between the number of events and the difference is -0.69, with a -value of 0.001). This might suggest that the bias is greatest for nodes that attend few events. This is not unexpected as a woman attending a single event with at least two others would automatically attain a coefficient of 1 in the binary network.
To further highlight some of the features of the redefined local clustering coefficient, Flora and the network around her up to three steps is shown below. In a one-mode projection, all the possible ties among Flora’s contacts are present. This is due to the fact that eleven out of the twelve contacts attended event 9. The twelfth contact that did not attend event 9, Helen, is connected to all others connect through other events. The redefined clustering coefficient is less than 1 for Flora. This is because event 9 and 11 are not used to form closing ties among the women attending them (i.e., close 4-paths). More specifically, 4-paths exist from the nodes attached to event 11 to the nodes connected to event 9 (excluding themselves). In total, there are 31 4-paths, out of which 18 are closed by the event 6, 7, 8, and 10.

Flora’s local network up to three steps. Only non-redudant ties are shown between the second and third steps.
References
Burt, R.S., 1992. Structural Holes. Harvard University Press, Cambridge, MA.
Davis, A., Gardner, B. B., Gardner, M. R., 1941. Deep South. University of Chicago Press, Chicago, IL.
Watts, D.J., Strogatz, S.H., 1998. Collective dynamics of small-world networks. Nature 393, 440-442.
What to try it with your data
The redefined local clustering coefficient is implemented in tnet as clustering_tm_local. Below is the code for analysing a sample network and Davis’ (1940) Southern Women dataset is shown.
# Load tnet
library(tnet)
# Load networks
net <- cbind(
i=c(1,1,2,2,2,3,3,4,5,5,6),
p=c(1,2,1,3,4,2,3,4,3,5,5),
w=c(3,5,6,1,2,6,2,1,3,1,2))
# Obtain the binary local clustering coefficients of the nodes in the sample network
clustering_tm_local(net[,1:2])
# Obtain the weighted local clustering coefficients of the nodes in the sample network
clustering_tm_local(net)
# Obtain the binary local clustering coefficients of the women in Davis' (1940) Southern Women dataset
data("Davis.Southern.women")
clustering_tm_local(Davis.Southern.women.2mode)
The output from the binary and weighted analyses of the sample network is:
node lc
1 1.0
2 0.2
3 0.5
4 NaN
5 0.0
6 NaN
node lc lc.am lc.gm lc.ma lc.mi
1 1.0 1.0000000 1.0000000 1.0000000 1.0000000
2 0.2 0.2400000 0.2313222 0.2608696 0.2000000
3 0.5 0.4666667 0.4317651 0.5000000 0.3333333
4 NaN NaN NaN NaN NaN
5 0.0 0.0000000 0.0000000 0.0000000 0.0000000
6 NaN NaN NaN NaN NaN
Entry Filed under: Network thoughts. .

Trackback this post | Subscribe to the comments via RSS Feed