One-mode Data Structure
Since most networks are sparse (i.e., the number of ties is much lower than the squared number of nodes, I opted for an edgelist format instead of a matrix one. A binary edgelist consists of two columns that represent the pairs of nodes that are tied together in a network (e.g., the edgelist1-format in UCINET’s dl files; Borgatti et al., 2002). When a directed network is represented, the first column represents the nodes that create the ties, whereas the second column represents the target nodes. This type of list has been extended to cover weighted networks by adding a third column representing the weight of the ties. While the matrix format records the weight of all possible ties (a non-established tie would get a weight of 0), this format records the sender, receiver, and weight of established ties only. The main advantage of this format is that it can scale to networks with many nodes as it is the number of ties, not nodes, that determine the size of the data object. Although many programmes can read edgelists, most network analysis programmes rely on an internal matrix representation, e.g. UCINET and the network-package in R (Butts, 2006). Conversely, Pajek, which was designed to analyse large-scale sparse networks, specifically uses an internal edgelist representation (Batagelj and Mrvar, 2007). By following Pajek, tnet can efficiently be applied to large-scale sparse networks.

A directed and weighted one-mode network.
[,1] [,2] [,3]
[1,] 1 2 2
[2,] 1 3 2
[3,] 2 1 4
[4,] 2 3 4
[5,] 2 4 1
[6,] 2 5 2
[7,] 3 1 2
[8,] 3 2 4
[9,] 5 2 2
[10,] 5 6 1
Undirected Networks
All networks are assumed directed in tnet. To represent an undirected network, each tie must be mentioned twice – one in each direction. Currently, only the local clustering coefficient is defined solely for undirected networks. If the sample network above was undirected (symmetrised using the average tie weight), the network should be represented as follows:
[,1] [,2] [,3]
[1,] 1 2 3.0
[2,] 1 3 2.0
[3,] 2 1 3.0
[4,] 2 3 4.0
[5,] 2 4 0.5
[6,] 2 5 2.0
[7,] 3 1 2.0
[8,] 3 2 4.0
[9,] 4 2 0.5
[10,] 5 2 2.0
[11,] 5 6 0.5
[12,] 6 5 0.5
Loading Your Network
The most common way of loading a network is to read a text file with the network. The read.table-function is the standard method for reading text files. This function works by giving it a filename or link, and a character for separating the values into columns (e.g., a tab). It is important to not just read, but also assign the read file to an object. To illustrate this procedure, the directed and undirected networks can be loaded into the objects directed.net and undirected.net using these commands (note that these files are on the web, and hence, the link instead of a filename).
# Read the directed network
directed.net <- read.table("http://opsahl.co.uk/tnet/datasets/one-mode-directed-network.txt", sep="\t")
# Read the undirected network
undirected.net <- read.table("http://opsahl.co.uk/tnet/datasets/one-mode-undirected-network.txt", sep="\t")
Ensure that the network conforms to the tnet standard
To ensure that the network conforms to the tnet standard, the as.tnet-function can be used. This function is run automatically by the functions if it has not been run on the network manually. This function takes two parameters: the network and a character string specifying the type of network. If the type parameter is not set, an object will be assumed to be a one-mode edgelist if it has three columns or if it is a square matrix with more than 4 nodes. Below is the code for testing the directed network above.
# Load tnet
library(tnet)
# Read the directed network
directed.net <- read.table("http://opsahl.co.uk/tnet/datasets/one-mode-directed-network.txt", sep="\t")
# Check that it confirms to the tnet standard for weighted one-mode networks
directed.net <- as.tnet(directed.net, type="weighted one-mode tnet")
There are a number of functions that help users to convert data in other formats into the weighted edgelist format. For example, if a dataset is undirected, but there is only one entry for each tie in the edgelist, the symmetrise_w-function adds a second entry of the edge with the identification numbers of the creator and target nodes reversed. Moreover, if a dataset is similar to an edgelist, but with only two columns (representing the identification numbers of the creator and target nodes) and multiple entries of the same tie refer to the weight of that tie (e.g., if a tie has a weight of 3, it is included three times), then the shrink_to_weighted_network-function allows the users to convert the edgelist into the correct format. To allow for a comparison between weighted and binary network measures, the dichotomise_w-function creates a binary network from a weighted one. It does so by removing the ties in a weighted edgelist that fall below a certain cut-off and sets the weight to 1 for the remaining ones.
References
Batagelj, V., Mrvar, A., 2007. Pajek: Program for Large Network Analysis: version 1.20. http://pajek.imfm.si/.
Borgatti, S.P., Everett, M.G., Freeman, L.C., 2002. UCINET forWindows: Software for Social Network Analysis. Analytic Technologies, Harvard, USA.
Butts, C. T., 2006. sna-package: Package for Social Network Analysis. R package version 1.4.
Opsahl, T., 2009. Structure and Evolution of Weighted Networks. University of London (Queen Mary College), London, UK, pp. 104-122. Available at http://toreopsahl.com/publications/thesis/.
1.
David Fisher | November 15, 2012 at 3:33 pm
HI, love the blog, unfortunately when trying to shrink my 2-column edgelist (with duplicate interactions) in to a weighted edge list using the “shrink_to_weighted_network-function” I get the following error from R: Error in Ops.factor(net[, "i"], net[, "j"]) :
level sets of factors are different
Any ideas as to why this might be? Apologies if this query is in the wrong place.
David
2.
Tore Opsahl | November 15, 2012 at 3:53 pm
Hi David,
THanks for your comment. Are you using integer values as node ids? If you write class(net[,"i"]) or class(net[,"j"], the output should be integer or numeric. If you have more issues, send me an email with your code and data, and then I will have a look at it.
Best,
Tore
3.
David Fisher | November 15, 2012 at 4:00 pm
The node IDs are all unique but are a mixture of numbers and letters (e.g. “LA”, “A”, “S1″, “U2″), could this be causing the problem?
Entering “class(binters08[,"i"])” gives me this error: “Error in `[.data.frame`(binters08, , “i”) : undefined columns selected”
Cheers
David
4.
Carol Xu | February 28, 2013 at 5:12 am
Hi,
Don’t know if this issue ever got resolved but I’m having the same problem and some of my node IDs are also a mixture of numbers and letters separated by underscores, i.e. “GOOD_3″ and “MISS_2″. Is there a way to work around this or would I have to change the names?
Thanks,
Carol
5.
Tore Opsahl | March 1, 2013 at 4:44 pm
Hi Carol,
The node IDs must be in the integer data class (in other words, just numbers). You could use the as.factors-function for this. See line 18-22 of the code on this page: http://toreopsahl.com/2011/08/12/why-anchorage-is-not-that-important-binary-ties-and-sample-selection/
Best,
Tore
6.
xulace | March 1, 2013 at 7:30 pm
Thank you!