As we move toward a society where no person or firm acts in isolation, it is vital to understand the systems in which people and firms interact. These systems can be represented as networks where the entities are called nodes and interactions among them are represented in terms of ties. More generally, a node can be a neuron, an individual, a group, an organisation, or even a country, whereas ties can take the form of friendship, communication, collaboration, alliance, or trade, to name only a few. Most network studies focus solely on a single type of nodes, and whether or not two nodes are connected. While these studies have uncovered numerous organising principles, the measurement crudeness might lead to inaccurate conclusions. Often researchers possess richer types of data, but are unable to analyse it due to the lack of methods or tools. The aim of this site and software is to highlight areas where tie strength, multiple types of nodes, and the evolution of networks can be considered.

I have structured the site in five main sections with the first three devoted to weighted, two-mode, and longitudinal network methods, respectively, and the remaining to the R-package tnet and network datasets. Admittedly, this site is mostly based on my own work, but do drop me an email or leave a comment if you have a question/remark or you feel like I’m missing a point somewhere.

Weighted Networks

Which is the shortest path from the lower-left node to the central node? Directly or via the top-left node?

A major limitation of many methods used for studying networks stems from the fact that the strength of ties is not taken into account. Granovetter (1973) argued that the strength of a social tie is a function of its duration, emotional intensity, intimacy, and exchange of services. For non-social networks, the strength often reflects the function performed by the ties, e.g. carbon flow (mg/m²/day) between species in food webs (Luczkowich et al., 2003) or the number of synapses and gap junctions in a neural networks (Watts and Strogatz, 1998). In infrastructure and information networks, variations in the strength of a tie depend on the flow of information, energy, people, and goods along that tie (Barrat et al., 2004). The strength of a tie is generally operationalised into a weight that is attached to the tie, thereby creating a weighted network. There are clear advantages to incorporating tie weights in network analysis. For example, the transmission probability a disease between two people is related to their interaction-level. This section highlights both generalisations of binary measures and novel measures for weighted networks as well as various random network types. Read more…

Two-mode Networks

Network with two types of nodes

Another limitation of network analysis has been the focus of a single type of nodes (e.g., people) and the direct ties between them (i.e., one-mode networks). Nodes are rarely directly connected, and instead connected through various media or projects. This type of networks is often referred to as two-mode, affiliation, or bipartite networks. One type of two-mode networks that has received great attention is scientific collaboration networks consisting of authors and publications where the authors are connected to papers, and not directly to each other (Newman, 2001). However, given the lack of methods for two-mode networks, the networks are often analysed as a one-mode network by, for example, simply connecting authors with their co-authors. While this procedure allows the networks to be analysed, the underpinning assumptions in two-mode networks differ from that of prototypical one-mode networks. In one-mode networks, a person is connected to one other person when a tie is formed. Conversely, in a two-mode network, ties are formed to all the other authors if a new author joins an existing publication. Moreover, these networks tend to have many fully connected cliques. In fact, the level of clustering tends to be much higher in two-mode networks than in other networks. The section of the site outlines various methods for analysing two-mode networks. Read more…

Longitudinal Networks

May 11, 2012: This is the last part of the website to be completed. The code is mostly there, but I am still working on the text etc.

Networks evolve.

Network analysis is mainly static. Most studies record the nodes and ties at a specific time (e.g., on the day of interviews). However, networks are a result of many actions (e.g., nodes joining and leaving as well as forming, reinforcing, weakening, and severing ties). Analysing aggreations of these actions represents a limitation as they are not independent from each other, and the dependency structure among them is unknown. This limitation is especially relevant for studies trying to investigate why certain ties are formed (see the literature on ERGMs and SIENA). While collecting static data is often the only feasible method, the exact evolution of the network is sometimes available. For example, the time of communication is recorded on online social network sites, such as Facebook, or in phone call logs. In fact, by having the time of each tie, it is possible to understand the dependency structure in the network. Nevertheless, there is also a lack of methods for networks where the time of each tie is known. As such, these networks are often aggregated to a static network, or multiple static networks (e.g., daily snapshots; Kossinets and Watts, 2006; Panzarasa et al., 2009). This section present some of the methods and tools to study these networks as well as random network models. Read more…

Software

The usefulness of methodological advances is lessened if they are not implemented in an easily accessible programme. To walk the walk, each section contains the specific details for applying the measures in the free open-source statistical programme R. I have chosen R to ensure that everyone can easily access it (even those on Mac and Linux). The example code relies mainly on the R-package tnet that can be downloaded directly from within R using the Comprehensive R Archive Network (CRAN)-servers. The software section outlines how to install and prepare your data as well as how to import from and export to other network analysis programmes. Read more…

Datasets

To further help researchers in the field, I have collected a number of weighted, two-mode, and longitudinal networks. These are generally publically available networks or networks that I have collected. The networks are accompanied by an in-depth description of how they were collected and defined. Read more…

44 Comments Add your own

  • 1. John McCreery  |  November 3, 2011 at 8:43 am

    Tore, I have just discovered your site. A truly amazing resource. I am wondering if I might consult with you occasionally re my current project. A Google search for “John McCreery SlideShare” will bring you to biographical information about myself and a series of presentations describing the project. Briefly, as things now stand, I have data on six moderately large networks based on the credits data in the 1981, 1986, 1991, 1996, 2001, and 2006 Tokyo Copywriters Club Advertising Copy Annual: 30,000 role relationships linking 8000 creators to 4000 prize-winning ads. I also have a small library of published material about the Japanese advertising industry, its star creators, and what they say about what they do and, having spent nearly three decades working in and around the industry, personal contacts that enable me to secure interviews with individuals who are central figures in the networks I am looking at. That said, I am, when it comes to network analysis, a self-taught independent scholar with what remains only a sketchy knowledge of the field—which is, for example, why I have just discovered you. In any case, I am truly delighted to discover this site. I anticipate learning much from you.

    Reply
    • 2. Tore Opsahl  |  November 3, 2011 at 5:04 pm

      Hi John,

      Thank you for taking an interest in my work. The dataset sounds really interesting, and a good source of understanding team formation and success. Also, the mixed methods approach is great for guiding the research. Please send me an email so we can connect privately.

      Best,
      Tore

      Reply
  • 3. Vaclav  |  November 24, 2011 at 2:19 pm

    Hi, thank you for developing nice tools for NA. It seems to me, however, that your network format ignores zero-degree nodes, right?

    Reply
    • 4. Tore Opsahl  |  November 24, 2011 at 10:39 pm

      Hi Vaclav,

      Thanks! The edgelist format does indeed not include isolates when they are at the end of the node id sequence. If you have a look at the example in the Closeness centrality in networks with disconnected components-blog post, you can see how you can include isolates by making sure a non-isolate has the highest node id.

      Best,
      Tore

      Reply
      • 5. Vaclav  |  December 8, 2011 at 9:04 pm

        Hi again,
        you are absolutely right. It’s only problem of those last nodes. Thanks again, your package is very helpful to my research!

        All best,
        Vaclav

  • 6. Alexander Smit  |  May 17, 2012 at 9:26 pm

    Hi Tore,

    thank you very much for all the good work you are doing with tnet! Since the data I use for my PhD is either 2-mode or (when projected to one of the modes) weighted, you can imagine I make grateful use of the package.

    What I was wondering: do you think there will be a day where we can calculate network-level measures for weighted networks, such as density or degree centralization? In the work on interorganizational networks my experience is that usually the values get dichotomized right away so they can be tackled with the traditional network-level measures. But somehow that does not feel completely right.

    I am curious what your vision is on this subject!

    Best,

    Alexander Smit

    Reply
    • 7. Tore Opsahl  |  May 18, 2012 at 2:18 pm

      Hi Alexander,

      Great question and one that is partly unanswered. When I say partly, it is because degree centralization metrics have yet to be generalized to two-mode networks and weighted one-mode networks and density to weighted networks; however, there is a host of network-level or global metrics for two-mode and weighted one-mode networks. For example, average degree is a better metric than network centralization (in my mind) as degree has been shown not to scale exponentially with the number of nodes (i.e., n*(n-1); see Node Centrality in Two-mode Networks). Additionally, there are global clustering metrics for both two-mode networks and weighted one-mode networks that might be applicable to you.

      Hope this gives you some new ideas!

      Tore

      Reply
  • 8. Basov Nikita  |  August 1, 2013 at 11:28 am

    Hi Tore,

    Thank you for tnet tool and this site, which I find very interesting.
    Have you heard of any methods of group detection in 2-mode networks (weighted or not)?

    Best wishes,
    Nikita

    Reply
    • 9. Tore Opsahl  |  August 14, 2013 at 11:03 pm

      Hi Nikita,

      Glad you found it interesting. If you are looking to detect the level of clustering, you can use the two-mode clustering coefficient in tnet. However, if you are looking for two-mode community detection, there are a number of efforts out there, such as Michael Barber’s work (http://arxiv.org/a/barber_m_1).

      Let me know what you find!

      Tore

      Reply
  • 10. Rui  |  August 14, 2013 at 5:04 pm

    Hi Tore,

    I am very pleased to discover your blog, which is really helpful to my research. Thank you for your effort! One question I would like to ask is how do you draw your two-mode (bipartite) graphs, with each mode using respective shape or color? Do you use R for visualization as well, or other software?

    Best regards
    Rui

    Reply
    • 11. Tore Opsahl  |  August 14, 2013 at 10:39 pm

      Hi Rui,

      Glad you find it useful! The graphics on this site is “hand-made” — however, you can use Gephi or NetDraw with node attributes which allows you to give them separate sizes and shapes.

      Tore

      Reply
      • 12. Rui  |  August 15, 2013 at 10:16 pm

        I have no idea about NetDraw, but It seems that Gephi cannot do that in batches, say thousands of nodes. There is anoher question, how do you draw graphs with community structures in different colors?

      • 13. Tore Opsahl  |  August 19, 2013 at 3:44 am

        Hi Rue,

        Have a look at the tutorials for Gephi and NetDraw. I believe both of them have this capability. Do check with their user forums if you have questions regarding those softwares.

        Best,
        Tore

      • 14. Rui  |  August 19, 2013 at 7:53 am

        Hi Tore,

        I have found it, thank you very much.

        Best
        Rui

  • 15. Giovanni Paganini  |  August 17, 2013 at 2:58 pm

    Hi Tore, I had a look at you web site and found it really interesting!
    I have a question: I have to generated random networks where the attributes of the nodes and edges should be random as well.

    The are is auto fraud analysis and the attributes are for instance: “Car parked when collision happend (value 1 or 2)”, “Police Called”,.. Basically these are binary attributes that should be randomly generated together with the random generated networks.

    The purpose is to find a “normal” value and compare it with the originnally observed network data.

    Any suggestion on how to do this with TNET or other packages?

    Thanks a lot!
    Giovanni

    Reply
    • 16. Tore Opsahl  |  August 19, 2013 at 3:45 am

      Hi Giovanni,

      Thanks!

      I would randomize the ties in the network using the functions in tnet, and then use the function sample in R to randomly assign attributes to the nodes and ties.

      Hope this helps,
      Tore

      Reply
  • 17. Giovanni Paganini  |  August 19, 2013 at 12:16 pm

    Thanks Tore!!
    Will try to implement your advice!

    G

    Reply
  • 18. Giovanni Paganini  |  August 20, 2013 at 6:14 pm

    Hi Tore,
    how are you?
    I have another question. Probably my network is a two-mode network as drivers in accident are linked only through the collision, so in a similar way to affiliation.
    Can you give me a quick hint on how to create such a network with TNET?
    I read all the material on the package I have only a doubt.

    If a use AXX for collision and numbers for driver I may expect a structure like this
    1 A3
    2 A3

    meaning that 1 and 2 are involved in the collision A3. How can I create the network with TNET? Do I have just to load it from an external file with the structure below?
    Driver Collisions
    1 A3
    2 A3

    Let’s imagine now I want to associate different attributes to Drivers and Collisions (e.g. Seriously Injured (Y/N) to drivers and “Collision Happened at Night (Y/N) to Collisions) : how can I do it?

    Apologize for the trivial questions! Hope in the future to raise more intelligent ones.

    Thanks!
    Giovanni

    Reply
    • 19. Tore Opsahl  |  August 21, 2013 at 2:18 am

      Hi Giovanni,

      Have a look at tnet » Two-mode Networks » Random Networks (https://toreopsahl.com/tnet/two-mode-networks/random-networks/). You can create a classic random two-mode network with the rg_tm-function, and various reshufflings using the rg_reshuffling_tm-function.

      In essence, if you want to create a network of 10 crashes involving 20 drivers with 30 relationships, type the following:
      net <- rg_tm(ni=20,np=10, ties=30)

      Then you can assign the edge attribute of whether the driver got seriously injured in that specific crash by typing (probability 20%):
      net[,"serious.injured"] <- runif(nrow(net))<0.2

      The node attribute of the crash can be generated separately by typing (assuming a 60% probability):
      node.attribute.night <- runif(length(unique(net[,"p"])))<0.6

      However, these functions do not fix elements such as "at least one driver must be involved in each creash". To do this, you can randomly reshuffle an existing network (see the rg_reshuffling_tm-function).

      Tore

      Reply
  • 20. Giovanni Paganini  |  August 21, 2013 at 8:35 am

    Thanks a lot Tore!!
    That’s really helpful!
    Giovanni

    Reply
    • 21. Giovanni  |  September 12, 2013 at 11:42 am

      Hi Tore,
      how are you?
      Let me abouce of your patience and ask you a further question.
      Let’s imagine I have to simulate say 10.000 random networks with the technique you described, that is with a rg_tm function.

      In other words I need to embedd the rg_tm function into a loop , save the result and then getting the empirical distribution of some attribute values (e.g. number of seriously injured,…) .

      Do you have any example code for this?

      Thanks a lot!!!!
      Ciao
      Giovanni

      Reply
      • 22. Tore Opsahl  |  September 24, 2013 at 10:12 pm

        Hi Giovanni,

        Have a look at the for-loop help pages in R. This should allow you to figure it out.

        Best,
        Tore

  • 23. Giovanni  |  September 25, 2013 at 9:44 am

    Thanks!
    G

    Reply
  • 24. Beatriz  |  February 7, 2014 at 6:44 pm

    Thank you very much for this site and the package.

    Reply
  • 25. Julie  |  August 2, 2014 at 6:15 pm

    Hi Tore,

    Thank you very much for the package. It has been extremely useful for my PhD analyses. I work on association data on wild elephants. I consider elephants associated if they are found in the same group ( same location in date and time). The function rg_reshuffling_tm worked fine for one of my sites with about 300 observations and I got 1000 random networks in less than 10 minutes. However in another site with about 1000 observations (where each observation correspond to an individual’s occurrence at a specific) one reshuffling was still running after 8 hours. Is that something to be expected or is there a bug that can be fixed?

    Thanks for your help,

    Julie

    Reply
    • 26. Tore Opsahl  |  August 21, 2014 at 5:10 pm

      Hi Julie,

      Glad you’re finding it useful. It is worth noting that these functions are written in R, so they won’t be the most efficient ones. You might was to compile the functions.

      Also, the running time is linked to the density of the network. If you have specific issues, email me your code and data. Then I’ll have a look.

      Tore

      Reply
  • 27. Mouna  |  May 6, 2015 at 2:52 pm

    Hi Tore,

    Thank you very much for the package, the website, forum, paper and datasets.

    I want to convert a dataset for example “Davis.Southern.women.2mode” whitch is a tnet object to a web object for the package “bipartite”.

    Do you have any idea how can i do it??

    Thanks for your help.

    Mouna :)

    Reply
    • 28. Tore Opsahl  |  May 9, 2015 at 7:39 pm

      Hi Mouna,

      I believe you need to create an adjacency matrix for bipartite. See below.

      Hope this helps,
      Tore

      # Load tnet and bipartite
      library(tnet)
      library(bipartite)
      
      # Get network and name it net for simplicity
      data(tnet)
      net <- Davis.Southern.women.2mode
      
      # Create matrix for bipartite
      g <- matrix(data=0, nrow=max(net[,1]), ncol=max(net[,2]))
      for(i in 1:nrow(net))
        g[net[i,1],net[i,2]] <- 1
      
      # Run a metric
      togetherness(g)
      
      Reply
  • 29. economicurtis  |  May 10, 2015 at 8:05 pm

    Howdy, I am using tnet for my research, (if you’re curious, implementing weighted betweenness as a measure of “moneyness” in a barter marketplace”. Curious if you have a preference for how we might cite you? Great resource by the way!.

    Reply
    • 30. Tore Opsahl  |  May 11, 2015 at 11:21 pm

      Curtis,

      Betweenness is often poorly applied in my view; however, a barter network is perhaps one of the few appropriate settings for applying it. How do you measure the tie weights? Also, have you consider the flow metric (Freeman et al., 1991)?

      Best,
      Tore

      Reply
  • 31. Pavel  |  June 17, 2015 at 11:10 am

    Dear Tore,
    I would like to calculate network centralization in a weighted network. Can I simply average centrality estimates of the network members, calculated by tnet ?

    Reply
    • 32. Tore Opsahl  |  June 17, 2015 at 12:02 pm

      Hi Pavel,

      If you wish to compute a centrality score for a whole network, I would suggest taking the average value across nodes.

      The weighted centrality scores do not have a theoretical maximum as weights are generally not bounded by an upper value. As such, centralization scores that are often bound between 0 and 1 are hard to define. As such, I have stayed clear of them. Also, I have issues with the normalization applied by these methods. For example, see comment #28 on https://toreopsahl.com/2010/04/21/article-node-centrality-in-weighted-networks-generalizing-degree-and-shortest-paths/

      Best,
      Tore

      Reply
      • 33. Pavel  |  June 17, 2015 at 12:17 pm

        Great, thank you!

      • 34. Carmine  |  June 13, 2016 at 5:16 pm

        Hi Tore, can I ask what would be the difference between an aggregated centrality score for the whole network (average value across nodes) and the measure of closeness you also offer with tnet?

        Thank you very much for you great work

        Best

        Carmine

      • 35. Tore Opsahl  |  June 14, 2016 at 5:51 pm

        Hi Carmine,

        The overall score would be a single number for the entire network whereas the closeness metric supplied is for individual nodes.

        My argument against using centralization (overall) scores is that these are often normalized by n(n-1) which might not be appropriate for networks as the number of connections often does not grow exponentially as more nodes are included (see Dunbar’s research).

        Best,
        Tore

  • 36. Giovanni Paganini  |  February 17, 2016 at 2:05 pm

    Hi Tore,
    how are you?

    I have a kind of ‘methodological question’ I’d like to discuss with you. I’m currently working on a network of auto accidents (still on it!!) where suspicious (connected) components of the network are assessed through the value of nodes attributes. Let me explain it better with an example.
    Let’s imagine the network is made up by accidents, and each accident nod has a binary attribute that tells if the accident happened at night or not.
    Hoc can I assess the value of the characteristic measured by this indicator for the whole set of nodes, that is for all the connected component? Just summing up all the values and dividing it by the number of nodes ? It seems a bit raw, but I haven’t got a better idea….
    Did you face similar problems in your activity, so you can recommend some approach/strategy?
    Many thanks in advance!
    G

    Reply
    • 37. Tore Opsahl  |  February 21, 2016 at 5:00 pm

      Hi Giovanni,

      I am not entirely sure I follow the problem you are trying to solve. If you would like, send me an email and we can discuss it in more detail.

      Tore

      Reply
      • 38. Giovanni  |  February 22, 2016 at 7:48 am

        Will do it!
        What is your e-mail ?
        Thanks!
        G

  • 39. Antonia Strazzella  |  June 4, 2017 at 10:33 am

    Dear Tore,

    thank you very much for your webpage and your great work!

    I’m trying to use tnet for a psychiatric research in which nodes are symptoms.
    I have converted my correlation matrix to a matrix with 3 columns (i, j, and their weight) with as.tnet, but when I try to compute degree, betweenness and closeness (degree_w, betweenness_w, closeness_w) with different values of alpha, I’ve got in output the same results.
    Hope you can help me to figure out what went wrong,

    Again, thank you so much!

    Antonia

    Reply
  • 41. Justin Schon  |  July 2, 2019 at 9:42 pm

    Dear Tore,

    Thank you for your work on this package.

    I wanted to ask about how the package symmetrizes networks. I am using the betweenness_w function, and I realized that it calculates different betweenness values when I set directed=FALSE than when I first symmetrise the network and then use the betweenness_w function. I see that symmetrise_w by default uses method=”MAX”. If I do not first symmetrise but I use betweenness_w( x, directed=FALSE), then what is the method used for deriving the undirected network?

    Thanks for any info you can share here,

    Justin

    Reply
    • 42. Tore Opsahl  |  August 20, 2019 at 9:16 pm

      Hi Justin,

      If you set directed=FALSE, the network will be analyzed as a directed network. You might also choose to let the directed-parameter be NULL. Then the function will auto detect whether or not the network is directed.

      Best,
      Tore

      Reply
      • 43. Justin Schon  |  August 20, 2019 at 9:56 pm

        Thank you for your reply. Did you mean that when directed=FALSE, the network will be analyzed as an undirected network? Also, I was confused about why I get different results if I use the symmetrise_w function and then let the directed parameter be NULL, versus just skipping the symmetrise step and setting directed=FALSE. Does this question make sense?

  • 44. Tore Opsahl  |  August 27, 2019 at 8:06 pm

    Correct: when directed=FALSE, the network will be analyzed as an undirected network. Please send me an email with your data and I will have a look.

    Reply

Leave a reply to economicurtis Cancel reply

Subscribe to the comments via RSS Feed