December 16, 2013

Fireside Science: Inspired by a visit to the Network's Frontier....

This post has been cross-posted to Fireside Science.

Recently, I attended the Network Frontiers Workshop at Northwestern University in Evanston, IL. This was a three-day session in which researchers engaged in network science from around the world gathered to present their work. They also came from many home disciplines, including computational biology, applied math and physics, economics and finance, neuroscience, and more.

The schedule (all researcher names and talk titles) can be found here. I was among one of the first presenters on the first day, presenting “From Switches to Convolution to Tangled Webs” [1], which involves network science from a evolutionary systems biology perspective.

One Field, Many Antecedents
For many people who have a passing familiarity with network science, it may not be clear as to how people from so many disciplines can come together around a single theme. Unlike more conventional (e.g. causal) approaches to science, network (or hairball) science is all about finding the interactions between the objects of analysis. Network science is the large-scale application of graph theory to complex systems and ever-bigger datasets. These data can come from social media platforms, high-throughput biological experiments, and observations of statistical mechanics. 

The visual definition of a scientific "hairball". This is not causal at all.....

25,000 foot View of Network Science
But what does a network science analysis look like? To illustrate, I will use an example familiar to many internet users. Think of a social network with many contacts. The network consists of nodes (e.g. friends) and edges (e.g. connections) [2]. Although there may be causal phenomena in the network (e.g. influence, transmission), the structure of the network is determined by correlative factors. If two individuals interact in some way, this increases the correlation between the nodes they represent. This gives us a web of connections in which the connectivity can range from random to highly-ordered, and the structure can range from homogeneous to heterogeneous.

Friend data from my Facebook account, represented as a sizable (N=64) heterogeneous network. COURTESY: Wolfram|Alpha Facebook app.

Continuing with the social network example, you may be familiar with the notion of “six degrees of separation” [3].  This describes one aspect (e.g. something that enables nth-order connectivity) of the structure inherent in complex networks. Again consider the social network: if there are preferences for who contacts whom, a randomly-connected network results. The path between any two individuals in such a network is generally high, as there are no reliable short-cuts. This path across the network is also known as the network diameter, and is an important feature of a network's topology.

Example of a social network. This example is homogeneous, but with highly-regular structure (e.g. non-random). 

Let us further assume that in the same network, there happen to be strong preferences for inter-node communication, which leads to changes in connectivity. In such cases, we get connectivity patterns that range from scale-free [4] to small-world [5]. In social networks, small-world networks have been implicated in the “six degrees” phenomenon, as the path between any two individuals is much shorter than in the random case. Scale-free and especially small-world networks have a heterogeneous structure, which can include local subnetworks (e.g. modules or communities) and small subpopulations of nodes with many more connections than other nodes (e.g. network hubs). Statistically, heterogeneity can be determined using a number of measures, including betweenness centrality and network diameter.

Example of a small-world network, in the scheme of things. 

Emerging Themes
While this example was made using a social network, the basic methodological and statistical approach can be applied to any system of strongly-interacting agents that can provide a correlation structure [6]. For example, high-throughput measurements of gene expression can be used to form a gene-gene interaction network. Genes that correlate with each other (above a pre-determined threshold) are consider connected in a first-order manner. The connections, while indirectly observed, can be statistically robust and validated via experimentation. And since all assayed genes (or the order of 103 genes) are likewise connected, second and third-order connections are also possible. The topology of a given gene-gene interaction network may be informative about the general effects of knockout experiments, environmental perturbations, and more [7].

This combination of exploratory and predictive power is just one reason why the network approach has been applied to many disciplines, and has even formed a discipline in and of itself [8]. At the Network Frontiers Workshop, the talks tended to coalesce around several themes that define potential future directions for this new field. These include:

A) general mechanisms: there are a number of mechanisms that allow for the network to adaptively change, stay the same in the face of pressure to change, or function in some way. These mechanisms include robustness, the identification of switches and oscillators, and the emergence of self-organized criticality among the interacting nodes. Papers representing this theme may be found in [9].

The anatomy of a forest fire's spread, from a network perspective.

B) nestedness, community detection, and clustering: Along with the concept of core-periphery organization, these properties may or may not exist in a heterogeneous network. But such techniques allow us to partition a network into subnetworks (modules) that may operate with a certain degree of independence. Papers representing this theme may be found in [10].

C) multilevel networks: even in the case of social networks, each "node" can represent a number of parallel processes. For example, while a single organism possesses both a genotype and a phenotype, the correlational structure for genotypic and phenotypic interactions may not always be identical. To solve this problem, a bipartite (two independent) graph structure may be used to represent different properties of the population of interest. While this is just a simple example, multilevel networks have been used creatively to attack a number of problems [11].

D) cascades, contagions: the diffusion of information in a network can be described in a number of ways. While the common metaphor of "spreading" may be sufficient in homogeneous networks, it may be insufficient to describe more complex processes. Cascades occur when transmission is sustained beyond first-order interactions. In a social network, messages that gets passed to a friend of a friend of a friend (e.g. third-order interactions) illustrate the potential of the network topology to enable cascade. Papers representing this theme may be found in [12].

E) hybrid models: as my talk demonstrates, the power and potential of complex networks can be extended to other models. For example, the theoretical "nodes" in a complex network can be represented as dynamic entities. Aside from real-world data, this can be achieved using point processes, genetic algorithms, or cellular automata. One theme I detected in some of the talks was the potential for a game-theoretic approach, while others involved using Google searches and social media activity to predict markets and disease outbreaks [13].

Here is a map of connectivity across three social media platforms: Facebook, Twitter, and Mashable. COURTESY: Figure 13 in [14].

[1] Here is the abstract and presentation. The talk centered around a convolution architecture, my term for a small-scale physical flow diagram that can be evolved to yield not-so-efficient (e.g. sub-optimal) biological processes. These architectures can be embedded into large, more complex networks as subnetworks (in a manner analogous to functional modules in gene-gene interaction or gene regulatory networks).

One person at the conference noted that this had strong parallels with the book “Plausibility of Life” (excerpts here) by Marc Kirschner and John Gerhart. Indeed, this book served as inspiration for the original paper and current talk.

[2] In practice, "nodes" can represent anything discrete, from people to cities to genes and proteins. For an example from brain science, please see: Stanley, M.L., Moussa, M.N., Paolini, B.M., Lyday, R.G., Burdette, J.H. and Laurienti, P.J.   Defining nodes in complex brain networks. Frontiers in Computational Neuroscience, doi:10.3389/fncom.2013.00169 (2013).

[3] the "six degrees" idea is based on an experiment conducted by Stanley Milgram, in which he sent out and tracked the progression of a series of chain letters through the US Mail system (a social network). 

The potential power of this phenomenon (the opportunity to identify and exploit weak ties in a network) was advanced by the sociologist Mark Granovetter: Granovetter, M.   The Strength of Weak Ties: A Network Theory Revisited. Sociological Theory, 1, 201–233 (1983).

The small-world network topology (the Watts-Strogatz model), which embodies the "six degrees" principle, was proposed in the following paper: Watts, D. J. and Strogatz, S. H.   Collective dynamics of 'small-world' networks. Nature, 393(6684), 440–442 (1998).

[4] Scale-free networks can be defined as a network with no characteristic number of connections across all nodes. Connectivity tends to scale with growth in the number of nodes and/or edges. Whereas connectivity in a random network can be characterized using a Gaussian (e.g. normal) distribution, connectivity in a scale-free network can be characterized using a Power Law (e.g. exponential) distribution.

[5] Small-world networks are defined by their hierarchical (e.g. strongly heterogeneous) structure and a short path length across the network. This is a special case of the more general scale-free pattern, and can be characterized with a strong power law (e.g. the distribution has a thicker tail). Because any one node can reach any other node in a relatively small number of steps, there are a number of organizational consequences to this type of configuration.

[6] Here are two foundational papers on network science [a, b] enlightening primers on complexity and network science [c, d]:
[a] Albert, R. and Barabasi, A-L.   Statistical mechanics of complex networks. Reviews in Modern Physics, 74, 47–97 (2002).

[b] Newman, M.E.J.   The structure and function of complex networks. SIAM Review, 45, 167–256 (2003).

[c] Shalizi, C.   Community Discovery Methods for Complex Networks. Cosma Shalizi's Notebooks - Center for the Study of Complex Systems, July 12 (2013).

[d] Voytek, B.   Non-linear Systems. Oscillatory Thoughts blog, June 28 (2013).

[7] For an example, please see: Cornelius, S.P., Kath, W.L., and Motter, A.E.   Controlling complex networks with compensatory perturbations. arXiv:1105.3726 (2011).

[8] Guimera, R., Uzzi, B., Spiro, J., and Amaral, L.A.N   Team Assembly Mechanisms Determine Collaboration Network Structure and Team Performance. Science, 308, 697 (2005).

[9] References for general mechanisms (e.g. switches and oscillators):
[a] Taylor, D., Fertig, E.J., and Restrepo, J.G.   Dynamics in hybrid complex systems of switches and oscillators. Chaos, 23, 033142 (2013).

[b] Malamud, B.D., Morein, G., and Turcotte, D.L.   Forest Fires: an example of self-organized critical behavior. Science, 281, 1840-1842 (1998).

[c] Ellens, W. and Kooij, R.E.   Graph measures and network robustness. arXiv: 1311.5064 (2013).

[d] Francis, M.R. and Fertig, E.J.   Quantifying the dynamics of coupled networks of switches and oscillators. PLoS One, 7(1), e29497 (2012).

[10] References for clustering [a], community detection [b-e], core-periphery structure detection [f], and nestedness [g]:
[a] Malik, N. and Mucha, P.J.   Role of social environment and social clustering in spread of opinions in co-evolving networks. Chaos, 23, 043123 (2013).

[b] Rosvall, M. and Bergstrom, C.T.   Maps of random walks on complex networks reveal community structure. PNAS, 105(4), 1118-1123 (2008).

* the image above was taken from Figure 3 of [a]. In [a], an information-theoretic approach to discovering network communities (or subgroups) is introduced.

[c] Colizza, V., Pastor-Satorras, R. and Vespignani, A.   Reaction–diffusion processes and metapopulation models in heterogeneous networks. Nature Physics, 3, 276-282 (2007).

[d] Bassett, D.S., Porter, M.A., Wymbs, N.F., Grafton, S.T., Carlson, J.M., and Mucha, P.J.   Robust detection of dynamic community structure in networks. Chaos, 23, 013142 (2013).

* the authors characterize the dynamic properties of temporal networks using methods such as optimization variance and randomization variance.

[e] Nishikawa, T. and Motter, A.E.   Discovering network structure beyond communities, Scientific Reports, 1, 151 (2011).

[f] Bassett, D.S., Wymbs, N.F., Rombach, M.P., Porter, M.A., Mucha, P.J., and Grafton,
S.T.   Task-Based Core-Periphery Organization of Human Brain Dynamics. PLoS Computational Biology, 9(9), e1003171 (2013).

* a good exampkle of how core-periphery structure is extracted from brain networks constructed from fMRI data.

[g] Staniczenko, P.P.A., Kopp, J.C., and Allesina, S.   The ghost of nestedness on ecological networks. Nature Communications, doi:10.1038/ncomms2422 (2012).

[11] References for multilevel networks:
[a] Szell, M., Lambiotte, R., Thurner, S.   Multirelational organization of large-scale social networks in an online world. PNAS, doi/10.1073/pnas.1004008107 (2010).

[b] Ahn, Y-Y., Bagrow, J.P., and Lehmann, S.   Link communities reveal multiscale complexity in networks. Nature, 466, 761-764 (2010).

[12] References for cascades and contagions:
[a] Centola, D.   The Spread of Behavior in an Online Social Network Experiment. Science, 329, 1194-1197 (2010).

[b] Brummitt, C.D., D’Souza, R.M., and Leicht, E.A.   Suppressing cascades of load in interdependent networks. PNAS, doi:10.1073/pnas.1110586109 (2011).

[c] Brockmann, D. and Helbing, D.   The Hidden Geometry of Complex, Network-Driven Contagion Phenomena. Science, 342(6164), 1337-1342 (2013).

[d] Glasserman, P. and Young, H.P.   How Likely is Contagion in Financial Networks? Oxford University Department of Economics Discussion Papers, #642 (2013).

[13] Reference for hybrid networks and other themes, including network evolution [a,b] and the use of big data in network analysis [c,d]:
[a] Pang, T.Y. and Maslov, S.   Universal distribution of component frequencies in biological and technological systems. PNAS, doi:10.1073/pnas.1217795110 (2012).

[b] Bassett, D.S., Wymbs, N.F., Porter, M.A., Mucha, P.J., and Grafton, S.T.   Cross-Linked Structure of Network Evolution. arXiv: 1306.5479 (2013).

[c] Ginsberg, J., Mohebbi, M.H., Patel, R.S., Brammer, L., Smolinski, M.S., and Brilliant, L.   Detecting influenza epidemics using search engine query data. Nature, 457, 1012–1014 (2008).

[d] Michel, J-B., Shen, Y.K., Aiden, A.P., Veres, A., Gray, M.K., Google Books Team, Pickett, J.P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M.A., Aiden, E.L.   Quantitative Analysis of Culture Using Millions of Digitized Books. Science, 331(6014), 176-182 (2011).

[14] Ferrara, E.   A large-scale community structure analysis in Facebook. EPJ Data Science, 1:9 (2012).

1 comment:

  1. Great post. This article is really very interesting and enjoyable. I think its must be helpful and informative for us. Thanks for sharing your nice post about Inspired by a visit to the Network's Frontier .
    live football scores