Here are some nice technological and mathematical non-sequiturs for the holiday season. Full of modern Western traditions. Thanks to Craig Ferguson, Benoit Mandelbrot, and natural forms for the conceptual help.
One area of evolutionary science that has always fascinated me involve subtle evolutionary mechanisms. Aside from having an interest in evolutionary modeling and developmental biology, I am also particularly interested in evolutionary mechanisms that are nonlinear, and provide a path towards complex evolutionary dynamics. It is somewhat different from a traditional phylogenetic model, and requires a significant departure from standard population genetics thinking as well.
Whether this belongs to the extended evolutionary synthesis or not is not clear, although a mechanism-first approach is inclusive of development and other life-history considerations. We will begin by looking at a new paper  on the evolution of Mexican cavefish (Astyanax mexicanus) populations. A. mexicanus had been previously identified as a prime example of developmental processes playing a role in morphological divergence between species . Namely, the cave-dwelling morph has lost its eyes, which are not needed in the cave environment. Figure 1 shows the latest version of this story.
Overview of Hsp90 phenotypic capacitance mechanism, based on account from .
In the latest paper, an inducible system is tested which depletes the available amount of Hsp90 (a chaperrone molecule which aids in protein folding). A few notes on the changes that have been linked to the absence of Hsp90:
1) the relationship between Hsp90 (chaperone) and proteins is one of a metastable signal transducer . For example, one folding state results when chaperrone is present, while another state results when the chaperrone is not. This results in a sigmoidal response function. As the chaperrone is depleted, some deleterious traits become unmasked. But for large-scale changes to occur, a complete depletion of the chaperrone is required.
Schematic demonstrating the shape of a sigmoidal function.
2) Hsp90 is intentionally overproduced in the sense that enough of the chaperone is available when unpredictable environmental stresses occur, requiring a greater amount of chaperrone to achieve the proper folding. This baseline is a conserved mechanism for morphological robustness (sometimes called phenotypic buffering).
3) in general, the more environmental stress that exists during development, the more Hsp90 is needed and used. When Hsp90 is exhausted, deleterious and large-scale changes can be unmasked (sometimes called phenotypic capacitance).
4) since selection does not act directly upon masked variation, multiple variants can be unmasked at the same time, revealing large changes in phenotype (similar to the notion of hopeful monster).
A hopeful monster represented using the Fisherian model of evolutionary mutation. Taken from .
The bottom line is that while an eyeless phenotype would not have a high fitness in an above-ground environment, having eyes would be quite costly in a cave environment. Thus, eyeless phenotypes would suddenly have a high fitness, but only in the context of this niche. But how do you get from point A to point B, particularly when most existing theoretical models assume gradual and/or genotypic-driven change?
This is where statistical models of extreme events comes into play. While the biological model suggests that eyeless phenotypes is the consequence of a failed protective mechanism, we can also understand these changes as extreme events that have a statistical distribution in any evolutionary context. Fortunately, we can turn to statistical physics for two candidate models: the Abelian sandpile model , and the Dragon King model .
In both cases, we must make the assumption that extreme events are not only possible but inevitable as evolutionary outcomes. In both "sandpile" and "dragon" evolution, extreme events can drive processes like speciation, niche specialization, and evolutionary diversification. The difference between sandpile evolution and dragon evolution involves whether or not extreme events are due to the same process as other evolutionary outcomes which are smaller in magnitude. This should not be interpreted as a verdict on so-called "gene-centered" evolution  -- while sandpile evolution is more likely to be dominated by changes in the genetic architecture, dragon evolution simply provides ways to organize the expression of these genotypic changes.
Distribution in time (top) and probability distribution (bottom) of Dragon King events (notice that they deviates from a conventional power law in the tail region). Images taken from .
The sandpile model demonstrates that the same underlying process (in this case, the growth and avalanche dynamics of a sandpile) is responsible for observed events of every magnitude. While this process is stochastic and unpredictable, it can be characterized using a power law distribution . While you can predict the existence of avalanches (and perhaps at a certain frequency), you cannot predict when they will happen or the chain of events that lead up to them. In "sandpile evolution", the mutational structure serves as the driving force for evolutionary change. Even when the genotypic mutation rate is constant, cumulative changes (driven by delayed feedback) could sometimes lead to large-scale and sudden changes in phenotype.
A dynamical phase space representation of the Dragon King event, taken from . In this case, the dynamical behavior of a coupled chaotic oscillator sporadically wanders far outside of its attractor orbit, resulting in an extreme event.
The Dragon King model, by contrast, assumes that events of large magnitude are not due to the same processes as events of small magnitude . Dragon King events, such as financial crises , coherent structures in turbulent fluids , and the behavior of coupled chaotic oscillators , cannot be characterized well by a typical power law distribution, with exceptional differences in the tail region . While "dragon
evolution" relies upon two or more concurrent processes, there is a historical contingency that allows for one of these processes to be sporadically amplified. This amplificion is accomplished through cumulative negative feedback from some mediating factor (perhaps chaperrones). Much as in the case of sandpile evolution, this generates large-magnitude events as a low frequency.
From what I can tell, the model of phenotypic capacitance for the cavefish matches the Dragon King criteria quite well. In this case, you have a dynamical system -- a variable concentration of Hsp90 that changes deterministically with respect to stochastic environmental fluctuations. When the Hsp90 concentration reaches zero (which happens rarely and represents a lower-bound), the phenotypic system sojourns far from equilibrium. Crucially, the depletion of Hsp90 and the absence of Hsp90 behave as independent systems: the depeletion of Hsp90 merely allows for deleterious phenotypes to be expressed.
It is of note that the original Hsp90 experiments in Drosophila , most of these phenotypes turned out to be embryonic lethal. But, using a different mechanism, the absence of Hsp90 allows for suites of mutations (representing latent variation) to be expressed, and resulting in a coherent, non-lethal embryonic phenotype that can have high fitness in a narrow range of environmental contexts. Perhaps this is the beginnings of a mathematical model for evo-devo!
Reconciling the Dragon King in phase space with phenotypic evolution. Images taken from  and .
Recently, I attended the Network Frontiers Workshop at Northwestern University in Evanston, IL. This was a three-day session in which researchers engaged in network science from around the world gathered to present their work. They also came from many home disciplines, including computational biology, applied math and physics, economics and finance, neuroscience, and more.
For many people who have a passing familiarity with network science, it may not be clear as to how people from so many disciplines can come together around a single theme. Unlike more conventional (e.g. causal) approaches to science, network (or hairball) science is all about finding the interactions between the objects of analysis. Network science is the large-scale application of graph theory to complex systems and ever-bigger datasets. These data can come from social media platforms, high-throughput biological experiments, and observations of statistical mechanics.
The visual definition of a scientific "hairball". This is not causal at all.....
25,000 foot View of Network Science
But what does a network science analysis look like? To illustrate, I will use an example familiar to many internet users. Think of a social network with many contacts. The network consists of nodes (e.g. friends) and edges (e.g. connections) . Although there may be causal phenomena in the network (e.g. influence, transmission), the structure of the network is determined by correlative factors. If two individuals interact in some way, this increases the correlation between the nodes they represent. This gives us a web of connections in which the connectivity can range from random to highly-ordered, and the structure can range from homogeneous to heterogeneous.
Friend data from my Facebook account, represented as a sizable (N=64) heterogeneous network. COURTESY: Wolfram|Alpha Facebook app.
Continuing with the social network example, you may be familiar with the notion of “six degrees of separation” . This describes one aspect (e.g. something that enables nth-order connectivity) of the structure inherent in complex networks. Again consider the social network: if there are preferences for who contacts whom, a randomly-connected network results. The path between any two individuals in such a network is generally high, as there are no reliable short-cuts. This path across the network is also known as the network diameter, and is an important feature of a network's topology.
Example of a social network. This example is homogeneous, but with highly-regular structure (e.g. non-random).
Let us further assume that in the same network, there happen to be strong preferences for inter-node communication, which leads to changes in connectivity. In such cases, we get connectivity patterns that range from scale-free  to small-world . In social networks, small-world networks have been implicated in the “six degrees” phenomenon, as the path between any two individuals is much shorter than in the random case. Scale-free and especially small-world networks have a heterogeneous structure, which can include local subnetworks (e.g. modules or communities) and small subpopulations of nodes with many more connections than other nodes (e.g. network hubs). Statistically, heterogeneity can be determined using a number of measures, including betweenness centrality and network diameter.
Example of a small-world network, in the scheme of things.
While this example was made using a social network, the basic methodological and statistical approach can be applied to any system of strongly-interacting agents that can provide a correlation structure . For example, high-throughput measurements of gene expression can be used to form a gene-gene interaction network. Genes that correlate with each other (above a pre-determined threshold) are consider connected in a first-order manner. The connections, while indirectly observed, can be statistically robust and validated via experimentation. And since all assayed genes (or the order of 103 genes) are likewise connected, second and third-order connections are also possible. The topology of a given gene-gene interaction network may be informative about the general effects of knockout experiments, environmental perturbations, and more .
This combination of exploratory and predictive power is just one reason why the network approach has been applied to many disciplines, and has even formed a discipline in and of itself . At the Network Frontiers Workshop, the talks tended to coalesce around several themes that define potential future directions for this new field. These include:
A) general mechanisms: there are a number of mechanisms that allow for the network to adaptively change, stay the same in the face of pressure to change, or function in some way. These mechanisms include robustness, the identification of switches and oscillators, and the emergence of self-organized criticality among the interacting nodes. Papers representing this theme may be found in .
The anatomy of a forest fire's spread, from a network perspective.
B) nestedness, community detection, and clustering: Along with the concept of core-periphery organization, these properties may or may not exist in a heterogeneous network. But such techniques allow us to partition a network into subnetworks (modules) that may operate with a certain degree of independence. Papers representing this theme may be found in .
C) multilevel networks: even in the case of social networks, each "node" can represent a number of parallel processes. For example, while a single organism possesses both a genotype and a phenotype, the correlational structure for genotypic and phenotypic interactions may not always be identical. To solve this problem, a bipartite (two independent) graph structure may be used to represent different properties of the population of interest. While this is just a simple example, multilevel networks have been used creatively to attack a number of problems .
D) cascades, contagions: the diffusion of information in a network can be described in a number of ways. While the common metaphor of "spreading" may be sufficient in homogeneous networks, it may be insufficient to describe more complex processes. Cascades occur when transmission is sustained beyond first-order interactions. In a social network, messages that gets passed to a friend of a friend of a friend (e.g. third-order interactions) illustrate the potential of the network topology to enable cascade. Papers representing this theme may be found in .
E) hybrid models: as my talk demonstrates, the power and potential of complex networks can be extended to other models. For example, the theoretical "nodes" in a complex network can be represented as dynamic entities. Aside from real-world data, this can be achieved using point processes, genetic algorithms, or cellular automata. One theme I detected in some of the talks was the potential for a game-theoretic approach, while others involved using Google searches and social media activity to predict markets and disease outbreaks .
Here is a map of connectivity across three social media platforms: Facebook, Twitter, and Mashable. COURTESY: Figure 13 in .
 Here is the abstract and presentation. The talk centered around a convolution architecture, my term for a small-scale physical flow diagram that can be evolved to yield not-so-efficient (e.g. sub-optimal) biological processes. These architectures can be embedded into large, more complex networks as subnetworks (in a manner analogous to functional modules in gene-gene interaction or gene regulatory networks).
One person at the conference noted that this had strong parallels with the book “Plausibility of Life” (excerpts here) by Marc Kirschner and John Gerhart. Indeed, this book served as inspiration for the original paper and current talk.
 In practice, "nodes" can represent anything discrete, from people to cities to genes and proteins. For an example from brain science, please see: Stanley, M.L., Moussa, M.N., Paolini, B.M., Lyday, R.G., Burdette, J.H. and Laurienti, P.J. Defining nodes in complex brain networks. Frontiers in Computational Neuroscience, doi:10.3389/fncom.2013.00169 (2013).
The potential power of this phenomenon (the opportunity to identify and exploit weak ties in a network) was advanced by the sociologist Mark Granovetter: Granovetter, M. The Strength of Weak Ties: A Network Theory Revisited. Sociological Theory, 1, 201–233 (1983).
The small-world network topology (the Watts-Strogatz model), which embodies the "six degrees" principle, was proposed in the following paper: Watts, D. J. and Strogatz, S. H. Collective dynamics of 'small-world' networks. Nature, 393(6684), 440–442 (1998).
Scale-free networks can be defined as a network with no characteristic number of connections across all nodes. Connectivity tends to scale with growth in the number of nodes and/or edges. Whereas connectivity in a random network can be characterized using a Gaussian (e.g. normal) distribution, connectivity in a scale-free network can be characterized using a Power Law (e.g. exponential) distribution.
Small-world networks are defined by their hierarchical (e.g. strongly heterogeneous) structure and a short path length across the network. This is a special case of the more general scale-free pattern, and can be characterized with a strong power law (e.g. the distribution has a thicker tail). Because any one node can reach any other node in a relatively small number of steps, there are a number of organizational consequences to this type of configuration.
 Here are two foundational papers on network science [a, b] enlightening primers on complexity and network science [c, d]:
[a] Albert, R. and Barabasi, A-L. Statistical mechanics of complex networks. Reviews in Modern Physics, 74, 47–97 (2002).
Here are the latest features cross-posted to Tumbld Thoughts. A cornucopia of themes, from a model of pure speculation (I), to a new paper and reflections on the diversity of life-history strategies to aging across 46 species (II), and the human actions and reactionary tendencies that result from massive cultural change (III). Also featured is an update on biases inherent in the peer-review process (IV). So let's get started.
I. Pure (or Applied) Speculation
Here is an interesting model of predicting the future, courtesy of Anthony Dunne and Stuart Candy. In the book "Speculative Everything", Dunne and co-author present a design-centered vision for predicting the future .
Using the prismatic spectrum metaphor (my coinage), the future is understood as an extension of the present, with progressively more and less likely outcomes. The "preferred futures" fall between the most likely and the most promising potentials.
II. What happens when you combine phylogeny, demography, meta-analysis, and life-history?
Apparently, yes it does.....
The top picture is from a new paper  that combines a phylogenetic perspective with demography (quasi-phylodemography) to look at variation in aging across the life-history of 46 species. A summary and set of insights from Phenomenon blog can be found in .
By compiling data from multiple sources and conducting a meta-analysis, the authors of  found that life history trends for fertility, mortality, and survivorship vary widely both cross-culturally (in humans) and across the tree of life.
To make sense of this diversity, the authors of  propose a fast-slow continuum of senescence: from populations with a short-lived, early reproductive period to populations with a long-lived, extended reproductive period. These results can be compared with the review in , which presents the standard view of why we age, circa 2000.
This notion of progress and reaction are largely based on human value systems, as Scott Alexander points out. While the reactionary would argue that turning away from traditional cultural value systems leads to economic ruin and rampant crime, the data show the opposite.
To get right to the point of this argument (so to speak), go to Section 3.3 (then where does progress come from). There you will find data from the World Values Survey, where the so-called "vanguard" countries (in terms of growth and safety) possess high levels of both secular-rational and self-expression values .
IV. Herding in peer-review
With significant apologies to Gary Larson and the scientific community. Please read on....
Now there is a new paper in Nature  that discusses the phenomenon of herding in peer-review and evaluation. Here, the authors use a Bayesian (as opposed to a signal detection) statistical model to describe what happens when reviewers converge upon a misclassification (e.g. rejecting a paper with solid conclusions and methods). They call for the inclusion of subjectivity in the decision-making criterion: subjective decisions are those that include assessing both the strength of a reviewer's agreement with the conclusions and more conventional features of the manuscript (e.g. strength of the premises and methods employed).
In the graphs above, two scenarios are compared: M1 (which is the subjective strategy) and M2 (which is a purely objective strategy). The authors claim that the M1 strategy prevents so-called herding and promotes a more unbiased outcome.
Here is the latest installment of assorted features from my micro-blog, Tumbld Thoughts. These include heuristic-based prediction of the future, system shock and recovery (human edition), and exposure vs. prestige. More from the intersection of human culture, technology, and complexity theory.
Heuristic-based Prediction of the Future
How do you predict the future? How does anyone predict the future? Perhaps they use heuristics such as the extrapolation of current trends, gradualistic change, or stasis in human value systems (see the "future prediction heuristics GUI", top picture). Here are two attempts at future approximation from the academia and the technology industry, respectively.
"We have tended to see the professor as a single figure, but he is now a multiple being of many types, tasks, and positions". Circa 2013.
The article in  is a counter to the common argument that academia has undergone a period of "deskilling". Here, the author thinks sociological differentiation rather than deskilling is at the root of institutional change, and that trend will continue into the future.
"Do our computer pundits lack all common sense? The truth in no online database will replace your daily newspaper, no CD-ROM can take the place of a competent teacher and no computer network will change the way government works". Circa 1995.
The article in  is a retro look at critiques of the internet, circa 1995. The context for this critique was set against the unbridled optimism of what the internet would change in society. And even though many of the changes deemed too unrealistic actually came to pass, not all of them unfolded in the same way people expected them to in 1995 .
Systemic Shock and Recovery: human edition
For the end of the 2013 Hurricane season, I provide some storm-related free association. The first picture is about what tends to happen socio-economically in the aftermath of a hurricane. This was inspired not only by the aftermath of Typhoon Haiyan , but also by the response of fault-tolerant computer systems.
The picture above shows a summary  of all Hurricane tracks in the Atlantic during the 2013 season. This season was fairly quiet, with no major storms and a relatively small number of landfalls.
Exposure vs. Prestige
Randomly sampling Wikipedia entries and then using it to predict h-index scores may mean nothing. This is a play on the title of , but serves as a good, one-sentence critique of .
The authors of  suggest that personal profiles of scientists on Wikipedia should correspond with scientific impact (measured using the h-index). If they do not, then it suggests that Wikipedia is the source of distortion, artificially giving attention to lesser mortals (as it were).
However, this assumes two things: that the properties of Wikipedia entries should reflect the scoring of citation indices, and that random samples of Wikipedia entries will correspond to the distributions of h-index values.
The first assumption is only valid if h-indices capture all possible information about scientific impact. Clearly, this is not the case, as many different indices have been developed  to characterize the various nuances inherent in scientific output and influence.
The authors of  present a systematic review of various citation indices. Importantly, none of which produce a normal distribution centered around a mean. So when the mean h-index value of the Wikipedia sample is compared to the h-indices of different scientific fields, it does not mean as much as one would assume at first glance.
The brings us to the second assumption, which regards the underlying distribution of scientific impact. While this is not clearly discussed in , we know from other studies  that scientific impact can be explained using Lotka's Law (which can be characterized using a Pareto distribution).
While this long-tail can be mitigated using specialized metrics such as the x-index , this was not considered in . In fact, one could argue that Wikipedia profiles and citation indices are statistically independent of one another.