August 27, 2013

Evolutionary Models from the Reading Queue

As my reading queue is always bigger than my attentional throughput, I have decided that when I hit upon a theme, I will blog about the papers involved. In this post, recent papers from PLoS Computational Biology and PLoS One illustrate three instances of evolutionary modeling, which I will provide a guided tour of for a general audience.

What is an evolutionary model? Since evolution is hard to observe in most cases, we require models to fully appreciate what it means to evolve. While we could merely observe extant organisms and ruminate on the adaptive significance of specific traits, a more complete picture can be gathered from fossils [1] and comparative anatomy. Yet these are static evolutionary models -- static in the sense that all change (e.g. dynamics) is inferred from the observed data.

Can we improve upon this? The evolutionary analysis of genomes and phylogenetic simulations are quasi-static models -- models that work with more information but still rely significantly on dynamic inference. Not that there's anything wrong with that. But we could also simulate the organisms and conditions under which evolution operates. This can provide us with general principles that can corroborate (or in some cases reinterpret) our inferential endeavors.

What is an evolutionary model? Here are some static evolutionary models (a.k.a. fossils). Pictures from the Florida Museum of Natural History (FLMNH) Hall of Fossils.

Using mathematical and computational techniques, models can be used to support hypotheses and ideas related to evolutionary acceleration, complexity, and tradeoffs. While the models may look nothing like the traditional forms of evidence, they may nevertheless provide insight into the fundamental mechanisms of evolution.

1) Evolutionary "acceleration": In "Epigenetic Feedback Regulation Accelerates Adaptation and Evolution" [2], the authors use simplified gene regulatory network models with epigenetic inputs to model a phenomenon called epigenetic feedback regulation (EFR). This is done by modeling three scenarios using differential equations: noise-driven adaptation with EFR, EFR without growth dependency, and EFR under evolution.

Gene expression dynamics as represented in [2]. A dynamic model, as opposed to a fossil.

In the first case (noise-driven adaptation with EFR), a property called growth is tied explicitly to gene expression patterns (a direct result of EFR). The population of networks initially exhibits itinerant growth dynamics (e.g. fast, then slow, then fast again). Over time, the population settles into a high growth regime.

When growth and expression patterns are decoupled (made independent of one another), active growth can be achieved without gene expression being distributed across too many attractor basins. Put another way, gene expression will tend to be more coherent and can be synchronized across the network.

Finally, in the case of EFR under evolution, populations of networks are evolved and compared with network that are generated randomly. Evolution allows a greater proportion of networks to exhibit a steady growth rate (Figure 1), which suggests that the effects of EFR seen in the non-evolutionary cases also play a role in evolution with natural selection.

Figure 1. Figure 6 from [2]. Green function represents the effects of evolution over randomly-assembled networks.

2) Evolutionary increases in complexity: In "The Minimal Complexity of Adapting Agents Increases with Fitness" [3], the authors use an animat (artificial life agent) brain to model the hypothesis that the complexity of a population increases over evolutionary time. This study is based on the premise that adaptation is a multivariate process which occurs at multiple time-scales. So during the course of evolution we should expect some traits in a lineage to evolve faster than others.

We should also expect a generalized increase in complexity, as this differential adaptation results in more moving parts (so to speak). In this case, the investigation is restricted to an animat population's neural complement (Figure 2, top), which is represented using twelve binary variables (similar in composition to phylogenetic character states). The animat population is evolved over 60,000 generation. The resulting complexity is evaluated statistically using mutual information and an integrated information measure (Figure 2, bottom) related to Giulio Tonini's Phi parameter.

Figure 2. Figures 2 (top) and 3 (bottom) from [3]. Animat architecture (top) and fitness measured against complexity (as characterized by mutual information) over evolutionary time (bottom).

3) Evolutionary tradeoffs: In "Evolutionary Tradeoffs between Economy and Effectiveness in Biological Homeostasis Systems" [4], a multi-task (e.g. Pareto) optimization approach is used to bridge the effectivity and economical aspects of an evolving physiological system. Effectivity refers to functional coherence during the performance of a task, while economy refers to doing things like repair or investment without incurring a high fitness cost.

For the uninitiated, Pareto optimality is a situation where the allocation of resources between the two tasks is a non-zero sum outcome. Assuming Pareto optimality is a possible condition, this leads to a set of best compromises between the two tasks (Figure 3). This has relevance to the function of homeostatic (e.g. regulatory) mechanisms, and potentially the evolvability and adaptability of these mechanisms.

Figure 3. Figure 4 form [4]. The relationship between effectiveness (a) and economy (b), expressed as points along a Pareto front (black function, c).

For the skeptics who don't see the relevance of these papers to evolutionary science, I should point out that these models are not intended to mimic real biology or actual organisms. In this sense, the models above might be viewed as useless curiosities. However, modeling is not about fully replicating biology. Rather, good models should approximate key parameters (e.g. those that explain the most variance) in a process.

Perhaps it is with irony the best biological models might actually be considered "false" models. The intentional use of false models [5] has a significant history in the modeling of biological complexity. These false models (according to [5] there are seven types) include those with the following attributes:

1) Models that have very local applicability. While they lack generalizability, they do describe local phenomena well. These type of models might be employed to understand unique phenomena. This type of model can also be contrasted with models that overly idealize nature (next point).

2) Models that are an idealization of nature. Neural Network models fall into this category. The only properties of the brain that matter are neurons, their connections, and a mechanism for excitability. Ignoring all other complexity in the brain still gives us a somewhat-useful model of cognition.

3) Models that are incomplete but causally relevant. If a tree falls in the woods because it has been struck by lightning and its innards consumed by termites, and if you only observe the tree falling down in a slight breeze, you would conclude that the slight breeze caused the tree to collapse. One causal factor (and in some cases an important one), but not the entire story [6].

4) Models that intentionally mis-describe interactions between variables (e.g. spurious context independence, reductionist bias). In the service of seeking causality, important (and often critical) interactions between variables are overlooked. In reductionist science, the focus on one or two variables (e.g. finding a gene responsible for x) in the face of great complexity is another version of this point. While these few variables may describe much of the variance, oftentimes they do not.

5) Models that are fundamentally wrong-headed descriptions of nature. This becomes an issue when models adopted for the first four points are greatly successful, and their adoption/use becomes self-perpetuating. Intuitive or naive models (e.g. models that sound consistent with intuition but are not supported by evidence) also fall into this category.

6) Models that are purely phenomenological in nature (e.g. genome annotation). While this type of model is useful for understanding the structure of a problem, it is hard to elucidate function using the same model. This is of course true when a model lacks predictive power. However, since our knowledge of most complex systems is incomplete, purely phenomenological models are in essence false (but often useful).

7) Models that fail to describe or predict the data correctly. The use of curve-fitting techniques and characteristic functions falls into this category. While characteristic functions are useful approximations of mean behavior, it does not describe the natural variation well. While much effort is put into outlier detection, less effort is put into understanding the relative significance of outlier data points.

In conclusion, these examples also reveal two things about evolutionary models:

* Any single model cannot be an all-purpose tool. As shown in the examples above, a single model might be very good at modeling a specific phenomenon (e.g. the relationship between gene regulation and adaptation), but not at all relevant to other aspects of evolution (e.g. evolutionary divergence).

* Dynamic models, like fossils (e.g static models), are incomplete. This does not imply a fault in one line of evidence or another. Rather, it is suggestive of their cooperative role in our understanding of evolutionary processes.


[1] For more information on how fossils can be used as evolutionary models, please see the GB3D Fossil Database.

[2] Furusawa, C. and Kaneko, K.   Epigenetic Feedback Regulation Accelerates Adaptation and Evolution. PLoS One, 8(5), e61251 (2013).

[3] Joshi, N.J., Tononi, G., and Koch, C.   The Minimal Complexity of Adapting Agents Increases with Fitness. PLoS Computational Biology, 9(7), e1003111 (2013).

[4] Szekely, P., Sheftel, H., Mayo, A., and Alon, U.   Evolutionary Tradeoffs between Economy and Effectiveness in Biological Homeostasis Systems. PLoS Computationakl Biology, 9(8), e1003163 (2013).

[5] Wimsatt, W.   False Models as a Means to Truer Theories. Chapter 2 in Nitecki, M.H. and Hoffman, A. Neutral Models in Biology. Oxford, New York (1987).

[6] Nielsen, M.   If correlation doesn't imply causation, then what does? Data-driven Intelligence (DDI) blog, January 23 (2012).

1 comment:

  1. Awesome post. I've attempted a similar classification of mathematical models, but it looks like I should read [5] more closely for inspiration. What bothers me about evolutionary models is how willing they are to rely on non-rigorous conclusions and simulations. I feel like this is preventing the field from developing these models into a rigorous dynamic theory instead of just an eclectic bouquet. Maybe theoretical computer science can help.