August 27, 2012

Degeneracy: a central mechanism in evolution

In this post, I will provide an overview of a concept in evolutionary biology called degeneracy. Degeneracy has been defined by [1, 2] as structurally different entities that perform identical functions or yields identical outputs. From a complex systems perspective, degeneracy is closely related to redundancy and robustness [1]. Yet not as often cited as the other two terms [3], degeneracy is still an important feature of evolved biological systems (see Figure 1). For example, degeneracy may explain the absence of key proteins in up to 30% of healthy patients, or the absence of growth defects in yeast with deleted genes of known functional consequence [1]. It is a prominant property of genetic regulatory networks, as most examples characterized in the literature are tied to gene regulatory function in one way or another. For a wider range of examples, Edelman and Gally [1] provide a list of 22 examples from a range of biological systems (see Figure 2 for how degeneracy fits into the broader context of biological complexity).

Figure 1. A comparison of the number of times each term ("degeneracy", "redundancy", and "robustness") appears in the scientific literature.

Evidence for degeneracy can be found in the existence of multiple routes to a specific physiological function. According to [2], members of the adhesins gene family in yeast can interchangably perform functional roles when expression is elevated. This is in contrast to lock-and-key systems (e.g. receptor-ligand binding), which interact with a high degree of specificity [4]. Another common signature of degeneracy is cross-talk, which is common in gene regulatory and neural pathways. The presence of degenerate pathways (or at least the name for them) suggests a "degeneration" from some previous state. This is at least partially correct: it appears that degenerate relationships involve a generalization of function over evolutionary time. However, this need not result from a mechanism that was originally functionally specialized.

To understand this in context, let us return to the "lock-and-key" model. Lock-and-key systems are highly specialized with regard to function. In fact, if one were to consider only the end product, we might be tempted to conclude a purposeful design. However, if we consider that both the lock AND key have shared evolutionary histories, it becomes more possible that this arrangement is not only the product of mutation-selection dynamics but historical contingency as well [5]. Lock-and-key phenomena result from selection for extreme specialization [1]. While this might have a fitness advantage in some contexts, in highly-veriable environments it is not particularly advantageous. Thus, the historical lock-in [6] that inadvertently results from selection can results in evolutionary dead-ends. What degeneracy provides, then, is a means to either rescue a phenotype from or circumvent entirely such instances.

There are also potential evolutionary tradeoffs between network functionality and function of the individual components of this network. One example of this is the role of an individual gene in a genetic regulatory network. Whether degeneracy result from a true evolutionary tradeoff or as a signature of a complex system's emergent properties [8] is not clear. However, Whitacre and Bender [9] modeled biological networks as a complex adaptive system (CAS). Using this approach, robustness was found to result from both diversity and degeneracy. In this case, invariance to perturbation (e.g. robustness) results from many possible ways to achieve a common function. Diversity allows for the biological system to move away from the extreme specificity required of the "lock-and-key" model [10], while a degenerate architecture incorporates this diversity into a functional mechanism.

Figure 2. A schematic showing the relationship between degeneracy and the related concepts of complexity, robustness, and evolvability. Adapted from Figure 1 in [7].

Tononi, Sporns, and Edelman [11] have proposed ways to quantify degeneracy in biological networks. The quantification is based on the notion that redundancy and degeneracy stand in contrast to all outputs of a system (e.g. gene network) being statistically independent of one another. The concept of mutual information [12] is used to quantify the degree of a shared functional role between output. If two or more output share information, the related functions are said to exhibit degeneracy. Another quantitative approach is to think of degenerate biological systems as degenerate sets of overlapping functions [1, 13]. Given its mathematical similarity to phylogenetic theory, such an approach might reveal new insights into convergent evolution.

What can degeneracy teach us about complex biological systems? One lesson is that in some cases there may be a fitness benefit for maintaining a parallel architecture. While not clearly beneficial in every context, parallelism can perform critical functions such as buffer against mutations or act as a noise filter [14]. A second lesson is that degeneracy is not always degenerate: far from being a failure of optimization, degeneracy provides a means to incorporate the stuff of evolutionary time (mutation) into a system that does not become reliant on any single pathway (specificity). In this way, degenerate biological systems are often the most adaptable, which means that some outcomes of the evolutionary process can truly be described as "survival of the most degenerate".


[1] Edelman, G.M. and Gally, J.A.   Degeneracy and complexity in biological systems. PNAS, 98(24) 13763–13768 (2001).

[2] Whitacre, J. and Bender, A.   Degeneracy: A design principle for achieving robustness and evolvability. Journal of Theoretical Biology 263 (2010) 143–153.

[3] Data for graph courtesy of PubMed, search data August 27, 2012.

[4] Adami, C.   Reducible Complexity. Science, 312(5770), 61-63 (2006)  AND  Brouat, C., Garcia, N., Andary, C., and McKey, D.   Plant lock and ant key: pairwise coevolution of an exclusion filter in an ant-plant mutualism. Proceedings of the Royal Society of London B, 268, 2131-2141 (2001).

[5] For concept of historical contingency, please see: Swartz, B.A.   On the “Duel” Nature of History: Revisiting Contingency versus Determinism. PLoS Biology, 7(12), e1000259 (2009). AND Fontana, W. and Schuster, P.   Continuity in Evolution: On the Nature of Transitions. Science, 280(5368), 1451-1455 (1998).

[6] References to historical lock-in can be found in: Nelson, R. and Winter, S.   An evolutionary theory of economic change, Harvard University Press, Cambridge, MA (1982). This term is often used in the economics and business literature, but also has relevance to the biological world.

[7] Whitacre, J.M.   Degeneracy: a  link  between  evolvability, robustness and complexity in biological systems. Theoretical Biology and Medical  Modeling, 7(6), 6 (2010).

[8] A system with emergent properties produces an output which is greater than the sum of its parts. Weak emergence is a case where the collective effects can be reduced to its individual components, while strong emergence results in collective effects that are irreducible. In many cases, biological evolution can be thought of as resulting from strong emergence.

For information specific to evolution, please see: Blitz, D.   Emergent Evolution: Qualitative Novelty and the Levels of Reality. Kluwer Academic, Dordrecht (1992) AND Bedau, M.A.   Downward causation and autonomy in weak emergence. Principia, 6, 5-50 (2003).

[9] Whitacre, J.M. and Bender, A.   Networked buffering: a basic mechanism for distributed robustness in complex adaptive systems. Theoretical Biology and Medical Modeling, 7, 20 (2010).

* the complex adaptive systems (CAS) approach is a way to model systems of high complexity using a series of interacting agents that originated out of the Santa Fe Institute.

For a general overview, please see: Holland, J.H.   Studying Complex Adaptive Systems. Journal of Systems Science and Complexity, 19, 1–8 (2006) AND Miller, J.H. and Page, S.E.   Complex Adaptive Systems: An Introduction to Computational Models of Social Life. Princeton University Press, Princeton, NJ.

[10] Gomez-Gardenes, J., Moreno, Y., and Floria, L.M.   On the robustness of complex heterogeneous gene expression networks. Biophysical Chemistry, 115, 225-228 (2005).

[11] Tononi, G., Sporns, O., and Edelman, G.   Measures of degeneracy and redundancy in biological networks. PNAS, 96, 3257-3262 (1999). This work was developed for studying brain networks, but theoretically can be applied to a wide range of biological systems, including genetic regulatory networks.

[12] For a general introduction, please see: Latham, P.E. and Roudi, Y.   Mutual information. Scholarpedia, 4(1), 1658 (2009). Link.

[13] I could find no references to this outside of the original paper (Edelman and Gally). I imagine it combines the mathematical concept of degeneracy (when objects change their set membership over time) and conventional set theory.

[14] For mutational and phenotypic buffering, please see: Braendle, C. and Felix, M.A.   Plasticity and Errors of a Robust Developmental System in Different Environments. Developmental Cell, 15(5), 714-724 (2008) AND Rutherford, S.L. and Lindquist, S.   Hsp90 as a capacitor for morphological evolution. Nature, 336-342 (1998).

For noise filtering in a genetic regulatory network, please see: Orrell, D. and Bolouri, H.   Control of internal and external noise in genetic regulatory networks. Journal of Theoretical Biology, 230(3), 301-312 (2004) AND Lestas, I., Paulsson, J., Ross, N.E., and Vinnicombe, G.   Noise in gene regulatory networks. IEEE Transactions in Automation and Control, 53, 189-200 (2008). .

** For the latest information on degeneracy research, please see the work of James Whitacre, admin of the Degeneracy and Selection online community. Also, the Wikipedia page for Degeneracy (biology) features a comprehensive bibliography featuring papers from many areas of biology.

August 22, 2012

On Rats (cardiomyocytes) and Jellyfish (bodies)

Here is a recent Nature Biotechnology paper from the researchers at the Wyss Institute (Harvard) and the Dabiri Lab (Caltech) entitled "A tissue-engineered jellyfish with biomimetic propulsion" [1]. The authors of this paper reverse-engineered the essential mechanisms of a muscular pump to create an "artificial" form of jellyfish (Aurelia sp.) called a medusoid [2]. A medusoid (see Figure 1) consists of only a stripped-down version of the jellyfish morphology, replicating only the components needed to approximate jellyfish swimming kinematics [3].

Figure 1. Two examples of a free-swimming medusoid in solution. COURTESY: YouTube video [2].

Once these kinematics were understood, neonatal rat cardiomyocytes [4] were allowed to self-assemble into the desired structure. Cardiomyocytes will spontanously contract in culture, which enabled a cell population to approximate a nerve net. How did they do it? In this post, we will superficially step through the design process and show how the functional morphology of an organism can be engineered. Figure 2 shows the design process.

Figure 1. Steps in the Medusoid design process. COURTESY: Figure 1 (top) in [1].

The first step was abstract design principles from observed jellyfish propulsion. This biomimetic appraoch revealed that motor neurons, striated muscles, and radially-symmetrical appendages are primarily responsible for production of the stroke cycle [5]. The propulsion stroke in Aurelia is produced by two things: a radially symmetric and complete (e.g. power and recovery phases) "bell" contraction [6], and the synchronous activity of a distributed set of pacemakers [7]. In addition, muscle fibers in the jellyfish propulsion mechanism were found to be aligned end-to-end, which provides a mechanism for power production. Once these features were understood, the cellular architecture of the muscles and limbs were mapped using a chemical staining technique. This allowed for millimeter-scale organisms to be created. Morphogenesis was guided using structural (extracellular matrix scaffolding) and chemical (microenvironmental) cues, the results of which can be seen in Figure 3.

Figure 3. Results from the design and bioengineering efforts featured in [1]. COURTESY: Figure 1 (bottom) in [1].

To produce a medusoid body, cardiomyocytes were grown on a PDMS (polymer) scaffold. Because of this, there were constraints in terms of morphological compliance (e.g. bending capacity) [8], which is essential for the organism to initiate and complete its stroke. In the jellyfish, cells assemble around a material called mesoglea, which is a soft substrate supported by stiff ribs. This allows for selective rigidity and the signature bell-shaped contraction (see Figure 4 for comparison of contraction dynamics between Aurelia and the engineered organism).

Figure 4. Comparisons of kinematic performance between the jellyfish and medusoid. COURTESY: Figure 2 in [1].

To solve this design constraint, a lobed design was used. This balances stress generation by a cardiomyocyte population with the bending capacity of the substrate. Since reproducing a stroke-related movement identical to a jellyfish was not possible, a movement that involved a quasi-closed bell being formed at maximal contraction was used instead. These kinematics allowed for muscle fibers to be aligned with respect to the main axis of deformation, which allowed both stress production and substrate bending to be simultaneously maximized.

Figure 5. Evaluating the morphology of jellyfish and medusoids using a vortex flow field. COURTESY: Figure 3 in [1].

Finally, fluid-body interactions were characterized in order to fully optimize the medusoid morphology. These interactions are summarized in Figure 5. According to the authors of this study, the method presented here can be used to design any generalized biomechanical pump. Due to the use of cardiomyocytes, there is no ability to produce multi-stage movement behaviors [9]. However, the use of heterogeneous skeletal muscle fiber populations or transgenic muscle fibers engineered with respect to control of contraction speed may allow for more complicated movement behaviors to be reproduced. It will be interesting to see what types of "hybrid" species (part soft robot, part animal) these and other researchers are able to engineer in the future.


[1] Nawroth, J.C., Lee, H., Feinberg, A.W., Ripplinger, C.M., McCain, M.L., Grosberg, A., Dabiri, J.O., and Parker, K.K.   A tissue-engineered jellyfish with biomimetic propulsion. Nature Biotechnology, 30(8), 792-797 (2012).

[2] YouTube video of "Artificial jellyfish made from rat heart", Nature News. Full article at Nature News.

[3] a strategy similar to that used for designing the PETMAN robot from Boston Dynamics (see picture below). YouTube video here.

[4] for use of this cell type as an experimental model, please see: Chlopcikova, S., Psotova, J.,and Miketova, P.   Neonatal Rat Cardiomyocytes: a model for the study of morphological, biochemical, and electrophysiological characteristics of the heart. Biomedical Papers, 145(2), 49–55 (2001).

Picture of rat cardiomyocytes stained for tropomyosin. COURTESY: Lonza website.

[5] For information on a thermodynamic cycle, please see this. For information on swimming stroke (in humans), please see this.

[6] a "bell" contraction (where all appendages expand outward in the shape of a bell during muscle contraction) can be seen in the far left-hand panel of Figure 1.

[7] For information on cardiac pacemakers, please see this. For in silico simulation of pacemaker neuron dynamics, please see this tutorial from AnimatLab.

Example of a pacemaker cell from the SA node in the human heart. COURTESY: University of Utah Genetic Science Learning Center.

[8] For more information on how compliant substrates are used in soft robotics, please see: Trivedi, D., Rahn, C.D., Kier, W.M., and Walker, I.D.   Soft robotics: Biological inspiration, state of the art, and future research. Applied Bionics and Biomechanics, 5(3), 99–117 (2008).

[9] While these type of movements (e.g. associated with feeding or fighting) may require a central nervous system, they could be approximated using a jellyfish-like nerve net model. For more information on multi-stage movements, please see: Tanji, J.   Sequential Organization of Multiple Movements: involvement of cortical motor areas. Annual Review of Neuroscience, 24, 631–651 (2001).

August 15, 2012

Unique states and complex systems

I am reposting these two short, recent paper profiles from my microblog, Tumbld Thoughts

The first is an interesting set of papers on transient, totipotent-like states in stem cells. Stem cells are defined by their ability to exhibit stable pluripotency, which is the ability to take on the identity of any somatic cell phenotype. In development, totipotent cells, or cells that can can form a viable organism on their own, tend to give rise to pluripotent cells. However, these descendents of totipotent cells are also able to take on an intermittent (e.g. less stable) totipotent identity from time to time. The figure above is Figure 1 from [1]. Or you can read the full research report [2].

The second is a new paper on a newly-verified quantum state [3]. The authors of this paper call it "our state", a special type of three-body bound state (also known as Efimov three-body states) [4]. The "our state" involves pair interactions that are not only too weak to bind interacting atoms together, but also try to push these atoms apart. See the paper and Machines Like Us story for more information.


[1] Tischler, J. and Surant, A.   A sporadic super state. Nature, 487, 43-45 (2012).

[2] MacFarlan, T.S.   Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature, 487, 57-63 (2012).

[3] For more information, please see: Guevara, N.L., Wang, Y., and Esry, B.D.   New Class of Three-Body States. Phys. Rev. Lett. 108, 213202 (2012).

[4] the authors in question are from Brett Esry's group at Kansas State University, and their current main area of research is 3-body recombination. Visualization of three-body problem is from

August 10, 2012

An epistomology fan walks into a bar, and watches an online course in machine learning.....

Earlier this summer, I viewed the lectures for Andrew Ng's Machine Learning (ML) [1] course offered live as CS229 at Stanford [2]. As a set of introductory lectures, the content was quite dense and for the most part accessible. Depending on who you talk to, there also seems to be quite a bit of hype surrounding the concept of online course offerings. I will come back to this later.

This course is a very useful experience given you have a fair amount of background in computer science theory and statistics. Because they can be completed at your own pace, you can review key and/or interesting concepts several times over (which I would recommend). My background in in modeling and adaptive computing with a interest in staitstical analysis. So while I found the content to be educational, I also saw opportunities to extend some of the theory to my own interests.

One thing I noticed was how Andrew kept stressing that the industry people (e.g. Silicon Valley types) he visits on occasion quite often misapply the tools and concepts that make ML such a potentially powerful technique. This is especially interesting in light of two recent blog posts by Cosma Shalizi and Cathy O'Neil [3] on the unintentional misapplication and lack of appreciation for the shortcomings of ML models by many data analysts. In this case, the goal is to apply a specific model to a specific problem. However, this is often done without a formal consideration of experimental design or how well the model fits the phenomenon being observed.

The Stanford ML course also addressed the philosophical implications of ML (as opposed to simply learning the practical aspects). There was a subtle emphasis on why ML techniques are implemented in the way that they are [4]. In particular, the lectures on gradient descent and statistical learning were the most enlightening in this regard. However, I believe there is still a niche for a course on the philosophical implications of machine learning techniques, something that teaches "why" rather than "how" we decide to apply a model to a given problem.

The course also featured many demos which provided examples of how ML can be applied to statistical analysis and control problems. Autonomous control seemed to be a favorite topic [5]. One assumption I had going into the course was that normally-distributed (e.g. Gaussian) statistical models are required for training and deploying a predictive ML model. However, it was suggested that various classes of Lagrangian model [6] could also be deployed with reasonable rates of learning. This is an area deserving further investigation......

The whole notion of online learning has been the subject of myriad commentary, blog posts, and media speculation [7]. It is currently in fashion to think of online learning as a highly disruptive technology with regard to higher education: ideally, online learning will the eliminating market inefficiencies of the current higher education pricing model. It is of note that many of the most popular online courses (such as Sebastian Thrun's AI course at Udacity/Stanford) are hosted by "elite" Universities, and taught by the same people who write authoritative textbooks about the field they teach. What are the role of online college-level courses and services such as the Khan Academy? I am a tempered optimist, but it is worth noting that hype always surrounds the emergence of new technologies (or, in this case, new ways of delivering a service).

To cut through the hype, a little perspective is in order. The exposure one gets to the field of ML in a class like the Stanford offering is cursory. The catalog at Coursera (an online course clearinghouse sponsored by major US universities) are likewise meant to be introductory offerings [8]. Courses such as these are most useful for continuing education, particularly in a fast-moving field like computational science. I think of the ML course (and others like it) as a distributed digital textbook. These courses are certainly something that can open up professional opportunities and expand the mind, but are not intended to and indeed cannot replace traditional college degree programs.

By contrast to putting non-elite CS departments out of business, courses such as this may well provide opportunities for niche course offerings. If basic courses could be provided by online services, the resources of local faculty could be spent on more specialized and esoteric geared towards the specific strengths of the institution and faculty.


[1] Machine Learning is a set of techniques and tools that used to fall under the name Artificial Intelligence (AI). While Artificial Intelligence is generally associated by most people with GOFAI, Machine Learning, a subfield of AI, is a more limited attempt to apply advanced statistical techniques to classification and inferential problems.

[2] here is a link to Andrew's course courtesy of Academic Earth.

[3] links to Three-Toed Sloth article and Naked Capitalism article.

[4] since Machine Learning is largely about learning categorization schemes, the "epistomology" of machine learning is a matter of understanding the bases of categorization and learning itself. This might take inspiration from animal/human models, or perhaps models of collective behavior, neither of which are stressed in modern approaches to machine learning.

[5] the Stanford group has built a proof-of-concept autonomous helicopter that has learned to self-operate using reinforcement learning: YouTube video 1, YouTube video 2. For a more general review article (from 2001), please see: "The Roles of Machine Learning in Robust Autonomous Systems" (David Kortenkamp,
Proceedings of the AAAI).

[6] there is no wiki page for "Lagrangian Probability Distributions", but suffice it to say they include various non-uniform distributions such as the Poisson and the Exponential. As a reference for understanding the formalisms and minutiae of Lagrangian distributions, I used the book by Consul and Famoye.

[7] here is an article about the potential of Coursera, here is an skeptical take on the quality of online education from Larry Moran at Sandwalk, here is an article about the role of the Khan academy in a global society, here is an account of John Hawks' d.i.y. experience in online teaching, and an article about Peter Thiel's solution to the higher education pricing bubble.

[8] the Coursera catalog can be found here. It includes courses from faculty at Michigan, Stanford, UC Berkeley, Princeton, and other top-tier institutions.


To what does the following picture refer? Hint can be found here.

August 6, 2012

Geohashing overview

This is being reposted from my microblog Tumbld Thoughts.

Concept of the Day: geohashing [1], or a method for classifying geographic locations (based on latitude-longitude inputs) from random data [2]. This is used extensively in the context of Google Maps and other mobile applications [3]

Using the geohashing algorithm to find a location in San Francisco, USA. Courtesy [3].


[1] picture above is the schematic for a geohashing algorithm.

[2] special thanks to Randall Munroe and his brilliant xkcd comic series.

[3] for more information, please see the following reference: Niccolai, M. and Slatkin, B. 24 hours in SF: A Geolocation App. Google Developers App, February 04, 2009. See also a Stack Overflow discussion thread on geohashing.