August 31, 2014

Godel's Revenge: All-Encompassing Formalisms vs. Incomplete Formalisms

This content is cross-posted to Tumbld Thoughts. A loosely-formed story in two parts about the pros and cons of predicting the outcome of and otherwise controlling complex sociocultural systems. Kurt Godel is sitting in the afterlife cafe right now, scoffing but also watching with great interest.

I. It's an All-encompassing, Self-regulation, Charlie Brown!

Here is a video [1] by the complexity theorist Dirk Helbing about the possibility of a self-regulating society. Essentially, by combining big data with the principles of complexity would allow us to solve previously intractable problems [2]. This includes more effective management of everything from massively parallel collective behaviors to very-rare events.

But controlling how big data is used can keep us from getting into trouble as well. Writing at Gigaom blog, Derrick Harris argues that the potentially catastrophic effects of AI taking over society (the downside of the singularity) can be avoided by keeping key data away from such systems [3]. In this case, even hyper-complex AI systems based on deep learning can become positively self-regulating.


[2] For a cursory review of algorithmic regulation, please see: Morozov, E.   The rise of data and the death of politics. The Guardian, July 19 (2014).

For a discussion as to why governmental regulation is a wicked problem and how algorithmic approaches might be inherently unworkable, please see: McCormick, T.   A brief exchange with Tim O’Reilly about “algorithmic regulation”. Tim McCormick blog, February 15 (2014).

[3] Harris, D.   When data become dangerous: why Elon Musk is right and wrong about AI. Gigaom blog, August 4 (2014).

II. Arguing Past Each Other Using Mathematical Formalisms

Here are a few papers on argumentation, game theory, and culture. My notes are below each set of citations. A good reading list (short but dense) nonetheless.

Brandenburger, A. and Keisler, H.J.   An Impossibility Theorem on Beliefs in Games. Studia Logica, 84(2), 211-240 (2006).

* shows that any two-player game is embedded in a system of reflexive, meta-cognitive beliefs. Players not only model payoffs that maximize their utility, but also model the beliefs of the other player. The resulting "belief model" cannot be completely self-consistent: beliefs about beliefs have holes which serve as sources of logical incompleteness.

What is Russell's Paradox? Scientific American, August 17 (1998).

* intorduction to a logical paradox which can be resolved by distinguishing between sets and sets that describe sets using a hierarchical classification method. This paradox is the basis for the Brandenburger and Keisler paper.

Mercier, H. and Sperber, D.   Why do humans reason? Arguments for an argumentative theory. Behavioral and Brain Sciences, 34, 57-111 (2011).

Oaksford, M.   Normativity, interpretation, and Bayesian models. Frontiers in Psychology, 5, 332 (2014).

* a new-ish take on culture and cognition called argumentation theory. Rather than reasoning to maximize individual utility, reasoning is done to maximize argumentative context. This includes decision-making that optimizes ideonational consistency. This theory predicts phenomena such as epistemic closure, and might be thought of as a postmodern version of rational agent theory. 

There also seems to be an underlying connection between the "holes" is a culturally-specific argument and the phenomenon of conceptual blending, but that is a topic for a future post.

August 26, 2014

Fireside Science: Fun with F1000: publish it and the peers will come

This content is cross-posted to Fireside Science. Please also see the update before the notes section.

For the last several months, I have been working on a paper called "Animal-oriented Virtual Environments: illusion, dilation, and discovery" [1] that is now published at F1000 Research (also available as a pre-print at PeerJ). This is a paper that has gone through several iterations, from a short 1800-word piece (first draft) to a full-length article. This includes several stages of editor-driven peer review [2], and took approximately nine months. Because of its speculative nature, this paper could be an excellent candidate for testing out this review method.

The paper is now live at F1000 Research.

Evolution of a research paper. The manuscript has been hosted at PeerJ Preprints since Draft 2.

F1000 Research uses a method of peer-review called post-publication peer review. For those who are not aware, F1000 approaches peer-review in two steps: the submission and approval by an editor stage, and the publication and review by selected peer stage. Let's walk through these.

The first step is to submit an article. For some articles (data-driven), they are published to the website immediately. However, for position pieces and theoretically-driven articles such as this one, a developmental editor is consulted to provide pre-publication feedback. This helps to tighten the arguments for the next stage: post-publication peer review. 

The next stage is to garner comments and reviews from other academics and the public (likely unsolicited academics). While this might take some time, the reviews (edited for relevance and brevity) will appear alongside the paper. The paper's "success" will then be judged on those comments. No matter what the peer reviewers have to say, however, the paper will be citable in perpetuity and might well have a very different life in terms of its citation index.

Why would we want to have such alternative available to us? Such alternative forms of peer review and evaluation can both open up the scope of the scientific debate and resolve some of the vagaries of conventional peer review [3]. This is not to say that we should strive towards the "fair-and-balanced" approach of journalistic myth. Rather, it is a recognition that scientists do a lot of work (e.g. peer review, negative results, conceptual formulation) that either falls through the cracks or does not get made public. Alternative approaches such as post-publication peer review is an attempt to remedy that, and as a consequence also serve to enhance the scientific approach.

COURTESY: Figure from [5].

The rise of social media and digital technologies have also changed the need for new scientific dissemination tools. While traditional scientific discovery operates at a relatively long time-scale [6], science communication and inspiration do not. Using an open science approach will effectively open up the scientific process, both in terms of new perspectives from the community and insights that arise purely from interactions with colleagues [7].

One proposed model of multi-staged peer review. COURTESY: Figure 1 in [8].

UPDATE: 9/2/2014:
I received an e-mail from the staff at F1000Research in appreciation of this post. They also wanted me to make the following points about their version of post-publication peer review a bit more clear. So, to make sure this process is not misrepresented, here are the major features of the F1000 approach in bullet-point form:

* input from the developmental editors is usually fairly brief. This involves checking for coherence and sentence structure. The developmental process is substantial only when a paper requires additional feedback before publication.

* most papers, regardless of article type, are published within a week to 10 days of initial submission.  

* the peer reviewing process is strictly by invitation only, and only reports from the invited reviewers contribute to what is indexed along with the article. 

* commenting from scientists with institutional email addresses is also allowed. However, these comments do not affect whether or not the article passes the peer review threshold (e.g. two "acceptable" or "positive" reviews).  

[1] Alicea B.   Animal-oriented virtual environments: illusion, dilation, and discovery [v1; ref status: awaiting peer review,] F1000Research 2014, 3:202 (doi: 10.12688/f1000research.3557.1).

This paper was the derivative of a Nature Reviews Neuroscience paper and several popular press interviews [a, b] that resulted.

[2] Aside from an in-house editor at F1000, Corey Bohil (a colleague from my time at the MIND Lab) was also gracious enough to read through and offer commentary.

[3] Hunter, J.   Post-publication peer review: opening up scientific conversation. Frontiers in Computational Science, doi: 10.3389/fncom.2012.00063 (2012) AND Tscheke, T.   New Frontiers in Open Access Publishing. SlideShare, October 22 (2013) AND Torkar, M.   Whose decision is it anyway? f1000 Research blog, August 4 (2014).

[4]  By opening up of peer review and manuscript publication, scientific discovery might become more piecemeal, with smaller discoveries and curiosities (and even negative results) getting their due. This will produce a richer and more nuanced picture of any given research endeavor.

[5] Mandavilli, A.   Trial by Twitter. Nature, 469, 286-287 (2011).

[6] One high-profile "discovery" (even based on flashes of brilliance) can take anywhere from years to decades, with a substantial period of interpersonal peer-review. Most scientists keep a lab notebook (or some other set of records) that document many of these "pers.comm." interactions.

[7] Sometimes, venues like F1000 can be used to feature attempts at replicating high-profile studies (such as the Stimulus-triggered Acquisition of Pluripotency (STAP) paper, which was published and retracted at Nature within a span of five months).

August 22, 2014

Six Degrees of the Alpha Male: breeding networks to understand population structure

This post is part of a continuing series on ways to think more deeply about human biological diversity. In last month's post (One Evolutionary Trajectory, Many Processes), I discussed how dual-process models (such as the DIT model) might be used to include a new dimension to more traditional studies of population genetics. This example did not spend too much time on the specifics of what such a model would look like. Nevertheless, a dual-process model provides a broader view of the evolutionary process, particularly for highly social (and cultural) species like humans.

In this post, I will lay out another idea briefly mentioned in the "Long Game of Human Biological Variation" post. This involves the use of complex network theory to model the nature of structure in populations. To review, the null hypothesis (e.g. no structure) is generally modeled using an assumption of panmixia [1]. In this conception, structure emerges from interactions between individuals and demes (semi-isolated breeding populations). Thus, a deviation from the null model involves the generation of structure via selective breeding, reproductive isolation, or some other mechanism.

One way to view these types of population dynamics is to use a population genetics model such as the one I just described. However, we can also use complex network theory to better understand how populations evolve, particularly when populations are suspected to deviate from the null expectation [2]. Complex networks provide us with a means to statistically characterize the interactions between individual organisms, in addition to rigorously characterizing sexual selection and the long-range effects of mating patterns.

An example of a small-world network with extensive weak ties. Importantly, this network topology is not random, but instead feature shortcuts and extensive structure. Data represents the human brain. COURTESY: Reference [3].

Since attending the Network Frontiers Workshop (Northwestern University) last December, I have been toying around with a new approach called "breeding networks". The breeding network concept [2] involves using multilayered, dynamical networks to characterize breeding events, the creation of offspring, the subsequent breeding events for those offspring, and macro-scale population patterns that result. This allows us to characterize a number of parameters in one model, such as the effects of animal social networks on population dynamics [4]. This includes traditional network statistics (e.g. connectivity and modularity parameters) that translate into theoretical measures of fecundity, the diffusion of genotypic markers within a population, and structural independence between demic populations. But these statistics are determined by a meta-process, one that is explicitly social and behavioral.

Before we continue, it is worth asking why complex networks are relevant. No doubt you have heard of the "small-world network" phenomenon, which postulates that given a certain type of network topology, networks with many nodes and connections can be traversed in a very small number of steps [5]. This is the famous "six degrees" phenomenon in action. But complex networks can range from random connectivity to various degrees of concentration. This approach, which comes with its own mathematical formalisms, allows us to neatly characterize the behavior, physiology, and other non-genetic factors that result in the population dynamics that produces structured genetic variation.

An example of regular, small-world, and random networks, ordered by to what extent their connectivity is determined by random processes [6]. In breeding networks, non-random connectivity is determined by sexual selection (e.g. selective breeding). As sexual selection increases or decreases, it can change the connectivity of a population.

As complex networks are made up of nodes and connections, the connections themselves are subject to
connection rules. In some networks, these rules can be observed as laws of preferential attachment [7].
But in general, each node or class of nodes can have simple rules for preferring (or ignoring) association with one node over another. If this sounds like an informal selection rule, this is no accident. While complex network theory does not approach connectivity rules in such a way, breeding networks are expected to be influenced by sexual selection at a very fundamental (e.g. dyadic interaction) level.

The complex network zoo, and the three parameters (heterogeneity, randomness, and modularity) that define the connectivity of a network topology. Examples of specific network types are given in the three-dimensional example above, but breeding networks could fall anywhere within this space. COURTESY: Reference [8].

Another feature of breeding networks involves connectivity trends over time. For example, a founder population with a small effective population size might indeed be panmictic (in this case represented by a random network topology). However, as the population size increases and connectivity rules change, this topology can evolve to one with scale-free or even small-world properties. This is not only due to the selective nature of producing offspring, but difference in the fecundity of individual nodes.

Once you start paying around with this basic model, a number of alternative network structures [9] can be used to represent the null model. Types of configuration such as star topologies, hyperbolic trees, and cactus graphs can approximate inherent geographic structure in a population's distribution. These alternative graph topologies are the product of factors such as geography or migration, and may have pre-existing structure. The key is to use these features as the null hypothesis as appropriate. This will provide us with a better accounting of the true complexity involved in shaping the structural features of an evolving population.

A map showing the seasonal migration of shark populations in the Pacific, including aggregation points. COURTESY: The Fisheries' Blog.

[1] One model organism for understanding local and global panmixia is the aquatic parasite Lecithochirium fusiforme. For more, please see: Criscione, C.D., Vilas, R., Paniagua, E., and Blouin, M.S.   More than meets the eye: detecting cryptic microgeographic population structure in a parasite with a complex life cycle.
Molecular Ecology, 20(12), 2510-2524 (2011).

[2] The idea of breeding networks is similar to the idea of sexual networks, except that breeding networks are more explicitly tied to population genetics. This paper give good insight into how sexual selection factor into the formation of structured, complex networks: McDonald, G.C., James, R., Krause, J., and Pizzari, T.   Sexual networks: measuring sexual selection in structured, polyandrous populations. Proceedings of the Royal Societiy B, 368, 20120356 (2013).

[3] Gallos, L.K., Makse, H.A., and Sigman, M.   A small world of weak ties provides optimal global integration of self-similar modules in functional brain networks. PNAS, 109(8), 2825-2830 (2012).

[4] For more on animal social networks and their relationship to evolution, please see the following references:

a) Oh, K.P. and Badyaev, A.V.   Structure of social networks in a passerine bird: consequences for sexual selection and the evolution of mating strategies. American Naturalist, 176(3), E80-89 (2010).

b) Kurvers, R.H.J.M., Krause, J., Croft, D.P., Wilson, A.D.M., Wolf, M.   The evolutionary and ecological consequences of animal social networks: emerging issues. Trends in Ecology and Evolution, 29(6), 326–335 (2014).

[5] For a definition of network diameter in context, please see: Porter, M.A.   Small-world Network. Scholarpedia, 7(2), 1739 (2012).

[6] This classification of idealized graph models is based on the Watts-Strogatz model of complex networks. For more information, please see: Watts, D.J. and Strogatz, S.H.   Collective dynamics of 'small-world' networks. Nature, 393, 440-442 (1998).

[7] This property of idealized graph models is based on the Barabasi-Albert model of complex networks. For more information, please see: Barabasi, A-L. and Albert, R.   Emergence of scaling in random networks. Science, 286(5439), 509–512 (1999).

[8] Sole, R.V. and Valverde, S.   Information Theory of Complex Networks: On Evolution and Architectural Constraints, Lecture Notes in Physics, 650, 189–207 (2004).

[9] Oikonomou, P. and Cluzel, P.   Effects of topology on network evolution. Nature Physics 2, 532-536 (2006).

August 18, 2014

Maps, Models, and Concepts, August edition

Walcome back, Maps, Models, and Concepts series! In this edition, with content cross-posted to Tumbld Thoughts, we take a tour of Artificial Intelligence reconsidered (I) and the visualization of Economic History (II). Enjoy!

I. Can you haz intelligent behavior, internet bot?

Here are a few recent readings on the modeling and simulation of intelligence, broadly defined. The first two [1, 2] are part of a series by Beau Cronin on alternative ways to model intelligence. How do we produce "better" (e.g. more intuitive, or more human) artificial intelligence? Perhaps it is the model that counts, or perhaps it is the definition of intelligence itself. 

COURTESY: Figure 3 in [3].

The authors of [3] take the former view, and present a review on how various computational architectures can produce intelligent outputs. One example demonstrates how hierarchical Bayesian models (HBMs) can be used to acquire intuitive theories for various knowledge domains. But one can also use biologically-based architectural models to produce intelligent behavior. In [4], it is shown that fabrication and cell culture techniques can produce outputs similar to purely computational connectionist models.

COURTESY: Figure 2 in [4].

II. Did it begin with a bang, a boom, or a bust?

Aha! The moment of economic creation was not at 1650 after all! Conventional economic theory sometimes gives the impression that economists are creationists in spirit. Many historical graphs [5] only offer useful information back to the year 1650. Around 1650 or so, most economic indicators enter their exponential phase, which renders graphical information about previous eras incomparable.

But economist and modeler Max Roser [6] offers a historical view of global GDP going back 2,000 years. His "Our World in Data" website is an attempt to characterize global economics and other social phenomena as a series of visualizations. This includes maps (spatial distributions) and charts that make long-term comparisons more than a series of bad graphs. If John Maynard Keynes were to look at these data, he might say: in the long run, we are all wealthier [7].

[1] Cronin, B.   In search of a model for modeling intelligence. O'Reilly Radar blog, July 24 (2014).

[2] Cronin, B.   AI's dueling definitions. O'Reilly Radar blog, July 17 (2014).

[3] Tenenbaum, J.B., Kemp, C., Griffiths, T.L., and Goodman, N.D.   How to Grow a Mind: Statistics, Structure, and Abstraction. Science, 331, 1279-1285 (2014).

[4] Tang-Schomera, M.D., White, J.D., Tien, L.W., Schmitt, L.I., Valentin, T.M., Graziano, D.J., Hopkins, A.M., Omenetto, F.G., Haydon, P.G., and Kaplan, D.L.   Bioengineered functional brain-like cortical tissue. PNAS, 10:1073/pnas.1324214111 (2014).

[5] The bottom three pictures are courtesy of: Roser, M.   GDP Growth Over the Very Long Run. Our World in Data (2014).

[6] Matthews, D.   The world economy since 1 AD, in a single chart. Vox blog, August 15 (2014).

[7] Based on the quote "in the long run, we are all dead".

August 14, 2014

Dynamic Digital Diversity (in two parts)

This content is cross-posted to Tumbld Thoughts. A series of readings (in two parts) on trends in digital technology and the nature of internet use. 

I. The Digital Monoculture and its Discontents, the Digital Hellscape and its Malcontents

The first reading list is on alternatives for consumption and use of internet [1] and virtual worlds [2]. Where one type of person sees corporatist monoculture as the norm, others see new opportunities. How do we achieve a more mindful computing environment? In the case of reading [1], mindful means greater balance between the deluge of information and the ability to reflect upon it. Discover the possibilities courtesy of an insightful techno-buzzword salad on topics such as ubiquitous information and disconnectionists.

The second reading [2] confuses the "post-apocalyptic" for "eschewing the corporate". Today's Second Life is like what would happen to Burning Man if all of the hipsters and Silicon Valley types stopped going. People are doing a lot of interesting things under the radar of the hype machine. An interesting article notwithstanding.

II. What do People of the Internet and the Sciences Want?

Here are some interesting readings and visualizations related to science and technology. The first [3] is a network analysis of comments received by the FCC in response to preserving net neutrality. Interestingly, this analysis allows us to assess the uniqueness of each major argument (and how one side of the argument tended to be suspiciously more homogeneous). The second visualization [4] is a survey of how scientists use social media to advance their research. This includes now only how these tools are used, but which tools are most popular. 

[1] McFedries, P.    Mindful computing. IEEE Spectrum, July 25 (2014).

One version of a "post-apocalyptic hellscape".

[3] Hu, E.   A Fascinating Look Inside Those 1.1 Million Open-Internet Comments. All Tech Considered blog, August 12 (2014).

[4] Van Noorden, R.   Online collaboration: Scientists and the social network. Nature News, August 13 (2014).

August 4, 2014

The Ukraine is Strong (for Synthetic Daisies)!

This post was written using Ubuntu 13.10 (Saucy Salamander), GIMP 2.6 and Blogilo. No Bitcoins (or their open-source alternatives) were transacted in its creation.

According to the game of Risk, the Ukraine is weak. But for Synthetic Daisies blog and as an example of viral content, the Ukraine is strong! In the past few days, a Synthetic Daisies (and Fireside Science) blog post called "Bitcoin Angst with an Annotated Blogroll" has gone viral in the Ukraine.

The associated pictures demonstrate how one can simply but effectively triangulate viral content from basic analytic data. This is also confirmed by the number of Pageviews made by users with alternative browsers and operating systems, which is either a Ukraine thing, a Bitcoin community thing, or both. In any case, keep up the diffusion!