December 22, 2017

Fault-tolerant Christmas Trees (not the live kind)

It's an interconnected Christmas scene, but that's not a Christmas Tree! (?) COURTESY: Andrew P. Wheeler.

This year's holiday season post brings a bit of graph-theoretic cheer. That's right, there is a type of network called a Christmas tree [1,2]! It is a class of fault-tolerant Hamiltonian graph [2,3]. So far, Christmas trees have been applied to computer and communications networks, but may be found to have a wider range of applications, particularly as we move into the New Year.

An example of a Christmas Tree directed graph as shown in [2]. The top two graphs are slim trees of order 3 (left) and 4 (right). A Christmas tree (bottom) includes selected long-range connections (longer than the immediate connection to mother, daughter, or sister nodes).

This tree could have used a bit more fault-tolerance!

[1] Hsu, L-H and Lin, C-K (2008). Graph Theory and Interconnection Networks. CRC Press, New York.

[2] Hung, C-N, Hsu, L-H, and Sung, T-Y (1999). Christmas tree: A versatile 1-fault-tolerant design for token rings. Information Processing Letters, 72(1–2), 55-63.

[3] Wang, J-J, Hung, C-N, Tan, J.J-N, Hsu, L-H, and Sung, T-Y (2000). Construction schemes for fault-tolerant Hamiltonian graphs. Networks, 35(3), 233-245.

December 15, 2017

Work With Me, the Orthogonal Laboratory, and the OpenWorm Foundation This Summer!

The Google Summer of Code (GSoC) is once again accepting applications from students to work on a range of programming-oriented projects over the Summer of 2018. Orthogonal Laboratory and the OpenWorm Foundation have contributed a number of projects to the list. Here are links to the project descriptions (login required):

Orthogonal Laboratory:

DevoWorm Group:

OpenWorm Foundation:

I am the contact person for the Orthogonal Laboratory and DevoWorm Group projects, and Matteo Cantarelli is the contact person for the other projects. If you have any questions about the application process or want to have be review your application before submission, please feel free do so. The deadline for application submission is tentatively in late March/early April. Stay tuned!

Join us on "The Road to GSoC"!

December 1, 2017

Coherence and Relevance in Scientific Communities

          During the past year, Synthetic Daisies featured a series of posts on relevance theory and intellectual coherence within research communities [1]. In this post, I would like to use a set of small datasets to demonstrate how relevance plays a role in shaping scientific practice [2]. We are using a syntactic approach (or word frequency) to infer changes over time in specific scientific fields. 

          This is done using a list of words extracted from titles of accepted papers at the NIPS (Neural Information Processing Systems) conference from various years past. The NIPS conference (annual) represents a set of fields (Neural Modeling, Artificial Intelligence, Machine Learning) that has experienced rapid innovation and vast methodological change over the past 20 years [3]. To get a handle on patterns that represent a more stable field, data from GECCO (Genetic and Evolutionary Computation Conference). While there is plenty of innovation in this area of CS research, the short-term wholesale turnover of methods is much less prominent.

          Our approach involves ranking words used in paper titles in terms of frequency, and then comparing these rankings between different time intervals. Title words are in many ways more precise than keywords in that titles tend to be descriptive of specific methods and approaches. Each list has the top 15 results for each year listed. Changes in rank are represented by lines between their location in each pairwise list, and words that newly appear or disappear from the list project to a black dot underneath the ranked lists. 

          The working hypothesis is that periods of rapid change are characterized by very little carry-over between two neighboring time-points. Basic descriptive terms specific to the field should remain, but all other terms in the earlier list will be replaced a new set of terms. 

NIPS Conference Accepted Papers for 10-year intervals.

          The first graph shows the change in top terms (relevance) across 10-year intervals. As expected for such a fast-moving field, the terms exhibit an almost complete turnover for each interval (4/15 terms are continuous between 1994 and 2007, and 5/15 terms are continuous between 2007 and 2016). The only three terms that are present in both 1994 and 2016 are "learning", "model", and "neural". These are consistent with the basic descriptive terms in our working hypothesis.

NIPS Conference Accepted Papers for 3-year intervals.

          The second graph demonstrates changes in top terms (relevance) between 2010 and 2016, using intervals of three years. As expected, there is more continuity between terms (8/15 terms are continuous between 2010 and 2013, and 11/15 terms are continuous between 2013 and 2016). The 2013-2016 interval is interesting in that two of the terms new to the 2016 list ("optimal" and "gradient") are descriptors of a word that was lost from the 2013 list ("algorithm"). This suggests that there was much coherence in research topics within this interval as compared the 2010-2013 intervals.

GECCO Conference Accepted Papers for 1-year intervals.

          For both one-year intervals, 11/15 terms are preserved from one interval to the next. The terms that exhibit this continuity are consistent with the idea of basic descriptive terms. This might be seen as the signature of stability within communities, as it matches what is observed between 2013 and 2016 for the NIPS data.

          In keeping with the idea of scientific revolutions [4], we might adjust our view of paradigm shifts as revolutions in relevance. This serves as an alternative to the "big person" view of history, where luminaries such as Newton or Einstein singularly make big discoveries that change the course of their field and upend prevailing views. In this case, revolutions occur when communities shift their discourse, sometimes quite rapidly, to new sets of topics. This seems to be the case with various samplings of the NIPS data.

          For papers presented at NIPS and GECCO, what is relevant in a particular year is made salient to the audience of people who attend the conference. Whether or not this results in a closed feedback loop (people perpetually revisiting a focused set of topics) is dependent on other social dynamics.

UPDATE (12/7):
A preprint is now available! Check it out here: How to find a scientific revolution: intellectual field formation and the analysis of terms. Psyarxiv, doi:10.17605/OSF.IO/RHS9G (2017).

[1] For more information, please see the following posts: Excellence vs. Relevance. July 2 AND Breaking Out From the Tyranny of the PPT, April 17 AND Loose Ends Tied, Interdisciplinarity, and Consilience. June 18.

[2] Lenoir, R. (2006). Scientific Habitus: Pierre Bourdieu and the Collective Individual. Theory, Culture, and Society, 23(6), 25-43.

[3] For more about the experience and history of NIPS, please see: Surmenok, P. (2017). NIPS 2016 Reviews. Medium.

[4] Kuhn, T.S. (1962). The Structure of Scientific Revolutions. University of Chicago Press, Chicago, IL.

November 18, 2017

New Badges (Microcredentials) for Fall 2017

I have some new badges to advertise, one set from the OpenWorm Badge System, and one set from the Orthogonal Lab Badge System. As discussed previously on this blog, badges are microcredentials we are using to encourage participation in our research ecosystems at an introductory level.

An education-centric sketch of the OpenWorm and Orthogoanl Laboratory research ecosystems.

The first new badge series is an introduction to what is going on in the DevoWorm group, but also gives biologists and computationalists unfamiliar with Caenorhabditis elegans developmental biology a chance to get their feet wet by taking a multidisciplinary approach to the topic.

Worm Development I focuses on embryonic development and associated pattern formation. Worm Development I is a prerequisite to II, so be sure to try this one first.

Worm Development II focuses on larval development, including the postembryonic lineage tree and what characterizes each life-history stage.

The second badge series is hosted on the Orthogonal Lab Badge System, and provides an overview of Peer Review issues and techniques. This series is meant to give young scholars a working familiarity with the process of peer review. It is notable that Publons Academy now offers a course on Peer Review, to which this badge might serve as an abbreviated complement.

Peer Review I covers the history of peer review and the basics of pre-publication peer review. Be aware that Peer Review I is a prerequisite for Peer Review II (but not Peer Review for Data).

Peer Review II delves into how to decompose an article for purposes of peer review. An evaluation strategy for post-publication peer review is also covered.

Peer Review for Data contains a brief how-to for conducting peer review for open datasets.

November 15, 2017

Deep Reading Brings New Things to Life (Science)

Here is an interesting Twitter thread from Jacquelyn Gill on 'deep reading':

The basic idea is that exploring older literature can lead to new insights, which in turn lead to new research directions. The new research of our era tends to focus on the most relevant and cutting-edge literature [1]. This recency bias excludes many similarly relevant articles, including articles that perhaps inspired the more recent citations to begin with [2]. 

I have my own list of deep reads that have influenced some of my research in a similar fashion. These references can be either foundational or so-called "sleeping beauties" [3]. Regardless, I am doing my part to maintain connectivity [4] amongst academic citation networks:

1) Woodger, J.H. The Axiomatic Method in Biology. 1937.

An argument for biological rules, an influence on cladistics (developed in the 1960s), and a natural bridge to geometric approaches to data analysis and modeling. While there is a strong argument to be made against the axiomatic approach [5], this directly inspired much of my thinking in the biological modeling area. 

2) Davis R.L., Weintraub H., and Lassar A.B. Expression of a single transfected cDNA converts fibroblasts to myoblasts. Cell 51, 987–1000. 1987.

This was the first proof-of-concept for direct cellular reprogramming, and predates the late 2000's Nobel-winning work in stem cells by decades. In this case, a single transcription factor (MyoD) was used to convert a cell from one phenotype to another without a strict regard for function. More generally, this paper helped inspired my thinking in the area of cellular reprogramming to go beyond a biological optimization or algorithmic approach [6].

3) Ashby, W.R. Design for a Brain. 1960.

"Design for a Brain" serves as a stand-in for the entirely of Ashby's bibliography, but this is the best example of how Ashby successfully merged explanations of adaptive behavior [7] with systems models (cybernetics). In fact, Ashby originally coined the phrase "Intelligence Augmentation" [8]. I first discovered Ashby's work while working in the area of Augmented Cognition, and has been more generally useful as inspiration for complex systems thinking.

Not so much a couple of sleeping beauty as easy reading technical reference guides for all things complexity theory.

5) Bourdieu, P. Outline of a Theory of Practice. Cambridge University Press. 1977 AND Alexander, C., Ishikawa, S., and Silverstein, M. A Pattern Language: towns, buildings, construction. Oxford
University Press. 1977.

This is a bonus, not because the references are particularly obscure or even from the same academic field, but because they partially influenced my own view of cultural evolution. This is yet another piece of advice to young researchers: take things that appear to be disparate on their surface and incorporate them into your mental model. If nothing else, you will gain valuable skills in intellectual synthesis.

UPDATE (11/17):
Here is another example of old (classic, not outdated) work influencing new scholarship.

[1] Evans, J.A. (2008). Electronic Publication and the Narrowing of Science and Scholarship. Science, 321(5887), 395-399 AND Scheffer, M. (2014). The forgotten half of scientific thinking. PNAS, 111(17), 6119.

[2] related topics discussed on this blog include distributions of citation ages and most-cited papers.

[3] van Raan, A.F.J. (2004). Sleeping Beauties in Science. Scientometrics, 59(3), 467–472.

[4] Editors (2010). On citing well. Nature Chemical Biology, 6, 79.

[5] For the semantic approach (which had been influential to my more recent work), please see: Lloyd, E.A. (1994). The Structure and Confirmation of Evolutionary Theory. Princeton University Press, Princeton, NJ.

[6] Ronquist, S. (2017). Algorithm for cellular reprogramming. PNAS, 114(45), 11832–11837.

[7] Sterling, P. and Eyer, J. (1988). Allostasis: A new paradigm to explain arousal pathology. In "Handbook of life stress, cognition, and health". Fisher, S. and Reason, J.T. eds. Wiley, New York. 

[8] Ashby, W.R. (1956). An Introduction to Cybernetics. Springer, Berlin.

October 26, 2017

Open Access Week 2017: Version-Controlled Papers

The subject of a recent workshop [1], the next-generation scientific paper will include digital tools that formalize things such as version control and data sharing/access. Orthogonal Laboratory is developing a method for version-controlled documents that integrates formatting, bibliographic aspects, and content management. While this is not a novel approach to writing and composition [2], this post will cover how to apply a version-controlled strategy to presenting a scientific workflow. Below are brief sketches of our system for generating next-generation papers.

The first element is the process through which a document is generated, styled, and published (assigned a unique digital identifier or doi):

The key element of our system is a version control repository. We are using Bitbucket, but Github or a more specialized platforms such as Authorea or Penflip might also be sufficient. The idea is to build documents using the the Markdown language [3], then incorporate stylistic elements using CSS and HTML. VScode is used to manage spellcheck and grammar in the Markdown documents (containing the authored content). Reference management is done via Zotero, but again, any open source alternative will do.

The diffs function [4] of version control can be used to operate on final versions of Markdown files for the purpose of alternating between document versions. The idea is to not only find a consensus between collaborators, but to use branches strategically to push alternative versions of content to the doi as desired. This combinatorial editing framework could be desirable in appealing to different audiences or stressing specific aspects of the work at different points in time. Note that this is distinct from the editorial function of pulls and merges, which are meant to be more "under the hood".

Pandoc serves as a conversion tool, and can style documents according to particular specifications. This includes conventions such as APA style, or document formats such as LaTeX or pdf [5]. Additional components include code and data repositories, supplemental materials, and post-publication peer review.

Orthogonal Lab generally uses a host such as Figshare to generate dois for such content, but there are other hosts that generate version-specific dois as well. It is worth noting that Github-hosted academic journals are beginning to appear. Two examples are ReScience and Journal of Open Science Software. What we are providing (for our community and yours) is a means to generate styled documents (technical papers, blogposts, formal publications) in a version-controlled format. This also means papers can be dynamic rather than static: content at a given doi can be updated as desired.

[1] Perkel, J. (2017). C. Titus Brown: Predicting the paper of the future. Nature TechBlog, June 1.

[2] Eve, M.P. (2013). Using git in my writing workflow. August 18. Also, much of this functionality is accessible in Overleaf using TeX and a GUI interface.

[3] Cifuentes-Goodbody, N. (2016). Academic Writing in Markdown. YouTube. AND Sparks, D. and Smith, E. Markdown Field Guide, MacSparky.

[4] Diffs are also useful in comparing different versions of a published document as events unfold. Newsdiffs performs this function quite nicely on documents containing unfolding news.

[5] A few references for further reading:

a) Building your own Document Processor Tools:
Building Publishing Workflows with Pandoc and Git. Simon Fraser University Publishing.

b) Git + Diffs = Word Diffs:
Diff (and collaborate on) Microsoft Word documents using GitHub. Ben Balter blog.

c) Using Microsoft Word with Git. Martin Fenner blog.

October 24, 2017

Open Access Week 2017: Open Project Management

To kick off the open fun for this year, we will start off with a short discussion on open project management. Although people should think of this in a tool-free manner, we will address broad principles using Slack and Open Science Framework (OSF).

Welcome to the Orthogonal Lab Slack space! Contact if you are interested in joining.

Slack as a laboratory group tool: I began using Slack several years ago when the OpenWorm Foundation started using it to facilitate shared communication and manage new members. Since then, it has become increasingly popular as a laboratory personnel and collaborative management tool [1]. I started the Orthogonal Lab Slack about a year ago, and it has been useful for disseminating intragroup messages, news, media, and short presentations. This is especially good for academic collaborations, particularly when the group members are not co-located [2].

Once your group has a Slack space (with a URL such as, you must a) create channels, and b) recruit members. Whether your group is large or small, Slack seems to scale well in most cases. Each channel is thematic, and allows for parallel communication between channel members. Media (files, images, links) can be shared with ease, and private messages are also possible. Additional functionality is possible through the use of bots (e.g. time-management tools such as todobot or slackodoro). In many ways, Slack is an alternative to the e-mail chain. However, integration with other platforms (such as Twitter or Skype) is also possible.

An infographic on Slack productivity in the academic workplace, courtesy of Paperpile.

COURTESY: Using OSF at the University of Notre Dame. YouTube.

Open Science Framework (OSF) as project pipeline and showcase: I have been using OSF for storing work at the project level for exposition to potential funders and other interested parties. More generally, OSF is used to promulgate both the progression and replicability of research projects [3]. From a technical perspective, OSF also features version control (using Git), doi creation, and storage space for papers, presentations, and data. OSF also offers an API and an open dataset on research activities. OSF also has a portal called Thesis Commons for theses and dissertations. You can also store datasets, digital notebooks, and link to Github-hosted code using the OSF project structure.

Potential destinations for objects of the OSF workflow. COURTESY: Ref [4].

The OSF offers a means to manage all scales of research output. Artem Kaznatcheev has provided an informal taxonomy of research output types as well as their scale of importance. According to this view, examples of the these scales include the following: standard (blog), kilo (conference pubs), giga (journal pubs), and tera (book/thesis) scales. Although arbitrary in terms of content, these scales might more closely define the number of hours invested in creating a particular type of research document. OSF projects can include combinations of research output types to provide a richer window into the research process.

Steps in the developing research (or, how to get to research outputs). COURTESY:

[1] Some examples include:
a) Slack inside the MacArthur Lab. SlackHQ blog, April 27 (2015).

b) Washietl, S. (2016). Six ways to streamline communication in your research group using Slack. Paperpile blog, April 12.

c) Perkel, J.M. (2016). How Scientists Use Slack. Nature News, 541, 123. Managing organizational to-do lists in Slack.

[2] OpenWorm Slack has a bi-weeky event called Office Hours where people meet and have topical conversations. Join us via Slack Pass if you are interested.

[3] Foster, E. and Deardorff, A. (2017). Open Science Framework (OSF). Journal of the Medical Library Association: JMLA, 105(2), 203-206.

[4] Anonymous (2016). Response from COS. Medium, April 2.

October 23, 2017

Open Access Week 2017!

Welcome to Open Access Week 2017! Synthetic Daisies participated in Open Access Week 2016 with two instructional posts on Altmetrics and Secondary Datasets.  On Twitter, several hashtags (#oaweek#OpenAccess#OpenScience, and #OpenData) will be full of related content over the next few days. And we will have longer posts on Tuesday and Thursday on the topics of Open Project Management and Version-Controlled Papers that will be worth reading.

Over the last year, the OpenWorm Foundation and Orthogonal Laboratory made a commitment to open access instruction in a series of microcredentials (digital badges). The OpenWorm badge system offers a series of badges on Literature Mining specifically and Open Science more generally. The Orthogonal Lab badge system offers a series of badges on Peer Review. Have a productive week!

October 2, 2017

Pseudo-Heliocentric Readership Information in Gravitationally Bound Form

Or, how to get 300,000 reads by being persistent [1] and getting results in unexpected places. Let's review our milestones in three cartoons.

The made-up planetary orbits featured here [2] may violate the physics of actual solar system orbits, at least as simulated by Super Planet Crash [3].

[1] Candy, A. (2011). The 8 Habits of Highly Effective Bloggers. Copyblogger, October 25.

[2] Previous readership milestones, in order of distance from central star: 20000, 50000, 100000 (first image), 120000, 150000 (second image), 200000, 250000 (third image).

[3] Featured in the Scientific Bytes and Pieces, August 2015 post.

September 21, 2017

An Infographical Survey of the Bitcoin Landscape

Josh Wardini sent me information on a new Bitcoin infographic that serves as a survey of events over the last 10 years in the world of Bitcoin development and legal regulation. Many interesting factoids in this graphic, some of which were unbeknownst to me. In the next few paragraphs, I will discuss my impressions that are brought to bear by each subset of factoids.

The relationship between blockchain and mining is an interesting one, and underscores the power of blockchain as both a data structure and a secure transaction system. Bitcoin is also its own economic system, complete with social interactions. In particular, the competitive and cooperative aspects of cryptocurrency can serve as a model for understanding the social structure of markets.

This is another interesting feature of bitcoin: the network has computational power to both unlock the value of existing blockchain as well as to create new currency. Bitcoin mining has always been a bit of a black box to me [1], but it seems as though it has potentially two roles in the bitcoin economy. In a Synthetic Daisies post from 2014, I mentioned that the supply of bitcoin is fixed (in the manner of a precious metals supply), but it turns out that it is not that simple. Of course, since then blockchain technology has become the latest hot emerging technology in a number of areas unrelated to Bitcoin and even the digital economy [2].

It turns out the computational systems (unlike people) is not all that hard to understand. However, digital currency, which is based on human systems, is much harder to understand (or at least fully appreciate). In 2013, I did a brief Synthetic Daisies mention of a flash crash on one of the main Bitcoin exchanges. There is a lot of opportunity to use blockchain and even perhaps cryptocurrency in the world of research. If ways are found to make these technologies more easily scalable, then they might be applied to many research problems involving human social systems [3].

[1] So I sought out a few introductory materials on Bitcoin mining to clarify what I did not know: 

a) startbitcoin (2016). Beginner's Guide to Mining Bitcoins. 99 Bitcoins blog, July 1.

* mining consists of discovery blocks in the blockchain data structure, the discovery of which is rewarded through a "bounty" of x bitcoins. From there, inequality emerges (or not).

b) Mining page. Bitcoin Wiki.

* the total number of blocks is agreed to by the community, as is the total amount of computational power of the network. This makes the monetary supply nominally fixed, but is not required by the technology.

c) Hashcash Algorithm page. Bitcoin Wiki.

Despite the clear metaphoric overtones, Bitcoin mining is essentially like breaking encryption in that it requires a massive amount of computing power thrown at a computationally hard problem, but is also has elements of an artificial life model (e.g. competition for blockchain elements).

Water-cooled rigs probably maximize your investment margin....

[2] Of course, there has been innovation in the use of blockchain for Bitcoin and more general cryptocurrency transactions. For more, please see:

Portegys, T.E. (2017). Coinspermia: a cryptocurrency unchained. Working Paper, ResearchGate, doi:10.13140/RG.2.2.33317.91360.

Brock, A. (2016). Beyond Blockchain: simple scalable cryptocurrencies. Metacurrency project blog, March 31.

[3] A few potential examples:

a) Data Management. 1  2

September 11, 2017

This Concludes This Version of Google Summer of Code

I am happy to announce that the DevoWorm group's first Google Summer of Code student has successfully completed his project! Congrats to Siddharth Yadav from IIT-New Delhi, who completed his project "Image Processing with ImageJ (segmentation of high-resolution images)".

Our intrepid student intern

His project completion video is located on the DevoWorm YouTube channel. This serves as a counterpart to his "Hello World" video at the beginning of the project. The official project repo is located here. Not only did Siddharth contribute to the data science efforts of DevoWorm but also contributed to the OpenWorm Foundation's public relations committee.

Screenshot from project completion video

As you will see from the video, a successful project proceeds by organizing work around a timeline, and then modifying that timeline as roeadblocks and practical considerations are taken into account. This approach resulted in a tool that can be used by a diverse research community immediately for data extraction, or build upon in the form of future projects. 

In terms of general advice for future students, communicate potential problems early and often. If you get hung up on a problem, put it aside for awhile and work on another part of the project. As a mentor, I encourage students to follow up on methods and areas of research that is most successful in their hands [1]. In this way, students can find and build upon their strengths, while also achieving some level of immediate success. 

[1] This seems like a good place to plug the Orthogonal Research Lab's Open Career Development project. In particular, check out our laboratory contribution philosophy.

August 25, 2017

Live streaming of Orthogonal Lab content

Research live-streaming: an experiment in content [1].

The Orthogonal Research Laboratory, in conjunction with the OpenWorm Foundation, is starting to experiment with live video content. We are using YouTube Live, and live streams (composed in Xsplit Broadcaster) will be archived on the Orthogonal Lab YouTube channel. The intial forays into content will focus on research advances and collaborative meetings, but ideas for content are welcome. 


[1] obscure reference of the post: a shot of Felix the Cat, whose likeness was used to calibrate early experimental television broadcasts.

August 3, 2017

War of the Inputs and Outputs

Earlier this Summer, I presented a talk on sub-optimal cybernetic systems at NetSci 2017. While the talk was a high-level mix of representational modeling and computational biology, there were a few loose ends for further discussion.

One of these loose ends involves how to model a biological system with boxes and arrows when biology is a multiscale, continuous process in both space and time [1]. While one solution is to add as much detail as possible, and perhaps even move to hybrid multiscale models, another solution involves the application of philosophy.

In the NetSci talk, I mentioned in passing a representational technique called metabiology. Our group has recently put out a preprint on the cybernetic embryo in which the level of analysis is termed metabiological. In a metabiological representation, the system components do not need to map isomorphically to the biological substrate [2]. Rather, the metabiological representation is a model of higher-order processes that result from the underlying biology.

From a predictive standpoint, this method is imprecise. It does, however, get around a major limitation of black box models -- namely what a specific black box is representative of. It makes more sense to black box an overarching feature or measurement construct than to constrain biological details to artificial boundaries.

A traditional cybernetic representation of a nonlinear loop. Notice that the boxes represent both specific (sensor) and general (state of affairs) phenomena.

The black box also changes through the history of a given scientific field or concept. In biology, for example, the black box is usually thought of as something to ultimately break apart and understand. This is opposed to understanding how the black box serves as a variable that interacts with a larger system. So it might seem odd to readers who assume a sort of conceptual impermanence by the term "black box".

A somewhat presumptuous biological example: in the time of Darwin, heredity was considered to be a black box. In the time of Hunt Morgan, a formal mechanism of heredity was beginning to be understood (chromosomes), but the structure was a black box. By the 1960s, we were beginning to understand the basic function and structure of genetic transmission (DNA and gene expression). Yet at each stage in history, the "black box" contained something entirely different. In a fast moving field like cell biology, this becomes a bit more of an issue.

A realted cultural problem in biology has involves coming to terms with generic categories. This goes back to Linnean classification, but more generally this applies to theoretical constructs. For example, Alan Turing's morphogen concept does not represent any one biological agent, but a host of candidate molecules that fit the functional description. Modern empirical biology involves specification rather than generalization, or precisely the opposite goal of theoretical abstraction [3].

The relationship between collective morphogen action and a spatial distribution of cells. COURTESY: Figure 4 in [4]. 

A related part of the black box conundrum is what the arcs and arrows (inputs and outputs) represent. Both inputs and outputs can be quite diverse. Inputs bring things like raw materials, reactants, free energy, sources of variation, components, while outputs include things like products, transformations, statistical effects, biological diversity, waste products, bond energy. While inputs and outputs can be broadly considered, the former (input signals) provide information to the black box, while the latter (output signals) provide samples of the processes unfolding within the black box. Inputs also constrain the state space representing the black box.

Within the black box itself, processes can create, destroy, or transform. They can synthesize new things out of component parts, which hints towards black box processes as sources of emergent properties [5]. Black boxes can also serve to destroy old relationships, particularly when a given black box has multiple inputs. Putting a little more detail to the notion of emergent black boxes involves watching how the black box transforms one thing into another [6]. This leads us to ask the question: do these generic transformational processes contribute to increases in global complexity?

Perhaps it does. The last insight about inputs and outputs comes from John Searle and his Chinese Room problem [7]. In his model of a simple message-passing AI, an input (in this case, a phrase in Chinese) is passed into a black box. The black box processes the input either by mere recognition or more in-depth understanding. These are referred to as weak and strong artificial intelligence, respectively [7]. And so it is with a cybernetic black box -- the process within the unit can be qualitatively variable, leading to greater complexity and potentially a richer representation of biological and social processes.

[1] certainly, systems involving social phenomena operate in a similar manner. We did not discuss social systems in the NetSci talk, but things discussed in this post apply to those systems as well.

[2] for those whom are familiar, this is quite similar to the mind-brain problem from the philosophy of mind literature, in which the mind is a model of thought and the brain is the mechanism for executing thought.

[3] this might be why robust biological "rules" are hard to come by.

[4] Lander, A. (2011). Pattern, Growth, and Control. Cell, 144(6), 955-969.

[5] while it is not what I intend with this statement, a good coda is the Sidney Harris classic "then a miracle happens..."


[7] Searle, J. (1980). Minds, Brains, and Programs. Cambridge University Press, Cambridge, UK.

July 26, 2017

Battle of the (Worm) Bots

There is an fledgling initiative at the OpenWorm Foundation to build a worm robot. This post highlights some of the first steps towards this goal. OpenWorm Robots will be useful for educational purposes and movement experiments/validation. No soft robotics (or miniaturization) yet, but progress is also continually being made on the brains behind the bot. More videos are forthcoming, so stay tuned. Thanks go to Dr. Tom Portegys and Shane Gingell for their efforts.

This is the latest version of the OpenWorm bot, as shown on the OpenWorm YouTube channel.

This is Shane's first iteration, called RoboWorm. Note the sinusoidal movement, and then compare to a biological C. elegans.

July 16, 2017

Wandering Towards an Essay of Laws

The winners of the FQXi "Wandering Towards a Goal" essay contest have been announced. I made an entry into the contest (my first FQXi contest entry) and did not win, but had a good time creating a number of interesting threads for future exploration. 

The essay itself, "Inverting the Phenomenology of Mathematical Lawfulness to Establish the Conditions of Intention-like Goal-oriented Behavior" [1], is the product of my work in an area I call Physical Intelligence in addition to intellectual discussions with colleagues (acknowledged in the essay). 

I did not have much time to polish and reflect upon the essay at the time it was submitted, but since then I have come up with a few additional points. So here are a few more focused observations extracted from the more exploratory essay form:

1) there is an underappreciated connection between biological physics, evolution, and psychophysics. There is an subtle but important research question here: why did some biological systems evolve in accordance with "law-like" behavior, while many others did not? 

2) the question of whether mathematical laws are discovered or invented (Mathematical Platonism) may be highly relevant to the application of mathematical models in the biological and social sciences [2]. While mathematicians have a commonly encountered answer (laws are discovered, notation was invented), an answer based on discovering laws from empirically-driven observations will likely provide a different answer.

3) how exactly do we define laws in the context of empirical science? While laws can be demonstrated in the biological sciences [3], biology itself is not thought of as particularly lawful. According to [4], "laws" fall somewhere in-between hypotheses and theories. In this sense, laws are both exercises in prediction and part of theory-building. Historically, biologists have tended to employ statistical models without reference to theory, while physicists and chemists often use statistical models to demonstrate theoretical principles [5]. In fields such as biology or the social sciences, the use of different or novel analytical or symbolic paradigms might facilitate the discovery of lawlike invariants.

4) the inclusion of cybernetic principles (Ashby's Law of Requisite Variety) may also bring together new insights on how laws occur in biological and social systems, and whether such laws are based on deep structural regularities in nature (as argued in the FQXi essay) or the mode of representating empirical observations (an idea to be explored in another post).

5) Aneural cognition is something that might guide information processing in a number of contexts. This has been explored further in another paper from the DevoWorm group [6] on the potential role of aneural cognition in embryos. It has also been explored in the form of the free-energy principle leading to information processing in plants [7]. Is cognition a unified theory of adaptive information processing? Now that's something to explore.

[1] A printable version can be downloaded from Figshare (doi:10.6084/m9.figshare.4725235).

[2] I experienced a nice discussion of this issue during an recent NSF-sponsored workshop. The bottom line is that while the variation typical of biology often makes the discovery of universal principles intractable, perhaps law discovery in biology simply requires a several hundred year investment in research (h/t Dr. Rob Phillips). For more, please see:

Phillips, R. (2015). Theory in Biology: Figure 1 or Figure 7? Trends in Cell Biology 25(12), 1-7.

[3] Trevors, J.T. and Saier, M.H. (2010). Three Laws of Biology. Water Air and Soil Pollution, 205(S1), S87-S89.

[4] el-Showk, S. (2014). Does Biology Have Laws? Accumulating Glitches blog, Nature Scitable.

[5] Ruse, M.E. (1970). Are there laws in biology? Australasian Journal of Philosophy, 48(2), 234-246. doi:10.1080/00048407012341201.

[6] Stone, R., Portegys, T.E., Mihkailovsky, G., and Alicea, B. (2017). Origins of the Embryo: self-organization through cybernetic regulation​. Figshare, doi:10.6084/m9.figshare.5089558.

[7] Calvo, P. and Friston, K. (2017). Predicting green: really radical (plant) predictive processing. Journal of the Royal Society Interface, 14, 20170096.