March 18, 2017

Almost time for GSoC Applications!

Your chance to join the DevoWorm group is almost upon us. If you are a student, the Google Summer of Code (GSoC) is a good opportunity to gain programming experience. Applications are being accepted from March 20 to April 3. If selected, you will join the DevoWorm group, and also have the chance to network with people from the OpenWorm Foundation and the INCF.

The best approach to a successful application is to discuss your skills, provide an outline of what you plan to do (which should resemble the project description), and then discuss your approach to solving the problems at hand. We are particularly interested in a demonstration of your problem-solving abilities, since many people will apply with a similar level of skill. You can find an application template in outline form here.

You can apply to work on one of two DevoWorm projects: "Physics-based Modeling of the Mosaic Embryo in CompuCell3D" or "Image processing with ImageJ (segmentation of high-resolution images)". If you have any questions, comment in the discussions or contact me directly.

March 15, 2017

A Tree of Deeper Experiences -- the Authorship Tree

One of the most difficult aspects of academic publishing with multiple authors is in determining the order of authorship. In many fields, authorship order is the key to job promotion. Unfortunately, these conventions vary field, while the criteria for authorship slots often varies by research group. Since a responsible accounting of contributions are key to determining authorship and authorship order [1], it is worth considering multiple possibilities for conveying this information.

Example of an Authorship list (with affiliations)

A mathematics or computer science researcher might also see the problem as one of choosing the proper representational data structure. The authorship order, no matter how determined, is a 1-dimensional queue (ordered list). Even though some publishers (such as PLoS) allow for footnotes (an inventory of author contributions), there is still little room for nuance.

Example from "The Academic Family Tree"

But is there a better way? Academic genealogies provide one potential answer. A typical genealogy can be thought of as a 1-dimensional order, from mentor to student. In reality, however, an academic have multiple mentors, influenced by a number of predecessors. The construction of academic family trees [2] is one step in this direction, turning the 1-dimensional graph into a 2-dimensional one.

Picture of the Authorship tree cover. COURTESY: "The Giving Tree" by Shel Silverstein

This is why Orthogonal Lab has just published a hybrid infographic/paper called the The Authorship Tree [3]. This is a working document, so suggestions are welcome. The idea is to not only determine the relative scope of each contribution, but also to graphically represent the interrelationships between authors, ideas, and scope of the contributions.

As we can see from the example below, this includes not only our authors, but also people from the acknowledgements, funders, reviewers, authors of important papers/methods, and funders. While the ordering of branches along the stem suggests an authorship order, they are actually ranked according to their degree of contribution [4]. To this end, there can be equivalent amounts of contribution, as well as inclusion of minor contributors not normally included in an authorship list.

Example of an authorship tree (derived from original 1-D author list).

[1] Cozzarelli, N.R. (2004). Responsible authorship of papers in PNAS. PNAS, 101(29), 10495.

[2] David, S.V. and Hayden, B.Y. (2012). Neurotree: A Collaborative, Graphical Database of the Academic Genealogy of Neuroscience. PLoS One, 7(10), e46608. doi:10.1371/journal.pone.0046608.

[3] Orthogonal Lab (2017). The Authorship Tree. Figshare, doi:10.6084/m9.figshare.4731913.

[4] For more on the point system convention, please see: Venkatraman, V. (2010). Conventions of Scientific Authorship. Science Issues and Perspectives, doi:10.1126/science.caredit.a1000039.

March 4, 2017

Open Data Day Activities

Today is International Open Data Day, which was first proposed in 2010. To do my part, we will discuss a few open data-related items. Namely, what can you do to make this day a success?

Logo of the Open Knowledge Foundation (based in London), who offer a host of Open Data Day acitivities.

1) You can host some of your unpublished data (whether they are linked to publications or not) at an open data repository. You can do this through a general repository such as Dryad or Figshare, or a specialized repository such as Open fMRI [1].

* another part of publishing data is the need for annotation and other metadata. This is a barrier to opening up datasets, but the benefits of doing so may outweigh the initial investments [2].
2) You can join a open access communities such as, a new social media network that allows people to share datasets of all types and sizes.

3) You can commit to creating more systematic descriptions of your research methods (e.g. the things you do to create data). This can be done by creating a set of digital notes or protocol descriptions [3], and making them open through Jupyterhub and [4], respectively.

4) You can host your own virtual Hackathon. Unsure as to how you might do this? Then you can earn any (or all) in a series of three badges (Hackathon I, Hackathon II, Hackathon III) created in conjunction with the Open Worm Foundation.

5) You can petition or get involved with municipal and state/provincial governments to ensure their committment to open public data.

Of course, there are other things you can do, and more innovation is needed in this area. Have some ideas or planning an event of your own. Let me know, and I will invite you to the Orthogonal Lab's new Slack channel on Open Science.

[1] This choice, of course, depends on the field in which you are working. I used this example because fMRI data seems to have good community support for data sharing. Consult the Open Access Directory to learn more about the specifics for various disciplines.

For more information about data sharing in the field of neuroimaging, please see: Iyengar, S. (2016). Case for fMRI Data Repositories. PNAS, 113(28), 7699-7700.

[2] Based on a paper recently posted to the bioRxiv, and based on some material from a recent talk. For more information, please see: Alicea, B. (2016). Data Reuse as a Prisoner's Dilemma: the social capital of open science. bioRxiv, doi:10.1101/093518.

[3] Olson, R. (2012). A short demo on how to use IPython Notebook as a research notebook. Randal S. Olson blog, May 12.

[4] In terms of witing better and more accessible protocols, please see the following examples: (2017). How to make your protocol more reproducible, discoverable, and user-friendly.
February 25.

Daudi, A. How to Write an Easily Reproducible Protocol. American Journal Experts, http://www.aje.
com/en/arc/how-to-write-an-easily-reproducible-protocol/, Accessed February 27, 2017.