October 31, 2018

October: Geppetto month

Here is a recap of Geppetto Month at the OpenWorm Foundation. This content has been cross-posted from the OpenWorm Foundation blog (h/t Giovanni Idili).

 

OpenWorm is made up of many sub-projects, “project of the month” is an effort to highlight a different OpenWom sub-project every month. This month is Geppetto’s turn!


What is Geppetto?
Geppetto is a visualisation and simulation web-based platform for building neuroscience applications. The first use case ever of Geppetto was OpenWorm itself (some lore: the virtual Worm being Pinocchio, a Geppetto was needed to “make it”), but since then many groups have adopted it as their platform of choice. It is basically a set of reusable components for simulation, visualisation and data aggregation that make it easier to develop your neuroscience application, be it a data portal or an entry point to external simulation engines.

Projects that currently make use of Geppetto as a platform:
OpenWorm uses Geppetto as an integration platform for the output of various of its subprojects, from connectome browsing to replaying of integrated electrophysiology and fluid dynamics simulations.


Open Source Brain uses Geppetto to share, visualize and simulate neuronal models, both for individual neurons and networks.

Virtual Fly Brain is an ontology and 3D/2D  morphology browser for drosophila resources built using Geppetto.

NetPyNE-ui is a user friendly UI to create and run neuronal models using the NetPyNE library.

Open Development
Geppetto development is entirely open source, like anything else that happens under the OpenWorm umbrella. There are open sprint meetings every two weeks that anybody can join, and we keep a public development board showing development activities and progress. You can browse the issues and see if there is anything you might wanna chance your hand on!

Resources
Here are some links if you want to learn more about Geppetto:

          Open access paper (Philosophical Transaction of the Royal Society B, 2018)

          Geppetto docs

          Geppetto live demo

          Development board

          Geppetto source code (Github)

          Geppetto Blog

          Geppetto on Twitter

Get involved!
Getting involved is easy, simply fill out the OpenWorm volunteer application form and we will invite you to the OpenWorm foundation slack, from there you can interact with the community and join the #geppetto channel if you are interested to learn more about Geppetto or get involved as a contributor.

October 26, 2018

OAWeek 2018: Barriers to Practice

In our final OAWeek post, I will present the current barriers to "open" practice. While there are many potential barriers to living up to the principles of complete openness, there are four major reasons why people or institutions make the decision to be open and their reasons for doing so. These include (but are not limited to): technological, financial, formal conventions, and learning curve.



Technological. The past few years have seen a boom in innovations and digital tools that enable open access, open science, and open source. Based on the above figure, we can see that the all areas of the conventional scientific process have been touched by this revolution. Distribution, publishing, notetaking, bibliographies, and engaging the broader community have all been impacted by new tools and (more importantly) their adoption by a critical mass of scientists. The development of formal pipelines for organizing this proliferation of tools into actionable steps [1] has also been a technological advance. Despite this convergence, this is not a single "killer app" that will solve the open problem. Nor should there be, as killer apps are often concentrated in the hands of single entities that are vulnerable to profiteering. Importantly, open-enabling technologies must be available to smaller research groups, particularly generators of smaller datasets [2], to get the most out of the scientific community's efforts.

101 Innovations in Scholarly Communication. ORIGINAL SOURCE: https://innoscholcomm.silk.co/  License: CC-BY.

Financial. While many tools are relatively cheap to use, other aspects of open science can be quite costly to individual scientists or even laboratories. In Wednesday's post on the three "opens", the various models of open access were discussed. Depending on which route to open access and/or open science is chosen, there are costs associated with manuscript, data archiving, curation, and annotation. A successful "open" strategy should include a consideration of these costs to ensure sustainability over the long term. There are also issues with the cost and public funding of large-scale community resources such as open access journals, preprint servers, data repositories that must be solved without making their use unaffordable or (by extension) unavailable. One open question is the incentive structure for sharing resources and making them accessible. This is particularly true for datasets, which require incentives related to research efficiency, social prestige, and intellectual growth [3]. Such incentives can also help to reinforce higher reproducibility standards and overall levels of scientific integrity [4]. 

An example of a set of formal conventions chosen from a large number of potential tools. COURTESY: Nate Angell, Joint Roadmap for Open Science Tools. License: CC-0.

Formal Conventions. Another barrier to "open" is cultural practice. In moving from concept to finished product, we do so by following a set of internalized practices. While science requires much formal training, many scientific practices are taught implicitly during the course of laboratory and scholarly research. Several recent studies characterize openness as a matter of evolving norms [5, 6] which define openness in terms of collegiality, and does not punish non-open endeavors. One critical aspect to encouraging open practices is education. However, there does seem to be a generational shift in attitudes and educational opportunities surrounding open practices. This has occurred at the same time information and computational technologies have emerged that encourage sharing and transparency. Whether this will change standards and expectations in a decade is unclear -- although governments and funding agencies are now embracing open access and open science in ways they previously have not.

Learning curve as compared to the diffusion of innovations [7]. COURTESY: Wikimedia.

Learning Curve. With all of the potential tools and steps in making research open, there is a learning curve for both individual scientists and small organizations (e.g. laboratory). While the learning curve for some practices (e.g. preprint posting) are trivial, other "open" practices (e.g. transparent protocol and methods) require more commitment and formal training. The learning curve is one major factor in the difference between merely "making things open" and making things accessible. In the domain of open datasets, accessibility can be hampered due to the fragmentation of resources across many obscure locations rather than a highly-discoverable set of repositories with fixed identifiers [8]. There are two additional barriers to accessibility and/or practice adoption: difficulty of learning and cultural learning. Difficulty in learning a specific tool or programming language does make a difference in how open practices are, and the harder or more time consuming a certain task is, the less likely the associated practice will be adopted. Cultural learning involves being exposed to a specific practice and then adopting that practice. This generally has little relation to difficulty, and depends more on personal and institutional preference. It is important to keep both of these in mind, both for adopting an "open" strategy and expectations of members of the broader community.


NOTES:
[1] Toelch, U. and Ostwald, D. (2018). Digital open science: Teaching digital tools for reproducible and transparent research. PLoS Biology, 16(7), e2006022. doi:10.1371/journal.pbio.2006022.

[2] Ferguson, A.R., Nielson, J.L., Cragin, M.H., Bandrowski, A.E., and Martone, M.E. (2014). Big Data from Small Data: Data-sharing in the ‘long tail’ of neuroscience. Nature Neuroscience, 17(11), 1442-1448. doi:10.1038/nn.3838.

[3] Gardner, D. et.al (2003). Towards Effective and Rewarding Data Sharing. Neuroinformatics, 1(3), 289-285. AND Piwowar, H.A., Becich, M.J., Bilofsky, H., Crowley, R.S. (2008). Towards a Data Sharing Culture: Recommendations for Leadership from Academic Health Centers. PLoS Medicine, 5(9), e183. doi:10.1371/journal.pmed.0050183.

[4] Gall, T., Ioannidis, J.P.A., Maniadis, Z. (2017). The credibility crisis in research: Caneconomics tools help? PLoS Biology, 15(4), e2001846. doi:10.1371/journal.pbio.2001846.

[5] Pham-Kanter, G., Zinner, D.E., and Campbell, E.G. (2014). Codifying Collegiality: recent developments in data sharing policy in the life sciences. PLoS One, 9(9), e108451. doi:10.1371/ journal.pone.0108451.

[6] Fecher, B., Friesike, S., and Hebing, M. (2015). What Drives Academic Data Sharing? PLoS One, 10(2), e0118053. doi:10.1371/journal.pone.0118053.

[7] Rogers, E. (1962). Diffusion of Innovations. Free Press of Glencoe, New York.

[8] Culina, A., Woutersen-Windhouwer, S., Manghi, P., Baglioni, M., Crowther, T.W., Visser, M.E.  (2018). Navigating the unfolding open data landscape in ecology and evolution. Nature Ecology and Evolution, 2, 420–426. doi:10.1038/s41559-017-0458-2

October 24, 2018

OAWeek 2018: Open Access, Open Science, Open Source

For this OAWeek post, we will discuss the connections between open access, open science, and open source. As an organizing principle, I will introduce each concept with a working definition, and then discuss relationships with other "open" concepts.


Open Access: availability to the general public, research output can be distributed freely without restrictions.

A typology of different forms of Open Access publishing.

As a publishing phenomenon, open access can take a number of forms [1, 2]. Aside from a distinction between peer-reviewed and non peer-reviewed materials, Open Access publishing is color-coded as green (self-archiving) or golden (archival at the publisher's site for a fee) [3]. There is also a version of golden open access called diamond open access, the difference being that diamond open access does not require the author to pay a fee to the publisher [4]. Self-archival can be done through a personal server (website), a preprint site such as bioRxiv, or a site that allows for public hosting of documents (ResearchGate, Figshare). Golden open access usually requires an APC fee, the funds for which go to the publisher. While cheaper, self-archival requires adherence to a set practices that ensure ease of access.

In a narrow sense then, open access is a publishing issue seemingly unconnected to open science and particularly open source. Yet in fact, open access is both critical to and an enabling factor in open science and open source. Aside from making materials open (free or affordable), they mush also be made accessible. There are many other benefits to open access [5], but the most important of which is that they enable access to many different components of a set of scientific results.


Open Science: make research and data (scholarly outputs) publically accessible. This requires efforts to make scholarly outputs transparent and accessible, which should enable reproducibility.


Open Science is an extension of open access in that not only is the manuscript made public, but the research products are made public as well [6, 7]. An open pipeline (or system) might include any number of the following: version-controlled manuscript editing, preprints, preregistration of study design, open datasets, demonstrable analyses, open source code, social media engagement, post-publication review, and open manuscript review. While it is up to the scientist or scientific organization what components to utilize, each component has value to both the scientist [8] and the scientific audience.

One way to make the benefits of being open explicit without violating the rights of scientists to their original work is to adopt an open license. While there are a number of options for both open science and open source, one popular type of license is Creative Commons (CC) [9]. There are many types of CC license, but one commonly used in open science is CC-BY (or alternatively CC-BY-NC). The BY license allows others to distribute and/or recombine your work with acknowledgement of the original author (you). BY-NC licenses explicitly disallow commercial derivatives.


A successful open science strategy is more than simply the production of science and the least publishable unit. Open science also includes access to educational materials, such as screencasts, lecture notes, and even course development [10]. As a suitable example, Open Science MOOC provides all of their course modules at the level of a consumable lesson and a Github repository of sharable lesson plans.


Open Source: make source code publically available and editable. Software architecture is licensed so that it can be modified in collaborative fashion.

In many ways, open source (OS) can be considered a crucial component of open science, as the ability to collaboratively and transparently solve problems is a key part of the ethos. Yet open source has its own set of concerns surrounding project-building and the management of contributors. The development of open source software is not simply the production of free software, as there are significant version control and human resource issues that go into OS [11]. Open source projects (such as Wikimedia Foundation or Linux Foundation) tend to operate at a much larger scale than open science collaborations. In the case of hybrid open science/open source organizations (such as the OpenWorm Foundation), there are a number of management concerns that also draw from making research methods and data transparent.

Open Source provides not only an avenue to transparency, but also as a tool for collaboration. An open source infrastructure that provides version-control [12] and source code annotation in the public domain can serve to enable public discussion and encourage future development outside of a specific project or set of experiments. The ability to open up code used in analysis and simulation aids in the peer review process. For published methods, open source provides a means for people to improve upon and use the code base. Open source efforts such as the open hardware movement allows labs to share standardized plans for DIY lab equipment, lowering the costs of science.


NOTES:
[1] Jeffrey, K.G. (2006). Open Access: an introduction. ERCIM News. https://www.ercim.eu/publication/Ercim_News/enw64/jeffery.html.

[2] Suber, P. (2012). Open Access. MIT Press, Cambridge, MA

[3] Kienc, W. (2015). Green OA vs. Gold OA. Which one to choose? Open Science blog, June 3.

[4] Kelly, J.M. (2013). Green, Gold, and Diamond?: A Short Primer on Open Access. Jason M. Kelly blog, January 27.

[5] PLoS. Why Open Access? https://www.plos.org/open-access.

[6] Guide to Open Science Publishing. F1000Research.

[7] McKiernan, E.C., Bourne, P.E., Brown, C.T., Buck, S., Kenall, A., Lin, J., McDougall, D., Nosek, B.A., Ram, K., Soderberg, C.K., Spies, J.R., Thaney, K., Updegrove, A., Woo, K.H., and Yarkoni, T. (2016). How open science helps researchers succeed. eLife. 2016; 5: e16800. doi:10.7554/eLife. 16800.001.

[8] Ali-Khan, S.E., Jean, A., MacDonald, E., Gold, E.R. (2018). Defining Success in Open Science. MNI Open Research, 2, 2. doi:10.12688/mniopenres.12780.

[9] Creative Commons. About the licenses. https://creativecommons.org/licenses/

[10] Jhangiani, R. and Biswas-Diener, R. (2017). Open: the philosophy and practices that are revolutionizing education and science. Ubiquity Press. doi:10.5334/bbc.

[11] Fogel, K. (2017). Producing Open Source Software: how to run a successful free software project. Version 2.3088 http://producingoss.com/

[12] Blischak, J.D., Davenport, E.R, and Wilson, G. (2016). A Quick Introduction to Version Control with Git and GitHub. PLoS Computational Biology, 12(1), e1004668. doi:10.1371/journal.pcbi. 1004668.

October 22, 2018

Welcome to Open Access Week 2018!

Welcome to Open Access Week! Orthogonal Research and Education Laboratory is contributing to the week's activities through three blogposts: in this post, we will briefly discuss Open Annotation, while Wednesday will feature "Open Access, Open Science, and Open Source" and Friday will feature "Barriers to Practice".


Synthetic Daisies blog celebrated Open Access Week in 2016 (Working with Secondary Datasets, How Am I Doing, Altmetrics?) and 2017 (Version-Controlled Papers, Open Project Management). All posts will be tagged with #OAweek for easy retrieval.

To kick off the discussion, we will now quickly discuss Open Annotation and the role it can play in enabling literature searches, peer-review, and collaboration. Two of the most well-known open annotation tools are Hypothes.is and Fermat's Library. A few posts from the Hypothes.is blog serve to establish the benefits and potential of open annotation and how it is currently being implemented on the web.

According to [1], open annotation can serve as a framework for new practices such as collective document review. This is a common function of collaborative document systems such as Overleaf and Authorea. However, the Hypothes.is vision for seems to be building a so-called "ecosystem" for commenting that can be used for peer review, reader notes, or links to relevant additional readings [1, 2]. In such a system, comments can be transferred across versions of a document, from draft to preprint to published manuscript [1].

Under the hood, open annotation relies upon standards such as the W3C Open Annotation data model. Once implemented, this allows for a separation of the discussion (annotations) from the main page [2]. This provides opportunities for meta-browsing [3] and distributed discussion threads that can be centralized in a common repository. There are also many opportunities for novel uses of open annotation, ranging from collaborative note-taking to adding references and data to an existing paper.

NOTES:
[1] Staines, H. (2017). Making Peer Review Transparent with Open Annotation. Hypothes.is blog, http://web.hypothes.is/blog/transparent-peer-review.

[2] Gerben (2014). Supporting Open Annotation. Hypothe.is blog, https://web.hypothes.is/blog/ supporting-open-annotation/.

[3] Wiesman, F., van den Herik, H.J., and Hasman, A. (2004). Information retrieval by metabrowsing. Journal of the American Society for Information Science and Technology, 55(7), 565-578.

September 30, 2018

Finding Your Inner Modeler (Part II)

Last year, I attended a workshop at the University of Illinois-Chicago called "Finding Your Inner Modeler". Sponsored by the NSF, FYIM is meant to bring together biologists and modelers and to foster collaborations between the two. There were many interesting talks over the course of two days, including plant biology, biochemical kinetics, and (of course) various types of computational and statistical model [1].

This year was the second installment of FYIM, and this time I was chosen for a platform presentation. The platform presentation (Process as Connectivity: models of interaction in cellular systems) involves a 40-minute discussion between the principal investigator and an expert modeler. For my talk, this expert modeler was Dr. Eric Deeds from the University of Kansas.



The talk features work with several collaborators, features work from the DevoWorm group. In the talk, I described the DevoWorm group as an example of data science biology [2]. As an affiliate of the OpenWorm Foundation [3], the DevoWorm group works with primary and secondary data, and produces secondary and tertiary open datasets that serve as material for publications, student projects, and the wider development/computational biology communities.



The core innovation introduced in this talk is the use of graph theory and complex networks to analyze the organizational structure of the embryonic phenotype. This work is now showcased in a new paper [4] and Github repository.




NOTES:
[1] One example is the Virtual Cell software project, which allows one to model and analyze representations of kinematics, kinetics, geometry, and network interactions at the cellular level.

[2] Alicea, B., Gordon, R., and Portegys, T.E. (2018). DevoWorm: data-theoretical synthesis of C. elegans development. bioRxiv, doi:10.1101/282004.
 
[3] Sarma, G.P., Lee, C-W., Portegys, T., Ghayoomie, V., Jacobs, T., Alicea, B., Cantarelli, M., Currie, M., Gerkin, R.C., Gingell, S., Gleeson, P., Gordon, R., Hasani, R.M., Idili, G., Khayrulin, S., Lung, D., Palyanov, A., Watts, M., Larson, S.D. (2018). OpenWorm: overview and recent advances in integrative biological simulation of Caenorhabditis elegans. Philosophical Transactions of the Royal Society B, 373, 20170382. doi:10.1098/rstb.2017.0382.

[4] Alicea, B. and Gordon R. (2018). Cell Differentiation Processes as Spatial Networks: identifying four-dimensional structure in embryogenesis. BioSystems, doi:10.1016/j.biosystems.2018.09.009.



Printfriendly