October 26, 2017

Open Access Week 2017: Version-Controlled Papers

The subject of a recent workshop [1], the next-generation scientific paper will include digital tools that formalize things such as version control and data sharing/access. Orthogonal Laboratory is developing a method for version-controlled documents that integrates formatting, bibliographic aspects, and content management. While this is not a novel approach to writing and composition [2], this post will cover how to apply a version-controlled strategy to presenting a scientific workflow. Below are brief sketches of our system for generating next-generation papers.

The first element is the process through which a document is generated, styled, and published (assigned a unique digital identifier or doi):


The key element of our system is a version control repository. We are using Bitbucket, but Github or a more specialized platforms such as Authorea or Penflip might also be sufficient. The idea is to build documents using the the Markdown language [3], then incorporate stylistic elements using CSS and HTML. VScode is used to manage spellcheck and grammar in the Markdown documents (containing the authored content). Reference management is done via Zotero, but again, any open source alternative will do.

The diffs function [4] of version control can be used to operate on final versions of Markdown files for the purpose of alternating between document versions. The idea is to not only find a consensus between collaborators, but to use branches strategically to push alternative versions of content to the doi as desired. This combinatorial editing framework could be desirable in appealing to different audiences or stressing specific aspects of the work at different points in time. Note that this is distinct from the editorial function of pulls and merges, which are meant to be more "under the hood".


Pandoc serves as a conversion tool, and can style documents according to particular specifications. This includes conventions such as APA style, or document formats such as LaTeX or pdf [5]. Additional components include code and data repositories, supplemental materials, and post-publication peer review.

Orthogonal Lab generally uses a host such as Figshare to generate dois for such content, but there are other hosts that generate version-specific dois as well. It is worth noting that Github-hosted academic journals are beginning to appear. Two examples are ReScience and Journal of Open Science Software. What we are providing (for our community and yours) is a means to generate styled documents (technical papers, blogposts, formal publications) in a version-controlled format. This also means papers can be dynamic rather than static: content at a given doi can be updated as desired.


NOTES:
[1] Perkel, J. (2017). C. Titus Brown: Predicting the paper of the future. Nature TechBlog, June 1.

[2] Eve, M.P. (2013). Using git in my writing workflow. August 18. Also, much of this functionality is accessible in Overleaf using TeX and a GUI interface.

[3] Cifuentes-Goodbody, N. (2016). Academic Writing in Markdown. YouTube. AND Sparks, D. and Smith, E. Markdown Field Guide, MacSparky.

[4] Diffs are also useful in comparing different versions of a published document as events unfold. Newsdiffs performs this function quite nicely on documents containing unfolding news.

[5] A few references for further reading:

a) Building your own Document Processor Tools:
Building Publishing Workflows with Pandoc and Git. Simon Fraser University Publishing.

b) Git + Diffs = Word Diffs:
Diff (and collaborate on) Microsoft Word documents using GitHub. Ben Balter blog.

c) Using Microsoft Word with Git. Martin Fenner blog.

October 24, 2017

Open Access Week 2017: Open Project Management

To kick off the open fun for this year, we will start off with a short discussion on open project management. Although people should think of this in a tool-free manner, we will address broad principles using Slack and Open Science Framework (OSF).


Welcome to the Orthogonal Lab Slack space! Contact if you are interested in joining.

Slack as a laboratory group tool: I began using Slack several years ago when the OpenWorm Foundation started using it to facilitate shared communication and manage new members. Since then, it has become increasingly popular as a laboratory personnel and collaborative management tool [1]. I started the Orthogonal Lab Slack about a year ago, and it has been useful for disseminating intragroup messages, news, media, and short presentations. This is especially good for academic collaborations, particularly when the group members are not co-located [2].

Once your group has a Slack space (with a URL such as your-group.slack.com), you must a) create channels, and b) recruit members. Whether your group is large or small, Slack seems to scale well in most cases. Each channel is thematic, and allows for parallel communication between channel members. Media (files, images, links) can be shared with ease, and private messages are also possible. Additional functionality is possible through the use of bots (e.g. time-management tools such as todobot or slackodoro). In many ways, Slack is an alternative to the e-mail chain. However, integration with other platforms (such as Twitter or Skype) is also possible.

An infographic on Slack productivity in the academic workplace, courtesy of Paperpile.


COURTESY: Using OSF at the University of Notre Dame. YouTube.

Open Science Framework (OSF) as project pipeline and showcase: I have been using OSF for storing work at the project level for exposition to potential funders and other interested parties. More generally, OSF is used to promulgate both the progression and replicability of research projects [3]. From a technical perspective, OSF also features version control (using Git), doi creation, and storage space for papers, presentations, and data. OSF also offers an API and an open dataset on research activities. OSF also has a portal called Thesis Commons for theses and dissertations. You can also store datasets, digital notebooks, and link to Github-hosted code using the OSF project structure.

Potential destinations for objects of the OSF workflow. COURTESY: Ref [4].

The OSF offers a means to manage all scales of research output. Artem Kaznatcheev has provided an informal taxonomy of research output types as well as their scale of importance. According to this view, examples of the these scales include the following: standard (blog), kilo (conference pubs), giga (journal pubs), and tera (book/thesis) scales. Although arbitrary in terms of content, these scales might more closely define the number of hours invested in creating a particular type of research document. OSF projects can include combinations of research output types to provide a richer window into the research process.

Steps in the developing research (or, how to get to research outputs). COURTESY: Visual.ly


NOTES:
[1] Some examples include:
a) Slack inside the MacArthur Lab. SlackHQ blog, April 27 (2015).

b) Washietl, S. (2016). Six ways to streamline communication in your research group using Slack. Paperpile blog, April 12.

c) Perkel, J.M. (2016). How Scientists Use Slack. Nature News, 541, 123. Managing organizational to-do lists in Slack.

[2] OpenWorm Slack has a bi-weeky event called Office Hours where people meet and have topical conversations. Join us via Slack Pass if you are interested.

[3] Foster, E. and Deardorff, A. (2017). Open Science Framework (OSF). Journal of the Medical Library Association: JMLA, 105(2), 203-206.

[4] Anonymous (2016). Response from COS. Medium, April 2.

October 23, 2017

Open Access Week 2017!

Welcome to Open Access Week 2017! Synthetic Daisies participated in Open Access Week 2016 with two instructional posts on Altmetrics and Secondary Datasets.  On Twitter, several hashtags (#oaweek#OpenAccess#OpenScience, and #OpenData) will be full of related content over the next few days. And we will have longer posts on Tuesday and Thursday on the topics of Open Project Management and Version-Controlled Papers that will be worth reading.


Over the last year, the OpenWorm Foundation and Orthogonal Laboratory made a commitment to open access instruction in a series of microcredentials (digital badges). The OpenWorm badge system offers a series of badges on Literature Mining specifically and Open Science more generally. The Orthogonal Lab badge system offers a series of badges on Peer Review. Have a productive week!

October 2, 2017

Pseudo-Heliocentric Readership Information in Gravitationally Bound Form

Or, how to get 300,000 reads by being persistent [1] and getting results in unexpected places. Let's review our milestones in three cartoons.






The made-up planetary orbits featured here [2] may violate the physics of actual solar system orbits, at least as simulated by Super Planet Crash [3].


NOTES:
[1] Candy, A. (2011). The 8 Habits of Highly Effective Bloggers. Copyblogger, October 25.

[2] Previous readership milestones, in order of distance from central star: 20000, 50000, 100000 (first image), 120000, 150000 (second image), 200000, 250000 (third image).

[3] Featured in the Scientific Bytes and Pieces, August 2015 post.


September 21, 2017

An Infographical Survey of the Bitcoin Landscape


Josh Wardini sent me information on a new Bitcoin infographic that serves as a survey of events over the last 10 years in the world of Bitcoin development and legal regulation. Many interesting factoids in this graphic, some of which were unbeknownst to me. In the next few paragraphs, I will discuss my impressions that are brought to bear by each subset of factoids.




The relationship between blockchain and mining is an interesting one, and underscores the power of blockchain as both a data structure and a secure transaction system. Bitcoin is also its own economic system, complete with social interactions. In particular, the competitive and cooperative aspects of cryptocurrency can serve as a model for understanding the social structure of markets.







This is another interesting feature of bitcoin: the network has computational power to both unlock the value of existing blockchain as well as to create new currency. Bitcoin mining has always been a bit of a black box to me [1], but it seems as though it has potentially two roles in the bitcoin economy. In a Synthetic Daisies post from 2014, I mentioned that the supply of bitcoin is fixed (in the manner of a precious metals supply), but it turns out that it is not that simple. Of course, since then blockchain technology has become the latest hot emerging technology in a number of areas unrelated to Bitcoin and even the digital economy [2].



It turns out the computational systems (unlike people) is not all that hard to understand. However, digital currency, which is based on human systems, is much harder to understand (or at least fully appreciate). In 2013, I did a brief Synthetic Daisies mention of a flash crash on one of the main Bitcoin exchanges. There is a lot of opportunity to use blockchain and even perhaps cryptocurrency in the world of research. If ways are found to make these technologies more easily scalable, then they might be applied to many research problems involving human social systems [3].


NOTES:
[1] So I sought out a few introductory materials on Bitcoin mining to clarify what I did not know: 

a) startbitcoin (2016). Beginner's Guide to Mining Bitcoins. 99 Bitcoins blog, July 1.

* mining consists of discovery blocks in the blockchain data structure, the discovery of which is rewarded through a "bounty" of x bitcoins. From there, inequality emerges (or not).

b) Mining page. Bitcoin Wiki.

* the total number of blocks is agreed to by the community, as is the total amount of computational power of the network. This makes the monetary supply nominally fixed, but is not required by the technology.

c) Hashcash Algorithm page. Bitcoin Wiki.

Despite the clear metaphoric overtones, Bitcoin mining is essentially like breaking encryption in that it requires a massive amount of computing power thrown at a computationally hard problem, but is also has elements of an artificial life model (e.g. competition for blockchain elements).

Water-cooled rigs probably maximize your investment margin....

[2] Of course, there has been innovation in the use of blockchain for Bitcoin and more general cryptocurrency transactions. For more, please see:

Portegys, T.E. (2017). Coinspermia: a cryptocurrency unchained. Working Paper, ResearchGate, doi:10.13140/RG.2.2.33317.91360.

Brock, A. (2016). Beyond Blockchain: simple scalable cryptocurrencies. Metacurrency project blog, March 31.

[3] A few potential examples:

a) Data Management. 1  2




Printfriendly