Contributing to open science – a computational biologist’s perspective

The landscape of how we communicate science is changing rapidly. Microbiome prizewinner, Dr Jacqueline Meisel attended the week long FORCE 11 Scholarly Communications Institute this summer to learn more about reproducibility, pre and post publication peer review and open data visualization. Her aim was to discover what practices she needed to implement in her own research in order to contribute to open science.

FORCE11 (the Future of Research Communications and e-Scholarship) is a community dedicated to improving scientific and scholarly communication practices. Recently, they organized the FORCE11 Scholarly Communications Institute FSCI 2018, a training course held at the University of California San Diego from July 30 to August 3. FSCI 2018 was attended by researchers, librarians, publishers, and others from multiple disciplines across the sciences and humanities.

Participants signed up for three courses on topics including research reproducibility, detection of questionable publishing practices, peer review, the cost of open access, and more. As a computational biologist, I attended the workshop to learn more about the Scholarly Communications field and what practices I can implement in my own research to help contribute to open science.

Reproducibility

Authors are still writing static documents to fit within the restrictions of traditional article formats. This is often not the best way to present the large datasets we generate today.

My first class focused on “Reproducible Research Reporting and Dynamic Documents with Open Authoring Tools: Toward the Paper of the Future”. Even though communication has significantly evolved from the first journal publications about 400 years ago, authors are still writing static documents to fit within the restrictions of traditional article formats. This is often not the best way to present the large datasets we generate today.

We discussed best practices of literate programming and the benefits of having authors combine code and text into interactive documents, or research compendiums. We examined tools like RStudio that help researchers perform analyses in one centralized location and update reports as they collect data. We also used Author Carpentry resources from Caltech that extend Software and Data Carpentry pipelines to cover open, reproducible science writing and publishing options.

Some current obstacles to the production and publication of dynamic documents include the need for better definitions of what compendiums in each field should contain, how to cite the datasets, software, and articles within compendiums, ­and how to archive data and maintain stable versions for publication and copyright. Another concern is that the half-life of computational tools is relatively short, and resources utilized today may be obsolete tomorrow. Containerized solutions that preserve runtime environments were noted as one potential solution for the future replication of analyses.

Pre- and Post-Publication Peer Review

In the afternoon “Pre- and Post-Publication Peer Review: Perspectives and Platforms” sessions, we discussed the strengths and weaknesses of common peer review practices. Introduced in the late 1960s, peer review was initially motivated by the need to increase man-power to process a back-log of submissions and to improve the quality of publications. Today, peer review helps establish validity of papers, provides a venue for feedback, and enables editors to select research findings for publications, but is also slow, time-consuming, expensive, and prone to bias.

I most often think of peer review as the formal process pursued after submitting a journal article, but we also considered the importance of utilizing informal peer review, including posts on Twitter and other social media and feedback from colleagues.

One problem we observed in peer review is the discrepancy between how it is described and how it is actually carried out. This varies depending on the field or journal conducting the review, who writes the reviews, what is assessed, and when and how it is done. I most often think of peer review as the formal process pursued after submitting a journal article, but we also considered the importance of utilizing informal peer review, including posts on Twitter and other social media and feedback from colleagues.

There were many suggestions for how we can improve peer review. For instance, the quality of peer review might increase if we provide more training and clearer guidelines to all reviewers. Additionally, journals could use professional groups or software to perform pre-checks on readability and completeness before sending articles to reviewers to assess scientific merit. We could also increase the number of registered reports, where studies are approved for publication before they are conducted, based on the soundness of their methodology. To incentivize peer review, we discussed ways of giving reviewers credit, including through sites like Publons.

Open Data Visualization

In my last class, “Open Data Visualization- Tools and Techniques to Better Report Data”, we outlined the pros and cons of different tools for data visualization. For the most popularly used tools, people appreciated open source options with large communities of users for support, and the ability to make specialized plots. However, these more dynamic tools often come with steep learning curves. We examined the different steps involved in the process of generating, visualizing, interpreting, and disseminating data. Finally, we used the open source tool, Apache Superset, to visualize some temporal data.

My week-long sabbatical in Scholarly Communications encouraged me to be more aware of how I share my own research. This includes thinking about how I am storing and analyzing my data, how I provide access to the raw data and source code, what types of formal and informal peer review I seek out to improve my work, and which journals I submit my work to (and if they are open access!). Open science is the way of the future  and programs like FSCI 2018 are working to make it a reality.

View the latest posts on the Research in progress blog homepage

Comments