The discussion was led by a panel including open data advocate Ross Mounce who acted as moderator, Alan Hyndman from figshare, Michael Markie from F1000 Research, Andrew Hufton from Scientific Data, Tom Pollard a PhD student at University College London and our own Amye Kenall. The event was accompanied by lots of Twitter activity with a dedicated hashtag #OpenDataSpotlight, nicely summarized by Laura Wheeler in a Storify.
“The best thing to do with your data will be thought of by someone else.”
This thought by Rufus Pollock may be inspiring to some, but scary to others. Many scientists fear that by sharing their data they might get ‘scooped’, or perceived as not as smart as the researcher who made a novel finding based on their data that they themselves have missed.
Research has shown that those who share data tend to get more citations for their articles. Alan Hyndman presented some inspiring examples of unforeseeable benefits of sharing data that included creating a map of Mythical Creatures of Europe and 3D printing of dinosaur bones.
Open data is a new frontier in open science
While publishing the results of research open access has now been widely accepted, there are still many challenges to making data truly open. A very good question raised was: do we value data as a research product? It seems that for many data becomes valuable only when associated with a published paper.
Instead of mandating open data and hoping that scientists will comply, we need to focus on the benefits of sharing data, and make sure that the right incentives are in place. At the forefront is proper citation and credit for generating data.Tom Pollard noted that this alone could prevent many young scientists from leaving the academic research environment due to lack of article publications on their CVs.
Tom also identified some of the challenges in working with data: its collection, data is in silos, and that it can be difficult to find and reuse.
What can publishers do to support open data?
Andrew Hufton pointed out that there is a role for publishers to show that data is a valuable resource. There are an increasing number of data journals, including GigaScience, Scientific Data, F1000Research and others. Authors are in a good position to make an informed choice of the best home for their data and research based on their publishing needs.
Both Andrew Hufton and Amye Kenall emphasized the importance of quality in data sharing, and the role of dedicated data editors and data curators in preserving the data, enabling its reuse and ensuring the authors get credit via data citations.
BioMed Central, PLoS, ORCiD, Digital Science and the Wellcome Trust have partnered with Mozilla Science Lab to introduce a digital credentialing system or badges to recognize contributorship of those who curated the data, designed methodology, developed software, performed the statistical analysis and so on.
This year we are celebrating 350th anniversary of scientific publishing but journals seem to be essentially unchanged. As Michael Markie commented, the time has come for the article to fit the science, not the other way round. Open data and how it is used gives us the chance to make the most of this opportunity.
“I have begun to think that no one ought to publish biometric results, without lodging a well-arranged and well-bound copy of his data in some place where it should be accessible, under reasonable restrictions, to those who desire to verify his work.” Galton F. Biometry. Biometrika, 1901; 1(1): 7-10.
Galton’s suggestion of a store data had been revived by Professor Julian Huxley, and suggestion made for storing measurements in the British Museum of Natural History.
That kind of thnkiing shows you’re an expert