Introduction
“One reason that the worldwide web worked was because people reused each other’s content in ways never imagined or achieved by those who created it. The same will be true of open data.”
– Tim Berners-Lee and Nigel Shadbolt, The Times, New Year’s Eve 2011
As part of our commitment to reproducible research and transparency, BioMed Central has partnered with LabArchives to work together for the shared goal of making datasets supporting peer-reviewed publications available and permanently linked to online publications – and available under terms which permit reuse freely, as Open Data.
A growing number of repositories for scientific datasets – which with persistent identification can be cited in and linked from published articles – are available but many fields still lack an obvious repository. There has been debate about whether institutional or subject-specific repositories (and journals) are the best solution for data archiving and publishing. But what is absolutely clear is that, for data, one size does not fit all – literally and metaphorically.
When available, data repositories are usually the best place for larger datasets which cannot be included as additional files, and we have been working to increase awareness of repositories of interest to our authors. But some scientists who are willing to share data may, understandably, be reluctant to deposit data in a repository with which they are not familiar, or which cannot guarantee permanence, or perhaps has suboptimal or ambiguous licensing terms.
LabArchives, an Electronic Laboratory Notebook, enables individual scientists to manage, share and publish data files, providing an accessible platform for sharing and publication which is controlled by authors themselves. LabArchives is web-based, or may also be installed on a local server, enabling a user to access their laboratory documents, protocols, notes and data from any location (you can read more about the features of LabAchives on their website).
As part of this partnership, all BioMed Central authors are entitled to an enhanced free version of LabArchives. This ‘BioMed Central Edition’ of the software offers additional storage capacity compared to the standard free edition, integrated manuscript submission to BioMed Central journals, along with important open data publishing features.
Key data publishing features of LabArchives – BioMed Central Edition
Permanence, citation and linking of datasets
In 2011 LabArchives introduced the ability to assign digital object identifiers (DOIs) to datasets stored and shared with the software. DOIs facilitate data citation, discovery and earning of academic credit for data publication, and datasets assigned DOIs through LabArchives will remain available in perpetuity. The DOI system is used by journal and data publishers, such as DataCite members, to ensure online permanence of published articles. DOIs are indexed permanently by the International DOI Foundation and are much more favorable than URLs for permanently linking content online.
DOIs are assigned in LabArchives through the ‘DOI Management’ tab in the software’s share settings (pictured). DOIs should only be assigned in instances in which data are to be permanently shared with the public.
A LabArchives user can choose to share a data set as it exists at the time of publication, or they may enable users to continue to view changes as they are made, while, importantly, maintaining the version which supports a peer-reviewed publication. So, a DOI can be assigned to data as of the time the article was published and authors or re-users of the data may then continue with their research.
Data which are available for integration and reuse
Datasets published via the LabArchives platform and assigned DOIs are available under a Creative Commons CC0 waiver. CC0 helps dispel legal uncertainties about what a person or machine can do with data – or any other content – they discover on the web. CC0 enables cultural (scholarly) norms of citation to take precedence over legal conditions, such as requirements for attribution, for ensuring scientists receive appropriate credit for their contributions. CC0 furthermore complies with the Panton Principles for Open Data in Science, which hold that for society to gain the full benefits of scientific endeavors, data must be free to reuse, integrate and build upon without legal or other barriers. In short, data published through the LabArchives – BioMed Central Edition are open data.
Anyone publishing data through LabArchives should ensure that CC0 is appropriate for their data and that they are in the position to apply this waiver to the data.
Complimentary additional storage
The enhanced free version of LabArchives has an increased allotment of 100MB of storage (the standard free edition includes 25MB), enabling publication of larger datasets which cannot be published as additional files with journal articles. BioMed Central authors can continue to submit virtually unlimited numbers of additional files to our journals, up to 20Mb per file – twice as much as some publishers, but in the age of ‘big data’ this can still sometimes be limiting . Users who choose to upgrade to the full version of LabArchives can store up to 100GB of data (see footnotes).
Integrated file viewers
LabArchives includes viewing software for a variety of file types. This feature enables those who discover your information to be able to see the data, even when stored in certain proprietary formats. Viewers are currently included for Microsoft Office files, PDFs, and all standard image formats. The list of viewers is regularly expanded by LabArchives.
View data in context
Readers (and reusers) of data published and shared through LabArchives can view files in context. LabArchives’ hierarchical file structure enables meaning to be conveyed to the reader through logical organization of data files.
Integrated manuscript submission to BioMed Central journals
Publishing data permanently online – especially when well-labelled, conforming to community standards, and in open file formats – increases potential for data reuse and collaboration. But peer-reviewed journals undoubtedly add value to data, such as detailed methods, context and discussion. For publishers to continue adding value to science communication, to speed publication and reduce barriers to data sharing it’s important to better integrate with scientists’ workflows and tools, upstream of journal submission and publication. The LabArc
hives – BioMed Central Edition includes integrated manuscript submission to BioMed Central journals. Authors submitting research manuscripts can, directly from LabArchives, choose the most appropriate of any BioMed Central journal, and authors preparing data notes can link directly to BMC Research Notes’ submission system. Our manuscript templates for research and data notes are incorporated in LabArchives’ integrated Office documents feature, to help speed the process of manuscript preparation.
With transparency comes responsibility
We encourage authors to comply with available field-specific standards for the preparation and recording of data. We recommend authors review the BioSharing website, and a special article series published in BMC Research Notes, for information on best practice in their field for sharing of data, with particular attention to maintaining patient confidentiality.
All journals in the BMC Series including BMC Research Notes now include information about LabArchives’ BioMed Central Edition in their instructions for authors, and the feature will be added to other journals as, when and if the Editors feel it is service that will be valued by their authors. The first articles which describe and link to data hosted in LabArchives are currently undergoing peer review.
As John Wilbanks, Senior Fellow at the Kauffman Foundation and Open Network Biology Editorial Board member said on the BMC Blog in November 2011: “[M]aking data available will serve as a strong attractor for the smartest people in the world to come and begin building things that utterly surprise and shock us.”
We are looking forward to working with our authors and LabArchives to make more data openly available for integration and reuse in 2012 – and beyond.
Footnotes
Use of LabArchives’ software will have no influence on the editorial decision to accept or reject a manuscript, and use of LabArchives or similar data publishing services does not replace preexisting community data deposition requirements set out in individual journals’ instructions for authors.
The full version of LabArchives including 100GB of storage requires payment. BioMed Central does not receive any commission from LabArchives.
BioMed Central remains committed to work with all data repositories which enable linking of data to publications particularly where specific journals and communities endorse them – such as for example the Dryad repository, with which we are working towards submission system integration with BMC Ecology and BMC Evolutionary Biology. More information on data deposition requirements relevant to BioMed Central’s journals can be found on our supporting data resources page.
Great news.
Could I ask a couple of further questions?
1.) When is this coming in? For all submissions today onwards?
2.) I take it data archiving is still optional for BMC authors, for the time being? I hope some (more) journals in your stable will consider making it mandatory to deposit underlying research data, where possible, in a public data repository (if not necessarily LabArchive’s one), c.f. Fairbairn, D. J. 2011. The advent of mandatory data archiving. Evolution 65:1-2. https://dx.doi.org/10.1111/j.1558-5646.2010.01182.x
Perhaps this may encourage a general shift in this direction…
Dear Ross,
Thanks for your comment. The LabArchives – BioMed Central Edition is available now, from the link in the blog above and in the information for authors of journals that have so far opted to include information about the service. These include the BMC Series journals (https://www.biomedcentral.com/authors/bmcseries) and BMC Research Notes (e.g. https://www.biomedcentral.com/bmcresnotes/authors/instructions/researcharticle).
While not superseding or replacing established data deposition policies and repositories (e.g. for microarray data) we hope LabArchives will enable authors who have data to publish, which is too large for additional files, and have no obvious repository for it, to be able to publish their data; link it to and cite it in their article; and ideally get some credit for it. Other features of an online lab notebook may also be attractive.
Different journals have different data archiving requirements, depending on the subject areas they cover and these are in individual journal instructions for authors. But for convenience we’ve collated policies on data archiving for different data types here: https://www.biomedcentral.com/about/supportingdata
At the bottom of this list you’ll find a table of journals that encourage or require data archiving for every article submitted.
Best regards,
Iain