“Hell is a place where nothing connects to nothing.”
TS Elliot, Introduction to The Inferno
This past Monday ORCID released their first Public Data File, which was released under CC0. This data file includes a snapshot of all public data in the ORCID registry at the time of release and is part of their continued effort to transparently link researchers with their outputs. The release of this dataset is only one of the many activities behind ORCID’s name this month. Today ORCID held an Outreach Meeting in Washington, DC, to focus on developing best practices to help researchers integrate ORCID identifiers into their workflows (see #orcid11 on Twitter), and earlier this month they were at CERN for their 1st-year conference of the ODIN (ORCID and DataCite Interoperability Network) Project. The conference—following a 1 ½ day codesprint and taking place in CERN’s Globe building—marked the half-way point in the 2-year project. (Many of the presentations have been posted.)
Previously called the Palace of Equilibrium, the Globe is 27 meters in height and 40 meters in diameter. The room was last used to award the Nobel Prize in Physics—no pressure. That momentous finding seemed to crop up in the various talks given at the meeting. For example, the Higg’s dataset does indeed have a DOI and has been cited. Great! But the publisher of the journal article related to the dataset still needs to work at better linking the dataset with the article. Not so great. It begs the question: If this dataset is failing to be linked up, so to speak, how are we treating the majority of datasets?
Funded by the European Commission, the ODIN Project aims to understand how gaps in the support for persistent data and contributor IDs (ORCID iDs) can be bridged and how we can enable a community and culture of consistent linking between authors and their “products”, including data. (You can read more about what they’ve done through their reports.) Laurel Haak, Executive Director of ORCID, explained this linking in the context of the greater picture: “ORCID iDs are not about researchers’ relationship to one publication or a set of publications but to a research community.” Exactly. Think a web of connections between researchers, their “products”, and other researchers, not a list of publications. This is the future: persistent identifiers (DOI and ORCID iD) linked to source-code, to data, to journal articles, to curated metadata (indeed, ISA Creator has integrated ORCID iDs), to reviewer reports. Since its launch in October 2012, ORCID has issued over 330,000 iDs. The Wellcome Trust has integrated ORCID iDs into its grants application system, and if you are awarded the grant, will automatically update your ORCID account acknowledging this. If you don’t have an ORCID iD, you can register here. It takes 30 seconds.
Across all the talks was a disappointing but sobering truth to all this development and progress: researchers aren’t getting the message. They’re not citing the data they use, especially not in the reference list. They’re not sharing their own data. They’re not all even registering for ORCID iDs. And finally, they’re not linking their iD and dataset DOI. Maybe we’re not, as Elliot describes hell, in a “place where nothing connects to nothing”, but what about a place where only the article connects to the DOI? Perhaps.
It’s clear why researchers should link up their research products (data, sourcecode, articles, etc) with their own persistent identifier (ORCID iD) and make their data open, so why aren’t they? Are researchers not getting the message? Or are the incentives not yet real enough to them? If most of this message has been getting out by “shouting at researchers with white papers”, as Mark Hahnel commented, then my guess is they’re just not getting the message. (I don’t like reading white papers and it’s my job.)
One speaker, Louise Corti of UK Data Service, mentioned the success they’d had with case studies of how datasets were re-used. Blame it on my background in literature, but I sense there lies real potential in showing researchers the stories behind data re-use. Moving away from the current white papers strategy, I’d like to start collecting some narratives that might enable researchers to rethink the way they link up with the products of their research and others’ research. There are convincing stories out there showing how we might build on research through the reuse of data in ways the originator of that data might not have imagined. Such stories help make tangible the reasons for sharing and citing data.
We at BioMed Central would like to begin collecting these stories. We’ll retell these stories at open science sessions at research conferences, on blog posts, on Twitter, in Biome, in promotional material, and to libraries and department heads. Eventually, I think the incentives for sharing and linking will become clear.
What was your experience of sharing your data? How was it used? Or have you re-used someone else’s data? This is an invitation to share your story. Comment here, or email me at firstname.lastname@example.org.