Webcite links provide access to archived copy of linked web pages

3

Anyone who has tried to follow web links in an scientific article published several years ago will be familiar with the problem. You click a link, only to get a ‘Server not responding’ message, or a ‘Page not found’ error.

This lack of permanence of web links (sometimes known as link rot) is a general phenomenon across the web, but it is a particular problem in the case of published scientific research. On the one hand, the coherence of the published scientific record depends on being able to refer back to the articles including the online material that they refer to. But on the other hand, the character of scientific research projects (which tend to be funded for a few years at a time) and of scientific careers (which tend to involved frequent shifts between institutions) mean that scientific web pages become inaccessible with worrying regularity.

In this electronic age, it is not realistic to expect authors to refrain entirely from mentioning web pages in their articles, ephemeral as they may be. So, since late 2005,  BioMed Central has been working in partnership with the WebCite initiative, based at the Centre for Global eHealth Innovation at Toronto General Hospital, to preserve archival copies of all web pages linked to from BioMed Central articles.

Wherever you see a logo, whether in the body of an article, or in the reference section, you can click on that link to view a version of that page that has been archived at WebCite.

For papers published since 2006, this archived copy will have been harvested immediately after publication, and so another benefit of this process, as well as providing some degree of digital permanence, is that it allows you to view the web page as it was at the time of publication.

For example, this Journal of Biology article links to the BioGRID database. The WebCite copy provides a snapshot of the BioGRID home page, including stats on the database, as it was at the time of publication.

WebCite is not, by itself, a perfect solution. Snapshots of web pages such as those preserved by WebCite cannot fully replicate the functionality of a complex database-driven web site. Even single web pages may in some cases cause problems for the WebCite archiving robot, but this is improving all the time (please let us know if you spot any problems). Lastly, in order to provide long term digital permanence, it is important that the WebCite project itself should have long term sustainable support. To this end, we encourage other publishers to participate in the initiative, and to consider ways of supporting it, perhaps via a similar collective model as that used for the CrossRef linking initiative.

The caveats notwithstanding, a  basic principle of digital archiving is that the sooner you start, the less you lose (as the Internet Archive has demonstrated). So we are very pleased to be working with WebCite to ensure that as much of possible of the web material linked to by BioMed Central authors is preserved for the long term.

View the latest posts on the Research in progress blog homepage

3 Comments