Human genomics comes of age: ENCODE, Open Access and BioMed Central

5

The 21st Century began with a milestone for the human genome: a summer fanfare on Pennsylvania Avenue, two scientists and a jazz enthusiast, feverish with talk of a "wondrous map". President Clinton, in harmony with Prime Minister Blair in London, announced the completion of a draft human genome sequence, and the science of biology was forever changed.

Only a few years earlier, many would not have believed that such a feat was possible. Just three years later, a complete sequence was released; but as the Human Genome Project reached its summit, a new ascent loomed larger. What to make of the sequence? How could the function of each section be explained? Why were so many stretches of DNA seemingly desolate wastelands, devoid of any genes?

ENCODE annotation methods (click to enlarge)

And so the ENCODE (Encyclopedia of DNA Elements) project was born, a mission to annotate the human genome in all its functional glory. A decade later, BioMed Central is very proud to publish a selection of research articles unveiling the project's findings, to accompany the flagship paper in Nature. Six articles appear in Genome Biology, and these include extensive analyses of transcription factor function and binding behavior, modeling of gene expression and zombie pseudogenes. An article in BMC Genetics employs ChIP-seq to identify inter-individual variation in genome sequence and function.

While these publications showcase the computational analysis efforts of ENCODE, the stupendous volume of data generated by the Consortium has been available for some time, and has become a popular resource used to support the work of many labs (as we observed at the Biology of Genomes conference). ENCODE's contribution to human biology is therefore already established as one of real significance, thanks to the open availability of generated data.

That this openness is now a matter to be taken for granted stands in stark contrast to the tension over data ownership that preceded the East Room Press Conference, where the competing human genome sequencing projects of J. Craig Venter and Francis Collins were brought together. Previously fierce rivals, the race had de-heated that spring when a patent ban forced the hand of Venter's private venture, which had in any case been equaled by the publicly funded effort.

The ENCODE Explorer website (click to enlarge)

Happily, ENCODE was a story of collaboration: not only between the 400 plus scientists involved, but also between multiple publishing houses. BioMed Central journals have joined together with Nature and Genome Research to co-publish more than thirty articles. Our co-operation has yielded a dedicated website and iPad app, where readers can navigate a cross-publisher collection.

Collaboration is only one of many areas in which Big Science has progressed. For one, genomics has become far more affordable. Conducted under the Ancien Régime of Sanger sequencing, the Human Genome Project's $3 bn cost could today fund the collective genome sequences of a large city. Genome Biology had published its first article only a few weeks prior to the draft genome announcement, pioneering Open Access when the concept was new and untested. Importantly, as is now common practice for large genomics projects, all of the official ENCODE articles are fully Open Access publications. The ENCODE era is one in which science is being democratized, in which the wondrous maps of genomes are no longer the preserve of a privileged few.

Update: BioMed Central’s ENCODE articles are now also available as ePubs. More info here.

View the latest posts on the On Biology homepage

5 Comments