Biological adventures into the world of cloud computing

Cloud computing is becoming an increasingly popular phenomenon in the world of computing. Analysing data in ‘the cloud’ involves using a self-service internet infrastructure, where you pay-as-you-go and use only what you need, all managed by the third party provider, typically Amazon or Google.

This technology has so far not been utilised for biological computing applications. The potential uses of cloud computing to analyse the mass of data pouring out of next-generation sequencing projects has, however, been a topic of hot debate in recent months amongst biologists and computational biologist alike.

In this month’s issue of Genome Biology, Ben Langmead and colleagues, from the University of Maryland and the John Hopkins Bloomberg School of Public Health, present the Crossbow algorithm for the alignment of whole-genome sequence data and the mapping of genetic variation information, such Single Nucleotide Polymorphisms (SNPs). Crossbow combines the alignment speed of the Bowtie algorithm, together with the SNP calling accuracy of the SOAPsnp algorithm, and allows them to be run on any publicly-available cloud computing cluster.

In the article presenting Crossbow Langmead and colleagues demonstrate how data from a 38-fold coverage of the human genome can be aligned, and the SNPs mapped, in less than 3 hours and for only US$ 85 using the Amazon cloud.

The authors believe this first demonstration of software for biological applications being able to utilise the technology of cloud computing will revolutionise the way we analyse our biological data.

Crossbow is open source and freely available to download here, as well as being provided online with the Genome Biology article.

Liz Gaskell, Senior Assistant Editor, Genome Biology.

View the latest posts on the On Biology homepage

Comments