Sequencing multiple individuals from a single species allows researchers to ask key biological questions regarding, for example, the diversity and population genetics of that species. The answers to many of these questions rely on the accurate mapping of polymorphisms between individual genomes.
The Human 1000 genomes and the Arabidopsis 1001 genomes projects are taking advantage of the rapid and increasingly affordable next-generation sequencing technologies to do just that. Millions of reads of sequence data can be generated in a relatively short period of time and once this information is pieced together, comparisons of multiple genomes from individuals of a single species are possible.
The advent of these ambitious sequencing projects requires concurrent advances in the software needed to analyse these data. Existing software for aligning short-read sequence data from next generation sequencing technologies have previously relied on aligning new sequences to a single reference genome.
In Genome Biology this month, Detlef Weigel, the Director of the Department of Molecular Biology at the Max Planck Institute for Developmental Biology in Tübingen, and colleagues introduce GenomeMapper, a new, open-source software for the assembly of short-read sequence data to multiple reference genomes at one time; they demonstrate the power of this tool by applying it to new sequence data from the Arabidopsis 1001 genomes project. GenomeMapper greatly increases our ability to compare genomes and the detection of polymorphisms between large numbers of individuals is now possible on a scale that was unimaginable previously.
You can also find out more about the Arabidopsis 1001 genomes project from their website http://1001genomes.org/index.html and read an opinion article highlighting the plans for this project, published in the May 2009 issue of Genome Biology.