GigaScience at #ICG6: announcing the release of GigaDB and new datasets

GigaScience release posterAnother busy week for the GigaScience team, with the release of a new-look database, more datasets, and a number of talks and announcements at BGI’s annual International Conference of Genomics in Shenzhen. It was a great (if exhausting) meeting this year, with the state-of-the art in genomics science on display, announcements of three exciting “Million Genomes” projects to come from the BGI and their many collaborators, and a chance to catch up with many members of our editorial board and friends.

GigaDb – a new look website
The biggest news at the meeting was the launch of our new-look GigaDB.org website and additional datasets at the pre-conference data release workshop and press-conference. This is still very much in beta-form (comments and feedback greatly appreciated at editorial@gigasciencejournal.com), but builds upon our original release of datasets in July and presents them together in a single portal.  Following the success of the outbreak E. coli 0:104 and Macaque genome datasets in demonstrating the practicalities of data citation, we have released another 20 datasets with citable DOIs. These span most of the tree of life, and include previously unsupported data-types.

New Data from across the Tree of Life
Following on from the release of seven vertebrate genomes from the Genome10K project in July, we have now added genomic data from the Sheep, Tibetan Antelope and Naked Mole Rat. Genome, transcriptome and methylome data is provided from an Asian Individual, and we are currently uploading data from Ancient DNA studies on an Eskimo and Aboriginal Australian. We now have plant genomes from the Potato, Foxtail Millet, Sorghum, Cucumber, Chinese Cabbage and Pigeon Pea, and invertebrate genomes from three species of ants, many strains of silkworm and a pathogenic pig roundworm. Many of these datasets (including the Sheep, Tibetan Antelope, Millet, Sorghum and transcriptome data) are previously unpublished, this novel and more rapid release of data
should potentially speed up research in these important model and commercial species, and in human health.

For more coverage on the meeting check out the #icg6 hashtag on twitter, and reporting on the software and data release in Bio-IT World. Laurie’s slides are available here, and slides from Scott’s talk on data issues in the Bioinformatics session are also available here. To see a video of Laurie’s talk you can also see the following clip on youtube.