New in GigaScience: the Squishome

- 3 Comments

Insect goo aids biodiversity research
Apologies to Jonathan Eisen (see Badomics in the journal), but today in GigaScience we publish a new “squishomics” approach for assessing and understanding biodiversity, using the slightly wacky sounding method of combining DNA-soup made from crushed-up insects and the latest sequencing technology. This bulk-collected insect goo has the potential to rapidly and cheaply reveal the diversity and make-up of both known and unknown species collected in a particular time and place.

Creepy crawlies are important indicators of diversity, as arthropods make up 80% of described species, and with an estimated only 20% of insects characterized to date. The new method devised by Xin Zhou and colleagues at BGI, is a more accurate and quantitative version of a new biodiversity analysis technique called metabarcoding. Doing some initial validation on the analyses has already revealed how diverse and poorly characterized insect communities (or diversity) can be, even from two small sites within the researchers’ own backyard— literally.

Combining DNA barcoding, which utilizes a standard gene fragment for species identification with next generation sequencing technologies; previous metabarcoding methods, however, have required a step to amplify the amount of DNA collected that uses PCR. This step can introduce problematic errors into the analysis. The authors of this study have found a way to carry out this method without this step, giving it the potential to be more accurate. In addition to assessing species diversity, it also allows the researcher to determine the total quantity of mitochondrial DNA present for each species, making it possible to reveal relative abundance and biomass of each species. Allowing more consistent and rapid sampling, this may simplify the study of changes in biodiversity over space and time and transform the way we study ecosystems.

What really lies at the bottom of your garden
Testing the technique on species collected on a hillside behind their laboratory, the authors were very surprised by what they managed to find in their own neighborhood. BGI, the world’s largest genomics organization, is situated on the edge of Shenzhen, a city of 12 million people in the densely urbanized Pearl River delta. Setting up two traps close to each not only revealed how much diversity there was, but also detected species not currently present in online databases. The findings demonstrated how little is known about insect diversity in China, and by opening up the ability to carry out these types of systematic and high-throughput analyses — enabling it to be tested if this is the case everywhere else in the world. Unfortunately this hillside is currently being leveled and built upon for new building projects as the relentless urban and industrial development in China continues (see picture), so this surprisingly rich environment is not likely to remain as diverse for much longer.

Of the study, Dr. Zhou said: “The 2 sampling sites were very close to each other, yet there were only around 10% of the total species being shared between them. The fact that only very few of our barcoded specimens received a sequence match from the Barcode of Life Data Systems, the world’s largest barcode reference database, suggests that much of China’s arthropod fauna still remains as a mystery, at least from a molecular aspect.” With the ability to detect and discover tiny organisms, stomach contents and partial samples without the usual visual cues, he also adds, “In some sense, the contribution of NGS technology to biodiversity research is equivalent to what microscopes did to microbiology.”

Open Science: the best way of proving insect data isn’t “buggy”
Following from our recent SOAPdenovo2 paper, this study is the second time we have integrated into the paper separate DOIs for making all of the supporting data and pipelines available. Hosted in our GigaDB database, the ability to independently cite these rewards the authors for making them available, and also boosts the transparency, reproducibility and utility of this work. As the pipeline is adapted from the open-source SOAPdenovo2 application all of the code has to be released under a similar license, and we have hosted it in our GitHub repository. To further boost the utility, the authors have worked hard to follow best practices for metadata laid out by the Genomic Standards Consortium (GSC). Contextual information is essential for environmental studies such as this, and while there are currently not modules for this new data type, the authors have built upon the GSC MIMs checklist and provided all of the information they thought relevant as a starting point for building new standards. With a transparent and open-review process (see the pre-publication history here), and with work currently underway to implement the workflows in our Giga-Galaxy platform, we hope that this paper presents another good example of what we are attempting to do at GigaScience to our papers more transparent and reproducible. As a really exciting study potentially opening up huge new areas of research, we hope this additional work pays dividends allowing future users to recreate and adopt the technique much quicker and easier.

Further reading:
1. Zhou X; et al., Ultra-deep sequencing enables high-fidelity recovery of biodiversity for bulk arthropod samples without PCR amplification GigaScience 2013 2:4 http://dx.doi.org/10.1186/2047-217X-2-4

2. Zhou, X; Li, Y; Liu, S; Yang, Q; Su, X; Zhou, L; Tang, M; Fu, R; Li, J (2013): Raw data, assembly and annotation results for: “Ultra-deep sequencing enables high-fidelity recovery of biodiversity for bulk arthropod samples without PCR amplification”. GigaScience Database http://dx.doi.org/10.5524/100045

3. Zhou, X; Li, Y; Liu, S; Yang, Q; Su, X; Zhou, L; Tang, M; Fu, R; Li, J; Huang, Q (2013): Software and supporting material for: “Ultra-deep sequencing enables high-fidelity recovery of biodiversity for bulk arthropod samples without PCR amplification”. GigaScience Database http://dx.doi.org/10.5524/100046

UPDATE 2/4/13: check out the related Q&A with the lead author in a follow-up posting.