A GigaGathering at ICG8 in the era of big data and crowdsourcing

- 1 Comment

The Vanke International Conference Centre

As the GigaScience journal moves from strength-to-strength, with that comes the expansion of the editorial and data management teams that are now spanning three continents – and what better way to meet than at the 8th International Conference on Genomics (ICG8) in Shenzhen, China, co-organised by the BGI and GigaScience. Held at the Thunderbirdsesque Vanke International Conference Centre in the popular seaside resort of Dameisha, this year’s meeting covered a range of topics including new innovations and technologies, big data management, crowdsourcing, animal and crop genomics, as well as informatics and metagenomics, to name a few.

George Church was the highlight on the first day, who not surprisingly, gave a fascinating keynote on his amazing technology developments for reading, writing and interpreting omes – highlighting his work on CRISPR, an RNA-based method for scanning the genome that has huge potential for gene therapy (and has already been adopted by a large proportion of this year’s iGEM teams), and the development of organoids (organs-on-chips), where in which human-induced pluripotent stem cells (hiPSCs) are differentiated into different tissue types and grown on chips – providing a new method to test hypotheses and can supersede animal models. Church also highlighted efforts on the PGP’s genomes, environments and traits (G+E=T) – the world’s only open access human datasets.

GigaScience presents open source biology
Scott in his finest Halloween attire
Our Executive Editor, Scott Edmunds (pictured at the Halloween banquet), chaired an eye opening session covering crowdsourcing, crowdfunding and open science that shed light on non-traditional methods of raising funds for research and data dissemination. Dan MacLean (Sainsbury Lab, UK) presented on his group’s effort (OpenAshDieBack) to combat the Ash tree fungal pathogen, Ash dieback aka. Chalara fraxinea. A majority of the UK’s Ash trees are currently under threat and Dan is calling for the community to help combat this disease. To encourage and harvest more public participation MacLean and colleagues developed an online game-with-a-purpose – help save the trees and play their FaceBook game, Fraxinus. Being a proponent on GitHub science, Dan “eats his own dogfood” and even has his slides in the platform (see here). Taras Oleksyk (University of Puerto Rico at Mayaguez) presented his efforts to raise funds to sequence the endangered Puerto Rican parrot genome through art exhibitions and fashion shows (slides here). A Data Note on “the people’s parrot” was published in GigaScience (with data hosted in GigaDB), and got fantastic coverage for the project. Follow the on-going adventures on Facebook. Taras is organising SMBE 2014 in Puerto Rico next summer – this will be a great meeting that we encourage you all to attend for a chance to see the ”peoples parrot” up close and in its natural environment.

Zachary Apte (uBiome, USA) presented on his efforts to crowd source the human microbiome. UBiome provides a 16S microbiome sequencing service for $90USD, participants can choose what microbiome they want sequenced from (nose, oral, skin, gut and genitals), including what level of openness they would like their data to be made available. UBiome is the world’s first crowdfunded, direct-to-consumer biotech company, and has so far raised $350,000 USD, with 3,500 participants, 6,500 samples from over 40 countries and storing 15-times more data than the Human Microbiome Project. Jacob Shiach (Brightwork CoResearch) presented on the concept of “indie science” with the aim to get the crowd participating in research.  DIYbiology (another term for indie science) aims to reinvent the lab where in which anyone with an interest in learning biology can get involved without the need to attend an academic institute. The concept also involves open-hardware, saving independent researchers huge amounts of money by making open-source centrifuges, agarose gel boxes, PCR machines, and even cell printers. The growing numbers of “indie science” labs around the world include GenSpace and La Paillasse.

Personal genomes and open platforms for science
The GigaScience team had a strong presence in the meeting; our Editor-in-Chief, Laurie Goodman, presented in two sessions addressing issues surrounding data reproducibility and how the greater scientific community can translate scientific information to the public more efficiently. Our Lead Data Manager, Peter Li, chaired a session on open platforms for biological data where our NERC metabolomics collaboration was presented by Rob Davidson (University of Birmingham, UK). Rob explained how Galaxy can be utilised for metabolomic data analyses and how they created the first end-to-end metabolomic pipeline integrated in Galaxy – you can get a nice overview of current workflows from Rob’s slides on SlideShare. Rob will be joining our team in January 2014 to further integrate metabolomic and other workflows into our GigaGalaxy platform, and early work will be to make it compliant with the ISA-TAB format for increased usability for the greater metabolomics community, and improve “trans-omic” analyses. This was not the only Galaxy talk in the session, as Ramil Mauleon (IRRI, Philippines) presented on his work at IRRI to produce a platform for rice researchers and breeders (slides here).

Pavel Stoev (Pensoft, Bulgaria), presented further work we enjoyed collaborating on, with his talk on the “cyber-centipede” and their efforts on turbo taxonomy – trying to keep up with the growing taxonomic impediment by publishing lots of data as quickly as possible. This study involved a holistic approach to taxonomy that used several different methods, such as transcriptome sequencing, SEM, micro-CT scanning and digital barcoding – the first big data management pilot taxonomy study that integrates (using ISA-TAB metadata) all data (6Gb in total) that is openly available in several databases, and mirrored in our GigaDB database. The research is published in the newly launched Biodiversity Data Journal, and you can see more in our recent blog posting and editorial.

Bi Cheng, Manuel Corpas and Laurie GoodmanThe high quality of talks was maintained right until the last day, Ravi Madurri (University of Chicago and Argonne National Lab, US) presented Globus Online – a cost- and time-effective tool for high speed data transfer between multiple networks worldwide without the need for uploading data in a cloud. Manuel Corpas (TGAC, UK) gave an impressive talk on his own efforts to understand his family’s genome, where privacy issues are not a concern. This endeavour began after Corpas received a 23&Me kit for his birthday, the results of which led him to launch a crowdfunding campaign for a family genome project that he jokingly termed “The Corpasome”.

It was great to end the meeting with an enlightening presentation from Ting Wu (Harvard University, US), who presented her efforts to engage the world in genetics and do them justice through the Personal Genetics Education Project. The project aims to encourage the community to start talking about the potential benefits and implications of personal genetics – through visiting schools in a wide range of economic backgrounds, and working closely with the Hollywood Health Society that provides entertainment industry professionals with timely and accurate information for storylines on health and climate change.