8 and great: the best Chinese research of 2015 in Genome Biology

To mark the launch of the BioMed Central’s China Gateway, the team at Genome Biology have picked out eight fascinating articles from Chinese researchers published so far this year. Dominique Morneau from the journal’s editorial team gives us the run down.


Research output from China is continuing to grow rapidly, and here at Genome Biology we’ve been fortunate to receive some great submissions from the country. We’ve picked a lucky eight articles that showcase some of the best of our Chinese research. So without further ado, let’s take a look…

Improving cotton genomics with a high-resolution genetic map

cotton-grass-680623_1280The cotton species Gossypium hirsutum and G. barbadense, known respectively as upland cotton and sea island cotton, have large genomes with four sets of each chromosome, making them tetraploids.

Large genomes are difficult to sequence, and so the field of plant genomics lags behind other areas of genomic research.

Finding sequence variations that could be useful for breeding-based crop improvement, like single nucleotide polymorphisms (SNPs), is also quite difficult when there are multiple copies of each chromosome.

A group from Nanjing Agricultural University saw an opportunity to construct a genetic map of SNPs in the genomes of both tetraploid cotton species. They found and mapped 5 million SNPs – the largest amount of sequence variation found between the two species to date!

Their genetic map was also used to update and improve the upland cotton genome assembly, and to identify the notoriously elusive central regions of the chromosomes. They hope that this high-precision, high-resolution genetic map will provide valuable information to improve cotton breeding, and contribute to our understanding of the genome structure and evolution of species with large genomes and multiple chromosome sets.

Sequencing the goose

goose-168560_1280China produces 94% of global goose meat, and so this poultry species plays a highly important role in both Chinese agriculture and economics. Geese are also a good model for studying liver metabolism in humans.

To determine the unique characteristics of geese, a collaboration between Jinjun Li, Jun Wang, and their colleagues sequenced and analyzed the complete goose genome.

The Anser cygnoides genome has over 16,000 genes (compared to 20,000-25,000 in humans), and contains a number of significant differences compared to other terrestrial birds, particularly in genes involved in the immune system. They also have more copies of key enzymes involved in lipid production and transport than other poultry species.

Determining the impact of genomic perturbations in cancer

Cancer cell_iStock PhotoTumors often carry a number of mutations, gene copy number variations, and epigenetic changes. Mutations can include anything from single nucleotide polymorphisms (SNPs), to sequence insertions and/or deletions (INDELs), while epigenetic changes include non-genetic aberrations. Each of these types of change has the potential to perturb a gene’s expression.

Despite our knowledge of this connection, it is still difficult to estimate the impact of these mutations on a given pathway, and therefore use this knowledge to improve therapy. A study conducted by Andrew Teschendorff and colleagues from the Chinese Academy of Sciences created a statistical framework that integrates known cancer and tumor perturbation signatures with matched drug sensitivity data.

Notably, they identified a novel, clinically relevant subtype of breast cancer, as well as a potential treatment to target it. They hope that this approach will be used to identify drug treatments that can benefit certain groups of cancer patients.

Gene expression and aging

Old handsAging causes drastic changes to gene expression at both the mRNA and protein levels.

Rong Zeng (Chinese Academy of Sciences) and Philipp Khaitovich (CAS-MPG Partner Institute for Computational Biology) and their colleagues recently surveyed the changes in mRNA and protein expression in the prefrontal cortex of the brain in humans and rhesus macaques over various developmental stages.

They found that the relationship between mRNA and protein levels is increasingly unlinked during aging. This occurrence is likely caused by modifications to the mRNA that alter stability, or prevent translation into proteins. Many of the genes that were predicted to be targeted for these modifications are known to be involved in lifespan extension, mitochondrial function, and Alzheimer’s disease.

Identifying circular RNAs

Since their discovery 30-40 years ago, circular RNAs have been found in all domains of life. These RNA molecules form a covalently bound continuous closed loop, which gives them unique properties. Their function remains largely unknown, but their high levels of expression and their conservation across species suggest that they have important physiological functions.

Studying circular RNAs is typically done using transcriptome sequencing, and an important technical challenge is to distinguish circular RNAs from other RNAs using this method.

Fangqing Zhao and his colleagues from the Chinese Academy of Sciences recently developed a software to accurately detect circular RNAs from transcriptome data, which they have called CIRI (CircRNA Identifier). The authors are confident that this method will be useful in detecting novel circular RNAs, enabling researchers to determine their true physiological roles.

Genomic recombination and honeybee behavior

bee-170551_1280Honeybees (Apis mellifera) and other social hymenoptera, have ultra-high chromosomal crossover rates – higher than any other animal or plant on the planet! This process is quite variable within the genome, with some areas having high recombination rates and other areas having very few recombination events.

Honeybee males only have one set of chromosomes (haploids) while the females have two (diploids), making honeybees great candidates for studying the causes and consequences of crossing over in the production of sex cells.

By studying crossing over in honeybee brains, Laurence Hurst (University of Bath) and Sihai Yang (Nanjing University), along with their colleagues constructed a high-resolution recombination map in the honeybee. They found that highly expressed behavior-associated genes in worker (female) brains have unusually high crossover rates, but that immune-related genes do not. This is the first evidence linking cross over events with social behavior in honeybees.

Building bridges for better transcriptome assemblies

RNA sequencing is a powerful tool allowing us to catch a glimpse into gene expression with unprecedented accuracy and sensitivity. But the sequence reads from this tool are quite short, often so short that it can be difficult to reconstruct full-length RNA transcripts properly.

There are several methods available that help with the assembly of RNA reads, but many require a reference genome to reconstruct the transcriptome. So what do we do when there is not a reference genome to base our assemblies on?

Guojun Li, Xiuzhen Huang, and their colleagues in China and the US recently described their new transcriptome assembler, Bridger, which takes advantage of the best features of the most popular methods and ‘bridges’ them in a single tool.

When tested on three real datasets (human, dog, and mouse RNA), this new method proved to be significantly faster, to require less memory space, and to be overall better at assembling transcriptomes while introducing fewer false positives than other methods.

ALLMAPS lead to high-quality de novo genome assembly

There are a lot of methods available for most of the steps involved in de novo genome assembly, but those used to order and orient scaffolds are under-developed.

Assembling a genome from scratch, or de novo assembly, occurs in a series of steps: the assembly of overlapping reads into contigs, building contigs into scaffolds, and accurately ordering and orienting scaffolds into chromosomes. This is often done using a variety of types of genomic mapping information.

There are a lot of methods available for most of the steps involved in de novo genome assembly, but those used to order and orient scaffolds are under-developed. Haibao Tang and colleagues have developed a method to fill this gap, which they call ALLMAPS.

Their algorithm optimizes agreement among multiple maps to order and orient scaffolds accurately, while avoiding mapping errors. ALLMAPS incorporates data from physical maps, optical maps, and comparative maps, offering a useful and robust tool for building high-quality genome assemblies.

Looking ahead

Genome Biology is thrilled to have published a number of high-quality research articles from Chinese authors so far in 2015. The year’s not over yet though, and we have no doubt that there will be a lot of great research to come throughout 2015 and beyond.

Stay tuned to the BioMed Central blog all week where we will be celebrating the re-launch of our China Gateway by highlighting the work of our Chinese editors and more content from Chinese researchers.

Dominique Morneau

Dominique has a PhD in plant enzymology from Carleton University. She joined BioMed Central in 2015 as a member of the Genome Biology editorial team.
Dominique Morneau

Latest posts by Dominique Morneau (see all)

View the latest posts on the On Biology homepage