A Year in Bioinformatics

As we begin a fresh new year it’s time to look back over the past 12 months and take stock of some of the fascinating articles published in BMC Bioinformatics. It’s been a year where our authors have looked at many different aspects of the bioinformatics field.

Sounding out the properties of DNA

While many researchers are used to viewing DNA visually, in April researcher Mark D Temple presented a new way of identifying mutations by converting the code into musical notes. The sonification algorithms that converted the DNA code into music could very well represent a whole new way of finding mutations that would otherwise be missed. These algorithms represented DNA features using different sounds so that, for example, binding sites, restriction endonucleases sites and SNPs are each highlighted in their own unique way.

A song made from DNA may not be topping the charts any time soon but it could lead to new ways of exploring sequences and discovering features that would otherwise be overlooked.

ImageJ2: The next generation

ImageJ has long been an immensely popular bioinformatics tool used in a wide variety of fields in both the biological and physical sciences. A huge, engaged community has developed to support and develop the software over the years but there comes a time when such successful software needs to be updated to be ready for the trials ahead. Curtis T. Rueden and his colleagues published ImageJ2 in November ready to meet the challenges posed by increasingly complex new datasets and to ensure that the software is capable of adapting to the future needs of the community.

Examples of image processing algorithms in ImageJ
Figure 1, Rueden et al, https://doi.org/10.1186/s12859-017-1934-z

The new software has been re-built from the ground up and expands upon the original ImageJ. It aims to continue to support the community while making sure it is free to develop and grow, exploring new avenues to work with other existing image processing tools. With the support and feedback of their enthusiastic community there’s no reason to think that ImageJ2 won’t continue to go from strength to strength.

Decontaminating DNA

In recent years improving techniques and technology have allowed researchers to assemble a vast number of genomes. The first eukaryotic genome was published in 2000 and since then the speed with which genomes are assembled and published has greatly increased. However with such speed must also come some caution; how sure can we be that the target DNA is not contaminated with foreign DNA? In December 2017 Janna L. Fierst and Duncan A. Murdock presented a novel methodology using machine learning to help ensure that de novo assembled genome sequences are free of unwanted extraneous sequences.

Guaranteeing a sterile DNA sample is a major challenge in modern sequencing efforts. Foreign DNA can be acquired from microbiota, endosymbionts or even from laboratory tools or reagents. In order to ensure that we can draw solid conclusions from sequencing data we have to be sure that the data is what we think it is. A number of contamination errors have already been found in assembled sequences. Existing methods to reduce contamination errors do exist but can be prone to being too aggressive in removing sequences or only removing a number of potential sequences. The machine learning approach – using a decision tree – demonstrated in this research showed great potential for removing foreign sequences whilst maintaining the target DNA.

Visualizing microbial dark matter

ICoVeR is a software tool than can visualise genomic fragment “bins” that can be reassembled to create draft microbial genomes

High-throughput sequencing has really come to the fore in recent years allowing for further exploration of so-called “microbial dark matter”: microbial communities which have proved resistant to attempts to cultivate them. Analyses of these communities have created large numbers of genomics fragments that need to be grouped together, or binned, in order to reassemble draft microbial genomes.

ICoVeR is a software tool than can visualise genomic fragment “bins” that can be reassembled to create draft microbial genomes.

ICoVeR (Interactive Contig-bin Verification and Refinement tool) is a new interactive visualisation software tool than can visualise these bins and perform further clustering as necessary. Its open design also means that it is easily updated with new algorithms and solutions to improve performance.

Bioinformatics: indispensable, yet hidden in plain sight?

Lastly, we highlight some correspondence by Bartett et al exploring the relationship between bioinformatics and life sciences. Bioinformatics is seen as increasingly important in life sciences but the work is often not given due credit.

Bioinformatics is multidisciplinary in nature – it can be considered a service, a collection of tools and/or methods and a field of study in its own right. And it has become integral to life sciences, complimenting the work of the wet lab. Having  studied the bioinformatics community, and the work bioinformaticians perform, Bartlett and colleagues discuss whether bioinformatics is now a victim of “black boxing” – that it has become so successful that the focus has switched from the process itself, for example the creation of an algorithm or software,  to only the results of that process.

View the latest posts on the BMC Series blog homepage