Until recently, microarrays have been the method of choice for transcriptional profiling. The advent of next generation sequencing technologies however has seen the rise of direct sequencing of mRNA (RNA-seq) as a new method for such profiling. In a recent publication in Genome Biology, Alicia Oshlack and colleagues at the Walter and Eliza Hall Institute in Melbourne, Australia have developed a new method for performing Gene Ontology analysis of RNA-seq data, called GOseq.
GOseq identifies whether a given transcriptional profile is over-represented with transcripts associated with specific biological processes. Up until now, statistical methods, such as this, used for analysing RNA-seq data have generally been modifications of methods developed for use with microarray data. Oshlack, however, shows that statistical methods are not interchangeable between the two techniques; in particular, there is a bias inherent in RNA-seq data whereby highly-expressed transcripts are more likely to be called as being differentially expressed compared with short or less highly-expressed genes. The GOseq algorithm takes this into account, thus correcting the bias and providing a more reliable readout. As well as providing a useful new tool, this paper highlights the need for new statistical analysis techniques tailored specifically for the new technology of RNA-seq.
Given the extent to which RNA-seq is being embraced by the genomics community, for example in defining alternative transcripts, this method is a welcome addition to a growing field.