Whether you call it precision medicine or personalized medicine or any one of a number of other terms, one of the major goals in medical research is to be able to predict or diagnose disease, and choose therapeutic options, based upon simple and non-invasive tests. Hundreds of molecular biomarkers have been identified with this goal in mind, yet few are reproducible and even fewer have been brought to the clinic. In a research article recently published in Genome Medicine, Paul Boutros and a multidisciplinary team from Canada, Australia and the Netherlands look at the technical reasons behind this problem of reproducibility.
Boutros and colleagues used publically available non-small cell lung cancer datasets known collectively as the Director’s Challenge cohort. They found that, contrary to earlier studies, previously-published multi-gene biomarkers are validated by the Director’s Challenge data. However, this success was a double-edged sword. Comparisons of different techniques showed that even minor changes in data pre-processing can change the statistical support for a biomarker from significant to random chance. Boutros and colleagues were able to “exploit the noise”, developing an algorithm to use this sensitivity as an indicator of biomarker robustness. Understanding technical and statistical effects on biomarker validation (or the lack of it) is fundamental for their development and potential translation.