Precision cancer medicine is improved using novel statistical methods

A new set of statistical tools improve the ability to discover cancer drug biomarkers from cell lines, which are effective for tailoring drug treatments in cancer patients. Paul Geeleher and Stephanie Huang tell us more about these statistical tools and their recently published paper in Genome Biology

The arrival of high-throughput genome sequencing and targeted cancer drugs had promised to usher an era of “precision medicine”, whereby each cancer patient would be treated based on the sequence of their tumor.

This paradigm has yielded a few notable successes. Examples include drugs such as trastuzumab, which interrupts ERBB2 in specific subtypes of breast cancer and imatinib, which targets the BCR-ABL1 fusion gene in chronic myeloid leukemia. However, these are among a list of less than 30 cancer drugs that have known actionable targets that can be identified using a genetic test. Notably, almost all of these drugs are a product of drug design. Drug repurposing strategies and biomarker discovery strategies, aimed at finding new targets for existing drugs, have almost universally failed.

One of the reasons for this is that the genetics of drug response in cancer is intrinsically difficult to study. Cancer patients are generally treated with complex regimes and collecting cleanly measured data on drug response in large patient cohorts is difficult, expensive and has rarely led to new discoveries. Thus, most research is limited to preclinical disease models. Examples include cancer cell lines and mice. However, there is a long history of robust biomarkers in these types of disease models being ineffective when applied in clinical trials.

We believe that one of the problems is that biomarkers which are discovered in preclinical models are not generally tested in clinical trials in a way that is consistent with how the biomarker was originally discovered. To elaborate, in a preclinical disease model, for example a large panel of cell lines, a biomarker is discovered by comparing the molecular features of tumors to a cleanly measured drug response phenotype, which has been collected based on the panel of cell lines being treated with a single drug. However, when tested in a clinical setting, the patient is generally not being treated with a single drug, but rather likely a cocktail of drugs and perhaps other regimes such as radiation therapy.

Another scenario might see a biomarker based trial being run on a set of relapsed patients, who have already developed multi-drug resistance. Clearly a different set of circumstances under which the biomarker was initially discovered. One obvious way to address this issue would be to re-evaluate how biomarkers are tested on patients, however, in most circumstances this will lead to serious ethical issues because the patient would likely no longer be receiving the optimal treatment.

Our findings address this issue in a different way. Specifically, we have developed a set of statistical techniques that attempt to improve the biomarker discovery process by estimating how cell lines tend to respond to all drugs, then using statistical methods to control for this variable.

We refer to this latent variable as “general levels of drug sensitivity”, or GLDS. Our hypothesis was that doing so should more closely model how these biomarkers would eventually be tested clinically and thus produce results that are more likely to effectively translate to the clinic.

Overall, our results were very encouraging. We found that accounting for how cell lines tended to respond to all drugs gave us results that were far more consistent with clinical observations. One of the limitations is that obviously these are retrospective observations, but in future we hope that our findings will be tested in prospective trials. However, our findings provide a strong rationale for the failure of several high profile biomarkers based clinical trials and drug repurposing efforts.

Moving forward we hope to apply similar methods to larger datasets and perhaps to other preclinical disease models, such as mouse studies. Other researchers can also apply our methods using an R package, “glds”, which we have released with our study.

View the latest posts on the On Medicine homepage

Comments