Dynamic prediction models: Learning to forget

Data on healthcare outcomes, patient populations, and new treatment options are constantly changing over time. Shouldn’t the data used to create our clinical prediction models be changing too? The authors of a new Diagnostic and Prognostic Research article discuss dynamic prediction models and current challenges and limitations.

Imagine being told that the chance of your surgery being successful was 70% or that there is a 10% chance you will have a heart attack in the following year. What would your reaction be if you were then made aware that the data used to obtain these estimates were from many years ago? You would probably ask for estimates that are derived from the most recent data rather than from a distant past.

Our evidence production pipelines are built on the flawed assumption that once produced, evidence can be translated into practice ad infinitum. The models and tools that are used to aid clinical decision-making are static and assume that nothing about the underlying patient population changes through time. However, healthcare systems and patient populations are constantly evolving.

Through time, medical practice improves, procedures and policies change, new treatments become available, and the characteristics of patient populations change (such as their lifestyles). All these situations can cause the accuracy of evidence to diminish over time.

Clinical prediction models (CPMs) are typically produced by analyzing historical, routine healthcare data to calculate the risk of an outcome, based on an individual’s characteristics (e.g. age, sex, ethnicity, blood pressure, etc.). For example, QRISK3 uses data collected from patient records between 1998 and 2016 to estimate an individual’s 10 year risk of cardiovascular disease. Because the model is static, it essentially assumes that cardiovascular risk management has not changed since the late 1990s. In reality, the last two decades have witnessed a massive increase in the use of statins; a smoking ban in public places; and the introduction of health checks in primary care.

Another example of a CPM used in clinical practice is the EuroSCORE model. This model predicts short-term mortality risk following cardiac surgery and uses data collected in 1995; however, over time the model has become less accurate due to its static nature. Indeed, the majority of CPMs ignore the huge amount of data that are constantly being collected, which could be used to update and learn from, and thereby ensure the models are reflective of the latest clinical practice.

Inspired by the data revolution, health informaticians have proposed the concept of a learning health system: a health system that improves itself by learning from the data that it generates, continuously and in real time. This takes place through cyclical processes that mobilize health data, analyze it to create new knowledge, and apply that new knowledge to improve the health of individuals and populations.

Learning health systems are supported by infrastructures that enable these processes to take place routinely and with efficiency of scale and scope. A key metric of the learning health systems is data-action latency: the time lag between evidence being available and corresponding action being taken in clinical practice. Minimizing the data-action latency requires concerted data capture, aggregation, and analysis followed swiftly by interpretation of results, assignment of responsibility for any actions, and recording of actions.

Our article reviews methods that are available for developing and validating dynamic CPMs: that is, CPMs that embrace the concept of a learning health system by having the ability to adapt to changes in the underlying data over time. The article discusses the current challenges and limitations that need to be addressed before these methods could be used routinely in clinical practice.

Although several methods exist which allow CPMs to evolve through time (e.g. model updating, Bayesian updating, and varying coefficient models), the review highlights that these are currently underutilized. One potential reason for this is that it is currently not clear how one should validate dynamic prediction models. Indeed, the lack of validation methods for such models is identified as the most pressing issue that needs to be investigated. Additionally, there exist few software packages and user-friendly tutorials for development of dynamic CPMs, which may also hamper the adoption of these models.

Dynamic CPMs and the concept of learning health systems can be used to improve patient benefit, and help reduce the data-action latency in preventative medicine. However, we first need to change the way in which we view data and decision tools. Firstly, real-world data do not appear in sets at discrete time points but are a continuous stream of information. Secondly, a dynamic CPM is not a model that generates predictions, but a system that generates models that in turn produce predictions. The problem of validating these models can only be solved by appreciating this completely.

View the latest posts on the On Medicine homepage