Over two years into the worldwide COVID-19 pandemic there is no agreement on the best way to predict outcome or measure severity of disease. A 2021 review of over 100 prognostic modeling approaches recommended none be used because of concerns over validity and biases in their development.
A recent BMC article “Seek COVER: using a disease proxy to rapidly develop and validate a personalized risk calculator for COVID-19 outcomes in an international network” introduces a novel approach to this challenge, the use of Influenza data. There was substantial existing data on outcomes from influenza at the beginning of the pandemic. With SARS-CoV-2 also being a respiratory pathogen, could there be insights from influenza cases that would prove useful for COVID-19?
The study by Williams and co-authors uses an eclectic combination of claims and clinical databases from multiple countries collected from 2006-2020. Over 7 million primarily outpatient influenza cases were assembled in a common dataset. It is recognized that the quality and consistency of these electronic data is very variable. Some of the databases used lacked any variables except age and sex whilst others did not consistently record outcomes.
The cases selected for initial model development, however, were a random selection of 150,000 cases drawn from 2,082,077 adult outpatients with a new influenza diagnosis and a follow up of 365 days. A smaller cohort of 44,507 patients treated from January 1 to April 27 of 2020 with a diagnostic code for COVID-19 were used for one of multiple external validations.
The modeling effort began as a data driven regression effort resulting in 31,917 potential predictors. Acknowledging the complexity of using a model with so many predictors the investigators had clinicians examine the distribution of variables for cases with and without the three outcomes they sought to predict. These were being hospitalized with pneumonia (COVER-H), being hospitalized with pneumonia and requiring intensive care services or dying (COVER-I) or death within 30 days after initial diagnosis regardless of the intensity of treatment (COVER-F). The reduced regression models still had hundreds of predictors. So, two simplified models were created. One had age and sex, whilst the other combined age and sex with seven common co-morbid conditions such as cancer or heart disease. The latter was converted into an easy-to-use calculator for the probability of outcomes.
What can we learn from this extraordinary compilation of cases and ambitious modeling efforts? First the results confirm that SARS-CoV-2 is much more dangerous than Influenza. When the COVER modeling efforts were validated, discrimination varied greatly across datasets and countries, and most models were not well calibrated. This was true with their COVID-19 database, and especially within the one dataset containing hospitalized COVID-19 patients.
An editorial by Peterson, “COVID-19 is Not Influenza”, discussed the comparison of nearly 90,000 confirmed COVID-19 patients to approximately 45,000 with influenza. They found dramatic differences in outcomes with COVID-19 being substantially more lethal. As the COVER authors acknowledge this means their models would need to be re-calibrated by country for the increased risk that COVID-19 presents over contracting influenza.
Second, the well acknowledged major vulnerabilities of age and existing comorbidities that are prominent in most COVID-19 models were also identified as important risk factors in COVER. In the well-received QCOVID 1 model based on the U.K.’s QResearch 10.5 million population registry, age and co-morbid conditions along with a measure of economic deprivation produced accurate risk stratification for the likelihood of catching COVID-19 and then having a severe outcome (hospital admission or death) in the initial two stages of the pandemic within that country.
Both models also illustrate that when a very large outpatient (COVER) or population-based registry (QCOVID 1) are used for modeling, the absolute risk of severe outcomes for an individual is low. The COVER models frequently produce probabilities of serious complications between 0.01%-1%. QCOVID 1 model risk levels for contracting COVID-19 and then having a severe outcome are higher, 0.1-2%. Unfortunately, as has been tragically demonstrated, when even these small risks occur in hundreds of millions of people, hospitalizations and deaths also number in the millions.
The developers of the COVER models emphasize that their overall goal was to illustrate a methods development process as opposed to a model for immediate clinical impact. At the beginning of the pandemic this was an appropriate effort. They should be commended for assembling a very large common dataset, for convening a worldwide collection of investigators, and creating a clinically plausible influenza model.
But a large well-maintained population data registry like QResearch and modeling conducted by an established New and Emerging Respiratory Virus Threats Advisory Group (NERVTAG) in the U.K. describe an alternative approach. We may be coming to the realization that replicating these two resources now in other countries could be the most appropriate response to this pandemic, and a path to being much better prepared for the next one.