Our understanding of hereditary disease risks has blossomed during the past decade, largely due to thousands of genome-wide association studies (GWAS). These studies involve scanning through genomes to identify single-nucleotide polymorphisms (SNPs) that are associated with common diseases and traits. For example, individuals who have a “G” allele at rs7329174 (a SNP located on chromosome 13) are more likely to have Crohn’s disease than individuals who have an “A” allele at this SNP.
Unfortunately, results from GWAS paint a biased picture of human health and disease. This is because the vast majority of GWAS have used European study cohorts. Disease associations found in European populations may not always generalize well to other populations. This problem is magnified for populations that have experienced different evolutionary histories (e.g. populations from Northern Europe and sub-Saharan Africa).
These biases hinder our ability to generalize results from one part of the world to other parts of the world.
Genotyping technologies used in GWAS can also cause problems. This is because most GWAS use genotyping arrays, as opposed to whole genome sequencing. These genotyping arrays tend to contain SNPs that were originally ascertained in European populations.
In a recently published Genome Biology paper, we found evidence of systematic biases in GWAS results. These biases hinder our ability to generalize results from one part of the world to other parts of the world. Specifically, we found that genetic predictions of disease risks can be grossly misestimated. This problem is particularly acute for African individuals, and it is caused by the biases that are described above.
Using computer simulations, we also found that African GWAS results generalize better across populations than non-African GWAS results. This asymmetry arises because non-African populations have experienced a loss of genetic diversity following the out-of-Africa migration (evolutionary history matters!).
Applications of genome-wide association study data
Genomic information is beginning to be applied in clinical settings. Because of this, our findings are timely. Using GWAS results, physician-scientists can count the number of risk-increasing alleles in each person’s genome to generate genetic risk scores for specific diseases. For example, genetic risk scores are able to identify women who are most at risk for breast cancer and determine the optimal age to begin mammogram screening.
Another way that genetic risk scores can be improved is by correcting for the biases of different genotyping technologies.
However, it is important to keep in mind that the utility of genetic risk scores depends on the quality of the source data. Genetic risk scores that work well for one population need not work well for other populations.
How can we improve genetic predictions of health and disease? Given unlimited funds, one option would be to repeat every GWAS in every population. Needless to say, this isn’t very feasible. A more practical solution is to generate genetic risk scores that correct for existing biases. For example, GWAS results may generalize well from the United Kingdom to Denmark, but substantial corrections may be needed to generalize results from the United Kingdom to Ghana.
Another way that genetic risk scores can be improved is by correcting for the biases of different genotyping technologies. Corrected genetic risk scores can also incorporate evolutionary information, as demonstrated in our recent Genome Biology paper. Only by incorporating these details can the benefits of genomic medicine be extended to individuals from diverse global populations.