Admixed Populations: How To Correct For Ancestry Bias In Genetic Association Studies

Admixed populations pose a special challenge in genetic association studies due to the different proportions of a contribution of two or more ancestries. In the classic design of case-control studies, allele frequencies between disease and healthy subjects are compared; however, spurious associations not related to causative loci can be obtained by unmeasured population substructure. The issue of unobserved confounding effects appears when the investigated population is composed of several ancestral subpopulations with different allele frequencies and disease-risk not equally represented in cases and controls. This bias on allele frequencies can result in false-positive associations during statistical analyses.

The most common way to solve this is applying a set of ancestry informative markers (AIMs), i.e. markers exhibiting differential allele frequencies (d) greater than 30% in any pair of parental populations, to infer ancestry in both case and control groups and then adjusting the analysis for population stratification.

Using a whole genome-wide DNA microarray (Cytoscan HD array, Affymetrix) that encompasses 750,000 single nucleotide polymorphisms (SNPs), we reported 345 markers that are able to evaluate accurately the ancestral composition of the Brazilian population in case-control studies using data from the own array.

We performed a two-step validation of the 345 SNP-AIMs panel estimating the ancestral contributions using another panel of AIMs and ~70K SNPs from the array. The 345 SNP-AIMs panel has the potential to be widely used in cytogenetic research and molecular genetics to study diseases whose incidence is affected by ancestry. Another noteworthy advantage of using a panel of AIMs to infer ancestry is to insert the information of individual proportions of each parental component as an independent variable in statistical logistic regression models. This is especially suitable to perform corrections for ancestry in case-control studies.

In our study, we demonstrated the application of the panel comparing the ancestry in a case group of SLE versus healthy controls. The ancestry estimation based on the set of 345 SNP-AIMs showed that both groups had major European ancestral contributions, followed by African and Amerindian. Significant differences in the European and African ancestries were detected among the two different groups, while patients and controls have shown a major European genomic contribution. SLE patients have a higher African contribution (22%) than healthy subjects (13%), while controls showed a European contribution 12% higher than SLE patients. This difference throws light upon the moderate population substructure detected between the case and control groups.

We also highlighted the divergence between the self-declared ancestry and the individual genetic background. Comparisons between genetic and self-declared ancestry in SLE patients showed at least 30% of non-declared ancestry background, including African/Amerindian in Whites and European/Amerindian in Blacks. This finding underscores how crucial it is to evaluate ancestry based on genetic markers in addition to using the perception of individuals with their phenotypic characteristics.

In brief, spurious associations resulting from population stratification may be circumvented by using the 345 SNP-AIMs panel as a straightforward and efficient method to infer ancestry in case-control studies.

This study, Ancestry informative marker panel to estimate population stratification using genome-wide human array was recently published in the journal Annals of Human Genetics.

About The Author

Fernanda B Barbosa

Fernanda is a research scientist at the University of São Paulo.

Aguinaldo L Simões

Aguinaldo is a research scientist at the University of São Paulo, Departamento de Genética (Ribeirão Preto).

Speak Your Mind!


How To Detect Coffee Fraud By Quantifying Robusta In Arabica Coffee Blends

Did you know that coffee is the second most popular drink in the world, losing only to water? Current estimates indicate that more than 1 billion cups of coffee are consumed, worldwide, per day. Such popularity is mainly due to its beneficial physiological effects on health, good taste, pleasant flavor and attractive aroma. Recent data […]

What Are The Monomers Of Lipids?

A lipid is a biological molecule that dissolves (is soluble) in nonpolar solvents, and the monomers of lipids are fatty acids and glycerol. To better understand what this means, let’s take a look at both lipids and monomers in the context of organic molecules. We’ll begin by seeing what the definitions of both monomers and […]

Temporal And Spatial Aspects Of The Hurst Phenomenon In Precipitation

The Hurst phenomenon is a behavior observed in geophysical processes in which wet or dry years are clustered during long time periods. A common practice for evaluating the presence of the Hurst phenomenon is to model the geophysical time series with the Hurst-Kolmogorov process (HKp, also called fractional Gaussian noise, fGn) and estimate its Hurst parameter […]

Acid Mine Drainage: A Huge Problem Around Abandoned Mines

Acid mine drainage (AMD) is an acidic mineral solution that flows out of a mine. The phenomenon in question is spontaneous and involves metal sulfides (type “M” Sx) with no economic value in mining. Indeed, during mining work (excavations and pumping), the chemical balance of these outcrops and deep deposits are disturbed by sudden changes […]

Trait-Based Clustering As An Indicator Of Species Competition

Competition is one of the major types of interactions between species in nature. It is especially important in biodiverse communities such as tropical forests, where a large number of plant species must compete for a small number of resources, such as water, light, and nutrients. Yet, we still know little about how competitive interactions influence […]

Back To The Past For Management Of Large Carnivores In Alaska

In the 19th and early 20th centuries, North American bears, wolves, and mountain lions were viewed as threats to human welfare and economies, and governments at all levels attempted to cleanse the landscape of them. These efforts were successful to the point that by the mid-20th century south of Canada, brown/grizzly bears (Ursus arctos) and wolves […]

Observing Mental Abilities In Patients After A Unilateral Or Bilateral Anterior Prefrontal Resection

The anterior prefrontal cortex is the most anterior part of the human brain. It is supposed to be at the top of the brain’s hierarchical organization and is involved in the most complex human mental functions. A wide range of studies has shown that this region plays a critical role in mental functions which are considered […]