|Home | About | Journals | Submit | Contact Us | Français|
To analyze if genetically determined Amerindian ancestry predicts the increased presence of risk alleles of known susceptibility genes for systemic lupus erythematosus.
Single nucleotide polymorphisms within 16 confirmed genetic susceptibility loci for SLE were genotyped in a set of 804 Mestizo lupus patients and 667 Mestizo normal healthy controls. In addition, 347 admixture informative markers were genotyped. Individual ancestry proportions were determined using STRUCTURE. Association analysis was performed using PLINK, and correlation of the presence of risk alleles with ancestry was done using linear regression.
A meta-analysis of the genetic association of the 16 SNPs across populations showed that TNFSF4, STAT4, PDCD1, ITGAM, and IRF5 were associated with lupus in a Hispanic-Mestizo cohort enriched for European and Amerindian ancestry. In addition, two SNPs within the MHC region, previously associated in a genome-wide association study in Europeans, were also associated in Mestizos. Using linear regression we predict an average increase of 2.34 risk alleles when comparing a lupus patient with 100% Amerindian ancestry to an SLE patient with 0% American Indian Ancestry (p<0.0001). SLE patients with 43% more Amerindian ancestry are predicted to carry one additional risk allele.
Amerindian ancestry increased the number of risk alleles for lupus.
Differences in the prevalence and severity of systemic lupus erythematosus (SLE) between various ethnicities are well documented. In particular, individuals of self-reported Hispanic (or Mestizo), Asian or of African ancestry in the United States and Europe have been shown to have an earlier age of onset, a higher frequency of severe renal disease and a higher frequency of relapses than individuals of European ancestry (1–8). While socioeconomic factors do play a role in the increased morbidity and mortality of Hispanic individuals, it has never been analyzed if the presence of genetically defined ancestry does correlate with an increased frequency of risk alleles for lupus. We have previously shown that the increased proportion of Amerindian genome increases the risk for SLE (9). This observation has been confirmed (10). Further, we also described the strong genetic association between IRF5 and SLE in Mexican individuals combined with an increased frequency of homozygozity for the risk haplotype (11).
In the present work we have analyzed 804 Mestizo individuals with lupus for genetic association with polymorphisms within 16 confirmed SLE susceptibility loci (12–31) and queried if the frequency of risk alleles correlates with a higher proportion of genetically determined Amerindian ancestry defined using a set of admixture informative markers.
Herein, we describe that Amerindian ancestry increases the odds of having more lupus risk alleles as compared to European ancestry in Mestizo lupus patients.
A total of 804 patients with SLE and 667 healthy controls were from two main sources: one source was the OMRF collection from the Lupus Family Registry and Repository (LFRR; http://lupus.omrf.org) comprising 373 cases with SLE and 272 controls. The great majority of these individuals are of Mexican ancestry born or living in the United States. The second source were cases recruited within a multicenter collaboration from Argentina: 242 cases and 240 controls that have been previously reported and were used in the analysis of genetic associations described previously for STAT4 (12), IRF5 (13), BANK1 (19), and TNFSF4 (20). The remaining of samples are individuals reported here for the first time from an ongoing collection of SLE patients from Latin America known as GENLES. These comprise 101 SLE cases and 64 controls collected throughout Mexico (specifically from the cities of Guadalajara, Morelia, Culiacán and Mexico City) and 88 cases and 91 controls from the city of Lima, Peru, in South America.
All cases fulfilled the American College of Rheumatology classification criteria for lupus in its latest version (32).
Genotyping was performed using the Illumina Custom Bead system on the iSCAN instrument. Genotypes for the following SNPs within 16 confirmed susceptibility genes for SLE were used: rs2476601 (PTPN22), rs1801274 (FCGR2A), rs2205960 (TNFSF4), rs7574865 (STAT4), rs231775 (CTLA4), rs11568821 (PDCD1), rs6445975 (PXK), rs10516487 (BANK1), rs907715 (IL21), rs3131379 (MSH5 within the MHC class III region), rs1270942 (CFB, within the MHC class III), rs2070197 (IRF5), rs13277113 (C8ORF13-BLK region), rs1800450 (MBL2), rs4963128 (KIAA1542) and rs1143679 (ITGAM) (12–31).
In addition, 347 admixture informative markers (AIMs) were also used to genotype all individuals (Supplementary table 1) (33–35). We have selected a panel of AIMs that had large frequency differences between European populations and Amerindian populations. In addition, the inter-marker distance between two adjacent AIMs was at least 1Mb to ensure that the AIMs were not in linkage disequilibrium in the parental populations.
Population structure was analyzed with STRUCTURE v2.3.1 (36), which implements a model-based clustering method for inferring population substructure using AIMs. We set most of parameters to their default values as advised in the user’s manual. Specifically, we chose the admixture model and the option of correlated allele frequencies between populations as suggested by Falush et al.(36). The range of possible populations we tested was from K= 3 to 5 as described (37). The best fitting K was 4, as a mixture of four populations: African, European, Asiatic and Amerindian. We selected genotypes from European (CEU), Amerindian (MEX), Asiatic (CHB) and African (YRI) individuals from HapMap version 3 dataset as potential ancestral populations (38). Outliers were excluded when they showed more than 10% African or Asian ancestry, in order to enrich for two ancestral populations, European and Amerindian. Among the samples 45 individuals were excluded from further analyses.
To account for confounding population substructure or admixture in the studied population we used principal component analysis (PCA) (39–42) as implemented in HelixTree using genotype data from the 347 AIMs. The first three PCs explained 71.7% of the variance among the first 10 PCs and had the following eigenvalues 42.1, 21.3 and 8.3. The eigenvalues for PC4-PC10 showed a plateau, suggesting that the first three PCs account for most of the populations’ substructure in this analysis. All individuals who were not clustering with the main Amerindian cluster (deviation of more than 4 SD from cluster centroids) were excluded from subsequent analysis. Using this method we identified 23 outlier individuals (15 healthy controls and 8 SLE patients).
The genetic association analysis was done using PLINK v1.0.7 (43). First quality control filters were applied to remove SNPs with differential missing rate between cases and controls (P<0.05), significant deviation from Hardy-Weinberg equilibrium in controls (P<0.001) or a minor allele frequency < 1%. Allele frequencies of the remaining SNPs (16 of 16) were tested for significant association by a χ2 test within each study population. The Meta-analysis of the all populations was conducted using standard methods based on Cochran-Mantel-Haenszel test (44). The Breslow-Day test (45) was performed for all SNPs to assess heterogeneity of the odds ratios in different populations. The pooled OR was calculated according to a fixed-effects model (Mantel-Haenszel meta-analysis) for SNPs with homogeneity between populations as well as random effects model (DerSimonian-Laird) when heterogeneity was present, using the StatsDirect v2.4.6 software.
Alternatively, we also derived principal components on a population-specific basis using HelixTree software v7.2.3, and applied an adjustment for the five first principal components.
We used linear regression to model the relationship between the proportion of Amerindian ancestry and the number of SLE risk alleles. Our initial model included the proportion of Amerindian ancestry, gender, and the interaction between gender and Amerindian ancestry as predictor variables for the number of SLE risk alleles. There was no evidence of interaction, so we refit the model with the two remaining predictor variables. Since we were interested in the association between the number of risk alleles and the proportion of Amerindian ancestry, we removed gender from the model as neither predictor variables were significant while both were fit. Our final model included the proportion of Amerindian ancestry as a predictor for the number of SLE risk alleles. All linear modeling assumptions were assessed and met.
Population structure analyses showed the following mean proportions of Amerindian ancestry for each of the sets included (Table 1): Amerindian ancestry was 30.7% for OMRF Hispanics, 24.7% for Argentine, in agreement with what we had described previously (46), 52.3% for Mexicans and 72.6% for Peruvians. OMRF Hispanics differed from the Latin American set by having higher proportions of North European ancestry, suggesting that some of these samples may include second or third generation Mexican-Americans where inclusion of the European-American genetic pool, mainly of North European ancestry has occurred. On the other hand, Latin American sets had clearly a South European proportion, as expected by the known history of these populations (Table 1).
For individual ancestry proportions, there were no differences between cases and controls in the four clusters. In addition, we did not observe any differences after comparing the clusters with and without population priors.
Using all cases and controls we first determined the genetic association between all Hispanic cases and controls and SLE with the 16 SNPs. Association was observed for TNFSF4, STAT4, IRF5, MSH5, CFB, and ITGAM and a trend of association was observed for PDCD1 (Table 2). The SNPs for C8orf13-BLK, BANK1 and PXK showed a significant degree of heterogeneity ((P<0.0001; P = 0.023; and P =0.001, respectively) across the different country sets and this could have contributed to the fact that the final meta-analysis did not provide a genetic association for these genetic variants. This is particularly true for the C8orf13-BLK SNP, however might not explain the results for BANK1 and PXK, where it could relate to insufficient power for detection of the genetic association.
We have previously shown that Amerindian ancestry increases the risk for lupus (9), and this was later confirmed (10). Therefore we analyzed if the individual proportion of Amerindian ancestry had any effect on the number of risk alleles.
Linear regression (Figure 1) showed that on average, we predict a 2.34 increase of risk alleles of a subject with 100% Amerindian ancestry as compared to a subject with 0% of such ancestry. In other words, an individual with 43% more Amerindian ancestry will on average have one additional risk allele.
It has been consistently shown that individuals of Mestizo (Hispanic) descent have a more severe clinical lupus disease accompanied with earlier age of onset and severe renal disease. Mestizos are a very heterogeneous group of individuals with different cultural backgrounds but in general a common mother tongue, Spanish. The complexity of the Mestizo population does not allow for appropriate genetics studies unless such complexity is taken into consideration (1). With the aim to investigate if genes identified in lupus in Europeans also play a role in the disease in Mestizos, we have selected a group from Latin American countries with an enrichment of Amerindian and European ancestries based on population history and a set of Hispanics from the United States primarily originating from Mexico.
In general, the populations of Mexico, Peru and Argentina have lower African ancestry and primarily European and Amerindian ancestry. Our collection also includes samples from Southern Europe (Spain and Portugal) as reference, so we were able to discern between North and South Europeans. In this regard, OMRF Hispanics showed a high proportion of North European ancestry, in line with recent inclusion of European-American gene pool.
Testing of the 16 SNPs representing risk variants of lupus susceptibility genes described in Europeans, we confirmed the genetic association previously found in IRF5, STAT4, TNFSF4, ITGAM and to a lesser degree the two SNPs within the MHC region and PDCD1. Interestingly, the two SNPs used here for the MHC were the same included in the GWAS that detected the highest genetic association in Europeans. Here, the genetic associations of the non-MHC variants was stronger than for the MHC, suggesting two possibilities: either the MHC effect originates from the European admixture in the Amerindian background and it is “diluted” and/or, other Amerindian genes play a very important role in disease susceptibility in Hispanics and in some way substitute for the strong effect that the MHC has in Europeans. However these do not tag MHC haplotypes and cannot be seen as representing the main effect on the MHC region in this population. For this, dense coverage of the region would be required. Such studies are underway. We are at present performing a GWAS in Hispanic-Mestizo individuals to answer this question. Of the remaining genetic association it is important to point out that this replication is not completely independent: the Argentine samples have been used previously in our work on BANK1, IRF5, TNFSF4 and STAT4 (12, 13, 19, 20). In fact, our previous work (9) showed an increased frequency of Amerindian genome in patients with SLE in this same set of Argentine patients. Here we observe a very similar average in the proportion of Amerindian genome between cases and controls, but we also have included new samples. The previous work used a completely different set of AIMS, although the number was smaller. At this point we are unable to explain the reason for this controversy.
As the individual sets of Mexican and Peruvian samples are new but each is relatively small, the associations were not discernible at the individual cohort level. The Peru sample showed a weak association for FCGR2A (P = 0.02), IRF5 (P = 0.004) and ITGAM (P = 0.01), while the Mexico set showed association with BANK1 (P = 0.0002) and ITGAM (P = 0.001). Most of the contribution to the genetic associations observed in the meta-analysis was provided by the Argentine and the OMRF Hispanic sets.
PDCD1 deserves further discussion. We identified PDCD1 as a susceptibility gene for lupus after linkage analysis in Icelandic and Swedish multiplex families and we described a polymorphism in intron 4 associated with SLE with a replication in European-Americans, Swedish and Mexican cases with SLE (31). A second independent report replicated this genetic association in Mexican pediatric SLE patients (47) and we recently described a correlation between surface levels of PDCD1 protein (PD-1) in CD4+CD25+ T cells and the associated variants (known as PD1.3) (48). Here, the association was only observed in the Argentine SLE cases and controls (P = 0.013), a set never before analyzed for this polymorphism. Importantly, and possibly affecting our results is the fact that the Argentine set is the most European and is possibly also why the association is detectable in that set. Finally, no association was observed for CTLA4, IL21, MBL2 and KIAA1542, while BLK showed, as mentioned, extensive heterogeneity. The negative results of the meta-analysis for BLK should be viewed with caution.
What is the significance of the increased risk for individuals with Amerindian genome to carry risk alleles of lupus susceptibility genes identified in Europeans? First, it is possible that in Hispanics/Mestizos, the “European” risk alleles interact with genes important in the Amerindian background. This is somewhat reminiscent of what happens in the New Zealand mouse strains, where the New Zealand White background interacts with genes found in the New Zealand Black background leading to a strong and florid lupus-like disease in the resultant F1 strain (49, 50). To some degree Mestizo individuals from Latin America behave as a sort of genetic F1 where unknown genetic interactions might occur leading to an increased risk to develop severe SLE in the admixed population. On the other hand, our results might also be explained by an enrichment of European risk alleles due to positive selection.
From the data presented here we can suggest that the admixture may in part be responsible for the increased susceptibility for the disease and that the Amerindian background genome contributes to this increased risk. The identification of genes of Amerindian origin contributing to the increased risk for the disease is clearly justified.
The authors would like to thank Dr. Carl Langefeld and Dr Jasmin Divers for the selection of admixture informative markers for the LLAS2 project, and Maria Luisa Ordoñez-Sanchez, Rosario Rodriguez-Guillen and Farideh Movafagh for technical assistance.
This work was supported by the following grants: the Swedish International Development Agency grant (SIDA), the Swedish Research Council for Medicine, the Instituto de Salud Carlos III and the NIH grants CA141700, AI083194 and the ARRA (NIAMS) grant AR058621 to M.E.A.R. The NIH grant R03AI076729 from the National Institute of Allergy and Infectious Diseases, and NIH Grants Number P20- RR020143, P30-AR053483, the Lupus Foundation of America, the University of Oklahoma Health Sciences Center, the Oklahoma City VA medical Center, and the Oklahoma Medical Research Foundation. The NIH grants AR062277, AR042460, AI024717, AI083194, and RR020143 to JBH.
The members of the Argentine Lupus collaboration are: Hugo R. Scherbarth MD, Jorge A. Lopez MD, Estela L. Motta MD Servicio de Reumatología, Hospital Interzonal General de Agudos “Dr. Oscar Alende”, Mar del Plata, Argentina; Susana Gamron MD, Sandra Buliubasich MD, Emilia Menso MD Servicio de Reumatología de la UHMI 1, Hospital Nacional de Clínicas, Universidad Nacional de Córdoba, Córdoba, Argentina; Alberto Allievi MD, Jose L. Presas MD Hospital General de Agudos Dr. Juán A. Fernandez, Buenos Aires, Argentina; Guillermo A. Tate MD Organización Médica de Investigación, Buenos Aires, Argentina; Simon A. Palatnik MD, Mariela Bearzotti PhD Facultad de Ciencias Médicas, Universidad Nacional de Rosario y Hospital Provincial del Centenario, Rosario, Argentina; Alejandro Alvarellos MD, Francisco Caeiro MD, Ana Bertoli MD Servicio de Reumatología, Hospital Privado, Centro Médico de Córdoba, Córdoba, Argentina; Sergio Paira MD, Susana Roverano MD, Carlos Louteiro MD Hospital José M. Cullen, Santa Fe, Argentina; Cesar E. Graf MD, Estela Bertero PhD Hospital San Martín, Paraná; Cesar Caprarulo MD, Griselda Buchanan PhD Hospital Felipe Heras, Concordia, Entre Ríos, Argentina; Carolina Guillerón MD, Sebastian Grimaudo PhD, Jorge Manni MD Departamento de Inmunología, Instituto de Investigaciones Médicas “Alfredo Lanari”, Buenos Aires, Argentina; Luis J. Catoggio MD, Enrique R. Soriano MD, Carlos D. Santos MD Sección Reumatología, Servicio de Clínica Médica, Hospital Italiano de Buenos Aires y Fundación Dr. Pedro M. Catoggio para el Progreso de la Reumatología, Buenos Aires, Argentina; Cristina Prigione MD, Fernando A. Ramos MD, Sandra M. Navarro MD Servicio de Reumatología, Hospital Provincial de Rosario, Rosario, Argentina; Guillermo A. Berbotto MD, Marisa Jorfen MD, Elisa J. Romero PhD Servicio de Reumatología Hospital Escuela Eva Perón, Granadero Baigorria, Argentina; Mercedes A. Garcia MD, Juan C. Marcos MD, Ana I. Marcos MD Servicio de Reumatología, Hospital Interzonal General de Agudos General San Martín, La Plata; Carlos E. Perandones MD, Alicia Eimon MD Centro de Educación Médica e Investigaciones Clínicas (CEMIC), Buenos Aires, Argentina; Cristina G. Battagliotti MD Hospital de Niños Dr. Orlando Alassia, Santa Fe, Argentina.