|Home | About | Journals | Submit | Contact Us | Français|
Beta-2 microglobulin (B2M) is a component of the major histocompatibility complex (MHC) class I molecule and has been studied as a biomarker of kidney function, cardiovascular diseases and mortality. Little is known about the genes influencing its levels directly or through glomerular filtration rate (GFR). We conducted a genome-wide association study of plasma B2M levels in 6738 European Americans from the Atherosclerosis Risk in Communities (ARIC) study to identify novel loci for B2M and assessed its association with known estimated GFR (eGFR) loci. We identified 2 genome-wide significant loci. One was in the human leukocyte antigen (HLA) region on chromosome 6 (lowest p-value=1.8×10−23 for rs9264638). At this locus, 6 index SNPs accounted for 3.2% of log(B2M) variance, and their association with B2M could largely be explained by imputed classical alleles of the MHC class I genes: HLA-A, HLA-B, or HLA-C. The index SNPs at this locus were not associated with eGFR based on serum creatinine (eGFRcr). The other locus of B2M was on chromosome 12 (rs3184504 at SH2B3, beta=0.02, p-value=3.1×10−8), which was previously implicated as an eGFR locus. In conclusion, although B2M is known to be a component of MHC class I molecule, the association between HLA class I alleles and plasma B2M levels in a community-based population is novel. The identification of the two novel loci for B2M extends our understanding of its metabolism and informs its use as a kidney filtration biomarker.
Reduced kidney function is associated with higher risk of mortality, cardiovascular disease, and kidney failure (Bash et al. 2010; Matsushita et al. 2010). Beta-2-microglobulin (B2M) has been studied as a biomarker of kidney function and was found to be more sensitive than serum creatinine in detecting a modest decline in glomerular filtration rate (GFR) (Aksun et al. 2004; Bianchi et al. 2001; Woitas et al. 2001). In addition B2M has also been identified as a biomarker of cardiovascular outcomes (Amighi et al. 2011; Astor et al. 2012; Liabeuf et al. 2012; Prentice et al. 2010) and mortality (Astor et al. 2012; Hoke et al. 2012; Shinkai et al. 2008). Compared to more widely used kidney function biomarkers, serum creatinine and cystatin C, B2M has been shown to have stronger associations with cardiovascular outcomes and mortality in a community-based cohort (Astor et al. 2012) and has been found to improve risk stratification over cystatin C with respect to mortality in an elderly population (Shinkai et al. 2008). B2M has also been found to be associated with peripheral arterial disease (Wilson et al. 2007), multiple myeloma (Rossi et al. 2010), and inflammatory diseases (Bianchi et al. 2001; Zissis et al. 2001).
B2M (11.8kDa) is a component of major histocompatibility complex (MHC) class I molecule (Cresswell et al. 1974) and was first isolated from the urine of patients with proteinuria (Berggard and Bearn 1968). When the MHC class I molecule is degraded, B2M is released to the blood stream (Cresswell et al. 1974) and is almost entirely eliminated by the kidney after filtering then reabsorbed and metabolized by the proximal tubular cells in individuals with normal kidney function and is elevated in individuals with kidney disease (Bianchi et al. 2001; Karlsson et al. 1980). Therefore, investigation into the genetic factors influencing B2M levels in a community-based sample could potentially discover loci associated with B2M through kidney function. We conducted a genome-wide association study (GWAS) to identify genetic loci associated with plasma B2M levels in the Atherosclerosis Risk in Communities (ARIC) study. To assess whether the genome-wide significant loci of B2M were related to kidney function, we tested for the associations between these loci and estimated GFR based on serum creatinine (eGFRcr). Conversely, we also tested for the associations between B2M levels and index SNPs of known eGFR loci (Kottgen et al. 2010) to assess B2M as a kidney function biomarker.
Table 1 presents the study population characteristics. A total of 6728 adults with a mean age of 63 and a mean eGFRcr of 82.8mL/min/1.73m2 were included in this study.
The GWAS of B2M identified two genome-wide significant loci (Table 2). Figure 1 presents a plot of the −log10(p-values) by genomic position for over two million SNPs tested. Supplementary Figure 1 presents the quantile-quantile plot of the GWAS results, which had a genomic inflation factor of 1.03, indicating very little test statistic inflation. One of the genome-wide significant loci was in the human leukocyte antigen (HLA) region on chromosome 6 (lowest p-value=1.8×10−23 for rs9264638 with minor allele frequency [MAF] of 0.42, Supplementary Figure 2). The other was on chromosome 12 (rs3184504 in SH2B3, beta=0.02, p=3.1×10−8, MAF=0.49, Supplementary Figure 3) and was previously implicated as a kidney function locus.(Kottgen et al. 2010)
Of the over 1000 genome-wide significant SNPs on chromosome 6, pruning by pairwise r2 < 0.3 left 42 SNPs. Then stepwise regression identified six SNPs that were statistically independent in their associations with B2M (Table 2 and Supplementary Figure 2). Two of the six SNPs (rs9260489 and rs2023472) were 6.7kb and 162kb from HLA-A, respectively. The other four SNPs were from 2kb to 166kb from HLA-B or HLA-C. Supplementary Table 1 presents the pairwise linkage disequilibrium measures (r2 and D′) of the six index SNPs calculated from the most likely genotype in our sample.
Since B2M is a component of major histocompatibility complex (MHC) class I molecule and binds with the class I heavy chain, and the six index SNPs are within or in proximity of the MHC class I genes, we imputed classical HLA alleles of the MHC class I genes (HLA-A, HLA-B, and HLA-C) and performed association analyses to determine whether the classical HLA alleles could account for the GWAS signals. The imputed classical HLA alleles in the three genes all had median posterior probability > 0.99 with the first quartile > 0.98. Linear regression analyses identified the following imputed classical alleles to be associated with log(B2M) (p<3.8×10−4 for HLA-A, p<2.6×10−4 for HLA-B, and p<3.3×10−4 for HLA-C): A*01:01, A*02:01, A*03:01, A*23:01, B*08:01, B*15:01, B*37:01, B*40:01, B*44:03, B*57:01, C*03:03, C*03:04, C*07:01. Table 3 presents the regression results and the allele frequencies of these significant imputed classical alleles. Supplementary Table 2 presents the regression results of all imputed classical alleles with frequency > 1%. In conditional regression analyses controlled for the imputed genotypes at HLA-A, HLA-B, or HLA-C, the association between B2M and the six GWAS index SNPs were substantially weakened (Supplementary Table 3). Linear regression analysis showed the imputed HLA genotypes explained 3.6% of log(B2M) variance, similar to that accounted for by the 6 GWAS index SNPs. The addition of the GWAS index SNPs to this linear regression model increased the percentage of variance explained by 0.56%, suggesting that the classical HLA alleles could explain the GWAS signals.
To examine whether this B2M HLA locus was a kidney function locus, we tested for the associations between the six B2M HLA index SNPs and eGFRcr in the ARIC samples; no significant associations were found (p>0.0083=0.05/6, Supplementary Table 4). To increase power, we also interrogated the associations between these six SNPs and eGFRcr in the much larger sample (N=67,093) of the CKDGen consortium, which is a collaboration of over 30 studies (Kottgen et al. 2010), and did not find evidence of associations (p>8.3×10−3=0.05/6, Supplementary Table 4), suggesting this locus is a non-kidney-related locus of B2M.
Given the HLA region have been implicated in many complex diseases (Alekseyenko et al. 2011; Fernando et al. 2012; International Multiple Sclerosis Genetics et al. 2011; Lucena et al. 2011; McLachlan et al. 2011; Othman et al. 2011; Ramos et al. 2011; Skinningsrud et al. 2011), we tested for the associations between the six B2M index SNPs at the HLA locus and all-cause mortality in the ARIC study and found no evidence of associations (p>8.3×10−3=0.05/6, Supplementary Table 5).
Finally, to assess B2M as a kidney function biomarker, we tested for the associations between scaled B2M (see Methods section for definition) and the index SNPs in 16 known eGFR loci (Kottgen et al. 2010). These 16 eGFR loci were identified in a total sample size of over 90,000 individuals of European Ancestry, including the ARIC European American (EA) cohort (Kottgen et al. 2010). Within the ARIC EA cohort, compared with eGFRcr and eGFR estimated using cystatin C (eGFRcys), the effect estimates of scaled B2M were in the same direction for all eGFR index SNPs. The probability of having effect estimates in the same direction for all 16 index SNPs by chance alone was 3.1×10−5. This confirmed B2M as a kidney function biomarkers. The associations between scaled B2M and the index SNPs at two eGFR loci (SHROOM3 and ATXN2/SH2B3) were statistically significant (p=1.9×10−4 for rs17319721 at SHROOM3 and p=2.0×10−8 for rs653178 at ATXN2/SH2B3, Supplementary Table 6), compared with seven loci for eGFRcr and four for eGFRcys at the same pvalue threshold. The eGFR index SNP at the ATXN2/SH2B3 locus (rs653178) was in high linkage disequilibrium with rs3184504, the B2M index SNP at this locus, (r2 and D′ of 1 in HapMap release 22 CEU).
In a genome-wide association study of plasma B2M levels in 6738 European Americans, we identified two novel loci, one in the HLA region on chromosome 6 and another at SH2B3 on chromosome 12, a known eGFR locus. Together, these two loci explained 3.4% of the variance of log(B2M). We localized the signals to SNPs close to the MHC class I genes (HLA-A, HLA-B, and HLA-C) and showed the B2M GWAS signals in this region could largely be explained by imputed classical HLA alleles. The B2M index SNPs at the HLA locus were not associated with eGFR or all-cause mortality. Of the 16 known eGFR loci, all showed directionally consistent association with B2M, but only two were statistically significant.
Our study is the first genome-wide association study of B2M levels. As a component of the MHC class I molecule, B2M binds with the class I heavy chain and is essential for the expression of the class I molecule on the cell surface (Nieto et al. 1989; Vitiello et al. 1990). This provides a clear biological rationale for our discovery of the association between plasma B2M levels and variants at the HLA class I genes. This study established a highly significant statistical association between B2M levels and HLA class I alleles. The class I heavy chain binds with B2M non-covalently. (Bianchi et al. 2001). Using human cell lines, Hochman et al. showed the MHC class I heavy chains had different affinity for B2M on the cell surface (Hochman et al. 1988). This provides a potential mechanism for the influence of HLA class I variants on B2M plasma levels. This study identified HLA class I variants that may potentially have different binding affinity with B2M on the cell surface. Research suggests after the expression of the MHC class I molecule on the cell surface, the binding of B2M may still have implications on immune response (Bodnar et al. 2003; Rock et al. 1991).
On the other hand, some classical HLA alleles associated with B2M (HLA-A*01, HLA-B*08, and HLA-C*07) are on the 8.1 ancestral haplotype, which has long been associated with immunopathological diseases (Price et al. 1999; Raychaudhuri et al. 2012). To further examine the possible disease association of this B2M HLA class I locus, we searched the GWAS catalog (Hindorff et al.) for possible association between the B2M GWAS signals and inflammatory diseases. We queried index SNPs in the HLA class I region (30mb to 33mb) that were reported to be associated with autoimmune disorders in individuals of European ancestry. Of the 17 index SNPs identified from the GWAS catalog, four had p-values < 5×10−8 in the GWAS of B2M (Supplementary Table 7). One of the reported index SNPs for multiple sclerosis (rs9260489) was also a B2M index SNP. The disease association of the three other index SNPs of autoimmune disorders were rs2523393 with multiple sclerosis (De Jager et al. 2009), rs3134792 with psoriasis (Capon et al. 2008), and rs3131379 with systemic lupus erythematosus (International Consortium for Systemic Lupus Erythematosus et al. 2008), which is a risk factor of lupus nephritis. The prevalence estimates of these autoimmune diseases in the U.S. population are low: 46 per 100,000 for multiple sclerosis (Rosati 2001), 3.15% for psoriasis (Kurd and Gelfand 2009), and 0.12% for systemic lupus erythematosus (Uramoto et al. 1999). It is unlikely these diseases could explain the associations between B2M and the HLA class I variants in our community-based cohort.
B2M is a major component of dialysis-related amyloidosis (Drueke and Massy 2009). Since HLA class I variants may influence B2M levels independent of kidney function, whether the B2M associated HLA class I variants is associated with the rate of the B2M amyloid deposit accumulation may worth further investigation. However, the lack of association between the B2M GWAS index SNPs in the HLA region with eGFR or mortality suggesting small variation in B2M levels is not directly related to decreased kidney function. Thus, elevated B2M levels mark low GFR, and we have no evidence of the opposite causal direction. If B2M is used for estimating GFR, the information on the B2M associated variants not influencing GFR may be useful for removing variability not related to kidney function and thus improve the precision of GFR estimation.
The other B2M locus was at the gene SH2B3 on chromosome 12. This locus was previously identified as an eGFR locus mainly due to its association with eGFRcys (Kottgen et al. 2010) and was also associated with other traits or diseases, including blood pressure (Newton-Cheh et al. 2009), lipid level (Talmud et al. 2009), hematocrit (Ganesh et al. 2009), microcirculation (Ikram et al. 2010), type 1 diabetes (Reddy et al. 2011), celiac disease (Amundsen et al. 2010), and multiple sclerosis (Alcina et al. 2010). Further research is needed to determine whether the association between this locus and B2M is mediated mainly by kidney function or other conditions are also involved.
Several strengths of this study are noteworthy. Our study was the first GWAS of B2M and had a large representative sample of European Americans. Confounding with population stratification was controlled using principal components. B2M levels were measured centrally with a high reliability coefficient (0.98), thus reducing heterogeneity of the phenotype. Nonetheless, the results should be interpreted with a few limitations. First, the two genome-wide significant loci of B2M have not been replicated in other population-based cohorts. However, the B2M’s role as a component of the MHC class I molecule supports the statistical association of B2M levels and HLA class I variants in a population-based cohort. The statistical association was highly robust (p<10−21). The SH2B3 locus was previously shown to be associated with eGFRcr and eGFRcys, suggesting a common mechanism for the association of a third filtration marker, B2M. The association analyses of the imputed HLA classical alleles were based on most likely genotype calls. However, these imputed HLA classical alleles had high posterior probabilities with a median of over 0.99. Finally, the associations of B2M with known eGFR loci were assessed without a GFR estimation equation for B2M which could have improved the ability to control for non-kidney function influencing B2M levels. In addition, pathological implications of the HLA variants associated with higher B2M remain unclear. Additional studies examining the relation between the B2M-associated HLA variants and phenotypes known to be associated with B2M levels, such as peripheral arterial disease (Wilson et al. 2007), multiple myeloma (Rossi et al. 2010) and dialysis-related amyloidosis (Drueke and Massy 2009) will be important for determining whether the variants are causal for these conditions. For B2M-associated HLA variants that are known to be associated with autoimmune diseases, studies can be conducted to investigate whether B2M levels act as a mediator between the HLA variants and the phenotypes.
This study identified two genetic loci of plasma B2M. The HLA class I locus influences up to 3.2% of the log(B2M) variance and is unrelated to eGFR, while the SH2B3 locus is associated with eGFR. In addition to SH2B3, we also showed that the associations between the index SNPs at 15 other known eGFR loci and B2M are directionally consistent with their associations with eGFR based on serum creatinine. The sharing of at least a subset of genetic loci between B2M and eGFRcr offers additional supporting evidence of B2M as a kidney function biomarker.
The Atherosclerosis Risk in Communities (ARIC) Study is a prospective observational cohort study of middle age adults (baseline age between 45 and 64) in four US communities. Details of the study design were reported previously (ARIC 1989). Briefly, four examinations, each about three years apart, were conducted between 1987 and 1998. The baseline sample included 15,792 participants (approximately 12,000 European Americans and 4,000 African Americans). Only the European American cohort was included in this analysis.
B2M, serum creatinine, and cystatin C were measured from blood samples obtained at visit 4. B2M was measured from plasma with nephelometric technology run on the Dade Behring Nephelometer II (BNII) system (reliability coefficient: 0.98 in 390 replicates after removing 9 outliers > 3 standard deviations). The number of B2M measures available from the European American cohort was 8,269. After filtered by the availability of genotype data (see Supplementary Methods section), the final sample size was 6738. Serum creatinine levels were measured using the modified kinetic Jaffe method and calibrated to the age, sex, and race specific means in the Third National Health and Nutrition Examination Survey (NHANES III). Cystatin C levels were measured using BNII system. All-cause mortality was ascertained through active surveillance (annual telephone calls, screening all known hospitalizations, as well as local obituaries) up to January 1, 2008.
The genotyping was supported by the National Institute of Health Gene Environment Association Studies (GENEVA) project (Cornelis et al. 2010) and used the Affymetrix 6.0 platform. After the genotype data was received from GENEVA, additional quality filters were applied to remove individuals who were genetic outliers or closely related. Imputation of over 2.5 million SNPs was performed using MACH v1.0.16 with HapMap phase II CEU reference panel. Principal components for capturing sub-population stratification were generated using EIGENSTRAT (Price et al. 2006). Details of individual exclusion criteria for imputation and principal component generation are reported in the Supplementary Methods section.
B2M was transformed on the natural log scale. Genome-wide association analyses were conducted using the imputed allele dosage as predictor adjusted for age, gender, center, and principal components associated with the outcome at p-value < 0.05 (the second and sixth principal components) to control for population stratification. The genome-wide significant threshold was set at 5×10−8. To search for independent signals among the over 1000 genome-wide significance SNPs in the HLA region, we included significant SNPs with minor allele frequency > 5% and imputation quality (ratio of allele dosage variance and expected binomial variance) > 0.9, then pruned SNP pairs with pairwise r2>0.3. After pruning, we conducted stepwise regression with backward selection based on Schwarz Bayesian information criterion. To determine whether other independent signals existed in the region in addition to the selected SNPs from stepwise regression, we conducted conditional regression analyses of the entire region controlling for the imputed dosage of the selected SNPs. The pairwise r2 and D′ of the selected index SNPs were calculated based on most likely genotype using PLINK 1.07.
Imputation of classical HLA alleles at the HLA-A, HLA-B, and the HLA-C loci were conducted using HLA*IMP (Dilthey et al. 2011). To test for the associations between the imputed HLA classical alleles and B2M, we coded each allele with frequency > 1% as biallelic and conducted linear regression using log(B2M) as outcome controlling for age, gender, and center. The significant p-value thresholds were set at 0.05 divided by the number of alleles with frequency > 1% in each HLA genes. To determine the percentage of log(B2M) variance explained by significant imputed HLA classical alleles, we constructed HLA genotype by grouping together all insignificant alleles as one variant and included the genotypes as categorical predictors in linear regression.
To assess whether the HLA alleles could explain the GWAS signals, we conducted regression analyses of each index SNP controlling for each HLA genotype. To estimate the percentage of the variance of log(B2M) explained by the GWAS SNPs independent of the imputed classical HLA alleles, we conducted linear regression analyses of the imputed classical HLA allele genotypes with and without the GWAS SNPs and calculate the difference in multiple R2.
Next, to assess whether the B2M HLA locus was a kidney function locus, we tested for the associations between the B2M index SNPs at the HLA class I region and eGFR in ARIC. eGFR was calculated from serum creatinine using the Modification of Diet in Renal Disease (MDRD) equation (eGFRcr) (Levey et al. 1999). To assess the possible disease risk of the B2M HLA locus, we tested for the associations between the B2M index SNPs at the HLA class I region with all-cause mortality using proportional hazard regression with baseline set at visit 4 and follow-up data up to 2008, adjusted for age, sex, center and principal components associated with the outcome with p-value < 0.05.
Finally, to assess the associations of B2M with known eGFR loci, we tested for the associations of the index SNPs in 16 known eGFR loci (Kottgen et al. 2010) with B2M and compared the B2M associations with these SNPs with two eGFR measures: eGFRcr and eGFR calculated with cystatin C (eGFRcys) using the equation in Stevens et al. with age and sex as covariates (Stevens et al. 2008). In this analysis, B2M was transformed by taking the reciprocal and scaled to have the same mean as the average of eGFRcr and eGFRcys, so the effect size estimates of the eGFR index SNP against B2M could be compared with those against eGFRcr and eGFRcys. We named this outcome scaled B2M. The significant p-value threshold was set at 0.003 (=0.05/16). The software tools were ProbABEL (Aulchenko et al. 2010) for GWAS and R and SAS 9.2 for other analyses.
Adrienne Tin, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, U.S.A. 443-287-4740.
Brad C. Astor, Department of Medicine, School of Medicine and Public Health, University of Wisconsin, Madison, WI 53705, U.S.A. 608-626-0361.
Eric Boerwinkle, Human Genetics Center, The University of Texas Health Science Center at Houston, Houston, TX 77225, U.S.A. 713-500-9816.
Ron C. Hoogeveen, The Margaret M. and Albert B. Alkek Department of Medicine, Division of Atherosclerosis and Vascular Medicine, Baylor College of Medicine, Houston, TX 77030, U.S.A. 713-798-3407.
Josef Coresh, Welch Center for Prevention, Epidemiology and Clinical Research, The Johns Hopkins Medical Institutions, 2024 E. Monument Street, Suite 2-600, Baltimore, MD, U.S.A. 410-955-0495.
Wen Hong Linda Kao, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe Street, Baltimore, MD, U.S.A. 410-614-0945.