|Home | About | Journals | Submit | Contact Us | Français|
Chronic kidney disease (CKD) is a significant public health problem, and recent genetic studies have identified common CKD susceptibility variants. The CKDGen consortium performed a meta-analysis of genome-wide association data in 67,093 Caucasian individuals from 20 population-based studies to identify new susceptibility loci for reduced renal function, estimated by serum creatinine (eGFRcrea), cystatin C (eGFRcys), and CKD (eGFRcrea <60 ml/min/1.73m2; n = 5,807 CKD cases). Follow-up of the 23 genome-wide significant loci (p<5×10−8) in 22,982 replication samples identified 13 novel loci for renal function and CKD (in or near LASS2, GCKR, ALMS1, TFDP2, DAB2, SLC34A1, VEGFA, PRKAG2, PIP5K1B, ATXN2, DACH1, UBE2Q2, and SLC7A9) and 7 creatinine production and secretion loci (CPS1, SLC22A2, TMEM60, WDR37, SLC6A13, WDR72, BCAS3). These results further our understanding of biologic mechanisms of kidney function by identifying loci potentially influencing nephrogenesis, podocyte function, angiogenesis, solute transport, and metabolic functions of the kidney.
Chronic kidney disease (CKD) is estimated to affect over 13% of adults1 and is increasing in prevalence.1;2 This poses a significant global disease burden as the risk for end stage renal disease (ESRD), cardiovascular morbidity, and mortality increases with declining glomerular filtration rate (GFR),3 the most commonly used measure of kidney function. In addition, CKD incurs substantial expenditures in the US,4 with similar trends expected globally.5
Despite the increasing prevalence of CKD, our understanding of the underlying risk factors and pathophysiologic mechanisms remains incomplete.5 Hypertension and diabetes are major risk factors for CKD.6 However, the marked variability in the development of CKD in the setting of hypertension and diabetes demonstrates that additional underlying factors contribute to its etiology.7 In particular, studies have consistently demonstrated important genetic contributions to estimated GFR (eGFR), CKD and ESRD.8;9 Using genome-wide association, we have recently identified susceptibility variants for renal function and CKD at the UMOD, SHROOM3, and STC1 loci in nearly 20,000 individuals.10 Together, single nucleotide polymorphisms (SNPs) at these loci explain only 0.43% of the variance in eGFR,10 suggesting that additional loci remain to be identified.
Thus, we have now performed a genome-wide association meta-analysis in 67,093 Caucasian participants from 20 general population-based cohorts within the CKDGen consortium, followed by independent replication of our findings in 22,982 Caucasian individuals. We analyzed GFR estimated from serum creatinine by the Modification of Diet in Renal Disease (MDRD) Study equation (eGFRcrea) as well as CKD (eGFRcrea <60 ml/min/1.73m2). To discriminate true susceptibility loci for renal function from those related to creatinine production and secretion, we used GFR estimated from a second serum marker of kidney function, cystatin C (eGFRcys).
Overall, 90,075 individuals (67,093 in Stage 1 discovery and 22,982 Stage 2 replication) contributed information to the analysis of eGFRcrea, 84,740 individuals (62,237 Stage 1 discovery and 22,503 Stage 2 replication) to the analysis of CKD, and 26,071 to the analysis of eGFRcys (20,957 Stage 1 discovery and 5,114 Stage 2 replication; Table 1).
Table 2 summarizes information for the 28 genomic loci that contained at least one genome-wide significant SNP association (p<5×10−8) for any of the three discovery traits; the SNP with the lowest p-value at each locus is presented. In addition to confirming 5 known loci,10 we identified 23 novel loci containing genome-wide significant SNPs (p-values between 4.5×10−8 and 3.8×10−12): 20 SNPs were identified in association with eGFRcrea, 2 SNPs with CKD, and 1 SNP with eGFRcys. Of note, rs7805747 in the PRKAG2 gene was identified in discovery analyses for both eGFRcrea and CKD, as was the known lead SNP at the UMOD locus.
Figure 1A shows the genome-wide −log10 p-value plot from Stage 1 discovery association analyses with eGFRcrea, Figure 1B with CKD, and Figure 1C shows the eGFRcys results. The respective quantile-quantile plots are presented in Supplementary Figure 1. Study-specific and median imputation quality for the lead SNPs can be found in Supplementary Table 2.
Because serum creatinine concentration is influenced both by renal function as well as by creatinine production or secretion, we used eGFRcys as a second measure of renal function to help distinguish between true renal function loci and creatinine production or secretion loci. Thus, of the 23 newly discovered loci, 16 were classified as renal function loci based on a direction-consistent association with eGFRcys with p-value of <0.05 (Table 2 and Supplementary Table 3), and 7 were classified as loci related to creatinine metabolism. One SNP, rs653178 at the ATXN2 locus, was identified primarily in association with eGFRcys.
The lead SNP at each of the 20 novel loci for eGFRcrea, the 2 loci for CKD, and the novel locus for eGFRcys were evaluated for independent replication in Stage 2 analyses with the respective discovery trait. After meta-analysis of Stage 1 discovery and Stage 2 replication results, the associations for all but 3 SNPs (rs16864170 at SOX11, rs1933182 at SYPL2, rs4014195 at RNASEH2C) became more significant (Table 2). Additionally, 16 of these 20 SNPs also showed a significant association in the replication samples alone (one-sided p-value<0.0025 for eGFRcrea, p<0.025 for CKD, and p<0.05 for eGFRcys, Table 2). Thus, after integrating evidence for replication with association results for eGFRcys, 13 replicated loci for renal function and 7 replicated loci likely related to creatinine metabolism were identified. Of note, rs653178 at the ATXN2 locus, which was identified primarily in association with eGFRcys, was also associated with eGFRcrea in the combined discovery and replication results. Regional association plots for all replicated loci related to renal function are shown in Supplementary Figure 2.
Of the SNPs that were validated by Stage 2 replication and were also associated with eGFRcys, ALMS1, DAB2, SLC34A1, PRKAG2, VEGFA, DACH1, and SLC7A9 can be linked to renal function and/or disease and are highlighted together with LASS2 and GCKR in Box 1. The remaining novel renal function loci were located in or close to TFDP2, PIP5K1B, UBE2Q2, and ATXN2. In spite of a lack of clear biological connection to renal function at these loci, findings were consistent across eGFRcrea, CKD, and eGFRcys analyses. More information on these genes is presented in Supplementary Box 1.
SLC7A9 encodes for an amino acid transporter in renal proximal tubule cells; mutations in SLC7A9 cause cystinuria type B (OMIM #220100).21 Patients with cystinuria excrete elevated amounts of amino acids, resulting in the formation of stones in the urinary tract. SLC7A9-deficient mice display tubular and pelvic dilatation, tubular necrosis, and chronic interstitial nephritis.22
Mutations in SLC34A1 cause hypophosphatemic nephrolithiasis/osteoporosis (OMIM #612286).23 SLC34A1 encodes the type IIa Na/Pi cotransporter, which is exclusively expressed in kidney and located in the brush border of renal proximal tubular cells, where it mediates reuptake of inorganic phosphate.24
Mutations in ALMS1 cause the autosomal recessive Almstrom Syndrome (OMIM #203800), characterized by retinal degeneration, hearing loss, obesity, diabetes, and commonly renal insufficiency.25;26 Mutations in this gene are associated with age-dependent ciliopathies in the kidney.27
DAB2 is a cytoplasmatic adaptor protein expressed in renal proximal tubular cells,28 where it is reported to represent the physical link between megalin and non-muscle myosin heavy polypeptide 9, encoded by MYH9.29 Common variants in MYH9 were recently identified as important susceptibility alleles for non-diabetic kidney disease in African Americans.30;31
Encodes for vascular endothelial growth factor A, with roles in angiogenesis and vascular permeability.32 Renal podocytes produce large amounts of VEGFA, which is essential for glomerulogenesis and glomerular filtration barrier formation in animal models.32 VEGFA has been reported to affect ureteric bud growth during embryogenesis and hence may impact the number of nephrons.33
The product of GCKR inhibits hepatic glucokinase.34 Common variants in GCKR, and specifically the missense SNP rs1260326 (P446L), are associated with a variety of human traits in genetic association studies, including serum triglycerides, fasting glucose, C-reactive protein, and uric acid as well as susceptibility to type 2 diabetes (http://www.genome.gov/GWAstudies/), highlighting the pleiotropy of this locus. eSNP analyses point to the role of the neighboring IFT172 gene, which has a role in the formation of primary cilia.35
Rare PRKAG2 variants cause a form of heart disease, featuring hypertrophic cardiomyopathy and the Wolff-Parkinson-White syndrome36;37 and sometimes enlarged kidneys.38 Studies in transgenic mice indicate that these mutations cause a glycogen storage disease of the heart.37 Several other hereditary glycogen storage diseases present with renal pathology such as renal tubular dysfunction.39
Dachshund homologue 1 is a transcription factor with a role in organogenesis; our data implicate a 100kb region within this gene. It is expressed in adult human kidney, as well as murine developing kidney, specifically glomerular podocytes and tubular epithelial cells.40 DACH1 may have a role in the development of the Mullerian duct.41 DACH1 is part of the genetic network including SIX and EYA,42 mutations in which cause brachio-oto-renal syndrome.43
is highly expressed in the kidney and may be involved in cell growth.44 A non-synonymous coding SNP in LASS2, rs267738 (E115A), was in perfect LD with the lead SNP in the region and of predicted damaging function.45 LASS2 has been implicated in the synthesis of specific ceramides.46 Ceramides and their product sphingolipids are important in genetic diseases of the kidney,47 and have a role in aging mechanisms.
Of the 20 SNPs that replicated, the remaining 7 SNPs were not associated with eGFRcys and hence were considered as likely creatinine production or secretion loci (CPS1, SLC22A2, TMEM60, WDR37, SLC6A13, WDR72, and BCAS3). More information on these loci is presented in Supplementary Box 1.
The majority of the 13 validated renal function loci were nominally associated with CKD (Table 3), underscoring how the use of intermediate phenotypes can provide insight into disease-based traits. The odds ratios associated with CKD for each additional copy of the minor allele ranged from 0.93 to 1.19, and minor allele frequencies ranged from 0.20 to 0.50.
Since diabetes mellitus and hypertension are major risk factors for kidney disease, we investigated the association of the replicated renal function loci with eGFRcrea stratified by diabetes or hypertension status in the discovery cohorts. None of the SNPs reported in Table 2 differed significantly across strata of diabetes and hypertension (p<0.008, Bonferroni-corrected alpha of 0.1 for 13 tests).
The 13 confirmed and the three previously identified renal function loci account for 1.4% of the variation in eGFRcrea. A genetic risk score was computed using all 16 validated renal loci (13 novel, 3 known) and data from the ARIC study. Across categories of the genetic risk score, mean eGFRcrea ranged from 86.9 (SD 18.7) to 71.1 (SD 14.7) in individuals in the lowest (10) to the highest risk score category (25), and CKD prevalence ranged from 3.9 to 23.6%, respectively.
To obtain evidence for the presence of functional variants at the identified genomic risk loci and to prioritize genes in the associated regions, we focused on SNPs previously identified in genome-wide studies as significantly related to gene expression in liver (n= 3,322),11 lymphocytes (n=29,094) 12 or lymphoblastoid cell lines (n=10,823).13 These expression SNPs (eSNPs) were then evaluated for their association with eGFRcrea, CKD, and eGFRcys from the discovery analysis. Table 4 shows that 9 of the 20 novel susceptibility loci identified for eGFRcrea (7/13 renal function loci) contained one or more significant eSNPs in one or more of the expression tissues queried. In addition, three of the previously identified loci (SHROOM3, GATM, and CST3) also contained at least one eSNP. The correlation (r2) between the significant eSNP at each locus with the strongest LD to lead SNP ranged from 0.01 (FBXO22) to >0.9 (DAB2, GCKR; Table 4). The lead SNP in GCKR (rs1260326), a non-synonymous coding variant, was significantly associated with gene expression of the neighboring IFT172 gene. Further, the eSNP data supports SLC7A9 as the important gene at the chromosome 19 susceptibility locus, as well as ALMS1 at the susceptibility locus on chromosome 2p13, since an eSNP in perfect LD (r2=1) with rs13538 in the HapMap CEU population is significantly associated with ALMS1 transcript expression in lymphocytes. All eSNPs with significant association with at least one renal trait are listed in Supplementary Table 4.
In order to further identify genomic loci for kidney function and disease, secondary analyses were conducted to identify SNPs that did not reach genome-wide significance but were associated with eGFRcrea or CKD at an FDR of 0.05 (p-value <4.8*10−6). After exclusion of all SNPs within 1 Mb of a genome-wide significant SNP, 9 additional loci for eGFRcrea were identified, 4 of which were also associated with eGFRcys (Supplementary Table 5). We also compared the identified FDR-loci with the eSNP analysis; 3 regions of overlap were identified. The r2 between each FDR SNP and the eSNP in highest LD with that FDR SNP ranged from 0.18 (ARL15) to 1.0 (CASP9, CRKRS), further supporting these genomic regions as loci of interest. Based on known biology, PARD3B and CASP9 are particularly interesting genes emerging from FDR analyses. PARD3B is important in establishing cell polarity and localizes to tight junctions of epithelial cells;14 it is most expressed in fetal and adult kidney. CASP9 is involved in the growth of metanephroi in the developing kidney.15
Our principal findings are four-fold. We have identified 20 novel replicated loci in association with eGFR and CKD. Of these, 13 are likely to be involved in renal function and susceptibility to CKD, whereas 7 likely represent creatinine production or secretion loci. In aggregate, the 13 new renal function loci plus the three previously identified renal function loci account for 1.4% of the variation in eGFRcrea. We demonstrate altered transcript expression with SNPs at several of the identified loci, providing potential functional insight. Lastly, we provide suggestive evidence for an additional 9 eGFRcrea-associated loci using a false discovery rate metric, of which 4 loci are suspected to be related to renal function.
These findings extend previous knowledge of common genetic variation related to renal function indices. We have confirmed our prior findings, the identification of common risk variants at the UMOD, SHROOM3, and STC1 loci as well as at two positive control loci (GATM, CST).10 We now highlight 13 novel loci not previously known to be associated with renal function in population-based studies. In the course of our work, we have also uncovered 7 loci likely influencing creatinine production or secretion. This underscores the importance of separating genetic loci that affect concentrations of a biomarker independent of underlying disease processes from those that truly reflect disease-association. Similar to what we have previously reported,10 we identified many more robust associations for eGFRcrea as compared to CKD. Nonetheless, nominal associations with CKD were identified for most of the renal function SNPs, showing that genetic variants associated with normal variation in eGFRcrea are also associated with the clinically important entity CKD.
Our findings highlight in several important ways how GWAS can aid in uncovering the genetic underpinnings of complex human traits and diseases as well as represent a first step towards a better understanding of physiological mechanisms and pathways. First, they provide novel information about the allelic architecture of known risk loci for genetic diseases of the kidney. Rare mutations in SLC7A9, SLC34A1, ALMS1, and UMOD are known to cause monogenetic diseases that feature a renal phenotype, underscoring the additional importance of common genetic susceptibility variants in these genes. This phenomenon is also observed for other complex traits and diseases; approximately 20% of loci discovered in GWAS of a variety of complex traits are known to also harbor mutations that cause monogenic diseases.16
Second, our findings provide information about the genomic location of genetic variants associated with renal indices. Over 65% of the SNPs identified in our discovery analyses are located in or within 3.7 kb upstream of genes, and three of the variants are non-synonymous coding. This is in agreement with a recent study that reported an enrichment of trait-associated variants identified in GWAS at non-synonymous coding sites and a depletion of trait-associated variants in intergenic regions when compared to a random selection of variants on genotyping arrays.17;18
Third, our replicated findings highlight the role of several pathways and mechanisms of importance in renal development and function. We identified common genetic variants in genes related to nephrogenesis (ALMS1, VEGFA, potentially DACH1), glomerular filtration barrier formation and podocyte function (DAB2, PARD3B, VEGFA), angiogenesis (VEGFA), solute transport (SLC7A9, SLC34A1), and metabolic functions of the kidney (PRKAG2, potentially GCKR and LASS2). Several of the genes we identified can be linked to the role of primary cilia (ALMS1, GCKR/IFT172, PARD3B); mutations in genes with a role in development and function of primary cilia are known to cause hereditary genetic diseases of the kidney such as polycystic kidney disease and nephronophthisis.19 The ability to uncover genetic variation in these genes in unselected individuals from population-based studies emphasizes their contribution to important mechanisms related to renal function in humans under physiological conditions and should provide interesting candidates for follow up in functional studies.
Finally, our data provide novel information about components of creatinine metabolism in humans and specifically about how creatinine is handled by epithelial cells of the proximal renal tubule. This may be of interest not only to physiologists but may also have consequences on the precision of GFR estimation from serum creatinine in the clinical setting.
Similar to what has been observed previously, we identified modest effects of the risk alleles on eGFR and CKD. Taken together, these renal function loci are associated with 1.4% of the variation in eGFRcrea. We observed substantial gradation of CKD prevalence across the genetic risk score, indicating the potential clinical relevance of these risk alleles. Most importantly, our findings point toward novel mechanisms that may lead to a better understanding of both renal development and the pathogenesis of CKD.
The strength of our analysis lies in the large sample size of 67,093 used for discovery, allowing us to uncover multiple loci despite the small effect size on eGFR and CKD. We restricted our analyses to population-based studies thereby avoiding potential bias from using case-control samples20 or from potential counter-regulating disease processes. We employed several additional methods to enhance our ability to discover novel kidney disease susceptibility loci in this screen free of prior biological hypotheses including an FDR and eSNP approach. To enable discrimination of renal function loci from creatinine production and secretion loci, we used a separate complementary measure of glomerular filtration obtained from cystatin C.
Some limitations warrant mention. Our sample consists of white participants only, and it is uncertain whether these results would replicate in other ethnic groups. We used an indirect measure of GFR as estimated by the MDRD equation; gold-standard measures of glomerular filtration are not feasible in a large population-based setting. We used eGFRcys to provide confirmatory evidence of which novel loci were indeed renal function loci. Loci that were not associated with eGFRcys were designated as presumptive creatinine production or secretion loci, but the smaller sample size with available cystatin C measures limits the power to confirm renal function loci. We may have falsely labeled some loci as unrelated to renal function based on an absent association with eGFRcys, particularly WDR37 and WDR72, which showed association of borderline significance with eGFRcys. Lastly, several genes exist in the regions of interest. For many of the reported loci, we are unable to identify which is the most likely gene related to the SNP association based on statistical evidence, although we could address this question to some extent using the eSNP data.
Multiple common genetic variants are associated with indices of renal function and highlight the role of specific genes in nephrogenesis, podocyte function, angiogenesis, solute transport, and metabolic functions of the kidney.