|Home | About | Journals | Submit | Contact Us | Français|
Genetic studies in Turkish, Native American, European American, and African American (AA) families have linked chromosome 18q21.1-23 to susceptibility for diabetes associated nephropathy. In this study we have carried out fine linkage mapping in the 18q region previously linked to diabetic nephropathy in AAs by genotyping both microsatellite and single nucleotide polymorphisms (SNPs) for linkage analysis in an expanded set of 223 AA families multiplexed for type 2 diabetes associated ESRD (T2DM-ESRD). Several approaches were used to evaluate evidence of linkage with the strongest evidence for linkage in ordered subset analysis with an earlier age of T2DM diagnosis compared to the remaining pedigrees (LOD 3.9 at 90.1cM, ΔP=0.0161, NPL P value = 0.00002). Overall, the maximum LODs and LOD-1 intervals vary in magnitude and location depending upon analysis. The linkage mapping was followed up by performing a dense SNP map, genotyping 2,814 SNPs in the refined LOD-1 region in 1,029 AA T2DM-ESRD cases and 1,027 AA controls. Of the top 25 most associated SNPs, 10 resided within genic regions. Two candidate genes stood out: NEDD4L and SERPINB7. SNP rs512099, located in intron 1 of NEDD4L, was associated under a dominant model of inheritance (P value = 0.0006; Odds ratio (95% Confidence Interval) (OR (95%CI)) = 0.70 (0.57-0.86)). SNP rs1720843, located in intron 2 of SERPINB7, was associated under a recessive model of inheritance (P value = 0.0017; OR (95% CI) = 0.65 (0.50-0.85)). Collectively, these results suggest that multiple genes in this region may influence diabetic nephropathy susceptibility in AAs.
African Americans have a 3.6-fold increased risk of developing end-stage renal disease (ESRD) compared to Caucasian Americans and at least a 1.8-fold increased risk of developing ESRD compared to other racial/ethnic minority groups in the United States (U.S. Renal Data System 2008). Diabetes-associated nephropathy is the most common source of ESRD, accounting for approximately 45% of cases in the U.S., and the incidence rate continues to rise, 2.5% in 2006 (U.S. Renal Data System 2008). Many different studies, using a variety of different methods have shown that there is a genetic component to ESRD in the general population and in African Americans as summarized (Bowden 2003). When looking at familial aggregation, African Americans who have a close relative with ESRD have a 9-fold increased risk of developing ESRD (Freedman et al. 1995), while Caucasian Americans only have an increased risk of 2.7-fold (Spray et al. 1995). Taking these results together, the African American population has a disproportionately higher incidence of ESRD along with a stronger familial component. Diabetes is the single most common contributor to ESRD in the United States, yet the origins of diabetes-associated nephropathy and ESRD are poorly understood. It remains unclear why some diabetes affected individuals will progress to nephropathy and ESRD, and why others will not. This suggests that environmental factors, as well as genetic susceptibility, contribute to nephropathy risk.
We previously reported evidence for linkage of diabetic nephropathy to chromosome 18q21.1-23 in 166 African American families (Bowden et al. 2004). Linkage and association analysis of other populations have provided consistent evidence for a diabetic nephropathy (DN) susceptibility locus/loci on 18q: 18q22.3-23 in Turkish and Pima Indian families (Vardarli et al. 2002), 18q22.3 in European American and American Indian families (Iyengar et al. 2007), and European American type 1 diabetes patients (Ewens et al. 2005). Here we report results from an expanded genetic linkage analysis and association analysis of a high density SNP map across the 18q21.1-23 interval.
Recruitment and sample collection procedures were approved by the Institutional Review Board at Wake Forest University. DNA samples were collected from self-described African American families with multiple type 2 diabetes mellitus (T2DM) affected subjects with either end-stage renal disease (ESRD) or chronic renal failure (CRF). For the purposes of this report these affected individuals will be referred to collectively as diabetic nephropathy (DN) cases and are treated the same with the exception of analyses that incorporate age at diagnosis of ESRD or duration of diabetes to ESRD. In these analyses only ESRD cases (and not CRF cases) were incorporated into the models since clear definition of age at onset of CRF were not possible.
Briefly, families were originally identified through a proband with T2DM associated ESRD. T2DM was diagnosed in probands developing diabetes and treated with diet and exercise or oral hypoglycemic agents during at least part of their disease history. Medical records were reviewed to verify the etiology of the nephropathy. Renal failure was attributed to diabetes when serum creatinine ≥ 2.0 mg/dl with either diabetes duration for >10 years, or proliferative diabetic retinopathy in the absence of other known causes of renal failure. When proteinuria data was available, all subjects had CRF defined as proteinuria ≥ 500 mg/24 hours, a urine protein:creatinine ratio ≥ 0.5 mg/g or ≥ 100 mg/dl proteinuria on urine dipstick. Diabetic nephropathy (DN) affected siblings and, when possible, other available family members were recruited also. Selection criteria and recruitment strategies have been previously described in detail (Freedman et al. 1997; Yu et al. 1996; Yu et al. 1998; Yu et al. 2000; Freedman et al. 2000; Freedman et al. 2002). The family set for the linkage fine mapping comprised 223 African American families with 270 DN affected sibling pairs, made up primarily of 233 full-sibling pairs and 37 half-sibling pairs, from a total of 476 DN affected individuals. One hundred seventy-one of the families contained 2 affected siblings, 17 families had 3 affected siblings, and 2 families had 4 affected siblings with a total of 796 individual subjects. In general, the family data consisted primarily of individuals from a single generation, with both parents available in none of the families and one parent for 6 families. Of the DN affected individuals, 406 had T2DM with ESRD and 70 had diabetes with chronic renal failure (CRF). Seventy-one individuals in the families had T2DM without a diagnosis of ESRD or CRF, of which 33 were unaffected and 38 had unknown renal status. DNA extraction was performed using the PureGene system (Gentra Systems, Minneapolis, MN).
DNAs from a total of 1,029 DN cases and 1,027 non-diabetic controls without nephropathy were genotyped for association analysis. DN cases were ascertained, recruited, and diagnosed using the same manner as outlined above for the family collection. Controls were healthy, self-reported African Americans born in North Carolina, age ≥ 18 years, and denying a personal or family history of kidney disease in 1st degree relatives. Controls were recruited from community resources including health fairs, churches, and shopping malls. Recruitment and sample collection procedures were approved by the Institutional Review Board at Wake Forest University. DNA extraction was performed using the PureGene system (Gentra Systems, Minneapolis, MN).
There were 64 additional DN-affected sibling pairs added to the African American family population for linkage mapping. These subjects were part of a genome-wide scan completed by the Center for Inherited Disease Research (CIDR), through the National Institute of Diabetes and Digestive and Kidney Diseases–funded Family Investigation of Nephropathy and Diabetes. The marker set was based on Marshfield Panel 8, with ~10% of the markers changed from the previous Marshfield panel. It was composed of mainly tetra-, tri-, and di- nucleotide repeats, including 385 primer pairs with an average spacing of 9.0cM and no intermarker gaps greater than 20cM.
SNPs for fine mapping were chosen using the marker list from Illumina's HumanLinkage-12 Marker Panel downloaded from the Center for Inherited Disease Research (CIDR) website (www.cidr.jhmi.edu/snp_marker.html) using genome build version 34. SNPs were picked from the LOD-1 region (bld34: 49,229,785-75,887,528) of the previously published linkage peak (Bowden et al. 2004). SNPs were selected with an average spacing of 0.97cM, and no interSNP gaps greater than 2.6cM. When there was more than one SNP in the region, the SNP with the highest minor allele frequency in the African American population was chosen. Map distances were based on the Rutgers genetic map (Matise et al. 2007).
Forty-nine SNPs were genotyped using the MassARRAY system from Sequenom, Inc. (Sequenom, San Diego, CA) (Buetow et al. 2001) in 796 AA subjects. Primer sequences are available on request. SNP rs872994 failed to genotype in the population. Genotyping success rates among the other 48 SNPs were >90.5%. For quality controls purposes (QC) 19 samples were run in duplicate. Concordance among blind duplicates was 100%.
One microsatellite marker, D18S880, was also added to the study. D18S880 is in the CNDP1 gene and has been shown to be associated with DN in European-derived populations (Janssen et al. 2005; Freedman et al. 2007). This marker was genotyped by fragment length analysis on an ABI Prism DNA Analyzer 3700 (Applied Biosystems, Inc., Foster City, CA) using a method similar to that previously described (Janssen et al. 2005). Fragment length was analyzed using ABI Prism GeneMapper software v3.0 (Applied Biosystems, Inc.). There were 19 duplicate samples run in order to ensure quality control. Of the replicate pairs, 100% were concordant.
Each pedigree was examined for consistency of familial relationships using PREST (Pedigree RElationship Statistical Test) (McPeek et al. 2000). When the self-reported familial relationships were inconsistent with that determined from the observed genotypic data for that pedigree, then 1) the pedigree was modified when the identity by descent (IBD) statistics suggested a very clear alternative, or 2) a minimal set of genotypic data was converted to missing. Each genetic marker was also examined for Mendelian inconsistencies using PedCheck (O'Connell et al. 1998), and sporadic problem genotypes were converted to missing.
A dense SNP map was completed by the Center for Inherited Disease Research (CIDR), and was funded by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). SNPs were identified through SNP databases, HapMap resources, and Illumina design scores. Initially, all Yoruban (YRI) SNP genotype, allele frequency, and LD data from HapMap (public data release #21; includes phase I and II data) for SNPs in the fine mapped LOD-1 region (bld35: 48,263,676-65,774,799) were selected. SNPs were then filtered based on minor allele frequency (MAF), retaining only SNPs with MAF ≥ 0.05. Next, the resulting SNP set was scored by Illumina using their proprietary assay designability scoring algorithm. In order to maximize probability of successful genotyping, the minimum threshold Illumina design score was set to 0.8. Tag SNPs were identified from these SNPs using an algorithm that selects an Illumina-designable SNP to tag each bin of SNPs (designable+non-designable) using a threshold LD score (r2≥0.8). These SNPs were added to the final SNP list.
Next, haplotype tagging SNPs were identified by evaluating the HapMap genotype data using the Haploview program (Barrett et al. 2005) to generate the set of Gabriel-algorithm (Gabriel et al. 2002) defined haplotype blocks and compute haplotype frequencies for haplotypes of frequency 1% or greater. The tagsnps program (D.O. Stram, USC) was used to identify haplotype tagging SNPs (htSNPs) for these Gabriel-algorithm derived haplotype blocks in the region. tagsnps (beta version 2) generates htSNP tag sets using a greedy stepwise algorithm to reduce the block-tagging SNP sets until the minimum haplotype variance (Rh2) drops below a threshold (in this case 0.8). These SNPs were added to the final SNP list which was filtered for duplicate SNPs.
Residual SNP map slots were backfilled using designable bin tag SNPs identified by Perlegen in an independent study and using physical map coverage from HapMap data. A search was conducted for any adjacent inter-SNP region >20 kb in length in the SNP map and these regions were checked for published Perlegen tag SNPs with MAF≥0.05 in AA populations (Perlegen ethnicity samples are distinct from HapMap samples). After merging the Perlegen tag SNPs with the map and uniqueness filtering, the map was checked for SNPs that were ≤ 60 bases apart since assayed SNPs closer than 60 bases cannot be included in the same Illumina multiplex oligo pool. Where necessary, SNPs were deleted from the map to meet the 60 bp separation requirement, with deletion preference in the order: physical map coverage SNP > Perlegen tag SNP > htSNP > HapMap tag SNP. This step was followed by a final manual check and editing to arrive at 2×1536 SNPs (actually 2,962 genotyping SNPs with 40 SNPs reserved for QC purposes and 70 SNPs reserved for admixture adjustment purposes).
3,072 (2×1536) SNPs were genotyped in 1,029 AA DN cases and 1,027 AA healthy controls using Illumina's Custom Genotyping Service. Genotyping success rates were >98.3%. For quality controls purposes (QC) 48 samples were run in duplicate. Concordance among blind duplicates was >99.9%.
Seventy biallelic ancestral informative markers (AIMs) were genotyped in 1,029 AA case subjects, 1,027 AA control subjects, 44 Yoruba Nigerians, and 39 European Americans using Illumina's Custom Genotyping Service or the MassARRAY system from Sequenom, Inc as the protocol specified.
Multipoint linkage analyses were carried out using NPL regression analyses using the NPLpairs statistics outputted from a modified version of Genehunter (Langefeld et al. 1999; Langefeld et al. 2001; Davis et al. 2001). The NPL regression approach is a conditional logistic regression analysis in which the family-specific NPL statistic (e.g., NPLpairs) at one or more loci is the predictor variable. This methodology was used in the initial genome scan and is described in detail (Bowden et al. 2004).
Ordered subset analyses (OSA) (Hauser et al. 2004) were calculated to investigate the influence of a pedigree's mean age of diagnosis of diabetes, mean age of diagnosis of ESRD, and mean duration of diabetes before diagnosis of ESRD (similar to NPL regression analysis above). OSA ranks each family by the family-level value of a covariate of interest and identifies the contiguous subset of families that maximize the evidence for linkage. In the OSA with the mean age at diabetes diagnosis, each pedigree was ranked from lowest to highest for age at diabetes diagnosis. The family with the lowest mean age at diabetes diagnosis entered into the analysis and the corresponding LOD score was computed on the target chromosome (e.g., chromosome 18) for that family. Next, a second linkage analysis on the target chromosome 18 was computed combining the two families with the two lowest mean ages at diabetes diagnosis values. The ith OSA analysis proceeds by computing a linkage analysis on the target chromosome using the subset of families with the ith lowest mean ages at diabetes diagnosis. This process is repeated until all families have been added to the linkage analysis. The subset of families that yield the largest LOD score on the target chromosome is taken as the LOD score of interest. The location that maximizes the LOD score on a chromosome will vary as the subset of families analyzed changes. The statistical significance of the change in the LOD score was evaluated by a permutation test under the null hypothesis that the ranking of the covariate is independent of the family's LOD score on the target chromosome. Thus, the families were randomly permuted with respect to the covariate ranking and an analysis proceeded as above for each permutation of these data. The resulting empirical distribution of the change in the LOD scores yielded a chromosome-specific P-value. In this example, the family-level means were ranked in ascending order; however, we repeated the analysis ranking in descending order.
Tests for genotypic association were performed on each SNP individually using SNPADDMIX, a component of the SNPGWA program (Harley et al. 2008) which includes the capability to perform association calculations adjusting for covariates. Genotypic association reported here is for analyses incorporating adjustment for African ancestry proportions. The primary inference is based on the 2 degree of freedom global test of genotypic association. If significant, then the individual genetic models (dominant, additive and recessive) were examined for context. This is consistent with the Fisher's protected least significant difference (LSD) multiple comparisons procedure.
Ancestral allele frequencies were estimated from the results of the AIMs genotyped in the Yoruba Nigerians and the European Americans. Individual ancestral proportions were generated for each subject using FRAPPE (Tang et al. 2005), an EM algorithm, under a two-population model. The influence of other possible covariates: age, BMI and gender, on evidence of association was tested using SNPADDMIX. Two SNP haplotype analysis was completed using the program Dandelion (www.phs.wfubmc.edu). Linkage disequilibrium was calculated as defined by Gabriel (Gabriel et al. 2002) with the program Haploview (Barrett et al. 2005).
The clinical and phenotypic characteristics for the genotyped samples, both the family samples for linkage analysis and the case-control samples for the dense SNP map, are summarized in Table 1. The genotyped samples have a higher proportion of females, probably reflecting the increased prevalence of T2DM among African American women (Center for Disease Control and Prevention 2008), survival, and participation bias. The cases in the family sample and the cases in the case-control sample have an overlap of 189 subjects, thus are not entirely independent. The two sets of cases are broadly comparable across all characteristics. While the age of enrollment is higher in the cases compared to the controls, note that the age of diagnosis of diabetes in the cases is lower than the age of recruitment for the non-diabetic controls. As we have reported previously (Bowden et al. 2004), the age at diabetes diagnosis and age at ESRD onset are strongly correlated. There were 53 DN subjects with age of onset of diabetes less than 25 years of age. The prevalence of type 1 diabetes in African Americans is low, and the average BMI at enrollment into the study of this early diabetes onset group was 31.4 (range 21.7-44.0) suggesting that many of these individuals do have T2DM. We cannot, however, exclude the possibility that some subjects have type 1 diabetes. The great majority of the ESRD affected subjects were enrolled within 5 years of developing ESRD. Overall the DN affected individuals were overweight or obese on average at the time of their enrollment in the study (median BMI 30.6 for linkage set and 29.6 for association sample). Controls for the association study were also overweight. On average, the diabetes affected individuals in the linkage set have poorer glucose control than cases in the association data set.
Linkage mapping was performed on the LOD-1 region on chromosome 18q (18q21.1-23) that was previously identified (Bowden et al. 2004). This dataset encompassed 13 microsatellite markers (including D18S880) and 48 SNPs genotyped in the enlarged family population. The average spacing between microsatellite markers was 8.7cM. The average spacing between SNP markers was 0.97cM, and there were no interSNP gaps greater than 2.6cM. The average spacing across all markers in the LOD-1 region was 0.89cM. The marker list and the Rutgers genetic map locations (Matise et al. 2007) are detailed in Supplementary Table 1.
With the inclusion of additional subjects and additional markers from fine mapping to the original scan, a multipoint linkage analysis was completed on chromosome 18. The single locus, NPL regression analysis results are detailed in Table 2. When including all markers: microsatellites and SNPs (Micro+SNP; Combined), there was a reduction in the LOD score (LOD 0.36, 110.6 cM, near D18S1371; Table 2). However, when the analysis was completed using just the microsatellites alone (Micro Only), the LOD score remained at the same magnitude (LOD 1.08, 121.6 cM, near D18S1390; Table 2) that was shown previously (Bowden et al. 2004). These results are also shown in Figure 1 (Micro+SNP, black line; Micro Only, gray line).
Ordered subsets analysis (OSA), which assesses linkage under the premise that it can be uncovered more readily in subgroups of families within a population, distinguished by specific phenotypic traits, was also performed. This method was used to look for differential evidence of linkage based on age at diagnosis of ESRD and age at diagnosis of diabetes. The pedigree subsets that were identified from the OSA were also analyzed using NPL regression. These results are shown in Table 3 and depicted in Figure 1.
When subsetting by age at diagnosis of ESRD, both the microsatellite+SNP marker set and the microsatellite only marker set showed evidence of significant change in P value (ΔP) when looking at an optimal subset compared to the complete population (Table 3). The microsatellite+SNP marker set displayed an OSA maximum LOD score of 3.36 at 80.6cM using 34 pedigrees (15%) with an earlier age of diagnosis, compared to the entire sample LOD score of 0.04 (ΔP = 0.0148, NPL P value = 0.00008; Table 3) (Yellow line; Figure 1). In the microsatellite only marker set, the OSA maximum LOD score was 3.34 at 84.1cM in the optimum subset of 31 pedigrees (14%), whereas the maximum LOD score in the whole population was 0.52 (ΔP = 0.0222; NPL P value = 0.00009; Table 3) (Red line; Figure 1).
The OSA was also performed subsetting by age at diagnosis of diabetes. Using the microsatellite only marker set, there were two subsets, early onset and optimal slice, that showed a significant P value for change. The early onset subset exhibited an OSA maximum LOD score of 3.18 at 90.1cM (Table 3). This was obtained using 136 pedigrees (61%) with the earliest age at diagnosis of diabetes. The entire sample LOD was 1.23 (ΔP = 0.0168, NPL P value = 0.00013; Table 3) (Blue line; Figure 1). The optimal slice subset, which also had an earlier age of diabetes diagnosis compared to the remaining families, showed a max LOD score of 3.90 at 90.1cM using 111 pedigrees (50%) compared to a LOD score of 1.23 in the whole population (ΔP = 0.0161, NPL P value = 0.00002; Table 3) (Green line; Figure 1). There were no statistically significant differences (ΔP) observed between the entire sample LOD score and the OSA maximized LOD score in the microsatellite+SNP marker set. Instead, the same subsets of pedigrees identified from the microsatellite only OSA analysis were used for analysis in the microsatellite+SNP marker set. The early onset subset displayed a trimodal peak, with OSA maximized LOD scores of 1.20, 1.82, and 2.12 at 73.1cM, 96.1cM, and 113.6cM respectively (NPL P values = 0.01854, 0.00383, and 0.0018 respectively; Table 3) (Pink line; Figure 1). This was done using the same 136 pedigrees that were identified in the microsatellite only early onset subset. The optimal slice subset also showed a trimodal peak in the microsatellite+SNP marker set. The OSA maximized LOD scores were 1.36, 2.47, and 1.95 at 74.6cM, 90.6cM, and 113.6cM respectively (NPL P values = 0.0123, 0.00075, and 0.00275 respectively; Table 3) (Maroon line; Figure 1). This was performed using the 111 pedigrees that were in the optimal slice subset in the microsatellite only marker set.
A dense SNP map was performed on the LOD-1 region, 18q21.1-18q22.2. There were 2,814 bi-allelic variants (95.0%) successfully genotyped in this region in 1,029 AA T2DM-ESRD case subjects, and 1,027 AA non-diabetic, non-nephropathy controls. The average marker density was 6 kb with a range of 55 bp to 97 kb. Sixty-three SNPs (2.2%) were nominally inconsistent with Hardy Weinberg Equilibrium (HWE) proportions (P-value<0.01) in the combined population (Supplementary Table 2A). These SNPs were noted as inconsistent with HWE, but still included for analysis. Minor allele frequencies (MAF) are also shown in Supplementary Table 2A. Genotype frequencies and counts for each SNP are shown in Supplementary Table 2B. There were 55 SNPs (1.9%) with MAFs of less than 0.05 in both the case and control populations and 3 SNPs that were monomorphic in our AA case-control population (Supplementary Table 2A).
A plot of the admixture-adjusted genotypic results is shown in Figure 2. These results are further detailed in Supplementary Tables 3A (2DF and dominant genetic model) and 3B (additive and recessive genetic models). There were 342 SNPs (12.2%) that showed nominal evidence of association (P<0.05) with T2DM-ESRD under one or more tests of association. One-hundred and forty-six SNPs showed evidence of association under the 2 df test, 180 were associated under a dominant genetic model, 129 were associated under an additive genetic model, and 150 were associated under a recessive genetic model (Figure 2; Supplementary Tables 3A and 3B).
The 25 SNPs most associated under the 2 df test are shown in Table 4. SNPs were excluded from this list if they were out of HWE in the case, control, or combined population with a P-value<0.001, or if they were associated under the recessive genetic model and had a minor allele homozygous genotype frequency of less than 0.10. Ten of the top 25 SNPs fell within a gene; 13 SNPs were within ~300 kb or less of a gene or hypothetical protein; and 2 SNPs were in intergenic regions (no genes within 500 kb). The 2 df P-values ranged from 0.001 – 0.0114 (Table 4). When looking at genetic models, 13 of the SNPs were most associated under a dominant genetic model (P= 0.0003–0.0210; Table 4), 8 were most associated under an additive genetic model (P=0.0017–0.0052; Table 4), and 4 were most associated under a recessive genetic model (P= 0.0015–0.0050; Table 4).
In addition to looking at the most associated SNPs, SNPs were also prioritized by genes and/or regions that contained multiple associated SNPs. These results are shown in Table 5. The region contains 97 known genes. After prioritization, there were 23 genes that contained one or more SNPs nominally associated (P=0.0006-0.0441; Table 5). There were also 12 genes with nearby SNPs (within 500 kb) associated (P=0.0025-0.0390; Table 5).
We have previously shown evidence of linkage to diabetic nephropathy in African American families on chromosome 18q21.1-23 (Bowden et al. 2004). This region has also been identified in genome wide linkage scans for diabetic nephropathy in other populations: Turkish, Pima Indian, European American, and American Indian families (Vardarli et al. 2002; Iyengar et al 2007). Following studies have excluded the candidate gene ZNF236, located at 18q22-23, from playing a role in diabetic nephropathy (Halama et al. 2003), but have discovered that the carnosinase genes, CNDP1 and CNDP2, located at 18q22.3, influence diabetic nephropathy susceptibility (Janssen et al. 2005; Freedman et al. 2007; McDonough et al. 2009). D18S880, a trinucleotide repeat in exon 2 of the CNDP1 gene, which encodes for a leucine repeat in the signal peptide of the preprotein, was associated with diabetic nephropathy in a small European population (Janssen et al. 2005) and a European-American population (Freedman et al. 2007). The association was seen in individuals who were homozygous for the shortest allelic form of the repeat (5L-5L); this genotype was significantly more common in the absence of diabetic nephropathy (Janssen et al. 2005, Freedman et al. 2007). However, there was no association seen with D18S880 in African Americans (McDonough et al. 2009). There were associations seen with other variants and haplotypes in the CNDP2 gene in African Americans. These results were associated with both risk and protection, and some were observed at low frequencies in the population (McDonough et al. 2009). This suggests that there are other variants at the 18q21.1-18q23 locus that are contributing to DN risk in African Americans.
In order to refine the results of our previously published linkage peak (Bowden et al. 2004), we increased our AA family population from 206 DN-affected sibling pairs to 270. We also increased the marker density in the region by adding an additional microsatellite marker, D18S880, and 48 SNPs. We examined the data using multiple methods of analysis. First, we used NPL analysis with two different marker sets (Micro Only and Micro+SNP). We saw a reduction in the LOD score using the combined marker set, however the LOD score remained at the same magnitude we had previously published (Bowden et al. 2004) using just the microsatellite markers. In order to maximum our LOD scores, we performed OSA with age of diagnosis of ESRD and age of diagnosis of diabetes. This provided us with maximized LOD scores in pedigrees with earlier ages of onset of ESRD and earlier ages of onset of diabetes. Our significant maximized LOD scores ranged from 3.18 to 3.90 and from 80.6cM to 90.1cM.
We have looked at the data numerous ways in order to see if one model or one approach is better. Overall, we observe a wide range of results depending on the analysis method and the combination of markers used. This is consistent with the commonly accepted explanation of results from linkage mapping at this time: that there are multiple loci at the same region that are shared among and between families (Altshuler et al. 2008). These results are clearly complex; however, there is continued evidence for linkage in this region after increasing the number of families and the number of markers.
We further investigated this region in African Americans by performing a dense SNP map. The dense SNP map increased our coverage to an average marker density of 6kb. These results were evaluated in two ways. First, we ranked SNPs by order of magnitude of association. Ten of the top 25 SNPs were in a gene. Second, we looked at genes that contained multiple associated SNPs. Using this method, we complied a list of 23 genes that contained one or more associated SNPs and 12 genes had one or more associated SNPs nearby (within 500kb).
After comparing the results between the two different SNP prioritization methods, two candidate genes stood out: NEDD4L and SERPINB7. NEDD4L is a ubiquitin ligase located at 18q21. NEDD4L has been previously shown to influence the regulation of renal sodium reabsorption (Manunta et al. 2008; Sile et al. 2008; Dunn et al. 2002) and has been associated with essential hypertension in Caucasians and African Americans (Russo et al. 2005), and in Han Chinese (Wen et al. 2008). The Nedd4L protein is expressed in the distal nephron (Araki et al. 2008; Umemura et al. 2006). SERPINB7, also known as MESGIN, is located at 18q21.33 (Scott et al. 1999), is predominantly expressed in human mesangial cells, and is up regulated in IgA nephropathy (Miyata et al. 1998). Two polymorphisms in the 3′UTR of SERPINB7, 2093C and 2180T, were associated with IgA nephropathy in a Chinese population (Li et al. 2004). These markers form the 2093C-2180T haplotype that was associated with a more severe form of IgA nephropathy in a Chinese population (Xia et al. 2006), and poor renal survival in Korean IgA nephropathy patients (Lim et al. 2008). Inagi et al. (2006) created a severe diabetic nephropathy mouse model by overexpressing serpinb7, RAGE, and iNOS. There is also evidence that serpinb7 is unregulated in diabetic nephropathy, in turn inhibiting plasmin and MMP activities, leading to mesangial matrix accumulation (Ohtomo et al. 2008).
We have looked at both of these genes in reference to diabetic nephropathy in our African American population. As we saw the most significant results in intron 1 of NEDD4L in the dense SNP map, we examined this gene from exon 1 to exon 3. We successfully genotyped 91 SNPs in this region; however, we only saw nominal association (P values = 0.025-0.0459) with five SNPs (data not shown). In addition, we successfully genotyped 26 SNPs across the SERPINB7 gene. We saw association with 6 SNPs throughout the gene (data not shown); however, these results require further validation. While we did not examine the entire genic region of NEDD4L, these results suggest that NEDD4L does not play a major role in diabetic nephropathy susceptibility in African Americans. Mutations in SERPINB7 may be involved in the mesangial matrix accumulation observed in diabetic nephropathy; however, this gene must be further investigated before any conclusion can be drawn on its relationship to diabetic nephropathy in African Americans.
Our study is not without limitations. One advantage of genome-wide linkage scans is the ability to identify regions where multiple, rare variants which lead to disease, are shared among and/or between families. These variants may not have been identified in our population based case-control dense SNP map follow-up study. In addition, due to our recruitment design, we are unable to distinguish if the linkage and associations we observed were with diabetes, nephropathy, or both.
Overall, we have performed a comprehensive evaluation of the 18q21-18q23 genomic region in African American. The results of the fine mapping performed in our African American T2DM-DN family population demonstrated continued evidence for linkage with the inclusion of additional subjects, and additional markers. The dense SNP map produced several candidate genes, including NEDD4L and SERPINB7, which warrant further investigation. Taken together, this suggests that there may be multiple loci in this region that affect diabetic nephropathy susceptibility in African Americans.
We wish to thank the patients, their relatives, and staff of the Southeastern Kidney Council, Inc./ESRD Network 6 for their participation. This work was supported by NIH grants RO1 DK066358 (DWB), R01 DK053591 (DWB), R01 HL56266 (BIF), R01 DK070941 (BIF) and in part by the General Clinical Research Center of the Wake Forest University School of Medicine grant M01 RR007122. Genotyping services were provided by the Center for Inherited Disease Research (CIDR). CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University; contract number N01-HG-65403.