1.  Detailed Investigation of the Role of Common and Low-Frequency WFS1 Variants in Type 2 Diabetes Risk 
Diabetes  2009;59(3):741-746.
Wolfram syndrome 1 (WFS1) single nucleotide polymorphisms (SNPs) are associated with risk of type 2 diabetes. In this study we aimed to refine this association and investigate the role of low-frequency WFS1 variants in type 2 diabetes risk.
For fine-mapping, we sequenced WFS1 exons, splice junctions, and conserved noncoding sequences in samples from 24 type 2 diabetic case and 68 control subjects, selected tagging SNPs, and genotyped these in 959 U.K. type 2 diabetic case and 1,386 control subjects. The same genomic regions were sequenced in samples from 1,235 type 2 diabetic case and 1,668 control subjects to compare the frequency of rarer variants between case and control subjects.
Of 31 tagging SNPs, the strongest associated was the previously untested 3′ untranslated region rs1046320 (P = 0.008); odds ratio 0.84 and P = 6.59 × 10−7 on further replication in 3,753 case and 4,198 control subjects. High correlation between rs1046320 and the original strongest SNP (rs10010131) (r2 = 0.92) meant that we could not differentiate between their effects in our samples. There was no difference in the cumulative frequency of 82 rare (minor allele frequency [MAF] <0.01) nonsynonymous variants between type 2 diabetic case and control subjects (P = 0.79). Two intermediate frequency (MAF 0.01–0.05) nonsynonymous changes also showed no statistical association with type 2 diabetes.
We identified six highly correlated SNPs that show strong and comparable associations with risk of type 2 diabetes, but further refinement of these associations will require large sample sizes (>100,000) or studies in ethnically diverse populations. Low frequency variants in WFS1 are unlikely to have a large impact on type 2 diabetes risk in white U.K. populations, highlighting the complexities of undertaking association studies with low-frequency variants identified by resequencing.
PMCID: PMC2828659  PMID: 20028947
2.  Candidate Gene Association Study of Esophageal Squamous Cell Carcinoma in a High-Risk Region in Iran 
Cancer research  2009;69(20):7994-8000.
A region with a high risk for esophageal squamous cell carcinoma (ESCC) in northeast of Iran was identified more than three decades ago. Previous studies suggest that hereditary factors play a role in the high incidence of cancer in the region. Polymorphisms of several genes have been associated with susceptibility to esophageal cancer in various populations, but these have not been studied in Iran. We selected 22 functional variants (and 130 related tagSNPs) from 15 genes which previously have been suggested to be associated with an increased risk of ESCC. We genotyped a primary set of samples from 451 Turkmen (197 cases and 254 controls). Seven of 152 variants were associated with ESCC at the P = 0.05 level; these SNPs were then studied in a validation set of 1668 cases and controls (Turkmen and non-Turkmen) under dominant and recessive models. In the joint sample set, five variants, from five different genes, showed significant associations with ESCC at the P = 0.05 level. For one variant, in ADH1B, the association was strong and was present in both Turkmen and non-Turkmen. The histidine allele at codon 48 of ADH1B gene was associated with a significantly decreased risk of ESCC under a recessive model (OR = 0.41, 95%, CI = 0.19 to 0.49; P = 4×10−4). For four additional variants, an association was present in the Turkmen subgroup, but the statistical significance of these was less compelling than for ADH1B. Two variants showed deleterious effects and two were protective. The G allele of the c.870A>G variant of CCND1 gene was associated with a 1.5-fold increased risk of ESCC under the recessive model (OR = 1.50, 95% CI = 1.14 to 2.16, P = 0.02) and the A allele of the rs1625895 variant of TP53 gene was associated with a 1.5-fold increased risk of ESCC under a dominant model (OR = 1.54, 95% CI = 1.21 to 4.07, P = 0.005). The C allele of the rs886205 variant of ALDH2 was associated with a decreased risk of ESCC under a recessive model (OR = 0.58, 95% CI = 0.34 to 0.87, P = 0.02) and the A allele of the rs7087131 variant of MGMT was associated with a decreased risk of ESCC under the recessive model (OR = 0.26, 95% CI = 0.05 to 0.49, P=0.01). These results confirm that genetic predisposition to ESCC plays a role in high incidence of this cancer among Turkmens who live in northeast of Iran.
PMCID: PMC3505030  PMID: 19826048
Esophageal squamous cell carcinoma; Turkmen population; ADH1B; ALDH2; MGMT; TP53; CCND1
3.  Sequencing PDX1 (insulin promoter factor 1) in 1788 UK individuals found 5% had a low frequency coding variant, but these variants are not associated with Type 2 diabetes 
Diabetic Medicine  2011;28(6):681-684.
Genome-wide association studies have identified > 30 common variants associated with Type 2 diabetes (> 5% minor allele frequency). These variants have small effects on individual risk and do not account for a large proportion of the heritable component of the disease. Monogenic forms of diabetes are caused by mutations that occur in < 1:2000 individuals and follow strict patterns of inheritance. In contrast, the role of low frequency genetic variants (minor allele frequency 0.1–5%) in Type 2 diabetes is not known. The aim of this study was to assess the role of low frequency PDX1 (also called IPF1) variants in Type 2 diabetes.
We sequenced the coding and flanking intronic regions of PDX1 in 910 patients with Type 2 diabetes and 878 control subjects.
We identified a total of 26 variants that occurred in 5.3% of individuals, 14 of which occurred once. Only D76N occurred in > 1%. We found no difference in carrier frequency between patients (5.7%) and control subjects (5.0%) (P = 0.46). There were also no differences between patients and control subjects when analyses were limited to subsets of variants. The strongest subset were those variants in the DNA binding domain where all five variants identified were only found in patients (P = 0.06).
Approximately 5% of UK individuals carry a PDX1 variant, but there is no evidence that these variants, either individually or cumulatively, predispose to Type 2 diabetes. Further studies will need to consider strategies to assess the role of multiple variants that occur in < 1 in 1000 individuals.
PMCID: PMC3586655  PMID: 21569088
diabetes; genetics; polygenic; variants
4.  Associations of common polymorphisms in GCKR with type 2 diabetes and related traits in a Han Chinese population: a case-control study 
BMC Medical Genetics  2011;12:66.
Several studies have shown that variants in the glucokinase regulatory protein gene (GCKR) were associated with type 2 diabetes and dyslipidemia. The purpose of this study was to examine whether tag single nucleotide polymorphisms (SNPs) in the GCKR region were associated with type 2 diabetes and related traits in a Han Chinese population and to identify the potential mechanisms underlying these associations.
We investigated the association of polymorphisms in the GCKR gene with type 2 diabetes by employing a case-control study design (1118 cases and 1161 controls). Four tag SNPs (rs8179206, rs2293572, rs3817588 and rs780094) with pairwise r2 > 0.8 and minor allele frequency > 0.05 across the GCKR gene and its flanking regions were studied and haplotypes were constructed. Genotyping was performed by matrix-assisted laser desorption/ionization time-of-flight mass spectroscopy using a MassARRAY platform.
The G alleles of GCKR rs3817588 and rs780094 were associated with an increased risk of type 2 diabetes after adjustment for year of birth, sex and BMI (OR = 1.24, 95% CI 1.08-1.43, p = 0.002 and OR = 1.22, 95% CI 1.07-1.38, p = 0.002, respectively). In the non-diabetic controls, the GG carriers of rs3817588 and rs780094 were nominally associated with a lower plasma triglyceride level compared to the AA carriers after adjustment for year of birth, sex and BMI (p for trend = 0.00004 and 0.03, respectively). Furthermore, the association of rs3817588 with plasma triglyceride level was still significant after correcting for multiple testing.
The rs3817588 A/G polymorphism of the GCKR gene was associated with type 2 diabetes and plasma triglyceride level in the Han Chinese population.
PMCID: PMC3112072  PMID: 21569451
5.  Genetic association analysis of LARS2 with type 2 diabetes 
Diabetologia  2009;53(1):103-110.
LARS2 has been previously identified as a potential type 2 diabetes susceptibility gene through the low-frequency H324Q (rs71645922) variant (minor allele frequency [MAF] 3.0%). However, this association did not achieve genome-wide levels of significance. The aim of this study was to establish the true contribution of this variant and common variants in LARS2 (MAF > 5%) to type 2 diabetes risk.
We combined genome-wide association data (n = 10,128) from the DIAGRAM consortium with independent data derived from a tagging single nucleotide polymorphism (SNP) approach in Dutch individuals (n = 999) and took forward two SNPs of interest to replication in up to 11,163 Dutch participants (rs17637703 and rs952621). In addition, because inspection of genome-wide association study data identified a cluster of low-frequency variants with evidence of type 2 diabetes association, we attempted replication of rs9825041 (a proxy for this group) and the previously identified H324Q variant in up to 35,715 participants of European descent.
No association between the common SNPs in LARS2 and type 2 diabetes was found. Our replication studies for the two low-frequency variants, rs9825041 and H324Q, failed to confirm an association with type 2 diabetes in Dutch, Scandinavian and UK samples (OR 1.03 [95% CI 0.95–1.12], p = 0.45, n = 31,962 and OR 0.99 [0.90–1.08], p = 0.78, n = 35,715 respectively).
In this study, the largest study examining the role of sequence variants in LARS2 in type 2 diabetes susceptibility, we found no evidence to support previous data indicating a role in type 2 diabetes susceptibility.
Electronic supplementary material
The online version of this article (doi:10.1007/s00125-009-1557-7) contains supplementary material, which is available to authorised users.
PMCID: PMC2789927  PMID: 19847392
Genetics; LARS2; Mitochondria; SNP; Type 2 diabetes
6.  A common polymorphism rs3781637 in MTNR1B is associated with type 2 diabetes and lipids levels in Han Chinese individuals 
Several studies have shown that common variants in the MTNR1B gene were associated with fasting glucose level and type 2 diabetes. The purpose of this study was to examine whether tagging single nucleotide polymorphisms (SNPs) in the MTNR1B region were associated with type 2 diabetes and related traits in a Han Chinese population.
We investigated the association of polymorphisms in the MTNR1B gene with type 2 diabetes by employing a case-control study design (1118 cases and 1161 controls). Three tagging SNPs (rs10830963, rs3781637, and rs1562444) with R2>0.8 and minor allele frequency>0.05 across the region of the MTNR1B gene were studied. Genotyping was performed by matrix-assisted laser desorption/ionization time-of-flight mass spectroscopy using a MassARRAY platform.
The polymorphism rs3781637 was associated with type 2 diabetes adjusted for age, sex and body mass index (BMI) in the additive model and recessive model (OR = 1.22, 95% CI 1.01-1.46, p = 0.038 and OR = 2.81, 95% CI 1.28-6.17, p = 0.01, respectively). In the non-diabetic controls, rs3781637 was nominally associated with plasma triglyceride, total cholesterol and low density lipoprotein cholesterol (LDL-C) levels in the recessive model (p = 0.018, 0.008 and 0.038, respectively). After adjustment for multiple comparisons, the associations of rs3781637 with total cholesterol and LDL-C remained significant in the recessive model (the empirical p = 0.024 and 0.045, respectively), but the association between rs3781637 and triglyceride became non-significant (the empirical p = 0.095). The associations of rs10830963 and rs1562444 with type 2 diabetes and related traits were not significant in the additive, dominant and recessive models.
The rs3781637 A/G polymorphism of the MTNR1B gene is associated with type 2 diabetes, plasma, total cholesterol and LDL-C levels in the Han Chinese population.
PMCID: PMC3079619  PMID: 21470412
7.  Genetic Variability in CLU and Its Association with Alzheimer's Disease 
PLoS ONE  2010;5(3):e9510.
Recently, two large genome wide association studies in Alzheimer disease (AD) have identified variants in three different genes (CLU, PICALM and CR1) as being associated with the risk of developing AD. The strongest association was reported for an intronic single nucleotide polymorphism (SNP) in CLU.
Methodology/Principal Findings
To further characterize this association we have sequenced the coding region of this gene in a total of 495 AD cases and 330 healthy controls. A total of twenty-four variants were found in both cases and controls. For the changes found in more than one individual, the genotypic frequencies were compared between cases and controls. Coding variants were found in both groups (including a nonsense mutation in a healthy subject), indicating that the pathogenicity of variants found in this gene must be carefully evaluated. We found no common coding variant associated with disease. In order to determine if common variants at the CLU locus effect expression of nearby (cis) mRNA transcripts, an expression quantitative loci (eQTL) analysis was performed. No significant eQTL associations were observed for the SNPs previously associated with AD.
We conclude that common coding variability at this locus does not explain the association, and that there is no large effect of common genetic variability on expression in brain tissue. We surmise that the most likely mechanism underpinning the association is either small effects of genetic variability on resting gene expression, or effects on damage induced expression of the protein.
PMCID: PMC2831070  PMID: 20209083
8.  How Many Genetic Variants Remain to Be Discovered? 
PLoS ONE  2009;4(12):e7969.
A great majority of genetic markers discovered in recent genome-wide association studies have small effect sizes, and they explain only a small fraction of the genetic contribution to the diseases. How many more variants can we expect to discover and what study sizes are needed? We derive the connection between the cumulative risk of the SNP variants to the latent genetic risk model and heritability of the disease. We determine the sample size required for case-control studies in order to achieve a certain expected number of discoveries in a collection of most significant SNPs. Assuming similar allele frequencies and effect sizes of the currently validated SNPs, complex phenotypes such as type-2 diabetes would need approximately 800 variants to explain its 40% heritability. Much smaller numbers of variants are needed if we assume rare-variants but higher penetrance models. We estimate that up to 50,000 cases and an equal number of controls are needed to discover 800 common low-penetrant variants among the top 5000 SNPs. Under common and rare low-penetrance models, the very large studies required to discover the numerous variants are probably at the limit of practical feasibility. Under rare-variant with medium- to high-penetrance models (odds-ratios between 1.6 and 4.0), studies comparable in size to many existing studies are adequate provided the genotyping technology can interrogate more and rarer variants.
PMCID: PMC2780697  PMID: 19956539
9.  Single-nucleotide polymorphisms in the RB1 gene and association with breast cancer in the British population 
British Journal of Cancer  2006;94(12):1921-1926.
A substantial proportion of the familial risk of breast cancer may be attributable to genetic variants each contributing a small effect. pRb controls the cell cycle and polymorphisms within it are candidates for such low penetrance susceptibility alleles, since the gene has been implicated in several human tumours, particularly breast cancer. The purpose of this study was to determine whether common variants in the RB1 gene are associated with breast cancer risk. We assessed 15 tagging single-nucleotide polymorphisms (SNPs) using a case–control study design (n⩽4474 cases and n⩽4560 controls). A difference in genotype frequencies was found between cases and controls for rs2854344 in intron 17 (P-trend=0.007) and rs198580 in intron 19 (P-trend=0.018). Carrying the minor allele of these SNPs appears to confer a protective effect on breast cancer risk (odd ratio (OR)=0.86 (0.76–0.96) for rs2854344 and OR=0.80 (0.66–0.96) for rs198580). However, after adjusting for multiple testing these associations were borderline with an adjusted P-trend=0.068 for the most significant SNP (rs2854344). The RB1 gene is not known to contain any coding SNPs with allele frequencies ⩾5% but several intronic variants are in perfect linkage disequilibrium with the associated SNPs. Replication studies are needed to confirm the associations with breast cancer.
PMCID: PMC2361346  PMID: 16685266
RB1; single-nucleotide polymorphisms; breast cancer
10.  Genetic polymorphisms of nerve growth factor receptor (NGFR) and the risk of Alzheimer's disease 
Loss of basal forebrain cholinergic neurons is attributable to the proapoptotic signaling induced by nerve growth factor receptor (NGFR) and may link to Alzheimer's disease (AD) risk. Only one study has investigated the association between NGFR polymorphisms and the risk of AD in an Italian population. Type 2 diabetes mellitus (DM) may modify this association based on previous animal and epidemiologic studies.
This was a case-control study in a Chinese population. A total of 264 AD patients were recruited from three teaching hospitals between 2007 to 2010; 389 controls were recruited from elderly health checkup and volunteers of the hospital during the same period of time. Five common (frequency≥5%) haplotype-tagging single nucleotide polymorphisms (htSNPs) were selected from NGFR to test the association between NGFR htSNPs and the risk of AD.
Variant NGFR rs734194 was significantly associated with a decreased risk of AD [GG vs. TT copies: adjusted odds ratio (OR) = 0.43, 95% confidence interval (CI) = 0.20-0.95]. Seven common haplotypes were identified. Minor haplotype GCGCG was significantly associated with a decreased risk of AD (2 vs. 0 copies: adjusted OR = 0.39, 95% CI = 0.17-0.91). Type 2 DM significantly modified the association between rs2072446, rs741072, and haplotype GCTTG and GTTCG on the risk of AD among ApoE ε4 non-carriers (Pinteraction < 0.05).
Inherited polymorphisms of NGFR were associated with the risk of AD; results were not significant after correction for multiple tests. This association was further modified by the status of type 2 DM.
PMCID: PMC3362783  PMID: 22236693
NGFR; Alzheimer's disease; htSNP; haplotype
11.  Association Analysis of Variation in/Near FTO, CDKAL1, SLC30A8, HHEX, EXT2, IGF2BP2, LOC387761, and CDKN2B With Type 2 Diabetes and Related Quantitative Traits in Pima Indians 
Diabetes  2009;58(2):478-488.
OBJECTIVE—In recent genome-wide association studies, variants in CDKAL1, SLC30A8, HHEX, EXT2, IGF2BP2, CDKN2B, LOC387761, and FTO were associated with risk for type 2 diabetes in Caucasians. We investigated the association of these single nucleotide polymorphisms (SNPs) and some additional tag SNPs with type 2 diabetes and related quantitative traits in Pima Indians.
RESEARCH DESIGN AND METHODS—Forty-seven SNPs were genotyped in 3,501 Pima Indians informative for type 2 diabetes and BMI, among whom 370 had measures of quantitative traits.
RESULTS—FTO provided the strongest evidence for replication, where SNPs were associated with type 2 diabetes (odds ratio = 1.20 per copy of the risk allele, P = 0.03) and BMI (P = 0.002). None of the other previously reported SNPs were associated with type 2 diabetes; however, associations were found between CDKAL1 and HHEX variants and acute insulin response (AIR), where the Caucasian risk alleles for type 2 diabetes were associated with reduced insulin secretion in normoglycemic Pima Indians. Multiallelic analyses of carrying risk alleles for multiple genes showed correlations between number of risk alleles and type 2 diabetes and impaired insulin secretion in normoglycemic subjects (P = 0.006 and 0.0001 for type 2 diabetes and AIR, respectively), supporting the hypothesis that many of these genes influence diabetes risk by affecting insulin secretion.
CONCLUSIONS—Variation in FTO impacts BMI, but the implicated common variants in the other genes did not confer a significant risk for type 2 diabetes in Pima Indians. However, confidence intervals for their estimated effects were consistent with the small effects reported in Caucasians, and the multiallelic “genetic risk profile” identified in Caucasians is associated with diminished early insulin secretion in Pima Indians.
PMCID: PMC2628623  PMID: 19008344
12.  Tagging Single Nucleotide Polymorphisms in the BRIP1 Gene and Susceptibility to Breast and Ovarian Cancer 
PLoS ONE  2007;2(3):e268.
BRIP1 interacts with BRCA1 and functions in regulating DNA double strand break repair pathways. Germline BRIP1 mutations are associated with breast cancer and Fanconi anemia. Thus, common variants in the BRIP1 are candidates for breast and ovarian cancer susceptibility.
We used a SNP tagging approach to evaluate the association between common variants (minor allele frequency≥0.05) in BRIP1 and the risks of breast cancer and invasive ovarian cancer. 12 tagging SNPs (tSNPs) in the gene were identified and genotyped in up to 2,270 breast cancer cases and 2,280 controls from the UK and up to 1,513 invasive ovarian cancer cases and 2,515 controls from the UK, Denmark and USA. Genotype frequencies in cases and controls were compared using logistic regression.
Two tSNPs showed a marginal significant association with ovarian cancer: Carriers of the minor allele of rs2191249 were at reduced risk compared with the common homozygotes (Odds Ratio (OR) = 0.90 (95% CI, 0.82–1.0), P-trend = 0.045) and the minor allele of rs4988344 was associated with increased risk (OR = 1.15 (95%CI, 1.02–1.30), P-trend = 0.02). When the analyses were restricted to serous ovarian cancers, these effects became slightly stronger. These results were not significant at the 5% level after adjusting for multiple testing. None of the tSNPs was associated with breast cancer.
It is unlikely that common variants in BRIP1 contribute significantly to breast cancer susceptibility. The possible association of rs2191249 and rs4988344 with ovarian cancer risks warrant confirmation in independent case-control studies.
PMCID: PMC1800910  PMID: 17342202
13.  Low-frequency intermediate penetrance variants in the ROCK1 gene predispose to Tetralogy of Fallot 
BMC Genetics  2013;14:57.
Epidemiological studies indicate a substantial excess familial recurrence of non-syndromic Tetralogy of Fallot (TOF), implicating genetic factors that remain largely unknown. The Rho induced kinase 1 gene (ROCK1) is a key component of the planar cell polarity signalling pathway, which plays an important role in normal cardiac development. The aim of this study was to investigate the role of genetic variation in ROCK1 on the risk of TOF.
ROCK1 was sequenced in a discovery cohort of 93 non-syndromic TOF probands to identify rare variants. TagSNPs were selected to capture commoner variation in ROCK1. Novel variants and TagSNPs were genotyped in a discovery cohort of 458 TOF cases and 1331 healthy controls, and positive findings were replicated in a further 209 TOF cases and 1290 healthy controls. Association between genotypes and TOF was assessed using LAMP.
A rare SNP (c.807C > T; rs56085230) discovered by sequencing was associated with TOF risk (p = 0.006) in the discovery cohort. The variant was also significantly associated with the risk of TOF in the replication cohort (p = 0.018). In the combined cohorts the odds ratio for TOF was 2.61 (95% CI 1.58-4.30); p < 0.0001. The minor allele frequency of rs56085230 in the cases was 0.02, and in the controls it was 0.007. The variant accounted for 1% of the population attributable risk (PAR) of TOF. We also found significant association with TOF for an uncommon TagSNP in ROCK1, rs288979 (OR 1.64 [95% CI 1.15-2.30]; p = 1.5x10-5). The minor allele frequency of rs288979 in the controls was 0.043, and the variant accounted for 11% of the PAR of TOF. These association signals were independent of each other, providing additional internal validation of our result.
Low frequency intermediate penetrance (LFIP) variants in the ROCK1 gene predispose to the risk of TOF.
PMCID: PMC3734041  PMID: 23782575
Congenital heart disease; Tetralogy of fallot; Genetics; Planar cell polarity pathway
14.  Fine mapping the TAGAP risk locus in rheumatoid arthritis 
Genes and Immunity  2011;12(4):314-318.
A common allele at the TAGAP gene locus demonstrates a suggestive, but not conclusive association with risk of rheumatoid arthritis (RA). To fine map the locus, we conducted comprehensive imputation of CEU HapMap single-nucleotide polymorphisms (SNPs) in a genome-wide association study (GWAS) of 5500 RA cases and 22 621 controls (all of European ancestry). After controlling for population stratification with principal components analysis, the strongest signal of association was to an imputed SNP, rs212389 (P=3.9 × 10−8, odds ratio=0.87). This SNP remained highly significant upon conditioning on the previous RA risk variant (rs394581, P=2.2 × 10−5) or on a SNP previously associated with celiac disease and type I diabetes (rs1738074, P=1.7 × 10−4). Our study has refined the TAGAP signal of association to a single haplotype in RA, and in doing so provides conclusive statistical evidence that the TAGAP locus is associated with RA risk. Our study also underscores the utility of comprehensive imputation in large GWAS data sets to fine map disease risk alleles.
PMCID: PMC3114196  PMID: 21390051
TAGAP; genetics; rheumatoid arthritis
15.  Genetic Risk Assessment of Type 2 Diabetes–Associated Polymorphisms in African Americans 
Diabetes Care  2012;35(2):287-292.
Multiple single nucleotide polymorphisms (SNPs) associated with type 2 diabetes (T2D) susceptibility have been identified in predominantly European-derived populations. These SNPs have not been extensively investigated for individual and cumulative effects on T2D risk in African Americans.
Seventeen index T2D risk variants were genotyped in 2,652 African American case subjects with T2D and 1,393 nondiabetic control subjects. Individual SNPs and cumulative risk allele loads were assessed for association with risk for T2D. Cumulative risk was assessed by counting risk alleles and evaluating the difference in cumulative risk scores between case subjects and control subjects. A second analysis weighted risk scores (ln [OR]) based on previously reported European-derived effect sizes.
Frequencies of risk alleles ranged from 8.6 to 99.9%. Eleven SNPs had ORs >1, and 5 from ADAMTS9, WFS1, CDKAL1, JAZF1, and TCF7L2 trended or had nominally significant evidence of T2D association (P < 0.05). Individuals carried between 13 and 29 risk alleles. Association was observed between T2D and increase in risk allele load (unweighted OR 1.04 [95% CI 1.01–1.08], P = 0.010; weighted 1.06 [1.03–1.10], P = 8.10 × 10−5). When TCF7L2 SNP rs7903146 was included as a covariate, the risk score was no longer associated with T2D in either model (unweighted 1.02 [0.98–1.05], P = 0.33; weighted 1.02 [0.98–1.06], P = 0.40).
The trend of increase in risk for T2D with increasing risk allele load is similar to observations in European-derived populations; however, these analyses indicate that T2D genetic risk is primarily mediated through the effect of TCF7L2 in African Americans.
PMCID: PMC3263882  PMID: 22275441
16.  Mapping of the IRF8 gene identifies a 3’ UTR variant associated with risk of chronic lymphocytic leukemia but not other common non-Hodgkin lymphoma subtypes 
Our genome-wide association study (GWAS) of chronic lymphocytic leukemia (CLL) identified 4 highly-correlated intronic variants within the IRF8 gene that were associated with CLL. These results were further supported by a recent meta-analysis of our GWAS with two other GWAS of CLL, supporting the IRF8 gene as a strong candidate for CLL risk.
To refine the genetic association of CLL risk, we performed Sanger sequencing of IRF8 in 94 CLL cases and 96 controls. We then performed fine-mapping by genotyping 39 variants (of which 10 were identified from sequencing) in 745 CLL cases and 1521 controls. We also assessed these associations with risk of other non-Hodgkin lymphoma (NHL) subtypes.
The strongest association with CLL risk was observed with a common SNP located within the 3’ UTR of IRF8 (rs1044873, log additive odds ratio = 0.7, P=1.81×10−6). This SNP was not associated with the other NHL subtypes (all P>0.05).
We provide evidence that rs1044873 in the IRF8 gene accounts for the initial GWAS signal for CLL risk. This association appears to be unique to CLL with little support for association with other common NHL subtypes. Future work is needed to assess functional role of IRF8 in CLL etiology.
These data provide support that a functional variant within the 3’ UTR of IRF8 may be driving the GWAS signal seen on 16q24.1 for CLL risk.
PMCID: PMC3596428  PMID: 23307532
CLL; NHL; SNPs; IRF8; risk locus
17.  Genetic variants in the TIRAP gene are associated with increased risk of sepsis-associated acute lung injury 
BMC Medical Genetics  2010;11:168.
Toll like receptors (TLRs) signaling pathways, including the adaptor protein Mal encoded by the TIRAP gene, play a central role in the development of acute lung injury (ALI). Recently, the TIRAP variants have been described association with susceptibility to inflammatory diseases. The aim of this study was to investigate whether genetic variants in TIRAP are associated with the development of ALI.
A case-control collection from Han Chinese of 298 healthy subjects, 278 sepsis-associated ALI and 288 sepsis alone patients were included. Three tag single nucleotide polymorphisms (SNPs) of the TIRAP gene and two additional SNPs that have previously showed association with susceptibility to other inflammatory diseases were genotyped by direct sequencing. The differences of allele, genotype and haplotype frequencies were evaluated between three groups.
The minor allele frequencies of both rs595209 and rs8177375 were significantly increased in ALI patients compared with both healthy subjects (odds ratio (OR) = 1.47, 95% confidence interval (CI):1.15-1.88, P = 0.0027 and OR = 1.97, 95% CI: (1.38-2.80), P = 0.0001, respectively) and sepsis alone patients (OR = 1.44, 95% CI: 1.12-1.85, P = 0.0041 and OR = 1.82, 95% CI: 1.28-2.57, P = 0.00079, respectively). Haplotype consisting of these two associated SNPs strengthened the association with ALI susceptibility. The frequency of haplotype AG (rs595209A, rs8177375G) in the ALI samples was significantly higher than that in the healthy control group (OR = 2.13, 95% CI: 1.46-3.09, P = 0.00006) and the sepsis alone group (OR = 2.24, 95% CI: 1.52-3.29, P = 0.00003). Carriers of the haplotype CA (rs595209C, rs8177375A) had a lower risk for ALI compared with healthy control group (OR = 0.69, 95% CI: 0.54-0.88, P = 0.0003) and sepsis alone group (OR = 0.71, 95% CI: 0.55-0.91, P = 0.0006). These associations remained significant after adjustment for covariates in multiple logistic regression analysis and for multiple comparisons.
These results indicated that genetic variants in the TIRAP gene might be associated with susceptibility to sepsis-associated ALI in Han Chinese population. However, the association needs to be replicated in independent studies.
PMCID: PMC3001691  PMID: 21118491
18.  Genetic analysis of the clusterin gene in pseudoexfoliation syndrome 
Molecular Vision  2008;14:1727-1736.
Pseudoexfoliation syndrome is a major risk factor for the development of glaucoma. Following recent reports of a strong association of coding variants in the lysyl oxidase-like 1 (LOXL1) gene with this syndrome but low penetrance and variable disease frequency between different populations, we aimed to identify additional genetic factors contributing to the disease. The clusterin (CLU) gene has been proposed as a candidate because of the presence of clusterin protein in pseudoexfoliation deposits, its varied levels in aqueous humor of cases compared to controls, and the role of the protein as a molecular chaperone. We investigated the association of genetic variants across CLU in pseudoexfoliation syndrome and analyzed molecular characteristics of the encoded protein in ocular tissues.
The expression of clusterin in relevant ocular tissues was assessed using western blotting. Nine tag single nucleotide polymorphisms (SNPs) across CLU were genotyped in 86 cases of pseudoexfoliation syndrome and 2422 controls from the Australian Blue Mountains Eye Study cohort. Each SNP and haplotype was assessed for association with the syndrome.
Clusterin was identified in normal human iris, the ciliary body, lens capsule, optic nerve, and aqueous humor. Post-translational modification gives rise to a 100 kDa precursor protein in ocular tissues, larger than that reported in non-ocular tissues. One CLU SNP (rs3087554) was nominally associated with pseudoexfoliation syndrome at the genotypic level (p=0.044), although not when the age of controls was restricted to those over 73 years. Only age and the LOXL1 diplotype were significant factors in the logistic regression. One haplotype of all nine CLU SNPs was also associated (p=0.005), but the significance decreased slightly with the use of the age-restricted controls (p=0.011).
Clusterin is present in ocular anterior segment tissues involved in pseudoexfoliation syndrome. Although one haplotype may contribute in a minor way to genetic risk of pseudoexfoliation syndrome, common variation in this gene is not a major contributor to the risk of pseudoexfoliation syndrome.
PMCID: PMC2542387  PMID: 18806885
19.  Genetic variants in XRCC2: new insights into colorectal cancer tumorigenesis 
Polymorphisms in double-strand DNA repair gene XRCC2 may play an important role in colorectal cancer (CRC) etiology, specifically in disease subtypes. Associations of XRCC2 variants and CRC were investigated by tumor site and tumor instability status in a four-center collaboration including three U.K. case-control studies (Sheffield, Leeds, Dundee) and a U.S. case-control study of cases from high-risk Utah pedigrees (total: 1,252 cases, 1,422 controls). The 14 variants studied were tagging-SNPs selected from HapMap/NIEHS data, supplemented with SNPs identified from sequencing of 125 cases chosen to represent multiple CRC groups (familial, metastatic disease, and tumor subsite). Monte Carlo significance testing using Genie software provided valid meta analyses of the total resource that includes family-based data. Similar to reports of CRC and other cancer sites, the rs3218536 R188H allele was not associated with increased risk. However, we observed a novel, highly significant association of a common SNP, rs3218499G>C, with increased risk of rectal tumors (OR 2.1, 95%CI 1.3-3.3; pchisq. =0.0006) versus controls, with the largest risk found for female rectal cases (OR 3.1, 95%CI 1.6-6.1; pchisq. =0.0006). This difference was significantly different to that for proximal and distal colon cancers (pchisq. =0.02). Our investigation supports a role for XRCC2 in CRC tumorigenesis, conferring susceptibility to rectal tumors.
PMCID: PMC2742634  PMID: 19690184
XRCC2; colorectal cancer; DNA double-strand break repair; chromosomal instability; microsatellite instability
20.  Common variants in WFS1 confer risk of type 2 diabetes 
Nature genetics  2007;39(8):951-953.
We studied genes involved in pancreatic β cell function and survival, identifying associations between SNPs in WFS1 and diabetes risk in UK populations that we replicated in an Ashkenazi population and in additional UK studies. In a pooled analysis comprising 9,533 cases and 11,389 controls, SNPs in WFS1 were strongly associated with diabetes risk. Rare mutations in WFS1 cause Wolfram syndrome; using a gene-centric approach, we show that variation in WFS1 also predisposes to common type 2 diabetes.
PMCID: PMC2672152  PMID: 17603484
21.  Cystic Fibrosis Transmembrane Conductance Regulator Gene Mutation and Lung Cancer Risk 
The cystic fibrosis transmembrane conductance regulator (CFTR) holds an important role in retaining lung function, but its association with lung cancer is unclear. A case-control study was conducted to determine the possible associations of the genetic variants in the CFTR gene with lung cancer risk. Genotypes of a most common deletion ΔF508, one functional SNP, and eight tag SNPs in the CFTR gene were determined in 574 lung cancer patients and 679 controls. A logistic regression model, adjusting for known risk factors, was used to evaluate the association of each variant with lung cancer risk, as confirmation haplotype and sub-haplotype analyses were performed. ΔF508 deletion and genotypes with minor alleles in one tag SNP, rs10487372, and one functional SNP, rs213950, were inversely associated with lung cancer risk. The results of haplotype and sub-haplotype analyses were consistent with single variant analysis, all pointing to deletion ΔF508 being the key variant for significant haplotypes and sub-haplotypes. Individuals with ‘deletion-T’ (ΔF508/rs10487372) haplotype had a 68% reduced risk for lung cancer compared to common haplotype ‘no-deletion-C’ (OR=0.32; 95% CI=0.15–0.68; p=0.01). Genetic variations in the CFTR gene might modulate the risk of lung cancer. This study, for the first time, provides evidence of a protective role of the CFTR deletion carrier in the etiology of lung cancer.
PMCID: PMC2895007  PMID: 20116881
Cystic fibrosis transmembrane conductance regulator; lung cancer; genetic variation
22.  Worldwide population differentiation at disease-associated SNPs 
BMC Medical Genomics  2008;1:22.
Recent genome-wide association (GWA) studies have provided compelling evidence of association between genetic variants and common complex diseases. These studies have made use of cases and controls almost exclusively from populations of European ancestry and little is known about the frequency of risk alleles in other populations. The present study addresses the transferability of disease associations across human populations by examining levels of population differentiation at disease-associated single nucleotide polymorphisms (SNPs).
We genotyped ~1000 individuals from 53 populations worldwide at 25 SNPs which show robust association with 6 complex human diseases (Crohn's disease, type 1 diabetes, type 2 diabetes, rheumatoid arthritis, coronary artery disease and obesity). Allele frequency differences between populations for these SNPs were measured using Fst. The Fst values for the disease-associated SNPs were compared to Fst values from 2750 random SNPs typed in the same set of individuals.
On average, disease SNPs are not significantly more differentiated between populations than random SNPs in the genome. Risk allele frequencies, however, do show substantial variation across human populations and may contribute to differences in disease prevalence between populations. We demonstrate that, in some cases, risk allele frequency differences are unusually high compared to random SNPs and may be due to the action of local (i.e. geographically-restricted) positive natural selection. Moreover, some risk alleles were absent or fixed in a population, which implies that risk alleles identified in one population do not necessarily account for disease prevalence in all human populations.
Although differences in risk allele frequencies between human populations are not unusually large and are thus likely not due to positive local selection, there is substantial variation in risk allele frequencies between populations which may account for differences in disease prevalence between human populations.
PMCID: PMC2440747  PMID: 18533027
23.  Common genetic variants of the ion channel transient receptor potential membrane melastatin 6 and 7 (TRPM6 and TRPM7), magnesium intake, and risk of type 2 diabetes in women 
BMC Medical Genetics  2009;10:4.
Ion channel transient receptor potential membrane melastatin 6 and 7 (TRPM6 and TRPM7) play a central role in magnesium homeostasis, which is critical for maintaining glucose and insulin metabolism. However, it is unclear whether common genetic variation in TRPM6 and TRPM7 contributes to risk of type 2 diabetes.
We conducted a nested case-control study in the Women's Health Study. During a median of 10 years of follow-up, 359 incident diabetes cases were diagnosed and matched by age and ethnicity with 359 controls. We analyzed 20 haplotype-tagging single nucleotide polymorphisms (SNPs) in TRPM6 and 5 common SNPs in TRPM7 for their association with diabetes risk.
Overall, there was no robust and significant association between any single SNP and diabetes risk. Neither was there any evidence of association between common TRPM6 and TRPM7 haplotypes and diabetes risk. Our haplotype analyses suggested a significant risk of type 2 diabetes among carriers of both the rare alleles from two non-synomous SNPs in TRPM6 (Val1393Ile in exon 26 [rs3750425] and Lys1584Glu in exon 27 [rs2274924]) when their magnesium intake was lower than 250 mg per day. Compared with non-carriers, women who were carriers of the haplotype 1393Ile-1584Glu had an increased risk of type 2 diabetes (OR, 4.92, 95% CI, 1.05–23.0) only when they had low magnesium intake (<250 mg/day).
Our results provide suggestive evidence that two common non-synonymous TRPM6 coding region variants, Ile1393Val and Lys1584Glu polymorphisms, might confer susceptibility to type 2 diabetes in women with low magnesium intake. Further replication in large-scale studies is warranted.
PMCID: PMC2637850  PMID: 19149903
24.  Association of a novel functional promoter variant (rs2075533 C>T) in the apoptosis gene TNFSF8 with risk of lung cancer—a finding from Texas lung cancer genome-wide association study 
Carcinogenesis  2011;32(4):507-515.
Published genome-wide association studies (GWASs) have identified few variants in the known biological pathways involved in lung cancer etiology. To mine the possibly hidden causal single nucleotide polymorphisms (SNPs), we explored all SNPs in the extrinsic apoptosis pathway from our published GWAS dataset for 1154 lung cancer cases and 1137 cancer-free controls. In an initial association analysis of 611 tagSNPs in 41 apoptosis-related genes, we identified only 10 tagSNPs associated with lung cancer risk with a P value <10−2, including four tagSNPs in DAPK1 and three tagSNPs in TNFSF8. Unlike DAPK1 SNPs, TNFSF8 rs2181033 tagged other four predicted functional but untyped SNPs (rs776576, rs776577, rs31813148 and rs2075533) in the promoter region. Therefore, we further tested binding affinity of these four SNPs by performing the electrophoretic mobility shift assay. We found that only rs2075533T allele modified levels of nuclear proteins bound to DNA, leading to significantly decreased expression of luciferase reporter constructs by 5- to –10-fold in H1299, HeLa and HCT116 cell lines compared with the C allele. We also performed a replication study of the untyped rs2075533 in an independent Texas population but did not confirm the protective effect. We further performed a mini meta-analysis for SNPs of TNFSF8 obtained from other four published lung cancer GWASs with 12  214 cases and 47  721 controls, and we found that only rs3181366 (r2 = 0.69 with the untyped rs2075533) was associated to lung cancer risk (P = 0.008). Our findings suggest a possible role of novel TNFSF8 variants in susceptibility to lung cancer.
PMCID: PMC3066422  PMID: 21292647
25.  Fine mapping of variants associated with endometriosis in the WNT4 region on chromosome 1p36 
Genome-wide association studies show strong evidence of association with endometriosis for markers on chromosome 1p36 spanning the potential candidate genes WNT4, CDC42 and LINC00339. WNT4 is involved in development of the uterus, and the expression of CDC42 and LINC00339 are altered in women with endometriosis. We conducted fine mapping to examine the role of coding variants in WNT4 and CDC42 and determine the key SNPs with strongest evidence of association in this region. We identified rare coding variants in WNT4 and CDC42 present only in endometriosis cases. The frequencies were low and cannot account for the common signal associated with increased risk of endometriosis. Genotypes for five common SNPs in the region of chromosome 1p36 show stronger association signals when compared with rs7521902 reported in published genome scans. Of these, three SNPs rs12404660, rs3820282, and rs55938609 were located in DNA sequences with potential functional roles including overlap with transcription factor binding sites for FOXA1, FOXA2, ESR1, and ESR2. Functional studies will be required to identify the gene or genes implicated in endometriosis risk.
PMCID: PMC3852639  PMID: 24319535
Endometriosis; WNT4; CDC42; chromosome 1p36; rare variants; common variants

