Search tips
Search criteria

Results 1-25 (75)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
more »
Document Types
1.  Polymorphisms influencing prostate specific antigen concentration may bias genome-wide association studies on prostate cancer 
Genome-wide association studies (GWAS) have produced weak (OR=1.1–1.5), but significant associations between single nucleotide polymorphisms (SNPs) and prostate cancer. However, these associations may be explained by detection bias caused by SNPs influencing prostate specific antigen (PSA) concentration. Thus, in a simulation study, we quantified the extent of bias in the association between a SNP and prostate cancer when the SNP influences PSA concentration.
We generated 2,000 replicate cohorts of 20,000 men using real-world estimates of prostate cancer risk, prevalence of carrying ≥1 minor allele, PSA concentration, and the influence of a SNP on PSA concentration. We modeled risk ratios (RR) of 1.00, 1.25, and 1.50 for the association between carrying ≥1 minor allele and prostate cancer. We calculated mean betas from the replicate cohorts and quantified bias under each scenario.
Assuming no association between a SNP and prostate cancer, the estimated mean bias in betas ranged from 0.02 to 0.10 for ln PSA being 0.05 to 0.20 ng/mL higher in minor allele carriers; the mean biased RRs ranged from 1.03 to 1.11. Assuming true RRs=1.25 and 1.50, the biased RRs were as large as 1.39 and 1.67, respectively.
Estimates of the association between SNPs and prostate cancer can be biased to the magnitude observed in published GWAS, possibly resulting in Type I error. However, large associations (RR >1.10) may not fully be explained by this bias.
The influence of SNPs on PSA concentration should be considered when interpreting results from GWAS on prostate cancer.
PMCID: PMC4294961  PMID: 25352524
polymorphism; prostate-specific antigen; prostate cancer; bias
2.  Impact of methods used to express levels of circulating fatty acids on the degree and direction of associations with blood lipids in humans 
The British Journal of Nutrition  2016;115(2):251-261.
Numerous studies have examined relationships between disease biomarkers (such as blood lipids) and levels of circulating or cellular fatty acids. In such association studies, fatty acids have typically been expressed as the percentage of a particular fatty acid relative to the total fatty acids in a sample. Using two human cohorts, this study examined relationships between blood lipids (TAG, and LDL, HDL or total cholesterol) and circulating fatty acids expressed either as a percentage of total or as concentration in serum. The direction of the correlation between stearic acid, linoleic acid, dihomo-γ-linolenic acid, arachidonic acid and DHA and circulating TAG reversed when fatty acids were expressed as concentrations v. a percentage of total. Similar reversals were observed for these fatty acids when examining their associations with the ratio of total cholesterol:HDL-cholesterol. This reversal pattern was replicated in serum samples from both human cohorts. The correlations between blood lipids and fatty acids expressed as a percentage of total could be mathematically modelled from the concentration data. These data reveal that the different methods of expressing fatty acids lead to dissimilar correlations between blood lipids and certain fatty acids. This study raises important questions about how such reversals in association patterns impact the interpretation of numerous association studies evaluating fatty acids and their relationships with disease biomarkers or risk.
PMCID: PMC4697295  PMID: 26615716
PUFA; Lipid biomarkers; Linoleic acid; Arachidonic acid
3.  Plasma Proteome Biomarkers of Inflammation in School Aged Children in Nepal 
PLoS ONE  2015;10(12):e0144279.
Inflammation is a condition stemming from complex host defense and tissue repair mechanisms, often simply characterized by plasma levels of a single acute reactant. We attempted to identify candidate biomarkers of systemic inflammation within the plasma proteome. We applied quantitative proteomics using isobaric mass tags (iTRAQ) tandem mass spectrometry to quantify proteins in plasma of 500 Nepalese children 6–8 years of age. We evaluated those that co-vary with inflammation, indexed by α-1-acid glycoprotein (AGP), a conventional biomarker of inflammation in population studies. Among 982 proteins quantified in >10% of samples, 99 were strongly associated with AGP at a family-wise error rate of 0.1%. Magnitude and significance of association varied more among proteins positively (n = 41) than negatively associated (n = 58) with AGP. The former included known positive acute phase proteins including C-reactive protein, serum amyloid A, complement components, protease inhibitors, transport proteins with anti-oxidative activity, and numerous unexpected intracellular signaling molecules. Negatively associated proteins exhibited distinct differences in abundance between secretory hepatic proteins involved in transporting or binding lipids, micronutrients (vitamin A and calcium), growth factors and sex hormones, and proteins of largely extra-hepatic origin involved in the formation and metabolic regulation of extracellular matrix. With the same analytical approach and the significance threshold, seventy-two out of the 99 proteins were commonly associated with CRP, an established biomarker of inflammation, suggesting the validity of the identified proteins. Our findings have revealed a vast plasma proteome within a free-living population of children that comprise functional biomarkers of homeostatic and induced host defense, nutrient metabolism and tissue repair, representing a set of plasma proteins that may be used to assess dynamics and extent of inflammation for future clinical and public health application.
PMCID: PMC4670104  PMID: 26636573
4.  Analytic power and sample size calculation for the genotypic transmission/disequilibrium test in case-parent trio studies 
Case-parent trio studies considering genotype data from children affected by a disease and from their parents are frequently used to detect single nucleotide polymorphisms (SNPs) associated with disease. The most popular statistical tests in this study design are transmission/disequlibrium tests (TDTs). Several types of these tests have been developed, e.g., procedures based on alleles or genotypes. Therefore, it is of great interest to examine which of these tests have the highest statistical power to detect SNPs associated with disease. Comparisons of the allelic and the genotypic TDT for individual SNPs have so far been conducted based on simulation studies, since the test statistic of the genotypic TDT was determined numerically. Recently, it, however, has been shown that this test statistic can be presented in closed form. In this article, we employ this analytic solution to derive equations for calculating the statistical power and the required sample size for different types of the genotypic TDT. The power of this test is then compared with the one of the corresponding score test assuming the same mode of inheritance as well as the allelic TDT based on a multiplicative mode of inheritance, which is equivalent to the score test assuming an additive mode of inheritance. This is, thus, the first time that the power of these tests are compared based on equations, yielding instant results and omitting the need for time-consuming simulation studies. This comparison reveals that the tests have almost the same power, with the score test being slightly more powerful.
PMCID: PMC4206700  PMID: 25123830
Case-parent trio design; Conditional logisitc regression; Genome-wide association studies; Power calculation; Wald test
5.  Oesophageal squamous cell carcinoma in high-risk Chinese populations: Possible role for vascular epithelial growth factor A 
Mechanisms involved in wound healing play some role in carcinogenesis in multiple organs, likely by creating a chronic inflammatory milieu. This study sought to assess the role of genetic markers in selected inflammation-related genes involved in wound healing (interleukin (IL)-1a, IL-1b, IL-1 Receptor type I (IL-1Ra), IL-1 Receptor type II (IL-1Rb), tumour necrosis factor (TNF)-α, tumour necrosis factor receptor superfamily member (TNFRSF)1A, nuclear factor kappa beta (NF-kB)1, NF-kB2, inducible nitric oxide synthase (iNOS), cyclooxygenase (COX)-2, hypoxia induced factor (HIF)-1α, vascular endothelial growth factor (VEGF)A and P-53) in risk to oesophageal squamous cell carcinoma (OSCC).
We genotyped 125 tag single nucleotide polymorphism (SNP)s in 410 cases and 377 age and sex matched disease-free individuals from Nutritional Intervention Trial (NIT) cohort, and 546 cases and 556 controls individually matched for age, sex and neighbourhood from Shanxi case–control study, both conducted in high-risk areas of north-central China (1985–2007). Cox proportional-hazard models and conditional logistic regression models were used for SNPs analyses for NIT and Shanxi, respectively. Fisher's inverse test statistics were used to obtain gene-level significance.
Multiple SNPs were significantly associated with OSCC in both studies, however, none retained their significance after a conservative Bonferroni adjustment. Empiric p-values for tag SNPs in VEGFA in NIT were highly concentrated in the lower tail of the distribution, suggesting this gene may be influencing risk. Permutation tests confirmed the significance of this pattern. At the gene level, VEGFA yielded an empiric significance (P = 0.027) in NIT. We also observed some evidence for interaction between environmental factors and some VEGFA tag SNPs.
Our finding adds further evidence for a potential role for markers in the VEGFA gene in the development and progression of early precancerous lesions of oesophagus.
PMCID: PMC4363989  PMID: 25172294
Oesophageal squamous; cell carcinoma; Inflammation; Wound-healing; Genetic marker; Genetics; Inflammation-related events; Vascular endothelial growth factor A; VEGFA
6.  An IL-13 Promoter Polymorphism Associated with Liver Fibrosis in Patients with Schistosoma japonicum 
PLoS ONE  2015;10(8):e0135360.
The aim of this study was to determine whether two polymorphisms in the gene encoding IL13 previously associated with Schistosoma hematobium (S. hematobium) and S. mansoni infection are associated with S. japonicum infection. Single nucleotide polymorphisms (SNPs) rs1800925 (IL13/-1112C>T) and rs20541 (IL13R130Q) were genotyped in 947 unrelated individuals (307 chronically infected, 339 late-stage with liver fibrosis, 301 uninfected controls) from a schistosomiasis-endemic area of Hubei province in China. Regression models were used to evaluate allelic and haplotypic associations with chronic and late-stage schistosomiasis adjusted for non-genetic covariates. Expression of IL-13 was measured in S. japonicun-infected liver fibrosis tissue and normal liver tissue from uninfected controls by immunohistochemistry (IHC). The role of rs1800925 in IL-13 transcription was further determined by Luciferase report assay using the recombinant PGL4.17-rs180092 plasmid. We found SNP rs1800925T was associated with late-stage schistosomiasis caused by S. japonicum but not chronic schistosomiasis (OR = 1.39, 95%CI = 1.02–1.91, p = 0.03) and uninfected controls (OR = 1.49, 95%CI = 1.03–2.13, p = 0.03). Moreover, the haplotype rs1800925T-rs20541C increased the risk of disease progression to late-stage schistosomiasis (OR = 1.46, p = 0.035), whereas haplotype rs1800925C-rs20541A showed a protective role against development of late-stage schistosomiasis (F = 0.188, OR = 0.61, p = 0.002). Furthermore, S. japonicum-induced fibrotic liver tissue had higher IL13 expression than normal liver tissue. Plasmid PGL4.17-rs1800925T showed a stronger relative luciferase activity than Plasmid PGL4.17-rs1800925C in 293FT, QSG-7701 and HL-7702 cell lines. In conclusion, the functional IL13 polymorphism, rs1800925T, previously associated with risk of schistosomiasis, also contributes to risk of late-stage schistosomiasis caused by S. japonicum.
PMCID: PMC4530950  PMID: 26258681
7.  Inferring rare disease risk variants based on exact probabilities of sharing by multiple affected relatives 
Bioinformatics  2014;30(15):2189-2196.
Motivation: Family-based designs are regaining popularity for genomic sequencing studies because they provide a way to test cosegregation with disease of variants that are too rare in the population to be tested individually in a conventional case–control study.
Results: Where only a few affected subjects per family are sequenced, the probability that any variant would be shared by all affected relatives—given it occurred in any one family member—provides evidence against the null hypothesis of a complete absence of linkage and association. A P-value can be obtained as the sum of the probabilities of sharing events as (or more) extreme in one or more families. We generalize an existing closed-form expression for exact sharing probabilities to more than two relatives per family. When pedigree founders are related, we show that an approximation of sharing probabilities based on empirical estimates of kinship among founders obtained from genome-wide marker data is accurate for low levels of kinship. We also propose a more generally applicable approach based on Monte Carlo simulations. We applied this method to a study of 55 multiplex families with apparent non-syndromic forms of oral clefts from four distinct populations, with whole exome sequences available for two or three affected members per family. The rare single nucleotide variant rs149253049 in ADAMTS9 shared by affected relatives in three Indian families achieved significance after correcting for multiple comparisons (p=2×10−6).
Availability and implementation: Source code and binaries of the R package RVsharing are freely available for download at
Contact: or
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4103601  PMID: 24740360
8.  Infection and Inflammation in Schizophrenia and Bipolar Disorder: A Genome Wide Study for Interactions with Genetic Variation 
PLoS ONE  2015;10(3):e0116696.
Inflammation and maternal or fetal infections have been suggested as risk factors for schizophrenia (SZ) and bipolar disorder (BP). It is likely that such environmental effects are contingent on genetic background. Here, in a genome-wide approach, we test the hypothesis that such exposures increase the risk for SZ and BP and that the increase is dependent on genetic variants. We use genome-wide genotype data, plasma IgG antibody measurements against Toxoplasma gondii, Herpes simplex virus type 1, Cytomegalovirus, Human Herpes Virus 6 and the food antigen gliadin as well as measurements of C-reactive protein (CRP), a peripheral marker of inflammation. The subjects are SZ cases, BP cases, parents of cases and screened controls. We look for higher levels of our immunity/infection variables and interactions between them and common genetic variation genome-wide. We find many of the antibody measurements higher in both disorders. While individual tests do not withstand correction for multiple comparisons, the number of nominally significant tests and the comparisons showing the expected direction are in significant excess (permutation p=0.019 and 0.004 respectively). We also find CRP levels highly elevated in SZ, BP and the mothers of BP cases, in agreement with existing literature, but possibly confounded by our inability to correct for smoking or body mass index. In our genome-wide interaction analysis no signal reached genome-wide significance, yet many plausible candidate genes emerged. In a hypothesis driven test, we found multiple interactions among SZ-associated SNPs in the HLA region on chromosome 6 and replicated an interaction between CMV infection and genotypes near the CTNNA3 gene reported by a recent GWAS. Our results support that inflammatory processes and infection may modify the risk for psychosis and suggest that the genotype at SZ-associated HLA loci modifies the effect of these variables on the risk to develop SZ.
PMCID: PMC4363491  PMID: 25781172
9.  Fetal polymorphisms at the ABCB1-transporter gene locus are associated with susceptibility to non-syndromic oral cleft malformations 
European Journal of Human Genetics  2013;21(12):1436-1441.
ATP-binding cassette (ABC) proteins in the placenta regulate fetal exposure to xenobiotics. We hypothesized that functional polymorphisms in ABC genes influence risk for non-syndromic oral clefts (NSOC). Both family-based and case–control studies were undertaken to evaluate the association of nine potentially functional single-nucleotide polymorphisms within four ABC genes with risk of NSOC. Peripheral blood DNA from a total of 150 NSOC case-parent trios from Singapore and Taiwan were genotyped, as was cord blood DNA from 189 normal Chinese neonates used as controls. In trios, significant association was observed between the ABCB1 single-nucleotide polymorphisms and NSOC (P<0.05). Only ABCB1 rs1128503 retained significant association after Bonferroni correction (odds ratio (OR)=2.04; 95% confidence interval (CI)=1.42–2.98), while rs2032582 and rs1045642 showed nominal significance. Association with rs1128503 was replicated in a case–control analysis comparing NSOC probands with controls (OR=1.58; 95% CI=1.12–2.23). A comparison between the mothers of probands and controls showed no evidence of association, suggesting NSOC risk is determined by fetal and not maternal ABCB1 genotype. The two studies produced a combined OR of 1.79 (95% CI=1.38–2.30). The T-allele at rs1128503 was associated with higher risk. This study thus provides evidence that potentially functional polymorphisms in fetal ABCB1 modulate risk for NSOC, presumably through suboptimal exclusion of xenobiotics at the fetal–maternal interface.
PMCID: PMC3831066  PMID: 23443032
cleft palate; cleft lip; ATP-binding cassette transporters; single-nucleotide polymorphism; disease susceptibility; placenta
10.  Statistical Inference from Multiple iTRAQ Experiments without Using Common Reference Standards 
Journal of proteome research  2013;12(2):594-604.
Isobaric tags for relative and absolute quantitation (iTRAQ) is a prominent mass spectrometry technology for protein identification and quantification that is capable of analyzing multiple samples in a single experiment. Frequently, iTRAQ experiments are carried out using an aliquot from a pool of all samples, or “masterpool”, in one of the channels as a reference sample standard to estimate protein relative abundances in the biological samples and to combine abundance estimates from multiple experiments. In this manuscript, we show that using a masterpool is counterproductive. We obtain more precise estimates of protein relative abundance by using the available biological data instead of the masterpool and do not need to occupy a channel that could otherwise be used for another biological sample. In addition, we introduce a simple statistical method to associate proteomic data from multiple iTRAQ experiments with a numeric response and show that this approach is more powerful than the conventionally employed masterpool-based approach. We illustrate our methods using data from four replicate iTRAQ experiments on aliquots of the same pool of plasma samples and from a 406-sample project designed to identify plasma proteins that covary with nutrient concentrations in chronically undernourished children from South Asia.
PMCID: PMC4223774  PMID: 23270375
Mass spectrometry; iTRAQ; statistical analysis; experimental design
11.  Metabolomic Profiling of Urine: Response to a Randomized, Controlled Feeding Study of Select Fruits and Vegetables, and Application to an Observational Study 1,2 
The British journal of nutrition  2013;110(10):10.1017/S000711451300127X.
Metabolomic profiles were used to characterize the effects of consuming a high-phytochemical diet compared to a diet devoid of fruits and vegetables in a randomized trial and cross-sectional study. In the trial, 8 h fasting urine from healthy men (n=5) and women (n=5) was collected after a 2-week randomized, controlled trial of 2 diet periods: a diet rich in cruciferous vegetables, citrus and soy (F&V), and a fruit- and vegetable-free (basal) diet. Among the ions found to differentiate the diets, 176 were putatively annotated with compound identifications, with 46 supported by MS/MS fragment evidence. Metabolites more abundant in the F&V diet included markers of dietary intervention (e.g., crucifers, citrus and soy), fatty acids and niacin metabolites. Ions more abundant in the basal diet included riboflavin, several acylcarnitines, and amino acid metabolites. In the cross-sectional study, we compared participants based on tertiles of crucifers, citrus and soy from 3 d food records (3DFR; n=36) and food frequency questionnaires (FFQ; n=57); intake was separately divided into tertiles of total fruit and vegetable intake for FFQ. As a group, ions individually differential between the experimental diets differentiated the observational study participants. However, only 4 ions were significant individually, differentiating the third vs. first tertile of crucifer, citrus and soy intake based on 3FDR. One of these was putatively annotated: proline betaine, a marker of citrus consumption. There were no ions significantly distinguishing tertiles by FFQ. Metabolomics assessment of controlled dietary interventions provides a more accurate and stronger characterization of diet than observational data.
PMCID: PMC3818452  PMID: 23657156
12.  Copy number polymorphisms near SLC2A9 are associated with serum uric acid concentrations 
BMC Genetics  2014;15:81.
Hyperuricemia is associated with multiple diseases, including gout, cardiovascular disease, and renal disease. Serum urate is highly heritable, yet association studies of single nucleotide polymorphisms (SNPs) and serum uric acid explain a small fraction of the heritability. Whether copy number polymorphisms (CNPs) contribute to uric acid levels is unknown.
We assessed copy number on a genome-wide scale among 8,411 individuals of European ancestry (EA) who participated in the Atherosclerosis Risk in Communities (ARIC) study. CNPs upstream of the urate transporter SLC2A9 on chromosome 4p16.1 are associated with uric acid (χ2df2=3545, p=3.19×10-23). Effect sizes, expressed as the percentage change in uric acid per deleted copy, are most pronounced among women (3.974.935.87 [ 2.55097.5 denoting percentiles], p=4.57×10-23) and independent of previously reported SNPs in SLC2A9 as assessed by SNP and CNP regression models and the phasing SNP and CNP haplotypes (χ2df2=3190,p=7.23×10-08). Our finding is replicated in the Framingham Heart Study (FHS), where the effect size estimated from 4,089 women is comparable to ARIC in direction and magnitude (1.414.707.88, p=5.46×10-03).
This is the first study to characterize CNPs in ARIC and the first genome-wide analysis of CNPs and uric acid. Our findings suggests a novel, non-coding regulatory mechanism for SLC2A9-mediated modulation of serum uric acid, and detail a bioinformatic approach for assessing the contribution of CNPs to heritable traits in large population-based studies where technical sources of variation are substantial.
PMCID: PMC4118309  PMID: 25007794
Copy number polymorphism; Hyperuricemia; Genomewide association study
13.  Large-Scale Genome-Wide Association Studies and Meta-Analyses of Longitudinal Change in Adult Lung Function 
Tang, Wenbo | Kowgier, Matthew | Loth, Daan W. | Soler Artigas, María | Joubert, Bonnie R. | Hodge, Emily | Gharib, Sina A. | Smith, Albert V. | Ruczinski, Ingo | Gudnason, Vilmundur | Mathias, Rasika A. | Harris, Tamara B. | Hansel, Nadia N. | Launer, Lenore J. | Barnes, Kathleen C. | Hansen, Joyanna G. | Albrecht, Eva | Aldrich, Melinda C. | Allerhand, Michael | Barr, R. Graham | Brusselle, Guy G. | Couper, David J. | Curjuric, Ivan | Davies, Gail | Deary, Ian J. | Dupuis, Josée | Fall, Tove | Foy, Millennia | Franceschini, Nora | Gao, Wei | Gläser, Sven | Gu, Xiangjun | Hancock, Dana B. | Heinrich, Joachim | Hofman, Albert | Imboden, Medea | Ingelsson, Erik | James, Alan | Karrasch, Stefan | Koch, Beate | Kritchevsky, Stephen B. | Kumar, Ashish | Lahousse, Lies | Li, Guo | Lind, Lars | Lindgren, Cecilia | Liu, Yongmei | Lohman, Kurt | Lumley, Thomas | McArdle, Wendy L. | Meibohm, Bernd | Morris, Andrew P. | Morrison, Alanna C. | Musk, Bill | North, Kari E. | Palmer, Lyle J. | Probst-Hensch, Nicole M. | Psaty, Bruce M. | Rivadeneira, Fernando | Rotter, Jerome I. | Schulz, Holger | Smith, Lewis J. | Sood, Akshay | Starr, John M. | Strachan, David P. | Teumer, Alexander | Uitterlinden, André G. | Völzke, Henry | Voorman, Arend | Wain, Louise V. | Wells, Martin T. | Wilk, Jemma B. | Williams, O. Dale | Heckbert, Susan R. | Stricker, Bruno H. | London, Stephanie J. | Fornage, Myriam | Tobin, Martin D. | O′Connor, George T. | Hall, Ian P. | Cassano, Patricia A.
PLoS ONE  2014;9(7):e100776.
Genome-wide association studies (GWAS) have identified numerous loci influencing cross-sectional lung function, but less is known about genes influencing longitudinal change in lung function.
We performed GWAS of the rate of change in forced expiratory volume in the first second (FEV1) in 14 longitudinal, population-based cohort studies comprising 27,249 adults of European ancestry using linear mixed effects model and combined cohort-specific results using fixed effect meta-analysis to identify novel genetic loci associated with longitudinal change in lung function. Gene expression analyses were subsequently performed for identified genetic loci. As a secondary aim, we estimated the mean rate of decline in FEV1 by smoking pattern, irrespective of genotypes, across these 14 studies using meta-analysis.
The overall meta-analysis produced suggestive evidence for association at the novel IL16/STARD5/TMC3 locus on chromosome 15 (P  =  5.71 × 10-7). In addition, meta-analysis using the five cohorts with ≥3 FEV1 measurements per participant identified the novel ME3 locus on chromosome 11 (P  =  2.18 × 10-8) at genome-wide significance. Neither locus was associated with FEV1 decline in two additional cohort studies. We confirmed gene expression of IL16, STARD5, and ME3 in multiple lung tissues. Publicly available microarray data confirmed differential expression of all three genes in lung samples from COPD patients compared with controls. Irrespective of genotypes, the combined estimate for FEV1 decline was 26.9, 29.2 and 35.7 mL/year in never, former, and persistent smokers, respectively.
In this large-scale GWAS, we identified two novel genetic loci in association with the rate of change in FEV1 that harbor candidate genes with biologically plausible functional links to lung function.
PMCID: PMC4077649  PMID: 24983941
14.  African Ancestry is a Risk Factor for Asthma and High Total IgE Levels in African Admixed Populations 
Genetic epidemiology  2013;37(4):393-401.
Characterization of genetic admixture of populations in the Americas and the Caribbean is of interest for anthropological, epidemiological, and historical reasons. Asthma has a higher prevalence and is more severe in populations with a high African component. Association of African ancestry with asthma has been demonstrated. We estimated admixture proportions of samples from six trihybrid populations of African descent and determined the relationship between African ancestry and asthma and total serum IgE levels (tIgE). We genotyped 237 ancestry informative markers in asthmatics and nonasthmatic controls from Barbados (190/277), Jamaica (177/529), Brazil (40/220), Colombia (508/625), African Americans from New York (207/171), and African Americans from Baltimore/Washington, D.C. (625/757). We estimated individual ancestries and evaluated genetic stratification using Structure and principal component analysis. Association of African ancestry and asthma and tIgE was evaluated by regression analysis. Mean SD African ancestry ranged from 0.76 ± 0.10 among Barbadians to 0.33 ± 0.13 in Colombians. The European component varied from 0.14 ± 0.05 among Jamaicans and Barbadians to 0.26 ± 0.08 among Colombians. African ancestry was associated with risk for asthma in Colombians (odds ratio (OR) = 4.5, P = 0.001) Brazilians (OR = 136.5, P = 0.003), and African Americans of New York (OR: 4.7; P = 0.040). African ancestry was also associated with higher tIgE levels among Colombians (β = 1.3, P = 0.04), Barbadians (β = 3.8, P = 0.03), and Brazilians (β = 1.6, P = 0.03). Our findings indicate that African ancestry can account for, at least in part, the association between asthma and its associated trait, tIgE levels.
PMCID: PMC4051322  PMID: 23554133
African; asthma; ancestry
15.  X- linked markers in DMD associated with oral clefts 
As part of an international consortium, case-parent trios were collected for a genome wide association study of isolated, non-syndromic oral clefts, including cleft lip (CL), cleft palate (CP) and cleft lip and palate (CLP). Non-syndromic oral clefts have a complex and heterogeneous etiology. Risk is influenced by genes, environmental factors, and differs markedly by gender. Family based association tests (FBAT) were used on 14,486 SNPs spanning the X chromosome, stratified by type of cleft and racial group. Significant results even after multiple comparisons correction were obtained for the Duchene’s muscular dystrophy (DMD) gene, the largest single gene in the human genome, among CL/P trios (both CL and CLP combined). When stratified into groups of European and Asian ancestry, stronger signals were obtained for Asians. Although conventional sliding window haplotype analysis showed no increase in significance, analysis selected combinations of the 25 most significant SNPs in DMD identified four SNPs together that attained genome-wide significance among Asian CL/P trios, raising the possibility of interaction between distant SNPs within DMD.
PMCID: PMC3600648  PMID: 23489894
oral clefts; case-parent trios; X-linked; family-based association; DMD
16.  Efficient simulation of epistatic interactions in case-parent trios 
Human heredity  2013;75(1):12-22.
Statistical approaches to evaluate interactions between single nucleotide polymorphisms (SNPs) and SNP-environment interactions are of great importance in genetic association studies, as susceptibility to complex disease might be related to the interaction of multiple SNPs and/or environmental factors. With these methods under active development, algorithms to simulate genomic data sets are needed, to ensure proper type I error control of newly proposed methods, and to compare power with existing methods. In this manuscript we propose an efficient method for a haplotype-based simulation of case-parent trios, when the disease risk is thought to depend on possibly higher order epistatic interactions, or gene-environment interactions with binary exposures.
PMCID: PMC3800020  PMID: 23548797
Case-parent trios; interactions; single nucleotide polymorphisms; haplotypes
17.  Distinct Loci in the CHRNA5/CHRNA3/CHRNB4 Gene Cluster Are Associated With Onset of Regular Smoking 
Stephens, Sarah H. | Hartz, Sarah M. | Hoft, Nicole R. | Saccone, Nancy L. | Corley, Robin C. | Hewitt, John K. | Hopfer, Christian J. | Breslau, Naomi | Coon, Hilary | Chen, Xiangning | Ducci, Francesca | Dueker, Nicole | Franceschini, Nora | Frank, Josef | Han, Younghun | Hansel, Nadia N. | Jiang, Chenhui | Korhonen, Tellervo | Lind, Penelope A. | Liu, Jason | Lyytikäinen, Leo-Pekka | Michel, Martha | Shaffer, John R. | Short, Susan E. | Sun, Juzhong | Teumer, Alexander | Thompson, John R. | Vogelzangs, Nicole | Vink, Jacqueline M. | Wenzlaff, Angela | Wheeler, William | Yang, Bao-Zhu | Aggen, Steven H. | Balmforth, Anthony J. | Baumeister, Sebastian E. | Beaty, Terri H. | Benjamin, Daniel J. | Bergen, Andrew W. | Broms, Ulla | Cesarini, David | Chatterjee, Nilanjan | Chen, Jingchun | Cheng, Yu-Ching | Cichon, Sven | Couper, David | Cucca, Francesco | Dick, Danielle | Foroud, Tatiana | Furberg, Helena | Giegling, Ina | Gillespie, Nathan A. | Gu, Fangyi | Hall, Alistair S. | Hällfors, Jenni | Han, Shizhong | Hartmann, Annette M. | Heikkilä, Kauko | Hickie, Ian B. | Hottenga, Jouke Jan | Jousilahti, Pekka | Kaakinen, Marika | Kähönen, Mika | Koellinger, Philipp D. | Kittner, Stephen | Konte, Bettina | Landi, Maria-Teresa | Laatikainen, Tiina | Leppert, Mark | Levy, Steven M. | Mathias, Rasika A. | McNeil, Daniel W. | Medland, Sarah E. | Montgomery, Grant W. | Murray, Tanda | Nauck, Matthias | North, Kari E. | Paré, Peter D. | Pergadia, Michele | Ruczinski, Ingo | Salomaa, Veikko | Viikari, Jorma | Willemsen, Gonneke | Barnes, Kathleen C. | Boerwinkle, Eric | Boomsma, Dorret I. | Caporaso, Neil | Edenberg, Howard J. | Francks, Clyde | Gelernter, Joel | Grabe, Hans Jörgen | Hops, Hyman | Jarvelin, Marjo-Riitta | Johannesson, Magnus | Kendler, Kenneth S. | Lehtimäki, Terho | Magnusson, Patrik K.E. | Marazita, Mary L. | Marchini, Jonathan | Mitchell, Braxton D. | Nöthen, Markus M. | Penninx, Brenda W. | Raitakari, Olli | Rietschel, Marcella | Rujescu, Dan | Samani, Nilesh J. | Schwartz, Ann G. | Shete, Sanjay | Spitz, Margaret | Swan, Gary E. | Völzke, Henry | Veijola, Juha | Wei, Qingyi | Amos, Chris | Cannon, Dale S. | Grucza, Richard | Hatsukami, Dorothy | Heath, Andrew | Johnson, Eric O. | Kaprio, Jaakko | Madden, Pamela | Martin, Nicholas G. | Stevens, Victoria L. | Weiss, Robert B. | Kraft, Peter | Bierut, Laura J. | Ehringer, Marissa A.
Genetic epidemiology  2013;37(8):846-859.
Neuronal nicotinic acetylcholine receptor (nAChR) genes (CHRNA5/CHRNA3/CHRNB4) have been reproducibly associated with nicotine dependence, smoking behaviors, and lung cancer risk. Of the few reports that have focused on early smoking behaviors, association results have been mixed. This meta-analysis examines early smoking phenotypes and SNPs in the gene cluster to determine: (1) whether the most robust association signal in this region (rs16969968) for other smoking behaviors is also associated with early behaviors, and/or (2) if additional statistically independent signals are important in early smoking. We focused on two phenotypes: age of tobacco initiation (AOI) and age of first regular tobacco use (AOS). This study included 56,034 subjects (41 groups) spanning nine countries and evaluated five SNPs including rs1948, rs16969968, rs578776, rs588765, and rs684513. Each dataset was analyzed using a centrally generated script. Meta-analyses were conducted from summary statistics. AOS yielded significant associations with SNPs rs578776 (beta = 0.02, P = 0.004), rs1948 (beta = 0.023, P = 0.018), and rs684513 (beta = 0.032, P = 0.017), indicating protective effects. There were no significant associations for the AOI phenotype. Importantly, rs16969968, the most replicated signal in this region for nicotine dependence, cigarettes per day, and cotinine levels, was not associated with AOI (P = 0.59) or AOS (P = 0.92). These results provide important insight into the complexity of smoking behavior phenotypes, and suggest that association signals in the CHRNA5/A3/B4 gene cluster affecting early smoking behaviors may be different from those affecting the mature nicotine dependence phenotype.
PMCID: PMC3947535  PMID: 24186853
CHRNA5; CHRNA3; CHRNB4; meta-analysis; nicotine; smoke
18.  A genome-wide study of de novo deletions identifies a candidate locus for non-syndromic isolated cleft lip/palate risk 
BMC Genetics  2014;15:24.
Copy number variants (CNVs) may play an important part in the development of common birth defects such as oral clefts, and individual patients with multiple birth defects (including clefts) have been shown to carry small and large chromosomal deletions. In this paper we investigate de novo deletions defined as DNA segments missing in an oral cleft proband but present in both unaffected parents. We compare de novo deletion frequencies in children of European ancestry with an isolated, non-syndromic oral cleft to frequencies in children of European ancestry from randomly sampled trios.
We identified a genome-wide significant 62 kilo base (kb) non-coding region on chromosome 7p14.1 where de novo deletions occur more frequently among oral cleft cases than controls. We also observed wider de novo deletions among cleft lip and palate (CLP) cases than seen among cleft palate (CP) and cleft lip (CL) cases.
This study presents a region where de novo deletions appear to be involved in the etiology of oral clefts, although the underlying biological mechanisms are still unknown. Larger de novo deletions are more likely to interfere with normal craniofacial development and may result in more severe clefts. Study protocol and sample DNA source can severely affect estimates of de novo deletion frequencies. Follow-up studies are needed to further validate these findings and to potentially identify additional structural variants underlying oral clefts.
PMCID: PMC3929298  PMID: 24528994
Oral clefts; DNA copy numbers; de novo deletions; Case-parent trios
19.  Evidence of Gene−Environment Interaction for Two Genes on Chromosome 4 and Environmental Tobacco Smoke in Controlling the Risk of Nonsyndromic Cleft Palate 
PLoS ONE  2014;9(2):e88088.
Nonsyndromic cleft palate (CP) is one of the most common human birth defects and both genetic and environmental risk factors contribute to its etiology. We conducted a genome-wide association study (GWAS) using 550 CP case-parent trios ascertained in an international consortium. Stratified analysis among trios with different ancestries was performed to test for GxE interactions with common maternal exposures using conditional logistic regression models. While no single nucleotide polymorphism (SNP) achieved genome-wide significance when considered alone, markers in SLC2A9 and the neighboring WDR1 on chromosome 4p16.1 gave suggestive evidence of gene-environment interaction with environmental tobacco smoke (ETS) among 259 Asian trios when the models included a term for GxE interaction. Multiple SNPs in these two genes were associated with increased risk of nonsyndromic CP if the mother was exposed to ETS during the peri-conceptual period (3 months prior to conception through the first trimester). When maternal ETS was considered, fifteen of 135 SNPs mapping to SLC2A9 and 9 of 59 SNPs in WDR1 gave P values approaching genome-wide significance (10−6
PMCID: PMC3916361  PMID: 24516586
Isolated, non-syndromic cleft lip with or without cleft palate (iCL±P) is a common human congenital malformation with a complex and heterogeneous etiology. Genes coding for fibroblast growth factors and their receptors (FGF/FGFR genes) are excellent candidate genes.
We tested single nucleotide polymorphic (SNP) markers in 10 FGF/FGFR genes (including FGFBP1, FGF2, FGF10, FGF18, FGFR1, FGFR2, FGF19, FGF4, FGF3, and FGF9) for genotypic effects, interactions with one another, and with common maternal environmental exposures in 221 Asian and 76 Maryland case-parent trios ascertained through a child with iCL±P.
Both FGFR1 and FGF19 yielded evidence of linkage and association in the transmission disequilibrium test, confirming previous evidence. Haplotypes of three SNPs in FGFR1 were nominally significant among Asian trios. Estimated ORs for individual SNPs and haplotypes of multiple markers in FGF19 ranged between1.31-1.87. We also found suggestive evidence of maternal genotypic effects for markers in FGF2 and FGF10 among Asian trios. Tests for gene-environment (GxE) interaction between markers in FGFR2 and maternal smoking or multivitamin supplementation yielded significant evidence of GxE interaction separately. Tests of gene-gene (GxG) interaction using Cordell's method yielded significant evidence between SNPs in FGF9 and FGF18, which was confirmed in an independent sample of trios from an international consortium.
Our results suggest several genes in the FGF/FGFR family may influence risk to iCL±P through distinct biological mechanisms.
PMCID: PMC3387510  PMID: 22074045
FGF/FGFR; oral clefts; maternal effects; gene-environment interaction; gene-gene interaction
Carcinogenesis  2012;34(1):86-92.
The hypothesis that germ-line polymorphisms in DNA repair genes influence cancer risk has previously been tested primarily on a cancer site-specific basis. The purpose of this study was to test the hypothesis that DNA repair gene allelic variants contribute to globally elevated cancer risk by measuring associations with risk of all cancers that occurred within a population-based cohort. In the CLUE II cohort study established in 1989 in Washington County, MD, this study was comprised of all 3619 cancer cases ascertained through 2007 compared with a sample of 2296 with no cancer. Associations were measured between 759 DNA repair gene single nucleotide polymorphisms (SNPs) and risk of all cancers. A SNP in O6-methylguanine-DNA methyltransferase, MGMT, (rs2296675) was significantly associated with overall cancer risk [per minor allele odds ratio (OR) 1.30, 95% confidence interval (CI) 1.19–1.43 and P-value: 4.1 × 10−8]. The association between rs2296675 and cancer risk was stronger among those aged ≤54 years old than those who were ≥55 years at baseline (P-for-interaction = 0.021). OR were in the direction of increased risk for all 15 categories of malignancies studied (P < 0.0001), ranging from 1.22 (P = 0.42) for ovarian cancer to 2.01 (P = 0.008) for urinary tract cancers; the smallest P-value was for breast cancer (OR 1.45, P = 0.0002). The results indicate that the minor allele of MGMT SNP rs2296675, a common genetic marker with 37% carriers, was significantly associated with increased risk of cancer across multiple tissues. Replication is needed to more definitively determine the scientific and public health significance of this observed association.
PMCID: PMC3534189  PMID: 23027618
Human genetics  2012;132(1):79-90.
Accelerated lung function decline is a key COPD phenotype; however its genetic control remains largely unknown.
We performed a genome-wide association study using the Illumina Human660W-Quad v.1_A BeadChip. Generalized estimation equations were used to assess genetic contributions to lung function decline over a 5-year period in 4,048 European-American Lung Health Study participants with largely mild COPD. Genotype imputation was performed using reference HapMap II data. To validate regions meeting genome-wide significance, replication of top SNPs was attempted in independent cohorts. Three genes (TMEM26, ANK3 and FOXA1) within the regions of interest were selected for tissue expression studies using immunohistochemistry.
Measurements and Main Results
Two intergenic SNPs (rs10761570, rs7911302) on chromosome 10 and one SNP on chromosome 14 (rs177852) met genome-wide significance after Bonferroni. Further support for the chromosome 10 region was obtained by imputation, the most significantly associated imputed SNPs (rs10761571, rs7896712) being flanked by observed markers rs10761570 and rs7911302. Results were not replicated in four general population cohorts or a smaller cohort of subjects with moderate to severe COPD; however, we show novel expression of genes near regions of significantly associated SNPS, including TMEM26 and FOXA1 in airway epithelium and lung parenchyma, and ANK3 in alveolar macrophages. Levels of expression were associated with lung function and COPD status.
We identified two novel regions associated with lung function decline in mild COPD. Genes within these regions were expressed in relevant lung cells and their expression related to airflow limitation suggesting they may represent novel candidate genes for COPD susceptibility.
PMCID: PMC3536920  PMID: 22986903
COPD; lung function decline; GWAS; genome wide association; genes; polymorphisms
Frontiers in Genetics  2013;4:252.
Genome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic arrays interrogate hundreds of thousands or even millions of loci simultaneously, many causal yet undetected loci are believed to exist because the conditional power to achieve a genome-wide significance level can be low, in particular for markers with small effect sizes and low minor allele frequencies and in studies with modest sample size. However, the correlation between neighboring markers in the human genome due to linkage disequilibrium (LD) resulting in correlated marker test statistics can be incorporated into multi-marker hypothesis tests, thereby increasing power to detect association. Herein, we establish a theoretical benchmark by quantifying the maximum power achievable for multi-marker tests of association in case-control studies, achievable only when the causal marker is known. Using that genotype correlations within an LD block translate into an asymptotically multivariate normal distribution for score test statistics, we develop a set of weights for the markers that maximize the non-centrality parameter, and assess the relative loss of power for other approaches. We find that the method of Conneely and Boehnke (2007) based on the maximum absolute test statistic observed in an LD block is a practical and powerful method in a variety of settings. We also explore the effect on the power that prior biological or functional knowledge used to narrow down the locus of the causal marker can have, and conclude that this prior knowledge has to be very strong and specific for the power to approach the maximum achievable level, or even beat the power observed for methods such as the one proposed by Conneely and Boehnke (2007).
PMCID: PMC3863805  PMID: 24379823
genome-wide association studies; linkage disequilibrium; multi-marker tests; multiplicity adjustment; single nucleotide polymorphisms
Genetic epidemiology  2010;34(6):10.1002/gepi.20512.
Admixture is a potential source of confounding in genetic association studies, so it becomes important to detect and estimate admixture in a sample of unrelated individuals. Populations of African descent in the US and the Caribbean share similar historical backgrounds but the distributions of African admixture may differ. We selected 416 ancestry informative markers (AIMs) to estimate and compare admixture proportions using STRUCTURE in 906 unrelated African Americans (AAs) and 294 Barbadians (ACs) from a study of asthma. This analysis showed AAs on average were 72.5% African, 19.6% European and 8% Asian, while ACs were 77.4% African, 15.9% European, and 6.7% Asian which were significantly different. A principal components analysis based on these AIMs yielded one primary eigenvector that explained 54.04% of the variation and captured a gradient from West African to European admixture. This principal component was highly correlated with African vs. European ancestry as estimated by STRUCTURE (r2 = 0.992, r2 = 0.912, respectively). To investigate other African contributions to African American and Barbadian admixture, we performed PCA on ~14,000 (14k) genome-wide SNPs in AAs, ACs, Yorubans, Luhya and Maasai African groups, and estimated genetic distances (FST). We found AAs and ACs were closest genetically (FST = 0.008), and both were closer to the Yorubans than the other East African populations. In our sample of individuals of African descent, ~400 well-defined AIMs were just as good for detecting substructure as ~14,000 random SNPs drawn from a genome-wide panel of markers.
PMCID: PMC3837693  PMID: 20717976
admixture; African Americans; African Caribbeans; African ancestry; genetic distance
Cancer epidemiology  2012;36(5):e288-e293.
A personal history of basal cell carcinoma (BCC) is associated with increased risk of other malignancies, but the reason is unknown. The hedgehog pathway is critical to the etiology of BCC, and is also believed to contribute to susceptibility to other cancers. This study tested the hypothesis that hedgehog pathway and pathway-related gene variants contribute to the increased risk of subsequent cancers among those with a history of BCC.
The study was nested within the ongoing CLUE II cohort study, established in 1989 in Washington County, Maryland, USA. The study consisted of a cancer-free control group (n=2,296) compared to three different groups of cancer cases ascertained through 2007, those diagnosed with: 1) Other (non-BCC) cancer only (n=2,349); 2) BCC only (n=534); and 3) BCC plus other cancer (n=446). The frequencies of variant alleles were compared among these four groups for 20 single nucleotide polymorphisms (SNPs) in 6 hedgehog pathway genes (SHH, IHH, PTCH2, SMO, GLI1, SUFU), and also 22 SNPs in VDR and 8 SNPs in FAS, which have cross-talk with the hedgehog pathway.
Comparing those with both BCC and other cancer versus those with no cancer, no significant associations were observed for any of the hedgehog pathway SNPs, or for the FAS SNPs. One VDR SNP was nominally significantly associated with the BCC cancer-prone phenotype, rs11574085 [per minor allele odds ratio (OR) 1.38, 95% confidence interval (CI) 1.05–1.82; p-value=0.02].
The hedgehog pathway gene SNPs studied, along with the VDR and FAS SNPs studied, are not strongly associated with the BCC cancer-prone phenotype.
PMCID: PMC3438291  PMID: 22677152
skin cancer; genetics; polymorphisms; hedgehog; vitamin D receptor; fas

Results 1-25 (75)