A single mutation can alter cellular and global homeostatic mechanisms and give rise to multiple clinical diseases. We hypothesized that these disease mechanisms could be identified using low minor allele frequency (MAF<0.1) non-synonymous SNPs (nsSNPs) associated with “mechanistic phenotypes”, comprised of collections of related diagnoses. We studied two mechanistic phenotypes: (1) thrombosis, evaluated in a population of 1,655 African Americans; and (2) four groupings of cancer diagnoses, evaluated in 3,009 white European Americans. We tested associations between nsSNPs represented on GWAS platforms and mechanistic phenotypes ascertained from electronic medical records (EMRs), and sought enrichment in functional ontologies across the top-ranked associations. We used a two-step analytic approach whereby nsSNPs were first sorted by the strength of their association with a phenotype. We tested associations using two reverse genetic models and standard additive and recessive models. In the second step, we employed a hypothesis-free ontological enrichment analysis using the sorted nsSNPs to identify functional mechanisms underlying the diagnoses comprising the mechanistic phenotypes. The thrombosis phenotype was solely associated with ontologies related to blood coagulation (Fisher's p = 0.0001, FDR p = 0.03), driven by the F5, P2RY12 and F2RL2 genes. For the cancer phenotypes, the reverse genetics models were enriched in DNA repair functions (p = 2×10−5, FDR p = 0.03) (POLG/FANCI, SLX4/FANCP, XRCC1, BRCA1, FANCA, CHD1L) while the additive model showed enrichment related to chromatid segregation (p = 4×10−6, FDR p = 0.005) (KIF25, PINX1). We were able to replicate nsSNP associations for POLG/FANCI, BRCA1, FANCA and CHD1L in independent data sets. Mechanism-oriented phenotyping using collections of EMR-derived diagnoses can elucidate fundamental disease mechanisms.
Rationale: Analysis of the age of onset in heritable pulmonary arterial hypertension (HPAH) has led to the hypothesis that genetic anticipation causes younger age of onset and death in subsequent generations. With accrual of pedigree data over multiple decades, we retested this hypothesis using analyses that eliminate the truncation of data that exists with shorter duration of follow-up.
Objectives: To analyze the pedigrees of families with mutations in bone morphogenetic protein receptor type 2 (BMPR2), afflicted in two or more generations with HPAH, eliminating time truncation bias by including families for whom we have at least 57 years of data.
Methods: We analyzed 355 individuals with BMPR2 mutations from 53 families in the Vanderbilt Pulmonary Hypertension Registry. We compared age at diagnosis or death in affected individuals (n = 249) by generation within families with multigenerational disease. We performed linear mixed effects models and we limited time-truncation bias by restricting date of birth to before 1955. This allowed for 57 years of follow-up (1955–2012) for mutation carriers to develop disease. We also conducted Kaplan-Meier analysis to include currently unaffected mutation carriers (n = 106).
Measurements and Main Results: Differences in age at diagnosis by generation were found in a biased analysis that included all birth years to the present, but this finding was eliminated when the 57-year observation limit was imposed. By Kaplan-Meier analysis, inclusion of currently unaffected mutation carriers strengthens the observation that bias of ascertainment exists when recent generations are included.
Conclusions: Genetic anticipation is likely an artifact of incomplete time of observation of kindreds with HPAH due to BMPR2 mutations.
hereditary; pulmonary hypertension; genetics
Investigate the association of maternal vitamin D and maternal asthma and infant respiratory infection severity.
Cross-sectional analyses of 340 mother-infant dyads enrolled September-May 2004-2008 during an infant viral respiratory infection. Maternal vitamin D levels were determined from enrollment blood specimens. At enrollment, we determined self-reported maternal asthma and infant respiratory infection severity using a bronchiolitis score. We assessed the association of maternal vitamin D levels and maternal asthma and infant bronchiolitis score in race-stratified multivariable regression models.
The cohort was 70% White, 19% African-American, and 21% had asthma. Overall, the median maternal vitamin D level was 20 ng/ml (Interquartile range 14,28). Among White women, a 14 ng/ml increase in vitamin D was associated with decreased odds of asthma (AOR 0.54, 95% CI 0.33-0.86). Maternal vitamin D was not associated with infant bronchiolitis score.
Higher maternal vitamin D levels were associated with decreased odds of asthma.
asthma; infant viral respiratory infections; vitamin D
Short duration and poor quality of sleep have been associated with increased risks of obesity, cardiovascular disease, diabetes mellitus, and total mortality. However, few studies have investigated their associations with risk of colorectal neoplasia.
In a screening colonoscopy-based case-control study, the Pittsburg Sleep Quality Index (PSQI) was administered to 1,240 study participants prior to their colonoscopy.
Three hundred and thirty eight (27.3%) of the participants were diagnosed with incident colorectal adenomas. Although there was no appreciable difference in the overall PSQI score between cases and adenoma-free controls (5.32 vs. 5.11; p=0.37), we found a statistically significant association of colorectal adenoma with the PSQI component 3, which corresponds to sleep duration (p=0.02). Cases were more likely to average less than 6 hours of sleep per night (28.9% vs. 22.1% in controls, p=0.01). In multivariate regression analysis adjusted for age, gender, race, smoking, family history of colorectal cancer, and waist-to-hip ratio, individuals averaging less than 6 hours per night had an almost 50% increase in risk of colorectal adenomas (OR=1.47, CI =1.05-2.06, p for trend=0.02) as compared with individuals sleeping at least 7 hours per night. Cases were also more likely to report of being diagnosed with sleep apnea (9.8% vs. 6.5%, p=0.05) and more likely to have worked alternate shifts (54.0% vs. 46.1%, p=0.01), although these differences were not significant in multivariate models.
Shorter duration of sleep significantly increases risk of colorectal adenomas. Our results suggest sleep duration as a novel risk factor for colorectal neoplasia.
sleep duration; colorectal adenoma; Pittsburgh Sleep Quality Index
Because obstructive sleep apnea (OSA) is associated with increased levels of inflammatory cytokines, we examined the relationship between OSA and polymorphisms for interleukin-6 (IL-6).
6 single nucleotide polymorphisms (SNPs) within IL-6 were genotyped in 259 African-Americans from the Cleveland Family Study with replication conducted in the Cardiovascular Health Study (n=124). OSA was dichotomized into apnea hypopnea index (AHI)>15 or on treatment vs. absent: AHI<5. Logistic regression was conducted, adjusting for age and sex in models with and without body mass index (BMI).
SNP IL6-6021 was associated with a decreased risk of OSA after adjusting for BMI (Odds Ratio for T allele 0.24; 95%CI [0.09–0.67]; p=0.006; q=0.07) under an additive model. This same allele was associated with increased BMI. The results from the replication sample were consistent in direction though not statistically significant (p=0.23). The SNPs were studied in European-Americans, although the minor allele frequency in IL6-6021 was too low (4%) for meaningful comparisons.
A synonymous SNP within the IL-6 coding region was protective of OSA in African-Americans; with qualitatively similar findings observed in another cohort. This suggests that variants in IL-6 may influence the risk of OSA in a pathway that is not explained by obesity.
Rationale: Obstructive sleep apnea (OSA) is hypothesized to be influenced by genes within pathways involved with obesity, craniofacial development, inflammation, and ventilatory control.
Objectives: We conducted the first candidate gene study of OSA using family data from European Americans and African Americans, selecting biologically plausible genes from within these pathways.
Methods: A total of 1,080 single nucleotide polymorphisms (SNPs) were genotyped in 729 African Americans and 505 SNPs were genotyped in 694 European Americans. Coding for SNPs additively, association testing on the apnea-hypopnea index (AHI) as a continuous trait, and OSA as a dichotomous trait (AHI ≥15) was conducted using methods that account for familial correlations in models adjusted for age, age-squared, and sex, with and without body mass index.
Measurements and Main Results: In European Americans, variants within C-reactive protein (CRP) and glial cell line–derived neurotrophic factor (GDNF) were associated with AHI (CRP: β = 4.6; SE = 1.1; P = 0.0000402) (GDNF: β = 4.3; SE = 1; P = 0.0000201) and with the dichotomous OSA trait (CRP: odds ratio = 2.4; 95% confidence interval, 1.5–3.9; P = 0.000170) (GDNF: odds ratio = 2; 95% confidence interval, 1.4–2.89; P = 0.0000433). In African Americans, rs9526240 within serotonin receptor 2a (HTR2A: odds ratio = 2.1; 95% confidence interval, 1.5–2.9; P = 0.00005233) was associated with OSA.
Conclusions: This candidate gene analysis identified the potential role of genes operating through intermediate disease pathways to influence sleep apnea phenotypes, providing a framework for focusing future replication studies.
sleep apnea; body mass index; genetics; candidate gene study
Although the importance of selecting cases and controls from the same population has been recognized for decades, the recent advent of genome-wide association studies has heightened awareness of this issue. Because these studies typically deal with large samples, small differences in allele frequencies between cases and controls can easily reach statistical significance. When, unbeknownst to a researcher, cases and controls have different substructures, the number of false-positive findings is inflated. There have been three recent developments of purely statistical approaches to assessing the ancestral comparability of case and control samples: genomic control, structured association, and multivariate reduction analyses. The widespread use of high-throughput technology has allowed the quick and accurate genotyping of the large number of markers required by these methods.
Group 13 dealt with four population stratification issues: single-nucleotide polymorphism marker selection, association testing, non-standard methods, and linkage disequilibrium calculations in stratified or mixed ethnicity samples. We demonstrated that there are continuous axes of ethnic variation in both datasets of Genetic Analysis Workshop 16. Furthermore, ignoring this structure created p-value inflation for a variety of phenotypes. Principal-components analysis (or multidimensional scaling) can control inflation as covariates in a logistic regression. One can weight for local ancestry estimation and allow the use of related individuals. Problems arise in the presence of extremely high association or unusually strong linkage disequilibrium (e.g., in chromosomal inversions). Our group also reported a method for performing an association test controlling for substructure when genome-wide markers are not available to explicitly compute stratification.
genetic association; genome-wide association study; principal components; multidimensional scaling; ethnic substructure
The ability to measure 25-hydroxyvitamin D (25OHD) levels from blood spot cards can simplify sample collection versus samples obtained by venipuncture, particularly in populations in whom it is difficult to draw blood. We sought to validate the use of blood spot samples for the measurement of 25OHD compared to serum or whole blood samples and correlate the measured levels with intake estimated from dietary recall.
Utilizing 109 biological mothers of infants enrolled in the Tennessee Children's Respiratory Initiative cohort, we measured 25OHD levels through highly selective liquid chromatography–tandem mass spectrometry on samples from blood spot cards, serum, and whole blood collected at enrollment. Dietary questionnaires (n = 65) were used to assess 25OHD intake by dietary recall. Sample collection measures were assessed for agreement and 25OHD levels for association with dietary 25OHD intake.
The mean absolute differences (95%CI) in 25OHD levels measured between whole blood and blood spot (n = 50 pairs) or serum and blood spot (n = 20) were 3.2 (95%CI:1.6, 4.8) ng/ml and 1.5 (95%CI:−0.5,3.4) ng/mL. Intake by dietary recall was marginally associated with 25OHD levels after adjustment for current smoking and race in linear regression.
25OHD levels determined by mass spectrometry from blood spot cards, serum and whole blood show relatively good agreement, although 25OHD levels are slightly lower when measured by blood spot cards. Blood spot samples are a less invasive means of obtaining 25OHD measurements, particularly in large population-based samples, or among children when venipuncture may decrease study participation.
Metabolic syndrome, by definition, is the manifestation of multiple, correlated metabolic impairments. It is known to have both strong environmental and genetic contributions. However, isolating genetic variants predisposing to such a complex trait has limitations. Using pedigree data, when available, may well lead to increased ability to detect variants associated with such complex traits. The ability to incorporate multiple correlated traits into a joint analysis may also allow increased detection of associated genes. Therefore, to demonstrate the utility of both univariate and multivariate family-based association analysis and to identify possible genetic variants associated with metabolic syndrome, we performed a scan of the Affymetrix 50 k Human Gene Panel data using 1) each of the traits comprising metabolic syndrome: triglycerides, high-density lipoprotein, systolic blood pressure, diastolic blood pressure, blood glucose, and body mass index, and 2) a composite trait including all of the above, jointly. Two single-nucleotide polymorphisms within the cholesterol ester transfer protein (CETP) gene remained significant even after correcting for multiple testing in both the univariate (p < 5 × 10-7) and multivariate (p < 5 × 10-9) association analysis. Three genes met significance for multiple traits after correction for multiple testing in the univariate analysis, while five genes remained significant in the multivariate association. We conclude that while both univariate and multivariate family-based association analysis can identify genes of interest, our multivariate approach is less affected by multiple testing correction and yields more significant results.
To account for population stratification in association studies, principal-components analysis is often performed on single-nucleotide polymorphisms (SNPs) across the genome. Here, we use Framingham Heart Study (FHS) Genetic Analysis Workshop 16 data to compare the performance of local ancestry adjustment for population stratification based on principal components (PCs) estimated from SNPs in a local chromosomal region with global ancestry adjustment based on PCs estimated from genome-wide SNPs.
Standardized height residuals from unrelated adults from the FHS Offspring Cohort were averaged from longitudinal data. PCs of SNP genotype data were calculated to represent individual's ancestry either 1) globally using all SNPs across the genome or 2) locally using SNPs in adjacent 20-Mbp regions within each chromosome. We assessed the extent to which there were differences in association studies of height depending on whether PCs for global, local, or both global and local ancestry were included as covariates.
The correlations between local and global PCs were low (r < 0.12), suggesting variability between local and global ancestry estimates. Genome-wide association tests without any ancestry adjustment demonstrated an inflated type I error rate that decreased with adjustment for local ancestry, global ancestry, or both. A known spurious association was replicated for SNPs within the lactase gene, and this false-positive association was abolished by adjustment with local or global ancestry PCs.
Population stratification is a potential source of bias in this seemingly homogenous FHS population. However, local and global PCs derived from SNPs appear to provide adequate information about ancestry.
Acoustic pharyngometry represents a simple, quick, non-invasive method for measuring upper airway dimensions which are predictive of sleep apnea risk. In this study we sought to assess the genetic basis for upper airway size as obtained by pharyngometry.
Participants over age 14 y in the Cleveland Family Study underwent three acoustic pharyngometry measurements. Variance component models adjusted for age and sex were used to estimate heritability of pharyngometry-derived airway measures.
A total of 568 of 655 subjects (87%) provided quality pharyngometry curves. Although African-Americans tended to have narrower airways compared to Caucasians, heritability patterns were similar in these two groups. Minimum cross-sectional area had a heritability of 0.34 (p=0.004) in Caucasians and 0.39 (p<0.001) in African-Americans, suggesting that 30-40% of the total variance in this measure is explained by shared familial factors. Estimates were unchanged after adjustment for body mass index or neck circumference. In contrast, oropharyngeal length did not have significant heritability in either ethnic group.
The minimum cross-sectional area in the oropharynx is a highly heritable trait suggesting the presence of an underlying genetic basis. These findings demonstrate the potential utility of acoustic pharyngometry in dissecting the genetic basis of sleep apnea.
sleep apnea; upper airway; oropharynx; pharyngometry; genetic epidemiology; heritability
Obesity and obstructive sleep apnea each have a substantial genetic basis and commonly co-exist in individuals. The degree to which the genetic underpinnings for these disorders overlap has not been previously quantified.
A total of 1802 individuals from 310 families in the Cleveland Family Study underwent home sleep studies as well as standardized assessment of body mass index and circumferences at the waist, hip, and neck. In 713 participants with laboratory sleep studies, fasting blood samples were assayed for leptin, adiponectin, and resistin. Variance component models were used to estimate heritability and genetic correlations.
The heritability of the apnea hypopnea index was 0.37 ± 0.04 and 0.33 ± 0.07 for home and laboratory sleep studies respectively. The genetic correlations between apnea hypopnea index and anthropomorphic adiposity measures ranged from 0.57 to 0.61 suggesting obesity can explain nearly 40% of the genetic variance in sleep apnea. The magnitude of the genetic correlations between apnea severity and adipokine levels was substantially less than those with anthropomorphic measures, ranging from 0.11–0.46. After adjusting for body mass index, no significant genetic correlation with apnea severity was observed for any of the other adiposity measures.
Substantial but not complete overlap in genetic bases exist between sleep apnea and anthropomorphic indices of adiposity, and this overlap accounts for more than one third of the genetic variance in apnea severity. These findings suggest that genetic polymorphisms exist that importantly influence sleep apnea susceptibility through both obesity-dependent and obesity-independent pathways.
obesity; sleep apnea; genetics; heritability; genetic correlation
Stature (adult body height), and body mass index (BMI) have a strong genetic component explaining observed variation in human populations, however, identifying those genetic components has been extremely challenging. It seems obvious that sample size is a critical determinant for successful identification of quantitative trait loci (QTL) that underlie the genetic architecture of these polygenic traits. The inherent shared environment and known genetic relationships in family studies provide clear advantages for gene mapping over studies utilizing unrelated individuals. To these ends, we combined the genotype and phenotype data from four previously performed family-based genome-wide screens resulting in a sample of 9.371 individuals from 3.032 African-American and European-American families and performed variance-components linkage analyses for stature and BMI. To our knowledge, this study represents the single largest family-based genome-wide linkage scan published for stature and BMI to date. This large study sample allowed us to pursue population-and sex-specific analyses as well. For stature we found evidence for linkage in previously reported loci on 11q23, 12q12, 15q25 and 18q23 as well as 15q26 and 19q13 which have not been linked to stature previously. For BMI we found evidence for two loci: one on 7q35 and another on 11q22 both of which have been previously linked to BMI in multiple populations. Our results show both the benefit of 1) combining data to maximize the sample size and 2) minimizing heterogeneity by analyzing subgroups where within-group variation can be reduced and suggest that the latter may be a more successful approach in genetic mapping.
Body Height; Body Mass Index; Linkage mapping; Quantitative Trait Loci
Stature (adult body height) and body mass index (BMI) have a strong genetic component explaining observed variation in human populations; however, identifying those genetic components has been extremely challenging. It seems obvious that sample size is a critical determinant for successful identification of quantitative trait loci (QTL) that underlie the genetic architecture of these polygenic traits. The inherent shared environment and known genetic relationships in family studies provide clear advantages for gene mapping over studies utilizing unrelated individuals. To these ends, we combined the genotype and phenotype data from four previously performed family-based genome-wide screens resulting in a sample of 9.371 individuals from 3.032 African-American and European-American families and performed variance-components linkage analyses for stature and BMI. To our knowledge, this study represents the single largest family-based genome-wide linkage scan published for stature and BMI to date. This large study sample allowed us to pursue population- and sex-specific analyses as well. For stature, we found evidence for linkage in previously reported loci on 11q23, 12q12, 15q25 and 18q23, as well as 15q26 and 19q13, which have not been linked to stature previously. For BMI, we found evidence for two loci: one on 7q35 and another on 11q22, both of which have been previously linked to BMI in multiple populations. Our results show both the benefit of (1) combining data to maximize the sample size and (2) minimizing heterogeneity by analyzing subgroups where within-group variation can be reduced and suggest that the latter may be a more successful approach in genetic mapping.
body height; body mass index; linkage mapping; quantitative trait loci
Non-parametric linkage methods have had limited success in detecting gene by gene interactions. Using affected sibling-pair (ASP) data from all replicates of the simulated data from Problem 3, we assessed the statistical power of three approaches to identify the gene × gene interaction between two loci on different chromosomes. The first method conditioned on linkage at the primary disease susceptibility locus (DR), to find linkage to a simulated effect modifier at Locus A with a mean allele sharing test. The second approach used a regression-based mean test to identify either the presence of interaction between the two loci or linkage to the A locus in the presence of linkage to DR. The third method applied a conditional logistic model designed to test for the presence of interacting loci. The first approach had decreased power over an unconditional linkage analysis, supporting the idea that gene × gene interaction cannot be detected with ASP data. The regression-based mean test and the conditional logistic model had the lowest power to detect gene × gene interaction, possibly because of the complex recoding of the tri-allelic DR locus for use as a covariate. We conclude that the ASP approaches tested have low power to successfully identify the interaction between the DR and A loci despite the large sample size, which may be due to the low prevalence of the high-risk DR genotypes. Additionally, the lack of data on discordant sibships may have decreased the power to identify gene × gene interactions.
Rheumatoid arthritis is a complex disease that appears to involve multiple genetic and environmental factors. Using the Genetic Analysis Workshop 15 simulated rheumatoid arthritis data and the structural equation modeling framework, we tested hypothesized "causal" rheumatoid arthritis model(s) by employing a novel latent gene construct approach that models individual genes as latent variables defined by multiple dense and non-dense single-nucleotide polymorphisms (SNPs). Our approach produced valid latent gene constructs, particularly with dense SNPs, which when coupled with other factors involved in rheumatoid arthritis, were able to generate good fitting models by certain goodness of fit indices. We observed that Gene F, C, DR, sex and smoking were significant predictors of rheumatoid arthritis but Genes A and E were not, which was generally, but not entirely, consistent with how the data were simulated. Our approach holds promise in unravelling complex diseases and improves upon current "one SNP (haplotype)-at-a-time" regression approaches by decreasing the number of statistical tests while minimizing problems with multicolinearity and haplotype estimation algorithm error. Furthermore, when genes are modeled as latent constructs simultaneously with other key cofactors, the approach provides enhanced control of confounding that should lead to less biased effect estimates among genes as well as between gene(s) and the complex disease. However, further study is needed to quantify bias, evaluate fit index disparity, and resolve multiplicative latent gene interactions. Moreover, because some a priori biological information is needed to form an initial substantive model, our approach may be most appropriate for candidate gene SNP panel applications.
In this analysis we applied a regression based transmission disequilibrium test to the binary trait presence or absence of Kofendred Personality Disorder in the Genetic Analysis Workshop 14 (GAW14) simulated dataset and determined the power and type I error rate of the method at varying map densities and sample sizes. To conduct this transmission disequilibrium test, the logit transformation was applied to a binary outcome and regressed on an indicator variable for the transmitted allele from informative matings. All 100 replicates from chromosomes 1, 3, 5, and 9 for the Aipotu and the combined Aipotu, Karangar, and Danacaa populations were used at densities of 3, 1, and 0.3 cM. Power and type I error were determined by the number of replicates significant at the 0.05 level.
The maximum power to detect linkage and association with the Aipotu population was 93% for chromosome 3 using a 0.3-cM map. For chromosomes 1, 5, and 9 the power was less than 10% at the 3-cM scan and less than 22% for the 0.3-cM map. With the larger sample size, power increased to 38% for chromosome 1, 100% for chromosome 3, 31% for chromosome 5, and 23% for chromosome 9. Type I error was approximately 7%.
The power of this method is highly dependent on the amount of information in a region. This study suggests that single-point methods are not particularly effective in narrowing a fine-mapping region, particularly when using single-nucleotide polymorphism data and when linkage disequilibrium in the region is variable.
Although obstructive sleep apnea (OSA) is known to have a strong familial basis, no genetic polymorphisms influencing apnea risk have been identified in cross-cohort analyses. We utilized the National Heart, Lung, and Blood Institute (NHLBI) Candidate Gene Association Resource (CARe) to identify sleep apnea susceptibility loci. Using a panel of 46,449 polymorphisms from roughly 2,100 candidate genes on a customized Illumina iSelect chip, we tested for association with the apnea hypopnea index (AHI) as well as moderate to severe OSA (AHI≥15) in 3,551 participants of the Cleveland Family Study and two cohorts participating in the Sleep Heart Health Study.
Among 647 African-Americans, rs11126184 in the pleckstrin (PLEK) gene was associated with OSA while rs7030789 in the lysophosphatidic acid receptor 1 (LPAR1) gene was associated with AHI using a chip-wide significance threshold of p-value<2×10−6. Among 2,904 individuals of European ancestry, rs1409986 in the prostaglandin E2 receptor (PTGER3) gene was significantly associated with OSA. Consistency of effects between rs7030789 and rs1409986 in LPAR1 and PTGER3 and apnea phenotypes were observed in independent clinic-based cohorts.
Novel genetic loci for apnea phenotypes were identified through the use of customized gene chips and meta-analyses of cohort data with replication in clinic-based samples. The identified SNPs all lie in genes associated with inflammation suggesting inflammation may play a role in OSA pathogenesis.