Search tips
Search criteria

Results 1-14 (14)

Clipboard (0)
Year of Publication
1.  Association of 25-hydroxyvitamin D with Blood Pressure in Predominantly 25-hydroxyvitamin D Deficient Hispanic and African Americans 
American journal of hypertension  2009;22(8):867-870.
Several observational studies have recently suggested an inverse association of circulating levels of vitamin D with blood pressure. These findings have been based mainly on Caucasian populations; whether this association also exists among Hispanic and African Americans has yet to be definitively determined. This study investigates the association of 25-hydroxyvitamin D (25[OH]D) with blood pressure in Hispanic and African Americans.
The data source for this study is the Insulin Resistance Atherosclerosis Family Study (IRASFS), which consists of Hispanic- and African-American families from three U.S. recruitment centers (n=1334). A variance components model was used to analyze the association of plasma 25[OH]D levels with blood pressure.
An inverse association was found between 25[OH]D and both systolic (β for 10 ng/mL difference= −2.05; p<0.01) and diastolic (β for 10 ng/mL difference= −1.35; p<0.001) blood pressure in all populations combined, after adjusting for age, sex, ethnicity and season of blood draw. Further adjustment for body mass index (BMI) weakened this association (β for 10 ng/mL difference= −0.94; p=0.14 and β for 10 ng/mL difference = −0.64; p=0.09, respectively).
25[OH]D levels are significantly inversely associated with blood pressure in Hispanic and African Americans from the IRASFS. However, this association was not significant after adjustment for BMI. Further research is needed to determine the role of BMI in this association. Large, well-designed prospective studies of the effect of vitamin D supplementation on blood pressure may be warranted.
PMCID: PMC2865679  PMID: 19444222
Vitamin D; 25-hydroxyvitamin D; blood pressure; hypertension; race; ethnic groups; Hispanic; African American
2.  The associations between a polygenic score, reproductive and menstrual risk factors and breast cancer risk 
We evaluated whether 13 single nucleotide polymorphisms (SNPs) identified in genome-wide association studies interact with one another and with reproductive and menstrual risk factors in association with breast cancer risk. DNA samples and information on parity, breastfeeding, age at menarche, age at first birth, and age at menopause were collected through structured interviews from 1484 breast cancer cases and 1307 controls who participated in a population-based case-control study conducted in three U.S. states. A polygenic score was created as the sum of risk allele copies multiplied by the corresponding log odds estimate. Logistic regression was used to test associations between SNPs, the score, reproductive and menstrual factors and breast cancer risk. Nonlinearity of the score was assessed by the inclusion of a quadratic term for polygenic score. Interactions between the aforementioned variables were tested by including a cross-product term in models. We confirmed associations between rs13387042 (2q35), rs4973768 (SLC4A7), rs10941679 (5p12), rs2981582 (FGFR2), rs3817198 (LSP1), rs3803662 (TOX3) and rs6504950 (STXBP4) with breast cancer. Women in the score’s highest quintile had 2.2-fold increased risk when compared to women in the lowest quintile (95% confidence interval:1.67–2.88). The quadratic polygenic score term was not significant in the model (p=0.85), suggesting established breast cancer loci are not associated with increased risk more than the sum of risk alleles. Modifications of menstrual and reproductive risk factors associations with breast cancer risk by polygenic score were not observed. Our results suggest interactions between breast cancer susceptibility loci and reproductive factors are not strong contributors to breast cancer risk.
PMCID: PMC3799826  PMID: 23893088
Epidemiology; reproductive and menstrual factors; breast cancer; breast cancer susceptibility loci
3.  Prediction of genetic contributions to complex traits using whole genome sequencing data 
BMC Proceedings  2014;8(Suppl 1):S68.
Although markers identified by genome-wide association studies have individually strong statistical significance, their performance in prediction remains limited. Our goal was to use animal breeding genomic prediction models to predict additive genetic contributions for systolic blood pressure (SBP) using whole genome sequencing data with different validation designs.
The additive genetic contributions of SBP were estimated via linear mixed model. Rare variants (MAF<0.05) were collapsed through the k-means method to create a "collapsed single-nucleotide polymorphisms." Prediction of the additive genomic contributions of SBP was conducted using genomic Best Linear Unbiased Predictor (GBLUP) and BayesCπ. Estimates of predictive accuracy were compared using common single-nucleotide polymorphisms (SNPs) versus common and collapsed SNPs, and for prediction within and across families.
The additive genetic variance of SBP contributed to 18% of the phenotypic variance (h2 = 0.18). BayesCπ had slightly better prediction accuracies than GBLUP. In both models, within-family predictions had higher accuracies both in the training and testing set than didacross-family design. Collapsing rare variants via the k-means method and adding to the common SNPs did not improve prediction accuracies. The prediction model, including both pedigree and genomic information, achieved a slightly higher accuracy than using either source of information alone.
Prediction of genetic contributions to complex traits is feasible using whole genome sequencing and statistical methods borrowed from animal breeding. The relatedness of individuals between the training and testing set strongly affected the performance of prediction models. Methods for inclusion of rare variants in these models need more development.
PMCID: PMC4143683  PMID: 25519339
4.  Genetic Analysis Workshop 18: Methods and strategies for analyzing human sequence and phenotype data in members of extended pedigrees 
BMC Proceedings  2014;8(Suppl 1):S1.
Genetic Analysis Workshop 18 provided a platform for developing and evaluating statistical methods to analyze whole-genome sequence data from a pedigree-based sample. In this article we present an overview of the data sets and the contributions that analyzed these data. The family data, donated by the Type 2 Diabetes Genetic Exploration by Next-Generation Sequencing in Ethnic Samples Consortium, included sequence-level genotypes based on sequencing and imputation, genome-wide association genotypes from prior genotyping arrays, and phenotypes from longitudinal assessments. The contributions from individual research groups were extensively discussed before, during, and after the workshop in theme-based discussion groups before being submitted for publication.
PMCID: PMC4143625  PMID: 25519310
5.  The Association of Telomere Length with Colorectal Cancer Differs by the Age of Cancer Onset 
Telomeres are nucleoprotein structures that cap the end of chromosomes and shorten with sequential cell divisions in normal aging. Short telomeres are also implicated in the incidence of many cancers, but the evidence is not conclusive for colorectal cancer (CRC). Therefore, the aim of this study was to assess the association of CRC and telomere length.
In this case–control study, we measured relative telomere length from peripheral blood leukocytes (PBLs) DNA with quantitative PCR in 598 CRC patients and 2,212 healthy controls.
Multivariate analysis indicated that telomere length was associated with risk for CRC, and this association varied in an age-related manner; younger individuals (≤50 years of age) with longer telomeres (80–99 percentiles) had a 2–6 times higher risk of CRC, while older individuals (>50 years of age) with shortened telomeres (1–10 percentiles) had 2–12 times the risk for CRC. The risk for CRC varies with extremes in telomere length in an age-associated manner.
Younger individuals with longer telomeres or older individuals with shorter telomeres are at higher risk for CRC. These findings indicate that the association of PBL telomere length varies according to the age of cancer onset and that CRC is likely associated with at minimum two different mechanisms of telomere dynamics.
PMCID: PMC3972691  PMID: 24598784
6.  The Challenge of Detecting Epistasis (G×G Interactions): Genetic Analysis Workshop 16 
Genetic epidemiology  2009;33(0 1):S58-S67.
Interest is increasing in epistasis as a possible source of the unexplained variance missed by genome-wide association studies. The Genetic Analysis Workshop 16 Group 9 participants evaluated a wide variety of classical and novel analytical methods for detecting epistasis, in both the statistical and machine learning paradigms, applied to both real and simulated data. Because the magnitude of epistasis is clearly relative to scale of penetrance, and therefore to some extent, to the choice of model framework, it is not surprising that strong interactions under one model might be minimized or even disappear entirely under a different modeling framework.
PMCID: PMC3692280  PMID: 19924703
generalized linear model; machine learning methods
7.  Analysis of human mini-exome sequencing data from Genetic Analysis Workshop 17 using a Bayesian hierarchical mixture model 
BMC Proceedings  2011;5(Suppl 9):S93.
Next-generation sequencing technologies are rapidly changing the field of genetic epidemiology and enabling exploration of the full allele frequency spectrum underlying complex diseases. Although sequencing technologies have shifted our focus toward rare genetic variants, statistical methods traditionally used in genetic association studies are inadequate for estimating effects of low minor allele frequency variants. Four our study we use the Genetic Analysis Workshop 17 data from 697 unrelated individuals (genotypes for 24,487 autosomal variants from 3,205 genes). We apply a Bayesian hierarchical mixture model to identify genes associated with a simulated binary phenotype using a transformed genotype design matrix weighted by allele frequencies. A Metropolis Hasting algorithm is used to jointly sample each indicator variable and additive genetic effect pair from its conditional posterior distribution, and remaining parameters are sampled by Gibbs sampling. This method identified 58 genes with a posterior probability greater than 0.8 for being associated with the phenotype. One of these 58 genes, PIK3C2B was correctly identified as being associated with affected status based on the simulation process. This project demonstrates the utility of Bayesian hierarchical mixture models using a transformed genotype matrix to detect genes containing rare and common variants associated with a binary phenotype.
PMCID: PMC3287935  PMID: 22373180
8.  The Survey of the Health of Wisconsin (SHOW), a novel infrastructure for population health research: rationale and methods 
BMC Public Health  2010;10:785.
Evidence-based public health requires the existence of reliable information systems for priority setting and evaluation of interventions. Existing data systems in the United States are either too crude (e.g., vital statistics), rely on administrative data (e.g., Medicare) or, because of their national scope (e.g., NHANES), lack the discriminatory power to assess specific needs and to evaluate community health activities at the state and local level. This manuscript describes the rationale and methods of the Survey of the Health of Wisconsin (SHOW), a novel infrastructure for population health research.
The program consists of a series of independent annual surveys gathering health-related data on representative samples of state residents and communities. Two-stage cluster sampling is used to select households and recruit approximately 800-1,000 adult participants (21-74 years old) each year. Recruitment and initial interviews are done at the household; additional interviews and physical exams are conducted at permanent or mobile examination centers. Individual survey data include physical, mental, and oral health history, health literacy, demographics, behavioral, lifestyle, occupational, and household characteristics as well as health care access and utilization. The physical exam includes blood pressure, anthropometry, bioimpedance, spirometry, urine collection and blood draws. Serum, plasma, and buffy coats (for DNA extraction) are stored in a biorepository for future studies. Every household is geocoded for linkage with existing contextual data including community level measures of the social and physical environment; local neighborhood characteristics are also recorded using an audit tool. Participants are re-contacted bi-annually by phone for health history updates.
SHOW generates data to assess health disparities across state communities as well as trends on prevalence of health outcomes and determinants. SHOW also serves as a platform for ancillary epidemiologic studies and for studies to evaluate the effect of community-specific interventions. It addresses key gaps in our current data resources and increases capacity for etiologic, applied and translational population health research. It is hoped that this program will serve as a model to better support evidence-based public health, facilitate intervention evaluation research, and ultimately help improve health throughout the state and nation.
PMCID: PMC3022857  PMID: 21182792
9.  Detecting gene-by-smoking interactions in a genome-wide association study of early-onset coronary heart disease using random forests 
BMC Proceedings  2009;3(Suppl 7):S88.
Genome-wide association studies are often limited in their ability to attain their full potential due to the sheer volume of information created. We sought to use the random forest algorithm to identify single-nucleotide polymorphisms (SNPs) that may be involved in gene-by-smoking interactions related to the early-onset of coronary heart disease.
Using data from the Framingham Heart Study, our analysis used a case-only design in which the outcome of interest was age of onset of early coronary heart disease.
Smoking status was dichotomized as ever versus never. The single SNP with the highest importance score assigned by random forests was rs2011345. This SNP was not associated with age alone in the control subjects. Using generalized estimating equations to adjust for sex and account for familial correlation, there was evidence of an interaction between rs2011345 and smoking status.
The results of this analysis suggest that random forests may be a useful tool for identifying SNPs taking part in gene-by-environment interactions in genome-wide association studies.
PMCID: PMC2795991  PMID: 20018084
10.  Classification tree for detection of single-nucleotide polymorphism (SNP)-by-SNP interactions related to heart disease: Framingham Heart Study 
BMC Proceedings  2009;3(Suppl 7):S83.
The aim of this study was to detect the effect of interactions between single-nucleotide polymorphisms (SNPs) on incidence of heart diseases. For this purpose, 2912 subjects with 350,160 SNPs from the Framingham Heart Study (FHS) were analyzed. PLINK was used to control quality and to select the 10,000 most significant SNPs. A classification tree algorithm, Generalized, Unbiased, Interaction Detection and Estimation (GUIDE), was employed to build a classification tree to detect SNP-by-SNP interactions for the selected 10 k SNPs. The classes generated by GUIDE were reexamined by a generalized estimating equations (GEE) model with the empirical variance after accounting for potential familial correlation. Overall, 17 classes were generated based on the splitting criteria in GUIDE. The prevalence of coronary heart disease (CHD) in class 16 (determined by SNPs rs1894035, rs7955732, rs2212596, and rs1417507) was the lowest (0.23%). Compared to class 16, all other classes except for class 288 (prevalence of 1.2%) had a significantly greater risk when analyzed using GEE model. This suggests the interactions of SNPs on these node paths are significant.
PMCID: PMC2795986  PMID: 20018079
11.  Detecting single-nucleotide polymorphism by single-nucleotide polymorphism interactions in rheumatoid arthritis using a two-step approach with machine learning and a Bayesian threshold least absolute shrinkage and selection operator (LASSO) model 
BMC Proceedings  2009;3(Suppl 7):S63.
The objective of this study was to detect interactions between relevant single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). Data from Problem 1 of the Genetic Analysis Workshop 16 were used. These data consisted of 868 cases and 1,194 controls genotyped with the 500 k Illumina chip. First, machine learning methods were applied for preselecting SNPs. One hundred SNPs outside the HLA region and 1,500 SNPs in the HLA region were preselected using information-gain theory. The software weka was used to reduce colinearity and redundancy in the HLA region, resulting in a subset of 6 SNPs out of 1,500. In a second step, a parametric approach to account for interactions between SNPs in the HLA region, as well as HLA-nonHLA interactions was conducted using a Bayesian threshold least absolute shrinkage and selection operator (LASSO) model incorporating 2,560 covariates. This approach detected some main and interaction effects for SNPs in genes that have previously been associated with RA (e.g., rs2395175, rs660895, rs10484560, and rs2476601). Further, some other SNPs detected in this study may be considered in candidate gene studies.
PMCID: PMC2795964  PMID: 20018057
12.  Genome-wide association studies using single-nucleotide polymorphisms versus haplotypes: an empirical comparison with data from the North American Rheumatoid Arthritis Consortium 
BMC Proceedings  2009;3(Suppl 7):S35.
The high genomic density of the single-nucleotide polymorphism (SNP) sets that are typically surveyed in genome-wide association studies (GWAS) now allows the application of haplotype-based methods. Although the choice of haplotype-based vs. individual-SNP approaches is expected to affect the results of association studies, few empirical comparisons of method performance have been reported on the genome-wide scale in the same set of individuals. To measure the relative ability of the two strategies to detect associations, we used a large dataset from the North American Rheumatoid Arthritis Consortium to: 1) partition the genome into haplotype blocks, 2) associate haplotypes with disease, and 3) compare the results with individual-SNP association mapping. Although some associations were shared across methods, each approach uniquely identified several strong candidate regions. Our results suggest that the application of both haplotype-based and individual-SNP testing to GWAS should be adopted as a routine procedure.
PMCID: PMC2795933  PMID: 20018026
13.  Comparison between two analytic strategies to detect linkage to obesity with genetically determined age of onset: the Framingham Heart Study 
BMC Genetics  2003;4(Suppl 1):S90.
Genes have been found to influence the age of onset of several diseases and traits. The occurrence of many chronic diseases, obesity included, appears to be strongly age-dependent. However, an analysis of potential age of onset genes for obesity has yet to be reported. There are at least two analytic methods for determining an age of onset gene. The first is to consider a person affected if they possess the trait before a certain age (an early age of onset phenotype). The second is to define the phenotype based on the residual from a survival analysis.
No regions provided evidence for linkage at the more stringent level of p < 0.001. However, five regions showed consistent suggestive evidence for linkage (one marker with p < 0.01 and a second contiguous marker at p < 0.05). These regions were chromosome 1 (280–294 cM) and chromosome 16 (56–64 cM) for overweight using the survival analysis residual method and chromosome 13 (102–122 cM), chromosome 17 (127–138 cM), and chromosome 19 (23–47 cM) for obese before age 35.
Only one region (chromosome 19 at 23–47 cM) showed somewhat consistent results between the two analytic methods. Potential reasons for inconsistent results between the two methods, as well as their strengths and weaknesses, are discussed. The use of both methods together to explore the genetics of the age of onset of a trait may prove to be beneficial in determining a gene that is linked only to an early age of onset phenotype versus one that determines age of onset through all age groups.
PMCID: PMC1866531  PMID: 14975158
14.  Genome scan linkage results for longitudinal systolic blood pressure phenotypes in subjects from the Framingham Heart Study 
BMC Genetics  2003;4(Suppl 1):S83.
The relationship between elevated blood pressure and cardiovascular and cerebrovascular disease risk is well accepted. Both systolic and diastolic hypertension are associated with this risk increase, but systolic blood pressure appears to be a more important determinant of cardiovascular risk than diastolic blood pressure. Subjects for this study are derived from the Framingham Heart Study data set. Each subject had five records of clinical data of which systolic blood pressure, age, height, gender, weight, and hypertension treatment were selected to characterize the phenotype in this analysis.
We modeled systolic blood pressure as a function of age using a mixed modeling methodology that enabled us to characterize the phenotype for each individual as the individual's deviation from the population average rate of change in systolic blood pressure for each year of age while controlling for gender, body mass index, and hypertension treatment. Significant (p = 0.00002) evidence for linkage was found between this normalized phenotype and a region on chromosome 1. Similar linkage results were obtained when we estimated the phenotype while excluding values obtained during hypertension treatment. The use of linear mixed models to define phenotypes is a methodology that allows for the adjustment of the main factor by covariates. Future work should be done in the area of combining this phenotype estimation directly with the linkage analysis so that the error in estimating the phenotype can be properly incorporated into the genetic analysis, which, at present, assumes that the phenotype is measured (or estimated) without error.
PMCID: PMC1866523  PMID: 14975151

Results 1-14 (14)