The pulmonary function measures of forced expiratory volume in one second (FEV1) and its ratio to forced vital capacity (FVC) are used in the diagnosis and monitoring of lung diseases and predict cardiovascular mortality in the general population. Genome wide association studies (GWAS) have identified numerous loci associated with FEV1 and FEV1/FVC but the causal variants remain uncertain. We hypothesized that novel or rare variants poorly tagged by GWAS may explain the significant associations between FEV1/FVC and two genes: ADAM19 and HTR4.
Methods and Results
We sequenced ADAM19 and its promoter region along with the approximately 21 kb portion of HTR4 harboring GWAS SNPs for pulmonary function and analyzed associations with FEV1/FVC among 3,983 participants of European ancestry from Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE). Meta-analysis of common variants in each region identified statistically significant associations (316 tests, P < 1.58×10−4) with FEV1/FVC for 14 ADAM19 SNPs and 24 HTR4 SNPs. After conditioning on the sentinel GWAS hit in each gene [ADAM19 rs1422795, minor allele frequency (MAF)=0.33 and HTR4 rs11168048, MAF=0.40] one SNP remained statistically significant (ADAM19 rs13155908, MAF = 0.12, P = 1.56×10−4). Analysis of rare variants (MAF < 1%) using Sequence Kernel Association Test did not identify associations with either region.
Sequencing identified one common variant associated with FEV1/FVC independently of the sentinel ADAM19 GWAS hit and supports the original HTR4 GWAS findings. Rare variants do not appear to underlie GWAS associations with pulmonary function for common variants in ADAM19 and HTR4.
genetic polymorphism; lung; population studies; DNA sequencing; Genome Wide Association Study
Genome-wide association studies (GWAS) have identified thousands of genetic variants that influence a variety of diseases and health-related quantitative traits. However, the causal variants underlying the majority of genetic associations remain unknown. The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Targeted Sequencing Study aims to follow up GWAS signals and identify novel associations of the allelic spectrum of identified variants with cardiovascular related traits.
Methods and Results
The study included 4,231 participants from three CHARGE cohorts: the Atherosclerosis Risk in Communities Study, the Cardiovascular Health Study, and the Framingham Heart Study. We used a case-cohort design in which we selected both a random sample of participants and participants with extreme phenotypes for each of 14 traits. We sequenced and analyzed 77 genomic loci, which had previously been associated with one or more of 14 phenotypes. A total of 52,736 variants were characterized by sequencing and passed our stringent quality control criteria. For common variants (minor allele frequency ≥1%), we performed unweighted regression analyses to obtain p-values for associations and weighted regression analyses to obtain effect estimates that accounted for the sampling design. For rare variants, we applied two approaches: collapsed aggregate statistics and joint analysis of variants using the Sequence Kernel Association Test.
We sequenced 77 genomic loci in participants from three cohorts. We established a set of filters to identify high-quality variants, and implemented statistical and bioinformatics strategies to analyze the sequence data, and identify potentially functional variants within GWAS loci.
genetics; epidemiology; CHARGE; sampling; targeted sequencing
Small area estimation (SAE) is an important endeavor in many fields and is used for resource allocation by both public health and government organizations. Often, complex surveys are carried out within areas, in which case it is common for the data to consist only of the response of interest and an associated sampling weight, reflecting the design. While it is appealing to use spatial smoothing models, and many approaches have been suggested for this endeavor, it is rare for spatial models to incorporate the weighting scheme, leaving the analysis potentially subject to bias. To examine the properties of various approaches to estimation we carry out a simulation study, looking at bias due to both non-response and non-random sampling. We also carry out SAE of smoking prevalence in Washington State, at the zip code level, using data from the 2006 Behavioral Risk Factor Surveillance System. The computation times for the methods we compare are short, and all approaches are implemented in R using currently available packages.
Complex surveys; Design-based inference; Intrinsic CAR models; Random effects models; Weighting
Rare variant tests have been of great interest in testing genetic associations with diseases and disease-related quantitative traits in recent years. Among these tests, the sequence kernel association test (SKAT) is an omnibus test for effects of rare genetic variants, in a linear or logistic regression framework. It is often described as a variance component test treating the genotypic effects as random. When the linear kernel is used, its test statistic can be expressed as a weighted sum of single-marker score test statistics. In this paper, we extend the test to survival phenotypes in a Cox regression framework. Because of the anticonservative small-sample performance of the score test in a Cox model, we substitute signed square-root likelihood ratio statistics for the score statistics, and confirm that the small-sample control of type I error is greatly improved. This test can also be applied in meta-analysis. We show in our simulation studies that this test has superior statistical power except in a few specific scenarios, as compared to burden tests in a Cox model. We also present results in an application to time-to-obesity using genotypes from Framingham Heart Study SNP Health Association Resource.
Cox proportional hazard model; likelihood ratio test; rare variant analysis; variance component test
Forced vital capacity (FVC), a spirometric measure of pulmonary function, reflects lung volume and is used to diagnose and monitor lung diseases. We performed genome-wide association study meta-analysis of FVC in 52,253 individuals from 26 studies and followed up the top associations in 32,917 additional individuals of European ancestry. We found six new regions associated at genome-wide significance (P < 5 × 10−8) with FVC in or near EFEMP1, BMP6, MIR-129-2/HSD17B12, PRDM11, WWOX, and KCNJ2. Two (GSTCD and PTCH1) loci previously associated with spirometric measures were related to FVC. Newly implicated regions were followed-up in samples of African American, Korean, Chinese, and Hispanic individuals. We detected transcripts for all six newly implicated genes in human lung tissue. The new loci may inform mechanisms involved in lung development and pathogenesis of restrictive lung disease.
Plasma fibrinogen is an acute phase protein playing an important role in the blood coagulation cascade having strong associations with smoking, alcohol consumption and body mass index (BMI). Genome-wide association studies (GWAS) have identified a variety of gene regions associated with elevated plasma fibrinogen concentrations. However, little is yet known about how associations between environmental factors and fibrinogen might be modified by genetic variation. Therefore, we conducted large-scale meta-analyses of genome-wide interaction studies to identify possible interactions of genetic variants and smoking status, alcohol consumption or BMI on fibrinogen concentration. The present study included 80,607 subjects of European ancestry from 22 studies. Genome-wide interaction analyses were performed separately in each study for about 2.6 million single nucleotide polymorphisms (SNPs) across the 22 autosomal chromosomes. For each SNP and risk factor, we performed a linear regression under an additive genetic model including an interaction term between SNP and risk factor. Interaction estimates were meta-analysed using a fixed-effects model. No genome-wide significant interaction with smoking status, alcohol consumption or BMI was observed in the meta-analyses. The most suggestive interaction was found for smoking and rs10519203, located in the LOC123688 region on chromosome 15, with a p value of 6.2×10−8. This large genome-wide interaction study including 80,607 participants found no strong evidence of interaction between genetic variants and smoking status, alcohol consumption or BMI on fibrinogen concentrations. Further studies are needed to yield deeper insight in the interplay between environmental factors and gene variants on the regulation of fibrinogen concentrations.
It has been well-established, both by population genetics theory and direct observation in many organisms, that increased genetic diversity provides a survival advantage. However, given the limitations of both sample size and genome-wide metrics, this hypothesis has not been comprehensively tested in human populations. Moreover, the presence of numerous segregating small effect alleles that influence traits that directly impact health directly raises the question as to whether global measures of genomic variation are themselves associated with human health and disease.
We performed a meta-analysis of 17 cohorts followed prospectively, with a combined sample size of 46,716 individuals, including a total of 15,234 deaths. We find a significant association between increased heterozygosity and survival (P = 0.03). We estimate that within a single population, every standard deviation of heterozygosity an individual has over the mean decreases that person’s risk of death by 1.57%.
This effect was consistent between European and African ancestry cohorts, men and women, and major causes of death (cancer and cardiovascular disease), demonstrating the broad positive impact of genomic diversity on human survival.
Electronic supplementary material
The online version of this article (doi:10.1186/s12863-014-0159-7) contains supplementary material, which is available to authorized users.
Heterozygosity; Human; Survival; GWAS
Statins effectively lower LDL cholesterol levels in large studies and the observed interindividual response variability may be partially explained by genetic variation. Here we perform a pharmacogenetic meta-analysis of genome-wide association studies (GWAS) in studies addressing the LDL cholesterol response to statins, including up to 18,596 statin-treated subjects. We validate the most promising signals in a further 22,318 statin recipients and identify two loci, SORT1/CELSR2/PSRC1 and SLCO1B1, not previously identified in GWAS. Moreover, we confirm the previously described associations with APOE and LPA. Our findings advance the understanding of the pharmacogenetic architecture of statin response.
Statins are effectively used to prevent and manage cardiovascular disease, but patient response to these drugs is highly variable. Here, the authors identify two new genes associated with the response of LDL cholesterol to statins and advance our understanding of the genetic basis of drug response.
Genome-wide association studies (GWAS) identified multiple loci for blood pressure (BP) and hypertension. Six genes – ATP2B1, CACNB2, CYP17A1, JAG1, PLEKHA7, and SH2B3 – were evaluated for sequence variation with large effects on systolic blood pressure (SBP), diastolic blood pressure (DBP), pulse pressure (PP), and mean arterial pressure (MAP).
Methods and Results
Targeted genomic sequence was determined in 4,178 European ancestry participants from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium. Common variants (≥50 minor allele copies) were evaluated individually and rare variants (minor allele frequency, MAF≤1%) were aggregated by locus. 464 common variants were identified across the 6 genes. An upstream CYP17A1 variant, rs11191416 (MAF = 0.09), was the most significant association for SBP (P = 0.0005); however the association was attenuated (P = 0.0469) after conditioning on the GWAS index single nucleotide polymorphism (SNP). A PLEKHA7 intronic variant was the strongest DBP association (rs12806040, MAF = 0.007, P = 0.0006) and was not in LD (r2 = 0.01) with the GWAS SNP. A CACNB2 intronic SNP, rs1571787, was the most significant association with PP (MAF = 0.27, P = 0.0003), but was not independent from the GWAS SNP (r2 = 0.34). Three variants (rs6163 and rs743572 in the CYP17A1 region and rs112467382 in PLEKHA7) were associated with BP traits (P<0.001). Rare variation, aggregately assessed in the 6 regions, was not significantly associated with BP measures.
Six targeted gene regions, previously identified by GWAS, did not harbor novel variation with large effects on BP in this sample.
Studies demonstrate associations between changes in obesity-related phenotypes and cardiovascular risk. While maternal pre-pregnancy BMI (mppBMI) and gestational weight gain (GWG) may be associated with adult offspring adiposity, no study has examined associations with obesity changes.
We examined associations of mppBMI and GWG with longitudinal change in offspring's BMI (ΔBMI), and assessed whether associations are explained by offspring genetics.
Design and Methods
We used a birth cohort of 1400 adults, with data at birth, age 17 and 32. After genotyping offspring, we created genetic scores, predictive of exposures and outcome, and fit linear regression models with and without scores to examine the associations of mppBMI and GWG with ΔBMI.
A one SD change in mppBMI and GWG was associated with a 0.83 and a 0.75 kg/m2 increase in ΔBMI respectively. The association between mppBMI and offspring ΔBMI was slightly attenuated (12%) with the addition of genetic scores. In the GWG model, a significant substantial 28.2% decrease in the coefficient was observed.
This study points to an association between maternal excess weight in pregnancy and offspring BMI change from adolescence to adulthood. Genetic factors may account, in part, for the GWG/ΔBMI association. These findings broaden observations that maternal obesity-related phenotypes have long-term consequences for offspring health.
Adiposity; Body-Mass Index; BMI; Cardiovascular Risk; Weight Change; Genetic Epidemiology
The cardiac sodium channel SCN5A regulates atrioventricular
and ventricular conduction. Genetic variants in this gene are associated with PR and QRS
intervals. We sought to further characterize the contribution of rare and common coding
variation in SCN5A to cardiac conduction.
Methods and Results
In the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted
Sequencing Study (CHARGE), we performed targeted exonic sequencing of
SCN5A (n=3699, European-ancestry individuals) and identified 4 common
(minor allele frequency >1%) and 157 rare variants. Common and rare
SCN5A coding variants were examined for association with PR and QRS intervals through
meta-analysis of European ancestry participants from CHARGE, NHLBI’s Exome
Sequencing Project (ESP, n=607) and the UK10K (n=1275) and by examining ESP
African-ancestry participants (N=972). Rare coding SCN5A variants in
aggregate were associated with PR interval in European and African-ancestry participants
(P=1.3×10−3). Three common variants were associated with PR
and/or QRS interval duration among European-ancestry participants and one among
African-ancestry participants. These included two well-known missense variants;
rs1805124 (H558R) was associated with PR and QRS shortening in European-ancestry
participants (P=6.25×10−4 and
P=5.2×10−3 respectively) and rs7626962 (S1102Y) was
associated with PR shortening in those of African ancestry
(P=2.82×10−3). Among European-ancestry participants, two
novel synonymous variants, rs1805126 and rs6599230, were associated with cardiac
conduction. Our top signal, rs1805126 was associated with PR and QRS lengthening
(P=3.35×10−7 and P=2.69×10−4
respectively), and rs6599230 was associated with PR shortening
By sequencing SCN5A, we identified novel common and rare
coding variants associated with cardiac conduction.
PR interval; QRS interval; genetics; sequencing; cohort
Estimates of the heritability of plasma fibrinogen concentration, an established predictor of cardiovascular disease (CVD), range from 34 to 50%. Genetic variants so far identified by genome-wide association (GWA) studies only explain a small proportion (< 2%) of its variation.
Methods and Results
We conducted a meta-analysis of 28 GWA studies, including more than 90,000 subjects of European ancestry, the first GWA meta-analysis of fibrinogen levels in 7 African Americans studies totaling 8,289 samples, and a GWA study in Hispanic-Americans totaling 1,366 samples. Evaluation for association of SNPs with clinical outcomes included a total of 40,695 cases and 85,582 controls for coronary artery disease (CAD), 4,752 cases and 24,030 controls for stroke, and 3,208 cases and 46,167 controls for venous thromboembolism (VTE). Overall, we identified 24 genome-wide significant (P<5×10−8) independent signals in 23 loci, including 15 novel associations, together accounting for 3.7% of plasma fibrinogen variation. Gene-set enrichment analysis highlighted key roles in fibrinogen regulation for the three structural fibrinogen genes and pathways related to inflammation, adipocytokines and thyrotrophin-releasing hormone signaling. Whereas lead SNPs in a few loci were significantly associated with CAD, the combined effect of all 24 fibrinogen-associated lead SNPs was not significant for CAD, stroke or VTE.
We identify 23 robustly associated fibrinogen loci, 15 of which are new. Clinical outcome analysis of these loci does not support a causal relationship between circulating levels of fibrinogen and CAD, stroke or VTE.
Fibrinogen; cardiovascular disease; genome-wide association study
Transforming growth factor-beta1 (TGF-B1) is a highly pleiotropic cytokine whose functions include a central role in the induction of fibrosis.
To investigate the hypothesis that elevated plasma levels of TGF-B1 are positively associated with incident heart failure (HF).
Participants and Methods
The hypotheses were tested using a two-phase case-control study design, ancillary to the Cardiovascular Health Study – a longitudinal, population-based cohort study. Cases were defined as having an incident HF event after their 1992-93 exam and controls were free of HF at follow-up. TGF-B1 was measured using plasma collected in 1992-93 and data from 89 cases and 128 controls were used for analysis. The association between TGF-B1 and risk of HF was evaluated using the weighted likelihood method, and odds ratios (OR) for risk of HF were calculated for TGF-B1 as a continuous linear variable and across quartiles of TGF-B1.
The OR for HF was 1.88 (95% confidence intervals [CI] 1.26 to 2.81) for each nanogram increase in TGF-B1, and the OR for the highest quartile (compared to the lowest) of TGF-B1 was 5.79 (95% CI 1.65 – 20.34), after adjustment for age, sex, C-reactive protein, platelet count and digoxin use. Further adjustment with other covariates did not change the results.
Higher levels of plasma TGF-B1 were associated with an increased risk of incident heart failure among older adults. However, further study is needed in larger samples to confirm these findings.
transforming growth factor-beta; heart failure; fibrosis; growth factors; cardiac remodeling
This paper considers approaches to the question “Which long-term care facilities have residents with high use of acute hospitalisations?” It compares four methods of identifying long-term care facilities with high use of acute hospitalisations by demonstrating four selection methods, identifies key factors to be resolved when deciding which methods to employ, and discusses their appropriateness for different research questions.
OPAL was a census-type survey of aged care facilities and residents in Auckland, New Zealand, in 2008. It collected information about facility management and resident demographics, needs and care. Survey records (149 aged care facilities, 6271 residents) were linked to hospital and mortality records routinely assembled by health authorities. The main ranking endpoint was acute hospitalisations for diagnoses that were classified as potentially avoidable. Facilities were ranked using 1) simple event counts per person, 2) event rates per year of resident follow-up, 3) statistical model of rates using four predictors, and 4) change in ranks between methods 2) and 3). A generalized mixed model was used for Method 3 to handle the clustered nature of the data.
3048 potentially avoidable hospitalisations were observed during 22 months’ follow-up. The same “top ten” facilities were selected by Methods 1 and 2. The statistical model (Method 3), predicting rates from resident and facility characteristics, ranked facilities differently than these two simple methods. The change-in-ranks method identified a very different set of “top ten” facilities. All methods showed a continuum of use, with no clear distinction between facilities with higher use.
Choice of selection method should depend upon the purpose of selection. To monitor performance during a period of change, a recent simple rate, count per resident, or even count per bed, may suffice. To find high–use facilities regardless of resident needs, recent history of admissions is highly predictive. To target a few high-use facilities that have high rates after considering facility and resident characteristics, model residuals or a large increase in rank may be preferable.
Long-term care; Risk assessment; Hospitalization; Health services for the aged; facility selection; Research design
Venous thromboembolism (VTE) is a common, heritable disease resulting in
high rates of hospitalization and mortality. Yet few associations between VTE
and genetic variants, all in the coagulation pathway, have been established. To
identify additional genetic determinants of VTE, we conducted a 2-stage
genome-wide association study (GWAS) among individuals of European ancestry in
the extended CHARGE VTE consortium. The discovery GWAS comprised 1,618 incident
VTE cases out of 44,499 participants from six community-based studies. Genotypes
for genome-wide single-nucleotide polymorphisms (SNPs) were imputed to
~2.5 million SNPs in HapMap and association with VTE assessed using
study-design appropriate regression methods. Meta-analysis of these results
identified two known loci, in F5 and ABO. Top
1,047 tag SNPs (p≤0.0016) from the discovery GWAS were tested for
association in an additional 3,231 cases and 3,536 controls from three
case-control studies. In the combined data from these two stages, additional
genome-wide significant associations were observed on 4q35 at
F11 (top SNP rs4253399, intronic to F11)
and on 4q28 at FGG (rs6536024, 9.7 kb from
FGG) (p<5.0×10−13 for both).
The associations at the FGG locus were not completely explained
by previously reported variants. Loci at or near SUSD1 and
OTUD7A showed borderline yet novel associations
(p<5.0×10-6) and constitute new candidate genes. In
conclusion, this large GWAS replicated key genetic associations in
F5 and ABO, and confirmed the importance
of F11 and FGG loci for VTE. Future studies
are warranted to better characterize the associations with F11
and FGG and to replicate the new candidate associations.
venous thrombosis; genetics; genome-wide association; genetic epidemiology
Genome-wide association studies (GWAS) have identified numerous loci influencing cross-sectional lung function, but less is known about genes influencing longitudinal change in lung function.
We performed GWAS of the rate of change in forced expiratory volume in the first second (FEV1) in 14 longitudinal, population-based cohort studies comprising 27,249 adults of European ancestry using linear mixed effects model and combined cohort-specific results using fixed effect meta-analysis to identify novel genetic loci associated with longitudinal change in lung function. Gene expression analyses were subsequently performed for identified genetic loci. As a secondary aim, we estimated the mean rate of decline in FEV1 by smoking pattern, irrespective of genotypes, across these 14 studies using meta-analysis.
The overall meta-analysis produced suggestive evidence for association at the novel IL16/STARD5/TMC3 locus on chromosome 15 (P = 5.71 × 10-7). In addition, meta-analysis using the five cohorts with ≥3 FEV1 measurements per participant identified the novel ME3 locus on chromosome 11 (P = 2.18 × 10-8) at genome-wide significance. Neither locus was associated with FEV1 decline in two additional cohort studies. We confirmed gene expression of IL16, STARD5, and ME3 in multiple lung tissues. Publicly available microarray data confirmed differential expression of all three genes in lung samples from COPD patients compared with controls. Irrespective of genotypes, the combined estimate for FEV1 decline was 26.9, 29.2 and 35.7 mL/year in never, former, and persistent smokers, respectively.
In this large-scale GWAS, we identified two novel genetic loci in association with the rate of change in FEV1 that harbor candidate genes with biologically plausible functional links to lung function.
Stroke, the leading neurologic cause of death and disability, has a substantial genetic component. We previously conducted a genome-wide association study (GWAS) in four prospective studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium and demonstrated that sequence variants near the NINJ2 gene are associated with incident ischemic stroke. Here, we sought to fine-map functional variants in the region and evaluate the contribution of rare variants to ischemic stroke risk.
Methods and Results
We sequenced 196 kb around NINJ2 on chromosome 12p13 among 3,986 European ancestry participants, including 475 ischemic stroke cases, from the Atherosclerosis Risk in Communities Study, Cardiovascular Health Study, and Framingham Heart Study. Meta-analyses of single-variant tests for 425 common variants (minor allele frequency [MAF] ≥ 1%) confirmed the original GWAS results and identified an independent intronic variant, rs34166160 (MAF = 0.012), most significantly associated with incident ischemic stroke (HR = 1.80, p = 0.0003). Aggregating 278 putatively-functional variants with MAF≤ 1% using count statistics, we observed a nominally statistically significant association, with the burden of rare NINJ2 variants contributing to decreased ischemic stroke incidence (HR = 0.81; p = 0.026).
Common and rare variants in the NINJ2 region were nominally associated with incident ischemic stroke among a subset of CHARGE participants. Allelic heterogeneity at this locus, caused by multiple rare, low frequency, and common variants with disparate effects on risk, may explain the difficulties in replicating the original GWAS results. Additional studies that take into account the complex allelic architecture at this locus are needed to confirm these findings.
We describe a methodology for assigning individual estimates of long-term average air pollution concentrations that accounts for a complex spatio-temporal correlation structure and can accommodate spatio-temporally misaligned observations. This methodology has been developed as part of the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air), a prospective cohort study funded by the U.S. EPA to investigate the relationship between chronic exposure to air pollution and cardiovascular disease. Our hierarchical model decomposes the space-time field into a “mean” that includes dependence on covariates and spatially varying seasonal and long-term trends and a “residual” that accounts for spatially correlated deviations from the mean model. The model accommodates complex spatio-temporal patterns by characterizing the temporal trend at each location as a linear combination of empirically derived temporal basis functions, and embedding the spatial fields of coefficients for the basis functions in separate linear regression models with spatially correlated residuals (universal kriging). This approach allows us to implement a scalable single-stage estimation procedure that easily accommodates a significant number of missing observations at some monitoring locations. We apply the model to predict long-term average concentrations of oxides of nitrogen (NOx) from 2005–2007 in the Los Angeles area, based on data from 18 EPA Air Quality System regulatory monitors. The cross-validated R2 is 0.67. The MESA Air study is also collecting additional concentration data as part of a supplementary monitoring campaign. We describe the sampling plan and demonstrate in a simulation study that the additional data will contribute to improved predictions of long-term average concentrations.
Air Pollution; Exposure Assessment; Hierarchical Modeling; Spatio-Temporal Modeling; Maximum Likelihood; Universal Kriging
With white blood cell count emerging as an important risk factor for chronic inflammatory diseases, genetic associations of differential leukocyte types, specifically monocyte count, are providing novel candidate genes and pathways to further investigate. Circulating monocytes play a critical role in vascular diseases such as in the formation of atherosclerotic plaque. We performed a joint and ancestry-stratified genome-wide association analyses to identify variants specifically associated with monocyte count in 11 014 subjects in the electronic Medical Records and Genomics Network. In the joint and European ancestry samples, we identified novel associations in the chromosome 16 interferon regulatory factor 8 (IRF8) gene (P-value = 2.78×10(−16), β = −0.22). Other monocyte associations include novel missense variants in the chemokine-binding protein 2 (CCBP2) gene (P-value = 1.88×10(−7), β = 0.30) and a region of replication found in ribophorin I (RPN1) (P-value = 2.63×10(−16), β = −0.23) on chromosome 3. The CCBP2 and RPN1 region is located near GATA binding protein2 gene that has been previously shown to be associated with coronary heart disease. On chromosome 9, we found a novel association in the prostaglandin reductase 1 gene (P-value = 2.29×10(−7), β = 0.16), which is downstream from lysophosphatidic acid receptor 1. This region has previously been shown to be associated with monocyte count. We also replicated monocyte associations of genome-wide significance (P-value = 5.68×10(−17), β = −0.23) at the integrin, alpha 4 gene on chromosome 2. The novel IRF8 results and further replications provide supporting evidence of genetic regions associated with monocyte count.
Identifying the downstream effects of disease-associated single nucleotide polymorphisms (SNPs) is challenging: the causal gene is often unknown or it is unclear how the SNP affects the causal gene, making it difficult to design experiments that reveal functional consequences. To help overcome this problem, we performed the largest expression quantitative trait locus (eQTL) meta-analysis so far reported in non-transformed peripheral blood samples of 5,311 individuals, with replication in 2,775 individuals. We identified and replicated trans-eQTLs for 233 SNPs (reflecting 103 independent loci) that were previously associated with complex traits at genome-wide significance. Although we did not study specific patient cohorts, we identified trait-associated SNPs that affect multiple trans-genes that are known to be markedly altered in patients: for example, systemic lupus erythematosus (SLE) SNP rs49170141 altered C1QB and five type 1 interferon response genes, both hallmarks of SLE2-4. Subsequent ChIP-seq data analysis on these trans-genes implicated transcription factor IKZF1 as the causal gene at this locus, with DeepSAGE RNA-sequencing revealing that rs4917014 strongly alters 3’ UTR levels of IKZF1. Variants associated with cholesterol metabolism and type 1 diabetes showed similar phenomena, indicating that large-scale eQTL mapping provides insight into the downstream effects of many trait-associated variants.
Maternal pre-pregnancy body-mass index (ppBMI) and gestational weight gain (GWG) are associated with cardiometabolic risk (CMR) traits in the offspring. The extent to which maternal genetic variation accounts for these associations is unknown.
In 1249 mother-offspring pairs recruited from the Jerusalem Perinatal Study, we used archival data to characterize ppBMI and GWG and follow-up data from offspring to assess CMR, including body mass index (BMI), waist circumference, glucose, insulin, blood pressure, and lipid levels, at an average age of 32. Maternal genetic risk scores (GRS) were created using a subset of SNPs most predictive of ppBMI, GWG, and each CMR trait, selected among 1384 single-nucleotide polymorphisms (SNPs) characterizing variation in 170 candidate genes potentially related to fetal development and/or metabolic risk. We fit linear regression models to examine the associations of ppBMI and GWG with CMR traits with and without adjustment for GRS. Compared to unadjusted models, the coefficient for the association of a one-standard-deviation (SD) difference in GWG and offspring BMI decreased by 41% (95%CI −81%, −11%) from 0.847 to 0.503 and the coefficient for a 1SD difference in GWG and WC decreased by 63% (95%CI −318%, −11%) from 1.196 to 0.443. For other traits, there were no statistically significant changes in the coefficients for GWG with adjustment for GRS. None of the associations of ppBMI with CMR traits were significantly altered by adjustment for GRS.
Maternal genetic variation may account in part for associations of GWG with offspring BMI and WC in young adults.
Estimates of treatment effectiveness in epidemiologic studies using large observational health care databases may be biased due to inaccurate or incomplete information on important confounders. Study methods that collect and incorporate more comprehensive confounder data on a validation cohort may reduce confounding bias.
Study Design and Setting
We applied two such methods, imputation and reweighting, to Group Health administrative data (full sample) supplemented by more detailed confounder data from the Adult Changes in Thought study (validation sample). We used influenza vaccination effectiveness (with an unexposed comparator group) as an example and evaluated each method’s ability to reduce bias using the control time period prior to influenza circulation.
Both methods reduced, but did not completely eliminate, the bias compared with traditional effectiveness estimates that do not utilize the validation sample confounders.
Although these results support the use of validation sampling methods to improve the accuracy of comparative effectiveness findings from healthcare database studies, they also illustrate that the success of such methods depends on many factors, including the ability to measure important confounders in a representative and large enough validation sample, the comparability of the full sample and validation sample, and the accuracy with which data can be imputed or reweighted using the additional validation sample information.
aged; bias (epidemiologic); comparative effectiveness research; confounding factors (epidemiology); influenza vaccines; propensity score
Summary: GWASTools is an R/Bioconductor package for quality control and analysis of genome-wide association studies (GWAS). GWASTools brings the interactive capability and extensive statistical libraries of R to GWAS. Data are stored in NetCDF format to accommodate extremely large datasets that cannot fit within R’s memory limits. The documentation includes instructions for converting data from multiple formats, including variants called from sequencing. GWASTools provides a convenient interface for linking genotypes and intensity data with sample and single nucleotide polymorphism annotation.
Availability and implementation: GWASTools is implemented in R and is available from Bioconductor (http://www.bioconductor.org). An extensive vignette detailing a recommended work flow is included.
An analysis of a case-control study of rhabdomyolysis was conducted to screen for previously unrecognized CYP2C8 inhibitors that may cause other clinically important drug-drug interactions. Cases of rhabdomyolysis using cerivastatin (n=72) were compared with controls using atorvastatin (n=287) between 1998–2001. The use of clopidogrel (OR 29.6; 95% CI, 6.1–143) was strongly associated with rhabdomyolysis. In a replication effort that used the FDA Adverse Event Reporting System (AERS), clopidogrel was used more commonly by rhabdomyolysis cases using cerivastatin (17%) than by rhabdomyolysis cases using atorvastatin (0%, OR infinity; 95% CI = 5.2-infinity). Several medications were tested in vitro for their potential to cause drug-drug interactions. Clopidogrel, rosiglitazone and montelukast were the most potent inhibitors of cerivastatin metabolism. Clopidogrel and its metabolites also inhibited cerivastatin metabolism in human hepatocytes. These epidemiological and in-vitro findings suggest that clopidogrel may cause clinically important, dose dependent, drug-drug interactions with other medications metabolized by CYP2C8.
rhabdomyolysis; statins; clopidogrel; adverse drug reaction; drug-drug interaction prediction; 2-oxo-clopidogrel; acyl glucuronide
The Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air) was initiated in 2004 to investigate the relation between individual-level estimates of long-term air pollution exposure and the progression of subclinical atherosclerosis and the incidence of cardiovascular disease (CVD). MESA Air builds on a multicenter, community-based US study of CVD, supplementing that study with additional participants, outcome measurements, and state-of-the-art air pollution exposure assessments of fine particulate matter, oxides of nitrogen, and black carbon. More than 7,000 participants aged 45–84 years are being followed for over 10 years for the identification and characterization of CVD events, including acute myocardial infarction and other coronary artery disease, stroke, peripheral artery disease, and congestive heart failure; cardiac procedures; and mortality. Subcohorts undergo baseline and follow-up measurements of coronary artery calcium using computed tomography and carotid artery intima-medial wall thickness using ultrasonography. This cohort provides vast exposure heterogeneity in ranges currently experienced and permitted in most developed nations, and the air monitoring and modeling methods employed will provide individual estimates of exposure that incorporate residence-specific infiltration characteristics and participant-specific time-activity patterns. The overarching study aim is to understand and reduce uncertainty in health effect estimation regarding long-term exposure to air pollution and CVD.
air pollution; atherosclerosis; cardiovascular diseases; environmental exposure; epidemiologic methods; particulate matter