Search tips
Search criteria

Results 1-15 (15)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Gene expression network analyses in response to air pollution exposures in the trucking industry 
Environmental Health  2016;15:101.
Exposure to air pollution, including traffic-related pollutants, has been associated with a variety of adverse health outcomes, including increased cardiopulmonary morbidity and mortality, and increased lung cancer risk.
To better understand the cellular responses induced by air pollution exposures, we performed genome-wide gene expression microarray analysis using whole blood RNA sampled at three time-points across the work weeks of 63 non-smoking employees at 10 trucking terminals in the northeastern US. We defined genes and gene networks that were differentially activated in response to PM2.5 (particulate matter ≤ 2.5 microns in diameter) and elemental carbon (EC) and organic carbon (OC).
Multiple transcripts were strongly associated (padj < 0.001) with pollutant levels (48, 260, and 49 transcripts for EC, OC, and PM2.5, respectively), including 63 that were statistically significantly correlated with at least two out of the three exposures. These genes included many that have been implicated in ischemic heart disease, chronic obstructive pulmonary disease (COPD), lung cancer, and other pollution-related illnesses. Through the combination of Gene Set Enrichment Analysis and network analysis (using GeneMANIA), we identified a core set of 25 interrelated genes that were common to all three exposure measures and were differentially expressed in two previous studies assessing gene expression attributable to air pollution. Many of these are members of fundamental cancer-related pathways, including those related to DNA and metal binding, and regulation of apoptosis and also but include genes implicated in chronic heart and lung diseases.
These data provide a molecular link between the associations of air pollution exposures with health effects.
Electronic supplementary material
The online version of this article (doi:10.1186/s12940-016-0187-z) contains supplementary material, which is available to authorized users.
PMCID: PMC5093980  PMID: 27809917
Air pollution; Trucking industry; Gene expression; Network analysis
2.  Noninvasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma 
Rationale: The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma.
Objectives: We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease.
Methods: Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes.
Measurements and Main Results: Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10−6) and hospitalization (P = 0.01), respectively.
Conclusions: There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma.
PMCID: PMC4451618  PMID: 25763605
molecular endotyping; genomic; RNA; severe asthma; pathway analysis
Nature communications  2014;5:4753.
Circadian rhythms are known to regulate immune responses in healthy animals, but it is unclear whether they persist during acute illnesses where clock gene expression is disrupted by systemic inflammation. Here, we use a genome-wide approach to investigate circadian gene and metabolite expression in the lungs of endotoxemic mice and find that novel cellular and molecular circadian rhythms are elicited in this setting. The endotoxin-specific circadian program exhibits unique features, including a divergent group of rhythmic genes and metabolites compared to the basal state and a distinct periodicity and phase distribution. At the cellular level endotoxin treatment also alters circadian rhythms of leukocyte counts within the lung in a bmal1-dependent manner, such that granulocytes rather than lymphocytes become the dominant oscillating cell type. Our results show that inflammation produces a complex reorganization of cellular and molecular circadian rhythms that are relevant to early events in lung injury.
PMCID: PMC4162491  PMID: 25208554
4.  Analyzing networks of phenotypes in complex diseases: methodology and applications in COPD 
BMC Systems Biology  2014;8:78.
The investigation of complex disease heterogeneity has been challenging. Here, we introduce a network-based approach, using partial correlations, that analyzes the relationships among multiple disease-related phenotypes.
We applied this method to two large, well-characterized studies of chronic obstructive pulmonary disease (COPD). We also examined the associations between these COPD phenotypic networks and other factors, including case-control status, disease severity, and genetic variants. Using these phenotypic networks, we have detected novel relationships between phenotypes that would not have been observed using traditional epidemiological approaches.
Phenotypic network analysis of complex diseases could provide novel insights into disease susceptibility, disease severity, and genetic mechanisms.
PMCID: PMC4105829  PMID: 24964944
Network medicine; Phenotypic networks; COPD; Genetic association analysis
5.  Germline Variants and Advanced Colorectal Adenomas: Adenoma Prevention with Celecoxib Trial Genomewide Association Study 
Identification of single nucleotide polymorphisms (SNPs) associated with development of advanced colorectal adenomas.
Experimental Design
Discovery Phase: 1,406 Caucasian patients (139 advanced adenoma cases and 1,267 controls) from the Adenoma Prevention with Celecoxib (APC) trial were included in a genome-wide association study (GWAS) to identify variants associated with post-polypectomy disease recurrence. Genome-wide significance was defined as false discovery rate < 0.05, unadjusted p=7.4×10−7. Validation Phase: Results were further evaluated using 4,175 familial colorectal adenoma or CRC cases and 5,036 controls from patients of European ancestry (COloRectal Gene Identification consortium, Scotland, Australia and VQ58).
Our study identified eight SNPs associated with advanced adenoma risk in the APC trial (rs2837156, rs7278863, rs2837237, rs2837241, rs2837254, rs741864 at 21q22.2, and rs1381392 and rs17651822 at 3p24.1, at p<10–7 level with odds ratio – OR>2). Five variants in strong pairwise linkage disequilbrium (rs7278863, rs2837237, rs741864, rs741864 and rs2837241, r2=0.8–1) are in or near the coding region for the tight junction adhesion protein, IGSF5. An additional variant associated with advanced adenomas, rs1535989 (minor allele frequency 0.11; OR 2.09; 95% confidence interval 1.50–2.91), also predicted CRC development in a validation analysis (p=0.019) using a series of adenoma cases or CRC (CORGI study) and 3 sets of CRC cases and controls (Scotland, VQ58 and Australia, N=9,211).
Our results suggest that common polymorphisms contribute to the risk of developing advanced adenomas and might also contribute to the risk of developing CRC. The variant at rs1535989 may identify patients whose risk for neoplasia warrants increased colonoscopic surveillance.
PMCID: PMC4037290  PMID: 24084763
Colorectal adenomas; colorectal cancer screening; genetic predisposition
6.  Copy number variation prevalence in known asthma genes and their impact on asthma susceptibility 
Genetic studies have identified numerous genes reproducibly associated with asthma, yet these studies have focused almost entirely on single nucleotide polymorphisms (SNPs), and virtually ignored another highly prevalent form of genetic variation: Copy Number Variants (CNVs).
To survey the prevalence of CNVs in genes previously associated with asthma, and to assess whether CNVs represent the functional asthma-susceptibility variants at these loci.
We genotyped 383 asthmatic trios participating in the Childhood Asthma Management Program (CAMP) using a competitive genomic hybridization (CGH) array designed to interrogate 20,092 CNVs. To ensure comprehensive assessment of all potential asthma candidate genes, we purposely used liberal asthma gene inclusion criteria, resulting in consideration of 270 candidate genes previously implicated in asthma. We performed statistical testing using FBAT-CNV.
Copy number variation in asthma candidate genes was prevalent, with 21% of tested genes residing near or within one of 69 CNVs. In 6 instances, the complete candidate gene sequence resides within the CNV boundaries. On average, asthmatic probands carried 6 asthma-candidate CNVs (range 1–29). However, the vast majority of identified CNVs were of rare frequency (< 5%), and were not statistically associated with asthma. Modest evidence for association with asthma was observed for 2 CNVs near NOS1 and SERPINA3. Linkage disequilibrium analysis suggests that CNV effects are unlikely to explain previously detected SNP associations with asthma.
Although a substantial proportion of asthma-susceptibility genes harbor polymorphic CNVs, the majority of these variants do not confer increased asthma risk. The lack of linkage disequilibrium (LD) between CNVs and asthma-associated SNPs suggests that these CNVs are unlikely to represent the functional variant responsible for most known asthma associations.
PMCID: PMC3609036  PMID: 23517041
7.  The Impact of Self-Identified Race on Epidemiologic Studies of Gene Expression 
Genetic epidemiology  2011;35(2):93-101.
Although population differences in gene expression have been established, the impact on differential gene expression studies in large populations is not well understood. We describe the effect of self-reported race on a gene expression study of lung function in asthma. We generated gene expression profiles for 254 young adults (205 non-Hispanic whites and 49 African Americans) with asthma on whom concurrent total RNA derived from peripheral blood CD4+ lymphocytes and lung function measurements were obtained. We identified four principal components that explained 62% of the variance in gene expression. The dominant principal component, which explained 29% of the total variance in gene expression, was strongly associated with self-identified race (P<10−16). The impact of these racial differences was observed when we performed differential gene expression analysis of lung function. Using multivariate linear models, we tested whether gene expression was associated with a quantitative measure of lung function: pre-bronchodilator forced expiratory volume in one second (FEV1). Though unadjusted linear models of FEV1 identified several genes strongly correlated with lung function, these correlations were due to racial differences in the distribution of both FEV1 and gene expression, and were no longer statistically significant following adjustment for self-identified race. These results suggest that self-identified race is a critical confounding covariate in epidemiologic studies of gene expression and that, similar to genetic studies, careful consideration of self-identified race in gene expression profiling studies is needed to avoid spurious association.
PMCID: PMC3718033  PMID: 21254216
ancestry; gene expression; population stratification; self-identified race
8.  Copy number variation genotyping using family information 
BMC Bioinformatics  2013;14:157.
In recent years there has been a growing interest in the role of copy number variations (CNV) in genetic diseases. Though there has been rapid development of technologies and statistical methods devoted to detection in CNVs from array data, the inherent challenges in data quality associated with most hybridization techniques remains a challenging problem in CNV association studies.
To help address these data quality issues in the context of family-based association studies, we introduce a statistical framework for the intensity-based array data that takes into account the family information for copy-number assignment. The method is an adaptation of traditional methods for modeling SNP genotype data that assume Gaussian mixture model, whereby CNV calling is performed for all family members simultaneously and leveraging within family-data to reduce CNV calls that are incompatible with Mendelian inheritance while still allowing de-novo CNVs. Applying this method to simulation studies and a genome-wide association study in asthma, we find that our approach significantly improves CNV calls accuracy, and reduces the Mendelian inconsistency rates and false positive genotype calls. The results were validated using qPCR experiments.
In conclusion, we have demonstrated that the use of family information can improve the quality of CNV calling and hopefully give more powerful association test of CNVs.
PMCID: PMC3668900  PMID: 23656838
9.  On the genome-wide analysis of copy number variants in family-based designs: Methods for combining family-based and population based information for testing dichotomous or quantitative traits, or completely ascertained samples 
Genetic Epidemiology  2010;34(6):582-590.
We propose a new approach for the analysis of copy number variants (CNVs)for genome-wide association studies in family-based designs. Our new overall association test combines the between-family component and the within-family component of the data so that the new test statistic is fully efficient and, at the same time, achieves the complete robustness against population-admixture and stratification, as classical family-based association tests that are based only on the between-family component. Although all data are incorporated into the test statistic, an adjustment for genetic confounding is not needed, not even for the between-family component. The new test statistic is valid for testing either quantitative or dichotomous phenotypes. If external CNV data are available, the approach can also be used in completely ascertained samples. Similar to the approach by Ionita-Laza et al.(1), the proposed test statistic does not required a CNV-calling algorithm and is based directly on the CNV probe intensity data. We show, via simulation studies, that our methodology increases the power of the FBAT statistic to levels comparable to those of population-based designs. The advantages of the approach in practice are demonstrated by an application to a genome-wide association study for body mass index (BMI).
PMCID: PMC3349936  PMID: 20718041
10.  Mapping of numerous disease-associated expression polymorphisms in primary peripheral blood CD4+ lymphocytes 
Human Molecular Genetics  2010;19(23):4745-4757.
Genome-wide association studies of human gene expression promise to identify functional regulatory genetic variation that contributes to phenotypic diversity. However, it is unclear how useful this approach will be for the identification of disease-susceptibility variants. We generated gene expression profiles for 22 184 mRNA transcripts using RNA derived from peripheral blood CD4+ lymphocytes, and genome-wide genotype data for 516 512 autosomal markers in 200 subjects. We screened for cis-acting variants by testing variants mapping within 50 kb of expressed transcripts for association with transcript abundance using generalized linear models. Significant associations were identified for 1585 genes at a false discovery rate of 0.05 (corresponding to P-values ranging from 1 × 10−91 to 7 × 10−4). Importantly, we identified evidence of regulatory variation for 119 previously mapped disease genes, including 24 examples where the variant with the strongest evidence of disease-association demonstrates strong association with specific transcript abundance. The prevalence of cis-acting variants among disease-associated genes was 63% higher than the genome-wide rate in our data set (P = 6.41 × 10−6), and although many of the implicated loci were associated with immune-related diseases (including asthma, connective tissue disorders and inflammatory bowel disease), associations with genes implicated in non-immune-related diseases including lipid profiles, anthropomorphic measurements, cancer and neurologic disease were also observed. Genetic variants that confer inter-individual differences in gene expression represent an important subset of variants that contribute to disease susceptibility. Population-based integrative genetic approaches can help identify such variation and enhance our understanding of the genetic basis of complex traits.
PMCID: PMC2972694  PMID: 20833654
11.  Genetic Influences on Asthma Susceptibility in the Developing Lung 
Asthma is the leading serious pediatric chronic illness in the United States, affecting 7.1 million children. The prevalence of asthma in children under 4 years of age has increased dramatically in the last 2 decades. Existing evidence suggests that this increase in prevalence derives from early environmental exposures acting on a pre-existing asthma-susceptible genotype. We studied the origins of asthma susceptibility in developing lung in rat strains that model the distinct phenotypes of airway hyperresponsiveness (Fisher rats) and atopy (brown Norway [BN] rats). Postnatal BN rat lungs showed increased epithelial proliferation and tracheal goblet cell hyperplasia. Fisher pups showed increased lung resistance at age 2 weeks, with elevated neutrophils throughout the postnatal period. Diverse transcriptomic signatures characterized the distinct respiratory phenotypes of developing lung in both rat models. Linear regression across age and strain identified developmental variation in expression of 1,376 genes, and confirmed both strain and temporal regulation of lung gene expression. Biological processes that were heavily represented included growth and development (including the T Box 1 transcription factor [Tbx5], the epidermal growth factor receptor [Egfr], the transforming growth factor beta-1-induced transcript 1 [Tgfbr1i1]), extracellular matrix and cell adhesion (including collagen and integrin genes), and immune function (including lymphocyte antigen 6 (Ly6) subunits, IL-17b, Toll-interacting protein, and Ficolin B). Genes validated by quantitative RT-PCR and protein analysis included collagen III alpha 1 Col3a1, Ly6b, glucocorticoid receptor and Importin-13 (specific to the BN rat lung), and Serpina1 and Ficolin B (specific to the Fisher lung). Innate differences in patterns of gene expression in developing lung that contribute to individual variation in respiratory phenotype are likely to contribute to the pathogenesis of asthma.
PMCID: PMC3159089  PMID: 20118217
asthma susceptibility; lung development; developmental gene expression
12.  Genome Wide Association Study to predict severe asthma exacerbations in children using random forests classifiers 
BMC Medical Genetics  2011;12:90.
Personalized health-care promises tailored health-care solutions to individual patients based on their genetic background and/or environmental exposure history. To date, disease prediction has been based on a few environmental factors and/or single nucleotide polymorphisms (SNPs), while complex diseases are usually affected by many genetic and environmental factors with each factor contributing a small portion to the outcome. We hypothesized that the use of random forests classifiers to select SNPs would result in an improved predictive model of asthma exacerbations. We tested this hypothesis in a population of childhood asthmatics.
In this study, using emergency room visits or hospitalizations as the definition of a severe asthma exacerbation, we first identified a list of top Genome Wide Association Study (GWAS) SNPs ranked by Random Forests (RF) importance score for the CAMP (Childhood Asthma Management Program) population of 127 exacerbation cases and 290 non-exacerbation controls. We predict severe asthma exacerbations using the top 10 to 320 SNPs together with age, sex, pre-bronchodilator FEV1 percentage predicted, and treatment group.
Testing in an independent set of the CAMP population shows that severe asthma exacerbations can be predicted with an Area Under the Curve (AUC) = 0.66 with 160-320 SNPs in comparison to an AUC score of 0.57 with 10 SNPs. Using the clinical traits alone yielded AUC score of 0.54, suggesting the phenotype is affected by genetic as well as environmental factors.
Our study shows that a random forests algorithm can effectively extract and use the information contained in a small number of samples. Random forests, and other machine learning tools, can be used with GWAS studies to integrate large numbers of predictors simultaneously.
PMCID: PMC3148549  PMID: 21718536
13.  Quantifying differential gene connectivity between disease states for objective identification of disease-relevant genes 
BMC Systems Biology  2011;5:89.
Network modeling of whole transcriptome expression data enables characterization of complex epistatic (gene-gene) interactions that underlie cellular functions. Though numerous methods have been proposed and successfully implemented to develop these networks, there are no formal methods for comparing differences in network connectivity patterns as a function of phenotypic trait.
Here we describe a novel approach for quantifying the differences in gene-gene connectivity patterns across disease states based on Graphical Gaussian Models (GGMs). We compare the posterior probabilities of connectivity for each gene pair across two disease states, expressed as a posterior odds-ratio (postOR) for each pair, which can be used to identify network components most relevant to disease status. The method can also be generalized to model differential gene connectivity patterns within previously defined gene sets, gene networks and pathways. We demonstrate that the GGM method reliably detects differences in network connectivity patterns in datasets of varying sample size. Applying this method to two independent breast cancer expression data sets, we identified numerous reproducible differences in network connectivity across histological grades of breast cancer, including several published gene sets and pathways. Most notably, our model identified two gene hubs (MMP12 and CXCL13) that each exhibited differential connectivity to more than 30 transcripts in both datasets. Both genes have been previously implicated in breast cancer pathobiology, but themselves are not differentially expressed by histologic grade in either dataset, and would thus have not been identified using traditional differential gene expression testing approaches. In addition, 16 curated gene sets demonstrated significant differential connectivity in both data sets, including the matrix metalloproteinases, PPAR alpha sequence targets, and the PUFA synthesis pathway.
Our results suggest that GGM can be used to formally evaluate differences in global interactome connectivity across disease states, and can serve as a powerful tool for exploring the molecular events that contribute to disease at a systems level.
PMCID: PMC3128864  PMID: 21627793
14.  The CD4+ T-cell transcriptome and serum IgE in asthma: IL17RB and the role of sex 
The relationships between total serum IgE levels and gene expression patterns in peripheral blood CD4+ T cells (in all subjects and within each sex specifically) are not known.
Peripheral blood CD4+ T cells from 223 participants from the Childhood Asthma Management Program (CAMP) with simultaneous measurement of IgE. Total RNA was isolated, and expression profiles were generated with Illumina HumanRef8 v2 BeadChip arrays. Modeling of the relationship between genome-wide gene transcript levels and IgE levels was performed in all subjects, and stratified by sex.
Among all subjects, significant evidence for association between gene transcript abundance and IgE was identified for a single gene, the interleukin 17 receptor B (IL17RB), explaining 12% of the variance (r2) in IgE measurement (p value = 7 × 10-7, 9 × 10-3 after adjustment for multiple testing). Sex stratified analyses revealed that the correlation between IL17RB and IgE was restricted to males only (r2 = 0.19, p value = 8 × 10-8; test for sex-interaction p < 0.05). Significant correlation between gene transcript abundance and IgE level was not found in females. Additionally we demonstrated substantial sex-specific differences in IgE when considering multi-gene models, and in canonical pathway analyses of IgE level.
Our results indicate that IL17RB may be the only gene expressed in CD4+ T cells whose transcript measurement is correlated with the variation in IgE level in asthmatics. These results provide further evidence sex may play a role in the genomic regulation of IgE.
PMCID: PMC3080837  PMID: 21473777
15.  A graphical model approach for inferring large-scale networks integrating gene expression and genetic polymorphism 
BMC Systems Biology  2009;3:55.
Graphical models (e.g., Bayesian networks) have been used frequently to describe complex interaction patterns and dependent structures among genes and other phenotypes. Estimation of such networks has been a challenging problem when the genes considered greatly outnumber the samples, and the situation is exacerbated when one wishes to consider the impact of polymorphisms (SNPs) in genes.
Here we describe a multistep approach to infer a gene-SNP network from gene expression and genotyped SNP data. Our approach is based on 1) construction of a graphical Gaussian model (GGM) based on small sample estimation of partial correlation and false-discovery rate multiple testing; 2) extraction of a subnetwork of genes directly linked to a target candidate gene of interest; 3) identification of cis-acting regulatory variants for the genes composing the subnetwork; and 4) evaluating the identified cis-acting variants for trans-acting regulatory effects of the target candidate gene. This approach identifies significant gene-gene and gene-SNP associations not solely on the basis of gene co-expression but rather through whole-network modeling. We demonstrate the method by building two complex gene-SNP networks around Interferon Receptor 12B2 (IL12RB2) and Interleukin 1B (IL1B), two biologic candidates in asthma pathogenesis, using 534,290 genotyped variants and gene expression data on 22,177 genes from total RNA derived from peripheral blood CD4+ lymphocytes from 154 asthmatics.
Our results suggest that graphical models based on integrative genomic data are computationally efficient, work well with small samples, and can describe complex interactions among genes and polymorphisms that could not be identified by pair-wise association testing.
PMCID: PMC2694152  PMID: 19473523

Results 1-15 (15)