PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-7 (7)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  On the genome-wide analysis of copy number variants in family-based designs: Methods for combining family-based and population based information for testing dichotomous or quantitative traits, or completely ascertained samples 
Genetic Epidemiology  2010;34(6):582-590.
We propose a new approach for the analysis of copy number variants (CNVs)for genome-wide association studies in family-based designs. Our new overall association test combines the between-family component and the within-family component of the data so that the new test statistic is fully efficient and, at the same time, achieves the complete robustness against population-admixture and stratification, as classical family-based association tests that are based only on the between-family component. Although all data are incorporated into the test statistic, an adjustment for genetic confounding is not needed, not even for the between-family component. The new test statistic is valid for testing either quantitative or dichotomous phenotypes. If external CNV data are available, the approach can also be used in completely ascertained samples. Similar to the approach by Ionita-Laza et al.(1), the proposed test statistic does not required a CNV-calling algorithm and is based directly on the CNV probe intensity data. We show, via simulation studies, that our methodology increases the power of the FBAT statistic to levels comparable to those of population-based designs. The advantages of the approach in practice are demonstrated by an application to a genome-wide association study for body mass index (BMI).
doi:10.1002/gepi.20515
PMCID: PMC3349936  PMID: 20718041
2.  Mapping of numerous disease-associated expression polymorphisms in primary peripheral blood CD4+ lymphocytes 
Human Molecular Genetics  2010;19(23):4745-4757.
Genome-wide association studies of human gene expression promise to identify functional regulatory genetic variation that contributes to phenotypic diversity. However, it is unclear how useful this approach will be for the identification of disease-susceptibility variants. We generated gene expression profiles for 22 184 mRNA transcripts using RNA derived from peripheral blood CD4+ lymphocytes, and genome-wide genotype data for 516 512 autosomal markers in 200 subjects. We screened for cis-acting variants by testing variants mapping within 50 kb of expressed transcripts for association with transcript abundance using generalized linear models. Significant associations were identified for 1585 genes at a false discovery rate of 0.05 (corresponding to P-values ranging from 1 × 10−91 to 7 × 10−4). Importantly, we identified evidence of regulatory variation for 119 previously mapped disease genes, including 24 examples where the variant with the strongest evidence of disease-association demonstrates strong association with specific transcript abundance. The prevalence of cis-acting variants among disease-associated genes was 63% higher than the genome-wide rate in our data set (P = 6.41 × 10−6), and although many of the implicated loci were associated with immune-related diseases (including asthma, connective tissue disorders and inflammatory bowel disease), associations with genes implicated in non-immune-related diseases including lipid profiles, anthropomorphic measurements, cancer and neurologic disease were also observed. Genetic variants that confer inter-individual differences in gene expression represent an important subset of variants that contribute to disease susceptibility. Population-based integrative genetic approaches can help identify such variation and enhance our understanding of the genetic basis of complex traits.
doi:10.1093/hmg/ddq392
PMCID: PMC2972694  PMID: 20833654
3.  Genetic Influences on Asthma Susceptibility in the Developing Lung 
Asthma is the leading serious pediatric chronic illness in the United States, affecting 7.1 million children. The prevalence of asthma in children under 4 years of age has increased dramatically in the last 2 decades. Existing evidence suggests that this increase in prevalence derives from early environmental exposures acting on a pre-existing asthma-susceptible genotype. We studied the origins of asthma susceptibility in developing lung in rat strains that model the distinct phenotypes of airway hyperresponsiveness (Fisher rats) and atopy (brown Norway [BN] rats). Postnatal BN rat lungs showed increased epithelial proliferation and tracheal goblet cell hyperplasia. Fisher pups showed increased lung resistance at age 2 weeks, with elevated neutrophils throughout the postnatal period. Diverse transcriptomic signatures characterized the distinct respiratory phenotypes of developing lung in both rat models. Linear regression across age and strain identified developmental variation in expression of 1,376 genes, and confirmed both strain and temporal regulation of lung gene expression. Biological processes that were heavily represented included growth and development (including the T Box 1 transcription factor [Tbx5], the epidermal growth factor receptor [Egfr], the transforming growth factor beta-1-induced transcript 1 [Tgfbr1i1]), extracellular matrix and cell adhesion (including collagen and integrin genes), and immune function (including lymphocyte antigen 6 (Ly6) subunits, IL-17b, Toll-interacting protein, and Ficolin B). Genes validated by quantitative RT-PCR and protein analysis included collagen III alpha 1 Col3a1, Ly6b, glucocorticoid receptor and Importin-13 (specific to the BN rat lung), and Serpina1 and Ficolin B (specific to the Fisher lung). Innate differences in patterns of gene expression in developing lung that contribute to individual variation in respiratory phenotype are likely to contribute to the pathogenesis of asthma.
doi:10.1165/rcmb.2009-0412OC
PMCID: PMC3159089  PMID: 20118217
asthma susceptibility; lung development; developmental gene expression
4.  Genome Wide Association Study to predict severe asthma exacerbations in children using random forests classifiers 
BMC Medical Genetics  2011;12:90.
Background
Personalized health-care promises tailored health-care solutions to individual patients based on their genetic background and/or environmental exposure history. To date, disease prediction has been based on a few environmental factors and/or single nucleotide polymorphisms (SNPs), while complex diseases are usually affected by many genetic and environmental factors with each factor contributing a small portion to the outcome. We hypothesized that the use of random forests classifiers to select SNPs would result in an improved predictive model of asthma exacerbations. We tested this hypothesis in a population of childhood asthmatics.
Methods
In this study, using emergency room visits or hospitalizations as the definition of a severe asthma exacerbation, we first identified a list of top Genome Wide Association Study (GWAS) SNPs ranked by Random Forests (RF) importance score for the CAMP (Childhood Asthma Management Program) population of 127 exacerbation cases and 290 non-exacerbation controls. We predict severe asthma exacerbations using the top 10 to 320 SNPs together with age, sex, pre-bronchodilator FEV1 percentage predicted, and treatment group.
Results
Testing in an independent set of the CAMP population shows that severe asthma exacerbations can be predicted with an Area Under the Curve (AUC) = 0.66 with 160-320 SNPs in comparison to an AUC score of 0.57 with 10 SNPs. Using the clinical traits alone yielded AUC score of 0.54, suggesting the phenotype is affected by genetic as well as environmental factors.
Conclusions
Our study shows that a random forests algorithm can effectively extract and use the information contained in a small number of samples. Random forests, and other machine learning tools, can be used with GWAS studies to integrate large numbers of predictors simultaneously.
doi:10.1186/1471-2350-12-90
PMCID: PMC3148549  PMID: 21718536
5.  Quantifying differential gene connectivity between disease states for objective identification of disease-relevant genes 
BMC Systems Biology  2011;5:89.
Background
Network modeling of whole transcriptome expression data enables characterization of complex epistatic (gene-gene) interactions that underlie cellular functions. Though numerous methods have been proposed and successfully implemented to develop these networks, there are no formal methods for comparing differences in network connectivity patterns as a function of phenotypic trait.
Results
Here we describe a novel approach for quantifying the differences in gene-gene connectivity patterns across disease states based on Graphical Gaussian Models (GGMs). We compare the posterior probabilities of connectivity for each gene pair across two disease states, expressed as a posterior odds-ratio (postOR) for each pair, which can be used to identify network components most relevant to disease status. The method can also be generalized to model differential gene connectivity patterns within previously defined gene sets, gene networks and pathways. We demonstrate that the GGM method reliably detects differences in network connectivity patterns in datasets of varying sample size. Applying this method to two independent breast cancer expression data sets, we identified numerous reproducible differences in network connectivity across histological grades of breast cancer, including several published gene sets and pathways. Most notably, our model identified two gene hubs (MMP12 and CXCL13) that each exhibited differential connectivity to more than 30 transcripts in both datasets. Both genes have been previously implicated in breast cancer pathobiology, but themselves are not differentially expressed by histologic grade in either dataset, and would thus have not been identified using traditional differential gene expression testing approaches. In addition, 16 curated gene sets demonstrated significant differential connectivity in both data sets, including the matrix metalloproteinases, PPAR alpha sequence targets, and the PUFA synthesis pathway.
Conclusions
Our results suggest that GGM can be used to formally evaluate differences in global interactome connectivity across disease states, and can serve as a powerful tool for exploring the molecular events that contribute to disease at a systems level.
doi:10.1186/1752-0509-5-89
PMCID: PMC3128864  PMID: 21627793
6.  The CD4+ T-cell transcriptome and serum IgE in asthma: IL17RB and the role of sex 
Background
The relationships between total serum IgE levels and gene expression patterns in peripheral blood CD4+ T cells (in all subjects and within each sex specifically) are not known.
Methods
Peripheral blood CD4+ T cells from 223 participants from the Childhood Asthma Management Program (CAMP) with simultaneous measurement of IgE. Total RNA was isolated, and expression profiles were generated with Illumina HumanRef8 v2 BeadChip arrays. Modeling of the relationship between genome-wide gene transcript levels and IgE levels was performed in all subjects, and stratified by sex.
Results
Among all subjects, significant evidence for association between gene transcript abundance and IgE was identified for a single gene, the interleukin 17 receptor B (IL17RB), explaining 12% of the variance (r2) in IgE measurement (p value = 7 × 10-7, 9 × 10-3 after adjustment for multiple testing). Sex stratified analyses revealed that the correlation between IL17RB and IgE was restricted to males only (r2 = 0.19, p value = 8 × 10-8; test for sex-interaction p < 0.05). Significant correlation between gene transcript abundance and IgE level was not found in females. Additionally we demonstrated substantial sex-specific differences in IgE when considering multi-gene models, and in canonical pathway analyses of IgE level.
Conclusions
Our results indicate that IL17RB may be the only gene expressed in CD4+ T cells whose transcript measurement is correlated with the variation in IgE level in asthmatics. These results provide further evidence sex may play a role in the genomic regulation of IgE.
doi:10.1186/1471-2466-11-17
PMCID: PMC3080837  PMID: 21473777
7.  A graphical model approach for inferring large-scale networks integrating gene expression and genetic polymorphism 
BMC Systems Biology  2009;3:55.
Background
Graphical models (e.g., Bayesian networks) have been used frequently to describe complex interaction patterns and dependent structures among genes and other phenotypes. Estimation of such networks has been a challenging problem when the genes considered greatly outnumber the samples, and the situation is exacerbated when one wishes to consider the impact of polymorphisms (SNPs) in genes.
Results
Here we describe a multistep approach to infer a gene-SNP network from gene expression and genotyped SNP data. Our approach is based on 1) construction of a graphical Gaussian model (GGM) based on small sample estimation of partial correlation and false-discovery rate multiple testing; 2) extraction of a subnetwork of genes directly linked to a target candidate gene of interest; 3) identification of cis-acting regulatory variants for the genes composing the subnetwork; and 4) evaluating the identified cis-acting variants for trans-acting regulatory effects of the target candidate gene. This approach identifies significant gene-gene and gene-SNP associations not solely on the basis of gene co-expression but rather through whole-network modeling. We demonstrate the method by building two complex gene-SNP networks around Interferon Receptor 12B2 (IL12RB2) and Interleukin 1B (IL1B), two biologic candidates in asthma pathogenesis, using 534,290 genotyped variants and gene expression data on 22,177 genes from total RNA derived from peripheral blood CD4+ lymphocytes from 154 asthmatics.
Conclusion
Our results suggest that graphical models based on integrative genomic data are computationally efficient, work well with small samples, and can describe complex interactions among genes and polymorphisms that could not be identified by pair-wise association testing.
doi:10.1186/1752-0509-3-55
PMCID: PMC2694152  PMID: 19473523

Results 1-7 (7)