Genome-wide association studies can potentially unravel the mechanisms behind complex traits and common genetic diseases. Despite the valuable results produced thus far, many questions remain unanswered. For instance, which specific genetic compounds are linked to the risk of the disease under investigation; what biological mechanism do they act through; or how do they interact with environmental and other external factors? The driving force of computational biology is the constantly growing amount of big data generated by high-throughput technologies. A practical framework that can deal with this abundance of information and that consent to discovering genetic associations and interactions is provided by means of networks. Unfortunately, high dimensionality, the presence of noise and the geometry of data can make the aforementioned problem extremely challenging. We propose a penalised linear regression approach that can deal with the aforementioned issues that affect genetic data. We analyse the gene expression profiles of individuals with a common trait to infer the network structure of interactions among genes. The permutation-based approach leads to more stable and reliable networks inferred from synthetic microarray data. We show that a higher number of permutations determines the number of predicted edges, improves the overall sensitivity and controls the number of false positives.
Joint association analysis of multiple traits in a genome-wide association study (GWAS), i.e. a multivariate GWAS, offers several advantages over analyzing each trait in a separate GWAS. In this study we directly compared a number of multivariate GWAS methods using simulated data. We focused on six methods that are implemented in the software packages PLINK, SNPTEST, MultiPhen, BIMBAM, PCHAT and TATES, and also compared them to standard univariate GWAS, analysis of the first principal component of the traits, and meta-analysis of univariate results. We simulated data (N = 1000) for three quantitative traits and one bi-allelic quantitative trait locus (QTL), and varied the number of traits associated with the QTL (explained variance 0.1%), minor allele frequency of the QTL, residual correlation between the traits, and the sign of the correlation induced by the QTL relative to the residual correlation. We compared the power of the methods using empirically fixed significance thresholds (α = 0.05). Our results showed that the multivariate methods implemented in PLINK, SNPTEST, MultiPhen and BIMBAM performed best for the majority of the tested scenarios, with a notable increase in power for scenarios with an opposite sign of genetic and residual correlation. All multivariate analyses resulted in a higher power than univariate analyses, even when only one of the traits was associated with the QTL. Hence, use of multivariate GWAS methods can be recommended, even when genetic correlations between traits are weak.
Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn’s disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn’s disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals.
The interest in performing gene-environment interaction studies has seen a significant increase with the increase of advanced molecular genetics techniques. Practically, it became possible to investigate the role of environmental factors in disease risk and hence to investigate their role as genetic effect modifiers. The understanding that genetics is important in the uptake and metabolism of toxic substances is an example of how genetic profiles can modify important environmental risk factors to disease. Several rationales exist to set up gene-environment interaction studies and the technical challenges related to these studies – when the number of environmental or genetic risk factors is relatively small – has been described before.
In the post-genomic era, it is now possible to study thousands of genes and their interaction with the environment. This brings along a whole range of new challenges and opportunities. Despite a continuing effort in developing efficient methods and optimal bioinformatics infrastructures to deal with the available wealth of data, the challenge remains how to best present and analyze Genome-Wide Environmental Interaction (GWEI) studies involving multiple genetic and environmental factors. Since GWEIs are performed at the intersection of statistical genetics, bioinformatics and epidemiology, usually similar problems need to be dealt with as for Genome-Wide Association gene-gene Interaction (GWAI) studies. However, additional complexities need to be considered which are typical for large-scale epidemiological studies, but are also related to “joining” two heterogeneous types of data in explaining complex disease trait variation or for prediction purposes.
Genome-wide association studies; gene-environment interaction; post-GWAS analysis; association tests; exploratory methods
For many complex diseases, quantitative traits contain more information than dichotomous traits. One of the approaches used to analyse these traits in family-based association studies is the quantitative transmission disequilibrium test (QTDT). The QTDT is a regression-based approach that models simultaneously linkage and association. It splits up the association effect in a between- and a within-family genetic component to adjust and test for population stratification and includes a variance components method to model linkage. We extend this approach to detect gene–gene interactions between two unlinked QTLs by adjusting the definition of the between- and within-family component and the variance components included in the model. We simulate data to investigate the influence of the epistasis model, linkage disequilibrium patterns between the markers and the QTLs, and allele frequencies on the power and type I error rates of the approach. Results show that for some of the investigated settings, power gains are obtained in comparison with FAM-MDR. We conclude that our approach shows promising results for candidate-gene studies where too few markers are available to correct for population stratification using standard methods (for example EIGENSTRAT). The proposed method is applied to real-life data on hypertension from the FLEMENGHO study.
QTDT; epistasis; association; linkage
Chronically relapsing inflammation, tissue remodeling and fibrosis are hallmarks of inflammatory bowel diseases. The aim of this study was to investigate changes in connective tissue in a chronic murine model resulting from repeated cycles of dextran sodium sulphate (DSS) ingestion, to mimic the relapsing nature of the human disease.
Materials and Methods
C57BL/6 mice were exposed to DSS in drinking water for 1 week, followed by a recovery phase of 2 weeks. This cycle of exposure was repeated for up to 3 times (9 weeks in total). Colonic inflammation, fibrosis, extracellular matrix proteins and colonic gene expression were studied. In vivo MRI T2 relaxometry was studied as a potential non-invasive imaging tool to evaluate bowel wall inflammation and fibrosis.
Repeated cycles of DSS resulted in a relapsing and remitting disease course, which induced a chronic segmental, transmural colitis after 2 and 3 cycles of DSS with clear induction of fibrosis and remodeling of the muscular layer. Tenascin expression mirrored its expression in Crohn’s colitis. Microarray data identified a gene expression profile different in chronic colitis from that in acute colitis. Additional recovery was associated with upregulation of unique genes, in particular keratins, pointing to activation of molecular pathways for healing and repair. In vivo MRI T2 relaxometry of the colon showed a clear shift towards higher T2 values in the acute stage and a gradual regression of T2 values with increasing cycles of DSS.
Repeated cycles of DSS exposure induce fibrosis and connective tissue changes with typical features, as occurring in Crohn’s disease. Colonic gene expression analysis revealed unique expression profiles in chronic colitis compared to acute colitis and after additional recovery, pointing to potential new targets to intervene with the induction of fibrosis. In vivo T2 relaxometry is a promising non-invasive assessment of inflammation and fibrosis.
Applying a statistical method implies identifying underlying (model) assumptions and checking their validity in the particular context. One of these contexts is association modeling for epistasis detection. Here, depending on the technique used, violation of model assumptions may result in increased type I error, power loss, or biased parameter estimates. Remedial measures for violated underlying conditions or assumptions include data transformation or selecting a more relaxed modeling or testing strategy. Model-Based Multifactor Dimensionality Reduction (MB-MDR) for epistasis detection relies on association testing between a trait and a factor consisting of multilocus genotype information. For quantitative traits, the framework is essentially Analysis of Variance (ANOVA) that decomposes the variability in the trait amongst the different factors. In this study, we assess through simulations, the cumulative effect of deviations from normality and homoscedasticity on the overall performance of quantitative Model-Based Multifactor Dimensionality Reduction (MB-MDR) to detect 2-locus epistasis signals in the absence of main effects.
Our simulation study focuses on pure epistasis models with varying degrees of genetic influence on a quantitative trait. Conditional on a multilocus genotype, we consider quantitative trait distributions that are normal, chi-square or Student’s t with constant or non-constant phenotypic variances. All data are analyzed with MB-MDR using the built-in Student’s t-test for association, as well as a novel MB-MDR implementation based on Welch’s t-test. Traits are either left untransformed or are transformed into new traits via logarithmic, standardization or rank-based transformations, prior to MB-MDR modeling.
Our simulation results show that MB-MDR controls type I error and false positive rates irrespective of the association test considered. Empirically-based MB-MDR power estimates for MB-MDR with Welch’s t-tests are generally lower than those for MB-MDR with Student’s t-tests. Trait transformations involving ranks tend to lead to increased power compared to the other considered data transformations.
When performing MB-MDR screening for gene-gene interactions with quantitative traits, we recommend to first rank-transform traits to normality and then to apply MB-MDR modeling with Student’s t-tests as internal tests for association.
Model-based multifactor dimensionality reduction; Epistasis; Model violations; Data transformation
To examine the associations between pet keeping in early childhood and asthma and allergies in children aged 6–10 years.
Pooled analysis of individual participant data of 11 prospective European birth cohorts that recruited a total of over 22,000 children in the 1990s.
Ownership of only cats, dogs, birds, rodents, or cats/dogs combined during the first 2 years of life.
Current asthma (primary outcome), allergic asthma, allergic rhinitis and allergic sensitization during 6–10 years of age.
Three-step approach: (i) Common definition of outcome and exposure variables across cohorts; (ii) calculation of adjusted effect estimates for each cohort; (iii) pooling of effect estimates by using random effects meta-analysis models.
We found no association between furry and feathered pet keeping early in life and asthma in school age. For example, the odds ratio for asthma comparing cat ownership with “no pets” (10 studies, 11489 participants) was 1.00 (95% confidence interval 0.78 to 1.28) (I2 = 9%; p = 0.36). The odds ratio for asthma comparing dog ownership with “no pets” (9 studies, 11433 participants) was 0.77 (0.58 to 1.03) (I2 = 0%, p = 0.89). Owning both cat(s) and dog(s) compared to “no pets” resulted in an odds ratio of 1.04 (0.59 to 1.84) (I2 = 33%, p = 0.18). Similarly, for allergic asthma and for allergic rhinitis we did not find associations regarding any type of pet ownership early in life. However, we found some evidence for an association between ownership of furry pets during the first 2 years of life and reduced likelihood of becoming sensitized to aero-allergens.
Pet ownership in early life did not appear to either increase or reduce the risk of asthma or allergic rhinitis symptoms in children aged 6–10. Advice from health care practitioners to avoid or to specifically acquire pets for primary prevention of asthma or allergic rhinitis in children should not be given.
Detecting gene–gene interactions or epistasis in studies of human complex diseases is a big challenge in the area of epidemiology. To address this problem, several methods have been developed, mainly in the context of data dimensionality reduction. One of these methods, Model-Based Multifactor Dimensionality Reduction, has so far mainly been applied to case–control studies. In this study, we evaluate the power of Model-Based Multifactor Dimensionality Reduction for quantitative traits to detect gene–gene interactions (epistasis) in the presence of error-free and noisy data. Considered sources of error are genotyping errors, missing genotypes, phenotypic mixtures and genetic heterogeneity. Our simulation study encompasses a variety of settings with varying minor allele frequencies and genetic variance for different epistasis models. On each simulated data, we have performed Model-Based Multifactor Dimensionality Reduction in two ways: with and without adjustment for main effects of (known) functional SNPs. In line with binary trait counterparts, our simulations show that the power is lowest in the presence of phenotypic mixtures or genetic heterogeneity compared to scenarios with missing genotypes or genotyping errors. In addition, empirical power estimates reduce even further with main effects corrections, but at the same time, false-positive percentages are reduced as well. In conclusion, phenotypic mixtures and genetic heterogeneity remain challenging for epistasis detection, and careful thought must be given to the way important lower-order effects are accounted for in the analysis.
Model-Based Multifactor Dimensionality Reduction; gene–gene interactions; quantitative traits; complex diseases; noisy data
Identifying gene-gene interactions or gene-environment interactions in studies of human complex diseases remains a big challenge in genetic epidemiology. An additional challenge, often forgotten, is to account for important lower-order genetic effects. These may hamper the identification of genuine epistasis. If lower-order genetic effects contribute to the genetic variance of a trait, identified statistical interactions may simply be due to a signal boost of these effects. In this study, we restrict attention to quantitative traits and bi-allelic SNPs as genetic markers. Moreover, our interaction study focuses on 2-way SNP-SNP interactions. Via simulations, we assess the performance of different corrective measures for lower-order genetic effects in Model-Based Multifactor Dimensionality Reduction epistasis detection, using additive and co-dominant coding schemes. Performance is evaluated in terms of power and familywise error rate. Our simulations indicate that empirical power estimates are reduced with correction of lower-order effects, likewise familywise error rates. Easy-to-use automatic SNP selection procedures, SNP selection based on “top” findings, or SNP selection based on p-value criterion for interesting main effects result in reduced power but also almost zero false positive rates. Always accounting for main effects in the SNP-SNP pair under investigation during Model-Based Multifactor Dimensionality Reduction analysis adequately controls false positive epistasis findings. This is particularly true when adopting a co-dominant corrective coding scheme. In conclusion, automatic search procedures to identify lower-order effects to correct for during epistasis screening should be avoided. The same is true for procedures that adjust for lower-order effects prior to Model-Based Multifactor Dimensionality Reduction and involve using residuals as the new trait. We advocate using “on-the-fly” lower-order effects adjusting when screening for SNP-SNP interactions using Model-Based Multifactor Dimensionality Reduction analysis.
Analyzing the combined effects of genes and/or environmental factors on the development of complex diseases is a great challenge from both the statistical and computational perspective, even using a relatively small number of genetic and non-genetic exposures. Several data mining methods have been proposed for interaction analysis, among them, the Multifactor Dimensionality Reduction Method (MDR), which has proven its utility in a variety of theoretical and practical settings. Model-Based Multifactor Dimensionality Reduction (MB-MDR), a relatively new MDR-based technique that is able to unify the best of both non-parametric and parametric worlds, was developed to address some of the remaining concerns that go along with an MDR-analysis. These include the restriction to univariate, dichotomous traits, the absence of flexible ways to adjust for lower-order effects and important confounders, and the difficulty to highlight epistasis effects when too many multi-locus genotype cells are pooled into two new genotype groups. Whereas the true value of MB-MDR can only reveal itself by extensive applications of the method in a variety of real-life scenarios, here we investigate the empirical power of MB-MDR to detect gene-gene interactions in the absence of any noise and in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. For the considered simulation settings, we show that the power is generally higher for MB-MDR than for MDR, in particular in the presence of genetic heterogeneity, phenocopy, or low minor allele frequencies.
population-based genetic association studies; complex diseases; case-control design; gene-gene Interactions; Multifactor Dimensionality Reduction
In the quest for the missing heritability of most complex diseases, rare variants have received increased attention. Advances in large-scale sequencing have led to a shift from the common disease/common variant hypothesis to the common disease/rare variant hypothesis or have at least reopened the debate about the relevance and importance of rare variants for gene discoveries. The investigation of modeling and testing approaches to identify significant disease/rare variant associations is in full motion. New methods to better deal with parameter estimation instabilities, convergence problems, or multiple testing corrections in the presence of rare variants or effect modifiers of rare variants are in their infancy. Using a recently developed semiparametric strategy to detect causal variants, we investigate the performance of the model-based multifactor dimensionality reduction (MB-MDR) technique in terms of power and family-wise error rate (FWER) control in the presence of rare variants, using population-based and family-based data (FAM-MDR). We compare family-based results obtained from MB-MDR analyses to screening findings from a quantitative trait Pedigree-based association test (PBAT). Population-based data were further examined using penalized regression models. We restrict attention to all available single-nucleotide polymorphisms on chromosome 4 and consider Q1 as the outcome of interest. The considered family-based methods identified marker C4S4935 in the VEGFC gene with estimated power not exceeding 0.35 (FAM-MDR), when FWER was kept under control. The considered population-based methods gave rise to highly inflated FWERs (up to 90% for PBAT screening).
The search for susceptibility loci in gene–gene interactions imposes a methodological and computational challenge for statisticians because of the large dimensionality inherent to the modelling of gene–gene interactions or epistasis. In an era in which genome-wide scans have become relatively common, new powerful methods are required to handle the huge amount of feasible gene–gene interactions and to weed out false positives and negatives from these results. One solution to the dimensionality problem is to reduce data by preliminary screening of markers to select the best candidates for further analysis. Ideally, this screening step is statistically independent of the testing phase. Initially developed for small numbers of markers, the Multifactor Dimensionality Reduction (MDR) method is a nonparametric, model-free data reduction technique to associate sets of markers with optimal predictive properties to disease. In this study, we examine the power of MDR in larger data sets and compare it with other approaches that are able to identify gene–gene interactions. Under various interaction models (purely and not purely epistatic), we use a Random Forest (RF)-based prescreening method, before executing MDR, to improve its performance. We find that the power of MDR increases when noisy SNPs are first removed, by creating a collection of candidate markers with RFs. We validate our technique by extensive simulation studies and by application to asthma data from the European Committee of Respiratory Health Study II.
gene–gene interactions; prescreening; Random Forests; Multifactor Dimensionality Reduction
Few studies have examined the effects of in utero smoke exposure (IUS) on lung function in children with asthma, and there are no published data on the impact of IUS on treatment outcomes in asthmatic children.
To explore whether IUS exposure is associated with increased airway responsiveness among children with asthma, and whether IUS modifies the response to treatment with inhaled corticosteroids (ICS).
To assess the impact of parent-reported IUS exposure on airway responsiveness in childhood asthma we performed a repeated-measures analysis of methacholine PC20 data from the Childhood Asthma Management Program (CAMP), a four-year, multicenter, randomized double masked placebo controlled trial of 1041 children ages 5–12 comparing the long term efficacy of ICS with mast cell stabilizing agents or placebo.
Although improvement was seen in both groups, asthmatic children with IUS exposure had on average 26% less of an improvement in airway responsiveness over time compared to unexposed children (p=.01). Moreover, while children who were not exposed to IUS who received budesonide experienced substantial improvement in PC20 compared to untreated children (1.25 fold-increase, 95% CI 1.03, 1.50, p=.02) the beneficial effects of budesonide were attenuated among children with a history of IUS exposure (1.04 fold-increase, 95% CI 0.65, 1.68, p=.88).
IUS reduces age-related improvements in airway responsiveness among asthmatic children. Moreover, IUS appears to blunt the beneficial effects of ICS use on airways responsiveness. These results emphasize the importance of preventing this exposure through smoking cessation counseling efforts with pregnant women.
asthma; in utero smoke exposure; airway responsiveness; inhaled corticosteroids
In recent years, DISC1 has emerged as one of the most credible and best supported candidate genes for schizophrenia and related neuropsychiatric disorders. Furthermore, increasing evidence – both genetic and functional – indicates that many of its protein interaction partners are also involved in the development of these diseases. In this study, we applied a pooled sample 454 sequencing strategy, to explore the contribution of genetic variation in DISC1 and 10 of its interaction partners (ATF5, Grb2, FEZ1, LIS-1, PDE4B, NDE1, NDEL1, TRAF3IP1, YWHAE, and ZNF365) to schizophrenia susceptibility in an isolated northern Swedish population. Mutation burden analysis of the identified variants in a population of 486 SZ patients and 514 control individuals, revealed that non-synonymous rare variants with a MAF<0.01 were significantly more present in patients compared to controls (8.64% versus 4.7%, P = 0.018), providing further evidence for the involvement of DISC1 and some of its interaction partners in psychiatric disorders. This increased burden of rare missense variants was even more striking in a subgroup of early onset patients (12.9% versus 4.7%, P = 0.0004), highlighting the importance of studying subgroups of patients and identifying endophenotypes. Upon investigation of the potential functional effects associated with the identified missense variants, we found that ∼90% of these variants reside in intrinsically disordered protein regions. The observed increase in mutation burden in patients provides further support for the role of the DISC1 pathway in schizophrenia. Furthermore, this study presents the first evidence supporting the involvement of mutations within intrinsically disordered protein regions in the pathogenesis of psychiatric disorders. As many important biological functions depend directly on the disordered state, alteration of this disorder in key pathways may represent an intriguing new disease mechanism for schizophrenia and related neuropsychiatric diseases. Further research into this unexplored domain will be required to elucidate the role of the identified variants in schizophrenia etiology.
Background and aims
Several antibodies have been associated with Crohn's disease and are associated with distinct clinical phenotypes. The aim of this study was to determine whether a panel of new antibodies against bacterial peptides and glycans could help in differentiating inflammatory bowel disease (IBD), and whether they were associated with particular clinical manifestations.
Antibodies against a mannan epitope of Saccharomyces cerevisiae (gASCA), laminaribioside (ALCA), chitobioside (ACCA), mannobioside (AMCA), outer membrane porins (Omp) and the atypical perinuclear antineutrophilic cytoplasmic antibody (pANCA) were tested in serum samples of 1225 IBD patients, 200 healthy controls and 113 patients with non‐IBD gastrointestinal inflammation. Antibody responses were correlated with the type of disease and clinical characteristics.
76% of Crohn's disease patients had at least one of the tested antibodies. For differentiation between Crohn's disease and ulcerative colitis, the combination of gASCA and pANCA was most accurate. For differentiation between IBD, healthy controls and non‐IBD gastrointestinal inflammation, the combination of gASCA, pANCA and ALCA had the best accuracy. Increasing amounts and levels of antibody responses against gASCA, ALCA, ACCA, AMCA and Omp were associated with more complicated disease behaviour (44.7% versus 53.6% versus 71.1% versus 82.0%, p < 0.001), and a higher frequency of Crohn's disease‐related abdominal surgery (38.5% versus 48.8% versus 60.7% versus 75.4%, p < 0.001).
Using this new panel of serological markers, the number and magnitude of immune responses to different microbial antigens were shown to be associated with the severity of the disease. With regard to the predictive role of serological markers, further prospective longitudinal studies are necessary.
Crohn's Disease (CD) has a heterogeneous presentation, and is typically classified according to extent and location of disease. The genetic susceptibility to CD is well known and genome-wide association scans (GWAS) and meta-analysis thereof have identified over 30 susceptibility loci. Except for the association between ileal CD and NOD2 mutations, efforts in trying to link CD genetics to clinical subphenotypes have not been very successful. We hypothesized that the large number of confirmed genetic variants enables (better) classification of CD patients.
To look for genetic-based subgroups, genotyping results of 46 SNPs identified from CD GWAS were analyzed by Latent Class Analysis (LCA) in CD patients and in healthy controls. Six genetic-based subgroups were identified in CD patients, which were significantly different from the five subgroups found in healthy controls. The identified CD-specific clusters are therefore likely to contribute to disease behavior. We then looked at whether we could relate the genetic-based subgroups to the currently used clinical parameters. Although modest differences in prevalence of disease location and behavior could be observed among the CD clusters, Random Forest analysis showed that patients could not be allocated to one of the 6 genetic-based subgroups based on the typically used clinical parameters alone. This points to a poor relationship between the genetic-based subgroups and the used clinical subphenotypes.
This approach serves as a first step to reclassify Crohn's disease. The used technique can be applied to other common complex diseases as well, and will help to complete patient characterization, in order to evolve towards personalized medicine.
We propose a novel multifactor dimensionality reduction method for epistasis detection in small or extended pedigrees, FAM-MDR. It combines features of the Genome-wide Rapid Association using Mixed Model And Regression approach (GRAMMAR) with Model-Based MDR (MB-MDR). We focus on continuous traits, although the method is general and can be used for outcomes of any type, including binary and censored traits. When comparing FAM-MDR with Pedigree-based Generalized MDR (PGMDR), which is a generalization of Multifactor Dimensionality Reduction (MDR) to continuous traits and related individuals, FAM-MDR was found to outperform PGMDR in terms of power, in most of the considered simulated scenarios. Additional simulations revealed that PGMDR does not appropriately deal with multiple testing and consequently gives rise to overly optimistic results. FAM-MDR adequately deals with multiple testing in epistasis screens and is in contrast rather conservative, by construction. Furthermore, simulations show that correcting for lower order (main) effects is of utmost importance when claiming epistasis. As Type 2 Diabetes Mellitus (T2DM) is a complex phenotype likely influenced by gene-gene interactions, we applied FAM-MDR to examine data on glucose area-under-the-curve (GAUC), an endophenotype of T2DM for which multiple independent genetic associations have been observed, in the Amish Family Diabetes Study (AFDS). This application reveals that FAM-MDR makes more efficient use of the available data than PGMDR and can deal with multi-generational pedigrees more easily. In conclusion, we have validated FAM-MDR and compared it to PGMDR, the current state-of-the-art MDR method for family data, using both simulations and a practical dataset. FAM-MDR is found to outperform PGMDR in that it handles the multiple testing issue more correctly, has increased power, and efficiently uses all available information.
Antimicrobial peptides (AMPs) protect the host intestinal mucosa against microorganisms. Abnormal expression of defensins was shown in inflammatory bowel disease (IBD), but it is not clear whether this is a primary defect. We investigated the impact of anti-inflammatory therapy with infliximab on the mucosal gene expression of AMPs in IBD.
Mucosal gene expression of 81 AMPs was assessed in 61 IBD patients before and 4–6 weeks after their first infliximab infusion and in 12 control patients, using Affymetrix arrays. Quantitative real-time reverse-transcription PCR and immunohistochemistry were used to confirm microarray data. The dysregulation of many AMPs in colonic IBD in comparison with control colons was widely restored by infliximab therapy, and only DEFB1 expression remained significantly decreased after therapy in the colonic mucosa of IBD responders to infliximab. In ileal Crohn's disease (CD), expression of two neuropeptides with antimicrobial activity, PYY and CHGB, was significantly decreased before therapy compared to control ileums, and ileal PYY expression remained significantly decreased after therapy in CD responders. Expression of the downregulated AMPs before and after treatment (DEFB1 and PYY) correlated with villin 1 expression, a gut epithelial cell marker, indicating that the decrease is a consequence of epithelial damage.
Our study shows that the dysregulation of AMPs in IBD mucosa is the consequence of inflammation, but may be responsible for perpetuation of inflammation due to ineffective clearance of microorganisms.
Glucocorticoid function is dependent on efficient translocation of the glucocorticoid receptor (GR) from the cytoplasm to the nucleus of cells. Importin-13 (IPO13) is a nuclear transport receptor that mediates nuclear entry of GR. In airway epithelial cells, inhibition of IPO13 expression prevents nuclear entry of GR and abrogates anti-inflammatory effects of glucocorticoids. Impaired nuclear entry of GR has been documented in steroid-non-responsive asthmatics. We hypothesize that common IPO13 genetic variation influences the anti-inflammatory effects of inhaled corticosteroids for the treatment of asthma, as measured by change in methacholine airway hyperresponsiveness (AHR-PC20).
10 polymorphisms were evaluated in 654 children with mild-to-moderate asthma participating in the Childhood Asthma Management Program (CAMP), a clinical trial of inhaled anti-inflammatory medications (budesonide and nedocromil). Population-based association tests with repeated measures of PC20 were performed using mixed models and confirmed using family-based tests of association.
Among participants randomized to placebo or nedocromil, IPO13 polymorphisms were associated with improved PC20 (i.e. less AHR), with subjects harboring minor alleles demonstrating an average 1.51–2.17 fold increase in mean PC20 at 8-months post-randomization that persisted over four years of observation (p = 0.01–0.005). This improvement was similar to that among children treated with long-term inhaled corticosteroids. There was no additional improvement in PC20 by IPO13 variants among children treated with inhaled corticosteroids.
IPO13 variation is associated with improved AHR in asthmatic children. The degree of this improvement is similar to that observed with long-term inhaled corticosteroid treatment, suggesting that IPO13 variation may improve nuclear bioavailability of endogenous glucocorticoids.
We present the rationale, the background and the structure for version 2.0 of the GENESTAT information portal (www.genestat.org) for statistical genetics. The fast methodological advances, coupled with a range of standalone software, makes it difficult for expert as well as non-expert users to orientate when designing and analysing their genetic studies. The ultimate ambition of GENESTAT is to guide on statistical methodology related to the broad spectrum of research in genetic epidemiology. GENESTAT 2.0 focuses on genetic association studies. Each entry provides a summary of a topic and gives links to key papers, websites and software. The flexibility of the internet is utilised for cross-referencing and for open editing. This paper gives an overview of GENESTAT and gives short introductions to the current main topics in GENESTAT, with additional entries on the website. Methods and software developers are invited to contribute to the portal, which is powered by a Wikipedia-type engine and allows easy additions and editing.
statistical genetics; genetic software; internet
Rationale: Little is known regarding the relationship between parental history of asthma and subsequent airway hyperresponsiveness (AHR) in children with asthma. Objectives: We evaluated this relationship in 1,041 children with asthma participating in a randomized trial of antiinflammatory medications (the Childhood Asthma Management Program [CAMP]). Methods: Methacholine challenge testing was performed before treatment randomization and once per year over an average of 4.5 years postrandomization. Cross-sectional and longitudinal repeated measures analyses were performed to model the relationship between PC20 (the methacholine concentration causing a 20% fall in FEV1) with maternal, paternal, and joint parental histories of asthma. Models were adjusted for potential confounders. Measurements and Main Results: At baseline, AHR was strongly associated with a paternal history of asthma. Children with a paternal history of asthma demonstrated significantly greater AHR than those without such history (median logePC20, 0.84 vs. 1.13; p = 0.006). Although maternal history of asthma was not associated with AHR, children with two parents with asthma had greater AHR than those with no parents with asthma (median logePC20, 0.52 vs. 1.17; p = 0.0008). Longitudinal multivariate analysis of the relation between paternal history of asthma and AHR using repeated PC20 measurements over 44 months postrandomization confirmed a significant association between paternal history of asthma and AHR among children in CAMP. Conclusions: Our findings suggest that the genetic contribution of the father is associated with AHR, an important determinant of disease severity among children with asthma.
airway responsiveness; asthma; genetics; longitudinal analysis; parent of origin
Due to the recent gains in the availability of single-nucleotide polymorphism data, genome-wide association testing has become feasible. It is hoped that this additional data may confirm the presence of disease susceptibility loci, and identify new genetic determinants of disease. However, the problem of multiple comparisons threatens to diminish any potential gains from this newly available data. To circumvent the multiple comparisons issue, we utilize a recently developed screening technique using family-based association testing. This screening methodology allows for the identification of the most promising single-nucleotide polymorphisms for testing without biasing the nominal significance level of our test statistic. We compare the results of our screening technique across univariate and multivariate family-based association tests. From our analyses, we observe that the screening technique, applied to different settings, is fairly consistent in identifying optimal markers for testing. One of the identified markers, TSC0047225, was significantly associated with both the ttth1 (p = 0.004) and ttth1-ttth4 (p = 0.004) phenotype(s). We find that both univariate- and multivariate-based screening techniques are powerful tools for detecting an association.
Genome scans using dense single-nucleotide polymorphism (SNP) data have recently become a reality. It is thought that the increase in information content for linkage analysis as a result of the denser scans will help refine previously identified linkage regions and possibly identify new regions not identifiable using the sparser, microsatellite scans. In the context of the dense SNP scans, it is also possible to consider association strategies to provide even more information about potential regions of interest. To circumvent the multiple-testing issues inherent in association analysis, we use a recently developed strategy, implemented in PBAT, which screens the data to identify the optimal SNPs for testing, without biasing the nominal significance level. We compare the results from the PBAT analysis to that of quantitative linkage analysis on chromosome 4 using the Collaborative Study on the Genetics of Alcoholism data, as released through Genetic Analysis Workshop 14.
The PBAT software package (v2.5) provides a unique set of tools for complex family-based association analysis at a genome-wide level. PBAT can handle nuclear families with missing parental genotypes, extended pedigrees with missing genotypic information, analysis of single nucleotide polymorphisms (SNPs), haplotype analysis, quantitative traits, multivariate/longitudinal data and time to onset phenotypes. The data analysis can be adjusted for covariates and gene/environment interactions. Haplotype-based features include sliding windows and the reconstruction of the haplotypes of the probands. PBAT's screening tools allow the user successfully to handle the multiple comparisons problem at a genome-wide level, even for 100,000 SNPs and more. Moreover, PBAT is computationally fast. A genome scan of 300,000 SNPs in 2,000 trios takes 4 central processing unit (CPU)-days. PBAT is available for Linux, Sun Solaris and Windows XP.
association analysis; extended pedigrees; genome-wide screening; quantitative and qualitative traits; haplotypes