Psoriatic skin differs distinctly from normal skin by its thickened epidermis. Most gene expression comparisons utilize full-thickness biopsies, with substantial amount of dermis. We assayed the transcriptomes of normal, lesional, and non-lesional psoriatic epidermis, sampled as split-thickness skin grafts, with 5′-end RNA sequencing. We found that psoriatic epidermis contains more mRNA per total RNA than controls, and took this into account in the bioinformatic analysis. The approach highlighted innate immunity-related pathways in psoriasis, including NOD-like receptor (NLR) signaling and inflammasome activation. We demonstrated that the NLR signaling genes NOD2, PYCARD, CARD6, and IFI16 are upregulated in psoriatic epidermis, and strengthened these findings by protein expression. Interestingly, PYCARD, the key component of the inflammasome, showed an altered expression pattern in the lesional epidermis. The profiling of non-lesional skin highlighted PSORS4 and mitochondrially encoded transcripts, suggesting that their gene expression is altered already before the development of lesions. Our data suggest that all components needed for the active inflammasome are present in the keratinocytes of psoriatic skin. The characterization of inflammasome pathways provides further opportunities for therapy. Complementing previous transcriptome studies, our approach gives deeper insight into the gene regulation in psoriatic epidermis.
The DYX5 locus for developmental dyslexia was mapped to chromosome 3 by linkage study of a large Finnish family, and later, roundabout guidance receptor 1 (ROBO1) was implicated as a candidate gene at DYX5 with suppressed expression from the segregating rare haplotype. A functional magnetoencephalographic study of several family members revealed abnormal auditory processing of interaural interaction, supporting a defect in midline crossing of auditory pathways. In the current study, we have characterized genetic variation in the broad ROBO1 gene region in the DYX5-linked family, aiming to identify variants that would increase our understanding of the altered expression of ROBO1.
We have used a whole genome sequencing strategy on a pooled sample of 19 individuals in combination with two individually sequenced genomes. The discovered genetic variants were annotated and filtered. Subsequently, the most interesting variants were functionally tested using relevant methods, including electrophoretic mobility shift assay (EMSA), luciferase assay, and gene knockdown by lentiviral small hairpin RNA (shRNA) in lymphoblasts.
We found one novel intronic single nucleotide variant (SNV) and three novel intergenic SNVs in the broad region of ROBO1 that were specific to the dyslexia susceptibility haplotype. Functional testing by EMSA did not support the binding of transcription factors to three of the SNVs, but one of the SNVs was bound by the LIM homeobox 2 (LHX2) protein, with increased binding affinity for the non-reference allele. Knockdown of LHX2 in lymphoblast cell lines extracted from subjects from the DYX5-linked family showed decreasing expression of ROBO1, supporting the idea that LHX2 regulates ROBO1 also in human.
The discovered variants may explain the segregation of dyslexia in this family, but the effect appears subtle in the experimental settings. Their impact on the developing human brain remains suggestive based on the association and subtle experimental support.
Electronic supplementary material
The online version of this article (doi:10.1186/s11689-016-9136-y) contains supplementary material, which is available to authorized users.
Dyslexia; ROBO1; Whole genome sequencing
Transcriptional program that drives human preimplantation development is largely unknown. Here, by using single-cell RNA sequencing of 348 oocytes, zygotes and single blastomeres from 2- to 3-day-old embryos, we provide a detailed analysis of the human preimplantation transcriptome. By quantifying transcript far 5′-ends (TFEs), we include in our analysis transcripts that derive from alternative promoters. We show that 32 and 129 genes are transcribed during the transition from oocyte to four-cell stage and from four- to eight-cell stage, respectively. A number of identified transcripts originates from previously unannotated genes that include the PRD-like homeobox genes ARGFX, CPHX1, CPHX2, DPRX, DUXA, DUXB and LEUTX. Employing de novo promoter motif extraction on sequences surrounding TFEs, we identify significantly enriched gene regulatory motifs that often overlap with Alu elements. Our high-resolution analysis of the human transcriptome during preimplantation development may have important implications on future studies of human pluripotent stem cells and cell reprograming.
Understanding human preimplantation development is invaluable for human reproduction and stem cell research. By employing single-cell RNA sequencing in oocytes, zygotes and single blastomeres, Töhönen et al. identify new regulatory factors and sequences that drive early human preimplantation development.
Recently developed high-throughput sequencing technology shows power to detect low-frequency disease-causing variants by deep sequencing of all known exons. We used exome sequencing to identify variants associated with morbid obesity. DNA from 100 morbidly obese adult subjects and 100 controls were pooled (n=10/pool), subjected to exome capture, and subsequent sequencing. At least 100 million sequencing reads were obtained from each pool. After several filtering steps and comparisons of observed frequencies of variants between obese and non-obese control pools, we systematically selected 144 obesity-enriched non-synonymous, splicing site or 5′ upstream single-nucleotide variants for validation. We first genotyped 494 adult subjects with morbid obesity and 496 controls. Five obesity-associated variants (nominal P-value<0.05) were subsequently genotyped in 1425 morbidly obese and 782 controls. Out of the five variants, only rs62623713:A>G (NM_001040709:c.A296G:p.E99G) was confirmed. rs62623713 showed strong association with body mass index (beta=2.13 (1.09, 3.18), P=6.28 × 10−5) in a joint analysis of all 3197 genotyped subjects and had an odds ratio of 1.32 for obesity association. rs62623713 is a low-frequency (2.9% minor allele frequency) non-synonymous variant (E99G) in exon 4 of the synaptophysin-like 2 (SYPL2) gene. rs62623713 was not covered by Illumina or Affymetrix genotyping arrays used in previous genome-wide association studies. Mice lacking Sypl2 has been reported to display reduced body weight. In conclusion, using exome sequencing we identified a low-frequency coding variant in the SYPL2 gene that was associated with morbid obesity. This gene may be involved in the development of excess body fat.
Genetic studies of complex traits have become increasingly successful as progress is made in next-generation sequencing. We aimed at discovering single nucleotide variation present in known and new candidate genes for developmental dyslexia: CYP19A1, DCDC2, DIP2A, DYX1C1, GCFC2 (also known as C2orf3), KIAA0319, MRPL19, PCNT, PRMT2, ROBO1 and S100B. We used next-generation sequencing to identify single-nucleotide polymorphisms in the exons of these 11 genes in pools of 100 DNA samples of Finnish individuals with developmental dyslexia. Subsequent individual genotyping of those 100 individuals, and additional cases and controls from the Finnish and German populations, validated 92 out of 111 different single-nucleotide variants. A nonsynonymous polymorphism in DCDC2 (corrected P=0.002) and a noncoding variant in S100B (corrected P=0.016) showed a significant association with spelling performance in families of German origin. No significant association was found for the variants neither in the Finnish case-control sample set nor in the Finnish family sample set. Our findings further strengthen the role of DCDC2 and implicate S100B, in the biology of reading and spelling.
Predisposition to childhood otitis media (OM) has a strong genetic component, with polymorphisms in innate immunity genes suspected to contribute to risk. Studies on several genes have been conducted, but most associations have failed to replicate in independent cohorts.
We investigated 53 gene polymorphisms in a Finnish cohort of 624 cases and 778 controls. A positive association signal was followed up in a tagging approach and tested in an independent Finnish cohort of 205 cases, in a British cohort of 1269 trios, as well as in two cohorts from the United States (US); one with 403 families and the other with 100 cases and 104 controls.
In the initial Finnish cohort, the SNP rs5030717 in the TLR4 gene region showed significant association (OR 1.33, P = .003) to OM. Tagging SNP analysis of the gene found rs1329060 (OR 1.33, P = .002) and rs1329057 (OR 1.29, P = .003) also to be associated. In the more severe phenotype the association was stronger. This finding was supported by an independent Finnish case cohort, but the associations failed to replicate in the British and US cohorts. In studies on TLR4 signaling in 20 study subjects, the three-marker risk haplotype correlated with a decreased TNFα secretion in myeloid dendritic cells.
The TLR4 gene locus, regulating the innate immune response, influences the genetic predisposition to childhood OM in a subpopulation of patients. Environmental factors likely modulate the genetic components contributing to the risk of OM.
Keratinocytes (KCs) are the most frequent cells in the epidermis, and they are often isolated and cultured in vitro to study the molecular biology of the skin. Cultured primary cells and various immortalized cells have been frequently used as skin models but their comparability to intact skin has been questioned. Moreover, when analyzing KC transcriptomes, fluctuation of polyA+ RNA content during the KCs’ lifecycle has been omitted.
We performed STRT RNA sequencing on 10 ng samples of total RNA from three different sample types: i) epidermal tissue (split-thickness skin grafts), ii) cultured primary KCs, and iii) HaCaT cell line. We observed significant variation in cellular polyA+ RNA content between tissue and cell culture samples of KCs. The use of synthetic RNAs and SAMstrt in normalization enabled comparison of gene expression levels in the highly heterogenous samples and facilitated discovery of differences between the tissue samples and cultured cells. The transcriptome analysis sensitively revealed genes involved in KC differentiation in skin grafts and cell cycle regulation related genes in cultured KCs and emphasized the fluctuation of transcription factors and non-coding RNAs associated to sample types.
The epidermal keratinocytes derived from tissue and cell culture samples showed highly different polyA+ RNA contents. The use of SAMstrt and synthetic RNA based normalization allowed the comparison between tissue and cell culture samples and thus proved to be valuable tools for RNA-seq analysis with translational approach. Transciptomics revealed clear difference both between tissue and cell culture samples and between primary KCs and immortalized HaCaT cells.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1671-5) contains supplementary material, which is available to authorized users.
GTPase of the immunity-associated protein (GIMAP) family members are differentially regulated during human Th cell differentiation and have been previously connected to immune-mediated disorders in animal studies. GIMAP4 is believed to contribute to the Th cell subtype–driven immunological balance via its role in T cell survival. GIMAP5 has a key role in BB-DR rat and NOD mouse lymphopenia. To elucidate GIMAP4 and GIMAP5 function and role in human immunity, we conducted a study combining genetic association in different immunological diseases and complementing functional analyses. Single nucleotide polymorphisms tagging the GIMAP haplotype variation were genotyped in Finnish type 1 diabetes (T1D) families and in a prospective Swedish asthma and allergic sensitization birth cohort. Initially, GIMAP5 rs6965571 was associated with risk for asthma and allergic sensitization (odds ratio [OR] 3.74, p = 0.00072, and OR 2.70, p = 0.0063, respectively) and protection from T1D (OR 0.64, p = 0.0058); GIMAP4 rs13222905 was associated with asthma (OR 1.28, p = 0.035) and allergic sensitization (OR 1.27, p = 0.0068). However, after false discovery rate correction for multiple testing, only the associations of GIMAP4 with allergic sensitization and GIMAP5 with asthma remained significant. In addition, transcription factor binding sites surrounding the associated loci were predicted. A gene–gene interaction in the T1D data were observed between the IL2RA rs2104286 and GIMAP4 rs9640279 (OR 1.52, p = 0.0064) and indicated between INS rs689 and GIMAP5 rs2286899. The follow-up functional analyses revealed lower IL-2RA expression upon GIMAP4 knockdown and an effect of GIMAP5 rs2286899 genotype on protein expression. Thus, the potential role of GIMAP4 and GIMAP5 as modifiers of immune-mediated diseases cannot be discarded.
Dyslexia is one of the most common childhood disorders with a prevalence of around 5–10% in school-age children. Although an important genetic component is known to have a role in the aetiology of dyslexia, we are far from understanding the molecular mechanisms leading to the disorder. Several candidate genes have been implicated in dyslexia, including DYX1C1, DCDC2, KIAA0319, and the MRPL19/C2ORF3 locus, each with reports of both positive and no replications. We generated a European cross-linguistic sample of school-age children – the NeuroDys cohort – that includes more than 900 individuals with dyslexia, sampled with homogenous inclusion criteria across eight European countries, and a comparable number of controls. Here, we describe association analysis of the dyslexia candidate genes/locus in the NeuroDys cohort. We performed both case–control and quantitative association analyses of single markers and haplotypes previously reported to be dyslexia-associated. Although we observed association signals in samples from single countries, we did not find any marker or haplotype that was significantly associated with either case–control status or quantitative measurements of word-reading or spelling in the meta-analysis of all eight countries combined. Like in other neurocognitive disorders, our findings underline the need for larger sample sizes to validate possibly weak genetic effects.
dyslexia; word-reading; spelling; association study; candidate genes
Inherited neurodegenerative disorders are debilitating diseases that occur across different species. We have performed clinical, pathological and genetic studies to characterize a novel canine neurodegenerative disease present in the Lagotto Romagnolo dog breed. Affected dogs suffer from progressive cerebellar ataxia, sometimes accompanied by episodic nystagmus and behavioral changes. Histological examination revealed unique pathological changes, including profound neuronal cytoplasmic vacuolization in the nervous system, as well as spheroid formation and cytoplasmic aggregation of vacuoles in secretory epithelial tissues and mesenchymal cells. Genetic analyses uncovered a missense change, c.1288G>A; p.A430T, in the autophagy-related ATG4D gene on canine chromosome 20 with a highly significant disease association (p = 3.8 x 10-136) in a cohort of more than 2300 Lagotto Romagnolo dogs. ATG4D encodes a poorly characterized cysteine protease belonging to the macroautophagy pathway. Accordingly, our histological analyses indicated altered autophagic flux in affected tissues. The knockdown of the zebrafish homologue atg4da resulted in a widespread developmental disturbance and neurodegeneration in the central nervous system. Our study describes a previously unknown canine neurological disease with particular pathological features and implicates the ATG4D protein as an important autophagy mediator in neuronal homeostasis. The canine phenotype serves as a model to delineate the disease-causing pathological mechanism(s) and ATG4D function, and can also be used to explore treatment options. Furthermore, our results reveal a novel candidate gene for human neurodegeneration and enable the development of a genetic test for veterinary diagnostic and breeding purposes.
Neurodegenerative disorders affect millions of people worldwide. We describe a novel neurodegenerative disease in a canine model, characterized by progressive cerebellar ataxia and cellular vacuolization. Our genetic analyses identified a single nucleotide change in the autophagy-related ATG4D gene in affected dogs. The ATG4D gene has not been linked to inherited diseases before. The autophagy-lysosome pathway plays an important role in degrading and recycling different cellular components. Disturbed autophagy has been reported in several different diseases but mutations in core autophagy components are rare. Histological analyses of affected canine brain tissues revealed altered autophagic flux, and a knockdown of the gene in the zebrafish model caused marked neurodevelopmental alterations and neurodegeneration. Our findings identify a new disease-causing pathway and implicate the ATG4D protease as an important mediator for neuronal homeostasis. Furthermore, our study establishes a large animal model to investigate the role of ATG4D in autophagy and to test possible treatment options.
Age-related changes in DNA methylation occurring in blood leukocytes during early childhood may reflect epigenetic maturation. We hypothesized that some of these changes involve gene networks of critical relevance in leukocyte biology and conducted a prospective study to elucidate the dynamics of DNA methylation. Serial blood samples were collected at 3, 6, 12, 24, 36, 48 and 60 months after birth in ten healthy girls born in Finland and participating in the Type 1 Diabetes Prediction and Prevention Study. DNA methylation was measured using the HumanMethylation450 BeadChip.
After filtering for the presence of polymorphisms and cell-lineage-specific signatures, 794 CpG sites showed significant DNA methylation differences as a function of age in all children (41.6% age-methylated and 58.4% age-demethylated, Bonferroni-corrected P value <0.01). Age-methylated CpGs were more frequently located in gene bodies and within +5 to +50 kilobases (kb) of transcription start sites (TSS) and enriched in developmental, neuronal and plasma membrane genes. Age-demethylated CpGs were associated to promoters and DNAse-I hypersensitivity sites, located within −5 to +5 kb of the nearest TSS and enriched in genes related to immunity, antigen presentation, the polycomb-group protein complex and cytoplasm.
This study reveals that susceptibility loci for complex inflammatory diseases (for example, IRF5, NOD2, and PTGER4) and genes encoding histone modifiers and chromatin remodeling factors (for example, HDAC4, KDM2A, KDM2B, JARID2, ARID3A, and SMARCD3) undergo DNA methylation changes in leukocytes during early childhood. These results open new perspectives to understand leukocyte maturation and provide a catalogue of CpG sites that may need to be corrected for age effects when performing DNA methylation studies in children.
Electronic supplementary material
The online version of this article (doi:10.1186/s13148-015-0064-6) contains supplementary material, which is available to authorized users.
Age-modified CpG; Childhood; DNA methylation; Genes; Leukocytes; Longitudinal
DNA methylation is a hallmark of genomic imprinting and differentially methylated regions (DMRs) are found near and in imprinted genes. Imprinted genes are expressed only from the maternal or paternal allele and their normal balance can be disrupted by uniparental disomy (UPD), the inheritance of both chromosomes of a chromosome pair exclusively from only either the mother or the father. Maternal UPD for chromosome 7 (matUPD7) results in Silver-Russell syndrome (SRS) with typical features and growth retardation, but no gene has been conclusively implicated in SRS. In order to identify novel DMRs and putative imprinted genes on chromosome 7, we analyzed eight matUPD7 patients, a segmental matUPD7q31-qter, a rare patUPD7 case and ten controls on the Infinium HumanMethylation450K BeadChip with 30 017 CpG methylation probes for chromosome 7. Genome-scale analysis showed highly significant clustering of DMRs only on chromosome 7, including the known imprinted loci GRB10, SGCE/PEG10, and PEG/MEST. We found ten novel DMRs on chromosome 7, two DMRs for the predicted imprinted genes HOXA4 and GLI3 and one for the disputed imprinted gene PON1. Quantitative RT-PCR on blood RNA samples comparing matUPD7, patUPD7, and controls showed differential expression for three genes with novel DMRs, HOXA4, GLI3, and SVOPL. Allele specific expression analysis confirmed maternal only expression of SVOPL and imprinting of HOXA4 was supported by monoallelic expression. These results present the first comprehensive map of parent-of-origin specific DMRs on human chromosome 7, suggesting many new imprinted sites.
differentially methylated regions; imprinting; uniparental disomy; chromosome 7; Silver-Russell syndrome; methylation; genome-scale analysis
Monogenic causes of autoimmunity give key insights to the complex regulation of the immune system. We report a new monogenic cause of autoimmunity resulting from de novo germline activating STAT3 mutations in 5 individuals with a spectrum of early-onset autoimmune disease including type 1 diabetes. These findings emphasise the critical role of STAT3 in autoimmune disease and contrast with the germline inactivating STAT3 mutations that result in Hyper IgE syndrome.
Pre-eclampsia is a common vascular disorder of pregnancy. It originates in the placenta and targets the maternal endothelium. According to epidemiological research, >50% of the liability to this disorder can be accounted for by genetic factors. Both maternal and fetal genes contribute to the risk, but especially the fetal genetic risk profile is still poorly understood. We have previously detected linkage signals in multiplex Finnish families on chromosomes 2p25, 4q32, and 9p13 using maternal phenotypes. We performed a linkage analysis using updated maternal phenotypes and an unprecedented linkage analysis using fetal phenotypes. Markers genotyped were available from 237 individuals in 15 Finnish families, including 72 affected mothers and 49 affected fetuses. The MERLIN software was used for sample and marker quality control and linkage analysis. The results were compared against the original ones obtained by using the GENEHUNTER 2.1 software. The previous identification of the maternal susceptibility locus to a genetic location at 21.70 cM near marker D2S168 on chromosome 2 was confirmed by using both maternal and fetal phenotypes (maternal non-parametric linkage (NPL) score 3.79, P=0.00008, LOD (logarithm (base 10) of odds)=2.20 and fetal NPL score 2.95, P=0.002, LOD=1.71). As a novel finding, we present a suggestive linkage to chromosome 18 at 86.80 cM near marker D18S64 (NPL score 2.51, P=0.006, LOD=1.20) using the fetal phenotype. We propose that chromosome 18 may harbor a new fetal susceptibility locus for pre-eclampsia.
pre-eclampsia; linkage; maternal phenotype; fetal phenotype; family study
Infertility is a worldwide concern that can be treated with in vitro fertilization (IVF). Improvements in IVF and infertility treatment depend largely on better understanding of the molecular mechanisms for human preimplantation development. Several large-scale studies have been conducted to identify gene expression patterns for the first five days of human development, and many functional studies utilize mouse as a model system. We have identified genes of possible importance for this time period by analyzing human microarray data and available data from online databases. We selected 70 candidate genes for human preimplantation development and investigated their expression in the early mouse development from oocyte to the 8-cell stage. Maternally loaded genes expectedly decreased in expression during development both in human and mouse. We discovered that 25 significantly upregulated genes after fertilization in human included 13 genes whose orthologs in mouse behaved differently and mimicked the expression profile of maternally expressed genes. Our findings highlight many significant differences in gene expression patterns during mouse and human preimplantation development. We also describe four cancer-testis antigen families that are also highly expressed in human embryos: PRAME, SSX, GAGE and MAGEA.
The cancer stem cell model implies a hierarchical organization within breast tumors maintained by cancer stem-like cells (CSCs). Accordingly, CSCs are a subpopulation of cancer cells with capacity for self-renewal, differentiation and tumor initiation. These cells can be isolated through the phenotypic markers CD44+/CD24-, expression of ALDH1 and an ability to form nonadherent, multicellular spheres in vitro. However, controversies to describe the stem cell model exist; it is unclear whether the tumorigenicity of CSCs in vivo is solely a proxy for a certain genotype. Moreover, in vivo evidence is lacking to fully define the reversibility of CSC differentiation.
In order to answer these questions, we undertook exome sequencing of CSCs from 12 breast cancer patients, along with paired primary tumor samples. As suggested by stem classical cell biology, we assumed that the number of mutations in the CSC subpopulation should be lower and distinct compared to the differentiated tumor cells with higher proliferation.
Our analysis revealed that the majority of somatic mutations are shared between CSCs and bulk primary tumor, with similar frequencies in the two.
The data presented here exclude the possibility that CSCs are only a phenotypic consequence of certain somatic mutations, that is a distinct and non-reversible population of cells. In addition, our results imply that CSCs must be a population of cells that can dynamically switch from differentiated tumor cells, and vice versa. This finding increases our understanding of CSC function in tumor heterogeneity and the importance of identifying drugs to counter de-differentiation rather than targeting CSCs.
Pre-eclampsia is an idiopathic pregnancy disorder promoting morbidity and mortality to both mother and child. Delivery of the fetus is the only means to resolve severe symptoms. Women with pre-eclamptic pregnancies demonstrate increased risk for later life cardiovascular disease (CVD) and good evidence suggests these two syndromes share several risk factors and pathophysiological mechanisms. To elucidate the genetic architecture of pre-eclampsia we have dissected our chromosome 2q22 susceptibility locus in an extended Australian and New Zealand familial cohort. Positional candidate genes were prioritized for exon-centric sequencing using bioinformatics, SNPing, transcriptional profiling and QTL-walking. In total, we interrogated 1598 variants from 52 genes. Four independent SNP associations satisfied our gene-centric multiple testing correction criteria: a missense LCT SNP (rs2322659, P = 0.0027), a synonymous LRP1B SNP (rs35821928, P = 0.0001), an UTR-3 RND3 SNP (rs115015150, P = 0.0024) and a missense GCA SNP (rs17783344, P = 0.0020). We replicated the LCT SNP association (P = 0.02) and observed a borderline association for the GCA SNP (P = 0.07) in an independent Australian case–control population. The LRP1B and RND3 SNP associations were not replicated in this same Australian singleton cohort. Moreover, these four SNP associations could not be replicated in two additional case–control populations from Norway and Finland. These four SNPs, however, exhibit pleiotropic effects with several quantitative CVD-related traits. Our results underscore the genetic complexity of pre-eclampsia and present novel empirical evidence of possible shared genetic mechanisms underlying both pre-eclampsia and other CVD-related risk factors.
2q22; cardiovascular disease risk trait; genetic association; pleiotropy; pre-eclampsia
Both genetic and environmental factors are important for the development of allergic diseases. However, a detailed understanding of how such factors act together is lacking. To elucidate the interplay between genetic and environmental factors in allergic diseases, we used a novel bioinformatics approach that combines feature selection and machine learning. In two materials, PARSIFAL (a European cross-sectional study of 3113 children) and BAMSE (a Swedish birth-cohort including 2033 children), genetic variants as well as environmental and lifestyle factors were evaluated for their contribution to allergic phenotypes. Monte Carlo feature selection and rule based models were used to identify and rank rules describing how combinations of genetic and environmental factors affect the risk of allergic diseases. Novel interactions between genes were suggested and replicated, such as between ORMDL3 and RORA, where certain genotype combinations gave odds ratios for current asthma of 2.1 (95% CI 1.2-3.6) and 3.2 (95% CI 2.0-5.0) in the BAMSE and PARSIFAL children, respectively. Several combinations of environmental factors appeared to be important for the development of allergic disease in children. For example, use of baby formula and antibiotics early in life was associated with an odds ratio of 7.4 (95% CI 4.5-12.0) of developing asthma. Furthermore, genetic variants together with environmental factors seemed to play a role for allergic diseases, such as the use of antibiotics early in life and COL29A1 variants for asthma, and farm living and NPSR1 variants for allergic eczema. Overall, combinations of environmental and life style factors appeared more frequently in the models than combinations solely involving genes. In conclusion, a new bioinformatics approach is described for analyzing complex data, including extensive genetic and environmental information. Interactions identified with this approach could provide useful hints for further in-depth studies of etiological mechanisms and may also strengthen the basis for risk assessment and prevention.
Recent genome-wide association studies (GWASs) conducted in Asian populations have identified novel risk loci for systemic lupus erythematosus (SLE). Here, we genotyped 10 single-nucleotide polymorphisms (SNPs) in eight such loci and investigated their disease associations in three independent Caucasian SLE case–control cohorts recruited from Sweden, Finland and the United States. The disease associations of the SNPs in ETS1, IKZF1, LRRC18-WDFY4, RASGRP3, SLC15A4, TNIP1 and 16p11.2 were replicated, whereas no solid evidence of association was observed for the 7q11.23 locus in the Caucasian cohorts. SLC15A4 was significantly associated with renal involvement in SLE. The association of TNIP1 was more pronounced in SLE patients with renal and immunological disorder, which is corroborated by two previous studies in Asian cohorts. The effects of all the associated SNPs, either conferring risk for or being protective against SLE, were in the same direction in Caucasians and Asians. The magnitudes of the allelic effects for most of the SNPs were also comparable across different ethnic groups. On the contrary, remarkable differences in allele frequencies between Caucasian and Asian populations were observed for all associated SNPs. In conclusion, most of the novel SLE risk loci identified by GWASs in Asian populations were also associated with SLE in Caucasian populations. We observed both similarities and differences with respect to the effect sizes and risk allele frequencies across ethnicities.
systemic lupus erythematosus; genetic-association study; Asian; Caucasian
Motivation: Recent transcriptome studies have revealed that total transcript numbers vary by cell type and condition; therefore, the statistical assumptions for single-cell transcriptome studies must be revisited. SAMstrt is an extension code for SAMseq, which is a statistical method for differential expression, to enable spike-in normalization and statistical testing based on the estimated absolute number of transcripts per cell for single-cell RNA-seq methods.
Availability and Implementation: SAMstrt is implemented on R and available in github (https://github.com/shka/R-SAMstrt).
Supplementary data are available at Bioinformatics online.
Genetic variation in the transcription factor Interferon Regulatory Factor 6 (IRF6) causes and contributes risk for oral clefting disorders. We hypothesized that genes regulated by IRF6 are also involved in oral clefting disorders. We used five criteria to identify potential IRF6 target genes; differential gene expression in skin taken from wild type and Irf6-deficient murine embryos, localization to the Van der Woude syndrome 2 (VWS2) locus at 1p36–1p32, overlapping expression with Irf6, presence of a conserved predicted binding site in the promoter region, and a mutant murine phenotype that was similar to the Irf6 mutant mouse. Previously, we observed altered expression for 573 genes; 13 were located in the murine region syntenic to the VWS2 locus. Two of these genes, Wdr65 and Stratifin, met four of five criteria. Wdr65 was a novel gene that encoded a predicted protein of 1250 amino acids with two WD domains. As potential targets for Irf6 regulation, we hypothesized that disease-causing mutations will be found in WDR65 and Stratifin in individuals with VWS or VWS-like syndromes. We identified a potentially etiologic missense mutation in WDR65 in a person with VWS who does not have an exonic mutation in IRF6. The expression and mutation data were consistent with the hypothesis that WDR65 was a novel gene involved in oral clefting.
cleft lip and palate; mutation; gene expression; syndrome; genomic; microvilli; WD domain; transcription factor
Epigenetic mechanisms integrate genetic and environmental causes of disease. Comprehensive genome-wide analyses of epigenetic modifications have not demonstrated robust association with common diseases. Using Illumina HumanMethylation450 arrays on 354 ACPA positive rheumatoid arthritis (RA) cases and 337 controls, we identified two clusters within the MHC region whose differential methylation potentially mediates genetic risk for RA. To reduce confounding hampering previous epigenome-wide studies, we corrected for cellular heterogeneity by estimating and adjusting for cell-type proportions and used mediation analysis to filter out associations likely consequential to disease. Four CpGs also showed association between genotype and variance of methylation in addition to mean. The associations for both clusters replicated at least one CpG (p<0.01), with the rest showing suggestive association, in monocytes in an independent 12 cases and 12 controls. Thus, DNA methylation is a potential mediator of genetic risk.