Search tips
Search criteria

Results 1-12 (12)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses 
Nucleic Acids Research  2015;43(15):e97.
Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package.
PMCID: PMC4551905  PMID: 25925576
2.  Repression of Igf1 expression by Ezh2 prevents basal cell differentiation in the developing lung 
Development (Cambridge, England)  2015;142(8):1458-1469.
Epigenetic mechanisms involved in the establishment of lung epithelial cell lineage identities during development are largely unknown. Here, we explored the role of the histone methyltransferase Ezh2 during lung lineage determination. Loss of Ezh2 in the lung epithelium leads to defective lung formation and perinatal mortality. We show that Ezh2 is crucial for airway lineage specification and alveolarization. Using optical projection tomography imaging, we found that branching morphogenesis is affected in Ezh2 conditional knockout mice and the remaining bronchioles are abnormal, lacking terminally differentiated secretory club cells. Remarkably, RNA-seq analysis revealed the upregulation of basal genes in Ezh2-deficient epithelium. Three-dimensional imaging for keratin 5 further showed the unexpected presence of a layer of basal cells from the proximal airways to the distal bronchioles in E16.5 embryos. ChIP-seq analysis indicated the presence of Ezh2-mediated repressive marks on the genomic loci of some but not all basal genes, suggesting an indirect mechanism of action of Ezh2. We found that loss of Ezh2 de-represses insulin-like growth factor 1 (Igf1) expression and that modulation of IGF1 signaling ex vivo in wild-type lungs could induce basal cell differentiation. Altogether, our work reveals an unexpected role for Ezh2 in controlling basal cell fate determination in the embryonic lung endoderm, mediated in part by repression of Igf1 expression.
SUMMARY: The histone methyltransferase Ezh2 inhibits basal cell differentiation in the mouse lung by depositing repressive marks on the promoter region of basal cell genes and by repressing Igf1 expression.
PMCID: PMC4392602  PMID: 25790853
Polycomb repressive complex 2; Ezh2; Lung development; Basal cells; IGF1; Mouse
3.  edgeR: a versatile tool for the analysis of shRNA-seq and CRISPR-Cas9 genetic screens 
F1000Research  2014;3:95.
Pooled library sequencing screens that perturb gene function in a high-throughput manner are becoming increasingly popular in functional genomics research. Irrespective of the mechanism by which loss of function is achieved, via either RNA interference using short hairpin RNAs (shRNAs) or genetic mutation using single guide RNAs (sgRNAs) with the CRISPR-Cas9 system, there is a need to establish optimal analysis tools to handle such data. Our open-source processing pipeline in edgeR provides a complete analysis solution for screen data, that begins with the raw sequence reads and ends with a ranked list of candidate genes for downstream biological validation. We first summarize the raw data contained in a fastq file into a matrix of counts (samples in the columns, genes in the rows) with options for allowing mismatches and small shifts in sequence position. Diagnostic plots, normalization and differential representation analysis can then be performed using established methods to prioritize results in a statistically rigorous way, with the choice of either the classic exact testing methodology or generalized linear modeling that can handle complex experimental designs. A detailed users’ guide that demonstrates how to analyze screen data in edgeR along with a point-and-click implementation of this workflow in Galaxy are also provided. The edgeR package is freely available from
PMCID: PMC4023662  PMID: 24860646
4.  A comparison of control samples for ChIP-seq of histone modifications 
Frontiers in Genetics  2014;5:329.
The advent of high-throughput sequencing has allowed genome wide profiling of histone modifications by Chromatin ImmunoPrecipitation (ChIP) followed by sequencing (ChIP-seq). In this assay the histone mark of interest is enriched through a chromatin pull-down assay using an antibody for the mark. Due to imperfect antibodies and other factors, many of the sequenced fragments do not originate from the histone mark of interest, and are referred to as background reads. Background reads are not uniformly distributed and therefore control samples are usually used to estimate the background distribution at any given genomic position. The Encyclopedia of DNA Elements (ENCODE) Consortium guidelines suggest sequencing a whole cell extract (WCE, or “input”) sample, or a mock ChIP reaction such as an IgG control, as a background sample. However, for a histone modification ChIP-seq investigation it is also possible to use a Histone H3 (H3) pull-down to map the underlying distribution of histones. In this paper we generated data from a hematopoietic stem and progenitor cell population isolated from mouse fetal liver to compare WCE and H3 ChIP-seq as control samples. The quality of the control samples is estimated by a comparison to pull-downs of histone modifications and to expression data. We find minor differences between WCE and H3 ChIP-seq, such as coverage in mitochondria and behavior close to transcription start sites. Where the two controls differ, the H3 pull-down is generally more similar to the ChIP-seq of histone modifications. However, the differences between H3 and WCE have a negligible impact on the quality of a standard analysis.
PMCID: PMC4174756  PMID: 25309581
ChIP-seq; histone modifications; control sample; whole cell extract; input; H3; quality control
5.  shRNA-seq data analysis with edgeR  
F1000Research  2014;3:95.
Pooled short hairpin RNA sequencing (shRNA-seq) screens are becoming increasingly popular in functional genomics research, and there is a need to establish optimal analysis tools to handle such data. Our open-source shRNA processing pipeline in edgeR provides a complete analysis solution for shRNA-seq screen data, that begins with the raw sequence reads and ends with a ranked lists of candidate shRNAs for downstream biological validation. We first summarize the raw data contained in a fastq file into a matrix of counts (samples in the columns, hairpins in the rows) with options for allowing mismatches and small shifts in hairpin position. Diagnostic plots, normalization and differential representation analysis can then be performed using established methods to prioritize results in a statistically rigorous way, with the choice of either the classic exact testing methodology or a generalized linear modelling that can handle complex experimental designs. A detailed users’ guide that demonstrates how to analyze screen data in edgeR along with a point-and-click implementation of this workflow in Galaxy are also provided. The edgeR package is freely available from
PMCID: PMC4023662  PMID: 24860646
6.  An ENU mutagenesis screen identifies novel and known genes involved in epigenetic processes in the mouse 
Genome Biology  2013;14(9):R96.
We have used a sensitized ENU mutagenesis screen to produce mouse lines that carry mutations in genes required for epigenetic regulation. We call these lines Modifiers of murine metastable epialleles (Mommes).
We report a basic molecular and phenotypic characterization for twenty of the Momme mouse lines, and in each case we also identify the causative mutation. Three of the lines carry a mutation in a novel epigenetic modifier, Rearranged L-myc fusion (Rlf), and one gene, Rap-interacting factor 1 (Rif1), has not previously been reported to be involved in transcriptional regulation in mammals. Many of the other lines are novel alleles of known epigenetic regulators. For two genes, Rlf and Widely-interspaced zinc finger (Wiz), we describe the first mouse mutants. All of the Momme mutants show some degree of homozygous embryonic lethality, emphasizing the importance of epigenetic processes. The penetrance of lethality is incomplete in a number of cases. Similarly, abnormalities in phenotype seen in the heterozygous individuals of some lines occur with incomplete penetrance.
Recent advances in sequencing enhance the power of sensitized mutagenesis screens to identify the function of previously uncharacterized factors and to discover additional functions for previously characterized proteins. The observation of incomplete penetrance of phenotypes in these inbred mutant mice, at various stages of development, is of interest. Overall, the Momme collection of mouse mutants provides a valuable resource for researchers across many disciplines.
PMCID: PMC4053835  PMID: 24025402
7.  Smchd1 regulates a subset of autosomal genes subject to monoallelic expression in addition to being critical for X inactivation 
Smchd1 is an epigenetic modifier essential for X chromosome inactivation: female embryos lacking Smchd1 fail during midgestational development. Male mice are less affected by Smchd1-loss, with some (but not all) surviving to become fertile adults on the FVB/n genetic background. On other genetic backgrounds, all males lacking Smchd1 die perinatally. This suggests that, in addition to being critical for X inactivation, Smchd1 functions to control the expression of essential autosomal genes.
Using genome-wide microarray expression profiling and RNA-seq, we have identified additional genes that fail X inactivation in female Smchd1 mutants and have identified autosomal genes in male mice where the normal expression pattern depends upon Smchd1. A subset of genes in the Snrpn imprinted gene cluster show an epigenetic signature and biallelic expression consistent with loss of imprinting in the absence of Smchd1. In addition, single nucleotide polymorphism analysis of expressed genes in the placenta shows that the Igf2r imprinted gene cluster is also disrupted, with Slc22a3 showing biallelic expression in the absence of Smchd1. In both cases, the disruption was not due to loss of the differential methylation that marks the imprint control region, but affected genes remote from this primary imprint controlling element. The clustered protocadherins (Pcdhα, Pcdhβ, and Pcdhγ) also show altered expression levels, suggesting that their unique pattern of random combinatorial monoallelic expression might also be disrupted.
Smchd1 has a role in the expression of several autosomal gene clusters that are subject to monoallelic expression, rather than being restricted to functioning uniquely in X inactivation. Our findings, combined with the recent report implicating heterozygous mutations of SMCHD1 as a causal factor in the digenically inherited muscular weakness syndrome facioscapulohumeral muscular dystrophy-2, highlight the potential importance of Smchd1 in the etiology of diverse human diseases.
PMCID: PMC3707822  PMID: 23819640
Clustered protocadherins; Genomic imprinting; Monoallelic expression; Smchd1; X inactivation
8.  ChIP-seq analysis reveals distinct H3K27me3 profiles that correlate with transcriptional activity 
Nucleic Acids Research  2011;39(17):7415-7427.
Transcriptional control is dependent on a vast network of epigenetic modifications. One epigenetic mark of particular interest is tri-methylation of lysine 27 on histone H3 (H3K27me3), which is catalysed and maintained by Polycomb Repressive Complex 2 (PRC2). Although this histone mark is studied widely, the precise relationship between its local pattern of enrichment and regulation of gene expression is currently unclear. We have used ChIP-seq to generate genome-wide maps of H3K27me3 enrichment, and have identified three enrichment profiles with distinct regulatory consequences. First, a broad domain of H3K27me3 enrichment across the body of genes corresponds to the canonical view of H3K27me3 as inhibitory to transcription. Second, a peak of enrichment around the transcription start site (TSS) is commonly associated with ‘bivalent’ genes, where H3K4me3 also marks the TSS. Finally and most surprisingly, we identified an enrichment profile with a peak in the promoter of genes that is associated with active transcription. Genes with each of these three profiles were found in different proportions in each of the cell types studied. The data analysis techniques developed here will be useful for the identification of common enrichment profiles for other histone modifications that have important consequences for transcriptional regulation.
PMCID: PMC3177187  PMID: 21652639
9.  Reduced dosage of the modifiers of epigenetic reprogramming Dnmt1, Dnmt3L, SmcHD1 and Foxo3a has no detectable effect on mouse telomere length in vivo 
Chromosoma  2011;120(4):377-385.
Studies carried out in cultured cells have implicated modifiers of epigenetic reprogramming in the regulation of telomere length, reporting elongation in cells that were null for DNA methyltransferase DNA methyltransferase 1 (Dnmt1), both de novo DNA methyltransferases, Dnmt3a and Dnmt3b or various histone methyltransferases. To investigate this further, we assayed telomere length in whole embryos or adult tissue from mice carrying mutations in four different modifiers of epigenetic reprogramming: Dnmt1, DNA methyltransferase 3-like, structural maintenance of chromosomes hinge domain containing 1, and forkhead box O3a. Terminal restriction fragment analysis was used to compare telomere length in homozygous mutants, heterozygous mutants and wild-type littermates. Contrary to expectation, we did not detect overall lengthening in the mutants, raising questions about the role of epigenetic processes in telomere length in vivo.
Electronic supplementary material
The online version of this article (doi:10.1007/s00412-011-0318-9) contains supplementary material, which is available to authorized users.
PMCID: PMC3140923  PMID: 21553025
10.  A genome-wide screen for modifiers of transgene variegation identifies genes with critical roles in development 
Genome Biology  2008;9(12):R182.
An extended ENU screen for modifiers of transgene variegation identified four new modifiers, MommeD7-D10.
Some years ago we established an N-ethyl-N-nitrosourea screen for modifiers of transgene variegation in the mouse and a preliminary description of the first six mutant lines, named MommeD1-D6, has been published. We have reported the underlying genes in three cases: MommeD1 is a mutation in SMC hinge domain containing 1 (Smchd1), a novel modifier of epigenetic gene silencing; MommeD2 is a mutation in DNA methyltransferase 1 (Dnmt1); and MommeD4 is a mutation in Smarca 5 (Snf2h), a known chromatin remodeler. The identification of Dnmt1 and Smarca5 attest to the effectiveness of the screen design.
We have now extended the screen and have identified four new modifiers, MommeD7-D10. Here we show that all ten MommeDs link to unique sites in the genome, that homozygosity for the mutations is associated with severe developmental abnormalities and that heterozygosity results in phenotypic abnormalities and reduced reproductive fitness in some cases. In addition, we have now identified the underlying genes for MommeD5 and MommeD10. MommeD5 is a mutation in Hdac1, which encodes histone deacetylase 1, and MommeD10 is a mutation in Baz1b (also known as Williams syndrome transcription factor), which encodes a transcription factor containing a PHD-type zinc finger and a bromodomain. We show that reduction in the level of Baz1b in the mouse results in craniofacial features reminiscent of Williams syndrome.
These results demonstrate the importance of dosage-dependent epigenetic reprogramming in the development of the embryo and the power of the screen to provide mouse models to study this process.
PMCID: PMC2646286  PMID: 19099580
11.  Polycomb Repressive Complex 2 (PRC2) Restricts Hematopoietic Stem Cell Activity 
PLoS Biology  2008;6(4):e93.
Polycomb group proteins are transcriptional repressors that play a central role in the establishment and maintenance of gene expression patterns during development. Using mice with an N-ethyl-N-nitrosourea (ENU)-induced mutation in Suppressor of Zeste 12 (Suz12), a core component of Polycomb Repressive Complex 2 (PRC2), we show here that loss of Suz12 function enhances hematopoietic stem cell (HSC) activity. In addition to these effects on a wild-type genetic background, mutations in Suz12 are sufficient to ameliorate the stem cell defect and thrombocytopenia present in mice that lack the thrombopoietin receptor (c-Mpl). To investigate the molecular targets of the PRC2 complex in the HSC compartment, we examined changes in global patterns of gene expression in cells deficient in Suz12. We identified a distinct set of genes that are regulated by Suz12 in hematopoietic cells, including eight genes that appear to be highly responsive to PRC2 function within this compartment. These data suggest that PRC2 is required to maintain a specific gene expression pattern in hematopoiesis that is indispensable to normal stem cell function.
Author Summary
The chromatin environment that surrounds a gene heavily influences the gene's transcriptional activity. Specific modifications on histone tails serve as signposts for the basal transcriptional machinery, reflecting a cell's developmental history and identifying genes that should be actively transcribed and those that must be repressed. Polycomb group proteins are involved in large, multiprotein complexes that catalyse the post-translational modification of histones. The disruption of these complexes induces wholesale changes in gene expression, a scenario commonly seen in diseases such as cancer. We have investigated the role of Polycomb group proteins during blood cell formation: in stem cells, progenitor cells, and mature blood cells. Using a variety of functional assays, we demonstrate an important role for Polycomb group proteins in restricting the activity of hematopoietic stem cells. To define the molecular targets of the complex, we examined gene expression profiles in cells with impaired expression of Polycomb group proteins. This analysis identified a set of target genes within the hematopoietic compartment that was distinct from those defined in embryonic stem cells and fibroblasts. This study provides new insights into the role of these proteins during hematopoiesis, and suggests a novel mechanism by which they might contribute to leukaemia.
Epigenetic modifications are central to the maintenance of cellular identity and are dynamically regulated during differentiation. We addressed the role of Polycomb group proteins during hematopoiesis and define a series of genes that are highly responsive to Polycomb dysfunction.
PMCID: PMC2292752  PMID: 18416604
12.  Dynamic Reprogramming of DNA Methylation at an Epigenetically Sensitive Allele in Mice 
PLoS Genetics  2006;2(4):e49.
There is increasing evidence in both plants and animals that epigenetic marks are not always cleared between generations. Incomplete erasure at genes associated with a measurable phenotype results in unusual patterns of inheritance from one generation to the next, termed transgenerational epigenetic inheritance. The Agouti viable yellow (Avy) allele is the best-studied example of this phenomenon in mice. The Avy allele is the result of a retrotransposon insertion upstream of the Agouti gene. Expression at this locus is controlled by the long terminal repeat (LTR) of the retrotransposon, and expression results in a yellow coat and correlates with hypomethylation of the LTR. Isogenic mice display variable expressivity, resulting in mice with a range of coat colours, from yellow through to agouti. Agouti mice have a methylated LTR. The locus displays epigenetic inheritance following maternal but not paternal transmission; yellow mothers produce more yellow offspring than agouti mothers. We have analysed the DNA methylation in mature gametes, zygotes, and blastocysts and found that the paternally and maternally inherited alleles are treated differently. The paternally inherited allele is demethylated rapidly, and the maternal allele is demethylated more slowly, in a manner similar to that of nonimprinted single-copy genes. Interestingly, following maternal transmission of the allele, there is no DNA methylation in the blastocyst, suggesting that DNA methylation is not the inherited mark. We have independent support for this conclusion from studies that do not involve direct analysis of DNA methylation. Haplo-insufficiency for Mel18, a polycomb group protein, introduces epigenetic inheritance at a paternally derived Avy allele, and the pedigrees reveal that this occurs after zygotic genome activation and, therefore, despite the rapid demethylation of the locus.
There is now a reasonable amount of evidence from both epidemiological studies in humans and from genetic studies in animals and plants that information in addition to the primary DNA sequence is inherited across generations and can influence the phenotype of the offspring. Researchers refer to this information as epigenetic, and there is much interest in discovering the molecular basis for this epigenetic information. They now know a great deal about the various types of epigenetic marks that regulate the expression of the genome within the life of an organism, and these include both modifications to the DNA molecule itself, specifically DNA methylation and modifications to the proteins that package the DNA into chromosomes, termed chromatin. DNA methylation appears to be one of the most stable epigenetic modifications and has been the primary candidate for the molecule responsible for transgenerational epigenetic inheritance. The results presented here suggest that DNA methylation is not the inherited epigenetic mark, at least in the mouse model used in this study.
PMCID: PMC1428789  PMID: 16604157

Results 1-12 (12)