|Home | About | Journals | Submit | Contact Us | Français|
The cohesin complex, discovered for its role in sister chromatid cohesion, also plays roles in gene expression and development in organisms from yeast to man. This review highlights what has been learned about the gene control and developmental functions of cohesin and the Nipped-B (NIPBL/Scc2) cohesin loading factor in Drosophila. The Drosophila studies have provided unique insights into the etiology of Cornelia de Lange syndrome (CdLS), which is caused by mutations affecting sister chromatid cohesion proteins in humans. In vivo experiments with Drosophila show that cohesin and Nipped-B have dosage-sensitive effects on the functions of many evolutionarily conserved genes and developmental pathways. Genome-wide studies with Drosophila cultured cells show that Nipped-B and cohesin co-localize on chromosomes, and bind preferentially, but not exclusively, to many actively-transcribed genes and their regulatory sequences, including many of the proposed in vivo target genes. In contrast, the cohesion factors are largely excluded from genes silenced by Polycomb group (PcG) proteins. Combined, the in vivo genetic data and the binding patterns of cohesin and Nipped-B in cultured cells are consistent with the hypothesis that they control the action of gene regulatory sequences, including transcriptional enhancers and insulators, and suggest that they might also help define active chromatin domains and influence transcriptional elongation.
The subunits of the cohesin complex (Figure 1) were discovered in genetic screens in fungi for mutations that cause defects in sister chromatid cohesion (Guacci et al. 1997; Michaelis et al. 1997). The structure, regulation and diverse functions of cohesin are reviewed in other articles in this issue (Austin et al., Gartenberg, MacNairn and Gerton, Sakuno and Watanabe, Wendt and Peters). This article focuses on the roles of cohesin and its regulatory factors in gene expression and development in the fruitfly, Drosophila melanogaster.
Similar to other SMC (Structural Maintenance of Chromosome) complexes, cohesin contains a heterodimer of two SMC proteins and other factors. In cohesin these are the Smc1–Smc3 heterodimer, the Rad21 kleisin protein, and Stromalin (Figure 1). Cohesin is structurally distinct from other SMC complexes in that a kink in the arm of Smc3 creates an open ring-like structure, with Rad21 linking the Smc1 and Smc3 ATPase head domains (Figure 1). There is substantial evidence that the cohesin ring encircles chromosomes (Ivanov and Nasmyth 2005, Ivanov and Nasmyth 2007, Haering et al. 2008), supporting the ideas that cohesin mediates sister chromatid cohesion by encircling both sisters, or by interactions between two cohesin rings encircling different sisters.
The Nipped-B protein and its Mau-2 partner (Figure 1) are required for cohesin to associate with chromosomes, and the Eco1 and Pds5 proteins are needed to establish and/or maintain sister chromatid cohesion. Cohesin is removed from chromosome arms during prophase, and from centromeres at the metaphase-to-anaphase transition, to permit sister separation and cell division.
In most organisms, cohesin rebinds to chromosomes immediately after cell division in telophase. Thus cohesin binds extensively along chromosomes throughout the entire interphase of the cell cycle. Thus it is not altogether surprising that it has been discovered that cohesin regulates gene expression. As described in other articles in this issue, cohesin influences the function of silencing proteins in yeast (Gartenberg), and helps mediate CTCF protein insulator activity in mouse and human cells (Wendt and Peters).
Studies in Drosophila have revealed that cohesin, Nipped-B, and Pds5 regulate gene expression, and that this regulation is important for organismal development. Based on the Drosophila studies described below, it seems likely that multiple changes in gene expression underlie the many developmental deficits, such as slow growth, mental retardation, autism, limb and organ structural defects, that occur in Cornelia de Lange syndrome (CdLS, Dorsett 2007, Liu and Krantz 2008, Dorsett and Krantz 2009). CdLS is caused by heterozygous loss-of-function mutations in the human NIPBL ortholog of Nipped-B and Scc2 in at least half the cases (Krantz et al. 2004, Tonkin et al. 2004). In some 5% of cases, CdLS is caused by amino acid changes in the Smc1 or Smc3 cohesin subunits, indicating that the role of NIPBL in human development involves regulation of cohesin function (Musio et al. 2006, Deardorff et al. 2007). This article first reviews several examples of genetic evidence showing that sister chromatid cohesion factors control segmental identity, and development of limbs and nervous system in Drosophila. This is followed by examination of the molecular data supporting the idea that the diverse in vivo effects on development are through direct regulation of gene expression, and discussion of some of the potential mechanisms by which such regulation might occur.
The first evidence that cohesion factors regulate gene expression and development in eumetazoa arose from studies on the Drosophila cut gene (Figure 2, Rollins et al. 1999). cut encodes a homeobox protein involved in many developmental processes, and has human orthologs with similar functions (Nepveu 2001, Sansregret and Nepveu 2008). During early wing development, the Notch receptor signaling pathway activates cut in a thin strip of cells located between the cells that will eventually form the dorsal and ventral surfaces of the adult wing. The cut-expressing cells give rise to the bristle-forming cells around adult wing margin, and loss of cut expression results in loss of these margin cells (Figure 2).
Activation of cut in the developing wing margin in response to Notch receptor signaling is mediated by a transcriptional enhancer located more than 80 kbp upstream of the cut transcription start site (Figure 2, Jack et al. 1991). In one of the first descriptions of insulators, it was shown that the ability of this distant enhancer to activate cut is impeded by several independent natural insertions of the gypsy retrovirus into the region between the enhancer and promoter (Figure 2, Jack et al. 1991). Gypsy binds the Suppressor of Hairy-wing [Su(Hw)] insulator protein, and long-range enhancer-promoter communication and the wing margin phenotype of cut gypsy insertion alleles vary quantitatively with the in vivo activity of Su(Hw) over a several hundred-fold range (Dorsett 1993).
The quantitative nature of Su(Hw) insulation was exploited to identify genes that control long-range transcriptional activation of cut. A gypsy insertion allele of cut was partially suppressed by a weak su(Hw) mutation, and mutations that dominantly decrease cut activation (or increase insulator activity) were isolated by virtue of the increased loss of adult wing margin (Figure 2, Morcillo et al. 1996, Rollins et al. 1999). These screens identified several loss-of-function alleles of Nipped-B, named for the gaps in the adult wing margin (Figure 2). Although the Nipped-B mutations were isolated in a screen using a gypsy insertion, they also reduce expression of wild-type cut alleles, indicating that Nipped-B normally regulates cut. Based on several lines of genetic evidence, it was postulated that Nipped-B is particularly critical for long-range enhancer-promoter communication (Rollins et al. 1999).
Nipped-B encodes a homolog of the S. cerevisiae Scc2 and S. pombe Mis4 proteins, which are required for sister chromatid cohesion and binding of cohesin to chromosomes (Ciosk et al. 2000, Tomonaga et al. 2000). Heterozygous Nipped-B mutants, which have reduced cut expression in the wing margin, are viable and do not show chromatid cohesion defects. In contrast, homozygous Nipped-B mutants survive only a few days into development and display significantly reduced sister chromatid cohesion prior to death, confirming that Nipped-B is a true Scc2/Mis4 ortholog (Rollins et al. 2004, Gause et al. 2008b).
Identification of Nipped-B as a sister chromatid cohesion protein raised the question of whether cohesin also participates in long-range activation of cut. Because Nipped-B facilitates both long-range activation and cohesin binding to chromosomes, it was expected that cohesin would have the same effect as Nipped-B. In contrast to this expectation, however, non-lethal in vivo reduction of the Rad21, SA or Smc1 cohesin subunit dosage by RNAi or mutation increases expression of cut in the developing wing (Rollins et al. 2004, Dorsett et al. 2005).
Based on these results, it was hypothesized that cohesin functions as an insulator, inhibiting enhancer-promoter communication in cut, and that Nipped-B facilitates enhancer-promoter interaction by dynamically altering cohesin binding (Dorsett 2004). As described by Wendt and Peters in this issue, this hypothesis is supported by recent studies showing that cohesin contributes to the insulator activity of the CTCF (CCCTC-binding factor) zinc finger protein in mouse and human cells (Parelho et al. 2008, Wendt et al. 2008). It is also consistent with the findings described later, that Nipped-B and cohesin bind to the regulatory and transcribed regions of cut in cultured cells, and that the binding pattern varies with transcription (Misulovin et al. 2008).
Although the eco and separation anxiety (san) Drosophila homologs of Eco1 are required for centromeric cohesion (Williams et al. 2003), unlike the cohesin subunits, they do not have dosage-sensitive effects on cut expression (Dorsett et al. 2005). More recently, it has been suggested that the Scc2 yeast ortholog of Nipped-B is involved in the chromosomal loading and/or positioning of the condensin (D'Ambrosio et al. 2008) and Smc5/6 DNA repair (Lindroos et al. 2006) SMC complexes. Condensin-like complexes regulate gene expression in C. elegans (Ercan and Lieb, this issue; Meyer 2005), and the Barren condensin subunit has been implicated in gene silencing in Drosophila (Lupo et al. 2001). Mutations affecting the Smc5/6 proteins are not available, but mutations in the Drosophila barren and gluon genes encoding condensin subunits, do not have dominant effects on cut expression (R.A. Rollins and D.D., unpublished).
Other genetic evidence supports the idea that Nipped-B regulates cut expression through control of cohesin. The Pds5 protein was identified in genetic screens in S. cerevisiae for defects in sister chromatid cohesion (Hartman et al. 2000, Panizza et al. 2000), and like cohesin and Nipped-B, it is also conserved from yeast to man. Similar to its orthologs in other organisms, Drosophila Pds5 is required for sister chromatid cohesion, but not for binding of cohesin to chromosomes, and it co-localizes with cohesin on chromosomes (Dorsett et al. 2005, Gause et al. 2008b). A null pds5 allele dominantly decreases cut expression in the developing wing margin to a slight extent (Dorsett et al. 2005). In contrast to the null allele, an allele predicted to encode a Pds5 protein lacking the N terminus increases cut expression. This truncation allele also diminishes cohesin binding to salivary gland polytene chromosomes when homozygous, suggesting that the mutant protein may destabilize or block binding of cohesin to chromosomes. The Pds5 data thus further suggest that cohesin interferes with cut expression in the developing wing margin.
Recently it was shown that homozygous mutations in Pds5B, one of the two mouse Pds5 genes, cause perinatal lethality and several developmental abnormalities reminiscent of some seen in Cornelia de Lange syndrome (Zhang et al. 2007). No defects in sister chromatid cohesion were detected, and thus these findings indicate that Pds5 proteins also likely regulate genes in mammals.
The bithorax complex (BX-C) contains three homeobox (HOX) genes, Ubx, abdominal-A (abd-A) and Abdominal-B (Abd-B) that control segmental identity, and limb and organ development (Maeda and Karch 2006). The BX-C genes, and the transcriptional enhancers that activate them, occupy nearly 320 kbp and are arranged along the chromosome in the anterior-to-posterior order of the thoracic and abdominal segments in which they are expressed, and whose identity they determine (Figure 3). This gene order is conserved in the mammalian HOX gene complexes.
Segmentation genes control expression of the BX-C genes during early embryogenesis, and beginning in mid-embryogenesis, they are regulated positively by trithorax group (trxG) genes, and negatively by Polycomb group (PcG) genes. PcG proteins, and some trxG proteins, bind to Polycomb Response Elements (PRE), or “maintenance elements” located in multiple locations in the BX-C (Figure 3).
In addition to the enhancers and PREs, several insulators, or boundary elements, are required for proper regulation of the BX-C, and some are tightly linked to PREs (Figure 3). The insulators in the BX-C require different proteins, including GAGA factor encoded by the Trithorax-like (Trl) gene and CTCF. In general, these insulators prevent inappropriate activation of individual BX-C genes by enhancers for the neighboring gene. Some of the enhancers that activate Abd-B are separated from the Abd-B transcription start sites by insulators, but in these cases, the insulators attenuate activation, and there are Promoter Targeting Sequences (PTS) that enable the enhancers to bypass the insulators and target the promoters in an epigenetic manner (Figure 3, Zhou and Levine 1999, Lin et al. 2004, Chen et al. 2005, Lin et al. 2007).
Reduced Ubx expression in parasegment 5 causes abdominal to thoracic transformations. Ubx expression in this parasegment is regulated by the abx/bx enhancers located several kilobases downstream of the transcription start site (Simon et al. 1990, Qian et al. 1991), and the bx34e gypsy retrovirus insertion (Peifer and Bender 1986) reduces activation of Ubx by these enhancers (Figure 3). The segmental transformation caused by the bx34e gypsy insertion is reversed by reduced Su(Hw) insulator protein activity. Reducing Nipped-B dosage or activity, however, strongly magnifies the effect of the weakened insulator, and increases the segmental transformation (Figure 4, Rollins et al. 1999, Gause et al. 2008b). Thus Nipped-B also facilitates long-range gene activation in the BX-C. As described below, Nipped-B and cohesin bind to the Abd-B gene and its 3' regulatory domain in cultured cells in which Abd-B is highly transcribed, but are absent from Abd-B and other BX-C genes, including Ubx, when they are silenced by PcG proteins (Misulovin et al. 2008).
Mutations in the PcG genes that repress BX-C gene expression often cause dominant mild segmental transformations in adults. For example, Polycomb (Pc) loss-of-function alleles cause extra sets of sex comb bristles that are normally specific for the anterior set of legs to form on the middle and posterior legs in males. Loss-of-function mutations in the verthande (vtd) gene were isolated in a screen for trxG mutations that dominantly suppress the dominant extra sex comb phenotypes of Pc mutations (Kennison and Tamkun 1988). Recently it was shown that the vtd trxG gene encodes the Rad21 subunit of the cohesin complex, and that Nipped-B mutations also dominantly suppress Pc dominant phenotypes (Hallson et al. 2008). These data indicate that Nipped-B and Rad21 both positively regulate BX-C expression in developing legs. This contrasts with cut, where Nipped-B and Rad21 have opposing effects, and suggest that the effect of cohesin on gene expression is context-dependent.
Rad21 (vtd) mutations even have opposite effects on the different phenotypes displayed by a single PcG gene mutation. The Enhancer of zeste [E(z)] PcG gene encodes the methyltransferase responsible for trimethylation of the lysine 27 residue of histone H3 (H3K27Me3) that coats PcG-silenced loci (Müller et al. 2002, Kahn et al. 2006, Schwartz et al. 2006). The E(z) trithorax mimic allele [E(z)Trm] is a gain-of-function missense mutation that causes E(z) to target BX-C genes when they should not be silenced (Bajusz et al. 2001). The phenotypes caused by E(z)Trm thus include thoracic and abdominal segmental transformations, recognized by bristle and pigmentation changes. The vtd2 loss-of-function allele of Rad21 dominantly increases the penetrance of the abdominal transformation, as do several other trxG mutations, but unlike most trxG mutations, reduces the frequency the thoracic transformations (Bajusz et al. 2001). These results further indicate that cohesin regulates the BX-C and segmental identity. If as the evidence suggests, cohesin directly regulates the BX-C, then its effects on expression are more context-dependent than those of most trxG proteins.
Rad21 also has context-dependent effects on gene expression in vertebrates. In early zebrafish embryogenesis, the runx1 gene is involved in haematopoietic and neural development. Homozygous rad21 mutations reduce runx1 expression in haematopoietic precursor cells, but not in most of the mechanosensory neurons (Horsfield et al. 2007). Expression of the runx3 gene, which is also expressed in the mechanosensory neurons, is lost in rad21 mutants, indicating that Rad21 still regulates other genes in these cells. Thus, runx1 expression is regulated by cohesin in some cells and not others. Expression of runx1 is also reduced in heterozygous rad21 mutants, indicating that expression of some genes is sensitive to cohesin dosage as in Drosophila.
The hedgehog protein (Hh) is a highly conserved signaling molecule that regulates several developmental processes. Hh is normally expressed in the posterior compartment of the developing wing, but the Moonrat gain-of-function hh mutation (hhMrt) causes expression in the anterior compartment (Felsenfeld and Kennison 1995). This results in variable overgrowth of the anterior compartment in heterozygous hhMrt mutants (Figure 4). Both Rad21 (vtd) and Nipped-B loss-of-function mutations dominantly suppress the hhMrt mutant wing phenotype (Figure 4, Schulze et al. 2001, Hallson et al. 2008), indicating that hh, or other genes in the hh signaling pathway, are regulated by cohesin and Nipped-B.
Effects of Nipped-B on Drosophila eye development have been observed, and likely involve regulation of the Enhancer of split gene complex. The Notch receptor plays early and late roles in the development of photoreceptors in the eye (Baker et al. 1996, Baker and Yu 1997). First, Notch activates basic helix-loop-helix (bHLH) DNA-binding proneural genes, creating groups of neural-competent cells that have the potential to become photoreceptors. Later, similar to its role in “lateral inhibition” during bristle development, Notch directly activates repressor-encoding genes in the E(spl)-C (Nelleson et al. 1999), which repress proneural genes, such that only one cell in the cluster retains the neural fate. Many of the E(spl)-C genes encode bHLH proteins that can interact directly with proneural proteins.
The Nspl-1 mutation in the Notch receptor gene causes small rough adult eyes by reducing the number of photoreceptors (Nagel and Preiss 1999). E(spl)D is a dominant gain-of-function mutation in one of the E(spl)-C genes that strongly enhances the eye phenotype, making Nspl-1 dominant (Welshons 1956, Nagel and Preiss 1999). Reducing E(spl)-C dosage diminishes the effect of E(spl)D, and reducing proneural gene dosage increases its effect (Nagel and Preiss 1999). E(spl)D encodes a hyperactive protein that over-represses proneural activity and reduces photoreceptors, possibly through direct interactions with proneural proteins (Nagel et al. 1999, Nagel and Preiss 1999). Thus the leading ideas are that Nspl-1 reduces eye size through reduced early activation of proneural genes, and/or by hyperactivation of the E(spl)-C.
Nipped-B mutations dominantly suppress Nspl-1, increasing the size of the adult eye (Figure 4, Rollins et al. 1999). The simplest explanation is that Nipped-B mutations reduce expression of genes in the E(spl)-C, although it is also possible that they could increase proneural gene expression. It is unlikely that Nipped-B mutations alter expression of Notch, because they do not affect other sensitive Notch mutant phenotypes (Rollins et al. 1999). Rad21 (vtd) mutations have the opposite effect as Nipped-B mutations, and reduce the size of Nspl-1 mutant eyes (C.A. Schaaf and D.D., unpublished), providing another case where Nipped-B and cohesin may have opposing effects on gene expression. As described below, Nipped-B and cohesin coat the entire E(spl)-C in cells derived from larval nervous system (Figure 5, Misulovin et al. 2008), supporting the idea that they directly regulate the E(spl)-C.
The Krüppel (Kr) zinc finger protein regulates development of many tissues, beginning with its role as a gap gene during early embryogenesis. KrIf-1 is a dominant gain-of-function allele that causes ectopic expression in the developing eye, reducing its size (Carrera et al. 1998, Abrell et al. 2000). Loss-of-function mutations in certain trxG genes, including Rad21 (vtd), cause ectopic outgrowths in KrIf-1 mutant eyes (Sollars et al. 2003). Of all the mutations tested, Rad21 (vtd3) has by far the strongest effect, causing eye outgrowths in over half the progeny. The most intriguing observation is that the vtd3 (Rad21) mutation has an epigenetic effect on KrIf-1 expression (Sollars et al. 2003). The eye outgrowths are seen in the next generation, even in many of the progeny that do not inherit the vtd3 mutation. By breeding progeny that have outgrowths but lack the vtd3 mutation, it was shown that this effect continues for at least five generations. Although these experiments need to be performed with mutations affecting other cohesin subunits, they raise the intriguing likelihood that cohesin has epigenetic effects on gene expression and development.
The mushroom body is a structure in the Drosophila brain that mediates olfactory learning and memory. Like many other neurons, the γ neuron of the mushroom body extends excess axons, and then prunes them back. A genetic screen for mutations that block axon pruning in the γ neuron recovered loss-of-function mutations in the Smc1 and Stromalin (SA) cohesin subunit genes (Schuldiner et al. 2008). Neurons homozygous mutant for the cohesin subunit genes generated by mitotic recombination fail to prune their axons. Axon pruning was rescued by expressing Smc1 specifically in postmitotic neurons, demonstrating that the loss of pruning does not result from reduced cell proliferation (Schuldiner et al. 2008).
The effects of cohesin on axon pruning in the mushroom body γ neuron were confirmed using an engineered Tobacco Etch mosaic Virus (TEV) protease-sensitive Rad21 cohesin subunit (Rad21TEV, Pauli et al. 2008). A Rad21 (vtd) mutation was fully rescued by a transgene encoding Rad21TEV. Ubiquitous expression of TEV protease, which has no obvious effect on wild-type flies, is lethal and causes release of cohesin from chromosomes in Rad21TEV-rescued flies. Expression of TEV protease specifically in post-mitotic neurons is also lethal in Rad21TEV flies, and blocks pruning of the γ neuron axons (Pauli et al. 2008). Combined, the studies on the mushroom body γ neuron show definitively that cohesin plays an important role in the morphogenesis of a non-dividing cell, and thus is unlikely to involve its role in chromosome segregation.
The role of cohesin in axon pruning in the mushroom body γ neuron likely involves regulation of the EcR (ecdysone receptor) gene, which encodes a homolog of mammalian nuclear steroid hormone and retinoic acid receptors (Schuldiner et al. 2008). EcR is expressed in the mushroom body γ neuron, and is required for axon pruning (Lee et al. 2000). Strikingly, EcR levels are reduced in Smc1 mutant γ neurons, and targeted expression of EcR in post-mitotic neurons rescues the pruning defect (Schuldiner et al. 2008). As described below, cohesin and Nipped-B bind to the active EcR gene in cultured cells, and thus it is predicted that the effect of Smc1 mutations on EcR-B1 protein levels in the γ neurons results from reduced transcription.
Developmental effects of cohesin are also seen in other Drosophila neurons, in addition to the mushroom body γ neurons and eye photoreceptors, although candidate target genes responsible for these effects have yet to be identified. Smc1 and SA mutations were also isolated in screens for mutations that cause defects in dendrite targeting by olfactory projection neurons (Schuldiner et al. 2008), and targeted expression of TEV protease in cholinergic neurons of Rad21TEV-rescued flies causes defects in larval locomotion (Pauli et al. 2008).
Several lines of evidence argue that the in vivo effects of cohesin, Nipped-B, and Pds5 on Drosophila development detailed above result from effects on gene expression, and that the effects on gene expression are unlikely to result from defects in sister cohesion or cell proliferation. In the cases of EcR expression, axon pruning, and dendrite formation in the mushroom body, although the cohesin subunit mutations are homozygous, the cells are non-dividing, and thus the chromosome segregation function is not involved. Most convincingly, expression of the missing cohesin subunit specifically in postmitotic neurons reverses the pruning defect (Pauli et al. 2008, Schuldiner et al. 2008). It is unknown, however, if the postmitotic neurons go through S phase before they differentiate (Liqun Luo, personal communication), and thus it is possible that sister chromatid interactions could be important for gene expression.
In the cases of cut, Ubx, Pc, E(z)Trm, hhMrt, KrIf-1 and Nspl-1, changes in Nipped-B and/or cohesin dosage that are too small to cause overt defects in sister chromatid cohesion (Rollins et al. 2004, Hallson et al. 2008) alter expression. Many of these changes in dosage are even smaller than might be expected - heterozygous Nipped-B null mutations cause only a 25 to 30% reduction in Nipped-B mRNA (Rollins et al. 2004). Reduction of Drosophila Nipped-B mRNA by 50% using in vivo RNAi is lethal, also without causing visible cohesion defects (Rollins et al. 2004).
Intriguingly, the partial dosage compensation of Nipped-B seen in Drosophila is also seen in tissues from heterozygous NIPBL mutant mice (Arthur Lander and Anne Calof, personal communication) and cell lines derived from CdLS patients with NIPBL mutations (Jinglan Liu and Ian Krantz, personal communication). All cases show a maximum reduction of 30% in NIPBL mRNA. This suggests that there is a conserved compensation mechanism to maintain Nipped-B/NIPBL levels.
Although it is possible that there are subtle effects on cohesion that might alter gene expression and development with reduced Nipped-B and cohesin subunit dosage, certain observations argue against the notion that minor changes in sister cohesion underlie the effects on gene expression. If they have a subtle effect, reducing Nipped-B or cohesin dosage would both be expected to slightly reduce sister cohesion, but as described above, they have opposing effects on development in some cases. Also, two pds5 gene mutations that cause equivalent recessive defects in sister chromatid cohesion have opposite dominant effects on cut expression (Dorsett et al. 2005). Thus it seems likely that the effects of cohesion factors on gene expression also occur in G1, and not just in the S or G2 phases of the cell cycle, when cohesion occurs.
Although the effects of Nipped-B and cohesin dosage on gene expression are unlikely to involve changes in sister chromatid cohesion, the role of Nipped-B in gene expression appears to involve the same molecular activity that is required for its role in cohesion. This idea is based upon the finding that missense Nipped-B mutations show both milder dominant effects on gene expression and weaker homozygous effects on cohesion compared to null or truncation alleles (Gause et al. 2008b).
The in vivo data described above indicate that Nipped-B and cohesin regulate diverse and conserved developmental pathways in Drosophila, and that most of these developmental roles likely involve regulation of gene expression. The finding that small reductions in Nipped-B or cohesin dosage have particularly strong effects on specific mutant alleles of proposed target genes, such as those documented with various cut alleles, argues strongly that Nipped-B and cohesin directly regulate some of the proposed gene targets. It is essential, however, to obtain molecular data to determine if the regulation is direct, and also to gain insights into the mechanisms by which cohesion factors regulate genes.
A first step towards this goal was the genome-wide mapping of Nipped-B and cohesin in Drosophila cell lines by genome-wide chromatin immunoprecipitation using tiled microarrays (ChIP-chip), and comparison of their binding patterns to those of RNA polymerase II (PolII) and PcG silencing (Misulovin et al. 2008). One of the cell lines, Sg4, is embryonic in origin, and another, MLDmBG3 (BG3), is derived from larval central nervous system. There are differences in gene expression between these lines, which as described below, provide key insights.
Contrary to what is seen in yeast, where cohesin loads at Scc2 (Nipped-B) binding sites and translocates away (Lengronne et al. 2004), Nipped-B and cohesin co-localize virtually completely throughout the non-repetitive genome (Misulovin et al. 2008), consistent with the notion that Nipped-B could dynamically regulate cohesin chromosome-binding. Also contrary to what occurs in yeast, where cohesin binds mostly between convergently transcribed genes (Lengronne et al. 2004, Glynn et al. 2004), Nipped-B and cohesin bind to transcribed regions (Misulovin et al. 2008). This is similar, however, to what is seen in mouse and human cells, where cohesin also binds transcribed regions (Parelho et al. 2008, Wendt et al. 2008).
The reasons for the differences between yeast and higher eukaryotes, in the co-localization of the Nipped-B/Scc2 cohesin loader, and the binding of cohesin to transcribed genes, are unknown. The answers may lie in differences in chromatin structure, or genome organization. Transcription might facilitate cohesin binding by unwinding higher order chromatin structures in higher organisms, while this is probably unnecessary in yeast, where chromosomes likely have a more open chromatin configuration. Yeast also have little repetitive DNA and intergenic sequences, and few long-range enhancers or introns. Thus the positions of cohesin binding regions between genes might have occurred by selection to reduce transcriptional interference.
Drosophila Nipped-B and cohesin prefer transcription units, binding to intron sequences about 30 to 45% more than expected if binding was random, and to 5' UTRs some 6 to 8-fold more than expected (Misulovin et al. 2008). Although Nipped-B and cohesin bind preferentially to transcription units, they also bind intergenic regions, but about 20 to 30% less than expected if binding were random. This is similar to human cells, where cohesin binds some 14% less to intergenic regions than expected on a random basis (Wendt et al. 2008). Although ChIP-chip studies in mouse suggest that binding might be equally distributed between genes and intergenic regions, these studies covered only 3% of the genome, and many intergenic cohesin binding regions are close to genes (Parelho et al. 2008).
Drosophila Nipped-B and cohesin bind preferentially to actively-transcribed genes. There is a strong genome-wide correlation and overlap between Nipped-B/cohesin and RNA polymerase II (PolII) binding, although most of actual peaks of binding for PolII and Nipped-B/cohesin differ (Misulovin et al. 2008). When Nipped-B/cohesin binding to genes in the region spanning from 10 kbp upstream to 10 kbp downstream of genes is averaged genome-wide, the peak is at the transcription start site (Misulovin et al. 2008), similar to what is seen with cohesin at start sites in mouse cells (Parelho et al. 2008).
Importantly, differences in gene expression between Drosophila cell lines also correlate with differences in cohesin binding. In the vast majority of the hundred or so genes that were found to bind PolII in one cell type and not another, cohesin binds to the gene in those cells in which it also binds PolII (Misulovin et al. 2008).
Although Nipped-B and cohesin associate preferentially with active genes, they do not bind all active genes. Only some 20% to 35% of transcription units that bind PolII in BG3 and Sg4 cells also bind cohesin (Gurmukh Sahota, Gary Stormo, DD, unpublished). The factors that determine which active genes bind Nipped-B and cohesin are currently unknown.
In addition to preferential association of cohesin and Nipped-B with a subset of active genes, they are virtually absent from genes silenced by PcG proteins. PcG silenced genes were identified in Sg4 cells by ChIP-chip for PcG proteins, and for the histone H3 lysine 27 trimethylation (H3K27Me3) associated with PcG silencing (Kahn et al. 2006, Schwartz et al. 2006). Genome-wide there is a negative correlation between H3K27Me3 and Nipped-B/cohesin binding, indicating that PcG silencing is generally incompatible with cohesin binding (Misulovin et al. 2008).
Some of the key examples of cohesin binding to active genes, and absence from PcG-silenced genes, involve genes for which there is in vivo evidence for regulation by cohesin. For example, Nipped-B and cohesin bind to the upstream and transcribed regions of the cut locus in Sg4 and BG3 cells (Figure 2, Misulovin et al. 2008). Importantly, there is substantially more binding in BG3 cells. In Sg4 cells, cut is a PcG target, resulting in the H3K27Me3 modification over most of the regulatory region and transcription unit (Figure 2, Schwartz et al. 2006). In these cells, Nipped-B and cohesin are largely restricted to regions with reduced histone methylation (Figure 2, Misulovin et al. 2008). In contrast, cut is not a PcG target in BG3 cells (Y. Schwartz and V. Pirrotta, personal communication), and it is transcribed at least 200-fold more than in Sg4 cells (Z. Misulovin, C.A. Schaaf, D.D., unpublished). The lack of PcG repression and increased transcription correlates with an expansion of Nipped-B and cohesin binding over a 150 kbp region starting upstream of the wing margin enhancer, and extending past the 3' end of the cut gene (Figure 2, Misulovin et al. 2008).
As described above, the effects of Nipped-B and Rad21 (vtd) mutations on Nspl-1 mutant phenotypes suggests that they likely regulate the E(spl)-C, and the effects of cohesin mutations on the levels of ecdysone receptor protein and axon pruning in the mushroom body γ neuron suggests that cohesin regulates the EcR gene. In BG3 cells, the entire 50 kbp E(spl)-C is bound by Nipped-B and cohesin (Figure 4, Misulovin et al. 2008), and most of the genes in the complex are transcribed (Z. Misulovin, C.A. Schaaf, D.D., unpublished). The EcR gene is transcribed in both Sg4 and BG3 cell lines (Z. Misulovin, C.A. Schaaf, D.D., unpublished) and the entire transcribed region shows significant Nipped-B and cohesin binding in both cell lines (Misulovin et al. 2008, Dorsett 2008). Thus it is probable that Nipped-B and cohesin directly regulate the E(spl)-C and EcR in vivo.
The BX-C provides a particularly instructive example. The Ubx gene, which genetic evidence indicates is regulated by Nipped-B in vivo (Figure 4, Rollins et al. 1999), does not bind Nipped-B or cohesin in either Sg4 or BG3 cells, but is PcG-silenced in both lines (Figure 3, Misulovin et al. 2008). The entire BX-C is silenced in BG3 cells, and there is no detectable Nipped-B or cohesin binding to any of the three genes (Figure 3). Abd-B, however, is highly expressed in Sg4 cells, and Nipped-B and cohesin cover a 75 kbp region extending from the distal promoter region through much of the downstream 3' regulatory region (Figure 3). This region corresponds precisely with a region that has high PolII binding and low H3K27Me3 (Schwartz et al. 2006, Misulovin et al. 2008).
The borders of the Abd-B Nipped-B/cohesin binding region are also intriguing. The upstream border near the distal transcription start site corresponds to a CTCF binding site (Figure 3, Holohan et al. 2007) and the downstream border coincides with a well-characterized insulator (Fab-7), which requires the GAGA (Trithorax-like) protein for activity (Schweinsberg et al. 2004). Combined with the finding that CTCF positions cohesin at CTCF-dependent insulators in mammalian cells (Parelho et al. 2008, Rubio et al. 2008, Stedman et al. 2008, Wendt et al. 2008), this raises the possibilities that diverse types of insulators can help determine cohesin-binding domains, and that cohesin might contribute to the activity of different chromatin boundaries. In particular, the cohesin binding pattern in Abd-B combined with the genetic interactions between the cohesion factors and Ubx, Pc and E(z) methyltransferase mutations described above, suggests that cohesin might help define active chromatin domains and determine the boundaries between PcG-silenced and active chromatin. This is related to a function proposed for cohesin at the boundaries of the HMR silent mating type locus in yeast, where it helps prevent spread of the SIR silencing protein complexes (Donze et al. 1999).
One idea that arises from the preferential association of cohesin with transcribed genes, and absence from PcG-silenced genes, is that unwinding of higher order chromatin structures by transcription might facilitate the binding of cohesin. The internal opening of cohesin, which is some 35 by 50 nm, may be too small to encircle higher order chromatin structures of 30 nm fiber and above, but could easily encircle a 10 nm nucleosomal fiber.
Also favoring the idea that unwound chromatin facilitates cohesin binding is the finding that many cohesin binding domains correlate in position with early replication origins, which are also likely to have an unwound structure (McAlpine et al. 2004, D. MacAlpine, D.D., unpublished). Indeed, many replication origins also correlate with transcribed regions (MacAlpine et al. 2004), but the correlation between cohesin binding and replication origins extends to large intergenic regions with little or no transcription (D. MacAlpine, Z. Misulovin, D.D., unpublished). These findings fit nicely with the requirement for replication origin licensing for Nipped-B and cohesin binding in Xenopus egg extracts (Gillespie and Hirano 2004, Takahashi et al. 2004), and interactions of the Nipped-B/Mau-2 complex with origin licensing factors (Takahashi et al. 2008). However, the findings that many transcribed genes do not bind Nipped-B and cohesin, and that many genes that bind cohesin do not overlap replication origins, indicate that there are factors besides open chromatin structure that determine cohesin binding.
The genetic evidence that Nipped-B and cohesin regulate enhancer-promoter interactions in the cut and Ubx genes, and the ChIP-chip data showing that they bind to the large regulatory domains of the cut and Abd-B genes when they are active, argue that they play a role in controlling gene activation through effects on enhancer-promoter communication and chromatin domain definition, consistent with the finding that cohesin contributes to insulation by the CTCF protein in mammalian cells (Parelho et al. 2008, Wendt et al. 2008, Wendt and Peters, this issue). It tempting to propose that cohesin regulates enhancer-promoter interactions and insulator activity by mediating intra-chromosomal cohesion, similar to way it mediates sister chromatid cohesion (Gause et al. 2008a). One of the current leading ideas, supported by multiple lines of evidence, is that insulators pair with other insulators through protein-mediated interactions to form both intra- and interchromosomal loops (Dorman et al. 2007, Maeda and Karch 2007, Wallace and Felsenfeld 2007). Other long-range looping interactions mediated by cohesin might stabilize long-range enhancer-promoter interactions, and could, for example, contribute to the function of promoter-targeting sequences (PTS) that help enhancers bypass insulators in the BX-C (Figure 3).
Data from Drosophila and yeast also raise the possibility that cohesin could also influence transcriptional elongation. Nipped-B and cohesin bind to the long transcribed regions of cut, Abd-B, and EcR genes, among others. Cohesin binding to transcribed regions could potentially influence transcriptional elongation by acting as a roadblock for PolII. It seems likely that cohesin rings have to be moved to permit passage of RNA polymerase, similar to what occurs in the rare cases where cohesin binds a transcription unit in S. cerevisiae, and the gene is activated (Bausch et al. 2007). This idea is also consistent with the finding that cohesin contributes to transcriptional termination between convergent genes in S. pombe during G2 (Gullerova and Proudfoot 2008). In addition to these negative effects, however, it can also be imagined that cohesin could have positive effects by holding the gene in an unwound state, maintaining an open chromatin conformation more conducive to both transcriptional initiation and elongation. The direction and extent of the effect of reducing Nipped-B or cohesin dosage on gene expression would therefore depend on multiple factors, including the dependence on long-range activation, the length of the transcription unit, and the extent to which cohesin might counteract negative effects of chromatin folding.
One question to be answered is why Nipped-B and cohesin have opposite effects on gene expression in some cases, such as with cut. This puzzle is compounded by the fact that heterozygous Nipped-B null mutations only reduce Nipped-B mRNA levels by 25 to 30%. Thus, gene expression is exquisitely sensitive to Nipped-B dosage. This is also seen for NIPBL in cell lines from CdLS patients, and some CdLS individuals have only a 15% reduction in NIPBL mRNA (Borck et al. 2006). The actual mechanisms by which Nipped-B loads cohesin onto chromosomes, and/or stabilizes its binding are unknown, but likely involve direct Nipped-B-cohesin interactions (Arumugam et al. 2003, Gause et al. 2008b, Takahashi et al. 2008). It is possible that Nipped-B does more than simply load cohesin onto chromosomes, but also determines the binding stability and might, in collaboration with other factors such as Pds5 or Wapl (Ghandi et al. 2006, Kueng et al. 2006), dynamically unload and reload cohesin. Whether or not binding dynamics proves to be the explanation, knowing why small changes in Nipped-B and cohesin dosage have significant effects will be important for understanding the molecular mechanisms by which they regulate gene expression.
Genetic and molecular studies in Drosophila have uncovered diverse roles for cohesin and Nipped-B in gene expression and development, and have framed key questions regarding the potential mechanisms by which the sister chromatid cohesion apparatus regulate gene transcription. As described in other articles in this issue, some of these proposed mechanisms, such as chromatin domain boundary and insulator functions, also likely occur in yeast and mammalian cells. The findings in Drosophila have already provided useful insights into the molecular etiology of Cornelia de Lange syndrome, which is caused by reduced NIPBL (Nipped-B) activity, or amino acid changes in cohesin subunits. In particular, the discoveries showing that cohesin and Nipped-B regulate multiple conserved developmental pathways, and that cohesin and Nipped-B bind to many of the likely target genes, already helps understand why CdLS individuals display diverse developmental deficits, and have important implications for potential therapeutic strategies. Future work in multiple experimental systems are certain to focus on determining the molecular mechanisms by which cohesin and associated factors regulate gene expression, and the genetic and molecular tools available in Drosophila will continue to make important contributions to these efforts.
The author thanks Kirsten Wendt, Katsuhiko Shirahige, Jan-Michael Peters, Matthias Merkenschlager, Brian Calvi, David MacAlpine, Yuri Schwartz, Vince Pirrotta, Arthur Lander, Anne Calof, Ian Krantz, Jinglan Liu, Jim Kennison, Barry Honda, Oren Schuldiner, Liqun Luo, Gurmukh Sahota, Gary Stormo, Cheri Schaaf, Maria Gause, and Ziva Misulovin for sharing data and/or helpful discussions. The author also thanks Christian Haering and two anonymous reviewers for helpful comments on the manuscript. Research in the author's laboratory is supported by grants from the NIH (R01 GM055683, P01 HD052860) and the March of Dimes (#1FY05-103).