Scaffold/matrix attachment regions (S/MARs) are essential for structural organization of the chromatin within the nucleus and serve as anchors of chromatin loop domains. A significant fraction of genes in Arabidopsis thaliana contains intragenic S/MAR elements and a significant correlation of S/MAR presence and overall expression strength has been demonstrated. In this study, we undertook a genome scale analysis of expression level and spatiotemporal expression differences in correlation with the presence or absence of genic S/MAR elements. We demonstrate that genes containing intragenic S/MARs are prone to pronounced spatiotemporal expression regulation. This characteristic is found to be even more pronounced for transcription factor genes. Our observations illustrate the importance of S/MARs in transcriptional regulation and the role of chromatin structural characteristics for gene regulation. Our findings open new perspectives for the understanding of tissue- and organ-specific regulation of gene expression.
Scaffold/matrix attachment regions (S/MARs) are AT-rich DNA sequences that mediate structural organization of the chromatin within the nucleus. These elements constitute anchor points of the DNA for the chromatin scaffold and serve to organize the chromatin into structural domains. Studies on individual genes led to the conclusion that the dynamic and complex organization of the chromatin mediated by S/MAR elements plays an important role in the regulation of gene expression. In addition to intergenic S/MARs, which likely exert import insulator effects, more than 2,000 intragenic S/MARs have been shown to be present within the Arabidopsis genome. In this study, the authors set out to analyze the effects of these intragenic S/MAR elements on the regulation of the genes affected. Making use of exhaustive and multidimensional expression datasets available for Arabidopsis, the authors analyzed overall expression differences and correlation of intragenic S/MARs with spatiotemporal expression of genes. On a genome scale, pronounced tissue- and organ-specific and developmental expression patterns of S/MAR-containing genes have been detected. Notably, transcription factor genes contain a significant higher portion of S/MARs. The pronounced difference in expression characteristics of S/MAR-containing genes emphasizes their functional importance and the importance of structural chromosomal characteristics for gene regulation in plants as well as within other eukaryotes.
Genomic DNA in higher eucaryotic cells is organized into a series of loops, each of which may be affixed at its base to the nuclear matrix via a specific matrix attachment region (MAR). In this report, we describe the distribution of MARs within the amplified dihydrofolate reductase (DHFR) domain (amplicon) in the methotrexate-resistant CHO cell line CHOC 400. In one experimental protocol, matrix-attached and loop DNA fractions were prepared from matrix-halo structures by restriction digestion and were analyzed for the distribution of amplicon sequences between the two fractions. A second, in vitro method involved the specific binding to the matrix of cloned DNA fragments from the amplicon. Both methods of analysis detected a MAR in the replication initiation locus that we have previously defined in the DHFR amplicon, as well as in the 5'-flanking region of the DHFR gene. The first of these methods also suggests the presence of a MAR in a region mapping approximately 120 kilobases upstream from the DHFR gene. Each of these MARs was detected regardless of whether the matrix-halo structures were prepared by the high-salt or the lithium 3,5-diiodosalicylate extraction protocols, arguing against their artifactual association with the proteinaceous scaffolding of the nucleus during isolation procedures. However, the in vitro binding assay did not detect the MAR located 120 kilobases upstream from the DHFR gene but did detect specific matrix attachment of a sequence near the junction between amplicons. The results of these experiments suggest that (i) MARs can occur next to different functional elements in the genome, with the result that a DNA loop formed between two MARs can be smaller than a replicon; and (ii) different methods of analysis detect a somewhat different spectrum of matrix-attached DNA fragments.
The gene functions, transcriptional regulation, and genome replication of human papillomaviruses (HPVs) have been extensively studied. Thus far, however, there has been little research on the organization of HPV genomes in the nuclei of infected cells. As a first step to understand how chromatin and suprachromatin structures may modulate the life cycles of these viruses, we have identified and mapped interactions of HPV DNAs with the nuclear matrix. The endogenous genomes of HPV type 16 (HPV-16) which are present in SiHa, HPKI, and HPKII cells, adhere in vivo to the nuclear matrixes of these cell lines. A tight association with the nuclear matrix in vivo may be common to all genital HPV types, as the genomes of HPV-11, HPV-16, HPV-18, and HPV-33 showed high affinity in vitro to preparations of the nuclear matrix of C33A cells, as did the well-known nuclear matrix attachment region (MAR) of the cellular beta interferon gene. Affinity to the nuclear matrix is not evenly spread over the HPV-16 genome. Five genomic segments have strong MAR properties, while the other parts of the genome have low or no affinity. Some of the five MARs correlate with known cis-responsive elements: a strong MAR lies in the 5′ segment of the long control region (LCR), and another one lies in the E6 gene, flanking the HPV enhancer, the replication origin, and the E6 promoter. The strongest MAR coincides with the E5 gene and the early-late intergenic region. Weak MAR activity is present in the E1 and E2 genes and in the 3′ part of L2. The in vitro map of MAR activity appears to reflect MAR properties in vivo, as we found for two selected fragments with and without MAR activity. As is typical for many MARs, the two segments with highest affinity, namely, the 5′ LCR and the early-late intergenic region, have an extraordinarily high A-T content (up to 85%). It is likely that these MARs have specific functions in the viral life cycle, as MARs predicted by nucleotide sequence analysis, patterns of A-T content, transcription factor YY1 binding sites, and likely topoisomerase II cleavage sites are conserved in similar positions throughout all genital HPVs.
The genome of eukaryotes is organized into structural units of chromatin loops. This higher order organization is supported by a nuclear skeleton called the nuclear matrix. The genomic DNA associated with the nuclear matrix is called the matrix associated region (MAR). Only a few genome-wide screens have been attempted, although many studies have characterized locusspecific MAR DNA sequences. In this study, a MAR DNA library was prepared from the Drosophila melanogaster Meigen (Diptera: Drosophilidae) genome. One of the sequences identified as a MAR was from a long terminal repeat region of ‘roo’ retrotransposon (roo MAR). Sequence analysis of roo MAR showed its distribution across the D. melanogaster genome. roo MAR also showed high sequence similarity with a previously identified MAR in Drosophila, namely the ‘gypsy’ retrotransposon. Analysis of the genes flanking roo MAR insertions in the Drosophila genome showed that genes were co-ordinately expressed. The results from the present study in D. melanogaster suggest this sequence plays an important role in genome organization and function. The findings point to an evolutionary role of retrotransposons in shaping the genomic architecture of eukaryotes.
genome organization; MAR DNA; retrotransposon
Matrix attachment regions (MAR) are the sites on genomic DNA that interact with the nuclear matrix. There is increasing evidence for the involvement of MAR in regulation of gene expression. The unsuitability of experimental detection of MAR for genome-wide analyses has led to the development of computational methods of detecting MAR. The MAR recognition signature (MRS) has been reported to be associated with a significant fraction of MAR in C. elegans and has also been found in MAR from a wide range of other eukaryotes. However the effectiveness of the MRS in specifically and sensitively identifying MAR remains unresolved.
Using custom software, we have mapped the occurrence of MRS across the entire C. elegans genome. We find that MRS have a distinctive chromosomal distribution, in which they appear more frequently in the gene-rich chromosome centres than in arms. Comparison to distributions of MRS estimated from chromosomal sequences randomised using mono-, di- tri- and tetra-nucleotide frequency patterns showed that, while MRS are less common in real sequence than would be expected from nucleotide content alone, they are more frequent than would be predicted from short-range nucleotide structure. In comparison to the rest of the genome, MRS frequency was elevated in 5' and 3' UTRs, and striking peaks of average MRS frequency flanked C. elegans coding sequence (CDS). Genes associated with MRS were significantly enriched for receptor activity annotations, but not for expression level or other features.
Through a genome-wide analysis of the distribution of MRS in C. elegans we have shown that they have a distinctive distribution, particularly in relation to genes. Due to their association with untranslated regions, it is possible that MRS could have a post-transcriptional role in the control of gene expression. A role for MRS in nuclear scaffold attachment is not supported by these analyses.
Two core microRNA (miRNA) pathway proteins, Dicer and Argonaute, are found in Giardia lamblia, a deeply branching parasitic protozoan. There are, however, no apparent homologues of Drosha or Exportin5 in the genome. Here, we report a 26 nucleotide (nt) RNA derived from a 106 nt Box C/D snoRNA, GlsR2. This small RNA, designated miR5, localizes to the 3′ end of GlsR2 and has a 75 nt hairpin precursor. GlsR2 is processed by the Dicer from Giardia (GlDcr) and generated miR5. Immunoprecipitation of the Argonaute from Giardia (GlAgo) brought down miR5. When a Renilla Luciferase transcript with a 26 nt miR5 antisense sequence at the 3′-untranslated region (3′ UTR) was introduced into Giardia trophozoites, Luciferase expression was reduced ∼25% when synthetic miR5 was also introduced. The Luciferase mRNA level remained, however, unchanged, suggesting translation repression by miR5. This inhibition was fully reversed by introducing also a 2′-O-methylated antisense inhibitor of miR5, suggesting that miR5 acts by interacting specifically with the antisense sequence in the mRNA. A partial antisense knock down of GlDcr or GlAgo in Giardia indicated that the former is needed for miR5 biogenesis whereas the latter is required for miR5-mediated translational repression. Potential targets for miR5 with canonical seed sequences were predicted bioinformatically near the stop codon of Giardia mRNAs. Four out of the 21 most likely targets were tested in the Luciferase reporter assay. miR5 was found to inhibit Luciferase expression (∼20%) of transcripts carrying these potential target sites, indicating that snoRNA-derived miRNA can regulate the expression of multiple genes in Giardia.
Giardia lambia is a deeply branched parasitic protozoan and the pathogen causing the diarrhetic disorder, giardiasis. The mechanism of gene regulation in this organism is largely unknown. Here, we identified a 26 nucleotide (nt) small RNA from the 3′-end of a 106 nt small nucleolar RNA (GlsR2) in Giardia. GlsR2 is processed through the action of a Dicer protein in Giardia to generate the 26 nt RNA. The latter becomes associated with the Argonaute protein. The protein-RNA complex can repress the translation of messenger RNAs carrying the antisense sequence of the 26 nt RNA at the 3′-untranslated region. This small RNA, designated microRNA5 (miR5), has several potential targets identified in Giardia, among which four were further tested in Giardia and found their translation repressed by miR5. This is the second functioning microRNA we have indentified in Giardia. The microRNAs could be thus important regulators of gene expression in this ancient single cellular organism.
Matrix-attachment regions (MARs) are DNA elements that are defined by their abilities to bind to isolated nuclear matrices in vitro. The DNA sequences of different matrix-binding elements vary widely. The locations of some MARs at the ends of chromatin loops suggest that they may represent boundaries of individual chromatin domains. As such, MARs may play important roles in regulating transcription and chromatin structure. As a first step towards assessing the roles of MARs in these processes, we assayed DNA sequences from the human serine protease inhibitor (serpin) gene cluster at 14q32.1 for matrix-binding activity in vitro. This approximately 150 kb region contains the cell-specific genes encoding alpha1-anti-trypsin (alpha1AT) and corticosteroid-binding globulin (CBG), as well as an antitrypsin-related sequence termed ATR. A DNase I-hypersensitive site (DHS) map of the locus has recently been described. We report here that the alpha1AT-ATR-CBG region contains five distinct MARs. There is a strong matrix-binding element approximately 16 kb upstream of alpha1AT; three MARs are between ATR and CBG and one MAR is within the CBG gene itself. These MARs were matrix-associated in all cell types examined. DNA sequencing indicated that the serpin MARs contained predominantly repetitive DNA, although the types of DNA repeats differed among the MARs.
We have identified a MAR/SAR recognition signature (MRS) which is common to a large group of matrix and scaffold attachment regions. The MRS is composed of two degenerate sequences (AATAAYAA and AWWRTAANNWWGNNNC) within close proximity. Analysis of >300 kb of genomic sequence from a variety of eukaryotic organisms shows that the MRS faithfully predicts 80% of MARs and SARs. In each case where we find a MRS, the corresponding DNA region binds specifically to the nuclear scaffold. Although all MRSs are associated with a SAR, not all known SARs and MARs contain a MRS, suggesting that at least two classes exist, one containing a MRS, the other not. Evidence is presented that the two sequence elements of the bipartite MRS occupy a position on the nucleosome near the dyad axis, together creating a putative protein binding site. The identification of a MAR- and SAR-associated DNA element is an important step forward towards understanding the molecular mechanisms of these elements. It will allow: (i) analysis of the genomic location of SARs, e.g. in relationship to genes, based on sequence information alone, rather than on the basis of an elaborate biochemical assay; (ii) identification and analysis of proteins that specifically bind to the MRS.
An Argonaute homolog and a functional Dicer have been identified in the ancient eukaryote Giardia lamblia, which apparently lacks the ability to perform RNA interference (RNAi). The Giardia Argonaute plays an essential role in growth and is capable of binding specifically to the m7G-cap, suggesting a potential involvement in microRNA (miRNA)-mediated translational repression. To test such a possibility, small RNAs were isolated from Giardia trophozoites, cloned, and sequenced. A 26-nucleotide (nt) small RNA (miR2) was identified as a product of Dicer-processed snoRNA GlsR17 and localized to the cytoplasm by fluorescence in situ hybridization, whereas GlsR17 was found primarily in the nucleolus of only one of the two nuclei in Giardia. Three other small RNAs were also identified as products of snoRNAs, suggesting that the latter could be novel precursors of miRNAs in Giardia. Putative miR2 target sites were identified at the 3′-untranslated regions (UTR) of 22 variant surface protein mRNAs using the miRanda program. In vivo expression of Renilla luciferase mRNA containing six identical miR2 target sites in the 3′-UTR was reduced by 40% when co-transfected with synthetic miR2, while the level of luciferase mRNA remained unaffected. Thus, miR2 likely affects translation but not mRNA stability. This repression, however, was not observed when Argonaute was knocked down in Giardia using a ribozyme-antisense RNA. Instead, an enhancement of luciferase expression was observed, suggesting a loss of endogenous miR2-mediated repression when this protein is depleted. Additionally, the level of miR2 was significantly reduced when Dicer was knocked down. In all, the evidence indicates the presence of a snoRNA-derived miRNA-mediated translational repression in Giardia.
Gene regulation in Giardia lamblia, a primitive parasitic protozoan responsible for the diarrheal disease giardiasis, is poorly understood. There is no consensus promoter sequence. A simple eight–base pair AT-rich region is sufficient to initiate gene transcription in this organism. Thus, the main control of gene expression may occur after the stage of transcription. The presence of Dicer and Argonaute homologs in Giardia suggested that microRNA (miRNA)-mediated translational repression could be one mechanism of gene regulation. In this work, we characterized the presence of the miRNA pathway in Giardia as well as identified the novel use of small nucleolar RNA (snoRNA) as miRNA precursors. Potential target sites for one small RNA (miR2) were identified with the miRanda program. In vivo reporter assays confirmed the specific interaction between the target sites and miR2. A ribozyme-mediated reduction of Dicer and Argonaute in Giardia showed that the former is required for miR2 production whereas the latter functions in mediating the inhibition of reporter expression, which agrees with the roles of these two proteins. This is the first evidence of miRNA-mediated gene regulation in Giardia and the first demonstration of the use of snoRNAs as miRNA precursors.
In order to gain insights into the relationship between spatial organization of the genome and genome function we have initiated studies of the co-linear Sh2/A1- homologous regions of rice (30 kb) and sorghum (50 kb). We have identified the locations of matrix attachment regions (MARs) in these homologous chromosome segments, which could serve as anchors for individual structural units or loops. Despite the fact that the nucleotide sequences serving as MARs were not detectably conserved, the general organizational patterns of MARs relative to the neighboring genes were preserved. All identified genes were placed in individual loops that were of comparable size for homologous genes. Hence, gene composition, gene orientation, gene order and the placement of genes into structural units has been evolutionarily conserved in this region. Our analysis demonstrated that the occurrence of various 'MAR motifs' is not indicative of MAR location. However, most of the MARs discovered in the two genomic regions were found to co-localize with miniature inverted repeat transposable elements (MITEs), suggesting that MITEs preferentially insert near MARs and/or that they can serve as MARs.
Giardia intestinalis is a protist found in freshwaters worldwide, and is the most common cause of parasitic diarrhea in humans. The phylogenetic position of this parasite is still much debated. Histones are small, highly conserved proteins that associate tightly with DNA to form chromatin within the nucleus. There are two classes of core histone genes in higher eukaryotes: DNA replication-independent histones and DNA replication-dependent ones.
We identified two copies each of the core histone H2a, H2b and H3 genes, and three copies of the H4 gene, at separate locations on chromosomes 3, 4 and 5 within the genome of Giardia intestinalis, but no gene encoding a H1 linker histone could be recognized. The copies of each gene share extensive DNA sequence identities throughout their coding and 5' noncoding regions, which suggests these copies have arisen from relatively recent gene duplications or gene conversions. The transcription start sites are at triplet A sequences 1–27 nucleotides upstream of the translation start codon for each gene. We determined that a 50 bp region upstream from the start of the histone H4 coding region is the minimal promoter, and a highly conserved 15 bp sequence called the histone motif (him) is essential for its activity. The Giardia core histone genes are constitutively expressed at approximately equivalent levels and their mRNAs are polyadenylated. Competition gel-shift experiments suggest that a factor within the protein complex that binds him may also be a part of the protein complexes that bind other promoter elements described previously in Giardia.
In contrast to other eukaryotes, the Giardia genome has only a single class of core histone genes that encode replication-independent histones. Our inability to locate a gene encoding the linker histone H1 leads us to speculate that the H1 protein may not be required for the compaction of Giardia's small and gene-rich genome.
microRNAs (miRNA) have been detected in the deeply branched protist, Giardia lamblia, and shown to repress expression of the family of variant-specific surface proteins (VSPs), only one of which is expressed in Giardia trophozoite at a given time. Three next-generation sequencing libraries of Giardia Argonaute-associated small RNAs were constructed and analyzed. Analysis of the libraries identified a total of 99 new putative miRNAs with a size primarily in the 26 nt range similar to the size previously predicted by the Giardia Dicer crystal structure and identified by our own studies. Bioinformatic analysis identified multiple putative miRNA target sites in the mRNAs of all 73 VSPs. The effect of miRNA target sites within a defined 3′-region were tested on two vsp mRNAs. All the miRNAs showed partial repression of the corresponding vsp expression and were additive when the targeting sites were separately located. But the combined repression still falls short of 100%. Two other relatively short vsp mRNAs with 15 and 11 putative miRNA target sites identified throughout their ORFs were tested with their corresponding miRNAs. The results indicate that; (1) near 100% repression of vsp mRNA expression can be achieved through the combined action of multiple miRNAs on target sites located throughout the ORF; (2) the miRNA machinery could be instrumental in repressing the expression of vsp genes in Giardia; (3) this is the first time that all the miRNA target sites in the entire ORF of a mRNA have been tested and shown to be functional.
Giardia lamblia is a protozoan parasite causing the diarrheal disease giardiasis. Variant-specific surface proteins (VSP) in Giardia are likely involved in its evasion of host immune response. Their expression is regulated by microRNAs (miRNA). To determine the full complement of miRNAs in Giardia, three cDNA libraries of Giardia Argonaute associated small RNAs were constructed and analyzed to identify a total of 105 miRNAs. Bioinformatic target identification showed that 102 of the 105 miRNAs find their putative target sites in vsp mRNAs. When only the target sites within the 3′ region,100 nts upstream of the stop codon, were tested against their corresponding miRNAs, however, only partial repression of VSP expression was observed. When all the miRNA target sites in the open reading frames of vsp mRNAs were examined, however, they all turned out to be functional. A saturation of them with the corresponding miRNAs resulted in a full repression of VSP expression, suggesting that this is the mechanism of miRNA repression of VSP expression in Giardia. The ability of miRNAs to regulate target sites throughout the entire open reading frame also provides the first indication that all the miRNA target sites in an mRNA are functional.
Controlled secretion of a protective extracellular matrix is required for transmission of the infective stage of a large number of protozoan and metazoan parasites. Differentiating trophozoites of the highly minimized protozoan parasite Giardia lamblia secrete the proteinaceous portion of the cyst wall material (CWM) consisting of three paralogous cyst wall proteins (CWP1–3) via organelles termed encystation-specific vesicles (ESVs). Phylogenetic and molecular data indicate that Diplomonads have lost a classical Golgi during reductive evolution. However, neogenesis of ESVs in encysting Giardia trophozoites transiently provides basic Golgi functions by accumulating presorted CWM exported from the ER for maturation. Based on this “minimal Golgi” hypothesis we predicted maturation of ESVs to a trans Golgi-like stage, which would manifest as a sorting event before regulated secretion of the CWM. Here we show that proteolytic processing of pro-CWP2 in maturing ESVs coincides with partitioning of CWM into two fractions, which are sorted and secreted sequentially with different kinetics. This novel sorting function leads to rapid assembly of a structurally defined outer cyst wall, followed by slow secretion of the remaining components. Using live cell microscopy we find direct evidence for condensed core formation in maturing ESVs. Core formation suggests that a mechanism controlled by phase transitions of the CWM from fluid to condensed and back likely drives CWM partitioning and makes sorting and sequential secretion possible. Blocking of CWP2 processing by a protease inhibitor leads to mis-sorting of a CWP2 reporter. Nevertheless, partitioning and sequential secretion of two portions of the CWM are unaffected in these cells. Although these cysts have a normal appearance they are not water resistant and therefore not infective. Our findings suggest that sequential assembly is a basic architectural principle of protective wall formation and requires minimal Golgi sorting functions.
The protozoan Giardia lamblia is the leading cause for parasite-induced diarrhea with significant morbidity in humans and animals world-wide, and is transmitted by water-resistant cysts. Giardia has undergone substantial reductive evolution to a simpler organization than the last common eukaryotic ancestor, which makes it an interesting model to investigate basic cellular mechanisms. Its secretory system lacks a Golgi, but trophozoites induced to differentiate to cysts generate organelles termed encystation-specific vesicles (ESVs). Previous work shows that ESVs are most likely minimal pulsed Golgi-like compartments for exporting pre-sorted cyst wall material. We tested whether the sorting function associated with classical trans Golgi networks was also conserved in these organelles. By tracking immature and processed forms of the three cyst wall proteins during differentiation we discovered a novel sorting function which results in partitioning of ESV cargo and sequential secretion of the cyst wall material. Using live cell imaging we identified reversible formation of condensed cores as a mechanism for cargo partitioning. These observations suggest that the requirement for sequential secretion of extracellular matrix components protecting Giardia during transmission has prevented the complete secondary loss of the machinery to generate Golgi cisterna-like maturation compartments; indeed, the preserved functions have been placed under stage-specific control.
S/MARs are regions of the DNA that are attached to the nuclear matrix. These regions are known to affect substantially the expression of genes. The computer prediction of S/MARs is a highly significant task which could contribute to our understanding of chromatin organisation in eukaryotic cells, the number and distribution of boundary elements, and the understanding of gene regulation in eukaryotic cells. However, while a number of S/MAR predictors have been proposed, their accuracy has so far not come under scrutiny.
We have selected S/MARs with sufficient experimental evidence and used these to evaluate existing methods of S/MAR prediction. Our main results are: 1.) all existing methods have little predictive power, 2.) a simple rule based on AT-percentage is generally competitive with other methods, 3.) in practice, the different methods will usually identify different sub-sequences as S/MARs, 4.) more research on the H-Rule would be valuable.
A new insight is needed to design a method which will predict S/MARs well. Our data, including the control data, has been deposited as additional material and this may help later researchers test new predictors.
Entamoeba histolytica is an amitochondriate protozoan parasite with numerous bacterium-like fermentation enzymes including the pyruvate:ferredoxin oxidoreductase (POR), ferredoxin (FD), and alcohol dehydrogenase E (ADHE). The goal of this study was to determine whether the genes encoding these cytosolic E. histolytica fermentation enzymes might derive from a bacterium by horizontal transfer, as has previously been suggested for E. histolytica genes encoding heat shock protein 60, nicotinamide nucleotide transhydrogenase, and superoxide dismutase. In this study, the E. histolytica por gene and the adhE gene of a second amitochondriate protozoan parasite, Giardia lamblia, were sequenced, and their phylogenetic positions were estimated in relation to POR, ADHE, and FD cloned from eukaryotic and eubacterial organisms. The E. histolytica por gene encodes a 1,620-amino-acid peptide that contained conserved iron-sulfur- and thiamine pyrophosphate-binding sites. The predicted E. histolytica POR showed fewer positional identities to the POR of G. lamblia (34%) than to the POR of the enterobacterium Klebsiella pneumoniae (49%), the cyanobacterium Anabaena sp. (44%), and the protozoan Trichomonas vaginalis (46%), which targets its POR to anaerobic organelles called hydrogenosomes. Maximum-likelihood, neighbor-joining, and parsimony analyses also suggested as less likely E. histolytica POR sharing more recent common ancestry with G. lamblia POR than with POR of bacteria and the T. vaginalis hydrogenosome. The G. lamblia adhE encodes an 888-amino-acid fusion peptide with an aldehyde dehydrogenase at its amino half and an iron-dependent (class 3) ADH at its carboxy half. The predicted G. lamblia ADHE showed extensive positional identities to ADHE of Escherichia coli (49%), Clostridium acetobutylicum (44%), and E. histolytica (43%) and lesser identities to the class 3 ADH of eubacteria and yeast (19 to 36%). Phylogenetic analyses inferred a closer relationship of the E. histolytica ADHE to bacterial ADHE than to the G. lamblia ADHE. The 6-kDa FD of E. histolytica and G. lamblia were most similar to those of the archaebacterium Methanosarcina barkeri and the delta-purple bacterium Desulfovibrio desulfuricans, respectively, while the 12-kDa FD of the T. vaginalis hydrogenosome was most similar to the 12-kDa FD of gamma-purple bacterium Pseudomonas putida. E. histolytica genes (and probably G. lamblia genes) encoding fermentation enzymes therefore likely derive from bacteria by horizontal transfer, although it is not clear from which bacteria these amebic genes derive. These are the first nonorganellar fermentation enzymes of eukaryotes implicated to have derived from bacteria.
The potentiation and subsequent initiation of transcription are complex biological phenomena. The region of attachment of the chromatin fiber to the nuclear matrix, known as the matrix attachment region or scaffold attachment region (MAR or SAR), are thought to be requisite for the transcriptional regulation of the eukaryotic genome. As expressed sequences should be contained in these regions, it becomes significant to answer the following question: can these regions be identified from the primary sequence data alone and subsequently used as markers for expressed sequences? This paper represents an effort toward achieving this goal and describes a mathematical model for the detection of MARs. The location of matrix associated regions has been linked to a variety of sequence patterns. Consequently, a list of these patterns is compiled and represented as a set of decision rules using an AND-OR formulation. The DNA sequence was then searched for the presence of these patterns and a statistical significance was associated with the frequency of occurrence of the various patterns. Subsequently, a mathematical potential value,MAR-Potential, was assigned to a sequence region as the inverse proportion to the probability that the observed pattern population occurred at random. Such a MAR detection process was applied to the analysis of a variety of known MAR containing sequences. Regions of matrix association predicted by the software essentially correspond to those determined experimentally. The human T-cell receptor and the DNA sequence from the Drosophila bithorax region were also analyzed. This demonstrates the usefulness of the approach described as a means to direct experimental resources.
Interphase chromatin is arranged into topologically separated domains comprising gene expression and replication units through genomic sequence elements, so-called MAR or SAR regions (for matrix- or scaffold-associating regions). S/MAR regions are located near the boundaries of actively transcribed genes and were shown to influence their activity. We show that scaffold attachment factor B (SAF-B), which specifically binds to S/MAR regions, interacts with RNA polymerase II (RNA pol II) and a subset of serine-/arginine-rich RNA processing factors (SR proteins). SAF-B localized to the nucleus in a speckled pattern that coincided with the distribution of the SR protein SC35. Furthermore, we show that overexpressed SAF-B induced an increase of the 10S splice product using an E1A reporter gene and repressed the activity of an S/MAR flanked CAT reporter gene construct in vivo . This indicates an association of SAF-B with SR proteins and components of the transcription machinery. Our results describe the coupling of a chromatin organizing S/MAR element with transcription and pre-mRNA processing components and we propose that SAF-B serves as a molecular base to assemble a 'transcriptosome complex' in the vicinity of actively transcribed genes.
RNA degradation is critical to the survival of all cells. With increasing evidence for pervasive transcription in cells, RNA degradation has gained recognition as a means of regulating gene expression. Yet, RNA degradation machinery has been studied extensively in only a few eukaryotic organisms, including Saccharomyces cerevisiae and humans. Giardia lamblia is a parasitic protist with unusual genomic traits: it is binucleated and tetraploid, has a very compact genome, displays a theme of genomic minimalism with cellular machinery commonly comprised of a reduced number of protein components, and has a remarkably large population of long, stable, noncoding, antisense RNAs.
Here we use in silico approaches to investigate the major RNA degradation machinery in Giardia lamblia and compare it to a broad array of other parasitic protists. We have found key constituents of the deadenylation and decapping machinery and of the 5'-3' RNA degradation pathway. We have similarly found that all of the major 3'-5' RNA degradation pathways are present in Giardia, including both exosome-dependent and exosome-independent machinery. However, we observe significant loss of RNA degradation machinery genes that will result in important differences in the protein composition, and potentially functionality, of the various RNA degradation pathways. This is most apparent in the exosome, the central mediator of 3'-5' degradation, which apparently contains an altered core configuration in both Giardia and Plasmodium, with only four, instead of the canonical six, distinct subunits. Additionally the exosome in Giardia is missing both the Rrp6, Nab3, and Nrd1 proteins, known to be key regulators of noncoding transcript stability in other cells.
These findings suggest that although the full complement of the major RNA degradation mechanisms were present - and likely functional - early in eukaryotic evolution, the composition and function of the complexes is more variable than previously appreciated. We suggest that the missing components of the exosome complex provide an explanation for the stable abundance of sterile RNA species in Giardia.
The marRAB operon is a regulatory locus that controls multiple drug resistance in Escherichia coli. marA encodes a positive regulator of the antibiotic resistance response, acting by altering the expression of unlinked genes. marR encodes a repressor of marRAB transcription and controls the production of MarA in response to environmental signals. A molecular and genetic study of the homologous operon in Salmonella typhimurium was undertaken, and the role of marA in virulence in a murine model was assessed. Expression of E. coli marA (marAEC) present on a multicopy plasmid in S. typhimurium resulted in a multiple antibiotic resistance (Mar) phenotype, suggesting that a similar regulon exists in this organism. A genomic plasmid library containing S. typhimurium chromosomal sequences was introduced into an E. coli strain that was deleted for the mar locus and contained a single-copy marR'-'lacZ translational fusion. Plasmid clones that contained both S. typhimurium marR (marRSt) and marA (marASt) genes were identified as those that were capable of repressing expression of the fusion and which resulted in a Mar phenotype. The predicted amino acid sequences of MarRSt, MarASt, and MarBSt were 91, 86, and 42% identical, respectively, to the same genes from E. coli, while the operator/promoter region of the operon was 86% identical to the same 98-nucleotide-upstream region in E. coli. The marRAB transcriptional start sites for both organisms were determined by primer extension, and a marRABSt transcript of approximately 1.1 kb was identified by Northern blot analysis. Its accumulation was shown to be inducible by sodium salicylate. Open reading frames flanking the marRAB operon were also conserved. An S. typhimurium marA disruption strain was constructed by an allelic exchange method and compared to the wild-type strain for virulence in a murine BALB/c infection model. No effect on virulence was noted. The endogenous S. typhimurium plasmid that is associated with virulence played no role in marA-mediated multiple antibiotic resistance. Taken together, the data show that the S. typhimurium mar locus is structurally and functionally similar to marRABEc and that a lesion in marASt has no effect on S. typhimurium virulence for BALB/c mice.
Comparative genomic studies of the mitochondrion-lacking protist group Diplomonadida (diplomonads) has been lacking, although Giardia lamblia has been intensively studied. We have performed a sequence survey project resulting in 2341 expressed sequence tags (EST) corresponding to 853 unique clones, 5275 genome survey sequences (GSS), and eleven finished contigs from the diplomonad fish parasite Spironucleus salmonicida (previously described as S. barkhanus).
The analyses revealed a compact genome with few, if any, introns and very short 3' untranslated regions. Strikingly different patterns of codon usage were observed in genes corresponding to frequently sampled ESTs versus genes poorly sampled, indicating that translational selection is influencing the codon usage of highly expressed genes. Rigorous phylogenomic analyses identified 84 genes – mostly encoding metabolic proteins – that have been acquired by diplomonads or their relatively close ancestors via lateral gene transfer (LGT). Although most acquisitions were from prokaryotes, more than a dozen represent likely transfers of genes between eukaryotic lineages. Many genes that provide novel insights into the genetic basis of the biology and pathogenicity of this parasitic protist were identified including 149 that putatively encode variant-surface cysteine-rich proteins which are candidate virulence factors. A number of genomic properties that distinguish S. salmonicida from its human parasitic relative G. lamblia were identified such as nineteen putative lineage-specific gene acquisitions, distinct mutational biases and codon usage and distinct polyadenylation signals.
Our results highlight the power of comparative genomic studies to yield insights into the biology of parasitic protists and the evolution of their genomes, and suggest that genetic exchange between distantly-related protist lineages may be occurring at an appreciable rate in eukaryote genome evolution.
Mitochondrial processing peptidases are heterodimeric enzymes (α/βMPP) that play an essential role in mitochondrial biogenesis by recognizing and cleaving the targeting presequences of nuclear-encoded mitochondrial proteins. The two subunits are paralogues that probably evolved by duplication of a gene for a monomeric metallopeptidase from the endosymbiotic ancestor of mitochondria. Here, we characterize the MPP-like proteins from two important human parasites that contain highly reduced versions of mitochondria, the mitosomes of Giardia intestinalis and the hydrogenosomes of Trichomonas vaginalis. Our biochemical characterization of recombinant proteins showed that, contrary to a recent report, the Trichomonas processing peptidase functions efficiently as an α/β heterodimer. By contrast, and so far uniquely among eukaryotes, the Giardia processing peptidase functions as a monomer comprising a single βMPP-like catalytic subunit. The structure and surface charge distribution of the Giardia processing peptidase predicted from a 3-D protein model appear to have co-evolved with the properties of Giardia mitosomal targeting sequences, which, unlike classic mitochondrial targeting signals, are typically short and impoverished in positively charged residues. The majority of hydrogenosomal presequences resemble those of mitosomes, but longer, positively charged mitochondrial-type presequences were also identified, consistent with the retention of the Trichomonas αMPP-like subunit. Our computational and experimental/functional analyses reveal that the divergent processing peptidases of Giardia mitosomes and Trichomonas hydrogenosomes evolved from the same ancestral heterodimeric α/βMPP metallopeptidase as did the classic mitochondrial enzyme. The unique monomeric structure of the Giardia enzyme, and the co-evolving properties of the Giardia enzyme and substrate, provide a compelling example of the power of reductive evolution to shape parasite biology.
In classic model organisms, cleavage of signals that are required to deliver nuclear-encoded proteins to mitochondria is mediated by an enzyme comprising two different subunits, called α or β, neither of which is functional by itself. Here, we have characterized a novel enzyme that functions in the mitosome, a highly reduced mitochondrion, of the pathogenic protist Giardia intestinalis. The Giardia enzyme is unique among eukaryotes because it has undergone reductive evolution to function efficiently as a single β-subunit monomer. We also show that the recent claim that the equivalent enzyme in the hydrogenosome, another type of reduced mitochondrion of the human parasite Trichomonas vaginalis, functions as a homodimer of two β-subunits, is not supported. The Trichomonas enzyme requires both an α- and a β-subunit to function most efficiently. Computational analysis of the Giardia and Trichomonas enzymes reveals that their structures and surface charge distributions have co-evolved to match the peculiar properties of the targeting signals that they process. The Giardia mitosome is an ideal model for studying the limits of mitochondrial reductive evolution and, because it makes cofactors that are essential for Giardia survival, is a potential therapeutic target for this important human parasite.
The Arabidopsis thaliana genome is currently being sequenced, eventually leading towards the unravelling of all potential genes. We wanted to gain more insight into the way this genome might be organized at the ultrastructural level. To this extent we identified matrix attachment regions demarking potential chromatin domains, in a 16 kb region around the plastocyanin gene. The region was cloned and sequenced revealing six genes in addition to the plastocyanin gene. Using an heterologous in vitro nuclear matrix binding assay, to search for evolutionary conserved matrix attachment regions (MARs), we identified three such MARs. These three MARs divide the region into two small chromatin domains of 5 kb, each containing two genes. Comparison of the sequence of the three MARs revealed a degenerated 21 bp sequence that is shared between these MARs and that is not found elsewhere in the region. A similar sequence element is also present in four other MARs of Arabidopsis.Therefore, this sequence may constitute a landmark for the position of MARs in the genome of this plant. In a genomic sequence database of Arabidopsis the 21 bp element is found approximately once every 10 kb. The compactness of the Arabidopsis genome could account for the high incidence of MARs and MRSs we observed.
The negative regulatory element (NRE) of human immunodeficiency virus type-1 (HIV-1) long terminal repeat (LTR) is a defined region that has been reported to downregulate LTR-directed HIV gene expression. However, information on the precise role of this region in regulating HIV gone transcription is lacking. We have investigated the possibility that these NRE sequences regulate HIV transcription by a mechanism mediated through a nuclear matrix-specific DNA-protein interaction. We find a nuclear matrix attachment region (MAR) present within the NRE of the HIV-1 LTR that recognizes a sequence-specific DNA-binding protein present in the nuclear matrix of HIV infected cells. Moreover, we also show that the purified DNA-binding nuclear matrix protein (NMP) specifically represses the DNA-binding activity of NF-kappaB. It is likely that the MAR and MAR-enriched specific DNA-binding NMP are brought into juxtaposition by the non-chromatin scaffolding of the nucleus, thus influencing NF-kappaB (and other nuclear proteins) DNA-binding activity through protein-protein and protein-DNA interactions. Our date suggest that one possible role of the NRE could be to act as a matrix attachment site in the nuclear matrix, thus, allowing interaction with a sequence-specific trans-acting factor. The negative effect on NF-kappaB activity due to this MAR-NMP-specific interaction provides a mechanism by which the NRE downregulates HIV gene expression.
The nuclear matrix has been implicated in several cellular processes, including DNA replication, transcription, and RNA processing. In particular, transcriptional regulation is believed to be accomplished by binding of chromatin loops to the nuclear matrix and by the concentration of specific transcription factors near these matrix attachment regions (MARs). A number of MAR-binding proteins have been identified, but few have been directly linked to tissue-specific transcription. Recently, we have identified two cellular protein complexes (NBP and UBP) that bind to a region of the mouse mammary tumor virus (MMTV) long terminal repeat (LTR) previously shown to contain at least two negative regulatory elements (NREs) termed the promoter-proximal and promoter-distal NREs. These NREs are absent from MMTV strains that cause T-cell lymphomas instead of mammary carcinomas. We show here that NBP binds to a 22-bp sequence containing an imperfect inverted repeat in the promoter-proximal NRE. Previous data showed that a mutation (p924) within the inverted repeat elevated basal transcription from the MMTV promoter and destabilized the binding of NBP, but not UBP, to the proximal NRE. By using conventional and affinity methods to purify NBP from rat thymic nuclear extracts, we obtained a single major protein of 115 kDa that was identified by protease digestion and partial sequencing analysis as the nuclear matrix-binding protein special AT-rich sequence-binding protein 1 (SATB1). Antibody ablation, distamycin inhibition of binding, renaturation and competition experiments, and tissue distribution data all confirmed that the NBP complex contained SATB1. Similar types of experiments were used to show that the UBP complex contained the homeodomain protein Cux/CDP that binds the MAR of the intronic heavy-chain immunoglobulin enhancer. By using the p924 mutation within the MMTV LTR upstream of the chloramphenicol acetyltransferase gene, we generated two strains of transgenic mice that had a dramatic elevation of reporter gene expression in lymphoid tissues compared with reporter gene expression in mice expressing wild-type LTR constructs. Thus, the 924 mutation in the SATB1-binding site dramatically elevated MMTV transcription in lymphoid tissues. These results and the ability of the proximal NRE in the MMTV LTR to bind to the nuclear matrix clearly demonstrate the role of MAR-binding proteins in tissue-specific gene regulation and in MMTV-induced oncogenesis.
Chromatin in eukaryotic nuclei is thought to be partitioned into functional loop domains that are generated by the binding of defined DNA sequences, named MARs (matrix attachment regions), to the nuclear matrix. We have previously identified B-type lamins as MAR-binding matrix components (M. E. E. Ludérus, A. de Graaf, E. Mattia, J. L. den Blaauwen, M. A. Grande, L. de Jong, and R. van Driel, Cell 70:949-959, 1992). Here we show that A-type lamins and the structurally related proteins desmin and NuMA also specifically bind MARs in vitro. We studied the interaction between MARs and lamin polymers in molecular detail and found that the interaction is saturable, of high affinity, and evolutionarily conserved. Competition studies revealed the existence of two different types of interaction related to different structural features of MARs: one involving the minor groove of double-stranded MAR DNA and one involving single-stranded regions. We obtained similar results for the interaction of MARs with intact nuclear matrices from rat liver. A model in which the interaction of nuclear matrix proteins with single-stranded MAR regions serves to stabilize the transcriptionally active state of chromatin is discussed.