Search tips
Search criteria

Results 1-25 (1373096)

Clipboard (0)

Related Articles

1.  Separate base usages of genes located on the leading and lagging strands in Chlamydia muridarum revealed by the Z curve method 
BMC Genomics  2007;8:366.
The nucleotide compositional asymmetry between the leading and lagging strands in bacterial genomes has been the subject of intensive study in the past few years. It is interesting to mention that almost all bacterial genomes exhibit the same kind of base asymmetry. This work aims to investigate the strand biases in Chlamydia muridarum genome and show the potential of the Z curve method for quantitatively differentiating genes on the leading and lagging strands.
The occurrence frequencies of bases of protein-coding genes in C. muridarum genome were analyzed by the Z curve method. It was found that genes located on the two strands of replication have distinct base usages in C. muridarum genome. According to their positions in the 9-D space spanned by the variables u1 – u9 of the Z curve method, K-means clustering algorithm can assign about 94% of genes to the correct strands, which is a few percent higher than those correctly classified by K-means based on the RSCU. The base usage and codon usage analyses show that genes on the leading strand have more G than C and more T than A, particularly at the third codon position. For genes on the lagging strand the biases is reverse. The y component of the Z curves for the complete chromosome sequences show that the excess of G over C and T over A are more remarkable in C. muridarum genome than in other bacterial genomes without separating base and/or codon usages. Furthermore, for the genomes of Borrelia burgdorferi, Treponema pallidum, Chlamydia muridarum and Chlamydia trachomatis, in which distinct base and/or codon usages have been observed, closer phylogenetic distance is found compared with other bacterial genomes.
The nature of the strand biases of base composition in C. muridarum is similar to that in most other bacterial genomes. However, the base composition asymmetry between the leading and lagging strands in C. muridarum is more significant than that in other bacteria. It's supposed that the remarkable strand biases of G/C and T/A are responsible for the appearance of separate base or codon usages in C. muridarum. On the other hand, the closer phylogenetic distance among the four bacterial genomes with separate base and/or codon usages is necessary rather than occasional. It's also shown that the Z curve method may be more sensitive than RSCU when being used to quantitatively analyze DNA sequences.
PMCID: PMC2089121  PMID: 17925038
2.  Chromosome Structuring Limits Genome Plasticity in Escherichia coli 
PLoS Genetics  2007;3(12):e226.
Chromosome organizations of related bacterial genera are well conserved despite a very long divergence period. We have assessed the forces limiting bacterial genome plasticity in Escherichia coli by measuring the respective effect of altering different parameters, including DNA replication, compositional skew of replichores, coordination of gene expression with DNA replication, replication-associated gene dosage, and chromosome organization into macrodomains. Chromosomes were rearranged by large inversions. Changes in the compositional skew of replichores, in the coordination of gene expression with DNA replication or in the replication-associated gene dosage have only a moderate effect on cell physiology because large rearrangements inverting the orientation of several hundred genes inside a replichore are only slightly detrimental. By contrast, changing the balance between the two replication arms has a more drastic effect, and the recombinational rescue of replication forks is required for cell viability when one of the chromosome arms is less than half than the other one. Macrodomain organization also appears to be a major factor restricting chromosome plasticity, and two types of inverted configurations severely affect the cell cycle. First, the disruption of the Ter macrodomain with replication forks merging far from the normal replichore junction provoked chromosome segregation defects. The second major problematic configurations resulted from inversions between Ori and Right macrodomains, which perturb nucleoid distribution and early steps of cytokinesis. Consequences for the control of the bacterial cell cycle and for the evolution of bacterial chromosome configuration are discussed.
Author Summary
Genomic analyses have revealed that bacterial genomes are dynamic entities that evolve through various processes including intrachromosome genetic rearrangements, gene duplication, and gene loss or acquisition by gene transfer. Nevertheless, comparison of bacterial chromosomes from related genera revealed a conservation of genetic organization. Most bacterial genomes are circular molecules, and DNA replication proceeds bidirectionally from a single origin to an opposite region where replication forks meet. The replication process imprints the bacterial chromosome because initiation and termination at defined loci result in strand biases due to the mutational differences occurring during leading and lagging strands synthesis. We analyze the strength of different parameters that may limit genome plasticity. We show that the preferential positioning of essential genes on the leading strand, the proximity of genes involved in transcription and translation to the origin of replication on the leading strand, and the presence of biased motifs along the replichores operate only as long-term positive selection determinants. By contrast, selection operates to maintain replication arms of similar lengths. Finally, we demonstrate that spatial structuring of the chromosome impedes strongly genome plasticity. Genetic evidence supports the presence of two steps in the cell cycle controlled by the spatial organization of the chromosome.
PMCID: PMC2134941  PMID: 18085828
3.  Codon usage in Chlamydia trachomatis is the result of strand-specific mutational biases and a complex pattern of selective forces 
Nucleic Acids Research  2000;28(10):2084-2090.
The patterns of synonymous codon choices of the completely sequenced genome of the bacterium Chlamydia trachomatis were analysed. We found that the most important source of variation among the genes results from whether the sequence is located on the leading or lagging strand of replication, resulting in an over representation of G or C, respectively. This can be explained by different mutational biases associated to the different enzymes that replicate each strand. Next we found that most highly expressed sequences are located on the leading strand of replication. From this result, replicational-transcriptional selection can be invoked. Then, when the genes located on the leading strand are studied separately, the correspondence analysis detects a principal trend which discriminates between lowly and highly expressed sequences, the latter displaying a different codon usage pattern than the former, suggesting selection for translation, which is reinforced by the fact that Ks values between orthologous sequences from C.trachomatis and Chlamydia pneumoniae are much smaller in highly expressed genes. Finally, synonymous codon choices appear to be influenced by the hydropathy of each encoded protein and by the degree of amino acid conservation. Therefore, synonymous codon usage in C.trachomatis seems to be the result of a very complex balance among different factors, which rises the problem of whether the forces driving codon usage patterns among microorganisms are rather more complex than generally accepted.
PMCID: PMC105376  PMID: 10773076
4.  Strand-Specific RNA-Seq Reveals Ordered Patterns of Sense and Antisense Transcription in Bacillus anthracis 
PLoS ONE  2012;7(8):e43350.
Although genome-wide transcriptional analysis has been used for many years to study bacterial gene expression, many aspects of the bacterial transcriptome remain undefined. One example is antisense transcription, which has been observed in a number of bacteria, though the function of antisense transcripts, and their distribution across the bacterial genome, is still unclear.
Methodology/Principal Findings
Single-stranded RNA-seq results revealed a widespread and non-random pattern of antisense transcription covering more than two thirds of the B. anthracis genome. Our analysis revealed a variety of antisense structural patterns, suggesting multiple mechanisms of antisense transcription. The data revealed several instances of sense and antisense expression changes in different growth conditions, suggesting that antisense transcription may play a role in the ways in which B. anthracis responds to its environment. Significantly, genome-wide antisense expression occurred at consistently higher levels on the lagging strand, while the leading strand showed very little antisense activity. Intrasample gene expression comparisons revealed a gene dosage effect in all growth conditions, where genes farthest from the origin showed the lowest overall range of expression for both sense and antisense directed transcription. Additionally, transcription from both strands was verified using a novel strand-specific assay. The variety of structural patterns we observed in antisense transcription suggests multiple mechanisms for this phenomenon, suggesting that some antisense transcription may play a role in regulating the expression of key genes, while some may be due to chromosome replication dynamics and transcriptional noise.
Although the variety of structural patterns we observed in antisense transcription suggest multiple mechanisms for antisense expression, our data also clearly indicate that antisense transcription may play a genome-wide role in regulating the expression of key genes in Bacillus species. This study illustrates the surprising complexity of prokaryotic RNA abundance for both strands of a bacterial chromosome.
PMCID: PMC3425587  PMID: 22937038
5.  Proteome composition and codon usage in spirochaetes: species-specific and DNA strand-specific mutational biases. 
Nucleic Acids Research  1999;27(7):1642-1649.
The genomes of the spirochaetes Borrelia burgdorferi and Treponema pallidum show strong strand-specific skews in nucleotide composition, with the leading strand in replication being richer in G and T than the lagging strand in both species. This mutation bias results in codon usage and amino acid composition patterns that are significantly different between genes encoded on the two strands, in both species. There are also substantial differences between the species, with T.pallidum having a much higher G+C content than B. burgdorferi. These changes in amino acid and codon compositions represent neutral sequence change that has been caused by strong strand- and species-specific mutation pressures. Genes that have been relocated between the leading and lagging strands since B. burgdorferi and T.pallidum diverged from a common ancestor now show codon and amino acid compositions typical of their current locations. There is no evidence that translational selection operates on codon usage in highly expressed genes in these species, and the primary influence on codon usage is whether a gene is transcribed in the same direction as replication, or opposite to it. The dnaA gene in both species has codon usage patterns distinctive of a lagging strand gene, indicating that the origin of replication lies downstream of this gene, possibly within dnaN. Our findings strongly suggest that gene-finding algorithms that ignore variability within the genome may be flawed.
PMCID: PMC148367  PMID: 10075995
6.  Accelerated gene evolution via replication-transcription conflicts 
Nature  2013;495(7442):10.1038/nature11989.
Several mechanisms that increase the rate of mutagenesis across the entire genome have been identified; however, how the rate of evolution might be promoted in individual genes is unclear. A majority of the genes in bacteria are encoded on the leading strand of replication1–4. This presumably avoids the potentially detrimental head-on collisions that occur between the replication and transcription machineries when genes are encoded on the lagging strand1–4. We identified the ubiquitous (core) genes in Bacillus subtilis and determined that 17% of them are on the lagging strand. We found a higher rate of point mutations in the core genes on the lagging strand compared to those on the leading strand, with this difference being primarily in the amino acid changing (nonsynonymous) mutations. We determined that overall, the genes under strong negative selection against amino acid changing mutations tend to be on the leading strand, co-oriented with replication. In contrast, based on the rate of convergent mutations, genes under positive selection for amino acid changing mutations are more commonly found on the lagging strand, indicating faster adaptive evolution in many genes in the head-on orientation. Increased gene length and gene expression levels are positively correlated with the rate of accumulation of nonsynonymous mutations in the head-on genes, suggesting that the conflict between replication and transcription could be a driving force behind these mutations. Indeed, using reversion assays, we show that the difference in the rate of mutagenesis of genes in the two orientations is transcription-dependent. Altogether, our findings indicate that head-on replication-transcription conflicts are more mutagenic than co-directional conflicts and that these encounters can significantly increase adaptive structural variation in the coded proteins. We propose that bacteria, and potentially other organisms, promote faster evolution of specific genes through orientation-dependent encounters between DNA replication and transcription.
PMCID: PMC3807732  PMID: 23538833
7.  Mismatch Repair Balances Leading and Lagging Strand DNA Replication Fidelity 
PLoS Genetics  2012;8(10):e1003016.
The two DNA strands of the nuclear genome are replicated asymmetrically using three DNA polymerases, α, δ, and ε. Current evidence suggests that DNA polymerase ε (Pol ε) is the primary leading strand replicase, whereas Pols α and δ primarily perform lagging strand replication. The fact that these polymerases differ in fidelity and error specificity is interesting in light of the fact that the stability of the nuclear genome depends in part on the ability of mismatch repair (MMR) to correct different mismatches generated in different contexts during replication. Here we provide the first comparison, to our knowledge, of the efficiency of MMR of leading and lagging strand replication errors. We first use the strand-biased ribonucleotide incorporation propensity of a Pol ε mutator variant to confirm that Pol ε is the primary leading strand replicase in Saccharomyces cerevisiae. We then use polymerase-specific error signatures to show that MMR efficiency in vivo strongly depends on the polymerase, the mismatch composition, and the location of the mismatch. An extreme case of variation by location is a T-T mismatch that is refractory to MMR. This mismatch is flanked by an AT-rich triplet repeat sequence that, when interrupted, restores MMR to >95% efficiency. Thus this natural DNA sequence suppresses MMR, placing a nearby base pair at high risk of mutation due to leading strand replication infidelity. We find that, overall, MMR most efficiently corrects the most potentially deleterious errors (indels) and then the most common substitution mismatches. In combination with earlier studies, the results suggest that significant differences exist in the generation and repair of Pol α, δ, and ε replication errors, but in a generally complementary manner that results in high-fidelity replication of both DNA strands of the yeast nuclear genome.
Author Summary
The stability of complex and highly organized nuclear genomes partly depends on the ability of mismatch repair (MMR) to correct a variety of different mismatches generated as the leading and lagging strand templates are copied by three polymerases, each with different fidelity. Here we provide the first comparison, to our knowledge, of the efficiency of MMR of leading and lagging strand replication errors. We first confirm that Pol ε is the primary leading strand replicase, complementing earlier assignment of Pols α and δ as the primary lagging strand replicases. We then show that MMR efficiency in vivo strongly depends on the polymerase that generates the mismatch and on the composition and location of mismatches. In one extreme case, a flanking triplet repeat sequence eliminates MMR altogether. Overall, MMR is most efficient for mismatches generated at the highest rates and having the most deleterious potential, thereby ultimately achieving high-fidelity replication of both DNA strands.
PMCID: PMC3469411  PMID: 23071460
8.  Codon Usage Domains over Bacterial Chromosomes 
PLoS Computational Biology  2006;2(4):e37.
The geography of codon bias distributions over prokaryotic genomes and its impact upon chromosomal organization are analyzed. To this aim, we introduce a clustering method based on information theory, specifically designed to cluster genes according to their codon usage and apply it to the coding sequences of Escherichia coli and Bacillus subtilis. One of the clusters identified in each of the organisms is found to be related to expression levels, as expected, but other groups feature an over-representation of genes belonging to different functional groups, namely horizontally transferred genes, motility, and intermediary metabolism. Furthermore, we show that genes with a similar bias tend to be close to each other on the chromosome and organized in coherent domains, more extended than operons, demonstrating a role of translation in structuring bacterial chromosomes. It is argued that a sizeable contribution to this effect comes from the dynamical compartimentalization induced by the recycling of tRNAs, leading to gene expression rates dependent on their genomic and expression context.
Genomic sequencing projects are clearly showing that cellular components are not randomly encoded over bacterial chromosomes. Order arises for a variety of reasons. Bailly-Bechet and colleagues focused here on the role of translation in shaping bacterial chromosomes. Due to degeneracy of the genetic code, each amino acid can be encoded by multiple codons. Gene encoding is not random, though, and, depending on the genes, some codons are preferred to their synonyms. This is the so-called codon bias phenomenon. The authors analyzed the usage of synonymous codons for protein encoding and its geography over bacterial chromosomes. They found that genes sharing similar codon bias tend to be close to each other on the chromosome, in coherent patches more extended than transcriptional units. Their hypothesis is that those correlations in codon bias enable the cell to locally recycle tRNAs employed during translation, reducing stalling of the ribosomes due to rare tRNAs. This also entails a dependence of expression rates of a gene on its chromosomal context. Furthermore, their analysis made clear that genes involved in anabolic pathways, mainly active when the cell is starving, have a similar codon usage, and that they are encoded on the lagging strand of DNA. They hypothesize that this is due to relative translation efficiency of the lagging strand as compared with the leading one, illustrating the role of translation in creating structural evolutionary constraints.
PMCID: PMC1447655  PMID: 16683018
9.  Asymmetric directional mutation pressures in bacteria 
Genome Biology  2002;3(10):research0058.1-research0058.14.
When there are no strand-specific biases in mutation and selection rates between the two strands of DNA, the average nucleotide composition is theoretically expected to be A = T and G = C within each strand. By focusing on weakly selected regions that could be oriented with respect to replication in 43 out of 51 completely sequenced bacterial chromosomes, asymmetric directional mutation pressures have been detected.
When there are no strand-specific biases in mutation and selection rates (that is, in the substitution rates) between the two strands of DNA, the average nucleotide composition is theoretically expected to be A = T and G = C within each strand. Deviations from these equalities are therefore evidence for an asymmetry in selection and/or mutation between the two strands. By focusing on weakly selected regions that could be oriented with respect to replication in 43 out of 51 completely sequenced bacterial chromosomes, we have been able to detect asymmetric directional mutation pressures.
Most of the 43 chromosomes were found to be relatively enriched in G over C and T over A, and slightly depleted in G+C, in their weakly selected positions (intergenic regions and third codon positions) in the leading strand compared with the lagging strand. Deviations from A = T and G = C were highly correlated between third codon positions and intergenic regions, with a lower degree of deviation in intergenic regions, and were not correlated with overall genomic G+C content.
During the course of bacterial chromosome evolution, the effects of asymmetric directional mutation pressures are commonly observed in weakly selected positions. The degree of deviation from equality is highly variable among species, and within species is higher in third codon positions than in intergenic regions. The orientation of these effects is almost universal and is compatible in most cases with the hypothesis of an excess of cytosine deamination in the single-stranded state during DNA replication. However, the variation in G+C content between species is influenced by factors other than asymmetric mutation pressure.
PMCID: PMC134625  PMID: 12372146
10.  A Blueprint for a Mutationist Theory of Replicative Strand Asymmetries Formation 
Current Genomics  2012;13(1):55-64.
In the present review, we summarized current knowledge on replicative strand asymmetries in prokaryotic genomes. A cornerstone for the creation of a theory of their formation has been overviewed. According to our recent works, the probability of nonsense mutation caused by replication-associated mutational pressure is higher for genes from lagging strands than for genes from leading strands of both bacterial and archaeal genomes. Lower density of open reading frames in lagging strands can be explained by faster rates of nonsense mutations in genes situated on them. According to the asymmetries in nucleotide usage in fourfold and twofold degenerate sites, the direction of replication-associated mutational pressure for genes from lagging strands is usually the same as the direction of transcription-associated mutational pressure. It means that lagging strands should accumulate more 8-oxo-G, uracil and 5-formyl-uracil, respectively. In our opinion, consequences of cytosine deamination (C to T transitions) do not lead to the decrease of cytosine usage in genes from lagging strands because of the consequences of thymine oxidation (T to C transitions), while guanine oxidation (causing G to T transversions) makes the main contribution into the decrease of guanine usage in fourfold degenerate sites of genes from lagging strands. Nucleotide usage asymmetries and bias in density of coding regions can be found in archaeal genomes, although, the percent of “inversed” asymmetries is much higher for them than for bacterial genomes. “Homogenized” and “inversed” replicative strand asymmetries in archaeal genomes can be used as retrospective indexes for detection of OriC translocations and large inversions.
PMCID: PMC3269017  PMID: 22942675
Chirochore; GC-content; isochore; mutational pressure; nonsense mutation; replichore.
11.  APOBEC3A deaminates transiently exposed single-strand DNA during LINE-1 retrotransposition 
eLife  2014;3:e02008.
Long INterspersed Element-1 (LINE-1 or L1) retrotransposition poses a mutagenic threat to human genomes. Human cells have therefore evolved strategies to regulate L1 retrotransposition. The APOBEC3 (A3) gene family consists of seven enzymes that catalyze deamination of cytidine nucleotides to uridine nucleotides (C-to-U) in single-strand DNA substrates. Among these enzymes, APOBEC3A (A3A) is the most potent inhibitor of L1 retrotransposition in cultured cell assays. However, previous characterization of L1 retrotransposition events generated in the presence of A3A did not yield evidence of deamination. Thus, the molecular mechanism by which A3A inhibits L1 retrotransposition has remained enigmatic. Here, we have used in vitro and in vivo assays to demonstrate that A3A can inhibit L1 retrotransposition by deaminating transiently exposed single-strand DNA that arises during the process of L1 integration. These data provide a mechanistic explanation of how the A3A cytidine deaminase protein can inhibit L1 retrotransposition.
eLife digest
Transposable elements are often referred to as ‘jumping genes’ because they can move between different locations within a genome. These sequences of DNA are found in many organisms and can make up a significant proportion of the genetic material: almost 50% of the DNA in the case of the human genome.
Transposable elements are grouped by how they move to new locations in a genome. Some move by a cut-and-paste mechanism—whereby the transposable element DNA is removed from one location and inserted back at a new genomic location. Others, termed retrotransposons, move by a copy-and-paste mechanism: the DNA sequence is transcribed into an RNA intermediate, and then copied back into DNA before being inserted into a new location. Retrotransposons can accumulate to great numbers in genomes: and one retrotransposon, called LINE-1, is present at an estimated 500,000 copies in the human genome.
Although most copies of LINE-1 are inactive, the average human genome contains about 80–100 that are predicted to be able to ‘jump’ to new locations. Given that these retrotransposons could insert into, and disrupt, vital genes, it follows that our cells would have evolved ways to limit their movement. An enzyme named APOBEC3A is known to limit the movement of LINE-1 retrotransposons in cells. APOBEC3A can alter the letters, or bases, that make up the genetic code. This enzyme acts on single-strand DNA to change ‘C’ bases to ‘U’ bases, which could explain how APOBEC3A combats LINE-1. However, no evidence for such mutation of LINE-1 sequences by APOBEC3A had been found to date.
Now, Richardson et al. recreate the copying of LINE-1 RNA back into DNA in a test tube—and reveal that APOBEC3A can mutate single-strand LINE-1 DNA. Critically, as long as the RNA intermediate and DNA copy remain together, the LINE-1 DNA is protected. However, when LINE-1 inserts into a new location the temporarily exposed single strand of LINE-1 DNA becomes susceptible to mutation by APOBEC3A. Human cells can detect and destroy ‘U’ bases in DNA—and only by inhibiting this ability were Richardson et al. able to observe APOBEC3A mutations in new LINE-1 copies within the genomes of living cells.
Richardson et al. speculate that the activity of APOBEC3 enzymes must strike a balance between limiting the spread of retrotransposons and minimizing the mutation of the cell's own DNA. Future work could address important questions, such as: do APOBEC enzymes affect the ‘jumping’ of LINE-1 retrotransposons in human reproductive cells and the early embryo, where new LINE-1 insertions could be passed on to subsequent generations? Also, does a loss of APOBEC3 activity lead to new LINE-1 insertions in cancerous cells? And does this effect how tumors form and/or progress? Since APOBEC3 enzymes can cause mutations in cancers, they have been proposed as new targets for anti-cancer drugs—therefore, it is crucial to uncover any harmful effects of inhibiting APOBEC3 enzymes that might limit the effectiveness of such treatments.
PMCID: PMC4003774  PMID: 24843014
LINE-1; APOBEC3A; cytidine deaminase; retrotransposition; human
12.  The Effect of Multiple Evolutionary Selections on Synonymous Codon Usage of Genes in the Mycoplasma bovis Genome 
PLoS ONE  2014;9(10):e108949.
Mycoplasma bovis is a major pathogen causing arthritis, respiratory disease and mastitis in cattle. A better understanding of its genetic features and evolution might represent evidences of surviving host environments. In this study, multiple factors influencing synonymous codon usage patterns in M. bovis (three strains’ genomes) were analyzed. The overall nucleotide content of genes in the M. bovis genome is AT-rich. Although the G and C contents at the third codon position of genes in the leading strand differ from those in the lagging strand (p<0.05), the 59 synonymous codon usage patterns of genes in the leading strand are highly similar to those in the lagging strand. The over-represented codons and the under-represented codons were identified. A comparison of the synonymous codon usage pattern of M. bovis and cattle (susceptible host) indicated the independent formation of synonymous codon usage of M. bovis. Principal component analysis revealed that (i) strand-specific mutational bias fails to affect the synonymous codon usage pattern in the leading and lagging strands, (ii) mutation pressure from nucleotide content plays a role in shaping the overall codon usage, and (iii) the major trend of synonymous codon usage has a significant correlation with the gene expression level that is estimated by the codon adaptation index. The plot of the effective number of codons against the G+C content at the third codon position also reveals that mutation pressure undoubtedly contributes to the synonymous codon usage pattern of M. bovis. Additionally, the formation of the overall codon usage is determined by certain evolutionary selections for gene function classification (30S protein, 50S protein, transposase, membrane protein, and lipoprotein) and translation elongation region of genes in M. bovis. The information could be helpful in further investigations of evolutionary mechanisms of the Mycoplasma family and heterologous expression of its functionally important proteins.
PMCID: PMC4211681  PMID: 25350396
13.  Codon Usages of Genes on Chromosome, and Surprisingly, Genes in Plasmid are Primarily Affected by Strand-specific Mutational Biases in Lawsonia intracellularis 
In this study, the factors driving genome-wide patterns of codon usages in Lawsonia intracellularis genome are determined. For genes on the chromosome of the bacterium, it is found that the most important source of variation results from strand-specific mutational biases. A lesser trend of variation is attributable to genes that are presumed as horizontally transferred. These putative alien genes are unusually GC richer than the other genes, whereas horizontally transferred genes have been observed to be AT rich in bacteria with medium and relatively low G + C contents. Hydropathy of encoded protein and expression level are also found to influence codon usage. Therefore, codon usage in L. intracellularis chromosome is the result of a complex balance among the different mutational and selectional factors. When analyzing genes in the largest plasmid, for the first time it is found that the strand-specific mutational biases are responsible for the primary variation of codon usages in plasmid. Genes, particularly highly expressed genes of this plasmid, are mainly located on the leading strands and this supposed to be the effects exerted by replicational–transcriptional selection. These facts suggest that this plasmid adopts the similar mechanism of replication as the chromosome in L. intracellularis. Common characters among the 10 bacteria in whose genomes the strand-specific mutational biases are the primary source of variation of codon usage are also investigated. For example, it is found that genes dnaT and fis that are involved in DNA replication initiation and re-initiation pathways are absent in all of the 10 bacteria.
PMCID: PMC2671203  PMID: 19221094
Lawsonia intracellularis; codon usage; strand-specific mutational bias; plasmid; replication mechanism
14.  Atypical AT Skew in Firmicute Genomes Results from Selection and Not from Mutation 
PLoS Genetics  2011;7(9):e1002283.
The second parity rule states that, if there is no bias in mutation or selection, then within each strand of DNA complementary bases are present at approximately equal frequencies. In bacteria, however, there is commonly an excess of G (over C) and, to a lesser extent, T (over A) in the replicatory leading strand. The low G+C Firmicutes, such as Staphylococcus aureus, are unusual in displaying an excess of A over T on the leading strand. As mutation has been established as a major force in the generation of such skews across various bacterial taxa, this anomaly has been assumed to reflect unusual mutation biases in Firmicute genomes. Here we show that this is not the case and that mutation bias does not explain the atypical AT skew seen in S. aureus. First, recently arisen intergenic SNPs predict the classical replication-derived equilibrium enrichment of T relative to A, contrary to what is observed. Second, sites predicted to be under weak purifying selection display only weak AT skew. Third, AT skew is primarily associated with largely non-synonymous first and second codon sites and is seen with respect to their sense direction, not which replicating strand they lie on. The atypical AT skew we show to be a consequence of the strong bias for genes to be co-oriented with the replicating fork, coupled with the selective avoidance of both stop codons and costly amino acids, which tend to have T-rich codons. That intergenic sequence has more A than T, while at mutational equilibrium a preponderance of T is expected, points to a possible further unresolved selective source of skew.
Author Summary
When considering a single strand of DNA, it is not necessarily the case that the frequency of each base should equal its complementary partner, such that A = T and G = C. For the leading strand, it is typically the case that Gs are more common than Cs, and Ts more common than As. This bias is widely thought to arise due to different mutational biases during replication. The Firmicutes exhibit an atypical preference for A over T on the leading strand, and here we show that selection, rather than mutation, can explain this exception. For those bases within coding regions, selection acts to inflate the frequency of A over T in order to avoid stop codons and to use metabolically cheap amino acids. Because genes are not orientated randomly, this manifests as an overall enrichment of A on the leading strand. Furthermore, a direct examination of mutational patterns is inconsistent with the observed enrichment of As. Curiously, our data also point to an unresolved source of selection on synonymous and intergenic sites, which are widely assumed to be neutral.
PMCID: PMC3174206  PMID: 21935355
15.  Overlapping of Genes in the Human Genome 
Overlapping genes are relatively common in DNA and RNA viruses. There are several examples in bacterial and eukaryotic genomes, but, in general, overlapping genes are quite rare in organisms other than viruses. There have been a few reports of overlapping genes in mammalian genomes. The present study identified all of the overlapping loci and overlapping exons in every chromosome of the human genome using a public database. The total number of overlapping loci on the same and opposite strands was 949 and 743, respectively. Similarly, in every chromosome, the instances in which two loci were located on the same strand was similar to the number of 2 genes observed on opposite strands, except for chromosome 5. The number of 2 exons located on the same strand was higher than that for 2 exons located on opposite strands, indicating the presence of many comprehensive-type overlaps. The mean percentage of overlapping exons on opposite strands in each chromosome was 3.3%, suggesting that parts of the nucleotide sequences of 26,501 exons are used to produce 2 transcribed products from each strand. The ratio of the number of overlapping regions to chromosomal length revealed that, on chromosomes 22, 17 and 19, ratios were high for both types of 2 loci, with exons located on the same and opposite strands. Ratios were low on chromosomes Y, 13 and 18. These results show that all overlapping types are distributed throughout the human genome, but that distributions differ for each chromosome.
PMCID: PMC3614620  PMID: 23675016
overlapping genes; human genome; locus; exon; chromosome
16.  The Genomic Pattern of tDNA Operon Expression in E. coli 
PLoS Computational Biology  2005;1(1):e12.
In fast-growing microorganisms, a tRNA concentration profile enriched in major isoacceptors selects for the biased usage of cognate codons. This optimizes translational rate for the least mass invested in the translational apparatus. Such translational streamlining is thought to be growth-regulated, but its genetic basis is poorly understood. First, we found in reanalysis of the E. coli tRNA profile that the degree to which it is translationally streamlined is nearly invariant with growth rate. Then, using least squares multiple regression, we partitioned tRNA isoacceptor pools to predicted tDNA operons from the E. coli K12 genome. Co-expression of tDNAs in operons explains the tRNA profile significantly better than tDNA gene dosage alone. Also, operon expression increases significantly with proximity to the origin of replication, oriC, at all growth rates. Genome location explains about 15% of expression variation in a form, at a given growth rate, that is consistent with replication-dependent gene concentration effects. Yet the change in the tRNA profile with growth rate is less than would be expected from such effects. We estimated per-copy expression rates for all tDNA operons that were consistent with independent estimates for rDNA operons. We also found that tDNA operon location, and the location dependence of expression, were significantly different in the leading and lagging strands. The operonic organization and genomic location of tDNA operons are significant factors influencing their expression. Nonrandom patterns of location and strandedness shown by tDNA operons in E. coli suggest that their genomic architecture may be under selection to satisfy physiological demand for tRNA expression at high growth rates.
The concentrations of tRNAs are co-adapted to codon usage frequencies in the transcriptomes of E. coli and other diverse organisms. But how are tRNA concentrations determined? Here, the researchers analyzed the E. coli tRNA concentration profile in its genomic context, using clustering and regression methods to partition tRNA concentration data to tDNA operons that were defined semi-automatically. They found that co-expression in operons explains the tRNA profile much better than tDNA gene dosage alone. Furthermore, they could significantly explain the total expression from tDNA operons by their distance from the genomic origin of replication. Per-copy transcription initiation rates from tDNA operons were also estimated. Although there is some evidence for replication-dependent effects on tDNA operon expression, this cannot explain how constant the tRNA profile is with growth rate. As a consequence, tDNA promoters are predicted to compensate for the location of their operons. Finally, the researchers found pronounced asymmetries between the leading and lagging genomic strands in the locations of tDNA operons, and on the effect of location on their expression. These nonrandom patterns suggest that the genomic location and strandedness of tDNA operons may be under some selection in E. coli to satisfy physiological demand for tRNAs at high growth rates.
PMCID: PMC1183518  PMID: 16103901
17.  Bacterial phylogenetic tree construction based on genomic translation stop signals 
The efficiencies of the stop codons TAA, TAG, and TGA in protein synthesis termination are not the same. These variations could allow many genes to be regulated. There are many similar nucleotide trimers found on the second and third reading-frames of a gene. They are called premature stop codons (PSC). Like stop codons, the PSC in bacterial genomes are also highly bias in terms of their quantities and qualities on the genes. Phylogenetically related species often share a similar PSC profile. We want to know whether the selective forces that influence the stop codons and the PSC usage biases in a genome are related. We also wish to know how strong these trimers in a genome are related to the natural history of the bacterium. Knowing these relations may provide better knowledge in the phylogeny of bacteria
A 16SrRNA-alignment tree of 19 well-studied α-, β- and γ-Proteobacteria Type species is used as standard reference for bacterial phylogeny. The genomes of sixty-one bacteria, belonging to the α-, β- and γ-Proteobacteria subphyla, are used for this study. The stop codons and PSC are collectively termed “Translation Stop Signals” (TSS). A gene is represented by nine scalars corresponding to the numbers of counts of TAA, TAG, and TGA on each of the three reading-frames of that gene. “Translation Stop Signals Ratio” (TSSR) is the ratio between the TSS counts. Four types of TSSR are investigated. The TSSR-1, TSSR-2 and TSSR-3 are each a 3-scalar series corresponding respectively to the average ratio of TAA: TAG: TGA on the first, second, and third reading-frames of all genes in a genome. The Genomic-TSSR is a 9-scalar series representing the ratio of distribution of all TSS on the three reading-frames of all genes in a genome. Results show that bacteria grouped by their similarities based on TSSR-1, TSSR-2, or TSSR-3 values could only partially resolve the phylogeny of the species. However, grouping bacteria based on thier Genomic-TSSR values resulted in clusters of bacteria identical to those bacterial clusters of the reference tree. Unlike the 16SrRNA method, the Genomic-TSSR tree is also able to separate closely related species/strains at high resolution. Species and strains separated by the Genomic-TSSR grouping method are often in good agreement with those classified by other taxonomic methods. Correspondence analysis of individual genes shows that most genes in a bacterial genome share a similar TSSR value. However, within a chromosome, the Genic-TSSR values of genes near the replication origin region (Ori) are more similar to each other than those genes near the terminus region (Ter).
The translation stop signals on the three reading-frames of the genes on a bacterial genome are interrelated, possibly due to frequent off-frame recombination facilitated by translational-associated recombination (TSR). However, TSR may not occur randomly in a bacterial chromosome. Genes near the Ori region are often highly expressed and a bacterium always maintains multiple copies of Ori. Frequent collisions between DNA- polymerase and RNA-polymerase would create many DNA strand-breaks on the genes; whereas DNA strand-break induced homologues-recombination is more likely to take place between genes with similar sequence. Thus, localized recombination could explain why the TSSR of genes near the Ori region are more similar to each other. The quantity and quality of these TSS in a genome strongly reflect the natural history of a bacterium. We propose that the Genomic- TSSR can be used as a subjective biomarker to represent the phyletic status of a bacterium.
PMCID: PMC3466146  PMID: 22651236
18.  Lagging-Strand, Early-Labelling, and Two-Dimensional Gel Assays Suggest Multiple Potential Initiation Sites in the Chinese Hamster Dihydrofolate Reductase Origin 
Molecular and Cellular Biology  1998;18(1):39-50.
There is general agreement that DNA synthesis in the single-copy and amplified dihydrofolate reductase (DHFR) loci of CHO cells initiates somewhere within the 55-kb spacer region between the DHFR and 2BE2121 genes. However, results of lagging-strand, early-labelling fragment hybridization (ELFH), and PCR-based nascent-strand abundance assays have been interpreted to suggest a very narrow zone of initiation centered at a single locus known as ori-β, while two-dimensional (2-D) gel analyses suggest that initiation can occur at any of a large number of potential sites scattered throughout the intergenic region. The results of a leading-strand assay and two intrinsic labelling techniques are compatible with a broad initiation zone in which ori-β and a second locus (ori-γ) are somewhat preferred. To determine how these differing views are shaped by differences in experimental manipulations unrelated to the biology itself, we have applied the lagging-strand, ELFH, neutral-neutral, and/or neutral-alkaline 2-D gel assays to CHOC 400 cell populations synchronized and manipulated in the same way. In our experiments, the lagging-strand assay failed to identify a template strand switch at ori-β; rather, we observed a gradual, undulating change in hybridization bias throughout the intergenic spacer, with hybridization to the two templates being approximately equal near a centered matrix attachment region. In the ELFH assay, all of the fragments in the 55-kb intergenic region were labelled in the first few minutes of the S phase, with the regions encompassing ori-β and ori-γ being somewhat preferred. Under the same conditions, neutral-neutral and neutral-alkaline 2-D gel analyses detected initiation sites at multiple locations in the intergenic spacer. Thus, the results of all existing replicon-mapping methods that have been applied to the amplified DHFR locus in CHOC 400 cells are consistent with a model in which two somewhat preferred subzones reside in a larger zone of multiple potential initiation sites in the intergenic region.
PMCID: PMC121447  PMID: 9418851
19.  ssb Gene Duplication Restores the Viability of ΔholC and ΔholD Escherichia coli Mutants 
PLoS Genetics  2014;10(10):e1004719.
The HolC-HolD (χψ) complex is part of the DNA polymerase III holoenzyme (Pol III HE) clamp-loader. Several lines of evidence indicate that both leading- and lagging-strand synthesis are affected in the absence of this complex. The Escherichia coli ΔholD mutant grows poorly and suppressor mutations that restore growth appear spontaneously. Here we show that duplication of the ssb gene, encoding the single-stranded DNA binding protein (SSB), restores ΔholD mutant growth at all temperatures on both minimal and rich medium. RecFOR-dependent SOS induction, previously shown to occur in the ΔholD mutant, is unaffected by ssb gene duplication, suggesting that lagging-strand synthesis remains perturbed. The C-terminal SSB disordered tail, which interacts with several E. coli repair, recombination and replication proteins, must be intact in both copies of the gene in order to restore normal growth. This suggests that SSB-mediated ΔholD suppression involves interaction with one or more partner proteins. ssb gene duplication also suppresses ΔholC single mutant and ΔholC ΔholD double mutant growth defects, indicating that it bypasses the need for the entire χψ complex. We propose that doubling the amount of SSB stabilizes HolCD-less Pol III HE DNA binding through interactions between SSB and a replisome component, possibly DnaE. Given that SSB binds DNA in vitro via different binding modes depending on experimental conditions, including SSB protein concentration and SSB interactions with partner proteins, our results support the idea that controlling the balance between SSB binding modes is critical for DNA Pol III HE stability in vivo, with important implications for DNA replication and genome stability.
Author Summary
Both replication polymerases and single-stranded DNA binding proteins (SSB, which associate with single-stranded DNA exposed transiently during replication) are ubiquitous and show high levels of functional and structural conservation across all species. Among the nine different polypeptides that compose the bacterial replicative polymerase, the HolC-HolD (χψ) complex interacts with SSB, and is crucial for normal growth in the model bacteria Escherichia coli. Interestingly, many bacterial species lack this complex, where its function is presumably carried out by other polymerase components. With the aim of better understanding HolC-HolD (χψ) complex function in E. coli, we isolated growth defect suppressor mutations of the holD mutant. We found that ssb gene duplication and the consequent doubling of SSB protein expression, renders the entire χψ complex dispensable for growth. We also show that growth-defect suppression requires the presence of the SSB C-terminal amino acids in both ssb gene copies. This C-terminal tail promotes interaction between SSB and its partner proteins. Thus, our results indicate that in vivo SSB concentration plays a key role in maintaining polymerase stability and replication efficiency, in a reaction that involves SSB interactions with protein partner(s) other than χψ.
PMCID: PMC4199511  PMID: 25329071
20.  Increased and Imbalanced dNTP Pools Symmetrically Promote Both Leading and Lagging Strand Replication Infidelity 
PLoS Genetics  2014;10(12):e1004846.
The fidelity of DNA replication requires an appropriate balance of dNTPs, yet the nascent leading and lagging strands of the nuclear genome are primarily synthesized by replicases that differ in subunit composition, protein partnerships and biochemical properties, including fidelity. These facts pose the question of whether imbalanced dNTP pools differentially influence leading and lagging strand replication fidelity. Here we test this possibility by examining strand-specific replication infidelity driven by a mutation in yeast ribonucleotide reductase, rnr1-Y285A, that leads to elevated dTTP and dCTP concentrations. The results for the CAN1 mutational reporter gene present in opposite orientations in the genome reveal that the rates, and surprisingly even the sequence contexts, of replication errors are remarkably similar for leading and lagging strand synthesis. Moreover, while many mismatches driven by the dNTP pool imbalance are efficiently corrected by mismatch repair, others are repaired less efficiently, especially those in sequence contexts suggesting reduced proofreading due to increased mismatch extension driven by the high dTTP and dCTP concentrations. Thus the two DNA strands of the nuclear genome are at similar risk of mutations resulting from this dNTP pool imbalance, and this risk is not completely suppressed even when both major replication error correction mechanisms are genetically intact.
Author Summary
The building blocks of DNA, dNTPs, are vital to life, and thus their production is carefully controlled within each cell. Under certain conditions, such as cancer, infection, or drugs, the overall dNTP level or dNTP balance can change. Using yeast genetics we manipulated the dNTP pool balance in unicellular baker's yeast and analysed the effects upon fidelity of DNA replication. We also disrupted mismatch repair, an internal safety system that corrects replication errors and is mutated in many cancers. By sequencing DNA from yeast cells with these alterations we gain insights into the mechanisms of mutation formation that contribute to genome instability. We find that the leading and lagging strand replication fidelity is affected similarly by the dNTP pool imbalance and that the mismatch repair machinery corrects replication errors driven by a dNTP pool imbalance with highly variable efficiencies.
PMCID: PMC4256292  PMID: 25474551
21.  Global features of sequences of bacterial chromosomes, plasmids and phages revealed by analysis of oligonucleotide usage patterns 
BMC Bioinformatics  2004;5:90.
Oligonucleotide frequencies were shown to be conserved signatures for bacterial genomes, however, the underlying constraints have yet not been resolved in detail. In this paper we analyzed oligonucleotide usage (OU) biases in a comprehensive collection of 155 completely sequenced bacterial chromosomes, 316 plasmids and 104 phages.
Two global features were analyzed: pattern skew (PS) and variance of OU deviations normalized by mononucleotide content of the sequence (OUV). OUV reflects the strength of OU biases and taxonomic signals. PS denotes asymmetry of OU in direct and reverse DNA strands. A trend towards minimal PS was observed for almost all complete sequences of bacterial chromosomes and plasmids, however, PS was substantially higher in separate genomic loci and several types of plasmids and phages characterized by long stretches of non-coding DNA and/or asymmetric gene distribution on the two DNA strands. Five of the 155 bacterial chromosomes have anomalously high PS, of which the chromosomes of Xylella fastidiosa 9a5c and Prochlorococcus marinus MIT9313 exhibit extreme PS values suggesting an intermediate unstable state of these two genomes.
Strand symmetry as indicated by minimal PS is a universally conserved feature of complete bacterial genomes that results from the matching mutual compensation of local OU biases on both replichors while OUV is more a taxon specific feature. Local events such as inversions or the incorporation of genome islands are balanced by global changes in genome organization to minimize PS that may represent one of the leading evolutionary forces driving bacterial genome diversification.
PMCID: PMC487896  PMID: 15239845
22.  Coordinated Leading and Lagging Strand DNA Synthesis by Using the Herpes Simplex Virus 1 Replication Complex and Minicircle DNA Templates ▿  
Journal of Virology  2010;85(2):957-967.
The origin-specific replication of the herpes simplex virus 1 genome requires seven proteins: the helicase-primase (UL5-UL8-UL52), the DNA polymerase (UL30-UL42), the single-strand DNA binding protein (ICP8), and the origin-binding protein (UL9). We reconstituted these proteins, excluding UL9, on synthetic minicircular DNA templates and monitored leading and lagging strand DNA synthesis using the strand-specific incorporation of dTMP and dAMP. Critical features of the assays that led to efficient leading and lagging stand synthesis included high helicase-primase concentrations and a lagging strand template whose sequence resembled that of the viral DNA. Depending on the nature of the minicircle template, the replication complex synthesized leading and lagging strand products at molar ratios varying between 1:1 and 3:1. Lagging strand products (∼0.2 to 0.6 kb) were significantly shorter than leading strand products (∼2 to 10 kb), and conditions that stimulated primer synthesis led to shorter lagging strand products. ICP8 was not essential; however, its presence stimulated DNA synthesis and increased the length of both leading and lagging strand products. Curiously, human DNA polymerase α (p70-p180 or p49-p58-p70-p180), which improves the utilization of RNA primers synthesized by herpesvirus primase on linear DNA templates, had no effect on the replication of the minicircles. The lack of stimulation by polymerase α suggests the existence of a macromolecular assembly that enhances the utilization of RNA primers and may functionally couple leading and lagging strand synthesis. Evidence for functional coupling is further provided by our observations that (i) leading and lagging strand synthesis produce equal amounts of DNA, (ii) leading strand synthesis proceeds faster under conditions that disable primer synthesis on the lagging strand, and (iii) conditions that accelerate helicase-catalyzed DNA unwinding stimulate decoupled leading strand synthesis but not coordinated leading and lagging strand synthesis.
PMCID: PMC3020029  PMID: 21068232
23.  Separating the effects of mutation and selection in producing DNA skew in bacterial chromosomes 
BMC Genomics  2007;8:369.
Many bacterial chromosomes display nucleotide asymmetry, or skew, between the leading and lagging strands of replication. Mutational differences between these strands result in an overall pattern of skew that is centered about the origin of replication. Such a pattern could also arise from selection coupled with a bias for genes coded on the leading strand. The relative contributions of selection and mutation in producing compositional skew are largely unknown.
We describe a model to quantify the contribution of mutational differences between the leading and lagging strands in producing replication-induced skew. When the origin and terminus of replication are known, the model can be used to estimate the relative accumulation of G over C and of A over T on the leading strand due to replication effects in a chromosome with bidirectional replication arms. The model may also be implemented in a maximum likelihood framework to estimate the locations of origin and terminus. We find that our estimations for the origin and terminus agree very well with the location of genes that are thought to be associated with the replication origin. This indicates that our model provides an accurate, objective method of determining the replication arms and also provides support for the hypothesis that these genes represent an ancestral cluster of origin-associated genes.
The model has several advantages over other methods of analyzing genome skew. First, it quantifies the role of mutation in generating skew so that its effect on composition, for example codon bias, can be assessed. Second, it provides an objective method for locating origin and terminus, one that is based on chromosome-wide accumulation of leading vs lagging strand nucleotide differences. Finally, the model has the potential to be utilized in a maximum likelihood framework in order to analyze the effect of chromosome rearrangements on nucleotide composition.
PMCID: PMC2099444  PMID: 17935620
24.  Why Are Genes Encoded on the Lagging Strand of the Bacterial Genome? 
Genome Biology and Evolution  2013;5(12):2436-2439.
Genomic DNA is used as the template for both replication and transcription, whose machineries may collide and result in mutagenesis, among other damages. Because head-on collisions are more deleterious than codirectional collisions, genes should be preferentially encoded on the leading strand to avoid head-on collisions, as is observed in most bacterial genomes examined. However, why are there still lagging strand encoded genes? Paul et al. recently proposed that these genes take advantage of the increased mutagenesis resulting from head-on collisions and are thus adaptively encoded on the lagging strand. We show that the evidence they provided is invalid and that the existence of lagging strand encoded genes is explainable by a balance between deleterious mutations that bring genes from the leading to the lagging strand and purifying selection purging such mutants. Therefore, the adaptive hypothesis is neither theoretically needed nor empirically supported.
PMCID: PMC3879979  PMID: 24273314
evolution; mutation-selection balance; convergence
25.  Condensin II Subunit dCAP-D3 Restricts Retrotransposon Mobilization in Drosophila Somatic Cells 
PLoS Genetics  2013;9(10):e1003879.
Retrotransposon sequences are positioned throughout the genome of almost every eukaryote that has been sequenced. As mobilization of these elements can have detrimental effects on the transcriptional regulation and stability of an organism's genome, most organisms have evolved mechanisms to repress their movement. Here, we identify a novel role for the Drosophila melanogaster Condensin II subunit, dCAP-D3 in preventing the mobilization of retrotransposons located in somatic cell euchromatin. dCAP-D3 regulates transcription of euchromatic gene clusters which contain or are proximal to retrotransposon sequence. ChIP experiments demonstrate that dCAP-D3 binds to these loci and is important for maintaining a repressed chromatin structure within the boundaries of the retrotransposon and for repressing retrotransposon transcription. We show that dCAP-D3 prevents accumulation of double stranded DNA breaks within retrotransposon sequence, and decreased dCAP-D3 levels leads to a precise loss of retrotransposon sequence at some dCAP-D3 regulated gene clusters and a gain of sequence elsewhere in the genome. Homologous chromosomes exhibit high levels of pairing in Drosophila somatic cells, and our FISH analyses demonstrate that retrotransposon-containing euchromatic loci are regions which are actually less paired than euchromatic regions devoid of retrotransposon sequences. Decreased dCAP-D3 expression increases pairing of homologous retrotransposon-containing loci in tissue culture cells. We propose that the combined effects of dCAP-D3 deficiency on double strand break levels, chromatin structure, transcription and pairing at retrotransposon-containing loci may lead to 1) higher levels of homologous recombination between repeats flanking retrotransposons in dCAP-D3 deficient cells and 2) increased retrotransposition. These findings identify a novel role for the anti-pairing activities of dCAP-D3/Condensin II and uncover a new way in which dCAP-D3/Condensin II influences local chromatin structure to help maintain genome stability.
Author Summary
Condensins are conserved complexes that are well known for their roles in promoting the efficient condensation of chromosomes during early mitosis. Previously, we have shown that the Drosophila Condensin II subunit, dCAP-D3, also functions to regulate transcription in somatic cells during the later stages of development. A significant number of dCAP-D3 regulated genes were found to be positioned very close to one another in clusters. In this study, we report that some of the most strongly regulated dCAP-D3 gene clusters are positioned near retrotransposons. Unexpectedly, we find that decreased dCAP-D3 expression results in a precise loss of retrotransposon sequence at these loci. Additionally, dCAP-D3 knockdown causes increased levels of double strand breaks within retrotransposon sequence, an opening of the chromatin in the region, increased retrotransposon transcription and a very significant increase in homologous pairing at the locus. Taken together, these results suggest that dCAP-D3/Condensin II functions to prevent recombination of retrotransposons between homologous chromosomes and possibly retrotransposition as well. This report identifies a novel function for Condensin II that may contribute to its role in genome organization.
PMCID: PMC3814330  PMID: 24204294

Results 1-25 (1373096)