Search tips
Search criteria

Results 1-25 (1213440)

Clipboard (0)

Related Articles

1.  Live Hot, Die Young: Transmission Distortion in Recombination Hotspots 
PLoS Genetics  2007;3(3):e35.
There is strong evidence that hotspots of meiotic recombination in humans are transient features of the genome. For example, hotspot locations are not shared between human and chimpanzee. Biased gene conversion in favor of alleles that locally disrupt hotspots is a possible explanation of the short lifespan of hotspots. We investigate the implications of such a bias on human hotspots and their evolution. Our results demonstrate that gene conversion bias is a sufficiently strong force to produce the observed lack of sharing of intense hotspots between species, although sharing may be much more common for weaker hotspots. We investigate models of how hotspots arise, and find that only models in which hotspot alleles do not initially experience drive are consistent with observations of rather hot hotspots in the human genome. Mutations acting against drive cannot successfully introduce such hotspots into the population, even if there is direct selection for higher recombination rates, such as to ensure correct segregation during meiosis. We explore the impact of hotspot alleles on patterns of haplotype variation, and show that such alleles mask their presence in population genetic data, making them difficult to detect.
Author Summary
Recombination is a fundamental component of mammalian meiosis, required to help ensure that daughter cells receive the correct complement of chromosomes. This is highly important, as incorrect segregation causes miscarriage and disorders such as Down syndrome. In addition to its mechanistic function, recombination is also crucial in generating the genetic diversity on which natural selection acts. In humans and many other species, recombination events cluster into narrow hotspots within the genome. Given the vital role recombination plays in meiosis, we might expect that the positions of these hotspots would be tightly conserved over evolutionary time. However, there is now considerable evidence to the contrary; hotspots are not frozen in place, but instead evolve rapidly. For example, humans and chimpanzees do not share hotspot locations, despite their genomic sequences being almost 99% identical. The explanation for this may be, remarkably, that hotspots are the architects of their own destruction. The biological mechanism of recombination dooms them to rapid extinction by favoring the spread of hotspot-disrupting mutations. By mathematically modeling human hotspot evolution, we find that this mechanism can account for fast hotspot turnover, and in fact makes it very difficult for active hotspots to arise at all. Given that active hotspots do exist in our genome, newly arising hotspots must somehow be able to bypass their self-destructive tendency. Despite their importance, it is difficult to identify mutations that disrupt hotspots, as they hide their tracks in genetic data.
PMCID: PMC1817654  PMID: 17352536
2.  Genomic Hypomethylation in the Human Germline Associates with Selective Structural Mutability in the Human Genome 
PLoS Genetics  2012;8(5):e1002692.
The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR) mediated by low-copy repeats (LCRs). Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ∼1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs) from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH) chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR–mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.
Author Summary
The human genome contains many loci with high incidence of structural mutations, including insertions and deletions of chromosomal segments. This excessive mutability has accelerated evolution and contributed to human disease but has yet to be explained. Segments of DNA repeated in low-copy numbers (LCRs) have been previously implicated in promoting structural mutability in specific disease-associated loci. Lack of methylation (hypomethylation) of genomic DNA has been previously associated with high structural mutability in gibbons and in human cancer cells, but the association with structural mutability in the human germline has not been explored prior to this study. Our analyses confirm the role of LCRs in promoting structural mutability on the genome scale but also reveal a surprisingly strong association of genomic instability with hypomethylation. Specifically, evolutionary analyses reveal that methylation deserts, the ∼1% fraction of the human genome with the lowest methylation in human sperm, harbor a tenfold higher number of structural mutations than genome-wide average. Moreover, the structural mutations in individuals diagnosed with schizophrenia, bipolar disorder, developmental delay, and autism are significantly more concentrated within hypomethylated regions. Our findings suggest a new connection between methylation of genomic DNA, selective structural mutability, evolution, and human disease.
PMCID: PMC3355074  PMID: 22615578
3.  Slow repair of lipid peroxidation-induced DNA damage at p53 mutation hotspots in human cells caused by low turnover of a DNA glycosylase 
Nucleic Acids Research  2014;42(14):9033-9046.
Repair of oxidative stress- and inflammation-induced DNA lesions by the base excision repair (BER) pathway prevents mutation, a form of genomic instability which is often observed in cancer as ‘mutation hotspots’. This suggests that some sequences have inherent mutability, possibly due to sequence-related differences in repair. This study has explored intrinsic mutability as a consequence of sequence-specific repair of lipid peroxidation-induced DNA adduct, 1, N6-ethenoadenine (εA). For the first time, we observed significant delay in repair of ϵA at mutation hotspots in the tumor suppressor gene p53 compared to non-hotspots in live human hepatocytes and endothelial cells using an in-cell real time PCR-based method. In-cell and in vitro mechanism studies revealed that this delay in repair was due to inefficient turnover of N-methylpurine-DNA glycosylase (MPG), which initiates BER of εA. We determined that the product dissociation rate of MPG at the hotspot codons was ≈5–12-fold lower than the non-hotspots, suggesting a previously unknown mechanism for slower repair at mutation hotspots and implicating sequence-related variability of DNA repair efficiency to be responsible for mutation hotspot signatures.
PMCID: PMC4132702  PMID: 25081213
4.  The Impact of Recombination on Nucleotide Substitutions in the Human Genome 
PLoS Genetics  2008;4(5):e1000071.
Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue for detecting selection within sequences. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We analyzed the pattern of neutral substitutions in 1 Gb of primate non-coding regions. We show that the GC-content toward which sequences are evolving is strongly negatively correlated to the distance to telomeres and positively correlated to the rate of crossovers (R2 = 47%). This demonstrates that recombination has a major impact on substitution patterns in human, driving the evolution of GC-content. The evolution of GC-content correlates much more strongly with male than with female crossover rate, which rules out selectionist models for the evolution of isochores. This effect of recombination is most probably a consequence of the neutral process of biased gene conversion (BGC) occurring within recombination hotspots. We show that the predictions of this model fit very well with the observed substitution patterns in the human genome. This model notably explains the positive correlation between substitution rate and recombination rate. Theoretical calculations indicate that variations in population size or density in recombination hotspots can have a very strong impact on the evolution of base composition. Furthermore, recombination hotspots can create strong substitution hotspots. This molecular drive affects both coding and non-coding regions. We therefore conclude that along with mutation, selection and drift, BGC is one of the major factors driving genome evolution. Our results also shed light on variations in the rate of crossover relative to non-crossover events, along chromosomes and according to sex, and also on the conservation of hotspot density between human and chimp.
Author Summary
Mammalian genomes show a very strong heterogeneity of base composition along chromosomes (the so-called isochores). The functional significance of these peculiar genomic landscapes is highly debated: do isochores confer some selective advantage, or are they simply the by-product of neutral evolutionary processes? To resolve this issue, we analyzed the pattern of substitution in the human genome by comparison with chimpanzee and macaque. We show that the evolution of base composition (GC-content) is essentially determined by the rate of recombination. This effect appears to be much stronger in male than in female germline, which rules out selective explanations for the evolution of isochores. We show that this impact of recombination is most probably a consequence of the process of biased gene conversion (BGC). This neutral process mimics the action of selection and can induce strong substitution hotspots within recombination hotspots, sometimes leading to the fixation of deleterious mutations. BGC appears to be one of the major factors driving genome evolution. It is therefore essential to take this process into account if we want to be able to interpret genome sequences.
PMCID: PMC2346554  PMID: 18464896
5.  Protein Domain-Level Landscape of Cancer-Type-Specific Somatic Mutations 
PLoS Computational Biology  2015;11(3):e1004147.
Identifying driver mutations and their functional consequences is critical to our understanding of cancer. Towards this goal, and because domains are the functional units of a protein, we explored the protein domain-level landscape of cancer-type-specific somatic mutations. Specifically, we systematically examined tumor genomes from 21 cancer types to identify domains with high mutational density in specific tissues, the positions of mutational hotspots within these domains, and the functional and structural context where possible. While hotspots corresponding to specific gain-of-function mutations are expected for oncoproteins, we found that tumor suppressor proteins also exhibit strong biases toward being mutated in particular domains. Within domains, however, we observed the expected patterns of mutation, with recurrently mutated positions for oncogenes and evenly distributed mutations for tumor suppressors. For example, we identified both known and new endometrial cancer hotspots in the tyrosine kinase domain of the FGFR2 protein, one of which is also a hotspot in breast cancer, and found new two hotspots in the Immunoglobulin I-set domain in colon cancer. Thus, to prioritize cancer mutations for further functional studies aimed at more precise cancer treatments, we have systematically correlated mutations and cancer types at the protein domain level.
Author Summary
Extensive tumor genome sequencing has provided raw material to understand mutational processes and identify cancer-associated somatic variants. However, fundamental problems remain to: i) separate ‘driver’ from ‘passenger’ mutations, ii) further understand the functional mechanisms and consequences of driver mutations, and iii) identify the cancer types in which each driver mutation is relevant. Here we analyze whole-genome and exome tumor sequencing data from the perspective of protein domains—the basic structural and functional units of proteins. Exploring the cancer-type-specific landscape of domain mutations across 21 cancer types, we identify both cancer-type-specific mutated domains and mutational hotspots. Frequently-mutated domains were identified for oncoproteins for which the ‘mutational hotspot’ phenomenon owing to the relative rarity of gain-of-function mutations is well known, and also for tumor suppressor proteins, for which more uniformly distributed loss-of-function driver mutations are expected. A given gene product may be perturbed differently in different cancers. Indeed, we observed systematic shifts between cancer types of the positions at which mutations occur within a given protein. Both known and novel candidate driver mutations were retrieved. Novel cancer gene candidates significantly overlapped with orthogonal systematic cancer screen hits, supporting the power of this approach to identify cancer genes.
PMCID: PMC4368709  PMID: 25794154
6.  Drosophila Duplication Hotspots Are Associated with Late-Replicating Regions of the Genome 
PLoS Genetics  2011;7(11):e1002340.
Duplications play a significant role in both extremes of the phenotypic spectrum of newly arising mutations: they can have severe deleterious effects (e.g. duplications underlie a variety of diseases) but can also be highly advantageous. The phenotypic potential of newly arisen duplications has stimulated wide interest in both the mutational and selective processes shaping these variants in the genome. Here we take advantage of the Drosophila simulans–Drosophila melanogaster genetic system to further our understanding of both processes. Regarding mutational processes, the study of two closely related species allows investigation of the potential existence of shared duplication hotspots, and the similarities and differences between the two genomes can be used to dissect its underlying causes. Regarding selection, the difference in the effective population size between the two species can be leveraged to ask questions about the strength of selection acting on different classes of duplications. In this study, we conducted a survey of duplication polymorphisms in 14 different lines of D. simulans using tiling microarrays and combined it with an analogous survey for the D. melanogaster genome. By integrating the two datasets, we identified duplication hotspots conserved between the two species. However, unlike the duplication hotspots identified in mammalian genomes, Drosophila duplication hotspots are not associated with sequences of high sequence identity capable of mediating non-allelic homologous recombination. Instead, Drosophila duplication hotspots are associated with late-replicating regions of the genome, suggesting a link between DNA replication and duplication rates. We also found evidence supporting a higher effectiveness of selection on duplications in D. simulans than in D. melanogaster. This is also true for duplications segregating at high frequency, where we find evidence in D. simulans that a sizeable fraction of these mutations is being driven to fixation by positive selection.
Author Summary
DNA duplications are important contributors to the phenotypic differences observed between individuals. These mutations can disrupt the normal functioning of genes and so are often associated with disease. But because they can add genetic information they can also lead to evolutionary change. Understanding how selection and non-random mutation processes shape the distribution of duplications throughout the genome is important to elucidate both the medical and evolutionary impacts of these mutations. Here, we examined the roles of selection and mutation in shaping patterns of duplication polymorphisms across the genomes of the fruit fly Drosophila melanogaster and its sister species, D. simulans. We found that selection is pervasive in both genomes but is more efficient in D. simulans than in D. melanogaster. We also found that these two species have shared duplication hotspots, i.e. orthologous regions experiencing high rates of duplication in the two genomes. After excluding the hypothesis that Drosophila duplication hotspots are associated with regions of the genome rich in segmental duplications (as observed for mammalian genomes), we show that they are associated with late-replicating regions of the genome. Our work therefore proposes a link between DNA replication and rates of duplication across the genome.
PMCID: PMC3207856  PMID: 22072977
7.  Hypermutable Non-Synonymous Sites Are under Stronger Negative Selection 
PLoS Genetics  2008;4(11):e1000281.
Mutation rate varies greatly between nucleotide sites of the human genome and depends both on the global genomic location and the local sequence context of a site. In particular, CpG context elevates the mutation rate by an order of magnitude. Mutations also vary widely in their effect on the molecular function, phenotype, and fitness. Independence of the probability of occurrence of a new mutation's effect has been a fundamental premise in genetics. However, highly mutable contexts may be preserved by negative selection at important sites but destroyed by mutation at sites under no selection. Thus, there may be a positive correlation between the rate of mutations at a nucleotide site and the magnitude of their effect on fitness. We studied the impact of CpG context on the rate of human–chimpanzee divergence and on intrahuman nucleotide diversity at non-synonymous coding sites. We compared nucleotides that occupy identical positions within codons of identical amino acids and only differ by being within versus outside CpG context. Nucleotides within CpG context are under a stronger negative selection, as revealed by their lower, proportionally to the mutation rate, rate of evolution and nucleotide diversity. In particular, the probability of fixation of a non-synonymous transition at a CpG site is two times lower than at a CpG site. Thus, sites with different mutation rates are not necessarily selectively equivalent. This suggests that the mutation rate may complement sequence conservation as a characteristic predictive of functional importance of nucleotide sites.
Author Summary
Mutations occur in some sites in the genome more frequently than in others. Similarly, mutations in some sites have greater consequences than in others. The effect of mutations might not be independent of the frequency with which mutations occur. Indeed, sites where mutations happen frequently will be preserved if the effects of these mutations are severe or will otherwise be allowed to mutate if there are no consequences for the organism. We compared both human–chimpanzee differences and sequence variation among humans in protein coding genes. We found that highly mutable nucleotide sites, such as the dinucleotide CpG, are on average more important and more frequently preserved by natural selection. Using this information, together with other features such as sequence conservation, opens a new perspective to predict the effect of human mutations, including their potential involvement in diseases.
PMCID: PMC2583910  PMID: 19043566
8.  Mouse PRDM9 DNA-Binding Specificity Determines Sites of Histone H3 Lysine 4 Trimethylation for Initiation of Meiotic Recombination 
PLoS Biology  2011;9(10):e1001176.
The nature of the PRDM9 zinc finger domain determines the location of hotspots for meiotic recombination in the genome and promotes local histone H3K4 trimethylation.
Meiotic recombination generates reciprocal exchanges between homologous chromosomes (also called crossovers, COs) that are essential for proper chromosome segregation during meiosis and are a major source of genome diversity by generating new allele combinations. COs have two striking properties: they occur at specific sites, called hotspots, and these sites evolve rapidly. In mammals, the Prdm9 gene, which encodes a meiosis-specific histone H3 methyltransferase, has recently been identified as a determinant of CO hotspots. Here, using transgenic mice, we show that the sole modification of PRDM9 zinc fingers leads to changes in hotspot activity, histone H3 lysine 4 trimethylation (H3K4me3) levels, and chromosome-wide distribution of COs. We further demonstrate by an in vitro assay that the PRDM9 variant associated with hotspot activity binds specifically to DNA sequences located at the center of the three hotspots tested. Remarkably, we show that mutations in cis located at hotspot centers and associated with a decrease of hotspot activity affect PRDM9 binding. Taken together, these results provide the direct demonstration that Prdm9 is a master regulator of hotspot localization through the DNA binding specificity of its zinc finger array and that binding of PRDM9 at hotspots promotes local H3K4me3 enrichment.
Author Summary
Meiosis is the process of cell division that reduces the number of chromosome sets from two to one, so producing gametes for sexual reproduction. During meiosis in many organisms, there is reciprocal exchange of genetic material between homologous chromosomes by the formation of “crossovers,” which promote genetic diversity by creating new combinations of gene variants and play an important mechanical role in the segregation of chromosomes. Crossovers do not occur randomly throughout the genome, but in small regions called hotspots. Recent work showed that hotspots have specific structural features and that the protein PRDM9 is important in specifying their location. PRDM9 contains a so-called zinc finger domain that is predicted to bind specific DNA sequences, suggesting that hotspots might be sites where PRDM9 binds. By using transgenic mice expressing PRDM9 with modified zinc fingers, here we show directly that the nature of the zinc fingers in PRDM9 determines crossover hotspot localization. We show that PRDM9 binds DNA sequences at the center of hotspots. Furthermore, we identify DNA sequence polymorphisms that affect its binding and the extent of crossover activity. Overall, our work shows that PRDM9, through its zinc finger domain, is a master regulator of hotspot location in the mouse genome.
PMCID: PMC3196474  PMID: 22028627
9.  Estimated Comparative Integration Hotspots Identify Different Behaviors of Retroviral Gene Transfer Vectors 
PLoS Computational Biology  2011;7(12):e1002292.
Integration of retroviral vectors in the human genome follows non random patterns that favor insertional deregulation of gene expression and may cause risks of insertional mutagenesis when used in clinical gene therapy. Understanding how viral vectors integrate into the human genome is a key issue in predicting these risks. We provide a new statistical method to compare retroviral integration patterns. We identified the positions where vectors derived from the Human Immunodeficiency Virus (HIV) and the Moloney Murine Leukemia Virus (MLV) show different integration behaviors in human hematopoietic progenitor cells. Non-parametric density estimation was used to identify candidate comparative hotspots, which were then tested and ranked. We found 100 significative comparative hotspots, distributed throughout the chromosomes. HIV hotspots were wider and contained more genes than MLV ones. A Gene Ontology analysis of HIV targets showed enrichment of genes involved in antigen processing and presentation, reflecting the high HIV integration frequency observed at the MHC locus on chromosome 6. Four histone modifications/variants had a different mean density in comparative hotspots (H2AZ, H3K4me1, H3K4me3, H3K9me1), while gene expression within the comparative hotspots did not differ from background. These findings suggest the existence of epigenetic or nuclear three-dimensional topology contexts guiding retroviral integration to specific chromosome areas.
Author Summary
Understanding how retroviral vectors integrate in the human genome is a major safety issue in gene therapy, since a concrete risk of developing tumors associated with the integration process has been observed in several clinical trials. Statistical analyses confirmed the non randomness of the integration. Where and why do virus-specific integrations tend to accumulate in the genome? We compared integration preferences of two retroviral vectors derived from HIV and MLV, which are used in most gene therapy trials for hematological disorders, in their actual clinical targets, i.e., human hematopoietic stem/progenitor cells. We developed a new statistical method to find areas of the genome, called comparative hotspots, where integration preferences are significantly different. We modeled the integration process as a stochastic process, so that integration sites are seen as samples from an unknown virus-specific probability density function. Thus, the problem became to identify areas where two empirical density functions differ significantly. The comparison of nonparametric variability bands around the estimated integration densities allowed identifying and ranking candidate comparative hotspots. Results indicated clear differential patterns of integration between HIV and MLV, leading to new hypotheses on the mechanisms governing retroviral integration.
PMCID: PMC3228801  PMID: 22144885
10.  Two Mechanisms Produce Mutation Hotspots at DNA Breaks in Escherichia coli 
Cell Reports  2012;2(4):714-721.
Mutation hotspots and showers occur across phylogeny and profoundly influence genome evolution, yet the mechanisms that produce hotspots remain obscure. We report that DNA double-strand breaks (DSBs) provoke mutation hotspots via stress-induced mutation in Escherichia coli. With tet reporters placed 2 kb to 2 Mb (half the genome) away from an I-SceI site, RpoS/DinB-dependent mutations occur maximally within the first 2 kb and decrease logarithmically to ∼60 kb. A weak mutation tail extends to 1 Mb. Hotspotting occurs independently of I-site/tet-reporter-pair position in the genome, upstream and downstream in the replication path. RecD, which allows RecBCD DSB-exonuclease activity, is required for strong local but not long-distance hotspotting, indicating that double-strand resection and gap-filling synthesis underlie local hotspotting, and newly illuminating DSB resection in vivo. Hotspotting near DSBs opens the possibility that specific genomic regions could be targeted for mutagenesis, and could also promote concerted evolution (coincident mutations) within genes/gene clusters, an important issue in the evolution of protein functions.
Graphical Abstract
► Spontaneous mutation pathway in Escherichia coli causes hotpots at double-strand breaks ► Strong local (2–60 kb) hotspot mechanism double-strand resection and gap-fill ► Weak long-distance (1 Mb) mutagenesis by break-induced replication ► Break-induced replication and length of DNA-end resection in natural repair with sister chromosomes
Mutation hotspots promote cancer and genome evolution, yet how they occur remains obscure. Rosenberg and colleagues used targeted endonucleolytic cleavages in the Escherichia coli chromosome to show that double-strand breaks cause mutation hotspots. Strong local and weak distant hotspots are caused by two mutation mechanisms that accelerate evolution in stressed cells. Hotspotting at breaks raises the possibility that specific genomic regions can be targeted for mutagenesis and can promote concerted evolution within genes, an important issue in protein evolution.
PMCID: PMC3607216  PMID: 23041320
11.  Trans-Regulation of Mouse Meiotic Recombination Hotspots by Rcr1 
PLoS Biology  2009;7(2):e1000036.
Meiotic recombination is required for the orderly segregation of chromosomes during meiosis and for providing genetic diversity among offspring. Among mammals, as well as yeast and higher plants, recombination preferentially occurs at highly delimited chromosomal sites 1–2 kb long known as hotspots. Although considerable progress has been made in understanding the roles various proteins play in carrying out the molecular events of the recombination process, relatively little is understood about the factors controlling the location and relative activity of mammalian recombination hotspots. To search for trans-acting factors controlling the positioning of recombination events, we compared the locations of crossovers arising in an 8-Mb segment of a 100-Mb region of mouse Chromosome 1 (Chr 1) when the longer region was heterozygous C57BL/6J (B6) × CAST/EiJ (CAST) and the remainder of the genome was either similarly heterozygous or entirely homozygous B6. The lack of CAST alleles in the remainder of the genome resulted in profound changes in hotspot activity in both females and males. Recombination activity was lost at several hotspots; new, previously undetected hotspots appeared; and still other hotspots remained unaffected, indicating the presence of distant trans-acting gene(s) whose CAST allele(s) activate or suppress the activity of specific hotspots. Testing the activity of three activated hotspots in sperm samples from individual male progeny of two genetic crosses, we identified a single trans-acting regulator of hotspot activity, designated Rcr1, that is located in a 5.30-Mb interval (11.74–17.04 Mb) on Chr 17. Using an Escherichia coli cloning assay to characterize the molecular products of recombination at two of these hotspots, we found that Rcr1 controls the appearance of both crossover and noncrossover gene conversion events, indicating that it likely controls the sites of the double-strand DNA breaks that initiate the recombination process.
Author Summary
Recombination is an essential aspect of meiosis, ensuring proper contact and exchange of genetic material between homologous parental chromosomes, as well as their subsequent segregation to produce haploid gametes. In humans and mice, recombination events are located at preferential sites termed hotspots, whose placement and activity are tightly regulated. We have now identified a hotspot-regulating locus in mammals, Rcr1, that simultaneously controls the locations of multiple hotspots. The discovery of Rcr1 indicates the existence of a newly emerging class of genes important in the recombination processes. Gaining further insights into their function may contribute to a better understanding of genetic factors underlying human fertility and evolution.
Rcr1 is identified as atrans-regulator of meiotic recombination hotspots that appears to act at the initiation of recombination and that maps to a 5.3-megabase region on mouse Chromosome 17.
PMCID: PMC2642880  PMID: 19226189
12.  Signs of positive selection of somatic mutations in human cancers detected by EST sequence analysis 
BMC Cancer  2006;6:36.
Carcinogenesis typically involves multiple somatic mutations in caretaker (DNA repair) and gatekeeper (tumor suppressors and oncogenes) genes. Analysis of mutation spectra of the tumor suppressor that is most commonly mutated in human cancers, p53, unexpectedly suggested that somatic evolution of the p53 gene during tumorigenesis is dominated by positive selection for gain of function. This conclusion is supported by accumulating experimental evidence of evolution of new functions of p53 in tumors. These findings prompted a genome-wide analysis of possible positive selection during tumor evolution.
A comprehensive analysis of probable somatic mutations in the sequences of Expressed Sequence Tags (ESTs) from malignant tumors and normal tissues was performed in order to access the prevalence of positive selection in cancer evolution. For each EST, the numbers of synonymous and non-synonymous substitutions were calculated. In order to identify genes with a signature of positive selection in cancers, these numbers were compared to: i) expected numbers and ii) the numbers for the respective genes in the ESTs from normal tissues.
We identified 112 genes with a signature of positive selection in cancers, i.e., a significantly elevated ratio of non-synonymous to synonymous substitutions, in tumors as compared to 37 such genes in an approximately equal-sized EST collection from normal tissues. A substantial fraction of the tumor-specific positive-selection candidates have experimentally demonstrated or strongly predicted links to cancer.
The results of EST analysis should be interpreted with extreme caution given the noise introduced by sequencing errors and undetected polymorphisms. Furthermore, an inherent limitation of EST analysis is that multiple mutations amenable to statistical analysis can be detected only in relatively highly expressed genes. Nevertheless, the present results suggest that positive selection might affect a substantial number of genes during tumorigenic somatic evolution.
PMCID: PMC1431556  PMID: 16469093
13.  BF Integrase Genes of HIV-1 Circulating in São Paulo, Brazil, with a Recurrent Recombination Region 
PLoS ONE  2012;7(4):e34324.
Although some studies have shown diversity in HIV integrase (IN) genes, none has focused particularly on the gene evolving in epidemics in the context of recombination. The IN gene in 157 HIV-1 integrase inhibitor-naïve patients from the São Paulo State, Brazil, were sequenced tallying 128 of subtype B (23 of which were found in non-B genomes), 17 of subtype F (8 of which were found in recombinant genomes), 11 integrases were BF recombinants, and 1 from subtype C. Crucially, we found that 4 BF recombinant viruses shared a recurrent recombination breakpoint region between positions 4900 and 4924 (relative to the HXB2) that includes 2 gRNA loops, where the RT may stutter. Since these recombinants had independent phylogenetic origin, we argue that these results suggest a possible recombination hotspot not observed so far in BF CRF in particular, or in any other HIV-1 CRF in general. Additionally, 40% of the drug-naïve and 45% of the drug-treated patients had at least 1 raltegravir (RAL) or elvitegravir (EVG) resistance-associated amino acid change, but no major resistance mutations were found, in line with other studies. Importantly, V151I was the most common minor resistance mutation among B, F, and BF IN genes. Most codon sites of the IN genes had higher rates of synonymous substitutions (dS) indicative of a strong negative selection. Nevertheless, several codon sites mainly in the subtype B were found under positive selection. Consequently, we observed a higher genetic diversity in the B portions of the mosaics, possibly due to the more recent introduction of subtype F on top of an ongoing subtype B epidemics and a fast spread of subtype F alleles among the B population.
PMCID: PMC3317518  PMID: 22485165
14.  The roles of transcription and genotoxins underlying p53 mutagenesis in vivo 
Carcinogenesis  2011;32(10):1559-1567.
Transcription drives supercoiling which forms and stabilizes single-stranded (ss) DNA secondary structures with loops exposing G and C bases that are intrinsically mutable and vulnerable to non-enzymatic hydrolytic reactions. Since many studies in prokaryotes have shown direct correlations between the frequencies of transcription and mutation, we conducted in silico analyses using the computer program, mfg, which simulates transcription and predicts the location of known mutable bases in loops of high-stability secondary structures. Mfg analyses of the p53 tumor suppressor gene predicted the location of mutable bases and mutation frequencies correlated with the extent to which these mutable bases were exposed in secondary structures. In vitro analyses have now confirmed that the 12 most mutable bases in p53 are in fact located in predicted ssDNA loops of these structures. Data show that genotoxins have two independent effects on mutagenesis and the incidence of cancer: Firstly, they activate p53 transcription, which increases the number of exposed mutable bases and also increases mutation frequency. Secondly, genotoxins increase the frequency of G-to-T transversions resulting in a decrease in G-to-A and C mutations. This precise compensatory shift in the ‘fate’ of G mutations has no impact on mutation frequency. Moreover, it is consistent with our proposed mechanism of mutagenesis in which the frequency of G exposure in ssDNA via transcription is rate limiting for mutation frequency in vivo.
PMCID: PMC3179427  PMID: 21803733
15.  Incorporating molecular and functional context into the analysis and prioritization of human variants associated with cancer 
Background and objective
With recent breakthroughs in high-throughput sequencing, identifying deleterious mutations is one of the key challenges for personalized medicine. At the gene and protein level, it has proven difficult to determine the impact of previously unknown variants. A statistical method has been developed to assess the significance of disease mutation clusters on protein domains by incorporating domain functional annotations to assist in the functional characterization of novel variants.
Disease mutations aggregated from multiple databases were mapped to domains, and were classified as either cancer- or non-cancer-related. The statistical method for identifying significantly disease-associated domain positions was applied to both sets of mutations and to randomly generated mutation sets for comparison. To leverage the known function of protein domain regions, the method optionally distributes significant scores to associated functional feature positions.
Most disease mutations are localized within protein domains and display a tendency to cluster at individual domain positions. The method identified significant disease mutation hotspots in both the cancer and non-cancer datasets. The domain significance scores (DS-scores) for cancer form a bimodal distribution with hotspots in oncogenes forming a second peak at higher DS-scores than non-cancer, and hotspots in tumor suppressors have scores more similar to non-cancers. In addition, on an independent mutation benchmarking set, the DS-score method identified mutations known to alter protein function with very high precision.
By aggregating mutations with known disease association at the domain level, the method was able to discover domain positions enriched with multiple occurrences of deleterious mutations while incorporating relevant functional annotations. The method can be incorporated into translational bioinformatics tools to characterize rare and novel variants within large-scale sequencing studies.
PMCID: PMC3277632  PMID: 22319177
Translational bioinformatics; protein networks; text-mining
16.  The Recombinational Anatomy of a Mouse Chromosome 
PLoS Genetics  2008;4(7):e1000119.
Among mammals, genetic recombination occurs at highly delimited sites known as recombination hotspots. They are typically 1–2 kb long and vary as much as a 1,000-fold or more in recombination activity. Although much is known about the molecular details of the recombination process itself, the factors determining the location and relative activity of hotspots are poorly understood. To further our understanding, we have collected and mapped the locations of 5,472 crossover events along mouse Chromosome 1 arising in 6,028 meioses of male and female reciprocal F1 hybrids of C57BL/6J and CAST/EiJ mice. Crossovers were mapped to a minimum resolution of 225 kb, and those in the telomere-proximal 24.7 Mb were further mapped to resolve individual hotspots. Recombination rates were evolutionarily conserved on a regional scale, but not at the local level. There was a clear negative-exponential relationship between the relative activity and abundance of hotspot activity classes, such that a small number of the most active hotspots account for the majority of recombination. Females had 1.2× higher overall recombination than males did, although the sex ratio showed considerable regional variation. Locally, entirely sex-specific hotspots were rare. The initiation of recombination at the most active hotspot was regulated independently on the two parental chromatids, and analysis of reciprocal crosses indicated that parental imprinting has subtle effects on recombination rates. It appears that the regulation of mammalian recombination is a complex, dynamic process involving multiple factors reflecting species, sex, individual variation within species, and the properties of individual hotspots.
Author Summary
In most eukaryotic organisms, recombination—the exchange of genetic information between homologous chromosomes—ensures the proper recognition and segregation of chromosomes during meiosis. Recombination events in mammals are not randomly positioned along the chromosomes but occur in preferential 1–2-kilobase sequences termed hotspots. Different species such as humans and mice do not share hotspots, although the same principles almost certainly regulate their placement in the genome. Hotspot positions and activities depend on genetic background and show sex-specific differences. In this study, we present a detailed analysis of recombination activity along the largest mouse chromosome, finding that recombination is regulated on multiple levels, including regional positioning relative to the chromosomal ends, local gene content, sex-specific mechanisms of hotspot recognition, and parental origin. Our results will contribute to further understanding of one of the most fundamental biological processes and are likely to cast light on several aspects of population genetics and evolutionary biology, as well as enhance our practical ability to define the genetic components of human disease.
PMCID: PMC2440539  PMID: 18617997
17.  Targeted deep sequencing of mucinous ovarian tumors reveals multiple overlapping RAS-pathway activating mutations in borderline and cancerous neoplasms 
BMC Cancer  2015;15:415.
Mucinous ovarian tumors represent a distinct histotype of epithelial ovarian cancer. The rarest (2-4 % of ovarian carcinomas) of the five major histotypes, their genomic landscape remains poorly described. We undertook hotspot sequencing of 50 genes commonly mutated in human cancer across 69 mucinous ovarian tumors. Our goals were to establish the overall frequency of cancer-hotspot mutations across a large cohort, especially those tumors previously thought to be “RAS-pathway alteration negative”, using highly-sensitive next-generation sequencing as well as further explore a small number of cases with apparent heterogeneity in RAS-pathway activating alterations.
Using the Ion Torrent PGM platform, we performed next generation sequencing analysis using the v2 Cancer Hotspot Panel. Regions of disparate ERBB2-amplification status were sequenced independently for two mucinous carcinoma (MC) cases, previously established as showing ERBB2 amplification/overexpression heterogeneity, to assess the hypothesis of subclonal populations containing either KRAS mutation or ERBB2 amplification independently or simultaneously.
We detected mutations in KRAS, TP53, CDKN2A, PIK3CA, PTEN, BRAF, FGFR2, STK11, CTNNB1, SRC, SMAD4, GNA11 and ERBB2. KRAS mutations remain the most frequently observed alteration among MC (64.9 %) and mucinous borderline tumors (MBOT) (92.3 %). TP53 mutation occurred more frequently in carcinomas than borderline tumors (56.8 % and 11.5 %, respectively), and combined IHC and mutation data suggest alterations occur in approximately 68 % of MC and as many as 20 % of MBOT. Proven and potential RAS-pathway activating changes were observed in all but one MC. Concurrent ERBB2 amplification and KRAS mutation were observed in a substantial number of cases (7/63 total), as was co-occurrence of KRAS and BRAF mutations (one case). Microdissection of ERBB2-amplified regions of tumors harboring KRAS mutation suggests these alterations are occurring in the same cell populations, while consistency of KRAS allelic frequency in both ERBB2 amplified and non-amplified regions suggests this mutation occurred in advance of the amplification event.
Overall, the prevalence of RAS-alteration and striking co-occurrence of pathway “double-hits” supports a critical role for tumor progression in this ovarian malignancy. Given the spectrum of RAS-activating mutations, it is clear that targeting this pathway may be a viable therapeutic option for patients with recurrent or advanced stage mucinous ovarian carcinoma, however caution should be exercised in selecting one or more personalized therapeutics given the frequency of non-redundant RAS-activating alterations.
Electronic supplementary material
The online version of this article (doi:10.1186/s12885-015-1421-8) contains supplementary material, which is available to authorized users.
PMCID: PMC4494777  PMID: 25986173
Next-generation sequencing; Mucinous; Ovarian; BRAF; KRAS; TP53; Heterogeneity
18.  Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules 
BMC Genomics  2013;14:196.
Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant.
While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso.
Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.
PMCID: PMC3616858  PMID: 23514438
19.  Stable and Unstable Malaria Hotspots in Longitudinal Cohort Studies in Kenya 
PLoS Medicine  2010;7(7):e1000304.
Philip Bejon and colleagues document the clustering of malaria episodes and malarial parasite infection. These patterns may enable future prediction of hotspots of malaria infection and targeting of treatment or preventive interventions.
Infectious diseases often demonstrate heterogeneity of transmission among host populations. This heterogeneity reduces the efficacy of control strategies, but also implies that focusing control strategies on “hotspots” of transmission could be highly effective.
Methods and Findings
In order to identify hotspots of malaria transmission, we analysed longitudinal data on febrile malaria episodes, asymptomatic parasitaemia, and antibody titres over 12 y from 256 homesteads in three study areas in Kilifi District on the Kenyan coast. We examined heterogeneity by homestead, and identified groups of homesteads that formed hotspots using a spatial scan statistic. Two types of statistically significant hotspots were detected; stable hotspots of asymptomatic parasitaemia and unstable hotspots of febrile malaria. The stable hotspots were associated with higher average AMA-1 antibody titres than the unstable clusters (optical density [OD] = 1.24, 95% confidence interval [CI] 1.02–1.47 versus OD = 1.1, 95% CI 0.88–1.33) and lower mean ages of febrile malaria episodes (5.8 y, 95% CI 5.6–6.0 versus 5.91 y, 95% CI 5.7–6.1). A falling gradient of febrile malaria incidence was identified in the penumbrae of both hotspots. Hotspots were associated with AMA-1 titres, but not seroconversion rates. In order to target control measures, homesteads at risk of febrile malaria could be predicted by identifying the 20% of homesteads that experienced an episode of febrile malaria during one month in the dry season. That 20% subsequently experienced 65% of all febrile malaria episodes during the following year. A definition based on remote sensing data was 81% sensitive and 63% specific for the stable hotspots of asymptomatic malaria.
Hotspots of asymptomatic parasitaemia are stable over time, but hotspots of febrile malaria are unstable. This finding may be because immunity offsets the high rate of febrile malaria that might otherwise result in stable hotspots, whereas unstable hotspots necessarily affect a population with less prior exposure to malaria.
Please see later in the article for the Editors' Summary
Editors' Summary
Malaria, a mosquito-borne parasitic disease, is a major global public-health problem. About half the world's population is at risk of malaria and about one million people (mainly children living in sub-Saharan Africa) die each year from the disease. Malaria is transmitted to people through the bite of an infected mosquito. Initially, the parasite replicates inside human liver cells but, about a week after infection, these cells release “merozoites” (one of the life-stages of the parasite), which invade red blood cells. Here, the merozoites replicate rapidly before bursting out after 2–3 days and infecting more red blood cells. The cyclical and massive increase in parasitemia (parasites in the bloodstream) that results from this pattern of replication is responsible for malaria's recurring fevers and can cause life-threatening organ damage and anemia (a lack of red blood cells). Malaria can be prevented by controlling the mosquitoes that spread the parasite and by avoiding mosquito bites. Effective treatment with antimalarial drugs can also reduce malaria transmission.
Why Was This Study Done?
Like many other infectious diseases, the transmission of malaria is heterogeneous. That is, even in places where malaria is always present, there are “hotspots” of transmission, areas where the risk of catching malaria is particularly high. The existence of these hotspots, which are caused by a combination of genetic factors (for example, host susceptibility to infection) and environmental factors (for example, distance from mosquito breeding sites), reduces the efficacy of control strategies. However, mathematical models suggest that focusing control strategies on transmission hotspots might be an effective way to reduce overall malaria transmission. Efforts have been made to identify such hotspots using environmental data collected by satellites but with limited success. In this study, therefore, the researchers investigate the heterogeneity of malaria transmission in the Kilifi District of Kenya over time by analyzing data collected over up to 12 years (“longitudinal” data) on malaria episodes and parasitemia in three groups (cohorts) of children living in 256 homesteads.
What Did the Researchers Do and Find?
The researchers identified febrile malaria episodes in the homesteads by taking blood from children with fever (febrile children) to analyze for parasitemia. They took blood once a year from all the study participants just before the rainy season (when malaria peaks) to look for symptom-free parasitemia and they also looked for antibodies (proteins made by the immune system that fight disease) against malaria parasites in the blood of the participants. They then used a “spatial scan statistic” to look for heterogeneity of transmission and to identify transmission hotspots (groups of homesteads where the observed incidence of malaria or parasitemia was higher than would be expected if cases were evenly distributed). The researchers identified two types of hotspots—stable hotspots of symptom-free parasitemia that were still hotspots several years later and unstable hotspots of febrile malaria that rarely stayed in the same place for more than a year or two. Children living in the stable hotspots had slightly higher average amounts of antimalaria antibodies and developed malaria at a slightly lower average age than children living in the unstable hotspots.
What Do These Findings Mean?
These findings show that in Kilifi District, Kenya, hotspots of symptom-free parasitemia are stable over time but hotspots of febrile malaria are unstable. The researchers suggest that rapid acquisition of immunity in the stable hotspots reduces the occurrence of febrile malaria whereas in the unstable hotspots there is a high incidence of febrile malaria because lack of previous exposure to the parasite means there is a low level of immunity. Targeted strategies for malaria control should target both types of hotspots, suggest the researchers. Stable hotspots of symptom-free parasitemia (which can be identified by parasite or antibody surveys or by remote environmental sensing) should be targeted because mosquito dispersion probably increases malaria transmission rates near these hotspots. Unstable hotspots of febrile disease should be targeted to reduce both the burden of disease and transmission in the wider community. Unstable hotspots of febrile malaria, the researchers suggest, could be efficiently identified in Kilifi District (and maybe elsewhere) by determining which homesteads had malaria outbreaks during September (part of the dry season) one year and then focusing control interventions on these homesteads the next year.
Additional Information
Please access these Web sites via the online version of this summary at
Information is available from the World Health Organization on malaria (in several languages)
The US Centers for Disease Control and Prevention provide information on malaria (in English and Spanish)
MedlinePlus provides links to additional information on malaria (in English and Spanish)
Information is available from the Roll Back Malaria Partnership on the global control of malaria (in English and French) and on malaria in Kenya
PMCID: PMC2897769  PMID: 20625549
20.  GT198 Splice Variants Display Dominant-Negative Activities and Are Induced by Inactivating Mutations 
Genes & Cancer  2013;4(1-2):26-38.
Alternative pre-mRNA splicing yields functionally distinct splice variants in regulating normal cell differentiation as well as cancer development. The putative tumor suppressor gene GT198 (PSMC3IP), encoding a protein also known as TBPIP and Hop2, has been shown to regulate steroid hormone receptor–mediated transcription and to stimulate homologous recombination in DNA repair. Here, we have identified 6 distinct GT198 splice variant transcripts generated by alternative promoter usage or alternative splicing. Various splice variant transcripts preserve a common open reading frame, which encodes the DNA binding domain of GT198. The splice variants act as dominant negatives to counteract wild-type GT198 activity in transcription and to abolish Rad51 foci formation during radiation-induced DNA damage. In fallopian tube cancer, we have identified 44 point mutations in GT198 clustered in 2 mutation hotspot sequences. The mutation hotspots coincide with the regulatory sequences responsible for alternative splicing, strongly supporting that imbalanced alternative splicing is a selected consequence in cancer. In addition, splice variant–associated cytoplasmic expression is found in tumors carrying germline or somatic GT198 mutations. An altered alternative splicing pattern with increased variants is also present in lymphoblastoid cells derived from familial breast cancer patients carrying GT198 germline mutations. Furthermore, GT198 and its variant are reciprocally expressed during mouse stem cell differentiation. The constitutive expression of the GT198 variant but not the wild type induces tumor growth in nude mice. Our results collectively suggest that mutations in the GT198 gene deregulate alternative splicing. Defective alternative splicing promotes antagonizing variants and in turn induces a loss of the wild type in tumorigenesis. The study highlights the role of alternative splicing in tumor suppressor gene inactivation.
PMCID: PMC3743156  PMID: 23946869
alternative splicing; GT198; tumor suppressor gene; somatic mutation; DNA repair
21.  The Plant Pathogen Pseudomonas syringae pv. tomato Is Genetically Monomorphic and under Strong Selection to Evade Tomato Immunity 
PLoS Pathogens  2011;7(8):e1002130.
Recently, genome sequencing of many isolates of genetically monomorphic bacterial human pathogens has given new insights into pathogen microevolution and phylogeography. Here, we report a genome-based micro-evolutionary study of a bacterial plant pathogen, Pseudomonas syringae pv. tomato. Only 267 mutations were identified between five sequenced isolates in 3,543,009 nt of analyzed genome sequence, which suggests a recent evolutionary origin of this pathogen. Further analysis with genome-derived markers of 89 world-wide isolates showed that several genotypes exist in North America and in Europe indicating frequent pathogen movement between these world regions. Genome-derived markers and molecular analyses of key pathogen loci important for virulence and motility both suggest ongoing adaptation to the tomato host. A mutational hotspot was found in the type III-secreted effector gene hopM1. These mutations abolish the cell death triggering activity of the full-length protein indicating strong selection for loss of function of this effector, which was previously considered a virulence factor. Two non-synonymous mutations in the flagellin-encoding gene fliC allowed identifying a new microbe associated molecular pattern (MAMP) in a region distinct from the known MAMP flg22. Interestingly, the ancestral allele of this MAMP induces a stronger tomato immune response than the derived alleles. The ancestral allele has largely disappeared from today's Pto populations suggesting that flagellin-triggered immunity limits pathogen fitness even in highly virulent pathogens. An additional non-synonymous mutation was identified in flg22 in South American isolates. Therefore, MAMPs are more variable than expected differing even between otherwise almost identical isolates of the same pathogen strain.
Author Summary
Our knowledge of the recent evolution of bacterial human pathogens has increased dramatically over the last five years. By comparison, relatively little is known about recent evolution of bacterial plant pathogens. Here, we analyze a large collection of isolates of the economically important plant pathogen Pseudomonas syringae pv. tomato with markers derived from the comparison of five genomes of this pathogen. We find that this pathogen likely evolved on a relatively recent time scale and continues to adapt to tomato by minimizing its recognition by the tomato immune system. We find that an allele of the flagellin subunit fliC that appeared in the pathogen population for the first time in the 1980s, and which is the most common allele of this gene in North America and Europe today, triggers a weaker tomato immune response than the fliC allele found in the 1960s and 1970s. These results not only impact our understanding of pathogen – plant interactions and pathogen evolution but also have important ramifications for disease prevention. Given the speed with which new pathogen strains spread and replace existing strains, limiting the movement of specific strains between geographic regions is critically important, even for pathogens known to have worldwide distribution.
PMCID: PMC3161960  PMID: 21901088
22.  Multiple genetic switches spontaneously modulating bacterial mutability 
All life forms need both high genetic stability to survive as species and a degree of mutability to evolve for adaptation, but little is known about how the organisms balance the two seemingly conflicting aspects of life: genetic stability and mutability. The DNA mismatch repair (MMR) system is essential for maintaining genetic stability and defects in MMR lead to high mutability. Evolution is driven by genetic novelty, such as point mutation and lateral gene transfer, both of which require genetic mutability. However, normally a functional MMR system would strongly inhibit such genomic changes. Our previous work indicated that MMR gene allele conversion between functional and non-functional states through copy number changes of small tandem repeats could occur spontaneously via slipped-strand mis-pairing during DNA replication and therefore may play a role of genetic switches to modulate the bacterial mutability at the population level. The open question was: when the conversion from functional to defective MMR is prohibited, will bacteria still be able to evolve by accepting laterally transferred DNA or accumulating mutations?
To prohibit allele conversion, we "locked" the MMR genes through nucleotide replacements. We then scored changes in bacterial mutability and found that Salmonella strains with MMR locked at the functional state had significantly decreased mutability. To determine the generalizability of this kind of mutability 'switching' among a wider range of bacteria, we examined the distribution of tandem repeats within MMR genes in over 100 bacterial species and found that multiple genetic switches might exist in these bacteria and may spontaneously modulate bacterial mutability during evolution.
MMR allele conversion through repeats-mediated slipped-strand mis-pairing may function as a spontaneous mechanism to switch between high genetic stability and mutability during bacterial evolution.
PMCID: PMC2955026  PMID: 20836863
23.  Impact of Alu repeats on the evolution of human p53 binding sites 
Biology Direct  2011;6:2.
The p53 tumor suppressor protein is involved in a complicated regulatory network, mediating expression of ~1000 human genes. Recent studies have shown that many p53 in vivo binding sites (BSs) reside in transposable repeats. The relationship between these BSs and functional p53 response elements (REs) remains unknown, however. We sought to understand whether the p53 REs also reside in transposable elements and particularly in the most-abundant Alu repeats.
We have analyzed ~160 functional p53 REs identified so far and found that 24 of them occur in repeats. More than half of these repeat-associated REs reside in Alu elements. In addition, using a position weight matrix approach, we found ~400,000 potential p53 BSs in Alu elements genome-wide. Importantly, these putative BSs are located in the same regions of Alu repeats as the functional p53 REs - namely, in the vicinity of Boxes A/A' and B of the internal RNA polymerase III promoter. Earlier nucleosome-mapping experiments showed that the Boxes A/A' and B have a different chromatin environment, which is critical for the binding of p53 to DNA. Here, we compare the Alu-residing p53 sites with the corresponding Alu consensus sequences and conclude that the p53 sites likely evolved through two different mechanisms - the sites overlapping with the Boxes A/A' were generated by CG → TG mutations; the other sites apparently pre-existed in the progenitors of several Alu subfamilies, such as AluJo and AluSq. The binding affinity of p53 to the Alu-residing sites generally correlates with the age of Alu subfamilies, so that the strongest sites are embedded in the 'relatively young' Alu repeats.
The primate-specific Alu repeats play an important role in shaping the p53 regulatory network in the context of chromatin. One of the selective factors responsible for the frequent occurrence of Alu repeats in introns may be related to the p53-mediated regulation of Alu transcription, which, in turn, influences expression of the host genes.
This paper was reviewed by Igor B. Rogozin (nominated by Pavel A. Pevzner), Sandor Pongor, and I. King Jordan.
PMCID: PMC3032802  PMID: 21208455
24.  Selection Acts on DNA Secondary Structures to Decrease Transcriptional Mutagenesis 
PLoS Genetics  2006;2(11):e176.
Single-stranded DNA is more subject to mutation than double stranded. During transcription, DNA is transiently single stranded and therefore subject to higher mutagenesis. However, if local intra-strand secondary structures are formed, some bases will be paired and therefore less sensitive to mutation than unpaired bases. Using complete genome sequences of Escherichia coli, we show that local intra-strand secondary structures can, as a consequence, be used to define an index of transcription-driven mutability. At gene level, we show that natural selection has favoured a reduced transcription-driven mutagenesis via the higher than expected frequency of occurrence of intra-strand secondary structures. Such selection is stronger in highly expressed genes and suggests a sequence-dependent way to control mutation rates and a novel form of selection affecting the evolution of synonymous mutations.
Genome sequence evolution results from the interplay between mutagenesis and natural selection. Mutations occur as the result of biochemical or physical alteration of DNA and/or from the errors made by polymerases while replicating DNA. As many mutations tend to be detrimental to the organism's fitness, natural selection favours a decrease in mutation rate. Hence, many mechanisms have evolved to control mutation rate. The mechanisms described to date have relied on (i) the existence of enzymes repairing the damaged DNA or correcting mismatched bases, which are mechanisms having an effect on whole genome mutation rate, and (ii) the avoidance in the sequence of repetition that could be misread by the polymerases, which is a sequence-dependent local control of mutation rate. In the present paper, the authors suggest that another sequence-dependent control of mutation exists and shapes the overall evolution of the genome. Using a comparative analysis of Escherichia coli genomes, they show that local secondary structures that are formed during the transcription of genes into RNA can modulate the base-to-base mutation rate. Moreover, the authors show that natural selection seems to have favoured the occurrence of such structures to minimise mutability, especially in the most expressed genes. This paper proposes a new way in which gene sequences can be constrained by natural selection.
PMCID: PMC1630709  PMID: 17083275
25.  Increased cell survival by inhibition of BRCA1 using an antisense approach in an estrogen responsive ovarian carcinoma cell line 
Breast Cancer Research  2000;2(2):139-148.
We tested the hypothesis that BRCA1 may play a role in the regulation of ovarian tumor cell death as well as the inhibition of ovarian cell proliferation. Introduction of BRCA1 antisense retroviral constructs into BG-1 estrogen-dependent ovarian adenocarcinoma cells resulted in reduced BRCA1 expression. BRCA1 antisense pooled populations and derived subclones were able to proliferate in monolayer culture without estrogen, whereas control cells began to die after 10 days of estrogen deprivation. In addition, both populations and subclones of BRCA1 antisense infected cells demonstrated a growth advantage in monolayer culture in the presence of estrogen and were able to proliferate in monolayer culture without estrogen, while control cells did not. Furthermore, clonal studies demonstrated that reduced levels of BRCA1 protein correlated with growth in soft agar and greater tumor formation in nude mice in the absence of estrogen. These data suggest that reduction of BRCA1 protein in BG-1 ovarian adenocarcinoma cells may have an effect on cell survival during estrogen deprivation both in vitro and in vivo.
Germline mutations in the breast and ovarian cancer susceptibility gene BRCA1, which is located on chromosome 17q21, are associated with a predisposition to the development of cancer in these organs [1,2]. No mutations in the BRCA1 gene have been detected in sporadic breast cancer cases, but mutations have been detected in sporadic cases of ovarian cancer [3,4]. Although there is debate regarding the level of cancer risk associated with mutations in BRCA1 and the significance of the lack of mutations in sporadic tumors, it is possible that alterations in the function of BRCA1 may occur by mechanisms other than mutation, leading to an underestimation of risk when it is calculated solely on the basis of mutational analysis. Such alterations cannot be identified until the function and regulation of BRCA1 are better understood.
The BRCA1 gene encodes a 220-kDa nuclear phosphoprotein that is regulated in response to DNA damaging agents [5,6,7] and in response to estrogen-induced growth [8,9,10,11]. Germline mutations that cause breast and ovarian cancer predisposition frequently result in truncated and presumably inactive BRCA1 protein [12].
BG-1 cells were derived from a patient with stage III, poorly differentiated ovarian adenocarcinoma [13]. This cell line, which expresses wild-type BRCA1, is estrogen responsive and withdrawal of estrogen results in eventual cell death. Previous studies suggest that BRCA1 is stimulated as a result of estrogen treatment [8,9,10,11], and also that BRCA1 may be involved in the cell death process [14]. Therefore, we examined the effect of reduction of BRCA1 levels in BG-1 cells on the cellular response to hormone depletion as well as estrogen stimulation. The results suggest that reduced levels of BRCA1 correlates with a survival advantage when BG-1 cells are placed under growth-restrictive and hormone-depleted conditions. In optimum growth conditions, significantly reduced levels of BRCA1 correlates with enhanced growth both in vitro and in vivo.
To test the hypothesis that BRCA1 may play a role in the regulation of ovarian tumor cell death as well as in the inhibition of ovarian cell proliferation.
Materials and methods:
The estrogen receptor-positive, BG-1 cell line [13], which contains an abundant amount of estrogen receptors (600 fmoles/100 μg DNA), was infected using a pLXSN retroviral vector (provided by AD Miller) containing an inverted partial human cDNA 900-base-pair sequence of BRCA1 (from nucleotide 121 in exon 1 to nucleotide 1025 in exon 11, accession #U14680). After 2 weeks of selection in 800 μg/ml of geneticin-G418 (Gibco/Life Technologies, Gaithersburg, MD, USA), BG-1 G418-resistant colonies were pooled, or individually isolated, and assayed for growth in the presence or absence of supplemented estrogen. Virally infected pooled populations of BG-1 cells were examined for BRCA1 message levels by ribonuclease protection assay (Fig. 1a). BRCA1 ribonuclease protection probe was made using an in vitro transcription kit (Ambion, Inc, Austin, TX, USA) as previously described [10] and derived clones were tested for protein levels by Western blot analysis using an anti-BRCA1 (Oncogene Research, Ab-1, Cambridge, MA, USA) antibody. Growth curve analysis of Infected populations and were pretreated for 5 days in phenol red-free, Dulbecco's modified eagle medium (DMEM)/F-12 medium (Gibco/Life Technologies) supplemented with 10% charcoal/dextran treated serum (Hyclone, Logan, UT, USA), then plated at 2.5 × 106 cells per 100mm dish in triplicate in the absence or presence of estrogen (10-8 mol/l; 17β-Estradiol; 1,3,5 (10) - Estratriene 3,17β-diol; Sigma, St Louis, MO, USA). For soft agar assay, clones were plated into 10 60-mm dishes at 1 × 105 cells/dish containing 0.3% bactopeptone agar with or without added estrogen (10-8 mol/l) in phenol red-free medium with 10% stripped serum in order to test for anchorage independent growth. BG-1 infected clones were tested for tumorigenicity by injection of cells (106 cells in 0.1cm2 50% matrigel; Collaborative Biomedical Products, Bedford, MA, USA) into subcutaneous sites in 6-week-old athymic Ncr-nude mice (NCI Animal Program, Bethesda, MD, USA) that were ovariectomized at approximately 4 weeks of age. Half of the ovariectomized mice received an implanted 0.18mg estrogen 60-day pellet (Innovative Research of America, Sarasota, FL, USA).
Antisense technology was effective in decreasing both RNA and protein levels of BRCA1 in the BG-1 human ovarian adenocarcinoma cells. BRCA1 antisense-infected populations contained significantly less BRCA1 message than control LXSN-infected pools and selected clones contained varying reduced levels of BRCA1 protein compared with control clones (Figs 1a and 1b).
Three independent BRCA1 antisense-infected cultures demonstrated a resistance to cell death induced by withdrawal from estrogen over a 6- to 20-day period (Fig. 2a). The BRCA1 antisense population also exhibited a threefold to sixfold increase in cell growth compared with control cells in the presence of estrogen treatment. BG-1 BRCA1 antisense clones demonstrated a similar response to pooled population studies, enhanced growth with estrogen, and failure to die upon estrogen depletion (Fig. 2b).
The BRCA1 antisense clones were further examined for other associated tumorigenic properties. All of the antisense clones were able to form colonies in soft agar (2-23 colonies per 104 cells plated; data not shown), whereas control clones were deficient in their ability to form colonies (0-0.8 colonies per 104 cells plated). Table 1 shows, in the presence of estrogen, the clone with the lowest levels of BRCA1 (AS-4) produced significantly more colonies (133 ± 17.9 colonies per 104 cells plated) than the control clone (NEO; 6 ± 3.1 colonies per 104 cells plated). Clones AS-4 and NEO were also injected with matrigel subcutaneously into ovariectomized athymic mice. Almost twice as many sites were positive for the AS-4 clone (14 out of 14) as for the NEO clone (eight out of 14) 42 days after injection. In addition, BRCA1 antisense tumors averaged twice the size of control tumors. The BRCA1 reduced cells also formed tumors with half the latency of control cells in the presence of implanted estrogen (11 days versus 21 days until tumor formation).
The present studies show that reduction in BRCA1 levels, using an antisense retroviral vector in the estrogen dependent BG-1 ovarian carcinoma cell line, contributes to confirmation of the hypothesis that BRCA1 plays a pivotal role in the balance between cell death and cell proliferation. BRCA1 RNA and protein levels were successfully reduced in populations and isolated clones of antisense infected BG-1 cells. Decreased BRCA1 levels rescued the BG-1 cells from growth arrest or cell death in adverse growth conditions in monolayer or soft agar conditions. Furthermore, a BRCA1 antisense clone that had significantly low levels of BRCA1 protein was able to form twice as many tumors in ovariectomized nude mice with a decreased latency compared with a control clone.
In multicellular mammalian organisms, a balance between cell proliferation and cell death is extremely important for the maintenance of normal healthy tissues. In support of this hypothesis, it has been shown that p53 and BRCA1 can form stable complexes, and can coactivate p21 and bax genes, which may lead to the activation of the apoptosis pathway [15]. The present data, which show that cells with a reduction of BRCA1 have a survival advantage in conditions where control cells fail to thrive, also supports this hypothesis. BRCA1 levels appear to affect the ability of cells to arrest growth or die in the absence of estrogenic growth-inducing conditions. Although mutations in this gene are uncommon in sporadic breast and ovarian tumors, BRCA1 expression levels and protein levels have been found to be reduced in sporadic human breast carcinomas [16,17,18,19]. In addition it has been demonstrated [20] that hormone-dependent tumors such as breast and ovarian cancers have a decreased ability to undergo apoptosis. Other mechanisms involving gene regulation may allow for decreased expression of BRCA1 in sporadic tumors. The response of BRCA1 mRNA and protein levels to mitogens and hormones in vitro suggests that BRCA1 may play a role in regulation of cell growth or maintenance [21]. The BRCA1 gene product may be involved in the regulation of hormone response pathways, and the present results demonstrate that loss of BRCA1 may result in loss of inhibitory control of these mitogenic pathways. These studies show that reduction in BRCA1 mRNA and protein can result in increased proliferation of BG-1 ovarian cancer cells in both in vitro and in vivo conditions, suggesting that BRCA1 may normally be acting as a growth inhibitor. Low BRCA1 levels found in sporadic cancers may be an important factor in tumorigenesis. The present data suggest that diminished levels of BRCA1 not only accelerate proliferation in the BG-1 ovarian carcinoma cell line, but also appear to promote tumorigenesis. We propose that the loss or reduction of BRCA1 may predispose a cell population to neoplastic transformation by altering the balance between cell death and proliferation/survival, rendering it more sensitive to secondary genetic changes.
PMCID: PMC13916  PMID: 11056686
antisense; BRCA1; cell death; estrogen; ovarian cancer; proliferation

Results 1-25 (1213440)