PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (55)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
more »
1.  Whole-Genome Sequences of 26 Vibrio cholerae Isolates 
Genome Announcements  2016;4(6):e01396-16.
The human pathogen Vibrio cholerae employs several adaptive mechanisms for environmental persistence, including natural transformation and type VI secretion, creating a reservoir for the spread of disease. Here, we report whole-genome sequences of 26 diverse V. cholerae isolates, significantly increasing the sequence diversity of publicly available V. cholerae genomes.
doi:10.1128/genomeA.01396-16
PMCID: PMC5180380  PMID: 28007852
2.  Chocó, Colombia: a hotspot of human biodiversity 
Objective
Chocó is a state located on the Pacific coast of Colombia that has a majority Afro-Colombian population. The objective of this study was to characterize the genetic ancestry, admixture and diversity of the population of Chocó, Colombia.
Methodology
Genetic variation was characterized for a sample of 101 donors (61 female and 40 male) from the state of Chocó. Genotypes were determined for each individual via the characterization of 610,545 single nucleotide polymorphisms genome-wide. Haplotypes for the uniparental mitochondrial DNA (female) and Y-DNA (male) chromosomes were also determined. These data were used for comparative analyses with a number of worldwide populations, including putative ancestral populations from Africa, the Americas and Europe, along with several admixed American populations.
Results
The population of Chocó has predominantly African genetic ancestry (75.8%) with approximately equal parts European (13.4%) and Native American (11.1%) ancestry. Chocó shows relatively high levels of three-way genetic admixture, and far higher levels of Native American ancestry, compared to other New World African populations from the Caribbean and the United States. There is a striking pattern of sex-specific ancestry in Chocó, with Native American admixture along the female lineage and European admixture along the male lineage. The population of Chocó is also characterized by relatively high levels of overall genetic diversity compared to both putative ancestral populations and other admixed American populations.
Conclusion
These results suggest a unique genetic heritage for the population of Chocó and underscore the profound human genetic diversity that can be found in the region.
doi:10.18636/bioneotropical.v6i1.341
PMCID: PMC5033504  PMID: 27668076
Admixture; Afro-Colombian; Colombia; Genetic ancestry; Genetic diversity; Human genome
3.  Population Genomics of Reduced Vancomycin Susceptibility in Staphylococcus aureus 
mSphere  2016;1(4):e00094-16.
The emergence and spread of antibiotic resistance among bacterial pathogens are two of the gravest threats to public health facing the world today. We report the development and application of a novel population genomic technique aimed at uncovering the evolutionary dynamics and genetic determinants of antibiotic resistance in Staphylococcus aureus. This method was applied to S. aureus cultures isolated from a single patient who showed decreased susceptibility to the vancomycin antibiotic over time. Our approach relies on the increased resolution afforded by next-generation genome-sequencing technology, and it allowed us to discover a number of S. aureus mutations, in both known and novel gene targets, which appear to have evolved under adaptive pressure to evade vancomycin mechanisms of action. The approach we lay out in this work can be applied to resistance to any number of antibiotics across numerous species of bacterial pathogens.
ABSTRACT
The increased prevalence of vancomycin-intermediate Staphylococcus aureus (VISA) is an emerging health care threat. Genome-based comparative methods hold great promise to uncover the genetic basis of the VISA phenotype, which remains obscure. S. aureus isolates were collected from a single individual that presented with recurrent staphylococcal bacteremia at three time points, and the isolates showed successively reduced levels of vancomycin susceptibility. A population genomic approach was taken to compare patient S. aureus isolates with decreasing vancomycin susceptibility across the three time points. To do this, patient isolates were sequenced to high coverage (~500×), and sequence reads were used to model site-specific allelic variation within and between isolate populations. Population genetic methods were then applied to evaluate the overall levels of variation across the three time points and to identify individual variants that show anomalous levels of allelic change between populations. A successive reduction in the overall levels of population genomic variation was observed across the three time points, consistent with a population bottleneck resulting from antibiotic treatment. Despite this overall reduction in variation, a number of individual mutations were swept to high frequency in the VISA population. These mutations were implicated as potentially involved in the VISA phenotype and interrogated with respect to their functional roles. This approach allowed us to identify a number of mutations previously implicated in VISA along with allelic changes within a novel class of genes, encoding LPXTG motif-containing cell-wall-anchoring proteins, which shed light on a novel mechanistic aspect of vancomycin resistance.
IMPORTANCE The emergence and spread of antibiotic resistance among bacterial pathogens are two of the gravest threats to public health facing the world today. We report the development and application of a novel population genomic technique aimed at uncovering the evolutionary dynamics and genetic determinants of antibiotic resistance in Staphylococcus aureus. This method was applied to S. aureus cultures isolated from a single patient who showed decreased susceptibility to the vancomycin antibiotic over time. Our approach relies on the increased resolution afforded by next-generation genome-sequencing technology, and it allowed us to discover a number of S. aureus mutations, in both known and novel gene targets, which appear to have evolved under adaptive pressure to evade vancomycin mechanisms of action. The approach we lay out in this work can be applied to resistance to any number of antibiotics across numerous species of bacterial pathogens.
doi:10.1128/mSphere.00094-16
PMCID: PMC4954867  PMID: 27446992
Staphylococcus aureus; antibiotic resistance; genomics; population genetics; vancomycin
4.  The Columbian Exchange as a source of adaptive introgression in human populations 
Biology Direct  2016;11:17.
Background
The term “Columbian Exchange” refers to the massive transfer of life between the Afro-Eurasian and American hemispheres that was precipitated by Columbus’ voyage to the New World. The Columbian Exchange is widely appreciated by historians, social scientists and economists as a major turning point that had profound and lasting effects on the trajectory of human history and development.
Presentation of the hypothesis
I propose that the Columbian Exchange should also be appreciated by biologists for its role in the creation of novel human genomes that have been shaped by rapid adaptive evolution. Specifically, I hypothesize that the process of human genome evolution stimulated by the Columbian Exchange was based in part on selective sweeps of introgressed haplotypes from ancestral populations, many of which possessed pre-evolved adaptive utility based on regional-specific fitness and health effects.
Testing the hypothesis
Testing of this hypothesis will require comparative analysis of genome sequences from putative ancestral source populations, with genomes from modern admixed populations, in order to identify ancestry-specific introgressed haplotypes that exist at higher frequencies in admixed populations than can be expected by chance alone. Investigation of such ancestry-enriched genomic regions can be used to provide clues as to the functional roles of the genes therein and the selective forces that have acted to increase their frequency in the population.
Implications of the hypothesis
Critical interrogation of this hypothesis could serve to underscore the important role of introgression as a source of adaptive alleles and as a driver of evolutionary change, and it would highlight the role of admixture in facilitating rapid human evolution.
Reviewers
This article was reviewed by Frank Eisenhaber, Lakshminarayan Iyer and Igor B. Rogozin
doi:10.1186/s13062-016-0121-x
PMCID: PMC4818900  PMID: 27038633
Columbian exchange; Human evolution; Adaptive evolution; Natural selection; Selective sweep; Genetic admixture; Introgression; Haplotype; Allele
5.  Patterns of Transposable Element Expression and Insertion in Cancer 
Human transposable element (TE) activity in somatic tissues causes mutations that can contribute to tumorigenesis. Indeed, TE insertion mutations have been implicated in the etiology of a number of different cancer types. Nevertheless, the full extent of somatic TE activity, along with its relationship to tumorigenesis, have yet to be fully explored. Recent developments in bioinformatics software make it possible to analyze TE expression levels and TE insertional activity directly from transcriptome (RNA-seq) and whole genome (DNA-seq) next-generation sequence data. We applied these new sequence analysis techniques to matched normal and primary tumor patient samples from the Cancer Genome Atlas (TCGA) in order to analyze the patterns of TE expression and insertion for three cancer types: breast invasive carcinoma, head and neck squamous cell carcinoma, and lung adenocarcinoma. Our analysis focused on the three most abundant families of active human TEs: Alu, SVA, and L1. We found evidence for high levels of somatic TE activity for these three families in normal and cancer samples across diverse tissue types. Abundant transcripts for all three TE families were detected in both normal and cancer tissues along with an average of ~80 unique TE insertions per individual patient/tissue. We observed an increase in L1 transcript expression and L1 insertional activity in primary tumor samples for all three cancer types. Tumor-specific TE insertions are enriched for private mutations, consistent with a potentially causal role in tumorigenesis. We used genome feature analysis to investigate two specific cases of putative cancer-causing TE mutations in further detail. An Alu insertion in an upstream enhancer of the CBL tumor suppressor gene is associated with down-regulation of the gene in a single breast cancer patient, and an L1 insertion in the first exon of the BAALC gene also disrupts its expression in head and neck squamous cell carcinoma. Our results are consistent with widespread somatic activity of human TEs leading to numerous insertion mutations that can contribute to tumorigenesis in a variety of tissues.
doi:10.3389/fmolb.2016.00076
PMCID: PMC5110550  PMID: 27900322
LINE-1; L1; Alu; SVA; retrotransposons; bioinformatics; mutation; tumorigenesis
6.  Lateral Gene Transfer in a Heavy Metal-Contaminated-Groundwater Microbial Community 
mBio  2016;7(2):e02234-15.
ABSTRACT
Unraveling the drivers controlling the response and adaptation of biological communities to environmental change, especially anthropogenic activities, is a central but poorly understood issue in ecology and evolution. Comparative genomics studies suggest that lateral gene transfer (LGT) is a major force driving microbial genome evolution, but its role in the evolution of microbial communities remains elusive. To delineate the importance of LGT in mediating the response of a groundwater microbial community to heavy metal contamination, representative Rhodanobacter reference genomes were sequenced and compared to shotgun metagenome sequences. 16S rRNA gene-based amplicon sequence analysis indicated that Rhodanobacter populations were highly abundant in contaminated wells with low pHs and high levels of nitrate and heavy metals but remained rare in the uncontaminated wells. Sequence comparisons revealed that multiple geochemically important genes, including genes encoding Fe2+/Pb2+ permeases, most denitrification enzymes, and cytochrome c553, were native to Rhodanobacter and not subjected to LGT. In contrast, the Rhodanobacter pangenome contained a recombinational hot spot in which numerous metal resistance genes were subjected to LGT and/or duplication. In particular, Co2+/Zn2+/Cd2+ efflux and mercuric resistance operon genes appeared to be highly mobile within Rhodanobacter populations. Evidence of multiple duplications of a mercuric resistance operon common to most Rhodanobacter strains was also observed. Collectively, our analyses indicated the importance of LGT during the evolution of groundwater microbial communities in response to heavy metal contamination, and a conceptual model was developed to display such adaptive evolutionary processes for explaining the extreme dominance of Rhodanobacter populations in the contaminated groundwater microbiome.
IMPORTANCE
Lateral gene transfer (LGT), along with positive selection and gene duplication, are the three main mechanisms that drive adaptive evolution of microbial genomes and communities, but their relative importance is unclear. Some recent studies suggested that LGT is a major adaptive mechanism for microbial populations in response to changing environments, and hence, it could also be critical in shaping microbial community structure. However, direct evidence of LGT and its rates in extant natural microbial communities in response to changing environments is still lacking. Our results presented in this study provide explicit evidence that LGT played a crucial role in driving the evolution of a groundwater microbial community in response to extreme heavy metal contamination. It appears that acquisition of genes critical for survival, growth, and reproduction via LGT is the most rapid and effective way to enable microorganisms and associated microbial communities to quickly adapt to abrupt harsh environmental stresses.
doi:10.1128/mBio.02234-15
PMCID: PMC4817265  PMID: 27048805
7.  Transposable element polymorphisms recapitulate human evolution 
Mobile DNA  2015;6:21.
Background
The human genome contains several active families of transposable elements (TE): Alu, L1 and SVA. Germline transposition of these elements can lead to polymorphic TE (polyTE) loci that differ between individuals with respect to the presence/absence of TE insertions. Limited sets of such polyTE loci have proven to be useful as markers of ancestry in human population genetic studies, but until this time it has not been possible to analyze the full genomic complement of TE polymorphisms in this way.
Results
For the first time here, we have performed a human population genetic analysis based on a genome-wide polyTE data set consisting of 16,192 loci genotyped in 2,504 individuals across 26 human populations. PolyTEs are found at very low frequencies, > 93 % of loci show < 5 % allele frequency, consistent with the deleteriousness of TE insertions. Nevertheless, polyTEs do show substantial geographic differentiation, with numerous group-specific polymorphic insertions. African populations have the highest numbers of polyTEs and show the highest levels of polyTE genetic diversity; Alu is the most numerous and the most diverse polyTE family. PolyTE genotypes were used to compute allele sharing distances between individuals and to relate them within and between human populations. Populations and continental groups show high coherence based on individuals’ polyTE genotypes, and human evolutionary relationships revealed by these genotypes are consistent with those seen for SNP-based genetic distances. The patterns of genetic diversity encoded by TE polymorphisms recapitulate broad patterns of human evolution and migration over the last 60–100,000 years. The utility of polyTEs as ancestry informative markers is further underscored by their ability to accurately predict both ancestry and admixture at the continental level. A genome-wide list of polyTE loci, along with their population group-specific allele frequencies and FST values, is provided as a resource for investigators who wish to develop panels of TE-based ancestry markers.
Conclusions
The genetic diversity represented by TE polymorphisms reflects known patterns of human evolution, and ensembles of polyTE loci are suitable for both ancestry and admixture analyses. The patterns of polyTE allelic diversity suggest the possibility that there may be a connection between TE-based genetic divergence and population-specific phenotypic differences.
Graphical Abstractᅟ
Electronic supplementary material
The online version of this article (doi:10.1186/s13100-015-0052-6) contains supplementary material, which is available to authorized users.
doi:10.1186/s13100-015-0052-6
PMCID: PMC4647816  PMID: 26579215
Transposable elements; Polymorphism; Population genetics; Human ancestry; Admixture; Ancestry informative markers; Phylogenetics; Alu; L1; SVA
8.  Genome Sequences of Vibrio navarrensis, a Potential Human Pathogen 
Genome Announcements  2014;2(6):e01188-14.
Vibrio navarrensis is an aquatic bacterium recently shown to be associated with human illness. We report the first genome sequences of three V. navarrensis strains obtained from clinical and environmental sources. Preliminary analyses of the sequences reveal that V. navarrensis contains genes commonly associated with virulence in other human pathogens.
doi:10.1128/genomeA.01188-14
PMCID: PMC4239357  PMID: 25414502
9.  Genome Sequence-Based Discriminator for Vancomycin-Intermediate Staphylococcus aureus 
Journal of Bacteriology  2014;196(5):940-948.
Vancomycin is the mainstay of treatment for patients with Staphylococcus aureus infections, and reduced susceptibility to vancomycin is becoming increasingly common. Accordingly, the development of rapid and accurate assays for the diagnosis of vancomycin-intermediate S. aureus (VISA) will be critical. We developed and applied a genome-based machine-learning approach for discrimination between VISA and vancomycin-susceptible S. aureus (VSSA) using 25 whole-genome sequences. The resulting machine-learning model, based on 14 gene parameters, including 3 molecular typing markers and 11 genes implicated in reduced vancomycin susceptibility, is able to unambiguously distinguish between the VISA and VSSA isolates analyzed here despite the fact that they do not form evolutionarily distinct groups. As such, the model is able to discriminate based on specific genomic markers of antibiotic susceptibility rather than overall sequence relatedness. Subsequent evaluation of the model using leave-one-out validation yielded a classification accuracy of 84%. The machine-learning approach described here provides a generalized framework for the application of genome sequence analysis to the classification of bacteria that differ with respect to clinically relevant phenotypes and should be particularly useful in defining the genomic features that underlie antibiotic resistance.
doi:10.1128/JB.01410-13
PMCID: PMC3957707  PMID: 24363339
10.  Transcriptional profiling of interleukin-2-primed human adipose derived mesenchymal stem cells revealed dramatic changes in stem cells response imposed by replicative senescence 
Oncotarget  2015;6(20):17938-17957.
Inflammation is a double-edged sword with both detrimental and beneficial consequences. Understanding of the mechanisms of crosstalk between the inflammatory milieu and human adult mesenchymal stem cells is an important basis for clinical efforts. Here, we investigate changes in the transcriptional response of human adipose-derived stem cells to physiologically relevant levels of IL-2 (IL-2 priming) upon replicative senescence. Our data suggest that replicative senescence might dramatically impede human mesenchymal stem cell (MSC) function via global transcriptional deregulation in response to IL-2. We uncovered a novel senescence-associated transcriptional signature in human adipose-derived MSCs hADSCs after exposure to pro-inflammatory environment: significant enhancement of the expression of the genes encoding potent growth factors and cytokines with anti-inflammatory and migration-promoting properties, as well as genes encoding angiogenic and anti-apoptotic promoting factors, all of which could participate in the establishment of a unique microenvironment. We observed transcriptional up-regulation of critical components of the nitric oxide synthase pathway (iNOS) in hADSCs upon replicative senescence suggesting, that senescent stem cells can acquire metastasis-promoting properties via stem cell-mediated immunosuppression. Our study highlights the importance of age as a factor when designing cell-based or pharmacological therapies for older patients and predicts measurable biomarkers characteristic of an environment that is conducive to cancer cells invasiveness and metastasis.
PMCID: PMC4627227  PMID: 26255627
mesenchymal stem cells; IL-2; aging; cancer; immunomodulation
11.  Genomic Basis of a Polyagglutinating Isolate of Neisseria meningitidis 
Journal of Bacteriology  2012;194(20):5649-5656.
Containment strategies for outbreaks of invasive Neisseria meningitidis disease are informed by serogroup assays that characterize the polysaccharide capsule. We sought to uncover the genomic basis of conflicting serogroup assay results for an isolate (M16917) from a patient with acute meningococcal disease. To this end, we characterized the complete genome sequence of the M16917 isolate and performed a variety of comparative sequence analyses against N. meningitidis reference genome sequences of known serogroups. Multilocus sequence typing and whole-genome sequence comparison revealed that M16917 is a member of the ST-11 sequence group, which is most often associated with serogroup C. However, sequence similarity comparisons and phylogenetic analysis showed that the serogroup diagnostic capsule polymerase gene (synD) of M16917 belongs to serogroup B. These results suggest that a capsule-switching event occurred based on homologous recombination at or around the capsule locus of M16917. Detailed analysis of this locus uncovered the locations of recombination breakpoints in the M16917 genome sequence, which led to the introduction of an ∼2-kb serogroup B sequence cassette into the serogroup C genomic background. Since there is no currently available vaccine for serogroup B strains of N. meningitidis, this kind capsule-switching event could have public health relevance as a vaccine escape mutant.
doi:10.1128/JB.06604-11
PMCID: PMC3458693  PMID: 22904290
12.  Inhibition of activated pericentromeric SINE/Alu repeat transcription in senescent human adult stem cells reinstates self-renewal 
Cell Cycle  2011;10(17):3016-3030.
Cellular aging is linked to deficiencies in efficient repair of DNA double strand breaks and authentic genome maintenance at the chromatin level. Aging poses a significant threat to adult stem cell function by triggering persistent DNA damage and ultimately cellular senescence. Senescence is often considered to be an irreversible process. Moreover, critical genomic regions engaged in persistent DNA damage accumulation are unknown. Here we report that 65% of naturally occurring repairable DNA damage in self-renewing adult stem cells occurs within transposable elements. Upregulation of Alu retrotransposon transcription upon ex vivo aging causes nuclear cytotoxicity associated with the formation of persistent DNA damage foci and loss of efficient DNA repair in pericentric chromatin. This occurs due to a failure to recruit of condensin I and cohesin complexes. Our results demonstrate that the cytotoxicity of induced Alu repeats is functionally relevant for the human adult stem cell aging. Stable suppression of Alu transcription can reverse the senescent phenotype, reinstating the cells' self-renewing properties and increasing their plasticity by altering so-called “master” pluripotency regulators.
doi:10.4161/cc.10.17.17543
PMCID: PMC3218602  PMID: 21862875
adult stem cells; senescence; SINE/Alu transposons; DNA damage; H2AX; ChIP-seq; cohesin; condensin; PML body; induced pluripotency
13.  Epigenetics Components of Aging in the Central Nervous System 
Neurotherapeutics  2013;10(4):647-663.
This review highlights recent discoveries that have shaped the emerging viewpoints in the field of epigenetic influences in the central nervous system (CNS), focusing on the following questions: i) How is the CNS shaped during development when precursor cells transition into morphologically and molecularly distinct cell types, and is this event driven by epigenetic alterations?; ii) How do epigenetic pathways control CNS function?; iii) What happens to “epigenetic memory” during aging processes, and do these alterations cause CNS dysfunction?; iv) Can one restore normal CNS function by manipulating the epigenome using pharmacologic agents, and will this ameliorate aging-related neurodegeneration? These and other still unanswered questions remain critical to understanding the impact of multifaceted epigenetic machinery on the age-related dysfunction of CNS.
Electronic supplementary material
The online version of this article (doi:10.1007/s13311-013-0229-y) contains supplementary material, which is available to authorized users.
doi:10.1007/s13311-013-0229-y
PMCID: PMC3805869  PMID: 24132650
Epigenetics; CNS; chromatin; neurodegeneration; aging; histone code; HDAC; DNA methylation
14.  On the presence and role of human gene-body DNA methylation 
Oncotarget  2012;3(4):462-474.
DNA methylation of promoter sequences is a repressive epigenetic mark that down-regulates gene expression. However, DNA methylation is more prevalent within gene-bodies than seen for promoters, and gene-body methylation has been observed to be positively correlated with gene expression levels. This paradox remains unexplained, and accordingly the role of DNA methylation in gene-bodies is poorly understood. We addressed the presence and role of human gene-body DNA methylation using a meta-analysis of human genome-wide methylation, expression and chromatin data sets. Methylation is associated with transcribed regions as genic sequences have higher levels of methylation than intergenic or promoter sequences. We also find that the relationship between gene-body DNA methylation and expression levels is non-monotonic and bell-shaped. Mid-level expressed genes have the highest levels of gene-body methylation, whereas the most lowly and highly expressed sets of genes both have low levels of methylation. While gene-body methylation can be seen to efficiently repress the initiation of intragenic transcription, the vast majority of methylated sites within genes are not associated with intragenic promoters. In fact, highly expressed genes initiate the most intragenic transcription, which is inconsistent with the previously held notion that gene-body methylation serves to repress spurious intragenic transcription to allow for efficient transcriptional elongation. These observations lead us to propose a model to explain the presence of human gene-body methylation. This model holds that the repression of intragenic transcription by gene-body methylation is largely epiphenomenal, and suggests that gene-body methylation levels are predominantly shaped via the accessibility of the DNA to methylating enzyme complexes.
PMCID: PMC3380580  PMID: 22577155
genome-wide methylation; epigenetic mark; intragenic transcription; methylating enzyme complexes
15.  Do human transposable element small RNAs serve primarily as genome defenders or genome regulators? 
Mobile Genetic Elements  2012;2(1):19-25.
It is currently thought that small RNA (sRNA) based repression mechanisms are primarily employed to mitigate the mutagenic threat posed by the activity of transposable elements (TEs). This can be achieved by the sRNA guided processing of TE transcripts via Dicer-dependent (e.g., siRNA) or Dicer-independent (e.g., piRNA) mechanisms. For example, potentially active human L1 elements are silenced by mRNA cleavage induced by element encoded siRNAs, leading to a negative correlation between element mRNA and siRNA levels. On the other hand, there is emerging evidence that TE derived sRNAs can also be used to regulate the host genome. Here, we evaluated these two hypotheses for human TEs by comparing the levels of TE derived mRNA and TE sRNA across six tissues. The genome defense hypothesis predicts a negative correlation between TE mRNA and TE sRNA levels, whereas the genome regulatory hypothesis predicts a positive correlation. On average, TE mRNA and TE sRNA levels are positively correlated across human tissues. These correlations are higher than seen for human genes or for randomly permuted control data sets. Overall, Alu subfamilies show the highest positive correlations of element mRNA and sRNA levels across tissues, although a few of the youngest, and potentially most active, Alu subfamilies do show negative correlations. Thus, Alu derived sRNAs may be related to both genome regulation and genome defense. These results are inconsistent with a simple model whereby TE derived sRNAs reduce levels of standing TE mRNA via transcript cleavage, and suggest that human cells efficiently process TE transcripts into sRNA based on the available message levels. This may point to a widespread role for processed TE transcripts in genome regulation or to alternative roles of TE-to-sRNA processing including the mitigation of TE transcript cytotoxicity.
doi:10.4161/mge.19031
PMCID: PMC3383446  PMID: 22754749
RNA interference; RNA processing; gene expression; genome regulation; small RNA
16.  Origin and evolution of the cystic fibrosis transmembrane regulator protein R domain 
Gene  2013;523(2):137-146.
The Cystic Fibrosis Transmembrane Conductance Regulator protein (CFTR) is a member of the ABC transporter superfamily. CFTR is distinguished from all other members of this superfamily by its status as an ion channel as well as the presence of its unique regulatory (R) domain. We investigated the origin and subsequent evolution of the R domain along the CFTR evolutionary lineage. The R domain protein coding sequence originated via the loss of a splice donor site at the 3′ end of exon 14, leading to the subsequent read-through and capture of formerly intronic sequence as novel coding sequence. Inclusion of the remaining part of the R domain coding sequence in the CFTR transcript involved a lineage-specific gain of exonic sequence with no homology to protein coding sequences outside of CFTR and loss of two exons conserved among ABC family members. These events occurred at the base of the Gnathostome evolutionary lineage ~550–650 million years ago. The apparent origination of the R domain de novo from previously non-coding sequence is consistent with its lack of sequence similarity to other domains as well as its intrinsically disordered structure, which has important implications for its function. In particular, this lack of structure may provide for a dynamic and inducible regulatory activity based on transient physical interactions with more structured domains of the protein. Since its acquisition along the CFTR evolutionary lineage, the R domain has evolved more rapidly than any other CFTR domain; however, there is no evidence for positive (adaptive) selection in the evolution of the domain. The R domain does show a distinct pattern of relative evolutionary rates compared to other CFTR domains, which sheds additional light on the connection between its function and evolution. The regulatory function of the R domain is dependent upon a fairly small number of sites that are subject to phosphorylation, and these sites were fixed very early in R domain evolution and have remained largely invariant since that time. In contrast, the rest of the R domain has been free to drift in sequence space leading to a more star-like phylogeny than seen for the other CFTR domains. The case of the R domain suggests that domain acquisition via the de novo creation of coding sequence, and the novel functional utility that such an event would seemingly entail, can be one route by which neo-functionalization is favored to occur.
doi:10.1016/j.gene.2013.02.050
PMCID: PMC3793851  PMID: 23578801
Cystic fibrosis; R domain; Molecular evolution; Coding sequence; Neo-functionalization
17.  Flow-dependent epigenetic DNA methylation regulates endothelial gene expression and atherosclerosis 
The Journal of Clinical Investigation  2014;124(7):3187-3199.
In atherosclerosis, plaques preferentially develop in arterial regions of disturbed blood flow (d-flow), which alters endothelial gene expression and function. Here, we determined that d-flow regulates genome-wide DNA methylation patterns in a DNA methyltransferase–dependent (DNMT-dependent) manner. Induction of d-flow by partial carotid ligation surgery in a murine model induced DNMT1 in arterial endothelium. In cultured endothelial cells, DNMT1 was enhanced by oscillatory shear stress (OS), and reduction of DNMT with either the inhibitor 5-aza-2′-deoxycytidine (5Aza) or siRNA markedly reduced OS-induced endothelial inflammation. Moreover, administration of 5Aza reduced lesion formation in 2 mouse models of atherosclerosis. Using both reduced representation bisulfite sequencing (RRBS) and microarray, we determined that d-flow in the carotid artery resulted in hypermethylation within the promoters of 11 mechanosensitive genes and that 5Aza treatment restored normal methylation patterns. Of the identified genes, HoxA5 and Klf3 encode transcription factors that contain cAMP response elements, suggesting that the methylation status of these loci could serve as a mechanosensitive master switch in gene expression. Together, our results demonstrate that d-flow controls epigenomic DNA methylation patterns in a DNMT-dependent manner, which in turn alters endothelial gene expression and induces atherosclerosis.
doi:10.1172/JCI74792
PMCID: PMC4071393  PMID: 24865430
18.  Mammalian-wide interspersed repeat (MIR)-derived enhancers and the regulation of human gene expression 
Mobile DNA  2014;5:14.
Background
Mammalian-wide interspersed repeats (MIRs) are the most ancient family of transposable elements (TEs) in the human genome. The deep conservation of MIRs initially suggested the possibility that they had been exapted to play functional roles for their host genomes. MIRs also happen to be the only TEs whose presence in-and-around human genes is positively correlated to tissue-specific gene expression. Similar associations of enhancer prevalence within genes and tissue-specific expression, along with MIRs’ previous implication as providing regulatory sequences, suggested a possible link between MIRs and enhancers.
Results
To test the possibility that MIRs contribute functional enhancers to the human genome, we evaluated the relationship between MIRs and human tissue-specific enhancers in terms of genomic location, chromatin environment, regulatory function, and mechanistic attributes. This analysis revealed MIRs to be highly concentrated in enhancers of the K562 and HeLa human cell-types. Significantly more enhancers were found to be linked to MIRs than would be expected by chance, and putative MIR-derived enhancers are characterized by a chromatin environment highly similar to that of canonical enhancers. MIR-derived enhancers show strong associations with gene expression levels, tissue-specific gene expression and tissue-specific cellular functions, including a number of biological processes related to erythropoiesis. MIR-derived enhancers were found to be a rich source of transcription factor binding sites, underscoring one possible mechanistic route for the element sequences co-option as enhancers. There is also tentative evidence to suggest that MIR-enhancer function is related to the transcriptional activity of non-coding RNAs.
Conclusions
Taken together, these data reveal enhancers to be an important cis-regulatory platform from which MIRs can exercise a regulatory function in the human genome and help to resolve a long-standing conundrum as to the reason for MIRs’ deep evolutionary conservation.
doi:10.1186/1759-8753-5-14
PMCID: PMC4090950  PMID: 25018785
19.  Repetitive DNA elements, nucleosome binding and human gene expression 
Gene  2009;436(1-2):12-22.
We evaluated the epigenetic contributions of repetitive DNA elements to human gene regulation. Human proximal promoter sequences show distinct distributions of transposable elements (TEs) and simple sequence repeats (SSRs). TEs are enriched distal from transcriptional start sites (TSSs) and their frequency decreases closer to TSSs being largely absent from the core promoter region. SSRs, on the other hand, are found at low frequency distal to the TSS and then increase in frequency starting ∼150bp upstream of the TSS. The peak of SSR density is centered around the -35bp position where the basal transcriptional machinery assembles. These trends in repetitive sequence distribution are strongly correlated, positively for TEs and negatively for SSRs, with relative nucleosome binding affinities along the promoters. Nucleosomes bind with highest probability distal from the TSS and the nucleosome binding affinity steadily decreases reaching its nadir just upstream of the TSS at the same point where SSR frequency is at its highest. Promoters that are enriched for TEs are more highly and broadly expressed, on average, than promoters that are devoid of TEs. In addition, promoters that have similar repetitive DNA profiles regulate genes that have more similar expression patterns and encode proteins with more similar functions than promoters that differ with respect to their repetitive DNA. Furthermore, distinct repetitive DNA promoter profiles are correlated with tissue-specific patterns of expression. These observations indicate that repetitive DNA elements mediate chromatin accessibility in proximal promoter regions and the repeat content of promoters is relevant to both gene expression and function.
doi:10.1016/j.gene.2009.01.013
PMCID: PMC2921533  PMID: 19393174
20.  Transcriptional Activity, Chromosomal Distribution and Expression Effects of Transposable Elements in Coffea Genomes 
PLoS ONE  2013;8(11):e78931.
Plant genomes are massively invaded by transposable elements (TEs), many of which are located near host genes and can thus impact gene expression. In flowering plants, TE expression can be activated (de-repressed) under certain stressful conditions, both biotic and abiotic, as well as by genome stress caused by hybridization. In this study, we examined the effects of these stress agents on TE expression in two diploid species of coffee, Coffea canephora and C. eugenioides, and their allotetraploid hybrid C. arabica. We also explored the relationship of TE repression mechanisms to host gene regulation via the effects of exonized TE sequences. Similar to what has been seen for other plants, overall TE expression levels are low in Coffea plant cultivars, consistent with the existence of effective TE repression mechanisms. TE expression patterns are highly dynamic across the species and conditions assayed here are unrelated to their classification at the level of TE class or family. In contrast to previous results, cell culture conditions per se do not lead to the de-repression of TE expression in C. arabica. Results obtained here indicate that differing plant drought stress levels relate strongly to TE repression mechanisms. TEs tend to be expressed at significantly higher levels in non-irrigated samples for the drought tolerant cultivars but in drought sensitive cultivars the opposite pattern was shown with irrigated samples showing significantly higher TE expression. Thus, TE genome repression mechanisms may be finely tuned to the ideal growth and/or regulatory conditions of the specific plant cultivars in which they are active. Analysis of TE expression levels in cell culture conditions underscored the importance of nonsense-mediated mRNA decay (NMD) pathways in the repression of Coffea TEs. These same NMD mechanisms can also regulate plant host gene expression via the repression of genes that bear exonized TE sequences.
doi:10.1371/journal.pone.0078931
PMCID: PMC3823963  PMID: 24244387
21.  Co-evolutionary Rates of Functionally Related Yeast Genes 
Evolutionary knowledge is often used to facilitate computational attempts at gene function prediction. One rich source of evolutionary information is the relative rates of gene sequence divergence, and in this report we explore the connection between gene evolutionary rates and function. We performed a genome-scale evaluation of the relationship between evolutionary rates and functional annotations for the yeast Saccharomyces cerevisiae. Non-synonymous (dN) and synonymous (dS) substitution rates were calculated for 1,095 orthologous gene sets common to S. cerevisiae and six other closely related yeast species. Differences in evolutionary rates between pairs of genes (ΔdN & ΔdS) were then compared to their functional similarities (sGO), which were measured using Gene Ontology (GO) annotations. Substantial and statistically significant correlations were found between ΔdN and sGO, whereas there is no apparent relationship between ΔdS and sGO. These results are consistent with a mode of action for natural selection that is based on similar rates of elimination of deleterious protein coding sequence variants for functionally related genes. The connection between gene evolutionary rates and function was stronger than seen for phylogenetic profiles, which have previously been employed to inform functional inference. The co-evolution of functionally related yeast genes points to the relevance of specific function for the efficacy of natural selection and underscores the utility of gene evolutionary rates for functional predictions.
PMCID: PMC2674680  PMID: 18345352
Functional inference; Co-evolution; natural selection; genome evolution; gene ontology
22.  Co-evolutionary Rates of Functionally Related Yeast Genes 
Evolutionary knowledge is often used to facilitate computational attempts at gene function prediction. One rich source of evolutionary information is the relative rates of gene sequence divergence, and in this report we explore the connection between gene evolutionary rates and function. We performed a genome-scale evaluation of the relationship between evolutionary rates and functional annotations for the yeast Saccharomyces cerevisiae. Non-synonymous (dN) and synonymous (dS) substitution rates were calculated for 1,095 orthologous gene sets common to S. cerevisiae and six other closely related yeast species. Differences in evolutionary rates between pairs of genes (ΔdN & ΔdS) were then compared to their functional similarities (sGO), which were measured using Gene Ontology (GO) annotations. Substantial and statistically significant correlations were found between ΔdN and sGO, whereas there is no apparent relationship between ΔdS and sGO. These results are consistent with a mode of action for natural selection that is based on similar rates of elimination of deleterious protein coding sequence variants for functionally related genes. The connection between gene evolutionary rates and function was stronger than seen for phylogenetic profiles, which have previously been employed to inform functional inference. The co-evolution of functionally related yeast genes points to the relevance of specific function for the efficacy of natural selection and underscores the utility of gene evolutionary rates for functional predictions.
PMCID: PMC2674680  PMID: 18345352
Functional inference; Co-evolution; natural selection; genome evolution; gene ontology
23.  Genome Sequences for Six Rhodanobacter Strains, Isolated from Soils and the Terrestrial Subsurface, with Variable Denitrification Capabilities 
Journal of Bacteriology  2012;194(16):4461-4462.
We report the first genome sequences for six strains of Rhodanobacter species isolated from a variety of soil and subsurface environments. Three of these strains are capable of complete denitrification and three others are not. However, all six strains contain most of the genes required for the respiration of nitrate to gaseous nitrogen. The nondenitrifying members of the genus lack only the gene for nitrate reduction, the first step in the full denitrification pathway. The data suggest that the environmental role of bacteria from the genus Rhodanobacter should be reevaluated.
doi:10.1128/JB.00871-12
PMCID: PMC3416251  PMID: 22843592
24.  Depletion of nuclear histone H2A variants is associated with chronic DNA damage signaling upon drug-evoked senescence of human somatic cells 
Aging (Albany NY)  2012;4(11):823-842.
Cellular senescence is associated with global chromatin changes, altered gene expression, and activation of chronic DNA damage signaling. These events ultimately lead to morphological and physiological transformations in primary cells. In this study, we show that chronic DNA damage signals caused by genotoxic stress impact the expression of histones H2A family members and lead to their depletion in the nuclei of senescent human fibroblasts. Our data reinforce the hypothesis that progressive chromatin destabilization may lead to the loss of epigenetic information and impaired cellular function associated with chronic DNA damage upon drug-evoked senescence. We propose that changes in the histone biosynthesis and chromatin assembly may directly contribute to cellular aging. In addition, we also outline the method that allows for quantitative and unbiased measurement of these changes.
PMCID: PMC3560435  PMID: 23235539
γH2A.X; DNA damage; senescence; LS-MS analysis; quantitative proteomic; SRM; histone H2A family; chromatin; DNA repair; HCA2 primary fibroblasts; epigenetics
25.  Cell type-specific termination of transcription by transposable element sequences 
Mobile DNA  2012;3:15.
Background
Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question.
Results
Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3′ UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS.
Conclusions
TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription termination by TEs seen here, along with the preference for sense-oriented TE insertions to provide TTS, is consistent with the observed antisense orientation bias of human TEs.
doi:10.1186/1759-8753-3-15
PMCID: PMC3517506  PMID: 23020800
Polyadenylation; Transcription termination; Orientation bias; Gene regulation

Results 1-25 (55)