Search tips
Search criteria

Results 1-25 (35)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
more »
1.  Deep sequencing of HPV16 genomes: A new high-throughput tool for exploring the carcinogenicity and natural history of HPV16 infection 
Papillomavirus research  2015;1:3-11.
For unknown reasons, there is huge variability in risk conferred by different HPV types and, remarkably, strong differences even between closely related variant lineages within each type. HPV16 is a uniquely powerful carcinogenic type, causing approximately half of cervical cancer and most other HPV-related cancers. To permit the large-scale study of HPV genome variability and precancer/cancer, starting with HPV16 and cervical cancer, we developed a high-throughput next-generation sequencing (NGS) whole-genome method. We designed a custom HPV16 AmpliSeq™ panel that generated 47 overlapping amplicons covering 99% of the genome sequenced on the Ion Torrent Proton platform. After validating with Sanger, the current “gold standard” of sequencing, in 89 specimens with concordance of 99.9%, we used our NGS method and custom annotation pipeline to sequence 796 HPV16-positive exfoliated cervical cell specimens. The median completion rate per sample was 98.0%.
Our method enabled us to discover novel SNPs, large contiguous deletions suggestive of viral integration (OR of 27.3, 95% CI 3.3–222, P=0.002), and the sensitive detection of variant lineage coinfections. This method represents an innovative high-throughput, ultra-deep coverage technique for HPV genomic sequencing, which, in turn, enables the investigation of the role of genetic variation in HPV epidemiology and carcinogenesis.
PMCID: PMC4669577  PMID: 26645052
HPV16; HPV epidemiology; HPV genomics
2.  Role of the ESCRT Complexes in Telomere Biology 
mBio  2016;7(6):e01793-16.
Eukaryotic chromosomal ends are protected by telomeres from fusion, degradation, and unwanted double-strand break repair events. Therefore, telomeres preserve genome stability and integrity. Telomere length can be maintained by telomerase, which is expressed in most human primary tumors but is not expressed in the majority of somatic cells. Thus, telomerase may be a highly relevant anticancer drug target. Genome-wide studies in the yeast Saccharomyces cerevisiae identified a set of genes associated with telomere length maintenance (TLM genes). Among the tlm mutants with short telomeres, we found a strong enrichment for those affecting vacuolar and endosomal traffic (particularly the endosomal sorting complex required for transport [ESCRT] pathway). Here, we present our results from investigating the surprising link between telomere shortening and the ESCRT machinery. Our data show that the whole ESCRT system is required to safeguard proper telomere length maintenance. We propose a model of impaired end resection resulting in too little telomeric overhang, such that Cdc13 binding is prevented, precluding either telomerase recruitment or telomeric overhang protection.
Telomeres are the ends of eukaryotic chromosomes. They are necessary for the proper replication of the genome and protect the chromosomes from degradation. In a large-scale systematic screen for mutants that affect telomere length in yeast, we found that mutations in any of the genes encoding the ESCRT complexes, required for the formation of transport vesicles within the cell, cause telomere shortening. We carried out an analysis of the mechanisms disrupted in these mutants and found that they are defective for the ability to elongate short telomeres, probably due to faulty end processing. We discuss the significance of these findings and how they could be relevant to anticancer therapies.
PMCID: PMC5101353  PMID: 27834202
3.  AVIA v2.0: annotation, visualization and impact analysis of genomic variants and genes 
Bioinformatics  2015;31(16):2748-2750.
Summary: As sequencing becomes cheaper and more widely available, there is a greater need to quickly and effectively analyze large-scale genomic data. While the functionality of AVIA v1.0, whose implementation was based on ANNOVAR, was comparable with other annotation web servers, AVIA v2.0 represents an enhanced web-based server that extends genomic annotations to cell-specific transcripts and protein-level functional annotations. With AVIA’s improved interface, users can better visualize their data, perform comprehensive searches and categorize both coding and non-coding variants.
Availability and implementation: AVIA is freely available through the web at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4528632  PMID: 25861966
4.  Retrovirus Integration Database (RID): a public database for retroviral insertion sites into host genomes 
Retrovirology  2016;13:47.
The NCI Retrovirus Integration Database is a MySql-based relational database created for storing and retrieving comprehensive information about retroviral integration sites, primarily, but not exclusively, HIV-1. The database is accessible to the public for submission or extraction of data originating from experiments aimed at collecting information related to retroviral integration sites including: the site of integration into the host genome, the virus family and subtype, the origin of the sample, gene exons/introns associated with integration, and proviral orientation. Information about the references from which the data were collected is also stored in the database. Tools are built into the website that can be used to map the integration sites to UCSC genome browser, to plot the integration site patterns on a chromosome, and to display provirus LTRs in their inserted genome sequence. The website is robust, user friendly, and allows users to query the database and analyze the data dynamically. Availability:; or
PMCID: PMC4932684  PMID: 27377064
Retrovirus; HIV; Integration site; Database; Integration site assay; ISA; Expanded clones
5.  Can Structural Features of Kinase Receptors Provide Clues on Selectivity and Inhibition?: A Molecular Modeling Study 
Cancer is a complex disease resulting from the uncontrolled proliferation of cell signaling events. Protein kinases have been identified as central molecules that participate overwhelmingly in oncogenic events, thus becoming key targets for anticancer drugs. A majority of studies converged on the idea that ligand-binding pockets of kinases retain clues to the inhibiting abilities and cross-reacting tendencies of inhibitor drugs. Even though these ideas are critical for drug discovery, validating them using experiments is not only difficult, but in some cases infeasible. To overcome these limitations and to test these ideas at the molecular level, we present here the results of receptor-focused in-silico docking of nine marketed drugs to 19 different wild-type and mutated kinases chosen from a wide range of families. This investigation highlights the need for using relevant models to explain the correct inhibition trends and the results are used to make predictions that might be able to influence future experiments. Our simulation studies are able to correctly predict the primary targets for each drug studied in majority of cases and our results agree with the existing findings. Our study shows that the conformations a given receptor acquires during kinase activation, and their micro-environment, defines the ligand partners. Type II drugs display high compatibility and selectivity for DFG-out kinase conformations. On the other hand Type I drugs are less selective and show binding preferences for both the open and closed forms of selected kinases. Using this receptor-focused approach, it is possible to capture the observed fold change in binding affinities between the wild-type and disease-centric mutations in ABL kinase for Imatinib and the second-generation ABL drugs. The effects of mutation are also investigated for two other systems, EGFR and B-Raf. Finally, by including pathway information in the design it is possible to model kinase inhibitors with potentially fewer side-effects.
PMCID: PMC4361267  PMID: 25635590
Kinase drugs; Docking; Activity; Selectivity; Type I/II drugs; Mutation; Cross-reactivity; Active site; Biological pathways
6.  Mixed Integer Linear Programming based machine learning approach identifies regulators of telomerase in yeast 
Nucleic Acids Research  2016;44(10):e93.
Understanding telomere length maintenance mechanisms is central in cancer biology as their dysregulation is one of the hallmarks for immortalization of cancer cells. Important for this well-balanced control is the transcriptional regulation of the telomerase genes. We integrated Mixed Integer Linear Programming models into a comparative machine learning based approach to identify regulatory interactions that best explain the discrepancy of telomerase transcript levels in yeast mutants with deleted regulators showing aberrant telomere length, when compared to mutants with normal telomere length. We uncover novel regulators of telomerase expression, several of which affect histone levels or modifications. In particular, our results point to the transcription factors Sum1, Hst1 and Srb2 as being important for the regulation of EST1 transcription, and we validated the effect of Sum1 experimentally. We compiled our machine learning method leading to a user friendly package for R which can straightforwardly be applied to similar problems integrating gene regulator binding information and expression profiles of samples of e.g. different phenotypes, diseases or treatments.
PMCID: PMC4889924  PMID: 26908654
7.  The Replisome-Coupled E3 Ubiquitin Ligase Rtt101Mms22 Counteracts Mrc1 Function to Tolerate Genotoxic Stress 
PLoS Genetics  2016;12(2):e1005843.
Faithful DNA replication and repair requires the activity of cullin 4-based E3 ubiquitin ligases (CRL4), but the underlying mechanisms remain poorly understood. The budding yeast Cul4 homologue, Rtt101, in complex with the linker Mms1 and the putative substrate adaptor Mms22 promotes progression of replication forks through damaged DNA. Here we characterized the interactome of Mms22 and found that the Rtt101Mms22 ligase associates with the replisome progression complex during S-phase via the amino-terminal WD40 domain of Ctf4. Moreover, genetic screening for suppressors of the genotoxic sensitivity of rtt101Δ cells identified a cluster of replication proteins, among them a component of the fork protection complex, Mrc1. In contrast to rtt101Δ and mms22Δ cells, mrc1Δ rtt101Δ and mrc1Δ mms22Δ double mutants complete DNA replication upon replication stress by facilitating the repair/restart of stalled replication forks using a Rad52-dependent mechanism. Our results suggest that the Rtt101Mms22 E3 ligase does not induce Mrc1 degradation, but specifically counteracts Mrc1’s replicative function, possibly by modulating its interaction with the CMG (Cdc45-MCM-GINS) complex at stalled forks.
Author Summary
Post-translational protein modifications, such as ubiquitylation, are essential for cells to respond to environmental cues. In order to understand how eukaryotes cope with DNA damage, we have investigated a conserved E3 ubiquitin ligase complex required for the resistance to carcinogenic chemicals. This complex, composed of Rtt101, Mms1 and Mms22 in budding yeast, plays a critical role in regulating the fate of stalled DNA replication. Here, we found that the Rtt101Mms22 E3 ubiquitin ligase complex interacts with the replisome during S-phase, and orchestrates the repair/restart of DNA synthesis after stalling by activating a Rad52-dependent homologous recombination pathway. Our findings indicate that Rtt101Mms22 specifically counteracts the replicative activity of Mrc1, a subunit of the fork protection complex, possibly by modulating its interaction with the CMG (Cdc45-MCM-GINS) helicase complex upon fork stalling. Altogether, our study unravels a functional protein cluster that is essential to understand how eukaryotic cells cope with DNA damage during replication and, thus deepens our knowledge of the biology that underlies carcinogenesis.
PMCID: PMC4743919  PMID: 26849847
9.  A novel autosomal recessive TERT T1129P mutation in a dyskeratosis congenita family leads to cellular senescence and loss of CD34+ hematopoietic stem cells not reversible by mTOR-inhibition 
Aging (Albany NY)  2015;7(11):911-927.
The TERT gene encodes for the reverse transcriptase activity of the telomerase complex and mutations in TERT can lead to dysfunctional telomerase activity resulting in diseases such as dyskeratosis congenita (DKC). Here, we describe a novel TERT mutation at position T1129P leading to DKC with progressive bone marrow (BM) failure in homozygous members of a consanguineous family. BM hematopoietic stem cells (HSCs) of an affected family member were 300-fold reduced associated with a significantly impaired colony forming capacity in vitro and impaired repopulation activity in mouse xenografts. Recent data in yeast suggested improved cellular checkpoint controls by mTOR inhibition preventing cells with short telomeres or DNA damage from dividing. To evaluate a potential therapeutic option for the patient, we treated her primary skin fibroblasts and BM HSCs with the mTOR inhibitor rapamycin. This led to prolonged survival and decreased levels of senescence in T1129P mutant fibroblasts. In contrast, the impaired HSC function could not be improved by mTOR inhibition, as colony forming capacity and multilineage engraftment potential in xenotransplanted mice remained severely impaired. Thus, rapamycin treatment did not rescue the compromised stem cell function of TERTT1129P mutant patient HSCs and outlines limitations of a potential DKC therapy based on rapamycin.
PMCID: PMC4694062  PMID: 26546739
TERT; TERC; mTOR; rapamycin; sirolimus; senescence
10.  The somatic autosomal mutation matrix in cancer genomes 
Human Genetics  2015;134(8):851-864.
DNA damage in somatic cells originates from both environmental and endogenous sources, giving rise to mutations through multiple mechanisms. When these mutations affect the function of critical genes, cancer may ensue. Although identifying genomic subsets of mutated genes may inform therapeutic options, a systematic survey of tumor mutational spectra is required to improve our understanding of the underlying mechanisms of mutagenesis involved in cancer etiology. Recent studies have presented genome-wide sets of somatic mutations as a 96-element vector, a procedure that only captures the immediate neighbors of the mutated nucleotide. Herein, we present a 32 × 12 mutation matrix that captures the nucleotide pattern two nucleotides upstream and downstream of the mutation. A somatic autosomal mutation matrix (SAMM) was constructed from tumor-specific mutations derived from each of 909 individual cancer genomes harboring a total of 10,681,843 single-base substitutions. In addition, mechanistic template mutation matrices (MTMMs) representing oxidative DNA damage, ultraviolet-induced DNA damage, 5mCpG deamination, and APOBEC-mediated cytosine mutation, are presented. MTMMs were mapped to the individual tumor SAMMs to determine the maximum contribution of each mutational mechanism to the overall mutation pattern. A Manhattan distance across all SAMM elements between any two tumor genomes was used to determine their relative distance. Employing this metric, 89.5 % of all tumor genomes were found to have a nearest neighbor from the same tissue of origin. When a distance-dependent 6-nearest neighbor classifier was used, 86.9 % of all SAMMs were assigned to the correct tissue of origin. Thus, although tumors from different tissues may have similar mutation patterns, their SAMMs often display signatures that are characteristic of specific tissues.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-015-1566-1) contains supplementary material, which is available to authorized users.
PMCID: PMC4495249  PMID: 26001532
11.  Pharmacophore model of the quercetin binding site of the SIRT6 protein 
SIRT6 is a histone deacetylase that has been proposed as a potential therapeutic target for metabolic disorders and the prevention of age-associated diseases. We have previously reported on the identification of quercetin and vitexin as SIRT6 inhibitors, and studied structurally related flavonoids including luteolin, kaempferol, apigenin and naringenin. It was determined that the SIRT6 protein remained active after immobilization and that a single frontal displacement could correctly predict the functional activity of the immobilized enzyme. The previous study generated a preliminary pharmacophore for the quercetin binding site on SIRT6, containing 3 hydrogen bond donors and one hydrogen bond acceptor. In this study, we have generated a refined pharmacophore with an additional twelve quercetin analogs. The resulting model had a positive linear behavior between the experimental elution time verses the fit values obtained from the model with a correlation coefficient of 0.8456.
PMCID: PMC3980043  PMID: 24491483
SIRT6; Pharmacophore Modeling; Frontal Displacement Chromatography; HAT; HDAC
12.  Acute and Chronic Plasma Metabolomic and Liver Transcriptomic Stress Effects in a Mouse Model with Features of Post-Traumatic Stress Disorder 
PLoS ONE  2015;10(1):e0117092.
Acute responses to intense stressors can give rise to post-traumatic stress disorder (PTSD). PTSD diagnostic criteria include trauma exposure history and self-reported symptoms. Individuals who meet PTSD diagnostic criteria often meet criteria for additional psychiatric diagnoses. Biomarkers promise to contribute to reliable phenotypes of PTSD and comorbidities by linking biological system alterations to behavioral symptoms. Here we have analyzed unbiased plasma metabolomics and other stress effects in a mouse model with behavioral features of PTSD. In this model, C57BL/6 mice are repeatedly exposed to a trained aggressor mouse (albino SJL) using a modified, resident-intruder, social defeat paradigm. Our recent studies using this model found that aggressor-exposed mice exhibited acute stress effects including changed behaviors, body weight gain, increased body temperature, as well as inflammatory and fibrotic histopathologies and transcriptomic changes of heart tissue. Some of these acute stress effects persisted, reminiscent of PTSD. Here we report elevated proteins in plasma that function in inflammation and responses to oxidative stress and damaged tissue at 24 hrs post-stressor. Additionally at this acute time point, transcriptomic analysis indicated liver inflammation. The unbiased metabolomics analysis showed altered metabolites in plasma at 24 hrs that only partially normalized toward control levels after stress-withdrawal for 1.5 or 4 wks. In particular, gut-derived metabolites were altered at 24 hrs post-stressor and remained altered up to 4 wks after stress-withdrawal. Also at the 4 wk time point, hyperlipidemia and suppressed metabolites of amino acids and carbohydrates in plasma coincided with transcriptomic indicators of altered liver metabolism (activated xenobiotic and lipid metabolism). Collectively, these system-wide sequelae to repeated intense stress suggest that the simultaneous perturbed functioning of multiple organ systems (e.g., brain, heart, intestine and liver) can interact to produce injuries that lead to chronic metabolic changes and disorders that have been associated with PTSD.
PMCID: PMC4309402  PMID: 25629821
13.  Identification of Gene Signatures Used to Recognize Biological Characteristics of Gastric Cancer Upon Gene Expression Data 
Biomarker Insights  2014;9:67-76.
High-throughput gene expression microarrays can be examined by machine-learning algorithms to identify gene signatures that recognize the biological characteristics of specific human diseases, including cancer, with high sensitivity and specificity. A previous study compared 20 gastric cancer (GC) samples against 20 normal tissue (NT) samples and identified 1,519 differentially expressed genes (DEGs). In this study, Classification Information Index (CII), Information Gain Index (IGI), and RELIEF algorithms are used to mine the previously reported gene expression profiling data. In all, 29 of these genes are identified by all three algorithms and are treated as GC candidate biomarkers. Three biomarkers, COL1A2, ATP4B, and HADHSC, are selected and further examined using quantitative real-time polymerase chain reaction (qRT-PCR) and immunohistochemistry (IHC) staining in two independent sets of GC and normal adjacent tissue (NAT) samples. Our study shows that COL1A2 and HADHSC are the two best biomarkers from the microarray data, distinguishing all GC from the NT, whereas ATP4B is diagnostically significant in lab tests because of its wider range of fold-changes in expression. Herein, a data-mining model applicable for small sample sizes is presented and discussed. Our result suggested that this mining model may be useful in small sample-size studies to identify putative biomarkers and potential biological features of GC.
PMCID: PMC4149392  PMID: 25210421
gastric cancer; gene signature; microarray; machine-learning algorithm
14.  The differential processing of telomeres in response to increased telomeric transcription and RNA–DNA hybrid accumulation 
RNA Biology  2014;11(2):95-100.
Telomeres are protective nucleoprotein structures at the ends of eukaryotic chromosomes. Despite the heterochromatic state of telomeres they are transcribed, generating non-coding telomeric repeat-containing RNA (TERRA). Strongly induced TERRA transcription has been shown to cause telomere shortening and accelerated senescence in the absence of both telomerase and homology-directed repair (HDR). Moreover, it has recently been demonstrated that TERRA forms RNA–DNA hybrids at chromosome ends. The accumulation of RNA–DNA hybrids at telomeres also leads to rapid senescence and telomere loss in the absence of telomerase and HDR. Conversely, in the presence of HDR, telomeric RNA–DNA hybrid accumulation and increased telomere transcription promote telomere recombination, and hence, delayed senescence. Here, we demonstrate that despite these similar phenotypic outcomes, telomeres that are highly transcribed are not processed in the same manner as those that accumulate RNA–DNA hybrids.
PMCID: PMC3973735  PMID: 24525824
TERRA; telomere; senescence; Exo1; RNA-DNA hybrid; R-loop; RNase H
15.  Knowledge and Theme Discovery across Very Large Biological Data Sets Using Distributed Queries: A Prototype Combining Unstructured and Structured Data 
PLoS ONE  2013;8(12):e80503.
As the discipline of biomedical science continues to apply new technologies capable of producing unprecedented volumes of noisy and complex biological data, it has become evident that available methods for deriving meaningful information from such data are simply not keeping pace. In order to achieve useful results, researchers require methods that consolidate, store and query combinations of structured and unstructured data sets efficiently and effectively. As we move towards personalized medicine, the need to combine unstructured data, such as medical literature, with large amounts of highly structured and high-throughput data such as human variation or expression data from very large cohorts, is especially urgent. For our study, we investigated a likely biomedical query using the Hadoop framework. We ran queries using native MapReduce tools we developed as well as other open source and proprietary tools. Our results suggest that the available technologies within the Big Data domain can reduce the time and effort needed to utilize and apply distributed queries over large datasets in practical clinical applications in the life sciences domain. The methodologies and technologies discussed in this paper set the stage for a more detailed evaluation that investigates how various data structures and data models are best mapped to the proper computational framework.
PMCID: PMC3846626  PMID: 24312478
16.  Guanine Holes Are Prominent Targets for Mutation in Cancer and Inherited Disease 
PLoS Genetics  2013;9(9):e1003816.
Single base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G•C bp in the context of all 64 5′-NGNN-3′ motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.
Author Summary
A large number of DNA mutations identified in cells from patients with cancer or human inherited disease were analyzed to address a fundamental issue in human pathology, viz, the mutational mechanisms that cause irreversible changes to DNA. By using bioinformatics and computational methods, we found that mutations do not occur randomly, but instead affect specific bases, most often guanines flanked by other guanines or adenines. We attribute this effect to electron transfer, a chemical reaction known to underlie basic biological processes such as cellular respiration and photosynthesis. Certain types of carcinogens, oxidants or radiation can interact with DNA and abstract an electron. Our results imply that the ensuing sites of electron loss can migrate from their original position in the DNA to neighboring guanines where they become trapped, leading to further chemical modifications that may eventually result in mutations. Many of the mutations known to be important for tumor growth (driver mutations), as well as passenger mutations and mutations associated with inherited disease, appear to be caused by electron transfer. Beyond pathological mutations, electron transfer may represent a universal mechanism by which genetic changes occur in all life forms to drive population fitness over evolutionary time.
PMCID: PMC3784513  PMID: 24086153
17.  Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools 
Nucleic Acids Research  2012;41(Database issue):D94-D100.
The non-B DB, available at, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.
PMCID: PMC3531222  PMID: 23125372
18.  Genes affected by mouse mammary tumor virus (MMTV) proviral insertions in mouse mammary tumors are deregulated or mutated in primary human mammary tumors 
Oncotarget  2012;3(11):1320-1334.
The accumulation of mutations is a contributing factor in the initiation of premalignant mammary lesions and their progression to malignancy and metastasis. We have used a mouse model in which the carcinogen is the mouse mammary tumor virus (MMTV) which induces clonal premalignant mammary lesions and malignant mammary tumors by insertional mutagenesis. Identification of the genes and signaling pathways affected in MMTV-induced mouse mammary lesions provides a rationale for determining whether genetic alteration of the human orthologues of these genes/pathways may contribute to human breast carcinogenesis. A high-throughput platform for inverse PCR to identify MMTV-host junction fragments and their nucleotide sequences in a large panel of MMTV-induced lesions was developed. Validation of the genes affected by MMTV-insertion was carried out by microarray analysis. Common integration site (CIS) means that the gene was altered by an MMTV proviral insertion in at least two independent lesions arising in different hosts. Three of the new genes identified as CIS for MMTV were assayed for their capability to confer on HC11 mouse mammary epithelial cells the ability for invasion, anchorage independent growth and tumor development in nude mice. Analysis of MMTV induced mammary premalignant hyperplastic outgrowth (HOG) lines and mammary tumors led to the identification of CIS restricted to 35 loci. Within these loci members of the Wnt, Fgf and Rspo gene families plus two linked genes (Npm3 and Ddn) were frequently activated in tumors induced by MMTV. A second group of 15 CIS occur at a low frequency (2-5 observations) in mammary HOGs or tumors. In this latter group the expression of either Phf19 or Sdc2 was shown to increase HC11 cells invasion capability. Foxl1 expression conferred on HC11 cells the capability for anchorage-independent colony formation in soft agar and tumor development in nude mice. The published transcriptome and nucleotide sequence analysis of gene expression in primary human breast tumors was interrogated. Twenty of the human orthologues of MMTV CIS associated genes are deregulated and/or mutated in human breast tumors.
PMCID: PMC3717796  PMID: 23131872
mouse mammary tumor virus; premalignant lesions; mammary tumors; genes; human breast carcinomas; metastases
19.  An Optimized Method for Computing 18O/16O Ratios of Differentially Stable-isotope Labeled Peptides in the Context of Post-digestion 18O Exchange/Labeling 
Analytical chemistry  2010;82(13):5878-5886.
Differential 18O/16O stable isotope labeling of peptides that relies on enzyme-catalyzed oxygen exchange at their carboxyl termini in the presence of H218O has been widely used for relative quantitation of peptides/proteins. The role of tryptic proteolysis in bottom-up shotgun proteomics and low reagent costs, has made trypsin-catalyzed 18O post-digestion exchange a convenient and affordable stable isotope labeling approach. However, it is known that trypsin-catalyzed 18O exchange at the carboxyl terminus is in many instances inhomogeneous/incomplete. The extent of the 18O exchange/incorporation fluctuates from peptide to peptide mostly due to variable enzyme-substrate affinity. Thus, accurate calculation and interpretation of peptide ratios are analytically complicated and in some regard deficient. Therefore, a computational approach capable of improved measurement of actual 18O incorporation for each differentially labeled peptide pair is needed. In this regard, we have developed an algorithmic method that relies on the trapezoidal rule to integrate peak intensities of all detected isotopic species across a particular peptide ion over the retention time, which fits the isotopic manifold to Poisson distributions. Optimal values for manifold fitting were calculated and then 18O/16O ratios derived via evolutionary programming. The algorithm is tested using trypsin–catalyzed 18O post-digestion exchange to differentially label bovine serum albumin (BSA) at a priori determined ratios. Both, accuracy and precision are improved utilizing this rigorous mathematical approach. Utilizing this algorithmic technique, we demonstrate the effectiveness of this method to accurately calculate 18O/16O ratios for differentially labeled BSA peptides, by accounting for artifacts caused by a variable degree of post-digestion 18O exchange. We further demonstrate the effectiveness of this method to accurately calculate 18O/16O ratios in a large scale proteomic quantitation of detergent resistant membrane microdomains (DRMMs) isolated from cells expressing wild-type HIV-1 Gag and its non myristylated mutant.
PMCID: PMC3479679  PMID: 20540505
quantitation; 18O/16O stable isotope labeling; variable/incomplete 18O exchange
20.  Rif2 Promotes a Telomere Fold-Back Structure through Rpd3L Recruitment in Budding Yeast 
PLoS Genetics  2012;8(9):e1002960.
Using a genome-wide screening approach, we have established the genetic requirements for proper telomere structure in Saccharomyces cerevisiae. We uncovered 112 genes, many of which have not previously been implicated in telomere function, that are required to form a fold-back structure at chromosome ends. Among other biological processes, lysine deacetylation, through the Rpd3L, Rpd3S, and Hda1 complexes, emerged as being a critical regulator of telomere structure. The telomeric-bound protein, Rif2, was also found to promote a telomere fold-back through the recruitment of Rpd3L to telomeres. In the absence of Rpd3 function, telomeres have an increased susceptibility to nucleolytic degradation, telomere loss, and the initiation of premature senescence, suggesting that an Rpd3-mediated structure may have protective functions. Together these data reveal that multiple genetic pathways may directly or indirectly impinge on telomere structure, thus broadening the potential targets available to manipulate telomere function.
Author Summary
Impaired telomere elongation eventually results in telomere dysfunction and can lead to diseases such as dyskeratosis congenita, which is associated with bone-marrow failure and pulmonary fibrosis. Cancer cells require continuous telomere maintenance to ensure continued cellular proliferation. Therefore the regulation of telomere function, both positively (in the case of dyskeratosis congenita) and negatively (for cancer), may be of therapeutic benefit. In this study we have used yeast to determine which genetic factors are important for a certain telomeric structure (the loop structure), which may help to maintain chromosome ends in a protected state. We found that multiple genetic factors and pathways affect telomere structure, ranging from metabolic signaling to specific telomere-binding proteins. We found that proper chromatin structure at the telomere is essential to maintain a telomere fold-back structure. Importantly, there was a strong correlation between telomere structure and function, as the mutants found in our screen (looping defective) were often associated with rapid senescence and telomere dysfunction phenotypes. We believe that, through the regulation of the various genetic pathways uncovered in our screen, one may be able to both positively and negatively influence telomere function.
PMCID: PMC3447961  PMID: 23028367
21.  The Mph1 Helicase Can Promote Telomere Uncapping and Premature Senescence in Budding Yeast 
PLoS ONE  2012;7(7):e42028.
Double strand breaks (DSBs) can be repaired via either Non-Homologous End Joining (NHEJ) or Homology directed Repair (HR). Telomeres, which resemble DSBs, are refractory to repair events in order to prevent chromosome end fusions and genomic instability. In some rare instances telomeres engage in Break-Induced Replication (BIR), a type of HR, in order to maintain telomere length in the absence of the enzyme telomerase. Here we have investigated how the yeast helicase, Mph1, affects DNA repair at both DSBs and telomeres. We have found that overexpressed Mph1 strongly inhibits BIR at internal DSBs however allows it to proceed at telomeres. Furthermore, while overexpressed Mph1 potently inhibits NHEJ at telomeres it has no effect on NHEJ at DSBs within the chromosome. At telomeres Mph1 is able to promote telomere uncapping and the accumulation of ssDNA, which results in premature senescence in the absence of telomerase. We propose that Mph1 is able to direct repair towards HR (thereby inhibiting NHEJ) at telomeres by remodeling them into a nuclease-sensitive structure, which promotes the accumulation of a recombinogenic ssDNA intermediate. We thus put forward that Mph1 is a double-edge sword at the telomere, it prevents NHEJ, but promotes senescence in cells with dysfunctional telomeres by increasing the levels of ssDNA.
PMCID: PMC3407055  PMID: 22848695
22.  Deregulated telomere transcription causes replication-dependent telomere shortening and promotes cellular senescence 
Nucleic Acids Research  2012;40(14):6649-6659.
Telomeres are transcribed into non-coding TElomeric Repeat containing RNAs (TERRA). We have employed a transcriptionally inducible telomere to investigate how telomere transcription affects telomere function in Saccharomyces cerevisiae. We report that telomere shortening resulting from high levels of telomere transcription stems from a DNA replication-dependent loss of telomere tracts, which can occur independent of both telomerase inhibition and homologous recombination. We show that in order for telomere loss to occur, transcription must pass through the telomere tract itself producing a TERRA molecule. We demonstrate that increased telomere transcription of a single telomere leads to a premature cellular senescence in the absence of a telomere maintenance mechanism (telomerase and homology directed repair). Similar rapid senescence and telomere shortening are also seen in sir2Δ cells with compromised telomere maintenance, where TERRA levels are increased at natural telomeres. These data suggest that telomere transcription must be tightly controlled to prevent telomere loss and early onset senescence.
PMCID: PMC3413150  PMID: 22553368
23.  The Role of Methylation in the Intrinsic Dynamics of B- and Z-DNA 
PLoS ONE  2012;7(4):e35558.
Methylation of cytosine at the 5-carbon position (5mC) is observed in both prokaryotes and eukaryotes. In humans, DNA methylation at CpG sites plays an important role in gene regulation and has been implicated in development, gene silencing, and cancer. In addition, the CpG dinucleotide is a known hot spot for pathologic mutations genome-wide. CpG tracts may adopt left-handed Z-DNA conformations, which have also been implicated in gene regulation and genomic instability. Methylation facilitates this B-Z transition but the underlying mechanism remains unclear. Herein, four structural models of the dinucleotide d(GC)5 repeat sequence in B-, methylated B-, Z-, and methylated Z-DNA forms were constructed and an aggregate 100 nanoseconds of molecular dynamics simulations in explicit solvent under physiological conditions was performed for each model. Both unmethylated and methylated B-DNA were found to be more flexible than Z-DNA. However, methylation significantly destabilized the BII, relative to the BI, state through the Gp5mC steps. In addition, methylation decreased the free energy difference between B- and Z-DNA. Comparisons of α/γ backbone torsional angles showed that torsional states changed marginally upon methylation for B-DNA, and Z-DNA. Methylation-induced conformational changes and lower energy differences may contribute to the transition to Z-DNA by methylated, over unmethylated, B-DNA and may be a contributing factor to biological function.
PMCID: PMC3328458  PMID: 22530050
24.  Getting in (and out of) the loop: regulating higher order telomere structures 
Frontiers in Oncology  2012;2:180.
The DNA at the ends of linear chromosomes (the telomere) folds back onto itself and forms an intramolecular lariat-like structure. Although the telomere loop has been implicated in the protection of chromosome ends from nuclease-mediated resection and unscheduled DNA repair activities, it potentially poses an obstacle to the DNA replication machinery during S-phase. Therefore, the coordinated regulation of telomere loop formation, maintenance, and resolution is required in order to establish a balance between protecting the chromosome ends and promoting their duplication prior to cell division. Until recently, the only factor known to influence telomere looping in human cells was TRF2, a component of the shelterin complex. Recent work in yeast and mouse cells has uncovered additional regulatory factors that affect the loop structure at telomeres. In the following “perspective” we outline what is known about telomere looping and highlight the latest results regarding the regulation of this chromosome end structure. We speculate about how the manipulation of the telomere loop may have therapeutic implications in terms of diseases associated with telomere dysfunction and uncontrolled proliferation.
PMCID: PMC3510458  PMID: 23226680
t-loop; telomere; RTEL1; end protection; cancer; Mph1

Results 1-25 (35)