PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1043070)

Clipboard (0)
None

Related Articles

1.  Whole-Organ Isolation Approach as a Basis for Tissue-Specific Analyses in Schistosoma mansoni 
Background
Schistosomiasis is one of the most important parasitic diseases worldwide, second only to malaria. Schistosomes exhibit an exceptional reproductive biology since the sexual maturation of the female, which includes the differentiation of the reproductive organs, is controlled by pairing. Pathogenicity originates from eggs, which cause severe inflammation in their hosts. Elucidation of processes contributing to female maturation is not only of interest to basic science but also considering novel concepts combating schistosomiasis.
Methodology/Principal Findings
To get direct access to the reproductive organs, we established a novel protocol using a combined detergent/protease-treatment removing the tegument and the musculature of adult Schistosoma mansoni. All steps were monitored by scanning electron microscopy (SEM) and bright-field microscopy (BF). We focused on the gonads of adult schistosomes and demonstrated that isolated and purified testes and ovaries can be used for morphological and structural studies as well as sources for RNA and protein of sufficient amounts for subsequent analyses such as RT-PCR and immunoblotting. To this end, first exemplary evidence was obtained for tissue-specific transcription within the gonads (axonemal dynein intermediate chain gene SmAxDynIC; aquaporin gene SmAQP) as well as for post-transcriptional regulation (SmAQP).
Conclusions/Significance
The presented method provides a new way of getting access to tissue-specific material of S. mansoni. With regard to many still unanswered questions of schistosome biology, such as elucidating the molecular processes involved in schistosome reproduction, this protocol provides opportunities for, e.g., sub-transcriptomics and sub-proteomics at the organ level. This will promote the characterisation of gene-expression profiles, or more specifically to complete knowledge of signalling pathways contributing to differentiation processes, so discovering involved molecules that may represent potential targets for novel intervention strategies. Furthermore, gonads and other tissues are a basis for cell isolation, opening new perspectives for establishing cell lines, one of the tools desperately needed in the post-genomic era.
Author Summary
As a neglected disease, schistosomiasis is still an enormous problem in the tropics and subtropics. Since the 1980s, Praziquantel (PZQ) has been the drug of choice but can be anticipated to lose efficacy in the future due to emerging resistance. Alternative drugs or efficient vaccines are still lacking, strengthening the need for the discovery of novel strategies and targets for combating schistosomiasis. One avenue is to understand the unique reproductive biology of this trematode in more detail. Sexual maturation of the adult female depends on a constant pairing with the male. This is a crucial prerequisite for the differentiation of the female reproductive organs such as the vitellarium and ovary, and consequently for the production of mature eggs. These are needed for life-cycle maintenance, but they also cause pathogenesis. With respect to adult males, the production of mature sperm is essential for fertilisation and life-cycle progression. In our study we present a convenient and inexpensive method to isolate reproductive tissues from adult schistosomes in high amounts and purity, representing a source for gonad-specific RNA and protein, which will serve for future sub-transcriptome and -proteome studies helping to characterise genes, or to unravel differentiation programs in schistosome gonads. Beyond that, isolated organs may be useful for approaches to establish cell cultures, desperately needed in the post-genomic era.
doi:10.1371/journal.pntd.0002336
PMCID: PMC3723596  PMID: 23936567
2.  Integrating Computational Biology and Forward Genetics in Drosophila 
PLoS Genetics  2009;5(1):e1000351.
Genetic screens are powerful methods for the discovery of gene–phenotype associations. However, a systems biology approach to genetics must leverage the massive amount of “omics” data to enhance the power and speed of functional gene discovery in vivo. Thus far, few computational methods for gene function prediction have been rigorously tested for their performance on a genome-wide scale in vivo. In this work, we demonstrate that integrating genome-wide computational gene prioritization with large-scale genetic screening is a powerful tool for functional gene discovery. To discover genes involved in neural development in Drosophila, we extend our strategy for the prioritization of human candidate disease genes to functional prioritization in Drosophila. We then integrate this prioritization strategy with a large-scale genetic screen for interactors of the proneural transcription factor Atonal using genomic deficiencies and mutant and RNAi collections. Using the prioritized genes validated in our genetic screen, we describe a novel genetic interaction network for Atonal. Lastly, we prioritize the whole Drosophila genome and identify candidate gene associations for ten receptor-signaling pathways. This novel database of prioritized pathway candidates, as well as a web application for functional prioritization in Drosophila, called Endeavour-HighFly, and the Atonal network, are publicly available resources. A systems genetics approach that combines the power of computational predictions with in vivo genetic screens strongly enhances the process of gene function and gene–gene association discovery.
Author Summary
Genome sequencing and annotation, combined with large-scale molecular experiments to query gene expression and molecular interactions, collectively known as Systems Biology, have resulted in an enormous wealth in biological databases. Yet, it remains a daunting task to use these data to decipher the rules that govern biological systems. One of the most trusted approaches in biology is genetic analysis because of its emphasis on gene function in living organisms. Genetics, however, proceeds slowly and unravels small-scale interactions. Turning genetics into an effective tool of Systems Biology requires harnessing the large-scale molecular data for the design and execution of genetic screens. In this work, we test the idea of exploiting a computational approach known as gene prioritization to pre-rank genes for the likelihood of their involvement in a process of interest. By carrying out a gene prioritization–supported genetic screen, we greatly enhance the speed and output of in vivo genetic screens without compromising their sensitivity. These results mean that future genetic screens can be custom-catered for any process of interest and carried out with a speed and efficiency that is comparable to other large-scale molecular experiments. We refer to this combined approach as Systems Genetics.
doi:10.1371/journal.pgen.1000351
PMCID: PMC2628282  PMID: 19165344
3.  Analysis of multiple compound–protein interactions reveals novel bioactive molecules 
The authors use machine learning of compound-protein interactions to explore drug polypharmacology and to efficiently identify bioactive ligands, including novel scaffold-hopping compounds for two pharmaceutically important protein families: G-protein coupled receptors and protein kinases.
We have demonstrated that machine learning of multiple compound–protein interactions is useful for efficient ligand screening and for assessing drug polypharmacology.This approach successfully identified novel scaffold-hopping compounds for two pharmaceutically important protein families: G-protein-coupled receptors and protein kinases.These bioactive compounds were not detected by existing computational ligand-screening methods in comparative studies.The results of this study indicate that data derived from chemical genomics can be highly useful for exploring chemical space, and this systems biology perspective could accelerate drug discovery processes.
The discovery of novel bioactive molecules advances our systems-level understanding of biological processes and is crucial for innovation in drug development. Perturbations of biological systems by chemical probes provide broader applications not only for analysis of complex systems but also for intentional manipulations of these systems. Nevertheless, the lack of well-characterized chemical modulators has limited their use. Recently, chemical genomics has emerged as a promising area of research applicable to the exploration of novel bioactive molecules, and researchers are currently striving toward the identification of all possible ligands for all target protein families (Wang et al, 2009). Chemical genomics studies have shown that patterns of compound–protein interactions (CPIs) are too diverse to be understood as simple one-to-one events. There is an urgent need to develop appropriate data mining methods for characterizing and visualizing the full complexity of interactions between chemical space and biological systems. However, no existing screening approach has so far succeeded in identifying novel bioactive compounds using multiple interactions among compounds and target proteins.
High-throughput screening (HTS) and computational screening have greatly aided in the identification of early lead compounds for drug discovery. However, the large number of assays required for HTS to identify drugs that target multiple proteins render this process very costly and time-consuming. Therefore, interest in using in silico strategies for screening has increased. The most common computational approaches, ligand-based virtual screening (LBVS) and structure-based virtual screening (SBVS; Oprea and Matter, 2004; Muegge and Oloff, 2006; McInnes, 2007; Figure 1A), have been used for practical drug development. LBVS aims to identify molecules that are very similar to known active molecules and generally has difficulty identifying compounds with novel structural scaffolds that differ from reference molecules. The other popular strategy, SBVS, is constrained by the number of three-dimensional crystallographic structures available. To circumvent these limitations, we have shown that a new computational screening strategy, chemical genomics-based virtual screening (CGBVS), has the potential to identify novel, scaffold-hopping compounds and assess their polypharmacology by using a machine-learning method to recognize conserved molecular patterns in comprehensive CPI data sets.
The CGBVS strategy used in this study was made up of five steps: CPI data collection, descriptor calculation, representation of interaction vectors, predictive model construction using training data sets, and predictions from test data (Figure 1A). Importantly, step 1, the construction of a data set of chemical structures and protein sequences for known CPIs, did not require the three-dimensional protein structures needed for SBVS. In step 2, compound structures and protein sequences were converted into numerical descriptors. These descriptors were used to construct chemical or biological spaces in which decreasing distance between vectors corresponded to increasing similarity of compound structures or protein sequences. In step 3, we represented multiple CPI patterns by concatenating these chemical and protein descriptors. Using these interaction vectors, we could quantify the similarity of molecular interactions for compound–protein pairs, despite the fact that the ligand and protein similarity maps differed substantially. In step 4, concatenated vectors for CPI pairs (positive samples) and non-interacting pairs (negative samples) were input into an established machine-learning method. In the final step, the classifier constructed using training sets was applied to test data.
To evaluate the predictive value of CGBVS, we first compared its performance with that of LBVS by fivefold cross-validation. CGBVS performed with considerably higher accuracy (91.9%) than did LBVS (84.4%; Figure 1B). We next compared CGBVS and SBVS in a retrospective virtual screening based on the human β2-adrenergic receptor (ADRB2). Figure 1C shows that CGBVS provided higher hit rates than did SBVS. These results suggest that CGBVS is more successful than conventional approaches for prediction of CPIs.
We then evaluated the ability of the CGBVS method to predict the polypharmacology of ADRB2 by attempting to identify novel ADRB2 ligands from a group of G-protein-coupled receptor (GPCR) ligands. We ranked the prediction scores for the interactions of 826 reported GPCR ligands with ADRB2 and then analyzed the 50 highest-ranked compounds in greater detail. Of 21 commercially available compounds, 11 showed ADRB2-binding activity and were not previously reported to be ADRB2 ligands. These compounds included ligands not only for aminergic receptors but also for neuropeptide Y-type 1 receptors (NPY1R), which have low protein homology to ADRB2. Most ligands we identified were not detected by LBVS and SBVS, which suggests that only CGBVS could identify this unexpected cross-reaction for a ligand developed as a target to a peptidergic receptor.
The true value of CGBVS in drug discovery must be tested by assessing whether this method can identify scaffold-hopping lead compounds from a set of compounds that is structurally more diverse. To assess this ability, we analyzed 11 500 commercially available compounds to predict compounds likely to bind to two GPCRs and two protein kinases. Functional assays revealed that nine ADRB2 ligands, three NPY1R ligands, five epidermal growth factor receptor (EGFR) inhibitors, and two cyclin-dependent kinase 2 (CDK2) inhibitors were concentrated in the top-ranked compounds (hit rate=30, 15, 25, and 10%, respectively). We also evaluated the extent of scaffold hopping achieved in the identification of these novel ligands. One ADRB2 ligand, two NPY1R ligands, and one CDK2 inhibitor exhibited scaffold hopping (Figure 4), indicating that CGBVS can use this characteristic to rationally predict novel lead compounds, a crucial and very difficult step in drug discovery. This feature of CGBVS is critically different from existing predictive methods, such as LBVS, which depend on similarities between test and reference ligands, and focus on a single protein or highly homologous proteins. In particular, CGBVS is useful for targets with undefined ligands because this method can use CPIs with target proteins that exhibit lower levels of homology.
In summary, we have demonstrated that data mining of multiple CPIs is of great practical value for exploration of chemical space. As a predictive model, CGBVS could provide an important step in the discovery of such multi-target drugs by identifying the group of proteins targeted by a particular ligand, leading to innovation in pharmaceutical research.
The discovery of novel bioactive molecules advances our systems-level understanding of biological processes and is crucial for innovation in drug development. For this purpose, the emerging field of chemical genomics is currently focused on accumulating large assay data sets describing compound–protein interactions (CPIs). Although new target proteins for known drugs have recently been identified through mining of CPI databases, using these resources to identify novel ligands remains unexplored. Herein, we demonstrate that machine learning of multiple CPIs can not only assess drug polypharmacology but can also efficiently identify novel bioactive scaffold-hopping compounds. Through a machine-learning technique that uses multiple CPIs, we have successfully identified novel lead compounds for two pharmaceutically important protein families, G-protein-coupled receptors and protein kinases. These novel compounds were not identified by existing computational ligand-screening methods in comparative studies. The results of this study indicate that data derived from chemical genomics can be highly useful for exploring chemical space, and this systems biology perspective could accelerate drug discovery processes.
doi:10.1038/msb.2011.5
PMCID: PMC3094066  PMID: 21364574
chemical genomics; data mining; drug discovery; ligand screening; systems chemical biology
4.  The Genome of Spraguea lophii and the Basis of Host-Microsporidian Interactions 
PLoS Genetics  2013;9(8):e1003676.
Microsporidia are obligate intracellular parasites with the smallest known eukaryotic genomes. Although they are increasingly recognized as economically and medically important parasites, the molecular basis of microsporidian pathogenicity is almost completely unknown and no genetic manipulation system is currently available. The fish-infecting microsporidian Spraguea lophii shows one of the most striking host cell manipulations known for these parasites, converting host nervous tissue into swollen spore factories known as xenomas. In order to investigate the basis of these interactions between microsporidian and host, we sequenced and analyzed the S. lophii genome. Although, like other microsporidia, S. lophii has lost many of the protein families typical of model eukaryotes, we identified a number of gene family expansions including a family of leucine-rich repeat proteins that may represent pathogenicity factors. Building on our comparative genomic analyses, we exploited the large numbers of spores that can be obtained from xenomas to identify potential effector proteins experimentally. We used complex-mix proteomics to identify proteins released by the parasite upon germination, resulting in the first experimental isolation of putative secreted effector proteins in a microsporidian. Many of these proteins are not related to characterized pathogenicity factors or indeed any other sequences from outside the Microsporidia. However, two of the secreted proteins are members of a family of RICIN B-lectin-like proteins broadly conserved across the phylum. These proteins form syntenic clusters arising from tandem duplications in several microsporidian genomes and may represent a novel family of conserved effector proteins. These computational and experimental analyses establish S. lophii as an attractive model system for understanding the evolution of host-parasite interactions in microsporidia and suggest an important role for lineage-specific innovations and fast evolving proteins in the evolution of the parasitic microsporidian lifecycle.
Author Summary
Microsporidia are unusual intracellular parasites that infect a broad range of animal cells. In comparison to their fungal relatives, microsporidian genomes have shrunk during evolution, encoding as few as 2000 proteins. This minimal molecular repertoire makes them a reduced model system for understanding host-parasite interactions. A number of microsporidian genomes have now been sequenced, but the lack of a system for genetic manipulation makes it difficult to translate these data into a better understanding of microsporidian biology. Here we present a deep sequencing project of Spraguea lophii, a fish-infecting microsporidian that is abundantly available from environmental samples. We use our sequence data combined with germination protocols and complex-mix proteomics to identify proteins released by the cell at the earliest stage of germination, representing potential pathogenicity factors. We profile the RNA expression pattern of germinating cells and identify a set of highly transcribed hypothetical genes. Our study provides new insight into the importance of uncharacterized, lineage-specific and/or fast evolving proteins in microsporidia and provides new leads for the investigation of virulence factors in these enigmatic parasites.
doi:10.1371/journal.pgen.1003676
PMCID: PMC3749934  PMID: 23990793
5.  Biology of Metastatic Renal Cell Carcinoma 
Journal of Cancer  2011;2:369-373.
In the past ten years we have made exceptional progresses in the understanding of RCC biology, particularly by recognizing the crucial pathogenetic role of activation of the HIF/VEGF and mTOR pathways. This has resulted in the successful clinical development of anti-angiogenic and mTOR-targeted drugs, which have profoundly impacted on the natural history of the disease and have improved the duration and quality of RCC patient lives. However, further improvements are still greatly needed: 1) even in patients who obtain striking clinical responses early in the course of treatment, disease will ultimately escape control and progress to a treatment-resistant state, leading to therapeutic failure; 2) prolonged disease control usually requires 'continuous' treatment, even across different treatment lines, making the impact of chronic, low-grade, toxicities on quality of life greater and precluding, for most patients, the possibility of experiencing 'drug-free holidays'; 3) although we have successfully identified classes of drugs (or molecular mechanisms of action) that are effective in a substantial proportion of patients, we still fall short of molecular predictive factors that identify individual patients who will (or will not) benefit from a specific intervention and still proceed on a trial-and-error basis, far from a truly 'personalized' therapeutic approach; 4) finally (and perhaps most importantly), even in the best case scenario, currently available treatments inevitably fail to definitively 'cure' metastatic RCC patients. In this review we briefly summarize recent developments in the understanding of the molecular pathogenesis of RCC, the development of resistance/escape mechanisms, the rationale for sequencing agents with different mechanisms of action, and the importance of host-related factors. Unraveling the complex mechanisms by which RCC shapes host microenvironment and immune response and therapeutic treatments, in turn, shape both cancer cell biology and tumor-host interactions may hold the key to future advances in such a complex and challenging disease.
PMCID: PMC3157018  PMID: 21850209
RCC; Biology; Signal transduction; HIF; mTOR; Angiogenesis
6.  Gender-Associated Genes in Filarial Nematodes Are Important for Reproduction and Potential Intervention Targets 
Background
A better understanding of reproductive processes in parasitic nematodes may lead to development of new anthelmintics and control strategies for combating disabling and disfiguring neglected tropical diseases such as lymphatic filariasis and onchocerciasis. Transcriptomatic analysis has provided important new insights into mechanisms of reproduction and development in other invertebrates. We have performed the first genome-wide analysis of gender-associated (GA) gene expression in a filarial nematode to improve understanding of key reproductive processes in these parasites.
Methodology/Principal Findings
The Version 2 Filarial Microarray with 18,104 elements representing ∼85% of the filarial genome was used to identify GA gene transcripts in adult Brugia malayi worms. Approximately 19% of 14,293 genes were identified as GA genes. Many GA genes have potential Caenorhabditis elegans homologues annotated as germline-, oogenesis-, spermatogenesis-, and early embryogenesis- enriched. The potential C. elegans homologues of the filarial GA genes have a higher frequency of severe RNAi phenotypes (such as lethal and sterility) than other C. elegans genes. Molecular functions and biological processes associated with GA genes were gender-segregated. Peptidase, ligase, transferase, regulator activity for kinase and transcription, and rRNA and lipid binding were associated with female GA genes. In contrast, catalytic activity from kinase, ATP, and carbohydrate binding were associated with male GA genes. Cell cycle, transcription, translation, and biological regulation were increased in females, whereas metabolic processes of phosphate and carbohydrate metabolism, energy generation, and cell communication were increased in males. Significantly enriched pathways in females were associated with cell growth and protein synthesis, whereas metabolic pathways such as pentose phosphate and energy production pathways were enriched in males. There were also striking gender differences in environmental information processing and cell communication pathways. Many proteins encoded by GA genes are secreted by Brugia malayi, and these encode immunomodulatory molecules such as antioxidants and host cytokine mimics. Expression of many GA genes has been recently reported to be suppressed by tetracycline, which blocks reproduction in female Brugia malayi. Our localization of GA transcripts in filarial reproductive organs supports the hypothesis that these genes encode proteins involved in reproduction.
Conclusions/Significance
Genome-wide expression profiling coupled with a robust bioinformatics analysis has greatly expanded our understanding of the molecular biology of reproduction in filarial nematodes. This study has highlighted key molecules and pathways associated with reproductive and other biological processes and identified numerous potential candidates for rational drug design to target reproductive processes.
Author Summary
Lymphatic filariasis is a neglected tropical disease that is caused by thread-like parasitic worms that live and reproduce in lymphatic vessels of the human host. There are no vaccines to prevent filariasis, and available drugs are not effective against all stages of the parasite. In addition, recent reports suggest that the filarial nematodes may be developing resistance to key medications. Therefore, there is an urgent need to identify new drug targets in filarial worms. The purpose of this study was to perform a genome-wide analysis of gender-associated gene transcription to improve understanding of key reproductive processes in filarial nematodes. Our results indicate that thousands of genes are differentially expressed in male and female adult worms. Many of those genes are involved in specific reproductive processes such as embryogenesis and spermatogenesis. In addition, expression of some of those genes is suppressed by tetracycline, a drug that leads to sterilization of adult female worms in many filarial species. Thus, gender-associated genes represent priority targets for design of vaccines and drugs that interfere with reproduction of filarial nematodes. Additional work with this type of integrated systems biology approach should lead to important new tools for controlling filarial diseases.
doi:10.1371/journal.pntd.0000947
PMCID: PMC3026763  PMID: 21283610
7.  Production of transmitochondrial cybrids containing naturally occurring pathogenic mtDNA variants 
Nucleic Acids Research  2006;34(13):e95.
The human mitochondrial genome (mtDNA) encodes polypeptides that are critical for coupling oxidative phosphorylation. Our detailed understanding of the molecular processes that mediate mitochondrial gene expression and the structure–function relationships of the OXPHOS components could be greatly improved if we were able to transfect mitochondria and manipulate mtDNA in vivo. Increasing our knowledge of this process is not merely of fundamental importance, as mutations of the mitochondrial genome are known to cause a spectrum of clinical disorders and have been implicated in more common neurodegenerative disease and the ageing process. In organellar or in vitro reconstitution studies have identified many factors central to the mechanisms of mitochondrial gene expression, but being able to investigate the molecular aetiology of a limited number of cell lines from patients harbouring mutated mtDNA has been enormously beneficial. In the absence of a mechanism for manipulating mtDNA, a much larger pool of pathogenic mtDNA mutations would increase our knowledge of mitochondrial gene expression. Colonic crypts from ageing individuals harbour mutated mtDNA. Here we show that by generating cytoplasts from colonocytes, standard fusion techniques can be used to transfer mtDNA into rapidly dividing immortalized cells and, thereby, respiratory-deficient transmitochondrial cybrids can be isolated. A simple screen identified clones that carried putative pathogenic mutations in MTRNR1, MTRNR2, MTCOI and MTND2, MTND4 and MTND6. This method can therefore be exploited to produce a library of cell lines carrying pathogenic human mtDNA for further study.
doi:10.1093/nar/gkl516
PMCID: PMC1540737  PMID: 16885236
8.  A decade of molecular pathogenomic analysis of group A Streptococcus  
The Journal of Clinical Investigation  2009;119(9):2455-2463.
Molecular pathogenomic analysis of the human bacterial pathogen group A Streptococcus has been conducted for a decade. Much has been learned as a consequence of the confluence of low-cost DNA sequencing, microarray technology, high-throughput proteomics, and enhanced bioinformatics. These technical advances, coupled with the availability of unique bacterial strain collections, have facilitated a systems biology investigative strategy designed to enhance and accelerate our understanding of disease processes. Here, we provide examples of the progress made by exploiting an integrated genome-wide research platform to gain new insight into molecular pathogenesis. The studies have provided many new avenues for basic and translational research.
doi:10.1172/JCI38095
PMCID: PMC2735924  PMID: 19729843
9.  Discovery of NSAID and anticancer drugs enhancing reprogramming and iPS cell generation 
Stem cells (Dayton, Ohio)  2011;29(10):1528-1536.
Recent breakthroughs in creating induced pluripotent stem cells (iPSCs) provide alternative means to obtain ES-like cells without destroying embryos by introducing four reprogramming factors (Oct3/4, Sox2, and Klf4/c-Myc or Nanog/ Lin28) into somatic cells. iPSCs are versatile tools for investigating early developmental processes and could become sources of tissues or cells for regenerative therapies. Here, for the first time, we describe a strategy to analyze genomics datasets of mouse embryonic fibroblasts (MEFs) and embryonic stem (ES) cells to identify genes constituting barriers to iPSC reprogramming. We further show that computational chemical biology combined with genomics analysis can be used to identify small molecules regulating reprogramming. Specific down-regulation by small interfering RNAs (siRNAs) of several key MEF-specific genes encoding proteins with catalytic or regulatory functions, including WISP1, PRRX1, HMGA2, NFIX, PRKG2, COX2, and TGFβ3, greatly increased reprogramming efficiency. Based on this rationale, we screened only 17 small molecules in reprogramming assays and discovered that the NSAID Nabumetone and the anti-cancer drug OHTM can generate iPS cells without Sox2. Nabumetone could also produce iPS cells in the absence of c-Myc or Sox2 without compromising self-renewal and pluripotency of derived iPS cells. In summary, we report a new concept of combining genomics and computational chemical biology to identify new drugs useful for iPSC generation. This hypothesis-driven approach provides an alternative to shot-gun screening and accelerates understanding of molecular mechanisms underlying iPS cell induction.
doi:10.1002/stem.717
PMCID: PMC3419601  PMID: 21898684
NSAIDS; OHTM; iPSC; Sox2; c-Myc
10.  RNAi in Arthropods: Insight into the Machinery and Applications for Understanding the Pathogen-Vector Interface 
Genes  2012;3(4):702-741.
The availability of genome sequencing data in combination with knowledge of expressed genes via transcriptome and proteome data has greatly advanced our understanding of arthropod vectors of disease. Not only have we gained insight into vector biology, but also into their respective vector-pathogen interactions. By combining the strengths of postgenomic databases and reverse genetic approaches such as RNAi, the numbers of available drug and vaccine targets, as well as number of transgenes for subsequent transgenic or paratransgenic approaches, have expanded. These are now paving the way for in-field control strategies of vectors and their pathogens. Basic scientific questions, such as understanding the basic components of the vector RNAi machinery, is vital, as this allows for the transfer of basic RNAi machinery components into RNAi-deficient vectors, thereby expanding the genetic toolbox of these RNAi-deficient vectors and pathogens. In this review, we focus on the current knowledge of arthropod vector RNAi machinery and the impact of RNAi on understanding vector biology and vector-pathogen interactions for which vector genomic data is available on VectorBase.
doi:10.3390/genes3040702
PMCID: PMC3899984  PMID: 24705082
RNA interference; vector; disease; mosquito; ixodid ticks; body louse; kissing bug; tsetse fly; transgenesis; vaccine; drug target
11.  EBV Chronic Infections 
The infection from Epstein-Barr virus (EBV) or virus of infectious mononucleosis, together with other herpes viruses’ infections, represents a prototype of persistent viral infections characterized by the property of the latency. Although the reactivations of the latent infection are associated with the resumption of the viral replication and eventually with the “shedding”, it is still not clear if this virus can determine chronic infectious diseases, more or less evolutive. These diseases could include some pathological conditions actually defined as “idiopathic”and characterized by the “viral persistence” as the more credible pathogenetic factor. Among the so-called idiopathic syndromes, the “chronic fatigue syndrome” (CFS) aroused a great interest around the eighties of the last century when, just for its relationship with EBV, it was called “chronic mononucleosis” or “chronic EBV infection”.
Today CFS, as defined in 1994 by the CDC of Atlanta (USA), really represents a multifactorial syndrome characterized by a chronic course, where reactivation and remission phases alternate, and by a good prognosis. The etiopathogenetic role of EBV is demonstrated only in a well-examined subgroup of patients, while in most of the remaining cases this role should be played by other infectious agents - able to remain in a latent or persistent way in the host – or even by not infectious agents (toxic, neuroendocrine, methabolic, etc.). However, the pathogenetic substrate of the different etiologic forms seems to be the same, much probably represented by the oxidative damage due to the release of pro-inflammatory cytokines as a response to the triggering event (infectious or not infectious).
Anyway, recently the scientists turned their’s attention to the genetic predisposition of the subjects affected by the syndrome, so that in the last years the genetic studies, together with those of molecular biology, received a great impulse. Thanks to both these studies it was possibile to confirm the etiologic links between the syndrome and EBV or other herpesviruses or other persistent infectious agents.
The mechanisms of EBV latency have been carefully examined both because they represent the virus strategy to elude the response of the immune system of the host, and because they are correlated with those oncologic conditions associated to the viral persistence, particularly lymphomas and lymphoproliferative disorders. Just these malignancies, for which a pathogenetic role of EBV is clearly documented, should represent the main clinical expression of a first group of chronic EBV infections characterized by a natural history where the neoplastic event aroused from the viral persistence in the resting B cells for all the life, from the genetic predisposition of the host and from the oncogenic potentialities of the virus that chronically persists and incurs reactivations.
Really, these oncological diseases should be considered more complications than chronic forms of the illness, as well as other malignancies for which a viral – or even infectious - etiology is well recognized. The chronic diseases, in fact, should be linked in a pathogenetic and temporal way to the acute infection, from whom start the natural history of the following disease. So, as for the chronic liver diseases from HBV and HCV, it was conied the acronym of CAEBV (Chronic Active EBV infection), distinguishing within these pathologies the more severe forms (SCAEBV) mostly reported in Far East and among children or adolescents. Probably only these forms have to be considered expressions of a chronic EBV infection “sensu scrictu”, together with those forms of CFS where the etiopathogenetic and temporal link with the acute EBV infection is well documented. As for CFS, also for CAEBV the criteria for a case definition were defined, even on the basis of serological and virological findings. However, the lymphoproliferative disorders are excluded from these forms and mantain their nosographic (e.g. T or B cell or NK type lymphomas) and pathogenetic collocation, even when they occur within chronic forms of EBV infection. In the pathogenesis, near to the programs of latency of the virus, the genetic and environmental factors, independent from the real natural history of EBV infection, play a crucial role.
Finally, it was realized a review of cases - not much numerous in literature – of chronic EBV infection associated to chronic liver and neurological diseases, where the modern techniques of molecular biology should be useful to obtain a more exact etiologic definition, not always possibile to reach in the past.
The wide variety of clinical forms associated to the EBV chronic infection makes difficult the finding of a univocal pathogenetic link. There is no doubt, however, that a careful examination of the different clinical forms described in this review should be useful to open new horizons to the study of the persistent viral infections and the still not well cleared pathologies that they can induce in the human host.
doi:10.4084/MJHID.2010.022
PMCID: PMC3033110  PMID: 21415952
12.  The Role of the Toxicologic Pathologist in the Post-Genomic Era# 
Journal of Toxicologic Pathology  2013;26(2):105-110.
An era can be defined as a period in time identified by distinctive character, events, or practices. We are now in the genomic era. The pre-genomic era: There was a pre-genomic era. It started many years ago with novel and seminal animal experiments, primarily directed at studying cancer. It is marked by the development of the two-year rodent cancer bioassay and the ultimate realization that alternative approaches and short-term animal models were needed to replace this resource-intensive and time-consuming method for predicting human health risk. Many alternatives approaches and short-term animal models were proposed and tried but, to date, none have completely replaced our dependence upon the two-year rodent bioassay. However, the alternative approaches and models themselves have made tangible contributions to basic research, clinical medicine and to our understanding of cancer and they remain useful tools to address hypothesis-driven research questions. The pre-genomic era was a time when toxicologic pathologists played a major role in drug development, evaluating the cancer bioassay and the associated dose-setting toxicity studies, and exploring the utility of proposed alternative animal models. It was a time when there was shortage of qualified toxicologic pathologists. The genomic era: We are in the genomic era. It is a time when the genetic underpinnings of normal biological and pathologic processes are being discovered and documented. It is a time for sequencing entire genomes and deliberately silencing relevant segments of the mouse genome to see what each segment controls and if that silencing leads to increased susceptibility to disease. What remains to be charted in this genomic era is the complex interaction of genes, gene segments, post-translational modifications of encoded proteins, and environmental factors that affect genomic expression. In this current genomic era, the toxicologic pathologist has had to make room for a growing population of molecular biologists. In this present era newly emerging DVM and MD scientists enter the work arena with a PhD in pathology often based on some aspect of molecular biology or molecular pathology research. In molecular biology, the almost daily technological advances require one’s complete dedication to remain at the cutting edge of the science. Similarly, the practice of toxicologic pathology, like other morphological disciplines, is based largely on experience and requires dedicated daily examination of pathology material to maintain a well-trained eye capable of distilling specific information from stained tissue slides - a dedicated effort that cannot be well done as an intermezzo between other tasks. It is a rare individual that has true expertise in both molecular biology and pathology. In this genomic era, the newly emerging DVM-PhD or MD-PhD pathologist enters a marketplace without many job opportunities in contrast to the pre-genomic era. Many face an identity crisis needing to decide to become a competent pathologist or, alternatively, to become a competent molecular biologist. At the same time, more PhD molecular biologists without training in pathology are members of the research teams working in drug development and toxicology. How best can the toxicologic pathologist interact in the contemporary team approach in drug development, toxicology research and safety testing? Based on their biomedical training, toxicologic pathologists are in an ideal position to link data from the emerging technologies with their knowledge of pathobiology and toxicology. To enable this linkage and obtain the synergy it provides, the bench-level, slide-reading expert pathologist will need to have some basic understanding and appreciation of molecular biology methods and tools. On the other hand, it is not likely that the typical molecular biologist could competently evaluate and diagnose stained tissue slides from a toxicology study or a cancer bioassay. The post-genomic era: The post-genomic era will likely arrive approximately around 2050 at which time entire genomes from multiple species will exist in massive databases, data from thousands of robotic high throughput chemical screenings will exist in other databases, genetic toxicity and chemical structure-activity-relationships will reside in yet other databases. All databases will be linked and relevant information will be extracted and analyzed by appropriate algorithms following input of the latest molecular, submolecular, genetic, experimental, pathology and clinical data. Knowledge gained will permit the genetic components of many diseases to be amenable to therapeutic prevention and/or intervention. Much like computerized algorithms are currently used to forecast weather or to predict political elections, computerized sophisticated algorithms based largely on scientific data mining will categorize new drugs and chemicals relative to their health benefits versus their health risks for defined human populations and subpopulations. However, this form of a virtual toxicity study or cancer bioassay will only identify probabilities of adverse consequences from interaction of particular environmental and/or chemical/drug exposure(s) with specific genomic variables. Proof in many situations will require confirmation in intact in vivo mammalian animal models. The toxicologic pathologist in the post-genomic era will be the best suited scientist to confirm the data mining and its probability predictions for safety or adverse consequences with the actual tissue morphological features in test species that define specific test agent pathobiology and human health risk.
doi:10.1293/tox.26.105
PMCID: PMC3695332  PMID: 23914052
genomic era; history of toxicologic pathology; molecular biology
13.  Molecular biology research in neuropsychiatry: India’s contribution 
Indian Journal of Psychiatry  2010;52(Suppl1):S120-S127.
Neuropsychiatric disorders represent the second largest cause of morbidity worldwide. These disorders have complex etiology and patho-physiology. The major lacunae in the biology of the psychiatric disorders include genomics, biomarkers and drug discovery, for the early detection of the disease, and have great application in the clinical management of disease. Indian psychiatrists and scientists played a significant role in filling the gaps. The present annotation provides in depth information related to research contributions on the molecular biology research in neuropsychiatric disorders in India. There is a great need for further research in this direction as to understand the genetic association of the neuropsychiatric disorders; molecular biology has a tremendous role to play. The alterations in gene expression are implicated in the pathogenesis of several neuropsychiatric disorders, including drug addiction and depression. The development of transgenic neuropsychiatric animal models is of great thrust areas. No studies from India in this direction. Biomarkers in neuropsychiatric disorders are of great help to the clinicians for the early diagnosis of the disorders. The studies related to gene-environment interactions, DNA instability, oxidative stress are less studied in neuropsychiatric disorders and making efforts in this direction will lead to pioneers in these areas of research in India. In conclusion, we provided an insight for future research direction in molecular understanding of neuropsychiatry disorders.
doi:10.4103/0019-5545.69223
PMCID: PMC3146196  PMID: 21836667
Depression; bipolar disorders; sexual dysfunction; autism; dementia; trace metals; DNA conformation; DNA stability; cell death; D1 receptors; genes; pedigree; enzymes; diet; mutations
14.  Whole Genome Sequencing versus Traditional Genotyping for Investigation of a Mycobacterium tuberculosis Outbreak: A Longitudinal Molecular Epidemiological Study 
PLoS Medicine  2013;10(2):e1001387.
In an outbreak investigation of Mycobacterium tuberculosis comparing whole genome sequencing (WGS) with traditional genotyping, Stefan Niemann and colleagues found that classical genotyping falsely clustered some strains, and WGS better reflected contact tracing.
Background
Understanding Mycobacterium tuberculosis (Mtb) transmission is essential to guide efficient tuberculosis control strategies. Traditional strain typing lacks sufficient discriminatory power to resolve large outbreaks. Here, we tested the potential of using next generation genome sequencing for identification of outbreak-related transmission chains.
Methods and Findings
During long-term (1997 to 2010) prospective population-based molecular epidemiological surveillance comprising a total of 2,301 patients, we identified a large outbreak caused by an Mtb strain of the Haarlem lineage. The main performance outcome measure of whole genome sequencing (WGS) analyses was the degree of correlation of the WGS analyses with contact tracing data and the spatio-temporal distribution of the outbreak cases. WGS analyses of the 86 isolates revealed 85 single nucleotide polymorphisms (SNPs), subdividing the outbreak into seven genome clusters (two to 24 isolates each), plus 36 unique SNP profiles. WGS results showed that the first outbreak isolates detected in 1997 were falsely clustered by classical genotyping. In 1998, one clone (termed “Hamburg clone”) started expanding, apparently independently from differences in the social environment of early cases. Genome-based clustering patterns were in better accordance with contact tracing data and the geographical distribution of the cases than clustering patterns based on classical genotyping. A maximum of three SNPs were identified in eight confirmed human-to-human transmission chains, involving 31 patients. We estimated the Mtb genome evolutionary rate at 0.4 mutations per genome per year. This rate suggests that Mtb grows in its natural host with a doubling time of approximately 22 h (400 generations per year). Based on the genome variation discovered, emergence of the Hamburg clone was dated back to a period between 1993 and 1997, hence shortly before the discovery of the outbreak through epidemiological surveillance.
Conclusions
Our findings suggest that WGS is superior to conventional genotyping for Mtb pathogen tracing and investigating micro-epidemics. WGS provides a measure of Mtb genome evolution over time in its natural host context.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
Tuberculosis—a contagious bacterial disease that usually infects the lungs—is a major public health problem, particularly in low- and middle-income countries. In 2011, an estimated 8.7 million people developed tuberculosis globally, and 1.4 million people died from the disease. Tuberculosis is second only to HIV/AIDS in terms of global deaths from a single infectious agent. Mycobacterium tuberculosis, the bacterium that causes tuberculosis, is readily spread in airborne droplets when people with active disease cough or sneeze. The characteristic symptoms of tuberculosis include persistent cough, weight loss, fever, and night sweats. Diagnostic tests for the disease include sputum smear analysis (examination of mucus coughed up from the lungs for the presence of M. tuberculosis), mycobacterial culture (growth of M. tuberculosis from sputum), and chest X-rays. Tuberculosis can be cured by taking several antibiotics daily for at least six months, although the recent emergence of multidrug-resistant M. tuberculosis is making tuberculosis harder to treat.
Why Was This Study Done?
Although efforts to reduce the global burden of tuberculosis are showing some improvements, the annual decline in the number of people developing tuberculosis continues to be slow. To develop optimized control strategies, experts need to be able to accurately track M. tuberculosis transmission within human populations. Because M. tuberculosis, like all bacteria, accumulates genetic changes over time, there are many different strains (genetic variants) of M. tuberculosis. Genotyping methods have been developed that identify different bacterial strains by examining specific regions of the bacterial genome (blueprint), but because these methods examine only a small part of the genome, they may not distinguish between related transmission chains. That is, traditional strain genotyping methods may not be able to determine accurately where a tuberculosis outbreak started or how it spread through a population. In this longitudinal cohort study, the researchers compare the ability of whole genome sequencing (WGS), which is rapidly becoming widely available, and traditional genotyping to provide information about a recent German tuberculosis outbreak. In a longitudinal cohort study, a population is followed over time to analyze the occurrence of a specific disease.
What Did the Researchers Do and Find?
During long-term (1997–2010) population-based molecular epidemiological surveillance (disease surveillance that uses molecular techniques rather than reports of illness) in Hamburg and Schleswig-Holstein, the researchers identified a large tuberculosis outbreak caused by M. tuberculosis isolates of the Haarlem lineage using classical strain typing. The researchers examined each of the 86 isolates from this outbreak using WGS and classical genotyping and asked whether the results of these two approaches correlated with contact tracing data (information is routinely collected about the people a patient with tuberculosis has recently met so that these contacts can be tested for tuberculosis and treated if necessary) and with the spatio-temporal distribution of outbreak cases. WGS of the isolates identified 85 single nucleotide polymorphisms (SNPs; genomic sequence variants in which single building blocks, or nucleotides, are altered) that subdivided the outbreak into seven clusters of isolates and 36 unique isolates. The WGS results showed that the first isolates of the outbreak were incorrectly clustered by classical genotyping and that one strain—the “Hamburg clone”—started expanding in 1998. Notably, the genome-based clustering patterns were in better accordance with contact tracing data and with the geographical distribution of cases than clustering patterns based on classical genotyping, and they identified eight confirmed human-to-human transmission chains that involved 31 patients and a maximum of three SNPs. Finally, the researchers used their WGS results to estimate that the Hamburg clone emerged between 1993 and 1997, shortly before the discovery of the tuberculosis outbreak through epidemiological surveillance.
What Do These Findings Mean?
These findings show that WGS can be used to identify specific strains within large tuberculosis outbreaks more accurately than classical genotyping. They also provide new information about the evolution of M. tuberculosis during outbreaks and indicate how WGS data should be interpreted in future genome-based molecular epidemiology studies. WGS has the potential to improve the molecular epidemiological surveillance and control of tuberculosis and of other infectious diseases. Importantly, note the researchers, ongoing reductions in the cost of WGS, the increased availability of “bench top” genome sequencers, and bioinformatics developments should all accelerate the implementation of WGS as a standard method for the identification of transmission chains in infectious disease outbreaks.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001387.
The World Health Organization provides information (in several languages) on all aspects of tuberculosis, including the Global Tuberculosis Report 2012
The Stop TB Partnership is working towards tuberculosis elimination; patient stories about tuberculosis are available (in English and Spanish)
The US Centers for Disease Control and Prevention has information about tuberculosis, including information on tuberculosis genotyping (some information in English and Spanish)
The US National Institute of Allergy and Infectious Diseases also has detailed information on all aspects of tuberculosis
The Tuberculosis Survival Project, which aims to raise awareness of tuberculosis and provide support for people with tuberculosis, provides personal stories about treatment for tuberculosis; the Tuberculosis Vaccine Initiative also provides personal stories about dealing with tuberculosis
MedlinePlus has links to further information about tuberculosis (in English and Spanish)
Wikipedia has a page on whole-genome sequencing (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
doi:10.1371/journal.pmed.1001387
PMCID: PMC3570532  PMID: 23424287
15.  An Atypical Kinase under Balancing Selection Confers Broad-Spectrum Disease Resistance in Arabidopsis 
PLoS Genetics  2013;9(9):e1003766.
The failure of gene-for-gene resistance traits to provide durable and broad-spectrum resistance in an agricultural context has led to the search for genes underlying quantitative resistance in plants. Such genes have been identified in only a few cases, all for fungal or nematode resistance, and encode diverse molecular functions. However, an understanding of the molecular mechanisms of quantitative resistance variation to other enemies and the associated evolutionary forces shaping this variation remain largely unknown. We report the identification, map-based cloning and functional validation of QRX3 (RKS1, Resistance related KinaSe 1), conferring broad-spectrum resistance to Xanthomonas campestris (Xc), a devastating worldwide bacterial vascular pathogen of crucifers. RKS1 encodes an atypical kinase that mediates a quantitative resistance mechanism in plants by restricting bacterial spread from the infection site. Nested Genome-Wide Association mapping revealed a major locus corresponding to an allelic series at RKS1 at the species level. An association between variation in resistance and RKS1 transcription was found using various transgenic lines as well as in natural accessions, suggesting that regulation of RKS1 expression is a major component of quantitative resistance to Xc. The co-existence of long lived RKS1 haplotypes in A. thaliana is shared with a variety of genes involved in pathogen recognition, suggesting common selective pressures. The identification of RKS1 constitutes a starting point for deciphering the mechanisms underlying broad spectrum quantitative disease resistance that is effective against a devastating and vascular crop pathogen. Because putative RKS1 orthologous have been found in other Brassica species, RKS1 provides an exciting opportunity for plant breeders to improve resistance to black rot in crops.
Author Summary
During the evolution of plant-pathogen interactions, plants have evolved the capability to defend themselves from pathogen infection by different overlapping mechanisms. Disease resistance is constituted by an elaborate, multilayered system of defense. Among these responses, quantitative resistance is a prevalent form of resistance in crops and natural plant populations, for which the genetic and molecular bases remain largely unknown. Thus, identification of the genes underlying quantitative resistance constitutes a major challenge in plant breeding and evolutionary biology, and might have enormous practical implications for human health by increasing crop yield and quality. Our work contributes to understanding the molecular bases of quantitative resistance to the vascular pathogen Xanthomonas campestris (Xc), which is responsible for black rot, an important disease of crucifers worldwide. By multiple approaches, we demonstrate that RKS1 is a quantitative resistance gene in Arabidopsis thaliana conferring broad-spectrum resistance to Xc and that this resistance mechanism in plants is associated with regulation of RKS1 expression. We also provide evidence that RKS1 allelic variation is a major component of quantitative resistance to Xc at the species level. Finally, the long-lived polymorphism associated with RKS1 suggests that evolutionary stable broad-spectrum resistance to Xc may be achieved in natural populations of A. thaliana.
doi:10.1371/journal.pgen.1003766
PMCID: PMC3772041  PMID: 24068949
16.  Contribution of Genome-Wide Association Studies to Scientific Research: A Pragmatic Approach to Evaluate Their Impact 
PLoS ONE  2013;8(8):e71198.
The factual value of genome-wide association studies (GWAS) for the understanding of multifactorial diseases is a matter of intense debate. Practical consequences for the development of more effective therapies do not seem to be around the corner. Here we propose a pragmatic and objective evaluation of how much new biology is arising from these studies, with particular attention to the information that can help prioritize therapeutic targets. We chose multiple sclerosis (MS) as a paradigm disease and assumed that, in pre-GWAS candidate-gene studies, the knowledge behind the choice of each gene reflected the understanding of the disease prior to the advent of GWAS. Importantly, this knowledge was based mainly on non-genetic, phenotypic grounds. We performed single-gene and pathway-oriented comparisons of old and new knowledge in MS by confronting an unbiased list of candidate genes in pre-GWAS association studies with those genes exceeding the genome-wide significance threshold in GWAS published from 2007 on. At the single gene level, the majority (94 out of 125) of GWAS-discovered variants had never been contemplated as plausible candidates in pre-GWAS association studies. The 31 genes that were present in both pre- and post-GWAS lists may be of particular interest in that they represent disease-associated variants whose pathogenetic relevance is supported at the phenotypic level (i.e. the phenotypic information that steered their selection as candidate genes in pre-GWAS association studies). As such they represent attractive therapeutic targets. Interestingly, our analysis shows that some of these variants are targets of pharmacologically active compounds, including drugs that are already registered for human use. Compared with the above single-gene analysis, at the pathway level GWAS results appear more coherent with previous knowledge, reinforcing some of the current views on MS pathogenesis and related therapeutic research. This study presents a pragmatic approach that helps interpret and exploit GWAS knowledge.
doi:10.1371/journal.pone.0071198
PMCID: PMC3743868  PMID: 23967165
17.  Metagenomic Assay for Identification of Microbial Pathogens in Tumor Tissues 
mBio  2014;5(5):e01714-14.
ABSTRACT
Screening for thousands of viruses and other pathogenic microorganisms, including bacteria, fungi, and parasites, in human tumor tissues will provide a better understanding of the contributory role of the microbiome in the predisposition for, causes of, and therapeutic responses to the associated cancer. Metagenomic assays designed to perform these tasks will have to include rapid and economical processing of large numbers of samples, supported by straightforward data analysis pipeline and flexible sample preparation options for multiple input tissue types from individual patients, mammals, or environmental samples. To meet these requirements, the PathoChip platform was developed by targeting viral, prokaryotic, and eukaryotic genomes with multiple DNA probes in a microarray format that can be combined with a variety of upstream sample preparation protocols and downstream data analysis. PathoChip screening of DNA plus RNA from formalin-fixed, paraffin-embedded tumor tissues demonstrated the utility of this platform, and the detection of oncogenic viruses was validated using independent PCR and deep sequencing methods. These studies demonstrate the use of the PathoChip technology combined with PCR and deep sequencing as a valuable strategy for detecting the presence of pathogens in human cancers and other diseases.
IMPORTANCE
This work describes the design and testing of a PathoChip array containing probes with the ability to detect all known publicly available virus sequences as well as hundreds of pathogenic bacteria, fungi, parasites, and helminths. PathoChip provides wide coverage of microbial pathogens in an economical format. PathoChip screening of DNA plus RNA from formalin-fixed, paraffin-embedded tumor tissues demonstrated the utility of this platform, and the detection of oncogenic viruses was validated using independent PCR and sequencing methods. These studies demonstrate that the PathoChip technology is a valuable strategy for detecting the presence of pathogens in human cancers and other diseases.
doi:10.1128/mBio.01714-14
PMCID: PMC4172075  PMID: 25227467
18.  Genotype-Environment Interactions Reveal Causal Pathways That Mediate Genetic Effects on Phenotype 
PLoS Genetics  2013;9(9):e1003803.
Unraveling the molecular processes that lead from genotype to phenotype is crucial for the understanding and effective treatment of genetic diseases. Knowledge of the causative genetic defect most often does not enable treatment; therefore, causal intermediates between genotype and phenotype constitute valuable candidates for molecular intervention points that can be therapeutically targeted. Mapping genetic determinants of gene expression levels (also known as expression quantitative trait loci or eQTL studies) is frequently used for this purpose, yet distinguishing causation from correlation remains a significant challenge. Here, we address this challenge using extensive, multi-environment gene expression and fitness profiling of hundreds of genetically diverse yeast strains, in order to identify truly causal intermediate genes that condition fitness in a given environment. Using functional genomics assays, we show that the predictive power of eQTL studies for inferring causal intermediate genes is poor unless performed across multiple environments. Surprisingly, although the effects of genotype on fitness depended strongly on environment, causal intermediates could be most reliably predicted from genetic effects on expression present in all environments. Our results indicate a mechanism explaining this apparent paradox, whereby immediate molecular consequences of genetic variation are shared across environments, and environment-dependent phenotypic effects result from downstream integration of environmental signals. We developed a statistical model to predict causal intermediates that leverages this insight, yielding over 400 transcripts, for the majority of which we experimentally validated their role in conditioning fitness. Our findings have implications for the design and analysis of clinical omics studies aimed at discovering personalized targets for molecular intervention, suggesting that inferring causation in a single cellular context can benefit from molecular profiling in multiple contexts.
Author Summary
A long-standing challenge in biology is to unravel the chain of molecular events linking genetic variation to phenotypes like disease. Identifying the genes that act as intermediates between the underlying genetic variation and the disease can offer new ways to intervene in its progression. While large-scale molecular profiles are an important starting point, it is difficult to distinguish causal relationships from correlative associations. In this study, our goal was to develop strategies to identify these causal intermediates. We studied the effects of genetic differences in baker's yeast on fitness in multiple environmental conditions. While genetic effects on fitness depended strongly on the environment, genetic effects on the expression of truly causal intermediate genes tended to persist despite environmental changes. This indicates that causal intermediates can be found among genes whose expression is affected by genetic variation independently of environment. We thus developed a statistical method to predict causal intermediates based on genetics, gene expression, and fitness in multiple environments. Our study has implications for the design and analysis of clinical molecular profiling efforts towards understanding how genetic variation causes disease, suggesting that multiple contexts (e.g., cell types) can be informative even if they are not afflicted by the disease.
doi:10.1371/journal.pgen.1003803
PMCID: PMC3778020  PMID: 24068968
19.  Heterotachy in Mammalian Promoter Evolution 
PLoS Genetics  2006;2(4):e30.
We have surveyed the evolutionary trends of mammalian promoters and upstream sequences, utilising large sets of experimentally supported transcription start sites (TSSs). With 30,969 well-defined TSSs from mouse and 26,341 from human, there are sufficient numbers to draw statistically meaningful conclusions and to consider differences between promoter types. Unlike previous smaller studies, we have considered the effects of insertions, deletions, and transposable elements as well as nucleotide substitutions. The rate of promoter evolution relative to that of control sequences has not been consistent between lineages nor within lineages over time. The most pronounced manifestation of this heterotachy is the increased rate of evolution in primate promoters. This increase is seen across different classes of mutation, including substitutions and micro-indel events. We investigated the relationship between promoter and coding sequence selective constraint and suggest that they are generally uncorrelated. This analysis also identified a small number of mouse promoters associated with the immune response that are under positive selection in rodents. We demonstrate significant differences in divergence between functional promoter categories and identify a category of promoters, not associated with conventional protein-coding genes, that has the highest rates of divergence across mammals. We find that evolutionary rates vary both on a fine scale within mammalian promoters and also between different functional classes of promoters. The discovery of heterotachy in promoter evolution, in particular the accelerated evolution of primate promoters, has important implications for our understanding of human evolution and for strategies to detect primate-specific regulatory elements.
Synopsis
Promoters are crucial to the regulation of gene expression. They are of considerable interest to molecular biologists from a functional perspective and to a much wider audience, as sequence changes within promoters are likely to be a substantial contributor to disease predisposition and the divergence of species. In mammals, promoters have been extensively studied in a case-by-case manner, but the more general mechanisms of promoter evolution are little understood. The authors have undertaken an extensive study of evolutionary trends across experimentally defined promoters. They have discovered that the relative rate of promoter evolution varies between lineages and is substantially accelerated in primates. The authors conclude that the predominant cause is variation in the mutation rate specifically within promoter regions. This finding has important implications for comparative genomics, in particular the identification of functional sites within promoters. The large datasets in this study have also allowed the pattern of evolution to be considered between different types of promoter, giving new insight into their distinct biology.
doi:10.1371/journal.pgen.0020030
PMCID: PMC1449885  PMID: 16683025
20.  Genomic Profiling Identifies GATA6 as a Candidate Oncogene Amplified in Pancreatobiliary Cancer 
PLoS Genetics  2008;4(5):e1000081.
Pancreatobiliary cancers have among the highest mortality rates of any cancer type. Discovering the full spectrum of molecular genetic alterations may suggest new avenues for therapy. To catalogue genomic alterations, we carried out array-based genomic profiling of 31 exocrine pancreatic cancers and 6 distal bile duct cancers, expanded as xenografts to enrich the tumor cell fraction. We identified numerous focal DNA amplifications and deletions, including in 19% of pancreatobiliary cases gain at cytoband 18q11.2, a locus uncommonly amplified in other tumor types. The smallest shared amplification at 18q11.2 included GATA6, a transcriptional regulator previously linked to normal pancreas development. When amplified, GATA6 was overexpressed at both the mRNA and protein levels, and strong immunostaining was observed in 25 of 54 (46%) primary pancreatic cancers compared to 0 of 33 normal pancreas specimens surveyed. GATA6 expression in xenografts was associated with specific microarray gene-expression patterns, enriched for GATA binding sites and mitochondrial oxidative phosphorylation activity. siRNA mediated knockdown of GATA6 in pancreatic cancer cell lines with amplification led to reduced cell proliferation, cell cycle progression, and colony formation. Our findings indicate that GATA6 amplification and overexpression contribute to the oncogenic phenotypes of pancreatic cancer cells, and identify GATA6 as a candidate lineage-specific oncogene in pancreatobiliary cancer, with implications for novel treatment strategies.
Author Summary
Pancreatic cancer is a devastating disease, having among the lowest survival rates of any cancer. A better understanding of the molecular basis of pancreatic cancer may lead to improved rationale therapies. We report here the discovery of amplification (i.e. extra copies) of the GATA6 gene in many human pancreatic cancers. GATA6 is a regulator of gene expression and functions in the development of the normal pancreas. Our findings indicate that its amplification and aberrant overexpression contribute to pancreatic cancer development. GATA6 joins a growing list of cancer genes with key roles in normal human development but pathogenic roles in cancer when aberrantly expressed. Our discovery of GATA6 amplification provides a new foothold into understanding the pathogenic mechanisms underlying pancreatic cancer, and suggests new strategies for therapy by targeting GATA6 or the genes it regulates.
doi:10.1371/journal.pgen.1000081
PMCID: PMC2413204  PMID: 18535672
21.  Engineering microbes to sense and eradicate Pseudomonas aeruginosa, a human pathogen 
A synthetic genetic system is designed and characterized that allows Escherichia coli to sense and eradicate Pseudomonas aeruginosa, providing a novel antimicrobial strategy that could potentially be applied to fighting infectious pathogens.
We have engineered and demonstrated a novel genetic circuit that enables Escherichia coli to produce and release pyocin upon quorum sensing detection of Pseudomonas aeruginosa, which in turn kills P. aeruginosa.The quorum sensing device, which comprises an LasR transcription factor constitutively expressed by a pTetR promoter and a downstream pLuxR inducible promoter, has a switch point of 1.2 × 10E-7 M 3OC12HSL and is able to sense 3OC12HSL natively produced by P. aeruginosa.The E7 lysis device when coupled downstream of the quorum sensing device enhances pyocin release eight-fold.The engineered E. coli, which carries the sensing, lysing, and killing devices, effectively inhibits the growth of planktonic and biofilm P. aeruginosa by 99 and 90%, respectively.
In this study, we have made progress toward developing a novel antimicrobial strategy, based on an engineered microbial system, using the synthetic biology framework. Our final system was designed to (i) detect AHLs produced by P. aeruginosa; (ii) produce pyocin S5 upon the detection; and (iii) lyse the E. coli cells by E7 lysis protein so that the produced pyocin S5 is released from the cells, leading to the killing of P. aeruginosa.
Figure 1 shows a schematic of our sensing and killing genetic system. The sensing device was designed based on the Type I quorum sensing mechanism of P. aeruginosa. The tetR promoter, which is constitutively on, produces a transcriptional factor, LasR, that binds to AHL 3OC12HSL. The luxR promoter, to which LasR-3OC12HSL activator complex reportedly binds, was adopted as the inducible promoter in our sensing device (Gray et al, 1994). Next, the formation of the LasR-3OC12HSL complex, which binds to the luxR promoter, activates the killing and lysing devices, leading to the production of pyocin S5 and lysis E7 proteins within the E. coli chassis. Upon reaching a threshold concentration, the lysis E7 protein perforates membrane of the E. coli host and releases the accumulated pyocin S5. Pyocin S5, which is a soluble protein, then diffuses toward the target pathogen and damages its cellular integrity, thereby killing it.
To evaluate and characterize the sensing device, the gene encoding the green fluorescent protein (GFP) was fused to the sensing device and the GFP expression was monitored at a range of concentrations of 3OC12HSL. From the measured GFP synthesis rates, we observed a basal expression level of 0.216 RFU per OD per minute without induction, followed by a sharp increase in GFP production rate as the concentration of 3OC12HSL was increased beyond 1.0E-7 M. A transfer function that describes the static relationship between the input (3OC12HSL) and output (GFP production rate) of the sensing device was determined by fitting an empirical mathematical model (Hill equation) to the experimental data where the input 3OC12HSL concentration is <1.0E-6 M. The resulting best fit model demonstrated that the static performance of the sensing device follows a Hill equation below the input concentration of 1.0E-6 M 3OC12HSL. The model showed that the sensing device saturated at a maximum output of 1.96 RFU per OD per minute at input concentration >3.3E-7 M but <1.0E-6 M 3OC12HSL, and the switch point for the sensing device was 1.2E-7 M 3OC12HSL, the input concentration at which output is at half-maximal. Since this switch point concentration is smaller than the concentration of 3OC12HSL present (1.0E-6 to 1.0E-4 M) within proximity to the site of P. aeruginosa infection as earlier reported in the literature (Pearson et al, 1995; Charlton et al, 2000), the sensing device would be sensitive enough to detect the amount of 3OC12HSL natively produced by P. aeruginosa.
In line with the objective of the E7 lysis device in mediating the export of pyocin, we studied the efficiency of the lysis device in the final system by measuring the amount of the released protein. While distinct bands that corresponded to pyocin S5 were observed on the SDS–PAGE of the final system, no bands were seen in lanes without the lysis device. We further validated the results by estimating the protein concentrations in the supernatant with Bradford assay and showed that the amount of pyocin released by our final system was eight times higher than the system without the lysis device.
To verify that our engineered E. coli can inhibit P. aeruginosa in a mixed culture, we monitored the growth of P. aeruginosa co-cultured with the engineered E. coli in the ratio 1:4 by CFU count. The result shows that our engineered E. coli with the final system effectively inhibited the growth of P. aeruginosa by 99% while continuous growths were apparent in P. aeruginosa co-cultured with incomplete E. coli systems missing either the pyocin S5 or E7 lysis devices.
To examine the potential application of our engineered system against a pseudo disease state of Pseudomonas, a static biofilm inhibition assay was performed. Figure 6A shows that our engineered E. coli inhibited the formation of P. aeruginosa biofilm by close to 90%. This observation is in stark contrast to the pyocin-resistant control strain PAO1 and pyocin-sensitive clinical isolate ln7 subjected to treatment with E. coli having the systems missing either the pyocin S5 or E7 lysis devices. To visualize the extent of biofilm inhibition, biofilm cells with green fluorescence were grown in the presence of engineered E. coli on glass slide substrate and examined with confocal laser scanning microscopy. Figure 6B shows that the morphology of Pseudomonas biofilm treated with the engineered E. coli appeared sparse, while elaborated honey-combed structures were apparent in the control experiments. Collectively, our results suggest that our engineered E. coli carrying the final system, which contains the sensing, killing, and lysing devices, can effectively inhibit the growth of P. aeruginosa in both planktonic and sessile states.
In summary, we engineered a novel biological system, which comprises sensing, killing, and lysing devices, that enables E. coli to sense and eradicate pathogenic P. aeruginosa strains by exploiting the synthetic biology framework. More importantly, our study presents the possibility of engineering potentially beneficial microbiota into therapeutic bioagents to arrest Pseudomonas infection. Given the stalled development of new antibiotics and the increasing emergence of multidrug-resistant pathogens, this study provides the foundational basis for a novel synthetic biology-driven antimicrobial strategy that could be extended to include other pathogens such as Vibrio cholera and Helicobacter pylori.
Synthetic biology aims to systematically design and construct novel biological systems that address energy, environment, and health issues. Herein, we describe the development of a synthetic genetic system, which comprises quorum sensing, killing, and lysing devices, that enables Escherichia coli to sense and kill a pathogenic Pseudomonas aeruginosa strain through the production and release of pyocin. The sensing, killing, and lysing devices were characterized to elucidate their detection, antimicrobial and pyocin release functionalities, which subsequently aided in the construction of the final system and the verification of its designed behavior. We demonstrated that our engineered E. coli sensed and killed planktonic P. aeruginosa, evidenced by 99% reduction in the viable cells. Moreover, we showed that our engineered E. coli inhibited the formation of P. aeruginosa biofilm by close to 90%, leading to much sparser and thinner biofilm matrices. These results suggest that E. coli carrying our synthetic genetic system may provide a novel synthetic biology-driven antimicrobial strategy that could potentially be applied to fighting P. aeruginosa and other infectious pathogens.
doi:10.1038/msb.2011.55
PMCID: PMC3202794  PMID: 21847113
genetic circuits; Pseudomonas aeruginosa; pyocin; quorum sensing; synthetic biology
22.  Mapping the Genetic Architecture of Gene Expression in Human Liver 
PLoS Biology  2008;6(5):e107.
Genetic variants that are associated with common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in higher-order disease traits. Therefore, identifying the molecular phenotypes that vary in response to changes in DNA and that also associate with changes in disease traits has the potential to provide the functional information required to not only identify and validate the susceptibility genes that are directly affected by changes in DNA, but also to understand the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. Toward that end, we profiled more than 39,000 transcripts and we genotyped 782,476 unique single nucleotide polymorphisms (SNPs) in more than 400 human liver samples to characterize the genetic architecture of gene expression in the human liver, a metabolically active tissue that is important in a number of common human diseases, including obesity, diabetes, and atherosclerosis. This genome-wide association study of gene expression resulted in the detection of more than 6,000 associations between SNP genotypes and liver gene expression traits, where many of the corresponding genes identified have already been implicated in a number of human diseases. The utility of these data for elucidating the causes of common human diseases is demonstrated by integrating them with genotypic and expression data from other human and mouse populations. This provides much-needed functional support for the candidate susceptibility genes being identified at a growing number of genetic loci that have been identified as key drivers of disease from genome-wide association studies of disease. By using an integrative genomics approach, we highlight how the gene RPS26 and not ERBB3 is supported by our data as the most likely susceptibility gene for a novel type 1 diabetes locus recently identified in a large-scale, genome-wide association study. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease and plasma low-density lipoprotein cholesterol levels in the process.
Author Summary
Genome-wide association studies seek to identify regions of the genome in which changes in DNA in a given population are correlated with disease, drug response, or other phenotypes of interest. However, changes in DNA that associate with traits like common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in the higher-order disease traits. Therefore, identifying molecular phenotypes that vary in response to changes in DNA that also associate with changes in disease traits can provide the functional information necessary to not only identify and validate the susceptibility genes directly affected by changes in DNA, but to understand as well the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. To enable this type of approach we profiled the expression levels of 39,280 transcripts and genotyped 782,476 SNPs in 427 human liver samples, identifying thousands of DNA variants that strongly associated with liver gene expression. These relationships were then leveraged by integrating them with genotypic and expression data from other human and mouse populations, leading to the direct identification of candidate susceptibility genes corresponding to genetic loci identified as key drivers of disease. Our analysis is able to provide much needed functional support for these candidate susceptibility genes.
Identifying changes in DNA that associate with changes in gene expression in human tissues elucidates the genetic architecture of gene expression in human populations and enables the direct identification of functionally supported candidate susceptibility genes in genomic regions associated with disease.
doi:10.1371/journal.pbio.0060107
PMCID: PMC2365981  PMID: 18462017
23.  Lung eQTLs to Help Reveal the Molecular Underpinnings of Asthma 
PLoS Genetics  2012;8(11):e1003029.
Genome-wide association studies (GWAS) have identified loci reproducibly associated with pulmonary diseases; however, the molecular mechanism underlying these associations are largely unknown. The objectives of this study were to discover genetic variants affecting gene expression in human lung tissue, to refine susceptibility loci for asthma identified in GWAS studies, and to use the genetics of gene expression and network analyses to find key molecular drivers of asthma. We performed a genome-wide search for expression quantitative trait loci (eQTL) in 1,111 human lung samples. The lung eQTL dataset was then used to inform asthma genetic studies reported in the literature. The top ranked lung eQTLs were integrated with the GWAS on asthma reported by the GABRIEL consortium to generate a Bayesian gene expression network for discovery of novel molecular pathways underpinning asthma. We detected 17,178 cis- and 593 trans- lung eQTLs, which can be used to explore the functional consequences of loci associated with lung diseases and traits. Some strong eQTLs are also asthma susceptibility loci. For example, rs3859192 on chr17q21 is robustly associated with the mRNA levels of GSDMA (P = 3.55×10−151). The genetic-gene expression network identified the SOCS3 pathway as one of the key drivers of asthma. The eQTLs and gene networks identified in this study are powerful tools for elucidating the causal mechanisms underlying pulmonary disease. This data resource offers much-needed support to pinpoint the causal genes and characterize the molecular function of gene variants associated with lung diseases.
Author Summary
Recent genome-wide association studies (GWAS) have identified genetic variants associated with lung diseases. The challenge now is to find the causal genes in GWAS–nominated chromosomal regions and to characterize the molecular function of disease-associated genetic variants. In this paper, we describe an international effort to systematically capture the genetic architecture of gene expression regulation in human lung. By studying lung specimens from 1,111 individuals of European ancestry, we found a large number of genetic variants affecting gene expression in the lung, or lung expression quantitative trait loci (eQTL). These lung eQTLs will serve as an important resource to aid in the understanding of the molecular underpinnings of lung biology and its disruption in disease. To demonstrate the utility of this lung eQTL dataset, we integrated our data with previous genetic studies on asthma. Through integrative techniques, we identified causal variants and genes in GWAS–nominated loci and found key molecular drivers for asthma. We feel that sharing our lung eQTLs dataset with the scientific community will leverage the impact of previous large-scale GWAS on lung diseases and function by providing much needed functional information to understand the molecular changes introduced by the susceptibility genetic variants.
doi:10.1371/journal.pgen.1003029
PMCID: PMC3510026  PMID: 23209423
24.  Systems Cancer Medicine: Towards Realization of Predictive, Preventive, Personalized, and Participatory (P4) Medicine 
Journal of internal medicine  2012;271(2):111-121.
A grand challenge impeding optimal treatment outcomes for cancer patients arises from the complex nature of the disease: the cellular heterogeneity, the myriad of dysfunctional molecular and genetic networks as results of genetic (somatic) and environmental perturbations. Systems biology, with its holistic approach to understanding fundamental principles in biology, and the empowering technologies in genomics, proteomics, single-cell analysis, microfluidics, and computational strategies, enables a comprehensive approach to medicine, which strives to unveil the pathogenic mechanisms of diseases, identify disease biomarkers and begin thinking about new strategies for drug target discovery. The integration of multi-dimensional high throughput “omics” measurements from tumor tissues and corresponding blood specimens, together with new systems strategies for diagnostics, enables the identification of cancer biomarkers that will enable presymptomatic diagnosis, stratification of disease, assessment of disease progression, evaluation of patient response to therapy, and the identification of reoccurrences. While some aspects of systems medicine are being adopted in clinical oncology practice through companion molecular diagnostics for personalized therapy, the mounting influx of global quantitative data from both wellness and diseases, is shaping up a transformational paradigm in medicine we termed predictive, preventive, personalized, and participatory (P4) medicine, which requires new strategies, both scientific and organizational, to enable bringing this revolution in medicine to patients and to the healthcare system. P4 medicine will have a profound impact on society—transforming the healthcare system, turning around the ever escalating costs of healthcare, digitizing the practice of medicine and creating enormous economic opportunities for those organizations and nations that embrace this revolution
doi:10.1111/j.1365-2796.2011.02498.x
PMCID: PMC3978383  PMID: 22142401
Systems medicine; cancer complexity; quantized cell populations; blood biomarkers; molecular diagnostics; P4 medicine
25.  The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode 
Genome Biology  2014;15(3):R43.
Background
Globodera pallida is a devastating pathogen of potato crops, making it one of the most economically important plant parasitic nematodes. It is also an important model for the biology of cyst nematodes. Cyst nematodes and root-knot nematodes are the two most important plant parasitic nematode groups and together represent a global threat to food security.
Results
We present the complete genome sequence of G. pallida, together with transcriptomic data from most of the nematode life cycle, particularly focusing on the life cycle stages involved in root invasion and establishment of the biotrophic feeding site. Despite the relatively close phylogenetic relationship with root-knot nematodes, we describe a very different gene family content between the two groups and in particular extensive differences in the repertoire of effectors, including an enormous expansion of the SPRY domain protein family in G. pallida, which includes the SPRYSEC family of effectors. This highlights the distinct biology of cyst nematodes compared to the root-knot nematodes that were, until now, the only sedentary plant parasitic nematodes for which genome information was available. We also present in-depth descriptions of the repertoires of other genes likely to be important in understanding the unique biology of cyst nematodes and of potential drug targets and other targets for their control.
Conclusions
The data and analyses we present will be central in exploiting post-genomic approaches in the development of much-needed novel strategies for the control of G. pallida and related pathogens.
doi:10.1186/gb-2014-15-3-r43
PMCID: PMC4054857  PMID: 24580726

Results 1-25 (1043070)