PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1050736)

Clipboard (0)
None

Related Articles

1.  Whole-Organ Isolation Approach as a Basis for Tissue-Specific Analyses in Schistosoma mansoni 
Background
Schistosomiasis is one of the most important parasitic diseases worldwide, second only to malaria. Schistosomes exhibit an exceptional reproductive biology since the sexual maturation of the female, which includes the differentiation of the reproductive organs, is controlled by pairing. Pathogenicity originates from eggs, which cause severe inflammation in their hosts. Elucidation of processes contributing to female maturation is not only of interest to basic science but also considering novel concepts combating schistosomiasis.
Methodology/Principal Findings
To get direct access to the reproductive organs, we established a novel protocol using a combined detergent/protease-treatment removing the tegument and the musculature of adult Schistosoma mansoni. All steps were monitored by scanning electron microscopy (SEM) and bright-field microscopy (BF). We focused on the gonads of adult schistosomes and demonstrated that isolated and purified testes and ovaries can be used for morphological and structural studies as well as sources for RNA and protein of sufficient amounts for subsequent analyses such as RT-PCR and immunoblotting. To this end, first exemplary evidence was obtained for tissue-specific transcription within the gonads (axonemal dynein intermediate chain gene SmAxDynIC; aquaporin gene SmAQP) as well as for post-transcriptional regulation (SmAQP).
Conclusions/Significance
The presented method provides a new way of getting access to tissue-specific material of S. mansoni. With regard to many still unanswered questions of schistosome biology, such as elucidating the molecular processes involved in schistosome reproduction, this protocol provides opportunities for, e.g., sub-transcriptomics and sub-proteomics at the organ level. This will promote the characterisation of gene-expression profiles, or more specifically to complete knowledge of signalling pathways contributing to differentiation processes, so discovering involved molecules that may represent potential targets for novel intervention strategies. Furthermore, gonads and other tissues are a basis for cell isolation, opening new perspectives for establishing cell lines, one of the tools desperately needed in the post-genomic era.
Author Summary
As a neglected disease, schistosomiasis is still an enormous problem in the tropics and subtropics. Since the 1980s, Praziquantel (PZQ) has been the drug of choice but can be anticipated to lose efficacy in the future due to emerging resistance. Alternative drugs or efficient vaccines are still lacking, strengthening the need for the discovery of novel strategies and targets for combating schistosomiasis. One avenue is to understand the unique reproductive biology of this trematode in more detail. Sexual maturation of the adult female depends on a constant pairing with the male. This is a crucial prerequisite for the differentiation of the female reproductive organs such as the vitellarium and ovary, and consequently for the production of mature eggs. These are needed for life-cycle maintenance, but they also cause pathogenesis. With respect to adult males, the production of mature sperm is essential for fertilisation and life-cycle progression. In our study we present a convenient and inexpensive method to isolate reproductive tissues from adult schistosomes in high amounts and purity, representing a source for gonad-specific RNA and protein, which will serve for future sub-transcriptome and -proteome studies helping to characterise genes, or to unravel differentiation programs in schistosome gonads. Beyond that, isolated organs may be useful for approaches to establish cell cultures, desperately needed in the post-genomic era.
doi:10.1371/journal.pntd.0002336
PMCID: PMC3723596  PMID: 23936567
2.  Integrating Computational Biology and Forward Genetics in Drosophila 
PLoS Genetics  2009;5(1):e1000351.
Genetic screens are powerful methods for the discovery of gene–phenotype associations. However, a systems biology approach to genetics must leverage the massive amount of “omics” data to enhance the power and speed of functional gene discovery in vivo. Thus far, few computational methods for gene function prediction have been rigorously tested for their performance on a genome-wide scale in vivo. In this work, we demonstrate that integrating genome-wide computational gene prioritization with large-scale genetic screening is a powerful tool for functional gene discovery. To discover genes involved in neural development in Drosophila, we extend our strategy for the prioritization of human candidate disease genes to functional prioritization in Drosophila. We then integrate this prioritization strategy with a large-scale genetic screen for interactors of the proneural transcription factor Atonal using genomic deficiencies and mutant and RNAi collections. Using the prioritized genes validated in our genetic screen, we describe a novel genetic interaction network for Atonal. Lastly, we prioritize the whole Drosophila genome and identify candidate gene associations for ten receptor-signaling pathways. This novel database of prioritized pathway candidates, as well as a web application for functional prioritization in Drosophila, called Endeavour-HighFly, and the Atonal network, are publicly available resources. A systems genetics approach that combines the power of computational predictions with in vivo genetic screens strongly enhances the process of gene function and gene–gene association discovery.
Author Summary
Genome sequencing and annotation, combined with large-scale molecular experiments to query gene expression and molecular interactions, collectively known as Systems Biology, have resulted in an enormous wealth in biological databases. Yet, it remains a daunting task to use these data to decipher the rules that govern biological systems. One of the most trusted approaches in biology is genetic analysis because of its emphasis on gene function in living organisms. Genetics, however, proceeds slowly and unravels small-scale interactions. Turning genetics into an effective tool of Systems Biology requires harnessing the large-scale molecular data for the design and execution of genetic screens. In this work, we test the idea of exploiting a computational approach known as gene prioritization to pre-rank genes for the likelihood of their involvement in a process of interest. By carrying out a gene prioritization–supported genetic screen, we greatly enhance the speed and output of in vivo genetic screens without compromising their sensitivity. These results mean that future genetic screens can be custom-catered for any process of interest and carried out with a speed and efficiency that is comparable to other large-scale molecular experiments. We refer to this combined approach as Systems Genetics.
doi:10.1371/journal.pgen.1000351
PMCID: PMC2628282  PMID: 19165344
3.  The Genome of Spraguea lophii and the Basis of Host-Microsporidian Interactions 
PLoS Genetics  2013;9(8):e1003676.
Microsporidia are obligate intracellular parasites with the smallest known eukaryotic genomes. Although they are increasingly recognized as economically and medically important parasites, the molecular basis of microsporidian pathogenicity is almost completely unknown and no genetic manipulation system is currently available. The fish-infecting microsporidian Spraguea lophii shows one of the most striking host cell manipulations known for these parasites, converting host nervous tissue into swollen spore factories known as xenomas. In order to investigate the basis of these interactions between microsporidian and host, we sequenced and analyzed the S. lophii genome. Although, like other microsporidia, S. lophii has lost many of the protein families typical of model eukaryotes, we identified a number of gene family expansions including a family of leucine-rich repeat proteins that may represent pathogenicity factors. Building on our comparative genomic analyses, we exploited the large numbers of spores that can be obtained from xenomas to identify potential effector proteins experimentally. We used complex-mix proteomics to identify proteins released by the parasite upon germination, resulting in the first experimental isolation of putative secreted effector proteins in a microsporidian. Many of these proteins are not related to characterized pathogenicity factors or indeed any other sequences from outside the Microsporidia. However, two of the secreted proteins are members of a family of RICIN B-lectin-like proteins broadly conserved across the phylum. These proteins form syntenic clusters arising from tandem duplications in several microsporidian genomes and may represent a novel family of conserved effector proteins. These computational and experimental analyses establish S. lophii as an attractive model system for understanding the evolution of host-parasite interactions in microsporidia and suggest an important role for lineage-specific innovations and fast evolving proteins in the evolution of the parasitic microsporidian lifecycle.
Author Summary
Microsporidia are unusual intracellular parasites that infect a broad range of animal cells. In comparison to their fungal relatives, microsporidian genomes have shrunk during evolution, encoding as few as 2000 proteins. This minimal molecular repertoire makes them a reduced model system for understanding host-parasite interactions. A number of microsporidian genomes have now been sequenced, but the lack of a system for genetic manipulation makes it difficult to translate these data into a better understanding of microsporidian biology. Here we present a deep sequencing project of Spraguea lophii, a fish-infecting microsporidian that is abundantly available from environmental samples. We use our sequence data combined with germination protocols and complex-mix proteomics to identify proteins released by the cell at the earliest stage of germination, representing potential pathogenicity factors. We profile the RNA expression pattern of germinating cells and identify a set of highly transcribed hypothetical genes. Our study provides new insight into the importance of uncharacterized, lineage-specific and/or fast evolving proteins in microsporidia and provides new leads for the investigation of virulence factors in these enigmatic parasites.
doi:10.1371/journal.pgen.1003676
PMCID: PMC3749934  PMID: 23990793
4.  Gender-Associated Genes in Filarial Nematodes Are Important for Reproduction and Potential Intervention Targets 
Background
A better understanding of reproductive processes in parasitic nematodes may lead to development of new anthelmintics and control strategies for combating disabling and disfiguring neglected tropical diseases such as lymphatic filariasis and onchocerciasis. Transcriptomatic analysis has provided important new insights into mechanisms of reproduction and development in other invertebrates. We have performed the first genome-wide analysis of gender-associated (GA) gene expression in a filarial nematode to improve understanding of key reproductive processes in these parasites.
Methodology/Principal Findings
The Version 2 Filarial Microarray with 18,104 elements representing ∼85% of the filarial genome was used to identify GA gene transcripts in adult Brugia malayi worms. Approximately 19% of 14,293 genes were identified as GA genes. Many GA genes have potential Caenorhabditis elegans homologues annotated as germline-, oogenesis-, spermatogenesis-, and early embryogenesis- enriched. The potential C. elegans homologues of the filarial GA genes have a higher frequency of severe RNAi phenotypes (such as lethal and sterility) than other C. elegans genes. Molecular functions and biological processes associated with GA genes were gender-segregated. Peptidase, ligase, transferase, regulator activity for kinase and transcription, and rRNA and lipid binding were associated with female GA genes. In contrast, catalytic activity from kinase, ATP, and carbohydrate binding were associated with male GA genes. Cell cycle, transcription, translation, and biological regulation were increased in females, whereas metabolic processes of phosphate and carbohydrate metabolism, energy generation, and cell communication were increased in males. Significantly enriched pathways in females were associated with cell growth and protein synthesis, whereas metabolic pathways such as pentose phosphate and energy production pathways were enriched in males. There were also striking gender differences in environmental information processing and cell communication pathways. Many proteins encoded by GA genes are secreted by Brugia malayi, and these encode immunomodulatory molecules such as antioxidants and host cytokine mimics. Expression of many GA genes has been recently reported to be suppressed by tetracycline, which blocks reproduction in female Brugia malayi. Our localization of GA transcripts in filarial reproductive organs supports the hypothesis that these genes encode proteins involved in reproduction.
Conclusions/Significance
Genome-wide expression profiling coupled with a robust bioinformatics analysis has greatly expanded our understanding of the molecular biology of reproduction in filarial nematodes. This study has highlighted key molecules and pathways associated with reproductive and other biological processes and identified numerous potential candidates for rational drug design to target reproductive processes.
Author Summary
Lymphatic filariasis is a neglected tropical disease that is caused by thread-like parasitic worms that live and reproduce in lymphatic vessels of the human host. There are no vaccines to prevent filariasis, and available drugs are not effective against all stages of the parasite. In addition, recent reports suggest that the filarial nematodes may be developing resistance to key medications. Therefore, there is an urgent need to identify new drug targets in filarial worms. The purpose of this study was to perform a genome-wide analysis of gender-associated gene transcription to improve understanding of key reproductive processes in filarial nematodes. Our results indicate that thousands of genes are differentially expressed in male and female adult worms. Many of those genes are involved in specific reproductive processes such as embryogenesis and spermatogenesis. In addition, expression of some of those genes is suppressed by tetracycline, a drug that leads to sterilization of adult female worms in many filarial species. Thus, gender-associated genes represent priority targets for design of vaccines and drugs that interfere with reproduction of filarial nematodes. Additional work with this type of integrated systems biology approach should lead to important new tools for controlling filarial diseases.
doi:10.1371/journal.pntd.0000947
PMCID: PMC3026763  PMID: 21283610
5.  Production of transmitochondrial cybrids containing naturally occurring pathogenic mtDNA variants 
Nucleic Acids Research  2006;34(13):e95.
The human mitochondrial genome (mtDNA) encodes polypeptides that are critical for coupling oxidative phosphorylation. Our detailed understanding of the molecular processes that mediate mitochondrial gene expression and the structure–function relationships of the OXPHOS components could be greatly improved if we were able to transfect mitochondria and manipulate mtDNA in vivo. Increasing our knowledge of this process is not merely of fundamental importance, as mutations of the mitochondrial genome are known to cause a spectrum of clinical disorders and have been implicated in more common neurodegenerative disease and the ageing process. In organellar or in vitro reconstitution studies have identified many factors central to the mechanisms of mitochondrial gene expression, but being able to investigate the molecular aetiology of a limited number of cell lines from patients harbouring mutated mtDNA has been enormously beneficial. In the absence of a mechanism for manipulating mtDNA, a much larger pool of pathogenic mtDNA mutations would increase our knowledge of mitochondrial gene expression. Colonic crypts from ageing individuals harbour mutated mtDNA. Here we show that by generating cytoplasts from colonocytes, standard fusion techniques can be used to transfer mtDNA into rapidly dividing immortalized cells and, thereby, respiratory-deficient transmitochondrial cybrids can be isolated. A simple screen identified clones that carried putative pathogenic mutations in MTRNR1, MTRNR2, MTCOI and MTND2, MTND4 and MTND6. This method can therefore be exploited to produce a library of cell lines carrying pathogenic human mtDNA for further study.
doi:10.1093/nar/gkl516
PMCID: PMC1540737  PMID: 16885236
6.  A decade of molecular pathogenomic analysis of group A Streptococcus  
The Journal of Clinical Investigation  2009;119(9):2455-2463.
Molecular pathogenomic analysis of the human bacterial pathogen group A Streptococcus has been conducted for a decade. Much has been learned as a consequence of the confluence of low-cost DNA sequencing, microarray technology, high-throughput proteomics, and enhanced bioinformatics. These technical advances, coupled with the availability of unique bacterial strain collections, have facilitated a systems biology investigative strategy designed to enhance and accelerate our understanding of disease processes. Here, we provide examples of the progress made by exploiting an integrated genome-wide research platform to gain new insight into molecular pathogenesis. The studies have provided many new avenues for basic and translational research.
doi:10.1172/JCI38095
PMCID: PMC2735924  PMID: 19729843
7.  Discovery of NSAID and anticancer drugs enhancing reprogramming and iPS cell generation 
Stem cells (Dayton, Ohio)  2011;29(10):1528-1536.
Recent breakthroughs in creating induced pluripotent stem cells (iPSCs) provide alternative means to obtain ES-like cells without destroying embryos by introducing four reprogramming factors (Oct3/4, Sox2, and Klf4/c-Myc or Nanog/ Lin28) into somatic cells. iPSCs are versatile tools for investigating early developmental processes and could become sources of tissues or cells for regenerative therapies. Here, for the first time, we describe a strategy to analyze genomics datasets of mouse embryonic fibroblasts (MEFs) and embryonic stem (ES) cells to identify genes constituting barriers to iPSC reprogramming. We further show that computational chemical biology combined with genomics analysis can be used to identify small molecules regulating reprogramming. Specific down-regulation by small interfering RNAs (siRNAs) of several key MEF-specific genes encoding proteins with catalytic or regulatory functions, including WISP1, PRRX1, HMGA2, NFIX, PRKG2, COX2, and TGFβ3, greatly increased reprogramming efficiency. Based on this rationale, we screened only 17 small molecules in reprogramming assays and discovered that the NSAID Nabumetone and the anti-cancer drug OHTM can generate iPS cells without Sox2. Nabumetone could also produce iPS cells in the absence of c-Myc or Sox2 without compromising self-renewal and pluripotency of derived iPS cells. In summary, we report a new concept of combining genomics and computational chemical biology to identify new drugs useful for iPSC generation. This hypothesis-driven approach provides an alternative to shot-gun screening and accelerates understanding of molecular mechanisms underlying iPS cell induction.
doi:10.1002/stem.717
PMCID: PMC3419601  PMID: 21898684
NSAIDS; OHTM; iPSC; Sox2; c-Myc
8.  RNAi in Arthropods: Insight into the Machinery and Applications for Understanding the Pathogen-Vector Interface 
Genes  2012;3(4):702-741.
The availability of genome sequencing data in combination with knowledge of expressed genes via transcriptome and proteome data has greatly advanced our understanding of arthropod vectors of disease. Not only have we gained insight into vector biology, but also into their respective vector-pathogen interactions. By combining the strengths of postgenomic databases and reverse genetic approaches such as RNAi, the numbers of available drug and vaccine targets, as well as number of transgenes for subsequent transgenic or paratransgenic approaches, have expanded. These are now paving the way for in-field control strategies of vectors and their pathogens. Basic scientific questions, such as understanding the basic components of the vector RNAi machinery, is vital, as this allows for the transfer of basic RNAi machinery components into RNAi-deficient vectors, thereby expanding the genetic toolbox of these RNAi-deficient vectors and pathogens. In this review, we focus on the current knowledge of arthropod vector RNAi machinery and the impact of RNAi on understanding vector biology and vector-pathogen interactions for which vector genomic data is available on VectorBase.
doi:10.3390/genes3040702
PMCID: PMC3899984  PMID: 24705082
RNA interference; vector; disease; mosquito; ixodid ticks; body louse; kissing bug; tsetse fly; transgenesis; vaccine; drug target
9.  The Role of the Toxicologic Pathologist in the Post-Genomic Era# 
Journal of Toxicologic Pathology  2013;26(2):105-110.
An era can be defined as a period in time identified by distinctive character, events, or practices. We are now in the genomic era. The pre-genomic era: There was a pre-genomic era. It started many years ago with novel and seminal animal experiments, primarily directed at studying cancer. It is marked by the development of the two-year rodent cancer bioassay and the ultimate realization that alternative approaches and short-term animal models were needed to replace this resource-intensive and time-consuming method for predicting human health risk. Many alternatives approaches and short-term animal models were proposed and tried but, to date, none have completely replaced our dependence upon the two-year rodent bioassay. However, the alternative approaches and models themselves have made tangible contributions to basic research, clinical medicine and to our understanding of cancer and they remain useful tools to address hypothesis-driven research questions. The pre-genomic era was a time when toxicologic pathologists played a major role in drug development, evaluating the cancer bioassay and the associated dose-setting toxicity studies, and exploring the utility of proposed alternative animal models. It was a time when there was shortage of qualified toxicologic pathologists. The genomic era: We are in the genomic era. It is a time when the genetic underpinnings of normal biological and pathologic processes are being discovered and documented. It is a time for sequencing entire genomes and deliberately silencing relevant segments of the mouse genome to see what each segment controls and if that silencing leads to increased susceptibility to disease. What remains to be charted in this genomic era is the complex interaction of genes, gene segments, post-translational modifications of encoded proteins, and environmental factors that affect genomic expression. In this current genomic era, the toxicologic pathologist has had to make room for a growing population of molecular biologists. In this present era newly emerging DVM and MD scientists enter the work arena with a PhD in pathology often based on some aspect of molecular biology or molecular pathology research. In molecular biology, the almost daily technological advances require one’s complete dedication to remain at the cutting edge of the science. Similarly, the practice of toxicologic pathology, like other morphological disciplines, is based largely on experience and requires dedicated daily examination of pathology material to maintain a well-trained eye capable of distilling specific information from stained tissue slides - a dedicated effort that cannot be well done as an intermezzo between other tasks. It is a rare individual that has true expertise in both molecular biology and pathology. In this genomic era, the newly emerging DVM-PhD or MD-PhD pathologist enters a marketplace without many job opportunities in contrast to the pre-genomic era. Many face an identity crisis needing to decide to become a competent pathologist or, alternatively, to become a competent molecular biologist. At the same time, more PhD molecular biologists without training in pathology are members of the research teams working in drug development and toxicology. How best can the toxicologic pathologist interact in the contemporary team approach in drug development, toxicology research and safety testing? Based on their biomedical training, toxicologic pathologists are in an ideal position to link data from the emerging technologies with their knowledge of pathobiology and toxicology. To enable this linkage and obtain the synergy it provides, the bench-level, slide-reading expert pathologist will need to have some basic understanding and appreciation of molecular biology methods and tools. On the other hand, it is not likely that the typical molecular biologist could competently evaluate and diagnose stained tissue slides from a toxicology study or a cancer bioassay. The post-genomic era: The post-genomic era will likely arrive approximately around 2050 at which time entire genomes from multiple species will exist in massive databases, data from thousands of robotic high throughput chemical screenings will exist in other databases, genetic toxicity and chemical structure-activity-relationships will reside in yet other databases. All databases will be linked and relevant information will be extracted and analyzed by appropriate algorithms following input of the latest molecular, submolecular, genetic, experimental, pathology and clinical data. Knowledge gained will permit the genetic components of many diseases to be amenable to therapeutic prevention and/or intervention. Much like computerized algorithms are currently used to forecast weather or to predict political elections, computerized sophisticated algorithms based largely on scientific data mining will categorize new drugs and chemicals relative to their health benefits versus their health risks for defined human populations and subpopulations. However, this form of a virtual toxicity study or cancer bioassay will only identify probabilities of adverse consequences from interaction of particular environmental and/or chemical/drug exposure(s) with specific genomic variables. Proof in many situations will require confirmation in intact in vivo mammalian animal models. The toxicologic pathologist in the post-genomic era will be the best suited scientist to confirm the data mining and its probability predictions for safety or adverse consequences with the actual tissue morphological features in test species that define specific test agent pathobiology and human health risk.
doi:10.1293/tox.26.105
PMCID: PMC3695332  PMID: 23914052
genomic era; history of toxicologic pathology; molecular biology
10.  Whole Genome Sequencing versus Traditional Genotyping for Investigation of a Mycobacterium tuberculosis Outbreak: A Longitudinal Molecular Epidemiological Study 
PLoS Medicine  2013;10(2):e1001387.
In an outbreak investigation of Mycobacterium tuberculosis comparing whole genome sequencing (WGS) with traditional genotyping, Stefan Niemann and colleagues found that classical genotyping falsely clustered some strains, and WGS better reflected contact tracing.
Background
Understanding Mycobacterium tuberculosis (Mtb) transmission is essential to guide efficient tuberculosis control strategies. Traditional strain typing lacks sufficient discriminatory power to resolve large outbreaks. Here, we tested the potential of using next generation genome sequencing for identification of outbreak-related transmission chains.
Methods and Findings
During long-term (1997 to 2010) prospective population-based molecular epidemiological surveillance comprising a total of 2,301 patients, we identified a large outbreak caused by an Mtb strain of the Haarlem lineage. The main performance outcome measure of whole genome sequencing (WGS) analyses was the degree of correlation of the WGS analyses with contact tracing data and the spatio-temporal distribution of the outbreak cases. WGS analyses of the 86 isolates revealed 85 single nucleotide polymorphisms (SNPs), subdividing the outbreak into seven genome clusters (two to 24 isolates each), plus 36 unique SNP profiles. WGS results showed that the first outbreak isolates detected in 1997 were falsely clustered by classical genotyping. In 1998, one clone (termed “Hamburg clone”) started expanding, apparently independently from differences in the social environment of early cases. Genome-based clustering patterns were in better accordance with contact tracing data and the geographical distribution of the cases than clustering patterns based on classical genotyping. A maximum of three SNPs were identified in eight confirmed human-to-human transmission chains, involving 31 patients. We estimated the Mtb genome evolutionary rate at 0.4 mutations per genome per year. This rate suggests that Mtb grows in its natural host with a doubling time of approximately 22 h (400 generations per year). Based on the genome variation discovered, emergence of the Hamburg clone was dated back to a period between 1993 and 1997, hence shortly before the discovery of the outbreak through epidemiological surveillance.
Conclusions
Our findings suggest that WGS is superior to conventional genotyping for Mtb pathogen tracing and investigating micro-epidemics. WGS provides a measure of Mtb genome evolution over time in its natural host context.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
Tuberculosis—a contagious bacterial disease that usually infects the lungs—is a major public health problem, particularly in low- and middle-income countries. In 2011, an estimated 8.7 million people developed tuberculosis globally, and 1.4 million people died from the disease. Tuberculosis is second only to HIV/AIDS in terms of global deaths from a single infectious agent. Mycobacterium tuberculosis, the bacterium that causes tuberculosis, is readily spread in airborne droplets when people with active disease cough or sneeze. The characteristic symptoms of tuberculosis include persistent cough, weight loss, fever, and night sweats. Diagnostic tests for the disease include sputum smear analysis (examination of mucus coughed up from the lungs for the presence of M. tuberculosis), mycobacterial culture (growth of M. tuberculosis from sputum), and chest X-rays. Tuberculosis can be cured by taking several antibiotics daily for at least six months, although the recent emergence of multidrug-resistant M. tuberculosis is making tuberculosis harder to treat.
Why Was This Study Done?
Although efforts to reduce the global burden of tuberculosis are showing some improvements, the annual decline in the number of people developing tuberculosis continues to be slow. To develop optimized control strategies, experts need to be able to accurately track M. tuberculosis transmission within human populations. Because M. tuberculosis, like all bacteria, accumulates genetic changes over time, there are many different strains (genetic variants) of M. tuberculosis. Genotyping methods have been developed that identify different bacterial strains by examining specific regions of the bacterial genome (blueprint), but because these methods examine only a small part of the genome, they may not distinguish between related transmission chains. That is, traditional strain genotyping methods may not be able to determine accurately where a tuberculosis outbreak started or how it spread through a population. In this longitudinal cohort study, the researchers compare the ability of whole genome sequencing (WGS), which is rapidly becoming widely available, and traditional genotyping to provide information about a recent German tuberculosis outbreak. In a longitudinal cohort study, a population is followed over time to analyze the occurrence of a specific disease.
What Did the Researchers Do and Find?
During long-term (1997–2010) population-based molecular epidemiological surveillance (disease surveillance that uses molecular techniques rather than reports of illness) in Hamburg and Schleswig-Holstein, the researchers identified a large tuberculosis outbreak caused by M. tuberculosis isolates of the Haarlem lineage using classical strain typing. The researchers examined each of the 86 isolates from this outbreak using WGS and classical genotyping and asked whether the results of these two approaches correlated with contact tracing data (information is routinely collected about the people a patient with tuberculosis has recently met so that these contacts can be tested for tuberculosis and treated if necessary) and with the spatio-temporal distribution of outbreak cases. WGS of the isolates identified 85 single nucleotide polymorphisms (SNPs; genomic sequence variants in which single building blocks, or nucleotides, are altered) that subdivided the outbreak into seven clusters of isolates and 36 unique isolates. The WGS results showed that the first isolates of the outbreak were incorrectly clustered by classical genotyping and that one strain—the “Hamburg clone”—started expanding in 1998. Notably, the genome-based clustering patterns were in better accordance with contact tracing data and with the geographical distribution of cases than clustering patterns based on classical genotyping, and they identified eight confirmed human-to-human transmission chains that involved 31 patients and a maximum of three SNPs. Finally, the researchers used their WGS results to estimate that the Hamburg clone emerged between 1993 and 1997, shortly before the discovery of the tuberculosis outbreak through epidemiological surveillance.
What Do These Findings Mean?
These findings show that WGS can be used to identify specific strains within large tuberculosis outbreaks more accurately than classical genotyping. They also provide new information about the evolution of M. tuberculosis during outbreaks and indicate how WGS data should be interpreted in future genome-based molecular epidemiology studies. WGS has the potential to improve the molecular epidemiological surveillance and control of tuberculosis and of other infectious diseases. Importantly, note the researchers, ongoing reductions in the cost of WGS, the increased availability of “bench top” genome sequencers, and bioinformatics developments should all accelerate the implementation of WGS as a standard method for the identification of transmission chains in infectious disease outbreaks.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001387.
The World Health Organization provides information (in several languages) on all aspects of tuberculosis, including the Global Tuberculosis Report 2012
The Stop TB Partnership is working towards tuberculosis elimination; patient stories about tuberculosis are available (in English and Spanish)
The US Centers for Disease Control and Prevention has information about tuberculosis, including information on tuberculosis genotyping (some information in English and Spanish)
The US National Institute of Allergy and Infectious Diseases also has detailed information on all aspects of tuberculosis
The Tuberculosis Survival Project, which aims to raise awareness of tuberculosis and provide support for people with tuberculosis, provides personal stories about treatment for tuberculosis; the Tuberculosis Vaccine Initiative also provides personal stories about dealing with tuberculosis
MedlinePlus has links to further information about tuberculosis (in English and Spanish)
Wikipedia has a page on whole-genome sequencing (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
doi:10.1371/journal.pmed.1001387
PMCID: PMC3570532  PMID: 23424287
11.  An Atypical Kinase under Balancing Selection Confers Broad-Spectrum Disease Resistance in Arabidopsis 
PLoS Genetics  2013;9(9):e1003766.
The failure of gene-for-gene resistance traits to provide durable and broad-spectrum resistance in an agricultural context has led to the search for genes underlying quantitative resistance in plants. Such genes have been identified in only a few cases, all for fungal or nematode resistance, and encode diverse molecular functions. However, an understanding of the molecular mechanisms of quantitative resistance variation to other enemies and the associated evolutionary forces shaping this variation remain largely unknown. We report the identification, map-based cloning and functional validation of QRX3 (RKS1, Resistance related KinaSe 1), conferring broad-spectrum resistance to Xanthomonas campestris (Xc), a devastating worldwide bacterial vascular pathogen of crucifers. RKS1 encodes an atypical kinase that mediates a quantitative resistance mechanism in plants by restricting bacterial spread from the infection site. Nested Genome-Wide Association mapping revealed a major locus corresponding to an allelic series at RKS1 at the species level. An association between variation in resistance and RKS1 transcription was found using various transgenic lines as well as in natural accessions, suggesting that regulation of RKS1 expression is a major component of quantitative resistance to Xc. The co-existence of long lived RKS1 haplotypes in A. thaliana is shared with a variety of genes involved in pathogen recognition, suggesting common selective pressures. The identification of RKS1 constitutes a starting point for deciphering the mechanisms underlying broad spectrum quantitative disease resistance that is effective against a devastating and vascular crop pathogen. Because putative RKS1 orthologous have been found in other Brassica species, RKS1 provides an exciting opportunity for plant breeders to improve resistance to black rot in crops.
Author Summary
During the evolution of plant-pathogen interactions, plants have evolved the capability to defend themselves from pathogen infection by different overlapping mechanisms. Disease resistance is constituted by an elaborate, multilayered system of defense. Among these responses, quantitative resistance is a prevalent form of resistance in crops and natural plant populations, for which the genetic and molecular bases remain largely unknown. Thus, identification of the genes underlying quantitative resistance constitutes a major challenge in plant breeding and evolutionary biology, and might have enormous practical implications for human health by increasing crop yield and quality. Our work contributes to understanding the molecular bases of quantitative resistance to the vascular pathogen Xanthomonas campestris (Xc), which is responsible for black rot, an important disease of crucifers worldwide. By multiple approaches, we demonstrate that RKS1 is a quantitative resistance gene in Arabidopsis thaliana conferring broad-spectrum resistance to Xc and that this resistance mechanism in plants is associated with regulation of RKS1 expression. We also provide evidence that RKS1 allelic variation is a major component of quantitative resistance to Xc at the species level. Finally, the long-lived polymorphism associated with RKS1 suggests that evolutionary stable broad-spectrum resistance to Xc may be achieved in natural populations of A. thaliana.
doi:10.1371/journal.pgen.1003766
PMCID: PMC3772041  PMID: 24068949
12.  Contribution of Genome-Wide Association Studies to Scientific Research: A Pragmatic Approach to Evaluate Their Impact 
PLoS ONE  2013;8(8):e71198.
The factual value of genome-wide association studies (GWAS) for the understanding of multifactorial diseases is a matter of intense debate. Practical consequences for the development of more effective therapies do not seem to be around the corner. Here we propose a pragmatic and objective evaluation of how much new biology is arising from these studies, with particular attention to the information that can help prioritize therapeutic targets. We chose multiple sclerosis (MS) as a paradigm disease and assumed that, in pre-GWAS candidate-gene studies, the knowledge behind the choice of each gene reflected the understanding of the disease prior to the advent of GWAS. Importantly, this knowledge was based mainly on non-genetic, phenotypic grounds. We performed single-gene and pathway-oriented comparisons of old and new knowledge in MS by confronting an unbiased list of candidate genes in pre-GWAS association studies with those genes exceeding the genome-wide significance threshold in GWAS published from 2007 on. At the single gene level, the majority (94 out of 125) of GWAS-discovered variants had never been contemplated as plausible candidates in pre-GWAS association studies. The 31 genes that were present in both pre- and post-GWAS lists may be of particular interest in that they represent disease-associated variants whose pathogenetic relevance is supported at the phenotypic level (i.e. the phenotypic information that steered their selection as candidate genes in pre-GWAS association studies). As such they represent attractive therapeutic targets. Interestingly, our analysis shows that some of these variants are targets of pharmacologically active compounds, including drugs that are already registered for human use. Compared with the above single-gene analysis, at the pathway level GWAS results appear more coherent with previous knowledge, reinforcing some of the current views on MS pathogenesis and related therapeutic research. This study presents a pragmatic approach that helps interpret and exploit GWAS knowledge.
doi:10.1371/journal.pone.0071198
PMCID: PMC3743868  PMID: 23967165
13.  Molecular biology research in neuropsychiatry: India’s contribution 
Indian Journal of Psychiatry  2010;52(Suppl1):S120-S127.
Neuropsychiatric disorders represent the second largest cause of morbidity worldwide. These disorders have complex etiology and patho-physiology. The major lacunae in the biology of the psychiatric disorders include genomics, biomarkers and drug discovery, for the early detection of the disease, and have great application in the clinical management of disease. Indian psychiatrists and scientists played a significant role in filling the gaps. The present annotation provides in depth information related to research contributions on the molecular biology research in neuropsychiatric disorders in India. There is a great need for further research in this direction as to understand the genetic association of the neuropsychiatric disorders; molecular biology has a tremendous role to play. The alterations in gene expression are implicated in the pathogenesis of several neuropsychiatric disorders, including drug addiction and depression. The development of transgenic neuropsychiatric animal models is of great thrust areas. No studies from India in this direction. Biomarkers in neuropsychiatric disorders are of great help to the clinicians for the early diagnosis of the disorders. The studies related to gene-environment interactions, DNA instability, oxidative stress are less studied in neuropsychiatric disorders and making efforts in this direction will lead to pioneers in these areas of research in India. In conclusion, we provided an insight for future research direction in molecular understanding of neuropsychiatry disorders.
doi:10.4103/0019-5545.69223
PMCID: PMC3146196  PMID: 21836667
Depression; bipolar disorders; sexual dysfunction; autism; dementia; trace metals; DNA conformation; DNA stability; cell death; D1 receptors; genes; pedigree; enzymes; diet; mutations
14.  Mapping the Genetic Architecture of Gene Expression in Human Liver 
PLoS Biology  2008;6(5):e107.
Genetic variants that are associated with common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in higher-order disease traits. Therefore, identifying the molecular phenotypes that vary in response to changes in DNA and that also associate with changes in disease traits has the potential to provide the functional information required to not only identify and validate the susceptibility genes that are directly affected by changes in DNA, but also to understand the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. Toward that end, we profiled more than 39,000 transcripts and we genotyped 782,476 unique single nucleotide polymorphisms (SNPs) in more than 400 human liver samples to characterize the genetic architecture of gene expression in the human liver, a metabolically active tissue that is important in a number of common human diseases, including obesity, diabetes, and atherosclerosis. This genome-wide association study of gene expression resulted in the detection of more than 6,000 associations between SNP genotypes and liver gene expression traits, where many of the corresponding genes identified have already been implicated in a number of human diseases. The utility of these data for elucidating the causes of common human diseases is demonstrated by integrating them with genotypic and expression data from other human and mouse populations. This provides much-needed functional support for the candidate susceptibility genes being identified at a growing number of genetic loci that have been identified as key drivers of disease from genome-wide association studies of disease. By using an integrative genomics approach, we highlight how the gene RPS26 and not ERBB3 is supported by our data as the most likely susceptibility gene for a novel type 1 diabetes locus recently identified in a large-scale, genome-wide association study. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease and plasma low-density lipoprotein cholesterol levels in the process.
Author Summary
Genome-wide association studies seek to identify regions of the genome in which changes in DNA in a given population are correlated with disease, drug response, or other phenotypes of interest. However, changes in DNA that associate with traits like common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in the higher-order disease traits. Therefore, identifying molecular phenotypes that vary in response to changes in DNA that also associate with changes in disease traits can provide the functional information necessary to not only identify and validate the susceptibility genes directly affected by changes in DNA, but to understand as well the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. To enable this type of approach we profiled the expression levels of 39,280 transcripts and genotyped 782,476 SNPs in 427 human liver samples, identifying thousands of DNA variants that strongly associated with liver gene expression. These relationships were then leveraged by integrating them with genotypic and expression data from other human and mouse populations, leading to the direct identification of candidate susceptibility genes corresponding to genetic loci identified as key drivers of disease. Our analysis is able to provide much needed functional support for these candidate susceptibility genes.
Identifying changes in DNA that associate with changes in gene expression in human tissues elucidates the genetic architecture of gene expression in human populations and enables the direct identification of functionally supported candidate susceptibility genes in genomic regions associated with disease.
doi:10.1371/journal.pbio.0060107
PMCID: PMC2365981  PMID: 18462017
15.  Genomic Profiling Identifies GATA6 as a Candidate Oncogene Amplified in Pancreatobiliary Cancer 
PLoS Genetics  2008;4(5):e1000081.
Pancreatobiliary cancers have among the highest mortality rates of any cancer type. Discovering the full spectrum of molecular genetic alterations may suggest new avenues for therapy. To catalogue genomic alterations, we carried out array-based genomic profiling of 31 exocrine pancreatic cancers and 6 distal bile duct cancers, expanded as xenografts to enrich the tumor cell fraction. We identified numerous focal DNA amplifications and deletions, including in 19% of pancreatobiliary cases gain at cytoband 18q11.2, a locus uncommonly amplified in other tumor types. The smallest shared amplification at 18q11.2 included GATA6, a transcriptional regulator previously linked to normal pancreas development. When amplified, GATA6 was overexpressed at both the mRNA and protein levels, and strong immunostaining was observed in 25 of 54 (46%) primary pancreatic cancers compared to 0 of 33 normal pancreas specimens surveyed. GATA6 expression in xenografts was associated with specific microarray gene-expression patterns, enriched for GATA binding sites and mitochondrial oxidative phosphorylation activity. siRNA mediated knockdown of GATA6 in pancreatic cancer cell lines with amplification led to reduced cell proliferation, cell cycle progression, and colony formation. Our findings indicate that GATA6 amplification and overexpression contribute to the oncogenic phenotypes of pancreatic cancer cells, and identify GATA6 as a candidate lineage-specific oncogene in pancreatobiliary cancer, with implications for novel treatment strategies.
Author Summary
Pancreatic cancer is a devastating disease, having among the lowest survival rates of any cancer. A better understanding of the molecular basis of pancreatic cancer may lead to improved rationale therapies. We report here the discovery of amplification (i.e. extra copies) of the GATA6 gene in many human pancreatic cancers. GATA6 is a regulator of gene expression and functions in the development of the normal pancreas. Our findings indicate that its amplification and aberrant overexpression contribute to pancreatic cancer development. GATA6 joins a growing list of cancer genes with key roles in normal human development but pathogenic roles in cancer when aberrantly expressed. Our discovery of GATA6 amplification provides a new foothold into understanding the pathogenic mechanisms underlying pancreatic cancer, and suggests new strategies for therapy by targeting GATA6 or the genes it regulates.
doi:10.1371/journal.pgen.1000081
PMCID: PMC2413204  PMID: 18535672
16.  Lung eQTLs to Help Reveal the Molecular Underpinnings of Asthma 
PLoS Genetics  2012;8(11):e1003029.
Genome-wide association studies (GWAS) have identified loci reproducibly associated with pulmonary diseases; however, the molecular mechanism underlying these associations are largely unknown. The objectives of this study were to discover genetic variants affecting gene expression in human lung tissue, to refine susceptibility loci for asthma identified in GWAS studies, and to use the genetics of gene expression and network analyses to find key molecular drivers of asthma. We performed a genome-wide search for expression quantitative trait loci (eQTL) in 1,111 human lung samples. The lung eQTL dataset was then used to inform asthma genetic studies reported in the literature. The top ranked lung eQTLs were integrated with the GWAS on asthma reported by the GABRIEL consortium to generate a Bayesian gene expression network for discovery of novel molecular pathways underpinning asthma. We detected 17,178 cis- and 593 trans- lung eQTLs, which can be used to explore the functional consequences of loci associated with lung diseases and traits. Some strong eQTLs are also asthma susceptibility loci. For example, rs3859192 on chr17q21 is robustly associated with the mRNA levels of GSDMA (P = 3.55×10−151). The genetic-gene expression network identified the SOCS3 pathway as one of the key drivers of asthma. The eQTLs and gene networks identified in this study are powerful tools for elucidating the causal mechanisms underlying pulmonary disease. This data resource offers much-needed support to pinpoint the causal genes and characterize the molecular function of gene variants associated with lung diseases.
Author Summary
Recent genome-wide association studies (GWAS) have identified genetic variants associated with lung diseases. The challenge now is to find the causal genes in GWAS–nominated chromosomal regions and to characterize the molecular function of disease-associated genetic variants. In this paper, we describe an international effort to systematically capture the genetic architecture of gene expression regulation in human lung. By studying lung specimens from 1,111 individuals of European ancestry, we found a large number of genetic variants affecting gene expression in the lung, or lung expression quantitative trait loci (eQTL). These lung eQTLs will serve as an important resource to aid in the understanding of the molecular underpinnings of lung biology and its disruption in disease. To demonstrate the utility of this lung eQTL dataset, we integrated our data with previous genetic studies on asthma. Through integrative techniques, we identified causal variants and genes in GWAS–nominated loci and found key molecular drivers for asthma. We feel that sharing our lung eQTLs dataset with the scientific community will leverage the impact of previous large-scale GWAS on lung diseases and function by providing much needed functional information to understand the molecular changes introduced by the susceptibility genetic variants.
doi:10.1371/journal.pgen.1003029
PMCID: PMC3510026  PMID: 23209423
17.  Metagenomic Assay for Identification of Microbial Pathogens in Tumor Tissues 
mBio  2014;5(5):e01714-14.
ABSTRACT
Screening for thousands of viruses and other pathogenic microorganisms, including bacteria, fungi, and parasites, in human tumor tissues will provide a better understanding of the contributory role of the microbiome in the predisposition for, causes of, and therapeutic responses to the associated cancer. Metagenomic assays designed to perform these tasks will have to include rapid and economical processing of large numbers of samples, supported by straightforward data analysis pipeline and flexible sample preparation options for multiple input tissue types from individual patients, mammals, or environmental samples. To meet these requirements, the PathoChip platform was developed by targeting viral, prokaryotic, and eukaryotic genomes with multiple DNA probes in a microarray format that can be combined with a variety of upstream sample preparation protocols and downstream data analysis. PathoChip screening of DNA plus RNA from formalin-fixed, paraffin-embedded tumor tissues demonstrated the utility of this platform, and the detection of oncogenic viruses was validated using independent PCR and deep sequencing methods. These studies demonstrate the use of the PathoChip technology combined with PCR and deep sequencing as a valuable strategy for detecting the presence of pathogens in human cancers and other diseases.
IMPORTANCE
This work describes the design and testing of a PathoChip array containing probes with the ability to detect all known publicly available virus sequences as well as hundreds of pathogenic bacteria, fungi, parasites, and helminths. PathoChip provides wide coverage of microbial pathogens in an economical format. PathoChip screening of DNA plus RNA from formalin-fixed, paraffin-embedded tumor tissues demonstrated the utility of this platform, and the detection of oncogenic viruses was validated using independent PCR and sequencing methods. These studies demonstrate that the PathoChip technology is a valuable strategy for detecting the presence of pathogens in human cancers and other diseases.
doi:10.1128/mBio.01714-14
PMCID: PMC4172075  PMID: 25227467
18.  Engineering microbes to sense and eradicate Pseudomonas aeruginosa, a human pathogen 
A synthetic genetic system is designed and characterized that allows Escherichia coli to sense and eradicate Pseudomonas aeruginosa, providing a novel antimicrobial strategy that could potentially be applied to fighting infectious pathogens.
We have engineered and demonstrated a novel genetic circuit that enables Escherichia coli to produce and release pyocin upon quorum sensing detection of Pseudomonas aeruginosa, which in turn kills P. aeruginosa.The quorum sensing device, which comprises an LasR transcription factor constitutively expressed by a pTetR promoter and a downstream pLuxR inducible promoter, has a switch point of 1.2 × 10E-7 M 3OC12HSL and is able to sense 3OC12HSL natively produced by P. aeruginosa.The E7 lysis device when coupled downstream of the quorum sensing device enhances pyocin release eight-fold.The engineered E. coli, which carries the sensing, lysing, and killing devices, effectively inhibits the growth of planktonic and biofilm P. aeruginosa by 99 and 90%, respectively.
In this study, we have made progress toward developing a novel antimicrobial strategy, based on an engineered microbial system, using the synthetic biology framework. Our final system was designed to (i) detect AHLs produced by P. aeruginosa; (ii) produce pyocin S5 upon the detection; and (iii) lyse the E. coli cells by E7 lysis protein so that the produced pyocin S5 is released from the cells, leading to the killing of P. aeruginosa.
Figure 1 shows a schematic of our sensing and killing genetic system. The sensing device was designed based on the Type I quorum sensing mechanism of P. aeruginosa. The tetR promoter, which is constitutively on, produces a transcriptional factor, LasR, that binds to AHL 3OC12HSL. The luxR promoter, to which LasR-3OC12HSL activator complex reportedly binds, was adopted as the inducible promoter in our sensing device (Gray et al, 1994). Next, the formation of the LasR-3OC12HSL complex, which binds to the luxR promoter, activates the killing and lysing devices, leading to the production of pyocin S5 and lysis E7 proteins within the E. coli chassis. Upon reaching a threshold concentration, the lysis E7 protein perforates membrane of the E. coli host and releases the accumulated pyocin S5. Pyocin S5, which is a soluble protein, then diffuses toward the target pathogen and damages its cellular integrity, thereby killing it.
To evaluate and characterize the sensing device, the gene encoding the green fluorescent protein (GFP) was fused to the sensing device and the GFP expression was monitored at a range of concentrations of 3OC12HSL. From the measured GFP synthesis rates, we observed a basal expression level of 0.216 RFU per OD per minute without induction, followed by a sharp increase in GFP production rate as the concentration of 3OC12HSL was increased beyond 1.0E-7 M. A transfer function that describes the static relationship between the input (3OC12HSL) and output (GFP production rate) of the sensing device was determined by fitting an empirical mathematical model (Hill equation) to the experimental data where the input 3OC12HSL concentration is <1.0E-6 M. The resulting best fit model demonstrated that the static performance of the sensing device follows a Hill equation below the input concentration of 1.0E-6 M 3OC12HSL. The model showed that the sensing device saturated at a maximum output of 1.96 RFU per OD per minute at input concentration >3.3E-7 M but <1.0E-6 M 3OC12HSL, and the switch point for the sensing device was 1.2E-7 M 3OC12HSL, the input concentration at which output is at half-maximal. Since this switch point concentration is smaller than the concentration of 3OC12HSL present (1.0E-6 to 1.0E-4 M) within proximity to the site of P. aeruginosa infection as earlier reported in the literature (Pearson et al, 1995; Charlton et al, 2000), the sensing device would be sensitive enough to detect the amount of 3OC12HSL natively produced by P. aeruginosa.
In line with the objective of the E7 lysis device in mediating the export of pyocin, we studied the efficiency of the lysis device in the final system by measuring the amount of the released protein. While distinct bands that corresponded to pyocin S5 were observed on the SDS–PAGE of the final system, no bands were seen in lanes without the lysis device. We further validated the results by estimating the protein concentrations in the supernatant with Bradford assay and showed that the amount of pyocin released by our final system was eight times higher than the system without the lysis device.
To verify that our engineered E. coli can inhibit P. aeruginosa in a mixed culture, we monitored the growth of P. aeruginosa co-cultured with the engineered E. coli in the ratio 1:4 by CFU count. The result shows that our engineered E. coli with the final system effectively inhibited the growth of P. aeruginosa by 99% while continuous growths were apparent in P. aeruginosa co-cultured with incomplete E. coli systems missing either the pyocin S5 or E7 lysis devices.
To examine the potential application of our engineered system against a pseudo disease state of Pseudomonas, a static biofilm inhibition assay was performed. Figure 6A shows that our engineered E. coli inhibited the formation of P. aeruginosa biofilm by close to 90%. This observation is in stark contrast to the pyocin-resistant control strain PAO1 and pyocin-sensitive clinical isolate ln7 subjected to treatment with E. coli having the systems missing either the pyocin S5 or E7 lysis devices. To visualize the extent of biofilm inhibition, biofilm cells with green fluorescence were grown in the presence of engineered E. coli on glass slide substrate and examined with confocal laser scanning microscopy. Figure 6B shows that the morphology of Pseudomonas biofilm treated with the engineered E. coli appeared sparse, while elaborated honey-combed structures were apparent in the control experiments. Collectively, our results suggest that our engineered E. coli carrying the final system, which contains the sensing, killing, and lysing devices, can effectively inhibit the growth of P. aeruginosa in both planktonic and sessile states.
In summary, we engineered a novel biological system, which comprises sensing, killing, and lysing devices, that enables E. coli to sense and eradicate pathogenic P. aeruginosa strains by exploiting the synthetic biology framework. More importantly, our study presents the possibility of engineering potentially beneficial microbiota into therapeutic bioagents to arrest Pseudomonas infection. Given the stalled development of new antibiotics and the increasing emergence of multidrug-resistant pathogens, this study provides the foundational basis for a novel synthetic biology-driven antimicrobial strategy that could be extended to include other pathogens such as Vibrio cholera and Helicobacter pylori.
Synthetic biology aims to systematically design and construct novel biological systems that address energy, environment, and health issues. Herein, we describe the development of a synthetic genetic system, which comprises quorum sensing, killing, and lysing devices, that enables Escherichia coli to sense and kill a pathogenic Pseudomonas aeruginosa strain through the production and release of pyocin. The sensing, killing, and lysing devices were characterized to elucidate their detection, antimicrobial and pyocin release functionalities, which subsequently aided in the construction of the final system and the verification of its designed behavior. We demonstrated that our engineered E. coli sensed and killed planktonic P. aeruginosa, evidenced by 99% reduction in the viable cells. Moreover, we showed that our engineered E. coli inhibited the formation of P. aeruginosa biofilm by close to 90%, leading to much sparser and thinner biofilm matrices. These results suggest that E. coli carrying our synthetic genetic system may provide a novel synthetic biology-driven antimicrobial strategy that could potentially be applied to fighting P. aeruginosa and other infectious pathogens.
doi:10.1038/msb.2011.55
PMCID: PMC3202794  PMID: 21847113
genetic circuits; Pseudomonas aeruginosa; pyocin; quorum sensing; synthetic biology
19.  Systems Cancer Medicine: Towards Realization of Predictive, Preventive, Personalized, and Participatory (P4) Medicine 
Journal of internal medicine  2012;271(2):111-121.
A grand challenge impeding optimal treatment outcomes for cancer patients arises from the complex nature of the disease: the cellular heterogeneity, the myriad of dysfunctional molecular and genetic networks as results of genetic (somatic) and environmental perturbations. Systems biology, with its holistic approach to understanding fundamental principles in biology, and the empowering technologies in genomics, proteomics, single-cell analysis, microfluidics, and computational strategies, enables a comprehensive approach to medicine, which strives to unveil the pathogenic mechanisms of diseases, identify disease biomarkers and begin thinking about new strategies for drug target discovery. The integration of multi-dimensional high throughput “omics” measurements from tumor tissues and corresponding blood specimens, together with new systems strategies for diagnostics, enables the identification of cancer biomarkers that will enable presymptomatic diagnosis, stratification of disease, assessment of disease progression, evaluation of patient response to therapy, and the identification of reoccurrences. While some aspects of systems medicine are being adopted in clinical oncology practice through companion molecular diagnostics for personalized therapy, the mounting influx of global quantitative data from both wellness and diseases, is shaping up a transformational paradigm in medicine we termed predictive, preventive, personalized, and participatory (P4) medicine, which requires new strategies, both scientific and organizational, to enable bringing this revolution in medicine to patients and to the healthcare system. P4 medicine will have a profound impact on society—transforming the healthcare system, turning around the ever escalating costs of healthcare, digitizing the practice of medicine and creating enormous economic opportunities for those organizations and nations that embrace this revolution
doi:10.1111/j.1365-2796.2011.02498.x
PMCID: PMC3978383  PMID: 22142401
Systems medicine; cancer complexity; quantized cell populations; blood biomarkers; molecular diagnostics; P4 medicine
20.  The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode 
Genome Biology  2014;15(3):R43.
Background
Globodera pallida is a devastating pathogen of potato crops, making it one of the most economically important plant parasitic nematodes. It is also an important model for the biology of cyst nematodes. Cyst nematodes and root-knot nematodes are the two most important plant parasitic nematode groups and together represent a global threat to food security.
Results
We present the complete genome sequence of G. pallida, together with transcriptomic data from most of the nematode life cycle, particularly focusing on the life cycle stages involved in root invasion and establishment of the biotrophic feeding site. Despite the relatively close phylogenetic relationship with root-knot nematodes, we describe a very different gene family content between the two groups and in particular extensive differences in the repertoire of effectors, including an enormous expansion of the SPRY domain protein family in G. pallida, which includes the SPRYSEC family of effectors. This highlights the distinct biology of cyst nematodes compared to the root-knot nematodes that were, until now, the only sedentary plant parasitic nematodes for which genome information was available. We also present in-depth descriptions of the repertoires of other genes likely to be important in understanding the unique biology of cyst nematodes and of potential drug targets and other targets for their control.
Conclusions
The data and analyses we present will be central in exploiting post-genomic approaches in the development of much-needed novel strategies for the control of G. pallida and related pathogens.
doi:10.1186/gb-2014-15-3-r43
PMCID: PMC4054857  PMID: 24580726
21.  Analysis of multiple compound–protein interactions reveals novel bioactive molecules 
The authors use machine learning of compound-protein interactions to explore drug polypharmacology and to efficiently identify bioactive ligands, including novel scaffold-hopping compounds for two pharmaceutically important protein families: G-protein coupled receptors and protein kinases.
We have demonstrated that machine learning of multiple compound–protein interactions is useful for efficient ligand screening and for assessing drug polypharmacology.This approach successfully identified novel scaffold-hopping compounds for two pharmaceutically important protein families: G-protein-coupled receptors and protein kinases.These bioactive compounds were not detected by existing computational ligand-screening methods in comparative studies.The results of this study indicate that data derived from chemical genomics can be highly useful for exploring chemical space, and this systems biology perspective could accelerate drug discovery processes.
The discovery of novel bioactive molecules advances our systems-level understanding of biological processes and is crucial for innovation in drug development. Perturbations of biological systems by chemical probes provide broader applications not only for analysis of complex systems but also for intentional manipulations of these systems. Nevertheless, the lack of well-characterized chemical modulators has limited their use. Recently, chemical genomics has emerged as a promising area of research applicable to the exploration of novel bioactive molecules, and researchers are currently striving toward the identification of all possible ligands for all target protein families (Wang et al, 2009). Chemical genomics studies have shown that patterns of compound–protein interactions (CPIs) are too diverse to be understood as simple one-to-one events. There is an urgent need to develop appropriate data mining methods for characterizing and visualizing the full complexity of interactions between chemical space and biological systems. However, no existing screening approach has so far succeeded in identifying novel bioactive compounds using multiple interactions among compounds and target proteins.
High-throughput screening (HTS) and computational screening have greatly aided in the identification of early lead compounds for drug discovery. However, the large number of assays required for HTS to identify drugs that target multiple proteins render this process very costly and time-consuming. Therefore, interest in using in silico strategies for screening has increased. The most common computational approaches, ligand-based virtual screening (LBVS) and structure-based virtual screening (SBVS; Oprea and Matter, 2004; Muegge and Oloff, 2006; McInnes, 2007; Figure 1A), have been used for practical drug development. LBVS aims to identify molecules that are very similar to known active molecules and generally has difficulty identifying compounds with novel structural scaffolds that differ from reference molecules. The other popular strategy, SBVS, is constrained by the number of three-dimensional crystallographic structures available. To circumvent these limitations, we have shown that a new computational screening strategy, chemical genomics-based virtual screening (CGBVS), has the potential to identify novel, scaffold-hopping compounds and assess their polypharmacology by using a machine-learning method to recognize conserved molecular patterns in comprehensive CPI data sets.
The CGBVS strategy used in this study was made up of five steps: CPI data collection, descriptor calculation, representation of interaction vectors, predictive model construction using training data sets, and predictions from test data (Figure 1A). Importantly, step 1, the construction of a data set of chemical structures and protein sequences for known CPIs, did not require the three-dimensional protein structures needed for SBVS. In step 2, compound structures and protein sequences were converted into numerical descriptors. These descriptors were used to construct chemical or biological spaces in which decreasing distance between vectors corresponded to increasing similarity of compound structures or protein sequences. In step 3, we represented multiple CPI patterns by concatenating these chemical and protein descriptors. Using these interaction vectors, we could quantify the similarity of molecular interactions for compound–protein pairs, despite the fact that the ligand and protein similarity maps differed substantially. In step 4, concatenated vectors for CPI pairs (positive samples) and non-interacting pairs (negative samples) were input into an established machine-learning method. In the final step, the classifier constructed using training sets was applied to test data.
To evaluate the predictive value of CGBVS, we first compared its performance with that of LBVS by fivefold cross-validation. CGBVS performed with considerably higher accuracy (91.9%) than did LBVS (84.4%; Figure 1B). We next compared CGBVS and SBVS in a retrospective virtual screening based on the human β2-adrenergic receptor (ADRB2). Figure 1C shows that CGBVS provided higher hit rates than did SBVS. These results suggest that CGBVS is more successful than conventional approaches for prediction of CPIs.
We then evaluated the ability of the CGBVS method to predict the polypharmacology of ADRB2 by attempting to identify novel ADRB2 ligands from a group of G-protein-coupled receptor (GPCR) ligands. We ranked the prediction scores for the interactions of 826 reported GPCR ligands with ADRB2 and then analyzed the 50 highest-ranked compounds in greater detail. Of 21 commercially available compounds, 11 showed ADRB2-binding activity and were not previously reported to be ADRB2 ligands. These compounds included ligands not only for aminergic receptors but also for neuropeptide Y-type 1 receptors (NPY1R), which have low protein homology to ADRB2. Most ligands we identified were not detected by LBVS and SBVS, which suggests that only CGBVS could identify this unexpected cross-reaction for a ligand developed as a target to a peptidergic receptor.
The true value of CGBVS in drug discovery must be tested by assessing whether this method can identify scaffold-hopping lead compounds from a set of compounds that is structurally more diverse. To assess this ability, we analyzed 11 500 commercially available compounds to predict compounds likely to bind to two GPCRs and two protein kinases. Functional assays revealed that nine ADRB2 ligands, three NPY1R ligands, five epidermal growth factor receptor (EGFR) inhibitors, and two cyclin-dependent kinase 2 (CDK2) inhibitors were concentrated in the top-ranked compounds (hit rate=30, 15, 25, and 10%, respectively). We also evaluated the extent of scaffold hopping achieved in the identification of these novel ligands. One ADRB2 ligand, two NPY1R ligands, and one CDK2 inhibitor exhibited scaffold hopping (Figure 4), indicating that CGBVS can use this characteristic to rationally predict novel lead compounds, a crucial and very difficult step in drug discovery. This feature of CGBVS is critically different from existing predictive methods, such as LBVS, which depend on similarities between test and reference ligands, and focus on a single protein or highly homologous proteins. In particular, CGBVS is useful for targets with undefined ligands because this method can use CPIs with target proteins that exhibit lower levels of homology.
In summary, we have demonstrated that data mining of multiple CPIs is of great practical value for exploration of chemical space. As a predictive model, CGBVS could provide an important step in the discovery of such multi-target drugs by identifying the group of proteins targeted by a particular ligand, leading to innovation in pharmaceutical research.
The discovery of novel bioactive molecules advances our systems-level understanding of biological processes and is crucial for innovation in drug development. For this purpose, the emerging field of chemical genomics is currently focused on accumulating large assay data sets describing compound–protein interactions (CPIs). Although new target proteins for known drugs have recently been identified through mining of CPI databases, using these resources to identify novel ligands remains unexplored. Herein, we demonstrate that machine learning of multiple CPIs can not only assess drug polypharmacology but can also efficiently identify novel bioactive scaffold-hopping compounds. Through a machine-learning technique that uses multiple CPIs, we have successfully identified novel lead compounds for two pharmaceutically important protein families, G-protein-coupled receptors and protein kinases. These novel compounds were not identified by existing computational ligand-screening methods in comparative studies. The results of this study indicate that data derived from chemical genomics can be highly useful for exploring chemical space, and this systems biology perspective could accelerate drug discovery processes.
doi:10.1038/msb.2011.5
PMCID: PMC3094066  PMID: 21364574
chemical genomics; data mining; drug discovery; ligand screening; systems chemical biology
22.  Biology of Metastatic Renal Cell Carcinoma 
Journal of Cancer  2011;2:369-373.
In the past ten years we have made exceptional progresses in the understanding of RCC biology, particularly by recognizing the crucial pathogenetic role of activation of the HIF/VEGF and mTOR pathways. This has resulted in the successful clinical development of anti-angiogenic and mTOR-targeted drugs, which have profoundly impacted on the natural history of the disease and have improved the duration and quality of RCC patient lives. However, further improvements are still greatly needed: 1) even in patients who obtain striking clinical responses early in the course of treatment, disease will ultimately escape control and progress to a treatment-resistant state, leading to therapeutic failure; 2) prolonged disease control usually requires 'continuous' treatment, even across different treatment lines, making the impact of chronic, low-grade, toxicities on quality of life greater and precluding, for most patients, the possibility of experiencing 'drug-free holidays'; 3) although we have successfully identified classes of drugs (or molecular mechanisms of action) that are effective in a substantial proportion of patients, we still fall short of molecular predictive factors that identify individual patients who will (or will not) benefit from a specific intervention and still proceed on a trial-and-error basis, far from a truly 'personalized' therapeutic approach; 4) finally (and perhaps most importantly), even in the best case scenario, currently available treatments inevitably fail to definitively 'cure' metastatic RCC patients. In this review we briefly summarize recent developments in the understanding of the molecular pathogenesis of RCC, the development of resistance/escape mechanisms, the rationale for sequencing agents with different mechanisms of action, and the importance of host-related factors. Unraveling the complex mechanisms by which RCC shapes host microenvironment and immune response and therapeutic treatments, in turn, shape both cancer cell biology and tumor-host interactions may hold the key to future advances in such a complex and challenging disease.
PMCID: PMC3157018  PMID: 21850209
RCC; Biology; Signal transduction; HIF; mTOR; Angiogenesis
23.  A Genome-Wide Screen for Genetic Variants That Modify the Recruitment of REST to Its Target Genes 
PLoS Genetics  2012;8(4):e1002624.
Increasing numbers of human diseases are being linked to genetic variants, but our understanding of the mechanistic links leading from DNA sequence to disease phenotype is limited. The majority of disease-causing nucleotide variants fall within the non-protein-coding portion of the genome, making it likely that they act by altering gene regulatory sequences. We hypothesised that SNPs within the binding sites of the transcriptional repressor REST alter the degree of repression of target genes. Given that changes in the effective concentration of REST contribute to several pathologies—various cancers, Huntington's disease, cardiac hypertrophy, vascular smooth muscle proliferation—these SNPs should alter disease-susceptibility in carriers. We devised a strategy to identify SNPs that affect the recruitment of REST to target genes through the alteration of its DNA recognition element, the RE1. A multi-step screen combining genetic, genomic, and experimental filters yielded 56 polymorphic RE1 sequences with robust and statistically significant differences of affinity between alleles. These SNPs have a considerable effect on the the functional recruitment of REST to DNA in a range of in vitro, reporter gene, and in vivo analyses. Furthermore, we observe allele-specific biases in deeply sequenced chromatin immunoprecipitation data, consistent with predicted differenes in RE1 affinity. Amongst the targets of polymorphic RE1 elements are important disease genes including NPPA, PTPRT, and CDH4. Thus, considerable genetic variation exists in the DNA motifs that connect gene regulatory networks. Recently available ChIP–seq data allow the annotation of human genetic polymorphisms with regulatory information to generate prior hypotheses about their disease-causing mechanism.
Author Summary
Common human diseases such as cancer, heart disease, or epilepsy have a genetic component that predisposes particular individuals to suffer from them. Huge sums have been invested to map the regions of the human genome where small DNA variations, or SNPs (“single-nucleotide polymorphisms”), determine the probability of developing these diseases. A major problem with this approach, however, is that, once the culprit SNPs are discovered, we know very little about how they cause disease—which is critical if we are to use this information to develop drugs and therapies. In this study, we demonstrate a new approach, employing functional maps of the human genome that have recently been published. We begin with regions of the genome recognised by a gene repressor protein—REST—that is involved in a number of important human diseases. Using information on where REST binds in the human genome, we predict and validate common DNA variations that increase or decrease this binding. By affecting how much REST is recruited to important genes, these variations may predispose or protect individuals from a number of diseases. Studies like this show how we can use genomic information to gain a deeper understanding of the genetics behind human disease.
doi:10.1371/journal.pgen.1002624
PMCID: PMC3320604  PMID: 22496669
24.  Genomic Insights into the Origin of Parasitism in the Emerging Plant Pathogen Bursaphelenchus xylophilus 
PLoS Pathogens  2011;7(9):e1002219.
Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite.
Author Summary
Bursaphelenchus xylophilus is an important plant pathogen, responsible for an epidemic of pine wilt disease in Asia and Europe. B. xylophilus has acquired the ability to parasitise plants independently from other economically important nematodes and has a complex life cycle that includes fungal feeding and a stage associated with an insect, as well as plant parasitism. We have sequenced the genome of B. xylophilus and used it as a resource to understand disease mechanisms and the biological basis of its complex ecology. The ability to break down cellulose, the major component of the plant cell wall, is a major problem for plant parasitic nematodes as few animals can produce the required enzymes (cellulases). Previous work has shown that other plant parasitic nematodes have acquired cellulases from bacteria but we show that all Bursaphelenchus cellulases were most likely acquired independently from fungi. We also describe a complex set of genes encoding enzymes that can break down proteins and other molecules, perhaps reflecting the range of organisms with which B. xylophilus interacts during its life cycle. The genome sequence of Bursaphelenchus represents an important step forward in understanding its biology, and will contribute to efforts to control the devastating disease it causes.
doi:10.1371/journal.ppat.1002219
PMCID: PMC3164644  PMID: 21909270
25.  Sub-Telomere Directed Gene Expression during Initiation of Invasive Aspergillosis 
PLoS Pathogens  2008;4(9):e1000154.
Aspergillus fumigatus is a common mould whose spores are a component of the normal airborne flora. Immune dysfunction permits developmental growth of inhaled spores in the human lung causing aspergillosis, a significant threat to human health in the form of allergic, and life-threatening invasive infections. The success of A. fumigatus as a pathogen is unique among close phylogenetic relatives and is poorly characterised at the molecular level. Recent genome sequencing of several Aspergillus species provides an exceptional opportunity to analyse fungal virulence attributes within a genomic and evolutionary context. To identify genes preferentially expressed during adaptation to the mammalian host niche, we generated multiple gene expression profiles from minute samplings of A. fumigatus germlings during initiation of murine infection. They reveal a highly co-ordinated A. fumigatus gene expression programme, governing metabolic and physiological adaptation, which allows the organism to prosper within the mammalian niche. As functions of phylogenetic conservation and genetic locus, 28% and 30%, respectively, of the A. fumigatus subtelomeric and lineage-specific gene repertoires are induced relative to laboratory culture, and physically clustered genes including loci directing pseurotin, gliotoxin and siderophore biosyntheses are a prominent feature. Locationally biased A. fumigatus gene expression is not prompted by in vitro iron limitation, acid, alkaline, anaerobic or oxidative stress. However, subtelomeric gene expression is favoured following ex vivo neutrophil exposure and in comparative analyses of richly and poorly nourished laboratory cultured germlings. We found remarkable concordance between the A. fumigatus host-adaptation transcriptome and those resulting from in vitro iron depletion, alkaline shift, nitrogen starvation and loss of the methyltransferase LaeA. This first transcriptional snapshot of a fungal genome during initiation of mammalian infection provides the global perspective required to direct much-needed diagnostic and therapeutic strategies and reveals genome organisation and subtelomeric diversity as potential driving forces in the evolution of pathogenicity in the genus Aspergillus.
Author Summary
Airborne spores of the fungus Aspergillus fumigatus are present in significant quantities worldwide and are responsible for a range of illnesses from allergy to deadly invasive lung infection. A number of fungal properties are likely required for germination and growth of the fungus in the host, and now that the genome sequence of A. fumigatus is available it is possible to address which genes become important during initiation of infection. Understanding this might lead to new therapeutics and diagnostic tools. We have compared A. fumigatus gene activation during infection in a murine model to that in a laboratory culture to identify fungal attributes preferentially employed during disease. Our analysis entailed measurement of activity from most of the >9000 A. fumigatus genes, identifying iron limitation, alkaline stress, and nitrogen starvation as prominent stresses imposed by the host environment. We also found that genes preferentially employed for infection occur in clusters and are more likely to reside near the end of chromosomes, otherwise known as telomeres.
doi:10.1371/journal.ppat.1000154
PMCID: PMC2526178  PMID: 18787699

Results 1-25 (1050736)