Search tips
Search criteria

Results 1-25 (1212228)

Clipboard (0)

Related Articles

1.  Weighted Interaction SNP Hub (WISH) network method for building genetic networks for complex diseases and traits using whole genome genotype data 
BMC Systems Biology  2014;8(Suppl 2):S5.
High-throughput genotype (HTG) data has been used primarily in genome-wide association (GWA) studies; however, GWA results explain only a limited part of the complete genetic variation of traits. In systems genetics, network approaches have been shown to be able to identify pathways and their underlying causal genes to unravel the biological and genetic background of complex diseases and traits, e.g., the Weighted Gene Co-expression Network Analysis (WGCNA) method based on microarray gene expression data. The main objective of this study was to develop a scale-free weighted genetic interaction network method using whole genome HTG data in order to detect biologically relevant pathways and potential genetic biomarkers for complex diseases and traits.
We developed the Weighted Interaction SNP Hub (WISH) network method that uses HTG data to detect genome-wide interactions between single nucleotide polymorphism (SNPs) and its relationship with complex traits. Data dimensionality reduction was achieved by selecting SNPs based on its: 1) degree of genome-wide significance and 2) degree of genetic variation in a population. Network construction was based on pairwise Pearson's correlation between SNP genotypes or the epistatic interaction effect between SNP pairs. To identify modules the Topological Overlap Measure (TOM) was calculated, reflecting the degree of overlap in shared neighbours between SNP pairs. Modules, clusters of highly interconnected SNPs, were defined using a tree-cutting algorithm on the SNP dendrogram created from the dissimilarity TOM (1-TOM). Modules were selected for functional annotation based on their association with the trait of interest, defined by the Genome-wide Module Association Test (GMAT). We successfully tested the established WISH network method using simulated and real SNP interaction data and GWA study results for carcass weight in a pig resource population; this resulted in detecting modules and key functional and biological pathways related to carcass weight.
We developed the WISH network method which is a novel 'systems genetics' approach to study genetic networks underlying complex trait variation. The WISH network method reduces data dimensionality and statistical complexity in associating genotypes with phenotypes in GWA studies and enables researchers to identify biologically relevant pathways and potential genetic biomarkers for any complex trait of interest.
PMCID: PMC4101698  PMID: 25032480
2.  Integrating Computational Biology and Forward Genetics in Drosophila 
PLoS Genetics  2009;5(1):e1000351.
Genetic screens are powerful methods for the discovery of gene–phenotype associations. However, a systems biology approach to genetics must leverage the massive amount of “omics” data to enhance the power and speed of functional gene discovery in vivo. Thus far, few computational methods for gene function prediction have been rigorously tested for their performance on a genome-wide scale in vivo. In this work, we demonstrate that integrating genome-wide computational gene prioritization with large-scale genetic screening is a powerful tool for functional gene discovery. To discover genes involved in neural development in Drosophila, we extend our strategy for the prioritization of human candidate disease genes to functional prioritization in Drosophila. We then integrate this prioritization strategy with a large-scale genetic screen for interactors of the proneural transcription factor Atonal using genomic deficiencies and mutant and RNAi collections. Using the prioritized genes validated in our genetic screen, we describe a novel genetic interaction network for Atonal. Lastly, we prioritize the whole Drosophila genome and identify candidate gene associations for ten receptor-signaling pathways. This novel database of prioritized pathway candidates, as well as a web application for functional prioritization in Drosophila, called Endeavour-HighFly, and the Atonal network, are publicly available resources. A systems genetics approach that combines the power of computational predictions with in vivo genetic screens strongly enhances the process of gene function and gene–gene association discovery.
Author Summary
Genome sequencing and annotation, combined with large-scale molecular experiments to query gene expression and molecular interactions, collectively known as Systems Biology, have resulted in an enormous wealth in biological databases. Yet, it remains a daunting task to use these data to decipher the rules that govern biological systems. One of the most trusted approaches in biology is genetic analysis because of its emphasis on gene function in living organisms. Genetics, however, proceeds slowly and unravels small-scale interactions. Turning genetics into an effective tool of Systems Biology requires harnessing the large-scale molecular data for the design and execution of genetic screens. In this work, we test the idea of exploiting a computational approach known as gene prioritization to pre-rank genes for the likelihood of their involvement in a process of interest. By carrying out a gene prioritization–supported genetic screen, we greatly enhance the speed and output of in vivo genetic screens without compromising their sensitivity. These results mean that future genetic screens can be custom-catered for any process of interest and carried out with a speed and efficiency that is comparable to other large-scale molecular experiments. We refer to this combined approach as Systems Genetics.
PMCID: PMC2628282  PMID: 19165344
3.  Using Microarrays to Facilitate Positional Cloning: Identification of Tomosyn as an Inhibitor of Neurosecretion 
PLoS Genetics  2005;1(1):e2.
Forward genetic screens have been used as a powerful strategy to dissect complex biological pathways in many model systems. A significant limitation of this approach has been the time-consuming and costly process of positional cloning and molecular characterization of the mutations isolated in these screens. Here, the authors describe a strategy using microarray hybridizations to facilitate positional cloning. This method relies on the fact that premature stop codons (i.e., nonsense mutations) constitute a frequent class of mutations isolated in screens and that nonsense mutant messenger RNAs are efficiently degraded by the conserved nonsense-mediated decay pathway. They validate this strategy by identifying two previously uncharacterized mutations: (1) tom-1, a mutation found in a forward genetic screen for enhanced acetylcholine secretion in Caenorhabditis elegans, and (2) an apparently spontaneous mutation in the hif-1 transcription factor gene. They further demonstrate the broad applicability of this strategy using other known mutants in C. elegans, Arabidopsis, and mouse. Characterization of tom-1 mutants suggests that TOM-1, the C. elegans ortholog of mammalian tomosyn, functions as an endogenous inhibitor of neurotransmitter secretion. These results also suggest that microarray hybridizations have the potential to significantly reduce the time and effort required for positional cloning.
Genetic screens are commonly used to figure out which genes are involved in a biological process. The first step in a genetic screen is to isolate mutant animals that are defective in the process being studied. The next step is to find which of the thousands of genes has the mutation that causes the observed defect. Positional cloning, the tried-and-true method for locating mutations, is slow and expensive. The authors propose using microarray hybridizations to speed the process. Their approach relies on the fact that a large fraction of the mutations found in screens are the results of premature stop codons, a particularly severe type of mutation. In cells, messages containing premature stop codons are rapidly destroyed by a protective pathway, called nonsense-mediated decay, thus making them directly detectable by microarray hybridization.
The authors apply this strategy retrospectively to known mutants in Caenorhabditis elegans, Arabidopsis, and mouse. They identify two uncharacterized mutations in C. elegans, including one, tom-1, found in a forward genetic screen for enhancers of neurotransmission. Interestingly, their characterization of tom-1 mutants suggests that the highly conserved protein tomosyn inhibits neurotransmission in neurons. This study shows that microarray hybridizations will help reduce the time and effort required for positional cloning.
PMCID: PMC1183521  PMID: 16103915
4.  Role of Tomato Lipoxygenase D in Wound-Induced Jasmonate Biosynthesis and Plant Immunity to Insect Herbivores 
PLoS Genetics  2013;9(12):e1003964.
In response to insect attack and mechanical wounding, plants activate the expression of genes involved in various defense-related processes. A fascinating feature of these inducible defenses is their occurrence both locally at the wounding site and systemically in undamaged leaves throughout the plant. Wound-inducible proteinase inhibitors (PIs) in tomato (Solanum lycopersicum) provide an attractive model to understand the signal transduction events leading from localized injury to the systemic expression of defense-related genes. Among the identified intercellular molecules in regulating systemic wound response of tomato are the peptide signal systemin and the oxylipin signal jasmonic acid (JA). The systemin/JA signaling pathway provides a unique opportunity to investigate, in a single experimental system, the mechanism by which peptide and oxylipin signals interact to coordinate plant systemic immunity. Here we describe the characterization of the tomato suppressor of prosystemin-mediated responses8 (spr8) mutant, which was isolated as a suppressor of (pro)systemin-mediated signaling. spr8 plants exhibit a series of JA-dependent immune deficiencies, including the inability to express wound-responsive genes, abnormal development of glandular trichomes, and severely compromised resistance to cotton bollworm (Helicoverpa armigera) and Botrytis cinerea. Map-based cloning studies demonstrate that the spr8 mutant phenotype results from a point mutation in the catalytic domain of TomLoxD, a chloroplast-localized lipoxygenase involved in JA biosynthesis. We present evidence that overexpression of TomLoxD leads to elevated wound-induced JA biosynthesis, increased expression of wound-responsive genes and, therefore, enhanced resistance to insect herbivory attack and necrotrophic pathogen infection. These results indicate that TomLoxD is involved in wound-induced JA biosynthesis and highlight the application potential of this gene for crop protection against insects and pathogens.
Author Summary
Plants have evolved sophisticated strategies to defend themselves against insect attack. Wound-inducible proteinase inhibitors (PIs) in tomato (Solanum lycopersicum) provide an attractive model to understand the signal transduction events leading from localized injury to the systemic expression of defense-related genes. A wealth of evidence indicates that the peptide signal systemin and the phytohormone jasmonic acid (JA) work together in the same signaling pathway to activate the expression of PIs and other defense-related genes. We have been using a genetic approach to dissect the systemin/JA signaling pathway and to discover important genes that can be used for crop protection. Here we report the characterization of the suppressor of prosystemin-mediated responses8 (spr8) mutant, which is defective in wound-induced defense gene expression and therefore is more susceptible to insect attack. We demonstrate that spr8 defines the TomLoxD gene, which encodes a chloroplast-localized lipoxygenase involved in wound-induced JA biosynthesis. Further, we demonstrate that genetic manipulation of Spr8/TomLoxD leads to increased plant resistance against insect attack and pathogen infection.
PMCID: PMC3861047  PMID: 24348260
5.  A Host Small GTP-binding Protein ARL8 Plays Crucial Roles in Tobamovirus RNA Replication 
PLoS Pathogens  2011;7(12):e1002409.
Tomato mosaic virus (ToMV), like other eukaryotic positive-strand RNA viruses, replicates its genomic RNA in replication complexes formed on intracellular membranes. Previous studies showed that a host seven-pass transmembrane protein TOM1 is necessary for efficient ToMV multiplication. Here, we show that a small GTP-binding protein ARL8, along with TOM1, is co-purified with a FLAG epitope-tagged ToMV 180K replication protein from solubilized membranes of ToMV-infected tobacco (Nicotiana tabacum) cells. When solubilized membranes of ToMV-infected tobacco cells that expressed FLAG-tagged ARL8 were subjected to immunopurification with anti-FLAG antibody, ToMV 130K and 180K replication proteins and TOM1 were co-purified and the purified fraction showed RNA-dependent RNA polymerase activity that transcribed ToMV RNA. From uninfected cells, TOM1 co-purified with FLAG-tagged ARL8 less efficiently, suggesting that a complex containing ToMV replication proteins, TOM1, and ARL8 are formed on membranes in infected cells. In Arabidopsis thaliana, ARL8 consists of four family members. Simultaneous mutations in two specific ARL8 genes completely inhibited tobamovirus multiplication. In an in vitro ToMV RNA translation-replication system, the lack of either TOM1 or ARL8 proteins inhibited the production of replicative-form RNA, indicating that TOM1 and ARL8 are required for efficient negative-strand RNA synthesis. When ToMV 130K protein was co-expressed with TOM1 and ARL8 in yeast, RNA 5′-capping activity was detected in the membrane fraction. This activity was undetectable or very weak when the 130K protein was expressed alone or with either TOM1 or ARL8. Taken together, these results suggest that TOM1 and ARL8 are components of ToMV RNA replication complexes and play crucial roles in a process toward activation of the replication proteins' RNA synthesizing and capping functions.
Author Summary
Many important pathogens of plants, animals, and humans are positive-strand RNA viruses. They replicate via complementary RNA in replication complexes formed on host intracellular membranes. In the replication process, not only viral replication proteins but also host factors play important roles. Although many host factors whose knockdown affects the multiplication of positive-strand RNA viruses have been identified, the function of each host factor in virus multiplication is only poorly understood in most instances. In this paper, we show that a host small GTP-binding protein ARL8 is required for the multiplication of Tomato mosaic virus (ToMV), and that it forms a complex with ToMV replication proteins and another essential host factor TOM1 that is a seven-pass transmembrane protein. We further demonstrate that the replication proteins acquire the ability to synthesize negative-strand ToMV RNA and RNA 5′ cap only in the presence of both TOM1 and ARL8. The replication proteins of ToMV are multifunctional proteins that participate in RNA replication on membranes and RNA silencing suppression in the cytosol. Our results suggest that ToMV replication proteins are programmed to express their replication-related activities only on membranes through interactions with these host membrane proteins.
PMCID: PMC3234234  PMID: 22174675
6.  Arabidopsis Gene Family Profiler (aGFP) – user-oriented transcriptomic database with easy-to-use graphic interface 
BMC Plant Biology  2007;7:39.
Microarray technologies now belong to the standard functional genomics toolbox and have undergone massive development leading to increased genome coverage, accuracy and reliability. The number of experiments exploiting microarray technology has markedly increased in recent years. In parallel with the rapid accumulation of transcriptomic data, on-line analysis tools are being introduced to simplify their use. Global statistical data analysis methods contribute to the development of overall concepts about gene expression patterns and to query and compose working hypotheses. More recently, these applications are being supplemented with more specialized products offering visualization and specific data mining tools. We present a curated gene family-oriented gene expression database, Arabidopsis Gene Family Profiler (aGFP; ), which gives the user access to a large collection of normalised Affymetrix ATH1 microarray datasets. The database currently contains NASC Array and AtGenExpress transcriptomic datasets for various tissues at different developmental stages of wild type plants gathered from nearly 350 gene chips.
The Arabidopsis GFP database has been designed as an easy-to-use tool for users needing an easily accessible resource for expression data of single genes, pre-defined gene families or custom gene sets, with the further possibility of keyword search. Arabidopsis Gene Family Profiler presents a user-friendly web interface using both graphic and text output. Data are stored at the MySQL server and individual queries are created in PHP script. The most distinguishable features of Arabidopsis Gene Family Profiler database are: 1) the presentation of normalized datasets (Affymetrix MAS algorithm and calculation of model-based gene-expression values based on the Perfect Match-only model); 2) the choice between two different normalization algorithms (Affymetrix MAS4 or MAS5 algorithms); 3) an intuitive interface; 4) an interactive "virtual plant" visualizing the spatial and developmental expression profiles of both gene families and individual genes.
Arabidopsis GFP gives users the possibility to analyze current Arabidopsis developmental transcriptomic data starting with simple global queries that can be expanded and further refined to visualize comparative and highly selective gene expression profiles.
PMCID: PMC1963329  PMID: 17645793
7.  Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis 
PLoS Computational Biology  2008;4(3):e1000043.
Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.
Methodology/Principal Findings
We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.
Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.
Author Summary
One of the most limiting aspects of biological research in the post-genomic era is the capability to integrate massive datasets on gene structure and function for producing useful biological knowledge. In this report we have applied an integrative approach to address the problem of identifying likely candidate genes within loci associated with human genetic diseases. Despite the recent progress in sequencing technologies, approaching this problem from an experimental perspective still represents a very demanding task, because the critical region may typically contain hundreds of positional candidates. We found that by concentrating only on genes sharing similar expression profiles in both human and mouse, massive microarray datasets can be used to reliably identify disease-relevant relationships among genes. Moreover, we found that integrating the coexpression criterion with systematic phenome analysis allows efficient identification of disease genes in large genomic regions. Using this approach on 850 OMIM loci characterized by unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.
PMCID: PMC2268251  PMID: 18369433
8.  Disease Gene Characterization through Large-Scale Co-Expression Analysis 
PLoS ONE  2009;4(12):e8491.
In the post genome era, a major goal of biology is the identification of specific roles for individual genes. We report a new genomic tool for gene characterization, the UCLA Gene Expression Tool (UGET).
Celsius, the largest co-normalized microarray dataset of Affymetrix based gene expression, was used to calculate the correlation between all possible gene pairs on all platforms, and generate stored indexes in a web searchable format. The size of Celsius makes UGET a powerful gene characterization tool. Using a small seed list of known cartilage-selective genes, UGET extended the list of known genes by identifying 32 new highly cartilage-selective genes. Of these, 7 of 10 tested were validated by qPCR including the novel cartilage-specific genes SDK2 and FLJ41170. In addition, we retrospectively tested UGET and other gene expression based prioritization tools to identify disease-causing genes within known linkage intervals. We first demonstrated this utility with UGET using genetically heterogeneous disorders such as Joubert syndrome, microcephaly, neuropsychiatric disorders and type 2 limb girdle muscular dystrophy (LGMD2) and then compared UGET to other gene expression based prioritization programs which use small but discrete and well annotated datasets. Finally, we observed a significantly higher gene correlation shared between genes in disease networks associated with similar complex or Mendelian disorders.
UGET is an invaluable resource for a geneticist that permits the rapid inclusion of expression criteria from one to hundreds of genes in genomic intervals linked to disease. By using thousands of arrays UGET annotates and prioritizes genes better than other tools especially with rare tissue disorders or complex multi-tissue biological processes. This information can be critical in prioritization of candidate genes for sequence analysis.
PMCID: PMC2797297  PMID: 20046828
9.  Ectopic Lymphoid Structures Support Ongoing Production of Class-Switched Autoantibodies in Rheumatoid Synovium 
PLoS Medicine  2009;6(1):e1.
Follicular structures resembling germinal centres (GCs) that are characterized by follicular dendritic cell (FDC) networks have long been recognized in chronically inflamed tissues in autoimmune diseases, including the synovium of rheumatoid arthritis (RA). However, it is debated whether these ectopic structures promote autoimmunity and chronic inflammation driving the production of pathogenic autoantibodies. Anti-citrullinated protein/peptide antibodies (ACPA) are highly specific markers of RA, predict a poor prognosis, and have been suggested to be pathogenic. Therefore, the main study objectives were to determine whether ectopic lymphoid structures in RA synovium: (i) express activation-induced cytidine deaminase (AID), the enzyme required for somatic hypermutation and class-switch recombination (CSR) of Ig genes; (ii) support ongoing CSR and ACPA production; and (iii) remain functional in a RA/severe combined immunodeficiency (SCID) chimera model devoid of new immune cell influx into the synovium.
Methods and Findings
Using immunohistochemistry (IHC) and quantitative Taqman real-time PCR (QT-PCR) in synovial tissue from 55 patients with RA, we demonstrated that FDC+ structures invariably expressed AID with a distribution resembling secondary lymphoid organs. Further, AID+/CD21+ follicular structures were surrounded by ACPA+/CD138+ plasma cells, as demonstrated by immune reactivity to citrullinated fibrinogen. Moreover, we identified a novel subset of synovial AID+/CD20+ B cells outside GCs resembling interfollicular large B cells. In order to gain direct functional evidence that AID+ structures support CSR and in situ manufacturing of class-switched ACPA, 34 SCID mice were transplanted with RA synovium and humanely killed at 4 wk for harvesting of transplants and sera. Persistent expression of AID and Iγ-Cμ circular transcripts (identifying ongoing IgM-IgG class-switching) was observed in synovial grafts expressing FDCs/CD21L. Furthermore, synovial mRNA levels of AID were closely associated with circulating human IgG ACPA in mouse sera. Finally, the survival and proliferation of functional B cell niches was associated with persistent overexpression of genes regulating ectopic lymphoneogenesis.
Our demonstration that FDC+ follicular units invariably express AID and are surrounded by ACPA-producing plasma cells provides strong evidence that ectopic lymphoid structures in the RA synovium are functional and support autoantibody production. This concept is further confirmed by evidence of sustained AID expression, B cell proliferation, ongoing CSR, and production of human IgG ACPA from GC+ synovial tissue transplanted into SCID mice, independently of new B cell influx from the systemic circulation. These data identify AID as a potential therapeutic target in RA and suggest that survival of functional synovial B cell niches may profoundly influence chronic inflammation, autoimmunity, and response to B cell–depleting therapies.
Costantino Pitzalis and colleagues show that lymphoid structures in synovial tissue of patients with rheumatoid arthritis support production of anti-citrullinated peptide antibodies, which continues following transplantation into SCID mice.
Editors' Summary
More than 1 million people in the United States have rheumatoid arthritis, an “autoimmune” condition that affects the joints. Normally, the immune system provides protection against infection by responding to foreign antigens (molecules that are unique to invading organisms) while ignoring self-antigens present in the body's own tissues. In autoimmune diseases, this ability to discriminate between self and non-self fails for unknown reasons and the immune system begins to attack human tissues. In rheumatoid arthritis, the lining of the joints (the synovium) is attacked, it becomes inflamed and thickened, and chemicals are released that damage all the tissues in the joint. Eventually, the joint may become so scarred that movement is no longer possible. Rheumatoid arthritis usually starts in the small joints in the hands and feet, but larger joints and other tissues (including the heart and blood vessels) can be affected. Its symptoms, which tend to fluctuate, include early morning joint pain, swelling, and stiffness, and feeling generally unwell. Although the disease is not always easy to diagnose, the immune systems of many people with rheumatoid arthritis make “anti-citrullinated protein/peptide antibodies” (ACPA). These “autoantibodies” (which some experts believe can contribute to the joint damage in rheumatoid arthritis) recognize self-proteins that contain the unusual amino acid citrulline, and their detection on blood tests can help make the diagnosis. Although there is no cure for rheumatoid arthritis, the recently developed biologic drugs, often used together with the more traditional disease-modifying therapies, are able to halt its progression by specifically blocking the chemicals that cause joint damage. Painkillers and nonsteroidal anti-inflammatory drugs can reduce its symptoms, and badly damaged joints can sometimes be surgically replaced.
Why Was This Study Done?
Before scientists can develop a cure for rheumatoid arthritis, they need to know how and why autoantibodies are made that attack the joints in this common and disabling disease. B cells, the immune system cells that make antibodies, mature in structures known as “germinal centers” in the spleen and lymph nodes. In the germinal centers, immature B cells are exposed to antigens and undergo two genetic processes called “somatic hypermutation” and “class-switch recombination” that ensure that each B cell makes an antibody that sticks as tightly as possible to just one antigen. The B cells then multiply and enter the bloodstream where they help to deal with infections. Interestingly, the inflamed synovium of many patients with rheumatoid arthritis contains structures that resemble germinal centers. Could these ectopic (misplaced) lymphoid structures, which are characterized by networks of immune system cells called follicular dendritic cells (FDCs), promote autoimmunity and long-term inflammation by driving the production of autoantibodies within the joint itself? In this study, the researchers investigate this possibility.
What Did the Researchers Do and Find?
The researchers collected synovial tissue from 55 patients with rheumatoid arthritis and used two approaches, called immunohistochemistry and real-time PCR, to investigate whether FDC-containing structures in synovium expressed an enzyme called activation-induced cytidine deaminase (AID), which is needed for both somatic hypermutation and class-switch recombination. All the FDC-containing structures that the researchers found in their samples expressed AID. Furthermore, these AID-containing structures were surrounded by mature B cells making ACPAs. To test whether these B cells were derived from AID-expressing cells resident in the synovium rather than ACPA-expressing immune system cells coming into the synovium from elsewhere in the body, the researchers transplanted synovium from patients with rheumatoid arthritis under the skin of a special sort of mouse that largely lacks its own immune system. Four weeks later, the researchers found that the transplanted human lymphoid tissue was still making AID, that the level of AID expression correlated with the amount of human ACPA in the blood of the mice, and that the B cells in the transplant were proliferating.
What Do These Findings Mean?
These findings show that the ectopic lymphoid structures present in the synovium of some patients with rheumatoid arthritis are functional and are able to make ACPA. Because ACPA may be responsible for joint damage, the survival of these structures could, therefore, be involved in the development and progression of rheumatoid arthritis. More experiments are needed to confirm this idea, but these findings may explain why drugs that effectively clear B cells from the bloodstream do not always produce a marked clinical improvement in rheumatoid arthritis. Finally, they suggest that AID might provide a new target for the development of drugs to treat rheumatoid arthritis.
Additional Information.
Please access these Web sites via the online version of this summary at
This study is further discussed in a PLoS Medicine Perspective by Rene Toes and Tom Huizinga
The MedlinePlus Encyclopedia has a page on rheumatoid arthritis (in English and Spanish). MedlinePlus provides links to other information on rheumatoid arthritis (in English and Spanish)
The UK National Health Service Choices information service has detailed information on rheumatoid arthritis
The US National Institute of Arthritis and Musculoskeletal and Skin Diseases provides Fast Facts, an easy to read publication for the public, and a more detailed Handbook on rheumatoid arthritis
The US Centers for Disease Control and Prevention has an overview on rheumatoid arthritis that includes statistics about this disease and its impact on daily life
PMCID: PMC2621263  PMID: 19143467
10.  Tomosyn Negatively Regulates CAPS-Dependent Peptide Release at Caenorhabditis elegans Synapses 
The syntaxin-interacting protein tomosyn is thought to be a key regulator of exocytosis, although its precise mechanism of action has yet to be elucidated. Here we examined the role of tomosyn in peptide secretion in Caenorhabditis elegans tomosyn (tom-1) mutants. Ultrastructural analysis of tom-1 mutants revealed a 50% reduction in presynaptic dense-core vesicles (DCVs) corresponding to enhanced neuropeptide release. Conversely, overexpression of TOM-1 led to an accumulation of DCVs. Together, these data provide the first in vivo evidence that TOM-1 negatively regulates DCV exocytosis. In C. elegans, neuropeptide release is promoted by the calcium-dependent activator protein for secretion (CAPS) homolog UNC-31. To test for a genetic interaction between tomosyn and CAPS, we generated tom-1;unc-31 double mutants. Loss of TOM-1 suppressed the behavioral, electrophysiological, and DCV ultrastructural phenotypes of unc-31 mutants, indicating that TOM-1 antagonizes UNC-31-dependent DCV release. Because unc-31 mutants exhibit synaptic transmission defects, we postulated that loss of DCV release in these mutants and the subsequent suppression by tom-1 mutants could simply reflect alterations in synaptic activity, rather than direct regulation of DCV release. To distinguish between these two possibilities, we analyzed C. elegans Rim mutants (unc-10), which have a comparable reduction in synaptic transmission to unc-31 mutants, specifically attributed to defects in synaptic vesicle (SV) exocytosis. Based on this analysis, we conclude that the changes in DCV release in tom-1 and unc-31 mutants reflect direct effects of TOM-1 and UNC-31 on DCV exocytosis, rather than altered SV release.
PMCID: PMC3874420  PMID: 17881523
tomosyn; CAPS; C. elegans; peptidergic transmission; neuromuscular junction; UNC-31; TOM-1
11.  A Candidate Gene Approach Identifies the TRAF1/C5 Region as a Risk Factor for Rheumatoid Arthritis 
PLoS Medicine  2007;4(9):e278.
Rheumatoid arthritis (RA) is a chronic autoimmune disorder affecting ∼1% of the population. The disease results from the interplay between an individual's genetic background and unknown environmental triggers. Although human leukocyte antigens (HLAs) account for ∼30% of the heritable risk, the identities of non-HLA genes explaining the remainder of the genetic component are largely unknown. Based on functional data in mice, we hypothesized that the immune-related genes complement component 5 (C5) and/or TNF receptor-associated factor 1 (TRAF1), located on Chromosome 9q33–34, would represent relevant candidate genes for RA. We therefore aimed to investigate whether this locus would play a role in RA.
Methods and Findings
We performed a multitiered case-control study using 40 single-nucleotide polymorphisms (SNPs) from the TRAF1 and C5 (TRAF1/C5) region in a set of 290 RA patients and 254 unaffected participants (controls) of Dutch origin. Stepwise replication of significant SNPs was performed in three independent sample sets from the Netherlands (ncases/controls = 454/270), Sweden (ncases/controls = 1,500/1,000) and US (ncases/controls = 475/475). We observed a significant association (p < 0.05) of SNPs located in a haplotype block that encompasses a 65 kb region including the 3′ end of C5 as well as TRAF1. A sliding window analysis revealed an association peak at an intergenic region located ∼10 kb from both C5 and TRAF1. This peak, defined by SNP14/rs10818488, was confirmed in a total of 2,719 RA patients and 1,999 controls (odds ratiocommon = 1.28, 95% confidence interval 1.17–1.39, pcombined = 1.40 × 10−8) with a population-attributable risk of 6.1%. The A (minor susceptibility) allele of this SNP also significantly correlates with increased disease progression as determined by radiographic damage over time in RA patients (p = 0.008).
Using a candidate-gene approach we have identified a novel genetic risk factor for RA. Our findings indicate that a polymorphism in the TRAF1/C5 region increases the susceptibility to and severity of RA, possibly by influencing the structure, function, and/or expression levels of TRAF1 and/or C5.
Using a candidate-gene approach, Rene Toes and colleagues identified a novel genetic risk factor for rheumatoid arthritis in theTRAF1/C5 region.
Editors' Summary
Rheumatoid arthritis is a very common chronic illness that affects around 1% of people in developed countries. It is caused by an abnormal immune reaction to various tissues within the body; as well as affecting joints and causing an inflammatory arthritis, it can also affect many other organs of the body. Severe rheumatoid arthritis can be life-threatening, but even mild forms of the disease cause substantial illness and disability. Current treatments aim to give symptomatic relief with the use of simple analgesics, or anti-inflammatory drugs. In addition, most patients are also treated with what are known as disease-modifying agents, which aim to prevent joint damage. Rheumatoid arthritis is known to have a genetic component. For example, an association has been shown with the part of the genome that contains the human leukocyte antigens (HLAs), which are involved in the immune response. Information on other genes involved would be helpful both for understanding the underlying cause of the disease and possibly for the discovery of new treatments.
Why Was This Study Done?
Previous work in mice that have a disease similar to human rheumatoid arthritis has identified a number of possible candidate genes. One of these genes, complement component 5 (C5) is involved in the complement system—a primitive system within the body that is involved in the defense against foreign molecules. In humans the gene for C5 is located on Chromosome 9 close to another gene involved in the inflammatory response, TNF receptor-associated factor 1 (TRAF1). A preliminary study in humans of this region had shown some evidence, albeit weak, to suggest that this region might be associated with rheumatoid arthritis. The authors set out to look in more detail, and in a larger group of individuals, to see if they could prove this association.
What Did the Researchers Do and Find?
The researchers took 40 genetic markers, known as single-nucleotide polymorphisms (SNPs), from across the region that included the C5 and TRAF1 genes. SNPs have each been assigned a unique reference number that specifies a point in the human genome, and each is present in alternate forms so can be differentiated. They compared which of the alternate forms were present in 290 patients with rheumatoid arthritis and 254 unaffected participants of Dutch origin. They then repeated the study in three other groups of patients and controls of Dutch, Swedish, and US origin. They found a consistent association with rheumatoid arthritis of one region of 65 kilobases (a small distance in genetic terms) that included one end of the C5 gene as well as the TRAF1 gene. They could refine the area of interest to a piece marked by one particular SNP that lay between the genes. They went on to show that the genetic region in which these genes are located may be involved in the binding of a protein that modifies the transcription of genes, thus providing a possible explanation for the association. Furthermore, they showed that one of the alternate versions of the marker in this region was associated with more aggressive disease.
What Do These Findings Mean?
The finding of a genetic association is the first step in identifying a genetic component of a disease. The strength of this study is that a novel genetic susceptibility factor for RA has been identified and that the overall result is consistent in four different populations as well as being associated with disease severity. Further work will need to be done to confirm the association in other populations and then to identify the precise genetic change involved. Hopefully this work will lead to new avenues of investigation for therapy.
Additional Information.
Please access these Web sites via the online version of this summary at
• Medline Plus, the health information site for patients from the US National Library of Medicine, has a page of resources on rheumatoid arthritis
• The UK's National Health Service online information site has information on rheumatoid arthritis
• The Arthritis Research Campaign, a UK charity that funds research on all types of arthritis, has a booklet with information for patients on rheumatoid arthritis
• Reumafonds, a Dutch arthritis foundation, gives information on rheumatoid arthritis (in Dutch)
• Autocure is an initiative whose objective is to transform knowledge obtained from molecular research into a cure for an increasing number of patients suffering from inflammatory rheumatic diseases
• The European league against Rheumatism, an organisation which represents the patient, health professionals, and scientific societies of rheumatology of all European nations
PMCID: PMC1976626  PMID: 17880261
12.  An ontology-driven semantic mash-up of gene and biological pathway information: Application to the domain of nicotine dependence 
Journal of biomedical informatics  2008;41(5):752-765.
This paper illustrates how Semantic Web technologies (especially RDF, OWL, and SPARQL) can support information integration and make it easy to create semantic mashups (semantically integrated resources). In the context of understanding the genetic basis of nicotine dependence, we integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base.
We use an ontology-driven approach to integrate two gene resources (Entrez Gene and HomoloGene) and three pathway resources (KEGG, Reactome and BioCyc), for five organisms, including humans. We created the Entrez Knowledge Model (EKoM), an information model in OWL for the gene resources, and integrated it with the extant BioPAX ontology designed for pathway resources. The integrated schema is populated with data from the pathway resources, publicly available in BioPAX-compatible format, and gene resources for which a population procedure was created. The SPARQL query language is used to formulate queries over the integrated knowledge base to answer the three biological queries.
Simple SPARQL queries could easily identify hub genes, i.e., those genes whose gene products participate in many pathways or interact with many other gene products. The identification of the genes expressed in the brain turned out to be more difficult, due to the lack of a common identification scheme for proteins.
Semantic Web technologies provide a valid framework for information integration in the life sciences. Ontology-driven integration represents a flexible, sustainable and extensible solution to the integration of large volumes of information. Additional resources, which enable the creation of mappings between information sources, are required to compensate for heterogeneity across namespaces.
Resource page
PMCID: PMC2766186  PMID: 18395495
Semantic Web; Semantic mashup; Nicotine dependence; Information integration; Ontologies
13.  Vaccine Efficacy in Senescent Mice Challenged with Recombinant SARS-CoV Bearing Epidemic and Zoonotic Spike Variants  
PLoS Medicine  2006;3(12):e525.
In 2003, severe acute respiratory syndrome coronavirus (SARS-CoV) was identified as the etiological agent of severe acute respiratory syndrome, a disease characterized by severe pneumonia that sometimes results in death. SARS-CoV is a zoonotic virus that crossed the species barrier, most likely originating from bats or from other species including civets, raccoon dogs, domestic cats, swine, and rodents. A SARS-CoV vaccine should confer long-term protection, especially in vulnerable senescent populations, against both the 2003 epidemic strains and zoonotic strains that may yet emerge from animal reservoirs. We report the comprehensive investigation of SARS vaccine efficacy in young and senescent mice following homologous and heterologous challenge.
Methods and Findings
Using Venezuelan equine encephalitis virus replicon particles (VRP) expressing the 2003 epidemic Urbani SARS-CoV strain spike (S) glycoprotein (VRP-S) or the nucleocapsid (N) protein from the same strain (VRP-N), we demonstrate that VRP-S, but not VRP-N vaccines provide complete short- and long-term protection against homologous strain challenge in young and senescent mice. To test VRP vaccine efficacy against a heterologous SARS-CoV, we used phylogenetic analyses, synthetic biology, and reverse genetics to construct a chimeric virus (icGDO3-S) encoding a synthetic S glycoprotein gene of the most genetically divergent human strain, GDO3, which clusters among the zoonotic SARS-CoV. icGD03-S replicated efficiently in human airway epithelial cells and in the lungs of young and senescent mice, and was highly resistant to neutralization with antisera directed against the Urbani strain. Although VRP-S vaccines provided complete short-term protection against heterologous icGD03-S challenge in young mice, only limited protection was seen in vaccinated senescent animals. VRP-N vaccines not only failed to protect from homologous or heterologous challenge, but resulted in enhanced immunopathology with eosinophilic infiltrates within the lungs of SARS-CoV–challenged mice. VRP-N–induced pathology presented at day 4, peaked around day 7, and persisted through day 14, and was likely mediated by cellular immune responses.
This study identifies gaps and challenges in vaccine design for controlling future SARS-CoV zoonosis, especially in vulnerable elderly populations. The availability of a SARS-CoV virus bearing heterologous S glycoproteins provides a robust challenge inoculum for evaluating vaccine efficacy against zoonotic strains, the most likely source of future outbreaks.
Experiments in mice suggest challenges in vaccine design for controlling future SARS-CoV zoonosis, especially in vulnerable elderly populations.
Editors' Summary
Severe acute respiratory syndrome (SARS) is a flu-like illness and was first recognized in China in 2002, after which the disease rapidly spread around the world. SARS was associated with high death rates, much higher than those for flu. Around 10% of people recognized as being infected with SARS died, and the death rate approached 50% among elderly people. The virus causing SARS was identified as a member of the coronavirus family; it is generally thought that this virus “jumped” to humans from bats, which harbor related viruses. Although SARS was declared eradicated by the World Health Organization in May 2005, there is still the possibility that similar viruses will again cross the species barrier and infect humans, with potentially serious consequences. As a result, many groups are working to develop vaccines that will protect against SARS infection.
Why Was This Study Done?
A SARS vaccine should be effective in people of all ages, including the elderly who are more likely to get seriously ill or die if they become infected. In addition, potential vaccines should protect against different variants of the virus, because there are different types of the virus that could potentially cross the species barrier from animals to humans. Of the different proteins that make up the SARS coronavirus, the spike glycoprotein is thought to elicit an immune response in humans that can protect against future infection. The researchers therefore examined vaccine candidates based on this particular protein (termed SARS-CoV S), as well as a second one called SARS-CoV N, in mice. Specifically, they tested whether the vaccines would protect against SARS infection in both young and older mice, and whether they would protect against infection by different strains of the SARS virus.
What Did the Researchers Do and Find?
The researchers created vaccines based on SARS-CoV S and SARS-CoV N by taking the genes coding for those proteins and inserting them into another type of virus particle that acted as a delivery vehicle. They injected mice with these vaccines and then tested whether the mice generated an immune response against the specific SARS proteins, which they did. The next step was to work out whether mice injected with the vaccines would be protected against later infection with SARS-CoV. The researchers found that mice injected with vaccine based on SARS-CoV S were protected against later infection with a standard SARS-CoV strain, both in the short term (eight weeks after vaccination) and the long term (54 weeks after vaccination). However, the vaccine based on SARS-CoV N did not seem to result in protection, and, worryingly, caused pathological changes in the lungs of mice following virus challenge. To find out if their candidate vaccines would protect against different strains of SARS, the researchers made a synthetic test virus that contained a mixture of genetic material from different natural variants of the virus. This test virus was used to “challenge” mice that had been immunized with the two different vaccines. The researchers found that the vaccine based on SARS-CoV S protected against infection by the test virus when mice were vaccinated young, but it failed to efficiently protect when administered to older mice.
What Do These Findings Mean?
The findings confirm others suggesting that vaccines based on the SARS-CoV S protein are more effective than those based on SARS-CoV N. They also suggest that the former can provide long-term protection in animals vaccinated young against closely related viruses. However, protection against more distantly related viruses remains a challenge, especially when vaccinating older animals. The differences seen between young and older mice suggest that older mice might provide a useful model for animal testing of candidate vaccines for diseases like SARS, flu, and West Nile virus that pose a particular threat to elderly people. Overall, these results provide useful lessons toward future SARS vaccine development in animals. The synthetic virus strain generated here, and others like it, are likely to be useful tools for such future studies.
Additional Information.
Please access these Web sites via the online version of this summary at
• The World Health Organization provides guidance, archives, and other information resources on SARS
• Information from the US Centers for Disease Control on SARS
• Wikipedia (an internet encyclopedia anyone can edit) has an entry on SARS
• Collected resources from MedLinePlus about SARS
PMCID: PMC1716185  PMID: 17194199
14.  Genotator: A disease-agnostic tool for genetic annotation of disease 
BMC Medical Genomics  2010;3:50.
Disease-specific genetic information has been increasing at rapid rates as a consequence of recent improvements and massive cost reductions in sequencing technologies. Numerous systems designed to capture and organize this mounting sea of genetic data have emerged, but these resources differ dramatically in their disease coverage and genetic depth. With few exceptions, researchers must manually search a variety of sites to assemble a complete set of genetic evidence for a particular disease of interest, a process that is both time-consuming and error-prone.
We designed a real-time aggregation tool that provides both comprehensive coverage and reliable gene-to-disease rankings for any disease. Our tool, called Genotator, automatically integrates data from 11 externally accessible clinical genetics resources and uses these data in a straightforward formula to rank genes in order of disease relevance. We tested the accuracy of coverage of Genotator in three separate diseases for which there exist specialty curated databases, Autism Spectrum Disorder, Parkinson's Disease, and Alzheimer Disease. Genotator is freely available at
Genotator demonstrated that most of the 11 selected databases contain unique information about the genetic composition of disease, with 2514 genes found in only one of the 11 databases. These findings confirm that the integration of these databases provides a more complete picture than would be possible from any one database alone. Genotator successfully identified at least 75% of the top ranked genes for all three of our use cases, including a 90% concordance with the top 40 ranked candidates for Alzheimer Disease.
As a meta-query engine, Genotator provides high coverage of both historical genetic research as well as recent advances in the genetic understanding of specific diseases. As such, Genotator provides a real-time aggregation of ranked data that remains current with the pace of research in the disease fields. Genotator's algorithm appropriately transforms query terms to match the input requirements of each targeted databases and accurately resolves named synonyms to ensure full coverage of the genetic results with official nomenclature. Genotator generates an excel-style output that is consistent across disease queries and readily importable to other applications.
PMCID: PMC2990725  PMID: 21034472
15.  Finding gene regulatory network candidates using the gene expression knowledge base 
BMC Bioinformatics  2014;15(1):386.
Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of ‘omics’ data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis.
We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions.
Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0386-y) contains supplementary material, which is available to authorized users.
PMCID: PMC4279962  PMID: 25490885
Knowledge management; Knowledge representation; Semantic Systems Biology; Semantic Web; RDF; SPARQL; Network extension; Gene expression; Transcription regulation; Protein-protein interaction; Transcription factor; Target gene interaction; Hypothesis assessment; Gastrin biology
16.  In Silico Detection of Sequence Variations Modifying Transcriptional Regulation 
Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at for all researchers interested in the detection and characterization of regulatory sequence variation.
Author Summary
DNA sequence variations (polymorphisms) that affect the expression levels of genes play important roles in the pathogenesis of many complex diseases. Compared with genetic variations that alter the amino acid sequences of encoded proteins, which are relatively easy to identify, sequence variants that affect the regulation of genes are difficult to pinpoint among the large amount of nonfunctional polymorphisms located in the vicinity of genes. Computational methods to distinguish functional from neutral variations could therefore prove useful to direct limited laboratory resources to sites most likely to exhibit a phenotypic effect. In this paper we present a Web-based tool for the identification of genetic variation in potential transcription factor binding sites. This tool can be used by any scientist interested in the characterization of regulatory polymorphisms. Using experimentally verified regulatory polymorphisms and background data collected from the literature, we evaluate the method's capacity to identify regulatory genetic variation, and we discuss the limitations of its application.
PMCID: PMC2211530  PMID: 18208319
17.  Eleven Candidate Susceptibility Genes for Common Familial Colorectal Cancer 
PLoS Genetics  2013;9(10):e1003876.
Hereditary factors are presumed to play a role in one third of colorectal cancer (CRC) cases. However, in the majority of familial CRC cases the genetic basis of predisposition remains unexplained. This is particularly true for families with few affected individuals. To identify susceptibility genes for this common phenotype, we examined familial cases derived from a consecutive series of 1514 Finnish CRC patients. Ninety-six familial CRC patients with no previous diagnosis of a hereditary CRC syndrome were included in the analysis. Eighty-six patients had one affected first-degree relative, and ten patients had two or more. Exome sequencing was utilized to search for genes harboring putative loss-of-function variants, because such alterations are likely candidates for disease-causing mutations. Eleven genes with rare truncating variants in two or three familial CRC cases were identified: UACA, SFXN4, TWSG1, PSPH, NUDT7, ZNF490, PRSS37, CCDC18, PRADC1, MRPL3, and AKR1C4. Loss of heterozygosity was examined in all respective cancer samples, and was detected in seven occasions involving four of the candidate genes. In all seven occasions the wild-type allele was lost (P = 0.0078) providing additional evidence that these eleven genes are likely to include true culprits. The study provides a set of candidate predisposition genes which may explain a subset of common familial CRC. Additional genetic validation in other populations is required to provide firm evidence for causality, as well as to characterize the natural history of the respective phenotypes.
Author Summary
Many individuals with a family history of colorectal cancer have no detectable germline mutation in the known cancer predisposing genes. We aimed to identify novel susceptibility genes for this common phenotype by performing exome sequencing on 96 independent cases with familial colorectal cancer. Eighty-six patients had one affected first-degree relative, and ten patients had two or more. None of the patients had a previous diagnosis of a hereditary syndrome. We focused our search on genes with rare variants, predicted to truncate the protein product, since these are likely candidates for disease predisposition. Using this approach we identified truncating germline variants in eleven genes, present in two or three independent familial colorectal cancer cases. We analyzed the respective tumor DNAs and found loss of the wild-type allele in seven out of seven occasions, involving four genes. No tumor showed loss of the mutant allele which provides us with additional evidence for disease causality. Further studies are required to provide firm evidence for pathogenicity. Genetic knowledge on confirmed predisposing genes can ultimately be translated into tools for cancer prevention and early diagnosis in individuals carrying predisposition alleles.
PMCID: PMC3798264  PMID: 24146633
18.  Control of the proinflammatory state in cystic fibrosis lung epithelial cells by genes from the TNF-alphaR/NFkappaB pathway. 
Molecular Medicine  2001;7(8):523-534.
BACKGROUND: Cystic fibrosis (CF) is the most common, lethal autosomal recessive disease affecting children in the United States and Europe. Extensive work is being performed to develop both gene and drug therapies. The principal mutation causing CF is in the CFTR gene ([Delta F508]CFTR). This mutation causes the mutant protein to traffic poorly to the plasma membrane, and degrades CFTR chloride channel activity. CPX, a candidate drug for CF, binds to mutant CFTR and corrects the trafficking deficit. CPX also activates mutant CFTR chloride channel activity. CF airways are phenotypically inundated by inflammatory signals, primarily contributed by sustained secretion of the proinflammatory cytokine interleukin 8 (IL-8) from mutant CFTR airway epithelial cells. IL-8 production is controlled by genes from the TNF-alphaR/NFkappaB pathway, and it is possible that the CF phenotype is due to dysfunction of genes from this pathway. In addition, because drug therapy with CPX and gene therapy with CFTR have the same common endpoint of raising the levels of CFTR, we have hypothesized that either approach should have a common genomic endpoint. MATERIALS AND METHODS: To test this hypothesis, we studied IL-8 secretion and global gene expression in IB-3 CF lung epithelial cells. The cells were treated by either gene therapy with wild-type CFTR, or by pharmacotherapy with the CFTR-surrogate drug CPX. CF cells, treated with either CFTR or CPX, were also exposed to Pseudomonas aeruginosa, a common chronic pathogen in CF patients. cDNA microarrays were used to assess global gene expression under the different conditions. A novel bioinformatic algorithm (GENESAVER) was developed to identify genes whose expression paralleled secretion of IL-8. RESULTS: We report here that IB3 CF cells secrete massive levels of IL-8. However, both gene therapy with CFTR and drug therapy with CPX substantially suppress IL-8 secretion. Nonetheless, both gene and drug therapy allow the CF cells to respond with physiologic secretion of IL-8 when the cells are exposed to P. aeruginosa. Thus, neither CFTR nor CPX acts as a nonspecific suppressor of IL-8 secretion from CF cells. Consistently, pharmacogenomic analysis indicates that CF cells treated with CPX greatly resemble CF cells treated with CFTR by gene therapy. Additionally, the same result obtains in the presence of P. aeruginosa. Classical hierarchical cluster analysis, based on similarity of global gene expression, also supports this conclusion. The GENESAVER algorithm, using the IL-8 secretion level as a physiologic variable, identifies a subset of genes from the TNF-alphaR/NFkappaB pathway that is expressed in phase with IL-8 secretion from CF epithelial cells. Certain other genes, previously known to be positively associated with CF, also fall into this category. Identified genes known to code for known inhibitors are expressed inversely, out of phase with IL-8 secretion. CONCLUSIONS: Wild-type CFTR and CPX both suppress proinflammatory IL-8 secretion from CF epithelial cells. The mechanism, as defined by pharmacogenomic analysis, involves identified genes from the TNF-alphaR/NFkappaB pathway. The close relationship between IL-8 secretion and genes from the TNF-alphaR/NFkappaB pathway suggests that molecular or pharmaceutical targeting of these novel genes may have strategic use in the development of new therapies for CF. From the perspective of global gene expression, both gene and drug therapy have similar genomic consequences. This is the first example showing equivalence of gene and drug therapy in CF, and suggests that a gene therapy-defined endpoint may prove to be a powerful paradigm for CF drug discovery. Finally, because the GENESAVER algorithm is capable of isolating disease-relevant genes in a hypothesis-driven manner without recourse to any a priori knowledge about the system, this new algorithm may also prove useful in applications to other genetic diseases.
PMCID: PMC1950060  PMID: 11591888
19.  Lentiviral Gene Transfer of Rpe65 Rescues Survival and Function of Cones in a Mouse Model of Leber Congenital Amaurosis 
PLoS Medicine  2006;3(10):e347.
RPE65 is specifically expressed in the retinal pigment epithelium and is essential for the recycling of 11-cis-retinal, the chromophore of rod and cone opsins. In humans, mutations in RPE65 lead to Leber congenital amaurosis or early-onset retinal dystrophy, a severe form of retinitis pigmentosa. The proof of feasibility of gene therapy for RPE65 deficiency has already been established in a dog model of Leber congenital amaurosis, but rescue of the cone function, although crucial for human high-acuity vision, has never been strictly proven. In Rpe65 knockout mice, photoreceptors show a drastically reduced light sensitivity and are subject to degeneration, the cone photoreceptors being lost at early stages of the disease. In the present study, we address the question of whether application of a lentiviral vector expressing the Rpe65 mouse cDNA prevents cone degeneration and restores cone function in Rpe65 knockout mice.
Methods and Findings
Subretinal injection of the vector in Rpe65-deficient mice led to sustained expression of Rpe65 in the retinal pigment epithelium. Electroretinogram recordings showed that Rpe65 gene transfer restored retinal function to a near-normal pattern. We performed histological analyses using cone-specific markers and demonstrated that Rpe65 gene transfer completely prevented cone degeneration until at least four months, an age at which almost all cones have degenerated in the untreated Rpe65-deficient mouse. We established an algorithm that allows prediction of the cone-rescue area as a function of transgene expression, which should be a useful tool for future clinical trials. Finally, in mice deficient for both RPE65 and rod transducin, Rpe65 gene transfer restored cone function when applied at an early stage of the disease.
By demonstrating that lentivirus-mediated Rpe65 gene transfer protects and restores the function of cones in the Rpe65−/− mouse, this study reinforces the therapeutic value of gene therapy for RPE65 deficiencies, suggests a cone-preserving treatment for the retina, and evaluates a potentially effective viral vector for this purpose.
In theRpe65-/- mouse model of Leber congenital amaurosis, injection of a lentiviral vector expressing the Rpe65 mouse cDNA was able to prevent cone degeneration and restore cone function.
Editors' Summary
Leber congenital amaurosis (LCA) is the name of a group of hereditary diseases that cause blindness in infants and children. Changes in any one of a number of different genes can cause the blindness, which affects vision starting at birth or soon after. The condition was first described by a German doctor, Theodore Leber, in the 19th century, hence the first part of the name; “amaurosis” is another word for blindness. Mutations in one gene called retinal pigment epithelium-specific protein, 65 kDa (RPE65)—so called because it is expressed in the pigment epithelium, a cell layer adjacent to the light-sensitive cells, and is 65 kilodaltons in size—cause about 10% of cases of LCA. The product of this gene is essential for the recycling of a substance called 11-cis-retinal, which is necessary for the light-sensitive rods and cones of the retina to capture light. If the gene is abnormal, the sensitivity of the retina to light is drastically reduced, but it also leads to damage to the light-sensitive cells themselves.
Why Was This Study Done?
Potentially, eyes diseases such as this one could be treated by gene therapy, which works by replacing a defective gene with a normal functional one, usually by putting a copy of the normal gene into a harmless virus and injecting it into the affected tissue—in this case, the eye. The researchers here wanted to see whether expressing wild-type RPE65 using a particular type of gene vector that can carry large pieces of DNA transcript—a lentiviral vector—could prevent degeneration of cone cells and restore cone function in a mouse model of this type of LCA—mice who had had this Rpe65 gene genetically removed.
What Did the Researchers Do and Find?
Injection of the normal gene into the retina of Rpe65-deficient mice led to sustained expression of the protein RPE65 in the retinal pigment epithelium. Electrical recordings of the activity of the eyes in these mice showed that Rpe65 gene transfer restored retinal function to a near-normal level. In addition, Rpe65 gene transfer completely prevented cone degeneration until at least four months, an age at which almost all cones have degenerated in the untreated Rpe65-deficient mice.
What Do These Findings Mean?
These findings suggest that it is theoretically possible to treat this type of blindness by gene therapy. However, because this study was done in mice, many other steps need to be taken before it will be clear whether the treatment could work in humans. These steps include a demonstration that the virus is safe in humans, and experiments to determine what dose of virus would be needed and how long the effects of the treatment would last. Another question is whether it would be necessary (or even possible) to treat affected children during early childhood or when children start losing vision.
Additional Information.
Please access these Web sites via the online version of this summary at
The Foundation for Retinal Research has detailed information on Leber's congenital amaurosis
Contact a Family is a UK organization that aims to put families of children with illnesses in touch with each other
The Foundation for Fighting Blindness funds research into, and provides information about many types of blindness, including Leber's congenital amaurosis
This Web site provides information on gene therapy clinical trials, including those dedicated to cure eye diseases
This foundation provides information on diseases leading to blindness, including Leber's congenital amaurosis
PMCID: PMC1592340  PMID: 17032058
20.  GEM-TREND: a web tool for gene expression data mining toward relevant network discovery 
BMC Genomics  2009;10:411.
DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database.
GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories.
GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at .
PMCID: PMC2748096  PMID: 19728865
21.  Investigating the Genetic Basis of Theory of Mind (ToM): The Role of Catechol-O-Methyltransferase (COMT) Gene Polymorphisms 
PLoS ONE  2012;7(11):e49768.
The ability to deduce other persons' mental states and emotions which has been termed ‘theory of mind (ToM)’ is highly heritable. First molecular genetic studies focused on some dopamine-related genes, while the genetic basis underlying different components of ToM (affective ToM and cognitive ToM) remain unknown. The current study tested 7 candidate polymorphisms (rs4680, rs4633, rs2020917, rs2239393, rs737865, rs174699 and rs59938883) on the catechol-O-methyltransferase (COMT) gene. We investigated how these polymorphisms relate to different components of ToM. 101 adults participated in our study; all were genetically unrelated, non-clinical and healthy Chinese subjects. Different ToM tasks were applied to detect their theory of mind ability. The results showed that the COMT gene rs2020917 and rs737865 SNPs were associated with cognitive ToM performance, while the COMT gene rs5993883 SNP was related to affective ToM, in which a significant gender-genotype interaction was found (p = 0.039). Our results highlighted the contribution of DA-related COMT gene on ToM performance. Moreover, we found out that the different SNP at the same gene relates to the discriminative aspect of ToM. Our research provides some preliminary evidence to the genetic basis of theory of mind which still awaits further studies.
PMCID: PMC3507837  PMID: 23209597
22.  solQTL: a tool for QTL analysis, visualization and linking to genomes at SGN database 
BMC Bioinformatics  2010;11:525.
A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases.
The Sol Genomics Network (SGN, is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL,, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application.
solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes underlying phenotypic variation and markers more closely linked to QTLs. solQTL is freely available on SGN and can be used in private or public mode.
PMCID: PMC2984588  PMID: 20964836
23.  Convergence of Mutation and Epigenetic Alterations Identifies Common Genes in Cancer That Predict for Poor Prognosis  
PLoS Medicine  2008;5(5):e114.
The identification and characterization of tumor suppressor genes has enhanced our understanding of the biology of cancer and enabled the development of new diagnostic and therapeutic modalities. Whereas in past decades, a handful of tumor suppressors have been slowly identified using techniques such as linkage analysis, large-scale sequencing of the cancer genome has enabled the rapid identification of a large number of genes that are mutated in cancer. However, determining which of these many genes play key roles in cancer development has proven challenging. Specifically, recent sequencing of human breast and colon cancers has revealed a large number of somatic gene mutations, but virtually all are heterozygous, occur at low frequency, and are tumor-type specific. We hypothesize that key tumor suppressor genes in cancer may be subject to mutation or hypermethylation.
Methods and Findings
Here, we show that combined genetic and epigenetic analysis of these genes reveals many with a higher putative tumor suppressor status than would otherwise be appreciated. At least 36 of the 189 genes newly recognized to be mutated are targets of promoter CpG island hypermethylation, often in both colon and breast cancer cell lines. Analyses of primary tumors show that 18 of these genes are hypermethylated strictly in primary cancers and often with an incidence that is much higher than for the mutations and which is not restricted to a single tumor-type. In the identical breast cancer cell lines in which the mutations were identified, hypermethylation is usually, but not always, mutually exclusive from genetic changes for a given tumor, and there is a high incidence of concomitant loss of expression. Sixteen out of 18 (89%) of these genes map to loci deleted in human cancers. Lastly, and most importantly, the reduced expression of a subset of these genes strongly correlates with poor clinical outcome.
Using an unbiased genome-wide approach, our analysis has enabled the discovery of a number of clinically significant genes targeted by multiple modes of inactivation in breast and colon cancer. Importantly, we demonstrate that a subset of these genes predict strongly for poor clinical outcome. Our data define a set of genes that are targeted by both genetic and epigenetic events, predict for clinical prognosis, and are likely fundamentally important for cancer initiation or progression.
Stephen Baylin and colleagues show that a combined genetic and epigenetic analysis of breast and colon cancers identifies a number of clinically significant genes targeted by multiple modes of inactivation.
Editors' Summary
Cancer is one of the developed world's biggest killers—over half a million Americans die of cancer each year, for instance. As a result, there is great interest in understanding the genetic and environmental causes of cancer in order to improve cancer prevention, diagnosis, and treatment.
Cancer begins when cells begin to multiply out of control. DNA is the sequence of coded instructions—genes—for how to build and maintain the body. Certain “tumor suppressor” genes, for instance, help to prevent cancer by preventing tumors from developing, but changes that alter the DNA code sequence—mutations—can profoundly affect how a gene works. Modern techniques of genetic analysis have identified genes such as tumor suppressors that, when mutated, are linked to the development of certain cancers.
Why Was This Study Done?
However, in recent years, it has become increasingly apparent that mutations are neither necessary nor sufficient to explain every case of cancer. This has led researchers to look at so-called epigenetic factors, which also alter how a gene works without altering its DNA sequence. An example of this is “methylation,” which prevents a gene from being expressed—deactivates it—by a chemical tag. Methylation of genes is part of the normal functioning of DNA, but abnormal methylation has been linked with cancer, aging, and some rare birth abnormalities.
Previous analysis of DNA from breast and colon cancer cells had revealed 189 “candidate cancer genes”—mutated genes that were linked to the development of breast and colon cancer. However, it was not clear how those mutations gave rise to cancer, and individual mutations were present in only 5% to 15% of specific tumors. The authors of this study wanted to know whether epigenetic factors such as methylation contributed to causing the cancers.
What Did the Researchers Do and Find?
The researchers first identified 56 of the 189 candidate cancer genes as likely tumor suppressors and then determined that 36 of these genes were methylated and deactivated, often in both breast and colon (laboratory-grown) cancer cells. In nearly all cases, the methylated genes were not active but could be reactivated by being demethylated. They further showed that, in normal colon and breast tissue samples, 18 of the 36 genes were unmethylated and functioned normally, but in cells taken from breast and colon cancer tumors they were methylated.
In contrast to the genetic mutations, the 18 genes were frequently methylated across a range of tumor types, and eight genes were methylated in both the breast and colon cancers. The authors found by reviewing the genetics and epigenetics of those 18 genes in breast and colon cancer that they were either mutated, methylated, or both. A literature review showed that at least six of the 18 genes were known to have tumor suppressor properties, and the authors determined that 16 were located in parts of DNA known to be missing from cells taken from a range of cancer tumors.
Finally, the researchers analyzed data on cancer cases to show that methylation of these 18 genes was correlated with reduced function of these genes in tumors and with a greater likelihood that a cancer will be terminal or spread to other parts of the body.
What Do These Findings Mean?
The researchers considered only the 189 candidate cancer genes found in one previous study and not other genes identified elsewhere. They also did not consider the biological effects of the individual mutations found in those genes. Despite this, they have demonstrated that methylation of specific genes is likely to play a role in the development of breast and/or colon cancer cells either together with mutations or independently, most likely by turning off their tumor suppression function.
More broadly, however, the study adds to the evidence that future analysis of the role of genes in cancer should include epigenetic as well as genetic factors. In addition, the authors have also shown that a number of these genes may be useful for predicting clinical outcomes for a range of tumor types.
Additional Information.
Please access these Web sites via the online version of this summary at
A December 2006 PLoS Medicine Perspective article reviews the value of examining methylation as a factor in common cancers and its use for early detection
The Web site of the American Cancer Society has a wealth of information and resources on a variety of cancers, including breast and colon cancer is a nonprofit organization providing information about breast cancer on the Web, including research news
Cancer Research UK provides information on cancer research
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins publishes background information on the authors' research on methylation, setting out its potential for earlier diagnosis and better treatment of cancer
PMCID: PMC2429944  PMID: 18507500
24.  Huvariome: a web server resource of whole genome next-generation sequencing allelic frequencies to aid in pathological candidate gene selection 
Next generation sequencing provides clinical research scientists with direct read out of innumerable variants, including personal, pathological and common benign variants. The aim of resequencing studies is to determine the candidate pathogenic variants from individual genomes, or from family-based or tumor/normal genome comparisons. Whilst the use of appropriate controls within the experimental design will minimize the number of false positive variations selected, this number can be reduced further with the use of high quality whole genome reference data to minimize false positives variants prior to candidate gene selection. In addition the use of platform related sequencing error models can help in the recovery of ambiguous genotypes from lower coverage data.
We have developed a whole genome database of human genetic variations, Huvariome, determined by whole genome deep sequencing data with high coverage and low error rates. The database was designed to be sequencing technology independent but is currently populated with 165 individual whole genomes consisting of small pedigrees and matched tumor/normal samples sequenced with the Complete Genomics sequencing platform. Common variants have been determined for a Benelux population cohort and represented as genotypes alongside the results of two sets of control data (73 of the 165 genomes), Huvariome Core which comprises 31 healthy individuals from the Benelux region, and Diversity Panel consisting of 46 healthy individuals representing 10 different populations and 21 samples in three Pedigrees. Users can query the database by gene or position via a web interface and the results are displayed as the frequency of the variations as detected in the datasets. We demonstrate that Huvariome can provide accurate reference allele frequencies to disambiguate sequencing inconsistencies produced in resequencing experiments. Huvariome has been used to support the selection of candidate cardiomyopathy related genes which have a homozygous genotype in the reference cohorts. This database allows the users to see which selected variants are common variants (> 5% minor allele frequency) in the Huvariome core samples, thus aiding in the selection of potentially pathogenic variants by filtering out common variants that are not listed in one of the other public genomic variation databases. The no-call rate and the accuracy of allele calling in Huvariome provides the user with the possibility of identifying platform dependent errors associated with specific regions of the human genome.
Huvariome is a simple to use resource for validation of resequencing results obtained by NGS experiments. The high sequence coverage and low error rates provide scientists with the ability to remove false positive results from pedigree studies. Results are returned via a web interface that displays location-based genetic variation frequency, impact on protein function, association with known genetic variations and a quality score of the variation base derived from Huvariome Core and the Diversity Panel data. These results may be used to identify and prioritize rare variants that, for example, might be disease relevant. In testing the accuracy of the Huvariome database, alleles of a selection of ambiguously called coding single nucleotide variants were successfully predicted in all cases. Data protection of individuals is ensured by restricted access to patient derived genomes from the host institution which is relevant for future molecular diagnostics.
PMCID: PMC3549785  PMID: 23164068
Medical genetics; Medical genomics; Whole genome sequencing; Allele frequency; Cardiomyopathy
25.  The neural basis of theory of mind and its relationship to social functioning and social anhedonia in individuals with schizophrenia☆ 
NeuroImage : Clinical  2013;4:154-163.
Theory of mind (ToM), the ability to attribute and reason about the mental states of others, is a strong determinant of social functioning among individuals with schizophrenia. Identifying the neural bases of ToM and their relationship to social functioning may elucidate functionally relevant neurobiological targets for intervention. ToM ability may additionally account for other social phenomena that affect social functioning, such as social anhedonia (SocAnh). Given recent research in schizophrenia demonstrating improved neural functioning in response to increased use of cognitive skills, it is possible that SocAnh, which decreases one's opportunity to engage in ToM, could compromise social functioning through its deleterious effect on ToM-related neural circuitry. Here, twenty individuals with schizophrenia and 18 healthy controls underwent fMRI while performing the False-Belief Task. Aspects of social functioning were assessed using multiple methods including self-report (Interpersonal Reactivity Index, Social Adjustment Scale), clinician-ratings (Global Functioning Social Scale), and performance-based tasks (MSCEIT—Managing Emotions). SocAnh was measured with the Revised Social Anhedonia Scale. Region-of-interest and whole-brain analyses revealed reduced recruitment of medial prefrontal cortex (MPFC) for ToM in individuals with schizophrenia. Across all participants, activity in this region correlated with most social variables. Mediation analysis revealed that neural activity for ToM in MPFC accounted for the relationship between SocAnh and social functioning. These findings demonstrate that reduced recruitment of MPFC for ToM is an important neurobiological determinant of social functioning. Furthermore, SocAhn may affect social functioning through its impact on ToM-related neural circuitry. Together, these findings suggest ToM ability as an important locus for intervention.
•Individuals with schizophrenia exhibited reduced recruitment of MPFC for ToM.•MPFC and RTPJ activities correlate with measures of social functioning and ability.•MPFC activity mediates the relationship between social anhedonia and functioning.•Neural circuitry supporting ToM may represent an important area for remediation.
PMCID: PMC3871293  PMID: 24371798
Schizophrenia; Theory of mind; Social functioning; Social anhedonia; fMRI

Results 1-25 (1212228)