Search tips
Search criteria

Results 1-25 (1198250)

Clipboard (0)

Related Articles

1.  Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks 
BMC Bioinformatics  2005;6:227.
Biological processes are carried out by coordinated modules of interacting molecules. As clustering methods demonstrate that genes with similar expression display increased likelihood of being associated with a common functional module, networks of coexpressed genes provide one framework for assigning gene function. This has informed the guilt-by-association (GBA) heuristic, widely invoked in functional genomics. Yet although the idea of GBA is accepted, the breadth of GBA applicability is uncertain.
We developed methods to systematically explore the breadth of GBA across a large and varied corpus of expression data to answer the following question: To what extent is the GBA heuristic broadly applicable to the transcriptome and conversely how broadly is GBA captured by a priori knowledge represented in the Gene Ontology (GO)? Our study provides an investigation of the functional organization of five coexpression networks using data from three mammalian organisms. Our method calculates a probabilistic score between each gene and each Gene Ontology category that reflects coexpression enrichment of a GO module. For each GO category we use Receiver Operating Curves to assess whether these probabilistic scores reflect GBA. This methodology applied to five different coexpression networks demonstrates that the signature of guilt-by-association is ubiquitous and reproducible and that the GBA heuristic is broadly applicable across the population of nine hundred Gene Ontology categories. We also demonstrate the existence of highly reproducible patterns of coexpression between some pairs of GO categories.
We conclude that GBA has universal value and that transcriptional control may be more modular than previously realized. Our analyses also suggest that methodologies combining coexpression measurements across multiple genes in a biologically-defined module can aid in characterizing gene function or in characterizing whether pairs of functions operate together.
PMCID: PMC1239911  PMID: 16162296
2.  Functional and genetic characterization of the non-lysosomal glucosylceramidase 2 as a modifier for Gaucher disease 
Gaucher disease (GD) is the most common inherited lysosomal storage disorder in humans, caused by mutations in the gene encoding the lysosomal enzyme glucocerebrosidase (GBA1). GD is clinically heterogeneous and although the type of GBA1 mutation plays a role in determining the type of GD, it does not explain the clinical variability seen among patients. Cumulative evidence from recent studies suggests that GBA2 could play a role in the pathogenesis of GD and potentially interacts with GBA1.
We used a framework of functional and genetic approaches in order to further characterize a potential role of GBA2 in GD. Glucosylceramide (GlcCer) levels in spleen, liver and brain of GBA2-deficient mice and mRNA and protein expression of GBA2 in GBA1-deficient murine fibroblasts were analyzed. Furthermore we crossed GBA2-deficient mice with conditional Gba1 knockout mice in order to quantify the interaction between GBA1 and GBA2. Finally, a genetic approach was used to test whether genetic variation in GBA2 is associated with GD and/ or acts as a modifier in Gaucher patients. We tested 22 SNPs in the GBA2 and GBA1 genes in 98 type 1 and 60 type 2/3 Gaucher patients for single- and multi-marker association with GD.
We found a significant accumulation of GlcCer compared to wild-type controls in all three organs studied. In addition, a significant increase of Gba2-protein and Gba2-mRNA levels in GBA1-deficient murine fibroblasts was observed. GlcCer levels in the spleen from Gba1/Gba2 knockout mice were much higher than the sum of the single knockouts, indicating a cross-talk between the two glucosylceramidases and suggesting a partially compensation of the loss of one enzyme by the other. In the genetic approach, no significant association with severity of GD was found for SNPs at the GBA2 locus. However, in the multi-marker analyses a significant result was detected for p.L444P (GBA1) and rs4878628 (GBA2), using a model that does not take marginal effects into account.
All together our observations make GBA2 a likely candidate to be involved in GD etiology. Furthermore, they point to GBA2 as a plausible modifier for GBA1 in patients with GD.
PMCID: PMC3850879  PMID: 24070122
3.  “Guilt by Association” Is the Exception Rather Than the Rule in Gene Networks 
PLoS Computational Biology  2012;8(3):e1002444.
Gene networks are commonly interpreted as encoding functional information in their connections. An extensively validated principle called guilt by association states that genes which are associated or interacting are more likely to share function. Guilt by association provides the central top-down principle for analyzing gene networks in functional terms or assessing their quality in encoding functional information. In this work, we show that functional information within gene networks is typically concentrated in only a very few interactions whose properties cannot be reliably related to the rest of the network. In effect, the apparent encoding of function within networks has been largely driven by outliers whose behaviour cannot even be generalized to individual genes, let alone to the network at large. While experimentalist-driven analysis of interactions may use prior expert knowledge to focus on the small fraction of critically important data, large-scale computational analyses have typically assumed that high-performance cross-validation in a network is due to a generalizable encoding of function. Because we find that gene function is not systemically encoded in networks, but dependent on specific and critical interactions, we conclude it is necessary to focus on the details of how networks encode function and what information computational analyses use to extract functional meaning. We explore a number of consequences of this and find that network structure itself provides clues as to which connections are critical and that systemic properties, such as scale-free-like behaviour, do not map onto the functional connectivity within networks.
Author Summary
The analysis of gene function and gene networks is a major theme of post-genome biomedical research. Historically, many attempts to understand gene function leverage a biological principle known as “guilt by association” (GBA). GBA states that genes with related functions tend to share properties such as genetic or physical interactions. In the past ten years, GBA has been scaled up for application to large gene networks, becoming a favored way to grapple with the complex interdependencies of gene functions in the face of floods of genomics and proteomics data. However, there is a growing realization that scaled-up GBA is not a panacea. In this study, we report a precise identification of the limits of GBA and show that it cannot provide a way to understand gene networks in a way that is simultaneously general and useful. Our findings indicate that the assumptions underlying the high-throughput use of gene networks to interpret function are fundamentally flawed, with wide-ranging implications for the interpretation of genome-wide data.
PMCID: PMC3315453  PMID: 22479173
4.  Progress and challenges in the computational prediction of gene function using networks: 2012-2013 update 
F1000Research  2013;2:230.
In an opinion published in 2012, we reviewed and discussed our studies of how gene network-based guilt-by-association (GBA) is impacted by confounds related to gene multifunctionality. We found such confounds account for a significant part of the GBA signal, and as a result meaningfully evaluating and applying computationally-guided GBA is more challenging than generally appreciated. We proposed that effort currently spent on incrementally improving algorithms would be better spent in identifying the features of data that do yield novel functional insights. We also suggested that part of the problem is the reliance by computational biologists on gold standard annotations such as the Gene Ontology. In the year since, there has been continued heavy activity in GBA-based research, including work that contributes to our understanding of the issues we raised. Here we provide a review of some of the most relevant recent work, or which point to new areas of progress and challenges.
PMCID: PMC3962002  PMID: 24715959
5.  Assessing functional annotation transfers with inter-species conserved coexpression: application to Plasmodium falciparum 
BMC Genomics  2010;11:35.
Plasmodium falciparum is the main causative agent of malaria. Of the 5 484 predicted genes of P. falciparum, about 57% do not have sufficient sequence similarity to characterized genes in other species to warrant functional assignments. Non-homology methods are thus needed to obtain functional clues for these uncharacterized genes. Gene expression data have been widely used in the recent years to help functional annotation in an intra-species way via the so-called Guilt By Association (GBA) principle.
We propose a new method that uses gene expression data to assess inter-species annotation transfers. Our approach starts from a set of likely orthologs between a reference species (here S. cerevisiae and D. melanogaster) and a query species (P. falciparum). It aims at identifying clusters of coexpressed genes in the query species whose coexpression has been conserved in the reference species. These conserved clusters of coexpressed genes are then used to assess annotation transfers between genes with low sequence similarity, enabling reliable transfers of annotations from the reference to the query species. The approach was used with transcriptomic data sets of P. falciparum, S. cerevisiae and D. melanogaster, and enabled us to propose with high confidence new/refined annotations for several dozens hypothetical/putative P. falciparum genes. Notably, we revised the annotation of genes involved in ribosomal proteins and ribosome biogenesis and assembly, thus highlighting several potential drug targets.
Our approach uses both sequence similarity and gene expression data to help inter-species gene annotation transfers. Experiments show that this strategy improves the accuracy achieved when using solely sequence similarity and outperforms the accuracy of the GBA approach. In addition, our experiments with P. falciparum show that it can infer a function for numerous hypothetical genes.
PMCID: PMC2826313  PMID: 20078859
6.  Visual short-term memory deficits associated with GBA mutation and Parkinson’s disease 
Brain  2014;137(8):2303-2311.
Individuals with mutation in the lysosomal enzyme glucocerebrosidase (GBA) gene are at significantly high risk of developing Parkinson’s disease with cognitive deficit. We examined whether visual short-term memory impairments, long associated with patients with Parkinson’s disease, are also present in GBA-positive individuals—both with and without Parkinson’s disease. Precision of visual working memory was measured using a serial order task in which participants observed four bars, each of a different colour and orientation, presented sequentially at screen centre. Afterwards, they were asked to adjust a coloured probe bar’s orientation to match the orientation of the bar of the same colour in the sequence. An additional attentional ‘filtering’ condition tested patients’ ability to selectively encode one of the four bars while ignoring the others. A sensorimotor task using the same stimuli controlled for perceptual and motor factors. There was a significant deficit in memory precision in GBA-positive individuals—with or without Parkinson’s disease—as well as GBA-negative patients with Parkinson’s disease, compared to healthy controls. Worst recall was observed in GBA-positive cases with Parkinson’s disease. Although all groups were impaired in visual short-term memory, there was a double dissociation between sources of error associated with GBA mutation and Parkinson’s disease. The deficit observed in GBA-positive individuals, regardless of whether they had Parkinson’s disease, was explained by a systematic increase in interference from features of other items in memory: misbinding errors. In contrast, impairments in patients with Parkinson’s disease, regardless of GBA status, was explained by increased random responses. Individuals who were GBA-positive and also had Parkinson’s disease suffered from both types of error, demonstrating the worst performance. These findings provide evidence for dissociable signature deficits within the domain of visual short-term memory associated with GBA mutation and with Parkinson’s disease. Identification of the specific pattern of cognitive impairment in GBA mutation versus Parkinson’s disease is potentially important as it might help to identify individuals at risk of developing Parkinson’s disease.
PMCID: PMC4107740  PMID: 24919969
visual short-term memory; working memory; Gaucher’s disease; glucocerebrosidase
7.  Glucocerebrosidase mutations in primary parkinsonism 
Parkinsonism & Related Disorders  2014;20(11):1215-1220.
Mutations in the lysosomal glucocerebrosidase (GBA) gene increase the risk of Parkinson's Disease (PD). We determined the frequency and relative risk of major GBA mutations in a large series of Italian patients with primary parkinsonism.
We studied 2766 unrelated consecutive patients with clinical diagnosis of primary degenerative parkinsonism (including 2350 PD), and 1111 controls. The entire cohort was screened for mutations in GBA exons 9 and 10, covering approximately 70% of mutations, including the two most frequent defects, p.N370S and p.L444P.
Four known mutations were identified in heterozygous state: 3 missense mutations (p.N370S, p.L444P, and p.D443N), and the splicing mutation IVS10+1G>T, which results in the in-frame exon-10 skipping. Molecular characterization of 2 additional rare variants, potentially interfering with splicing, suggested a neutral effect. GBA mutations were more frequent in PD (4.5%, RR = 7.2, CI = 3.3–15.3) and in Dementia with Lewy Bodies (DLB) (13.8%, RR = 21.9, CI = 6.8–70.7) than in controls (0.63%). but not in the other forms of parkinsonism such as Progressive Supranuclear Palsy (PSP, 2%), and Corticobasal Degeneration (CBD, 0%). Considering only the PD group, GBA-carriers were younger at onset (52 ± 10 vs. 57 ± 10 years, P < 0.0001) and were more likely to have a positive family history of PD (34% vs. 20%, P < 0.001).
GBA dysfunction is relevant for synucleinopathies, such as PD and DLB, except for MSA, in which pathology involves oligodendrocytes, and the tauopathies PSP and CBD. The risk of developing DLB is three-fold higher than PD, suggesting a more aggressive phenotype.
•We screened a large case–control cohort with parkinsonism for common GBA mutations.•GBA mutations in the Italian population are a risk factor for Lewy Bodies Diseases (PD and DLB).•GBA mutations were not increased in the other forms of parkinsonism: PSP, CBD and MSA.•GBA dysfunction does not seem to be involved in MSA and tauopathies.
PMCID: PMC4228056  PMID: 25249066
Parkinson's disease; GBA; Parkinsonism; Association analysis; Splicing mutation; Functional characterization
8.  GBA server: EST-based digital gene expression profiling 
Nucleic Acids Research  2005;33(Web Server issue):W673-W676.
Expressed Sequence Tag-based gene expression profiling can be used to discover functionally associated genes on a large scale. Currently available web servers and tools focus on finding differentially expressed genes in different samples or tissues rather than finding co-expressed genes. To fill this gap, we have developed a web server that implements the GBA (Guilt-by-Association) co-expression algorithm, which has been successfully used in finding disease-related genes. We have also annotated UniGene clusters with links to several important databases such as GO, KEGG, OMIM, Gene, IPI and HomoloGene. The GBA server can be accessed and downloaded at .
PMCID: PMC1160240  PMID: 15980560
9.  The link between the GBA gene and parkinsonism 
The Lancet. Neurology  2012;11(11):986-998.
Mutations in the glucocerebrosidase (GBA) gene, which encodes the lysosomal enzyme that is deficient in Gaucher's disease, are important and common risk factors for Parkinson’s disease and related disorders. This association was first recognised in the clinic, where parkinsonism was noted, albeit rarely, in patients with Gaucher's disease and more frequently in relatives who were obligate carriers. Subsequently, findings from large studies showed that patients with Parkinson’s disease and associated Lewy body disorders had an increased frequency of GBA mutations when compared with control individuals. Patients with GBA-associated parkinsonism exhibit varying parkinsonian phenotypes but tend to have an earlier age of onset and more associated cognitive changes than patients with parkinsonism without GBA mutations. Hypotheses proposed to explain this association include a gain-of-function due to mutations in glucocerebrosidase that promotes α-synuclein aggregation; substrate accumulation due to enzymatic loss-of-function, which affects α-synuclein processing and clearance; and a bidirectional feedback loop. Identification of the pathological mechanisms underlying GBA-associated parkinsonism will improve our understanding of the genetics, pathophysiology, and treatment for both rare and common neurological diseases.
PMCID: PMC4141416  PMID: 23079555
10.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes 
Nucleic Acids Research  2004;32(18):5539-5545.
In this paper, we present the Functional Catalogue (FunCat), a hierarchically structured, organism-independent, flexible and scalable controlled classification system enabling the functional description of proteins from any organism. FunCat has been applied for the manual annotation of prokaryotes, fungi, plants and animals. We describe how FunCat is implemented as a highly efficient and robust tool for the manual and automatic annotation of genomic sequences. Owing to its hierarchical architecture, FunCat has also proved to be useful for many subsequent downstream bioinformatic applications. This is illustrated by the analysis of large-scale experiments from various investigations in transcriptomics and proteomics, where FunCat was used to project experimental data into functional units, as ‘gold standard’ for functional classification methods, and also served to compare the significance of different experimental methods. Over the last decade, the FunCat has been established as a robust and stable annotation scheme that offers both, meaningful and manageable functional classification as well as ease of perception.
PMCID: PMC524302  PMID: 15486203
11.  The association between ß-glucocerebrosidase mutations and parkinsonism 
Current neurology and neuroscience reports  2013;13(8):10.1007/s11910-013-0368-x.
Mutations in the ß-glucocerebrosidase gene (GBA), which encodes the lysosomal enzyme ß-glucocerebrosidase, have traditionally been implicated in Gaucher disease, an autosomal-recessive lyososomal storage disorder. Yet the past two decades have yielded an explosion of epidemiological and basic-science evidence linking mutations in GBA with the development of Parkinson disease as well. Although the specific contribution of mutant GBA to the pathogenesis of parkinsonism remains unknown, evidence suggests both loss of function and toxic gain-of-function by abnormal ß-glucocerebrosidase may be important, and a close relationship between ß-glucocerebrosidase and α-synuclein. Furthermore, multiple lines of evidence suggest that while GBA-associated PD closely mimics idiopathic PD (IPD), it may present at a younger age, and is more frequently complicated by cognitive dysfunction. Understanding the clinical association between GBA and PD, and the relationship between ß-glucocerebrosidase and α-synuclein, may enhance understanding of the pathogenesis of IPD, improve prognostication and treatment of GBA carriers with parkinsonism, and may furthermore inform therapies for IPD not due to GBA mutations.
PMCID: PMC3816495  PMID: 23812893
Parkinson disease; parkinsonism; dementia with Lewy bodies; Gaucher disease; GBA; ß-glucocerebrosidase
12.  Mutations in GBA are associated with familial Parkinson disease susceptibility and age at onset 
Neurology  2009;72(4):310-316.
To characterize sequence variation within the glucocerebrosidase (GBA) gene in a select subset of our sample of patients with familial Parkinson disease (PD) and then to test in our full sample whether these sequence variants increased the risk for PD and were associated with an earlier onset of disease.
We performed a comprehensive study of all GBA exons in one patient with PD from each of 96 PD families, selected based on the family-specific lod scores at the GBA locus. Identified GBA variants were subsequently screened in all 1325 PD cases from 566 multiplex PD families and in 359 controls.
Nine different GBA variants, five previously reported, were identified in 21 of the 96 PD cases sequenced. Screening for these variants in the full sample identified 161 variant carriers (12.2%) in 99 different PD families. An unbiased estimate of the frequency of the five previously reported GBA variants in the familial PD sample was 12.6% and in the control sample was 5.3% (odds ratio 2.6; 95% confidence interval 1.5–4.4). Presence of a GBA variant was associated with an earlier age at onset (p = 0.0001). On average, those patients carrying a GBA variant had onset with PD 6.04 years earlier than those without a GBA variant.
This study suggests that GBA is a susceptibility gene for familial Parkinson disease (PD) and patients with GBA variants have an earlier age at onset than patients with PD without GBA variants.
= confidence interval;
= Gaucher disease;
= Geriatric Depression Scale;
= Mini-Mental State Examination;
= National Cell Repository for Alzheimer’s Disease;
= nonparametric lod;
= odds ratio;
= Parkinson disease;
= Unified Parkinson’s Disease Rating Scale.
PMCID: PMC2677501  PMID: 18987351
13.  The relationship between the visual evoked potential and the gamma band investigated by blind and semi-blind methods 
Neuroimage  2011;56(3-4):1059-1071.
Gamma Band Activity (GBA) is increasingly studied for its relation with attention, change detection, maintenance of working memory and the processing of sensory stimuli. Activity around the gamma range has also been linked with early visual processing, although the relationship between this activity and the low frequency visual evoked potential (VEP) remains unclear. This study examined the ability of blind and semi-blind source separation techniques to extract sources specifically related to the VEP and GBA in order to shed light on the relationship between them. Blind (Independent Component Analysis—ICA) and semi-Blind (Functional Source Separation—FSS) methods were applied to dense array EEG data recorded during checkerboard stimulation. FSS was performed with both temporal and spectral constraints to identify specifically the generators of the main peak of the VEP (P100) and of the GBA. Source localisation and time-frequency analyses were then used to investigate the properties and co-dependencies between VEP/P100 and GBA. Analysis of the VEP extracted using the different methods demonstrated very similar morphology and localisation of the generators. Single trial time frequency analysis showed higher GBA when a larger amplitude VEP/P100 occurred. Further examination indicated that the evoked (phase-locked) component of the GBA was more related to the P100, whilst the induced component correlated with the VEP as a whole. The results suggest that the VEP and GBA may be generated by the same neuronal populations, and implicate this relationship as a potential mediator of the correlation between the VEP and the Blood Oxygenation Level Dependent (BOLD) effect measured with fMRI.
Research Highlights
► ICA and FSS are able to extract sources specifically related to the VEP/P100 and GBA. ► Localisation and frequency analyses show co-dependencies between VEP/P100 and GBA. ► Trial by trial induced GBA covaries with VEP amplitude. ► VEP and GBA may be generated by the same neuronal populations. ► VEP and GBA relationship may underlie the relationship between VEP and BOLD.
PMCID: PMC3095074  PMID: 21396460
Visual Evoked Potential (VEP); Electroencephalography (EEG); Independent Component Analysis (ICA); Functional Source Separation (FSS); Induced Visual Gamma (IVG); Gamma Band Activity (GBA)
14.  Network Assessor: an automated method for quantitative assessment of a network's potential for gene function prediction 
Frontiers in Genetics  2014;5:123.
Significant effort has been invested in network-based gene function prediction algorithms based on the guilt by association (GBA) principle. Existing approaches for assessing prediction performance typically compute evaluation metrics, either averaged across all functions being considered, or strictly from properties of the network. Since the success of GBA algorithms depends on the specific function being predicted, evaluation metrics should instead be computed for each function. We describe a novel method for computing the usefulness of a network by measuring its impact on gene function cross validation prediction performance across all gene functions. We have implemented this in software called Network Assessor, and describe its use in the GeneMANIA (GM) quality control system. Network Assessor is part of the GM command line tools.
PMCID: PMC4032932  PMID: 24904632
network inference; function prediction; cross validation; network biology; machine learning
15.  Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites 
PLoS Computational Biology  2007;3(5):e90.
Computational prediction of nucleotide binding specificity for transcription factors remains a fundamental and largely unsolved problem. Determination of binding positions is a prerequisite for research in gene regulation, a major mechanism controlling phenotypic diversity. Furthermore, an accurate determination of binding specificities from high-throughput data sources is necessary to realize the full potential of systems biology. Unfortunately, recently performed independent evaluation showed that more than half the predictions from most widely used algorithms are false. We introduce a graph-theoretical framework to describe local sequence similarity as the pair-wise distances between nucleotides in promoter sequences, and hypothesize that densely connected subgraphs are indicative of transcription factor binding sites. Using a well-established sampling algorithm coupled with simple clustering and scoring schemes, we identify sets of closely related nucleotides and test those for known TF binding activity. Using an independent benchmark, we find our algorithm predicts yeast binding motifs considerably better than currently available techniques and without manual curation. Importantly, we reduce the number of false positive predictions in yeast to less than 30%. We also develop a framework to evaluate the statistical significance of our motif predictions. We show that our approach is robust to the choice of input promoters, and thus can be used in the context of predicting binding positions from noisy experimental data. We apply our method to identify binding sites using data from genome scale ChIP–chip experiments. Results from these experiments are publicly available at The graphical framework developed here may be useful when combining predictions from numerous computational and experimental measures. Finally, we discuss how our algorithm can be used to improve the sensitivity of computational predictions of transcription factor binding specificities.
Author Summary
A historically difficult problem in computational biology is the identification of transcription factor binding sites (TFBS) in the promoters of co-regulated genes. With increasing emphasis on research in transcriptional regulation, this problem is also uniquely relevant to emerging results from recent experiments in high-throughput and systems biology. Despite extensive research in the area, recent evaluations of previously published techniques show much room for improvement. In this paper, we introduce a fundamentally new approach to the identification of TFBS. First, we start by representing nucleotides in promoters as an undirected, weighted graph. Given this representation of a binding site graph (BSG), we employ relatively simple graph clustering techniques to identify functional TFBS. We show that BSG predictions significantly outperform all previously evaluated methods in nearly every performance measure using a standardized assessment benchmark. We also find that this approach is more robust than traditional Gibbs sampling to selection of input promoters, and thus more likely to perform well under noisy experimental conditions. Finally, BSGs are very good at predicting specificity determining nucleotides. Using BSG predictions, we were able to confirm recent experimental results on binding specificity of E-box TFs CBF1 and PHO4 and predict novel specificity determining nucleotides for TYE7.
PMCID: PMC1866359  PMID: 17500587
16.  Reconstructing genome-wide regulatory network of E. coli using transcriptome data and predicted transcription factor activities 
BMC Bioinformatics  2011;12:233.
Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element.
This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network.
The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on E. coli gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions.
PMCID: PMC3224099  PMID: 21668997
17.  Comprehensive Research Synopsis and Systematic Meta-Analyses in Parkinson's Disease Genetics: The PDGene Database 
PLoS Genetics  2012;8(3):e1002548.
More than 800 published genetic association studies have implicated dozens of potential risk loci in Parkinson's disease (PD). To facilitate the interpretation of these findings, we have created a dedicated online resource, PDGene, that comprehensively collects and meta-analyzes all published studies in the field. A systematic literature screen of ∼27,000 articles yielded 828 eligible articles from which relevant data were extracted. In addition, individual-level data from three publicly available genome-wide association studies (GWAS) were obtained and subjected to genotype imputation and analysis. Overall, we performed meta-analyses on more than seven million polymorphisms originating either from GWAS datasets and/or from smaller scale PD association studies. Meta-analyses on 147 SNPs were supplemented by unpublished GWAS data from up to 16,452 PD cases and 48,810 controls. Eleven loci showed genome-wide significant (P<5×10−8) association with disease risk: BST1, CCDC62/HIP1R, DGKQ/GAK, GBA, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, and SYT11/RAB25. In addition, we identified novel evidence for genome-wide significant association with a polymorphism in ITGA8 (rs7077361, OR 0.88, P = 1.3×10−8). All meta-analysis results are freely available on a dedicated online database (, which is cross-linked with a customized track on the UCSC Genome Browser. Our study provides an exhaustive and up-to-date summary of the status of PD genetics research that can be readily scaled to include the results of future large-scale genetics projects, including next-generation sequencing studies.
Author Summary
The genetic basis of Parkinson's disease is complex, i.e. it is determined by a number of different disease-causing and disease-predisposing genes. Especially the latter have proven difficult to find, evidenced by more than 800 published genetic association studies, typically showing discrepant results. To facilitate the interpretation of this large and continuously increasing body of data, we have created a freely available online database (“PDGene”: which provides an exhaustive account of all published genetic association studies in PD. One particularly useful feature is the calculation and display of up-to-date summary statistics of published data for overlapping DNA sequence variants (polymorphisms). These meta-analyses revealed eleven gene loci that showed a statistically very significant (P<5×10−8; a.k.a. genome-wide significance) association with risk for PD: BST1, CCDC62/HIP1R, DGKQ/GAK, GBA, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, SYT11/RAB25. In addition and purely by data-mining, we identified one novel PD susceptibility locus in a gene called ITGA8 (rs7077361, P = 1.3×10−8). We note that our continuously updated database represents the most comprehensive research synopsis of genetic association studies in PD to date. In addition to vastly facilitating the work of other PD geneticists, our approach may serve as a valuable example for other complex diseases.
PMCID: PMC3305333  PMID: 22438815
18.  An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods 
In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization.
Materials and methods
We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions.
The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation.
Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network.
PMCID: PMC4070077  PMID: 24726035
Gene disease prioritization; Network integration; Heterogeneous data fusion; MeSH descriptors; Node label ranking
19.  Beta-glucosidase 1 (GBA1) is a second bile acid β-glucosidase in addition to β-glucosidase 2 (GBA2). Study in β-glucosidase deficient mice and humans 
Beta-glucosidase 1 (GBA1; lysosomal glucocerebrosidase) and β-glucosidase 2 (GBA2, non-lysosomal glucocerebrosidase) both have glucosylceramide as a main natural substrate. The enzyme-deficient conditions with glucosylceramide accumulation are Gaucher disease (GBA1–/– in humans), modelled by the Gba1–/– mouse, and the syndrome with male infertility in the Gba2–/– mouse, respectively. Before the leading role of glucosylceramide was recognised for both deficient conditions, bile acid-3-O-β-glucoside (BG), another natural substrate, was viewed as the main substrate of GBA2. Given that GBA2 hydrolyses both BG and glucosylceramide, it was asked whether vice versa GBA1 hydrolyses both glucosylceramide and BG. Here we show that GBA1 also hydrolyses BG. We compared the residual BG hydrolysing activities in the GBA1–/–, Gba1–/– conditions (where GBA2 is the almost only active β-glucosidase) and those in the Gba2–/– condition (GBA1 active), with wild-type activities, but we used also the GBA1 inhibitor isofagomine. GBA1 and GBA2 activities had characteristic differences between the studied fibroblast, liver and brain samples. Independently, the hydrolysis of BG by pure recombinant GBA1 was shown. The fact that both GBA1 and GBA2 are glucocerebrosidases as well as bile acid β-glucosidases raises the question, why lysosomal accumulation of glucosylceramide in GBA1 deficiency, and extra-lysosomal accumulation in GBA2 deficiency, are not associated with an accumulation of BG in either condition.
PMCID: PMC3529407  PMID: 22659419
β-Glucosidase 1 (GBA1); β-Glucosidase 2 (GBA2); Bile acid β-glucosidases; Glucosylceramide lipidosis; β-Glucosidase null mice; Isofagomine
20.  Guilt-By-Association Feature Selection Applied to Simulated Proteomic Data 
We propose a new feature selection algorithm, Guilt-By-Association (GBA), which uses hierarchical clustering based on feature correlations to eliminate redundant features. GBA can be used in conjunction with other algorithms to produce a feature selection routine that explicitly considers both the similarities between features and their individual discriminatory powers. In this preliminary study, a simple form of GBA was investigated on simulated proteomic data.
PMCID: PMC1560499  PMID: 16779401
21.  A Multicenter Study of Glucocerebrosidase Mutations in Dementia With Lewy Bodies 
JAMA neurology  2013;70(6):10.1001/jamaneurol.2013.1925.
While mutations in glucocerebrosidase (GBA1) are associated with an increased risk for Parkinson disease (PD), it is important to establish whether such mutations are also a common risk factor for other Lewy body disorders.
To establish whether GBA1 mutations are a risk factor for dementia with Lewy bodies (DLB).
We compared genotype data on patients and controls from 11 centers. Data concerning demographics, age at onset, disease duration, and clinical and pathological features were collected when available. We conducted pooled analyses using logistic regression to investigate GBA1 mutation carrier status as predicting DLB or PD with dementia status, using common control subjects as a reference group. Random-effects meta-analyses were conducted to account for additional heterogeneity.
Eleven centers from sites around the world performing genotyping.
Seven hundred twenty-one cases met diagnostic criteria for DLB and 151 had PD with dementia. We compared these cases with 1962 controls from the same centers matched for age, sex, and ethnicity.
Main Outcome Measures
Frequency of GBA1 mutations in cases and controls.
We found a significant association between GBA1 mutation carrier status and DLB, with an odds ratio of 8.28 (95% CI, 4.78–14.88). The odds ratio for PD with dementia was 6.48 (95% CI, 2.53–15.37). The mean age at diagnosis of DLB was earlier in GBA1 mutation carriers than in noncarriers (63.5 vs 68.9 years; P<.001), with higher disease severity scores.
Conclusions and Relevance
Mutations in GBA1 are a significant risk factor for DLB. GBA1 mutations likely play an even larger role in the genetic etiology of DLB than in PD, providing insight into the role of glucocerebrosidase in Lewy body disease.
PMCID: PMC3841974  PMID: 23588557
22.  Computational reconstruction of tissue-specific metabolic models: application to human liver metabolism 
The first computational approach for the rapid generation of genome-scale tissue-specific models from a generic species model.A genome scale model of human liver metabolism, which is comprehensively tested and validated using cross-validation and the ability to carry out complex hepatic metabolic functions.The model's flux predictions are shown to correlate with flux measurements across a variety of hormonal and dietary conditions, and are successfully used to predict biomarker changes in genetic metabolic disorders, both with higher accuracy than the generic human model.
The study of normal human metabolism and its alterations is central to the understanding and treatment of a variety of human diseases, including diabetes, metabolic syndrome, neurodegenerative disorders, and cancer. A promising systems biology approach for studying human metabolism is through the development and analysis of large-scale stoichiometric network models of human metabolism. The reconstruction of these network models has followed two main paths: the former being the reconstruction of generic (non-tissue specific) models, characterizing the complete metabolic potential of human cells, based mostly on genomic data to trace enzyme-coding genes (Duarte et al, 2007; Ma et al, 2007), and the latter is the reconstruction of cell type- and tissue-specific models (Wiback and Palsson, 2002; Chatziioannou et al, 2003; Vo et al, 2004), based on a similar methodology to that described above, with the extra complexity of manual curation of literature evidence for the cell/system specificity of metabolic enzymes and pathways.
On this background, we present in this study, to the best of our knowledge, the first computational approach for a rapid generation of genome-scale tissue-specific models. The method relies on integrating the previously reconstructed generic human models with a variety of high-throughput molecular ‘omics' data, including transcriptomic, proteomic, metabolomic, and phenotypic data, as well as literature-based knowledge, characterizing the tissue in hand (Figure 1). Hence, it can be readily used to quite rapidly build and use a large array of human tissue-specific models. The resulting model satisfies stoichiometric, mass-balance, and thermodynamic constraints. It serves as a functional metabolic network that can then be used to explore the metabolic state of a tissue under various genetic and physiological conditions, simulating enzymatic inhibition or drug applications through standard constraint-based modeling methods, without requiring additional context-specific molecular data.
We applied this approach to build a genome scale model of liver metabolism, which is then comprehensively tested and validated. The model is shown to be able to simulate complex hepatic metabolic functions, as well as depicting the pathological alterations caused by urea cycle deficiencies. The liver model was applied to predict measured intra-cellular metabolic fluxes given measured metabolite uptake and secretion rates at different hepatic metabolic conditions. The predictions were tested using a comprehensive set of flux measurements performed by (Chan et al, 2003), showing that the liver model obtained more accurate predictions compared to those obtained by the original, generic human model (an overall prediction accuracy of 0.67 versus 0.46). Furthermore, it was applied to identify metabolic biomarkers for liver in-born errors of metabolism—once again, displaying superiority vs. the predictions generated by the generic human model (accuracy of 0.67 versus 0.59).
From a biotechnological standpoint, the liver model generated here can serve as a basis for future studies aiming to optimize the functioning of bio artificial liver devices. The application of the method to rapidly construct metabolic models of other human tissues can obviously lead to many other important clinical insights, e.g., concerning means for metabolic salvage of ischemic heart and brain tissues. Last but not least, the application of the new method is not limited to the realm of human modeling; it can be used to generate tissue models for any multi-tissue organism for which a generic model exists, such as the Mus musculus (Quek and Nielsen, 2008; Sheikh et al, 2005) and the model plant Arabidopsis thaliana (Poolman et al, 2009).
The computational study of human metabolism has been advanced with the advent of the first generic (non-tissue specific) stoichiometric model of human metabolism. In this study, we present a new algorithm for rapid reconstruction of tissue-specific genome-scale models of human metabolism. The algorithm generates a tissue-specific model from the generic human model by integrating a variety of tissue-specific molecular data sources, including literature-based knowledge, transcriptomic, proteomic, metabolomic and phenotypic data. Applying the algorithm, we constructed the first genome-scale stoichiometric model of hepatic metabolism. The model is verified using standard cross-validation procedures, and through its ability to carry out hepatic metabolic functions. The model's flux predictions correlate with flux measurements across a variety of hormonal and dietary conditions, and improve upon the predictive performance obtained using the original, generic human model (prediction accuracy of 0.67 versus 0.46). Finally, the model better predicts biomarker changes in genetic metabolic disorders than the generic human model (accuracy of 0.67 versus 0.59). The approach presented can be used to construct other human tissue-specific models, and be applied to other organisms.
PMCID: PMC2964116  PMID: 20823844
constraint based; hepatic; liver; metabolism
23.  Gaucher Disease: Transcriptome Analyses Using Microarray or mRNA Sequencing in a Gba1 Mutant Mouse Model Treated with Velaglucerase alfa or Imiglucerase 
PLoS ONE  2013;8(10):e74912.
Gaucher disease type 1, an inherited lysosomal storage disorder, is caused by mutations in GBA1 leading to defective glucocerebrosidase (GCase) function and consequent excess accumulation of glucosylceramide/glucosylsphingosine in visceral organs. Enzyme replacement therapy (ERT) with the biosimilars, imiglucerase (imig) or velaglucerase alfa (vela) improves/reverses the visceral disease. Comparative transcriptomic effects (microarray and mRNA-Seq) of no ERT and ERT (imig or vela) were done with liver, lung, and spleen from mice having Gba1 mutant alleles, termed D409V/null. Disease-related molecular effects, dynamic ranges, and sensitivities were compared between mRNA-Seq and microarrays and their respective analytic tools, i.e. Mixed Model ANOVA (microarray), and DESeq and edgeR (mRNA-Seq). While similar gene expression patterns were observed with both platforms, mRNA-Seq identified more differentially expressed genes (DEGs) (∼3-fold) than the microarrays. Among the three analytic tools, DESeq identified the maximum number of DEGs for all tissues and treatments. DESeq and edgeR comparisons revealed differences in DEGs identified. In 9V/null liver, spleen and lung, post-therapy transcriptomes approximated WT, were partially reverted, and had little change, respectively, and were concordant with the corresponding histological and biochemical findings. DEG overlaps were only 8–20% between mRNA-Seq and microarray, but the biological pathways were similar. Cell growth and proliferation, cell cycle, heme metabolism, and mitochondrial dysfunction were most altered with the Gaucher disease process. Imig and vela differentially affected specific disease pathways. Differential molecular responses were observed in direct transcriptome comparisons from imig- and vela-treated tissues. These results provide cross-validation for the mRNA-Seq and microarray platforms, and show differences between the molecular effects of two highly structurally similar ERT biopharmaceuticals.
PMCID: PMC3790783  PMID: 24124461
24.  The neurobiology of glucocerebrosidase-associated parkinsonism: a positron emission tomography study of dopamine synthesis and regional cerebral blood flow 
Brain  2012;135(8):2440-2448.
Mutations in GBA, the gene encoding glucocerebrosidase, the enzyme deficient in Gaucher disease, are common risk factors for Parkinson disease, as patients with Parkinson disease are over five times more likely to carry GBA mutations than healthy controls. Patients with GBA mutations generally have an earlier onset of Parkinson disease and more cognitive impairment than those without GBA mutations. We investigated whether GBA mutations alter the neurobiology of Parkinson disease, studying brain dopamine synthesis and resting regional cerebral blood flow in 107 subjects (38 women, 69 men). We measured dopamine synthesis with 18F-fluorodopa positron emission tomography, and resting regional cerebral blood flow with H215O positron emission tomography in the wakeful, resting state in four study groups: (i) patients with Parkinson disease and Gaucher disease (n = 7, average age = 56.6 ± 9.2 years); (ii) patients with Parkinson disease without GBA mutations (n = 11, 62.1 ± 7.1 years); (iii) patients with Gaucher disease without parkinsonism, but with a family history of Parkinson disease (n = 14, 52.6 ± 12.4 years); and (iv) healthy GBA-mutation carriers with a family history of Parkinson disease (n = 7, 50.1 ± 18 years). We compared each study group with a matched control group. Data were analysed with region of interest and voxel-based methods. Disease duration and Parkinson disease functional and staging scores were similar in the two groups with parkinsonism, as was striatal dopamine synthesis: both had greatest loss in the caudal striatum (putamen Ki loss: 44 and 42%, respectively), with less reduction in the caudate (20 and 18% loss). However, the group with both Parkinson and Gaucher diseases showed decreased resting regional cerebral blood flow in the lateral parieto-occipital association cortex and precuneus bilaterally. Furthermore, two subjects with Gaucher disease without parkinsonian manifestations showed diminished striatal dopamine. In conclusion, the pattern of dopamine loss in patients with both Parkinson and Gaucher disease was similar to sporadic Parkinson disease, indicating comparable damage in midbrain neurons. However, H215O positron emission tomography studies indicated that these subjects have decreased resting activity in a pattern characteristic of diffuse Lewy body disease. These findings provide insight into the pathophysiology of GBA-associated parkinsonism.
PMCID: PMC3407426  PMID: 22843412
brain imaging; genetic risk; positron emission tomography (PET); Parkinson disease; lysosomal storage disorders
25.  Isotype antibody response in cows to Streptococcus agalactiae group B polysaccharide-ovalbumin conjugate. 
Journal of Clinical Microbiology  1992;30(7):1856-1862.
Adult dairy cows were immunized with group B antigen (GBA) of Streptococcus agalactiae or GBA coupled to ovalbumin, both emulsified in incomplete Freund adjuvant, and their sera were examined by an enzyme-linked immunosorbent assay measuring bovine immunoglobulin isotypes (immunoglobulin G1 [IgG1], IgG2, and IgM) specific for GBA. All of the cows possessed naturally acquired antibodies against GBA, which implied that primary antibody responses could not be studied. At the highest dose tested (200 micrograms), free GBA elicited a slight increase in antibody titers only in the IgM isotype, to which most of the naturally acquired antibodies to GBA belonged. A second administration of antigen was not more effective. The conjugate was able to induce a strong humoral response against GBA, particularly in the IgG1 and IgG2 subisotypes, and a second injection of the conjugate induced a doubling of the peak antibody titers. Therefore, conjugation of GBA to a protein carrier markedly improved the antibody response, which showed the main characteristics of T-cell dependency. The opsonic activity of serum against an unencapsulated strain of S. agalactiae was reinforced by the immunization with the conjugate.
PMCID: PMC265393  PMID: 1629343

Results 1-25 (1198250)