PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1349932)

Clipboard (0)
None

Related Articles

1.  Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks 
BMC Bioinformatics  2005;6:227.
Background
Biological processes are carried out by coordinated modules of interacting molecules. As clustering methods demonstrate that genes with similar expression display increased likelihood of being associated with a common functional module, networks of coexpressed genes provide one framework for assigning gene function. This has informed the guilt-by-association (GBA) heuristic, widely invoked in functional genomics. Yet although the idea of GBA is accepted, the breadth of GBA applicability is uncertain.
Results
We developed methods to systematically explore the breadth of GBA across a large and varied corpus of expression data to answer the following question: To what extent is the GBA heuristic broadly applicable to the transcriptome and conversely how broadly is GBA captured by a priori knowledge represented in the Gene Ontology (GO)? Our study provides an investigation of the functional organization of five coexpression networks using data from three mammalian organisms. Our method calculates a probabilistic score between each gene and each Gene Ontology category that reflects coexpression enrichment of a GO module. For each GO category we use Receiver Operating Curves to assess whether these probabilistic scores reflect GBA. This methodology applied to five different coexpression networks demonstrates that the signature of guilt-by-association is ubiquitous and reproducible and that the GBA heuristic is broadly applicable across the population of nine hundred Gene Ontology categories. We also demonstrate the existence of highly reproducible patterns of coexpression between some pairs of GO categories.
Conclusion
We conclude that GBA has universal value and that transcriptional control may be more modular than previously realized. Our analyses also suggest that methodologies combining coexpression measurements across multiple genes in a biologically-defined module can aid in characterizing gene function or in characterizing whether pairs of functions operate together.
doi:10.1186/1471-2105-6-227
PMCID: PMC1239911  PMID: 16162296
2.  Functional and genetic characterization of the non-lysosomal glucosylceramidase 2 as a modifier for Gaucher disease 
Background
Gaucher disease (GD) is the most common inherited lysosomal storage disorder in humans, caused by mutations in the gene encoding the lysosomal enzyme glucocerebrosidase (GBA1). GD is clinically heterogeneous and although the type of GBA1 mutation plays a role in determining the type of GD, it does not explain the clinical variability seen among patients. Cumulative evidence from recent studies suggests that GBA2 could play a role in the pathogenesis of GD and potentially interacts with GBA1.
Methods
We used a framework of functional and genetic approaches in order to further characterize a potential role of GBA2 in GD. Glucosylceramide (GlcCer) levels in spleen, liver and brain of GBA2-deficient mice and mRNA and protein expression of GBA2 in GBA1-deficient murine fibroblasts were analyzed. Furthermore we crossed GBA2-deficient mice with conditional Gba1 knockout mice in order to quantify the interaction between GBA1 and GBA2. Finally, a genetic approach was used to test whether genetic variation in GBA2 is associated with GD and/ or acts as a modifier in Gaucher patients. We tested 22 SNPs in the GBA2 and GBA1 genes in 98 type 1 and 60 type 2/3 Gaucher patients for single- and multi-marker association with GD.
Results
We found a significant accumulation of GlcCer compared to wild-type controls in all three organs studied. In addition, a significant increase of Gba2-protein and Gba2-mRNA levels in GBA1-deficient murine fibroblasts was observed. GlcCer levels in the spleen from Gba1/Gba2 knockout mice were much higher than the sum of the single knockouts, indicating a cross-talk between the two glucosylceramidases and suggesting a partially compensation of the loss of one enzyme by the other. In the genetic approach, no significant association with severity of GD was found for SNPs at the GBA2 locus. However, in the multi-marker analyses a significant result was detected for p.L444P (GBA1) and rs4878628 (GBA2), using a model that does not take marginal effects into account.
Conclusions
All together our observations make GBA2 a likely candidate to be involved in GD etiology. Furthermore, they point to GBA2 as a plausible modifier for GBA1 in patients with GD.
doi:10.1186/1750-1172-8-151
PMCID: PMC3850879  PMID: 24070122
3.  Progress and challenges in the computational prediction of gene function using networks: 2012-2013 update 
F1000Research  2013;2:230.
In an opinion published in 2012, we reviewed and discussed our studies of how gene network-based guilt-by-association (GBA) is impacted by confounds related to gene multifunctionality. We found such confounds account for a significant part of the GBA signal, and as a result meaningfully evaluating and applying computationally-guided GBA is more challenging than generally appreciated. We proposed that effort currently spent on incrementally improving algorithms would be better spent in identifying the features of data that do yield novel functional insights. We also suggested that part of the problem is the reliance by computational biologists on gold standard annotations such as the Gene Ontology. In the year since, there has been continued heavy activity in GBA-based research, including work that contributes to our understanding of the issues we raised. Here we provide a review of some of the most relevant recent work, or which point to new areas of progress and challenges.
doi:10.12688/f1000research.2-230.v1
PMCID: PMC3962002  PMID: 24715959
4.  “Guilt by Association” Is the Exception Rather Than the Rule in Gene Networks 
PLoS Computational Biology  2012;8(3):e1002444.
Gene networks are commonly interpreted as encoding functional information in their connections. An extensively validated principle called guilt by association states that genes which are associated or interacting are more likely to share function. Guilt by association provides the central top-down principle for analyzing gene networks in functional terms or assessing their quality in encoding functional information. In this work, we show that functional information within gene networks is typically concentrated in only a very few interactions whose properties cannot be reliably related to the rest of the network. In effect, the apparent encoding of function within networks has been largely driven by outliers whose behaviour cannot even be generalized to individual genes, let alone to the network at large. While experimentalist-driven analysis of interactions may use prior expert knowledge to focus on the small fraction of critically important data, large-scale computational analyses have typically assumed that high-performance cross-validation in a network is due to a generalizable encoding of function. Because we find that gene function is not systemically encoded in networks, but dependent on specific and critical interactions, we conclude it is necessary to focus on the details of how networks encode function and what information computational analyses use to extract functional meaning. We explore a number of consequences of this and find that network structure itself provides clues as to which connections are critical and that systemic properties, such as scale-free-like behaviour, do not map onto the functional connectivity within networks.
Author Summary
The analysis of gene function and gene networks is a major theme of post-genome biomedical research. Historically, many attempts to understand gene function leverage a biological principle known as “guilt by association” (GBA). GBA states that genes with related functions tend to share properties such as genetic or physical interactions. In the past ten years, GBA has been scaled up for application to large gene networks, becoming a favored way to grapple with the complex interdependencies of gene functions in the face of floods of genomics and proteomics data. However, there is a growing realization that scaled-up GBA is not a panacea. In this study, we report a precise identification of the limits of GBA and show that it cannot provide a way to understand gene networks in a way that is simultaneously general and useful. Our findings indicate that the assumptions underlying the high-throughput use of gene networks to interpret function are fundamentally flawed, with wide-ranging implications for the interpretation of genome-wide data.
doi:10.1371/journal.pcbi.1002444
PMCID: PMC3315453  PMID: 22479173
5.  Mutations in GBA are associated with familial Parkinson disease susceptibility and age at onset 
Neurology  2009;72(4):310-316.
Objective:
To characterize sequence variation within the glucocerebrosidase (GBA) gene in a select subset of our sample of patients with familial Parkinson disease (PD) and then to test in our full sample whether these sequence variants increased the risk for PD and were associated with an earlier onset of disease.
Methods:
We performed a comprehensive study of all GBA exons in one patient with PD from each of 96 PD families, selected based on the family-specific lod scores at the GBA locus. Identified GBA variants were subsequently screened in all 1325 PD cases from 566 multiplex PD families and in 359 controls.
Results:
Nine different GBA variants, five previously reported, were identified in 21 of the 96 PD cases sequenced. Screening for these variants in the full sample identified 161 variant carriers (12.2%) in 99 different PD families. An unbiased estimate of the frequency of the five previously reported GBA variants in the familial PD sample was 12.6% and in the control sample was 5.3% (odds ratio 2.6; 95% confidence interval 1.5–4.4). Presence of a GBA variant was associated with an earlier age at onset (p = 0.0001). On average, those patients carrying a GBA variant had onset with PD 6.04 years earlier than those without a GBA variant.
Conclusions:
This study suggests that GBA is a susceptibility gene for familial Parkinson disease (PD) and patients with GBA variants have an earlier age at onset than patients with PD without GBA variants.
GLOSSARY
= confidence interval;
= Gaucher disease;
= Geriatric Depression Scale;
= Mini-Mental State Examination;
= National Cell Repository for Alzheimer’s Disease;
= nonparametric lod;
= odds ratio;
= Parkinson disease;
= Unified Parkinson’s Disease Rating Scale.
doi:10.1212/01.wnl.0000327823.81237.d1
PMCID: PMC2677501  PMID: 18987351
6.  Assessing functional annotation transfers with inter-species conserved coexpression: application to Plasmodium falciparum 
BMC Genomics  2010;11:35.
Background
Plasmodium falciparum is the main causative agent of malaria. Of the 5 484 predicted genes of P. falciparum, about 57% do not have sufficient sequence similarity to characterized genes in other species to warrant functional assignments. Non-homology methods are thus needed to obtain functional clues for these uncharacterized genes. Gene expression data have been widely used in the recent years to help functional annotation in an intra-species way via the so-called Guilt By Association (GBA) principle.
Results
We propose a new method that uses gene expression data to assess inter-species annotation transfers. Our approach starts from a set of likely orthologs between a reference species (here S. cerevisiae and D. melanogaster) and a query species (P. falciparum). It aims at identifying clusters of coexpressed genes in the query species whose coexpression has been conserved in the reference species. These conserved clusters of coexpressed genes are then used to assess annotation transfers between genes with low sequence similarity, enabling reliable transfers of annotations from the reference to the query species. The approach was used with transcriptomic data sets of P. falciparum, S. cerevisiae and D. melanogaster, and enabled us to propose with high confidence new/refined annotations for several dozens hypothetical/putative P. falciparum genes. Notably, we revised the annotation of genes involved in ribosomal proteins and ribosome biogenesis and assembly, thus highlighting several potential drug targets.
Conclusions
Our approach uses both sequence similarity and gene expression data to help inter-species gene annotation transfers. Experiments show that this strategy improves the accuracy achieved when using solely sequence similarity and outperforms the accuracy of the GBA approach. In addition, our experiments with P. falciparum show that it can infer a function for numerous hypothetical genes.
doi:10.1186/1471-2164-11-35
PMCID: PMC2826313  PMID: 20078859
7.  GBA server: EST-based digital gene expression profiling 
Nucleic Acids Research  2005;33(Web Server issue):W673-W676.
Expressed Sequence Tag-based gene expression profiling can be used to discover functionally associated genes on a large scale. Currently available web servers and tools focus on finding differentially expressed genes in different samples or tissues rather than finding co-expressed genes. To fill this gap, we have developed a web server that implements the GBA (Guilt-by-Association) co-expression algorithm, which has been successfully used in finding disease-related genes. We have also annotated UniGene clusters with links to several important databases such as GO, KEGG, OMIM, Gene, IPI and HomoloGene. The GBA server can be accessed and downloaded at .
doi:10.1093/nar/gki480
PMCID: PMC1160240  PMID: 15980560
8.  The link between the GBA gene and parkinsonism 
The Lancet. Neurology  2012;11(11):986-998.
Mutations in the glucocerebrosidase (GBA) gene, which encodes the lysosomal enzyme that is deficient in Gaucher's disease, are important and common risk factors for Parkinson’s disease and related disorders. This association was first recognised in the clinic, where parkinsonism was noted, albeit rarely, in patients with Gaucher's disease and more frequently in relatives who were obligate carriers. Subsequently, findings from large studies showed that patients with Parkinson’s disease and associated Lewy body disorders had an increased frequency of GBA mutations when compared with control individuals. Patients with GBA-associated parkinsonism exhibit varying parkinsonian phenotypes but tend to have an earlier age of onset and more associated cognitive changes than patients with parkinsonism without GBA mutations. Hypotheses proposed to explain this association include a gain-of-function due to mutations in glucocerebrosidase that promotes α-synuclein aggregation; substrate accumulation due to enzymatic loss-of-function, which affects α-synuclein processing and clearance; and a bidirectional feedback loop. Identification of the pathological mechanisms underlying GBA-associated parkinsonism will improve our understanding of the genetics, pathophysiology, and treatment for both rare and common neurological diseases.
doi:10.1016/S1474-4422(12)70190-4
PMCID: PMC4141416  PMID: 23079555
9.  Visual short-term memory deficits associated with GBA mutation and Parkinson’s disease 
Brain  2014;137(8):2303-2311.
Individuals with mutation in the lysosomal enzyme glucocerebrosidase (GBA) gene are at significantly high risk of developing Parkinson’s disease with cognitive deficit. We examined whether visual short-term memory impairments, long associated with patients with Parkinson’s disease, are also present in GBA-positive individuals—both with and without Parkinson’s disease. Precision of visual working memory was measured using a serial order task in which participants observed four bars, each of a different colour and orientation, presented sequentially at screen centre. Afterwards, they were asked to adjust a coloured probe bar’s orientation to match the orientation of the bar of the same colour in the sequence. An additional attentional ‘filtering’ condition tested patients’ ability to selectively encode one of the four bars while ignoring the others. A sensorimotor task using the same stimuli controlled for perceptual and motor factors. There was a significant deficit in memory precision in GBA-positive individuals—with or without Parkinson’s disease—as well as GBA-negative patients with Parkinson’s disease, compared to healthy controls. Worst recall was observed in GBA-positive cases with Parkinson’s disease. Although all groups were impaired in visual short-term memory, there was a double dissociation between sources of error associated with GBA mutation and Parkinson’s disease. The deficit observed in GBA-positive individuals, regardless of whether they had Parkinson’s disease, was explained by a systematic increase in interference from features of other items in memory: misbinding errors. In contrast, impairments in patients with Parkinson’s disease, regardless of GBA status, was explained by increased random responses. Individuals who were GBA-positive and also had Parkinson’s disease suffered from both types of error, demonstrating the worst performance. These findings provide evidence for dissociable signature deficits within the domain of visual short-term memory associated with GBA mutation and with Parkinson’s disease. Identification of the specific pattern of cognitive impairment in GBA mutation versus Parkinson’s disease is potentially important as it might help to identify individuals at risk of developing Parkinson’s disease.
doi:10.1093/brain/awu143
PMCID: PMC4107740  PMID: 24919969
visual short-term memory; working memory; Gaucher’s disease; glucocerebrosidase
10.  Reconstructing genome-wide regulatory network of E. coli using transcriptome data and predicted transcription factor activities 
BMC Bioinformatics  2011;12:233.
Background
Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element.
Results
This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network.
Conclusions
The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on E. coli gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions.
doi:10.1186/1471-2105-12-233
PMCID: PMC3224099  PMID: 21668997
11.  Glucocerebrosidase mutations in primary parkinsonism 
Parkinsonism & Related Disorders  2014;20(11):1215-1220.
Introduction
Mutations in the lysosomal glucocerebrosidase (GBA) gene increase the risk of Parkinson's Disease (PD). We determined the frequency and relative risk of major GBA mutations in a large series of Italian patients with primary parkinsonism.
Methods
We studied 2766 unrelated consecutive patients with clinical diagnosis of primary degenerative parkinsonism (including 2350 PD), and 1111 controls. The entire cohort was screened for mutations in GBA exons 9 and 10, covering approximately 70% of mutations, including the two most frequent defects, p.N370S and p.L444P.
Results
Four known mutations were identified in heterozygous state: 3 missense mutations (p.N370S, p.L444P, and p.D443N), and the splicing mutation IVS10+1G>T, which results in the in-frame exon-10 skipping. Molecular characterization of 2 additional rare variants, potentially interfering with splicing, suggested a neutral effect. GBA mutations were more frequent in PD (4.5%, RR = 7.2, CI = 3.3–15.3) and in Dementia with Lewy Bodies (DLB) (13.8%, RR = 21.9, CI = 6.8–70.7) than in controls (0.63%). but not in the other forms of parkinsonism such as Progressive Supranuclear Palsy (PSP, 2%), and Corticobasal Degeneration (CBD, 0%). Considering only the PD group, GBA-carriers were younger at onset (52 ± 10 vs. 57 ± 10 years, P < 0.0001) and were more likely to have a positive family history of PD (34% vs. 20%, P < 0.001).
Conclusion
GBA dysfunction is relevant for synucleinopathies, such as PD and DLB, except for MSA, in which pathology involves oligodendrocytes, and the tauopathies PSP and CBD. The risk of developing DLB is three-fold higher than PD, suggesting a more aggressive phenotype.
Highlights
•We screened a large case–control cohort with parkinsonism for common GBA mutations.•GBA mutations in the Italian population are a risk factor for Lewy Bodies Diseases (PD and DLB).•GBA mutations were not increased in the other forms of parkinsonism: PSP, CBD and MSA.•GBA dysfunction does not seem to be involved in MSA and tauopathies.
doi:10.1016/j.parkreldis.2014.09.003
PMCID: PMC4228056  PMID: 25249066
Parkinson's disease; GBA; Parkinsonism; Association analysis; Splicing mutation; Functional characterization
12.  Network Assessor: an automated method for quantitative assessment of a network's potential for gene function prediction 
Frontiers in Genetics  2014;5:123.
Significant effort has been invested in network-based gene function prediction algorithms based on the guilt by association (GBA) principle. Existing approaches for assessing prediction performance typically compute evaluation metrics, either averaged across all functions being considered, or strictly from properties of the network. Since the success of GBA algorithms depends on the specific function being predicted, evaluation metrics should instead be computed for each function. We describe a novel method for computing the usefulness of a network by measuring its impact on gene function cross validation prediction performance across all gene functions. We have implemented this in software called Network Assessor, and describe its use in the GeneMANIA (GM) quality control system. Network Assessor is part of the GM command line tools.
doi:10.3389/fgene.2014.00123
PMCID: PMC4032932  PMID: 24904632
network inference; function prediction; cross validation; network biology; machine learning
13.  An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods 
Objective
In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization.
Materials and methods
We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions.
Results
The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation.
Conclusions
Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network.
doi:10.1016/j.artmed.2014.03.003
PMCID: PMC4070077  PMID: 24726035
Gene disease prioritization; Network integration; Heterogeneous data fusion; MeSH descriptors; Node label ranking
14.  Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites 
PLoS Computational Biology  2007;3(5):e90.
Computational prediction of nucleotide binding specificity for transcription factors remains a fundamental and largely unsolved problem. Determination of binding positions is a prerequisite for research in gene regulation, a major mechanism controlling phenotypic diversity. Furthermore, an accurate determination of binding specificities from high-throughput data sources is necessary to realize the full potential of systems biology. Unfortunately, recently performed independent evaluation showed that more than half the predictions from most widely used algorithms are false. We introduce a graph-theoretical framework to describe local sequence similarity as the pair-wise distances between nucleotides in promoter sequences, and hypothesize that densely connected subgraphs are indicative of transcription factor binding sites. Using a well-established sampling algorithm coupled with simple clustering and scoring schemes, we identify sets of closely related nucleotides and test those for known TF binding activity. Using an independent benchmark, we find our algorithm predicts yeast binding motifs considerably better than currently available techniques and without manual curation. Importantly, we reduce the number of false positive predictions in yeast to less than 30%. We also develop a framework to evaluate the statistical significance of our motif predictions. We show that our approach is robust to the choice of input promoters, and thus can be used in the context of predicting binding positions from noisy experimental data. We apply our method to identify binding sites using data from genome scale ChIP–chip experiments. Results from these experiments are publicly available at http://cagt10.bu.edu/BSG. The graphical framework developed here may be useful when combining predictions from numerous computational and experimental measures. Finally, we discuss how our algorithm can be used to improve the sensitivity of computational predictions of transcription factor binding specificities.
Author Summary
A historically difficult problem in computational biology is the identification of transcription factor binding sites (TFBS) in the promoters of co-regulated genes. With increasing emphasis on research in transcriptional regulation, this problem is also uniquely relevant to emerging results from recent experiments in high-throughput and systems biology. Despite extensive research in the area, recent evaluations of previously published techniques show much room for improvement. In this paper, we introduce a fundamentally new approach to the identification of TFBS. First, we start by representing nucleotides in promoters as an undirected, weighted graph. Given this representation of a binding site graph (BSG), we employ relatively simple graph clustering techniques to identify functional TFBS. We show that BSG predictions significantly outperform all previously evaluated methods in nearly every performance measure using a standardized assessment benchmark. We also find that this approach is more robust than traditional Gibbs sampling to selection of input promoters, and thus more likely to perform well under noisy experimental conditions. Finally, BSGs are very good at predicting specificity determining nucleotides. Using BSG predictions, we were able to confirm recent experimental results on binding specificity of E-box TFs CBF1 and PHO4 and predict novel specificity determining nucleotides for TYE7.
doi:10.1371/journal.pcbi.0030090
PMCID: PMC1866359  PMID: 17500587
15.  Guilt-By-Association Feature Selection Applied to Simulated Proteomic Data 
We propose a new feature selection algorithm, Guilt-By-Association (GBA), which uses hierarchical clustering based on feature correlations to eliminate redundant features. GBA can be used in conjunction with other algorithms to produce a feature selection routine that explicitly considers both the similarities between features and their individual discriminatory powers. In this preliminary study, a simple form of GBA was investigated on simulated proteomic data.
PMCID: PMC1560499  PMID: 16779401
16.  Comprehensive Research Synopsis and Systematic Meta-Analyses in Parkinson's Disease Genetics: The PDGene Database 
PLoS Genetics  2012;8(3):e1002548.
More than 800 published genetic association studies have implicated dozens of potential risk loci in Parkinson's disease (PD). To facilitate the interpretation of these findings, we have created a dedicated online resource, PDGene, that comprehensively collects and meta-analyzes all published studies in the field. A systematic literature screen of ∼27,000 articles yielded 828 eligible articles from which relevant data were extracted. In addition, individual-level data from three publicly available genome-wide association studies (GWAS) were obtained and subjected to genotype imputation and analysis. Overall, we performed meta-analyses on more than seven million polymorphisms originating either from GWAS datasets and/or from smaller scale PD association studies. Meta-analyses on 147 SNPs were supplemented by unpublished GWAS data from up to 16,452 PD cases and 48,810 controls. Eleven loci showed genome-wide significant (P<5×10−8) association with disease risk: BST1, CCDC62/HIP1R, DGKQ/GAK, GBA, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, and SYT11/RAB25. In addition, we identified novel evidence for genome-wide significant association with a polymorphism in ITGA8 (rs7077361, OR 0.88, P = 1.3×10−8). All meta-analysis results are freely available on a dedicated online database (www.pdgene.org), which is cross-linked with a customized track on the UCSC Genome Browser. Our study provides an exhaustive and up-to-date summary of the status of PD genetics research that can be readily scaled to include the results of future large-scale genetics projects, including next-generation sequencing studies.
Author Summary
The genetic basis of Parkinson's disease is complex, i.e. it is determined by a number of different disease-causing and disease-predisposing genes. Especially the latter have proven difficult to find, evidenced by more than 800 published genetic association studies, typically showing discrepant results. To facilitate the interpretation of this large and continuously increasing body of data, we have created a freely available online database (“PDGene”: http://www.pdgene.org) which provides an exhaustive account of all published genetic association studies in PD. One particularly useful feature is the calculation and display of up-to-date summary statistics of published data for overlapping DNA sequence variants (polymorphisms). These meta-analyses revealed eleven gene loci that showed a statistically very significant (P<5×10−8; a.k.a. genome-wide significance) association with risk for PD: BST1, CCDC62/HIP1R, DGKQ/GAK, GBA, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, SYT11/RAB25. In addition and purely by data-mining, we identified one novel PD susceptibility locus in a gene called ITGA8 (rs7077361, P = 1.3×10−8). We note that our continuously updated database represents the most comprehensive research synopsis of genetic association studies in PD to date. In addition to vastly facilitating the work of other PD geneticists, our approach may serve as a valuable example for other complex diseases.
doi:10.1371/journal.pgen.1002548
PMCID: PMC3305333  PMID: 22438815
17.  The neurobiology of glucocerebrosidase-associated parkinsonism: a positron emission tomography study of dopamine synthesis and regional cerebral blood flow 
Brain  2012;135(8):2440-2448.
Mutations in GBA, the gene encoding glucocerebrosidase, the enzyme deficient in Gaucher disease, are common risk factors for Parkinson disease, as patients with Parkinson disease are over five times more likely to carry GBA mutations than healthy controls. Patients with GBA mutations generally have an earlier onset of Parkinson disease and more cognitive impairment than those without GBA mutations. We investigated whether GBA mutations alter the neurobiology of Parkinson disease, studying brain dopamine synthesis and resting regional cerebral blood flow in 107 subjects (38 women, 69 men). We measured dopamine synthesis with 18F-fluorodopa positron emission tomography, and resting regional cerebral blood flow with H215O positron emission tomography in the wakeful, resting state in four study groups: (i) patients with Parkinson disease and Gaucher disease (n = 7, average age = 56.6 ± 9.2 years); (ii) patients with Parkinson disease without GBA mutations (n = 11, 62.1 ± 7.1 years); (iii) patients with Gaucher disease without parkinsonism, but with a family history of Parkinson disease (n = 14, 52.6 ± 12.4 years); and (iv) healthy GBA-mutation carriers with a family history of Parkinson disease (n = 7, 50.1 ± 18 years). We compared each study group with a matched control group. Data were analysed with region of interest and voxel-based methods. Disease duration and Parkinson disease functional and staging scores were similar in the two groups with parkinsonism, as was striatal dopamine synthesis: both had greatest loss in the caudal striatum (putamen Ki loss: 44 and 42%, respectively), with less reduction in the caudate (20 and 18% loss). However, the group with both Parkinson and Gaucher diseases showed decreased resting regional cerebral blood flow in the lateral parieto-occipital association cortex and precuneus bilaterally. Furthermore, two subjects with Gaucher disease without parkinsonian manifestations showed diminished striatal dopamine. In conclusion, the pattern of dopamine loss in patients with both Parkinson and Gaucher disease was similar to sporadic Parkinson disease, indicating comparable damage in midbrain neurons. However, H215O positron emission tomography studies indicated that these subjects have decreased resting activity in a pattern characteristic of diffuse Lewy body disease. These findings provide insight into the pathophysiology of GBA-associated parkinsonism.
doi:10.1093/brain/aws174
PMCID: PMC3407426  PMID: 22843412
brain imaging; genetic risk; positron emission tomography (PET); Parkinson disease; lysosomal storage disorders
18.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes 
Nucleic Acids Research  2004;32(18):5539-5545.
In this paper, we present the Functional Catalogue (FunCat), a hierarchically structured, organism-independent, flexible and scalable controlled classification system enabling the functional description of proteins from any organism. FunCat has been applied for the manual annotation of prokaryotes, fungi, plants and animals. We describe how FunCat is implemented as a highly efficient and robust tool for the manual and automatic annotation of genomic sequences. Owing to its hierarchical architecture, FunCat has also proved to be useful for many subsequent downstream bioinformatic applications. This is illustrated by the analysis of large-scale experiments from various investigations in transcriptomics and proteomics, where FunCat was used to project experimental data into functional units, as ‘gold standard’ for functional classification methods, and also served to compare the significance of different experimental methods. Over the last decade, the FunCat has been established as a robust and stable annotation scheme that offers both, meaningful and manageable functional classification as well as ease of perception.
doi:10.1093/nar/gkh894
PMCID: PMC524302  PMID: 15486203
19.  Computational reconstruction of tissue-specific metabolic models: application to human liver metabolism 
The first computational approach for the rapid generation of genome-scale tissue-specific models from a generic species model.A genome scale model of human liver metabolism, which is comprehensively tested and validated using cross-validation and the ability to carry out complex hepatic metabolic functions.The model's flux predictions are shown to correlate with flux measurements across a variety of hormonal and dietary conditions, and are successfully used to predict biomarker changes in genetic metabolic disorders, both with higher accuracy than the generic human model.
The study of normal human metabolism and its alterations is central to the understanding and treatment of a variety of human diseases, including diabetes, metabolic syndrome, neurodegenerative disorders, and cancer. A promising systems biology approach for studying human metabolism is through the development and analysis of large-scale stoichiometric network models of human metabolism. The reconstruction of these network models has followed two main paths: the former being the reconstruction of generic (non-tissue specific) models, characterizing the complete metabolic potential of human cells, based mostly on genomic data to trace enzyme-coding genes (Duarte et al, 2007; Ma et al, 2007), and the latter is the reconstruction of cell type- and tissue-specific models (Wiback and Palsson, 2002; Chatziioannou et al, 2003; Vo et al, 2004), based on a similar methodology to that described above, with the extra complexity of manual curation of literature evidence for the cell/system specificity of metabolic enzymes and pathways.
On this background, we present in this study, to the best of our knowledge, the first computational approach for a rapid generation of genome-scale tissue-specific models. The method relies on integrating the previously reconstructed generic human models with a variety of high-throughput molecular ‘omics' data, including transcriptomic, proteomic, metabolomic, and phenotypic data, as well as literature-based knowledge, characterizing the tissue in hand (Figure 1). Hence, it can be readily used to quite rapidly build and use a large array of human tissue-specific models. The resulting model satisfies stoichiometric, mass-balance, and thermodynamic constraints. It serves as a functional metabolic network that can then be used to explore the metabolic state of a tissue under various genetic and physiological conditions, simulating enzymatic inhibition or drug applications through standard constraint-based modeling methods, without requiring additional context-specific molecular data.
We applied this approach to build a genome scale model of liver metabolism, which is then comprehensively tested and validated. The model is shown to be able to simulate complex hepatic metabolic functions, as well as depicting the pathological alterations caused by urea cycle deficiencies. The liver model was applied to predict measured intra-cellular metabolic fluxes given measured metabolite uptake and secretion rates at different hepatic metabolic conditions. The predictions were tested using a comprehensive set of flux measurements performed by (Chan et al, 2003), showing that the liver model obtained more accurate predictions compared to those obtained by the original, generic human model (an overall prediction accuracy of 0.67 versus 0.46). Furthermore, it was applied to identify metabolic biomarkers for liver in-born errors of metabolism—once again, displaying superiority vs. the predictions generated by the generic human model (accuracy of 0.67 versus 0.59).
From a biotechnological standpoint, the liver model generated here can serve as a basis for future studies aiming to optimize the functioning of bio artificial liver devices. The application of the method to rapidly construct metabolic models of other human tissues can obviously lead to many other important clinical insights, e.g., concerning means for metabolic salvage of ischemic heart and brain tissues. Last but not least, the application of the new method is not limited to the realm of human modeling; it can be used to generate tissue models for any multi-tissue organism for which a generic model exists, such as the Mus musculus (Quek and Nielsen, 2008; Sheikh et al, 2005) and the model plant Arabidopsis thaliana (Poolman et al, 2009).
The computational study of human metabolism has been advanced with the advent of the first generic (non-tissue specific) stoichiometric model of human metabolism. In this study, we present a new algorithm for rapid reconstruction of tissue-specific genome-scale models of human metabolism. The algorithm generates a tissue-specific model from the generic human model by integrating a variety of tissue-specific molecular data sources, including literature-based knowledge, transcriptomic, proteomic, metabolomic and phenotypic data. Applying the algorithm, we constructed the first genome-scale stoichiometric model of hepatic metabolism. The model is verified using standard cross-validation procedures, and through its ability to carry out hepatic metabolic functions. The model's flux predictions correlate with flux measurements across a variety of hormonal and dietary conditions, and improve upon the predictive performance obtained using the original, generic human model (prediction accuracy of 0.67 versus 0.46). Finally, the model better predicts biomarker changes in genetic metabolic disorders than the generic human model (accuracy of 0.67 versus 0.59). The approach presented can be used to construct other human tissue-specific models, and be applied to other organisms.
doi:10.1038/msb.2010.56
PMCID: PMC2964116  PMID: 20823844
constraint based; hepatic; liver; metabolism
20.  Detection of Putatively Thermophilic Anaerobic Methanotrophs in Diffuse Hydrothermal Vent Fluids 
The anaerobic oxidation of methane (AOM) is carried out by a globally distributed group of uncultivated Euryarchaeota, the anaerobic methanotrophic arachaea (ANME). In this work, we used G+C analysis of 16S rRNA genes to identify a putatively thermophilic ANME group and applied newly designed primers to study its distribution in low-temperature diffuse vent fluids from deep-sea hydrothermal vents. We found that the G+C content of the 16S rRNA genes (PGC) is significantly higher in the ANME-1GBa group than in other ANME groups. Based on the positive correlation between the PGC and optimal growth temperatures (Topt) of archaea, we hypothesize that the ANME-1GBa group is adapted to thrive at high temperatures. We designed specific 16S rRNA gene-targeted primers for the ANME-1 cluster to detect all phylogenetic groups within this cluster, including the deeply branching ANME-1GBa group. The primers were successfully tested both in silico and in experiments with sediment samples where ANME-1 phylotypes had previously been detected. The primers were further used to screen for the ANME-1 microorganisms in diffuse vent fluid samples from deep-sea hydrothermal vents in the Pacific Ocean, and sequences belonging to the ANME-1 cluster were detected in four individual vents. Phylotypes belonging to the ANME-1GBa group dominated in clone libraries from three of these vents. Our findings provide evidence of existence of a putatively extremely thermophilic group of methanotrophic archaea that occur in geographically and geologically distinct marine hydrothermal habitats.
doi:10.1128/AEM.03034-12
PMCID: PMC3568577  PMID: 23183981
21.  The association between ß-glucocerebrosidase mutations and parkinsonism 
Current neurology and neuroscience reports  2013;13(8):10.1007/s11910-013-0368-x.
Mutations in the ß-glucocerebrosidase gene (GBA), which encodes the lysosomal enzyme ß-glucocerebrosidase, have traditionally been implicated in Gaucher disease, an autosomal-recessive lyososomal storage disorder. Yet the past two decades have yielded an explosion of epidemiological and basic-science evidence linking mutations in GBA with the development of Parkinson disease as well. Although the specific contribution of mutant GBA to the pathogenesis of parkinsonism remains unknown, evidence suggests both loss of function and toxic gain-of-function by abnormal ß-glucocerebrosidase may be important, and a close relationship between ß-glucocerebrosidase and α-synuclein. Furthermore, multiple lines of evidence suggest that while GBA-associated PD closely mimics idiopathic PD (IPD), it may present at a younger age, and is more frequently complicated by cognitive dysfunction. Understanding the clinical association between GBA and PD, and the relationship between ß-glucocerebrosidase and α-synuclein, may enhance understanding of the pathogenesis of IPD, improve prognostication and treatment of GBA carriers with parkinsonism, and may furthermore inform therapies for IPD not due to GBA mutations.
doi:10.1007/s11910-013-0368-x
PMCID: PMC3816495  PMID: 23812893
Parkinson disease; parkinsonism; dementia with Lewy bodies; Gaucher disease; GBA; ß-glucocerebrosidase
22.  The relationship between the visual evoked potential and the gamma band investigated by blind and semi-blind methods 
Neuroimage  2011;56(3-4):1059-1071.
Gamma Band Activity (GBA) is increasingly studied for its relation with attention, change detection, maintenance of working memory and the processing of sensory stimuli. Activity around the gamma range has also been linked with early visual processing, although the relationship between this activity and the low frequency visual evoked potential (VEP) remains unclear. This study examined the ability of blind and semi-blind source separation techniques to extract sources specifically related to the VEP and GBA in order to shed light on the relationship between them. Blind (Independent Component Analysis—ICA) and semi-Blind (Functional Source Separation—FSS) methods were applied to dense array EEG data recorded during checkerboard stimulation. FSS was performed with both temporal and spectral constraints to identify specifically the generators of the main peak of the VEP (P100) and of the GBA. Source localisation and time-frequency analyses were then used to investigate the properties and co-dependencies between VEP/P100 and GBA. Analysis of the VEP extracted using the different methods demonstrated very similar morphology and localisation of the generators. Single trial time frequency analysis showed higher GBA when a larger amplitude VEP/P100 occurred. Further examination indicated that the evoked (phase-locked) component of the GBA was more related to the P100, whilst the induced component correlated with the VEP as a whole. The results suggest that the VEP and GBA may be generated by the same neuronal populations, and implicate this relationship as a potential mediator of the correlation between the VEP and the Blood Oxygenation Level Dependent (BOLD) effect measured with fMRI.
Research Highlights
► ICA and FSS are able to extract sources specifically related to the VEP/P100 and GBA. ► Localisation and frequency analyses show co-dependencies between VEP/P100 and GBA. ► Trial by trial induced GBA covaries with VEP amplitude. ► VEP and GBA may be generated by the same neuronal populations. ► VEP and GBA relationship may underlie the relationship between VEP and BOLD.
doi:10.1016/j.neuroimage.2011.03.008
PMCID: PMC3095074  PMID: 21396460
Visual Evoked Potential (VEP); Electroencephalography (EEG); Independent Component Analysis (ICA); Functional Source Separation (FSS); Induced Visual Gamma (IVG); Gamma Band Activity (GBA)
23.  A systematic mapping review of effective interventions for communicating with, supporting and providing information to parents of preterm infants 
BMJ Open  2011;1(1):e000023.
Background and objective
The birth of a preterm infant can be an overwhelming experience of guilt, fear and helplessness for parents. Provision of interventions to support and engage parents in the care of their infant may improve outcomes for both the parents and the infant. The objective of this systematic review is to identify and map out effective interventions for communication with, supporting and providing information for parents of preterm infants.
Design
Systematic searches were conducted in the electronic databases Medline, Embase, PsychINFO, the Cochrane library, the Cumulative Index to Nursing and Allied Health Literature, Midwives Information and Resource Service, Health Management Information Consortium, and Health Management and Information Service. Hand-searching of reference lists and journals was conducted. Studies were included if they provided parent-reported outcomes of interventions relating to information, communication and/or support for parents of preterm infants prior to the birth, during care at the neonatal intensive care unit and after going home with their preterm infant. Titles and abstracts were read for relevance, and papers judged to meet inclusion criteria were included. Papers were data-extracted, their quality was assessed, and a narrative summary was conducted in line with the York Centre for Reviews and Dissemination guidelines.
Studies reviewed
Of the 72 papers identified, 19 papers were randomised controlled trials, 16 were cohort or quasi-experimental studies, and 37 were non-intervention studies.
Results
Interventions for supporting, communicating with, and providing information to parents that have had a premature infant are reported. Parents report feeling supported through individualised developmental and behavioural care programmes, through being taught behavioural assessment scales, and through breastfeeding, kangaroo-care and baby-massage programmes. Parents also felt supported through organised support groups and through provision of an environment where parents can meet and support each other. Parental stress may be reduced through individual developmental care programmes, psychotherapy, interventions that teach emotional coping skills and active problem-solving, and journal writing. Evidence reports the importance of preparing parents for the neonatal unit through the neonatal tour, and the importance of good communication throughout the infant admission phase and after discharge home. Providing individual web-based information about the infant, recording doctor–patient consultations and provision of an information binder may also improve communication with parents. The importance of thorough discharge planning throughout the infant's admission phase and the importance of home-support programmes are also reported.
Conclusion
The paper reports evidence of interventions that help support, communicate with and inform parents who have had a premature infant throughout the admission phase of the infant, discharge and return home. The level of evidence reported is mixed, and this should be taken into account when developing policy. A summary of interventions from the available evidence is reported.
Article summary
Article focus
A systematic mapping review to identify and synthesise evidence of effective interventions for communicating with, supporting and providing information for parents of preterm infants.
Key messages
The review highlights the importance of encouraging and involving parents in the care of their preterm infant at the neonatal unit to enhance their ability to cope with and improve their confidence in caring for the infant, which may also lead to improved infant outcomes and reduced length of stay at the neonatal unit.
Interventions for supporting parents included: (1) involving parents in individualised developmental and behavioural care programmes (eg, Creating Opportunities for Parent Empowerment (COPE), Neonatal Individualised Developmental Care and Assessment Programme, Mother–Infant Transaction Programme (MITP)) and behavioural assessment programmes; (2) breastfeeding, kangaroo-care and infant-massage programmes; (3) support forums for parents; (4) interventions to alleviate parental stress; (5) preparation of parents for various stages—for example, seeing their infant for the first time, preparing to go home; (6) home-support programmes.
Involving parents in the exchange of information with and between health professionals is important, with various modes of providing this information reported—for example, ward rounds with doctors, discussion around infant notes, websites and hard-copy information.
Strengths and limitations of this study
Strengths
This is the first review to synthesise the evidence of interventions to support parents of preterm infants through improved provision of information, improved communications between parents and health professionals, and alleviation of stress at all stages of a parent's journey through the neonatal unit. It highlights relatively inexpensive interventions that can be integrated into their pathway through the neonatal unit and return home, enhancing parental coping and potentially improving infant outcomes and reducing the infants length of stay at the neonatal unit.
Limitations
The quality of the evidence that this review reports is variable, and includes all types of study designs. It has been difficult to evaluate one piece of evidence over another because of the nature of the evidence. For example, whether randomised controlled trials (RCTs) are an appropriate method of evaluating the parents' experiences of interventions over and above, say, a qualitative study is debatable. While the RCT studies are more objective, they often fail to provide a more in-depth empirical reality of parents' experiences of having a premature infant. A well-conducted RCT may not provide a true reflection of improved self-esteem or empowerment, for example, whereas a qualitative study provides an understanding of the experiences. Furthermore, evaluation of such complex interventions is challenging because of the various interconnecting parts of the pathway reported in figure 2.
It is therefore very difficult to evaluate the results to say that one study method is better than another. For this reason, we have been inclusive in our selection of studies, resulting in a large number of studies selected for the review. Being inclusive of studies benefits the evidence base by bringing together ‘experience’ studies in a systematic way gaining a greater breadth of perspectives and a deeper understanding of issues from the point of view of those targeted by the interventions. However, if studies were fatally flawed, they were excluded from the review.
doi:10.1136/bmjopen-2010-000023
PMCID: PMC3191395  PMID: 22021730
Social Health; community child health; paediatric intensive and critical care; education and training; quality in healthcare
24.  GFam: a platform for automatic annotation of gene families 
Nucleic Acids Research  2012;40(19):e152.
We have developed GFam, a platform for automatic annotation of gene/protein families. GFam provides a framework for genome initiatives and model organism resources to build domain-based families, derive meaningful functional labels and offers a seamless approach to propagate functional annotation across periodic genome updates. GFam is a hybrid approach that uses a greedy algorithm to chain component domains from InterPro annotation provided by its 12 member resources followed by a sequence-based connected component analysis of un-annotated sequence regions to derive consensus domain architecture for each sequence and subsequently generate families based on common architectures. Our integrated approach increases sequence coverage by 7.2 percentage points and residue coverage by 14.6 percentage points higher than the coverage relative to the best single-constituent database within InterPro for the proteome of Arabidopsis. The true power of GFam lies in maximizing annotation provided by the different InterPro data sources that offer resource-specific coverage for different regions of a sequence. GFam’s capability to capture higher sequence and residue coverage can be useful for genome annotation, comparative genomics and functional studies. GFam is a general-purpose software and can be used for any collection of protein sequences. The software is open source and can be obtained from http://www.paccanarolab.org/software/gfam/.
doi:10.1093/nar/gks631
PMCID: PMC3479161  PMID: 22790981
25.  Prediction of Gene Expression in Embryonic Structures of Drosophila melanogaster 
PLoS Computational Biology  2007;3(7):e144.
Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms.
Author Summary
The task of deciphering the complex transcriptional regulatory networks controlling development is one of the major current challenges for molecular biology. The problem is difficult, if not impossible, to solve without a detailed knowledge of the spatiotemporal dynamics of gene expression. Thus, to understand development, we need to identify and functionally characterize all players in regulatory networks. Data on gene expression dynamics obtained from whole transcriptome microarray experiments, combined with in situ hybridization mRNA localisation patterns for a subset of genes, may provide a route for predicting the localisation of gene expression for those genes for which in situ data has not been generated, as well as suggesting functional information for uncharacterised genes. Here, we report the development of one of the first methods for predicting the localisation of gene expression during Drosophila embryogenesis from microarray data. Pooling the subset of genes in the fly genome with in situ data to form functional units, localised in space and time for relevant developmental processes, facilitates the statement of a classification problem, which we address with machine-learning methods. Our approach promotes a richer annotation of biological function for genes in the absence of costly and time-consuming experimental analysis.
doi:10.1371/journal.pcbi.0030144
PMCID: PMC1924873  PMID: 17658945

Results 1-25 (1349932)