PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (34)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
more »
1.  The Software Ontology (SWO): a resource for reproducibility in biomedical data analysis, curation and digital preservation 
Motivation
Biomedical ontologists to date have concentrated on ontological descriptions of biomedical entities such as gene products and their attributes, phenotypes and so on. Recently, effort has diversified to descriptions of the laboratory investigations by which these entities were produced. However, much biological insight is gained from the analysis of the data produced from these investigations, and there is a lack of adequate descriptions of the wide range of software that are central to bioinformatics. We need to describe how data are analyzed for discovery, audit trails, provenance and reproducibility.
Results
The Software Ontology (SWO) is a description of software used to store, manage and analyze data. Input to the SWO has come from beyond the life sciences, but its main focus is the life sciences. We used agile techniques to gather input for the SWO and keep engagement with our users. The result is an ontology that meets the needs of a broad range of users by describing software, its information processing tasks, data inputs and outputs, data formats versions and so on. Recently, the SWO has incorporated EDAM, a vocabulary for describing data and related concepts in bioinformatics. The SWO is currently being used to describe software used in multiple biomedical applications.
Conclusion
The SWO is another element of the biomedical ontology landscape that is necessary for the description of biomedical entities and how they were discovered. An ontology of software used to analyze data produced by investigations in the life sciences can be made in such a way that it covers the important features requested and prioritized by its users. The SWO thus fits into the landscape of biomedical ontologies and is produced using techniques designed to keep it in line with user’s needs.
Availability
The Software Ontology is available under an Apache 2.0 license at http://theswo.sourceforge.net/; the Software Ontology blog can be read at http://softwareontology.wordpress.com.
doi:10.1186/2041-1480-5-25
PMCID: PMC4098953  PMID: 25068035
2.  Comparative Proteomic Analysis Identifies Age-Dependent Increases in the Abundance of Specific Proteins after Deletion of the Small Heat Shock Proteins αA- and αB-Crystallin 
Biochemistry  2013;52(17):2933-2948.
Mice with deletion of genes for small heat shock proteins αA- and αB-crystallin (αA/αB−/−) develop cataracts. We used proteomic analysis to identify lens proteins that change in abundance after deletion of these α-crystallin genes. Wild-type (WT) and αA/αB−/− knockout (DKO) mice were compared using two-dimensional difference gel electrophoresis and mass spectrometric analysis, and protein identifications were validated by Mascot proteomic software. The abundance of histones H2A, H4, and H2B fragment, and a low molecular weight β1-catenin increased 2- to 3-fold in postnatal day 2 lenses of DKO lenses compared with WT lenses. Additional major increases were observed in abundance of βB2-crystallin and vimentin in 30-day-old lenses of DKO animals compared with WT animals. Lenses of DKO mice were comprised of 9 protein spots containing βB2-crystallin at 10- to 40-fold higher abundance and 3 protein spots containing vimentin at ≥ 2-fold higher abundance than in WT lenses. Gel permeation chromatography identified a unique 328 kDa protein in DKO lenses, containing β-crystallin, demonstrating aggregation of β-crystallin in the absence of α-crystallins. Together, these changes provide biochemical evidence for possible functions of specific cell adhesion proteins, cytoskeletal proteins, and crystallins in lens opacities caused by the absence of the major chaperones, αA- and αB-crystallins.
doi:10.1021/bi400180d
PMCID: PMC3690595  PMID: 23590631
Crystallin; knockout; proteomics; substrate; alpha-crystallin; chaperone
3.  In Vivo Substrates of the Lens Molecular Chaperones αA-Crystallin and αB-Crystallin 
PLoS ONE  2014;9(4):e95507.
αA-crystallin and αB-crystallin are members of the small heat shock protein family and function as molecular chaperones and major lens structural proteins. Although numerous studies have examined their chaperone-like activities in vitro, little is known about the proteins they protect in vivo. To elucidate the relationships between chaperone function, substrate binding, and human cataract formation, we used proteomic and mass spectrometric methods to analyze the effect of mutations associated with hereditary human cataract formation on protein abundance in αA-R49C and αB-R120G knock-in mutant lenses. Compared with age-matched wild type lenses, 2-day-old αA-R49C heterozygous lenses demonstrated the following: increased crosslinking (15-fold) and degradation (2.6-fold) of αA-crystallin; increased association between αA-crystallin and filensin, actin, or creatine kinase B; increased acidification of βB1-crystallin; increased levels of grifin; and an association between βA3/A1-crystallin and αA-crystallin. Homozygous αA-R49C mutant lenses exhibited increased associations between αA-crystallin and βB3-, βA4-, βA2-crystallins, and grifin, whereas levels of βB1-crystallin, gelsolin, and calpain 3 decreased. The amount of degraded glutamate dehydrogenase, α-enolase, and cytochrome c increased more than 50-fold in homozygous αA-R49C mutant lenses. In αB-R120G mouse lenses, our analyses identified decreased abundance of phosphoglycerate mutase, several β- and γ-crystallins, and degradation of αA- and αB-crystallin early in cataract development. Changes in the abundance of hemoglobin and histones with the loss of normal α-crystallin chaperone function suggest that these proteins also play important roles in the biochemical mechanisms of hereditary cataracts. Together, these studies offer a novel insight into the putative in vivo substrates of αA- and αB-crystallin.
doi:10.1371/journal.pone.0095507
PMCID: PMC3997384  PMID: 24760011
4.  The EBI RDF platform: linked open data for the life sciences 
Bioinformatics  2014;30(9):1338-1339.
Motivation: Resource description framework (RDF) is an emerging technology for describing, publishing and linking life science data. As a major provider of bioinformatics data and services, the European Bioinformatics Institute (EBI) is committed to making data readily accessible to the community in ways that meet existing demand. The EBI RDF platform has been developed to meet an increasing demand to coordinate RDF activities across the institute and provides a new entry point to querying and exploring integrated resources available at the EBI.
Availability: http://www.ebi.ac.uk/rdf
Contact: jupp@ebi.ac.uk
doi:10.1093/bioinformatics/btt765
PMCID: PMC3998127  PMID: 24413672
5.  Expression Atlas update—a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments 
Nucleic Acids Research  2013;42(Database issue):D926-D932.
Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of ‘baseline’ expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful ‘contrasts’, i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.
doi:10.1093/nar/gkt1270
PMCID: PMC3964963  PMID: 24304889
6.  Identification of Potential Mediators of Retinotopic Mapping: A Comparative Proteomic Analysis of Optic Nerve from WT and Phr1 Retinal Knockout Mice 
Journal of proteome research  2012;11(11):5515-5526.
Retinal ganglion cells (RGCs) transmit visual information topographically from the eye to the brain, creating a map of visual space in retino-recipient nuclei (retinotopy). This process is affected by retinal activity and by activity-independent molecular cues. Phr1, which encodes a presumed E3 ubiquitin ligase (PHR1), is required presynaptically for proper placement of RGC axons in the lateral geniculate nucleus and the superior colliculus, suggesting that increased levels of PHR1 target proteins may be instructive for retinotopic mapping of retinofugal projections. To identify potential target proteins, we conducted a proteomic analysis of optic nerve to identify differentially abundant proteins in the presence or absence of Phr1 in RGCs. 1D gel electrophoresis identified a specific band in controls that was absent in mutants. Targeted proteomic analysis of this band demonstrated the presence of PHR1. Additionally, we conducted an unbiased proteomic analysis that identified 30 proteins as being significantly different between the two genotypes. One of these, heterogeneous nuclear ribonucleoprotein M (hnRNP-M), regulates antero-posterior patterning in invertebrates and can function as a cell surface adhesion receptor in vertebrates. Thus we have demonstrated that network analysis of quantitative proteomic data is a useful approach for hypothesis generation and for identifying biologically relevant targets in genetically altered biological models.
doi:10.1021/pr300767a
PMCID: PMC3510777  PMID: 22985349
Phr1; Mycbp2; retinal ganglion cell; proteomics; hnRNP-M; retinotopy; ubiquitin ligase; label-free quantitative proteomics; LC-MS; network analysis
7.  Identification of a unique TLR2-interacting peptide motif in a microbial leucine-rich-repeat protein 
Pathogenesis of many bacterially-induced inflammatory diseases is driven by toll- like receptor (TLR) mediated immune responses following recognition of bacterial factors by different TLRs. Periodontitis is a chronic inflammation of the tooth supporting apparatus often leading to tooth loss, and is caused by a Gram-negative bacterial consortium that includes Tannerella forsythia. This bacterium expresses a virulence factor, the BspA, which drives periodontal inflammation by activating TLR2. The N- terminal portion of the BspA protein comprises a leucine-rich repeat (LRR) domain previously shown to be involved in the binding and activation of TLR2. The objective of the current study was to identify specific epitopes in the LRR domain of BspA that interact with TLR2. Our results demonstrate that a sequence motif GC(S/T)GLXSIT is involved in mediating the interaction of BspA with TLR2. Thus, our study has identified a peptide motif that mediates the binding of a bacterial protein to TLR2 and highlights the promiscuous nature of TLR2 with respect to ligand binding. This work could provide a structural basis for designing peptidomimetics to modulate the activity of TLR2 in order to block bacterially-induced inflammation.
doi:10.1016/j.bbrc.2012.06.008
PMCID: PMC3405494  PMID: 22695115
leucine-rich repeat protein; BspA; TLR-2; Tannerella forsythia
8.  Quantitative Label-Free Proteomics for Discovery of Biomarkers in Cerebrospinal Fluid: Assessment of Technical and Inter-Individual Variation 
PLoS ONE  2013;8(5):e64314.
Background
Biomarkers are required for pre-symptomatic diagnosis, treatment, and monitoring of neurodegenerative diseases such as Alzheimer's disease. Cerebrospinal fluid (CSF) is a favored source because its proteome reflects the composition of the brain. Ideal biomarkers have low technical and inter-individual variability (subject variance) among control subjects to minimize overlaps between clinical groups. This study evaluates a process of multi-affinity fractionation (MAF) and quantitative label-free liquid chromatography tandem mass spectrometry (LC-MS/MS) for CSF biomarker discovery by (1) identifying reparable sources of technical variability, (2) assessing subject variance and residual technical variability for numerous CSF proteins, and (3) testing its ability to segregate samples on the basis of desired biomarker characteristics.
Methods/Results
Fourteen aliquots of pooled CSF and two aliquots from six cognitively normal individuals were randomized, enriched for low-abundance proteins by MAF, digested endoproteolytically, randomized again, and analyzed by nano-LC-MS. Nano-LC-MS data were time and m/z aligned across samples for relative peptide quantification. Among 11,433 aligned charge groups, 1360 relatively abundant ones were annotated by MS2, yielding 823 unique peptides. Analyses, including Pearson correlations of annotated LC-MS ion chromatograms, performed for all pairwise sample comparisons, identified several sources of technical variability: i) incomplete MAF and keratins; ii) globally- or segmentally-decreased ion current in isolated LC-MS analyses; and iii) oxidized methionine-containing peptides. Exclusion of these sources yielded 609 peptides representing 81 proteins. Most of these proteins showed very low coefficients of variation (CV<5%) whether they were quantified from the mean of all or only the 2 most-abundant peptides. Unsupervised clustering, using only 24 proteins selected for high subject variance, yielded perfect segregation of pooled and individual samples.
Conclusions
Quantitative label-free LC-MS/MS can measure scores of CSF proteins with low technical variability and can segregate samples according to desired criteria. Thus, this technique shows potential for biomarker discovery for neurological diseases.
doi:10.1371/journal.pone.0064314
PMCID: PMC3659127  PMID: 23700471
9.  EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats 
Bioinformatics  2013;29(10):1325-1332.
Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required.
Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations.
Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl.
Contact: jison@ebi.ac.uk
doi:10.1093/bioinformatics/btt113
PMCID: PMC3654706  PMID: 23479348
10.  ArrayExpress update—trends in database growth and links to data analysis tools 
Nucleic Acids Research  2012;41(Database issue):D987-D990.
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.
doi:10.1093/nar/gks1174
PMCID: PMC3531147  PMID: 23193272
11.  The role of molecular chaperonins in warm ischemia and reperfusion injury in the steatotic liver: A proteomic study 
BMC Biochemistry  2012;13:17.
Background
The molecular basis of the increased susceptibility of steatotic livers to warm ischemia/reperfusion (I/R) injury during transplantation remains undefined. Animal model for warm I/R injury was induced in obese Zucker rats. Lean Zucker rats provided controls. Two dimensional differential gel electrophoresis was performed with liver protein extracts. Protein features with significant abundance ratios (p < 0.01) between the two cohorts were selected and analyzed with HPLC/MS. Proteins were identified by Uniprot database. Interactive protein networks were generated using Ingenuity Pathway Analysis and GRANITE software.
Results
The relative abundance of 105 proteins was observed in warm I/R injury. Functional grouping revealed four categories of importance: molecular chaperones/endoplasmic reticulum (ER) stress, oxidative stress, metabolism, and cell structure. Hypoxia up-regulated 1, calcium binding protein 1, calreticulin, heat shock protein (HSP) 60, HSP-90, and protein disulfide isomerase 3 were chaperonins significantly (p < 0.01) down-regulated and only one chaperonin, HSP-1was significantly upregulated in steatotic liver following I/R.
Conclusion
Down-regulation of the chaperones identified in this analysis may contribute to the increased ER stress and, consequently, apoptosis and necrosis. This study provides an initial platform for future investigation of the role of chaperones and therapeutic targets for increasing the viability of steatotic liver allografts.
doi:10.1186/1471-2091-13-17
PMCID: PMC3445822  PMID: 22962947
Ischemia repurfusion injury; Two dimensional gel electrophoresis; Mass spectrometry; Liver transplantation; Chaperonins; Endoplasmic reticulum (ER) stress
12.  Inhibition of Lens Photodamage by UV-Absorbing Contact Lenses 
Using a proteomics approach, the authors examined whether class 1 UV-blocking contact lenses protect against UVB radiation–induced damage in a human lens epithelial cell line (HLE B-3) and postmortem human lenses.
Purpose.
To determine whether class 1 UV-blocking contact lenses protect against UVB radiation–induced damage in a human lens epithelial cell line (HLE B-3) and postmortem human lenses using a proteomics approach.
Methods.
HLE B-3 cells were exposed to 6.4 mW/cm2 UVB radiation at 302 nm for 2 minutes (768 mJ/cm2) with or without covering by senofilcon A class 1 UV-blocking contact lenses or lotrafilcon A non–UV-blocking (lotrafilcon A has some UV-blocking ability, albeit minimal) contact lenses. Control cells were not exposed to UVB radiation. Four hours after treatment, cells were analyzed by two-dimensional difference gel electrophoresis and tandem mass spectrometry, and changes in protein abundance were quantified. F-actin and microtubule cytoskeletons were examined by fluorescence staining. In addition, human donor lenses were exposed to UVB radiation at 302 nm for 4 minutes (1536 mJ/cm2). Cortical and epithelial cell proteins were scraped from lens surfaces and subjected to the same protein analyses.
Results.
Senofilcon A lenses were beneficial for protecting HLE B-3 cells against UVB radiation–induced changes in caldesmon 1 isoform, lamin A/C transcript variant 1, DEAD (Asp-Glu-Ala-Asp) box polypeptide, β-actin, glyceraldehyde 3-phosphate dehydrogenase (G3PDH), annexin A2, triose phosphate isomerase, and ubiquitin B precursor. These contact lenses also prevented actin and microtubule cytoskeleton changes typically induced by UVB radiation. Conversely, non–UV-blocking contact lenses were not protective. UVB-irradiated human lenses showed marked reductions in αA-crystallin, αB-crystallin, aldehyde dehydrogenase 1, βS-crystallin, βB2-crystallin, and G3PDH, and UV-absorbing contact lenses significantly prevented these alterations.
Conclusions.
Senofilcon A class 1 UV-blocking contact lenses largely prevented UVB-induced changes in protein abundance in lens epithelial cells and in human lenses.
doi:10.1167/iovs.11-7633
PMCID: PMC3208141  PMID: 21873653
13.  Cetuximab in Refractory Skin Cancer Treatment 
Journal of Cancer  2012;3:257-261.
Objectives: Non-melanoma skin cancer is the most common malignancy in US, with an annual incidence of in excess of 1.5 million cases. In the majority of cases, locoregional treatment is curative and systemic therapy is not indicated. Platinum-based chemotherapy regimens have been used most commonly in refractory cases. The use of cetuximab, a monoclonal antibody targeting epidermal growth factor receptor [EGFR], has been reported for skin cancer treatment. This current study evaluated eight cases of locally advanced and refractory basal cell or squamous cell cancers which were treated with cetuximab.
Methods: This is a retrospective study on eight patients who had received cetuximab for treatment of cutaneous carcinoma since 2007 at Southern Illinois University School of Medicine (SIU-SOM) Medical Oncology clinic.
Results: Three of the four patients with basal cell carcinoma and two of the four patients with squamous cell carcinoma maintained remission on treatment.. The main side effect was acneiform rash which required termination of treatment for one patient and dose reduction in another.
Conclusion: The study indicates that cetuximab may have a beneficial role for patients with non-melanoma cutaneous carcinomas that are refractory to standard therapy.
doi:10.7150/jca.3491
PMCID: PMC3376776  PMID: 22712026
cetuximab; non-melanoma; skin cancer
14.  YKL-40: A Novel Prognostic Fluid Biomarker for Preclinical Alzheimer’s Disease 
Biological psychiatry  2010;68(10):903-912.
Background
Disease-modifying therapies for Alzheimer’s disease (AD) would be most beneficial if applied during the ‘preclinical’ stage (pathology present with cognition intact) before significant neuronal loss occurs. Therefore, biomarkers that can detect AD pathology in its early stages and predict dementia onset and progression will be invaluable for patient care and efficient clinical trial design.
Methods
2D–difference gel electrophoresis and liquid chromatography tandem mass spectrometry were used to measure AD-associated changes in cerebrospinal fluid (CSF). Concentrations of CSF YKL-40 were further evaluated by enzyme-linked immunosorbent assay in the discovery cohort (N=47), an independent sample set (N=292) with paired plasma samples (N=237), frontotemporal lobar degeneration (N=9), and progressive supranuclear palsy (PSP, N=6). Human AD brain was studied immunohistochemically to identify potential source(s) of YKL-40.
Results
In the discovery and validation cohorts, mean CSF YKL-40 was higher in very mild and mild AD-type dementia (Clinical Dementia Rating [CDR] 0.5 and 1) vs. controls (CDR 0) and PSP. Importantly, CSF YKL-40/Aβ42 ratio predicted risk of developing cognitive impairment (CDR 0 to CDR>0 conversion) as well as the best CSF biomarkers identified to date, tau/Aβ42 and p-tau181/Aβ42. Mean plasma YKL-40 was higher in CDR 0.5 and 1 vs. CDR 0 groups, and correlated with CSF levels. YKL-40 immunoreactivity was observed within astrocytes near a subset of amyloid plaques, implicating YKL-40 in the neuroinflammatory response to Aβ deposition.
Conclusions
These data demonstrate that YKL-40, a putative indicator of neuroinflammation, is elevated in AD, and that, together with Aβ42, has potential prognostic utility as a biomarker for preclinical AD.
doi:10.1016/j.biopsych.2010.08.025
PMCID: PMC3011944  PMID: 21035623
YKL-40; Alzheimer’s disease; biomarkers; cerebrospinal fluid; chitinase-3 like-1; inflammation
15.  Gene Expression Atlas update—a value-added database of microarray and sequencing-based functional genomics experiments 
Nucleic Acids Research  2011;40(Database issue):D1077-D1081.
Gene Expression Atlas (http://www.ebi.ac.uk/gxa) is an added-value database providing information about gene expression in different cell types, organism parts, developmental stages, disease states, sample treatments and other biological/experimental conditions. The content of this database derives from curation, re-annotation and statistical analysis of selected data from the ArrayExpress Archive and the European Nucleotide Archive. A simple interface allows the user to query for differential gene expression either by gene names or attributes or by biological conditions, e.g. diseases, organism parts or cell types. Since our previous report we made 20 monthly releases and, as of Release 11.08 (August 2011), the database supports 19 species, which contains expression data measured for 19 014 biological conditions in 136 551 assays from 5598 independent studies.
doi:10.1093/nar/gkr913
PMCID: PMC3245177  PMID: 22064864
16.  Anatomy ontologies and potential users: bridging the gap 
Journal of Biomedical Semantics  2011;2(Suppl 4):S3.
Motivation
To evaluate how well current anatomical ontologies fit the way real-world users apply anatomy terms in their data annotations.
Methods
Annotations from three diverse multi-species public-domain datasets provided a set of use cases for matching anatomical terms in two major anatomical ontologies (the Foundational Model of Anatomy and Uberon), using two lexical-matching applications (Zooma and Ontology Mapper).
Results
Approximately 1500 terms were identified; Uberon/Zooma mappings provided 286 matches, compared to the control and Ontology Mapper returned 319 matches. For the Foundational Model of Anatomy, Zooma returned 312 matches, and Ontology Mapper returned 397.
Conclusions
Our results indicate that for our datasets the anatomical entities or concepts are embedded in user-generated complex terms, and while lexical mapping works, anatomy ontologies do not provide the majority of terms users supply when annotating data. Provision of searchable cross-products for compositional terms is a key requirement for using ontologies.
doi:10.1186/2041-1480-2-S4-S3
PMCID: PMC3194170  PMID: 21995944
17.  Automating generation of textual class definitions from OWL to English 
Journal of Biomedical Semantics  2011;2(Suppl 2):S5.
Background
Text definitions for entities within bio-ontologies are a cornerstone of the effort to gain a consensus in understanding and usage of those ontologies. Writing these definitions is, however, a considerable effort and there is often a lag between specification of the main part of an ontology (logical descriptions and definitions of entities) and the development of the text-based definitions. The goal of natural language generation (NLG) from ontologies is to take the logical description of entities and generate fluent natural language. The application described here uses NLG to automatically provide text-based definitions from an ontology that has logical descriptions of its entities, so avoiding the bottleneck of authoring these definitions by hand.
Results
To produce the descriptions, the program collects all the axioms relating to a given entity, groups them according to common structure, realises each group through an English sentence, and assembles the resulting sentences into a paragraph, to form as ‘coherent’ a text as possible without human intervention. Sentence generation is accomplished using a generic grammar based on logical patterns in OWL, together with a lexicon for realising atomic entities. We have tested our output for the Experimental Factor Ontology (EFO) using a simple survey strategy to explore the fluency of the generated text and how well it conveys the underlying axiomatisation. Two rounds of survey and improvement show that overall the generated English definitions are found to convey the intended meaning of the axiomatisation in a satisfactory manner. The surveys also suggested that one form of generated English will not be universally liked; that intrusion of too much ‘formal ontology’ was not liked; and that too much explicit exposure of OWL semantics was also not liked.
Conclusions
Our prototype tools can generate reasonable paragraphs of English text that can act as definitions. The definitions were found acceptable by our survey and, as a result, the developers of EFO are sufficiently satisfied with the output that the generated definitions have been incorporated into EFO. Whilst not a substitute for hand-written textual definitions, our generated definitions are a useful starting point.
Availability
An on-line version of the NLG text definition tool can be found at http://swat.open.ac.uk/tools/. The questionaire and sample generated text definitions may be found at http://mcs.open.ac.uk/nlg/SWAT/bio-ontologies.html.
doi:10.1186/2041-1480-2-S2-S5
PMCID: PMC3102894  PMID: 21624160
18.  Serum markers may distinguish biliary atresia from other forms of neonatal cholestasis 
Background
Biliary atresia (BA) is the most serious liver disease in infants. Diagnosis currently depends on surgical exploration of the biliary tree. Non-invasive tests that distinguish BA from other types of neonatal liver disease are not available.
Methods
To identify potential serum biomarkers that classify children with neonatal cholestasis, we performed 2-dimensional difference gel electrophoresis, statistical analysis, and tandem mass spectrometry using serum samples from 19 infants with BA and 19 infants with non-BA neonatal cholestasis.
Results
11 potential serum biomarkers were found that could in combination classify children with neonatal cholestasis.
Conclusions
Although no single biomarker or imaging test adequately distinguishes BA from other types of neonatal cholestasis, combinations of biomarkers, imaging tests and non-invasive clinical criteria should be further explored as potential tests for rapid and accurate diagnosis of BA.
doi:10.1097/MPG.0b013e3181cb42ee
PMCID: PMC2881691  PMID: 20216099
Proteomics; biliary atresia; neonatal cholestasis; biomarker
19.  A unified sample preparation protocol for proteomic and genomic profiling of cervical swabs to identify biomarkers for cervical cancer screening 
Proteomics. Clinical applications  2008;2(12):1658-1669.
Cervical cancer screening is ideally suited for the development of biomarkers due to the ease of tissue acquisition and the well-established histological transitions. Furthermore, cell and biologic fluid obtained from cervix samples undergo specific molecular changes that can be profiled. However, the ideal manner and techniques for preparing cervical samples remains to be determined. To address this critical issue a patient screening protein and nucleic acid collection protocol was established. RNAlater was used to collect the samples followed by proteomic methods to identify proteins that were differentially expressed in normal cervical epithelial versus cervical cancer cells. Three hundred ninety spots were identified via two-dimensional difference gel electrophoresis (2-D DIGE) that were expressed at either higher or lower levels (>3-fold) in cervical cancer samples. These proteomic results were compared to genes in a cDNA microarray analysis of microdissected neoplastic cervical specimens to identify overlapping patterns of expression. The most frequent pathways represented by the combined dataset were: cell cycle: G2/M DNA damage checkpoint regulation; aryl hydrocarbon receptor signaling; p53 signaling; cell cycle: G1/S checkpoint regulation; and the endoplasmic reticulum stress pathway. HNRPA2B1 was identified as a biomarker candidate with increased expression in cancer compared to normal cervix and validated by Western blot.
doi:10.1002/prca.200780146
PMCID: PMC3042129  PMID: 21136816
2-D DIGE; biomarkers; cervical cancer; cDNA microarray; RNAlater
20.  Identification and Validation of Novel Cerebrospinal Fluid Biomarkers for Staging Early Alzheimer's Disease 
PLoS ONE  2011;6(1):e16032.
Background
Ideally, disease modifying therapies for Alzheimer disease (AD) will be applied during the ‘preclinical’ stage (pathology present with cognition intact) before severe neuronal damage occurs, or upon recognizing very mild cognitive impairment. Developing and judiciously administering such therapies will require biomarker panels to identify early AD pathology, classify disease stage, monitor pathological progression, and predict cognitive decline. To discover such biomarkers, we measured AD-associated changes in the cerebrospinal fluid (CSF) proteome.
Methods and Findings
CSF samples from individuals with mild AD (Clinical Dementia Rating [CDR] 1) (n = 24) and cognitively normal controls (CDR 0) (n = 24) were subjected to two-dimensional difference-in-gel electrophoresis. Within 119 differentially-abundant gel features, mass spectrometry (LC-MS/MS) identified 47 proteins. For validation, eleven proteins were re-evaluated by enzyme-linked immunosorbent assays (ELISA). Six of these assays (NrCAM, YKL-40, chromogranin A, carnosinase I, transthyretin, cystatin C) distinguished CDR 1 and CDR 0 groups and were subsequently applied (with tau, p-tau181 and Aβ42 ELISAs) to a larger independent cohort (n = 292) that included individuals with very mild dementia (CDR 0.5). Receiver-operating characteristic curve analyses using stepwise logistic regression yielded optimal biomarker combinations to distinguish CDR 0 from CDR>0 (tau, YKL-40, NrCAM) and CDR 1 from CDR<1 (tau, chromogranin A, carnosinase I) with areas under the curve of 0.90 (0.85–0.94 95% confidence interval [CI]) and 0.88 (0.81–0.94 CI), respectively.
Conclusions
Four novel CSF biomarkers for AD (NrCAM, YKL-40, chromogranin A, carnosinase I) can improve the diagnostic accuracy of Aβ42 and tau. Together, these six markers describe six clinicopathological stages from cognitive normalcy to mild dementia, including stages defined by increased risk of cognitive decline. Such a panel might improve clinical trial efficiency by guiding subject enrollment and monitoring disease progression. Further studies will be required to validate this panel and evaluate its potential for distinguishing AD from other dementing conditions.
doi:10.1371/journal.pone.0016032
PMCID: PMC3020224  PMID: 21264269
21.  ArrayExpress update—an archive of microarray and high-throughput sequencing-based functional genomics experiments 
Nucleic Acids Research  2010;39(Database issue):D1002-D1004.
The ArrayExpress Archive (http://www.ebi.ac.uk/arrayexpress) is one of the three international public repositories of functional genomics data supporting publications. It includes data generated by sequencing or array-based technologies. Data are submitted by users and imported directly from the NCBI Gene Expression Omnibus. The ArrayExpress Archive is closely integrated with the Gene Expression Atlas and the sequence databases at the European Bioinformatics Institute. Advanced queries provided via ontology enabled interfaces include queries based on technology and sample attributes such as disease, cell types and anatomy.
doi:10.1093/nar/gkq1040
PMCID: PMC3013660  PMID: 21071405
22.  Cutting Edge: Identification of a Pre-Ligand Assembly Domain (PLAD) and Ligand Binding Site in the IL-17 Receptor1 
IL-17 is the hallmark cytokine of the newly described “Th17” lymphocyte population. The composition, subunit dynamics, and ligand contacts of the IL-17 receptor are poorly defined. We previously demonstrated that the IL-17RA subunit oligomerizes in the membrane without a ligand. In this study, computational modeling identified two fibronectin-III-like (FN) domains in IL-17RA connected by a nonstructured linker, which we predicted to mediate homotypic interactions. In yeast two-hybrid, the membrane-proximal FN domain (FN2), but not the membrane-distal domain (FN1), formed homomeric interactions. The ability of FN2 to drive ligand-independent multimerization was verified by coimmunoprecipitation and fluorescence resonance energy transfer microscopy. Thus, FN2 constitutes a “pre-ligand assembly domain” (PLAD). Further studies indicated that the FN2 linker domain contains the IL-17 binding site, which was never mapped. However, the FN1 domain is also required for high affinity interactions with IL-17. Therefore, although the PLAD is located entirely within FN2, effective ligand binding also involves contributions from the linker and FN1.
PMCID: PMC2973996  PMID: 17982023
23.  Cross-species comparison of orthologous gene expression in human bladder cancer and carcinogen-induced rodent models 
Genes differentially expressed by tumor cells represent promising drug targets for anti-cancer therapy. Such candidate genes need to be validated in appropriate animal models. This study examined the suitability of rodent models of bladder cancer in B6D2F1 mice and Fischer-344 rats to model clinical bladder cancer specimens in humans. Using a global gene expression approach cross-species analysis showed that 13-34% of total genes in the genome were differentially expressed between tumor and normal tissues in each of five datasets from humans, rats, and mice. About 20% of these differentially expressed genes overlapped among species, corresponding to 2.6 to 4.8% of total genes in the genome. Several genes were consistently dysregulated in bladder tumors in both humans and rodents. Notably, CNN1, MYL9, PDLIM3, ITIH5, MYH11, PCP4 and FM05 were found to commonly down-regulated; while T0P2A, CCNB2, KIF20A and RRM2 were up-regulated. These genes are likely to have conserved functions contributing to bladder carcinogenesis. Gene set enrichment analysis detected a number of molecular pathways commonly activated in both humans and rodent bladder cancer. These pathways affect the cell cycle, HIF-1 and MYC expression, and regulation of apoptosis. We also compared expression changes at mRNA and protein levels in the rat model and identified several genes/proteins exhibiting concordant changes in bladder tumors, including ANXA1, ANXA2, CA2, KRT14, LDHA, LGALS4, SERPINA1, KRT18 and LDHB. In general, rodent models of bladder cancer represent the clinical disease to an extent that will allow successful mining of target genes and permit studies on the molecular mechanisms of bladder carcinogenesis.
PMCID: PMC2981423  PMID: 21139803
Human bladder cancer; rodent models; gene expression; proteomics; and cross-species comparison
24.  Modeling biomedical experimental processes with OBI 
Journal of Biomedical Semantics  2010;1(Suppl 1):S7.
Background
Experimental descriptions are typically stored as free text without using standardized terminology, creating challenges in comparison, reproduction and analysis. These difficulties impose limitations on data exchange and information retrieval.
Results
The Ontology for Biomedical Investigations (OBI), developed as a global, cross-community effort, provides a resource that represents biomedical investigations in an explicit and integrative framework. Here we detail three real-world applications of OBI, provide detailed modeling information and explain how to use OBI.
Conclusion
We demonstrate how OBI can be applied to different biomedical investigations to both facilitate interpretation of the experimental process and increase the computational processing and integration within the Semantic Web. The logical definitions of the entities involved allow computers to unambiguously understand and integrate different biological experimental processes and their relevant components.
Availability
OBI is available at http://purl.obolibrary.org/obo/obi/2009-11-02/obi.owl
doi:10.1186/2041-1480-1-S1-S7
PMCID: PMC2903726  PMID: 20626927
25.  Proteomic Analysis of Anoxia Tolerance in the Developing Zebrafish Embryo 
While some species and tissue types are injured by oxygen deprivation, anoxia tolerant organisms display a protective response that has not been fully elucidated and is well-suited to genomic and proteomic analysis. However, such methodologies have focused on transcriptional responses, prolonged anoxia, or have used cultured cells or isolated tissues. In this study of intact zebrafish embryos, a species capable of >24 h survival in anoxia, we have utilized 2D difference in gel electrophoresis to identify changes in the proteomic profile caused by near-lethal anoxic durations as well as acute anoxia (1 h), a timeframe relevant to ischemic events in human disease when response mechanisms are largely limited to post-transcriptional and post-translational processes. We observed a general stabilization of the proteome in anoxia. Proteins involved in oxidative phosphorylation, antioxidant defense, transcription, and translation changed over this time period. Among the largest proteomic alterations was that of muscle cofilin 2, implicating the regulation of the cytoskeleton and actin assembly in the adaptation to acute anoxia. These studies in an intact embryo highlight proteomic components of an adaptive response to anoxia in a model organism amenable to genetic analysis to permit further mechanistic insight into the phenomenon of anoxia tolerance.
doi:10.1016/j.cbd.2008.09.003
PMCID: PMC2858231  PMID: 20403745
Anoxia; proteomic; zebrafish

Results 1-25 (34)