1.  The Drosophila T-box transcription factor Midline functions within the Notch–Delta signaling pathway to specify sensory organ precursor cell fates and regulates cell survival within the eye imaginal disc 
Mechanisms of development  2013;130(0):577-601.
We report that the T-box transcription factor Midline (Mid), an evolutionary conserved homolog of the vertebrate Tbx20 protein, functions within the Notch–Delta signaling pathway essential for specifying the fates of sensory organ precursor cells. This complements an established history of research showing that Mid regulates the cell-fate specification of diverse cell types within the developing heart, epidermis and central nervous system. Tbx20 has been detected in diverse neuronal and epithelial cells of embryonic eye tissues in both mice and humans. However, the mechanisms by which either Mid or Tbx20 function to regulate cell-fate specification or other critical aspects of eye development including cell survival have not yet been elucidated. We have also gathered preliminary evidence suggesting that Mid may play an indirect, but vital role in selecting SOP cells within the third-instar larval eye disc by regulating the expression of the proneural gene atonal. During subsequent pupal stages, Mid specifies SOP cell fates as a member of the Notch–Delta signaling hierarchy and is essential for maintaining cell viability within by inhibiting apoptotic pathways. We present several new hypotheses that seek to understand the role of Mid in regulating developmental processes downstream of the Notch receptor that are critical for specifying unique cell fates, patterning the adult eye and maintaining cellular homeostasis during eye disc morphogenesis.
PMCID: PMC4500660  PMID: 23962751
Midline; Tbx20; Notch; Delta; Extramacrochaetae; Atonal; Senseless; Sensory Organ Precursor Cell; Baculovirus p35; Apoptosis
2.  Semantic Web repositories for genomics data using the eXframe platform 
Journal of Biomedical Semantics  2014;5(Suppl 1):S3.
With the advent of inexpensive assay technologies, there has been an unprecedented growth in genomics data as well as the number of databases in which it is stored. In these databases, sample annotation using ontologies and controlled vocabularies is becoming more common. However, the annotation is rarely available as Linked Data, in a machine-readable format, or for standardized queries using SPARQL. This makes large-scale reuse, or integration with other knowledge bases very difficult.
To address this challenge, we have developed the second generation of our eXframe platform, a reusable framework for creating online repositories of genomics experiments. This second generation model now publishes Semantic Web data. To accomplish this, we created an experiment model that covers provenance, citations, external links, assays, biomaterials used in the experiment, and the data collected during the process. The elements of our model are mapped to classes and properties from various established biomedical ontologies. Resource Description Framework (RDF) data is automatically produced using these mappings and indexed in an RDF store with a built-in Sparql Protocol and RDF Query Language (SPARQL) endpoint.
Using the open-source eXframe software, institutions and laboratories can create Semantic Web repositories of their experiments, integrate it with heterogeneous resources and make it interoperable with the vast Semantic Web of biomedical knowledge.
PMCID: PMC4108874  PMID: 25093072
3.  Pain Research Forum: application of scientific social media frameworks in neuroscience 
Background: Social media has the potential to accelerate the pace of biomedical research through online collaboration, discussions, and faster sharing of information. Focused web-based scientific social collaboratories such as the Alzheimer Research Forum have been successful in engaging scientists in open discussions of the latest research and identifying gaps in knowledge. However, until recently, tools to rapidly create such communities and provide high-bandwidth information exchange between collaboratories in related fields did not exist.
Methods: We have addressed this need by constructing a reusable framework to build online biomedical communities, based on Drupal, an open-source content management system. The framework incorporates elements of Semantic Web technology combined with social media. Here we present, as an exemplar of a web community built on our framework, the Pain Research Forum (PRF) ( PRF is a community of chronic pain researchers, established with the goal of fostering collaboration and communication among pain researchers.
Results: Launched in 2011, PRF has over 1300 registered members with permission to submit content. It currently hosts over 150 topical news articles on research; more than 30 active or archived forum discussions and journal club features; a webinar series; an editor-curated weekly updated listing of relevant papers; and several other resources for the pain research community. All content is licensed for reuse under a Creative Commons license; the software is freely available. The framework was reused to develop other sites, notably the Multiple Sclerosis Discovery Forum ( and StemBook (
Discussion: Web-based collaboratories are a crucial integrative tool supporting rapid information transmission and translation in several important research areas. In this article, we discuss the success factors, lessons learned, and ongoing challenges in using PRF as a driving force to develop tools for online collaboration in neuroscience. We also indicate ways these tools can be applied to other areas and uses.
PMCID: PMC3949323  PMID: 24653693
social media; neuropathic pain; content management systems; Drupal
4.  The Stem Cell Commons: an exemplar for data integration in the biomedical domain driven by the ISA framework 
Comparisons of stem cell experiments at both molecular and semantic levels remain challenging due to inconsistencies in results, data formats, and descriptions among biomedical research discoveries. The Harvard Stem Cell Institute (HSCI) has created the Stem Cell Commons (, an open, community-based approach to data sharing. Experimental information is integrated using the Investigation-Study-Assay tabular format (ISA-Tab) used by over 30 organizations (ISA Commons, The early adoption of this format permitted the novel integration of three independent systems to facilitate stem cell data storage, exchange and analysis: the Blood Genomics Repository, the Stem Cell Discovery Engine, and the new Refinery platform that links the Galaxy analytical engine to data repositories.
PMCID: PMC3814497  PMID: 24303302
5.  Toward interoperable bioscience data 
Nature genetics  2012;44(2):121-126.
To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open ‘data commoning’ culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared ‘Investigation-Study-Assay’ framework to support that vision.
PMCID: PMC3428019  PMID: 22281772
6.  Genome-Wide Histone Acetylation Is Altered in a Transgenic Mouse Model of Huntington's Disease 
PLoS ONE  2012;7(7):e41423.
In Huntington's disease (HD; MIM ID #143100), a fatal neurodegenerative disorder, transcriptional dysregulation is a key pathogenic feature. Histone modifications are altered in multiple cellular and animal models of HD suggesting a potential mechanism for the observed changes in transcriptional levels. In particular, previous work has suggested an important link between decreased histone acetylation, particularly acetylated histone H3 (AcH3; H3K9K14ac), and downregulated gene expression. However, the question remains whether changes in histone modifications correlate with transcriptional abnormalities across the entire transcriptome. Using chromatin immunoprecipitation paired with microarray hybridization (ChIP-chip), we interrogated AcH3-gene interactions genome-wide in striata of 12-week old wild-type (WT) and transgenic (TG) R6/2 mice, an HD mouse model, and correlated these interactions with gene expression levels. At the level of the individual gene, we found decreases in the number of sites occupied by AcH3 in the TG striatum. In addition, the total number of genes bound by AcH3 was decreased. Surprisingly, the loss of AcH3 binding sites occurred within the coding regions of the genes rather than at the promoter region. We also found that the presence of AcH3 at any location within a gene strongly correlated with the presence of its transcript in both WT and TG striatum. In the TG striatum, treatment with histone deacetylase (HDAC) inhibitors increased global AcH3 levels with concomitant increases in transcript levels; however, AcH3 binding at select gene loci increased only slightly. This study demonstrates that histone H3 acetylation at lysine residues 9 and 14 and active gene expression are intimately tied in the rodent brain, and that this fundamental relationship remains unchanged in an HD mouse model despite genome-wide decreases in histone H3 acetylation.
PMCID: PMC3407195  PMID: 22848491
7.  eXframe: reusable framework for storage, analysis and visualization of genomics experiments 
BMC Bioinformatics  2011;12:452.
Genome-wide experiments are routinely conducted to measure gene expression, DNA-protein interactions and epigenetic status. Structured metadata for these experiments is imperative for a complete understanding of experimental conditions, to enable consistent data processing and to allow retrieval, comparison, and integration of experimental results. Even though several repositories have been developed for genomics data, only a few provide annotation of samples and assays using controlled vocabularies. Moreover, many of them are tailored for a single type of technology or measurement and do not support the integration of multiple data types.
We have developed eXframe - a reusable web-based framework for genomics experiments that provides 1) the ability to publish structured data compliant with accepted standards 2) support for multiple data types including microarrays and next generation sequencing 3) query, analysis and visualization integration tools (enabled by consistent processing of the raw data and annotation of samples) and is available as open-source software. We present two case studies where this software is currently being used to build repositories of genomics experiments - one contains data from hematopoietic stem cells and another from Parkinson's disease patients.
The web-based framework eXframe offers structured annotation of experiments as well as uniform processing and storage of molecular data from microarray and next generation sequencing platforms. The framework allows users to query and integrate information across species, technologies, measurement types and experimental conditions. Our framework is reusable and freely modifiable - other groups or institutions can deploy their own custom web-based repositories based on this software. It is interoperable with the most important data formats in this domain. We hope that other groups will not only use eXframe, but also contribute their own useful modifications.
PMCID: PMC3235155  PMID: 22103807
8.  An open annotation ontology for science on web 3.0 
Journal of Biomedical Semantics  2011;2(Suppl 2):S4.
There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges.
Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work.
This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables “stand-off” or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at, and extensive documentation including screencasts is available on AO’s Google Code page: .
The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors.
PMCID: PMC3102893  PMID: 21624159

