1.  Community-driven development for computational biology at Sprints, Hackathons and Codefests 
BMC Bioinformatics  2014;15(Suppl 14):S7.
Computational biology comprises a wide range of technologies and approaches. Multiple technologies can be combined to create more powerful workflows if the individuals contributing the data or providing tools for its interpretation can find mutual understanding and consensus. Much conversation and joint investigation are required in order to identify and implement the best approaches.
Traditionally, scientific conferences feature talks presenting novel technologies or insights, followed up by informal discussions during coffee breaks. In multi-institution collaborations, in order to reach agreement on implementation details or to transfer deeper insights in a technology and practical skills, a representative of one group typically visits the other. However, this does not scale well when the number of technologies or research groups is large.
Conferences have responded to this issue by introducing Birds-of-a-Feather (BoF) sessions, which offer an opportunity for individuals with common interests to intensify their interaction. However, parallel BoF sessions often make it hard for participants to join multiple BoFs and find common ground between the different technologies, and BoFs are generally too short to allow time for participants to program together.
This report summarises our experience with computational biology Codefests, Hackathons and Sprints, which are interactive developer meetings. They are structured to reduce the limitations of traditional scientific meetings described above by strengthening the interaction among peers and letting the participants determine the schedule and topics. These meetings are commonly run as loosely scheduled "unconferences" (self-organized identification of participants and topics for meetings) over at least two days, with early introductory talks to welcome and organize contributors, followed by intensive collaborative coding sessions. We summarise some prominent achievements of those meetings and describe differences in how these are organised, how their audience is addressed, and their outreach to their respective communities.
Hackathons, Codefests and Sprints share a stimulating atmosphere that encourages participants to jointly brainstorm and tackle problems of shared interest in a self-driven proactive environment, as well as providing an opportunity for new participants to get involved in collaborative projects.
PMCID: PMC4255748  PMID: 25472764
2.  Implementation of Cloud based Next Generation Sequencing data analysis in a clinical laboratory 
BMC Research Notes  2014;7:314.
The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories.
To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample.
We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.
PMCID: PMC4036707  PMID: 24885806
Next generation sequencing; Cloud computing; Variant detection; Molecular diagnostics
3.  Using iRT, a normalized retention time for more targeted measurement of peptides 
Proteomics  2012;12(8):1111-1121.
Multiple reaction monitoring (MRM) has recently become the method of choice for targeted quantitative measurement of proteins using mass spectrometry. The method, however, is limited in the number of peptides that can be measured in one run. This number can be markedly increased by scheduling the acquisition if the accurate retention time (RT) of each peptide is known.
Here we present iRT, an empirically derived dimensionless peptide-specific value that allows for highly accurate RT prediction. The iRT of a peptide is a fixed number relative to a standard set of reference iRT-peptides that can be transferred across laboratories and chromatographic systems.
We show that iRT facilitates the setup of multiplexed experiments with acquisition windows more than 4 times smaller compared to in silico RT predictions resulting in improved quantification accuracy. iRTs can be determined by any laboratory and shared transparently. The iRT concept has been implemented in Skyline, the most widely used software for MRM experiments.
PMCID: PMC3918884  PMID: 22577012
Mass spectrometry; multiplexing; proteomics methods; optimization; quantitative analysis
4.  Workflow for analysis of high mass accuracy salivary data set using MaxQuant and ProteinPilot search algorithm 
Proteomics  2012;12(11):1726-1730.
LTQ Orbitrap data analyzed with ProteinPilot can be further improved by MaxQuant raw data processing, which utilizes precursor-level high mass accuracy data for peak processing and MGF creation. In particular, ProteinPilot results from MaxQuant-processed peaklists for Orbitrap data sets resulted in improved spectral utilization due to an improved peaklist quality with higher precision and high precursor mass accuracy (HPMA). The output and postsearch analysis tools of both workflows were utilized for previously unexplored features of a three-dimensional fractionated and hexapeptide library (ProteoMiner) treated whole saliva data set comprising 200 fractions. ProteinPilot’s ability to simultaneously predict multiple modifications showed an advantage from ProteoMiner treatment for modified peptide identification. We demonstrate that complementary approaches in the analysis pipeline provide comprehensive results for the whole saliva data set acquired on an LTQ Orbitrap. Overall our results establish a workflow for improved protein identification from high mass accuracy data.
PMCID: PMC3618284  PMID: 22623410
Bioinformatics; Combined workflows; Descriptive statistics; High precursor mass accuracy and peaklist quality
6.  Drebrin controls neuronal migration through the formation and alignment of the leading process 
Formation of a functional nervous system requires neurons to migrate to the correct place within the developing brain. Tangentially migrating neurons are guided by a leading process which extends towards the target and is followed by the cell body. How environmental cues are coupled to specific cytoskeletal changes to produce and guide leading process growth is unknown. One such cytoskeletal modulator is drebrin, an actin-binding protein known to induce protrusions in many cell types and be important for regulating neuronal morphology.
Using the migration of oculomotor neurons as a model, we have shown that drebrin is necessary for the generation and guidance of the leading process. In the absence of drebrin, leading processes are not formed and cells fail to migrate although axon growth and pathfinding appear grossly unaffected. Conversely, when levels of drebrin are elevated the leading processes turn away from their target and as a result the motor neuron cell bodies move along abnormal paths within the brain. The aberrant trajectories were highly reproducible suggesting that drebrin is required to interpret specific guidance cues. The axons and growth cones of these neurons display morphological changes, particularly increased branching and filopodial number but despite this they extend along normal developmental pathways.
Collectively these results show that drebrin is initially necessary for the formation of a leading process and subsequently for this to respond to navigational signals and grow in the correct direction. Furthermore, we have shown that the actions of drebrin can be segregated within individual motor neurons to direct their migration independently of axon guidance.
PMCID: PMC3356577  PMID: 22306864
OMN, oculomotor nucleus; PCN, precerebellar nuclei; YFP, yellow fluorescent protein; Drebrin; Actin-binding; Migration; Leading process; Oculomotor
7.  LTQ-iQuant: A freely-available software pipeline for automated and accurate protein quantification of isobaric tagged peptide data from LTQ instruments 
Proteomics  2010;10(19):3533-3538.
Pulsed Q dissociation enables combining LTQ ion trap instruments with isobaric peptide tagging. Unfortunately, this combination lacks a technique which accurately reports protein abundance ratios and is implemented in a freely-available, flexible software pipeline. We developed and implemented a technique assigning collective reporter ion intensity-based weights to each peptide abundance ratio and calculating a protein’s weighted average abundance ratio and P value. Using an iTRAQ-labeled standard mixture, we compared our technique’s performance to the commercial software Mascot, finding that it performed better than Mascot’s non-weighted averaging and median peptide ratio techniques, and equal to its weighted averaging technique. We also compared performance of the LTQ-Orbitrap plus our technique to 4800 MALDI TOF/TOF plus Protein Pilot, by analyzing an iTRAQ-labeled stem cell lysate. We found highly correlated protein abundance ratios, indicating that the LTQ-Orbitrap plus our technique yields results comparable to the current standard. We implemented our technique in a freely available, automated software pipeline, called LTQ-iQuant, which: is mzXML-compatible; supports iTRAQ 4-plex and 8-plex LTQ data; and can be modified for and have weights trained to a user’s LTQ and other isobaric peptide tagging methods. LTQ-iQuant should make LTQ instruments and isobaric peptide tagging accessible to more proteomics researchers.
PMCID: PMC3025484  PMID: 20821806
isobaric tags; LTQ; protein quantification; open source software; weighted average
8.  Human CHN1 mutations hyperactivate α2-chimaerin and cause Duane’s retraction syndrome 
Science (New York, N.Y.)  2008;321(5890):839-843.
The RacGAP molecule α2-chimaerin is implicated in neuronal signaling pathways required for precise guidance of developing corticospinal axons. We now demonstrate that a variant of Duane’s retraction syndrome, a congenital eye movement disorder in which affected individuals show aberrant development of axon projections to the extraocular muscles, can result from gain-of-function heterozygous missense mutations in CHN1 that increase α2-chimaerin RacGAP activity in vitro. A subset of mutations enhances α2-chimaerin membrane translocation and/or α2-chimaerin’s previously unrecognized ability to form a complex with itself. In ovo expression of mutant CHN1 alters the development of ocular motor axons. These data demonstrate that human CHN1 mutations can hyperactivate α2-chimaerin and result in aberrant cranial motor neuron development.
PMCID: PMC2593867  PMID: 18653847
9.  Dimerization of Protein Tyrosine Phosphatase σ Governs both Ligand Binding and Isoform Specificity▿  
Molecular and Cellular Biology  2006;27(5):1795-1808.
Signaling through receptor protein tyrosine phosphatases (RPTPs) can influence diverse processes, including axon development, lymphocyte activation, and cell motility. The molecular regulation of these enzymes, however, is still poorly understood. In particular, it is not known if, or how, the dimerization state of RPTPs is related to the binding of extracellular ligands. Protein tyrosine phosphatase σ (PTPσ) is an RPTP with major isoforms that differ in their complements of fibronectin type III domains and in their ligand-binding specificities. In this study, we show that PTPσ forms homodimers in the cell, interacting at least in part through the transmembrane region. Using this knowledge, we provide the first evidence that PTPσ ectodomains must be presented as dimers in order to bind heterophilic ligands. We also provide evidence of how alternative use of fibronectin type III domain complements in two major isoforms of PTPσ can alter the ligand binding specificities of PTPσ ectodomains. The data suggest that the alternative domains function largely to change the rotational conformations of the amino-terminal ligand binding sites of the ectodomain dimers, thus imparting novel ligand binding properties. These findings have important implications for our understanding of how heterophilic ligands interact with, and potentially regulate, RPTPs.
PMCID: PMC1820468  PMID: 17178832
10.  Tubulin tyrosination is a major factor affecting the recruitment of CAP-Gly proteins at microtubule plus ends 
The Journal of Cell Biology  2006;174(6):839-849.
Tubulin-tyrosine ligase (TTL), the enzyme that catalyzes the addition of a C-terminal tyrosine residue to α-tubulin in the tubulin tyrosination cycle, is involved in tumor progression and has a vital role in neuronal organization. We show that in mammalian fibroblasts, cytoplasmic linker protein (CLIP) 170 and other microtubule plus-end tracking proteins comprising a cytoskeleton-associated protein glycine-rich (CAP-Gly) microtubule binding domain such as CLIP-115 and p150 Glued, localize to the ends of tyrosinated microtubules but not to the ends of detyrosinated microtubules. In vitro, the head domains of CLIP-170 and of p150 Glued bind more efficiently to tyrosinated microtubules than to detyrosinated polymers. In TTL-null fibroblasts, tubulin detyrosination and CAP-Gly protein mislocalization correlate with defects in both spindle positioning during mitosis and cell morphology during interphase. These results indicate that tubulin tyrosination regulates microtubule interactions with CAP-Gly microtubule plus-end tracking proteins and provide explanations for the involvement of TTL in tumor progression and in neuronal organization.
PMCID: PMC2064338  PMID: 16954346

