Search tips
Search criteria

Results 1-7 (7)

Clipboard (0)
more »
Year of Publication
Document Types
1.  EpiToolKit—a web-based workbench for vaccine design 
Bioinformatics  2015;31(13):2211-2213.
Summary: EpiToolKit is a virtual workbench for immunological questions with a focus on vaccine design. It offers an array of immunoinformatics tools covering MHC genotyping, epitope and neo-epitope prediction, epitope selection for vaccine design, and epitope assembly. In its recently re-implemented version 2.0, EpiToolKit provides a range of new functionality and for the first time allows combining tools into complex workflows. For inexperienced users it offers simplified interfaces to guide the users through the analysis of complex immunological data sets.
Availability and implementation:
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4481845  PMID: 25712691
2.  OptiType: precision HLA typing from next-generation sequencing data 
Bioinformatics  2014;30(23):3310-3316.
Motivation: The human leukocyte antigen (HLA) gene cluster plays a crucial role in adaptive immunity and is thus relevant in many biomedical applications. While next-generation sequencing data are often available for a patient, deducing the HLA genotype is difficult because of substantial sequence similarity within the cluster and exceptionally high variability of the loci. Established approaches, therefore, rely on specific HLA enrichment and sequencing techniques, coming at an additional cost and extra turnaround time.
Result: We present OptiType, a novel HLA genotyping algorithm based on integer linear programming, capable of producing accurate predictions from NGS data not specifically enriched for the HLA cluster. We also present a comprehensive benchmark dataset consisting of RNA, exome and whole-genome sequencing data. OptiType significantly outperformed previously published in silico approaches with an overall accuracy of 97% enabling its use in a broad range of applications.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4441069  PMID: 25143287
3.  Going from where to why—interpretable prediction of protein subcellular localization 
Bioinformatics  2010;26(9):1232-1238.
Motivation: Protein subcellular localization is pivotal in understanding a protein's function. Computational prediction of subcellular localization has become a viable alternative to experimental approaches. While current machine learning-based methods yield good prediction accuracy, most of them suffer from two key problems: lack of interpretability and dealing with multiple locations.
Results: We present YLoc, a novel method for predicting protein subcellular localization that addresses these issues. Due to its simple architecture, YLoc can identify the relevant features of a protein sequence contributing to its subcellular localization, e.g. localization signals or motifs relevant to protein sorting. We present several example applications where YLoc identifies the sequence features responsible for protein localization, and thus reveals not only to which location a protein is transported to, but also why it is transported there. YLoc also provides a confidence estimate for the prediction. Thus, the user can decide what level of error is acceptable for a prediction. Due to a probabilistic approach and the use of several thousands of dual-targeted proteins, YLoc is able to predict multiple locations per protein. YLoc was benchmarked using several independent datasets for protein subcellular localization and performs on par with other state-of-the-art predictors. Disregarding low-confidence predictions, YLoc can achieve prediction accuracies of over 90%. Moreover, we show that YLoc is able to reliably predict multiple locations and outperforms the best predictors in this area.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2859129  PMID: 20299325
4.  A novel algorithm for detecting differentially regulated paths based on gene set enrichment analysis 
Bioinformatics  2009;25(21):2787-2794.
Motivation: Deregulated signaling cascades are known to play a crucial role in many pathogenic processes, among them are tumor initiation and progression. In the recent past, modern experimental techniques that allow for measuring the amount of mRNA transcripts of almost all known human genes in a tissue or even in a single cell have opened new avenues for studying the activity of the signaling cascades and for understanding the information flow in the networks.
Results: We present a novel dynamic programming algorithm for detecting deregulated signaling cascades. The so-called FiDePa (Finding Deregulated Paths) algorithm interprets differences in the expression profiles of tumor and normal tissues. It relies on the well-known gene set enrichment analysis (GSEA) and efficiently detects all paths in a given regulatory or signaling network that are significantly enriched with differentially expressed genes or proteins. Since our algorithm allows for comparing a single tumor expression profile with the control group, it facilitates the detection of specific regulatory features of a tumor that may help to optimize tumor therapy. To demonstrate the capabilities of our algorithm, we analyzed a glioma expression dataset with respect to a directed graph that combined the regulatory networks of the KEGG and TRANSPATH database. The resulting glioma consensus network that encompasses all detected deregulated paths contained many genes and pathways that are known to be key players in glioma or cancer-related pathogenic processes. Moreover, we were able to correlate clinically relevant features like necrosis or metastasis with the detected paths.
Availability: C++ source code is freely available, BiNA can be downloaded from
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2781748  PMID: 19713416
5.  FRED—a framework for T-cell epitope detection 
Bioinformatics  2009;25(20):2758-2759.
Summary: Over the last decade, immunoinformatics has made significant progress. Computational approaches, in particular the prediction of T-cell epitopes using machine learning methods, are at the core of modern vaccine design. Large-scale analyses and the integration or comparison of different methods become increasingly important. We have developed FRED, an extendable, open source software framework for key tasks in immunoinformatics. In this, its first version, FRED offers easily accessible prediction methods for MHC binding and antigen processing as well as general infrastructure for the handling of antigen sequence data and epitopes. FRED is implemented in Python in a modular way and allows the integration of external methods.
Availability: FRED is freely available for download at
PMCID: PMC2759545  PMID: 19578173
6.  KIRMES: kernel-based identification of regulatory modules in euchromatic sequences 
Bioinformatics  2009;25(16):2126-2133.
Motivation: Understanding transcriptional regulation is one of the main challenges in computational biology. An important problem is the identification of transcription factor (TF) binding sites in promoter regions of potential TF target genes. It is typically approached by position weight matrix-based motif identification algorithms using Gibbs sampling, or heuristics to extend seed oligos. Such algorithms succeed in identifying single, relatively well-conserved binding sites, but tend to fail when it comes to the identification of combinations of several degenerate binding sites, as those often found in cis-regulatory modules.
Results: We propose a new algorithm that combines the benefits of existing motif finding with the ones of support vector machines (SVMs) to find degenerate motifs in order to improve the modeling of regulatory modules. In experiments on microarray data from Arabidopsis thaliana, we were able to show that the newly developed strategy significantly improves the recognition of TF targets.
Availability: The python source code (open source-licensed under GPL), the data for the experiments and a Galaxy-based web service are available at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2722996  PMID: 19389732
7.  MetaRoute: fast search for relevant metabolic routes for interactive network navigation and visualization 
Bioinformatics  2008;24(18):2108-2109.
Summary: We present MetaRoute, an efficient search algorithm based on atom mapping rules and path weighting schemes that returns relevant or textbook-like routes between a source and a product metabolite within seconds for genome-scale networks. Its speed allows the algorithm to be used interactively through a web interface to visualize relevant routes and local networks for one or multiple organisms based on data from KEGG.
Supplementary information: Supplementary details are available at
PMCID: PMC2530881  PMID: 18635573

Results 1-7 (7)