The extraordinary diversity characterizing the antibody repertoire is generated by both evolution and lymphocyte development. Much of this diversity is due to the existence of immunoglobulin (Ig) variable region gene segment libraries, which were diversified during evolution and, in higher vertebrates, are used in generating the combinatorial diversity of antibody genes. The aim of the present study was to address the following questions: What evolutionary parameters affect the size and structure of gene libraries? Are the number of genes in libraries of contemporary species, and the corresponding gene locus structure, a random result of evolutionary history, or have these properties been optimized with respect to individual or population fitness? If a larger number of genes or different genome structures do not increase the fitness, then the current structure is probably optimized.
We used a simulation of variable region gene library evolution. We measured the effect of different parameters on gene library size and diversity, and the corresponding fitness. We found compensating relationships between parameters, which optimized Ig library size and diversity.
We conclude that contemporary species' Ig libraries have been optimized by evolution in terms of Ig sequence lengths, the number and diversity of Ig genes, and antibody-antigen affinities.
Toxoplasmosis causes loss of life, cognitive and motor function, and sight. A vaccine is greatly needed to prevent this disease. The purpose of this study was to use an immmunosense approach to develop a foundation for development of vaccines to protect humans with the HLA-A03 supertype. Three peptides had been identified with high binding scores for HLA-A03 supertypes using bioinformatic algorhythms, high measured binding affinity for HLA-A03 supertype molecules, and ability to elicit IFN-γ production by human HLA-A03 supertype peripheral blood CD8+ T cells from seropositive but not seronegative persons.
Herein, when these peptides were administered with the universal CD4+T cell epitope PADRE (AKFVAAWTLKAAA) and formulated as lipopeptides, or administered with GLA-SE either alone, or with Pam2Cys added, we found we successfully created preparations that induced IFN-γ and reduced parasite burden in HLA-A*1101(an HLA-A03 supertype allele) transgenic mice. GLA-SE is a novel emulsified synthetic TLR4 ligand that is known to facilitate development of T Helper 1 cell (TH1) responses. Then, so our peptides would include those expressed in tachyzoites, bradyzoites and sporozoites from both Type I and II parasites, we used our approaches which had identified the initial peptides. We identified additional peptides using bioinformatics, binding affinity assays, and study of responses of HLA-A03 human cells. Lastly, we found that immunization of HLA-A*1101 transgenic mice with all the pooled peptides administered with PADRE, GLA-SE, and Pam2Cys is an effective way to elicit IFN-γ producing CD8+ splenic T cells and protection. Immunizations included the following peptides together: KSFKDILPK (SAG1224-232); AMLTAFFLR (GRA6164-172); RSFKDLLKK (GRA7134-142); STFWPCLLR (SAG2C13-21); SSAYVFSVK(SPA250-258); and AVVSLLRLLK(SPA89-98). This immunization elicited robust protection, measured as reduced parasite burden using a luciferase transfected parasite, luciferin, this novel, HLA transgenic mouse model, and imaging with a Xenogen camera.
Toxoplasma gondii peptides elicit HLA-A03 restricted, IFN-γ producing, CD8+ T cells in humans and mice. These peptides administered with adjuvants reduce parasite burden in HLA-A*1101 transgenic mice. This work provides a foundation for immunosense based vaccines. It also defines novel adjuvants for newly identified peptides for vaccines to prevent toxoplasmosis in those with HLA-A03 supertype alleles.
The advent of Systems Biology has been accompanied by the blooming of pathway databases. Currently pathways are defined generically with respect to the organ or cell type where a reaction takes place. The cell type specificity of the reactions is the foundation of immunological research, and capturing this specificity is of paramount importance when using pathway-based analyses to decipher complex immunological datasets. Here, we present DC-ATLAS, a novel and versatile resource for the interpretation of high-throughput data generated perturbing the signaling network of dendritic cells (DCs).
Pathways are annotated using a novel data model, the Biological Connection Markup Language (BCML), a SBGN-compliant data format developed to store the large amount of information collected. The application of DC-ATLAS to pathway-based analysis of the transcriptional program of DCs stimulated with agonists of the toll-like receptor family allows an integrated description of the flow of information from the cellular sensors to the functional outcome, capturing the temporal series of activation events by grouping sets of reactions that occur at different time points in well-defined functional modules.
The initiative significantly improves our understanding of DC biology and regulatory networks. Developing a systems biology approach for immune system holds the promise of translating knowledge on the immune system into more successful immunotherapy strategies.
Macrophages represent the front lines of our immune system; they recognize and engulf pathogens or foreign particles thus initiating the immune response. Imaging macrophages presents unique challenges, as most optical techniques require labeling or staining of the cellular compartments in order to resolve organelles, and such stains or labels have the potential to perturb the cell, particularly in cases where incomplete information exists regarding the precise cellular reaction under observation. Label-free imaging techniques such as Raman microscopy are thus valuable tools for studying the transformations that occur in immune cells upon activation, both on the molecular and organelle levels. Due to extremely low signal levels, however, Raman microscopy requires sophisticated image processing techniques for noise reduction and signal extraction. To date, efficient, automated algorithms for resolving sub-cellular features in noisy, multi-dimensional image sets have not been explored extensively.
We show that hybrid z-score normalization and standard regression (Z-LSR) can highlight the spectral differences within the cell and provide image contrast dependent on spectral content. In contrast to typical Raman imaging processing methods using multivariate analysis, such as single value decomposition (SVD), our implementation of the Z-LSR method can operate nearly in real-time. In spite of its computational simplicity, Z-LSR can automatically remove background and bias in the signal, improve the resolution of spatially distributed spectral differences and enable sub-cellular features to be resolved in Raman microscopy images of mouse macrophage cells. Significantly, the Z-LSR processed images automatically exhibited subcellular architectures whereas SVD, in general, requires human assistance in selecting the components of interest.
The computational efficiency of Z-LSR enables automated resolution of sub-cellular features in large Raman microscopy data sets without compromise in image quality or information loss in associated spectra. These results motivate further use of label free microscopy techniques in real-time imaging of live immune cells.
Binding of peptides to Major Histocompatibility class II (MHC-II) molecules play a central role in governing responses of the adaptive immune system. MHC-II molecules sample peptides from the extracellular space allowing the immune system to detect the presence of foreign microbes from this compartment. Predicting which peptides bind to an MHC-II molecule is therefore of pivotal importance for understanding the immune response and its effect on host-pathogen interactions. The experimental cost associated with characterizing the binding motif of an MHC-II molecule is significant and large efforts have therefore been placed in developing accurate computer methods capable of predicting this binding event. Prediction of peptide binding to MHC-II is complicated by the open binding cleft of the MHC-II molecule, allowing binding of peptides extending out of the binding groove. Moreover, the genes encoding the MHC molecules are immensely diverse leading to a large set of different MHC molecules each potentially binding a unique set of peptides. Characterizing each MHC-II molecule using peptide-screening binding assays is hence not a viable option.
Here, we present an MHC-II binding prediction algorithm aiming at dealing with these challenges. The method is a pan-specific version of the earlier published allele-specific NN-align algorithm and does not require any pre-alignment of the input data. This allows the method to benefit also from information from alleles covered by limited binding data. The method is evaluated on a large and diverse set of benchmark data, and is shown to significantly out-perform state-of-the-art MHC-II prediction methods. In particular, the method is found to boost the performance for alleles characterized by limited binding data where conventional allele-specific methods tend to achieve poor prediction accuracy.
The method thus shows great potential for efficient boosting the accuracy of MHC-II binding prediction, as accurate predictions can be obtained for novel alleles at highly reduced experimental costs. Pan-specific binding predictions can be obtained for all alleles with know protein sequence and the method can benefit by including data in the training from alleles even where only few binders are known. The method and benchmark data are available at http://www.cbs.dtu.dk/services/NetMHCIIpan-2.0
To properly characterize protective polyclonal antibody responses, it is necessary to examine epitope specificity. Most antibody epitopes are conformational in nature and, thus, cannot be identified using synthetic linear peptides. Cyclic peptides can function as mimetics of conformational epitopes (termed mimotopes), thereby providing targets, which can be selected by immunoaffinity purification. However, the management of large collections of random cyclic peptides is cumbersome. Filamentous bacteriophage provides a useful scaffold for the expression of random peptides (termed phage display) facilitating both the production and manipulation of complex peptide libraries. Immunoaffinity selection of phage displaying random cyclic peptides is an effective strategy for isolating mimotopes with specificity for a given antiserum. Further epitope prediction based on mimotope sequence is not trivial since mimotopes generally display only small homologies with the target protein. Large numbers of unique mimotopes are required to provide sufficient sequence coverage to elucidate the target epitope. We have developed a method based on pattern recognition theory to deal with the complexity of large collections of conformational mimotopes. The analysis consists of two phases: 1) The learning phase where a large collection of epitope-specific mimotopes is analyzed to identify epitope specific “signs” and 2) The identification phase where immunoaffinity-selected mimotopes are interrogated for the presence of the epitope specific “signs” and assigned to specific epitopes. We are currently using computational methods to define epitope “signs” without the need for prior knowledge of specific mimotopes. This technology provides an important tool for characterizing the breadth of antibody specificities within polyclonal antisera.
Atomistic Molecular Dynamics provides powerful and flexible tools for the prediction and analysis of molecular and macromolecular systems. Specifically, it provides a means by which we can measure theoretically that which cannot be measured experimentally: the dynamic time-evolution of complex systems comprising atoms and molecules. It is particularly suitable for the simulation and analysis of the otherwise inaccessible details of MHC-peptide interaction and, on a larger scale, the simulation of the immune synapse. Progress has been relatively tentative yet the emergence of truly high-performance computing and the development of coarse-grained simulation now offers us the hope of accurately predicting thermodynamic parameters and of simulating not merely a handful of proteins but larger, longer simulations comprising thousands of protein molecules and the cellular scale structures they form. We exemplify this within the context of immunoinformatics.
Identification of epitopes that invoke strong responses from B-cells is one of the key steps in designing effective vaccines against pathogens. Because experimental determination of epitopes is expensive in terms of cost, time, and effort involved, there is an urgent need for computational methods for reliable identification of B-cell epitopes. Although several computational tools for predicting B-cell epitopes have become available in recent years, the predictive performance of existing tools remains far from ideal. We review recent advances in computational methods for B-cell epitope prediction, identify some gaps in the current state of the art, and outline some promising directions for improving the reliability of such methods.
Immunoinformatics is an emergent branch of informatics science that long ago pullulated from the tree of knowledge that is bioinformatics. It is a discipline which applies informatic techniques to problems of the immune system. To a great extent, immunoinformatics is typified by epitope prediction methods. It has found disappointingly limited use in the design and discovery of new vaccines, which is an area where proper computational support is generally lacking. Most extant vaccines are not based around isolated epitopes but rather correspond to chemically-treated or attenuated whole pathogens or correspond to individual proteins extract from whole pathogens or correspond to complex carbohydrate. In this chapter we attempt to review what progress there has been in an as-yet-underexplored area of immunoinformatics: the computational discovery of whole protein antigens. The effective development of antigen prediction methods would significantly reduce the laboratory resource required to identify pathogenic proteins as candidate subunit vaccines. We begin our review by placing antigen prediction firmly into context, exploring the role of reverse vaccinology in the design and discovery of vaccines. We also highlight several competing yet ultimately complementary methodological approaches: sub-cellular location prediction, identifying antigens using sequence similarity, and the use of sophisticated statistical approaches for predicting the probability of antigen characteristics. We end by exploring how a systems immunomics approach to the prediction of immunogenicity would prove helpful in the prediction of antigens.
Sequence based T-cell epitope predictions have improved immensely in the last decade. From predictions of peptide binding to major histocompatibility complex molecules with moderate accuracy, limited allele coverage, and no good estimates of the other events in the antigen-processing pathway, the field has evolved significantly. Methods have now been developed that produce highly accurate binding predictions for many alleles and integrate both proteasomal cleavage and transport events. Moreover have so-called pan-specific methods been developed, which allow for prediction of peptide binding to MHC alleles characterized by limited or no peptide binding data. Most of the developed methods are publicly available, and have proven to be very useful as a shortcut in epitope discovery. Here, we will go through some of the history of sequence-based predictions of helper as well as cytotoxic T cell epitopes. We will focus on some of the most accurate methods and their basic background.
The last years have seen a renaissance of the vaccine area, driven by clinical needs in infectious diseases but also chronic diseases such as cancer and autoimmune disorders. Equally important are technological improvements involving nano-scale delivery platforms as well as third generation adjuvants. In parallel immunoinformatics routines have reached essential maturity for supporting central aspects in vaccinology going beyond prediction of antigenic determinants. On this basis computational vaccinology has emerged as a discipline aimed at ab-initio rational vaccine design.
Here we present a computational workflow for implementing computational vaccinology covering aspects from vaccine target identification to functional characterization and epitope selection supported by a Systems Biology assessment of central aspects in host-pathogen interaction. We exemplify the procedures for Epstein Barr Virus (EBV), a clinically relevant pathogen causing chronic infection and suspected of triggering malignancies and autoimmune disorders.
We introduce pBone/pView as a computational workflow supporting design and execution of immunoinformatics workflow modules, additionally involving aspects of results visualization, knowledge sharing and re-use. Specific elements of the workflow involve identification of vaccine targets in the realm of a Systems Biology assessment of host-pathogen interaction for identifying functionally relevant targets, as well as various methodologies for delineating B- and T-cell epitopes with particular emphasis on broad coverage of viral isolates as well as MHC alleles.
Applying the workflow on EBV specifically proposes sequences from the viral proteins LMP2, EBNA2 and BALF4 as vaccine targets holding specific B- and T-cell epitopes promising broad strain and allele coverage.
Based on advancements in the experimental assessment of genomes, transcriptomes and proteomes for both, pathogen and (human) host, the fundaments for rational design of vaccines have been laid out. In parallel, immunoinformatics modules have been designed and successfully applied for supporting specific aspects in vaccine design. Joining these advancements, further complemented by novel vaccine formulation and delivery aspects, have paved the way for implementing computational vaccinology for rational vaccine design tackling presently unmet vaccine challenges.
Viruses are fast evolving pathogens that continuously adapt to the highly variable environments they live and reproduce in. Strategies devoted to inhibit virus replication and to control their spread among hosts need to cope with these extremely heterogeneous populations and with their potential to avoid medical interventions. Computational techniques such as phylogenetic methods have broadened our picture of viral evolution both in time and space, and mathematical modeling has contributed substantially to our progress in unraveling the dynamics of virus replication, fitness, and virulence. Integration of multiple computational and mathematical approaches with experimental data can help to predict the behavior of viral pathogens and to anticipate their escape dynamics. This piece of information plays a critical role in some aspects of vaccine development, such as viral strain selection for vaccinations or rational attenuation of viruses. Here we review several aspects of viral evolution that can be addressed quantitatively, and we discuss computational methods that have the potential to improve vaccine design.
Improving our understanding of the immune response is fundamental to developing strategies to combat a wide range of diseases. We describe an integrated epitope analysis system which is based on principal component analysis of sequences of amino acids, using a multilayer perceptron neural net to conduct QSAR regression predictions for peptide binding affinities to 35 MHC-I and 14 MHC-II alleles.
The approach described allows rapid processing of single proteins, entire proteomes or subsets thereof, as well as multiple strains of the same organism. It enables consideration of the interface of diversity of both microorganisms and of host immunogenetics. Patterns of binding affinity are linked to topological features, such as extracellular or intramembrane location, and integrated into a graphical display which facilitates conceptual understanding of the interplay of B-cell and T-cell mediated immunity.
Patterns which emerge from application of this approach include the correlations between peptides showing high affinity binding to MHC-I and to MHC-II, and also with predicted B-cell epitopes. These are characterized as coincident epitope groups (CEGs). Also evident are long range patterns across proteins which identify regions of high affinity binding for a permuted population of diverse and heterozygous HLA alleles, as well as subtle differences in reactions with MHCs of individual HLA alleles, which may be important in disease susceptibility, and in vaccine and clinical trial design. Comparisons are shown of predicted epitope mapping derived from application of the QSAR approach with experimentally derived epitope maps from a diverse multi-species dataset, from Staphylococcus aureus, and from vaccinia virus.
A desktop application with interactive graphic capability is shown to be a useful platform for development of prediction and visualization tools for epitope mapping at scales ranging from individual proteins to proteomes from multiple strains of an organism. The possible functional implications of the patterns of peptide epitopes observed are discussed, including their implications for B-cell and T-cell cooperation and cross presentation.
Operation of the immune system is multivariate. Reduction of the dimensionality is essential to facilitate understanding of this complex biological system. One multi-dimensional facet of the immune system is the binding of epitopes to the MHC-I and MHC-II molecules by diverse populations of individuals. Prediction of such epitope binding is critical and several immunoinformatic strategies utilizing amino acid substitution matrices have been designed to develop predictive algorithms. Contemporaneously, computational and statistical tools have evolved to handle multivariate and megavariate analysis, but these have not been systematically deployed in prediction of MHC binding. Partial least squares analysis, principal component analysis, and associated regression techniques have become the norm in handling complex datasets in many fields. Over two decades ago Wold and colleagues showed that principal components of amino acids could be used to predict peptide binding to cellular receptors. We have applied this observation to the analysis of MHC binding, and to derivation of predictive methods applicable on a whole proteome scale.
We show that amino acid principal components and partial least squares approaches can be utilized to visualize the underlying physicochemical properties of the MHC binding domain by using commercially available software. We further show the application of amino acid principal components to develop both linear partial least squares and non-linear neural network regression prediction algorithms for MHC-I and MHC-II molecules. Several visualization options for the output aid in understanding the underlying physicochemical properties, enable confirmation of earlier work on the relative importance of certain peptide residues to MHC binding, and also provide new insights into differences among MHC molecules. We compared both the linear and non-linear MHC binding prediction tools to several predictive tools currently available on the Internet.
As opposed to the highly constrained user-interaction paradigms of web-server approaches, local computational approaches enable interactive analysis and visualization of complex multidimensional data using robust mathematical tools. Our work shows that prediction tools such as these can be constructed on the widely available JMP® platform, can operate in a spreadsheet environment on a desktop computer, and are capable of handling proteome-scale analysis with high throughput.
One of the major challenges in the field of vaccine design is to predict conformational B-cell epitopes in an antigen. In the past, several methods have been developed for predicting conformational B-cell epitopes in an antigen from its tertiary structure. This is the first attempt in this area to predict conformational B-cell epitope in an antigen from its amino acid sequence.
All Support vector machine (SVM) models were trained and tested on 187 non-redundant protein chains consisting of 2261 antibody interacting residues of B-cell epitopes. Models have been developed using binary profile of pattern (BPP) and physiochemical profile of patterns (PPP) and achieved a maximum MCC of 0.22 and 0.17 respectively. In this study, for the first time SVM model has been developed using composition profile of patterns (CPP) and achieved a maximum MCC of 0.73 with accuracy 86.59%. We compare our CPP based model with existing structure based methods and observed that our sequence based model is as good as structure based methods.
This study demonstrates that prediction of conformational B-cell epitope in an antigen is possible from is primary sequence. This study will be very useful in predicting conformational B-cell epitopes in antigens whose tertiary structures are not available. A web server CBTOPE has been developed for predicting B-cell epitope http://www.imtech.res.in/raghava/cbtope/.
The enrichment and importance of some aromatic residues, such as Tyr and Trp, have been widely noticed at the binding interfaces of antibodies from many experimental and statistical results, some of which were even identified as “hot spots” contributing significantly greater to the binding affinity than other amino acids. However, how these aromatic residues influence the immune binding still deserves further investigation. A large-scale examination was done regarding the local spatial environment around the interfacial Tyr or Trp residues. Energetic contribution of these Tyr and Trp residues to the binding affinity was then studied regarding 82 representative antibody interfaces covering 509 immune complexes from the PDB database and IMGT/3Dstructure-DB.
The connectivity analysis of interfacial residues showed that Tyr and Trp tended to cluster into the spatial Aromatic Islands (AI) rather than being distributed randomly at the antibody interfaces. Out of 82 antibody-antigen complexes, 72% (59) interfaces were found to contain AI with more than 3 aromatic residues. The statistical test against an empirical distribution indicated that the existence of AI was significant in about 60% representative antibody interfaces. Secondly, the loss of solvent accessible surface area (SASA) for side chains of aromatic residues between actually crowded state and independent state was nicely correlated with the AI size increasing in a linearly positive way which indicated that the aromatic side chains in AI tended to take a compact and ordered stacking conformation at the interfaces. Interestingly, the SASA loss of AI was also correlated roughly with the averaged gap of binding free energy between the theoretical and experimental data for immune complexes.
The results of our study revealed the wide existence and statistical significance of “Aromatic Island” (AI) composed of the spatially clustered Tyr and Trp residues at the antibody interfaces. The regular arrangement and stacking of aromatic side chains in AI could probably produce extra cooperative effects to the binding affinity which was firstly observed through the large-scale data analysis. The finding in this work not only provides insights into the functional role of aromatic residues in the antibody-antigen interaction, but also may facilitate the antibody engineering and potential clinical applications.
Identification of antigenic peptide epitopes is an essential prerequisite in T cell-based molecular vaccine design. Computational (sequence-based and structure-based) methods are inexpensive and efficient compared to experimental approaches in screening numerous peptides against their cognate MHC alleles. In structure-based protocols, suited to alleles with limited epitope data, the first step is to identify high-binding peptides using docking techniques, which need improvement in speed and efficiency to be useful in large-scale screening studies. We present pDOCK: a new computational technique for rapid and accurate docking of flexible peptides to MHC receptors and primarily apply it on a non-redundant dataset of 186 pMHC (MHC-I and MHC-II) complexes with X-ray crystal structures.
We have compared our docked structures with experimental crystallographic structures for the immunologically relevant nonameric core of the bound peptide for MHC-I and MHC-II complexes. Primary testing for re-docking of peptides into their respective MHC grooves generated 159 out of 186 peptides with Cα RMSD of less than 1.00 Å, with a mean of 0.56 Å. Amongst the 25 peptides used for single and variant template docking, the Cα RMSD values were below 1.00 Å for 23 peptides. Compared to our earlier docking methodology, pDOCK shows upto 2.5 fold improvement in the accuracy and is ~60% faster. Results of validation against previously published studies represent a seven-fold increase in pDOCK accuracy.
The limitations of our previous methodology have been addressed in the new docking protocol making it a rapid and accurate method to evaluate pMHC binding. pDOCK is a generic method and although benchmarks against experimental structures, it can be applied to alleles with no structural data using sequence information. Our outcomes establish the efficacy of our procedure to predict highly accurate peptide structures permitting conformational sampling of the peptide in MHC binding groove. Our results also support the applicability of pDOCK for in silico identification of promiscuous peptide epitopes that are relevant to higher proportions of human population with greater propensity to activate T cells making them key targets for the design of vaccines and immunotherapies.
Recent advances in Immunology highlighted the importance of local properties on the overall progression of HIV infection. In particular, the gastrointestinal tract is seen as a key area during early infection, and the massive cell depletion associated with it may influence subsequent disease progression. This motivated the development of a large-scale agent-based model.
Lymph nodes are explicitly implemented, and considerations on parallel computing permit large simulations and the inclusion of local features. The results obtained show that GI tract inclusion in the model leads to an accelerated disease progression, during both the early stages and the long-term evolution, compared to a theoretical, uniform model.
These results confirm the potential of treatment policies currently under investigation, which focus on this region. They also highlight the potential of this modelling framework, incorporating both agent-based and network-based components, in the context of complex systems where scaling-up alone does not result in models providing additional insights.
Clonal expansion of B lymphocytes coupled with somatic mutation and antigen selection allow the mammalian humoral immune system to generate highly specific immunoglobulins (IG) or antibodies against invading bacteria, viruses and toxins. The availability of high-throughput DNA sequencing methods is providing new avenues for studying this clonal expansion and identifying the factors guiding the generation of antibodies. The identification of groups of rearranged immunoglobulin gene sequences descended from the same rearrangement (clonally-related sets) in very large sets of sequences is facilitated by the availability of immunoglobulin gene sequence alignment and partitioning software that can accurately predict component germline gene, but has required painstaking visual inspection and analysis of sequences.
We have developed and implemented an algorithm for identifying sets of clonally-related sequences in large human immunoglobulin heavy chain gene variable region sequence sets. The program processes sequences that have been partitioned using iHMMune-align, and uses pairwise comparisons of CDR3 sequences and similarity in IGHV and IGHJ germline gene assignments to construct a distance matrix. Agglomerative hierarchical clustering is then used to identify likely groups of clonally-related sequences. The program is available for download from http://www.cse.unsw.edu.au/~ihmmune/ClonalRelate/ClonalRelate.zip.
The method was evaluated on several benchmark datasets and provided a more accurate and considerably faster identification of clonally-related immunoglobulin gene sequences than visual inspection by domain experts.
Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets.
VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system.
Bioinformatics curation and ontological representation of Brucella vaccines promotes classification and analysis of existing Brucella vaccines and vaccine candidates. Computational prediction of Brucella vaccine targets provides more candidates for rational vaccine development. The use of VIOLIN provides a general approach that can be applied for analyses of vaccines against other pathogens and infection diseases.
Selective peptide transport by the transporter associated with antigen processing (TAP) represents one of the main candidate mechanisms that may regulate the presentation of antigenic peptides to HLA class I molecules. Because TAP-binding preferences may significant impact T-cell epitope selection, there is great interest in applying computational techniques to systematically discover these elements.
We describe TAP Hunter, a web-based computational system for predicting TAP-binding peptides. A novel encoding scheme, based on representations of TAP peptide fragments and composition effects, allows the identification of variable-length TAP ligands using SVM as the prediction engine. The system was rigorously trained and tested using 613 experimentally verified peptide sequences. The results showed that the system has good predictive ability with area under the receiver operating characteristics curve (AROC) ≥0.88. In addition, TAP Hunter is compared against several existing public available TAP predictors and has showed either superior or comparable performance.
TAP Hunter provides a reliable platform for predicting variable length peptides binding onto the TAP transporter. To facilitate the usage of TAP Hunter to the scientific community, a simple, flexible and user-friendly web-server is developed and freely available at http://datam.i2r.a-star.edu.sg/taphunter/.
Innate immunity is the first line of defence offered by host cells to infections. Macrophage cells involved in innate immunity are stimulated by lipopolysaccharide (LPS), found on bacterial cell surface, to express a complex array of gene products. Persistent LPS stimulation makes a macrophage tolerant to LPS with down regulation of inflammatory genes ("pro-inflammatory") while continually expressing genes to fight the bacterial infection ("antibacterial"). Interactions of transcription factors (TF) at their cognate TF binding sites (TFBS) on the expressed genes are important in transcriptional regulatory networks that control these pro-inflammatory and antibacterial expression paradigms involved in LPS stimulation.
We used differential expression patterns in a public domain microarray data set from LPS-stimulated macrophages to identify 228 pro-inflammatory and 18 antibacterial genes. Employing three different motif search tools, we predicted respectively four and one statistically significant TF-TFBS interactions from the pro-inflammatory and antibacterial gene sets. The biological literature was utilized to identify target genes for the four pro-inflammatory profile TFs predicted from the three tools, and 18 of these target genes were observed to follow the pro-inflammatory expression pattern in the original microarray data.
Our analysis distinguished pro-inflammatory vs. antibacterial transcriptomic signatures that classified their respective gene expression patterns and the corresponding TF-TFBS interactions in LPS-stimulated macrophages. By doing so, this study has attempted to characterize the temporal differences in gene expression associated with LPS tolerance, a major immune phenomenon implicated in various pathological disorders.
Several arenaviruses cause severe hemorrhagic fever and aseptic meningitis in humans for which no licensed vaccines are available. A major obstacle for vaccine development is pathogen heterogeneity within the Arenaviridae family. Evidence in animal models and humans indicate that T cell and antibody-mediated immunity play important roles in controlling arenavirus infection and replication. Because CD4+ T cells are needed for optimal CD8+ T cell responses and to provide cognate help for B cells, knowledge of epitopes recognized by CD4+ T cells is critical to the development of an effective vaccine strategy against arenaviruses. Thus, the goal of the present study was to define and characterize CD4+ T cell responses from a broad repertoire of pathogenic arenaviruses (including lymphocytic choriomeningitis, Lassa, Guanarito, Junin, Machupo, Sabia, and Whitewater Arroyo viruses) and to provide determinants with the potential to be incorporated into a multivalent vaccine strategy.
By inoculating HLA-DRB1*0101 transgenic mice with a panel of recombinant vaccinia viruses, each expressing a single arenavirus antigen, we identified 37 human HLA-DRB1*0101-restricted CD4+ T cell epitopes from the 7 antigenically distinct arenaviruses. We showed that the arenavirus-specific CD4+ T cell epitopes are capable of eliciting T cells with a propensity to provide help and protection through CD40L and polyfunctional cytokine expression. Importantly, we demonstrated that the set of identified CD4+ T cell epitopes provides broad, non-ethnically biased population coverage of all 7 arenavirus species targeted by our studies.
The identification of CD4+ T cell epitopes, with promiscuous binding properties, derived from 7 different arenavirus species will aid in the development of a T cell-based vaccine strategy with the potential to target a broad range of ethnicities within the general population and to protect against both Old and New World arenavirus infection.
Avian β-defensins (AvBDs) represent a group of innate immune genes with broad antimicrobial activity. Within the chicken genome, previous work identified 14 AvBDs in a cluster on chromosome three. The release of a second bird genome, the zebra finch, allows us to study the comparative evolutionary history of these gene clusters between from two species that shared a common ancestor about 100 million years ago.
A phylogenetic analysis of the β-defensin gene clusters in the chicken and the zebra finch identified several cases of gene duplication and gene loss along their ancestral lines. In the zebra finch genome a cluster of 22 AvBD genes were identified, all located within 125 Kbp on chromosome three. Ten of the 22 genes were found to be highly conserved with orthologous genes in the chicken genome. The remaining 12 genes were all located within a cluster of 58 Kbp and are suggested to be a result of recent gene duplication events that occurred after the galliformes- passeriformes split (G-P split). Within the chicken genome, AvBD6 was found to be a duplication of AvBD7, whereas the gene AvDB14 seems to have been lost along the ancestral line of the zebra finch. The duplicated β-defensin genes have had a significantly higher accumulation of non-synonymous over synonymous substitutions compared to the genes that have not undergone duplication since the G-P split. The expression patterns of avian β-defensin genes seem to be well conserved between chicken and zebra finch.
The genomic comparisons of the β-defensins gene clusters of the chicken and zebra finch illuminate the evolutionary history of this gene complex. Along their ancestral lines, several gene duplication events have occurred in the passerine line after the galliformes-passeriformes split giving rise to 12 novel genes compared to a single duplication event in the galliformes line. After the duplication events, the duplicated genes have been subject to a relaxed selection pressure compared to the non-duplicated genes, thus supporting models of evolution by gene duplication.
Gene coregulation across a population is an important aspect of the considerable variability of the human immune response to virus infection. Methodology to investigate it must rely on a number of ingredients ranging from gene clustering to transcription factor enrichment analysis.
We have developed a methodology to investigate the gene to gene correlations for the expression of 34 genes linked to the immune response of Newcastle Disease Virus (NDV) infected conventional dendritic cells (DCs) from 145 human donors. The levels of gene expression showed a large variation across individuals. We generated a map of gene co-expression using pairwise correlation and multidimensional scaling (MDS). The analysis of these data showed that among the 13 genes left after filtering for statistically significant variations, two clusters are formed. We investigated to what extent the observed correlation patterns can be explained by the sharing of transcription factors (TFs) controlling these genes. Our analysis showed that there was a significant positive correlation between MDS distances and TF sharing across all pairs of genes. We applied enrichment analysis to the TFs having binding sites in the promoter regions of those genes. This analysis, after Gene Ontology filtering, indicated the existence of two clusters of genes (CCL5, IFNA1, IFNA2, IFNB1) and (IKBKE, IL6, IRF7, MX1) that were transcriptionally co-regulated. In order to facilitate the use of our methodology by other researchers, we have also developed an interactive coregulation explorer web-based tool called CorEx. It permits the study of MDS and hierarchical clustering of data combined with TF enrichment analysis. We also offer web services that provide programmatic access to MDS, hierarchical clustering and TF enrichment analysis.
MDS mapping based on correlation in conjunction with TF enrichment analysis represents a useful computational method to generate predictions underlying gene coregulation across a population.