Microbial disease is the major cause of human death and morbidity and for many infectious diseases, no preventive vaccines are available [1
]. Where therapies do exist, escalation of resistance to antimicrobials hinders treatment of common bacterial infections and accentuates the need for new approaches [2
]. Therefore, it is imperative to identify appropriate targets for medical countermeasures such as antimicrobial drugs or cross-protective vaccines active against several pathogenic strains or species. An alternative to "killing" bacteria, which exacerbates the selection of antimicrobial resistance, is to "disarm" bacteria by interfering with their capacity to be virulent, thus enabling the bacterium to survive and evoke an appropriate immune protection [4
]. Targeting such virulence factors through the development of antivirulence (as opposed to antimicrobial) compounds has indicated that it is possible to target common virulence genes [5
Virulence is typically described as the damage a pathogen causes to the host during infection [7
]. Gene products that contribute to virulence can therefore be described as virulence factors. Traditionally a gene has been classified to encode a virulence factor by experimentally introducing a mutation into the protein of interest and determining whether virulence of the resultant mutant is reduced. Genome-wide screens for identifying novel virulence factors have traditionally employed transposon mutagenesis to inactivate genes in a selected bacterial strain and then screening the resulting insertion mutants for attenuation in an appropriate animal infection model. An adaptation of this method has been the incorporation of unique DNA tags in signature-tagged mutagenesis (STM) which enables mutants to be screened en masse
in animal infection models [8
]. However, this approach is limited to a single strain of a particular bacterial species and a particular infection model. Therefore the approach typically identifies highly specific virulence factors with limited extrapolation to generic virulence determinants of other pathogens. Another approach is to identify genes up-regulated in vivo
, such as in vivo
expression technology (IVET) [9
]. However, this approach also identifies genes other than those required for virulence and both STM and IVET identify genes in a pathogen that are also present in non-pathogens.
Computational approaches to identifying virulence factors have often been made through whole-genome comparisons of two or more bacteria, where the presence or absence of genes between closely related pathogenic and non-pathogenic strains can suggest genes that potentially play a role in virulence [10
]. For example, Garbom et al. [12
] identified novel virulence-associated genes in Yersinia pseudotuberculosis
by looking at the hypothetical genes (genes of unknown function) conserved in six human microbial pathogens. Expanding on this work, we have used whole proteome searches to identify virulence-associated proteins common to diverse pathogenic bacteria that are absent in non-pathogenic species. The identified factors can then be exploited for the development of medical countermeasures such as antimicrobials, vaccines or diagnostics.
In contrast to computational approaches based on similarity, genomic context methods involve several non-similarity based approaches to predict protein functions and interactions [13
]. Phylogenetic profiles particularly suit our objective of identifying conserved virulence factors across multiple human pathogenic species. The method was originally designed to identify functionally-related proteins that evolve in a correlated fashion by characterizing proteins by a binary string that encodes the presence or absence of the protein in every known genome [18
]. The method has been improved and expanded in numerous ways, including new approaches to characterize profile patterns by domains [19
] and protein families [20
] and by integrating phylogenetic information to compute probabilities of observing different profile strings [21
]. Phylogenetic profiles have been used to identify virulence factors related to bacterial food poisoning [23
] and intracellular pathogenesis [24
]. In this work, we have utilized a similar approach to identify potential virulence factors present in a group of extreme human pathogens, bacteria from the Centers for Disease Control category A and B pathogen lists (http://www.bt.cdc.gov/agent/agentlist-category.asp
). We impose two primary criteria: first, the putative target genes must be broadly present in the diverse pathogens within these two groups. Second, the putative target genes should be absent or highly divergent in non-pathogens. This enhances the likelihood that candidates are implicated in virulence and minimizes the potential activity of future countermeasures against the host commensal flora.