Within proteobacteria, the 98% overlap between VirB4-based and specific T4SS-based searches suggests that we have attained a good knowledge of the conjugative pilus that can be used to automatically classify plasmid mobility. Our analysis suggests that the classical division of T4SSs into two groups, T4SSa and T4SSb (31
), should be revised into a classification into four groups, one of which (MPFG
) is rare and another of which (MPFT
) is by far the most frequent. However, one must emphasize that such a classification scheme is applicable only to plasmids of proteobacteria. Outside this clade, our analyses suggest that new relaxases and T4SSs remain to be found. Some clades, notably in archaea, have known mobilizable plasmids containing a putative T4SS but no relaxase or T4CP. Some plasmids contain a small but coherent set of mobility genes of low homology with known relaxases, such as the above-mentioned new family represented by Bacteroides thetaiotaomicron
VPI-5482 plasmid p5482. Other clades, notably cyanobacteria, contain many plasmids with relaxases and T4CPs but no close VirB4 homologues. Given that cyanobacterial cells are among the most abundant on earth and that cyanobacteria genomes have many plasmids, uncovering how plasmids spread in this clade should become a priority. Indeed, the 408-kb Anabaena
plasmid pCC7120α was reported to be transmissible (103
), although the exact mechanism of transmission was not investigated. As mentioned above, the VirB4 analogue of MPFI
is less similar to VirB4 than to VirD4. It is thus likely that other unknown types of proteins energize unknown types of T4SSs. Mining for relaxases and T4SSs in large genomes is therefore bound to produce novel families. This should be done in connection with the identification of ICEs in genomes, which remain largely unexplored by comparative genomics.
It is tempting to relate the current caveats in relaxases and T4SSs with our results showing that known broad-host-range proteobacterial plasmids are in fact found only in proteobacteria. This reinforces previous observations that the conjugation range of plasmids is larger than the range of hosts that they typically occupy (34
) and that plasmids tend to have nucleotide compositions close to that of the host where they are often found (131
). While some plasmids can conjugate between very different clades, they are not naturally found there (at least not significantly often), and our phylogenetic analysis suggests that the diversification of mobility-associated protein families takes place in narrower clades. Plasmids do not shuffle modules freely; they tend to cluster within given clades, and this preference will somehow be related to specific features of a given plasmid design and with the host physiology (50
). As a case in point, the T4SSs of firmicutes seem smaller than those of proteobacteria, possibly because of the lack of an outer membrane in these cells. Elements of the T4SS with homologues between proteobacteria and firmicutes are found to interact in equivalent ways (1
), but the remaining machinery may work very diversely; e.g., T4SSs of firmicutes are not known to form a mating pilus (1
). The T4SSs of mollicutes, lacking an outer membrane and a cell wall, seem even simpler. While conjugation systems have certainly adapted to the peculiarities of bacterial membranes, it makes little sense in opposing Gram-positive and Gram-negative plasmids. As a case in point, T4SSs of plasmids of proteobacteria seem to have more in common with the ones of firmicutes than with the ones of cyanobacteria, which also have two membranes. Thus, the emerging picture that seems to arise is that conjugation systems are adapted to taxonomic clades, and more research on differences between plasmids of proteobacteria, firmicutes, actinobacteria, cyanobacteria, and archaea, etc., should be carried out. As a result of these adaptive processes, few elements of the conjugation machinery are common between firmicutes and proteobacteria, and even fewer are common between cyanobacteria and archaea.
Plasmid design or host adaptation is also likely to account for phylogenetic specificities in MOB and MPF modules. This is analogous to the evolution of prokaryotic chromosomes. Although rampant recombination could have taken place and eroded phylogenetic lineages, this has not happened, because some core genes are rarely successfully exchanged (88
). What makes prokaryotic classification useful and meaningful appears to do the same job in plasmid classification, respecting mobility systems most likely because of the adaptive coevolution of the different elements of the mobility machinery with the host.
According to the interpretation of the sequence analysis carried out in this work, we found 15% conjugative, 24% mobilizable, and 61% nontransmissible plasmids in prokaryotes. In proteobacteria, for which our predictions are more accurate, the percentages are not so different, 28%, 23%, and 49%, respectively. Thus, about half of the plasmids are nontransmissible, and the remaining ones are divided more or less evenly between conjugative and mobilizable plasmids. This finding suggests that many evolutionary models of plasmid evolution requiring high rates of horizontal spread for plasmid survival might have to be revised to account for the fact that at least half of the plasmids probably have low transfer rates. Our phylogenetic analysis shows rare transfer between distant phyla and describes the evolutionary history of T4SSs. MPFT
is by far the most abundant T4SS, and MPFG
might derive from an ancestral MPFT
. However, MPFI
and T4SSs from other clades have not derived from MPFT
, which seems to be a proteobacterial invention. The high frequency of MPFT
occurrence might reflect a particularly successful design, even though such plasmids are notoriously poorly functioning for mating in liquid media (16
). It might also result from some bias toward sequencing host-associated proteobacteria. The incoming metagenomic data should be able to provide more-accurate and less-biased estimates of the diversity and frequency of the different types of MPF and MOB. In any case, a full account of the evolutionary history of conjugation will require a parallel study of ICEs and the uncovering of conjugation mechanisms in prokaryotes lacking identifiable T4SSs and/or relaxases, that is, the vast majority of prokaryotes.