The integron/gene cassette system is now known to be widely distributed amongst the Bacteria and is a major contributor to the dispersal of genes by Lateral Gene Transfer (LGT). The system can facilitate LGT by virtue of the fact that the integron encodes a site-specific recombination (SSR) system [
1,
2]. The targets for SSR are gene cassettes [
3]. These are independently mobilizable units of DNA, usually composed of a single gene bound by a recombination site, designated a 59-base element (59-be) or alternatively
attC. The DNA integrase catalysing SSR, IntI, is encoded by the integron. This integrase recognises two families of recombination sites. The first of these is the cassette associated 59-be [
4] and the second is a site contained within the integron designated
attI. IntI can catalyze a reversible reaction by which gene cassettes can be inserted into, or excised from, an integron via recombination between the
attI site and a 59-be or between two 59-be sites [
5,
6]. Multiple recombination events are common. As a consequence, integrons are normally associated with arrays comprising multiple cassettes, which in some species of
Vibrio can number well in excess of a hundred [
7]. Gene cassettes do not normally include a promoter. Instead, where it has been examined, transcription of cassette genes is driven by a promoter (P
c) located in the integron itself [
1]. Thus, the integron and gene cassettes comprise both a gene capture and gene expression system.
Integrons were first identified in the context of multi-drug resistance. In this context, pathogenic bacteria are often resistant to many antibiotics as a consequence of possessing an integron that has captured several gene cassettes containing resistance genes [
2]. These integrons recovered from clinical environments are often embedded in other types of mobile elements such as plasmids and transposons [
8]. It is now clear however that integrons are also a common feature of the chromosomes of various bacteria, being found in the gamma-proteobacteria (vibrios, xanthomonads, pseudomonads,
Shewanella), beta-proteobacteria (
Nitrosomonas europea) and spirochaetes (
Treponema denticola) (Figure ). Another notable feature of chromosomal integrons is the wide diversity of functions encoded by the genes found in their gene cassettes [
9]. This, coupled with the observation that cassette arrays can be quite large, clearly hints at a system that can greatly impact on the adaptive potential of bacteria [
10]. Given that cassette associated genes are part of the gene pool that is LGT-associated and mobilizable, they also represent a community resource that can be shared between individuals [
11].
Bacteria of the
Vibrio genus are rapidly becoming a model system for the study of the chromosomal integron/gene cassette system [
12-
14]. As noted above, their cassette arrays are characteristically large, encompassing several percent of genomic coding capacity in many cases. Representatives of this genus display such a high level of variation in terms genome size and content that "species" units can rarely be identified by the sequencing of one or a few genetic markers [
15]. Many complete integron arrays have now been assembled from various species of
Vibrio as a result of whole genome sequencing initiatives. These include
V. cholerae, V. parahaemolyticus, V. vulnificus (two strains) and
V. fischeri [
16-
19]. These collective sequencing efforts have helped to highlight the enormous amount of novel genetic diversity contained within integron arrays. However, it is also becoming clear that whole genome sequencing is a relatively blunt instrument for assessing gene cassette diversity. This is because, although individual strains of
Vibrio can contain large arrays, the total number of cassettes harboured by any single individual is very small compared to the overall size of this community resource [
11]. As an alternative approach to recovering gene cassettes, the cassette PCR technique has been developed [
20]. This method selectively amplifies gene cassettes to the exclusion of other genomic sequences and can be applied to both metagenomic (environmental) DNA and to the DNA of defined strains. However, cassette PCR only recovers cassettes in isolation (i.e. outside their genetic context) and cannot aid in the assembly of contiguous arrays. Also, although gene cassette PCR is selective, there is always the possibility for some false positives (non-cassette DNA fragments being amplified) [
20]. When a gene cassette is found within an array, there is no doubt about its nature.
To more rapidly access and analyse the mobile gene cassette pool, we adopted a genomics approach that allows us to specifically isolate DNA fragments containing large integron gene cassette arrays and exclude the vast majority of the genome that is common to most members of a species. No such large cassette array has previously been isolated and fully sequenced without completely sequencing the genome of its host. Using our streamlined approach, we isolate and sequence the integron from a close relative of the widespread marine bacterium Vibrio harveyi, a well-known member of the core vibrio group to which V. parahaemolyticus also belongs (Figure ). To place it in an evolutionary context, we compare it to known vibrio integron arrays and perform a phylogenetic analysis of all of its components (integrase, 59-be sites, gene cassette encoded genes). This reveals strong interaction between vibrio integrons through LGT, high variability of array contents and wide functional diversity of gene cassettes. These analyses also yield insights on the processes of gene cassette recruitment from non-mobile genes and the possibility of cassette de-recruitment. Differences in the genomic context of integrons from various vibrios suggest several events of intragenomic translocation for these genetic elements.