The Vibrio sp. DAT722 gene cassette array contained 116 cassettes in 90 different size classes. Of these 90 size classes, 75 represented only one type of cassette, and so these size classes uniquely defined cassettes and hence their positions within the array. The remaining 15 size classes represented multiple copies of the same cassette and examples of different cassette sequences having the same length. Consequently, these 15 size classes ambiguously defined positions within the array. The methodology used in this work identified 62 of the 90 possible size classes in genomic DNA (gDNA). Of the 62 identified size classes, 50 represented unique cassettes within the array and 12 represented cassettes from amongst the 15 ambiguously located size classes. In short, 74 of the 116 cassette positions within the array were readily accessible with the techniques used in this work.
The Vibrio sp. DAT722 array contained 91 coding and 25 non-coding (ORF-less) cassette positions. Additionally, eight of the coding cassettes were oriented so that their genes were located on the complimentary strand. In this work, both coding cassettes and non-coding cassettes were detectably expressed. However, no expression could be specifically attributed to those cassettes with genes on the complimentary strand.
Expression of cassette-associated genes under optimal growth conditions
Cultures grown overnight in Vibrio media at 280 C showed that the majority of detectable cassette size classes (39 of the 62) were detectably expressed. That is, copy DNA (cDNA) species, corresponding to 39 of the 62 detectable cassettes within the array were detected in the cDNA samples. These expressed cassettes were distributed along the length of the array and were interspersed with cassettes that, while detectable in gDNA, could not be detected amongst cDNA species (Figure ).
The technique used provided relative quantitation of expression amongst the individual gene cassettes in the DAT722 array by using the stoichiometric relationship between cassette species in gDNA as a quantitative standard. This quantitation showed at least a 100-fold difference in the intensity of expression between the most and least detectably expressed cassettes (cassettes 95 and 11 within the array respectively). Additionally, 'blocs' of adjacent expressed cassettes were expressed at similar intensities, with less than three-fold difference amongst cassettes within the bloc (for example, cassettes 31-35 in Figure ).
The hypothesis that the integron-associated promoter, Pc mediated expression of the entire array implies the presence of large contiguous transcripts of the array amongst cDNA species. In this work, we detected cassettes within the array whose expression was undetectable interspersed amongst detectably expressed cassettes within the array, indicating that not all of the expression seen within the array is due to Pc. The presence of additional promoters within the array was therefore implied. Additionally, the observation that different 'blocs' of similarly expressed adjacent cassettes within the array, were expressed at levels significantly different from other 'blocs' (for example, cassettes 21-23 were significantly more expressed than the bloc containing cassettes 31-35 (Figure )) suggested that these additional promoters had differing abilities to catalyse transcription. The presence of detectable but unexpressed cassettes between 'blocs' indicated areas of the array where some of these 'intra-array' promoters, might be located (i.e. adjacent to blocs of cassettes; 1-18, 20-25, 31-44, 52-59 and 88-95).
Conditional expression of gene cassette associated genes
We then examined the question of conditional expression of cassette-associated genes in the Vibrio sp. DAT722 array in three experimental series. These series examined thermal stress at three levels, 4°C, 14°C and 28°C, and oxidative stress at three levels of imposed stress and a control, being 0, 0.9 mM, 1.8 mM, and 3.6 mM hydrogen peroxide, applied for two time periods, 30 minutes and 18 hours. Replicate studies showed that a two-fold change in expression both amongst cassettes within a treatment and for a single cassette between treatments in an experimental series, was significant with greater than 95% confidence. Consequently, a two-fold change in gene cassette expression was adopted, in this study, as the minimum change in expression deemed to be conditional. A summary of all gene cassette conditional expression data is shown in Figure .
A number of gene cassettes, detectable in gDNA were not detectably expressed under any stressor (arrowed in red in Figure ). These cassettes were the same as those nominated as potential promoter locations in the previous section. Of the detectably expressed cassettes, all but two cassettes (23 and 70) were conditionally expressed under at least one stressor at the two-fold level of significance. The largest measured increase in expression was 11.4-fold, seen in cassette 57 under 18 hour oxidative stress, with other cassettes (eg. cassettes 10, 20 and 104 in Figure ) showing similar levels of increase, though not always at the same level of applied stress, or even under the same stressor. Additionally, in many cases cassettes were not detectably expressed in one or more of the experimental treatments (eg. cassette 11 under both 30 minute and 18 hour oxidative stress). Consequently, the actual increase in expression of these cassettes under these stressors from undetectable levels may have been larger than that measured.
Amongst this widespread conditional expression, the following patterns were noted:
-Gene cassettes were similarly expressed within blocs. That is, within a bloc, the level of expression was largely consistent irrespective of stressor. This observation supported the suggestion that individual promoters were associated with these 'expression blocs'.
-The particular expression response to a stressor varied in both direction and extent amongst expression blocs. That is, the expression of some blocs was increased under a particular stressor whilst others blocs were not. This suggested that different types of promoter were responsible for the expression of the various blocs. For example, it was noted that cassettes 21-24 were similarly expressed under both 30-minute oxidative stress and thermal stress whilst the expression of cassettes 1-15 differed markedly under these same stressors (Figure ).
Localising a promoter within the Vibrio sp. DAT722 cassette array
It was expected that a possible location for an intra-array promoter would be indicated by cassettes that were not detectably expressed, as adjacent expressed cassettes necessarily required the presence of a promoter. Cassette 19 was not detectably expressed under any stressor while adjacent cassettes 18 and 21 were both strongly expressed, though under differing stressors. In order to identify the promoter in this region of the array, the 5' end of the cDNA transcript was localised through the PCR of nested primers bracketing the target area. An initial examination of the area between cassettes 16 and 21 (Figure ) narrowed the target area to the region between cassettes 20 and 21. The location of the promoter was further localised as shown in Figure . These PCRs showed that the majority of cDNA transcripts containing cassette 21, did not include cassettes 19 and further that most cassette 21 transcripts commenced between the cassette 20 3' attC site and the cassette 21 ORF. The region of sequence adjacent to the cassette 21 Shine-Dalgarno sequence was examined for a possible promoter sequences in an appropriate position, with an example of a sigma 70-type promoter being found bridging the cassette 20-21 attC site (Figure ). The activity of this putative cassette 20-21 promoter was tested by measuring relative amounts of transcript on either side of the promoter using QPCR. This work showed an approximately 40-fold increase in transcript on the cassette 21 side of the promoter indicating that this promoter was indeed functional. It was also noted in the QPCR work that low levels of YB4-MazGr transcript were present, indicating the presence of an additional promoter in the region of cassette 19, responsible a portion of the expression of cassettes 20-21. Consequently, the cassette 21 expression seen in the attC PCR was due to the presence of a transcript containing cassettes 20 and 21 and the additional expression of cassette 21 due to the cassette 21 promoter was not measured in this assay.
Comparison of attC sites for promoter sequences
The cassette 21 promoter appeared to bridge the attC junction between cassettes 20 and 21 (Figure ). The putative -35 site was contained within the cassette 20 side of the attC site while the -10 site was immediately adjacent to the cassette 21 section of attC. Sequence examination of other attC junctions within the DAT722 array showed that the position and sequence of the -35 site was present in a number of other attC sites within the DAT722 array. However, the corresponding -10 site within cassette 21 (Figure ) was not present in any of the other cassette in the DAT722 array. This indicated firstly that this potential promoter could remain functional if cassette 21 were mobilised to a location with an attC site containing the appropriately located -35 sequence. Secondly, the -10 site, being unique to cassette 21, indicated that this promoter was unique within the array. These observations indicated that the remainder of the expression seen within the DAT722 array was due to other types of intra-array promoter. The observation of detectable but unexpressed gene cassettes adjacent to expressed cassettes in other areas of the Vibrio sp. DAT 722 array, suggested that additional intra-array promoters might also be located in the vicinities of cassettes 19, 35, 60, 96, 99, 106 and 108-109.
Some implications of widespread gene cassette expression in large arrays
We have found, that in the Vibrio sp. DAT722 gene cassette array, the majority of gene cassette-associated genes were expressed, that this expression was largely conditional and that the expression was facilitated by multiple, different, intra-array promoters. These findings have a significant impact on our understanding of the utility of the integron/gene cassette system in prokaryotes:
Firstly, the widespread expression of cassette-associated genes within the 116-cassette array indicated that a wide range of the phenotypes implied by cassette array was available to
Vibrio sp. DAT722 host. So, rather than being restricted to only those phenotypes that may be provided by cassettes proximal to the integron, this prokaryote lineage has the potential to benefit from all cassettes present, irrespective of their location within the array. Further, because the widespread expression in DAT722 was due to cassette-borne promoters that are themselves mobile genetic elements, it is likely that promoter-containing cassettes are ubiquitous in the gene cassette metagenome. Therefore, we concluded that cassette-associated genes within all large arrays may be routinely expressed and so, cassette arrays in general are able to confer phenotypes in proportion to their size. Consequently, the presence of larger cassette arrays can provide distinct selective advantages to the host organism and this may well account for the observed prevalence of large arrays in the environment [
21].
Secondly, the presence of cassette-borne promoters indicates that these promoters as well as cassette-borne ORFs may be rearranged within the array by the action of the IntI integrase. Consequently, with the observation of polycistronic cDNA transcripts in this work and elsewhere [
11], repeated rounds of rearrangement may result in the assembly of a number of tandem genes of related function within a gene cassette array, in association with an appropriate cassette-borne promoter. Such 'gene cassette operons' could result in the co-ordinated expression of multiple cassette-associated genes to produce complex phenotypes [
22]. The existence of such hypothetical 'gene cassette operons' is supported by observations that differences amongst the cassette arrays of the vibrio pandemic strains were largely confined to contiguous multi-cassette indels rather than single cassette indels [
20]. Similarly, the observation that a large proportion of environmental integrons have an inactive integrase gene may also be a reflection that the existence of advantageous gene cassette operons may necessitate the preservation of not only gene cassette complement but intra-array cassette order as well [
23]. Further, where a functional integron-associated integrase gene is associated with a cassette array, it has been observed that the integrase gene may be induced by cellular stress [
13]. This induction, enabling the recruitment of novel cassettes or groups of cassettes to the array further underscores the adaptive role of cassette arrays,
Further research
We have established here a link between environmental stress and the differential expression of cassette-associated genes. It has also been established that lateral gene transfer involving gene cassettes can rapidly and randomly produce new phenotypes in prokaryote communities [
24]. However, because of the random nature of the new arrangements of cassette-borne genes and promoters produced by LGT, the resulting novel phenotypes may not necessarily be 'finely-tuned' to the stressor that causes them to be produced. Similarly, evidence for markedly decreased translation of widely spaced genes on polycistronic cassette transcripts [
25] may indicate that the ultimate outcome of the expression of individual cassettes shown in this work may not necessarily result in an advantageous phenotype. Consequently, it remains to be demonstrated, that the conditional expression of gene cassettes, as seen in this work, produces phenotypes that appropriately address the applied stressor.