In this work, we have identified seven groups of Ty1/copia-like LTR-retrotransposons in diatom genomes. Four groups (CoDi1-2-3 and CoDi7) were found only in the P. tricornutum genome whereas elements belonging to the CoDi4-5-6 groups were detected in both diatom genomes. The presence of both classes suggests either that they were present in the diatom common ancestor and that the CoDi1-3 groups became extinct in the lineage leading to the centric species T. pseudonana, or that representatives of each group have been separately introduced horizontally in pennate and centric diatoms. The topology of the tree presented in Figure shows that CoDi3 and CoDi4 are bootstrap-supported sister groups that share a common ancestor after the separation from CoDi1 and CoDi2. This, together with the fact that we could not detect traces of diverged remnant copies from the CoDi1-3 groups in the T. pseudonana genome by BLAST searches (data not shown) favors the horizontal transfer hypothesis to explain the presence of CoDi4 elements in the T. pseudonana genome.
Ty3/gypsy-like elements were found in the T. pseudonana genome but not in the P. tricornutum genome. We also identified RT sequences corresponding to Ty3/gypsy-like elements from the pennate diatoms P. multiseries and P. multistriata which clearly cluster with the GyDi elements (Figure ). Although the number of diatom species for which data is available is low, this suggests that Ty3/gypsy-like elements were likely present in the diatom common ancestor, and that these elements have been lost in P. tricornutum.
Figure shows the retrotransposon sequences found in the CAMERA dataset. Although the vast majority of the sequences derived from these environmental genomic surveys are of bacterial and archaeal origin [30
], the authors counted 69 18S rRNA sequences in the analysis of the Sargasso Sea data [29
] and 98 in the GOS sequence collection (Doug Rusch, personal communication). Thus, some small eukaryotes were also present in these datasets. The observed higher abundance of RT domains in the fractions containing the larger cells is consistent with higher relative eukaryote/prokaryote abundance in these samples. The RT sequences studied display a huge diversity including some clustering in the CoDiI and CoDiII lineages, which likely testifies for the presence of diatoms in the samples. The other RT sequences may reflect the diversity of LTR retrotransposons populating the genomes of diverse tiny marine eukaryotes such as green, red, or brown algae, dinoflagellates, haptophytes, or euglenoids. For example, the abundance of RAS-clustering sequences in the Sargasso Sea fractions may be indicative of the presence of red algae, although analysis of these eukaryotic fractions did not reveal a particular abundance of red algae [33
]. It will therefore be important to determine which eukaryotic branch or branches the RAS-like sequences collected come from. In addition to the CoDiI, CodiII, and RAS sequences, other discrete clusters shown in Figure are exclusively composed of RT sequences from the CAMERA database and are likely to represent RT domains from organisms for which we have little or no genomic knowledge.
The mutagenic potential of LTR retrotransposons [34
] and the effects of their accumulation [35
] and recombination [36
] together suggest that active retrotransposons may be major contributors to genome diversification. Accumulated data indicates that retrotransposons in plants [37
], animals and fungi respond to various forms of stress. It has also been shown in natural wild barley populations living on each side of a canyon that LTR retrotransposon dynamics contribute to genome diversity in response to sharp microclimatic divergence [38
]. LTR-RTs are hence thought to play a key role in long term adaptation of natural populations exposed to stress by generating genetic diversity within populations. Evidence presented here suggests that this may also be the case in diatoms. For example, Blackbeard
is one of the most highly expressed genes in the EST library derived from P. tricornutum
cells grown under nitrate starvation and Surcouf
is highly expressed in response to DD treatment (Table , Figure ). If these expression levels correlate with completion of the retrotransposition cycle, which ends with de novo
insertions, then nitrate starvation, DD exposure, and perhaps other environmental stressors could lead to an increase in genetic diversity in P. tricornutum
. LTR-RTs may therefore be major drivers of genetic diversity in P. tricornutum
populations. Although we have not been able to observe de novo
insertion of Bkb
elements following stress, this claim is supported by the different insertions that have been observed in P. tricornutum
accessions isolated from different locations around the world (Figure ).
The significance of these findings is strengthened by the ecological relevance and common occurrence of stress in marine environments. Nitrogen is the most widespread limiting nutrient for marine phytoplankton [39
], and transitions between nitrate starved stratified waters and nitrate replete upwelling conditions are a major influence governing marine diatom population oscillations [40
]. Conversely, diatom-derived unsaturated aldehydes can regulate intercellular signalling, stress surveillance, and defence against grazers [41
]. Diatoms can sense these aldehydes accurately, whereby subthreshold levels serve as an early-warning protective mechanism, and lethal doses initiate a cascade leading to autocatalytic cell death. Activation of Surcouf
only after exposure to high levels of aldehydes supports a threshold-dependent response in which activation only occurs under acute stress conditions. Furthermore, the fact that significant aldehyde concentrations are only produced by nutrient-stressed and wounded diatoms suggests a possible role in long term adaptation to abiotic and biotic stress [44