|Home | About | Journals | Submit | Contact Us | Français|
Like ecological communities, which vary in species composition, eukaryote genomes differ in the amount and diversity of transposable elements (TEs) that they harbor. Because TEs have a considerable impact on the biology of their host species, we need to better understand whether their dynamics reflects some form of organization or is primarily driven by stochastic processes. Here we borrow ecological concepts on species diversity to explore how interactions between TEs can contribute to structure TE communities within their genomic ecosystem. Whereas the niche theory predicts a stable diversity of TEs because of their divergent characteristics, the neutral theory of biodiversity predicts the assembly of TE communities from stochastic processes acting at the level of individual TE. Contrary to ecological communities, however, TE communities are shaped by selection at the level of their ecosystem, i.e., the host individual. Developing ecological models specific to the genome will thus be pre-requisite for modeling the dynamics of TEs.
Transposable elements (TEs) constitute a large proportion of many multicellular eukaryote genomes, from 4% in the yeast Saccharomyces cerevisiae to more than 70 % in some plants and amphibians, and 45% in human . The mobility and amplification of TEs represent a major source of genomic variation either by virtue of their insertion or by triggering a variety of small- and large-scale chromosomal rearrangements. Once inserted, most TE copies serve no immediate function and thus their sequences progressively decay by accumulating mutations at the neutral rate of the species and eventually disappear. Occasionally, some TE copies may be co-opted by the genome to function either as coding sequences or as regulatory elements . Whereas TEs can be said to contribute genetic variation and therefore innovation [1–6], their uncontrolled movement and proliferation pose a threat to genome integrity. Indeed, TEs are an important cause of deleterious mutations and illnesses, including in humans [7–8], and therefore several host-encoded mechanisms exist to silence or restrict their activity [9–12].
TEs are classified into different classes and subclasses on the basis of their structural organization and of mechanisms of transposition (DNA transposons, Long Terminal Repeat (LTR) retrotransposons, non-LTR retrotransposons…) , and further divided into families and sub-families (see Box 1). Genomes of different organisms contain widely differing numbers of TE families and of TE copy numbers per family (Box 1). The main challenge we are facing is therefore to understand to what extent the contrasted patterns and variations that we see across genomes in the amount and diversity of TEs reflect some sort of organization, or whether they are largely idiosyncratic, i.e., the result of stochastic processes. The dynamics of TEs has been modeled on the basis of their transposition and excision rates and their fitness effects on the host . Theory and simulations have shown that various evolutionary forces acting at the level of the host species can influence TE distribution and maintenance [14–16]. However, it is still not clear whether these forces are actually responsible for most of the variations that we observe between host species. We need to understand how much TE variation is driven by selection and drift at the level of the host, and how much is driven at the level of the elements themselves, more or less independently from the host species. The host level component has been extensively investigated and reviewed previously [17–22], and will not be the focus of this review. Instead we will examine how interactions between TEs in the context of the genome can be mirrored to interactions between species in ecological ecosystem and explain some of the trends in TE composition patterns observed in eukaryotic genomes.
Ecological communities are organized in a wide range of structures. Some communities comprise just a few species at the same trophic level, with large numbers of individuals of each of the species. At the other extreme, in ecosystems such as tropical forests or coral reefs, communities may be composed of a wide array of species, each of which is represented by just a few individuals.
In each class TEs are divided into families, with a family being defined as a set of phylogenetically close TE copies that share more than 80% sequence identity . Families are sometimes themselves subdivided into sub-families corresponding to groups of sequences that share specific insertions, deletions, or substitutions. Genomes of different organisms contain widely differing numbers of TE families and of TE copy numbers per family [1,6,71]. For example, the Drosophila and pufferfish genomes have many recently active TE families with only a few copies of each, while mammalian genomes have generally much higher copy number families but only a few families concomitantly or recently active [42,71,72].
If a copy of a TE is considered as an individual, one TE species comprises closely genetically related TE copies that share the same interactions with their environment. TE copies belonging to the same family or sub-family are thus considered to constitute one TE "species" (Table 1). From the ecological niche concept (Box 2), the niche for each TE species (its “genomic niche”) may be inferred both from its ‘limiting factors’ and its ‘genomic traits’, that is, any biological characteristics involved in its relationship with the genome. Limiting factors include, for example, the cellular resources required for transposition (e.g., molecular machineries and metabolisms needed to express transposition enzymes, DNA repair activities…), the number of potential insertion sites in the genome, and the multiple host mechanisms that control or influence transposition activity. TE traits could include the transposition strategy (e.g., RNA- or DNA-based transposition), the ability to preferentially insert into different genomic regions (e.g., targeted integration) , the propensity to undergo horizontal transfers , and the capacity to counter host defence mechanisms.
The TE community of a genome comprises all the copies of TEs, irrespective of their sub-families, families or classes, and is analogous to the biotic portion of an ecosystem, the "abiotic" component being the genes and various kinds of non-coding sequences, repetitive (e.g., satellite DNA) and non-repetitive, as well as the intra-cellular environment. We can thus define the TE species richness as the number of TE species within a genome, and the relative abundance of TE species as the number of copies of each TE species relative to the total number of copies of all TE species (Table 1).
Because the relationships between TEs can be assimilated to those between species, an analogy between genomes and ecosystems has been initiated [23–25]. Studies that introduce the terminology of the “ecology of the genome” have thus focused on interactions, primarily between two TE types, which can be of a host-parasite, competitive or cooperative nature [26–28]. Moreover, the study of ecological communities reveals that they are organized in a wide range of structures that mirror some of the characteristics and range in TE composition observed across genomes (see Box 1). The mechanisms underlying species diversity at the same trophic level in a given community have been revisited in the light of recent advances in the ecological niche theory (Box 2), and the development of the neutral theory of biodiversity (Box 3). To date, however, no study on genome ecology has presented the fundamentals of ecological theories relative to the species diversity [29–32] (i.e., niche and neutral theories of biodiversity, see Boxes 2 and 3) or some key concepts developed on the evolution of communities and ecosystems [33–37].
The concept of the ecological niche infers interactions between species and their environment, and is defined by two components : (i) the requirement for an organism of a given species to live in a given environment (the extent to which a limiting factor - a resource, a predator or a parasite - influence the birth and death rate of that species) and (ii) the impact of the species on its environment (the extent to which the growth of a population alters the limiting factor - the availability of a resource or the density of a predator or parasite).
The ecological niche of a species is generally inferred from its limiting factors (food resource, predator, stress), and from either its ecological traits (traits involved in its interactions with the environment) or the performance associated with these traits (e.g., competitive or dispersal ability, resistance to pathogens or predators, potential fecundity…). Conflicting predictions can be expected with regard to the ecological niches depending on the putative mechanisms underlying the community structure .
Several theories on species diversity have been developed around the notion of interspecific competition. All species are to some extent limited by their resources or natural enemies. In communities where species exploit the same resources (competition for resources) or share the same natural enemies (apparent competition), the species that is able to maintain a positive per capita growth rate at the lowest resource level  or highest natural enemy pressure  will drive all other competing species to extinction (competitive exclusion principle).
Niche-partitioning hypothesis: according to this hypothesis the long-term, stable coexistence of competing species is possible because of niche partitioning . Stable competitive coexistence requires that any competing species that are relatively rare -compared to the other species in the community- must have a higher growing potential than the other, more abundant, competing species (invasibility criterion ). This criterion thus requires that the competing species have distinct ecological traits. For example, niche partitioning can occur if the competing species specialize on distinct resources (resource partitioning), or differ in terms of when (time partitioning) or where (spatial partitioning) they exploit the limiting resource. Niche partitioning may also occur as a result of interspecific trade-offs if one of the competing species outperforms the others for a given activity (i.e., has higher fecundity or faster growth), whether others are better at other activities (e.g., competitivity or dispersal…).
According to the neutral theory of biodiversity [30,31] competitive exclusion often takes so long to occur that other processes, notably demographic stochasticity (variation in population size associated with random differences among individuals with regard to survival and reproduction), speciation, and migration predominantly account for the structure of communities of competing species, making niche differences between species irrelevant. Neutrality is defined at the individual level: in a neutral model, all individuals of all species have equivalent per capita probabilities of giving birth, dying, migrating, and speciating, such events occurring randomly for any given individual. The resultant structure of a community (the relative abundance of species) is therefore not governed by ecological trait differences between species, but results solely from stochastic drift acting on the density of the species in competition. The apparent stability of species diversity in the community (or in the meta-community, when several communities are connected to each other through dispersal) may be attributed to a balance between speciation or immigration processes and the gradual loss of competing species diversity caused by demographic stochasticity (ecological drift) and competitive exclusion. Neutral models are powerful because of their minimalist aspects (individuals of all species are equivalent with respect to key processes, and the species in a community all occupy the same ecological niche), and because they predict a surprising number of complex patterns of competing species communities and metacommunities that might accurately describe species abundance and species relationships in the field .
Recent theoretical investigations have attempted to develop a unified theory to explain the competitive coexistence of species based on both the niche and neutral theories [54,55]. From this perspective, niche partitioning and demographic stochasticity are both involved in structuring communities. Such combined approaches might offer an explanation for the diversity, composition, and relative abundance patterns of species observed in ecological communities. This stochastic niche theory overcomes the shortcomings of the classical niche-partitioning hypothesis, which does not predict any limit to diversity, and of the neutral theory, which does not predict any link between the traits of species and their relative abundance in communities.
In this review, we argue that several key recent discoveries in ecology may be readily applicable to TE communities and used to further develop the field of genome ecology. First, we present a brief history of "genome ecology" and summarize the analogies made previously between the genome and the ecosystem. We then review the fundamentals of ecological theories that may be relevant to understand the principles influencing TE diversity in genomes. We show that the niche theory predicts stable TE composition as long as the TEs exhibit divergent biological characteristics allowing them to occupy different genomic niches and to limit competition. The neutral theory of biodiversity provides an alternative explanation for the establishment of complex TE communities, solely based on random processes acting at the level of individual TE copies, even if one theorizes that all TEs have identical characteristics. Finally, we point out that TE communities differ from traditional ecological communities in several ways, and in particular in the way selection can shape the evolution of the community by acting at the level of the ecosystem itself, i.e., the individuals hosting the TEs. These important differences prevent us from drawing over-simplistic parallels between ecosystems and genomes, and highlight the need for developing ecological models specific to the genome.
Kidwell and Lisch  were the first to use the term “ecology of the genome”, to illustrate the complexity of the interactions occurring between TEs and their host from an evolutionary perspective. They hypothesize the co-existence of “two types of elements that occupy two very different niches” within the genome (essentially heterochromatin vs. euchromatin), but did not expand the analogy to the concepts developed in community ecology. Further attempts to apply concepts borrowed from ecology to explain the population dynamics of TEs mainly focused on modeling the interdependent oscillations of two distinct entities, represented either by the host and one TE family or by two different TE families or by two members of the same TE family competing for the same resource . Some studies [23,25,26] suggested that interactions between TEs potentially reduce or increase the probability that these elements will transpose, and therefore influence TE diversity and numbers independently of the host genome. As for species in ecological communities, the interactions between TEs could be of three types (i) a host-parasite relationship, in which one element benefits at the expense of another, (ii) competition, with each element being negatively affected by the presence of the other, or (iii) cooperation, with both elements benefiting from the presence of the other. By extrapolating from the interactions between autonomous (that can supply the enzymes required for their mobility) and non-autonomous elements (that rely on the transposition machinery of autonomous copies), Le Rouzic et al. [25,39] observed a cyclical dynamic in population size between the two types of elements, closely mimicking host-parasite relationships. Competitive interactions among TEs were also investigated using a predator-prey model , in which the diversity of TE families (which represent the prey) is driven by the strength and specificity of silencing mechanisms such as RNA interference (which acts as the predator). Le Rouzic et al.  propose a parallel between concepts used in population genetic models of TE dynamics and those used in ecology. We modify and expand this approach below, incorporating recent concepts developed in community ecology that deal with species diversity (see Table 1).
Several theories on species diversity have been developed around the notion of interspecific competition (Glossary Box, Boxes 2 and 3). Hence, as suggested by Leonardo and Nuzhdin , TEs could compete for the metabolic components needed for transposition, for space in the genome and available target sequences, or for avoiding the defense system of the host . In such competitive context, TE species are expected to be mutually exclusive. According to the niche partitioning hypothesis (see Box 2), however, competing TEs could coexist stably (same number of TE species, but number of copies within species changing through time) (Box 2) if the TE species with low copy number grows faster than the more abundant TE species. This ability to invade genome (see Box 2, "invasibility" criterion) requires partitioning of the “genomic” niches of TE species that must differ for their biological traits (Box 1). A spatiotemporal heterogeneity of the environment, by increasing the number of possible distinct ecological niches, has been shown to favor the stable coexistence of competing species . Accordingly, a more heterogeneous genomic environment is expected to increase the diversity of both genomic niches and the number of TE species coexisting in the genome. For example, the environment of the TE copies could be "heterogeneous with time" in an ecological sense, as a result of variation through host generations of the TEs targeted by the defence mechanisms of the host. At a given generation, the defence should primarily target the most abundant and active TE species, whereas it should be weak against TE species with a small number of copies. TE species are thus expected to have a high growth potential when they are rare, in agreement with the invasibility criterion of the niche theory (Box 2). Some copies of an inactive TE species could be dormant entities that could allow this TE species to persist in the genome as long as the environment remains unfavorable for its development. For example, TE copies inactived by DNA methylation or other epigenetic processes, can be reactivated when methylation is removed. In an environment fluctuating with time, distinct TE species should bear different genomic characteristics that should result in each TE species experiencing different periods of high growth rate and different period of dormancy. Such asynchronous TE dynamic was reported in several works, notably in Drosophila , and for the LINE-1 elements in humans, in which competition for a host-encoded factor and/or “changing genomic environments” are assumed to drive the peculiar evolution pattern of these elements [41–43].
Heterogeneity of the TE environment might also be expected when host populations differ for their composition and amount of TE species, as documented for example in flies . It has been observed in Arabidopsis that different copies within the same TE family can be controlled by different silencing pathways . Thus it could be that distinct host populations develop different defence mechanisms against their TEs. These TEs will encounter a new genomic environment as soon as they enter a new population following crosses between populations. By dispersal of their host, TEs could therefore escape host defence mechanism and invade the genome of a new host population. This scenario could explain why bats, which –as the only flying mammals- have tremendous dispersal capacity compared to other mammals, display a dramatically different TE composition than other mammals examined so far, with an abundance of recently active DNA transposons . Interestingly, there is evidence that these elements have amplified sequentially with little or no overlap between families, as would be predicted under the niche theory. Models of competition between different TE species within host populations differing for their genomic environment but spatially connected through dispersal should allow testing of this hypothesis.
In contrast to the niche-partitioning hypothesis, the neutral theory of biodiversity [30,31] (Box 3) states that differences in biological traits between species do not explain the dynamics and structures of communities. According to this neutral theory, TE communities in genomes might not depend upon differences in genomic traits between TE species. In other words, the composition of TE communities may still vary even when the TE species have similar characteristics of transposition, deletion, selection intensity… In this model, stochastic variations at the level of each TE copy are sufficient to account for the complex structure of TE communities and their differentiation among genomes. Because TE species are assumed to share common biological characteristics, extinction is more likely to occur when a TE is rare than when it is abundant in the genome (caused by “genomic drift”, see Table 1 and Box 3). The apparent stability of the structure of the TE community could therefore result from a balance between the random extinction of TE species and the emergence of new families as a result of immigration or speciation. Within a genome, TE immigration could involve either the migration of TE copies between populations of the same host species , or between sexually isolated species via horizontal transfer [6, 39]. Speciation would correspond to the emergence of new TE species as a result of ‘vertical diversification’ [48–50]. Clear examples of vertical diversification remains scarce in the TE literature, but it is the favored mechanism to explain the persistence and gradual diversity of L1 elements in mammals  and mariner-like elements in grasses  over relatively long period (> 80 Myr), with no evidence for the contribution of horizontal transfer. Interestingly, there is also evidence that individual full-length L1 copies differ in their level of transposition activity, and this variation is caused by essentially random processes, e.g., the genomic environment unique to each of the copy [52,53]. These observations fit the predictions made by the neutral theory of biodiversity that random processes at the individual level will strongly impact the structure of the entire community.
Recent ecological studies, which attempt to develop a unified theory of species biodiversity, suggest that the niche and the neutral theories of species diversity are not mutually exclusive [54,56, Box 3 "Stochastic niche theory"]. The dynamics and structure of the whole TE community could thus be explained by a combination of niche partitioning of TE species and random processes acting at the TE copy level.
Because transposition produces raw DNA, the TE communities shape much more directly their own ecosystem than ecological communities, and the accumulation of dead, inactivated copies of TEs species is one major difference between ecological and TE communities. The old, dead elements can contribute to the dynamics of newer elements, by forming the major constituent of the genomic landscape in which the new elements insert, whereas the genic (coding) space is essentially unavailable for TEs as their insertions are generally removed from the genome due to their deleterious effects. TEs therefore create and shape continuously their own environment. Advantageously, the remnant TE copies make it possible to investigate the detailed history of the dynamics of former TE communities, allowing to investigate a "molecular paleontology" of the genome [43,56–61].
According to Dawkins’ definition of the extended phenotype , genes have predictable effects that extend beyond the individual. As a result of interactions with individuals of other species, these genes may also influence their phenotype and, at a higher level, the structure and dynamics of the entire community (community phenotype) and the ecosystem processes (ecosystem phenotype) . For selection at the level of ecosystem to occur, heritable phenotypic variability must exist between multiple communities or ecosystems that are in competition with each other, and which must maintain some minimum level of integrity for long enough for selection to occur . Because of these restrictive criteria, selection at such high levels is not generally considered to be a major evolutionary force in ecology . Even so, selection at the ecosystem level has been explored empirically  and theoretically  using micro-ecosystems that included thousands of species of micro-organisms numbering several million of individuals. The results have shown that the properties of the whole ecosystem appeared shaped by artificial selection at the ecosystem level. Whenever selection at the ecosystem level was omitted, the evolutionary trajectory of ecosystems changed dramatically due to the predominant effect of selection at the individual level . Selection at the ecosystem level may therefore theoretically favor traits of the organisms that confer an advantage on the ecosystem as a whole, even if these same traits constitute a selective disadvantage at the individual level.
Following this conceptual framework, TEs influence the phenotype of their ecosystem, i.e., the genome and its intracellular phenotype, referred to here as the “genome ecosystem phenotype”. An individual host comprises a set of genomes derived from a single nucleus (the zygote) with a given community of TEs. By extension, the host-individual phenotype might thus be considered as the "genome ecosystem phenotype". We can thus viewed the host-individual as an “integrated” genome ecosystem within a population of “integrated” ecosystems (i.e., the set of individuals within the host-population). Whenever differences in the dynamics and structure of TE communities confer unequal fitness on host-individuals, selection can operate at the genome ecosystem level to favor the TE communities that confer a selective advantage on their integrated ecosystem. This type of selection might therefore favor ecosystem phenotypic traits even if these traits result in a selective disadvantage at the TE level, as it has been shown in the theoretical investigation of micro-ecosystem selection . Selection at the ecosystem level, something that is controversial in ecology, therefore seems to be widespread in genomes, where it corresponds to a process known as "selection at the (host) individual level".
The impact of selection at various levels on TE community structure and on the characteristics of TEs could be examined by relaxing the selective pressure at the “genome ecosystem” level. Based on the results of experiments performed on microecosystems , one can predict that a marked shift in TE community dynamics will occur if selection operates at a level that favors TEs with a high transposition rate. A recent study has shown that a shift in the L1 sequence increased the retrotransposition level of the new element 200-fold relative to the unmodified element . In the same way, hyperactive transposases can result from a single or a few point mutations in their coding sequence [66–68]. Relaxing selection at the genome ecosystem level may also have occurred spontaneously during the domestication of plants or animals, as suggested by recent bursts of transposition in some rice cultivars . These examples are consistent with the idea that selection at the ecosystem level (i.e., the host-individual) favors TEs that have low transposition rate.
The fundamental rules underlying biodiversity within ecosystems and within genomes seem to be very similar, and so genomes could advantageously be explored in the light of theories developed in ecology. We thus point out the need for investigating the impact of characteristic differences between TE species (niche theory) as well as of stochastic processes at the level of TE copies (neutral theory) on the structure of TE communities. Neutral models could be inferred from models inherited from the neutral theory of biodiversity . These new models should be individual-based, and all the copies of the TE species should be equivalent with regard to transposition activity, sequence divergence, death rate, and rate of horizontal transfer, all events occurring randomly at the level of each TE copy. Such models would help testing the hypothesis that random processes at the TE copy level are sufficient to account for the structures of TE communities observed in genomes. Differences in the biological characteristics of the TE species within the same community could be progressively added to the models to explore the impact of each of these variables on the probability that each TE species persists and on their relative abundance.
The genome and ecological ecosystems present notable differences, especially in the levels at which selection operates to shape and alter the species communities over time. Developing genome ecology will therefore not just consist of merely extrapolating concepts derived from ecological studies, but will necessitate the development of ecological models specific to the genome, requiring closer collaboration between genome biologists and ecologists. While we have limited our analogy to the portion of the genome that is constituted by TEs, future investigations might incorporate other components of the genome as well: genes, cis-regulatory elements and other repeated sequences such as satellites, microsatellites, simple sequences, and retrosequences, some of which also result from TE activity [1–6].
TEs as a source of genetic innovation: in addition to their genes, genomes are composed of various repeated sequences, including TEs, which often make up a health fraction of the genonic space. TEs can insert within or near genes, thereby inactivating them or modifying their expression. Because TEs can transpose at high frequency (relative to other source of mutations), at a rate ranging from 10−3 to 10−5 per element per generation (even 10−2 in some specific crosses), [78–80] they are powerful mutators. The estimated rate of visible DNA mutations due to TE insertions differs between species. In Drosophila fruit flies, 50–80% of spontaneous visible mutations can be attributed to TE insertions . In contrast, in the human genome, in which the numerous copies of non-LTR retrotransposons are less active and mostly fixed in position, this value is between 0.1 and 1% . In addition to acting as insertional mutagens, TEs can also induce various chromosomal rearrangements and, consequently, they can have a major impact on the host phenotype.
Among the various TE copies inserted within genomes, some are deleterious and are eliminated by natural selection, others are neutral with regard to selection, and others again have been co-opted into genes or cis-regulatory elements and maintained by natural selection for host function. There are dozens of examples of TEs that have been co-opted to serve essential functions for their host, including telomere maintenance in Drosophila , antibody diversification in the adaptive immune system of jawed vertebrates , nervous system development in mammals , light-sensing in flowering plants , and rearrangements of the germline genome during the life cycle of ciliated protozoans .
TEs and epigenetic control: Epigenetic control of gene expression occurs when the gene's regulatory instructions are laid out in terms of modifications of the DNA or of proteins associated with it (mainly histones) that do not alter the actual DNA sequence. These controls include methylation of the DNA itself (on cytosines), and post-translational modifications (e.g., methylation, acetylation…) of the histone proteins around which the DNA is wrapped. Upon insertions in the genome, TEs are frequently marked by repressive epigenetic modifications aimed to suppress their expression. These modifications induce local chromatin alteration that may spread to the surrounding DNA and modulate (generally repress) the expression of adjacent genes in a tissue- or developmentally-specific manner [9,11].
We would like to thank E. Lerat, F. Menu, and C. Vieira for their comments. This work was supported by the “Centre National de la Recherche Scientifique”, the GDR 2157 on "Transposable Elements" and the GDR “ComEvol”.