|Home | About | Journals | Submit | Contact Us | Français|
Background and Aims Banana genomes harbour numerous copies of viral sequences derived from banana streak viruses (BSVs) – dsDNA viruses belonging to the family Caulimoviridae. These viral integrants (eBSVs) are mostly defective, probably as a result of ‘pseudogenization’ driven by host genome evolution. However, some can give rise to infection by releasing a functional viral genome following abiotic stresses. These distinct infective eBSVs correspond to the three main widespread BSV species (BSOLV, BSGFV and BSIMV), fully described within the Musa balbisiana B genomes of the seedy diploid ‘Pisang Klutuk Wulung’ (PKW).
Methods We characterize eBSV distribution among a Musa sampling including seedy BB diploids and interspecific hybrids with Musa acuminata exhibiting different levels of ploidy for the B genome (ABB, AAB, AB). We used representative samples of the two areas of sympatry between M. acuminata and M. balbisiana species representing the native area of the most widely cultivated AAB cultivars (in India and in East Asia, ranging from the Philippines to New Guinea). Seventy-seven accessions were characterized using eBSV-related PCR markers and Southern hybridization approaches. We coded both sets of results to create a common dissimilarity matrix with which to interpret eBSV distribution.
Key Results We propose a Musa phylogeny driven by the M. balbisiana genome based on a dendrogram resulting from a joint neighbour-joining analysis of the three BSV species, showing for the first time lineages between BB and ABB/AAB hybrids. eBSVs appear to be relevant phylogenetic markers that can illustrate the M. balbisiana phylogeography story.
Conclusion The theoretical implications of this study for further elucidation of the historical and geographical process of Musa domestication are numerous. Discovery of banana plants with B genome non-infective for eBSV opens the way to the introduction of new genitors in programmes of genetic banana improvement.
Modern sequencing technologies have revealed the frequent integration of viruses into their host genomes regardless of whether their life cycle includes an obligatory integration step. Such endogenous viral elements (EVEs) have been reported in many genomes in the animal kingdom and can represent a large proportion of some of them (Horie and Tomonaga, 2011; Feschotte and Gilbert, 2012). In palaeovirology studies, these fossils of initial integrations can provide snapshots of past viral diversity useful in tracing viral–host co-evolution in the animal kingdom (Patel et al., 2011; Etienne and Emerman, 2013). Similarly, numerous viruses have invaded plant genomes extensively (Bejarano et al., 1996; Jakowitsch et al., 1999; Harper et al., 1999; Ndowora et al., 1999; Richert-Pöggeler et al., 2003; Becher et al., 2014), in most cases DNA viruses belonging to at least five genera of the family Caulimoviridae (Hohn et al., 2008). Caulimoviridae all comprise double-stranded (ds) circular DNA viruses using a virus-encoded reverse transcriptase (RT) to replicate. They are classed as pararetroviruses (PRVs), and as endogenous pararetroviruses (EPRVs) when present within their host genome after an illegitimate recombination (Harper et al., 2002). Some EPRVs contribute a significant proportion to the host plant genome. Tobacco vein clearing virus (TVCV) is present in the tobacco genome at more than 103 copies (Jakowitsch et al., 1999) whereas others, such as Petunia vein clearing virus (PVCV) and banana streak virus (BSV) (Gayral et al., 2008), show 50–100 copies in the haploid Petunia hybrida genome (Richert-Pöggeler and Shepherd, 1997; Richert-Pöggeler et al., 2003) and fewer than ten in the Musa balbisiana genome, respectively. Although integration events are probably frequent during viral infection, few become fixed within the host genome and probably if they provide a positive benefit (Iskra-Caruana et al., 2014a, b). In specific cases, EPRVs can still be infective by producing functional viral genomes and infectious viral particles. Such infective EPRVs are rare; only three cases have been reported to date in the plant kingdom: TVCV EPRV in tobacco (Lockhart et al., 2000), PVCV EPRV in petunia (Richert-Pöggeler et al., 2003) and BSV EPRV in banana (Ndowora et al., 1999). In banana, BSV EPRVs are named endogenous BSVs (eBSVs).
BSV belongs to the Caulimoviridae and, like almost all others members of the genus Badnavirus, is a bacilliform circular dsDNA virus whose genome contains three consecutive open reading frames (ORFs) (King et al., 2012; Hohn and Rothnie, 2013). BSV is a complex of different viruses that all induce the same disease in banana plants: banana streak disease (BSD) (Lockhart and Olszewski, 1993). To date, 15 full-length BSV species have been described and sequenced completely (Harper and Hull, 1998; Geering et al., 2005; Lheureux et al., 2007; Gayral and Iskra-Caruana, 2009; Geering et al., 2011; King et al., 2012).
In the past 20 years, major problems due to BSV have not been epidemics caused by natural transmission via mealybugs or the high incidence of BSV in east African AAA bananas because of the use of infected suckers, but rather spontaneous outbreaks belonging to infective eBSVs, which can release functional viral genomes producing infectious episomal viruses (Iskra-Caruana et al., 2010). So far, such eBSVs have been restricted to Musa balbisiana genomes (denoted B) only (Lheureux et al., 2003; Gayral et al., 2008; Iskra-Caruana et al., 2014a). Consequently, BSV has become the main constraint worldwide for breeders using either diploid M. balbisiana, or genotypes harbouring at least one B genome as progenitors to introgress desirable traits of agronomic interest as well as tolerance to the severe black sigatoka disease. Genomic and abiotic stresses such as those experienced during interspecific crossing and micro-propagation by in vitro culture play a major role in the occurrence of BSV infection in newly created interspecific hybrids and natural hybrids harbouring the B genome (Harper et al., 1999; Ndowora et al., 1999; Dallot et al., 2001; Lheureux et al., 2003; Côte et al., 2010). Recently, integrants of the three most widely distributed BSV species (BSOLV, BSGFV and BSIMV) have been fully characterized (genomic, genetic, cytogenetic and infective capacity) for the seedy M. balbisiana diploid ‘Pisang Klutuk Wulung’ (PKW) (Gayral et al., 2008, 2010; Chabannes et al., 2013). The results revealed that each BSV species is present as a complex insertion at a single locus resulting from a single integration event. BSOLV and BSGFV exist as two alleles called eBSOLV allele-1 (eBSOLV-1)/eBSOLV allele-2 (eBSOLV-2), and eBSGFV-7/eBSGFV-9 alleles, respectively. BSIMV showed only one allele eBSIMV (Fig. 1). The eBSOLV-1 and eBSGFV-7 alleles are identified as infective alleles producing viral particles following homologous recombination in banana genotypes having one B genome (Iskra-Caruana et al., 2010; Chabannes and Iskra-Caruana, 2013).
Gayral et al. (2010) developed molecular markers specific to each allele for BSGFV and BSIMV to genotype PKW-related eBSVs in M. balbisiana species. They first established a microsatellite-based phylogeny of M. balbisiana diploids (BB), revealing few polymorphisms between the six identified groups. Preliminary genotyping results indicated that PKW-related eBSVs are limited strictly to the B genome. They are strongly conserved; all BB genotypes tested positive for eBSGFV, while eBSIMV was absent or mutated in some BB accessions. These results suggest an integration event shortly after speciation of M. balbisiana in the genus Musa and before intraspecies diversification (approx. 27·9 Mya) (Christelová et al., 2011). However, the absence of geographical data and the limited number (20) of available M. balbisiana accessions preclude a complete description of eBSV polymorphism and evolution on Musa species.
Following the precepts of palaeovirology developed in the animal kingdom, in this study we developed and combined markers and techniques to further characterize PKW-related eBSV distribution in Musa for BSOLV, BSGFV and BSIMV species. We investigated their distribution and insertion polymorphisms as well as pattern evolution among as large a sampling of BB seedy diploids as possible. Moreover, the sampling was extended with interspecific banana genotypes (ABB, AAB, AB) to catch new B genome resources absent – not collected or extinct – from the available M. balbisiana diploid diversity (De Jesus et al., 2013). These hybrids also represent the most widely cultivated AB, AAB or ABB cultivars native to the two areas where regions of sympatry between M. acuminata and M. balbisiana have been reported. One is in India and the other in East Asia, stretching from the Philippines to New Guinea (Perrier et al., 2009). Musa interspecific hybrid creation is still not well understood, and different processes have been proposed. Carreel et al. (2002), using cytoplasmic markers, showed that several AAB and ABB derived from a preliminary AB hybrid providing non-reduced AB gametes combined later to A or B gametes, whereas Perrier et al. (2009) using simple sequence repeat (SSR) nuclear markers showed non-reduced AA gametes to be not only at the origin of several AAA cultivars but also associated with B gametes providing AAB hybrids such as the Indian ‘Pome’ cultivar.
Finally, we tested whether eBSV markers can help describe M. balbisiana phylogeny. We show how the known phylogeny of banana accessions can help elucidate eBSV integration pattern diversity as well as how eBSV polymorphism can help understand the particularly unresolved question of M. balbisiana diversity.
Sampling of Musa diversity was based on genotyping of 22 SSR nuclear markers in a population of more than 500 accessions (Hippolyte et al., 2012). To increase the diversity of B genomes, this sampling was complemented with 23 new accessions including several recently collected M. balbisiana and interspecific AAB and ABB hybrids of interest. A sample of 77 accessions was thus defined [24 BB, 11 ABB, 26 AAB, three AB, six AA, seven outgroups (OG)] (Table 1) representative of (1) the two species at the origin of all cultivated bananas (M. acuminata and M. balbisiana), (2) diploid and triploid hybrids of these two species and (3) some other Musa species as outgroups. This sample was characterized by nuclear genome as well as eBSV insertions. Fresh leaf samples were kindly supplied by the in vivo germplasm collections of CIRAD in Guadeloupe and the International Institute of Tropical Agriculture (IITA) in Nigeria; the INIBAP Transit Center (ITC) in Leuven (Belgium) supplied plantlets from in vitro germplasm collection. Each genotype was documented with its genome constitution and subgroup classification according to the current agro-morphological classification [IPGRI-INIBAP (Bioversity International, 2016)], and ploidy levels were estimated by flow cytometry (Dolezel et al., 1997).
Total genomic DNA was extracted from banana leaf tissue according to the method of Gawel and Jarret (1991). The quality of DNA was assessed visually under UV light after migration of 5μL of DNA sample in a 1 % agarose gel in 0·5× TBE buffer [45 mm Tris-borate, 1 mm EDTA (pH 8)], stained with ethidium bromide, with a Nanodrop 2000 (Thermo Scientific, Wilmington, DE, USA) and by PCR amplification of the housekeeping Musa actin gene (see below).
Among the 22 SSRs of the previous analysis (developed from M. acuminata cv. ‘Gobusik’ and M. balbisiana cv. PKW), 17 were selected for the present study (Supplementary Table S1). Microsatellite analysis followed the protocol developed by Hippolyte et al. (2012). The 17 SSRs were shown to be independent and to be distributed among ten of the 11 linkage groups (Hippolyte et al., 2010).
Regardless of the method of construction, the accuracy of a diversity tree relies on the representativeness of the sample analysed. To increase this representativeness, the results for the 17 SSR markers on the 77 sampled accessions were concatenated with the results of the analysis developed by Hippolyte et al. (2012). The band level notations were adjusted on the subset of 54 accessions common to both analyses. The resulting data matrix on 567 accessions was used to calculate dissimilarities between pairs of accessions. The dissimilarity estimated from co-dominant SSR markers was the proportion of shared alleles, which has proven effective in reconstructing correct genealogical relationships. However, this measure cannot be applied directly to our data, which mixes diploids and triploids. Therefore, an extended index as defined by Hippolyte et al. (2012) was used to apply to two diploids and two triploids as well as a diploid and a triploid.
A diversity tree was built from the dissimilarity matrix on 567 accessions, using the neighbour-joining (NJ) algorithm (Saitou and Nei, 1987) implemented in DARwin v5·0·155 software (Perrier and Jacquemoud-Collet, 2006; http://darwin.cirad.fr/darwin). The tree is too large to be shown here, and thus a sub-tree gathering our 77 accessions was extracted from the whole tree. We used PowerMarker software to calculate the heterozygosity of diploids (Liu and Muse, 2005).
A Southern blot protocol was developed specifically for eBSV banana analysis. The BSOLV, BSGFV and BSIMV genomes previously cloned separately in plasmids were used as probes for hybridization on digested genomic DNA of the different banana accessions. Restriction enzymes were chosen based on the molecular sequence of eBSGFV, eBSOLV and eBSIMV alleles present in PKW. From the restriction map of these sequences, we chose enzymes that would cleave eBSV into at least five fragments of different sizes (between 1000 and 8000bp to facilitate observation on gels), allowing the discrimination from each other and permitting allelic differentiation.
Total plant genomic DNA (40μg) and PKW BAC DNA clones carrying eBSV (1·5μg) were digested separately with 1U μg–1 of each enzyme of the pair HindIII/BamHI for BSOLV and SpeI/DraI for BSGFV and BSIMV. Digested DNA was then purified by dialysis on a cellulose membrane (Millipore, Molsheim, France), separated by electrophoresis on 1 % agarose gels run for 16h at 40 V in 0·5× TBE and transferred to positively charged Hybond N+ nylon membrane (GE Healthcare, Vélizy-Villacoublay, France) according to the manufacturer’s instructions. Membranes were hybridized overnight at 65°C with radiolabelled probes containing the complete genome of BSOLV, BSGFV or BSIMV (Chabannes et al., 2013), and then treated according to the manufacturer’s instructions. Autoradiography was performed using a phosphorimager (Typhoon FLA 9000, GE Healthcare) following 24h of exposure.
We conducted PCR and derived cleaved amplified polymorphic sequences (dCAPS) analysis for the three BSVs studied. The PCR and dCAPS data were scored according to the presence or absence of fragments for each accession. We score as present only CAPS showing PKW expected results.
Hybridization patterns resulting from Southern blots were analysed using the software ImageQuant TL (GE Healthcare). This software permits automatic fragment detection on the membrane image and calculates their size by referring to the ladder and positive controls present on each membrane. A visual control of final hybridization patterns was conducted to validate data obtained by means of the software. We used two pairs of enzymes to digest DNA for all samples as described above. Each sample pattern was recorded according to the presence or absence of expected bands/fragments obtained following separate hybridization with the three BSV probes by comparison with both the PKW and the allelic BAC clone patterns.
For each banana accession, we obtained a final picture of the eBSV structure based on the presence/absence of fragments for each BSV (Fig. 2A–C).
We represented each eBSV as a separate zone referring to PKW eBSV structures. We coded analysis from PCR and Southern blot as follows: both markers in the same zone as presence 1, absence 0, Southern-blot fragments only 2. Others markers, such as dCAPS markers and eBSIMV-Junction PCR, were coded separately as follows: presence 1 and absence 2. This scoring method is presented in Fig. 2 and the data are shown in Supplementary Tables S3–S5.
The dissimilarity between two accessions was calculated as the proportion of cases where the two accessions were not in agreement (presence/absence). It was considered that Southern blot data were more informative than PCR data regarding the final eBSV structure. Then, when estimating the dissimilarity between two accessions, disagreement in Southern blot results were considered as full differences, contributing a value of 1 to the dissimilarity, while disagreements in PCR data contributed a weight lower than 1. After testing several values, a weight of 0·2 was retained. Indeed, this weight allows us to resolve the known phylogenetic links already observed by Hippolyte et al. (2012); for example, it clearly illustrates the lineage between the BB Eti Kehel accession and the Indian ABB and AAB hybrids described in this latter publication.
A specific procedure was developed to calculate the dissimilarity matrices according to our double weight system. A matrix was calculated separately for each eBSV and used to build a diversity tree using the NJ algorithm (Saitou and Nei, 1987) with 1000 bootstrap replicates (DARwin v5.0.155 software, Perrier and Jacquemoud-Collet, 2006). The joint analysis of the three eBSVs was realized by a synthetic dissimilarity matrix calculated as the sum of the three eBSV dissimilarities, each with a specific weight to compensate for the unequal number of observed fragments. An NJ tree was built from this overall dissimilarity.
A Musa phylogeny representative of the diversity of both wild and cultivated M. acuminata (denoted A)- and M. balbisiana (denoted B)-derived genotypes was proposed by Perrier et al. (2009, 2011) and Hippolyte et al. (2012) based on SSR marker analysis. As this phylogeny was not representative of M. balbisiana diversity, we genotyped 23 seedy M. balbisiana diploids as well as interspecific hybrids, diploid (ABB) or haploid (AAB) for the B genome. Combining the two datasets, we performed a novel microsatellite-based analysis on a total of 567 accessions. From this overall phylogeny (data not shown), we extracted a sub-tree corresponding to our 77 accessions constituting the sample and rooted on the outgroup accessions (OG) (Fig. 3). The main structure of the phylogeny revealed a contrast between M. acuminata and M. babisiana pure genomes.
All M. acuminata diploid (AA) as well as Musa laterita, which is a Musa species genetically close to M. acuminata Long Tavoy (Carreel et al., 1994, 2002), and the AAB hybrid Pisang Nangka are present at the root of the dendrogram. All seedy M. balbisiana diploids (BB) formed a distinct monophyletic group structured into sub-groups (called microsatellite groups, Msat 1–7) showing little diversity, as first described by Gayral et al. (2010). We observed an additional Msat group, named 7, grouping accessions absent in the previous analysis done by Gayral et al. (2010). It included three M. balbisiana diploids, recently collected in South China (Chi 1–3) and Lep Chang Kut originating from China. These accessions are very close and probably belong to the same genetic population (Fig. 3).
All interspecific hybrids and AA diploids are organized as successive groups apart from the BB diploid group; two main forces (genotype and geography) explained the observed distribution. We observed that accessions in the same group had the same genotype (AAB/AB, ABB) (Fig. 3). Several of these groups corresponded to triploid subgroups, such as Pome, Plantains, Awak or Silk, defined based on morphological characters. Next, we observed a second level of structure based on the native geographical origin of the accessions, grouping the interspecific hybrids schematically into accessions from India, and accessions from a large South-East Asia region ranging from Papua New Guinea to the Philippines and Indonesia. The heterozygosis of diploid samples was estimated from SSR alleles (Supplementary Table S1). AA and AB diploids showed a heterozygosis twice that recorded for seedy BB diploids (Supplementary Table S1), in agreement with the reduced diversity observed.
The full-length eBSOLV, eBSGFV and eBSIMV allelic structures of the seedy diploid M. balibisiana PKW were established definitively by Gayral et al. (2008) for BSGFV and by Chabannes et al. (2013) for BSOLV and BSIMV. They developed specific PCR and dCAPS markers to differentiate eBSV from BSV (Supplementary Fig. S1). Thus, the PCR and dCAPS results provide an eBSV ID for each BSV species. We used these markers to check for the presence of PKW-related eBSV in the B genome on the 77 Musa accessions and established an eBSV ID for each BSV species (Supplementary Tables S3–S5).
No eBSOLV, eBSGFV or eBSIMV IDs were identified in either the outgroup or the accessions of M. acuminata, confirming their B genome restriction (supplementary data). eBSV IDs were recorded for most of the other accessions concerning BSGFV as well as BSOLV and BSIMV, indicating the wide BSV colonization of M. balbisiana genomes by these three BSV species. Most accessions yielded systematic PCR amplifications with the two Musa-junction PCR markers, indicating a common locus of integration into the B genome for each BSV species. eBSGFV IDs were strongly conserved whereas eBSOLV and eBSIMV IDs appeared more diverse and rearranged.
eBSV IDs gave a preliminary picture of eBSV structure when PCR results were positive. However, a negative PCR result due to either no fragment or a fragment not recognized by the PCR primers remains difficult to interpret. To complete the picture of eBSV structure, we developed a Southern blot approach (see Materials and methods) generating specific restriction fragments to interpret patterns and differentiate between alleles (Fig. 2A–C). To maximize the information gleaned, we cross-hybridized each Southern blot membrane with the two other viral probes (data not shown).
We analysed the Southern blot patterns by referring to those obtained with PKW for the three BSV species (Fig. 2D–F). No hybridization occurred in either the outgroup or the AA accessions according to eBSV ID results. Among the other accessions, patterns recorded for eBSGFV appeared strongly conserved (Fig. 2E). This could reflect the similarity of eBSV alleles (Fig. 2B), showing only one difference for the 5-GF fragment. In comparison, eBSOLV Southern blot analysis showed a larger diversity of patterns, probably reflecting eBSOLV allele differences. We observed one over-sized fragment (named 9-OL) in several accessions (Fig. 2D). Because of its size, and because it was always correlated to the absence of both the 4-OL and the 2-OL fragments when the genotype was haploid for the B genome, we considered that the 9-OL fragment corresponded to non-digestion due to a single nucleotide polymorphism at this restriction site. Thus, we included this fragment in our final analysis because it represented a permanent difference between accessions. Next, we counted the two close 1-OL and 6-OL fragments without the occasional extra fragment (noted 6-OL+1-OL) resulting from partial digestion because they are always present. eBSIMV patterns then ranged from similar to PKW, to totally different or absent (Fig. 2C). The differences probably corresponded to various alleles as in the case of eBSOLV.
PCR IDs and Southern blots give complementary data that are both useful and relevant to propose a representation of PKW-related eBSV structure for each accession. Indeed, all restriction enzyme fragments are associated with PCR markers for both eBSGFV and eBSOLV, except fragments 2-OL and 5-OL for eBSOLV (Fig. 2A–C) and the main part of the eBSIMV structure, which forms a tandem full-length linear viral genome precluding the design of specific PCR markers (Supplementary Fig. S1). Nevertheless, to propose an accurate picture of PKW eBSV allele diversity in M. balbisiana species, we coded our Southern blot data together with PCR-based results to interpret eBSV distribution (see Materials and methods).
Three separate dendrograms of PKW-related eBSV distribution were inferred by the NJ method for each BSV species from the total results of 77 accessions.
The PKW-related eBSOLV dendrogram was divided into six subsets, ranging from PKW eBSOLV, to modified PKW eBSOLV to no eBSOLV (Fig. 4). This reflects strong allelic changes in internal organization as observed for seven accessions rather than in the flanking eBSOLV fragments, which are always conserved. Indeed, a large number of fragments are missing for AAB Slen, AAB Luba, AAB PRB, AAB Mur, BB LCK, BB Cam and AAB Tig accessions, all ABB hybrids having an Asian origin. The other accessions are distributed into two allelic-based divergent groups. The first grouped into three sub-groups ranging from PKW eBSOLV alleles (five accessions), to slightly modified PKW eBSOLV alleles (13 and 17 accessions, respectively). The second group gathered accessions having strong integration pattern changes in fragments making allele differences: 4-OL and 5-OL fragments for eBSOLV-1, and 8-OL fragments for eBSOLV-2. The 7-OL fragment was always lacking when the integration pattern changes were large. We assumed that all PKW eBSOLV modifications corresponded to novel alleles showing large eBSOLV diversity as at least 21 additional alleles are reported. Almost exact PKW eBSOLV alleles were found in only six accessions: three of the Msat-4 group, BB 211 and two ABB hybrids from the Asian group. eBSOLV-1 is massively conserved in BB, AAB and ABB accessions from the Asian group and in AAB Ceylan from the Indian group. Newly inserted BB diploid accessions (Chi 1, 2 and 3) forming a new microsatellite group presented specific eBSOLV alleles and grouped with the accessions gathering ten other AAB, ABB and AB hybrids, all from the Indian group.
None of the AA and outgroup accessions, including two AAB hybrids (Nangka and Kunaimp), presented any eBSOLV. The Nangka accession already grouped with AA diploid accessions in the Musa phylogeny (Fig. 3) and was thought to be M. acuminata genotype only.
The PKW-related eBSGFV dendrogram clearly separates into two parts: with and without eBSGFV (Fig. 5). The eBSGFV-forming group showed a remarkable conservation of PKW eBSGFV alleles as, except the AAB Luba accession, all M. balbisiana diploid and M. balbisiana haploid accessions are distributed in one well-supported group (strong bootstrap support, 96). The Luba accession lacked the 2-GF fragment, and had different mutations resulting in no PCR amplification with three sets of primers. All other accessions presented a similar PKW eBSGFV allelic structure. These were structured into six close allelic-based groups (indicated by short branches). The groups are named according to their allelic information (Fig. 5). Both alleles are present in most BB diploid accessions with the exception of four that lack eBSGFV-7, and another three lacking eBSGFV-9. Among Indian accessions, the two eBSGFV alleles occur in ABB hybrids, whereas the eBSGFV-9 allele alone is reported in AAB hybrids. Two AB hybrids (Ekona and Safet Velchi) harboured both alleles (eBSGFV-7 and -9) in their genomes, respectively, despite reportedly having only one B genome. Consequently, they grouped with the main BB diploid accessions harbouring both PKW eBSGFV alleles. Curiously, among Asian accessions, both eBSGFV alleles occurred in AAB hybrids whereas ABB hybrids never harboured the eBSOLV-9 allele alone.
All AA and outgroup accessions grouped with eight AAB hybrids and one AB hybrid that did not present any eBSGFV. Except for Nangka and Slendang accessions, the other AAB and AB hybrids originated from India.
The dendrogram reconstructed using data obtained for PKW eBSIMV proposes three groups (Fig. 6). A large group aggregated the entire AA and outgroup accessions including 15 AAB, one ABB and two AB as well as one BB diploid (Honduras accession) that lacked any PKW eBSIMV. A second set grouped nine accessions showing flanking fragment conservation and generally lacking most of the others. As for BSOLV, these insertions may be considered as lost; four BB diploids have lost a large part of eBSIMV. The third group may be separated into two sub-groups: a larger one of 27 accessions harbouring the PKW eBSIMV allele including all ABB hybrids, and a smaller one containing eight accessions only. Regarding the other banana hybrids, we observed either the complete absence of eBSIMV in the plantains group or the almost complete absence for the Indian group. The presence of small parts of eBSIMV focused at the junction zone for certain accessions from the Indian group may correspond to pseudogenization within the Musa genome, which leads to a complete loss of eBSIMV. We also observed a pseudogenization for accessions in the Mai’a/ Popoulou group.
An eBSV dendrogram built on the weighted sum of dissimilarities was inferred from all eBSV data (Fig. 7). It appears structured by the presence/absence of the different eBSVs (Fig. 7), showing four main groups according to banana genotype and geographical origin. Interestingly, even if similar SSR marker-based sub-groups are maintained (Fig. 3), the overall structure shows new lineages between accessions such as those observed between BB and both ABB and AAB genotypes. The tree seems structured by the banana genotypes because all AAB hybrids were present with few BB diploids in one part, and the other part grouped all the other BB and ABB hybrids. This resulted in all eBSVs being slightly modified for the BB/ABB group (represented by short branches) with strong modification or total absence of PKW-related eBSVs (represented by longer branches) being observed for AAB hybrids. The two geographical hybridization areas pointed out by Perrier et al. (2009) do not form two separate groups as observed in Musa phylogeny (Fig. 3) but appear distributed among the tree. This illustrates the diversity as well as the similarity of eBSV structures for these two separate interspecies hybridization areas. Interestingly, apart from the AAB Silk subgroups, BB accessions are distributed among the main sub-groups of the tree, resulting from possible lineages between BB parents and hybrids.
The aim of this study was to characterize the polymorphism of integration of three BSV species among Musa balbisiana genomes using seedy diploids as well as natural interspecific hybrids to investigate the evolutionary history of BSVs. We also wanted to test whether eBSV can be used as a phylogenetic marker to help resolve B genome phylogeny.
Our results indicate the systematic presence of at least one of the three BSV species in all available B genomes among diverse Musa accessions, BB diploids and natural interspecific hybrids, except Pisang Nangka accession (AAB), which is thought to be an M. acuminata triploid hybrid (AAA), whatever the native area they come from. We observed conservation of the locus of eBSV integration even when the alleles are strongly degenerated, indicating that the alleles resulted from modifications following BSV integration/fixation, and no integration in either M. acuminata or the outgroup genotypes. These results confirm that BSOLV, BSGFV and BSIMV integrations within the B genome occurred after the speciation of M. acuminata/M. balbisiana and before M. balbisiana diversification as proposed by Gayral et al. (2010). Global analysis also revealed that eBSOLV is the most prevalent eBSV in the sample followed by eBSGFV and then eBSIMV. This observation, related to differences in the number of alleles recorded for each BSV species among BB diploids as well as banana hybrids, attests to various eBSV evolutions encompassing probable sequential BSV integrations where BSOLV integrated first, followed by BSGFV and BSIMV, as proposed by Gayral et al. (2010) in BB diploids and Chabannes et al. (2013) in PKW (BB). Curiously, access to wider M. balbisiana diversity by studying PKW-related eBSV of banana hybrids was not reached as expected. Indeed, in most cases, we recorded the same allelic diversity as observed in the BB seedy diploids. This is particularly well illustrated for eBSOLV (Fig. 4) where allelic diversity is assigned to BB diploids, even for BSIMV, where large changes including eBSV loss or absence already exist.
eBSOLV alleles (eBSOLV-1/-2) are distributed in all Musa phylogenetic groups (Fig. 4). This may indicate an early appearance of these alleles and of their conservation. Indeed, some correspond to a mix between alleles eBSOLV-1 and -2, making it difficult to discern whether these novel alleles are infective or not. Others show changes on the same part of eBSV. Fragments 1-OL, 2-OL, 3-OL and 6-OL are conserved in almost all plants (Fig. 2A), while fragments 4-OL, 5-OL and 8-OL are modified. Interestingly, this part of eBSV allows eBSOLV-1 and -2 alleles to be discriminated and was also proposed as the zone releasing functional viral genomes by homologous recombination for the eBSOLV-1 allele (Chabannes and Iskra-Caruana, 2013). In addition, the eBSOLV haplotype is located within a transposable-element-rich genomic area (Chabannes et al., 2013) known to have a high tendency to recombine (Mézard, 2006; D’Hont et al., 2012). Two BSOLV strains [discovered recently by Baranwal et al. (2013) based on their shorter genomes (6950bp) compared with the reference (7389bp)] in cultivar ‘Safet-Velchi’ from India were thought to originate from eBSOLV not characterized in this Indian banana genotype. A ‘Safet-Velchi’ cultivar exists in our sample as an AB Indian reference only and shows strong modifications of the eBSOLV allele in the part of the genome absent in the two BSOLV Indian strains. Unfortunately, no infection occurred under our greenhouse conditions in the banana plant, impeding any verification of Baranwal’s hypothesis. For these reasons, we propose this zone in the BSV structure as a possible hot spot for recombination during Musa evolution, explaining the different alleles recorded. Finally, fossil traces of eBSOLV indicated an ongoing process of pseudogenization for some AAB and ABB hybrids.
eBSGFV distribution analysis shows that PKW alleles (eBSGFV-7/-9) are highly conserved among diverse B genomes, particularly the BB and ABB accessions, compared with those of eBSOLV and eBSIMV. As for eBSOLV, most accessions harbour both alleles in their genome, as established by Gayral et al. (2010). Most of the differences between the eBSGFV alleles are pinpoint mutations. The site of integration in a gene-rich region of chromosome 1 can explain its relative conservation in different genotypes (Chabannes et al., 2013). Besides, Gayral et al. (2010) observed lower selection pressure for this integration, suggesting this area is less permissive for rearrangement.
eBSIMV analysis appeared simple because PKW harbours only one allele composed of a partial tandem repeat of a full-length viral genome showing little rearrangement compared with eBSGFV and eBSOLV. However, we observed several alleles ranging from a similar or slightly modified PKW allele to extensively deleted with PCR markers corresponding only to flanking regions. One explanation for these various allele traces may be the presence of inverted repeat sequences near the Musa-junction zones forming hairpins facilitating eBSIMV elimination. Among M. balbisiana species, eBSIMV is less rearranged than other eBSVs, and the only one with this high degree of fragment absence. The PKW allele was also widely present in BB diploids and ABB hybrids, but we observed for the first time five BB accessions with no or only a few traces of eBSIMV. In addition, we noted the total absence of eBSIMV in 20 hybrids (AB, AAB, ABB). All these data reinforce a recent integration of BSIMV into B genomes; Chabannes et al. (2013) assumed sequential BSV integrations in PKW where eBSIMV would be most recently integrated. Accordingly, we cannot exclude a partial fixation among the BB diploid population in a period close to the M. balbisiana diversification as an alternative explanation.
Surprisingly, we identified both eBSGFV-7 and -9 alleles within two hybrids having only one B genome (AB). Amazing observations were also made for Asian AAB hybrids where Kunaimp and Pisang Nangka are the only accessions lacking both eBSOLV and eBSGFV. The hypothesis of an incomplete M. balbisiana genome was proposed because cases of unbalanced chromosome representation of the two genomes have already been observed in triploid interspecific hybrids (Jeridi et al., 2012; G. B. Noumbissié, pers. commun.). Although no data exist for diploids, we assume that the AB genotypes have a supernumerary presence of the B chromosome and the Asian AAB hybrids lack chromosome 1 containing both eBSVs.
All eBSV alleles have resulted from distinct evolution mechanisms after fixation into seedy banana diploid populations. We assumed that they represent many snapshots of intra- and inter-eBSV diversity illustrating the BSV/M. balbisiana interactions during Musa evolution, all focusing now on the same final fate of pseudogenization, as evidence that the BSV story is intimately connected to that of Musa (Gayral et al., 2010; D’Hont et al., 2012).
Historically, divergence between M. acuminata and M. balbisiana occurred around 27·9 Mya (Christelová et al., 2011) in continental South Asia [also reported as the native area of seedy BB diploids (De Langhe et al., 2009)]. This represents a very long period of sexual diversification, natural selection, drift and diffusion of BB diploids that has led to the allelic eBSV diversity observed today. This is supported by our data (Figs 4–6) and reinforces our hypothesis that eBSV allelic diversification occurred in the face of epidemic environmental changes during BB diploid evolution (Chabannes et al., 2013; Iskra-Caruana et al., 2014b). The presence of the three eBSVs in all BB diploids except Honduras addresses the question of selective advantage at the beginning of eBSV fixation. A hypothesis of innate defence to resist environmental BSV pressure could be formulated, and the resistance of BB plants to BSV, whether episomal or infective eBSV in origin (Lheureux, 2002), reinforces this hypothesis.
The appearance of interspecific hybrids selected by domestication processes/human activities, dated to not earlier than 7000 years ago (Perrier et al., 2009), allowed the selection of hybrids having only one B genome. Changes in ploidy of the B genome probably impacted the virus/banana equilibrium, releasing ‘endogenous’ BSV infections from infective eBSV alleles. We observed that all ABB hybrids grouped closely with BB diploids because they harbour the same eBSV allele composition. AAB hybrids harbour mainly either deleted or no eBSV alleles except for the plantain groups, which still have infective eBSV for both BGFV and BSOLV. A natural selection during hybrid creation promoted virus-free AAB hybrids harbouring deleted or no eBSV alleles rather than hybrids hosting infective eBSV. Interestingly, our data on eBSV alleles of the AAB hybrids coming from the two main centres of diversification located in India and South-East Asia seem to indicate a convergent evolution process of pseudogenization, whereas the BB diploids present in each native area probably differ.
Most AAB hybrids from the Indian group showed either non-infective allelic rearrangements or complete absence of eBSVs whereas ABB hybrids still have all allelic eBSV diversity including the infective alleles. We observed one exception for Lal velchi BB diploid suspected to be the parent of AAB sub-group Pome or AB Kunnan accession from the Indian Group (Hippolyte et al., 2012). All these observations favour a systematic selective post-hybridization process aiming to select AAB hybrids having no infective alleles.
The native hybridization area in South-East Asia, like the native Indian hybridization area, showed more rearranged eBSV alleles in AAB hybrids than in ABB hybrids or BB diploids except for the subgroup African plantain, which harbours both eBSOLV-1 and eBSGFV-7 infective alleles. Nevertheless, in the Mai’a Maoli/Popoulou AAB sub-group of Oceania genetically close to African plantains, the eBSGFV-9 allele is dominant. E. De Langhe (pers. commun.) claimed that different routes of diversification exist, with the plantains of Africa separated from Mai’a Maoli/Popoulou AAB earlier than previously proposed (Perrier et al., 2011). We observed through our eBSV analysis that the B genome differs between those two sub-groups. At this time, having an infective eBSV appeared not to affect the AAB African plantain fitness, raising interesting issues regarding epidemiological context as well as host/BSV interactions on this continent.
The eBSV alleles diversity comparison of AAB hybrids from different native areas supports two distinct evolving contexts of AAB hybrids whilst a similar BB diploid diversity should exist in each area. However, selection of AAB hybrids having no eBSV except eBSOLV appears to be more effective in the Indian than in the Asian area. The exhaustive presence of BSOLV in the B genomes could be explained by its older integration and fixation time relative to other eBSVs. Differences in epidemic context where BSV strongly affected the Asian area appear as a possible explanation for the selection of AAB hybrids having infective alleles useful for stimulating innate plant defences against episomal BSV.
This study provides evidence that eBSVs, which are distributed specifically in B genomes, can help resolve B genome phylogeny. The tree mixing data obtained for the three eBSVs (Fig. 7) gives a picture of Musa relationships based on viral integrations within the B genome and their evolution with Musa. The structure of this tree is broadly coherent with the previous phylogenetic tree based on Musa SSR markers, but two differences exist. The first concerns the global distribution of banana plant samples over the tree. Due to the greater number of SSR markers on the A than the B genome, the general structure of the SSR-based tree (Fig. 3) is coordinated by the quantity of the A genome in each plant, depending on the genotype origin of banana samples, opposing M. acuminata to M. balbisiana diploids (Fig. 3). The eBSV-based tree was inferred using only B genome markers (Fig. 7) and shows an organization ranging from no eBSV in M. acuminata and outgroups to full eBSVs represented by the M. balbisiana Msat-4 group, which has exactly the same integrations as PKW. The groups gathered accessions with similar eBSVs resulting from natural selection of independent convergent events occurring in different native geographical areas. Thus, AAB African plantains of South-East Asian origin, which differ genetically from Indian AAB Pome, are close because there is no eBSIMV in their genomes.
The second difference is of particular interest. Whilst in the SSR-based tree (Fig. 3) BB diploids form a monophyletic group, on the eBSV-based tree (Fig. 7) a large number of BB diploids group with interspecific ABB and AAB hybrids. These BBs probably shared the same ancestor BB plants at the origin of these triploid hybrids because BSV integration occurred before M. balbisiana diversification and no specific eBSV allele diversity exists in hybrids. For example, ABB Pelipita has similar integrations to PKW and other BB accessions of the Msat-4 group. Saba, Burro Cemsa and Daru accessions forming the ABB Bluggoe group are also linked to this subset. BB 63-80 is linked closely to AAB plantains; BB Honduras to Indian AAB Pome; BB Butuhan to ABB Auko, Bengani and Pisang Kepok Bung; and BB 211 to two Indian ABB Awak. The hybrid component is sometimes restricted to just one accession, such as ABB Dole with BB 342, BB LBA and BB MON. Moreover, such grouping leads to the merging of several hybrid groups sharing the same BB ancestor. For example, AAB Kapas, which belongs to the group Laknao from the Philippines, is very close to AAB Plantains, which are thought to originate from the same region. The AAB group Pome includes AB Ekona and AB Safet Velchi, but also ABB Blue Java, which was collected in Fiji and characterized as belonging to the Ney Mannan ABB group. All these accessions shared the same origin in India, supporting the hypothesis of an identical M. balbisiana ancestor. To the best of our knowledge, this is the first time that M. balbisiana ancestors of the main triploid cultivars can be assumed.
Some BB groups remain isolated, however; they were not recruited in hybridizations with M. acuminata genomes. One possibility is that the hybrids generated are absent from our sample. Conversely, a single hybrid group, the AAB Silk from India, is not linked to any particular BB diploid. A first hypothesis is that the ancestral BB is extinct or absent from the sample. Another possibility might be a specific modification within the triploid genome. The intense agricultural selection in this area has already been mentioned (Perrier et al., 2009) and could explain this specificity. Several accessions classified as indeterminate for SSR markers (Pisang Slendang, Luba, Kunaimp, Tigua) were also confirmed; indeed, these plants, which exhibit very specific eBSV structures, are positioned on specific intermediate branches. The integrity of their M. balbisiana genome has already been questioned.
To conclude, the existing Musa phylogeny focused mainly on M. acuminata data does not provide any evidence of relationships between AA or BB diploids and their interspecific hybrids. In this paper, we tested whether eBSVs could be relevant genetic markers for M. balbisiana to infer the current Musa phylogeny. eBSV markers appear to be efficient tools with which to elucidate the phylogeny of M. balbisiana based on two clues shared between the three BSV species tested. First, we observed a systematic conservation of the integration locus. Secondly, the evolving process of eBSV appears to be due to rearrangement rather than nucleotidic sequence divergence (Gayral et al., 2010; Chabannes et al., 2013). The explanation for such an ambiguous situation at the same time for conservation of both infective and degraded eBSV alleles after such a long period of coevolution in the BB diploids remains unclear. Based on the different results now at our disposal, two explanations result from this amazing BSV/banana interaction according to either the virus or the plant point of view. The dormant BSV within the B genomes as a conserved eBSV allele could be a viral reservoir that could preserve viral populations from extinction. Indeed the main BSV species observed worldwide resulting from outbreaks belong to eBSV (Gayral et al., 2010) ass no BSV epidemics are reported explaining a viral evolution origin for BSOLV, BSGFV and BSIMV. Conversely, M. balbisiana diploids and interspecific hybrids appeared resistant and tolerant, respectively, to endogenous infection from eBSV. Staginnus and Richert-Pöggeler (2006) proposed a sequence homology-based resistance through the silencing mechanism requiring a conserved viral sequence. Thus, M. balbisiana diploids with early BSV integrations could be the best adapted to resist high viral environmental pressure. Their selection within diploid populations could result in the decrease and extinction of the viral epidemic context. Because elimination of eBSV probably involves large evolutionary cost, M. balbisiana diploids maintained eBSV to control endogenous infection by silencing and genome duplication. Unfortunately, interspecific hybrids having one B genome have no such strong regulation, resulting in spontaneous infection. However, such infections rarely result in epidemics as virus rates are probably maintained under a vector-transmitted level also by gene silencing mechanisms given that we observed that BSV multiplication in M. acuminata hybrids was under silencing regulation (Rajeswaran et al., 2014).
These specific regulation mechanisms are supported by high sequence homology between eBSV and episomal BSV (Gayral and Iskra-Caruana, 2009), allowing M. balbisiana diploids to be healthy carriers of BSD via eBSV, but also new potential transmitters of the disease, particularly to hybrids.
The theoretical implications of this study for further elucidation of the historical and geographical process of Musa domestication, as well as practical implications for genetic improvement programmes, are numerous.
Supplementary Data are available online at www.aob.oxfordjournals.org and consist of the following. Table S1: Microsatellite loci used in this study. Table S2: PCR markers used to genotype PKW eBSOLV–eBSGFV–eBSIMV. Table S3: Data for PKW-related eBSGFV genotypes. Table S4: Data for PKW-related eBSOLV genotypes. Table S5: Data for PKW-related eBSIMV genotypes. Table S6: Numbers of the restriction enzyme fragments in PKW eBSGFV–eBSOLV–eBSIMV. Fig. S1: Overview of PCR and dCAPS loci within PKW eBSGF–eBSOLV–eBSIMV.
We thank Serge Galzi for technical help, greenhouse banana plant cultivation and design of Fig. 1; Pierre-Yves Techeney for preparing and sending the banana samples; and Christophe Jenny and Marie Umber for helpful discussions during this work. P.-O.D. was supported by a CIRAD PhD grant.