|Home | About | Journals | Submit | Contact Us | Français|
A multilocus sequence typing (MLST) analysis was used to examine the genetic structure and diversity within the two large extrachromosomal replicons in Medicago-nodulating rhizobia (Sinorhizobium meliloti and Sinorhizobium medicae). The allelic diversity within these replicons was high compared to the reported diversity within the corresponding chromosomes of the same strains (P. van Berkum et al., J. Bacteriol. 188:5570-5577, 2006). Also, there was strong localized linkage disequilibrium (LD) between certain pSymA loci: e.g., nodC and nifD. Although both of these observations could be explained by positive (or diversifying) selection by plant hosts, results of tests for positive selection did not provide consistent support for this hypothesis. The strong LD observed between the nodC and nifD genes could also be explained by their close proximity on the pSymA replicon. Evidence was obtained that some nodC alleles had a history of intragenic recombination, while other alleles of this locus had a history of intergenic recombination. Both types of recombination were associated with a decline in symbiotic competence with Medicago sativa as the host plant. The combined observations of LD between the nodC and nifD genes and intragenic recombination within one of these loci indicate that the symbiotic gene region on the pSymA plasmid has evolved as a clonal segment, which has been laterally transferred within the natural populations.
Plants of the genus Medicago are legumes that often benefit from a mutualistic symbiosis with rhizobia. The most agriculturally significant species of rhizobia that nodulate these plants are Sinorhizobium meliloti (9) and Sinorhizobium medicae (22). Previously reported population genetic analyses of these bacteria have focused on the study of how allelic variants at multiple loci are distributed within and among natural populations (2, 3, 10, 26, 31, 32). This was also the focus of the present study, but it was extended by examining more loci in many more strains of both species of Sinorhizobium coupled with an analysis having a range of symbiotic genotypes. One goal was to determine if there were any obvious correlations between the megaplasmid genotypes observed and their symbiotic competence. A second goal was to determine if selection by their host plants may have influenced the evolution of their symbiotic relationships.
The genes for symbiosis reside on the extrachromosomal replicons pSymA (1,354,226 nucleotides [nt]) and pSMED02 (1,245,408 nt) in the genomes of S. meliloti Rm1021 and S. medicae WSM419, respectively (GenBank accession no. AE006469 and CP000740, respectively). Besides these two plasmids, these two strains each harbor one other large extrachromosomal replicon, pSymB (1,683,333 nt) and pSMED01 (1,570,951 nt), respectively (GenBank accession no. AL591985 and CP000739, respectively).
Multilocus sequence typing (MLST) (16) is a form of genomic indexing that is commonly used to study the population genetic structure and phylogenetic relatedness within diverse groups of bacteria. In this method, nucleotide sequences of a fixed set of common loci are obtained from a collection of strains, and polymorphic sites among these sequences are used to derive an allelic profile or sequence type (ST) for each genome. Comparisons of the resulting data can be used to infer phylogenetic relationships among the organisms in the sample population, and they also can be used to infer how evolutionary processes, such as recombination and selection, have shaped the genetic structure of the population. For example, levels of intergenic recombination among chromosomal genes in natural populations of Neisseria meningitidis reportedly are relatively high, while corresponding levels within populations of Staphylococcus aureus were low (28). Depending on the specific pairs of loci examined, the levels of linkage disequilibrium (LD) (a lack of intergenic recombination) among several chromosomally carried core genes of S. meliloti were reported to be generally moderate to high (26).
The MLST approach has been used to confirm that the chromosomes of S. meliloti and S. medicae are sexually isolated (2, 3, 31) and to provide evidence that horizontal gene transfer (HGT) does occur between the symbiotic megaplasmids of these species (3, 32). It has also been used to demonstrate that levels of intergenic recombination, as indicated by linkage disequilibrium, differ between the three replicons of S. meliloti (26). Levels of intergenic recombination within the pSymB replicons of these strains are generally high, unlike the chromosomes and pSymA replicons within the same strains (26). Bailly et al. (3) hypothesized that the region of the pSymA plasmid that contains the nodulation (nod) genes is frequently transferred in natural populations. They also suggested that selective pressures from the host plant may have influenced both nod gene diversity and patterns of polymorphism across the entire nod gene region.
In the present study, multilocus allelic variation of the two megaplasmids was examined among 231 Medicago-nodulating rhizobia that originated primarily from southwest Asia (10). Previously, 91 different chromosomal sequence types (STs) were identified among the same strains from sequence variation in 10 loci (31). This collection of strains had earlier been divided into two closely related groups based on results of multilocus enzyme electrophoresis (10), and this result was subsequently cited in support of separating the Medicago-nodulating rhizobia into the two species S. meliloti and S. medicae (22).
The objectives of this study were (i) to use MLST to examine the genetic relationships within and among the large extrachromosomal replicons in S. meliloti and S. medicae, (ii) to estimate levels of intergenic and intragenic recombination in these replicons, (iii) to evaluate the nitrogen-fixing competence of representative symbiotic genotypes with Medicago sativa, and (iv) to determine whether positive (or diversifying) selection may have influenced the genetic structure of the megaplasmids.
The Medicago-nodulating rhizobia included in this analysis were those previously described by Eardly et al. (10) and van Berkum et al. (31). DNA samples from a previous study (31) were used as templates in PCRs. Candidate loci for MLST analysis were chosen by referring to the complete genome sequence of strain Rm1021 (12). Based upon the success of both the PCR amplification and sequence analysis with the largest number of strains, 6 genes distributed across pSymB (Table (Table11 ) and pSymA (Table (Table2)2) were selected for MLST analysis. Of the original 231 strains, 229 were chosen for analysis of pSymB loci (Table (Table1).1). Strain 12 and a duplicate strain of 128A10 were omitted. Of the original 231 strains, 225 were chosen for analysis of pSymA loci (Table (Table2).2). The strains 128A14, 15A5, 17A8, M158, and M75 and a duplicate strain of 128A10 were omitted. The presence of each locus on pSMED2 or pSMED1 within the genome of S. medicae strain WSM419 was verified (GenBank accession no. CP000738.1). The entire open reading frames of both rhizobial species were aligned in GenDoc (version 2.6.001; K. B. Nicholas and H. B. Nicholas [http://www.nrbsc.org/gfx/genedoc/index.html]). Primers (Tables (Tables11 and and2)2) were selected that would amplify a portion of each gene between 200 to 500 bp in size by using the primer design software package Oligo Primer Analysis Software, version 6.65 (Molecular Biology Insights, Inc., Cascade, CO). Primer synthesis and PCR protocols were described elsewhere (31).
The purified PCR products were used in two sequence reactions with primers nested to the PCR primers. An Applied Biosystems 3130 genetic analyzer in combination with a dye deoxy terminator cycle sequencing kit (Applied Biosystems, Foster City, CA) was used for sequencing the purified PCR products as described previously (30). Each of the partial nodC and nifD sequences (349 and 457 bp, respectively) was used to select one single example of each allele to obtain the corresponding full-length sequences (1,292 and 1,503 bp, respectively).
A Microsoft Access database was created to compile the data. The strategy for design and manipulation of the database was used as described for the chromosomal database of the same strains (31).
Distance matrices among the multilocus ST profiles for each of the replicons were estimated using the START software package (14). These matrices were then used to derive neighbor-joining trees (23) with the software Splitstree version 4.10 (13). Groups of multilocus STs that differed at one or two loci were identified by the eBurst program (11), and these are referred to as single- or double-locus variants (SLVs or DLVs, respectively). The clonal complexes that were identified by this method were coded alphabetically, starting with the largest complex for each replicon, which was assigned the letter “A.” Population snapshots were drawn only with these groups; the singletons (STs that differed at three or more alleles) were not shown.
Linkage disequilibria among the loci within each extrachromosomal replicon were examined to estimate the frequency of intergenic recombination among the STs. This was done using phylogenetically informative polymorphic nucleotide sites as described by Sun et al. (26) with the program MULTILOCUS version 1.2 [http://www.agapow.net/software/multilocus/1.3b/view]) (1). This was accomplished by using a standardized index of association called “Rd” to test the null hypothesis that polymorphic sites among the different loci were randomly associating. This program also provided an index of the proportion of phylogenetically compatible polymorphic nucleotide sites (PrC) as an alternative test for random recombination within each replicon.
Full-length sequences of both nodC and nifD alleles were obtained containing the full range of nodC and nifD allelic diversity observed using partial sequences of both loci. The nodC gene was chosen for this more extensive analysis because certain allelic variants (alleles 32, 35, 36, 37, and 38) are associated with the ability to nodulate highly selective Medicago species, e.g., M. laciniata (5, 32). The nifD genes in these strains were included in the full-length analysis, because they are also essential for the symbiosis, but unlike nodC, they do not encode a host range determinant. In addition, these two genetic regions are conserved across genomes, and this facilitated the selection of primers for PCR and sequence analyses in adjacent loci.
The full-length nodC and nifD sequences were analyzed for evidence of intragenic recombination. The sequences were imported into Genedoc, and the alignments were exported as fasta files. These files were then imported into SplitsTree, version 4.10 (13), to display evolutionary relationships of the genes as the network algorithm NeighborNet (7). They were also used to test for recombination using the pairwise homoplasy index (Φw statistic) of Bruen et al. (6). Additional analyses to screen for intragenic recombination within the two loci were further done using the software package RD2 (17).
In this study, the Nei-Gojobori method (20), as implemented in the Z test in the program MEGA 4.0 (27), was used to test all pairwise allelic combinations within each locus for positive selection. The program estimates the number of synonymous mutations per synonymous site (dS), the number of nonsynonymous mutations per nonsynonymous site (dN), and also the variances [var(dS) and var(dN)] of the respective estimates. This information is then used to test the null hypothesis that the gene's product provides no contribution to fitness, in other words that the gene is evolving neutrally (Ho: dN = dS). Alternatively, if positive selection has operated since the divergence of the alleles, it is expected that dN > dS (HA: dN > dS). These hypotheses were evaluated by a one-tailed codon-based Z test (for large samples), where Z = (dN − dS)/√[(var(dS) + var(dN)]. In the analyses, a bootstrap method (1,000 replicates) was used to estimate variances. The extent of nucleotide and amino acid sequence divergence was estimated by means of the uncorrected differences (p distance) because this distance is known to give better results than more complicated distance estimates when the number of sequences is large and the number of positions used is relatively small, because of its smaller variance (19). For allelic combinations where the Z test provided a significant value, Fisher's exact test was also used. This method, which is also implemented in MEGA 4.0 (27), is more conservative and provides a more reliable test of the alternate hypothesis (HA: dN > dS) when the total numbers of substitutions between sequences are small (19).
Seeds of Medicago sativa cultivar Sutter were surface sterilized with concentrated H2SO4 for 3 min and were washed five times with sterile distilled water. The treated seeds were germinated on sterile water agar, and seedlings were sown in sterile 50:50 (wt/wt) sand: vermiculite-filled Leonard jars (15) and 2 ml of modified arabinose-gluconate (31) late-log-phase broth cultures were used to inoculate each jar. The cultures tested for symbiosis were the type strains for the species S. meliloti (USDA 1002) and S. medicae (A321) and a single strain each harboring one of the remaining nodC alleles. Each treatment was prepared in duplicate, with 15 seedlings in each jar, and 18 jars without inoculated bacteria served as controls. The plants were grown in a greenhouse without supplemental lighting for 32 days. The plants were visually scored for color of the plant tops and nodulation. The plant tops were dried at 60°C for 2 days to determine dry matter (8).
Sequences of the alleles for each locus have been deposited in GenBank under the following accession numbers: adeC3, GQ507024 through GQ507047; catC, GQ507048 through GQ507066; dak, GQ507067 through GQ507084; dgoA, GQ507085 through GQ507132; fixK, GQ507133 through GQ507161; gabT, GQ507162 through GQ507185; idhA, GQ507186 through GQ507212; napB, GQ507213 through GQ507236; nifD, GQ507237 through GQ507267; pdh, GQ507268 through GQ507301; locus tag SMa0198, GQ507302 through GQ507334; and nodC, GQ507335 through GQ507367 (see Table S1 in the supplemental material). Sequences of additional nodC alleles (32) used in the analyses had the following GenBank accession numbers: EF428921 (allele 2 of A321); EF428922 (allele 3 of USDA 1002); for rhizobia isolated from Medicago laciniata, EF428923 (allele 35), EF428924 (allele 36), EF428925 (allele 37), and EF428926 (allele 38); and for rhizobia isolated from Medicago truncatula, EF428927 (allele 39), EF428928 (allele 40), and EF428930 (allele 41).
Two different methods were used to illustrate relationships between the STs, because each method was found to have certain limitations. The eBurst analysis resulted in a large number of unaffiliated multilocus singletons; however, it was informative in that it was able to link many single-locus variants (SLVs). The neighbor-joining algorithm failed to link all SLVs into distinct clusters, but it did link all unaffiliated singletons to similar clusters of STs. Consequently, the combination of methods did make it possible to visualize the degree of relatedness among all of the STs.
Of the original 231 strains included in the chromosomal study (31), 229 were examined for allelic variation in the six pSymB loci (Table (Table1).1). Strain 12 was not included in the analyses because the PCR for several of the loci repeatedly failed to produce a product. A duplicate of strain 128A10 was omitted. The numbers of alleles revealed were 19, 18, 48, 24, 26, and 34 for catC, dak, dgoA, gabT, idh, and pdh, respectively. A combination of the observed alleles for each locus was then used to derive 177 unique multilocus STs (see Table S2 in the supplemental material). The 177 STs were placed into 19 different groups (Fig. (Fig.11 A) and 94 singletons (not shown) by eBURST analysis. The number of strains in each group ranged from 20 to 2 for the largest to smallest, respectively. The proportion of STs in the largest eBURST group (group A in Fig. Fig.1A)1A) was 9%, and this group consisted of several chains of linked STs. All of the STs, including the 94 singletons, were clustered using the neighbor-joining algorithm (Fig. (Fig.1B).1B). This figure was intended to provide a convenient method to associate the 94 singletons with similar STs and was not intended to represent a phylogeny. From the neighbor-joining tree, it was evident, however, that the pSymB multilocus genotypes of strains having chromosomal ST profiles characteristic of S. medicae (31) were all associated with the eBurst group C. The large number of deep branches in the figure reflects the extensive multilocus diversity observed among the pSymB (and pSMED1) replicons.
Of the 231 strains originally included in the study (31), 225 were examined for allelic variation in the six pSymA loci (Table (Table2).2). The strains 128A14, 15A5, 17A8, M158, and M75 were not included in the analyses because the PCR for several of the loci repeatedly failed to produce products. The duplicate strain of 128A10 was omitted. The numbers of alleles revealed were 24, 29, 24, 31, 34, and 33 for the loci adeC3, fixK, napB, nifD, nodC, and SMa0198, respectively. A combination of the observed alleles for each locus was used to derive 161 unique STs (see Table S3 in the supplemental material). The 161 STs were placed into 19 different groups (Fig. (Fig.2A)2A) and 82 singletons (not shown) by eBURST analysis. The number of strains in each group ranged from 45 to 2 for the largest to smallest, respectively. The proportion of STs in the largest eBURST group (group A in Fig. Fig.2A)2A) was 10.6%, and this group consisted of several chains of linked STs. All of the STs, including the 82 singletons, were clustered using the neighbor-joining algorithm (Fig. (Fig.2B).2B). This figure was intended to provide a convenient method to associate the 82 singletons with similar STs and was not intended to represent a phylogeny. As was the case with pSymB, the pSymA multilocus genotypes of strains having chromosomal ST profiles characteristic of S. medicae (31) were all associated with a single branch, which supported eBurst groups E and G (Fig. (Fig.2B).2B). The large number of deep branches in the figure reflects the extensive multilocus diversity among the pSymA (and the pSMED2) replicons.
No distinct correlation was found between a particular chromosomal ST and the pSymA STs and the pSymB STs in each of the strains (see Table S4 in the supplemental material).
The standardized index of association (Rd) statistics for the combinations of loci on the pSymB replicons were significant (P < 0.05) in only two combinations (Table (Table3),3), while the corresponding index for the proportion of phylogenetically compatible polymorphic sites (PrC) was not significant for any of the pSymB combinations. Since the amount of genetic polymorphism with the strains of S. medicae was limited, a linkage disequilibrium (LD) analysis of pSMED01 was not completed.
The Rd values for the combinations of loci on the pSymA replicons were significant at the P < 0.01 and P < 0.05 levels, in two and one combinations, respectively (Table (Table4).4). The corresponding PrC values for the various combinations were significant at the P < 0.01 and P < 0.05 levels in five and in three cases, respectively. Since the amount of genetic polymorphism within the strains of S. medicae was limited, a linkage disequilibrium (LD) analysis of pSMED02 was not completed. Overall, when considering both indices, consistently strong support for pairwise LD (P < 0.01) was detected only between the nodC and nifD gene combinations on the pSymA replicon.
Intragenic recombination among the 31 alleles of nifD was not significant in either the results of the pairwise homoplasy index or the results obtained using the suite of programs implemented in the software package RD2. Nevertheless, the evolutionary relationships among the nifD alleles (with the NGR234 sequence “NGR_a01120” as the outgroup) formed a network, when analyzed with Splitstree (Fig. (Fig.3).3). A network rather than a bifurcating tree is formed when the phylogenetic signals conflict or if alternate evolutionary histories are present, even when evidence for recombination is absent. The nifD alleles could be roughly divided into four clusters. Three of those clusters (IX, XI, and XII) were predominantly associated with chromosomal STs of S. meliloti. The remaining central cluster of three alleles (X) was detected in strains with chromosomal STs characteristic of S. medicae. Allele 17 in the lower cluster was unusual in that it was present in three strains with chromosomal STs that share characteristics with S. medicae rather than S. meliloti.
By using the pairwise homoplasy index (phi) test, statistically significant evidence for recombination (P = 0.0095) among the full-length alleles of nodC was revealed (Fig. (Fig.4).4). Further support for a specific intragenic recombination event (P < 0.05) was obtained with the applications MaxChi, GeneConv, and Chimaera within the software suite RDP2. The recombination event identified the S. medicae allele 14 (cluster II) as the major parent in the transfer of an 874-bp central fragment into the S. meliloti allele 16 (minor parent, cluster IV) present in strain M163. The resulting chimera was either allele 15 or allele 12 (cluster IV; Fig. Fig.4).4). It was noted that the nodC allele of the major parent (allele 14) is associated with nodC alleles (and strains) that predominantly are symbiotically effective (clusters I, II, and III; Fig. Fig.4)4) with M. sativa (Table (Table5).5). In contrast, the nodC alleles of both the minor parent (allele 16) and the putative recombinants (alleles 12 and 15) are associated with symbiotically ineffective strains (cluster IV; Fig. Fig.4)4) with M. sativa (Table (Table5).5). The split graph for the nodC genes further indicated a history of recombination among alleles of nodC. However, conflicting phylogenetic signals and/or alternate evolutionary histories in nodC may have contributed to the formation of the network.
The presence of S. meliloti nifD and nodC alleles were detected in strains with an S. medicae chromosomal background (Fig. (Fig.3,3, cluster XI; Fig. Fig.4,4, cluster IV). By coincidence, both of these alleles were assigned the identical number “17” and both were present in strains M161, M173, and M278, which have an S. medicae chromosomal background. In contrast to nifD and nodC, the four other pSymA alleles within these three strains were more characteristic of S. medicae than S. meliloti.
A codon-based Z test was initially used to investigate all pairwise allelic combinations within each locus for positive selection. This was done by testing the hypothesis that positive selection (HA: dN > dS) had been operating since the divergence of the alleles in each of the pairwise comparisons. Because the number of pairwise comparisons was large, only those with significant Z statistics were presented (Table (Table6).6). On the pSymA replicon, evidence for positive selection since divergence was obtained for two allelic combinations at the adeC3 locus, six combinations at the nodC locus, and five combinations at the SMa0198 locus. On the pSymB replicon, positive selection was indicated for 12 allelic combinations at the idhA locus and three combinations at the pdh locus (Table (Table6).6). Subsequent analyses of the same combinations using Fisher's exact test did not provide significant evidence of positive selection for any of the loci above (Table (Table66).
Because specific nodC alleles have previously been shown to code for an important determinant of both host specificity and symbiotic effectiveness (5, 32), a plant test was done to examine the symbiotic competence of 34 rhizobia harboring the range of the different nodC alleles that were observed (Table (Table5).5). The corresponding nifD allele and the chromosomal ST (31) for each of the strains that were selected for the plant test are also indicated. Effective nitrogen-fixing symbioses were established with 18 strains, while the remaining 17 were ineffective with various numbers of white and small nodules (Table (Table5),5), with the exception of CC2003 (nodC allele 32, cluster V), an isolate of M. laciniata that failed to nodulate M. sativa. With but one exception, symbiotic competence was associated with certain nifD alleles and not with others. For example, strains Rm1021, USDA 1002, 102F82, and N6B1 all harbored nifD allele 1 and all were symbiotically effective, while strains M105, M124, M202, M5, and M197 all harbored nifD allele 12 and all were symbiotically ineffective (Table (Table5).5). The single exception was nifD allele 18 detected in strains M11, M193, and RF22, which were symbiotically effective, while strain M47 was ineffective.
With six loci per replicon, 161 and 177 unique allelic profiles (STs) for pSymA and pSymB (pSMED2 and pSMED1) were identified. Previously, for the much larger chromosome, only 91 different STs were identified in the same strains with the analysis of 10 loci (31). Therefore, the allelic diversity within the megaplasmids of S. meliloti and S. medicae appears much greater than the diversity within their corresponding chromosomes. Furthermore, no obvious relationship was detected between the chromosomal and megaplasmid ST within each strain.
The disparity between chromosomal and megaplasmid diversity could be explained by positive (or diversifying) selection, which was observed using the codon-based Z tests. However the results of the more conservative Fisher's exact test indicated that the dN/dS ratios observed could also be explained by chance alone. These contrasting results may be because the Z test is too liberal in rejecting the null hypothesis when the total number of codons or the total number of synonymous and/or nonsynonymous substitutions are small (27).
The program eBurst was used in this study in an attempt to identify linked SLVs among the megaplasmid multilocus genotypes. The largest eBurst groups in the pSymA and pSymB populations contained 10.6% and 9.0% of the STs in the respective populations. Based on data with other microbial species, Turner et al. (29) suggested that these proportions would represent realistic ancestor-descendant relationships. In the same study, they suggested that large numbers of singletons within population snapshots, such as those obtained with the megaplasmids, were associated with elevated levels of mutation and recombination.
Most of the previous studies examining recombination in Sinorhizobium have focused on either horizontal gene transfer (HGT) or intergenic recombination by linkage disequilibrium analyses. In several reports results were presented that the chromosomes of S. meliloti and S. medicae are sexually isolated (2, 10, 17a, 31). This evidence was used in support of the circumscription of these two species (22). However, this degree of sexual isolation does not appear to be present in the megaplasmids within these two species. For example, in a recent phylogenetic analysis, Bailly et al. (3) concluded that the nod genes in one biovar of S. meliloti (bv. meliloti) were more similar to the nod genes in another species (S. medicae) than they were to the nod genes in a second biovar of S. meliloti (bv. medicaginis). They hypothesized that this nod sequence similarity between S. meliloti and S. medicae might be explained by either a history of HGT between these species or as a result of selection by host plants, preventing the divergence of the alleles. However, they concluded that HGT was a more likely explanation based on a comparative analysis of dS/dN ratios within these two species. Different nodC alleles of strains representing the highly host-specific S. meliloti bv. medicaginis were included in the current study (Fig. (Fig.44 cluster V), and the failure to detect positive selection associated with any of these alleles (alleles 32, 35, 36, 37, and 38) would support the HGT hypothesis (3). In further support of the HGT hypothesis was the observation of intragenic recombination between the nodC alleles of S. medicae (allele 14) and S. meliloti (allele 16).
In bacteria, intergenic recombination of genes within a species is usually evaluated by linkage disequilibrium (LD) analysis. Generally the level of linkage disequilibrium between two loci is inversely correlated with their map distance. Furthermore, complete LD is expected between loci that have been replicating clonally: i.e., that have not been involved in a genetic recombination event (24). In a previous multilocus sequence analysis of 49 S. meliloti strains that were included in the present study, Sun et al. (26) sequenced partial fragments within four other genes on the pSymA replicon and within six other genes on the pSymB replicon. They observed significant LD among a few gene pairs encoded by the pSymA replicon, while finding little evidence for LD between genes encoded by the pSymB replicon. Similar results were obtained in the present study for strong LD between certain loci on the pSymA replicon (Table (Table4)4) and no strong support for LD between genes on the pSymB replicon (Table (Table33).
The two loci (nifD and nodC) for which consistently strong evidence of LD was obtained, both encode critical components needed for the nitrogen-fixing symbiosis. These two genes map to a 275-kb region of the 1.35-Mb pSymA replicon of S. meliloti strain Rm1021 (4). The strong LD observed probably is more likely due the proximity of these loci on the pSymA replicon, rather than a consequence of host selection for fitness (21). However, selection for other linked loci (hitchhiking) is a possibility that remains to be examined.
Intragenic recombination in the nodC locus was detected using the RDP2 suite of programs (17). From this analysis, it was apparent that certain nodC alleles do have a history of intragenic recombination and it was shown in one specific case that this may have resulted in a loss of symbiotic competence on M. sativa. There was no corresponding support for recombination among the nifD alleles that were examined. These contrasting results for these two genes were supported by pairwise homoplasy indices.
The presence of a small (<1,000 bp) recombinant fragment within a gene that is otherwise in LD with nearby loci, such as was found in the nodC locus of strain M163, is consistent with the clonal frame model proposed for the Escherichia coli chromosome by Milkman and Bridges (18). In this model, the E. coli chromosome was described as a mosaic of clonal segments each bounded by recombinational borders. They suggested that in natural populations, such segments that contain favorable alleles may increase in number. As this is occurring, the segments may also undergo changes in clonality as recombination introduces short replacement segments from outside the clone.
Clearly the genomic relationships of S. meliloti and S. medicae are complex. The combined observations of sexual isolation between the chromosomes of these species and the lateral transfer of certain genes and gene regions in their symbiotic megaplasmids indicate that recombinational boundaries do exist. An awareness of the nature of these boundaries and how they may have shaped the population genetic structure in these species will aid in the development of strategies to maximize the contribution of symbiotic nitrogen fixation to sustainable agriculture.
We thank K. Lee Nash for excellent technical assistance.
Published ahead of print on 23 April 2010.
†Supplemental material for this article may be found at http://aem.asm.org/.