|Home | About | Journals | Submit | Contact Us | Français|
Cotton fibers (produced by Gossypium species) are the premier natural fibers for textile production. The two tetraploid species, G. barbadense (Gb) and G. hirsutum (Gh), differ significantly in their fiber properties, the former having much longer, finer and stronger fibers that are highly prized. A better understanding of the genetics and underlying biological causes of these differences will aid further improvement of cotton quality through breeding and biotechnology. We evaluated an inter-specific Gh × Gb recombinant inbred line (RIL) population for fiber characteristics in 11 independent experiments under field and glasshouse conditions. Sites were located on 4 continents and 5 countries and some locations were analyzed over multiple years.
The RIL population displayed a large variability for all major fiber traits. QTL analyses were performed on a per-site basis by composite interval mapping. Among the 651 putative QTLs (LOD > 2), 167 had a LOD exceeding permutation based thresholds. Coincidence in QTL location across data sets was assessed for the fiber trait categories strength, elongation, length, length uniformity, fineness/maturity, and color. A meta-analysis of more than a thousand putative QTLs was conducted with MetaQTL software to integrate QTL data from the RIL and 3 backcross populations (from the same parents) and to compare them with the literature. Although the global level of congruence across experiments and populations was generally moderate, the QTL clustering was possible for 30 trait x chromosome combinations (5 traits in 19 different chromosomes) where an effective co-localization of unidirectional (similar sign of additivity) QTLs from at least 5 different data sets was observed. Most consistent meta-clusters were identified for fiber color on chromosomes c6, c8 and c25, fineness on c15, and fiber length on c3.
Meta-analysis provided a reliable means of integrating phenotypic and genetic mapping data across multiple populations and environments for complex fiber traits. The consistent chromosomal regions contributing to fiber quality traits constitute good candidates for the further dissection of the genetic and genomic factors underlying important fiber characteristics, and for marker-assisted selection.
There are two economically important tetraploid cultivated species of cotton, Gossypium hirsutum (also referred to as "Upland" cotton) and G. barbadense (Caribbean "Sea-Island", Extra Long Staple "ELS" and modern "Pima" and "Egyptian" cultivars). They display many complementary agronomic features. G. hirsutum (hereafter Gh), the most widely cultivated species, has higher yield potential than G. barbadense (Gb) in most environments; however, Gb cultivars are superior to Gh in most aspects of fiber quality, such as fiber length, strength and fineness. The two species are derived from a recent polyploidization event that occurred ~1-2 MYA . They are inter-fertile, but inter-specific crosses aimed at recombination of of genes underlying their complementary agronomic performance have generally resulted in difficulties such as reduced fertility, cytological abnormalities and distorted segregation in the F 2 generation. Although inter-specific breeding by conventional approaches has had positive impacts on cultivar development , the potential for using molecular markers to facilitate more rapid and effective selection and transfer of G. barbadense fiber properties to G. hirsutum  is clear.
Cotton fibers are highly elongated single cells of the epidermal layer of the seed. Fiber development spans four discrete, yet overlapping stages: initiation (-3 to 5 days post anthesis, dpa), elongation (3 to 21 dpa), secondary cell wall deposition (14 to 45 dpa) and maturation/dehydration (40 to 55 dpa) . The final mature spinnable fibers are dried flattened cylinders ~35-50 mm long and made of ~96% cellulose. Their commercial value is determined by their overall physical dimensions and the extent of thickening of the internal walls. Various physical properties of cotton fibers are measured ranging from fiber length and length uniformity, strength, elongation (degree of extensibility), maturity (extent of cell wall thickening), micronaire (resistance to air flow across a plug of fibers) and fineness (linear density, a function of diameter and thickness), to color indices (reflectance and yellowness). The most commonly used equipment, High Volume Instrument (HVI), is used for commercial and research applications for high-throughput measurements of most of these parameters, while research programs also rely on specialized instruments like Fibronaire, Maturimeter or the Advanced Fiber Information System (AFIS) . An optimal fiber quality results from the composite association of numerous partially correlated quantitative traits all impacting on the final performance of the fiber in spinning and weaving and relatively small differences in fiber parameters can attract premiums or penalties during marketing.
Earlier genetic studies demonstrated fairly high heritabilities of fiber characteristics, in excess of 0.40, except for length uniformity [5-7], and moderate environment and GxE effects (also reviewed in ). More than 30 reports have been published on genetic mapping and QTL mapping in inter-specific Gh × Gb populations, of which 13, originating from 14 different populations, relate to QTL data for fiber traits. These studies represent a total of over 455 QTLs [6-18]. A common feature of most of these QTL studies is that positive effects on all fiber traits are derived from the presence of alleles from both Gh and Gb parents [6,7,17] commonly resulting in transgressive segregation [7,19].
Comparison between different QTL studies in cotton is generally complicated by insufficient marker synteny between maps and the lack of sufficient bridge markers. However, an overall observation is that the level of congruence of QTL localization is generally low. Only two QTL reports have included a comparison across populations. The study by Lacape et al. , focusing on 3 backcross generations (BC 1, BC 2 and BC 2S 1, derived from the Gh × Gb cross of Guazuncho-2 (Gh) × VH8-4602 (Gb)), reported 50 significant fiber QTLs (80 putative QTLs at LOD > 2.5). Only 20% of the QTLs were common between at least 2 of the 3 BC data sets and only 30% putatively agreed with at least one QTL report in the literature for both chromosomal location and parental species origin. Rong et al.  aligned 212 fiber QTLs from 5 different inter-specific Gh × Gb populations using the palmeri x K101-F2 (Gh × Gb) map as a reference. These authors concluded that there was a poor level of correspondence of fiber QTLs between experiments and populations. Inter-specific chromosome substitution lines (having a single Gh chromosome pair substituted by Gb chromosomes), so called CS-B lines, have been employed to assign genetic effects (additivity, dominance) on fiber traits to specific chromosomes or chromosome arms [20,21], although correlations with other QTL studies were not undertaken.
Although the number of reports on QTL mapping of agronomically important traits in plants has increased tremendously in the past 20 years, several authors have also emphasized limitations and biases of QTL mapping that have limited their broad application to marker-assisted breeding in many crops. It appears that numerous "resource-limited" QTL studies may be questionable in terms of their reliability and accuracy, major limitations being the number of genotypes and number of environments under study [22-25]. Such limitations will be inflated for traits of low to moderate heritability or for traits with significant QTL x environment interactions . A way of improving the power and accuracy in detection of true QTLs is by increasing population sizes (often not practical), or by multiplying the number of environments in which the population is evaluated (reviewed in ). One option would be to reanalyze raw data in a pooled analysis , but this approach is generally prohibitive because of different data structures and requirement for a common set of markers across populations.
Comparative QTL mapping and meta-analysis, however, provides another means to unify, and thereby simplify molecular analysis of complex phenotypes across multiple data sets. Meta-analysis aims to study QTL congruency. Results are pooled across studies in order to combine them in a single result, thus improving estimate in QTL detection . Meta-analysis of QTLs has mainly been reported in medical and animal sciences . Examples in plants include nematode resistance in soybean , disease resistance traits in cocoa , drought-related traits in rice  and ear emergence in wheat . The report on cotton by Rong et al.  concerned yield and fiber traits gathered from diverse QTL mapping reports that were collectively projected onto a reference map. These studies essentially used BioMercator software , that has recently been followed by a new computational and statistical package, MetaQTL . MetaQTL has been used for the meta-analysis of QTLs for flowering time  and N-remobilization in maize , FHB resistance in wheat , and root architecture in rice .
In this study, we evaluated the fiber characteristics of a population of inter-specific Gh × Gb RILs in eleven independent experiments involving both field and glasshouse conditions, different locations and multiple years. The two parents used for the RIL population were the same as those being currently used in marker-assisted backcross selection underway at CIRAD to enhance the fiber quality of Gh breeding material . A meta-analysis was undertaken with MetaQTL using the fiber QTLs derived from the eleven experimental sites of the RIL population, fiber QTLs previously reported in backcross populations, and fiber QTLs from the literature, in order to identify highly congruent genomic regions contributing to key fiber quality traits.
Summary data of the 11 experiments and the fiber trait values for the parents and RILs are given in Table Table11 and Table Table2,2, respectively. Among the 11 fiber data sets, 4 were collected from glasshouse grown plants and 7 from field-grown plants (Table (Table1).1). Although field-grown samples as compared to glasshouse samples may be expected to be more representative of real commercial growing conditions, they may also be more prone to environmental variation. The 2 types of growing conditions were therefore not specifically separated in all further analyses. The two parents, Guazuncho-2 (Gh) and VH8-4602 (Gb) display a broad range of phenotypes in many aspects of plant growth and development (not shown) as well as fiber quality parameters (Table (Table2),2), making them suitable for discriminating the underlying genetic contributions to fiber quality in genetic crosses. The fibers of the Gb parent were intrinsically very fine (favorable), but their maturity and micronaire measures were usually very low (unfavorable). As expected, the RILs displayed a high degree of variability at all sites and for every descriptor of plant phenology (such as earliness), morphology (eg., leaf shape and hairiness), and production components (eg., boll size or number), but also showed frequent segregation beyond parental variability, or transgression, for those parameters (to be presented elsewhere). Variation among the RILs was also considerable for all fiber quality parameters (Table (Table2).2). Although the mean RIL values (intermediate between the 2 parents) were always closer to the Gh parent value than to the Gb parent, there were always some individual RILs displaying transgression on both sides of the distribution for most traits, as shown by the maximum and minimum values in Table Table2.2. The ranking of individual RILs for fiber characteristics showed good consistency among sites, indicative of moderate GxE interaction effects and high heritabilities for fiber traits, as described below.
Fiber fineness components of the RILs in particular, displayed a wide range of variation. Fiber maturity was considerably improved in the RIL material, relative to the mid-parent value, with an average level close to that of Guazuncho-2 (MR = 0.86 for the RILs, as compared to 0.89 for Guazuncho-2 and 0.72 for VH8-4602 over 10 data sets). In addition to their large variation, fiber length traits (ML and UHML) were the parameters for which the mean RIL values were the most biased toward the Gh parent. Fiber of the RILs was on average 1 mm longer (UHML) than that of the Gh parent, although the 2 parents differed by more than 8 mm. Fiber strength displayed transgression in some RILs, as exemplified by the extreme high 44.8 g/tex (+ 5.3 g/tex over best parent) for one RIL in experiment Garoua 2007 (Ga7). Regarding the two color parameters, the parents only displayed moderate differences (Gb slightly more brilliant) and the variation over the RILs largely exceeded the parental range in both yellowness and reflectance.
A separate analysis of variance of the fiber data collected from the two Brazilian experiments of 2007 and 2008 was conducted. The 2 randomized blocks design used in each year constituted a balanced data set with a sufficiently high number of individuals (109 RILS in common) to allow a statistical estimate of GxE compared with some of the other sites. The environmental (E) component corresponded to compounded effects of blocks and year. In the case of the global data set, there was a variable number of RILs (Table (Table1)1) for which 7 to 10 individual fiber data were collected from the 11 experiments depending on the trait measured. The year and location effects were confounded in the analysis of variance and the GxE effect was not testable. The results of the 2 analyses of variance of fiber data are presented in Table Table3.3. In the Brazil-only analysis, all fiber traits displayed strong (P < 0.0001) genotype effects and high heritabilities (all superior or equal to 0.78). Brazilian data also indicated that half of the fiber quality measures were associated with non significant GxE (RIL × year) effects. When significant (MR, IM, ML, UHML, UI, and elongation), this interaction component was always strongly inferior to the genotypic (RIL) effect (Table (Table3),3), confirming data from the literature and indicating that the genetic influence on major fiber characteristics was generally greater than non-genetic influences (reviewed in ). As compared to Brazil data, it could be assumed that a larger environmental variation had been introduced from the testing over 11 sites × years; global estimation of trait heritabilities based on the complete data set resulted in slightly lower values, between 0.37 for maturity ratio and 0.69 for yellowness index (Table (Table3).3). These values were generally superior to the narrow sense heritabilities estimated from the BC 2/BC 2S 1 regression  and fell within the range of reported values from the literature . Highest heritabilities associated with a non-significant RIL × year interaction component for the Brazilian data were related to color indices (Rd and + b) and to fiber fineness.
Within a trait category, the strongly significant correlations between mean length, ML, and upper half mean length, UHML (r = 0.95) or between the 2 color parameters, Rd and + b (r = -0.76) are inherent to the measurement equipment and are well described . Within the fiber fineness/maturity category, 3 parameters were also well correlated: H, MR and IM (r = 0.68 for H/MR, r = 0.85 for H/IM and r = 0.85 for MR/IM), while the 4th parameter, standard fineness, Hs, was the least correlated with the 3 others. This can be explained by the fact that Hs is calculated as the ratio of H to MR, and is used to compare cottons of very different maturities. There was also a fairly high and positive correlation between mean fiber length and strength (mean r = 0.57), but no significant correlation between fineness and length or fineness and strength, although the 3 fiber traits, strength, length and fineness, are phenotypically associated in the 2 parents.
One hundred and sixty seven significant (LOD > permutation-based threshold) QTLs were detected by composite interval mapping over the 11 RIL experiments, involving 93 individual series of data (see Additional file 1, Table S1 and summary in Table Table4).4). The 167 significant QTLs individually explained 8.4% to 48.4% of the phenotypic variation, with LOD as high as LOD = 10.6 for color QTLs on c8. The 167 significant QTLs were a subset of the larger number of 651 putative QTLs (including all QTLs of LOD > 2) which were considered for meta-analysis.
The number of RILs sampled and analyzed for their fiber parameters per location/year ranged from as low as 38 for Ge6 to 128 for the Br8 data set, and the number of fiber traits measured at the different sites also varied (Table (Table1).1). The lowest number of QTLs (5) was detected for Cs7 (Table (Table4)4) as only fineness components and fiber length were measured at this site because the small fiber quantities generated were insufficient for most fiber measuring instruments. For other data sets (Table (Table4),4), where more fiber parameters were measured, the number of significant QTLs per data set varied between 11 (Cs8, Lu8 and Mp7) and 23 (Mp8).
Globally, the largest group of QTLs (66) was related to the fiber fineness/maturity category as this category integrates more variables than any of the others (Table (Table4)4) and includes H, MR, Hs, and micronaire. Fiber length (UHML, ML and UQLw) and fiber length uniformity (UI, and SFI) detected 27 and 14 QTLs, respectively. Strength and elongation were accounted for by 12 and 16 QTLs, respectively, but were not detected at all sites. Fiber color, represented by 2 parameters, Rd and + b, revealed 32 QTLs in total.
The number of fiber QTLs per chromosome (Table (Table4)4) varied. The lowest numbers of QTLs were detected on c10 (1 QTL) and c11, c13, c20, c22 and c23 (2 QTLs each), and the highest numbers were detected on c12 (17 QTLs), c15 (16), c21 (13) and c19 (10). A similar number of significant QTLs were detected on chromosomes of the A t sub-genome (c1-c13) and the D t sub-genome (c14-c26), with 80 and 87 significant QTLs, respectively (Table (Table44).
The parental contribution (additivity) for the 167 significant QTLs was also analyzed. In a commercial context, high values of length, strength, elongation, maturity or reflectance are sought by fiber merchants and spinners, while low values of fineness and yellowness are favored. In our inter-specific cross, the Gh parent, Guazuncho-2 is expected to be the donor for slightly better fiber color properties as well as for fiber elongation, while the Gb parent, VH8-4602, is expected to be the donor for all other parameters, including the most commercially important ones, length, strength and fineness. From our QTL data, we observed that the relative contribution of the 2 parents was sometimes "as expected" with more QTLs having a positive contribution from the predicted donor parent, as was the case for fiber elongation (100%, 16/16, by the Gh parent), fiber color (72%, 23/32, by the Gh parent) and fineness (73%, 19/26, by the Gb parent). However, in the case of fiber strength and length the situation was reversed, as the donor parent VH8-4602 (Gb) contributed positively at less QTLs than Guazuncho-2 (33% for fiber strength and length, i.e. 4/12, and 9/27, respectively).
In total, 67 significant QTL (LOD > permutation-based threshold) were detected after the re-analysis of the 3 BC data sets, BC 1, BC 2 and BC 2S 1 (see Additional file 1, Table S1, also summarized in Additional file 2, Table S2). This is higher than the 50 significant QTLs of the original report , because the three generations were considered separately here, while the QTLs reported earlier were derived from pooled data. Details of these QTLs (additivity, distribution) were, however, only moderately altered as compared to the initial report . Using a relaxed threshold there were 328 putative QTLs (LOD > 2) detected over the three BC generations and these were used in the meta-analysis.
QTL analyses of fiber data from the 11 RIL experiments and 3 backcross generations altogether generated 234 significant QTLs (exceeding LOD threshold) and 979 QTLs (including putative QTLs of LOD > 2). Visual inspection indicated a moderate level of transferability between RIL and BC data sets. The highest frequency of conserved QTLs from RIL and/or BC data sets mapping at close distance were encountered for fiber color on c25 involving 8 (5 RIL and 3 BC data sets) of the 10 data sets where it was measured, for length on c4 and fineness on c12 and c15 (6 RIL data sets), for fiber color on c8 (5 RIL data sets), for fiber length on c3 (3 RIL and 3 BC data sets). These regions of conserved QTLs were only identified at the lower detection threshold (LOD2) and therefore mainly comprise putative QTLs. Conversely, some strong QTLs were found to be specifically detected in only a limited number of situations. For example, strength QTLs on c3 with strong effects were only detected in the BC populations, but none were detected in the RIL experiments. Two possible factors may account for these differences between BC and RIL: (1) population structure and heterozygosity (50% in the BC 1, 25% in the BC 2 and < 5% for the RILs), and (2) environmental interactions (glasshouse/field, year effect) including intrinsic measurement variability (like high CV).
In addition to QTLs from the RIL and BC data, fiber QTLs reported by  were also projected onto the RIL-BC consensus map. In all, the QTLProj module of MetaQTL was used to project over 1100 QTLs onto the consensus map, including all 328 QTLs of the BC data, all except 10 of the 651 QTLs of the RIL data and 140 of the 212 QTLs from Rong et al. . The 10 RIL QTLs that could not be projected (6 on c4, 3 on c6 and 1 on c24) may be due to minor remaining map inconsistencies in some terminal regions between the RIL map and the consensus map.
The one-LOD confidence interval (CI) was used as the primary screen to identify a limited number of chromosome regions with consistent localization for meta-analysis . For a given chromosome and for a given fiber trait category, the presence of QTLs from at least 4 (most often 5) independent data sets mapping anywhere on the same chromosome was considered as the cutoff for clustering with the QTLClust module of MetaQTL. Thirty chromosome regions were chosen, and in each case the best clustering model based on AIC criterion was then implemented and clusters generated by MetaQTL. Nearly half (471 out of 979) of the QTLs served as input data for the meta-analysis, resulting in 135 clusters on 19 different chromosomes. The seven other chromosomes (c1, c7, c11, c13, c14, c20 and c22) did not show sufficient QTL co-localization for any fiber trait to warrant clustering. An example of the clustering output is shown in Figure Figure22 for the 26 QTLs related to fiber length detected on chromosome 3. Detailed clustering results are shown in Additional file 4, Table S3 and summarized in Table Table5.5. Output figures generated by MetaQTL for all 30 trait x chromosome combinations are shown in Additional file 5, Figure S2, with corresponding text comments in Additional file 6, Table S4. Despite an acceptable level of representation among the data sets, some of the combinations listed in Table Table55 and Additional file 6, Table S4, were qualified as still only indicative because there was sometimes a poor level of co-localization and consequently high numbers of putative clusters were generated by MetaQTL. We decided to be conservative with these few cases because of the possible uncertainty in the mapping data, whereby markers (and the inferred linked QTLs) could be erroneously mapped at distant locations.
The QTL meta-analysis relied strongly upon the fact that the CI of proposed co-localizing LOD peaks were overlapping. We used the one-LOD drop off method for calculation of CI. Considering the inherent limits related to the notion of CI (peak height, population size and structure) and the possible uncertainties in the locus order and distances in genetic maps, we also tested an alternative, formula-based method for CI calculation . All clustering analyses were re-run with MetaQTL and clustering results with new formula-based CI were found to differ. The much larger individual CI with the formula-based method (see Additional file 1, Table S1) most often resulted in a lower number of clusters that also had larger CI (not shown). Apart from the effect of the method of calculation of the CI, it is also expected that mapping errors and uncertainties in mapping will have a strong effect on LOD peak and CI positioning. For these reasons, we believe that in some cases the identification of separate clusters by MetaQTL software could be questioned. In several instances, we have proposed that multiple clusters mapping at fairly close distances and containing individual QTL for which CI essentially overlapped and with the condition that their individual additivity agreed, be coalesced and collectively referred-to as candidate "meta-clusters" (Table (Table55 and see comments in Additional file 6, Table S4). Most prominent examples of such meta-clusters were found for fiber color on c8 and c25, length on c24, fineness on c2, c12, c15, and c17 or for strength on c23. Applying a prioritization among clusters and meta-clusters we identified a total of 25 robust cases (6 corresponded to a single cluster, all others to a meta-cluster grouping 2 to 4 clusters), listed in Table Table55 and Additional file 6, Table S4, and mapped to 15 different chromosomes, shared between 11 for fiber fineness, 5 for length, 2 for strength, 5 for color and 2 for elongation. The higher representation of fiber fineness as compared to other categories may reflect the inherent complex nature of fiber fineness as it includes two fairly independent interacting components (fiber diameter and fiber cell wall thickness or maturity) that are not separated by the measurement devices used. Regarding fiber strength, only 12 significant QTLs (Table (Table4)4) and 2 clusters (Table (Table5),5), on c3 and c21, were detected. An issue here may be in the reliability of the HVI-based measurement of fiber strength within the range of values that occurred in the RILs. Considering that micronaire reading is used by the HVI to estimate fiber plug mass and convert breakage force to g/tex (strength unit), it might be important to bear in mind that the parent VH8-4602 and a number of RILs had an extremely low micronaire reading (<2.0) and this could cause a biased estimation of bundle strength.
In the majority of cases (22 out of the 25 robust clusters and meta-clusters) the directionality of the additive effect followed the expectation in relation to the parental phenotypes (Table (Table5):5): improved trait values were contributed by the Gh parent for the 5 fiber color and 2 fiber elongation meta-clusters and by the Gb parent for 2 fiber strength meta-clusters, for 10 of the 11 fiber fineness clusters and for 3 of the 5 length meta-clusters. A positive contribution by the inferior parent was also reported in the case of fiber length in  for 61% (17 out of 28) of their non over-lapping fiber length QTLs. This effect may explain the marked occurrence of transgressive segregation.
From these 25 robust cases, 16 may be considered of higher significance based on additional consistency evidence (Table (Table5)5) both from our data and data from the literature:
- 2 clusters or meta-clusters for fiber length on c3 and c4
- 9 clusters or meta-clusters for fiber fineness on c9 (2×) and c21 (2×) and on c12, c15, c17, c18 and c25,
- 5 clusters or meta-clusters for fiber color on c8 (2×) and of c25 (2×) and c6.
The two Gossypium species used in this study represented highly divergent genotypes and were purposely selected as parents for hybridization to both maximize the chances of detecting significant QTLs for all fiber traits and to produce material that might have value as a source of variation for breeding. Because of their interspecific origin, some of the RILs showed reduced fertility and productivity resulting in the poor representation of data from some lines and sites (Table (Table1).1). Another contributing factor for the poor performance of the RILs could be that most of the experimental sites were outside of South-Central America, the area of adaptation of the two parents.
The averaged fiber characteristics of the RILs were always intermediate between the 2 parents, but closer to the Gh parent than to the Gb parent (Table (Table2).2). This is in accordance with the observed distorted allelic constitution of most of the RILs, that contained on average 71% and 29% of alleles of Gh and Gb, respectively . Some individual RILs displayed fiber characteristics outside of the parental range (transgressive phenotypes) and may thereby provide useful material to include in breeding crosses. Interestingly, the parental allelic composition of the RILs, varying between 95/5 to 32/68 parental (Gh%/Gb%) allelic content, showed no correlation with fiber quality: the best performing RILs were not those having the highest allelic content in VH8-4602 alleles.
The quantitative variation in the fiber characteristics observed over 11 (year × site) RIL experiments did convert into a high number of QTLs. Including the 167 significant fiber QTLs from the RILs, ca 600 fiber QTLs have been collectively reported so far from 9 different inter-specific Gh × Gb populations. Although we have reported some stability in QTL detection among this high number of fiber QTLs, the overall complex picture behind fiber QTL mapping may be put into perspective with the fact that fiber quality is assessed by a series of fairly independent physical measurements, essentially related to dimensional features like diameter, thickness and length. The complex genetic network is also consistent with biological evidence that cotton fiber development involves many genes (probably over 90% of all cotton genes are expressed in cotton fiber, ) and gene interactions.
Although earlier published comparative QTL mapping studies concluded that there was a poor level of transferability of QTLs between populations [7,16], it was expected that at least the strongest fiber QTLs would be confirmed, thus hopefully reducing the number of relevant QTLs  to a more manageable level from a breeder's perspective. The four populations studied here (3 backcross-derived and 1 RIL) only revealed a moderate level of convergence across data sets. Regarding the RIL experiments, the differences in the QTLs detected among the 11 sites probably resulted from the low number of individual RILs examined in some RIL experiments. The most important limiting factor causing reduced accuracy of detection of QTLs probably relates to the range of different population sizes used in the current study [22,44] rather than to the effect of environment, because heritabilities of fiber traits were shown to be medium to high. Another factor negatively affecting QTL detection power was the distorted genomic composition of the RIL population.
For these reasons, a fairly low detection threshold (LOD2) was used to declare LOD peaks as putative QTLs for use in across-experiment comparisons. Previous experience in QTL interval mapping has indicated that it is common for "background" noise in LOD score profiles to occur in the range of LOD values between 0 and 1. Figure Figure33 provides an example of the usefulness of lowering the LOD detection threshold for QTL detection in the particular context of such across-experiment comparisons. In the case of the 11 independent series of fiber length measurements and their LOD profiles along chromosome 4, only 3 significant QTLs having a LOD superior to the permutation-based threshold (3.5 in this case) would have been declared in a bottom region of c4. However, when a lower LOD threshold is used (LOD2), 3 additional "putative" QTLs (6 instead of 3) would be integrated in the meta-analysis. In this particular case, the reduction of the LOD detection threshold to LOD1 would even have allowed 2 additional locations to be included (i.e. 9 out of the total 11). Lowering detection thresholds also had some deleterious effects and probably resulted in significant inflation of QTL numbers, particularly by splitting broader peaks into multiple peaks in close proximity, as well as the probable inclusion of some false positives.
The BC and RIL populations included in meta-QTL analysis had the same Gh and Gb parents (Guazuncho-2 and VH8-4602). When possible, we compared our results with fiber QTLs reported from different crosses in the literature, for which the map locations on our BC-RIL consensus map could be extrapolated. Interestingly, for at least 22 of the 25 prioritized clusters and meta-clusters, additional support was provided by the reported localization on the same chromosome of at least one QTL from the literature, of which 17 also agreed in their directionality (Table (Table5).5). The cases of contradiction in directionality (see examples of fiber length on c9 or fineness on c18) may relate to real differential allelic effects as the parents differed. In a few cases, QTLs reported in the literature were not detected in this study. The strong fiber strength QTL reported from cross TM1 x 7235 by Zhang et al.  localized on c24/D8, near locus BNL2961, and presumably originating from the G. anomalum and G. barbadense lineages of one of the 2 parents, was only corroborated by one putative QTL, Lu8_STR with peak LOD = 2.54 and higher strength by the Gb alleles. A recent intraspecific RIL population  also reported 13 fiber QTLs, including a region of co-localization for 5 QTLs for different fiber traits on c7, but none was corroborated by any of our clusters. Similarly, the central region of c6 containing the gene t1 governing leaf hairiness did not contain any confirmed fiber QTL from our data, although leaf hairiness has been correlated with fiber quality in 2 segregating intra-G. hirsutum populations . The only putative case of co-localization (distant by only 10 cM) with t1 was for fiber fineness QTLs detected in 2 RIL data sets, Br7 and Cs8 (not considered for clustering) co-localized with a fineness QTL from the literature, FF06.1 from Paterson et al.  (see Additional file 5, Figure S2). It should also be noted that several fiber QTL reports from the literature could not be considered because of lack of bridge markers and uncertain chromosome assignation; this was the case for most QTLs from 2 interspecific Gh × Gb populations [10,15], and more importantly for the majority of fiber QTLs reported from intraspecific G. hirsutum crosses [48,49].
The coincidence of QTLs for different fiber traits in the same genomic region was reported earlier [7,16,46]. Our data based on a larger set of QTLs confirmed this observation. In several instances a close co-localization of individual QTLs for different types of fiber traits was observed. Moreover, in the majority of cases the respective directionality of the groups of QTLs agreed with the known phenotypic association in the parents. This may be interpreted in terms of pleiotropy because of the known correlations between fiber parameters, their physical definition or their particular measurement method. Although pleiotropic genetic effects may not be overlooked, we would also consider linkage as an explanation for these cases of co-localizations.
The overall distribution of QTLs and QTL clusters was not homogenous between chromosomes (Table (Table44 for the RILs, additional file 2, Table S2, for the BCs, and Table Table55 for QTL clusters). Although, simply counting numbers of QTLs may be misleading, because of correlations between traits under consideration, and possible mis-identification of multiple LOD peaks in close vicinity (5-10 cM) as separate QTLs, it is noteworthy that chromosome 19 consistently harbored a higher number of QTLs than any other chromosome (this study, BC data  and ). The enrichment in fiber QTLs on c19 was more frequently correlated with a Gh additive contribution to fiber traits: higher elongation, length, and strength and lower fineness.
Subgenome distribution was equivalent between chromosomes c1-c13 of the A t sub-genome and c14-c26 of the D t sub-genome, with 105 and 129 significant QTLs (among 234 from all RIL and BC), and 13 and 12 robust meta-clusters respectively (Table (Table5).5). The extant wild D-genome diploids, including the modern species G. raimondii, recognized as the closest to the D-genome ancestor of polyploid cotton, only have very short and coarse non-spinnable trichomes on their seed. Thus, the finding that the D t-genome of tetraploid A tD t cotton contributed at least half of the QTLs (eg, alleles and genes) involved in fiber quality raises the possibility that homoeologous fiber-related genes have differentially evolved following polyploid formation [16,50]. Recent reports tend to confirm an unequal homoeologous gene expression pattern in allopolyploid cottons [51,52] leading to a differential expression of duplicated genes during fiber development  (1500 A t/D t couples of genes studied).
Meta-clusters generally mapped at non-homoeologous A t/D t locations. The only possible homoeologies for QTL-enriched regions with same directionality included two regions, though only indicative, for fineness along the upper half of pair c5/c19 already identified by Rong et al. , and two new potentially homoeologous regions (close to the duplicated locus BNL1440) along central regions of pair c6/c25 both mapping fiber color QTLs (Table (Table5,5, and additional files 3 and 5, Figures S1 and S2 respectively).
The chromosomal regions enriched in fiber QTLs were compared with the chromosome regions statistically enriched in fiber-expressed genes reported by Xu et al. . These authors used available mapping data for cotton EST Unigenes from diverse libraries to assess their distribution among chromosomes. They identified a limited number of gene-rich regions: two regions at the top and the center of c5, a central region in each of c10 and c14, and a top region on c15. Cross comparison of these regions with our clusters for fiber QTLs indicated only 2 cases of putative convergence. A region at the top of c5 (0-30 cM interval on our consensus map) was over-represented in fiber "initiation" genes  and enriched in fiber length and fiber fineness QTLs (Table (Table5).5). Both fiber traits were consistent for a Gh positive effect (lower fineness and higher length), but the clustering was unreliable. Secondly, a central region of c10 (50-80 cM interval) was enriched in "elongation" genes  and possibly corresponded with a meta-cluster region for fiber fineness, FIN_10 (Table (Table5)5) consistent for a Gb positive effect (lower fineness). It should be borne in mind that genes classified as "expressed in cotton fibers", as in , may not necessarily match with the genes that underlie the fiber QTLs reported here. The subset of genes with known differential expression patterns between Gh and Gb may be more likely candidates for co-localization with fiber QTL. In this context, two recently published studies specifically focused on differences between Gh and Gb at the transcriptional level using microarrays. Alabady et al.  showed that on average 14.5% of the 12000 fiber genes profiled were specifically and differentially regulated between Pima (Gb) and TM1 (Gh) developing fibers. In another study, Al-Ghazi et al.  showed only few Gh/Gb differences in gene expression except at early stage of fiber development (20% of 24000 genes differentially expressed at 7 dpa) as compared to later stages (< 4%).
A challenging issue for bridging the gap between QTL mapping and identification of the underlying causative DNA polymorphisms is the low resolution associated with QTL mapping. Over the past years QTL mapping has resulted in the identification of thousands of chromosomal regions predicted to be involved in many complex traits. However, only in a few examples has it been possible to clone the genes that underlie the traits . Although our results on cotton fiber can hardly support the optimistic assumption that "QTL are accurate" , we have shown that the reliability of QTL-calls and the estimated trait impact can be improved by integrating more replicates into the analysis. It should, however, be emphasized that segregating populations of larger sizes than the ones that have so far been reported (majority in the range of 80-150), including our own RILs, will be needed in order to improve the detection power of QTLs [22,23], and this is particularly true in the context of phenotypic traits of complex inheritance like cotton fiber quality. In cotton, particularly with the inter-specific crosses that are needed to access the higher levels of DNA polymorphism that occurs between Gossypium species, this is sometimes difficult to achieve. The 140 RILs analyzed here were the end result of SSD from over 600 original F 2 plants with large numbers dropping out at each generation due to low fertility or extremely late flowering. Meta-analysis of QTLs was shown to be useful for identifying robust QTLs, but it is not a substitute for large population sizes. It will be important in the future, that new fiber QTLs plotted on new maps sharing common markers with our consensus map be verified for their agreement with the regions of convergence identified here.
In term of practical applications, the relatively small number of candidate regions hosting fiber QTL meta-clusters that were identified here will facilitate molecular breeding strategies like marker-assisted backcrossing.
On the other hand, different avenues may also be followed to identify the genes underlying strongest QTLs. Positional cloning of QTLs has been the major route in plant QTL dissection , and examples in the literature are increasing, although practical interest for really complex traits remains to be shown. In addition to classical phenotypic QTL information from segregating populations, QTLs may also be resolved through association mapping of targeted candidate genes, based on an assumption that the allelic polymorphism of the gene is associated with the variation of the trait and provided that linkage disequilibrium is sufficient . Alternatively, the use of segregating populations for expression QTL (eQTL) mapping and the correlative observation of the variation of transcript abundance with the variation of the phenotypic traits  can also provide a complementary way to resolve complexity of phenotypic QTLs. This approach using a segregating population and combining genetics (QTL mapping) and genomics (gene expression profiling), also referred as genetical genomics , is underway using the same RIL population . Finally, the dissection of some of these specific regions, could proceed by determining the physical position of associated markers and by linking it with synteny-based information for example from Arabidopsis as was attempted by Rong et al. , at least until the cotton genome is sequenced.
A recombinant inbred line (RIL) population was created by CIRAD in Montpellier glasshouses from a cross between the 2 parents, Guazuncho-2 and VH8-4602 . Guazuncho-2 and VH8-4602 are typical representatives of G. hirsutum and G. barbadense (frequently referred to as Upland and Extra Long Staple (ELS) cottons, respectively). The Gb parent, VH8-4602, belongs to the group of Sea Island cottons among which it represents the best accession present in the extensive germplasm collection of CIRAD. VH8-4602 combines high values in both fiber length and strength compared to other Gb accessions. Conversely, the choice of Guazuncho-2, as Gh parent, was based upon its overall good agronomic behavior for the target regions of CIRAD's cotton improvement program and is associated with medium quality fibers. The same Gh and Gb parents also served to develop a backcross BC 1 mapping population  and later BC 1, BC 2 and BC 2S 1 generations that were used for QTL mapping of fiber quality parameters  and leaf pubescence . The RIL population comprised 140 lines in an F 6 to F 9 (6 F 6, 19 F 7, 89 F 8 and 26 F 9) stage of selfing through single seed descent (SSD), and was used to build an SSR-AFLP genetic map . A BC-RIL, Gh × Gb, consensus map was constructed after integration of RIL (140 individuals) with BC 1 (75 individuals) marker data . The consensus map contained 1745 loci and spanned 3637 cM. This consensus map contained a high proportion of markers in common with other published genetic maps and was used as a reference for the overall QTL projections.
Small amounts of seeds of the original RIL population were distributed in 2006 to CSIRO, Bayer CS, and the Brazilian Agricultural Research Corporation (EMBRAPA) for seed increase under glasshouse conditions in Canberra (Australia), Ghent (Belgium), and Campina Grande (Brazil), respectively. Two of these populations of glasshouse-grown plant were used for fiber property measurements and constituted data sets Ge6 (Ghent) and Cs7 (Canberra). Two other glasshouse experiments consisted of pooled fiber samples in Montpellier, one from an environment-controlled glasshouse (set Mp7) and a second from the combination of two consecutive summer harvest in an un-controlled glasshouse (set Mp8). Mp7 fiber samples were pooled from the harvest of the last 3 generations of repeated selfing of RILs during SSD. Mp8 fiber samples originated from a randomized experiment comparing 130 RILs in 2 replications (1 pot as replicate and 2 plants per pot). Field experiments were conducted during growing seasons 2007/2008 and 2008/2009 by CSIRO in Australia (location Narrabri, NSW: data sets Cs8 and Cs9), by Bayer in the USA (2 locations in Texas, Lubbock and Idalou: data sets Lu7 and Lu8), by EMBRAPA in Brazil (location Itatuba, Paraiba: data sets Br7 and Br8) and only in 2007 by the Institute of Agricultural Research for Development (IRAD) in Cameroon (location Garoua: data set Ga7). RIL experimental details by location are given in Table Table1.1. Except for the RIL experiments in the earliest seasons (Ge6, Cs7, and Ga7), the seeds for other RIL experiments originated from the bulked harvest from the previous years experiment, i.e. the generations of selfing may therefore differ between experiments. However the original RIL material, in an F 8 generation of selfing, was 95% homozygous as a mean over all loci in the 140 RILs .
Only subsets of the 140 RILs were tested and evaluated for fiber characteristics in the various experiments (Table (Table1).1). This was a consequence, for some of the RILs, of a poor germination ability, a long duration of the life cycle (for example some late flowering RILs matured too late to be harvested in field experiments in some locations) and more importantly from low fertility resulting in insufficient quantities of fiber for testing in some fiber analysis instruments. The most comprehensive data sets were obtained from the testing in Brazil under drip-irrigation conditions during 2 growing seasons which generated data from 123 and 128 RILs respectively. In both years, 2 repetitions of the RILs were grown, and fiber quality data (Br7 and Br8) are the average of the 2 replicates. Some other sites also had replications, for which either a single pooled fiber sample per RIL was analyzed (Mp8, Lu7, Lu8), or replicates were averaged after analysis (Cs8 and Cs9).
Ginning of the seed-cotton was conducted locally at each site using in most cases a roller gin (or saw gin for Cs8 and Cs9 or hand ginned for Cs7) and fiber measurements were determined by each partner; by CIRAD in Montpellier (data sets Mp7, Mp8, Br7, Br8 and Ga7), by CSIRO in Australia (Cs7, Cs8 and Cs9) and by Bayer in Belgium (Ge6) and in the USA (Lu7 and Lu8). Instruments used were either High Volume Instruments, HVI Classing (Uster Technologies, Charlotte, NC), for measuring length, short fiber index, strength, elongation, micronaire, and color, or AFIS Pro systems (Uster Technologies, Charlotte, NC), for measuring length and fineness components. In addition, some specialized instruments were used to measure micronaire only (Fibronaire, Motion Control, Inc., Dallas, TX) or micronaire, maturity and fineness (Shirley maturimeter FMT, Shirley Developments, Stockport, England) (Table (Table1).1). All instruments were calibrated using international calibration cotton standards. Among the more than 20 measurements provided by the different instruments used, 13 truly non-synonymous parameters were finally taken into consideration in the analysis and were grouped into 6 broad categories, fineness/maturity further referred as fineness, length, length uniformity, strength, elongation and color.
Four individual parameters refer to "fineness" (symbolized FIN) as a category: fineness (linear density, noted H, mass per 1000 meters of fibers expressed in mtex, a tex being the weight in grams of 1000 meters of fiber), maturity ratio (noted MR, a measurement of the relative amount of the cellulose in the fiber cross-section, dimensionless; also measured as % mature fibers, PM), standard fineness (noted Hs, ratio of H to MR or mass per 1000 meters of fibers having a MR of 1, expressed in mtex), and micronaire reading (a commercial index, here noted IM, varying from 2 to 5, and based on the measurement of an air flow that passes through a porous plug of cotton fibers). Unlike fiber maturity (thickness of the fiber wall) and fiber fineness (linear density of the fiber), both of which are straightforward in interpretation (high maturity and low fineness being favorable), the interpretation of the micronaire is more complex. A given micronaire value may be reached by very different combinations of fineness (H) and maturity (MR). Two cottons, for example, one with immature (MR = 0.67) and coarse (H = 220 mtex) fibers and the other with mature (MR = 1.04) and fine (H = 150 mtex) fibers can have the same micronaire reading of 4.1. Despite its limitations, micronaire reading is still a widely used measure of a fibers' fitness for spinning applications and remains a key parameter in both the breeding and marketing of cotton.
Five variables relate to the fiber "length" category (LEN). Three parameters are direct estimates of the length, in mm or inches, of the fibers: mean length (noted ML, mean length by number of the fibers), upper half mean length (noted UHML, average length by number of the longer half of the fibers), both measured on HVI, and upper quartile length per weight (noted UQLw, length of the 25% by weight of longest fibers) measured on an AFIS instrument. In addition to these 3 length measures, 2 additional parameters represent the homogeneity of the length, or "length uniformity" distribution (UNI): uniformity index (noted UI, calculated from HVI data as the ratio between ML and UHML) and short fiber content or short fiber index (noted SFI, calculated from AFIS or HVI data, as the percentage of short fibers of less than 0.5 inches). Fiber strength (STR), the force required to break a bundle of fibers (in g/tex), and fiber elongation (ELO) measured from the same bundle of fibers as for strength and representing the degree of elasticity before breakage (dimensionless), are both measured on HVI. Finally, 2 negatively correlated parameters measured on HVI represent, in combination, the color grade of the fiber (COL): reflectance (noted Rd, relative whiteness of reflected light, in %) and yellowness index (noted + b, degree of yellowness of reflected light using a yellow filter, dimensionless). CSIRO breeders do not use color evaluations in their selection program, so this parameter was not measured in data sets Cs7, Cs8 and Cs9.
The analyses of variance were conducted using the GLM procedure of the SAS software package (SAS Institute Inc., Cary, NC) for 2 series of fiber phenotypic data: - for the 2 Brazilian data sets (2 trials in 2007 and 2008 both under 2 completely randomized replicates design) using the fiber data from the 109 RIL in common between the 2 years, and - for the 11 (year × site combinations) RIL data sets. The effect of genotypes (RILs) was tested either against global residual or against genotype x set interaction when significant. Variance components were calculated using the VarComp procedure of SAS, declaring the variables Year, and replicates in the case of Brazil data, or Sites (year × site) in the global analysis, as fixed. Broad sense individual heritabilities (h²) were calculated as the ratio between genotypic (RIL) and phenotypic (error) variances.
The frequency distribution for most fiber traits fitted a normal distribution (not shown) and no data transformation was made before QTL analysis.
A genetic map of the RIL population was published earlier . The map was based on the segregation in over 140 RILs of 800 AFLP and SSR markers, and spanned 2044 cM. A subset of 656 loci separated by more than 1 cM was used for QTL analysis of RIL fiber data. We have used the classical nomenclature system for numbering the chromosomes, c1-c13 and c14-c26 for the chromosomes of the A t and D t sub-genomes respectively, as this facilitated across-experiments comparisons. The assignation of the 13 homoeologous A/D pairs is as follows: - c1/c15 (A1/D1), c2/c14 (A2/D2), c3/c17 (A3/D3), c4/c22 (A4/D4), c5/c19 (A5/D5), c6/c25 (A6/D6), c7/c16 (A7/D7), c8/c24 (A8/D8), c9/c23 (A9/D9), c10/c20 (A10/D10), c11/c21 (A11/D11), c12/c26 (A12/D12) and c13/c18 (A13/D13).
QTL were analyzed separately for each individual data set (site × year) and 5 to 12 fiber traits per data set, using WinQTL Cartographer 2.5 . For each variable, interval mapping over the whole genome using multiple regression of phenotypic data on marker genotypic data was run with 1000 permutations to identify the minimum significant LOD (global risk of 5%) threshold score to be considered. Composite interval mapping (CIM) was performed with Model 6 using the markers pre-selected by stepwise regression as cofactors. Permutation LOD thresholds varied between 3.2 for UHML in the Lu8 data set and 4.6 for SFI in Cs8 data set (not reported). However, to facilitate comparisons across data sets, we also considered a relaxed LOD value (2.0) to locate additional QTLs and declare the presence of putative QTLs used in the meta-analysis (see Discussion). All QTLs (LOD > 2) were automatically localized using WinQTL Cartographer with the following parameters: - minimal space between peaks = 5 cM, and minimum LOD from top to valley = 1. The position of peaks and their one-LOD CI, corresponding to a 1 LOD drop off from the peak , were recovered as outputs from WinQTL Cartographer.
Individual RIL QTLs were named in the form "Loc_Trait_Chr_X_Peak_[Add]", for the combination of a given data set location ("Loc") and trait ("Trait"), on a chromosome ("Chr") with a rank (from the top to the bottom) on it ("X" from 1 to N if N peaks surpassed LOD2 on the given chromosome), with a LOD peak value ("Peak") and an additive effect ("[Add]", either [+] or [-]) as conferred by parent Gh). For example "Ga7_UHML_3_2_3.45_[+]" designated the presence of a QTL detected in the Ga7 data set for trait UHML, on chromosome 3, in 2nd position, with a LOD value of 3.45 and a positive additive effect on trait (higher fiber length) conferred by alleles of the Gh parent, Guazuncho-2.
The fiber data of the 3 backcross generations derived from the same cross, Guazuncho-2 x VH8-4602 , were reanalyzed with WinQTL Cartographer for QTL mapping using the same procedure as for the RILs. The 3 generations consisted of 75 BC 1, 200 BC 2 and 200 of their selfed BC 2 S 1 progenies. BC 1 and BC 2 were grown in Montpellier glasshouses for fiber production, while the per-row value of BC 2 S 1 was assessed from a field experiment in Brazil as previously described . Interval mapping procedures (IM and CIM) were based upon the most recent mapping data using the BC-RIL consensus map . This map was considered as more reliable than the BC 1 map initially used . QTLs of BC data sets were named in a similar way as the RIL QTLs (see above), except that "Loc" designated either of the 3 generations analyzed, BC 1, BC 2 or BC 2 S 1.
Different numbers of RILs were tested in the 11 experiments (Table (Table1).1). Because of the existence of many missing phenotypic data in our RIL-by-environment data matrix, it was not possible to assess GxE interactions, except for the Brazil data, nor apply QTL-by-environment interaction (QEI) analysis models . In contrast to genotypic data, existence of missing phenotypic data, as in our case, would have resulted in discarding many individuals lacking data, sacrificing all other phenotypic and genotypic information available for those individuals. However fiber quality traits being considered as heritable and GxE usually moderate , we have chosen to place emphasis on a comparative mapping approach, i.e. on the coincident detection on some chromosomal regions of QTLs in several year-site combinations. Fiber QTL data from the BC and RIL data sets were integrated using the BC-RIL consensus map  on which all QTLs were located (directly for the BC QTLs and by interpolated projection for the RIL QTLs).
Rong et al.  positioned a total of 212 fiber QTLs originating from 5 different inter-specific Gh × Gb populations, 4 F 2 and 1 BC [6,8,11-13,16], all projected onto the Gh race palmeri × K101-F 2 reference map . The basic information for these QTLs including their names and CI were downloaded from http://www.plantgenome.uga.edu/cottonmap.htm. The projection of these QTLs onto the Guazuncho-2 x VH8-4602-consensus map was possible thanks to the existence of 203 markers in common between the 26 chromosomes of the 2 maps. However, inconsistencies in marker order led to difficulties in integrating around one third of these QTLs. Apart from QTLs from Rong et al.  we also considered other literature sources of fiber QTL mapping [9,10,14,15,18], but they were not included in the meta-analysis.
The meta-analysis of QTLs from the RIL and BC data sets were run using the MetaQTL package . The purpose of MetaQTL is to evaluate, for a given trait, the degree of congruence of the CI around LOD peaks detected in different mapping experiments. MetaQTL offers a statistical process to establish a consensus model for marker and QTL positions and to develop a clustering approach. The first 2 modules, ConsMap and QTLProj, of MetaQTL apply a weighted least square strategy for positioning markers on a single consensus map and for positioning QTLs on this map,, respectively. Then a clustering approach implemented in the QTLClust module and based on a Gaussian mixture model, determines the optimal number, K, of clusters by means of 4 different information-based criteria, including the Akaike Information Criterion (AIC) that is the only criteria presented here. Parameter estimates of MetaQTL include the most likely location on the chromosomes of the K clusters, their 95% CI, and the probability of individual QTLs belonging to a particular cluster.
Because clustering results are strongly influenced by the CI that frames the QTL, we ran the clustering with MetaQTL using 2 series of input data as CI of QTLs: - one-LOD drop off method as derived from composite interval mapping by WinQTL Cartographer, and - calculation-based method as described in Darvasi and Soller . In this method, a weighted CI is calculated as the product of a constant value (function of the population type) divided by N*R² (N, population size and R², percentage of variance explained).
For a given chromosome, QTLs within the same broad trait category (thus possibly including different, but correlated traits, like ML, UHML and UQLw in case of fiber length category, or Rd and + b for fiber color category, etc..) and for different data sets were selected for clustering when QTLs with additivity of similar directionality were detected in at least 4 different populations/data sets (RIL or BC). The regions showing some degree of consistency across populations corresponded to "confirmed" QTLs, according to the terminology of Lander and Kruglyak . The following nomenclature system was adopted to designate clusters: QTLClust_Trait_Chr_Rank (for example QTLClust_FIN_3_2, for the second cluster of fiber fineness QTLs along chromosome 3).
In several instances, it was proposed that clusters mapped at close proximity be coalesced into meta-clusters; in such cases the meta-cluster was referred-to by trait acronym followed by the chromosome number and by a letter suffix when necessary, such as FIN_21B for a second fineness meta-cluster on chromosome 21.
JJe, CV, MC, JML participated in the RIL experiment conducted in Montpellier. PO and SG conducted the RIL experiment in Cameroon. MG, PAVB and HdA conducted RIL experiments in Brazil. GG and MV conducted fiber analyses for experiments conducted in France, Brazil and Cameroon, and provided global interpretation of fiber parameters. JJa co-ordinated all Bayer CS contributions and conducted the experiment in Belgium. DB, SC and TA conducted the RIL experiments in the USA. DL co-ordinated all CSIRO contributions and drafted the manuscript. SL and YAG conducted the RIL experiments in Australia. JML conceived and co-ordinated the project, conducted meta-analyses and drafted the manuscript. All authors contributed to the interpretation of the results, read and approved the final manuscript.
Table S1: Details of significant QTLs in RIL and BC data sets.
Tables S2: Distribution of QTLs in BC experiments. Distribution of the 67 significant QTLs and of the total 255 (including putative QTLs shown in parentheses) from the analysis of the 3 BC generations and shared among the 6 fiber trait categories, 3 generations and 26 chromosomes.
Table S3: Characteristics of the clusters detected by MetaQTL and list of meta-clusters.
Table S4: Detailed chromosome by chromosome descriptions of the QTL clusters identified by MetaQTL.
Phenotypic data on the various locations were collected by many colleagues that are all thanked by the authors. Experiments in Brazil (glasshouse and field) and Cameroon (field) were conducted thanks to the collaborative links of CIRAD with EMBRAPA and with IRAD, respectively. The authors would like to thank Tuong-Vi Cao and Jean-François Rami for assistance in data analysis, as well as Brigitte Courtois for critical review of the manuscript, as well as the 2 anonymous reviewers. The French National Research Agency, ANR, through its program specialized in plant genomics, Génoplante, has sponsored this research (project nr ANR-06-GPLA-018).