The adult plant stem rust resistance gene Sr2 was introgressed into hexaploid wheat cultivar (cv) Marquis from tetraploid emmer wheat cv Yaroslav, to generate stem rust resistant cv Hope in the 1920s. Subsequently, Sr2 has been widely deployed and has provided durable partial resistance to all known races of Puccinia graminis f. sp. tritici. This report describes the physical map of the Sr2-carrying region on the short arm of chromosome 3B of cv Hope and compares the Hope haplotype with non-Sr2 wheat cv Chinese Spring.
Sr2 was located to a region of 867 kb on chromosome 3B in Hope, which corresponded to a region of 567 kb in Chinese Spring. The Hope Sr2 region carried 34 putative genes but only 17 were annotated in the comparable region of Chinese Spring. The two haplotypes differed by extensive DNA sequence polymorphisms between flanking markers as well as by a major insertion/deletion event including ten Germin-Like Protein (GLP) genes in Hope that were absent in Chinese Spring. Haplotype analysis of a limited number of wheat genotypes of interest showed that all wheat genotypes carrying Sr2 possessed the GLP cluster; while, of those lacking Sr2, some, including Marquis, possessed the cluster, while some lacked it. Thus, this region represents a common presence-absence polymorphism in wheat, with presence of the cluster not correlated with presence of Sr2. Comparison of Hope and Marquis GLP genes on 3BS found no polymorphisms in the coding regions of the ten genes but several SNPs in the shared promoter of one divergently transcribed GLP gene pair and a single SNP downstream of the transcribed region of a second GLP.
Physical mapping and sequence comparison showed major haplotype divergence at the Sr2 locus between Hope and Chinese Spring. Candidate genes within the Sr2 region of Hope are being evaluated for the ability to confer stem rust resistance. Based on the detailed mapping and sequencing of the locus, we predict that Sr2 does not belong to the NB-LRR gene family and is not related to previously cloned, race non-specific rust resistance genes Lr34 and Yr36.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-014-0379-z) contains supplementary material, which is available to authorized users.
Adult plant resistance (APR); Map-based cloning; Sr2; Germin-like proteins (GLPs); Wheat stem rust; Puccinia graminis; Physical mapping; Gene expression
The ~17 Gb hexaploid bread wheat genome is a high priority and a major technical challenge for genomic studies. In particular, the D sub-genome is relatively lacking in genetic diversity, making it both difficult to map genetically, and a target for introgression of agriculturally useful traits. Elucidating its sequence and structure will therefore facilitate wheat breeding and crop improvement.
We generated shotgun sequences from each arm of flow-sorted Triticum aestivum chromosome 5D using 454 FLX Titanium technology, giving 1.34× and 1.61× coverage of the short (5DS) and long (5DL) arms of the chromosome respectively. By a combination of sequence similarity and assembly-based methods, ~74% of the sequence reads were classified as repetitive elements, and coding sequence models of 1314 (5DS) and 2975 (5DL) genes were generated. The order of conserved genes in syntenic regions of previously sequenced grass genomes were integrated with physical and genetic map positions of 518 wheat markers to establish a virtual gene order for chromosome 5D.
The virtual gene order revealed a large-scale chromosomal rearrangement in the peri-centromeric region of 5DL, and a concentration of non-syntenic genes in the telomeric region of 5DS. Although our data support the large-scale conservation of Triticeae chromosome structure, they also suggest that some regions are evolving rapidly through frequent gene duplications and translocations.
EBI European Nucleotide Archive, Study no. ERP002330
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-1080) contains supplementary material, which is available to authorized users.
Wheat genome; Chromosome sorting; Triticum aestivum; Genome zipper; Triticeae genome; Chromosome arm shotgun; Comparative grass genomics
Bread wheat (Triticum aestivum L.) is the most important staple food crop for 35% of the world's population. International efforts are underway to facilitate an increase in wheat production, of which the International Wheat Genome Sequencing Consortium (IWGSC) plays an important role. As part of this effort, we have developed a sequence-based physical map of wheat chromosome 6A using whole-genome profiling (WGP™). The bacterial artificial chromosome (BAC) contig assembly tools fingerprinted contig (fpc) and linear topological contig (ltc) were used and their contig assemblies were compared. A detailed investigation of the contigs structure revealed that ltc created a highly robust assembly compared with those formed by fpc. The ltc assemblies contained 1217 contigs for the short arm and 1113 contigs for the long arm, with an L50 of 1 Mb. To facilitate in silico anchoring, WGP™ tags underlying BAC contigs were extended by wheat and wheat progenitor genome sequence information. Sequence data were used for in silico anchoring against genetic markers with known sequences, of which almost 79% of the physical map could be anchored. Moreover, the assigned sequence information led to the ‘decoration’ of the respective physical map with 3359 anchored genes. Thus, this robust and genetically anchored physical map will serve as a framework for the sequencing of wheat chromosome 6A, and is of immediate use for map-based isolation of agronomically important genes/quantitative trait loci located on this chromosome.
bread wheat chromosome 6A; whole-genome profiling; linear topological contigs; anchored physical map; bacterial artificial chromosome contigs; technical advance
The bread wheat (Triticum aestivum L.) genotype “Chinese Spring” (“CS”) is the reference base in wheat genetics and genomics. Pericentric rearrangements in this genotype were systematically assessed by analyzing homoeoloci for a set of nonredundant genes from Brachypodium distachyon, Triticum urartu, and Aegilops tauschii in the CS chromosome shotgun sequence obtained from individual chromosome arms flow-sorted from CS aneuploid lines. Based on patterns of their homoeologous arm locations, 551 genes indicated the presence of pericentric inversions in at least 10 of the 21 chromosomes. Available data from deletion bin-mapped expressed sequence tags and genetic mapping in wheat indicated that all inversions had breakpoints in the low-recombinant gene-poor pericentromeric regions. The large number of putative intrachromosomal rearrangements suggests the presence of extensive structural differences among the three subgenomes, at least some of which likely occurred during the production of the aneuploid lines of this hexaploid wheat genotype. These differences could have significant implications in wheat genome research where comparative approaches are used such as in ordering and orientating sequence contigs and in gene cloning.
chromosomal rearrangement; comparative genomics; pericentric inversion; pericentromeric regions; translocation; Chinese Spring
Powdery mildew, caused by Blumeria graminis f. sp. tritici, is one of the most important wheat diseases in the world. In this study, a single dominant powdery mildew resistance gene MlIW172 was identified in the IW172 wild emmer accession and mapped to the distal region of chromosome arm 7AL (bin7AL-16-0.86-0.90) via molecular marker analysis. MlIW172 was closely linked with the RFLP probe Xpsr680-derived STS marker Xmag2185 and the EST markers BE405531 and BE637476. This suggested that MlIW172 might be allelic to the Pm1 locus or a new locus closely linked to Pm1. By screening genomic BAC library of durum wheat cv. Langdon and 7AL-specific BAC library of hexaploid wheat cv. Chinese Spring, and after analyzing genome scaffolds of Triticum urartu containing the marker sequences, additional markers were developed to construct a fine genetic linkage map on the MlIW172 locus region and to delineate the resistance gene within a 0.48 cM interval. Comparative genetics analyses using ESTs and RFLP probe sequences flanking the MlIW172 region against other grass species revealed a general co-linearity in this region with the orthologous genomic regions of rice chromosome 6, Brachypodium chromosome 1, and sorghum chromosome 10. However, orthologous resistance gene-like RGA sequences were only present in wheat and Brachypodium. The BAC contigs and sequence scaffolds that we have developed provide a framework for the physical mapping and map-based cloning of MlIW172.
The banana family (Musaceae) includes genetically a diverse group of species and their diploid and polyploid hybrids that are widely cultivated in the tropics. In spite of their socio-economic importance, the knowledge of Musaceae genomes is basically limited to draft genome assemblies of two species, Musa acuminata and M. balbisiana. Here we aimed to complement this information by analyzing repetitive genome fractions of six species selected to represent various phylogenetic groups within the family.
Low-pass sequencing of M. acuminata, M. ornata, M. textilis, M. beccarii, M. balbisiana, and Ensete gilletii genomes was performed using a 454/Roche platform. Sequence reads were subjected to analysis of their overall intra- and inter-specific similarities and, all major repeat families were quantified using graph-based clustering. Maximus/SIRE and Angela lineages of Ty1/copia long terminal repeat (LTR) retrotransposons and the chromovirus lineage of Ty3/gypsy elements were found to make up most of highly repetitive DNA in all species (14–34.5% of the genome). However, there were quantitative differences and sequence variations detected for classified repeat families as well as for the bulk of total repetitive DNA. These differences were most pronounced between species from different taxonomic sections of the Musaceae family, whereas pairs of closely related species (M. acuminata/M. ornata and M. beccarii/M. textilis) shared similar populations of repetitive elements.
This study provided the first insight into the composition and sequence variation of repetitive parts of Musaceae genomes. It allowed identification of repetitive sequences specific for a single species or a group of species that can be utilized as molecular markers in breeding programs and generated computational resources that will be instrumental in repeat masking and annotation in future genome assembly projects.
Monitoring alien introgressions in crop plants is difficult due to the lack of genetic and molecular mapping information on the wild crop relatives. The tertiary gene pool of wheat is a very important source of genetic variability for wheat improvement against biotic and abiotic stresses. By exploring the 5Mg short arm (5MgS) of Aegilops geniculata, we can apply chromosome genomics for the discovery of SNP markers and their use for monitoring alien introgressions in wheat (Triticum aestivum L).
The short arm of chromosome 5Mg of Ae. geniculata Roth (syn. Ae. ovata L.; 2n = 4x = 28, UgUgMgMg) was flow-sorted from a wheat line in which it is maintained as a telocentric chromosome. DNA of the sorted arm was amplified and sequenced using an Illumina Hiseq 2000 with ~45x coverage. The sequence data was used for SNP discovery against wheat homoeologous group-5 assemblies. A total of 2,178 unique, 5MgS-specific SNPs were discovered. Randomly selected samples of 59 5MgS-specific SNPs were tested (44 by KASPar assay and 15 by Sanger sequencing) and 84% were validated. Of the selected SNPs, 97% mapped to a chromosome 5Mg addition to wheat (the source of t5MgS), and 94% to 5Mg introgressed from a different accession of Ae. geniculata substituting for chromosome 5D of wheat. The validated SNPs also identified chromosome segments of 5MgS origin in a set of T5D-5Mg translocation lines; eight SNPs (25%) mapped to TA5601 [T5DL · 5DS-5MgS(0.75)] and three (8%) to TA5602 [T5DL · 5DS-5MgS (0.95)]. SNPs (gsnp_5ms83 and gsnp_5ms94), tagging chromosome T5DL · 5DS-5MgS(0.95) with the smallest introgression carrying resistance to leaf rust (Lr57) and stripe rust (Yr40), were validated in two released germplasm lines with Lr57 and Yr40 genes.
This approach should be widely applicable for the identification of species/genome-specific SNPs. The development of a large number of SNP markers will facilitate the precise introgression and monitoring of alien segments in crop breeding programs and further enable mapping and cloning novel genes from the wild relatives of crop plants.
The wheat genome sequence is an essential tool for advanced genomic research and improvements. The generation of a high-quality wheat genome sequence is challenging due to its complex 17 Gb polyploid genome. To overcome these difficulties, sequencing through the construction of BAC-based physical maps of individual chromosomes is employed by the wheat genomics community. Here, we present the construction of the first comprehensive physical map of chromosome 1BS, and illustrate its unique gene space organization and evolution.
Fingerprinted BAC clones were assembled into 57 long scaffolds, anchored and ordered with 2,438 markers, covering 83% of chromosome 1BS. The BAC-based chromosome 1BS physical map and gene order of the orthologous regions of model grass species were consistent, providing strong support for the reliability of the chromosome 1BS assembly. The gene space for chromosome 1BS spans the entire length of the chromosome arm, with 76% of the genes organized in small gene islands, accompanied by a two-fold increase in gene density from the centromere to the telomere.
This study provides new evidence on common and chromosome-specific features in the organization and evolution of the wheat genome, including a non-uniform distribution of gene density along the centromere-telomere axis, abundance of non-syntenic genes, the degree of colinearity with other grass genomes and a non-uniform size expansion along the centromere-telomere axis compared with other model cereal genomes. The high-quality physical map constructed in this study provides a solid basis for the assembly of a reference sequence of chromosome 1BS and for breeding applications.
Bread wheat (Triticum aestivum) has a large and highly repetitive genome which poses major technical challenges for its study. To aid map-based cloning and future genome sequencing projects, we constructed a BAC-based physical map of the short arm of wheat chromosome 1A (1AS). From the assembly of 25,918 high information content (HICF) fingerprints from a 1AS-specific BAC library, 715 physical contigs were produced that cover almost 99% of the estimated size of the chromosome arm. The 3,414 BAC clones constituting the minimum tiling path were end-sequenced. Using a gene microarray containing ∼40 K NCBI UniGene EST clusters, PCR marker screening and BAC end sequences, we arranged 160 physical contigs (97 Mb or 35.3% of the chromosome arm) in a virtual order based on synteny with Brachypodium, rice and sorghum. BAC end sequences and information from microarray hybridisation was used to anchor 3.8 Mbp of Illumina sequences from flow-sorted chromosome 1AS to BAC contigs. Comparison of genetic and synteny-based physical maps indicated that ∼50% of all genetic recombination is confined to 14% of the physical length of the chromosome arm in the distal region. The 1AS physical map provides a framework for future genetic mapping projects as well as the basis for complete sequencing of chromosome arm 1AS.
Common wheat (Triticum aestivum L.) is one of the most important cereals in the world. To improve wheat quality and productivity, the genomic sequence of wheat must be determined. The large genome size (∼17 Gb/1 C) and the hexaploid status of wheat have hampered the genome sequencing of wheat. However, flow sorting of individual chromosomes has allowed us to purify and separately shotgun-sequence a pair of telocentric chromosomes. Here, we describe a result from the survey sequencing of wheat chromosome 6B (914 Mb/1 C) using massively parallel 454 pyrosequencing. From the 4.94 and 5.51 Gb shotgun sequence data from the two chromosome arms of 6BS and 6BL, 235 and 273 Mb sequences were assembled to cover ∼55.6 and 54.9% of the total genomic regions, respectively. Repetitive sequences composed 77 and 86% of the assembled sequences on 6BS and 6BL, respectively. Within the assembled sequences, we predicted a total of 4798 non-repetitive gene loci with the evidence of expression from the wheat transcriptome data. The numbers and chromosomal distribution patterns of the genes for tRNAs and microRNAs in wheat 6B were investigated, and the results suggested a significant involvement of DNA transposon diffusion in the evolution of these non-protein-coding RNA genes. A comparative analysis of the genomic sequences of wheat 6B and monocot plants clearly indicated the evolutionary conservation of gene contents.
wheat; chromosome 6B; genome sequencing; next-generation sequencing
Diploid Aegilops umbellulata and Ae. comosa and their natural allotetraploid hybrids Ae. biuncialis and Ae. geniculata are important wild gene sources for wheat. With the aim of assisting in alien gene transfer, this study provides gene-based conserved orthologous set (COS) markers for the U and M genome chromosomes. Out of the 140 markers tested on a series of wheat-Aegilops chromosome introgression lines and flow-sorted subgenomic chromosome fractions, 100 were assigned to Aegilops chromosomes and six and seven duplications were identified in the U and M genomes, respectively. The marker-specific EST sequences were BLAST-ed to Brachypodium and rice genomic sequences to investigate macrosyntenic relationships between the U and M genomes of Aegilops, wheat and the model species. Five syntenic regions of Brachypodium identified genome rearrangements differentiating the U genome from the M genome and from the D genome of wheat. All of them seem to have evolved at the diploid level and to have been modified differentially in the polyploid species Ae. biuncialis and Ae. geniculata. A certain level of wheat–Aegilops homology was detected for group 1, 2, 3 and 5 chromosomes, while a clearly rearranged structure was showed for the group 4, 6 and 7 Aegilops chromosomes relative to wheat. The conserved orthologous set markers assigned to Aegilops chromosomes promise to accelerate gene introgression by facilitating the identification of alien chromatin. The syntenic relationships between the Aegilops species, wheat and model species will facilitate the targeted development of new markers specific for U and M genomic regions and will contribute to the understanding of molecular processes related to allopolyploidization.
As for other major crops, achieving a complete wheat genome sequence is essential for the application of genomics to breeding new and improved varieties. To overcome the complexities of the large, highly repetitive and hexaploid wheat genome, the International Wheat Genome Sequencing Consortium established a chromosome-based strategy that was validated by the construction of the physical map of chromosome 3B. Here, we present improved strategies for the construction of highly integrated and ordered wheat physical maps, using chromosome 1BL as a template, and illustrate their potential for evolutionary studies and map-based cloning.
Using a combination of novel high throughput marker assays and an assembly program, we developed a high quality physical map representing 93% of wheat chromosome 1BL, anchored and ordered with 5,489 markers including 1,161 genes. Analysis of the gene space organization and evolution revealed that gene distribution and conservation along the chromosome results from the superimposition of the ancestral grass and recent wheat evolutionary patterns, leading to a peak of synteny in the central part of the chromosome arm and an increased density of non-collinear genes towards the telomere. With a density of about 11 markers per Mb, the 1BL physical map provides 916 markers, including 193 genes, for fine mapping the 40 QTLs mapped on this chromosome.
Here, we demonstrate that high marker density physical maps can be developed in complex genomes such as wheat to accelerate map-based cloning, gain new insights into genome evolution, and provide a foundation for reference sequencing.
chromosome 1BL; evolution; gene space; grasses; hexaploid wheat; map-based cloning; physical mapping; sequencing; synteny
Bread wheat (Triticum aestivum L.) is one of the most important crops worldwide and its production faces pressing challenges, the solution of which demands genome information. However, the large, highly repetitive hexaploid wheat genome has been considered intractable to standard sequencing approaches. Therefore the International Wheat Genome Sequencing Consortium (IWGSC) proposes to map and sequence the genome on a chromosome-by-chromosome basis.
We have constructed a physical map of the long arm of bread wheat chromosome 1A using chromosome-specific BAC libraries by High Information Content Fingerprinting (HICF). Two alternative methods (FPC and LTC) were used to assemble the fingerprints into a high-resolution physical map of the chromosome arm. A total of 365 molecular markers were added to the map, in addition to 1122 putative unique transcripts that were identified by microarray hybridization. The final map consists of 1180 FPC-based or 583 LTC-based contigs.
The physical map presented here marks an important step forward in mapping of hexaploid bread wheat. The map is orders of magnitude more detailed than previously available maps of this chromosome, and the assignment of over a thousand putative expressed gene sequences to specific map locations will greatly assist future functional studies. This map will be an essential tool for future sequencing of and positional cloning within chromosome 1A.
The assembly of the bread wheat genome sequence is challenging due to allohexaploidy and extreme repeat content (>80%). Isolation of single chromosome arms by flow sorting can be used to overcome the polyploidy problem, but the repeat content cause extreme assembly fragmentation even at a single chromosome level. Long jump paired sequencing data (mate pairs) can help reduce assembly fragmentation by joining multiple contigs into single scaffolds. The aim of this work was to assess how mate pair data generated from multiple displacement amplified DNA of flow-sorted chromosomes affect assembly fragmentation of shotgun assemblies of the wheat chromosomes.
Three mate pair (MP) libraries (2 Kb, 3 Kb, and 5 Kb) were sequenced to a total coverage of 89x and 64x for the short and long arm of chromosome 7B, respectively. Scaffolding using SSPACE improved the 7B assembly contiguity and decreased gene space fragmentation, but the degree of improvement was greatly affected by scaffolding stringency applied. At the lowest stringency the assembly N50 increased by ~7 fold, while at the highest stringency N50 was only increased by ~1.5 fold. Furthermore, a strong positive correlation between estimated scaffold reliability and scaffold assembly stringency was observed. A 7BS scaffold assembly with reduced MP coverage proved that assembly contiguity was affected only to a small degree down to ~50% of the original coverage.
The effect of MP data integration into pair end shotgun assemblies of wheat chromosome was moderate; possibly due to poor contig assembly contiguity, the extreme repeat content of wheat, and the use of amplified chromosomal DNA for MP library construction.
Wheat; Assembly; Scaffold; Mate-pair; MDA; Improvement
Satellite DNA sequences consist of tandemly arranged repetitive units up to thousands nucleotides long in head-to-tail orientation. The evolutionary processes by which satellites arise and evolve include unequal crossing over, gene conversion, transposition and extra chromosomal circular DNA formation. Large blocks of satellite DNA are often observed in heterochromatic regions of chromosomes and are a typical component of centromeric and telomeric regions. Satellite-rich loci may show specific banding patterns and facilitate chromosome identification and analysis of structural chromosome changes. Unlike many other genomes, nuclear genomes of banana (Musa spp.) are poor in satellite DNA and the information on this class of DNA remains limited. The banana cultivars are seed sterile clones originating mostly from natural intra-specific crosses within M. acuminata (A genome) and inter-specific crosses between M. acuminata and M. balbisiana (B genome). Previous studies revealed the closely related nature of the A and B genomes, including similarities in repetitive DNA. In this study we focused on two main banana DNA satellites, which were previously identified in silico. Their genomic organization and molecular diversity was analyzed in a set of nineteen Musa accessions, including representatives of A, B and S (M. schizocarpa) genomes and their inter-specific hybrids. The two DNA satellites showed a high level of sequence conservation within, and a high homology between Musa species. FISH with probes for the satellite DNA sequences, rRNA genes and a single-copy BAC clone 2G17 resulted in characteristic chromosome banding patterns in M. acuminata and M. balbisiana which may aid in determining genomic constitution in interspecific hybrids. In addition to improving the knowledge on Musa satellite DNA, our study increases the number of cytogenetic markers and the number of individual chromosomes, which can be identified in Musa.
Polyploidization is considered one of the main mechanisms of plant genome evolution. The presence of multiple copies of the same gene reduces selection pressure and permits sub-functionalization and neo-functionalization leading to plant diversification, adaptation and speciation. In bread wheat, polyploidization and the prevalence of transposable elements resulted in massive gene duplication and movement. As a result, the number of genes which are non-collinear to genomes of related species seems markedly increased in wheat.
We used new-generation sequencing (NGS) to generate sequence of a Mb-sized region from wheat chromosome arm 3DS. Sequence assembly of 24 BAC clones resulted in two scaffolds of 1,264,820 and 333,768 bases. The sequence was annotated and compared to the homoeologous region on wheat chromosome 3B and orthologous loci of Brachypodium distachyon and rice. Among 39 coding sequences in the 3DS scaffolds, 32 have a homoeolog on chromosome 3B. In contrast, only fifteen and fourteen orthologs were identified in the corresponding regions in rice and Brachypodium, respectively. Interestingly, five pseudogenes were identified among the non-collinear coding sequences at the 3B locus, while none was found at the 3DS locus.
Direct comparison of two Mb-sized regions of the B and D genomes of bread wheat revealed similar rates of non-collinear gene insertion in both genomes with a majority of gene duplications occurring before their divergence. Relatively low proportion of pseudogenes was identified among non-collinear coding sequences. Our data suggest that the pseudogenes did not originate from insertion of non-functional copies, but were formed later during the evolution of hexaploid wheat. Some evidence was found for gene erosion along the B genome locus.
Wheat; BAC sequencing; Homoeologous genomes; Gene duplication; Non-collinear genes; Allopolyploidy
Nuclear genomes of human, animals, and plants are organized into subunits called chromosomes. When isolated into aqueous suspension, mitotic chromosomes can be classified using flow cytometry according to light scatter and fluorescence parameters. Chromosomes of interest can be purified by flow sorting if they can be resolved from other chromosomes in a karyotype. The analysis and sorting are carried out at rates of 102–104 chromosomes per second, and for complex genomes such as wheat the flow sorting technology has been ground-breaking in reducing genome complexity for genome sequencing. The high sample rate provides an attractive approach for karyotype analysis (flow karyotyping) and the purification of chromosomes in large numbers. In characterizing the chromosome complement of an organism, the high number that can be studied using flow cytometry allows for a statistically accurate analysis. Chromosome sorting plays a particularly important role in the analysis of nuclear genome structure and the analysis of particular and aberrant chromosomes. Other attractive but not well-explored features include the analysis of chromosomal proteins, chromosome ultrastructure, and high-resolution mapping using FISH. Recent results demonstrate that chromosome flow sorting can be coupled seamlessly with DNA array and next-generation sequencing technologies for high-throughput analyses. The main advantages are targeting the analysis to a genome region of interest and a significant reduction in sample complexity. As flow sorters can also sort single copies of chromosomes, shotgun sequencing DNA amplified from them enables the production of haplotype-resolved genome sequences. This review explains the principles of flow cytometric chromosome analysis and sorting (flow cytogenetics), discusses the major uses of this technology in genome analysis, and outlines future directions.
Chromosome sorting; Chromosome-specific BAC libraries; Chromosome sequencing; Chromosome genomics; Genome complexity reduction; Flow cytometry; Physical mapping
Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat.
The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695 Insertion Site Based Polymorphisms (ISBPs). Of the 96 ISBP primer pairs tested, 28 (29%) were 3A-specific and compared to 17 (18%) for 96 SSRs.
This work reports on the use of wheat chromosome arm 3AS-specific BAC library for the targeted generation of sequence data from a particular region of the huge genome of wheat. A large quantity of sequences were generated from the A genome of hexaploid wheat for comparative genome analysis with homoeologous B and D genomes and other model grass genomes. Hundreds of molecular markers were developed from the 3AS arm-specific sequences; these and other sequences will be useful in gene discovery and physical mapping.
Genome size evolution is a complex process influenced by polyploidization, satellite DNA accumulation, and expansion of retroelements. How this process could be affected by different reproductive strategies is still poorly understood.
We analyzed differences in the number and distribution of major repetitive DNA elements in two closely related species, Silene latifolia and S. vulgaris. Both species are diploid and possess the same chromosome number (2n = 24), but differ in their genome size and mode of reproduction. The dioecious S. latifolia (1C = 2.70 pg DNA) possesses sex chromosomes and its genome is 2.5× larger than that of the gynodioecious S. vulgaris (1C = 1.13 pg DNA), which does not possess sex chromosomes. We discovered that the genome of S. latifolia is larger mainly due to the expansion of Ogre retrotransposons. Surprisingly, the centromeric STAR-C and TR1 tandem repeats were found to be more abundant in S. vulgaris, the species with the smaller genome. We further examined the distribution of major repetitive sequences in related species in the Caryophyllaceae family. The results of FISH (fluorescence in situ hybridization) on mitotic chromosomes with the Retand element indicate that large rearrangements occurred during the evolution of the Caryophyllaceae family.
Our data demonstrate that the evolution of genome size in the genus Silene is accompanied by the expansion of different repetitive elements with specific patterns in the dioecious species possessing the sex chromosomes.
The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide.
Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice.
The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye.
This study evaluates the potential of flow cytometry for chromosome sorting in two wild diploid wheats Aegilops umbellulata and Ae. comosa and their natural allotetraploid hybrids Ae. biuncialis and Ae. geniculata. Flow karyotypes obtained after the analysis of DAPI-stained chromosomes were characterized and content of chromosome peaks was determined. Peaks of chromosome 1U could be discriminated in flow karyotypes of Ae. umbellulata and Ae. biuncialis and the chromosome could be sorted with purities exceeding 95%. The remaining chromosomes formed composite peaks and could be sorted in groups of two to four. Twenty four wheat SSR markers were tested for their position on chromosomes of Ae. umbellulata and Ae. comosa using PCR on DNA amplified from flow-sorted chromosomes and genomic DNA of wheat-Ae. geniculata addition lines, respectively. Six SSR markers were located on particular Aegilops chromosomes using sorted chromosomes, thus confirming the usefulness of this approach for physical mapping. The SSR markers are suitable for marker assisted selection of wheat-Aegilops introgression lines. The results obtained in this work provide new opportunities for dissecting genomes of wild relatives of wheat with the aim to assist in alien gene transfer and discovery of novel genes for wheat improvement.
Wheat is one of the world's most important crops and is characterized by a large polyploid genome. One way to reduce genome complexity is to isolate single chromosomes using flow cytometry. Low coverage DNA sequencing can provide a snapshot of individual chromosomes, allowing a fast characterization of their main features and comparison with other genomes. We used massively parallel 454 pyrosequencing to obtain a 2x coverage of wheat chromosome 5A. The resulting sequence assembly was used to identify TEs, genes and miRNAs, as well as to infer a virtual gene order based on the synteny with other grass genomes. Repetitive elements account for more than 75% of the genome. Gene content was estimated considering non-redundant reads showing at least one match to ESTs or proteins. The results indicate that the coding fraction represents 1.08% and 1.3% of the short and long arm respectively, projecting the number of genes of the whole chromosome to approximately 5,000. 195 candidate miRNA precursors belonging to 16 miRNA families were identified. The 5A genes were used to search for syntenic relationships between grass genomes. The short arm is closely related to Brachypodium chromosome 4, sorghum chromosome 8 and rice chromosome 12; the long arm to regions of Brachypodium chromosomes 4 and 1, sorghum chromosomes 1 and 2 and rice chromosomes 9 and 3. From these similarities it was possible to infer the virtual gene order of 392 (5AS) and 1,480 (5AL) genes of chromosome 5A, which was compared to, and found to be largely congruent with the available physical map of this chromosome.
The classification of the Musaceae (banana) family species and their phylogenetic inter-relationships remain controversial, in part due to limited nucleotide information to complement the morphological and physiological characters. In this work the evolutionary relationships within the Musaceae family were studied using 13 species and DNA sequences obtained from a set of 19 unlinked nuclear genes.
The 19 gene sequences represented a sample of ~16 kb of genome sequence (~73% intronic). The sequence data were also used to obtain estimates for the divergence times of the Musaceae genera and Musa sections. Nucleotide variation within the sample confirmed the close relationship of Australimusa and Callimusa sections and showed that Eumusa and Rhodochlamys sections are not reciprocally monophyletic, which supports the previous claims for the merger between the two latter sections. Divergence time analysis supported the previous dating of the Musaceae crown age to the Cretaceous/Tertiary boundary (~ 69 Mya), and the evolution of Musa to ~50 Mya. The first estimates for the divergence times of the four Musa sections were also obtained.
The gene sequence-based phylogeny presented here provides a substantial insight into the course of speciation within the Musaceae. An understanding of the main phylogenetic relationships between banana species will help to fine-tune the taxonomy of Musaceae.
Genes coding for 45S ribosomal RNA are organized in tandem arrays of up to several thousand copies and contain 18S, 5.8S and 26S rRNA units separated by internal transcribed spacers ITS1 and ITS2. While the rRNA units are evolutionary conserved, ITS show high level of interspecific divergence and have been used frequently in genetic diversity and phylogenetic studies. In this work we report on the structure and diversity of the ITS region in 87 representatives of the family Musaceae. We provide the first detailed information on ITS sequence diversity in the genus Musa and describe the presence of more than one type of ITS sequence within individual species. Both Sanger sequencing of amplified ITS regions and whole genome 454 sequencing lead to similar phylogenetic inferences. We show that it is necessary to identify putative pseudogenic ITS sequences, which may have negative effect on phylogenetic reconstruction at lower taxonomic levels. Phylogenetic reconstruction based on ITS sequence showed that the genus Musa is divided into two distinct clades – Callimusa and Australimusa and Eumusa and Rhodochlamys. Most of the intraspecific banana hybrids analyzed contain conserved parental ITS sequences, indicating incomplete concerted evolution of rDNA loci. Independent evolution of parental rDNA in hybrids enables determination of genomic constitution of hybrids using ITS. The observation of only one type of ITS sequence in some of the presumed interspecific hybrid clones warrants further study to confirm their hybrid origin and to unravel processes leading to evolution of their genomes.