Physical maps employing libraries of bacterial artificial chromosome (BAC) clones are essential for comparative genomics and sequencing of large and repetitive genomes such as those of the hexaploid bread wheat. The diploid ancestor of the D-genome of hexaploid wheat (Triticum aestivum), Aegilops tauschii, is used as a resource for wheat genomics. The barley diploid genome also provides a good model for the Triticeae and T. aestivum since it is only slightly larger than the ancestor wheat D genome. Gene co-linearity between the grasses can be exploited by extrapolating from rice and Brachypodium distachyon to Ae. tauschii or barley, and then to wheat.
We report the use of Ae. tauschii for the construction of the physical map of a large distal region of chromosome arm 3DS. A physical map of 25.4 Mb was constructed by anchoring BAC clones of Ae. tauschii with 85 EST on the Ae. tauschii and barley genetic maps. The 24 contigs were aligned to the rice and B. distachyon genomic sequences and a high density SNP genetic map of barley. As expected, the mapped region is highly collinear to the orthologous chromosome 1 in rice, chromosome 2 in B. distachyon and chromosome 3H in barley. However, the chromosome scale of the comparative maps presented provides new insights into grass genome organization. The disruptions of the Ae. tauschii-rice and Ae. tauschii-Brachypodium syntenies were identical. We observed chromosomal rearrangements between Ae. tauschii and barley. The comparison of Ae. tauschii physical and genetic maps showed that the recombination rate across the region dropped from 2.19 cM/Mb in the distal region to 0.09 cM/Mb in the proximal region. The size of the gaps between contigs was evaluated by comparing the recombination rate along the map with the local recombination rates calculated on single contigs.
The physical map reported here is the first physical map using fingerprinting of a complete Triticeae genome. This study demonstrates that global fingerprinting of the large plant genomes is a viable strategy for generating physical maps. Physical maps allow the description of the co-linearity between wheat and grass genomes and provide a powerful tool for positional cloning of new genes.
Whole genome duplication is a common evolutionary event in plants. Bread wheat (Triticum aestivum L.) is a good model to investigate the impact of paleo- and neoduplications on the organization and function of modern plant genomes.
We performed an RNA sequencing-based inference of the grain filling gene network in bread wheat and identified a set of 37,695 non-redundant sequence clusters, which is an unprecedented resolution corresponding to an estimated half of the wheat genome unigene repertoire. Using the Brachypodium distachyon genome as a reference for the Triticeae, we classified gene clusters into orthologous, paralogous, and homoeologous relationships. Based on this wheat gene evolutionary classification, older duplicated copies (dating back 50 to 70 million years) exhibit more than 80% gene loss and expression divergence while recent duplicates (dating back 1.5 to 3 million years) show only 54% gene loss and 36 to 49% expression divergence.
We suggest that structural shuffling due to duplicated gene loss is a rapid process, whereas functional shuffling due to neo- and/or subfunctionalization of duplicates is a longer process, and that both shuffling mechanisms drive functional redundancy erosion. We conclude that, as a result of these mechanisms, half the gene duplicates in plants are structurally and functionally altered within 10 million years of evolution, and the diploidization process is completed after 45 to 50 million years following polyploidization.
The Yr26 gene, conferring resistance to all currently important races of Puccinia striiformis f. sp. tritici (Pst) in China, was previously mapped to wheat chromosome deletion bin C-1BL-6-0.32 with low-density markers. In this study, collinearity of wheat to Brachypodium distachyon and rice was used to develop markers to saturate the chromosomal region containing the Yr26 locus, and a total of 2,341 F2 plants and 551 F2∶3 progenies derived from Avocet S×92R137 were used to develop a fine map of Yr26. Wheat expressed sequence tags (ESTs) located in deletion bin C-1BL-6-0.32 were used to develop sequence tagged site (STS) markers. The EST-STS markers flanking Yr26 were used to identify collinear regions of the rice and B. distachyon genomes. Wheat ESTs with significant similarities in the two collinear regions were selected to develop conserved markers for fine mapping of Yr26. Thirty-one markers were mapped to the Yr26 region, and six of them cosegregated with the resistance gene. Marker orders were highly conserved between rice and B. distachyon, but some rearrangements were observed between rice and wheat. Two flanking markers (CON-4 and CON-12) further narrowed the genomic region containing Yr26 to a 1.92 Mb region in B. distachyon chromosome 3 and a 1.17 Mb region in rice chromosome 10, and two putative resistance gene analogs were identified in the collinear region of B. distachyon. The markers developed in this study provide a potential target site for further map-based cloning of Yr26 and should be useful in marker assisted selection for pyramiding the gene with other resistance genes.
Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat.
The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695 Insertion Site Based Polymorphisms (ISBPs). Of the 96 ISBP primer pairs tested, 28 (29%) were 3A-specific and compared to 17 (18%) for 96 SSRs.
This work reports on the use of wheat chromosome arm 3AS-specific BAC library for the targeted generation of sequence data from a particular region of the huge genome of wheat. A large quantity of sequences were generated from the A genome of hexaploid wheat for comparative genome analysis with homoeologous B and D genomes and other model grass genomes. Hundreds of molecular markers were developed from the 3AS arm-specific sequences; these and other sequences will be useful in gene discovery and physical mapping.
The wax (glaucousness) on wheat leaves and stems is mainly controlled by two sets of genes: glaucousness loci (W1 and W2) and non-glaucousness loci (Iw1 and Iw2). The non-glaucousness (Iw) loci act as inhibitors of the glaucousness loci (W). High-resolution comparative genetic linkage maps of the wax inhibitors Iw1 originating from Triticum dicoccoides, and Iw2 from Aegilops tauschii were developed by comparative genomics analyses of Brachypodium, sorghum and rice genomic sequences corresponding to the syntenic regions of the Iw loci in wheat. Eleven Iw1 and eight Iw2 linked EST markers were developed and mapped to linkage maps on the distal regions of chromosomes 2BS and 2DS, respectively. The Iw1 locus mapped within a 0.96 cM interval flanked by the BE498358 and CA499581 EST markers that are collinear with 122 kb, 202 kb, and 466 kb genomic regions in the Brachypodium 5S chromosome, the sorghum 6S chromosome and the rice 4S chromosome, respectively. The Iw2 locus was located in a 4.1 to 5.4-cM interval in chromosome 2DS that is flanked by the CJ886319 and CJ519831 EST markers, and this region is collinear with a 2.3 cM region spanning the Iw1 locus on chromosome 2BS. Both Iw1 and Iw2 co-segregated with the BF474014 and CJ876545 EST markers, indicating they are most likely orthologs on 2BS and 2DS. These high-resolution maps can serve as a framework for chromosome landing, physical mapping and map-based cloning of the wax inhibitors in wheat.
Ghd7 is an important rice gene that has a major effect on several agronomic traits, including yield. To reveal the origin of Ghd7 and sequence evolution of this locus, we performed a comparative sequence analysis of the Ghd7 orthologous regions from ten diploid Oryza species, Brachypodium distachyon, sorghum and maize. Sequence analysis demonstrated high gene collinearity across the genus Oryza and a disruption of collinearity among non-Oryza species. In particular, Ghd7 was not present in orthologous positions except in Oryza species. The Ghd7 regions were found to have low gene densities and high contents of repetitive elements, and that the sizes of orthologous regions varied tremendously. The large transposable element contents resulted in a high frequency of pseudogenization and gene movement events surrounding the Ghd7 loci. Annotation information and cytological experiments have indicated that Ghd7 is a heterochromatic gene. Ghd7 orthologs were identified in B. distachyon, sorghum and maize by phylogenetic analysis; however, the positions of orthologous genes differed dramatically as a consequence of gene movements in grasses. Rather, we identified sequence remnants of gene movement of Ghd7 mediated by illegitimate recombination in the B. distachyon genome.
Sucrose phosphate synthase (SPS) is an important component of the plant sucrose biosynthesis pathway. In the monocotyledonous Poaceae, five SPS genes have been identified. Here we present a detailed analysis of the wheat SPSII family in wheat. A set of homoeologue-specific primers was developed in order to permit both the detection of sequence variation, and the dissection of the individual contribution of each homoeologue to the global expression of SPSII.
The expression in bread wheat over the course of development of various sucrose biosynthesis genes monitored on an Affymetrix array showed that the SPS genes were regulated over time and space. SPSII homoeologue-specific assays were used to show that the three homoeologues contributed differentially to the global expression of SPSII. Genetic mapping placed the set of homoeoloci on the short arms of the homoeologous group 3 chromosomes. A resequencing of the A and B genome copies allowed the detection of four haplotypes at each locus. The 3B copy includes an unspliced intron. A comparison of the sequences of the wheat SPSII orthologues present in the diploid progenitors einkorn, goatgrass and Triticum speltoides, as well as in the more distantly related species barley, rice, sorghum and purple false brome demonstrated that intronic sequence was less well conserved than exonic. Comparative sequence and phylogenetic analysis of SPSII gene showed that false purple brome was more similar to Triticeae than to rice. Wheat - rice synteny was found to be perturbed at the SPS region.
The homoeologue-specific assays will be suitable to derive associations between SPS functionality and key phenotypic traits. The amplicon sequences derived from the homoeologue-specific primers are informative regarding the evolution of SPSII in a polyploid context.
Caffeic acid o-methyltransferase (COMT) is one of the important enzymes controlling lignin monomer production in plant cell wall synthesis. Analysis of the genome sequence of the new grass model Brachypodium distachyon identified four COMT gene homologs, designated as BdCOMT1, BdCOMT2, BdCOMT3, and BdCOMT4. Phylogenetic analysis suggested that they belong to the COMT gene family, whereas syntenic analysis through comparisons with rice and sorghum revealed that BdCOMT4 on Chromosome 3 is the orthologous copy of the COMT genes well characterized in other grass species. The other three COMT genes are unique to Brachypodium since orthologous copies are not found in the collinear regions of rice and sorghum genomes. Expression studies indicated that all four Brachypodium COMT genes are transcribed but with distinct patterns of tissue specificity. Full-length cDNAs were cloned in frame into the pQE-T7 expression vector for the purification of recombinant Brachypodium COMT proteins. Biochemical characterization of enzyme activity and substrate specificity showed that BdCOMT4 has significant effect on a broad range of substrates with the highest preference for caffeic acid. The other three COMTs had low or no effect on these substrates, suggesting that a diversified evolution occurred on these duplicate genes that not only impacted their pattern of expression, but also altered their biochemical properties.
Phosphomannomutase (PMM) is an essential enzyme in eukaryotes. However, little is known about PMM gene and function in crop plants. Here, we report molecular evolutionary and biochemical analysis of PMM genes in bread wheat and related Triticeae species.
Two sets of homoeologous PMM genes (TaPMM-1 and 2) were found in bread wheat, and two corresponding PMM genes were identified in the diploid progenitors of bread wheat and many other diploid Triticeae species. The duplication event yielding PMM-1 and 2 occurred before the radiation of diploid Triticeae genomes. The PMM gene family in wheat and relatives may evolve largely under purifying selection. Among the six TaPMM genes, the transcript levels of PMM-1 members were comparatively high and their recombinant proteins were all enzymatically active. However, PMM-2 homoeologs exhibited lower transcript levels, two of which were also inactive. TaPMM-A1, B1 and D1 were probably the main active isozymes in bread wheat tissues. The three isozymes differed from their counterparts in barley and Brachypodium distachyon in being more tolerant to elevated test temperatures.
Our work identified the genes encoding PMM isozymes in bread wheat and relatives, uncovered a unique PMM duplication event in diverse Triticeae species, and revealed the main active PMM isozymes in bread wheat tissues. The knowledge obtained here improves the understanding of PMM evolution in eukaryotic organisms, and may facilitate further investigations of PMM function in the temperature adaptability of bread wheat.
The genomic sequences of many important Triticeae crop species are hard to assemble and analyse due to their large genome sizes, (in part) polyploid genomes and high repeat content. Recently, the draft genomes of barley and bread wheat were reported thanks to cost-efficient and fast NGS technologies. The genome of barley is estimated to be 5 Gb in size whereas the genome of bread wheat accounts for 17 Gb and harbours an allo-hexaploid genome. Direct assembly of the sequence reads and access to the gene content is hampered by the repeat content. As a consequence, novel strategies and data analysis concepts had to be developed to provide much-needed whole genome sequence surveys and access to the gene repertoires. Here we describe some analytical strategies that now enable structuring of massive NGS data generated and pave the way towards structured and ordered sequence data and gene order. Specifically we report on the GenomeZipper, a synteny driven approach to order and structure NGS survey sequences of grass genomes that lack a physical map. In addition, to access and analyse the gene repertoire of allo-hexaploid bread wheat from the raw sequence reads, a reference-guided approach was developed utilizing representative genes from rice, Brachypodium distachyon, sorghum and barley. Stringent sub-assembly on the reference genes prevented collapsing of homeologous wheat genes and allowed to estimate gene retention rate and determine gene family sizes. Genomic sequences from the wheat sub-genome progenitors enabled to discriminate a large number of sub-assemblies between the wheat A, B or D sub-genome using machine learning algorithms. Many of the concepts outlined here can readily be applied to other complex plant and non-plant genomes.
Triticeae genomes; Grass genomes; Wheat genome; Barley genome; GenomeZipper; Genome analysis
Diploid Aegilops umbellulata and Ae. comosa and their natural allotetraploid hybrids Ae. biuncialis and Ae. geniculata are important wild gene sources for wheat. With the aim of assisting in alien gene transfer, this study provides gene-based conserved orthologous set (COS) markers for the U and M genome chromosomes. Out of the 140 markers tested on a series of wheat-Aegilops chromosome introgression lines and flow-sorted subgenomic chromosome fractions, 100 were assigned to Aegilops chromosomes and six and seven duplications were identified in the U and M genomes, respectively. The marker-specific EST sequences were BLAST-ed to Brachypodium and rice genomic sequences to investigate macrosyntenic relationships between the U and M genomes of Aegilops, wheat and the model species. Five syntenic regions of Brachypodium identified genome rearrangements differentiating the U genome from the M genome and from the D genome of wheat. All of them seem to have evolved at the diploid level and to have been modified differentially in the polyploid species Ae. biuncialis and Ae. geniculata. A certain level of wheat–Aegilops homology was detected for group 1, 2, 3 and 5 chromosomes, while a clearly rearranged structure was showed for the group 4, 6 and 7 Aegilops chromosomes relative to wheat. The conserved orthologous set markers assigned to Aegilops chromosomes promise to accelerate gene introgression by facilitating the identification of alien chromatin. The syntenic relationships between the Aegilops species, wheat and model species will facilitate the targeted development of new markers specific for U and M genomic regions and will contribute to the understanding of molecular processes related to allopolyploidization.
The Protein Disulfide Isomerase (PDI) gene family encodes several PDI and PDI-like proteins containing thioredoxin domains and controlling diversified metabolic functions, including disulfide bond formation and isomerisation during protein folding. Genomic, cDNA and promoter sequences of the three homoeologous wheat genes encoding the "typical" PDI had been cloned and characterized in a previous work. The purpose of present research was the cloning and characterization of the complete set of genes encoding PDI and PDI like proteins in bread wheat (Triticum aestivum cv Chinese Spring) and the comparison of their sequence, structure and expression with homologous genes from other plant species.
Eight new non-homoeologous wheat genes were cloned and characterized. The nine PDI and PDI-like sequences of wheat were located in chromosome regions syntenic to those in rice and assigned to eight plant phylogenetic groups. The nine wheat genes differed in their sequences, genomic organization as well as in the domain composition and architecture of their deduced proteins; conversely each of them showed high structural conservation with genes from other plant species in the same phylogenetic group. The extensive quantitative RT-PCR analysis of the nine genes in a set of 23 wheat samples, including tissues and developmental stages, showed their constitutive, even though highly variable expression.
The nine wheat genes showed high diversity, while the members of each phylogenetic group were highly conserved even between taxonomically distant plant species like the moss Physcomitrella patens. Although constitutively expressed the nine wheat genes were characterized by different expression profiles reflecting their different genomic organization, protein domain architecture and probably promoter sequences; the high conservation among species indicated the ancient origin and diversification of the still evolving gene family. The comprehensive structural and expression characterization of the complete set of PDI and PDI-like wheat genes represents a basis for the functional characterization of this gene family in the hexaploid context of bread wheat.
The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide.
Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice.
The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye.
Carotenoids are isoprenoid pigments, essential for photosynthesis and photoprotection in plants. The enzyme phytoene synthase (PSY) plays an essential role in mediating condensation of two geranylgeranyl diphosphate molecules, the first committed step in carotenogenesis. PSY are nuclear enzymes encoded by a small gene family consisting of three paralogous genes (PSY1-3) that have been widely characterized in rice, maize and sorghum.
In wheat, for which yellow pigment content is extremely important for flour colour, only PSY1 has been extensively studied because of its association with QTLs reported for yellow pigment whereas PSY2 has been partially characterized. Here, we report the isolation of bread wheat PSY3 genes from a Renan BAC library using Brachypodium as a model genome for the Triticeae to develop Conserved Orthologous Set markers prior to gene cloning and sequencing. Wheat PSY3 homoeologous genes were sequenced and annotated, unravelling their novel structure associated with intron-loss events and consequent exonic fusions. A wheat PSY3 promoter region was also investigated for the presence of cis-acting elements involved in the response to abscisic acid (ABA), since carotenoids also play an important role as precursors of signalling molecules devoted to plant development and biotic/abiotic stress responses. Expression of wheat PSYs in leaves and roots was investigated during ABA treatment to confirm the up-regulation of PSY3 during abiotic stress.
We investigated the structural and functional determinisms of PSY genes in wheat. More generally, among eudicots and monocots, the PSY gene family was found to be associated with differences in gene copy numbers, allowing us to propose an evolutionary model for the entire PSY gene family in Grasses.
Carotenoids; Phytoene synthase; Wheat; Intron loss; Abiotic stress; Evolution
Plant and animal methyltransferases are key enzymes involved in DNA methylation at cytosine residues, required for gene expression control and genome stability. Taking advantage of the new sequence surveys of the wheat genome recently released by the International Wheat Genome Sequencing Consortium, we identified and characterized MET1 genes in the hexaploid wheat Triticum aestivum (TaMET1).
Nine TaMET1 genes were identified and mapped on homoeologous chromosome groups 2A/2B/2D, 5A/5B/5D and 7A/7B/7D. Synteny analysis and evolution rates suggest that the genome organization of TaMET1 genes results from a whole genome duplication shared within the grass family, and a second gene duplication, which occurred specifically in the Triticeae tribe prior to the speciation of diploid wheat. Higher expression levels were observed for TaMET1 homoeologous group 2 genes compared to group 5 and 7, indicating that group 2 homoeologous genes are predominant at the transcriptional level, while group 5 evolved into pseudogenes. We show the connection between low expression levels, elevated evolution rates and unexpected enrichment in CG-dinucleotides (CG-rich isochores) at putative promoter regions of homoeologous group 5 and 7, but not of group 2 TaMET1 genes. Bisulfite sequencing reveals that these CG-rich isochores are highly methylated in a CG context, which is the expected target of TaMET1.
We retraced the evolutionary history of MET1 genes in wheat, explaining the predominance of group 2 homoeologous genes and suggest CG-DNA methylation as one of the mechanisms involved in wheat genome dynamics.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-922) contains supplementary material, which is available to authorized users.
DNA methylation; Evolution; Genome dynamics; CG-rich isochores
MicroRNAs are a class of short, non-coding, single-stranded RNAs that act as post-transcriptional regulators in gene expression. miRNA analysis of Triticum aestivum chromosome 5D was performed on 454 GS FLX Titanium sequences of flow-sorted chromosome 5D with a total of 3,208,630 good quality reads representing 1.34x and 1.61x coverage of the short (5DS) and long (5DL) arms of the chromosome respectively. In silico and structural analyses revealed a total of 55 miRNAs; 48 and 42 miRNAs were found to be present on 5DL and 5DS respectively, of which 35 were common to both chromosome arms, while 13 miRNAs were specific to 5DL and 7 miRNAs were specific to 5DS. In total, 14 of the predicted miRNAs were identified in wheat for the first time. Representation (the copy number of each miRNA) was also found to be higher in 5DL (1,949) compared to 5DS (1,191). Targets were predicted for each miRNA, while expression analysis gave evidence of expression for 6 out of 55 miRNAs. Occurrences of the same miRNAs were also found in Brachypodium distachyon and Oryza sativa genome sequences to identify syntenic miRNA coding sequences. Based on this analysis, two other miRNAs: miR1133 and miR167 were detected in B. distachyon syntenic region of wheat 5DS. Five of the predicted miRNA coding regions (miR6220, miR5070, miR169, miR5085, miR2118) were experimentally verified to be located to the 5D chromosome and three of them : miR2118, miR169 and miR5085, were shown to be 5D specific. Furthermore miR2118 was shown to be expressed in Chinese Spring adult leaves. miRNA genes identified in this study will expand our understanding of gene regulation in bread wheat.
The ability of grass species to adapt to various habitats is attributed to the dynamic nature of their genomes, which have been shaped by multiple rounds of ancient and recent polyploidization. To gain a better understanding of the nature and extent of variation in functionally relevant regions of a polyploid genome, we developed a sequence capture assay to compare exonic sequences of allotetraploid wheat accessions.
A sequence capture assay was designed for the targeted re-sequencing of 3.5 Mb exon regions that surveyed a total of 3,497 genes from allotetraploid wheat. These data were used to describe SNPs, copy number variation and homoeologous sequence divergence in coding regions. A procedure for variant discovery in the polyploid genome was developed and experimentally validated. About 1% and 24% of discovered SNPs were loss-of-function and non-synonymous mutations, respectively. Under-representation of replacement mutations was identified in several groups of genes involved in translation and metabolism. Gene duplications were predominant in a cultivated wheat accession, while more gene deletions than duplications were identified in wild wheat.
We demonstrate that, even though the level of sequence similarity between targeted polyploid genomes and capture baits can bias enrichment efficiency, exon capture is a powerful approach for variant discovery in polyploids. Our results suggest that allopolyploid wheat can accumulate new variation in coding regions at a high rate. This process has the potential to broaden functional diversity and generate new phenotypic variation that eventually can play a critical role in the origin of new adaptations and important agronomic traits.
In allopolypoid crops, homoeologous genes in different genomes exhibit a very high sequence similarity, especially in the coding regions of genes. This makes it difficult to design genome-specific primers to amplify individual genes from different genomes. Development of genome-specific primers for agronomically important genes in allopolypoid crops is very important and useful not only for the study of sequence diversity and association mapping of genes in natural populations, but also for the development of gene-based functional markers for marker-assisted breeding. Here we report on a useful approach for the development of genome-specific primers in allohexaploid wheat.
In the present study, three genome-specific primer sets for the waxy (Wx) genes and four genome-specific primer sets for the starch synthase II (SSII) genes were developed mainly from single nucleotide polymorphisms (SNPs) and/or insertions or deletions (Indels) in introns and intron-exon junctions. The size of a single PCR product ranged from 750 bp to 1657 bp. The total length of amplified PCR products by these genome-specific primer sets accounted for 72.6%-87.0% of the Wx genes and 59.5%-61.6% of the SSII genes. Five genome-specific primer sets for the Wx genes (one for Wx-7A, three for Wx-4A and one for Wx-7D) could distinguish the wild type wheat and partial waxy wheat lines. These genome-specific primer sets for the Wx and SSII genes produced amplifications in hexaploid wheat, cultivated durum wheat, and Aegilops tauschii accessions, but failed to generate amplification in the majority of wild diploid and tetraploid accessions.
For the first time, we report on the development of genome-specific primers from three homoeologous Wx and SSII genes covering the majority of the genes in allohexaploid wheat. These genome-specific primers are being used for the study of sequence diversity and association mapping of the three homoeologous Wx and SSII genes in natural populations of both hexaploid wheat and cultivated tetraploid wheat. The strategies used in this paper can be used to develop genome-specific primers for homoeologous genes in any allopolypoid species. They may be also suitable for (i) the development of gene-specific primers for duplicated paralogous genes in any diploid species, and (ii) the development of allele-specific primers at the same gene locus.
Bread wheat is one of the world’s most important food crops and considerable efforts have been made to develop genomic resources for this species. This includes an on-going project by the International Wheat Genome Sequencing Consortium to assemble its large and complex genome, which is hexaploid and contains three closely related ‘homoeologous’ copies for each chromosome. This multi-national effort avoids the complications polyploidy entails for correct assembly of the genome by sequencing flow-sorted chromosome arms one at a time. Here we report on an alternate approach, a direct homoeolog-specific assembly of the expressed portion of the genome, the transcriptome.
After assessment of the ability of various assemblers to generate homoeolog-specific assemblies, we employed a two-stage assembly process to produce a high-quality assembly of the transcriptome of hexaploid wheat from Roche-454 and Illumina GAIIx paired-end sequence reads. The assembly process made use of a rapid partitioning of expressed sequences into homoeologous clusters, followed by a parallel high-fidelity assembly of each cluster on a 1150-processor compute cloud. We assessed assembly quality through comparison to known wheat gene sequences and found that in ca. 98.5% of cases the assembly was sufficiently accurate for homoeologous triplets to be cleanly separated into either two or three separate contigs. Comparison to publicly available transcript collections suggests that the assembly covers ~75-80% of the complete transcriptome.
This work therefore describes the first homoeolog-specific sequence assembly of the wheat transcriptome and provides a reference transcriptome for future wheat research. Furthermore, our assembly methodology is transferable to other polyploid organisms.
Wheat transcriptome; Wheat genes; Sequence assembly; Cloud computing
The bread wheat (Triticum aestivum L.) genotype “Chinese Spring” (“CS”) is the reference base in wheat genetics and genomics. Pericentric rearrangements in this genotype were systematically assessed by analyzing homoeoloci for a set of nonredundant genes from Brachypodium distachyon, Triticum urartu, and Aegilops tauschii in the CS chromosome shotgun sequence obtained from individual chromosome arms flow-sorted from CS aneuploid lines. Based on patterns of their homoeologous arm locations, 551 genes indicated the presence of pericentric inversions in at least 10 of the 21 chromosomes. Available data from deletion bin-mapped expressed sequence tags and genetic mapping in wheat indicated that all inversions had breakpoints in the low-recombinant gene-poor pericentromeric regions. The large number of putative intrachromosomal rearrangements suggests the presence of extensive structural differences among the three subgenomes, at least some of which likely occurred during the production of the aneuploid lines of this hexaploid wheat genotype. These differences could have significant implications in wheat genome research where comparative approaches are used such as in ordering and orientating sequence contigs and in gene cloning.
chromosomal rearrangement; comparative genomics; pericentric inversion; pericentromeric regions; translocation; Chinese Spring
Powdery mildew (PM) is a very destructive disease of wheat (Triticum aestivum L.). Wheat-Thinopyrum ponticum introgression line CH7086 was shown to possess powdery mildew resistance possibly originating from Th. ponticum. Genomic in situ hybridization and molecular characterization of the alien introgression failed to identify alien chromatin. To study the genetics of resistance, CH7086 was crossed with susceptible genotypes. Segregation in F2 populations and F2:3 lines tested with Chinese Bgt race E09 under controlled conditions indicated that CH7086 carries a single dominant gene for powdery mildew resistance. Fourteen SSR and EST-PCR markers linked with the locus were identified. The genetic distances between the locus and the two flanking markers were 1.5 and 3.2 cM, respectively. Based on the locations of the markers by nullisomic-tetrasomic and deletion lines of ‘Chinese Spring’, the resistance gene was located in deletion bin 2BL-0.89-1.00. Conserved orthologous marker analysis indicated that the genomic region flanking the resistance gene has a high level of collinearity to that of rice chromosome 4 and Brachypodium chromosome 5. Both resistance specificities and tests of allelism suggested the resistance gene in CH7086 was different from previously reported powdery mildew resistance genes on 2BL, and the gene was provisionally designated PmCH86. Molecular analysis of PmCH86 compared with other genes for resistance to Bgt in the 2BL-0.89-1.00 region suggested that PmCH86 may be a new PM resistance gene, and it was therefore designated as Pm51. The closely linked flanking markers could be useful in exploiting this putative wheat-Thinopyrum translocation line for rapid transfer of Pm51 to wheat breeding programs.
Brachypodium distachyon (Brachypodium) has been recognized as a new model species for comparative and functional genomics of cereal and bioenergy crops because it possesses many biological attributes desirable in a model, such as a small genome size, short stature, self-pollinating habit, and short generation cycle. To maximize the utility of Brachypodium as a model for basic and applied research it is necessary to develop genomic resources for it. A BAC-based physical map is one of them. A physical map will facilitate analysis of genome structure, comparative genomics, and assembly of the entire genome sequence.
A total of 67,151 Brachypodium BAC clones were fingerprinted with the SNaPshot HICF fingerprinting method and a genome-wide physical map of the Brachypodium genome was constructed. The map consisted of 671 contigs and 2,161 clones remained as singletons. The contigs and singletons spanned 414 Mb. A total of 13,970 gene-related sequences were detected in the BAC end sequences (BES). These gene tags aligned 345 contigs with 336 Mb of rice genome sequence, showing that Brachypodium and rice genomes are generally highly colinear. Divergent regions were mainly in the rice centromeric regions. A dot-plot of Brachypodium contigs against the rice genome sequences revealed remnants of the whole-genome duplication caused by paleotetraploidy, which were previously found in rice and sorghum. Brachypodium contigs were anchored to the wheat deletion bin maps with the BES gene-tags, opening the door to Brachypodium-Triticeae comparative genomics.
The construction of the Brachypodium physical map, and its comparison with the rice genome sequence demonstrated the utility of the SNaPshot-HICF method in the construction of BAC-based physical maps. The map represents an important genomic resource for the completion of Brachypodium genome sequence and grass comparative genomics. A draft of the physical map and its comparisons with rice and wheat are available at .
A cytogenetic map of wheat was constructed using FISH with cDNA probes. FISH markers detected homoeology and chromosomal rearrangements of wild relatives, an important source of genes for wheat improvement.
To transfer agronomically important genes from wild relatives to bread wheat (Triticum aestivum L., 2n = 6x = 42, AABBDD) by induced homoeologous recombination, it is important to know the chromosomal relationships of the species involved. Fluorescence in situ hybridization (FISH) can be used to study chromosome structure. The genomes of allohexaploid bread wheat and other species from the Triticeae tribe are colinear to some extent, i.e., composed of homoeoloci at similar positions along the chromosomes, and with genic regions being highly conserved. To develop cytogenetic markers specific for genic regions of wheat homoeologs, we selected more than 60 full-length wheat cDNAs using BLAST against mapped expressed sequence tags and used them as FISH probes. Most probes produced signals on all three homoeologous chromosomes at the expected positions. We developed a wheat physical map with several cDNA markers located on each of the 14 homoeologous chromosome arms. The FISH markers confirmed chromosome rearrangements within wheat genomes and were successfully used to study chromosome structure and homoeology in wild Triticeae species. FISH analysis detected 1U-6U chromosome translocation in the genome of Aegilops umbellulata, showed colinearity between chromosome A of Ae. caudata and group-1 wheat chromosomes, and between chromosome arm 7S#3L of Thinopyrum intermedium and the long arm of the group-7 wheat chromosomes.
Electronic supplementary material
The online version of this article (doi:10.1007/s00122-013-2253-z) contains supplementary material, which is available to authorized users.
Powdery mildew, caused by Blumeria graminis f. sp. tritici, is one of the most important wheat diseases in the world. In this study, a single dominant powdery mildew resistance gene MlIW172 was identified in the IW172 wild emmer accession and mapped to the distal region of chromosome arm 7AL (bin7AL-16-0.86-0.90) via molecular marker analysis. MlIW172 was closely linked with the RFLP probe Xpsr680-derived STS marker Xmag2185 and the EST markers BE405531 and BE637476. This suggested that MlIW172 might be allelic to the Pm1 locus or a new locus closely linked to Pm1. By screening genomic BAC library of durum wheat cv. Langdon and 7AL-specific BAC library of hexaploid wheat cv. Chinese Spring, and after analyzing genome scaffolds of Triticum urartu containing the marker sequences, additional markers were developed to construct a fine genetic linkage map on the MlIW172 locus region and to delineate the resistance gene within a 0.48 cM interval. Comparative genetics analyses using ESTs and RFLP probe sequences flanking the MlIW172 region against other grass species revealed a general co-linearity in this region with the orthologous genomic regions of rice chromosome 6, Brachypodium chromosome 1, and sorghum chromosome 10. However, orthologous resistance gene-like RGA sequences were only present in wheat and Brachypodium. The BAC contigs and sequence scaffolds that we have developed provide a framework for the physical mapping and map-based cloning of MlIW172.
Mutational inactivation of plant genes is an essential tool in gene function studies. Plants with inactivated or deleted genes may also be exploited for crop improvement if such mutations/deletions produce a desirable agronomical and/or quality phenotype. However, the use of mutational gene inactivation/deletion has been impeded in polyploid plant species by genetic redundancy, as polyploids contain multiple copies of the same genes (homoeologous genes) encoded by each of the ancestral genomes. Similar to many other crop plants, bread wheat (Triticum aestivum L.) is polyploid; specifically allohexaploid possessing three progenitor genomes designated as 'A', 'B', and 'D'. Recently modified TILLING protocols have been developed specifically for mutation detection in wheat. Whilst extremely powerful in detecting single nucleotide changes and small deletions, these methods are not suitable for detecting whole gene deletions. Therefore, high-throughput methods for screening of candidate homoeologous gene deletions are needed for application to wheat populations generated by the use of certain mutagenic agents (e.g. heavy ion irradiation) that frequently generate whole-gene deletions.
To facilitate the screening for specific homoeologous gene deletions in hexaploid wheat, we have developed a TaqMan qPCR-based method that allows high-throughput detection of deletions in homoeologous copies of any gene of interest, provided that sufficient polymorphism (as little as a single nucleotide difference) amongst homoeologues exists for specific probe design. We used this method to identify deletions of individual TaPFT1 homoeologues, a wheat orthologue of the disease susceptibility and flowering regulatory gene PFT1 in Arabidopsis. This method was applied to wheat nullisomic-tetrasomic lines as well as other chromosomal deletion lines to locate the TaPFT1 gene to the long arm of chromosome 5. By screening of individual DNA samples from 4500 M2 mutant wheat lines generated by heavy ion irradiation, we detected multiple mutants with deletions of each TaPFT1 homoeologue, and confirmed these deletions using a CAPS method. We have subsequently designed, optimized, and applied this method for the screening of homoeologous deletions of three additional wheat genes putatively involved in plant disease resistance.
We have developed a method for automated, high-throughput screening to identify deletions of individual homoeologues of a wheat gene. This method is also potentially applicable to other polyploidy plants.