Physical maps employing libraries of bacterial artificial chromosome (BAC) clones are essential for comparative genomics and sequencing of large and repetitive genomes such as those of the hexaploid bread wheat. The diploid ancestor of the D-genome of hexaploid wheat (Triticum aestivum), Aegilops tauschii, is used as a resource for wheat genomics. The barley diploid genome also provides a good model for the Triticeae and T. aestivum since it is only slightly larger than the ancestor wheat D genome. Gene co-linearity between the grasses can be exploited by extrapolating from rice and Brachypodium distachyon to Ae. tauschii or barley, and then to wheat.
We report the use of Ae. tauschii for the construction of the physical map of a large distal region of chromosome arm 3DS. A physical map of 25.4 Mb was constructed by anchoring BAC clones of Ae. tauschii with 85 EST on the Ae. tauschii and barley genetic maps. The 24 contigs were aligned to the rice and B. distachyon genomic sequences and a high density SNP genetic map of barley. As expected, the mapped region is highly collinear to the orthologous chromosome 1 in rice, chromosome 2 in B. distachyon and chromosome 3H in barley. However, the chromosome scale of the comparative maps presented provides new insights into grass genome organization. The disruptions of the Ae. tauschii-rice and Ae. tauschii-Brachypodium syntenies were identical. We observed chromosomal rearrangements between Ae. tauschii and barley. The comparison of Ae. tauschii physical and genetic maps showed that the recombination rate across the region dropped from 2.19 cM/Mb in the distal region to 0.09 cM/Mb in the proximal region. The size of the gaps between contigs was evaluated by comparing the recombination rate along the map with the local recombination rates calculated on single contigs.
The physical map reported here is the first physical map using fingerprinting of a complete Triticeae genome. This study demonstrates that global fingerprinting of the large plant genomes is a viable strategy for generating physical maps. Physical maps allow the description of the co-linearity between wheat and grass genomes and provide a powerful tool for positional cloning of new genes.
The Yr26 gene, conferring resistance to all currently important races of Puccinia striiformis f. sp. tritici (Pst) in China, was previously mapped to wheat chromosome deletion bin C-1BL-6-0.32 with low-density markers. In this study, collinearity of wheat to Brachypodium distachyon and rice was used to develop markers to saturate the chromosomal region containing the Yr26 locus, and a total of 2,341 F2 plants and 551 F2∶3 progenies derived from Avocet S×92R137 were used to develop a fine map of Yr26. Wheat expressed sequence tags (ESTs) located in deletion bin C-1BL-6-0.32 were used to develop sequence tagged site (STS) markers. The EST-STS markers flanking Yr26 were used to identify collinear regions of the rice and B. distachyon genomes. Wheat ESTs with significant similarities in the two collinear regions were selected to develop conserved markers for fine mapping of Yr26. Thirty-one markers were mapped to the Yr26 region, and six of them cosegregated with the resistance gene. Marker orders were highly conserved between rice and B. distachyon, but some rearrangements were observed between rice and wheat. Two flanking markers (CON-4 and CON-12) further narrowed the genomic region containing Yr26 to a 1.92 Mb region in B. distachyon chromosome 3 and a 1.17 Mb region in rice chromosome 10, and two putative resistance gene analogs were identified in the collinear region of B. distachyon. The markers developed in this study provide a potential target site for further map-based cloning of Yr26 and should be useful in marker assisted selection for pyramiding the gene with other resistance genes.
Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat.
The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695 Insertion Site Based Polymorphisms (ISBPs). Of the 96 ISBP primer pairs tested, 28 (29%) were 3A-specific and compared to 17 (18%) for 96 SSRs.
This work reports on the use of wheat chromosome arm 3AS-specific BAC library for the targeted generation of sequence data from a particular region of the huge genome of wheat. A large quantity of sequences were generated from the A genome of hexaploid wheat for comparative genome analysis with homoeologous B and D genomes and other model grass genomes. Hundreds of molecular markers were developed from the 3AS arm-specific sequences; these and other sequences will be useful in gene discovery and physical mapping.
Whole genome duplication is a common evolutionary event in plants. Bread wheat (Triticum aestivum L.) is a good model to investigate the impact of paleo- and neoduplications on the organization and function of modern plant genomes.
We performed an RNA sequencing-based inference of the grain filling gene network in bread wheat and identified a set of 37,695 non-redundant sequence clusters, which is an unprecedented resolution corresponding to an estimated half of the wheat genome unigene repertoire. Using the Brachypodium distachyon genome as a reference for the Triticeae, we classified gene clusters into orthologous, paralogous, and homoeologous relationships. Based on this wheat gene evolutionary classification, older duplicated copies (dating back 50 to 70 million years) exhibit more than 80% gene loss and expression divergence while recent duplicates (dating back 1.5 to 3 million years) show only 54% gene loss and 36 to 49% expression divergence.
We suggest that structural shuffling due to duplicated gene loss is a rapid process, whereas functional shuffling due to neo- and/or subfunctionalization of duplicates is a longer process, and that both shuffling mechanisms drive functional redundancy erosion. We conclude that, as a result of these mechanisms, half the gene duplicates in plants are structurally and functionally altered within 10 million years of evolution, and the diploidization process is completed after 45 to 50 million years following polyploidization.
The wax (glaucousness) on wheat leaves and stems is mainly controlled by two sets of genes: glaucousness loci (W1 and W2) and non-glaucousness loci (Iw1 and Iw2). The non-glaucousness (Iw) loci act as inhibitors of the glaucousness loci (W). High-resolution comparative genetic linkage maps of the wax inhibitors Iw1 originating from Triticum dicoccoides, and Iw2 from Aegilops tauschii were developed by comparative genomics analyses of Brachypodium, sorghum and rice genomic sequences corresponding to the syntenic regions of the Iw loci in wheat. Eleven Iw1 and eight Iw2 linked EST markers were developed and mapped to linkage maps on the distal regions of chromosomes 2BS and 2DS, respectively. The Iw1 locus mapped within a 0.96 cM interval flanked by the BE498358 and CA499581 EST markers that are collinear with 122 kb, 202 kb, and 466 kb genomic regions in the Brachypodium 5S chromosome, the sorghum 6S chromosome and the rice 4S chromosome, respectively. The Iw2 locus was located in a 4.1 to 5.4-cM interval in chromosome 2DS that is flanked by the CJ886319 and CJ519831 EST markers, and this region is collinear with a 2.3 cM region spanning the Iw1 locus on chromosome 2BS. Both Iw1 and Iw2 co-segregated with the BF474014 and CJ876545 EST markers, indicating they are most likely orthologs on 2BS and 2DS. These high-resolution maps can serve as a framework for chromosome landing, physical mapping and map-based cloning of the wax inhibitors in wheat.
The genomic sequences of many important Triticeae crop species are hard to assemble and analyse due to their large genome sizes, (in part) polyploid genomes and high repeat content. Recently, the draft genomes of barley and bread wheat were reported thanks to cost-efficient and fast NGS technologies. The genome of barley is estimated to be 5 Gb in size whereas the genome of bread wheat accounts for 17 Gb and harbours an allo-hexaploid genome. Direct assembly of the sequence reads and access to the gene content is hampered by the repeat content. As a consequence, novel strategies and data analysis concepts had to be developed to provide much-needed whole genome sequence surveys and access to the gene repertoires. Here we describe some analytical strategies that now enable structuring of massive NGS data generated and pave the way towards structured and ordered sequence data and gene order. Specifically we report on the GenomeZipper, a synteny driven approach to order and structure NGS survey sequences of grass genomes that lack a physical map. In addition, to access and analyse the gene repertoire of allo-hexaploid bread wheat from the raw sequence reads, a reference-guided approach was developed utilizing representative genes from rice, Brachypodium distachyon, sorghum and barley. Stringent sub-assembly on the reference genes prevented collapsing of homeologous wheat genes and allowed to estimate gene retention rate and determine gene family sizes. Genomic sequences from the wheat sub-genome progenitors enabled to discriminate a large number of sub-assemblies between the wheat A, B or D sub-genome using machine learning algorithms. Many of the concepts outlined here can readily be applied to other complex plant and non-plant genomes.
Triticeae genomes; Grass genomes; Wheat genome; Barley genome; GenomeZipper; Genome analysis
The Protein Disulfide Isomerase (PDI) gene family encodes several PDI and PDI-like proteins containing thioredoxin domains and controlling diversified metabolic functions, including disulfide bond formation and isomerisation during protein folding. Genomic, cDNA and promoter sequences of the three homoeologous wheat genes encoding the "typical" PDI had been cloned and characterized in a previous work. The purpose of present research was the cloning and characterization of the complete set of genes encoding PDI and PDI like proteins in bread wheat (Triticum aestivum cv Chinese Spring) and the comparison of their sequence, structure and expression with homologous genes from other plant species.
Eight new non-homoeologous wheat genes were cloned and characterized. The nine PDI and PDI-like sequences of wheat were located in chromosome regions syntenic to those in rice and assigned to eight plant phylogenetic groups. The nine wheat genes differed in their sequences, genomic organization as well as in the domain composition and architecture of their deduced proteins; conversely each of them showed high structural conservation with genes from other plant species in the same phylogenetic group. The extensive quantitative RT-PCR analysis of the nine genes in a set of 23 wheat samples, including tissues and developmental stages, showed their constitutive, even though highly variable expression.
The nine wheat genes showed high diversity, while the members of each phylogenetic group were highly conserved even between taxonomically distant plant species like the moss Physcomitrella patens. Although constitutively expressed the nine wheat genes were characterized by different expression profiles reflecting their different genomic organization, protein domain architecture and probably promoter sequences; the high conservation among species indicated the ancient origin and diversification of the still evolving gene family. The comprehensive structural and expression characterization of the complete set of PDI and PDI-like wheat genes represents a basis for the functional characterization of this gene family in the hexaploid context of bread wheat.
Ghd7 is an important rice gene that has a major effect on several agronomic traits, including yield. To reveal the origin of Ghd7 and sequence evolution of this locus, we performed a comparative sequence analysis of the Ghd7 orthologous regions from ten diploid Oryza species, Brachypodium distachyon, sorghum and maize. Sequence analysis demonstrated high gene collinearity across the genus Oryza and a disruption of collinearity among non-Oryza species. In particular, Ghd7 was not present in orthologous positions except in Oryza species. The Ghd7 regions were found to have low gene densities and high contents of repetitive elements, and that the sizes of orthologous regions varied tremendously. The large transposable element contents resulted in a high frequency of pseudogenization and gene movement events surrounding the Ghd7 loci. Annotation information and cytological experiments have indicated that Ghd7 is a heterochromatic gene. Ghd7 orthologs were identified in B. distachyon, sorghum and maize by phylogenetic analysis; however, the positions of orthologous genes differed dramatically as a consequence of gene movements in grasses. Rather, we identified sequence remnants of gene movement of Ghd7 mediated by illegitimate recombination in the B. distachyon genome.
In allopolypoid crops, homoeologous genes in different genomes exhibit a very high sequence similarity, especially in the coding regions of genes. This makes it difficult to design genome-specific primers to amplify individual genes from different genomes. Development of genome-specific primers for agronomically important genes in allopolypoid crops is very important and useful not only for the study of sequence diversity and association mapping of genes in natural populations, but also for the development of gene-based functional markers for marker-assisted breeding. Here we report on a useful approach for the development of genome-specific primers in allohexaploid wheat.
In the present study, three genome-specific primer sets for the waxy (Wx) genes and four genome-specific primer sets for the starch synthase II (SSII) genes were developed mainly from single nucleotide polymorphisms (SNPs) and/or insertions or deletions (Indels) in introns and intron-exon junctions. The size of a single PCR product ranged from 750 bp to 1657 bp. The total length of amplified PCR products by these genome-specific primer sets accounted for 72.6%-87.0% of the Wx genes and 59.5%-61.6% of the SSII genes. Five genome-specific primer sets for the Wx genes (one for Wx-7A, three for Wx-4A and one for Wx-7D) could distinguish the wild type wheat and partial waxy wheat lines. These genome-specific primer sets for the Wx and SSII genes produced amplifications in hexaploid wheat, cultivated durum wheat, and Aegilops tauschii accessions, but failed to generate amplification in the majority of wild diploid and tetraploid accessions.
For the first time, we report on the development of genome-specific primers from three homoeologous Wx and SSII genes covering the majority of the genes in allohexaploid wheat. These genome-specific primers are being used for the study of sequence diversity and association mapping of the three homoeologous Wx and SSII genes in natural populations of both hexaploid wheat and cultivated tetraploid wheat. The strategies used in this paper can be used to develop genome-specific primers for homoeologous genes in any allopolypoid species. They may be also suitable for (i) the development of gene-specific primers for duplicated paralogous genes in any diploid species, and (ii) the development of allele-specific primers at the same gene locus.
Bread wheat is one of the world’s most important food crops and considerable efforts have been made to develop genomic resources for this species. This includes an on-going project by the International Wheat Genome Sequencing Consortium to assemble its large and complex genome, which is hexaploid and contains three closely related ‘homoeologous’ copies for each chromosome. This multi-national effort avoids the complications polyploidy entails for correct assembly of the genome by sequencing flow-sorted chromosome arms one at a time. Here we report on an alternate approach, a direct homoeolog-specific assembly of the expressed portion of the genome, the transcriptome.
After assessment of the ability of various assemblers to generate homoeolog-specific assemblies, we employed a two-stage assembly process to produce a high-quality assembly of the transcriptome of hexaploid wheat from Roche-454 and Illumina GAIIx paired-end sequence reads. The assembly process made use of a rapid partitioning of expressed sequences into homoeologous clusters, followed by a parallel high-fidelity assembly of each cluster on a 1150-processor compute cloud. We assessed assembly quality through comparison to known wheat gene sequences and found that in ca. 98.5% of cases the assembly was sufficiently accurate for homoeologous triplets to be cleanly separated into either two or three separate contigs. Comparison to publicly available transcript collections suggests that the assembly covers ~75-80% of the complete transcriptome.
This work therefore describes the first homoeolog-specific sequence assembly of the wheat transcriptome and provides a reference transcriptome for future wheat research. Furthermore, our assembly methodology is transferable to other polyploid organisms.
Wheat transcriptome; Wheat genes; Sequence assembly; Cloud computing
The patterns of expression of homoeologous genes in hexaploid bread wheat have been intensively studied in recent years, but the interaction between structural genes and their homoeologous regulatory genes remained unclear. The question was as to whether, in an allopolyploid, this interaction is genome-specific, or whether regulation cuts across genomes. The aim of the present study was cloning, sequence analysis, mapping and expression analysis of F3H (flavanone 3-hydroxylase – one of the key enzymes in the plant flavonoid biosynthesis pathway) homoeologues in bread wheat and study of the interaction between F3H and their regulatory genes homoeologues – Rc (red coleoptiles).
PCR-based cloning of F3H sequences from hexaploid bread wheat (Triticum aestivum L.), a wild tetraploid wheat (T. timopheevii) and their putative diploid progenitors was employed to localize, physically map and analyse the expression of four distinct bread wheat F3H copies. Three of these form a homoeologous set, mapping to the chromosomes of homoeologous group 2; they are highly similar to one another at the structural and functional levels. However, the fourth copy is less homologous, and was not expressed in anthocyanin pigmented coleoptiles. The presence of dominant alleles at the Rc-1 homoeologous loci, which are responsible for anthocyanin pigmentation in the coleoptile, was correlated with F3H expression in pigmented coleoptiles. Each dominant Rc-1 allele affected the expression of the three F3H homoeologues equally, but the level of F3H expression was dependent on the identity of the dominant Rc-1 allele present. Thus, the homoeologous Rc-1 genes contribute more to functional divergence than do the structural F3H genes.
The lack of any genome-specific relationship between F3H-1 and Rc-1 implies an integrative evolutionary process among the three diploid genomes, following the formation of hexaploid wheat. Regulatory genes probably contribute more to the functional divergence between the wheat genomes than do the structural genes themselves. This is in line with the growing consensus which suggests that although heritable morphological traits are determined by the expression of structural genes, it is the regulatory genes which are the prime determinants of allelic identity.
Sucrose phosphate synthase (SPS) is an important component of the plant sucrose biosynthesis pathway. In the monocotyledonous Poaceae, five SPS genes have been identified. Here we present a detailed analysis of the wheat SPSII family in wheat. A set of homoeologue-specific primers was developed in order to permit both the detection of sequence variation, and the dissection of the individual contribution of each homoeologue to the global expression of SPSII.
The expression in bread wheat over the course of development of various sucrose biosynthesis genes monitored on an Affymetrix array showed that the SPS genes were regulated over time and space. SPSII homoeologue-specific assays were used to show that the three homoeologues contributed differentially to the global expression of SPSII. Genetic mapping placed the set of homoeoloci on the short arms of the homoeologous group 3 chromosomes. A resequencing of the A and B genome copies allowed the detection of four haplotypes at each locus. The 3B copy includes an unspliced intron. A comparison of the sequences of the wheat SPSII orthologues present in the diploid progenitors einkorn, goatgrass and Triticum speltoides, as well as in the more distantly related species barley, rice, sorghum and purple false brome demonstrated that intronic sequence was less well conserved than exonic. Comparative sequence and phylogenetic analysis of SPSII gene showed that false purple brome was more similar to Triticeae than to rice. Wheat - rice synteny was found to be perturbed at the SPS region.
The homoeologue-specific assays will be suitable to derive associations between SPS functionality and key phenotypic traits. The amplicon sequences derived from the homoeologue-specific primers are informative regarding the evolution of SPSII in a polyploid context.
Mutational inactivation of plant genes is an essential tool in gene function studies. Plants with inactivated or deleted genes may also be exploited for crop improvement if such mutations/deletions produce a desirable agronomical and/or quality phenotype. However, the use of mutational gene inactivation/deletion has been impeded in polyploid plant species by genetic redundancy, as polyploids contain multiple copies of the same genes (homoeologous genes) encoded by each of the ancestral genomes. Similar to many other crop plants, bread wheat (Triticum aestivum L.) is polyploid; specifically allohexaploid possessing three progenitor genomes designated as 'A', 'B', and 'D'. Recently modified TILLING protocols have been developed specifically for mutation detection in wheat. Whilst extremely powerful in detecting single nucleotide changes and small deletions, these methods are not suitable for detecting whole gene deletions. Therefore, high-throughput methods for screening of candidate homoeologous gene deletions are needed for application to wheat populations generated by the use of certain mutagenic agents (e.g. heavy ion irradiation) that frequently generate whole-gene deletions.
To facilitate the screening for specific homoeologous gene deletions in hexaploid wheat, we have developed a TaqMan qPCR-based method that allows high-throughput detection of deletions in homoeologous copies of any gene of interest, provided that sufficient polymorphism (as little as a single nucleotide difference) amongst homoeologues exists for specific probe design. We used this method to identify deletions of individual TaPFT1 homoeologues, a wheat orthologue of the disease susceptibility and flowering regulatory gene PFT1 in Arabidopsis. This method was applied to wheat nullisomic-tetrasomic lines as well as other chromosomal deletion lines to locate the TaPFT1 gene to the long arm of chromosome 5. By screening of individual DNA samples from 4500 M2 mutant wheat lines generated by heavy ion irradiation, we detected multiple mutants with deletions of each TaPFT1 homoeologue, and confirmed these deletions using a CAPS method. We have subsequently designed, optimized, and applied this method for the screening of homoeologous deletions of three additional wheat genes putatively involved in plant disease resistance.
We have developed a method for automated, high-throughput screening to identify deletions of individual homoeologues of a wheat gene. This method is also potentially applicable to other polyploidy plants.
Powdery mildew, caused by Blumeria graminis f. sp. tritici, is one of the most important wheat diseases in the world. In this study, a single dominant powdery mildew resistance gene MlIW172 was identified in the IW172 wild emmer accession and mapped to the distal region of chromosome arm 7AL (bin7AL-16-0.86-0.90) via molecular marker analysis. MlIW172 was closely linked with the RFLP probe Xpsr680-derived STS marker Xmag2185 and the EST markers BE405531 and BE637476. This suggested that MlIW172 might be allelic to the Pm1 locus or a new locus closely linked to Pm1. By screening genomic BAC library of durum wheat cv. Langdon and 7AL-specific BAC library of hexaploid wheat cv. Chinese Spring, and after analyzing genome scaffolds of Triticum urartu containing the marker sequences, additional markers were developed to construct a fine genetic linkage map on the MlIW172 locus region and to delineate the resistance gene within a 0.48 cM interval. Comparative genetics analyses using ESTs and RFLP probe sequences flanking the MlIW172 region against other grass species revealed a general co-linearity in this region with the orthologous genomic regions of rice chromosome 6, Brachypodium chromosome 1, and sorghum chromosome 10. However, orthologous resistance gene-like RGA sequences were only present in wheat and Brachypodium. The BAC contigs and sequence scaffolds that we have developed provide a framework for the physical mapping and map-based cloning of MlIW172.
Diploid Aegilops umbellulata and Ae. comosa and their natural allotetraploid hybrids Ae. biuncialis and Ae. geniculata are important wild gene sources for wheat. With the aim of assisting in alien gene transfer, this study provides gene-based conserved orthologous set (COS) markers for the U and M genome chromosomes. Out of the 140 markers tested on a series of wheat-Aegilops chromosome introgression lines and flow-sorted subgenomic chromosome fractions, 100 were assigned to Aegilops chromosomes and six and seven duplications were identified in the U and M genomes, respectively. The marker-specific EST sequences were BLAST-ed to Brachypodium and rice genomic sequences to investigate macrosyntenic relationships between the U and M genomes of Aegilops, wheat and the model species. Five syntenic regions of Brachypodium identified genome rearrangements differentiating the U genome from the M genome and from the D genome of wheat. All of them seem to have evolved at the diploid level and to have been modified differentially in the polyploid species Ae. biuncialis and Ae. geniculata. A certain level of wheat–Aegilops homology was detected for group 1, 2, 3 and 5 chromosomes, while a clearly rearranged structure was showed for the group 4, 6 and 7 Aegilops chromosomes relative to wheat. The conserved orthologous set markers assigned to Aegilops chromosomes promise to accelerate gene introgression by facilitating the identification of alien chromatin. The syntenic relationships between the Aegilops species, wheat and model species will facilitate the targeted development of new markers specific for U and M genomic regions and will contribute to the understanding of molecular processes related to allopolyploidization.
The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide.
Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice.
The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye.
Carotenoids are isoprenoid pigments, essential for photosynthesis and photoprotection in plants. The enzyme phytoene synthase (PSY) plays an essential role in mediating condensation of two geranylgeranyl diphosphate molecules, the first committed step in carotenogenesis. PSY are nuclear enzymes encoded by a small gene family consisting of three paralogous genes (PSY1-3) that have been widely characterized in rice, maize and sorghum.
In wheat, for which yellow pigment content is extremely important for flour colour, only PSY1 has been extensively studied because of its association with QTLs reported for yellow pigment whereas PSY2 has been partially characterized. Here, we report the isolation of bread wheat PSY3 genes from a Renan BAC library using Brachypodium as a model genome for the Triticeae to develop Conserved Orthologous Set markers prior to gene cloning and sequencing. Wheat PSY3 homoeologous genes were sequenced and annotated, unravelling their novel structure associated with intron-loss events and consequent exonic fusions. A wheat PSY3 promoter region was also investigated for the presence of cis-acting elements involved in the response to abscisic acid (ABA), since carotenoids also play an important role as precursors of signalling molecules devoted to plant development and biotic/abiotic stress responses. Expression of wheat PSYs in leaves and roots was investigated during ABA treatment to confirm the up-regulation of PSY3 during abiotic stress.
We investigated the structural and functional determinisms of PSY genes in wheat. More generally, among eudicots and monocots, the PSY gene family was found to be associated with differences in gene copy numbers, allowing us to propose an evolutionary model for the entire PSY gene family in Grasses.
Carotenoids; Phytoene synthase; Wheat; Intron loss; Abiotic stress; Evolution
The phase transition from vegetative to reproductive growth is a critical event in the life cycle of flowering plants. FLOWERING LOCUS T (FT) plays a central role in the regulation of this transition by integrating signals from multiple flowering pathways in the leaves and transmitting them to the shoot apical meristem. In this study, we characterized FT homologs in the temperate grasses Brachypodium distachyon and polyploid wheat using transgenic and mutant approaches. Downregulation of FT1 by RNAi was associated with a significant downregulation of the FT-like genes FT2 and FT4 in Brachypodium and FT2 and FT5 in wheat. In a transgenic wheat line carrying a highly-expressed FT1 allele, FT2 and FT3 were upregulated under both long and short days. Overexpression of FT1 caused extremely early flowering during shoot regeneration in both Brachypodium and hexaploid wheat, and resulted in insufficient vegetative tissue to support the production of viable seeds. Downregulation of FT1 transcripts by RNA interference (RNAi) resulted in non-flowering Brachypodium plants and late flowering plants (2–4 weeks delay) in wheat. A similar delay in heading time was observed in tetraploid wheat plants carrying mutations for both FT-A1 and FT-B1. Plants homozygous only for mutations in FT-B1 flowered later than plants homozygous only for mutations in FT-A1, which corresponded with higher transcript levels of FT-B1 relative to FT-A1 in the early stages of development. Taken together, our data indicate that FT1 plays a critical role in the regulation of flowering in Brachypodium and wheat, and that this role is associated with the simultaneous regulation of other FT-like genes. The differential effects of mutations in FT-A1 and FT-B1 on wheat heading time suggest that different allelic combinations of FT1 homoeologs could be used to adjust wheat heading time to improve adaptation to changing environments.
Caffeic acid o-methyltransferase (COMT) is one of the important enzymes controlling lignin monomer production in plant cell wall synthesis. Analysis of the genome sequence of the new grass model Brachypodium distachyon identified four COMT gene homologs, designated as BdCOMT1, BdCOMT2, BdCOMT3, and BdCOMT4. Phylogenetic analysis suggested that they belong to the COMT gene family, whereas syntenic analysis through comparisons with rice and sorghum revealed that BdCOMT4 on Chromosome 3 is the orthologous copy of the COMT genes well characterized in other grass species. The other three COMT genes are unique to Brachypodium since orthologous copies are not found in the collinear regions of rice and sorghum genomes. Expression studies indicated that all four Brachypodium COMT genes are transcribed but with distinct patterns of tissue specificity. Full-length cDNAs were cloned in frame into the pQE-T7 expression vector for the purification of recombinant Brachypodium COMT proteins. Biochemical characterization of enzyme activity and substrate specificity showed that BdCOMT4 has significant effect on a broad range of substrates with the highest preference for caffeic acid. The other three COMTs had low or no effect on these substrates, suggesting that a diversified evolution occurred on these duplicate genes that not only impacted their pattern of expression, but also altered their biochemical properties.
A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat.
Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed.
In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions. The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes, which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the genome.
Phosphomannomutase (PMM) is an essential enzyme in eukaryotes. However, little is known about PMM gene and function in crop plants. Here, we report molecular evolutionary and biochemical analysis of PMM genes in bread wheat and related Triticeae species.
Two sets of homoeologous PMM genes (TaPMM-1 and 2) were found in bread wheat, and two corresponding PMM genes were identified in the diploid progenitors of bread wheat and many other diploid Triticeae species. The duplication event yielding PMM-1 and 2 occurred before the radiation of diploid Triticeae genomes. The PMM gene family in wheat and relatives may evolve largely under purifying selection. Among the six TaPMM genes, the transcript levels of PMM-1 members were comparatively high and their recombinant proteins were all enzymatically active. However, PMM-2 homoeologs exhibited lower transcript levels, two of which were also inactive. TaPMM-A1, B1 and D1 were probably the main active isozymes in bread wheat tissues. The three isozymes differed from their counterparts in barley and Brachypodium distachyon in being more tolerant to elevated test temperatures.
Our work identified the genes encoding PMM isozymes in bread wheat and relatives, uncovered a unique PMM duplication event in diverse Triticeae species, and revealed the main active PMM isozymes in bread wheat tissues. The knowledge obtained here improves the understanding of PMM evolution in eukaryotic organisms, and may facilitate further investigations of PMM function in the temperature adaptability of bread wheat.
The ability of grass species to adapt to various habitats is attributed to the dynamic nature of their genomes, which have been shaped by multiple rounds of ancient and recent polyploidization. To gain a better understanding of the nature and extent of variation in functionally relevant regions of a polyploid genome, we developed a sequence capture assay to compare exonic sequences of allotetraploid wheat accessions.
A sequence capture assay was designed for the targeted re-sequencing of 3.5 Mb exon regions that surveyed a total of 3,497 genes from allotetraploid wheat. These data were used to describe SNPs, copy number variation and homoeologous sequence divergence in coding regions. A procedure for variant discovery in the polyploid genome was developed and experimentally validated. About 1% and 24% of discovered SNPs were loss-of-function and non-synonymous mutations, respectively. Under-representation of replacement mutations was identified in several groups of genes involved in translation and metabolism. Gene duplications were predominant in a cultivated wheat accession, while more gene deletions than duplications were identified in wild wheat.
We demonstrate that, even though the level of sequence similarity between targeted polyploid genomes and capture baits can bias enrichment efficiency, exon capture is a powerful approach for variant discovery in polyploids. Our results suggest that allopolyploid wheat can accumulate new variation in coding regions at a high rate. This process has the potential to broaden functional diversity and generate new phenotypic variation that eventually can play a critical role in the origin of new adaptations and important agronomic traits.
Wheat (Triticum ssp.) is an important food source for humans in many regions around the world. However, the ability to understand and modify gene function for crop improvement is hindered by the lack of available genomic resources. TILLING is a powerful reverse genetics approach that combines chemical mutagenesis with a high-throughput screen for mutations. Wheat is specially well-suited for TILLING due to the high mutation densities tolerated by polyploids, which allow for very efficient screens. Despite this, few TILLING populations are currently available. In addition, current TILLING screening protocols require high-throughput genotyping platforms, limiting their use.
We developed mutant populations of pasta and common wheat and organized them for TILLING. To simplify and decrease costs, we developed a non-denaturing polyacrylamide gel set-up that uses ethidium bromide to detect fragments generated by crude celery juice extract digestion of heteroduplexes. This detection method had similar sensitivity as traditional LI-COR screens, suggesting that it represents a valid alternative. We developed genome-specific primers to circumvent the presence of multiple homoeologous copies of our target genes. Each mutant library was characterized by TILLING multiple genes, revealing high mutation densities in both the hexaploid (~1/38 kb) and tetraploid (~1/51 kb) populations for 50% GC targets. These mutation frequencies predict that screening 1,536 lines for an effective target region of 1.3 kb with 50% GC content will result in ~52 hexaploid and ~39 tetraploid mutant alleles. This implies a high probability of obtaining knock-out alleles (P = 0.91 for hexaploid, P = 0.84 for tetraploid), in addition to multiple missense mutations. In total, we identified over 275 novel alleles in eleven targeted gene/genome combinations in hexaploid and tetraploid wheat and have validated the presence of a subset of them in our seed stock.
We have generated reverse genetics TILLING resources for pasta and bread wheat and achieved a high mutation density in both populations. We also developed a modified screening method that will lower barriers to adopt this promising technology. We hope that the use of this reverse genetics resource will enable more researchers to pursue wheat functional genomics and provide novel allelic diversity for wheat improvement.
MicroRNAs are a class of short, non-coding, single-stranded RNAs that act as post-transcriptional regulators in gene expression. miRNA analysis of Triticum aestivum chromosome 5D was performed on 454 GS FLX Titanium sequences of flow-sorted chromosome 5D with a total of 3,208,630 good quality reads representing 1.34x and 1.61x coverage of the short (5DS) and long (5DL) arms of the chromosome respectively. In silico and structural analyses revealed a total of 55 miRNAs; 48 and 42 miRNAs were found to be present on 5DL and 5DS respectively, of which 35 were common to both chromosome arms, while 13 miRNAs were specific to 5DL and 7 miRNAs were specific to 5DS. In total, 14 of the predicted miRNAs were identified in wheat for the first time. Representation (the copy number of each miRNA) was also found to be higher in 5DL (1,949) compared to 5DS (1,191). Targets were predicted for each miRNA, while expression analysis gave evidence of expression for 6 out of 55 miRNAs. Occurrences of the same miRNAs were also found in Brachypodium distachyon and Oryza sativa genome sequences to identify syntenic miRNA coding sequences. Based on this analysis, two other miRNAs: miR1133 and miR167 were detected in B. distachyon syntenic region of wheat 5DS. Five of the predicted miRNA coding regions (miR6220, miR5070, miR169, miR5085, miR2118) were experimentally verified to be located to the 5D chromosome and three of them : miR2118, miR169 and miR5085, were shown to be 5D specific. Furthermore miR2118 was shown to be expressed in Chinese Spring adult leaves. miRNA genes identified in this study will expand our understanding of gene regulation in bread wheat.
The caleosin genes encode proteins with a single conserved EF hand calcium-binding domain and comprise small gene families found in a wide range of plant species. Some members of the gene family have been shown to be upregulated by environmental stresses including low water availability and high salinity. Caleosin 3 from wheat has been shown to interact with the α-subunit of the heterotrimeric G proteins, and to act as a GTPase activating protein (GAP). This study characterizes the size and diversity of the gene family in wheat and related species and characterizes the differential tissue-specific expression of members of the gene family.
A total of 34 gene family members that belong to eleven paralogous groups of caleosins were identified in the hexaploid bread wheat, T. aestivum. Each group was represented by three homeologous copies of the gene located on corresponding homeologous chromosomes, except the caleosin 10, which has four gene copies. Ten gene family members were identified in diploid barley, Hordeum vulgare, and in rye, Secale cereale, seven in Brachypodium distachyon, and six in rice, Oryza sativa. The analysis of gene expression was assayed in triticale and rye by RNA-Seq analysis of 454 sequence sets and members of the gene family were found to have diverse patterns of gene expression in the different tissues that were sampled in rye and in triticale, the hybrid hexaploid species derived from wheat and rye. Expression of the gene family in wheat and barley was also previously determined by microarray analysis, and changes in expression during development and in response to environmental stresses are presented.
The caleosin gene family had a greater degree of expansion in the Triticeae than in the other monocot species, Brachypodium and rice. The prior implication of one member of the gene family in the stress response and heterotrimeric G protein signaling, points to the potential importance of the caleosin gene family. The complexity of the family and differential expression in various tissues and under conditions of abiotic stress suggests the possibility that caleosin family members may play diverse roles in signaling and development that warrants further investigation.
Caleosin gene family; Calcium-binding protein; Phylogenetic analysis; Tissue-specific expression; GAP; Gα; Heterotrimeric G protein signaling; RNA-seq