Whole genome duplication is a common evolutionary event in plants. Bread wheat (Triticum aestivum L.) is a good model to investigate the impact of paleo- and neoduplications on the organization and function of modern plant genomes.
We performed an RNA sequencing-based inference of the grain filling gene network in bread wheat and identified a set of 37,695 non-redundant sequence clusters, which is an unprecedented resolution corresponding to an estimated half of the wheat genome unigene repertoire. Using the Brachypodium distachyon genome as a reference for the Triticeae, we classified gene clusters into orthologous, paralogous, and homoeologous relationships. Based on this wheat gene evolutionary classification, older duplicated copies (dating back 50 to 70 million years) exhibit more than 80% gene loss and expression divergence while recent duplicates (dating back 1.5 to 3 million years) show only 54% gene loss and 36 to 49% expression divergence.
We suggest that structural shuffling due to duplicated gene loss is a rapid process, whereas functional shuffling due to neo- and/or subfunctionalization of duplicates is a longer process, and that both shuffling mechanisms drive functional redundancy erosion. We conclude that, as a result of these mechanisms, half the gene duplicates in plants are structurally and functionally altered within 10 million years of evolution, and the diploidization process is completed after 45 to 50 million years following polyploidization.
Physical maps employing libraries of bacterial artificial chromosome (BAC) clones are essential for comparative genomics and sequencing of large and repetitive genomes such as those of the hexaploid bread wheat. The diploid ancestor of the D-genome of hexaploid wheat (Triticum aestivum), Aegilops tauschii, is used as a resource for wheat genomics. The barley diploid genome also provides a good model for the Triticeae and T. aestivum since it is only slightly larger than the ancestor wheat D genome. Gene co-linearity between the grasses can be exploited by extrapolating from rice and Brachypodium distachyon to Ae. tauschii or barley, and then to wheat.
We report the use of Ae. tauschii for the construction of the physical map of a large distal region of chromosome arm 3DS. A physical map of 25.4 Mb was constructed by anchoring BAC clones of Ae. tauschii with 85 EST on the Ae. tauschii and barley genetic maps. The 24 contigs were aligned to the rice and B. distachyon genomic sequences and a high density SNP genetic map of barley. As expected, the mapped region is highly collinear to the orthologous chromosome 1 in rice, chromosome 2 in B. distachyon and chromosome 3H in barley. However, the chromosome scale of the comparative maps presented provides new insights into grass genome organization. The disruptions of the Ae. tauschii-rice and Ae. tauschii-Brachypodium syntenies were identical. We observed chromosomal rearrangements between Ae. tauschii and barley. The comparison of Ae. tauschii physical and genetic maps showed that the recombination rate across the region dropped from 2.19 cM/Mb in the distal region to 0.09 cM/Mb in the proximal region. The size of the gaps between contigs was evaluated by comparing the recombination rate along the map with the local recombination rates calculated on single contigs.
The physical map reported here is the first physical map using fingerprinting of a complete Triticeae genome. This study demonstrates that global fingerprinting of the large plant genomes is a viable strategy for generating physical maps. Physical maps allow the description of the co-linearity between wheat and grass genomes and provide a powerful tool for positional cloning of new genes.
Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat.
The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695 Insertion Site Based Polymorphisms (ISBPs). Of the 96 ISBP primer pairs tested, 28 (29%) were 3A-specific and compared to 17 (18%) for 96 SSRs.
This work reports on the use of wheat chromosome arm 3AS-specific BAC library for the targeted generation of sequence data from a particular region of the huge genome of wheat. A large quantity of sequences were generated from the A genome of hexaploid wheat for comparative genome analysis with homoeologous B and D genomes and other model grass genomes. Hundreds of molecular markers were developed from the 3AS arm-specific sequences; these and other sequences will be useful in gene discovery and physical mapping.
Structural changes of chromosomes are a primary mechanism of genome rearrangement over the course of evolution and detailed knowledge of such changes in a given species and its close relatives should increase the efficiency and precision of chromosome engineering in crop improvement. We have identified sequences bordering each of the main translocation and inversion breakpoints on chromosomes 4A, 5A and 7B of the modern bread wheat genome. The locations of these breakpoints allow, for the first time, a detailed description of the evolutionary origins of these chromosomes at the gene level. Results from this study also demonstrate that, although the strategy of exploiting sorted chromosome arms has dramatically simplified the efforts of wheat genome sequencing, simultaneous analysis of sequences from homoeologous and non-homoeologous chromosomes is essential in understanding the origins of DNA sequences in polyploid species.
A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat.
Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed.
In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions. The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes, which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the genome.
The Protein Disulfide Isomerase (PDI) gene family encodes several PDI and PDI-like proteins containing thioredoxin domains and controlling diversified metabolic functions, including disulfide bond formation and isomerisation during protein folding. Genomic, cDNA and promoter sequences of the three homoeologous wheat genes encoding the "typical" PDI had been cloned and characterized in a previous work. The purpose of present research was the cloning and characterization of the complete set of genes encoding PDI and PDI like proteins in bread wheat (Triticum aestivum cv Chinese Spring) and the comparison of their sequence, structure and expression with homologous genes from other plant species.
Eight new non-homoeologous wheat genes were cloned and characterized. The nine PDI and PDI-like sequences of wheat were located in chromosome regions syntenic to those in rice and assigned to eight plant phylogenetic groups. The nine wheat genes differed in their sequences, genomic organization as well as in the domain composition and architecture of their deduced proteins; conversely each of them showed high structural conservation with genes from other plant species in the same phylogenetic group. The extensive quantitative RT-PCR analysis of the nine genes in a set of 23 wheat samples, including tissues and developmental stages, showed their constitutive, even though highly variable expression.
The nine wheat genes showed high diversity, while the members of each phylogenetic group were highly conserved even between taxonomically distant plant species like the moss Physcomitrella patens. Although constitutively expressed the nine wheat genes were characterized by different expression profiles reflecting their different genomic organization, protein domain architecture and probably promoter sequences; the high conservation among species indicated the ancient origin and diversification of the still evolving gene family. The comprehensive structural and expression characterization of the complete set of PDI and PDI-like wheat genes represents a basis for the functional characterization of this gene family in the hexaploid context of bread wheat.
The patterns of expression of homoeologous genes in hexaploid bread wheat have been intensively studied in recent years, but the interaction between structural genes and their homoeologous regulatory genes remained unclear. The question was as to whether, in an allopolyploid, this interaction is genome-specific, or whether regulation cuts across genomes. The aim of the present study was cloning, sequence analysis, mapping and expression analysis of F3H (flavanone 3-hydroxylase – one of the key enzymes in the plant flavonoid biosynthesis pathway) homoeologues in bread wheat and study of the interaction between F3H and their regulatory genes homoeologues – Rc (red coleoptiles).
PCR-based cloning of F3H sequences from hexaploid bread wheat (Triticum aestivum L.), a wild tetraploid wheat (T. timopheevii) and their putative diploid progenitors was employed to localize, physically map and analyse the expression of four distinct bread wheat F3H copies. Three of these form a homoeologous set, mapping to the chromosomes of homoeologous group 2; they are highly similar to one another at the structural and functional levels. However, the fourth copy is less homologous, and was not expressed in anthocyanin pigmented coleoptiles. The presence of dominant alleles at the Rc-1 homoeologous loci, which are responsible for anthocyanin pigmentation in the coleoptile, was correlated with F3H expression in pigmented coleoptiles. Each dominant Rc-1 allele affected the expression of the three F3H homoeologues equally, but the level of F3H expression was dependent on the identity of the dominant Rc-1 allele present. Thus, the homoeologous Rc-1 genes contribute more to functional divergence than do the structural F3H genes.
The lack of any genome-specific relationship between F3H-1 and Rc-1 implies an integrative evolutionary process among the three diploid genomes, following the formation of hexaploid wheat. Regulatory genes probably contribute more to the functional divergence between the wheat genomes than do the structural genes themselves. This is in line with the growing consensus which suggests that although heritable morphological traits are determined by the expression of structural genes, it is the regulatory genes which are the prime determinants of allelic identity.
Our understanding of the mechanisms that govern the cellular process of meiosis is limited in higher plants with polyploid genomes. Bread wheat is an allohexaploid that behaves as a diploid during meiosis. Chromosome pairing is restricted to homologous chromosomes despite the presence of homoeologues in the nucleus. The importance of wheat as a crop and the extensive use of wild wheat relatives in breeding programs has prompted many years of cytogenetic and genetic research to develop an understanding of the control of chromosome pairing and recombination. The rapid advance of biochemical and molecular information on meiosis in model organisms such as yeast provides new opportunities to investigate the molecular basis of chromosome pairing control in wheat. However, building the link between the model and wheat requires points of data contact.
We report here a large-scale transcriptomics study using the Affymetrix wheat GeneChip® aimed at providing this link between wheat and model systems and at identifying early meiotic genes. Analysis of the microarray data identified 1,350 transcripts temporally-regulated during the early stages of meiosis. Expression profiles with annotated transcript functions including chromatin condensation, synaptonemal complex formation, recombination and fertility were identified. From the 1,350 transcripts, 30 displayed at least an eight-fold expression change between and including pre-meiosis and telophase II, with more than 50% of these having no similarities to known sequences in NCBI and TIGR databases.
This resource is now available to support research into the molecular basis of pairing and recombination control in the complex polyploid, wheat.
In contrast to diploids, most polyploid plant species, which include the hexaploid bread wheat, possess an additional layer of epigenetic complexity. Several studies have demonstrated that polyploids are affected by homoeologous gene silencing, a process in which sub-genomic genomic copies are selectively transcriptionally inactivated. This form of silencing can be tissue specific and may be linked to developmental or stress responses.
Evidence was sought as to whether the frequency of homoeologous silencing in in vitro cultured wheat callus differ from that in differentiated organs, given that disorganized cells are associated with a globally lower level of DNA methylation. Using a reverse transcription PCR (RT-PCR) single strand conformation polymorphism (SSCP) platform to detect the pattern of expression of 20 homoeologous sets of single-copy genes known to be affected by this form of silencing in the root and/or leaf, we observed no silencing in any of the wheat callus tissue tested.
Our results suggest that much of the homoeologous silencing observed in differentiated tissues is probably under epigenetic control, rather than being linked to genomic instability arising from allopolyploidization. This study reinforces the notion of plasticity in the wheat epi-genome.
Mutational inactivation of plant genes is an essential tool in gene function studies. Plants with inactivated or deleted genes may also be exploited for crop improvement if such mutations/deletions produce a desirable agronomical and/or quality phenotype. However, the use of mutational gene inactivation/deletion has been impeded in polyploid plant species by genetic redundancy, as polyploids contain multiple copies of the same genes (homoeologous genes) encoded by each of the ancestral genomes. Similar to many other crop plants, bread wheat (Triticum aestivum L.) is polyploid; specifically allohexaploid possessing three progenitor genomes designated as 'A', 'B', and 'D'. Recently modified TILLING protocols have been developed specifically for mutation detection in wheat. Whilst extremely powerful in detecting single nucleotide changes and small deletions, these methods are not suitable for detecting whole gene deletions. Therefore, high-throughput methods for screening of candidate homoeologous gene deletions are needed for application to wheat populations generated by the use of certain mutagenic agents (e.g. heavy ion irradiation) that frequently generate whole-gene deletions.
To facilitate the screening for specific homoeologous gene deletions in hexaploid wheat, we have developed a TaqMan qPCR-based method that allows high-throughput detection of deletions in homoeologous copies of any gene of interest, provided that sufficient polymorphism (as little as a single nucleotide difference) amongst homoeologues exists for specific probe design. We used this method to identify deletions of individual TaPFT1 homoeologues, a wheat orthologue of the disease susceptibility and flowering regulatory gene PFT1 in Arabidopsis. This method was applied to wheat nullisomic-tetrasomic lines as well as other chromosomal deletion lines to locate the TaPFT1 gene to the long arm of chromosome 5. By screening of individual DNA samples from 4500 M2 mutant wheat lines generated by heavy ion irradiation, we detected multiple mutants with deletions of each TaPFT1 homoeologue, and confirmed these deletions using a CAPS method. We have subsequently designed, optimized, and applied this method for the screening of homoeologous deletions of three additional wheat genes putatively involved in plant disease resistance.
We have developed a method for automated, high-throughput screening to identify deletions of individual homoeologues of a wheat gene. This method is also potentially applicable to other polyploidy plants.
The Yr26 gene, conferring resistance to all currently important races of Puccinia striiformis f. sp. tritici (Pst) in China, was previously mapped to wheat chromosome deletion bin C-1BL-6-0.32 with low-density markers. In this study, collinearity of wheat to Brachypodium distachyon and rice was used to develop markers to saturate the chromosomal region containing the Yr26 locus, and a total of 2,341 F2 plants and 551 F2∶3 progenies derived from Avocet S×92R137 were used to develop a fine map of Yr26. Wheat expressed sequence tags (ESTs) located in deletion bin C-1BL-6-0.32 were used to develop sequence tagged site (STS) markers. The EST-STS markers flanking Yr26 were used to identify collinear regions of the rice and B. distachyon genomes. Wheat ESTs with significant similarities in the two collinear regions were selected to develop conserved markers for fine mapping of Yr26. Thirty-one markers were mapped to the Yr26 region, and six of them cosegregated with the resistance gene. Marker orders were highly conserved between rice and B. distachyon, but some rearrangements were observed between rice and wheat. Two flanking markers (CON-4 and CON-12) further narrowed the genomic region containing Yr26 to a 1.92 Mb region in B. distachyon chromosome 3 and a 1.17 Mb region in rice chromosome 10, and two putative resistance gene analogs were identified in the collinear region of B. distachyon. The markers developed in this study provide a potential target site for further map-based cloning of Yr26 and should be useful in marker assisted selection for pyramiding the gene with other resistance genes.
A complete assembled genome sequence of wheat is not yet available. Therefore, model plant systems for wheat are very valuable. Brachypodium distachyon (Brachypodium) is such a system. The WRKY family of transcription factors is one of the most important families of plant transcriptional regulators with members regulating important agronomic traits. Studies of WRKY transcription factors in Brachypodium and wheat therefore promise to lead to new strategies for wheat improvement.
We have identified and manually curated the WRKY transcription factor family from Brachypodium using a pipeline designed to identify all potential WRKY genes. 86 WRKY transcription factors were found, a total higher than all other current databases. We therefore propose that our numbering system (BdWRKY1-BdWRKY86) becomes the standard nomenclature. In the JGI v1.0 assembly of Brachypodium with the MIPS/JGI v1.0 annotation, nine of the transcription factors have no gene model and eleven gene models are probably incorrectly predicted. In total, twenty WRKY transcription factors (23.3%) do not appear to have accurate gene models. To facilitate use of our data, we have produced The Database of Brachypodium distachyon WRKY Transcription Factors. Each WRKY transcription factor has a gene page that includes predicted protein domains from MEME analyses. These conserved protein domains reflect possible input and output domains in signaling. The database also contains a BLAST search function where a large dataset of WRKY transcription factors, published genes, and an extensive set of wheat ESTs can be searched. We also produced a phylogram containing the WRKY transcription factor families from Brachypodium, rice, Arabidopsis, soybean, and Physcomitrella patens, together with published WRKY transcription factors from wheat. This phylogenetic tree provides evidence for orthologues, co-orthologues, and paralogues of Brachypodium WRKY transcription factors.
The description of the WRKY transcription factor family in Brachypodium that we report here provides a framework for functional genomics studies in an important model system. Our database is a resource for both Brachypodium and wheat studies and ultimately projects aimed at improving wheat through manipulation of WRKY transcription factors.
WRKY transcription factor; Brachypodium distachyon; Wheat; Comparative genomics; Database
Sucrose phosphate synthase (SPS) is an important component of the plant sucrose biosynthesis pathway. In the monocotyledonous Poaceae, five SPS genes have been identified. Here we present a detailed analysis of the wheat SPSII family in wheat. A set of homoeologue-specific primers was developed in order to permit both the detection of sequence variation, and the dissection of the individual contribution of each homoeologue to the global expression of SPSII.
The expression in bread wheat over the course of development of various sucrose biosynthesis genes monitored on an Affymetrix array showed that the SPS genes were regulated over time and space. SPSII homoeologue-specific assays were used to show that the three homoeologues contributed differentially to the global expression of SPSII. Genetic mapping placed the set of homoeoloci on the short arms of the homoeologous group 3 chromosomes. A resequencing of the A and B genome copies allowed the detection of four haplotypes at each locus. The 3B copy includes an unspliced intron. A comparison of the sequences of the wheat SPSII orthologues present in the diploid progenitors einkorn, goatgrass and Triticum speltoides, as well as in the more distantly related species barley, rice, sorghum and purple false brome demonstrated that intronic sequence was less well conserved than exonic. Comparative sequence and phylogenetic analysis of SPSII gene showed that false purple brome was more similar to Triticeae than to rice. Wheat - rice synteny was found to be perturbed at the SPS region.
The homoeologue-specific assays will be suitable to derive associations between SPS functionality and key phenotypic traits. The amplicon sequences derived from the homoeologue-specific primers are informative regarding the evolution of SPSII in a polyploid context.
The bread wheat genome harbors three homoeologs of the barley gene HvAP2, which determines the cleistogamous/non-cleistogamous flowering. The three homoeologs, TaAP2-A, TaAP2-B and TaAP2-D, are derived from the A, B and D genomes. The importance of lodicule swelling in assuring non-cleistogamous flowering in a range of wild and domesticated wheat accessions of varying ploidy level was established. Re-sequencing of wheat AP2 homoeologous genes was carried out to identify natural variation at both the nucleotide and polypeptide level. The sequences of wheat AP2 homoeologs are highly conserved even across different ploidy levels and no functional variants at the key miR172 targeting site were detected. These results indicate that engineering of cleistogamous wheat will require the presence of a functional TaAP2 modification at each of the three homoeologs.
Triticum aestivum L.; cleistogamy; lodicule; microRNA172
Phosphomannomutase (PMM) is an essential enzyme in eukaryotes. However, little is known about PMM gene and function in crop plants. Here, we report molecular evolutionary and biochemical analysis of PMM genes in bread wheat and related Triticeae species.
Two sets of homoeologous PMM genes (TaPMM-1 and 2) were found in bread wheat, and two corresponding PMM genes were identified in the diploid progenitors of bread wheat and many other diploid Triticeae species. The duplication event yielding PMM-1 and 2 occurred before the radiation of diploid Triticeae genomes. The PMM gene family in wheat and relatives may evolve largely under purifying selection. Among the six TaPMM genes, the transcript levels of PMM-1 members were comparatively high and their recombinant proteins were all enzymatically active. However, PMM-2 homoeologs exhibited lower transcript levels, two of which were also inactive. TaPMM-A1, B1 and D1 were probably the main active isozymes in bread wheat tissues. The three isozymes differed from their counterparts in barley and Brachypodium distachyon in being more tolerant to elevated test temperatures.
Our work identified the genes encoding PMM isozymes in bread wheat and relatives, uncovered a unique PMM duplication event in diverse Triticeae species, and revealed the main active PMM isozymes in bread wheat tissues. The knowledge obtained here improves the understanding of PMM evolution in eukaryotic organisms, and may facilitate further investigations of PMM function in the temperature adaptability of bread wheat.
Wheat, barley, and rye, of tribe Triticeae in the Poaceae, are among the most important crops worldwide but they present many challenges to genomics-aided crop improvement. Brachypodium distachyon, a close relative of those cereals has recently emerged as a model for grass functional genomics. Sequencing of the nuclear and organelle genomes of Brachypodium is one of the first steps towards making this species available as a tool for researchers interested in cereals biology.
The chloroplast genome of Brachypodium distachyon was sequenced by a combinational approach using BAC end and shotgun sequences derived from a selected BAC containing the entire chloroplast genome. Comparative analysis indicated that the chloroplast genome is conserved in gene number and organization with respect to those of other cereals. However, several Brachypodium genes evolve at a faster rate than those in other grasses. Sequence analysis reveals that rice and wheat have a ~2.1 kb deletion in their plastid genomes and this deletion must have occurred independently in both species.
We demonstrate that BAC libraries can be used to sequence plastid, and likely other organellar, genomes. As expected, the Brachypodium chloroplast genome is very similar to those of other sequenced grasses. The phylogenetic analyses and the pattern of insertions and deletions in the chloroplast genome confirmed that Brachypodium is a close relative of the tribe Triticeae. Nevertheless, we show that some large indels can arise multiple times and may confound phylogenetic reconstruction.
Bread wheat is one of the world’s most important food crops and considerable efforts have been made to develop genomic resources for this species. This includes an on-going project by the International Wheat Genome Sequencing Consortium to assemble its large and complex genome, which is hexaploid and contains three closely related ‘homoeologous’ copies for each chromosome. This multi-national effort avoids the complications polyploidy entails for correct assembly of the genome by sequencing flow-sorted chromosome arms one at a time. Here we report on an alternate approach, a direct homoeolog-specific assembly of the expressed portion of the genome, the transcriptome.
After assessment of the ability of various assemblers to generate homoeolog-specific assemblies, we employed a two-stage assembly process to produce a high-quality assembly of the transcriptome of hexaploid wheat from Roche-454 and Illumina GAIIx paired-end sequence reads. The assembly process made use of a rapid partitioning of expressed sequences into homoeologous clusters, followed by a parallel high-fidelity assembly of each cluster on a 1150-processor compute cloud. We assessed assembly quality through comparison to known wheat gene sequences and found that in ca. 98.5% of cases the assembly was sufficiently accurate for homoeologous triplets to be cleanly separated into either two or three separate contigs. Comparison to publicly available transcript collections suggests that the assembly covers ~75-80% of the complete transcriptome.
This work therefore describes the first homoeolog-specific sequence assembly of the wheat transcriptome and provides a reference transcriptome for future wheat research. Furthermore, our assembly methodology is transferable to other polyploid organisms.
Wheat transcriptome; Wheat genes; Sequence assembly; Cloud computing
The pooid subfamily of grasses includes some of the most important crop, forage and turf species, such as wheat, barley and Lolium. Developing genomic resources, such as whole-genome physical maps, for analysing the large and complex genomes of these crops and for facilitating biological research in grasses is an important goal in plant biology. We describe a bacterial artificial chromosome (BAC)-based physical map of the wild pooid grass Brachypodium distachyon and integrate this with whole genome shotgun sequence (WGS) assemblies using BAC end sequences (BES). The resulting physical map contains 26 contigs spanning the 272 Mb genome. BES from the physical map were also used to integrate a genetic map. This provides an independent vaildation and confirmation of the published WGS assembly. Mapped BACs were used in Fluorescence In Situ Hybridisation (FISH) experiments to align the integrated physical map and sequence assemblies to chromosomes with high resolution. The physical, genetic and cytogenetic maps, integrated with whole genome shotgun sequence assemblies, enhance the accuracy and durability of this important genome sequence and will directly facilitate gene isolation.
The emergence of new sequencing technologies has provided fast and cost-efficient strategies for high-resolution mapping of complex genomes. Although these approaches hold great promise to accelerate genome analysis, their application in studying genetic variation in wheat has been hindered by the complexity of its polyploid genome. Here, we applied the next-generation sequencing of a wheat doubled-haploid mapping population for high-resolution gene mapping and tested its utility for ordering shotgun sequence contigs of a flow-sorted wheat chromosome. A bioinformatical pipeline was developed for reliable variant analysis of sequence data generated for polyploid wheat mapping populations. The results of variant mapping were consistent with the results obtained using the wheat 9000 SNP iSelect assay. A reference map of the wheat genome integrating 2740 gene-associated single-nucleotide polymorphisms from the wheat iSelect assay, 1351 diversity array technology, 118 simple sequence repeat/sequence-tagged sites, and 416,856 genotyping-by-sequencing markers was developed. By analyzing the sequenced megabase-size regions of the wheat genome we showed that mapped markers are located within 40−100 kb from genes providing a possibility for high-resolution mapping at the level of a single gene. In our population, gene loci controlling a seed color phenotype cosegregated with 2459 markers including one that was located within the red seed color gene. We demonstrate that the high-density reference map presented here is a useful resource for gene mapping and linking physical and genetic maps of the wheat genome.
sequence-based genotyping; contig anchoring; gene mapping; reference map
Chromosome pairing, recombination and DNA repair are essential processes during meiosis in sexually reproducing organisms. Investigating the bread wheat (Triticum aestivum L.) Ph2 (Pairing homoeologous) locus has identified numerous candidate genes that may have a role in controlling such processes, including TaMSH7, a plant specific member of the DNA mismatch repair family.
Sequencing of the three MSH7 genes, located on the short arms of wheat chromosomes 3A, 3B and 3D, has revealed no significant sequence divergence at the amino acid level suggesting conservation of function across the homoeogroups. Functional analysis of MSH7 through the use of RNAi loss-of-function transgenics was undertaken in diploid barley (Hordeum vulgare L.). Quantitative real-time PCR revealed several T0 lines with reduced MSH7 expression. Positive segregants from two T1 lines studied in detail showed reduced MSH7 expression when compared to transformed controls and null segregants. Expression of MSH6, another member of the mismatch repair family which is most closely related to the MSH7 gene, was not significantly reduced in these lines. In both T1 lines, reduced seed set in positive segregants was observed.
Results presented here indicate, for the first time, a distinct functional role for MSH7 in vivo and show that expression of this gene is necessary for wild-type levels of fertility. These observations suggest that MSH7 has an important function during meiosis and as such remains a candidate for Ph2.
Carotenoids are isoprenoid pigments, essential for photosynthesis and photoprotection in plants. The enzyme phytoene synthase (PSY) plays an essential role in mediating condensation of two geranylgeranyl diphosphate molecules, the first committed step in carotenogenesis. PSY are nuclear enzymes encoded by a small gene family consisting of three paralogous genes (PSY1-3) that have been widely characterized in rice, maize and sorghum.
In wheat, for which yellow pigment content is extremely important for flour colour, only PSY1 has been extensively studied because of its association with QTLs reported for yellow pigment whereas PSY2 has been partially characterized. Here, we report the isolation of bread wheat PSY3 genes from a Renan BAC library using Brachypodium as a model genome for the Triticeae to develop Conserved Orthologous Set markers prior to gene cloning and sequencing. Wheat PSY3 homoeologous genes were sequenced and annotated, unravelling their novel structure associated with intron-loss events and consequent exonic fusions. A wheat PSY3 promoter region was also investigated for the presence of cis-acting elements involved in the response to abscisic acid (ABA), since carotenoids also play an important role as precursors of signalling molecules devoted to plant development and biotic/abiotic stress responses. Expression of wheat PSYs in leaves and roots was investigated during ABA treatment to confirm the up-regulation of PSY3 during abiotic stress.
We investigated the structural and functional determinisms of PSY genes in wheat. More generally, among eudicots and monocots, the PSY gene family was found to be associated with differences in gene copy numbers, allowing us to propose an evolutionary model for the entire PSY gene family in Grasses.
Carotenoids; Phytoene synthase; Wheat; Intron loss; Abiotic stress; Evolution
The ability of grass species to adapt to various habitats is attributed to the dynamic nature of their genomes, which have been shaped by multiple rounds of ancient and recent polyploidization. To gain a better understanding of the nature and extent of variation in functionally relevant regions of a polyploid genome, we developed a sequence capture assay to compare exonic sequences of allotetraploid wheat accessions.
A sequence capture assay was designed for the targeted re-sequencing of 3.5 Mb exon regions that surveyed a total of 3,497 genes from allotetraploid wheat. These data were used to describe SNPs, copy number variation and homoeologous sequence divergence in coding regions. A procedure for variant discovery in the polyploid genome was developed and experimentally validated. About 1% and 24% of discovered SNPs were loss-of-function and non-synonymous mutations, respectively. Under-representation of replacement mutations was identified in several groups of genes involved in translation and metabolism. Gene duplications were predominant in a cultivated wheat accession, while more gene deletions than duplications were identified in wild wheat.
We demonstrate that, even though the level of sequence similarity between targeted polyploid genomes and capture baits can bias enrichment efficiency, exon capture is a powerful approach for variant discovery in polyploids. Our results suggest that allopolyploid wheat can accumulate new variation in coding regions at a high rate. This process has the potential to broaden functional diversity and generate new phenotypic variation that eventually can play a critical role in the origin of new adaptations and important agronomic traits.
Brachypodium distachyon (L.) Beauv. is a temperate wild grass species; its morphological and genomic characteristics make it a model system when compared to many other grass species. It has a small genome, short growth cycle, self-fertility, many diploid accessions, and simple growth requirements. In addition, it is phylogenetically close to economically important crops, like wheat and barley, and several potential biofuel grasses. It exhibits agricultural traits similar to those of these target crops. For cereal genomes, it is a better model than Arabidopsis thaliana and Oryza sativa (rice), the former used as a model for all flowering plants and the latter hitherto used as model for genomes of all temperate grass species including major cereals like barley and wheat. Increasing interest in this species has resulted in the development of a series of genomics resources, including nuclear sequences and BAC/EST libraries, together with the collection and characterization of other genetic resources. It is expected that the use of this model will allow rapid advances in generation of genomics information for the improvement of all temperate crops, particularly the cereals.
Caffeic acid o-methyltransferase (COMT) is one of the important enzymes controlling lignin monomer production in plant cell wall synthesis. Analysis of the genome sequence of the new grass model Brachypodium distachyon identified four COMT gene homologs, designated as BdCOMT1, BdCOMT2, BdCOMT3, and BdCOMT4. Phylogenetic analysis suggested that they belong to the COMT gene family, whereas syntenic analysis through comparisons with rice and sorghum revealed that BdCOMT4 on Chromosome 3 is the orthologous copy of the COMT genes well characterized in other grass species. The other three COMT genes are unique to Brachypodium since orthologous copies are not found in the collinear regions of rice and sorghum genomes. Expression studies indicated that all four Brachypodium COMT genes are transcribed but with distinct patterns of tissue specificity. Full-length cDNAs were cloned in frame into the pQE-T7 expression vector for the purification of recombinant Brachypodium COMT proteins. Biochemical characterization of enzyme activity and substrate specificity showed that BdCOMT4 has significant effect on a broad range of substrates with the highest preference for caffeic acid. The other three COMTs had low or no effect on these substrates, suggesting that a diversified evolution occurred on these duplicate genes that not only impacted their pattern of expression, but also altered their biochemical properties.
In the past, rice genome served as a good model for studies involving comparative genomics of grass species. More recently, however, Brachypodium distachyon genome has emerged as a better model system for genomes of temperate cereals including wheat. During the present study, Brachypodium EST contigs were utilized to resolve orthologous relationships among the genomes of Brachypodium, wheat and rice.
Comparative sequence analysis of 3,818 Brachypodium EST (bEST) contigs and 3,792 physically mapped wheat EST (wEST) contigs revealed that as many as 449 bEST contigs were orthologous to 1,154 wEST loci that were bin-mapped on all the 21 wheat chromosomes. Similarly 743 bEST contigs were orthologous to specific rice genome sequences distributed on all the 12 rice chromosomes. As many as 183 bEST contigs were orthologous to both wheat and rice genome sequences, which harbored as many as 17 SSRs conserved across the three species. Primers developed for 12 of these 17 conserved SSRs were used for a wet-lab experiment, which resolved relatively high level of conservation among the genomes of Brachypodium, wheat and rice.
The present study confirmed that Brachypodium is a better model than rice for analysis of the genomes of temperate cereals like wheat and barley. The whole genome sequence of Brachypodium, which should become available in the near future, will further facilitate greatly the studies involving comparative genomics of cereals.