Bread wheat (Triticum aestivum) has a large and highly repetitive genome which poses major technical challenges for its study. To aid map-based cloning and future genome sequencing projects, we constructed a BAC-based physical map of the short arm of wheat chromosome 1A (1AS). From the assembly of 25,918 high information content (HICF) fingerprints from a 1AS-specific BAC library, 715 physical contigs were produced that cover almost 99% of the estimated size of the chromosome arm. The 3,414 BAC clones constituting the minimum tiling path were end-sequenced. Using a gene microarray containing ∼40 K NCBI UniGene EST clusters, PCR marker screening and BAC end sequences, we arranged 160 physical contigs (97 Mb or 35.3% of the chromosome arm) in a virtual order based on synteny with Brachypodium, rice and sorghum. BAC end sequences and information from microarray hybridisation was used to anchor 3.8 Mbp of Illumina sequences from flow-sorted chromosome 1AS to BAC contigs. Comparison of genetic and synteny-based physical maps indicated that ∼50% of all genetic recombination is confined to 14% of the physical length of the chromosome arm in the distal region. The 1AS physical map provides a framework for future genetic mapping projects as well as the basis for complete sequencing of chromosome arm 1AS.
Diploid Aegilops umbellulata and Ae. comosa and their natural allotetraploid hybrids Ae. biuncialis and Ae. geniculata are important wild gene sources for wheat. With the aim of assisting in alien gene transfer, this study provides gene-based conserved orthologous set (COS) markers for the U and M genome chromosomes. Out of the 140 markers tested on a series of wheat-Aegilops chromosome introgression lines and flow-sorted subgenomic chromosome fractions, 100 were assigned to Aegilops chromosomes and six and seven duplications were identified in the U and M genomes, respectively. The marker-specific EST sequences were BLAST-ed to Brachypodium and rice genomic sequences to investigate macrosyntenic relationships between the U and M genomes of Aegilops, wheat and the model species. Five syntenic regions of Brachypodium identified genome rearrangements differentiating the U genome from the M genome and from the D genome of wheat. All of them seem to have evolved at the diploid level and to have been modified differentially in the polyploid species Ae. biuncialis and Ae. geniculata. A certain level of wheat–Aegilops homology was detected for group 1, 2, 3 and 5 chromosomes, while a clearly rearranged structure was showed for the group 4, 6 and 7 Aegilops chromosomes relative to wheat. The conserved orthologous set markers assigned to Aegilops chromosomes promise to accelerate gene introgression by facilitating the identification of alien chromatin. The syntenic relationships between the Aegilops species, wheat and model species will facilitate the targeted development of new markers specific for U and M genomic regions and will contribute to the understanding of molecular processes related to allopolyploidization.
Bread wheat (Triticum aestivum L.) is one of the most important crops worldwide and its production faces pressing challenges, the solution of which demands genome information. However, the large, highly repetitive hexaploid wheat genome has been considered intractable to standard sequencing approaches. Therefore the International Wheat Genome Sequencing Consortium (IWGSC) proposes to map and sequence the genome on a chromosome-by-chromosome basis.
We have constructed a physical map of the long arm of bread wheat chromosome 1A using chromosome-specific BAC libraries by High Information Content Fingerprinting (HICF). Two alternative methods (FPC and LTC) were used to assemble the fingerprints into a high-resolution physical map of the chromosome arm. A total of 365 molecular markers were added to the map, in addition to 1122 putative unique transcripts that were identified by microarray hybridization. The final map consists of 1180 FPC-based or 583 LTC-based contigs.
The physical map presented here marks an important step forward in mapping of hexaploid bread wheat. The map is orders of magnitude more detailed than previously available maps of this chromosome, and the assignment of over a thousand putative expressed gene sequences to specific map locations will greatly assist future functional studies. This map will be an essential tool for future sequencing of and positional cloning within chromosome 1A.
The assembly of the bread wheat genome sequence is challenging due to allohexaploidy and extreme repeat content (>80%). Isolation of single chromosome arms by flow sorting can be used to overcome the polyploidy problem, but the repeat content cause extreme assembly fragmentation even at a single chromosome level. Long jump paired sequencing data (mate pairs) can help reduce assembly fragmentation by joining multiple contigs into single scaffolds. The aim of this work was to assess how mate pair data generated from multiple displacement amplified DNA of flow-sorted chromosomes affect assembly fragmentation of shotgun assemblies of the wheat chromosomes.
Three mate pair (MP) libraries (2 Kb, 3 Kb, and 5 Kb) were sequenced to a total coverage of 89x and 64x for the short and long arm of chromosome 7B, respectively. Scaffolding using SSPACE improved the 7B assembly contiguity and decreased gene space fragmentation, but the degree of improvement was greatly affected by scaffolding stringency applied. At the lowest stringency the assembly N50 increased by ~7 fold, while at the highest stringency N50 was only increased by ~1.5 fold. Furthermore, a strong positive correlation between estimated scaffold reliability and scaffold assembly stringency was observed. A 7BS scaffold assembly with reduced MP coverage proved that assembly contiguity was affected only to a small degree down to ~50% of the original coverage.
The effect of MP data integration into pair end shotgun assemblies of wheat chromosome was moderate; possibly due to poor contig assembly contiguity, the extreme repeat content of wheat, and the use of amplified chromosomal DNA for MP library construction.
Wheat; Assembly; Scaffold; Mate-pair; MDA; Improvement
Satellite DNA sequences consist of tandemly arranged repetitive units up to thousands nucleotides long in head-to-tail orientation. The evolutionary processes by which satellites arise and evolve include unequal crossing over, gene conversion, transposition and extra chromosomal circular DNA formation. Large blocks of satellite DNA are often observed in heterochromatic regions of chromosomes and are a typical component of centromeric and telomeric regions. Satellite-rich loci may show specific banding patterns and facilitate chromosome identification and analysis of structural chromosome changes. Unlike many other genomes, nuclear genomes of banana (Musa spp.) are poor in satellite DNA and the information on this class of DNA remains limited. The banana cultivars are seed sterile clones originating mostly from natural intra-specific crosses within M. acuminata (A genome) and inter-specific crosses between M. acuminata and M. balbisiana (B genome). Previous studies revealed the closely related nature of the A and B genomes, including similarities in repetitive DNA. In this study we focused on two main banana DNA satellites, which were previously identified in silico. Their genomic organization and molecular diversity was analyzed in a set of nineteen Musa accessions, including representatives of A, B and S (M. schizocarpa) genomes and their inter-specific hybrids. The two DNA satellites showed a high level of sequence conservation within, and a high homology between Musa species. FISH with probes for the satellite DNA sequences, rRNA genes and a single-copy BAC clone 2G17 resulted in characteristic chromosome banding patterns in M. acuminata and M. balbisiana which may aid in determining genomic constitution in interspecific hybrids. In addition to improving the knowledge on Musa satellite DNA, our study increases the number of cytogenetic markers and the number of individual chromosomes, which can be identified in Musa.
Polyploidization is considered one of the main mechanisms of plant genome evolution. The presence of multiple copies of the same gene reduces selection pressure and permits sub-functionalization and neo-functionalization leading to plant diversification, adaptation and speciation. In bread wheat, polyploidization and the prevalence of transposable elements resulted in massive gene duplication and movement. As a result, the number of genes which are non-collinear to genomes of related species seems markedly increased in wheat.
We used new-generation sequencing (NGS) to generate sequence of a Mb-sized region from wheat chromosome arm 3DS. Sequence assembly of 24 BAC clones resulted in two scaffolds of 1,264,820 and 333,768 bases. The sequence was annotated and compared to the homoeologous region on wheat chromosome 3B and orthologous loci of Brachypodium distachyon and rice. Among 39 coding sequences in the 3DS scaffolds, 32 have a homoeolog on chromosome 3B. In contrast, only fifteen and fourteen orthologs were identified in the corresponding regions in rice and Brachypodium, respectively. Interestingly, five pseudogenes were identified among the non-collinear coding sequences at the 3B locus, while none was found at the 3DS locus.
Direct comparison of two Mb-sized regions of the B and D genomes of bread wheat revealed similar rates of non-collinear gene insertion in both genomes with a majority of gene duplications occurring before their divergence. Relatively low proportion of pseudogenes was identified among non-collinear coding sequences. Our data suggest that the pseudogenes did not originate from insertion of non-functional copies, but were formed later during the evolution of hexaploid wheat. Some evidence was found for gene erosion along the B genome locus.
Wheat; BAC sequencing; Homoeologous genomes; Gene duplication; Non-collinear genes; Allopolyploidy
Nuclear genomes of human, animals, and plants are organized into subunits called chromosomes. When isolated into aqueous suspension, mitotic chromosomes can be classified using flow cytometry according to light scatter and fluorescence parameters. Chromosomes of interest can be purified by flow sorting if they can be resolved from other chromosomes in a karyotype. The analysis and sorting are carried out at rates of 102–104 chromosomes per second, and for complex genomes such as wheat the flow sorting technology has been ground-breaking in reducing genome complexity for genome sequencing. The high sample rate provides an attractive approach for karyotype analysis (flow karyotyping) and the purification of chromosomes in large numbers. In characterizing the chromosome complement of an organism, the high number that can be studied using flow cytometry allows for a statistically accurate analysis. Chromosome sorting plays a particularly important role in the analysis of nuclear genome structure and the analysis of particular and aberrant chromosomes. Other attractive but not well-explored features include the analysis of chromosomal proteins, chromosome ultrastructure, and high-resolution mapping using FISH. Recent results demonstrate that chromosome flow sorting can be coupled seamlessly with DNA array and next-generation sequencing technologies for high-throughput analyses. The main advantages are targeting the analysis to a genome region of interest and a significant reduction in sample complexity. As flow sorters can also sort single copies of chromosomes, shotgun sequencing DNA amplified from them enables the production of haplotype-resolved genome sequences. This review explains the principles of flow cytometric chromosome analysis and sorting (flow cytogenetics), discusses the major uses of this technology in genome analysis, and outlines future directions.
Chromosome sorting; Chromosome-specific BAC libraries; Chromosome sequencing; Chromosome genomics; Genome complexity reduction; Flow cytometry; Physical mapping
Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat.
The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695 Insertion Site Based Polymorphisms (ISBPs). Of the 96 ISBP primer pairs tested, 28 (29%) were 3A-specific and compared to 17 (18%) for 96 SSRs.
This work reports on the use of wheat chromosome arm 3AS-specific BAC library for the targeted generation of sequence data from a particular region of the huge genome of wheat. A large quantity of sequences were generated from the A genome of hexaploid wheat for comparative genome analysis with homoeologous B and D genomes and other model grass genomes. Hundreds of molecular markers were developed from the 3AS arm-specific sequences; these and other sequences will be useful in gene discovery and physical mapping.
Genome size evolution is a complex process influenced by polyploidization, satellite DNA accumulation, and expansion of retroelements. How this process could be affected by different reproductive strategies is still poorly understood.
We analyzed differences in the number and distribution of major repetitive DNA elements in two closely related species, Silene latifolia and S. vulgaris. Both species are diploid and possess the same chromosome number (2n = 24), but differ in their genome size and mode of reproduction. The dioecious S. latifolia (1C = 2.70 pg DNA) possesses sex chromosomes and its genome is 2.5× larger than that of the gynodioecious S. vulgaris (1C = 1.13 pg DNA), which does not possess sex chromosomes. We discovered that the genome of S. latifolia is larger mainly due to the expansion of Ogre retrotransposons. Surprisingly, the centromeric STAR-C and TR1 tandem repeats were found to be more abundant in S. vulgaris, the species with the smaller genome. We further examined the distribution of major repetitive sequences in related species in the Caryophyllaceae family. The results of FISH (fluorescence in situ hybridization) on mitotic chromosomes with the Retand element indicate that large rearrangements occurred during the evolution of the Caryophyllaceae family.
Our data demonstrate that the evolution of genome size in the genus Silene is accompanied by the expansion of different repetitive elements with specific patterns in the dioecious species possessing the sex chromosomes.
The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide.
Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice.
The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye.
This study evaluates the potential of flow cytometry for chromosome sorting in two wild diploid wheats Aegilops umbellulata and Ae. comosa and their natural allotetraploid hybrids Ae. biuncialis and Ae. geniculata. Flow karyotypes obtained after the analysis of DAPI-stained chromosomes were characterized and content of chromosome peaks was determined. Peaks of chromosome 1U could be discriminated in flow karyotypes of Ae. umbellulata and Ae. biuncialis and the chromosome could be sorted with purities exceeding 95%. The remaining chromosomes formed composite peaks and could be sorted in groups of two to four. Twenty four wheat SSR markers were tested for their position on chromosomes of Ae. umbellulata and Ae. comosa using PCR on DNA amplified from flow-sorted chromosomes and genomic DNA of wheat-Ae. geniculata addition lines, respectively. Six SSR markers were located on particular Aegilops chromosomes using sorted chromosomes, thus confirming the usefulness of this approach for physical mapping. The SSR markers are suitable for marker assisted selection of wheat-Aegilops introgression lines. The results obtained in this work provide new opportunities for dissecting genomes of wild relatives of wheat with the aim to assist in alien gene transfer and discovery of novel genes for wheat improvement.
Wheat is one of the world's most important crops and is characterized by a large polyploid genome. One way to reduce genome complexity is to isolate single chromosomes using flow cytometry. Low coverage DNA sequencing can provide a snapshot of individual chromosomes, allowing a fast characterization of their main features and comparison with other genomes. We used massively parallel 454 pyrosequencing to obtain a 2x coverage of wheat chromosome 5A. The resulting sequence assembly was used to identify TEs, genes and miRNAs, as well as to infer a virtual gene order based on the synteny with other grass genomes. Repetitive elements account for more than 75% of the genome. Gene content was estimated considering non-redundant reads showing at least one match to ESTs or proteins. The results indicate that the coding fraction represents 1.08% and 1.3% of the short and long arm respectively, projecting the number of genes of the whole chromosome to approximately 5,000. 195 candidate miRNA precursors belonging to 16 miRNA families were identified. The 5A genes were used to search for syntenic relationships between grass genomes. The short arm is closely related to Brachypodium chromosome 4, sorghum chromosome 8 and rice chromosome 12; the long arm to regions of Brachypodium chromosomes 4 and 1, sorghum chromosomes 1 and 2 and rice chromosomes 9 and 3. From these similarities it was possible to infer the virtual gene order of 392 (5AS) and 1,480 (5AL) genes of chromosome 5A, which was compared to, and found to be largely congruent with the available physical map of this chromosome.
The classification of the Musaceae (banana) family species and their phylogenetic inter-relationships remain controversial, in part due to limited nucleotide information to complement the morphological and physiological characters. In this work the evolutionary relationships within the Musaceae family were studied using 13 species and DNA sequences obtained from a set of 19 unlinked nuclear genes.
The 19 gene sequences represented a sample of ~16 kb of genome sequence (~73% intronic). The sequence data were also used to obtain estimates for the divergence times of the Musaceae genera and Musa sections. Nucleotide variation within the sample confirmed the close relationship of Australimusa and Callimusa sections and showed that Eumusa and Rhodochlamys sections are not reciprocally monophyletic, which supports the previous claims for the merger between the two latter sections. Divergence time analysis supported the previous dating of the Musaceae crown age to the Cretaceous/Tertiary boundary (~ 69 Mya), and the evolution of Musa to ~50 Mya. The first estimates for the divergence times of the four Musa sections were also obtained.
The gene sequence-based phylogeny presented here provides a substantial insight into the course of speciation within the Musaceae. An understanding of the main phylogenetic relationships between banana species will help to fine-tune the taxonomy of Musaceae.
Genes coding for 45S ribosomal RNA are organized in tandem arrays of up to several thousand copies and contain 18S, 5.8S and 26S rRNA units separated by internal transcribed spacers ITS1 and ITS2. While the rRNA units are evolutionary conserved, ITS show high level of interspecific divergence and have been used frequently in genetic diversity and phylogenetic studies. In this work we report on the structure and diversity of the ITS region in 87 representatives of the family Musaceae. We provide the first detailed information on ITS sequence diversity in the genus Musa and describe the presence of more than one type of ITS sequence within individual species. Both Sanger sequencing of amplified ITS regions and whole genome 454 sequencing lead to similar phylogenetic inferences. We show that it is necessary to identify putative pseudogenic ITS sequences, which may have negative effect on phylogenetic reconstruction at lower taxonomic levels. Phylogenetic reconstruction based on ITS sequence showed that the genus Musa is divided into two distinct clades – Callimusa and Australimusa and Eumusa and Rhodochlamys. Most of the intraspecific banana hybrids analyzed contain conserved parental ITS sequences, indicating incomplete concerted evolution of rDNA loci. Independent evolution of parental rDNA in hybrids enables determination of genomic constitution of hybrids using ITS. The observation of only one type of ITS sequence in some of the presumed interspecific hybrid clones warrants further study to confirm their hybrid origin and to unravel processes leading to evolution of their genomes.
The centromeric and pericentromeric regions of plant chromosomes are colonized by Ty3/gypsy retrotransposons, which, on the basis of their reverse transcriptase sequences, form the chromovirus CRM clade. Despite their potential importance for centromere evolution and function, they have remained poorly characterized. In this work, we aimed to carry out a comprehensive survey of CRM clade elements with an emphasis on their diversity, structure, chromosomal distribution and transcriptional activity.
We have surveyed a set of 190 CRM elements belonging to 81 different retrotransposon families, derived from 33 host species and falling into 12 plant families. The sequences at the C-terminus of their integrases were unexpectedly heterogeneous, despite the understanding that they are responsible for targeting to the centromere. This variation allowed the division of the CRM clade into the three groups A, B and C, and the members of each differed considerably with respect to their chromosomal distribution. The differences in chromosomal distribution coincided with variation in the integrase C-terminus sequences possessing a putative targeting domain (PTD). A majority of the group A elements possess the CR motif and are concentrated in the centromeric region, while members of group C have the type II chromodomain and are dispersed throughout the genome. Although representatives of the group B lack a PTD of any type, they appeared to be localized preferentially in the centromeres of tested species. All tested elements were found to be transcriptionally active.
Comprehensive analysis of the CRM clade elements showed that genuinely centromeric retrotransposons represent only a fraction of the CRM clade (group A). These centromeric retrotransposons represent an active component of centromeres of a wide range of angiosperm species, implying that they play an important role in plant centromere evolution. In addition, their transcriptional activity is consistent with the notion that the transcription of centromeric retrotransposons has a role in normal centromere function.
Establishment of a standardized platform for genotyping banana (Musa spp.) using a set of previously published SSR markers is described. The platform will serve a broad Musa research and breeding community and support the conservation and use of genetic diversity.
Background and aims
Bananas and plantains (Musa spp.) are one of the major fruit crops worldwide with acknowledged importance as a staple food for millions of people. The rich genetic diversity of this crop is, however, endangered by diseases, adverse environmental conditions and changed farming practices, and the need for its characterization and preservation is urgent. With the aim of providing a simple and robust approach for molecular characterization of Musa species, we developed an optimized genotyping platform using 19 published simple sequence repeat markers.
The genotyping system is based on 19 microsatellite loci, which are scored using fluorescently labelled primers and high-throughput capillary electrophoresis separation with high resolution. This genotyping platform was tested and optimized on a set of 70 diploid and 38 triploid banana accessions.
The marker set used in this study provided enough polymorphism to discriminate between individual species, subspecies and subgroups of all accessions of Musa. Likewise, the capability of identifying duplicate samples was confirmed. Based on the results of a blind test, the genotyping system was confirmed to be suitable for characterization of unknown accessions.
Here we report on the first complex and standardized platform for molecular characterization of Musa germplasm that is ready to use for the wider Musa research and breeding community. We believe that this genotyping system offers a versatile tool that can accommodate all possible requirements for characterizing Musa diversity, and is economical for samples ranging from one to many accessions.
Positional cloning in bread wheat is a tedious task due to its huge genome size and hexaploid character. BAC libraries represent an essential tool for positional cloning. However, wheat BAC libraries comprise more than million clones, which makes their screening very laborious. Here, we present a targeted approach based on chromosome-specific BAC libraries. Such libraries were constructed from flow-sorted arms of wheat chromosome 7D. A library from the short arm (7DS) consisting of 49,152 clones with 113 kb insert size represented 12.1 arm equivalents whereas a library from the long arm (7DL) comprised 50,304 clones of 116 kb providing 14.9x arm coverage. The 7DS library was PCR screened with markers linked to Russian wheat aphid resistance gene DnCI2401, the 7DL library was screened by hybridization with a probe linked to greenbug resistance gene Gb3. The small number of clones combined with high coverage made the screening highly efficient and cost effective.
Bananas and plantains (Musa spp.) provide a staple food for many millions of people living in the humid tropics. The cultivated varieties (cultivars) are seedless parthenocarpic clones of which the origin remains unclear. Many are believed to be diploid and polyploid hybrids involving the A genome diploid M. acuminata and the B genome M. balbisiana, with the hybrid genomes consisting of a simple combination of the parental ones. Thus the genomic constitution of the diploids has been classified as AB, and that of the triploids as AAB or ABB. However, the morphology of many accessions is biased towards either the A or B phenotype and does not conform to predictions based on these genomic formulae.
On the basis of published cytotypes (mitochondrial and chloroplast genomes), we speculate here that the hybrid banana genomes are unbalanced with respect to the parental ones, and/or that inter-genome translocation chromosomes are relatively common. We hypothesize that the evolution under domestication of cultivated banana hybrids is more likely to have passed through an intermediate hybrid, which was then involved in a variety of backcrossing events. We present experimental data supporting our hypothesis and we propose a set of experimental approaches to test it, thereby indicating other possibilities for explaining some of the unbalanced genome expressions. Progress in this area would not only throw more light on the origin of one of the most important crops, but provide data of general relevance for the evolution under domestication of many other important clonal crops. At the same time, a complex origin of the cultivated banana hybrids would imply a reconsideration of current breeding strategies.
Backcrossing; banana; breeding; genotype; hybrids; Musa
Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement.
In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones.
A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection.
The evolution of sex chromosomes is often accompanied by gene or chromosome rearrangements. Recently, the gene AP3 was characterized in the dioecious plant species Silene latifolia. It was suggested that this gene had been transferred from an autosome to the Y chromosome.
In the present study we provide evidence for the existence of an X linked copy of the AP3 gene. We further show that the Y copy is probably located in a chromosomal region where recombination restriction occurred during the first steps of sex chromosome evolution. A comparison of X and Y copies did not reveal any clear signs of degenerative processes in exon regions. Instead, both X and Y copies show evidence for relaxed selection compared to the autosomal orthologues in S. vulgaris and S. conica. We further found that promoter sequences differ significantly. Comparison of the genic region of AP3 between the X and Y alleles and the corresponding autosomal copies in the gynodioecious species S. vulgaris revealed a massive accumulation of retrotransposons within one intron of the Y copy of AP3. Analysis of the genomic distribution of these repetitive elements does not indicate that these elements played an important role in the size increase characteristic of the Y chromosome. However, in silico expression analysis shows biased expression of individual domains of the identified retroelements in male plants.
We characterized the structure and evolution of AP3, a sex linked gene with copies on the X and Y chromosomes in the dioecious plant S. latifolia. These copies showed complementary expression patterns and relaxed evolution at protein level compared to autosomal orthologues, which suggests subfunctionalization. One intron of the Y-linked allele was invaded by retrotransposons that display sex-specific expression patterns that are similar to the expression pattern of the corresponding allele, which suggests that these transposable elements may have influenced evolution of expression patterns of the Y copy. These data could help researchers decipher the role of transposable elements in degenerative processes during sex chromosome evolution.
The presence of closely related genomes in polyploid species makes the assembly of total genomic sequence from shotgun sequence reads produced by the current sequencing platforms exceedingly difficult, if not impossible. Genomes of polyploid species could be sequenced following the ordered-clone sequencing approach employing contigs of bacterial artificial chromosome (BAC) clones and BAC-based physical maps. Although BAC contigs can currently be constructed for virtually any diploid organism with the SNaPshot high-information-content-fingerprinting (HICF) technology, it is currently unknown if this is also true for polyploid species. It is possible that BAC clones from orthologous regions of homoeologous chromosomes would share numerous restriction fragments and be therefore included into common contigs. Because of this and other concerns, physical mapping utilizing the SNaPshot HICF of BAC libraries of polyploid species has not been pursued and the possibility of doing so has not been assessed. The sole exception has been in common wheat, an allohexaploid in which it is possible to construct single-chromosome or single-chromosome-arm BAC libraries from DNA of flow-sorted chromosomes and bypass the obstacles created by polyploidy.
The potential of the SNaPshot HICF technology for physical mapping of polyploid plants utilizing global BAC libraries was evaluated by assembling contigs of fingerprinted clones in an in silico merged BAC library composed of single-chromosome libraries of two wheat homoeologous chromosome arms, 3AS and 3DS, and complete chromosome 3B. Because the chromosome arm origin of each clone was known, it was possible to estimate the fidelity of contig assembly. On average 97.78% or more clones, depending on the library, were from a single chromosome arm. A large portion of the remaining clones was shown to be library contamination from other chromosomes, a feature that is unavoidable during the construction of single-chromosome BAC libraries.
The negligibly low level of incorporation of clones from homoeologous chromosome arms into a contig during contig assembly suggested that it is feasible to construct contigs and physical maps using global BAC libraries of wheat and almost certainly also of other plant polyploid species with genome sizes comparable to that of wheat. Because of the high purity of the resulting assembled contigs, they can be directly used for genome sequencing. It is currently unknown but possible that equally good BAC contigs can be also constructed for polyploid species containing smaller, more gene-rich genomes.
Grasses are among the most important and widely cultivated plants on Earth. They provide high quality fodder for livestock, are used for turf and amenity purposes, and play a fundamental role in environment protection. Among cultivated grasses, species within the Festuca-Lolium complex predominate, especially in temperate regions. To facilitate high-throughput genome profiling and genetic mapping within the complex, we have developed a Diversity Arrays Technology (DArT) array for five grass species: F. pratensis, F. arundinacea, F. glaucescens, L. perenne and L. multiflorum.
The DArTFest array contains 7680 probes derived from methyl-filtered genomic representations. In a first marker discovery experiment performed on 40 genotypes from each species (with the exception of F. glaucescens for which only 7 genotypes were used), we identified 3884 polymorphic markers. The number of DArT markers identified in every single genotype varied from 821 to 1852. To test the usefulness of DArTFest array for physical mapping, DArT markers were assigned to each of the seven chromosomes of F. pratensis using single chromosome substitution lines while recombinants of F. pratensis chromosome 3 were used to allocate the markers to seven chromosome bins.
The resources developed in this project will facilitate the development of genetic maps in Festuca and Lolium, the analysis on genetic diversity, and the monitoring of the genomic constitution of the Festuca × Lolium hybrids. They will also enable marker-assisted selection for multiple traits or for specific genome regions.
Background and Aims
After the initial boom in the application of flow cytometry in plant sciences in the late 1980s and early 1990s, which was accompanied by development of many nuclear isolation buffers, only a few efforts were made to develop new buffer formulas. In this work, recent data on the performance of nuclear isolation buffers are utilized in order to develop new buffers, general purpose buffer (GPB) and woody plant buffer (WPB), for plant DNA flow cytometry.
GPB and WPB were used to prepare samples for flow cytometric analysis of nuclear DNA content in a set of 37 plant species that included herbaceous and woody taxa with leaf tissues differing in structure and chemical composition. The following parameters of isolated nuclei were assessed: forward and side light scatter, propidium iodide fluorescence, coefficient of variation of DNA peaks, quantity of debris background, and the number of particles released from sample tissue. The nuclear genome size of 30 selected species was also estimated using the buffer that performed better for a given species.
In unproblematic species, the use of both buffers resulted in high quality samples. The analysis of samples obtained with GPB usually resulted in histograms of DNA content with higher or similar resolution than those prepared with the WPB. In more recalcitrant tissues, such as those from woody plants, WPB performed better and GPB failed to provide acceptable results in some cases. Improved resolution of DNA content histograms in comparison with previously published buffers was achieved in most of the species analysed.
WPB is a reliable buffer which is also suitable for the analysis of problematic tissues/species. Although GPB failed with some plant species, it provided high-quality DNA histograms in species from which nuclear suspensions are easy to prepare. The results indicate that even with a broad range of species, either GPB or WPB is suitable for preparation of high-quality suspensions of intact nuclei suitable for DNA flow cytometry.
Cytosolic compounds; flow cytometry; general purpose buffer; genome size; lysis buffers; nuclear DNA content; nuclear DNA staining; propidium iodide; woody plant buffer
Rye (Secale cereale L.) belongs to tribe Triticeae and is an important temperate cereal. It is one of the parents of man-made species Triticale and has been used as a source of agronomically important genes for wheat improvement. The short arm of rye chromosome 1 (1RS), in particular is rich in useful genes, and as it may increase yield, protein content and resistance to biotic and abiotic stress, it has been introgressed into wheat as the 1BL.1RS translocation. A better knowledge of the rye genome could facilitate rye improvement and increase the efficiency of utilizing rye genes in wheat breeding.
Here, we report on BAC end sequencing of 1,536 clones from two 1RS-specific BAC libraries. We obtained 2,778 (90.4%) useful sequences with a cumulative length of 2,032,538 bp and an average read length of 732 bp. These sequences represent 0.5% of 1RS arm. The GC content of the sequenced fraction of 1RS is 45.9%, and at least 84% of the 1RS arm consists of repetitive DNA. We identified transposable element junctions in BESs and developed insertion site based polymorphism markers (ISBP). Out of the 64 primer pairs tested, 17 (26.6%) were specific for 1RS. We also identified BESs carrying microsatellites suitable for development of 1RS-specific SSR markers.
This work demonstrates the utility of chromosome arm-specific BAC libraries for targeted analysis of large Triticeae genomes and provides new sequence data from the rye genome and molecular markers for the short arm of rye chromosome 1.