PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-7 (7)
 

Clipboard (0)
None

Select a Filter Below

Journals
Authors
more »
Year of Publication
Document Types
1.  Genome-Wide Analysis of Syntenic Gene Deletion in the Grasses 
Genome Biology and Evolution  2012;4(3):265-277.
The grasses, Poaceae, are one of the largest and most successful angiosperm families. Like many radiations of flowering plants, the divergence of the major grass lineages was preceded by a whole-genome duplication (WGD), although these events are not rare for flowering plants. By combining identification of syntenic gene blocks with measures of gene pair divergence and different frequencies of ancient gene loss, we have separated the two subgenomes present in modern grasses. Reciprocal loss of duplicated genes or genomic regions has been hypothesized to reproductively isolate populations and, thus, speciation. However, in contrast to previous studies in yeast and teleost fishes, we found very little evidence of reciprocal loss of homeologous genes between the grasses, suggesting that post-WGD gene loss may not be the cause of the grass radiation. The sets of homeologous and orthologous genes and predicted locations of deleted genes identified in this study, as well as links to the CoGe comparative genomics web platform for analyzing pan-grass syntenic regions, are provided along with this paper as a resource for the grass genetics community.
doi:10.1093/gbe/evs009
PMCID: PMC3318446  PMID: 22275519
polyploidy; gene loss; synteny; Poaceae; speciation
2.  Escape from Preferential Retention Following Repeated Whole Genome Duplications in Plants 
The well supported gene dosage hypothesis predicts that genes encoding proteins engaged in dose–sensitive interactions cannot be reduced back to single copies once all interacting partners are simultaneously duplicated in a whole genome duplication. The genomes of extant flowering plants are the result of many sequential rounds of whole genome duplication, yet the fraction of genomes devoted to encoding complex molecular machines does not increase as fast as expected through multiple rounds of whole genome duplications. Using parallel interspecies genomic comparisons in the grasses and crucifers, we demonstrate that genes retained as duplicates following a whole genome duplication have only a 50% chance of being retained as duplicates in a second whole genome duplication. Genes which fractionated to a single copy following a second whole genome duplication tend to be the member of a gene pair with less complex promoters, lower levels of expression, and to be under lower levels of purifying selection. We suggest the copy with lower levels of expression and less purifying selection contributes less to effective gene-product dosage and therefore is under less dosage constraint in future whole genome duplications, providing an explanation for why flowering plant genomes are not overrun with subunits of large dose–sensitive protein complexes.
doi:10.3389/fpls.2012.00094
PMCID: PMC3355610  PMID: 22639677
polyploidy; gene dosage; gene loss; genome evolution; comparative genomics; crucifers; grasses
3.  Heritable Epigenetic Variation among Maize Inbreds 
PLoS Genetics  2011;7(11):e1002372.
Epigenetic variation describes heritable differences that are not attributable to changes in DNA sequence. There is the potential for pure epigenetic variation that occurs in the absence of any genetic change or for more complex situations that involve both genetic and epigenetic differences. Methylation of cytosine residues provides one mechanism for the inheritance of epigenetic information. A genome-wide profiling of DNA methylation in two different genotypes of Zea mays (ssp. mays), an organism with a complex genome of interspersed genes and repetitive elements, allowed the identification and characterization of examples of natural epigenetic variation. The distribution of DNA methylation was profiled using immunoprecipitation of methylated DNA followed by hybridization to a high-density tiling microarray. The comparison of the DNA methylation levels in the two genotypes, B73 and Mo17, allowed for the identification of approximately 700 differentially methylated regions (DMRs). Several of these DMRs occur in genomic regions that are apparently identical by descent in B73 and Mo17 suggesting that they may be examples of pure epigenetic variation. The methylation levels of the DMRs were further studied in a panel of near-isogenic lines to evaluate the stable inheritance of the methylation levels and to assess the contribution of cis- and trans- acting information to natural epigenetic variation. The majority of DMRs that occur in genomic regions without genetic variation are controlled by cis-acting differences and exhibit relatively stable inheritance. This study provides evidence for naturally occurring epigenetic variation in maize, including examples of pure epigenetic variation that is not conditioned by genetic differences. The epigenetic differences are variable within maize populations and exhibit relatively stable trans-generational inheritance. The detected examples of epigenetic variation, including some without tightly linked genetic variation, may contribute to complex trait variation.
Author Summary
Heritable variation within a species provides the basis for natural and artificial selection. A substantial portion of heritable variation is based on alterations in DNA sequence among individuals and is termed genetic variation. There is also evidence for epigenetic variation, which refers to heritable differences that are not caused by DNA sequence changes. Methylation of cytosine residues provides one molecular mechanism for epigenetic variation in many eukaryotic species. The genome-wide distribution of DNA methylation was assessed in two different inbred genotypes of maize to identify differentially methylated regions that may contribute to epigenetic variation. There are hundreds of genomic regions that have differences in DNA methylation levels in these two different genotypes, including methylation differences in regions without genetic variation. By studying the inheritance of the differential methylation in near-isogenic progeny of the two inbred lines, it is possible to demonstrate relatively stable inheritance of epigenetic variation, even in the absence of DNA sequence changes. The epigenetic variation among individuals of the same species may provide important contributions to phenotypic variation within a species even in the absence of genetic differences.
doi:10.1371/journal.pgen.1002372
PMCID: PMC3219600  PMID: 22125494
4.  Screening synteny blocks in pairwise genome comparisons through integer programming 
BMC Bioinformatics  2011;12:102.
Background
It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events.
Results
We have formulated the synteny block screening as an optimization problem known as "Binary Integer Programming" (BIP), which is solved using existing linear programming solvers. The computer program QUOTA-ALIGN performs this task by creating a clear objective function that maximizes the compatible set of synteny blocks under given constraints on overlaps and depths (corresponding to the duplication history in respective genomes). Such a procedure is useful for any pairwise synteny alignments, but is most useful in lineages affected by multiple WGDs, like plants or fish lineages. For example, there should be a 1:2 ploidy relationship between genome A and B if genome B had an independent WGD subsequent to the divergence of the two genomes. We show through simulations and real examples using plant genomes in the rosid superorder that the quota-based screening can eliminate ambiguous synteny blocks and focus on specific genomic evolutionary events, like the divergence of lineages (in cross-species comparisons) and the most recent WGD (in self comparisons).
Conclusions
The QUOTA-ALIGN algorithm screens a set of synteny blocks to retain only those compatible with a user specified ploidy relationship between two genomes. These blocks, in turn, may be used for additional downstream analyses such as identifying true orthologous regions in interspecific comparisons. There are two major contributions of QUOTA-ALIGN: 1) reducing the block screening task to a BIP problem, which is novel; 2) providing an efficient software pipeline starting from all-against-all BLAST to the screened synteny blocks with dot plot visualizations. Python codes and full documentations are publicly available http://github.com/tanghaibao/quota-alignment. QUOTA-ALIGN program is also integrated as a major component in SynMap http://genomevolution.com/CoGe/SynMap.pl, offering easier access to thousands of genomes for non-programmers.
doi:10.1186/1471-2105-12-102
PMCID: PMC3088904  PMID: 21501495
5.  Genes Identified by Visible Mutant Phenotypes Show Increased Bias toward One of Two Subgenomes of Maize 
PLoS ONE  2011;6(3):e17855.
Not all genes are created equal. Despite being supported by sequence conservation and expression data, knockout homozygotes of many genes show no visible effects, at least under laboratory conditions. We have identified a set of maize (Zea mays L.) genes which have been the subject of a disproportionate share of publications recorded at MaizeGDB. We manually anchored these “classical” maize genes to gene models in the B73 reference genome, and identified syntenic orthologs in other grass genomes. In addition to proofing the most recent version 2 maize gene models, we show that a subset of these genes, those that were identified by morphological phenotype prior to cloning, are retained at syntenic locations throughout the grasses at much higher levels than the average expressed maize gene, and are preferentially found on the maize1 subgenome even with a duplicate copy is still retained on the opposite subgenome. Maize1 is the subgenome that experienced less gene loss following the whole genome duplication in maize lineage 5–12 million years ago and genes located on this subgenome tend to be expressed at higher levels in modern maize. Links to the web based software that supported our syntenic analyses in the grasses should empower further research and support teaching involving the history of maize genetic research. Our findings exemplify the concept of “grasses as a single genetic system,” where what is learned in one grass may be applied to another.
doi:10.1371/journal.pone.0017855
PMCID: PMC3053395  PMID: 21423772
6.  Dose–Sensitivity, Conserved Non-Coding Sequences, and Duplicate Gene Retention Through Multiple Tetraploidies in the Grasses 
Whole genome duplications, or tetraploidies, are an important source of increased gene content. Following whole genome duplication, duplicate copies of many genes are lost from the genome. This loss of genes is biased both in the classes of genes deleted and the subgenome from which they are lost. Many or all classes are genes preferentially retained as duplicate copies are engaged in dose sensitive protein–protein interactions, such that deletion of any one duplicate upsets the status quo of subunit concentrations, and presumably lowers fitness as a result. Transcription factors are also preferentially retained following every whole genome duplications studied. This has been explained as a consequence of protein–protein interactions, just as for other highly retained classes of genes. We show that the quantity of conserved noncoding sequences (CNSs) associated with genes predicts the likelihood of their retention as duplicate pairs following whole genome duplication. As many CNSs likely represent binding sites for transcriptional regulators, we propose that the likelihood of gene retention following tetraploidy may also be influenced by dose–sensitive protein–DNA interactions between the regulatory regions of CNS-rich genes – nicknamed bigfoot genes – and the proteins that bind to them. Using grass genomes, we show that differential loss of CNSs from one member of a pair following the pre-grass tetraploidy reduces its chance of retention in the subsequent maize lineage tetraploidy.
doi:10.3389/fpls.2011.00002
PMCID: PMC3355796  PMID: 22645525
conserved non-coding sequence; polyploidy; fractionation; gene dosage; gene regulation
7.  Following Tetraploidy in Maize, a Short Deletion Mechanism Removed Genes Preferentially from One of the Two Homeologs 
PLoS Biology  2010;8(6):e1000409.
Following genome duplication and selfish DNA expansion, maize used a heretofore unknown mechanism to shed redundant genes and functionless DNA with bias toward one of the parental genomes.
Previous work in Arabidopsis showed that after an ancient tetraploidy event, genes were preferentially removed from one of the two homeologs, a process known as fractionation. The mechanism of fractionation is unknown. We sought to determine whether such preferential, or biased, fractionation exists in maize and, if so, whether a specific mechanism could be implicated in this process. We studied the process of fractionation using two recently sequenced grass species: sorghum and maize. The maize lineage has experienced a tetraploidy since its divergence from sorghum approximately 12 million years ago, and fragments of many knocked-out genes retain enough sequence similarity to be easily identifiable. Using sorghum exons as the query sequence, we studied the fate of both orthologous genes in maize following the maize tetraploidy. We show that genes are predominantly lost, not relocated, and that single-gene loss by deletion is the rule. Based on comparisons with orthologous sorghum and rice genes, we also infer that the sequences present before the deletion events were flanked by short direct repeats, a signature of intra-chromosomal recombination. Evidence of this deletion mechanism is found 2.3 times more frequently on one of the maize homeologs, consistent with earlier observations of biased fractionation. The over-fractionated homeolog is also a greater than 3-fold better target for transposon removal, but does not have an observably higher synonymous base substitution rate, nor could we find differentially placed methylation domains. We conclude that fractionation is indeed biased in maize and that intra-chromosomal or possibly a similar illegitimate recombination is the primary mechanism by which fractionation occurs. The mechanism of intra-chromosomal recombination explains the observed bias in both gene and transposon loss in the maize lineage. The existence of fractionation bias demonstrates that the frequency of deletion is modulated. Among the evolutionary benefits of this deletion/fractionation mechanism is bulk DNA removal and the generation of novel combinations of regulatory sequences and coding regions.
Author Summary
All genomes can accumulate dispensable DNA in the form of duplications of individual genes or even partial or whole genome duplications. Genomes also can accumulate selfish DNA elements. Duplication events specifically are often followed by extensive gene loss. The maize genome is particularly extreme, having become tetraploid 10 million years ago and played host to massive transposon amplifications. We compared the genome of sorghum (which is homologous to the pre-tetraploid maize genome) with the two identifiable parental genomes retained in maize. The two maize genomes differ greatly: one of the parental genomes has lost 2.3 times more genes than the other, and the selfish DNA regions between genes were even more frequently lost, suggesting maize can distinguish between the parental genomes present in the original tetraploid. We show that genes are actually lost, not simply relocated. Deletions were rarely longer than a single gene, and occurred between repeated DNA sequences, suggesting mis-recombination as a mechanism of gene removal. We hypothesize an epigenetic mechanism of genome distinction to account for the selective loss. To the extent that the rate of base substitutions tracks time, we neither support nor refute claims of maize allotetraploidy. Finally, we explain why it makes sense that purifying selection in mammals does not operate at all like the gene and genome deletion program we describe here.
doi:10.1371/journal.pbio.1000409
PMCID: PMC2893956  PMID: 20613864

Results 1-7 (7)