Search tips
Search criteria

Results 1-4 (4)

Clipboard (0)
more »
Year of Publication
Document Types
1.  RNA-Seq Atlas of Glycine max: A guide to the soybean transcriptome 
BMC Plant Biology  2010;10:160.
Next generation sequencing is transforming our understanding of transcriptomes. It can determine the expression level of transcripts with a dynamic range of over six orders of magnitude from multiple tissues, developmental stages or conditions. Patterns of gene expression provide insight into functions of genes with unknown annotation.
The RNA Seq-Atlas presented here provides a record of high-resolution gene expression in a set of fourteen diverse tissues. Hierarchical clustering of transcriptional profiles for these tissues suggests three clades with similar profiles: aerial, underground and seed tissues. We also investigate the relationship between gene structure and gene expression and find a correlation between gene length and expression. Additionally, we find dramatic tissue-specific gene expression of both the most highly-expressed genes and the genes specific to legumes in seed development and nodule tissues. Analysis of the gene expression profiles of over 2,000 genes with preferential gene expression in seed suggests there are more than 177 genes with functional roles that are involved in the economically important seed filling process. Finally, the Seq-atlas also provides a means of evaluating existing gene model annotations for the Glycine max genome.
This RNA-Seq atlas extends the analyses of previous gene expression atlases performed using Affymetrix GeneChip technology and provides an example of new methods to accommodate the increase in transcriptome data obtained from next generation sequencing. Data contained within this RNA-Seq atlas of Glycine max can be explored at
PMCID: PMC3017786  PMID: 20687943
2.  Complementary genetic and genomic approaches help characterize the linkage group I seed protein QTL in soybean 
BMC Plant Biology  2010;10:41.
The nutritional and economic value of many crops is effectively a function of seed protein and oil content. Insight into the genetic and molecular control mechanisms involved in the deposition of these constituents in the developing seed is needed to guide crop improvement. A quantitative trait locus (QTL) on Linkage Group I (LG I) of soybean (Glycine max (L.) Merrill) has a striking effect on seed protein content.
A soybean near-isogenic line (NIL) pair contrasting in seed protein and differing in an introgressed genomic segment containing the LG I protein QTL was used as a resource to demarcate the QTL region and to study variation in transcript abundance in developing seed. The LG I QTL region was delineated to less than 8.4 Mbp of genomic sequence on chromosome 20. Using Affymetrix® Soy GeneChip and high-throughput Illumina® whole transcriptome sequencing platforms, 13 genes displaying significant seed transcript accumulation differences between NILs were identified that mapped to the 8.4 Mbp LG I protein QTL region.
This study identifies gene candidates at the LG I protein QTL for potential involvement in the regulation of protein content in the soybean seed. The results demonstrate the power of complementary approaches to characterize contrasting NILs and provide genome-wide transcriptome insight towards understanding seed biology and the soybean genome.
PMCID: PMC2848761  PMID: 20199683
3.  Highly syntenic regions in the genomes of soybean, Medicago truncatula, and Arabidopsis thaliana 
BMC Plant Biology  2005;5:15.
Recent genome sequencing enables mega-base scale comparisons between related genomes. Comparisons between animals, plants, fungi, and bacteria demonstrate extensive synteny tempered by rearrangements. Within the legume plant family, glimpses of synteny have also been observed. Characterizing syntenic relationships in legumes is important in transferring knowledge from model legumes to crops that are important sources of protein, fixed nitrogen, and health-promoting compounds.
We have uncovered two large soybean regions exhibiting synteny with M. truncatula and with a network of segmentally duplicated regions in Arabidopsis. In all, syntenic regions comprise over 500 predicted genes spanning 3 Mb. Up to 75% of soybean genes are colinear with M. truncatula, including one region in which 33 of 35 soybean predicted genes with database support are colinear to M. truncatula. In some regions, 60% of soybean genes share colinearity with a network of A. thaliana duplications. One region is especially interesting because this 500 kbp segment of soybean is syntenic to two paralogous regions in M. truncatula on different chromosomes. Phylogenetic analysis of individual genes within these regions demonstrates that one is orthologous to the soybean region, with which it also shows substantially denser synteny and significantly lower levels of synonymous nucleotide substitutions. The other M. truncatula region is inferred to be paralogous, presumably resulting from a duplication event preceding speciation.
The presence of well-defined M. truncatula segments showing orthologous and paralogous relationships with soybean allows us to explore the evolution of contiguous genomic regions in the context of ancient genome duplication and speciation events.
PMCID: PMC1201151  PMID: 16102170
4.  The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana 
BMC Plant Biology  2004;4:10.
Most genes in Arabidopsis thaliana are members of gene families. How do the members of gene families arise, and how are gene family copy numbers maintained? Some gene families may evolve primarily through tandem duplication and high rates of birth and death in clusters, and others through infrequent polyploidy or large-scale segmental duplications and subsequent losses.
Our approach to understanding the mechanisms of gene family evolution was to construct phylogenies for 50 large gene families in Arabidopsis thaliana, identify large internal segmental duplications in Arabidopsis, map gene duplications onto the segmental duplications, and use this information to identify which nodes in each phylogeny arose due to segmental or tandem duplication. Examples of six gene families exemplifying characteristic modes are described. Distributions of gene family sizes and patterns of duplication by genomic distance are also described in order to characterize patterns of local duplication and copy number for large gene families. Both gene family size and duplication by distance closely follow power-law distributions.
Combining information about genomic segmental duplications, gene family phylogenies, and gene positions provides a method to evaluate contributions of tandem duplication and segmental genome duplication in the generation and maintenance of gene families. These differences appear to correspond meaningfully to differences in functional roles of the members of the gene families.
PMCID: PMC446195  PMID: 15171794

Results 1-4 (4)