Since our previous NAR report (8
), Gramene has tripled its number of complete reference genomes to 27. As shown in Supplementary Table S1
, the species list broadens taxonomic representation and increases resolution with the inclusion of 14 monocots, 9 core eudicots and 4 primitive non-flowering plants, while serving both crop and model organism research communities. Notable additions to the monocot list include maize (Zea mays
) and foxtail millet (Setaria italica
), which along with Sorghum bicolor
contribute to biofeedstock research owing to their C4 photosynthetic metabolism. Supporting wheat research, we added two diploid progenitor species Triticum urartu
and Aegilops tauschii
representing the AA and DD genome types, respectively. Until recently the monocot collection included only grasses (Poacea
). This changed with the addition of banana (Musa acuminate
), among the first non-grass monocots to be sequenced.
We have more than doubled core eudicots. Addition of two members of the Solanaceae, tomato (Solanum lycopersicum) and potato (S. tuberosum), represent the first asterids to join this resource, thus broadening eudicots beyond the rosid subclass. Addition of soybean (Glycine max) and Medicago truncatula represent two ends of the spectrum within legumes and provide complementary resources for crop breeding and research. In order to broaden the base of the species tree, we now include aquatic algae (Cyanidioschyzon merolae and Chlamydomonas reinhardtii), an early land plant moss (Physcomitrella patens) and an early vascular non-seed plant spikemoss (Selaginella moellendorffii).
Although inclusion of basal species aids the investigation of early events in plant evolution, the study of rapidly evolving characteristics requires dense species representation within a more shallow clade. In recent years, Gramene has accomplished this goal by building a rice-genus-level resource that now includes 13 of the estimated 24 species within the Oryza
) (Supplementary Table S1
). In addition to the two subspecies of Asian cultivated rice, this resource includes complete reference assemblies for cultivated African rice Oryza glaberrima
, its wild progenitor Oryza barthii
, and the distantly related wild species Oryza punctata
and Oryza brachyantha
. An additional eight Oryza
species, including one polyploid, plus the outgroup species Leersia perrieri
, are available as chromosome 3 short-arm assemblies and were contributed through collaboration with the NSF-funded Oryza Map Alignment Project (OMAP) and Oryza Genome Evolution (OGE) projects (http://www.genome.arizona.edu/modules/publisher/item.php?itemid=7
). In the coming year, many of these will be replaced with complete reference assemblies provided through various international consortia.
Gramene performs base-line annotation of repeat sequences, est/mRNA alignments and ab initio
gene prediction (8
). The community-recognized gene annotations are characterized for InterPro domains and cross-referenced to entries in third-party databases. Functional information is assigned using ontologies (Supplementary Table S2
) through a variety of methods (10
), which now include projection from one species to another using Compara gene ortholog assignments.