The eastern North American monarch butterfly (Danaus plexippus
) undergoes a spectacular long-distance migration in the fall. The monarch has emerged as an excellent model for investigating the general molecular and neural basis of long-distance migration (1
). The remarkable navigational capabilities of monarchs are part of a genetic program that is initiated in migrants; the butterflies that travel south to Mexico are at least two generations away from the previous generation of fall migrants (3
). Fundamental to decoding the genetic basis of the long-distance migration has been the construction of the draft sequence of the monarch genome (4
The monarch genome and its transcriptome were sequenced de novo
using next-generation sequencing technologies (4
). The difficulty of assembling the genome from wild-caught butterflies with potentially high heterozygosity was overcome, thus allowing the construction of the initial version of the monarch genome assembly (v1) which consisted of 273 Mb with 16 866 protein-coding genes (4
Although the original assembly was quite complete for gene coverage, its quality was hindered because of small scaffold size (N50 of 53 kb) and high redundancy (~10%). By implementing new assembling strategies and new libraries, these difficulties have been largely overcome, resulting in a substantial improvement of the monarch butterfly assembly (named v3): 90% of the 249 Mb assembled sequence is now represented by 366 major scaffolds whose minimum length is 160 kb. The improved organization of the monarch genome should allow more precise annotation work. Furthermore, it provides a high quality reference that will facilitate future population genetic studies. For example, researchers now can re-sequence other monarch populations or non-migratory Danaus species to help identify migratory genes.
MonarchBase was developed as a public database for readily accessing the monarch genome, its proteome and related biological processes. The growing amount of genomic data and its continuous qualitative improvement necessitated a centralized database to coordinate the inflow of monarch genomic resources. Compared with public data repository, organism-specific databases provide the community with specialized data sets, powerful retrieving interfaces, a platform for extensive biological interpretations and a site for the integration of a variety of previously dispersed data types. MonarchBase serves not only researchers interested in monarch butterfly biology and the biology of the migration but also the wider lepidopteran community. We report here the development of MonarchBase, its components and the latest version of monarch genome assembly and its corresponding geneset.