Relatively little is known about the genomic basis and evolution of wood-feeding in beetles. We undertook genome sequencing and annotation, gene expression assays, studies of plant cell wall degrading enzymes, and other functional and comparative studies of the Asian longhorned beetle, Anoplophora glabripennis, a globally significant invasive species capable of inflicting severe feeding damage on many important tree species. Complementary studies of genes encoding enzymes involved in digestion of woody plant tissues or detoxification of plant allelochemicals were undertaken with the genomes of 14 additional insects, including the newly sequenced emerald ash borer and bull-headed dung beetle.
The Asian longhorned beetle genome encodes a uniquely diverse arsenal of enzymes that can degrade the main polysaccharide networks in plant cell walls, detoxify plant allelochemicals, and otherwise facilitate feeding on woody plants. It has the metabolic plasticity needed to feed on diverse plant species, contributing to its highly invasive nature. Large expansions of chemosensory genes involved in the reception of pheromones and plant kairomones are consistent with the complexity of chemical cues it uses to find host plants and mates.
Amplification and functional divergence of genes associated with specialized feeding on plants, including genes originally obtained via horizontal gene transfer from fungi and bacteria, contributed to the addition, expansion, and enhancement of the metabolic repertoire of the Asian longhorned beetle, certain other phytophagous beetles, and to a lesser degree, other phytophagous insects. Our results thus begin to establish a genomic basis for the evolutionary success of beetles on plants.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-016-1088-8) contains supplementary material, which is available to authorized users.
Chemoperception; Detoxification; Glycoside hydrolase; Horizontal gene transfer; Phytophagy; Xylophagy
The Mediterranean fruit fly (medfly), Ceratitis capitata, is a major destructive insect pest due to its broad host range, which includes hundreds of fruits and vegetables. It exhibits a unique ability to invade and adapt to ecological niches throughout tropical and subtropical regions of the world, though medfly infestations have been prevented and controlled by the sterile insect technique (SIT) as part of integrated pest management programs (IPMs). The genetic analysis and manipulation of medfly has been subject to intensive study in an effort to improve SIT efficacy and other aspects of IPM control.
The 479 Mb medfly genome is sequenced from adult flies from lines inbred for 20 generations. A high-quality assembly is achieved having a contig N50 of 45.7 kb and scaffold N50 of 4.06 Mb. In-depth curation of more than 1800 messenger RNAs shows specific gene expansions that can be related to invasiveness and host adaptation, including gene families for chemoreception, toxin and insecticide metabolism, cuticle proteins, opsins, and aquaporins. We identify genes relevant to IPM control, including those required to improve SIT.
The medfly genome sequence provides critical insights into the biology of one of the most serious and widespread agricultural pests. This knowledge should significantly advance the means of controlling the size and invasive potential of medfly populations. Its close relationship to Drosophila, and other insect species important to agriculture and human health, will further comparative functional and structural studies of insect genomes that should broaden our understanding of gene family evolution.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-016-1049-2) contains supplementary material, which is available to authorized users.
Medfly genome; Tephritid genomics; Insect orthology; Gene family evolution; Chromosomal synteny; Insect invasiveness; Insect adaptation; Medfly integrated pest management (IPM)
The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the last two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host-symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human-bed bug and symbiont-bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite.
Echinoderm genome sequences are a corpus of useful information about a clade of animals that serve as research models in fields ranging from marine ecology to cell and developmental biology. Genomic information from echinoids has contributed to insights into the gene interactions that drive the developmental process at the molecular level. Such insights often rely heavily on genomic information and the kinds of questions that can be asked thus depend on the quality of the sequence information. Here we describe the history of echinoderm genomic sequence assembly and present details about the quality of the data obtained. All of the sequence information discussed here is posted on the echinoderm information web system, Echinobase.org.
Multiple species of herpesviruses from three different lineages of the Proboscivirus genus (EEHV1/6, EEHV2/5, and EEHV3/4/7) infect both Asian and African elephants, but lethal hemorrhagic disease is largely confined to Asian elephant calves and is predominantly associated with EEHV1. Milder disease caused by EEHV5 or EEHV4 is being increasingly recognized as well, but little is known about the latter, which is estimated to have diverged at least 35 million years ago from the others within a distinctive GC-rich branch of the Proboscivirus genus. Here, we have determined the complete genomic DNA sequence of a strain of EEHV4 obtained from a trunk wash sample collected from a surviving Asian elephant calf undergoing asymptomatic shedding during convalescence after an acute hemorrhagic disease episode. This represents the first example from among the three known GC-rich branch Proboscivirus species to be assembled and fully annotated. Several distinctive features of EEHV4 compared to AT-rich branch genomes are described
A novel group of mammalian DNA viruses called elephant endotheliotropic herpesviruses (EEHVs) belonging to the Proboscivirus genus has been associated with nearly 100 cases of highly lethal acute hemorrhagic disease in young Asian elephants worldwide. The complete 180-kb genomes of prototype strains from three AT-rich branch viruses, EEHV1A, EEHV1B, and EEHV5, have been published. However, less than 6 kb of DNA sequence each from EEHV3, EEHV4, and EEHV7 showed them to be a hugely diverged second major branch with GC-rich characteristics. Here, we determined the complete 206-kb genome of EEHV4(Baylor) directly from trunk wash DNA by next-generation sequencing and de novo assembly procedures. Among a total of 119 genes with an overall colinear organization similar to those of the AT-rich EEHVs, major features of EEHV4 include a family of 26 paralogous 7xTM and vGPCR-like genes plus 25 novel or missing genes. The genome also contains an unusual distribution of tracts of 5 to 11 successive A or T nucleotides in intergenic domains between the mostly much higher GC content protein coding regions. Furthermore, an extremely high GC-rich bias in the third wobble position of codons clearly delineates the coding regions for many but not all proteins. There are also two novel captured cellular genes, including a C-type lectin (vECTL) and an O-linked acetylglucosamine transferase (vOGT), as well as an unusually large and complex Ori-Lyt dyad symmetry domain. Finally, 30 kb from a second strain proved to include three small chimeric domains, indicating the existence of distinct EEHV4A and EEHV4B subtypes.
IMPORTANCE Multiple species of herpesviruses from three different lineages of the Proboscivirus genus (EEHV1/6, EEHV2/5, and EEHV3/4/7) infect both Asian and African elephants, but lethal hemorrhagic disease is largely confined to Asian elephant calves and is predominantly associated with EEHV1. Milder disease caused by EEHV5 or EEHV4 is being increasingly recognized as well, but little is known about the latter, which is estimated to have diverged at least 35 million years ago from the others within a distinctive GC-rich branch of the Proboscivirus genus. Here, we have determined the complete genomic DNA sequence of a strain of EEHV4 obtained from a trunk wash sample collected from a surviving Asian elephant calf undergoing asymptomatic shedding during convalescence after an acute hemorrhagic disease episode. This represents the first example from among the three known GC-rich branch Proboscivirus species to be assembled and fully annotated. Several distinctive features of EEHV4 compared to AT-rich branch genomes are described.
elephant endotheliotropic herpesviruses; Elephas maximus calf; G-plus-C nucleotide content bias; acute hemorrhagic disease; evolutionary divergence; trunk wash shedding
Cooperative systems are susceptible to invasion by selfish individuals that profit from receiving the social benefits but fail to contribute. These so-called “cheaters” can have a fitness advantage in the laboratory, but it is unclear whether cheating provides an important selective advantage in nature. We used a population genomic approach to examine the history of genes involved in cheating behaviors in the social amoeba Dictyostelium discoideum, testing whether these genes experience rapid evolutionary change as a result of conflict over spore-stalk fate. Candidate genes and surrounding regions showed elevated polymorphism, unusual patterns of linkage disequilibrium, and lower levels of population differentiation, but they did not show greater between-species divergence. The signatures were most consistent with frequency-dependent selection acting to maintain multiple alleles, suggesting that conflict may lead to stalemate rather than an escalating arms race. Our results reveal the evolutionary dynamics of cooperation and cheating and underscore how sequence-based approaches can be used to elucidate the history of conflicts that are difficult to observe directly.
Analysing population genomic data from killer whale ecotypes, which we estimate have globally radiated within less than 250,000 years, we show that genetic structuring including the segregation of potentially functional alleles is associated with socially inherited ecological niche. Reconstruction of ancestral demographic history revealed bottlenecks during founder events, likely promoting ecological divergence and genetic drift resulting in a wide range of genome-wide differentiation between pairs of allopatric and sympatric ecotypes. Functional enrichment analyses provided evidence for regional genomic divergence associated with habitat, dietary preferences and post-zygotic reproductive isolation. Our findings are consistent with expansion of small founder groups into novel niches by an initial plastic behavioural response, perpetuated by social learning imposing an altered natural selection regime. The study constitutes an important step towards an understanding of the complex interaction between demographic history, culture, ecological adaptation and evolution at the genomic level.
Killer whales have evolved into specialized ecotypes based on hunting strategies and ecological niches. Here, Andrew Foote and colleagues sequenced the whole genome of individual killer whales representing 5 different ecotypes from North Pacific and Antarctic, and show expansion of small founder groups to adapt to specific ecological niches.
The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host–symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human–bed bug and symbiont–bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite.
The bed bug, Cimex lectularius, is a ubiquitous human ectoparasite with global distribution. Here, the authors sequence the genome of the bed bug and identify reductions in chemosensory genes, expansion of genes associated with blood digestion and genes linked to pesticide resistance.
Acorn worms, also known as enteropneust (literally, ‘gut-breathing’) hemichordates, are marine invertebrates that share features with echinoderms and chordates. Together, these three phyla comprise the deuterostomes. Here we report the draft genome sequences of two acorn worms, Saccoglossus kowalevskii and Ptychodera flava. By comparing them with diverse bilaterian genomes, we identify shared traits that were probably inherited from the last common deuterostome ancestor, and then explore evolutionary trajectories leading from this ancestor to hemichordates, echinoderms and chordates. The hemichordate genomes exhibit extensive conserved synteny with amphioxus and other bilaterians, and deeply conserved non-coding sequences that are candidates for conserved gene-regulatory elements. Notably, hemichordates possess a deuterostome-specific genomic cluster of four ordered transcription factor genes, the expression of which is associated with the development of pharyngeal ‘gill’ slits, the foremost morphological innovation of early deuterostomes, and is probably central to their filter-feeding lifestyle. Comparative analysis reveals numerous deuterostome-specific gene novelties, including genes found in deuterostomes and marine microbes, but not other animals. The putative functions of these genes can be linked to physiological, metabolic and developmental specializations of the filter-feeding ancestor.
Marine mammals from different mammalian orders share several phenotypic traits adapted to the aquatic environment and are therefore a classic example of convergent evolution. To investigate convergent evolution at the genomic level, we sequenced and de novo assembled the genomes of three species of marine mammals (the killer whale, walrus and manatee) from three mammalian orders that share independently evolved phenotypic adaptations to a marine existence. Our comparative genomic analyses found that convergent amino acid substitutions were widespread throughout the genome, and that a subset were in genes evolving under positive selection and putatively associated with a marine phenotype. However, we found higher levels of convergent amino acid substitutions in a control set of terrestrial sister taxa to the marine mammals. Our results suggest that while convergent molecular evolution is relatively common, adaptive molecular convergence linked to phenotypic convergence is comparatively rare.
A first analysis of the genome sequence of the common marmoset (Callithrix jacchus), assembled using traditional Sanger methods and Ensembl annotation, has permitted genomic comparison with apes and that old world monkeys and the identification of specific molecular features a rapid reproductive capacity partly due to may contribute to the unique biology of diminutive The common marmoset has prevalence of this dizygotic primate. twins. Remarkably, these twins share placental circulation and exchange hematopoietic stem cells in utero, resulting in adults that are hematopoietic chimeras.
We observed positive selection or non-synonymous substitutions for genes encoding growth hormone / insulin-like growth factor (growth pathways), respiratory complex I (metabolic pathways), immunobiology, and proteases (reproductive and immunity pathways). In addition, both protein-coding and microRNA genes related to reproduction exhibit rapid sequence evolution. This New World monkey genome sequence enables significantly increased power for comparative analyses among available primate genomes and facilitates biomedical research application.
Lucilia cuprina is a parasitic fly of major economic importance worldwide. Larvae of this fly invade their animal host, feed on tissues and excretions and progressively cause severe skin disease (myiasis). Here we report the sequence and annotation of the 458-megabase draft genome of Lucilia cuprina. Analyses of this genome and the 14,544 predicted protein-encoding genes provide unique insights into the fly's molecular biology, interactions with the host animal and insecticide resistance. These insights have broad implications for designing new methods for the prevention and control of myiasis.
Lucilia cuprina is a parasitic blowfly of major economic importance worldwide that feeds on the tissues of animals such as sheep. Here, the authors sequence the genome of L. cuprina and provide insights into the fly's molecular biology, interactions with the host animal and insecticide resistance.
The shift from solitary to social behavior is one of the major evolutionary transitions. Primitively eusocial bumblebees are uniquely placed to illuminate the evolution of highly eusocial insect societies. Bumblebees are also invaluable natural and agricultural pollinators, and there is widespread concern over recent population declines in some species. High-quality genomic data will inform key aspects of bumblebee biology, including susceptibility to implicated population viability threats.
We report the high quality draft genome sequences of Bombus terrestris and Bombus impatiens, two ecologically dominant bumblebees and widely utilized study species. Comparing these new genomes to those of the highly eusocial honeybee Apis mellifera and other Hymenoptera, we identify deeply conserved similarities, as well as novelties key to the biology of these organisms. Some honeybee genome features thought to underpin advanced eusociality are also present in bumblebees, indicating an earlier evolution in the bee lineage. Xenobiotic detoxification and immune genes are similarly depauperate in bumblebees and honeybees, and multiple categories of genes linked to social organization, including development and behavior, show high conservation. Key differences identified include a bias in bumblebee chemoreception towards gustation from olfaction, and striking differences in microRNAs, potentially responsible for gene regulation underlying social and other traits.
These two bumblebee genomes provide a foundation for post-genomic research on these key pollinators and insect societies. Overall, gene repertoires suggest that the route to advanced eusociality in bees was mediated by many small changes in many genes and processes, and not by notable expansion or depauperation.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0623-3) contains supplementary material, which is available to authorized users.
Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these structural variants (SVs), they remain less characterized than smaller variants because of SV diversity, complexity, and size. These challenges are exacerbated by the experimental and computational demands of SV analysis. Here, we characterize the SV content of a personal genome with Parliament, a publicly available consensus SV-calling infrastructure that merges multiple data types and SV detection methods.
We demonstrate Parliament’s efficacy via integrated analyses of data from whole-genome array comparative genomic hybridization, short-read next-generation sequencing, long-read (Pacific BioSciences RSII), long-insert (Illumina Nextera), and whole-genome architecture (BioNano Irys) data from the personal genome of a single subject (HS1011). From this genome, Parliament identified 31,007 genomic loci between 100 bp and 1 Mbp that are inconsistent with the hg19 reference assembly. Of these loci, 9,777 are supported as putative SVs by hybrid local assembly, long-read PacBio data, or multi-source heuristics. These SVs span 59 Mbp of the reference genome (1.8%) and include 3,801 events identified only with long-read data. The HS1011 data and complete Parliament infrastructure, including a BAM-to-SV workflow, are available on the cloud-based service DNAnexus.
HS1011 SV analysis reveals the limits and advantages of multiple sequencing technologies, specifically the impact of long-read SV discovery. With the full Parliament infrastructure, the HS1011 data constitute a public resource for novel SV discovery, software calibration, and personal genome structural variation analysis.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1479-3) contains supplementary material, which is available to authorized users.
Structural variation; Long-read sequencing; SV software
Gibbons are small arboreal apes that display an accelerated rate of
evolutionary chromosomal rearrangement and occupy a key node in the primate
phylogeny between Old World monkeys and great apes. Here we present the assembly
and analysis of a northern white-cheeked gibbon (Nomascus
leucogenys) genome. We describe the propensity for a
gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation
genes and alter transcription by providing a premature termination site,
suggesting a possible molecular mechanism for the genome plasticity of the
gibbon lineage. We further show that the gibbon genera
Hoolock and Symphalangus) experienced a
near-instantaneous radiation ~5 million years ago, coincident with major
geographical changes in Southeast Asia that caused cycles of habitat compression
and expansion. Finally, we identify signatures of positive selection in genes
important for forelimb development (TBX5) and connective
tissues (COL1A1) that may have been involved in the adaptation
of gibbons to their arboreal habitat.
Asian elephant (Elephas maximus) immunity is poorly characterized and understood. This gap in knowledge is particularly concerning as Asian elephants are an endangered species threatened by a newly discovered herpesvirus known as elephant endotheliotropic herpesvirus (EEHV), which is the leading cause of death for captive Asian elephants born after 1980 in North America. While reliable diagnostic assays have been developed to detect EEHV DNA, serological assays to evaluate elephant anti-EEHV antibody responses are lacking and will be needed for surveillance and epidemiological studies and also for evaluating potential treatments or vaccines against lethal EEHV infection. Previous studies have shown that Asian elephants produce IgG in serum, but they failed to detect IgM and IgA, further hampering development of informative serological assays for this species. To begin to address this issue, we determined the constant region genomic sequence of Asian elephant IgM and obtained some limited protein sequence information for putative serum IgA. The information was used to generate or identify specific commercial antisera reactive against IgM and IgA isotypes. In addition, we generated a monoclonal antibody against Asian elephant IgG. These three reagents were used to demonstrate that all three immunoglobulin isotypes are found in Asian elephant serum and milk and to detect antibody responses following tetanus toxoid booster vaccination or antibodies against a putative EEHV structural protein. The results indicate that these new reagents will be useful for developing sensitive and specific assays to detect and characterize elephant antibody responses for any pathogen or vaccine, including EEHV.
Sheep (Ovis aries) are a major source of meat, milk and fiber in the form of wool, and represent a distinct class of animals that have a specialized digestive organ, the rumen, which carries out the initial digestion of plant material. We have developed and analyzed a high quality reference sheep genome and transcriptomes from 40 different tissues. We identified highly expressed genes encoding keratin cross-linking proteins associated with rumen evolution. We also identified genes involved in lipid metabolism that had been amplified and/or had altered tissue expression patterns. This may be in response to changes in the barrier lipids of the skin, an interaction between lipid metabolism and wool synthesis, and an increased role of volatile fatty acids in ruminants, compared to non-ruminant animals.
Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific life history.
Arthropods are the most abundant animals on earth. Among them, insects clearly dominate on land, whereas crustaceans hold the title for the most diverse invertebrates in the oceans. Much is known about the biology of these groups, not least because of genomic studies of the fruit fly Drosophila, the water flea Daphnia, and other species used in research. Here we report the first genome sequence from a species belonging to a lineage that has previously received very little attention—the myriapods. Myriapods were among the first arthropods to invade the land over 400 million years ago, and survive today as the herbivorous millipedes and venomous centipedes, one of which—Strigamia maritima—we have sequenced here. We find that the genome of this centipede retains more characteristics of the presumed arthropod ancestor than other sequenced insect genomes. The genome provides access to many aspects of myriapod biology that have not been studied before, suggesting, for example, that they have diversified receptors for smell that are quite different from those used by insects. In addition, it shows specific consequences of the largely subterranean life of this particular species, which seems to have lost the genes for all known light-sensing molecules, even though it still avoids light.
To understand the biology and evolution of ruminants, the cattle genome was sequenced to ∼7× coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1,217 are absent or undetected in non-eutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides an enabling resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
The evolutionary importance of hybridization and introgression has long been debated1. We used genomic tools to investigate introgression in Heliconius, a rapidly radiating genus of neotropical butterflies widely used in studies of ecology, behaviour, mimicry and speciation2-5 . We sequenced the genome of Heliconius melpomene and compared it with other taxa to investigate chromosomal evolution in Lepidoptera and gene flow among multiple Heliconius species and races. Among 12,657 predicted genes for Heliconius, biologically important expansions of families of chemosensory and Hox genes are particularly noteworthy. Chromosomal organisation has remained broadly conserved since the Cretaceous, when butterflies split from the silkmoth lineage. Using genomic resequencing, we show hybrid exchange of genes between three co-mimics, H. melpomene, H. timareta, and H. elevatus, especially at two genomic regions that control mimicry pattern. Closely related Heliconius species clearly exchange protective colour pattern genes promiscuously, implying a major role for hybridization in adaptive radiation.
Ecosystem function and resilience is determined by the interactions and independent contributions of individual species. Apex predators play a disproportionately determinant role through their influence and dependence on the dynamics of prey species. Their demographic fluctuations are thus likely to reflect changes in their respective ecological communities and habitat. Here, we investigate the historical population dynamics of the killer whale based on draft nuclear genome data for the Northern Hemisphere and mtDNA data worldwide. We infer a relatively stable population size throughout most of the Pleistocene, followed by an order of magnitude decline and bottleneck during the Weichselian glacial period. Global mtDNA data indicate that while most populations declined, at least one population retained diversity in a stable, productive ecosystem off southern Africa. We conclude that environmental changes during the last glacial period promoted the decline of a top ocean predator, that these events contributed to the pattern of diversity among extant populations, and that the relatively high diversity of a population currently in productive, stable habitat off South Africa suggests a role for ocean productivity in the widespread decline.
genomics; demographics; Cetacea; population bottleneck
The first generation of genome sequence assemblies and annotations have had a significant impact upon our understanding of the biology of the sequenced species, the phylogenetic relationships among species, the study of populations within and across species, and have informed the biology of humans. As only a few Metazoan genomes are approaching finished quality (human, mouse, fly and worm), there is room for improvement of most genome assemblies. The honey bee (Apis mellifera) genome, published in 2006, was noted for its bimodal GC content distribution that affected the quality of the assembly in some regions and for fewer genes in the initial gene set (OGSv1.0) compared to what would be expected based on other sequenced insect genomes.
Here, we report an improved honey bee genome assembly (Amel_4.5) with a new gene annotation set (OGSv3.2), and show that the honey bee genome contains a number of genes similar to that of other insect genomes, contrary to what was suggested in OGSv1.0. The new genome assembly is more contiguous and complete and the new gene set includes ~5000 more protein-coding genes, 50% more than previously reported. About 1/6 of the additional genes were due to improvements to the assembly, and the remaining were inferred based on new RNAseq and protein data.
Lessons learned from this genome upgrade have important implications for future genome sequencing projects. Furthermore, the improvements significantly enhance genomic resources for the honey bee, a key model for social behavior and essential to global ecology through pollination.
Apis mellifera; GC content; Gene annotation; Gene prediction; Genome assembly; Genome improvement; Genome sequencing; Repetitive DNA; Transcriptome
Genetic mapping on fully sequenced individuals is transforming our understanding of the relationship between molecular variation and variation in complex traits. Here we report a combined sequence and genetic mapping analysis in outbred rats that maps 355 quantitative trait loci for 122 phenotypes. We identify 35 causal genes involved in 31 phenotypes, implicating novel genes in models of anxiety, heart disease and multiple sclerosis. The relation between sequence and genetic variation is unexpectedly complex: at approximately 40% of quantitative trait loci a single sequence variant cannot account for the phenotypic effect. Using comparable sequence and mapping data from mice, we show the extent and spatial pattern of variation in inbred rats differ significantly from those of inbred mice, and that the genetic variants in orthologous genes rarely contribute to the same phenotype in both species.
The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.
In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.
Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.
Genome assembly; N50; Scaffolds; Assessment; Heterozygosity; COMPASS
A major challenge of biology is understanding the relationship between molecular genetic variation and variation in quantitative traits, including fitness. This relationship determines our ability to predict phenotypes from genotypes and to understand how evolutionary forces shape variation within and between species. Previous efforts to dissect the genotype-phenotype map were based on incomplete genotypic information. Here, we describe the Drosophila melanogaster Genetic Reference Panel (DGRP), a community resource for analysis of population genomics and quantitative traits. The DGRP consists of fully sequenced inbred lines derived from a natural population. Population genomic analyses reveal reduced polymorphism in centromeric autosomal regions and the X chromosome, evidence for positive and negative selection, and rapid evolution of the X chromosome. Many variants in novel genes, most at low frequency, are associated with quantitative traits and explain a large fraction of the phenotypic variance. The DGRP facilitates genotype-phenotype mapping using the power of Drosophila genetics.