Miniature Inverted-repeat Terminal Elements (MITEs), which are particular class-II transposable elements (TEs), play an important role in genome evolution, because they have very high copy numbers and display recurrent bursts of transposition. The 5' and 3' subterminal regions of a given MITE family often show a high sequence similarity with the corresponding regions of an autonomous Class-II TE family. However, the sustained presence over a prolonged evolutionary time of MITEs and TE master copies able to promote their mobility has been rarely reported within the same genome, and this raises fascinating evolutionary questions.
We report here the presence of P transposable elements with related MITE families in the Anopheles gambiae genome. Using a TE annotation pipeline we have identified and analyzed all the P sequences in the sequenced A. gambiae PEST strain genome. More than 0.49% of the genome consists of P elements and derivates. P elements can be divided into 9 different subfamilies, separated by more than 30% of nucleotide divergence. Seven of them present full length copies. Ten MITE families are associated with 6 out of the 9 Psubfamilies. Comparing their intra-element nucleotide diversities and their structures allows us to propose the putative dynamics of their emergence. In particular, one MITE family which has a hybrid structure, with ends each of which is related to a different P-subfamily, suggests a new mechanism for their emergence and their mobility.
This work contributes to a greater understanding of the relationship between full-length class-II TEs and MITEs, in this case P elements and their derivatives in the genome of A. gambiae. Moreover, it provides the most comprehensive catalogue to date of P-like transposons in this genome and provides convincing yet indirect evidence that some of the subfamilies have been recently active.
An initial comparative genomic study of the malaria vector Anopheles gambiae and the yellow fever mosquito Aedes aegypti revealed striking differences in the genome assembly size and in the abundance of transposable elements between the two species. However, the chromosome arms homology between An. gambiae and Ae. aegypti, as well as the distribution of genes and repetitive elements in chromosomes of Ae. aegypti, remained largely unexplored because of the lack of a detailed physical genome map for the yellow fever mosquito.
Using a molecular landmark-guided fluorescent in situ hybridization approach, we mapped 624 Mb of the Ae. aegypti genome to mitotic chromosomes. We used this map to analyze the distribution of genes, tandem repeats and transposable elements along the chromosomes and to explore the patterns of chromosome homology and rearrangements between Ae. aegypti and An. gambiae. The study demonstrated that the q arm of the sex-determining chromosome 1 had the lowest gene content and the highest density of minisatellites. A comparative genomic analysis with An. gambiae determined that the previously proposed whole-arm synteny is not fully preserved; a number of pericentric inversions have occurred between the two species. The sex-determining chromosome 1 had a higher rate of genome rearrangements than observed in autosomes 2 and 3 of Ae. aegypti.
The study developed a physical map of 45% of the Ae. aegypti genome and provided new insights into genomic composition and evolution of Ae. aegypti chromosomes. Our data suggest that minisatellites rather than transposable elements played a major role in rapid evolution of chromosome 1 in the Aedes lineage. The research tools and information generated by this study contribute to a more complete understanding of the genome organization and evolution in mosquitoes.
Physical mapping; Mosquito; Genome; Chromosome
In the model system Drosophila melanogaster, doublesex (dsx) is the double-switch gene at the bottom of the somatic sex determination cascade that determines the differentiation of sexually dimorphic traits. Homologues of dsx are functionally conserved in various dipteran species, including the malaria vector Anopheles gambiae. They show a striking conservation of sex-specific regulation, based on alternative splicing, and of the encoded sex-specific proteins, which are transcriptional regulators of downstream terminal genes that influence sexual differentiation of cells, tissues and organs.
In this work, we report on the molecular characterization of the dsx homologue in the dengue and yellow fever vector Aedes aegypti (Aeadsx). Aeadsx produces sex-specific transcripts by alternative splicing, which encode isoforms with a high degree of identity to Anopheles gambiae and Drosophila melanogaster homologues. Interestingly, Aeadsx produces an additional novel female-specific splicing variant. Genomic comparative analyses between the Aedes and Anopheles dsx genes revealed a partial conservation of the exon organization and extensive divergence in the intron lengths. An expression analysis showed that Aeadsx transcripts were present from early stages of development and that sex-specific regulation starts at least from late larval stages. The analysis of the female-specific untranslated region (UTR) led to the identification of putative regulatory cis-elements potentially involved in the sex-specific splicing regulation. The Aedes dsx sex-specific splicing regulation seems to be more complex with the respect of other dipteran species, suggesting slightly novel evolutionary trajectories for its regulation and hence for the recruitment of upstream splicing regulators.
This study led to uncover the molecular evolution of Aedes aegypti dsx splicing regulation with the respect of the more closely related Culicidae Anopheles gambiae orthologue. In Aedes aegypti, the dsx gene is sex-specifically regulated and encodes two female-specific and one male-specific isoforms, all sharing a doublesex/mab-3 (DM) domain-containing N-terminus and different C-termini. The sex-specific regulation is based on a combination of exon skipping, 5' alternative splice site choice and, most likely, alternative polyadenylation. Interestingly, when the Aeadsx gene is compared to the Anopheles dsx ortholog, there are differences in the in silico predicted default and regulated sex-specific splicing events, which suggests that the upstream regulators either are different or act in a slightly different manner. Furthermore, this study is a premise for the future development of transgenic sexing strains in mosquitoes useful for sterile insect technique (SIT) programs.
Nonrandom distribution of rearrangements is a common feature of eukaryotic chromosomes that is not well understood in terms of genome organization and evolution. In the major African malaria vector Anopheles gambiae, polymorphic inversions are highly nonuniformly distributed among five chromosomal arms and are associated with epidemiologically important adaptations. However, it is not clear whether the genomic content of the chromosomal arms is associated with inversion polymorphism and fixation rates.
To better understand the evolutionary dynamics of chromosomal inversions, we created a physical map for an Asian malaria mosquito, Anopheles stephensi, and compared it with the genome of An. gambiae. We also developed and deployed novel Bayesian statistical models to analyze genome landscapes in individual chromosomal arms An. gambiae. Here, we demonstrate that, despite the paucity of inversion polymorphisms on the X chromosome, this chromosome has the fastest rate of inversion fixation and the highest density of transposable elements, simple DNA repeats, and GC content. The highly polymorphic and rapidly evolving autosomal 2R arm had overrepresentation of genes involved in cellular response to stress supporting the role of natural selection in maintaining adaptive polymorphic inversions. In addition, the 2R arm had the highest density of regions involved in segmental duplications that clustered in the breakpoint-rich zone of the arm. In contrast, the slower evolving 2L, 3R, and 3L, arms were enriched with matrix-attachment regions that potentially contribute to chromosome stability in the cell nucleus.
These results highlight fundamental differences in evolutionary dynamics of the sex chromosome and autosomes and revealed the strong association between characteristics of the genome landscape and rates of chromosomal evolution. We conclude that a unique combination of various classes of genes and repetitive DNA in each arm, rather than a single type of repetitive element, is likely responsible for arm-specific rates of rearrangements.
Malaria has a devastating impact on worldwide public health in many tropical areas. Studies on vector immunity are important for the overall understanding of the parasite-vector interaction and for the design of novel strategies to control malaria. A member of the fibrinogen-related protein family, fbn9, has been well studied in Anopheles gambiae and has been shown to be an important component of the mosquito immune system. However, little is known about this gene in neotropical anopheline species.
This article describes the identification and characterization of the fbn9 gene partial sequences from four species of neotropical anopheline primary and secondary vectors: Anopheles darlingi, Anopheles nuneztovari, Anopheles aquasalis, and Anopheles albitarsis (namely Anopheles marajoara). Degenerate primers were designed based on comparative analysis of publicly available Aedes aegypti and An. gambiae gene sequences and used to clone putative homologs in the neotropical species. Sequence comparisons and Bayesian phylogenetic analyses were then performed to better understand the molecular diversity of this gene in evolutionary distant anopheline species, belonging to different subgenera.
Comparisons of the fbn9 gene sequences of the neotropical anophelines and their homologs in the An. gambiae complex (Gambiae complex) showed high conservation at the nucleotide and amino acid levels, although some sites show significant differentiation (non-synonymous substitutions). Furthermore, phylogenetic analysis of fbn9 nucleotide sequences showed that neotropical anophelines and African mosquitoes form two well-supported clades, mirroring their separation into two different subgenera.
The present work adds new insights into the conserved role of fbn9 in insect immunity in a broader range of anopheline species and reinforces the possibility of manipulating mosquito immunity to design novel pathogen control strategies.
Anopheles darlingi is the principal neotropical malaria vector, responsible for more than a million cases of malaria per year on the American continent. Anopheles darlingi diverged from the African and Asian malaria vectors ∼100 million years ago (mya) and successfully adapted to the New World environment. Here we present an annotated reference A. darlingi genome, sequenced from a wild population of males and females collected in the Brazilian Amazon. A total of 10 481 predicted protein-coding genes were annotated, 72% of which have their closest counterpart in Anopheles gambiae and 21% have highest similarity with other mosquito species. In spite of a long period of divergent evolution, conserved gene synteny was observed between A. darlingi and A. gambiae. More than 10 million single nucleotide polymorphisms and short indels with potential use as genetic markers were identified. Transposable elements correspond to 2.3% of the A. darlingi genome. Genes associated with hematophagy, immunity and insecticide resistance, directly involved in vector–human and vector–parasite interactions, were identified and discussed. This study represents the first effort to sequence the genome of a neotropical malaria vector, and opens a new window through which we can contemplate the evolutionary history of anopheline mosquitoes. It also provides valuable information that may lead to novel strategies to reduce malaria transmission on the South American continent. The A. darlingi genome is accessible at www.labinfo.lncc.br/index.php/anopheles-darlingi.
Understanding phylogenetic relationships within species complexes of disease vectors is crucial for identifying genomic changes associated with the evolution of epidemiologically important traits. However, the high degree of genetic similarity among sibling species confounds the ability to determine phylogenetic relationships using molecular markers. The goal of this study was to infer the ancestral–descendant relationships among malaria vectors and nonvectors of the Anopheles gambiae species complex by analyzing breakpoints of fixed chromosomal inversions in ingroup and several outgroup species. We identified genes at breakpoints of fixed overlapping chromosomal inversions 2Ro and 2Rp of An. merus using fluorescence in situ hybridization, a whole-genome mate-paired sequencing, and clone sequencing. We also mapped breakpoints of a chromosomal inversion 2La (common to An. merus, An. gambiae, and An. arabiensis) in outgroup species using a bioinformatics approach. We demonstrated that the “standard” 2R+p arrangement and “inverted” 2Ro and 2La arrangements are present in outgroup species Anopheles stephensi, Aedes aegypti, and Culex quinquefasciatus. The data indicate that the ancestral species of the An. gambiae complex had the 2Ro, 2R+p, and 2La chromosomal arrangements. The “inverted” 2Ro arrangement uniquely characterizes a malaria vector An. merus as the basal species in the complex. The rooted chromosomal phylogeny implies that An. merus acquired the 2Rp inversion and that its sister species An. gambiae acquired the 2R+o inversion from the ancestral species. The karyotype of nonvectors An. quadriannulatus A and B was derived from the karyotype of the major malaria vector An. gambiae. We conclude that the ability to effectively transmit human malaria had originated repeatedly in the complex. Our findings also suggest that saltwater tolerance originated first in An. merus and then independently in An. melas. The new chromosomal phylogeny will facilitate identifying the association of evolutionary genomic changes with epidemiologically important phenotypes.
Malaria causes more than one million deaths every year, mostly among children in Sub-Saharan Africa. Anopheles mosquitoes are exclusive vectors of human malaria. Many malaria vectors belong to species complexes, and members within these complexes can vary significantly in their ecological adaptations and ability to transmit the parasite. To better understand evolution of epidemiologically important traits, we studied relationships among nonvector and vector species of the African Anopheles gambiae complex. We analyzed gene orders at genomic regions where evolutionary breaks of chromosomal inversions occurred in members of the complex and compared them with gene orders in species outside the complex. This approach allowed us to identify ancient and recent gene orders for three chromosomal inversions. Surprisingly, the more ancestral chromosomal arrangements were found in mosquito species that are vectors of human malaria, while the more derived arrangements were found in both nonvectors and vectors. Our finding strongly suggests that the increased ability to transmit human malaria originated repeatedly during the recent evolution of these African mosquitoes. This knowledge can be used to identify specific genetic changes associated with the human blood choice and ecological adaptations.
Mosquito-borne viral diseases cause significant burden in much of the developing world. Although host-virus interactions have been studied extensively in the vertebrate host, little is known about mosquito responses to viral infection. In contrast to mosquitoes of the Aedes and Culex genera, Anopheles gambiae, the principal vector of human malaria, naturally transmits very few arboviruses, the most important of which is O'nyong-nyong virus (ONNV). Here we have investigated the A. gambiae immune response to systemic ONNV infection using forward and reverse genetic approaches.
We have used DNA microarrays to profile the transcriptional response of A. gambiae inoculated with ONNV and investigate the antiviral function of candidate genes through RNAi gene silencing assays. Our results demonstrate that A. gambiae responses to systemic viral infection involve genes covering all aspects of innate immunity including pathogen recognition, modulation of immune signalling, complement-mediated lysis/opsonisation and other immune effector mechanisms. Patterns of transcriptional regulation and co-infections of A. gambiae with ONNV and the rodent malaria parasite Plasmodium berghei suggest that hemolymph immune responses to viral infection are diverted away from melanisation. We show that four viral responsive genes encoding two putative recognition receptors, a galectin and an MD2-like receptor, and two effector lysozymes, function in limiting viral load.
This study is the first step in elucidating the antiviral mechanisms of A. gambiae mosquitoes, and has revealed interesting differences between A. gambiae and other invertebrates. Our data suggest that mechanisms employed by A. gambiae are distinct from described invertebrate antiviral immunity to date, and involve the complement-like branch of the humoral immune response, supressing the melanisation response that is prominent in anti-parasitic immunity. The antiviral immune response in A. gambiae is thus composed of some key conserved mechanisms to target viral infection such as RNAi but includes other diverse and possibly species-specific mechanisms.
Mosquito-borne viral diseases are found across the globe and are responsible for numerous severe human infections. In order to develop novel methods for prevention and treatment of these diseases, detailed understanding of the biology of viral infection and transmission is required. Little is known about invertebrate responses to infection in mosquito hosts. In this study we used a model system of Anopheles gambiae mosquitoes and O'nyong-nyong virus to study mosquito immune responses to infection. We examined the global transcriptional responses of A. gambiae to viral infection of the mosquito blood equivalent (the hemolymph) identifying a number of genes with immune functions that are switched on or off in response to infection, including complement-like proteins that circulate in the mosquito hemolymph. The switching on of these genes combined with co-infection experiments with malaria parasites suggests that viral infection inhibits the melanisation pathway. Through silencing the function of a selection of viral responsive genes, we identified four genes that have roles in A. gambiae anti-viral immunity; two putative recognition receptors (a galectin and an MD2-like receptor); two effector lysozymes. These molecules have previously non-described roles in antiviral immunity, and suggest uncharacterised mechanisms for targeting viral infection in A. gambiae mosquitoes.
About 1 million people in the world die each year from diseases spread by mosquitoes, and understanding the mechanism of host identification by the mosquitoes through olfaction is at stake. The role of odorant binding proteins (OBPs) in the primary molecular events of olfaction in mosquitoes is becoming an important focus of biological research in this area. Here, we present a comprehensive comparative genomics study of OBPs in the three disease-transmitting mosquito species Anopheles gambiae, Aedes aegypti, and Culex quinquefasciatus starting with the identification of 110 new OBPs in these three genomes. We have characterized their genomic distribution and orthologous and phylogenetic relationships. The diversity and expansion observed with respect to the Aedes and Culex genomes suggests that the OBP gene family acquired functional diversity concurrently with functional constraints posed on these two species. Sequences with unique features have been characterized such as the “two-domain OBPs” (previously known as Atypical OBPs) and “MinusC OBPs” in mosquito genomes. The extensive comparative genomics featured in this work hence provides useful primary insights into the role of OBPs in the molecular adaptations of mosquito olfactory system and could provide more clues for the identification of potential targets for insect repellants and attractants.
odorant binding proteins; OBP; mosquito; Culex quinquefasciatus; Aedes aegypti; Anopheles gambiae; olfaction; phylogeny
Transposable elements represent a large proportion of the eukaryotic genomes. Long Terminal Repeat (LTR) retrotransposons are very abundant and constitute the predominant family of transposable elements in plants. Recent studies have identified chromoviruses to be a widely distributed lineage of Gypsy elements. These elements contain chromodomains in their integrases, which suggests a preference for insertion into heterochromatin. In turn, this preference might have contributed to the patterning of heterochromatin observed in host genomes. Despite their potential importance for our understanding of plant genome dynamics and evolution, the regulatory mechanisms governing the behavior of chromoviruses and their activities remain largely uncharacterized. Here, we report a detailed analysis of the spatio-temporal activity of a plant chromovirus in the endogenous host. We examined LORE1a, a member of the endogenous chromovirus LORE1 family from the model legume Lotus japonicus. We found that this chromovirus is stochastically de-repressed in plant populations regenerated from de-differentiated cells and that LORE1a transposes in the male germline. Bisulfite sequencing of the 5′ LTR and its surrounding region suggests that tissue culture induces a loss of epigenetic silencing of LORE1a. Since LTR promoter activity is pollen specific, as shown by the analysis of transgenic plants containing an LTR::GUS fusion, we conclude that male germline-specific LORE1a transposition in pollen grains is controlled transcriptionally by its own cis-elements. New insertion sites of LORE1a copies were frequently found in genic regions and show no strong insertional preferences. These distinctive novel features of LORE1 indicate that this chromovirus has considerable potential for generating genetic and epigenetic diversity in the host plant population. Our results also define conditions for the use of LORE1a as a genetic tool.
In contrast to animals, where germline differentiation initiates early in embryogenesis, germline differentiation in plants starts in the adult phase during reproductive development. Transpositions of transposable elements in both somatic and gametic cells can be transmitted to the next generation. As a result, plant genomes may contain transposable elements exhibiting a variety of tissue-specific activities. Thus far, the spatio-temporal activity of LTR retrotransposons, the most abundant class of transposable elements in plants, has not been well characterized. Here, we report a detailed analysis of the spatio-temporal transposition pattern of a plant LTR retrotransposon in the endogenous system. Using the model legume Lotus japonicus, we found that LORE1a, a member of the chromovirus LORE1 family that belongs to the Gypsy superfamily, was epigenetically de-repressed via tissue culture. Activation was stochastic and derepression was maintained in regenerated plants. This feature made it possible to trace the original spatio-temporal activity of the retrotransposon in the intact plants. We determined that the plant chromovirus retrotransposes mainly in the male germline, without obvious insertional preferences for chromosomal regions. This finding suggests that the tissue specificity of transposable elements should be taken into account when considering their impact on the host genome dynamics and evolution.
Anopheles gambiae is the primary mosquito vector of human malaria parasites in sub-Saharan Africa. To date, three innate immune signaling pathways, including the nuclear factor (NF)-kappaB-dependent Toll and immune deficient (IMD) pathways and the Janus kinase/signal transducers and activators of transcription (Jak-STAT) pathway, have been extensively characterized in An. gambiae. However, in addition to NF-kappaB-dependent signaling, three mitogen-activated protein kinase (MAPK) pathways regulated by JNK, ERK and p38 MAPK are critical mediators of innate immunity in other invertebrates and in mammals. Our understanding of the roles of the MAPK signaling cascades in anopheline innate immunity is limited, so identification of the encoded complement of these proteins, their upstream activators, and phosphorylation profiles in response to relevant immune signals was warranted.
In this study, we present the orthologs and phylogeny of 17 An. gambiae MAPKs, two of which were previously unknown and two others that were incompletely annotated. We also provide detailed temporal activation profiles for ERK, JNK, and p38 MAPK in An. gambiae cells in vitro to immune signals that are relevant to malaria parasite infection (human insulin, human transforming growth factor-beta1, hydrogen peroxide) and to bacterial lipopolysaccharide. These activation profiles and possible upstream regulatory pathways are interpreted in light of known MAPK signaling cascades.
The establishment of a MAPK "road map" based on the most advanced mosquito genome annotation can accelerate our understanding of host-pathogen interactions and broader physiology of An. gambiae and other mosquito species. Further, future efforts to develop predictive models of anopheline cell signaling responses, based on iterative construction and refinement of data-based and literature-based knowledge of the MAP kinase cascades and other networked pathways will facilitate identification of the "master signaling regulators" in biomedically important mosquito species.
Transposable elements (TEs), both DNA transposons and retrotransposons, are genetic elements with the main characteristic of being able to mobilize and amplify their own representation within genomes, utilizing different mechanisms of transposition. An almost universal feature of TEs in eukaryotic genomes is their inability to transpose by themselves, mainly as the result of sequence degeneration (by either mutations or deletions). Most of the elements are thus either inactive or non-autonomous. Considering that the bulk of some eukaryotic genomes derive from TEs, they have been conceived as “TE graveyards.” It has been shown that once an element has been inactivated, it progressively accumulates mutations and deletions at neutral rates until completely losing its identity or being lost from the host genome; however, it has also been shown that these “neutral sequences” might serve as raw material for domestication by host genomes.
We have analyzed the sequence structural variations, nucleotide divergence, and pattern of insertions and deletions of several superfamilies of TEs belonging to both class I (long terminal repeats [LTRs] and non-LTRs [NLTRs]) and II in the genome of Anopheles gambiae, aiming at describing the landscape of deterioration of these elements in this particular genome. Our results describe a great diversity in patterns of deterioration, indicating lineage-specific differences including the presence of Solo-LTRs in the LTR lineage, 5′-deleted NLTRs, and several non-autonomous and MITEs in the class II families. Interestingly, we found fragments of NLTRs corresponding to the RT domain, which preserves high identity among them, suggesting a possible remaining genomic role for these domains.
We show here that the TEs in the An. gambiae genome deteriorate in different ways according to the class to which they belong. This diversity certainly has implications not only at the host genomic level but also at the amplification dynamic and evolution of the TE families themselves.
Transposable elements; LTR; Non-LTR; Class II; Deterioration; Anopheles gambiae
The dynamics of reductive genome evolution for eukaryotes living inside other eukaryotic cells are poorly understood compared to well-studied model systems involving obligate intracellular bacteria. Here we present 8.5 Mb of sequence from the genome of the microsporidian Trachipleistophora hominis, isolated from an HIV/AIDS patient, which is an outgroup to the smaller compacted-genome species that primarily inform ideas of evolutionary mode for these enormously successful obligate intracellular parasites. Our data provide detailed information on the gene content, genome architecture and intergenic regions of a larger microsporidian genome, while comparative analyses allowed us to infer genomic features and metabolism of the common ancestor of the species investigated. Gene length reduction and massive loss of metabolic capacity in the common ancestor was accompanied by the evolution of novel microsporidian-specific protein families, whose conservation among microsporidians, against a background of reductive evolution, suggests they may have important functions in their parasitic lifestyle. The ancestor had already lost many metabolic pathways but retained glycolysis and the pentose phosphate pathway to provide cytosolic ATP and reduced coenzymes, and it had a minimal mitochondrion (mitosome) making Fe-S clusters but not ATP. It possessed bacterial-like nucleotide transport proteins as a key innovation for stealing host-generated ATP, the machinery for RNAi, key elements of the early secretory pathway, canonical eukaryotic as well as microsporidian-specific regulatory elements, a diversity of repetitive and transposable elements, and relatively low average gene density. Microsporidian genome evolution thus appears to have proceeded in at least two major steps: an ancestral remodelling of the proteome upon transition to intracellular parasitism that involved reduction but also selective expansion, followed by a secondary compaction of genome architecture in some, but not all, lineages.
Microsporidians are enormously successful obligate intracellular parasites of animals, including humans. Despite their economic and medical importance, there are major gaps in our understanding of how microsporidians have made the transition from a free-living organism to one that can only complete its life cycle by living inside another cell. We present the larger genome of Trachipleistophora hominis isolated from a human patient with HIV/AIDS. Our analyses provide insights into the gene content, genome architecture and intergenic regions of a known opportunistic pathogen, and will facilitate the development of T. hominis as a much-needed model species that can also be grown in co-culture. The genome of T. hominis has more genes than other microsporidians, it has diverse regulatory motifs, and it contains a variety of transposable elements coupled with the machinery for RNA interference, which may eventually allow experimental down-regulation of T. hominis genes. Comparison of the genome of T. hominis with other microsporidians allowed us to infer properties of their common ancestor. Our analyses predict an ancestral microsporidian that was already an intracellular parasite with a reduced core proteome but one with a relatively large genome populated with diverse repetitive elements and a complex transcriptional regulatory network.
Human Malaria is transmitted by mosquitoes of the genus Anopheles. Transmission is a complex phenomenon involving biological and environmental factors of humans, parasites and mosquitoes. Among more than 500 anopheline species, only a few species from different branches of the mosquito evolutionary tree transmit malaria, suggesting that their vectorial capacity has evolved independently. Anopheles albimanus (subgenus Nyssorhynchus) is an important malaria vector in the Americas. The divergence time between Anopheles gambiae, the main malaria vector in Africa, and the Neotropical vectors has been estimated to be 100 My. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to explore the mosquito biology beyond the An. gambiae complex.
We sequenced the transcriptome of the An. albimanus adult female. By combining Sanger, 454 and Illumina sequences from cDNA libraries derived from the midgut, cuticular fat body, dorsal vessel, salivary gland and whole body, we generated a single, high-quality assembly containing 16,669 transcripts, 92% of which mapped to the An. darlingi genome and covered 90% of the core eukaryotic genome. Bidirectional comparisons between the An. gambiae, An. darlingi and An. albimanus predicted proteomes allowed the identification of 3,772 putative orthologs. More than half of the transcripts had a match to proteins in other insect vectors and had an InterPro annotation. We identified several protein families that may be relevant to the study of Plasmodium-mosquito interaction. An open source transcript annotation browser called GDAV (Genome-Delinked Annotation Viewer) was developed to facilitate public access to the data generated by this and future transcriptome projects.
We have explored the adult female transcriptome of one important New World malaria vector, An. albimanus. We identified protein-coding transcripts involved in biological processes that may be relevant to the Plasmodium lifecycle and can serve as the starting point for searching targets for novel control strategies. Our data increase the available genomic information regarding An. albimanus several hundred-fold, and will facilitate molecular research in medical entomology, evolutionary biology, genomics and proteomics of anopheline mosquito vectors. The data reported in this manuscript is accessible to the community via the VectorBase website (http://www.vectorbase.org/Other/AdditionalOrganisms/).
Anopheles albimanus; Transcriptome; Malaria; RNA-Seq
Transposable elements are the most abundant components of all characterized genomes of higher eukaryotes. It has been documented that these elements not only contribute to the shaping and reshaping of their host genomes, but also play significant roles in regulating gene expression, altering gene function, and creating new genes. Thus, complete identification of transposable elements in sequenced genomes and construction of comprehensive transposable element databases are essential for accurate annotation of genes and other genomic components, for investigation of potential functional interaction between transposable elements and genes, and for study of genome evolution. The recent availability of the soybean genome sequence has provided an unprecedented opportunity for discovery, and structural and functional characterization of transposable elements in this economically important legume crop.
Using a combination of structure-based and homology-based approaches, a total of 32,552 retrotransposons (Class I) and 6,029 DNA transposons (Class II) with clear boundaries and insertion sites were structurally annotated and clearly categorized, and a soybean transposable element database, SoyTEdb, was established. These transposable elements have been anchored in and integrated with the soybean physical map and genetic map, and are browsable and visualizable at any scale along the 20 soybean chromosomes, along with predicted genes and other sequence annotations. BLAST search and other infrastracture tools were implemented to facilitate annotation of transposable elements or fragments from soybean and other related legume species. The majority (> 95%) of these elements (particularly a few hundred low-copy-number families) are first described in this study.
SoyTEdb provides resources and information related to transposable elements in the soybean genome, representing the most comprehensive and the largest manually curated transposable element database for any individual plant genome completely sequenced to date. Transposable elements previously identified in legumes, the third largest family of flowering plants, are relatively scarce. Thus this database will facilitate structural, evolutionary, functional, and epigenetic analyses of transposable elements in soybean and other legume species.
Heterochromatin plays an important role in chromosome function and gene regulation. Despite the availability of polytene chromosomes and genome sequence, the heterochromatin of the major malaria vector Anopheles gambiae has not been mapped and characterized.
To determine the extent of heterochromatin within the An. gambiae genome, genes were physically mapped to the euchromatin-heterochromatin transition zone of polytene chromosomes. The study found that a minimum of 232 genes reside in 16.6 Mb of mapped heterochromatin. Gene ontology analysis revealed that heterochromatin is enriched in genes with DNA-binding and regulatory activities. Immunostaining of the An. gambiae chromosomes with antibodies against Drosophila melanogaster heterochromatin protein 1 (HP1) and the nuclear envelope protein lamin Dm0 identified the major invariable sites of the proteins' localization in all regions of pericentric heterochromatin, diffuse intercalary heterochromatin, and euchromatic region 9C of the 2R arm, but not in the compact intercalary heterochromatin. To better understand the molecular differences among chromatin types, novel Bayesian statistical models were developed to analyze genome features. The study found that heterochromatin and euchromatin differ in gene density and the coverage of retroelements and segmental duplications. The pericentric heterochromatin had the highest coverage of retroelements and tandem repeats, while intercalary heterochromatin was enriched with segmental duplications. We also provide evidence that the diffuse intercalary heterochromatin has a higher coverage of DNA transposable elements, minisatellites, and satellites than does the compact intercalary heterochromatin. The investigation of 42-Mb assembly of unmapped genomic scaffolds showed that it has molecular characteristics similar to cytologically mapped heterochromatin.
Our results demonstrate that Anopheles polytene chromosomes and whole-genome shotgun assembly render the mapping and characterization of a significant part of heterochromatic scaffolds a possibility. These results reveal the strong association between characteristics of the genome features and morphological types of chromatin. Initial analysis of the An. gambiae heterochromatin provides a framework for its functional characterization and comparative genomic analyses with other organisms.
IS630/Tc1/mariner elements are diverse and widespread within insects. The African malaria mosquito, Anopheles gambiae, contains over 30 families of IS630/Tc1/mariner elements although few have been studied in any detail. To examine the history of Topi elements in Anopheles gambiae populations, Topi elements (n = 73) were sampled from five distinct populations of Anopheles gambiae from eastern and western Africa and evaluated with respect to copy number, nucleotide diversity and insertion site-occupancy frequency. Topi 1 and 2 elements were abundant (10–34 per diploid genome) and highly diverse (π = 0.051). Elements from mosquitoes collected in Nigeria were Topi 2 elements and those from mosquitoes collected in Mozambique were Topi 1 elements. Of the 49 Topi transposase open reading frames sequenced none were found to be identical. Intact elements with complete transposase open reading frames were common, although based on insertion site -occupancy frequency data it appeared that genetic drift was the major force acting on these IS630/Tc1/mariner -type elements. Topi 3 elements were not recovered from any of the populations sampled in this study and appear to be rare elements in Anopheles gambiae, possibly due to a recent introduction.
Topi; transposable elements; Tc1; mariner; Anopheles gambiae; malaria
Malaria is a tropical disease caused by protozoan parasite, Plasmodium, which is transmitted to humans by various species of female anopheline mosquitoes. Anopheles stephensi is one such major malaria vector in urban parts of the Indian subcontinent. Unlike Anopheles gambiae, an African malaria vector, transcriptome of A. stephensi midgut tissue is less explored. We have therefore carried out generation, annotation, and analysis of expressed sequence tags from sugar-fed and Plasmodium yoelii infected blood-fed (post 24 h) adult female A. stephensi midgut tissue.
We obtained 7061 and 8306 ESTs from the sugar-fed and P. yoelii infected mosquito midgut tissue libraries, respectively. ESTs from the combined dataset formed 1319 contigs and 2627 singlets, totaling to 3946 unique transcripts. Putative functions were assigned to 1615 (40.9%) transcripts using BLASTX against UniProtKB database. Amongst unannotated transcripts, we identified 1513 putative novel transcripts and 818 potential untranslated regions (UTRs). Statistical comparison of annotated and unannotated ESTs from the two libraries identified 119 differentially regulated genes. Out of 3946 unique transcripts, only 1387 transcripts were mapped on the A. gambiae genome. These also included 189 novel transcripts, which were mapped to the unannotated regions of the genome. The EST data is available as ESTDB at .
3946 unique transcripts were successfully identified from the adult female A. stephensi midgut tissue. These data can be used for microarray development for better understanding of vector-parasite relationship and to study differences or similarities with other malaria vectors. Mapping of putative novel transcripts from A. stephensi on the A. gambiae genome proved fruitful in identification and annotation of several genes. Failure of some novel transcripts to map on the A. gambiae genome indicates existence of substantial genomic dissimilarities between these two potent malaria vectors.
Determining the mechanisms by which transposable elements move within a genome increases our understanding of how they can shape genome evolution. Class 2 transposable elements transpose via a 'cut-and-paste' mechanism mediated by a transposase that binds to sites at or near the ends of the transposon. Herves is a member of the hAT superfamily of class 2 transposons and was isolated from Anopheles gambiae, a medically important mosquito species that is the major vector of malaria in sub-Saharan Africa. Herves is transpositionally active and intact copies of it are found in field populations of A gambiae. In this study we report the binding activities of the Herves transposase to the sequences at the ends of the Herves transposon and compare these to other sequences recognized by hAT transposases isolated from other organisms.
We identified the specific DNA-binding sites of the Herves transposase. Active Herves transposase was purified using an Escherichia coli expression system and bound in a site-specific manner to the subterminal and terminal sequences of the left and right ends of the element, respectively, and also interacted with the right but not the left terminal inverted repeat. We identified a common subterminal DNA-binding motif (CG/AATTCAT) that is critical and sufficient for Herves transposase binding.
The Herves transposase binds specifically to a short motif located at both ends of the transposon but shows differential binding with respect to the left and right terminal inverted repeats. Despite similarities in the overall structures of hAT transposases, the regions to which they bind in their respective transposons differ in sequence ensuring the specificity of these enzymes to their respective transposon. The asymmetry with which the Herves terminal inverted repeats are bound by the transposase may indicate that these differ in their interactions with the enzyme.
Flax (Linum usitatissimum L.) is an important crop for the production of bioproducts derived from its seed and stem fiber. Transposable elements (TEs) are widespread in plant genomes and are a key component of their evolution. The availability of a genome assembly of flax (Linum usitatissimum) affords new opportunities to explore the diversity of TEs and their relationship to genes and gene expression.
Four de novo repeat identification algorithms (PILER, RepeatScout, LTR_finder and LTR_STRUC) were applied to the flax genome assembly. The resulting library of flax repeats was combined with the RepBase Viridiplantae division and used with RepeatMasker to identify TEs coverage in the genome. LTR retrotransposons were the most abundant TEs (17.2% genome coverage), followed by Long Interspersed Nuclear Element (LINE) retrotransposons (2.10%) and Mutator DNA transposons (1.99%). Comparison of putative flax TEs to flax transcript databases indicated that TEs are not highly expressed in flax. However, the presence of recent insertions, defined by 100% intra-element LTR similarity, provided evidence for recent TE activity. Spatial analysis showed TE-rich regions, gene-rich regions as well as regions with similar genes and TE density. Monte Carlo simulations for the 71 largest scaffolds (≥ 1 Mb each) did not show any regional differences in the frequency of TE overlap with gene coding sequences. However, differences between TE superfamilies were found in their proximity to genes. Genes within TE-rich regions also appeared to have lower transcript expression, based on EST abundance. When LTR elements were compared, Copia showed more diversity, recent insertions and conserved domains than the Gypsy, demonstrating their importance in genome evolution.
The calculated 23.06% TE coverage of the flax WGS assembly is at the low end of the range of TE coverages reported in other eudicots, although this estimate does not include TEs likely found in unassembled repetitive regions of the genome. Since enrichment for TEs in genomic regions was associated with reduced expression of neighbouring genes, and many members of the Copia LTR superfamily are inserted close to coding regions, we suggest Copia elements have a greater influence on recent flax genome evolution while Gypsy elements have become residual and highly mutated.
Transposable elements; Flax; Genome evolution; LTR elements; Gene expression
microRNAs (miRNAs) are a highly abundant class of small noncoding regulatory RNAs that post-transcriptionally regulate gene expression in multicellular organisms. miRNAs are involved in a wide range of biological and physiological processes, including the regulation of host immune responses to microbial infections. Small-scale studies of miRNA expression in the malaria mosquito Anopheles gambiae have been reported, however no comprehensive analysis of miRNAs has been performed so far.
Using small RNA sequencing, we characterized de novo A. gambiae miRNA repertoire expressed in adult sugar- and blood-fed females. We provided transcriptional evidences for 123 miRNAs, including 58 newly identified miRNAs. Out of the newly described miRNAs, 19 miRNAs are homologs to known miRNAs in other insect species and 17 miRNAs share sequence similarity restricted to the seed sequence. The remaining 21 novel miRNAs displayed no obvious sequence homology with known miRNAs. Detailed bioinformatics analysis of the mature miRNAs revealed a sequence variation occurring at their 5’-end and leading to functional seed shifting in more than 5% of miRNAs. We also detected significant sequence heterogeneity at the 3’-ends of the mature miRNAs, mostly due to imprecise processing and post-transcriptional modifications. Comparative analysis of arm-switching events revealed the existence of species-specific production of dominant mature miRNAs induced by blood feeding in mosquitoes. We also identified new conserved and fragmented miRNA clusters and A. gambiae-specific miRNA gene duplication. Using miRNA expression profiling, we identified the differentially expressed miRNAs at an early time point after regular blood feeding and after infection with the rodent malaria parasite Plasmodium berghei. Significant changes were detected in the expression levels of 4 miRNAs in blood-fed mosquitoes, whereas 6 miRNAs were significantly upregulated after P. berghei infection.
In the current study, we performed the first systematic analysis of miRNAs in A. gambiae. We provided new insights on mature miRNA sequence diversity and functional shifts in the mosquito miRNA evolution. We identified a set of the differentially expressed miRNAs that respond to normal and infectious blood meals. The extended set of Anopheles miRNAs and their isoforms provides a basis for further experimental studies of miRNA expression patterns and biological functions in A. gambiae.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-557) contains supplementary material, which is available to authorized users.
MicroRNAs (miRNAs) are small non-coding RNAs that post-transcriptionally regulate gene expression in a variety of organisms, including insects, vertebrates, and plants. miRNAs play important roles in cell development and differentiation as well as in the cellular response to stress and infection. To date, there are limited reports of miRNA identification in mosquitoes, insects that act as essential vectors for the transmission of many human pathogens, including flaviviruses. West Nile virus (WNV) and dengue virus, members of the Flaviviridae family, are primarily transmitted by Aedes and Culex mosquitoes. Using high-throughput deep sequencing, we examined the miRNA repertoire in Ae. albopictus cells and Cx. quinquefasciatus mosquitoes.
We identified a total of 65 miRNAs in the Ae. albopictus C7/10 cell line and 77 miRNAs in Cx. quinquefasciatus mosquitoes, the majority of which are conserved in other insects such as Drosophila melanogaster and Anopheles gambiae. The most highly expressed miRNA in both mosquito species was miR-184, a miRNA conserved from insects to vertebrates. Several previously reported Anopheles miRNAs, including miR-1890 and miR-1891, were also found in Culex and Aedes, and appear to be restricted to mosquitoes. We identified seven novel miRNAs, arising from nine different precursors, in C7/10 cells and Cx. quinquefasciatus mosquitoes, two of which have predicted orthologs in An. gambiae. Several of these novel miRNAs reside within a ~350 nt long cluster present in both Aedes and Culex. miRNA expression was confirmed by primer extension analysis. To determine whether flavivirus infection affects miRNA expression, we infected female Culex mosquitoes with WNV. Two miRNAs, miR-92 and miR-989, showed significant changes in expression levels following WNV infection.
Aedes and Culex mosquitoes are important flavivirus vectors. Recent advances in both mosquito genomics and high-throughput sequencing technologies enabled us to interrogate the miRNA profile in these two species. Here, we provide evidence for over 60 conserved and seven novel mosquito miRNAs, expanding upon our current understanding of insect miRNAs. Undoubtedly, some of the miRNAs identified will have roles not only in mosquito development, but also in mediating viral infection in the mosquito host.
Many genes involved in the immune response of Anopheles gambiae, the main malaria vector in Africa, have been identified, but whether naturally occurring polymorphisms in these genes underlie variation in resistance to the human malaria parasite, Plasmodium falciparum, is currently unknown. Here we carried out a candidate gene association study to identify single nucleotide polymorphisms (SNPs) associated with natural resistance to P. falciparum. A. gambiae M form mosquitoes from Cameroon were experimentally challenged with three local wild P. falciparum isolates. Statistical associations were assessed between 157 SNPs selected from a set of 67 A. gambiae immune-related genes and the level of infection. Isolate-specific associations were accounted for by including the effect of the isolate in the analysis. Five SNPs were significantly associated to the infection phenotype, located within or upstream of AgMDL1, CEC1, Sp PPO activate, Sp SNAKElike, and TOLL6. Low overall and local linkage disequilibrium indicated high specificity in the loci found. Association between infection phenotype and two SNPs was isolate-specific, providing the first evidence of vector genotype by parasite isolate interactions at the molecular level. Four SNPs were associated to either oocyst presence or load, indicating that the genetic basis of infection prevalence and intensity may differ. The validity of the approach was verified by confirming the functional role of Sp SNAKElike in gene silencing assays. These results strongly support the role of genetic variation within or near these five A. gambiae immune genes, in concert with other genes, in natural resistance to P. falciparum. They emphasize the need to distinguish between infection prevalence and intensity and to account for the genetic specificity of vector-parasite interactions in dissecting the genetic basis of Anopheles resistance to human malaria.
Anopheles gambiae is the main malaria vector in Africa, transmitting the parasite when it blood feeds on human hosts. The parasite undergoes several developmental stages in the mosquito to complete its life cycle, during which time it is confronted by the mosquito's immune system. The resistance of mosquitoes to malaria infection is highly variable in wild populations and is known to be under strong genetic control, but to date the specific genes responsible for this variation remain to be identified. The present study uncovers variations in A. gambiae immune genes that are associated with natural resistance to Plasmodium falciparum, the deadliest human malaria parasite. The association of some mosquito genetic loci with the level of infection depended on the P. falciparum isolate, suggesting that resistance is determined by interactions between the genome of the mosquito and that of the parasite. This finding highlights the need to account for the natural genetic diversity of malaria parasites in future research on vector-parasite interactions. The loci uncovered in this study are potential targets for developing novel malaria control strategies based on natural mosquito resistance mechanisms.
Anopheles mosquitoes are important vectors of malaria and lymphatic filariasis (LF), which are major public health diseases in Nigeria. Malaria is caused by infection with a protozoan parasite of the genus Plasmodium and LF by the parasitic worm Wuchereria bancrofti. Updating our knowledge of the Anopheles species is vital in planning and implementing evidence based vector control programs. To present a comprehensive report on the spatial distribution and composition of these vectors, all published data available were collated into a database. Details recorded for each source were the locality, latitude/longitude, time/period of study, species, abundance, sampling/collection methods, morphological and molecular species identification methods, insecticide resistance status, including evidence of the kdr allele, and P. falciparum sporozoite rate and W. bancrofti microfilaria prevalence. This collation resulted in a total of 110 publications, encompassing 484,747 Anopheles mosquitoes in 632 spatially unique descriptions at 142 georeferenced locations being identified across Nigeria from 1900 to 2010. Overall, the highest number of vector species reported included An. gambiae complex (65.2%), An. funestus complex (17.3%), An. gambiae s.s. (6.5%). An. arabiensis (5.0%) and An. funestus s.s. (2.5%), with the molecular forms An. gambiae M and S identified at 120 locations. A variety of sampling/collection and species identification methods were used with an increase in molecular techniques in recent decades. Insecticide resistance to pyrethroids and organochlorines was found in the main Anopheles species across 45 locations. Presence of P. falciparum and W. bancrofti varied between species with the highest sporozoite rates found in An. gambiae s.s, An. funestus s.s. and An. moucheti, and the highest microfilaria prevalence in An. gambiae s.l., An. arabiensis, and An. gambiae s.s. This comprehensive geo-referenced database provides an essential baseline on Anopheles vectors and will be an important resource for malaria and LF vector control programmes in Nigeria.
Y chromosomes are responsible for the initiation of male development, male fertility, and other male-related functions in diverse species. However, Y genes are rarely characterized outside a few model species due to the arduous nature of studying the repeat-rich Y.
The chromosome quotient (CQ) is a novel approach to systematically discover Y chromosome genes. In the CQ method, genomic DNA from males and females is sequenced independently and aligned to candidate reference sequences. The female to male ratio of the number of alignments to a reference sequence, a parameter called the chromosome quotient (CQ), is used to determine whether the sequence is Y-linked. Using the CQ method, we successfully identified known Y sequences from Homo sapiens and Drosophila melanogaster. The CQ method facilitated the discovery of Y chromosome sequences from the malaria mosquitoes Anopheles stephensi and An. gambiae. Comparisons to transcriptome sequence data with blastn led to the discovery of six Anopheles Y genes, three from each species. All six genes are expressed in the early embryo. Two of the three An. stephensi Y genes were recently acquired from the autosomes or the X. Although An. stephensi and An. gambiae belong to the same subgenus, we found no evidence of Y genes shared between the species.
The CQ method can reliably identify Y chromosome sequences using the ratio of alignments from male and female sequence data. The CQ method is widely applicable to species with fragmented genome assemblies produced from next-generation sequencing data. Analysis of the six Y genes characterized in this study indicates rapid Y chromosome evolution between An. stephensi and An. gambiae. The Anopheles Y genes discovered by the CQ method provide unique markers for population and phylogenetic analysis, and opportunities for novel mosquito control measures through the manipulation of sexual dimorphism and fertility.