Repeatedly during evolution, mosquitoes and other insects have adopted hematophagy to sustain abundant progeny production. In turn, blood feeding provided a new point of entry for pathogens. To counter assaults, innate immunity has evolved to recognize and respond to numerous pathogens, in a dynamic playoff where either host or pathogen may win. Although fundamental concepts mostly derive from Dm, Ag
is now an important model for studies of innate immunity. A previous comparative analysis of Ag
immune-related gene families (1
) highlighted their diversification and pointed toward an expanded conceptual framework of insect innate immunity. The sequencing of the Aa
) permitted deeper understanding of insect immune systems, as displayed by two quite different mosquito species that diverged ~150 million years ago (Ma) and Dm
, which separated from them ~250 Ma. This three-way comparison is considerably more powerful than the previous Dm
study, because it allows measuring true genetic distances rather than unrooted sequence similarities. Taking advantage of the added value from multiple species comparisons, we explore the evolutionary dynamics of innate immunity in insects and how they can address both common and species-specific immune challenges.
Multiple large-scale bioinformatic methods, manual curation, and phylogenetic analyses (3
) identified 285 Dm
, 338 Ag
, and 353 Aa
genes from 31 gene families and functional groups implicated in classical innate immunity or defense functions such as apoptosis and response to oxidative stress (table S1). Additional limited analysis of nine sequenced genomes from four holometabolous insect orders, spanning 350 million years of evolution, further defined conserved family features and assisted manual gene model curation by gene family experts. The detailed core analysis (Aa
) is presented in the supporting online material (SOM) text and in figs. S1 to S22, and the total data set is organized into a web-accessible resource (http://cegg.unige.ch/Insecta/immunodb/
), offering a comparative perspective across higher insects. All but 24 previously named Aa
genes, as well as 79 previously unnamed Ag
genes, were named in accordance with the nomenclature scheme devised for the Ag
) with the use of additional guidelines as described in the SOM; this information will be incorporated in the forthcoming manual annotations of the VectorBase resource (www.vectorbase.org
Our conservative bioinformatic analysis of the complete genomes identified 4951 orthologous trios (1:1:1 orthologs in the three species) and 886 mosquito-specific orthologous pairs (absent from both Dm and the honeybee, Apis mellifera). Combined bioinformatic analysis and manual curation of the immune repertoire identified 91 trios and 57 pairs, plus a combined total of 589 paralogous genes in the three species. Paralogs derive from family expansions and gene losses, or cases of exceptionally high sequence divergence obscuring phylogenetic relationships. Orthologs most likely serve corresponding functions in respective organisms, whereas paralogs may have acquired different functions.
By definition, orthologous trios represent a numerically conserved subset of genes. Nevertheless, a plot of Dm-Aa and Dm-Ag phylogenetic distances, measured in terms of amino acid substitutions, revealed that, on average, immunity trio orthologs are significantly more divergent (~20%) than the totality of trios in the genomes (). Indeed, the immune repertoire is one of the most divergent functional groups as defined by Gene Ontology classifications (fig. S1A). Furthermore, with Dm as reference, several Ag immunity genes are considerably more divergent than their Aa orthologs. A similar trend among all 1:1:1 orthologs was detected, implying greater accumulation of amino acid substitutions in Anopheles. One hypothesis that merits detailed testing is whether this reflects a higher speciation rate and diverse habitat colonization by Anopheles as opposed to the more cosmopolitan Aedes.
Fig. 1 (A) Divergence of orthologous trios. Immunity single-copy trios are compared with all single-copy trios in terms of genetic distances of each mosquito species (Ag or Aa) protein to the corresponding Dm ortholog (3) (fig. S1B). Signal transducers are highlighted. (more ...)
Large variation exists in different immune families in their proportions of orthologous trios, mosquito pairs, and species-specific genes (). Some families display exclusively species-specific genes, some mostly trios, and others intermediate variation. At one extreme are apoptosis inhibitors (IAPs), oxidative defense enzymes [superoxide dismutases (SODs), glutathione peroxidases (GPXs), thioredoxin peroxidases (TPXs), and heme-containing peroxidases (HPXs)], and class A and B scavenger receptors (SCRs), all of which show predominantly trio orthologs. At the opposite extreme are highly diverse immune effector gene families, including three shared antimicrobial peptide (AMP) families that collectively exhibit no orthologous trio and only one confident mosquito orthologous pair. The C-type lectins (CTLs), which have been implicated in immunity as opsonins and modulators of melanization (see below), are intermediate, exhibiting large expansions while retaining nine trios and one pair. The present study reaffirms the family diversity observed in our previous Dm-Ag comparison and further reveals substantial diversity between the two mosquito species, at just over half the evolutionary distance.
A fascinating picture emerged when we disarticulated the immune responses into sequential phases (Figs. and ). Immune responses begin with molecular recognition of microbial patterns, producing immune signals. Some signals are modulated and/or transduced before activating effector mechanisms. We observed that each of the phases is characterized by different evolutionary dynamics, which may collectively account for the flexibility of the innate immune system that enables adaptation to new challenges.
Fig. 2 Evolution of immune signaling phases in insects. (A) Genes and gene families implicated in two immune signaling pathways, Toll and Imd (green and purple, respectively). The well-recognized phases of signaling, from recognition to effector production, (more ...)
Fig. 3 The melanization immune response evolves by convergence and is based on pathogen-related, species-specific regulatory modules. Components are highlighted and shown in relation to their closest phylogenetic relatives in Dm (blue), Ag (red), and Aa (yellow). (more ...)
The immune recognition phase seems to achieve flexibility through divergent evolution: Gene duplications result in species- or lineage-specific expansions and generation of novel genes, whereas domain duplications lead to new gene architectures. Consequently, fruit fly and mosquito recognition proteins mostly form distinct clades within each family (see SOM). Nevertheless, sequence divergence between reduplicated recognition genes or domains remains limited, possibly reflecting the relatively limited diversity of microbial molecular patterns that are known to trigger immune responses. The peptidoglycan recognition proteins (PGRPs) and the Gram-negative binding proteins (GNBPs) are recognition receptor families that trigger signaling through Toll or Imd pathways as indicated in (4
). The Gram-negative recognition protein Dm
PGRP-LC, which functions in the Imd pathway, and its Anopheles
ortholog each have three functional PGRP domains; however, these are more similar within species than between species, indicating phylogenetically separate domain reduplications. A sequence gap obscures the full structure of the Aedes PGRP
ortholog, which apparently derives from the same domain reduplication events that created Ag PGRP
. Separate reduplication of two adjacent PGRP-LC
domains in Drosophila
generated a novel gene, PGRP-LF
, which is absent from mosquitoes.
The function of PGRP-LC in Dm
is antagonized by catalytic PGRPs that cleave and inactivate peptidoglycan (5
). Mosquitoes also possess catalytic PGRPs, but most have emerged as species-specific paralogs (Ag
PGRPS2/3 and Aa
PGRPS4/5). The fruit fly recognizes Gram-positive bacteria activating Toll using the species-specific Dm
PGRP-SD, as well as Dm
PGRP-SA, which belongs to a trio and functions in conjunction with GNBP1, a recognition protein that processes polymeric peptidoglycan (7
). The two additional Dm
GNBPs are also fruit fly-specific; one of them, GNBP3, recognizes fungi, possibly through binding β1,3-glucans (8
). A large expansion has generated five mosquito-specific B-type GNBPs, distinct from the two A-type orthologous pairs that resemble fruit fly GNBPs.
Recent studies in Ag
identified two types of putative malaria parasite recognition receptors belonging to distinct structural classes: thioester-containing proteins (TEPs) and leucine-rich repeat (LRR) proteins. Members of each class have been associated with the killing and disposal of parasites by lysis or melanization. The TEP family is related to the vertebrate complement factors C3/C4/C5 and pan-protease inhibitors α2-macroglobulins. Ag
TEP1 binds to the surface of Plasmodium berghei
and mediates parasite killing (9
); it also binds to bacteria and promotes phagocytosis (10
). TEPs exhibit only one orthologous trio and otherwise form two groups: one with both Dm
and mosquito TEPs and another with only mosquito species-specific clades (the latter group includes Ag
TEP1) (). The second class of putative receptors include LRR immune gene 1, the pioneer P. berghei
LRR antagonist (12
); others of similar function are Anopheles Plasmodium
-responsive LRR 1 and LRR domain 7, which have been additionally implicated in resistance to P. falciparum
, the human malaria parasite (13
). Like TEP1, none of the three has identifiable orthologs in Aa
Immune modulation is an important process that regulates both the immediate aftermath of recognition and subsequent effector functions and evolves in a “mix and match” mode. Examples are modulation of Toll pathway activation and the melanization reaction, respectively. In both contexts, modulation uses a vast reservoir of serine proteases and their inhibitors [serpins or serine protease inhibitors (SRPNs)] or other regulators, from which particular components are picked to constitute species-specific regulatory modules.
Successful triggering of the Dm
Toll pathway after fungal and Gram-positive recognition engages a dedicated proteolytic activation cascade of serine proteases and SRPNs, of which several have been identified recently (15
). None of these proteins exhibit mosquito orthologs, and only Spirit and Grass have recognizable paralogs (). The cascade culminates in cleavage of Spaetzle by the Spaetzle proteolytic enzyme (SPE), releasing a cytokine that binds to Toll. Mosquitoes have several genes encoding Spaetzle-like proteins (SPZs), but their SPE has not been recognized. Suggestively, the short and very specific SPE cleavage site (16
) recurs in Ag
CLIP-domain serine protease B5 (Ag
CLIPB5) and Aa
CLIPB38, which are otherwise phylogenetically unrelated.
Similarly, activation of prophenoloxidases (PPOs) to phenoloxidases (POs), the executors of melanization, is induced by a protease cascade (mostly CLIPBs). The cascade is positively and negatively regulated by a network of inactive protease homologs (CLIPAs), CTLs, and SRPNs (). This melanization module is tightly controlled, because it generates toxic byproducts including reactive oxygen species. Reverse genetic analyses have identified a large set of Ag
regulators for melanization of P. berghei
) or Sephadex beads (20
): one SRPN, two CTLs, eight CLIPBs, and three CLIPAs (). Notably, all are members of mosquito-specific expansions, none has a definitive 1:1:1 ortholog, and only SRPN2 has a clear Aa
ortholog. The reservoir of Aa
proteases shows an underrepresentation of CLIPAs and massive expansions of CLIPBs as compared with both Ag
. Finally, the melanization module may encompass additional regulators, because the genetic background determines which components are important in specific Ag
The observed diversity of modulation components suggests that related but distinct regulatory modules may evolve in different species and even in subspecific taxa. Recruitment of individual members from very large multigene families may be followed by modulatory fine-tuning through selection imposed by particular microbes. For example, several of the genes that negatively control P. berghei
melanization in Ag
[CTL4, CTL mannose-binding 2
), and SRPN2
] do not affect P. falciparum
). Because Ag
is a natural vector of P. falciparum
but not of P. berghei
, it is appealing to speculate that the sets of regulators of the melanization module evolve with and are manipulated by parasites. This modular mix and match evolution hinders detailed knowledge transfer between vector species but reinforces its importance in shaping the immune response. Future experimental studies of the melanization module in Aa
, which can melanize bacteria and filarial worms, as well as sporozoites of the avian parasite P. gallinaceum
), will be fruitful in further exploring this fascinating mode of immune evolution.
Although Toll-like receptors (TLRs) are found throughout the animal kingdom, phylogenetic and functional studies have suggested that insect Tolls and mammalian TLRs evolved independently (26
). Most Dm
Tolls serve developmental functions, and the recruitment of the Toll (Toll-1) receptor to immune signaling has been ascribed to convergent evolution. Even within insects, our analysis detects diversity: species-specific Toll expansions and only three trios. Dm
Toll-1 has no clear orthologs; reduplications have created a clade of four Ag
and four Aa
genes, all related to both Dm
Toll-1 and Dm
Toll-5 (). In addition to its role in antifungal and antibacterial responses, Dm
Toll-1 has been implicated in cellular antiviral responses (27
). Thus, the possibility that the expanded Toll-1/Toll-5 clade in mosquitoes is related to their interactions with viruses merits detailed functional investigation. An unexpected evolutionary pattern was also observed for Spaetzle, the cytokine partner of Dm
Toll-1, which shows three Aa
paralogs and no identifiable Ag
SPZ1C acts together with Aa
TOLL5A to activate antifungal responses (28
); however, the absence of an Ag
Spaetzle ortholog raises questions about the evolution of this pair of molecules as an immune module, especially because the cytokine-Toll interaction is not required for mammalian TLR signaling. The only insect Tolls that cluster with TLRs are Dm
TOLL9, and Aa
TOLL9A/9B. Because Dm
Toll-9 is the only other Toll linked to Drosophila
), it is possible that this clade represents the most ancient immune-related insect Tolls. Whether these receptors can directly recognize microbial or viral immune inducers remains to be seen; it is worth noting that they are more similar to lipid-binding TLRs rather than to nucleic acid-binding TLRs.
Signal transduction components exhibit an unexpected mode of evolution. Rather than duplicating to create novel cascades responding to distinct challenges, or picking up members of multiprotein families to promote adaptive interactions, these components show robustness, maintaining their distinctive identity and functionality in the face of sequence evolution. The cytoplasmic signal transduction of the Toll pathway includes a chain of interacting partners, almost invariably encoded by orthologous trios: myeloid differentiation factor 88 (MYD88), TUBE, PELLE, tumor necrosis factor receptor-associated factor 6 (TRAF6), and CACT (). The same is true for the components of the IMD pathway: IMD, Fas-associated death domain protein (FADD), Dredd (CASPL1), IAP2, transforming growth factor β-activated kinase (TAK1), and inhibitor of nuclear factor κB kinase subunits γ and β (IKKγ and IKKβ). Despite persistent orthology, these components show marked divergence in sequence (). A similar pattern is observed in the signal transducers Dome and Hop of the immune signaling Janus kinase-signal transducers and activators of transcription (JAK-STAT) pathway, which is activated in Dm
by virus infections (30
). We hypothesize that the requirement for these factors to interact productively with others in the same chain causes escalating sequence divergence: A mutation in one may enhance the acceptability of certain mutations in its interacting partner, maintaining pathway function through coherent evolution rather than stasis. Consistent with this interpretation, evidence has been reported for an association between natural sequence variation of core signaling pathway components and immune competence in Drosophila
). Similar evolutionary patterns are detected among members of the RNA interference antiviral pathway, Dicer-2 and Ago-2 (32
), which also form highly divergent trios.
Signal transduction culminates in the next phase: nuclear translocation of transcription factors. The cytoplasmic nuclear factor κB (NF-κB) transcription factors remain inactive until a processed immune signal frees them from inhibitors, permitting their entry into the nucleus and transcription of effector genes. The evolutionary pattern in this phase combines aspects observed in other phases. The NF-κB s of the Imd pathway [Relish in Dm
and Rel-like NF-κB protein 2 (REL2) in mosquitoes] form an orthologous trio that displays high sequence divergence, as in signal transducer trios (Figs. and ). A recent duplication in Aa
has resulted in an orthologous quartet (Ag
REL1A, and Aa
REL1B). In contrast, Dif is absent from both mosquito species, although the intronless Aa REL1B
gene may have originated by retrotransposition. Transgenic analysis has shown that REL1A controls Aedes
antifungal responses, as does Dif in Dm
); this represents an interesting case of functional transfer between paralogs. STAT, the transcription factor of the JAK-STAT pathway, shows high sequence divergence like REL2 and has been duplicated in Ag
Immune effectors are required to target and neutralize the microbial source of the immune signal. We observed varied evolutionary dynamics for different categories of effectors, reflecting their modes of action. Those acting directly on microbes diversify rapidly or are species-specific, whereas effector enzymes that produce chemical cues to attack invaders remain conserved but independently expand in each species.
The production of AMPs, which act on bacterial membranes causing lysis, is a classic immune-inducible effector response (). Seven AMP families exist in Dm, but only three of them were detected in mosquitoes: Defensins (DEFs), cecropins (CECs), and attacins (ATTs) are highly diverse, together displaying no orthologous trio and only one confident 1:1 orthologous pair. Conversely, gambicins are only encountered in mosquitoes. The apparent paucity of mosquito AMPs in contrast to Dm may be attributable to different prevalence of bacteria in their respective environments.
As diverse as AMPs, the large family of antibacterial peptidoglycan-hydrolyzing lysozymes (LYSs) shows only one identifiable trio and one mosquito pair among 28 members (). A marked expansion in Dm
is ascribable to the use of LYSs for digestion of bacteria as a food resource: These peptides are atypically acidic and are expressed in the midgut but not in other immune tissues (34
). Apart from these digestive Dm
LYSs, the family forms two groups: one with both Dm
and mosquito LYSs and the other with only species-specific clades of mosquito LYSs—a very similar pattern to that observed for TEPs, which are also thought to function both as recognition receptors and as complement effectors.
The family of PPO melanization effectors has expanded greatly in mosquitoes as compared with Dm
and larger model insects. Ag
PPO6 is the only orthologous pair that clusters with Dm
PPOs; the remaining 17 mosquito PPOs form a distinct clade, created by reduplication events both before and since Ag
diverged (). The invariable catalytic activity of POs (conversion of tyrosine to melanin) is likely to restrict their functional diversification, suggesting that observed expansions may reflect diversification to accommodate differential developmental, topological, or temporal activation. Indeed, several Aa
PPOs show developmental or physiological specificity (35
, increased systemic levels of hydrogen peroxide (H2
) have been associated with Plasmodium
is used as an electron acceptor by HPXs that catalyze various oxidative reactions. This effector family shows a small expansion in Aa
and a large one in Ag
, while retaining a set of eight orthologous trios including DUOX
(dual HPX and NADPH-oxidase, where NADPH is the reduced form of nicotinamide adenine dinucleotide phosphate). The latter is associated with peroxidase-mediated nitration during the apoptotic response of midgut cells to Plasmodium
). Numerous trio orthologs of HPXs and other enzyme families implicated in oxidative defense show low sequence divergence, suggestive of constraints to preserve ubiquitous catalytic activities.
The availability of the genome sequences of distantly related insects has allowed us to apply comparative genomic methods to analyze the evolutionary dynamics of the insect innate immune repertoires. Notably, we identified distinct and seemingly contrasting evolutionary modes characterizing different immune modules, which together serve to provide a flexible system capable of adapting to new challenges. The repertoire of recognition receptors of microbial groups such as bacteria and fungi, which are encountered by all species, is achieved through expansion and fine-tuning of model genes. New functions (e.g., recognition of malaria parasites) are acquired from genes bearing powerful and ancient recognition domains such as LRRs. Protein networks modulating immune signals are assembled independently in each species, in the mix and match mode of evolution described as “bricolage” by François Jacob; they therefore coevolve with pathogens and may be subject to evasion. Pathways of signal transduction, on the other hand, remain highly conserved, and their constituent genes seem to evolve always in concert. Finally, effector mechanisms follow evolutionary patterns that depend on their mode of action; most are highly divergent or even species-specific, in contrast to the ancient, conserved oxidative defense mechanisms.
Recognition of the role of Toll in Drosophila immunity led directly to the identification of TLRs as a fundamental aspect of mammalian innate immunity. Similarly, the diverse evolutionary modes of insect immunity that we detected in the present study can guide future studies on the evolution of innate immune mechanisms in vertebrates and other animals. They can also facilitate targeted studies of immunity in the two mosquito species, which together transmit some of the most devastating infectious diseases of humankind.