New insight into the regulation of essential developmental, inflammatory and apoptotic pathways was achieved with the discovery and characterization of the NLR gene family of putative cytoplasmic pattern recognition molecules. While an increasing amount of information exists for these molecules in mammals, this gene family is poorly studied in other vertebrates with little to no information available even at the gene level for birds, amphibians and fish. This study resolves this issue by identifying and characterizing many NLR-like genes from these three classes of animals and uncovering a unique subfamily of NLRs in teleost fish.
Our evidence shows early evolution and high conservation of the NOD (NLR-A) subfamily of NLRs. All species of teleost fish that were analyzed had five distinct members of this subfamily designated NLR-A1, NLR-A2, NLR-A3, NLR-A4 and NLR-A5 that were clear gene orthologs of human NOD1 to NOD5 [1
]. A NOD1 ortholog was also described during an earlier screen of zebrafish ESTs for molecules similar to apoptosis regulators [17
]. In addition to encoded NACHT domains, the effector domains and LRR regions were highly conserved in the fish NLR-A genes relative to their human equivalents, suggesting retained function. NLR-A1 and NLR-A2, the fish orthologs of human NOD1 and NOD2 respectively, both possessed clear CARD domains (one in NLR-A1 and two in NLR-A2) with high amino acid identity to the equivalent regions of human molecules. In mammalian NOD1 (and presumably NOD2), the CARD domains are necessary for the interaction with RICK kinase, an enzyme that participates in NFκB activation and, ultimately, the generation of pro-inflammatory molecules [18
]. Since RICK is also present in fish genomes (see zebrafish RIPK2, Q4V958), it would appear that this inflammatory cascade was established prior to the divergence of teleost fish from the tetrapod lineage, assuming that the same interaction occurs between these molecules in fish. The highly conserved sequences in the LRR domains implies these zebrafish NLR-A1 and NLR-A2 may also be able to recognize meso-DAP and muramyl dipeptide as mammalian NOD1 and NOD2 respectively [13
] although this requires formal confirmation. NLR-A1 transcript was detected equally in intestine, liver and spleen, reflecting the wide-spread distribution observed for murine NOD1 [18
], while NLR-A2 was strongly expressed in intestine, with some expression in spleen and barely detectable levels in liver. Similar to the highest expression of NLR-A2 in zebrafish intestine, human NOD2 has a more restricted expression pattern, with predominant expression in cells of myeloid origin including monocytes [9
] and Paneth cells [19
] that are associated with the gut, although expression of NOD2 can also be induced in epithelial cells [20
]. Zebrafish NLR-A3 is clearly an ortholog of mammalian NOD3, with similarity in the effector and NACHT domains and an equal number of LRR domains. At the genomic level, NOD3 is flanked by RHOT2, SBK1 and PDPK1, GNPTG respectively in zebrafish and fugu further supporting the orthologous relationship for NOD3 between fish species. Expression of NLR-A3 was strong in zebrafish intestine, with some expression also in liver and little to no expression observed in the spleen. Interestingly, the kidney (bone marrow equivalent) did not express NLR-A3 as well suggesting that it is not expressed by lymphocytes (data not shown). In mammals, NOD3 expression occurs primarily in lymphocytes and is attributed to inhibition of T-cell activity [21
]. Two other NLR-A subfamily members were also identified in zebrafish that were designated NLR-A4 and NLR-A5 with NLR-A4 resembling human NOD4 and NLR-A5 being highly conserved to human NOD5. NLR-A4/NOD4 genes represent the most divergent members of this subfamily based on amino acid conservation within the N-terminal and LRR regions between different vertebrate orthologs. Both NLR-A4 and NLR-A5 genes were constitutively expressed in intestine, spleen and liver of naïve zebrafish, although there is clearly some fish to fish variation preventing their detection in some individuals under the conditions used for RT-PCR. Currently, there is no information concerning the expression patterns or functions of these latter two NLRs in mammals.
Whereas NOD1, NOD3, NOD4 and NOD5 appear to be conserved in bird and amphibian genomes, the gene for NOD2 was identified in neither the chicken nor the frog genomes. This would suggest that NOD2 has been deleted from the genomes in these species, although the genome of Xenopus tropicalis
is, at present, incomplete. This is surprising since NOD2, in mammals, appears to be a highly important sensor for intracellular microbial molecules. However, chickens do possess a NALP3 ortholog (see below) representing another potential PRR for muramyl dipeptide [22
] and may functionally replace NOD2 in this species.
Members of the NALP subfamily are also evident in lower vertebrates. Six genes were identified in zebrafish (Zv6) for NALP-like molecules (NLR-B1
), and ten predicted NALP-like genes (nicknamed NALPa
) were found for Xenopus
. These genes clustered separately for each species, suggesting recent duplication events formed the NALP subfamilies independently in fish, amphibians and mammals. The closest human ortholog of the amphibian and fish NALPs appears to be NALP6. A single NALP-like sequence predicted in the chicken genome (ENSGALG00000005155) and in the Uniprot database (Q5F3J4) clusters closest to the group of human NALPs 1, 3, 10, and 12 when analyzed phylogenetically (although not with strong bootstrap support) and has recently been given the name NLRP3 (previously designated CIAS1/NALP3). Chicken NALP was identified on chromosome 5, separate to the chicken NOD5
gene (chromosome 24). Although sequence variation makes accurate comparisons difficult, it is likely that this chicken gene arose from a distinct NALP than the fish and amphibian NALPs, with the ancestral NALP(s) possibly lost from the genome. The discovery of multiple NALP-like proteins in lower vertebrates contradicts a recent hypothesis by Hughes suggesting that the NALP subfamily evolved only in mammals [15
], with clear evidence that a gene encoding the NACHT domain of at least one NALP (possibly a NALP6-like gene) was present prior to the fish-tetrapod split. Zebrafish NALPs are situated at two distinct chromosomal locations, four of these genes (NLR-B1
) are located on chromosome 2, and the other two (NLR-B5
) can be found near NLR-A5
on chromosome 15; the new assembly of the zebrafish genome (Zv7) suggests these two sequences may represent the same gene. It should be pointed out that although chicken NLRP3 has an N-terminal PYRIN domain, the N-terminal domains for the Xenopus
and zebrafish NALPs were not identified. One exception was NLR-B2, which appears to have a domain that resembles a CARD and not a PYRIN domain as would be expected from its similarity to the mammalian NALPs. No PYRIN domains are observed for the Xenopus
NALP-like sequences and, other than the PY-CARD protein (prediction ENSXETT00000004042), no PYRIN domains were predicted in the Xenopus
genome. These observations may reflect that early ancestors of NALPs lacked these effector domains and later acquired the PYRIN domain (or CARD domain in the case of NLR-B1). Whether these NALP-like genes encode functional PRRs in poikilotherms remains uncertain, however, NLR-B2 transcript was detectable in zebrafish intestine, spleen and liver suggesting this may represent a functional gene.
In addition to the NOD- and NALP-like subfamilies, a unique subfamily of NLRs was identified in teleost fish, and designated NLR subfamily C (NLR-C). This subfamily is interesting for several reasons. Firstly, all teleostei genome (and EST) databases show numerous NLR-C genes, amounting to several hundred of these genes in a single species. Secondly, these genes all possess a central NACHT domain that is highly similar to the NACHT domain of NOD3 (NLR-A3) suggesting they evolved from a NLR-A3-like molecule, yet many of these genes possess a PYRIN domain at their N-terminus making them more structurally similar to mammalian NALP molecules. Finally, following the LRR domain many of these molecules (representatives found in all
bony fish) possess a B30.2 (PRY-SPRY) domain, which may allow them to interact with distinct molecules to standard NLRs and thus perform some novel function. B30.2 domains are also found on some tripartite motif containing (TRIM) proteins [23
] and on the PYRIN molecule [24
] (Fig ) and have several roles related to immunity. TRIM5a has been shown to inhibit retroviral activity by directly binding the capsid of the HIV retrovirus [25
], and PYRIN has been shown to inhibit the activity of caspase-1 by directly binding to the active site of this enzyme [24
], both using their B30.2 domains for these interactions. Each of these functions would fit with the role of NLRs as intracellular PRRs; the ability to bind viruses could be an extension of the pattern detection system attributed to the neighboring LRR domain, while the potential to inhibit caspase-1 activity may make NLR-C molecules important negative regulators of the inflammasome in teleost fish. The latter function would reflect gene families of cell surface receptors such as killer immunoglobulin-like receptors (KIRs) [26
] or novel immune-type receptors (NITRs) [27
] that possess many inhibitory receptors and a small number of stimulatory receptors for controlling cellular activation. It is also interesting that these molecules all contain a NACHT domain similar to NOD3, since mammalian NOD3 has an inhibitory role in T cells [21
]. Importantly, since the predicted N- and C-termini of some NLR-Cs are structurally similar to the two domains of the PYRIN molecule, this would also fit with a potential function of mimicking PYRIN. However, additional studies are required to determine what, if any, role in the immune system NLR-C molecules may play.
The evolutionary processes generating the vast subfamily of NLR-C genes are not clear and appear very complex. The relationships are further confused by apparent errors in the assembly of the zebrafish genome (Zv6 versus Zv7), as evidenced by clear differences in the mapping of some NLR-C genes to their predicted chromosomes between assembly versions [see Additional file 2
]. However, evidence suggesting tandem duplications of individual genes within a chromosomal locus is consistent between Zv6 and Zv7, which result in NLR-C genes adopting new exons encoding distinct N-terminal domains and/or C-terminal domains via exon-shuffling. The clusters of tandem NLR-C genes appear to have undergone en bloc
duplication, to generate further clusters in the same locus (cis
duplication), or within distinct loci or chromosomes (trans
duplication) through translocation. Single genes may also have duplicated independently multiple times, within established loci and to create new loci, prior to and following formation of new gene structures. A large scale duplication of this gene family may be explained by the teleost-specific genome duplication event (3R) occurring early in the evolution of teleost fish, which followed two rounds of complete genome duplication (2R) observed early in the evolution of the vertebrate lineage [28
]. Should this be the case, mutations and deletions of many of the duplicated genes would be expected, to remove redundancy from the genome [29
]. Therefore, many NLR-C genes may be non-functional genes or pseudogenes, although a small number have likely established new functions. Clearly, it is too early to assign functionality to these genes, except to note that many are transcribed and are presumably translated into protein products. Transcripts for NLR-C were detected, in this study, in three distinct tissues in naïve zebrafish, and many more can be identified in the EST databases for this fish species.
A CIITA-like gene was identified in the zebrafish genome (Zv7) and is an important molecule, in mammals, for controlling the expression of both major histocompatibility complex class I and class II molecules and therefore is significant for antigen presentation to T lymphocytes. Defects in human CIITA gene expression have been linked to several immune disorders [5
]. However, alternative molecules have been implicated in the induction of antigen presentation pathways [30
], including other members of the NLR family, such as NALP12 [31
]. NAIP/IPAF homologs have been identified in the sea urchin [16
] implying that the ancestral NLR resembled one of these molecules. However, neither NAIP nor IPAF was identified in the fish genomes at this time, although IPAF was evident in the frog genome, suggesting that the genes for these molecules may have been lost from the fish genomes during the teleost-specific genome duplication event.