The zebrafish L chain isotypes were classified by Haire et al. (17
) by C region identity (~30%); this degree of shared sequence is what distinguishes mammalian κ and λ C regions, 35–37%. Contig sequences from the database were selected for analysis based on the presence of matches with both L chain V and C regions. The organizations of loci encoding L chain types 1–3 are shown in . The diversity of the V and C genes has already been discussed (17
); some considerations as to their derivation are in the legend. The organization and rearrangement potentials of the three L chain isotypes will be presented in the order: type 2, followed by type 3 and type 1. The type 2 locus is the most complex, containing multiple V, J, and C genes. The type 3 locus has a different organization, similar to one of the type 1 loci; the remaining type 1 loci are in the well-known cluster configuration.
FIGURE 1 Organization of representative genes encoding zebrafish L chains. Five contigs from the zebrafish database were analyzed for the arrangement of L chain V (gray boxes) and J gene segments and C exons (black boxes). The transcriptional polarity is indicated (more ...)
Type 2 L chain organization
Type 2 genes are located over an area of 36 kb on contig NW_644395 (1759 kb). There are 12 V gene segments, 4 J gene segments and 2 C exons (). There are no C sequences in the 4-kb interval between J2b
, or after J2d
. The C2a
is identical with the published type 2 C region (17
). The C2b
sequence is 59% identical with C2a
at the nucleotide level and 45% in the derived amino acid sequence. That is, it is well diverged from C2a
but nonetheless distinguishable from type 1/type 3 C sequences (22–31% identity in all inter-isotype pairwise comparisons).
All of the genes are in the same transcriptional orientation and are so closely packed that they might operate as one locus. The distance between C2a and V2g is <2.5 kb; if they are not differentially activated, conventional recombination between V and J would occur by deletion of intervening DNA to form a VJ () or deletion of C (). In any case, genes V2a through V2f can rearrange to J2c and be spliced to C2b. It is not clear whether any rearrangement to J2a or J2b would be efficiently spliced to C2b.
FIGURE 2 Recombination at the type 2 and type 3 loci. A, Deletion rearrangement at type 2 locus. B, Possible deletion rearrangement excising C2a exon. The excised region is indicated by brackets. C, Inversion rearrangement at type 3 (or type 1) locus. D, Inversion (more ...)
The multiplicity of J segments scattered throughout the locus provide the potential for recombination events that could inactivate the locus. Recombination between J2a and V2g-V2k would delete the intervening DNA carrying C2a ( and ), and assuming J2d is part of the activated locus, any rearrangement with J2d excises the intervening DNA carrying one or both C exons. The biological significance of these possibilities will be considered in Discussion.
Type 3 L chain organization
A type 3 locus is located on chromosome 5. There are eight V gene sequences, one J gene segment, and one C exon over 18 kb (, bottom); no other Ig sequences were detected in over 1600 kb on contig NW_634729. Six V gene segments are located upstream of the J and C; two more are 3′ of the C. Of the eight V genes, seven are in opposite transcriptional polarity to V3b, J3a, and C3a.
V genes on both sides of the J/C can rearrange functionally to the J. The V genes downstream can recombine by inversion of the J and C (), whereas those upstream will themselves invert to join J (). The finding that both downstream V gene segments V3g and V3h appear to be functional argue for their active usage. Thus, this type 3 locus consists of eight V gene segments all potentially able to recombine with the J.
Type 1 L chain organization
A search with the zebrafish type 1 C sequence showed good matches from contigs NW_646408 and NW_633979 (, top
); the former carries two C regions that are 99 and 100% identical with the LC1–8 cDNA sequence and the latter, one C region sharing 87% identity. Both contigs contain multiple V region sequences that were distinguished after searches with the type 1 V sequence and with the type 1 RSS. Thus, type 1 L chains are encoded by several loci, as originally deduced from the finding of different C sequences in a cDNA library (17
NW_646408 carries four V gene segments interspersed among two J and C genes, as shown in , top left; its 5′ region is unknown, being at the very end of the contig. NW_633979 is assigned to chromosome 1 and contains seven V genes, one J gene segment, and one C exon over 18 kb (, top right). The organization and rearrangement potential is like the type 3 locus; it too apparently exists as an isolated Ig locus.
Additional matches were found using tblastn program with the C1a
amino acid sequence. On NW_644842 (34 kb long), an isolated V-V-J-C cluster was detected; this C sequence (C1k
; see supplemental data
shared 58–62% identity with C1a-c
at the amino acid level. On NW_633913, assigned to chromosome 19, seven partial and complete C sequences were found accompanied by J gene segments; the functional ones share 43–62% identity with C1a,b,c,k
and 90–100% with each other, showing that they duplicated among themselves and diverged long ago from the others. Ten V sequences were detected, placed in the opposite transcriptional orientation to the nearest J and C, about one or two per cluster. Over a span of 516 kb, these genes form four clusters, separated by intervals of 3.7–418 kb. Thus, in contrast to the IgL on chromosome 1, these type 1 loci resemble the kind of small IgL clusters reported in other bony fishes.
The type 1 L chain in zebrafish are encoded both by small V-J-C clusters and an “expanded” cluster like those in types 2 and 3. This is a range of variant gene organizations not previously observed in other fishes, perhaps due to the limitations of cloning with bacteriophage vectors.
Genomic Southern blotting results from our laboratory (not shown) for C sequences were identical with those obtained by Haire et al. (17
), who also screened a zebrafish BAC library. The many (7–9) C1-hybridizing bands found in Eco
RI- or Hin
dIII- or Pst
I-cut genomic DNA corroborate the maps in , showing that the type 1 L chain is encoded by multiple clusters, whereas the few (2
) C-hybridizing bands for types 2 and 3 suggest about two loci each.
Type 1 and 3 share a common ancestor
Together with representative zebrafish sequences, the V and C regions of various teleost L chains were examined for their phylogenetic relationship (). These particular species were selected based on the existence of organizational information for one or all of their L chain types (). The deduced organization of salmon type 2 is based on the finding of a germline transcript containing an unrearranged V gene, the V-J intergenic sequence, and a J spliced to the C sequence. Such transcripts are only possible when the V, J, and C are in close proximity and in the same transcriptional orientation, like V2f-J2a-C2a
(). The two reported in zebrafish and in salmon are both type 2 L chains (17
FIGURE 3 Phylogenetic analyses of L chain V and C from various teleost fishes. Top, V domain. Bottom, C domain. All type 1/L1 sequences are indicated in blue, type 2/L2 in pink, and all type 3/L3 in yellow except salmon, which was named type 3, but which we designate (more ...)
Teleost L chains grouped according to isotype and organization
In , top, it can be seen that the type 2 V regions grouped strongly together (>99% of bootstrap resamplings) and are distinct from the type 1/3, which appear to be intermixed. The close relationship of type 1 to type 3 is further supported by the similar configuration of the germline genes, in contrast to all the type 2, where the V are in the same transcriptional orientation as J and C (). The statistically robust phylogenetic groupings of sequences, together with the shared organizational characteristics, are the strongest evidence for the common derivation of type 1 and type 3 L chains.
The classification of a teleost fish L chain is most reliably established through its C region. The phylogenetic tree of the C regions (, bottom
) shows strongly supported branches where sequences of the same isotype are clustered (>98% of bootstrap resamplings), suggesting common descent of all type 2 and of all type 1 C region genes among these fishes. The one exception among type 3 C regions is the salmon sequence named “type 3,” which is found amid the type 1 genes. Catfish, zebrafish, and carp, whose type 3 sequences group together and independently of the other C regions, all belong to Ostariophysi; in contrast, salmon belongs to Protacanthopterygii (see ). The type 3 L chain genes in Ostariophysi fishes thus descend from a common ancestral duplication from type 1 L chain genes. We suggest that, in contrast, the third salmon L chain characterized by Solem and Jørgensen (23
) derives from an event specific to Protacanthopterygii or Salmoniformes (hence the asterisked “L3” is distinguished in ). A more recent divergence could explain why the salmon “type 3” and type 1 C regions share 61.5–63.3% amino acid identity, a relationship that is more comparable to zebrafish C1a-c
, which share 58–62% identity. In contrast, between isotypes, the zebrafish type 1 and 3 share 23–27% and the catfish G (type 1) and F (type 3) share <35% (15
); this level of identity distinguishes mammalian κ and λ C regions.
FIGURE 4 Taxonomic relationships among several teleost species. The taxonomic classifications of the teleost fishes listed in are displayed according to superorder, order, and genus (48). The L chains that have been reported here and in the references (more ...)
Because type 2 is present in Ostariophysi, Protacanthopterygii, as well as Acanthopterygii (see below), its apparent absence in catfish and others might be attributed to species-specific deletion or to a L chain subpopulation that has yet to be identified. In fact, there exists in the catfish pronephros EST library a L chain sequence (accession no. CK403931) that may be a candidate for the catfish type 2 L chain. A preliminary analysis shows greater similarity to bony fish type 2 sequences than to the catfish F or G L chains; however, the establishment of its identity requires further investigation (M. Criscitiello, unpublished observation).
We suggest that type 1 and type 2 L chains have been present since the emergence of Euteleostei. Type 3 could have evolved early after Ostariophysi divergence because no type 3 homolog has so far been reported in other fish orders; if indeed that is the case, type 3 is by definition not a teleost isotype but a subtype.
Type 2 L chain sequence in puffer fish
We used the zebrafish type 2 V sequence to search the fugu database (http://fugu.biology.qmul.ac.uk/
). Two sequences (clones M007644, accession no. DQ471453; M006921, DQ471454) are >2 kb in length and carry one V and one J gene segment and one C exon. All are in the same transcriptional orientation; the analysis of M007644 is shown in supplemental Fig. 1a
. Searches using the fugu C sequence resulted in five more matches at 99–100% identity. Because M006921 has a V with a stop codon (supplemental Fig. 1a
), M007644 a functional V, and another clone contains two J segments and one C (M007311), the heterogeneity suggests an organization of multiple clusters (). The fugu V and C sequences group with the other type 2 () and provide additional proof that type 1 and type 2 loci existed early in teleost radiation.
Searches were also performed with zebrafish C3 and C1 sequences; among fugu sequences matching to both, all showed greater similarity to C1. One sequence with similarity to only C3 encodes an Ig-like domain; however, there were no V, J, or RSS sequences within a 10-kb vicinity, nor were there any matches to transcribed fugu or tetraodon sequences.