|Home | About | Journals | Submit | Contact Us | Français|
Adaptins are subunits of adaptor protein (AP) complexes involved in the formation of intracellular transport vesicles and in the selection of cargo for incorporation into the vesicles. In this article, we report the results of a survey for adaptins from sequenced genomes including those of man, mouse, the fruit fly Drosophila melanogaster, the nematode Caenorhabditis elegans, the plant Arabidopsis thaliana, and the yeasts, Saccharomyces cerevisiae and Schizosaccharomyces pombe. We find that humans, mice, and Arabidopsis thaliana have four AP complexes (AP-1, AP-2, AP-3, and AP-4), whereas D. melanogaster, C. elegans, S. cerevisiae, and S. pombe have only three (AP-1, AP-2, and AP-3). Additional diversification of AP complexes arises from the existence of adaptin isoforms encoded by distinct genes or resulting from alternative splicing of mRNAs. We complete the assignment of adaptins to AP complexes and provide information on the chromosomal localization, exon-intron structure, and pseudogenes for the different adaptins. In addition, we discuss the structural and evolutionary relationships of the adaptins and the genetic analyses of their function. Finally, we extend our survey to adaptin-related proteins such as the GGAs and stonins, which contain domains homologous to the adaptins.
The term “adaptin” was coined by Barbara Pearse (1975) to designate a group of ~100 kDa proteins that copurified with clathrin upon isolation of clathrin-coated vesicles. The ~100 kDa-proteins were later found to be subunits of heterotetrameric adaptor protein (AP) complexes, and the term “adaptin” was extended to all subunits of these complexes. Four basic AP complexes have been described: AP-1, AP-2, AP-3, and AP-4. Each of these complexes is composed of two large adaptins (one each of γ/α/δ/ε and β1–4, respectively, 90–130 kDa), one medium adaptin (μ1–4, ~50 kDa), and one small adaptin (ς1–4, ~20 kDa) (Figure (Figure1A)1A) (reviewed by Kirchhausen, 1999 ; Lewin and Mellman, 1998 ; Robinson and Bonifacino, 2001 ). The analogous adaptins of the four AP complexes are homologous to one another (21–83% identity at the amino acid level). In general, the subunits of different AP complexes are not interchangeable, with the exception of some nonmammalian β1/2 hybrid proteins (see below), and possibly mammalian β1 and β2, which can be components of both AP-1 and AP-2. Some of the adaptins occur as two or more closely-related isoforms encoded by different genes. Additional diversity arises from alternative splicing of adaptin mRNAs. Thus, cells that express several of these adaptin variants have the potential to assemble a diverse array of AP complexes. AP-1, AP-2, and AP-3 are expressed in all eukaryotic cells examined to date. AP-4, on the other hand, is ubiquitously expressed in man (Homo sapiens), mouse (Mus musculus), chicken (Gallus gallus), and the plant Arabidopsis thaliana, but not in the fruit fly Drosophila melanogaster, the nematode Caenorhabditis elegans, and the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe.
AP complexes are components of protein coats that associate with the cytoplasmic face of organelles of the secretory and endocytic pathways. The complexes participate in the formation of coated vesicular carriers, as well as in the selection of cargo molecules for incorporation into the carriers. AP-2 mediates rapid endocytosis from the plasma membrane, while AP-1, AP-3, and AP-4 mediate sorting events at the trans-Golgi network (TGN) and/or endosomes (Figure (Figure1B).1B). AP-1 and AP-2 function in conjunction with clathrin, whereas AP-4 is most likely part of a nonclathrin coat. Mammalian (but not yeast) AP-3 has been shown to interact with clathrin, but the functional significance of this interaction is still unclear. The AP complexes have the overall shape of a “head” with two protruding “ears” connected to the head by flexible “hinge” domains (Figure (Figure11A).
Recent studies have identified two additional families of proteins, the GGAs (Golgi-localizing, γ-adaptin ear homology, ARF-binding proteins) (Boman et al., 2000 ; Dell'Angelica et al., 2000b ; Hirst et al., 2000 ; Poussu et al., 2000 ; Takatsu et al., 2000 ), and the stonins (Andrews et al., 1996 ; Martina et al., 2001 ; Walther et al., 2001 ), which share partial homology with the adaptins but are not components of AP complexes (Figure (Figure1A).1A). The GGAs contain a carboxy-terminal domain homologous (28–30% identity at the amino acid level) to the ear domain of the γ-adaptin subunit of AP-1. They function as monomeric adaptors for ARF (ADP-ribosylation factor)-dependent recruitment of clathrin to the TGN, and they mediate sorting of mannose 6-phosphate receptors and sortilin from the TGN to endosomes (Nielsen et al., 2001 ; Puertollano et al., 2001a ; Puertollano et al., 2001b ; Takatsu et al., 2000 ; Zhdankina et al., 2001 ; Zhu et al., 2001 ). The stonins are related to the D. melanogaster stoned B protein and exhibit homology (22–25% identity at the amino acid level) to the carboxy-terminal domain of the μ adaptins (Andrews et al., 1996 ; Martina et al., 2001 ; Walther et al., 2001 ). The available evidence points to a role for at least some of the stonins in endocytosis (Fergestad and Broadie, 2001 ; Fergestad et al., 1999 ; Martina et al., 2001 ; Stimson et al., 2001 ).
The adaptins are also distantly related (16–21% identity at the amino acid level) to subunits of the heteroheptameric COPI (coat protein I) or coatomer complex, a protein coat that functions in ER-Golgi and endosomal transport pathways. The large AP subunits are related to the β-COP and γ-COP subunits of COPI, while the medium and small AP subunits are related to the δ-COP and ζ-COP subunits of COPI, respectively. Together, β-, γ-, δ- and ζ-COP constitute the heterotetrameric F-COPI subcomplex (Fiedler et al., 1996 ). COPI comprises three additional subunits named α-COP, β'-COP, and ε-COP that are not related to the adaptins. These subunits constitute the B-COPI subcomplex (Fiedler et al., 1996 ), which is thought to subserve a function similar to that of clathrin.
Because of the critical roles of adaptins and related proteins in intracellular protein trafficking, it is of utmost importance to identify the complete repertoire of these proteins in eukaryotes. This goal is now achievable thanks to the recent completion of the sequencing of the genomes of humans and model organisms such as M. musculus, D. melanogaster, C. elegans, A. thaliana, S. cerevisiae, and S. pombe (Adams et al., 2000 ; The C. elegans Sequencing Consortium, 1998 ; Goffeau et al., 1996 ; The Arabidopsis Genome Initiative, 2000 ; Lander et al., 2001 ; Venter et al., 2001 ). The following sections describe the findings of a genome-wide survey for adaptins, GGAs, and stonins in these organisms. COPI subunits are beyond the scope of this essay and are only discussed in relation to the adaptins.
An inventory of adaptins was compiled from information published in the literature or obtained from the following internet resources: the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov), The Genome Database (http://www.gdb.org), the Mouse Genome Informatics at The Jackson Laboratory (http://www.informatics.jax.org/), the D. melanogaster genome at NCBI (http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/7227.html), the Sanger Center C. elegans Genome Project (http://www.sanger.ac.uk/Projects/C_elegans/blast_server.shtml), and The Stanford University Saccharomyces Genome Database search page (http://genome-www.stanford.edu/Saccharomyces/). To search for novel human adaptins, we used the TBLASTN algorithm (http://www.ncbi.nlm.nih.gov/genome/seq/page.cgi?F=HsBlast.html&&ORG=Hs) at the NCBI human genome BLAST web page. Adaptins in organisms other than human were found using the BLASTP and TBLASTN algorithms at the NCBI web page. Homologous hit sequences were reanalyzed using the same algorithm to check for the closest human relative and assigned a name accordingly. Consensus secondary structure predictions and sequence alignments were performed using the Multialign and ClustalW programs available at the Pôle Bio-informatique Lyonnais (http://npsa-pbil.ibcp.fr/). Protein family (Pfam) domains are listed in the Pfam Homepage (http://pfam.wustl.edu/) at Washington University, St. Louis, MO.
Table Table11 lists all the adaptins and related proteins found in mammals and other eukaryotes with sequenced genomes. Table Table22 summarizes the phenotypes resulting from disruption, RNA interference, or naturally-occurring mutations of genes encoding adaptins and adaptin-related proteins in organisms from yeast to humans. Supplemental Tables 1 and 2 (all supplemental Tables are found online) contain an inventory of the names, chromosomal location, number of exons and size of the genes, and the accession codes for all human and mouse adaptins, respectively. Supplemental Table 3 lists potential human pseudogenes. Supplemental Tables 4–9 summarize information on adaptins in D. melanogaster, A. thaliana, C. elegans, S. cerevisiae, S. pombe, and other organisms, in that order.
Both humans and mice express two γ (γ1 and γ2), one β (β1), two μ (μ1A and μ1B) and three ς (ς1A, ς1B, and ς1C) adaptin(s) (Table (Table1,1, Supplemental Tables 1 and 2). Although there is only one gene encoding β1, two isoforms can be generated by alternative splicing of exon 15 (Peyrard et al., 1994 ). ς1C is a novel isoform of ς1 encoded on chromosome 2 that was identified in our analyses. The predicted amino acid sequence for ς1C adaptin is equally homologous to the ς1A and ς1B adaptins (Supplemental Figure 1A). All of these proteins are known or predicted to assemble into various forms of the heterotetrameric AP-1 complex.
The AP-1 subunits are expressed in all mammalian tissues and cells examined except for μ1B, which is exclusively expressed in polarized epithelial cells (Ohno et al., 1999 ). For ς1C, human ESTs can be found from a variety of sources, such as kidney (accession number BG166479), colon (BG386072), brain (BF697657), and B-cells (BG340480), suggesting that it may be ubiquitously expressed. Homozygous disruptions of the genes encoding γ1 or μ1A cause embryonic lethality in mice, indicating that the AP-1 complex is essential for viability (Zizioli et al., 1999 ; Meyer et al., 2000 ) (Table (Table2).2). An embryonal fibroblast cell line deficient in μ1A adaptin exhibited accumulation of mannose 6-phosphate receptors in endosomes, suggesting a role for the AP-1 complex containing μ1A (i.e., AP-1A) in sorting from endosomes to the TGN (Meyer et al., 2000 ). The absence of μ1B expression in the polarized epithelial cell line LLC-PK1 (Ohno et al., 1999 ), on the other hand, was linked to impaired sorting of LDL receptor and other transmembrane proteins to the basolateral plasma membrane domain (Fölsch et al., 1999 ) (Table (Table2).2). Thus, the form of AP-1 containing μ1B (i.e., AP-1B) appears to be involved in basolateral targeting. The functional importance of other mammalian AP-1 subunit isoforms (e.g., γ2, ς1A, ς1B, ς1C) is unknown.
BLASTP searches of GenBank revealed an additional human cDNA termed FLJ10813 encoding a protein that is distantly related to the medium adaptins, with human and mouse μ1B being the most homologous (24% identity and 37% similarity at the amino acid level, but with 17% sequence gaps). The FLJ10813 protein is predicted to be truncated at the carboxy terminus relative to the μ chains, suggesting that it may not function as a μ adaptin. Interestingly, while no homologues can be found in D. melanogaster, C. elegans, or S. cerevisiae, a homologous protein (accession number AC006234) exists in A. thaliana.
Genes encoding two α (α1 and α2), one β (β2), one μ (μ2) and one ς (ς2) adaptin(s) have been found in both humans and mice (Table (Table1,1, Supplemental Tables 1 and 2). The human and mouse α2 adaptin sequences have been known for some time (Robinson, 1989 ; Faber et al., 1998 ), whereas the sequence of human α1 adaptin is annotated as a hypothetical protein in GenBank (accession number CAB66859). Human α1 and α2 adaptin share 81% identity and 88% similarity at the amino acid level. Although there are fewer isoforms of AP-2 subunits as compared with AP-1 subunits, additional diversity arises from alternative splicing of some mRNAs. In mammals such as mouse and pig, the α1 adaptin mRNA is alternatively spliced in brain and skeletal muscle to generate a protein with 21 additional amino acids in the hinge region (Ball et al., 1995 ). Alternative splicing of exon 5 of the mouse μ2 gene leads to the presence or absence of His142 and Gln143 in the protein. Although the longer isoform is more abundant, both forms are fully capable of interacting with tyrosine-based sorting signals (Ohno et al., 1998 ). Finally, a splice variant of ς2 adaptin termed ς2Δ has been identified in human leukocytes (Holzmann et al., 1998 ). All AP-2 subunits are ubiquitously expressed in mammals. No genetic analyses of AP-2 function in mammals have been reported to date.
Both humans and mice contain genes encoding one δ, two β3 (β3A and β3B), two μ3 (μ3A and μ3B), and two ς3 (ς3A and ς3B) adaptin(s) (Table (Table1,1, Supplemental Tables 1 and 2). δ, β3A, μ3A, ς3A, and ς3B are expressed ubiquitously, while β3B and μ3B are specifically expressed in neurons and neuroendocrine cells (Pevsner et al., 1994 ; Newman et al., 1995 ). Several putative splice variants of δ adaptin lacking codons 170–260, 117–285, and 746–877 have been identified through searches of EST databases and PCR amplification (Ooi et al., 1997 ). In humans, mutations in the gene encoding β3A adaptin cause Hermansky-Pudlak syndrome type 2 (HPS-2), a genetic disorder characterized by defective melanosomes and platelet dense granules (Dell'Angelica et al., 1999b ) (Table (Table2).2). A similar disorder has been described in mice bearing mutations in the genes encoding δ adaptin (mocha, Kantheti et al., 1998 ) and β3A adaptin (pearl, Feng et al., 2000 ; Feng et al., 1999 ; Yang et al., 2000 ) (Table (Table2).2). In addition, δ adaptin-deficient mice exhibit neurological defects that are not observed in β3A adaptin-deficient mice or HPS-2 patients (Kantheti et al., 1998 ). Fibroblasts from AP-3-deficient humans and mice exhibit increased trafficking of lysosomal membrane proteins such as CD63, lamp-1, and lamp-2 via the plasma membrane (Dell'Angelica et al., 2000a ; Dell'Angelica et al., 1999b ; Yang et al., 2000 ). A similar missorting of melanosomal and platelet dense granule proteins could underlie the organellar defects in the AP-3 mutants.
Both humans and mice have only one gene encoding each of the ε, β4, μ4, and ς4 adaptin subunits of AP-4 (Table (Table1,1, Supplemental Tables 1 and 2) (Dell'Angelica et al., 1999a ; Hirst et al., 1999 ). While no isoforms or splice variants of the AP-4 subunits ε, β4, or μ4 have been reported to date, the human ς4 mRNA appears to be subject to alternative splicing (accession numbers NP_009008 and AAH01259). Both splice isoforms share the first 102 residues, but they contain unrelated carboxy-terminal sequences of 42 (one exon) and 57 amino acids (two exons), respectively. Genomic sequences can be found for both splice variants, but no data are available on their tissue expression or differential incorporation of the proteins into AP-4. There are also no data on the disruption of AP-4 subunit gene expression in mammals, although indirect evidence has suggested a possible involvement of this complex in protein sorting to lysosomes (Aguilar et al., 2001 ).
Unlike the conventional adaptins, GGAs and stonins are monomeric proteins. Genes encoding three GGAs (GGA1, GGA2 and GGA3) and two stonins (stonin 1 and stonin 2) have been described in both humans and mice (Table (Table1,1, Supplemental Tables 1 and 2) (Boman et al., 2000 ; Dell'Angelica et al., 2000b ; Hirst et al., 2000 ; Martina et al., 2001 ; Poussu et al., 2000 ; Takatsu et al., 2000 ). Alternative splicing has been reported for the GGA2 and GGA3 mRNAs. For GGA2, this results in a truncated form of the protein (accession number AAK38634) comprising residues 1–194 of the full length GGA2, plus an additional ~30 residues at the carboxy terminus (Nielsen et al., 2001 ). This short GGA2 form, therefore, has a complete VHS domain but no GAT, hinge or GAE domains. There is also a short form of GGA3 (accession number AAF42849) that lacks 33 residues from the VHS domain as compared with the full-length GGA3 (Dell'Angelica et al., 2000b ), and that appears not to interact with the acidic di-leucine motif in the cytoplasmic tails of the mannose 6-phosphate receptors (Takatsu et al., 2001 ). No genetic disruptions of the expression of GGAs and stonins in mammals have been reported.
The availability of a complete catalogue of mammalian adaptins should now allow detailed analyses of the structural features that account for both the conservation and specialization of their functions. A schematic representation of all human adaptins and adaptin-related proteins is shown in Figure Figure2.2.
A region at the amino terminus of the large adaptins, comprising ~600 amino acid residues in γ1, γ2, α1, α2, ε, β3A, and β3B, ~530 residues in β1, β2, and β4, and ~490 residues in δ, corresponds to the so-called “trunk” or “Adaptin_N” homology domain (pfam 01602) (Figure (Figure2).2). This domain is predicted to be rich in α-helical secondary structure. In the β adaptins, it has been proposed to comprise 13–14 armadillo (arm) repeats, ~40-residue sequences that fold into two short α-helices linked by joining loops (Groves and Barford, 1999 ). Arm repeats have not been recognized in the amino-terminal region of the γ/α/δ/ε adaptins. However, their homology to the β adaptins suggests that they could be composed of more divergent double helix-loop repeats. An “Adaptin_N” homology domain is also present within the amino-terminal ~500 residues of the β-COP, γ1-COP, and γ2-COP subunits of COPI.
The “Adaptin_N” homology domain is involved in intersubunit interactions between the large adaptins and with the medium and small adaptins. It is also responsible for targeting of AP-1 and AP-2 to the TGN and plasma membrane, respectively. Page and Robinson (1995) have demonstrated that the trunk domains of γ1 and α2 interact with ς1A and ς2, respectively, while the β adaptins are more promiscuous, as both β1 and β2 can interact with either μ1A and μ2 in yeast two-hybrid assays. The ability of the γ/α large chains to bind specifically to the small chains has been shown to correlate with their in vivo targeting to the TGN or the plasma membrane. The trunk domains of β1 and β2 are also the site of interaction with dileucine-based signals (Rapoport et al., 1998 ; Greenberg et al., 1998 ).
The “hinge” domains of the large chains are of variable length, ranging from 46 residues in β4 adaptin to 410 residues in δ adaptin, and exhibit little if any sequence homology (Figure (Figure2).2). With the exception of γ2 and δ adaptin, the hinge regions of the adaptins are enriched in serine residues, many of which are potential targets for phosphorylation (Newman et al., 1995 ; Wilde and Brodsky, 1996 ; Dell'Angelica et al., 1997 ; Faundez and Kelly, 2000 ). The hinge domains of β1, β2, β3A, and β3B adaptins contain clathrin-binding motifs conforming to the consensus, L(L,I)(D,E,N)(L,F)(D,E) (Dell'Angelica et al., 1998 ; Kirchhausen, 2000 ).
Homology in the carboxy-terminal “ear” domains of the large adaptins is lower than that in the trunk domains, especially among the γ/α/δ/ε adaptins (Figure (Figure2).2). This probably reflects the functional diversity of the various AP complexes. The carboxy-terminal ~300 amino acids of α1 and α2 adaptin contain what is known as the “Alpha _adaptin_C” domain (pfam 02296) (Figure (Figure2).2). The ear domain of the α2 adaptin has been shown to interact with many regulators of coat assembly and/or vesicle formation that contain DPF/W motifs, such as epsin, eps15, and amphiphysin (reviewed by Slepnev and De Camilli, 2000 ). The three-dimensional structure of the ear domain of mouse α2 adaptin (residues 695–938) was solved with the use of x-ray crystallography (Owen et al., 1999 ; Traub et al., 1999 ). This domain consists of two structurally unrelated subdomains. The amino-terminal subdomain forms a nine-stranded β-sandwich that has similarity to the immunoglobulin fold and possibly acts as a structural anchor or spacer for the carboxy-terminal subdomain. This latter domain harbors the binding site for the DPF/W containing proteins, which is centered around residue W840. This domain contains a five-stranded β-sheet that is flanked by three α-helices and bears no resemblance to other known domain structures.
The carboxy-terminal ~120 residues of the γ1 and γ2 adaptins share the so-called “G_Adapt_CT” domain (pfam 02139) with the carboxy-terminal ~120 residues of the three GGAs (Figure (Figure2).2). Functionally, this homology is mirrored in the ability of the γ1 ear and the GGA GAE domains to interact with γ-synergin and rabaptin-5 (Hirst et al., 2000 ; Page et al., 1999 ; Takatsu et al., 2000 ). Although the alignment of δ and ε adaptin with the γ and α adaptin sequences shows little conservation outside the trunk domain, a comparison of the predicted secondary structures allows the tentative assignment of ear domains for δ and ε. In δ adaptin, the ear domain starts around residue 900 while in ε adaptin, it starts around residue 840. For both δ and ε adaptins, this domain is predicted to comprise a part rich in β-sheet followed by an α-helical segment of ~100 residues, similarly to the α adaptins. The carboxy-terminal domains of γ1-COP, γ2-COP, and β-COP are largely dissimilar from those of the AP large subunits. However, there is a small stretch of homology (22% amino acid identity, 41% similarity) between residues 946 and 1094 of ε adaptin and residues 745 and 895 of β-COP. This homology is exclusive for β-COP and ε adaptin since it is not observed in γ1-COP or γ2-COP, or in the γ, α, and δ adaptins.
Although the ear domains of the γ/α/δ/ε adaptins do not share sequence homology with the β ear domains, the tertiary structures of the ear domains of α2 adaptin (Owen et al., 1999 ; Traub et al., 1999 ) and β2 adaptin (Owen et al., 2000 ) are remarkably similar. The ear domains of the other β adaptins exhibit significant sequence homology to β2, suggesting that they may also display a similar three-dimensional structure.
The amino-terminal domain of the μ adaptins, consisting of 120–140 residues, interacts with the β adaptins of the corresponding complexes, and is hence referred to as “β-binding domain” (BBD, Aguilar et al., 1997 ) or “Clat_adaptor_s” region (pfam 01217) (Figure (Figure2).2). This region exhibits homology to a 90-residue sequence from the amino-terminal domain of δ-COP (20–27% amino acid identity, 44–51% similarity), as well as to the entire or almost entire length of the small adaptins (~20% amino acid identity, ~40% similarity). The small adaptins are in turn homologous to ζ1-COP and ζ2-COP (17–23% amino acid identity, 43–49% similarity). In addition, ς3A, ς3B, and ζ1-COP have short extensions at their carboxy-termini and ζ2-COP at both its amino- and carboxy-termini. By analogy with the amino-terminal domain of the μ adaptins, it is tempting to speculate that the entire length of the ς adaptins may be engaged in interactions with the γ/α/δ/ε adaptins, perhaps having the sole purpose of stabilizing the AP complex. The crystal structure of this domain has not been solved, although theoretical analyses predict a high content of α-helices.
The 290–320-residue carboxy-terminal domain of the μ adaptins (Figure (Figure2)2) is known as the “YXXØ-signal-binding domain” (Aguilar et al., 1997 ) or “Adap_comp_sub” domain (pfam 00928) and binds YXXØ-motifs (Y is tyrosine, X is any amino acid and Ø is leucine, isoleucine, phenylalanine, methionine or valine) present in the cytosolic domains of some transmembrane proteins. This domain has an all β-sheet, “banana-shaped” tertiary structure including two hydrophobic pockets that accommodate the tyrosine and bulky hydrophobic residues of YXXØ-type signals (Owen and Evans, 1998 ). This domain of μ2 has also been shown to interact with synaptotagmins (Haucke et al., 2000 ). Interestingly, this domain is shared with δ-COP (Serafini et al., 1991 ; Waters et al., 1991 ; Radice et al., 1995 ; Tunnacliffe et al., 1996 ) and the stonins (Andrews et al., 1996 ; Martina et al., 2001 ). Although neither of these proteins bind YXXØ signals, the ability to interact with synaptotagmins appears to be conserved in the stonins (Martina et al., 2001 ; Walther et al., 2001 ).
In addition to genes encoding the AP subunits described above, TBLASTN searches at the NCBI human genome BLAST web page using the known adaptin protein sequences as queries revealed additional DNA sequences homologous to adaptin genes (Supplemental Table 3). However, these sequences encoded only fragments of the predicted adaptins, had deletions, insertions or frame shifts that resulted in premature termination codons, or had no introns, all characteristics of pseudogenes (Supplemental Table 3). Of these, only an intronless gene on chromosome 17 could potentially give rise to a protein identical to ς1B residues 1-142 (amino acid identity). Although the absence of introns is a salient feature of pseudogenes, transcribed intronless paralogs have nonetheless been described (Venter et al., 2001 ), making it possible that this sequence encodes an additional ς1 adaptin.
Adaptins in other organisms (Table (Table1,1, Supplemental Tables 4–9) have been tentatively identified based on the homology to their mammalian counterparts, with the exception of the S. cerevisiae adaptins that have also been assigned by Yeung et al. (1999) based on biochemical and genetic evidence. Like mammals, A. thaliana contains genes encoding subunits of the four AP complexes. In contrast, D. melanogaster, C. elegans, S. cerevisiae, and S. pombe possess genes encoding subunits of AP-1, AP-2, and AP-3, but not AP-4. The domain organization and other structural features of adaptins in these organisms are predicted to be very similar to those of mammalian adaptins.
As in mammals, AP-1 is the complex that exhibits the greatest subunit diversification in other organisms (Table (Table1,1, Supplemental Tables 4–9). The A. thaliana genome contains genes encoding three γ (γ-I, γ-II and γ-III), three β1/2 (β1/2-I, β1/2-II and β1/2-III), one μ1, and two ς1 (ς1-I and ς1-II) adaptin(s) (Supplemental Table 4). γ-I had been previously given the names γ1 and γ2 by Schledzewski et al. who isolated two sequences with 98% identity at the amino acid level [accession number AAC28338 (γ1) and CAB39730 (γ2)]. In a TBLASTN search, however, gene T23E23.7 was most closely related to both γ1 and γ2 (94% identity and similarity at the amino acid level), making it likely that both proteins originate from the same gene. The difference in amino acid composition between the γ1 and γ2 forms of γ-I could be the result of alternative splicing. γ-II shares 70% amino acid sequence identity with γ-I. γ-III has several stretches of homology to both γ-I and γ-II. However, it shows several features that would make it an unusual γ adaptin. The γ-III “Adaptin_N” region of homology (pfam 01602) is truncated and shifted carboxy-terminally relative to the other A. thaliana γ adaptins. Moreover, γ-III lacks the γ adaptin hinge and ear domains. Since there is no corroborative evidence for the existence of a transcript encoding γ-III, this DNA could be an artifact or a pseudogene. Full sequences are available for genes encoding A. thaliana β1/2-I and β1/2-II, whereas that encoding β1/2-III has been only partially sequenced. The β1/2 adaptins share 93–98% identity at the amino acid level and are thus more closely related to each other than to either human β1 or β2 adaptin. This suggests that they are products of relatively recent gene duplications. As is the case for mammalian β1 and β2 adaptins, the A. thaliana β1/2 adaptins could be subunits of both AP-1 and AP-2 complexes. Two ς1 adaptins that are homologous to mammalian ς1 adaptins have been identified in A. thaliana. These are constitutively expressed in all tissues of the plant, with higher levels being found in reproductive tissues (Maldonado-Mendoza and Nessler, 1997 ). Only one gene encoding a μ1 homolog could be identified in A. thaliana.
D. melanogaster contains only one gene encoding each of the subunits of AP-1 (γ, β1/2, μ1 and ς1 adaptins) (Supplemental Table 5), while C. elegans contains genes encoding one γ, one β1/2, two μ1 (μ1-I and μ1-II) and one ς1 adaptin(s) (Supplemental Table 6). Both D. melanogaster and C. elegans have a single β1/2 adaptin, which is probably a subunit of both AP-1 and AP-2. While D. melanogaster has a single μ1 adaptin, C. elegans has two, μ1-I (Unc-101) and μ1-II (Apm-1). The 56% identity and 70% similarity at the amino acid level of these two proteins points to a gene duplication within the nematodes group. Both proteins are transcribed in all cells and at all stages of development. Null mutations of unc-101 are lethal in 50% of the animals (Lee et al., 1994 ), and a dsRNAi against apm-1 results in larval lethality (Shim et al., 2000 ). Simultaneous interference with both μ1 adaptins by dsRNAi is embryonic lethal in 100% of the animals, as is dsRNAi against γ, β1 or ς1 (Shim et al., 2000 ) (Table (Table22).
S. cerevisiae has one γ (Apl4p), one β1 (Apl2p), two μ1 (Apm1p and Apm2p) and one ς1 (Aps1p) adaptin (Supplemental Table 7). This allows for the assembly of two AP-1 complexes that differ in their μ1 subunit (Yeung et al., 1999 ). While Apm1p is a classical μ chain, Apm2p is unusually large and cannot complement for Apm1p in apm1 null mutants (Stepp et al., 1995 ). Disruption of genes encoding AP-1 subunits is not detrimental to S. cerevisiae cells (Nakai et al., 1993 ; Phan et al., 1994 ; Rad et al., 1995 ; Stepp et al., 1995 ) (Table (Table2).2). However, they exhibit synthetic lethality with the temperature-sensitive clathrin heavy chain allele chc1ts at the nonpermissive temperature (Phan et al., 1994 ). Even at the permissive temperature, there is a defect in α-factor secretion in the double mutants, indicating a role for yeast AP-1 in sorting at the TGN (Phan et al., 1994 ; Stepp et al., 1995 ).
A. thaliana, D. melanogaster, C. elegans, S. cerevisiae, and S. pombe have single genes encoding the α, μ2 and ς2 subunits of AP-2 (Table (Table1,1, Supplemental Tables 4–8). As discussed above, A. thaliana possesses three genes encoding hybrid β1/2 adaptins (β1/2-I, -II, and -III) that could be components of AP-2 as well as AP-1. Similarly, D. melanogaster and C. elegans have single genes encoding a hybrid β1/2 adaptin that is also probably shared by AP-1 and AP-2. In D. melanogaster, α and μ2 are most highly expressed in the central nervous system. Disruption of D. melanogaster α adaptin expression resulted in alleles with various degrees of severity ranging from embryonic lethality to adult flies that could neither walk nor fly (González-Gaitán and Jäckle, 1997 ) (Table (Table2).2). dsRNAi of α or β1/2 adaptin in C. elegans resulted in the inhibition of yolk endocytosis and embryos proved to be inviable (Grant and Hirsh, 1999 ). Embryos obtained from μ2(RNAi) or ς2(RNAi) mothers exhibited developmental phenotypes (Levy et al., 1993 ; Shim and Lee, 2000 ) (Table (Table2).2). S. cerevisiae has homologues of the mammalian AP-2 adaptor complex subunits [Apl3p (α), Apl1p (β2), Apm4p (μ2), and Aps2p (ς2)], but their deletion has no apparent effect on endocytosis or any other protein sorting step yet analyzed (Munn, 2001 ) (Table (Table22).
AP-3 subunits are encoded by single genes in A. thaliana, D. melanogaster, C. elegans, S. cerevisiae, and S. pombe (Table (Table1,1, Supplemental Tables 4–7). We found the C. elegans ς3 adaptin sequence by using the TBLASTN algorithm at the Sanger Center on a search for homologues of D. melanogaster ς3 (Supplemental Figure 1B). In agreement with studies of AP-3 function in mammals described above, D. melanogaster AP-3 appears to be involved in the biogenesis of pigment granules (Kretzschmar et al., 2000 ; Lloyd et al., 1999 ; Mullins et al., 2000 ; Mullins et al., 1999 ; Ooi et al., 1997 ; Simpson et al., 1997 ) (Table (Table2).2). Mutations of the genes encoding the S. cerevisiae δ (Apl5p), β3 (Apl6p), μ3 (Apm3p) or ς3 (Aps3p) impaired the sorting of alkaline phosphatase (ALP) and the t-SNARE Vam3p to the vacuole (Cowles et al., 1997 ), an organelle that is the yeast counterpart of mammalian lysosomes (Table (Table22).
In addition to mammals, a complete set of AP-4 subunits has only been found in A. thaliana (Table (Table1,1, Supplemental Table 3). Homologues of μ4 have been identified in G. gallus (Wang and Kilimann, 1997 ) and D. discoideum (de Chassey et al., 2001 ), the genomes of which have not yet been completely sequenced. However, homologues of ε, β4 and ς4 subunits have yet to be identified in these organisms. Therefore, the existence of an AP-4 complex in these organisms remains to be formally established.
Sequences homologous to the GGA proteins could be identified in C. elegans and D. melanogaster (Table (Table1,1, Supplemental Tables 4–6), but no data are available on their expression pattern, localization or function. S. cerevisiae and S. pombe each contain two GGA proteins that are more closely related to each other than to the mammalian proteins. Single deletions of genes encoding these proteins in S. cerevisiae resulted in no obvious phenotype, but the gga1Δ gga2Δ double deletion strain displays defects in CPY and proteinase A sorting to the vacuole, Pep12p sorting from the Golgi to a prevacuolar compartment, α-factor maturation, and vacuolar morphology (Black and Pelham, 2000 ; Costaguta et al., 2001 ; Dell'Angelica et al., 2000b ; Hirst et al., 2000 ; Mullins and Bonifacino, 2001 ; Zhdankina et al., 2001 ) (Table (Table22).
A BLASTP search for homologues of the mammalian stonins in other organisms also identified homologues in D. melanogaster and C. elegans, but not in A. thaliana, S. cerevisiae and S. pombe, suggesting that these proteins may be specific to the animal lineage (Table (Table1,1, Supplemental Tables 4–9). The only member of this family for which a genetic analysis of its function has been performed is the D. melanogaster stoned B protein. This protein is one of two polypeptides produced from a dicistronic message that is transcribed from the stoned gene (Andrews et al., 1996 ). The other polypeptide translated from this message, stoned A, is structurally unrelated to the adaptins. Temperature-sensitive stoned mutants display uncoordinated leg and wing movements characteristic of neurological dysfunction at the nonpermissive temperature (Phillips et al., 2000 ) (Table (Table2).2). The mutants also exhibit decreased uptake of FM1–43 in nerve terminals, suggesting that the neurological defects are due to impaired synaptic vesicle recycling (Phillips et al., 2000 ).
All eukaryotes for which sequence information is available have one COPI and at least three of the four AP complexes (AP-1, AP-2 and AP-3), suggesting that the diversification of heterotetrameric coat complexes occurred before the branching of the major eukaryotic kingdoms [see Doolittle (1999) for a review of recent phylogenetic classifications]. The existence of AP-4 in A. thaliana and some vertebrates (G. gallus, M. musculus, and H. sapiens) suggests that this complex evolved before the separation of the plant and animal ancestors. The homologies between the two sets of large subunits of the AP complexes (γ/α/δ/ε and β1–4) and the F-COPI subcomplex (γ- and β-COP) indicate that they all derived from a single ancestral large chain (denoted as L in Figure Figure3).3). Similarly, the μ and ς subunits of AP complexes as well as the δ and ζ subunits of the F-COPI subcomplex likely derived from a common ancestral small chain (denoted as S in Figure Figure3).3). These two proteins must have come together to form a proto-F-COPI-AP hemicomplex (Schledzewski et al., 1999 ) (L, S in Figure Figure3).3). This complex could have been a heterodimer or a heterotetramer composed of two identical heterodimers (L, L, S, S in Figure Figure3).3). Genes encoding the subunits of this ancestral complex must have undergone successive rounds of coordinated gene duplication (indicated by asterisks in Figure Figure3)3) to give rise to the F-COPI and AP complexes. A first round of gene duplication and accumulation of mutations resulted in the emergence of two distinct pairs of large and small subunits (L1, L2, S1, S2). Later, one of the small subunits acquired a precursor of the μ subunit signal-binding domain (MHD in Figure Figure3),3), leading to the emergence of an ancestral μ subunit (M) (Schledzewski et al., 1999 ). The two large subunits, together with the medium and small subunits, constituted the proto-F-COPI-AP heterotetrameric complex (L1, L2, M, S in Figure Figure3).3). Another round of gene duplication involving all four subunits of this complex followed by evolutionary divergence led to the appearance of distinct F-COPI and proto-AP (AP-1/2/3/4) complexes (Figure (Figure3).3). From this point on, the F-COPI and proto-AP complexes followed different evolutionary paths. F-COPI acquired the ability to bind to the three proteins that conform the B-COPI subcomplex to form a heteroheptameric COPI complex (Fiedler et al., 1996 ). Isoforms of the γ-COP and ζ-COP subunits arose by separate gene duplications (indicated by in Figure Figure3)3) in plants (ζ-COP) and vertebrates (γ-COP and ζ-COP), but no new sets of four subunits were derived from the F-COPI subunits. In contrast, the genes encoding the four subunits of the proto-AP complex underwent at least three more rounds of gene duplication (Figure (Figure3).3).
The probable sequence in which the four AP complexes evolved was inferred from phylogenetic analyses using the EMBL European Bioinformatics Institute ClustalW algorithm (http://www2.ebi.ac.uk/clustalw) and the TreeviewPPC program (Figures (Figures44 and and5).5). Although differences in the evolutionary rates of different proteins can lead to erroneous phylogenetic reconstructions (Brocchieri, 2001 ), the heterotetrameric structure of the four AP complex allows combined measures of evolutionary distance to be derived. These analyses indicate that the precursor of modern-day AP-3 branched out first from the proto-AP complex. The AP-4 complex appears to have evolved next. Some animals such as C. elegans and D. melanogaster do not have an AP-4, while A. thaliana and mammals do. This indicates that, although AP-4 is ancient, some organisms may have lost the genes for this complex. Distinct AP-1 and AP-2 complexes were the last ones to evolve, initially sharing a single β subunit. The sharing of a β subunit has persisted to date in organisms such as C. elegans and D. melanogaster, which have only one β1/2 subunit common to both AP-1 and AP-2. Duplications of genes encoding β1/2 precursors may have occurred relatively recently in some lineages giving rise to two or three genes encoding closely related paralogs that may be subunits of both AP-1 and AP-2.
AP-1 and AP-2 subunits in the yeasts S. cerevisiae and S. pombe evolved somewhat differently, since the duplication of the β1/2 subunit gene appears to have occurred very early in evolution after the separation from the common yeast/nematode ancestor. Thus, the S. cerevisiae AP-1 β adaptin, Apl2p, is most homologous to the mammalian β1 and β2 proteins while the S. cerevisiae AP-2 β adaptin, Apl4p, is quite different from the β2 subunit of mammalian AP-2. Similarly, the S. cerevisiae μ2 adaptin, Apm4p, is equally homologous to the mammalian μ1 and μ2 proteins, which suggests an early duplication of genes encoding the β1/2-μ1/2 hemicomplex in S. cerevisiae. An early duplication of the gene encoding μ1 may also be responsible for the appearance of a second, distinct μ1 in S. cerevisiae, Apm2p, which remains a relative outlier in the phylogenetic tree.
The transition between flies and vertebrates is characterized by the duplication of genes encoding individual subunits of the AP complexes (indicated by the in Figure Figure3).3). These resulted in the emergence of two or more isoforms of certain subunits, including γ (γ1 and γ2), μ1 (μ1A and μ1B), ς1 (ς1A, ς1B and ς1C), α (α1 and α2), β3 (β3A and β3B), μ3 (μ3A and μ3B), and ς3 (ς3A and ς3B) in humans. Some of the isoforms continued to be expressed in all cells, while others became specific to some cells or tissues. For example, β3A and μ3A are ubiquitous while β3B and μ3B are exclusively found in brain or other neural tissues. Similarly, μ1A is ubiquitous while μ1B is expressed only in epithelial cells. It thus appears that more complex organisms built specificity on top of their already established AP complexes by selectively replacing some of their preexisting subunits with new isoforms.
All three mammalian GGA proteins contain a domain that is homologous to the ear domain of the γ adaptins, but are otherwise structurally distinct from the γ adaptins. A duplication and transfer of the γ ear domain probably took place after the emergence of AP-1 but before the separation of the major eukaryotic kingdoms (Figure (Figure3).3). The complete GGA gene was duplicated in S. cerevisiae and S. pombe, giving rise to two GGA isoforms. C. elegans and D. melanogaster have only one GGA gene but mice and humans have three. GGA2 was likely the first one to branch out, while GGA1 and GGA3 separated later (Figure (Figure33).
The stonins have a carboxy-terminal domain homologous to the signal-binding domain of the μ adaptins, more specifically to mammalian μ2. This suggests that genomic sequences encoding the μ2 signal-binding domain were duplicated and translocated next to sequences encoding the proline-rich (PRD in Figure Figure3)3) and stonin homology domains (SHD in Figure Figure3)3) thus creating the stonin ancestor (stonin 1/2 in Figure Figure3).3). Since stonin orthologues have been identified in C. elegans, D. melanogaster, mice and humans, but not yeast or A. thaliana, it appears that they evolved in the animal lineage. Similarly to adaptins and GGAs, however, the stonin gene was duplicated in vertebrates, leading to the human and mouse stonin 1 and stonin 2 genes (Figure (Figure33).
In this essay we have assembled a comprehensive inventory of adaptins and related proteins encoded in the genomes of various eukaryotic organisms. We have tentatively assigned them to complexes according to the mammalian AP nomenclature and described the structural and evolutionary relationships between these proteins. Our analyses resulted in the identification of several new gene products that had not yet been annotated as adaptins or related proteins. With a systematic classification of the adaptins in hand, we can now attempt to explain the full range of protein sorting events mediated by the adaptins. Many aspects of adaptin function remain to be elucidated, including: (1) the exact intracellular localization and distribution of AP complexes and related proteins; (2) the signal-binding specificity of all the adaptins and related proteins; (3) the nature of the cellular processes in which AP complexes and related proteins are involved; (4) the function of ubiquitous and tissue-specific adaptin isoforms; (5) the regulation of expression of adaptins during development. We expect rapid advances in the elucidation of these issues through refinement of established morphological, biochemical, and genetic methodologies, as well as the adoption of recent advances in genetic manipulation such as RNA interference in mammalian cells.
We thank Esteban Dell'Angelica, François Letourneur, Chris Mullins, and Hiroshi Ohno for critical reading of the manuscript. M.B. was supported by a fellowship from the German Academic Exchange Service (DAAD).
Online version of this essay contains supplemental tabular material. Online version is available at www.molcellbiol.org.