|Home | About | Journals | Submit | Contact Us | Français|
BLOC-1 is one of four multi-subunit complexes implicated in sorting cargo to lysosome-related organelles, as loss of function of any of these complexes causes Hermansky-Pudlak syndrome. Eight subunits of BLOC-1 interact with each other, and with many other proteins. Identifying new interactors of BLOC-1 will increase understanding of its mechanism of action, and studies in model organisms are useful for finding such interactors. Psi-BLAST searches identify homologues in diverse model organisms, but there are significant gaps for BLOC-1, with none of its eight subunits found in Saccharomyces cerevisiae. Here we use more sensitive searches to identify distant homologues for three BLOC-1 subunits in S. cerevisiae: Blos1, snapin and cappuccino (cno). Published data on protein interactions show that in yeast these are likely to form a complex with three other proteins. One of these is the yeast homologue of the previously uncharacterized KxDL protein, which also interacts with Blos1 and cappuccino in higher eukaryotes, suggesting that KxDL proteins are key interactors with BLOC-1.
Traffic through the secretory pathway involves the action of many multi-molecular complexes. Adaptor protein (AP-3) complexes define a traffic step from early endosomes, or in yeast from the TGN (1), to later endocytic compartments (2,3). AP-3 dysfunction in humans leads to Hermansky-Pudlak Syndrome (HPS), characterized by oculocutaneous albinism and platelet dysfunction caused by defects in melanosomes and platelet dense granules respectively (1). These two endocytic compartments are lysosome-related organelles (LROs), a term that applies to specialized acidic organelles related to late endosomes or lysosomes (4,5).
Biogenesis of LROs relies not only on AP-3, but also three other multimeric complexes called BLOC-1/2/3 (for biogenesis of LRO complex) (6). Of these, BLOC-1 is the most complicated with eight subunits: Blos1, Blos2, Blos3 (also called BLOC1S1-3), cappuccino, dysbindin, muted, pallidin and snapin (6). HPS in humans results from mutations in two of the eight BLOC-1 subunits: HPS7 and 8 arise from mutations of dysbindin and Blos3 respectively. Mutations in three other BLOC-1 subunits (pallidin, muted and cappuccino) have been identified in mouse models of HPS (7,8). BLOC-1 has been shown to reside on endosomes (9,10), and is required for different aspects of cargo sorting in early endosomes (11,12). A similar function is conserved across evolution, as in flies and plants mutations in BLOC-1 subunits affect lysosome and LRO function (13,14) A wide variety of other proteins implicated in membrane trafficking have been identified as interactors of the eight BLOC-1 subunits, mainly by yeast two-hybrid analysis (15). Among the few interactors described to bind the intact octameric complex are AP-3 and BLOC-2 (9), as well as the SNAREs syntaxin-13 and SNAP-25 (16). Despite the large number of interactions known for BLOC-1, its mechanism of action is not known.
All of AP-3 and BLOC-1/2/3 are ubiquitously expressed in mammals (4,17), suggesting that they may affect the secretory pathway in non-specialized cells. In agreement with this, HPS affects not only LROs but also the lysosome itself, such that lysosomal proteins such as CD63 are misrouted to the cell surface (9), and lipofuscin accumulates in different cell types (18). This suggests that the mutations causing HPS might have effects in all cell types, even those lacking LROs. In the budding yeast S. cerevisiae, AP-3 mediates traffic of a subset of cargo to the vacuole, the terminal degradative organelle equivalent to the lysosome (1). Even though BLOC components are conserved widely throughout eukaryotic evolution (for example, the slime mold Dictyostelium has 11 of the possible 13), only two direct homologues to BLOC components have been found in any fungus: Blos1 and Blos2 in the oleaginous yeast Yarrowia lipolytica, but none have been identified in S. cerevisiae (19).
We have extended previous iterative psi-BLAST searches, using a more sensitive technique that combines structural predictions with the detection of sequence homology. This has identified putative homologues for three BLOC-1 subunits in S. cerevisiae: Blos1, snapin and cno. These form a complex with three other proteins, one of which is the yeast homologue of the previously uncharacterized KxDL protein. Database mining shows that KxDL proteins have conserved interactions with BLOC-1 in higher eukaryotes.
A psi-BLAST search seeded with human Blos1 (125 aa) identified a yeast homologue in Y. lipolytica (113 aa) on the second iteration, but no homologue in S. cerevisiae was found (19). We reasoned that there might be a homologue of Blos1 in S. cerevisiae whose sequence has extensively diverged, that was missed because the set of sequences used by the iterative psi-BLAST model was too dominated by sequences closely related to human Blos1. Among the non-significant hits for Blos1 there are no S. cerevisiae proteins, but there are proteins from other yeast of the typical size of Blos1. For example, an ORF of 104 aa from Kluyveromyces lactis (a yeast that is more closely related to Saccharomyces than to Yarrowia) has an E-value = 0.09 (threshold from inclusion in psi-BLAST = 0.005, see Methods and Table S1). To test if this is a Blos1 homologue in K. lactis, we seeded psi-BLAST with the K. lactis sequence, which identified ORFs from six other yeast, including the S. cerevisiae ORF of unknown function YLR408Cp (122 aa). Although this psi-BLAST failed to expand, known Blos1 homologues appeared among the non-significant hits (e.g. Rana, E-value = 0.5; Yarrowia, E-value = 4, Table S1), which left open the possibility that the sequence in K. lactis is a Blos1 homologue, in which case YLR408Cp could also be a Blos1 homologue. To test this, we seeded a psi-BLAST search with YLR408Cp. At first this identified the same 7 yeast sequences; then the alignment expanded to include first the known fungal Blos1 homologues, then all other Blos1 sequences (Table S1). Thus, although human Blos1 could not detect YLR408Cp by psi-BLAST, the reverse search did establish the link, which strongly suggests that YLR408Cp is the S. cerevisiae homologue of Blos1.
We next enhanced homology detection for Blos1 by using HHpred, which supplements sequence alignment with structure prediction (20), to achieve far greater sensitivity (21). This takes advantage of the fact that proteins diverge in terms of structure much more slowly than in specific sequence, conservation of which may be undetectable. HHpred compares a structural model of the query sequence against a database of ~135,000 records that contains not only all solved structures, but also structural models of every ORF in human, fly, worm, plant and budding yeast (see Methods). HHpred searches include information not only on amino acid frequencies, but also on the position-specific probability for opening and closing gaps (20). HHpred seeded with human Blos1 identified YLR408Cp as the sole yeast homologue (E-value 0.005, Table S2), and in the reverse search, HHpred seeded with YLR408Cp identified Blos1 in human, fly, worm and plant (E-values 0.0001 – 0.01, Table S2). Thus, a search that includes structural information supports the psi-BLAST results that YLR408Cp is a Blos1 homologue.
We next extended sensitivity using HHsenser, which uses an alignment originating from HHpred to find distant homologues with high sensitivity and virtual absence of false positives (22). HHsenser combines the iterative approach of psi-BLAST with an “intermediate profile search”, whereby information obtained from a fixed number of iterations are then used to seed new (“intermediate”) searches (22). In addition, HHsenser compares profiles with profiles (not with sequences), which improves sensitivity (23). Submitting the HHpred alignment for Blos1 to HHsenser increased the significance of the alignment to YLR408Cp ten-fold (E-value 0.0005, Table S2). For the reverse search, HHsenser seeded with YLR408Cp produced highly significant alignments with Blos1 in higher eukaryotes (E-values < 10−26, Table S2). Variation in the significance of alignments depending on the initial seed is a known feature of HHsenser (22). These highly significant alignments strongly suggest that YLR408C can be assigned to be the Blos1 homologue in S. cerevisiae, so we provisionally suggest the gene name BLS1, for Blos1-homologue, although this needs to be confirmed by functional testing for protein sorting functions.
What are the conserved features that underlie the alignment of Blos1 with Bls1p? Blos1 homologues are predicted to be helical, as is Bls1p. Fig 1A shows that while human Blos1 shows great sequence homology with its plant homologue, there is only marginal sequence conservation with Bls1p. For all BLOC-1 subunits, short linear motifs have been identified from multiple sequence alignments, which can help identify divergent homologues (19). In Blos1 a conserved motif was identified at residues 87-93: “ALKEIGD” (Fig. 1B) (19), but Bls1p does not contain this, and its conservation with Blos1 is maximal in a region of 20 residues between the motif and the extreme carboxy-terminus (Fig. 1A). To analyze this conservation in Bls1p, previously known Blos1 homologues were divided into two groups: animal/plant and fungal, with consensus sequences being constructed for each group (Fig. 1B). The animal/plant Blos1 consensus shows maximal conservation at the motif and ~20 aa on either side (44 aa in total, Fig. 1B, top row). For Blos1 in fungi, conservation is confined to the motif and ~20aa downstream (26 residues in total, Fig. 1B, middle row). As expected from our psi-BLAST results, the sequence of Bls1p is closer to the fungal group than to the animal/plant group, and 13 out of 20 residues between 94-113 of Bls1p are shared with the fungal consensus (Fig. 1B, bottom row).
In addition to specific amino acids, key properties of the helix are conserved from human to yeast (Fig. 1C and D). Viewed as a helical wheel, both human and yeast sequences form amphipathic helices with negative charges flanking a hydrophobic face, and with positive charges and other polar residues on the opposite face (Fig. 1D). This amphipathicity correlates with conservation of hydrophobic residues at positions “a” and “d” of a heptad repeating pattern (Fig. 1C). These regions are typically not predicted to form coiled coils (8), likely because most Blos1 homologues have only one leucine in this region (24). By comparison, fungal Blos1 homologues identified previously, as well as Bls1p, have 3-4 leucines here and so are predicted to form coiled-coils. The significance of this predicted difference between Blos1 higher eukaryotes is not known. Thus, identification of Bls1p as a Blos1 homologue is based on overall predicted structure, and sequence conservation with fungal Blos1 homologues that does not include the motif identified previously.
Continuing the search for different BLOC-1 subunits, we considered snapin (136 aa), where psi-BLAST has found homologues in metazoa, plants and protists, but not fungi (19). HHpred seeded with snapin weakly identified the yeast ORF of unknown function YNL086Wp (102 aa, E-value 0.3, Table S2), and HHsenser improved the alignment (E-value 0.003, Table S2). In reversed searches, psi-BLAST with YNL086Wp only identified homologues in closely related yeast (Table S1). HHpred with YNL086Wp showed weak homology to snapin from human, fly and worm (E-values 0.6 – 2, Table S2), and these alignments were improved by HHsenser (E-values 0.003 – 0.09, Table S2). Even though the snapin/YNL086Wp alignment is not strong, our findings suggest that YNL086Wp could be the snapin homologue in S. cerevisiae, and we propose that YNL086W be named SNN1 (from snapin), again pending confirmation by functional testing.
Snapin and its known homologues have two predicted helical regions, both with predicted coiled coils, and Snn1p is highly similar, aligning without gaps across a region of 77 residues (Fig. 2A and B). Two sequence motifs were identified previously in the amino-terminal helix of snapin: “SQxEL” and “DxLAxEL” at residues 50-54 and 59-66 respectively (Fig 2A and B) (19), the first of which is well conserved in Snn1p (Fig 2B). Other homology between snapin and Snn1p is distributed along their whole length (Fig. 2A). Within the predicted coiled-coils, key leucines in heptad repeats are conserved (Fig. 2B). Snapin and Snn1p also share the motif “RESQ” (Arg-Glu-Ser-Gln, Fig. 2B), the serine of which is phosphorylated by PKA (25). The finding that Snn1p shares functionally important sequence elements with snapin supports the notion that it may be a snapin homologue.
Psi-BLAST searches with cno (217 aa) have found homologues in mammals, fish, flies and a protist (slime mold), but not in nematode worms, plants or any fungi (19). We hypothesized that the species lacking cno may contain homologues that have diverged beyond the ability of psi-BLAST to recognize them, but which are detectable when structural information is included in alignment searches. HHpred seeded with the protist cno (the most divergent) identified human and fly homologues with E-values < 10−50. The next hits were distant matches to two proteins of unknown function (Fig. 3A): T24H7.4 in the nematode C. elegans (106 aa, E-value 0.2) and YDR357Cp in S. cerevisiae (122 aa, E-value 1), and similar results were obtained seeding HHpred with human cno (Table S2). Unlike for Blos1/Bls1p and snapin/Snn1p, cno alignments did not improve with HHsenser. However, searches seeded with these two new sequences did yield interesting information.
For T24H7.4, psi-BLAST only found a homologue in the closely related nematode C. briggsae. HHpred produced weak hits to YDR357Cp, fly cno and human cno (E-values 0.6,1 and 2 respectively, Table S2). While T24H7.4 is shorter than most cno homologues, cno in some insects is almost as short (e.g. beetle 125 aa). Therefore, we considered it possible that T24H7.4 is the missing nematode cno homologue.
Looking at YDR357Cp, psi-BLAST found a family of homologous sequences in fungi (one in each of 51 species, Table S1), but not outside fungi. HHpred with YDR357Cp (Fig. 3B) revealed weak homology both to a nematode ORF named systematically DUF2365, and to its relatives in human and fly (E-values 0.01, 0.2 and 0.4 respectively, Table S2). All that is known about the DUF2365 family (also systematically named c17orf59 from the location of the human gene) is that divergent eukaryotes have a single protein, typically of 125-325 residues, with a domain of unknown function (hence DUF) of approximately 95 residues identified by automated BLAST searches. A reciprocal HHpred search with nematode DUF2365 identified human, fly and plant DUF2365 proteins (E-value <10−10); the next hit was YDR357Cp (E-value 0.006, Table S2), which strongly indicates that YDR357Cp is the yeast member of the DUF2365 family. Since YDR357C is cno-like, we propose the name CNL1. While this role has not yet been tested, there is some functional evidence that this gene functions in vacuolar protein sorting (see Conclusion).
The similarities that are shared by cno, T24H7.4, Cnl1p, and DUF2365 proteins are distributed throughout the conserved domain of approximately 95 amino acids, which is predicted to be largely helical (Fig. 3A and B). Cno is predicted to form a coiled-coil at the carboxy-terminus of its region of shared homology (Fig. 3A) (26), and the same is true for DUF2365 proteins (Fig. 3B). At the level of primary sequence, two motifs were identified in cno (Fig. 3C) (19), but only one of these is conserved in T24H7.4 and Cnl1p (Fig. 3A and C). To visualize how elements of primary sequence are shared between cno and fungal cno-like proteins, we made a large alignment of cno homologues and fungal cno-like sequences (Fig. S1A). This showed that there is little conservation of the first motif at residues 89-99 (“Ø−+ØØx+Ø−−Ø”, where Ø/− /+ are hydrophobic, acidic and basic residues respectively). In contrast, the second motif at residues 134-139 (“+Ø−+Ø±”, where ± indicates charge of either type) is well conserved in fungal cno-like sequences. The alignment also revealed a third short motif (“Ø−−Ø±”, residues 125-129) in most fungal cno-like proteins (and all DUF2365, not shown) that is conserved to some extent in cno (Figs S1A and and3C).3C). A tree of cno and fungal cno-like sequences shows that while they mainly divide into their two groups, there are exceptions: the cno homologue we have identified in worms (T24H7.4) and the cno homologue the choanoflagellate Monosiga brevicollis are intermediate between the two groups (Fig. S1B). These results support the idea that a common ancestor has diverged into the two groups of cno and cno-like proteins, the link being undetectable by conventional sequence alignment.
All BLOC-1 subunits are relatively small (between 125 and 351 residues), and are predicted to contain alpha helices (6), which raises the possibility that the yeast ORFs found by HHpred are false positives, identified solely because their simple helical structure is so common. However, additional evidence that Bls1p, Snn1p and Cnl1p are BLOC-1 components comes from published data on their physical interactions. All three have been found in a single multi-subunit complex among 546 such complexes identified in a genome-wide study of protein-protein interactions in S. cerevisiae (27). Among the 2,400 tagged proteins purified in this study, one was Vab2p, a 31 kDa cytoplasmic protein of unknown function originally identified as a binding partner for Vac8p (28,29), which is a peripheral vacuolar protein that co-ordinates multiple vacuolar functions (29,30). Affinity-purified Vab2p co-precipitated five other proteins: Bls1p, Cnl1p, Snn1p, and two proteins of unknown function: YGL079Wp and YKL061Wp (Table S4) (27,31). Precipitates of Bls1p, Cnl1p and YKL061Wp revealed five more interactions among Bls1p, Cnl1p, Snn1p, YGL079Wp and YKL061Wp (Table S4). Still more physical interactions within this group of proteins have been mapped: one affinity purification identified by its combination with similar levels of expression (32), and two interactions identified by two-hybrid studies (33-35) (Table S4). In total, 13 of the possible 15 pair-wise interactions have been detected between the six proteins Vab2p, Bls1p, Cnl1p, Snn1p, YGL079Wp and YKL061Wp (Fig. 4A). The high density of connections suggests that they form a single complex (27). Current data do not indicate that Vab2p or any other component protein is the “node” for the complex, which will have to be determined by examining pairwise interactions in strains missing other components of the complex.
The three extra components that interact with BLOC-1 in yeast are similar to known BLOC-1 subunits in size and in predicted content of alpha-helices (with some coiled coil) but no beta-sheet (Fig. 4B). Only Vab2p has been studied previously, but apart from its interaction with Vac8p, no function is known (28,29). We next scanned the database for homologues of Vab2p, YGL079Wp and YKL061Wp using psi-BLAST (Table S1); no extra alignments arose from HHpred. Homologues for Vab2p were found in fungi only. For YKL061Wp, homologues were restricted to close relatives of S. cerevisiae (Table S1). Since the majority of physical interactions documented for YKL061W are with BLOC-1 subunits (27,31,33-35), we propose the name BLI1, for BLOC-1 interactor.
Unlike Vab2p and Bli1p, YGL079Wp is in a conserved protein family, with a single identifiable member in most eukaryotes, from mammals to plants (Table S1). However, none of the homologues have been characterized. Automated database curation has designated them as IPR019371, Pfam10241, or DUF-KxDL. They are all short proteins, defined by a conserved helical region of 90 residues that includes a “KxDL” (Lys-x-Asp-Leu) motif near the carboxy-terminus (Fig. 4B). Since YGL079W is the yeast KxDL homologue, we propose the name KXD1.
The multiple interactions between all of Bls1p, Snn1p, Cnl1p, Vab2p, Bli1p and Kxd1p indicate that the latter three might also be subunits of BLOC-1 in yeast. In support of this, high-throughput analysis of localization ascribes all six ORFs to a punctate distribution, in some cases reported to co-localize with endosomes (36).
Including structural information in homology searches identified potential yeast homologues for Blos1 and snapin, and a cno-like protein. The same approach found no homologues in yeast for the five remaining BLOC-1 subunits (Blos2, Blos3, muted, dysbindin, and pallidin), but did identify a new homologue of muted in C. elegans (C34D4.13, not shown). This indicates that whatever the function may be of the yeast complex we have highlighted, it may be only distantly related to BLOC-1 in mammals. Nevertheless, there is some evidence that the yeast proteins are performing a function that parallels BLOC-1 in mammals, since a genome-wide screen that identified ~200 genes functioning in vacuolar protein sorting included both CNL1 and KXD1, as well as all four sub-units of AP-3, but none of AP-1,-2 and clathrin (37). The inclusion of Kxd1p in complexes containing yeast BLOC-1 homologues is particularly significant because, although KxDL proteins are completely uncharacterized, in higher eukaryotes they recapitulate some if the interactions with BLOC-1 subunits (Fig. 4): in D. melanogaster and C. elegans, the KxDL protein interacts with Blos1 (38); in C. elegans, the KxDL protein interacts both with the newly discovered cno (T24H7.4), and with the DUF2365 homologue (39). Conservation of interactions in yeast, worms and flies suggests that KxDL proteins may be key interactors with BLOC-1 in mammals.
Although it has not been shown that any of the interactions of the yeast complex occur simultaneously, the density of interactions among the six subunits is predicted to generate a multimeric complex (27). Would this be similar to BLOC-1 in higher eukaryotes, which, by a combination of size exclusion chromatography and velocity sedimentation, has been estimated to be an asymmetric complex of 200 kDa (40)? For most of the eight subunits, a large proportion co-migrates and co-precipitates in these large complexes (8,26,41). However purification of the complex has not yet been achieved, so its complete composition is not known. Our results predict that BLOC-1 might include KxDL or DUF2365 proteins. Knowing conserved components and interactions in yeast, it will now be possible to study ancestral functions of BLOC-1 in a genetically tractable model organism.
Psi-BLAST: The psi-BLAST tool examining the non-redundant protein database at NCBI. E-values returned here (and by other tools, below) give the average number of false positives expected to randomly produce an alignment as good as this. An E-value of 10 means that 10 wrong hits are expected to occur with the extent of alignment observed, while numbers very much smaller than one indicate likely significance. We used a threshold E-value of 0.005 for inclusion of alignments from one iteration of psi-BLAST in the next iteration. Also searches were limited to eukaryotes, and sequences masked for the lookup table only. Psi-BLAST was iterated until new sequences added to the list were larger proteins including other domains of known function. Sequences from transcripts with frame shift errors were excluded.
HHpred: The HHpred tool (20), which is part of the MPI bioinformatics toolkit (http://toolkit.tuebingen.mpg.de), was used to search against the protein databank of structures clustered at 70% sequence identity (PDB70), and to examine the genomes of five phylogenetically diverse organisms: H. sapiens, D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae, with a total of ~135,000 records in total. Alignments were carried out with default settings in the local mode. E-values returned by HHpred and reported here are based on sequence alone, excluding secondary structural similarity, so hits can be significant even when the E-value is ~ 1 (20). Some HHpred alignments were submitted to HHsenser, using default settings (22). Coiled-coils were analyzed at www.ch.embnet.org/ using default settings; positions where any of the three windows scores ≥ 0.5 were considered positive. Structural predictions were made by Psi-Pred 3.0. Consensus sequences for Blos1 and snapin were made by MUltiple Sequence Comparison by Log-Expectation (MUSCLE) at EBI. For each position in the consensus, conservation was scored by comparison to maximum consensus strength: <50% weak, 50-75% moderate, and ≥75% high. The alignment for Fig. S1 was made by Kalign at EBI, and coloured with the Clustalx scheme.
Funding: Wellcome Trust (grant #082119).