|Home | About | Journals | Submit | Contact Us | Français|
Many putative transcription factors in the pathogenic fungus Candida albicans contain sequence similarity to well-defined transcriptional regulators in the budding yeast Saccharomyces cerevisiae, but this sequence similarity is often limited to the DNA binding domains of the molecules. The Gcn4p and Gal4p proteins of Saccharomyces cerevisiae are highly studied and well-understood eukaryotic transcription factors of the basic leucine zipper (Gcn4p) and C6 zinc cluster (Gal4p) families; C. albicans has C. albicans Gcn4p (CaGcn4p) and CaGal4p with DNA binding domains highly similar to their S. cerevisiae counterparts. Deletion analysis of the CaGcn4p protein shows that the N′ terminus is needed for transcriptional activation; an 81-amino-acid region is critical for this function, and this domain can be coupled to a lexA DNA binding module to provide transcription-activating function in a heterologous reporter system. Deletion analysis of the C. albicans Gal4p identifies a C-terminal 73-amino-acid-long transcription-activating domain that also can be transferred to a heterologous reporter construct to direct transcriptional activation. These two transcriptional activation regions show no sequence similarity to the respective domains in their S. cerevisiae homologs, and the two C. albicans transcription-activating domains themselves show little similarity.
Transcriptional regulators control the expression of genes to coordinate the availability of cellular function with the physiological needs of the cell. Gene-specific transcriptional activation is often regulated by the binding of positively acting proteins to upstream activating sequences (UAS) in the DNA where they recruit and control the activities of chromatin-modifying and remodeling complexes and the transcription apparatus (34). A typical transcriptional activator then interacts with the RNA polymerase II complex through binding to an adaptor complex termed Mediator; this Mediator complex consists of about 20 proteins and is conserved from yeasts to humans (5). Eukaryotic transcriptional activator proteins are generally bipartite in nature, with separate domains for DNA binding and transcriptional activation (40, 53). The transcriptional activation domains are classified according to their amino acid composition: rich in acidic residues (e.g., Saccharomyces cerevisiae Gal4p and Gcn4p) or basic residues (tobacco BBC1) or rich in glutamine (S. cerevisiae Mcm1p), threonine/serine (human OCT2), or isoleucine (NTF1) (2, 7, 10, 15, 26, 30, 37, 38). The DNA binding modules also fall into many classes, such as zinc finger, leucine zipper, and helix-loop-helix motifs (14, 16, 57). Although the activation domain is critical for function and can provide a level of regulation, the functional targets of such transcriptional activators are determined by the DNA binding address of the protein.
In both S. cerevisiae and Candida albicans, GCN4 encodes a transcriptional activator of amino acid biosynthetic genes that responds to amino acid starvation (25, 54). S. cerevisiae Gcn4p (ScGcn4p) is tightly regulated at both the transcriptional and translational levels. The 5′ leader region of GCN4, which codes for a transcriptional activator of amino acid biosynthetic genes in response to amino acid starvation, contains four small upstream open reading frames (uORF1 to -4). These uORFs act as negative regulators of translation: the ribosome initiates translation at uORF1 and becomes reactivated for translation at subsequent uORFs. Under environmental stresses, such as amino acid starvation, the translation of S. cerevisiae GCN4 (ScGCN4) is induced: ScGcn2 kinase phosphorylates eukaryotic initiation factor 2, and the scanning ribosome is not reactivated until it bypasses the uORFs and initiates translation at the GCN4 open reading frame (24). The unusually long 5′ leader sequence on the GCN4 mRNA, which carries four upstream open reading frames is conserved in C. albicans (54). It was recently shown that the protein kinase Gcn2, which is involved in transcriptional and translational regulation of Gcn4p in S. cerevisiae, is not involved in the regulation of C. albicans Gcn4p (51).
Gcn4p binds as a homodimer with its basic leucine zipper found in its carboxy terminus to a TGACTC sequence located upstream of many genes induced during amino acid starvation (1, 27). In S. cerevisiae Gcn4p, there are two transcription activation domains: one resides in an acidic segment in the center of the protein between residues 107 and 144 (13, 26, 28), and the second Gcn4p activation domain is located in the N-terminal 100 amino acids. The two activation domains are functionally redundant and can work independently to produce high-level activation (13, 28).
In S. cerevisiae, Gal4p is a second well-studied transcription factor and functions as the transcriptional activator of galactose catabolism (36, 45, 52). ScGal4p contains a DNA binding domain (amino acids 1 to 65) and two transcriptional activation domains, domain I (amino acids 149 to 196), and domain II (amino acids 768 to 881) (12, 38). The transcriptional activation domain II of Gal4p interacts with Gal80p in the absence of galactose, and through this contact, Gal80p inhibits Gal4p (43), preventing it from activating the expression of galactose-dependent genes. During growth on galactose, Gal3p binds Gal80p and removes it from Gal4p at the GAL gene promoter and prevents Gal80p from inhibiting Gal4p function (45). Therefore, in the presence of galactose, Gal4p is freed from Gal80p inhibition and subsequently activates expression of the galactose regulon (36, 45, 52). The Gal4p DNA binding domain interacts with a specific upstream activating sequence UASG (CGGN11CCG), located in the promoter regions of the genes whose products participate in the galactose metabolism circuit, such as GAL1, -2, -3, -7, -10, and -80 (31). Nuclear magnetic resonance analyses of various amino-terminal Gal4p fragments and X-ray crystal structure determination of Gal4p-UASG complex (39) show that the C6 zinc cluster is the DNA binding module of Gal4p.
The primary role of a transcription activator is to recruit the RNA polymerase II machinery to the promoter to which it is bound. To achieve this, the transcriptional activation domains of both Gcn4p and Gal4p interact with Gal11p, which is a component of the mediator complex that binds the RNA polymerase II machinery (3, 22, 29, 44, 47).
Three general models have been proposed to characterize the structure of the transcriptional activation domain. In the first hypothesis, it was proposed that activation domains are unstructured “acidic blobs” that interact with their targets via ionic interactions. This model is supported by the observation that the removal of the residues of activation domains decreases the activity gradually, rather than abruptly (50). A second model proposes that acidic activation domains form amphipathic α helices, in which acidic residues are aligned on one face of the helix. This model is supported by the observation that an artificial 15-residue peptide, designed to fold into amphipathic α helix, shows transcription-activating abilities when fused to the GAL4 DNA binding domain (38). The authors of the third model argue that the most likely secondary structure is the antiparallel β sheet (55).
We have recently established a detailed annotation of the C. albicans genes (6). Intriguingly, although many of the C. albicans transcription factors have sequence similarity to transcription factors in S. cerevisiae, the similarities occur primarily in the DNA binding motifs of those proteins. In addition, it has been previously shown that although C. albicans Rfg1p, Rap1p, Gat1p, Msn2p, and Msn4p have S. cerevisiae DNA binding domain homologs, these transcription factors control the regulation of different processes in the two organisms (4, 32, 35, 41). Here we have investigated the transcription activation domains of C. albicans homologs of the Gcn4p and Gal4p transcription factors; these domains are serine-threonine rich and lack sequence similarity to the S. cerevisiae homologs.
The C. albicans strains are listed in Table Table1.1. Strain CAI8 (18) was used to generate strains CRC103 and CRC106 (49). Strain CRC106 carries the Staphylococcus aureus lexA operator; CRC103 does not and serves as a negative control for lexA binding. Strains CRC106 and BWP17 (56) were used to define the transcription activation domains of GCN4 and GAL4. The Escherichia coli strain MC1061 was used for all plasmid constructions.
Plasmids and oligonucleotides are shown in Tables Tables22 and and3,3, respectively. To create plexA-HIS1, we PCR amplified the C. albicans HIS1 (CaHIS1) open reading frame and its termination sequence from pFA-HIS1 with oligonucleotides OMM46 and OMM47, which contain the lexA binding site and the CaADH2 TATA box. The PCR product, which contains the lexA binding site, TATA box, HIS1 open reading frame and termination sequence, was cloned into pFA-ARG4 using the SalI and SunI restriction sites. Two more lexA binding sites were added by annealing oligonucleotides OMM50 and OMM51 and cloning them into the SunI site close to the third lexA binding site to yield plexA-HIS1. To create strain CMM25, the plexA-HIS1 construct was integrated into the ARG4 locus of strain BWP17 by treating plexA-HIS1 with AgeI, which cuts once in the ARG4 sequence. CIp-lexA-GCN4 deletion constructs were created by PCR using divergent primers: oligonucleotide OMM56 annealed to a region upstream of GCN4, while oligonucleotides OMM60 and OMM62 align inside the GCN4 open reading frame in the CIp-lexA-GCN4 template. Oligonucleotides OMM66-67 were used for CIp-lexA-GAL4. To fuse GAL4 in frame with lexA, CIp-lexA-GAL4 was cut with MluI. The resulting 5′ overhangs were filled with T4 polymerase, and the constructs were self-ligated with T4 DNA ligase. Plasmids CIp-lexA-GCN4 and CIp-lexA-GAL4Δ1-81 were cut with BstBI and self-ligated to yield CIp-lexA-GCN4Δ247-323 and CIp-lexA-GAL4Δ1-81Δ247-323, respectively. Plasmid CIp-lexA-GAL4 was cut with ZraI and PstI and self-ligated to create CIp-lexA-GAL4Δ188-261. Oligonucleotides OMM125 and OMM57 were used to PCR amplify GCN4AD, which was ligated into AatII- and PstI-cut CIp-lexA-GAL4 to create CIp-lexA-GAL4-GCN4AD. CaHIS1 open reading frame, its promoter, and termination sequence were cut out from pFA-HIS1 using NotI and ligated into NotI-cut pOPlacZ to create pOPlacZ-HIS1. The CMM85 strain (gcn4 with pOPlacZ-HIS1) was created by first converting gcn4 (42) into an URA3 auxotroph (gcn4-ura3) using 5-fluoroorotic acid-containing media, followed by an integration of XcmI-cut pOPlacZ-HIS1 at the HIS1 locus. Strains CMM86, CMM87, CMM88, and CMM89 were created by transforming CMM85 (gcn4 with pOPlacZ-HIS1) with plasmids CIp-lexA, CIp-lexA-GCN4, CIp-lexA-GCN4Δ1-81, and CIp-lexA-GCN4Δ1-81Δ247-323, respectively. All of the constructs created in this study were integrated into the genome of C. albicans: all of the constructs, except for pOPlacZ and plexA-HIS1, were digested with StuI to integrate them at the RPS1 locus. pOPlacZ was digested with BamHI and plexA-HIS1 was digested with AgeI to integrate them at ADE2 and ARG4 loci, respectively. All of the DNA constructs were transformed into C. albicans by treatment with lithium acetate (8).
The expression level of the lacZ gene was assayed in two ways: by β-galactosidase overlay assay using independently isolated transformants grown on solid yeast extract-peptone-dextrose (YPD) medium, yeast extract-peptone-galactose (YPGal) medium, or synthetic complete medium with amino acids (SC-aa), or by β-galactosidase assays performed on mid-exponential shaking flask YPD, YPGal, or SC-aa-grown liquid cultures (48). The β-galactosidase activity was expressed in Miller units; the values are shown as means and standard deviations from three independent transformants.
We have recently performed a detailed genome annotation of the C. albicans genome (6), which showed that frequently the transcription factors of this organism share homology to transcription factors of other organisms only within the DNA binding domain. We have defined 198 S. cerevisiae genes whose products contain a DNA binding domain and are classified as transcription factors by combining the list of transcriptional regulators of Harbison et al. (23) with the list from http://www.yeastract.com/tflist.php. Of these, 32 were experimentally shown to be transcriptional repressors in S. cerevisiae. Ninety-nine of the remaining 166 S. cerevisiae transcriptional activators were found to have C. albicans homologs, half of which share homology only within a DNA binding domain (Fig. (Fig.1).1). A detailed assessment of global and transcriptional activation domain similarities is provided (see Table S1 in the supplemental material). Since there is no primary sequence that defines the activation domain as a module, the nature of the activation domain is based on the experimentally defined part of the transcription factor. A majority of S. cerevisiae transcription factors, such as Gal4p, Gcn4p, Upc2p, Leu3p, and Arg81p (11, 13, 26, 28, 38, 46, 58), were experimentally shown to have acidic activation domains. When we compared the transcription factors of S. cerevisiae with the transcription factors of C. albicans, we observed that in some cases the sequence of the experimentally defined activation domains of S. cerevisiae is very well conserved in C. albicans transcription factor homologs, such as Upc2, Leu3, and Arg81 (11, 46, 58). In other cases, the sequence of the experimentally defined activation domain of S. cerevisiae is not detectable in C. albicans transcription factor homologs, such as Gal4p and Gcn4p. The presence of homology in the DNA binding domain of the C. albicans transcription factors like Gcn4p and Gal4p tells us that these might be transcriptional regulators, but the absence of homology in the activation domain makes it difficult for us to predict whether these could work as activators or repressors. We therefore directly investigated the functions of the regions outside the DNA binding motif.
We have investigated the roles of the nonhomologous regions of candidate transcription factors to establish if they play a role in transcriptional activation. Within the DNA binding domain, CaGcn4p shares strong sequence similarity (88%) with S. cerevisiae Gcn4p. To establish whether this protein possesses a functional activation domain, we made use of the S. aureus lexA one-hybrid system (49). This system contains the C. albicans actin promoter (pACT1) placed upstream of the S. aureus lexA open reading frame to create plasmid CIp-lexA, and the S. aureus lexA operator upstream of both an ADH1 basal promoter and a lacZ open reading frame, creating pOPlacZ (49); in addition, we placed the S. aureus lexA operator upstream of the HIS1 open reading frame to create plexA-HIS1. We integrated pOPlacZ into strain CAI8 to yield reporter strain CRC106 (49) and integrated plexA-HIS1 into strain BWP17 to yield reporter strain CMM25. Fusions were constructed in CIp-lexA and introduced into these two reporter strains. In the absence of any transcriptional activator fused to lexA, the reporter CRC106 derivative yielded basal levels of β-galactosidase, and the reporter CMM25 derivative produced no growth in the absence of histidine; when a transactivator is fused to lexA, the system yielded higher levels of β-galactosidase and permitted growth in the absence of histidine. A full-length CaGCN4 cloned downstream of the lexA open reading frame, creating CIp-lexA-GCN4, was fully capable of transactivation in the C. albicans assays. This construct generated fivefold-higher β-galactosidase activity when transformed into CRC106 to generate CMM14 and permitted growth in 1 day in the absence of histidine when transformed into CMM25 to create CMM30, compared to the appropriate controls CMM10 and CMM26, which contain CIp-lexA (Fig. (Fig.2).2). These results suggest that C. albicans GCN4 contains a transcription activation domain (49).
The C-terminal systematic deletions of lexA-GCN4 identified the N-terminal 81-amino-acid region serving as an activation domain; CIp-lexA-GCN4Δ247-323, CIp-lexA-GCN4Δ161-323, CIp-lexA-GCN4Δ123-323, and CIp-lexA-GCN4Δ82-323 were as active as full-length CIp-lexA-GCN4 in both the lacZ background and HIS1 background. Further C-terminal deletions gradually reduced both lacZ and HIS1 activities, suggesting that the Gcn4p activation domain is at least 81 amino acids long (see CIp-lexA-GCN4Δ69-323, CIp-lexA-GCN4Δ56-323, and CIp-lexA-GCN4Δ42-323 in Fig. Fig.3).3). The deletion of the proposed N-terminal activation domain in the context of the full-length GCN4 (CIp-lexA-GCN4Δ1-81) showed the same β-galactosidase activity in strain CMM21 as that of full-length GCN4 in CMM14, while CIp-lexA-GCN4Δ1-81 in CMM37 showed a slightly reduced HIS1 activity compared to that of CMM30. This observation could be explained either by a second transcription activation domain (as in S. cerevisiae Gcn4p) located between the N-terminal activation domain and C-terminal DNA binding domain or by the CIp-lexA-GCN4Δ1-81 interaction with the endogenous wild-type Gcn4p through a dimerization domain located at the C′ terminus of the protein (27), generating an activating heterodimer. We directly tested the capacity of the region between the DNA binding domain and the N-terminal activating domain to allow transcriptional activation by creating CIp-lexA-GCN4Δ1-81Δ247-323, in which both the activation and DNA binding domains were deleted; this construct generated background β-galactosidase activity and no HIS1 activity (strains CMM64 and CMM65), suggesting that either the activation domain at the N′ terminus is the only CaGcn4p activation domain, that CIp-lexA-GCN4Δ1-81Δ247-323 yields an unstable protein, or that the Gcn4p DNA binding domain directs transcriptional activation. To distinguish between these hypotheses, we tested lexA-GCN4Δ1-81, which lacks the N-terminal activation domain, for its ability to activate the expression of lacZ in the absence of endogenous Gcn4p (gcn4 strain). Although lexA-GCN4Δ1-81 resulted in high expression of lacZ in the wild-type strain (CMM21), the lacZ expression was dropped down to background levels in the absence of endogenous GCN4 (CMM88). At the same time, lexA-GCN4 showed high lacZ expression in both wild-type and gcn4 backgrounds (CMM14 and CMM87), while lexA and lexA-GCN4Δ1-81Δ247-323 showed low lacZ levels of expression in both GCN4 and gcn4 backgrounds (CMM10, CMM86, CMM64, and CMM89). These results suggest that the activation domain at the N′ terminus of C. albicans Gcn4p is the only CaGcn4p activation domain. At the amino acid level, this activation domain is nucleophilic (has a composition of 20% serine-threonine) and shares no similarity with the activation domain of ScGcn4p.
We examined the activation domain of a candidate C. albicans version of the S. cerevisiae Gal4p protein. In S. cerevisiae, Gal4p is a highly studied transcription factor, and the structures of its DNA binding and transcriptional activation modules as well as the target promoters have been extensively investigated (36, 45, 52). Within the DNA binding domain, the putative Gal4p protein, encoded by C. albicans ORF19.5338, shares strong sequence similarity (86%) with S. cerevisiae Gal4p; the DNA binding domain has the six cysteine residues, the linker region, and the dimerization region all well conserved. A BLAST search of the ScGal4p sequence in the C. albicans genome yields Orf19.5338 as its closest homolog; at the same time, searching the Orf19.5338 sequence in the S. cerevisiae genome yields ScGal4p as its closest homolog. Since ScGal4p and Orf19.5338 form a “reciprocal best hit” relationship, we named Orf19.5338 CaGal4p. Although this C. albicans Gal4p homolog binds 5′-CGGN11CCG-3′, the upstream activating sequence (UASG) to which Gal4p binds in S. cerevisiae (data not shown), the promoters of C. albicans GAL genes lack UASG. Rather, UASG are found upstream of C. albicans subtelomeric and glycolysis genes (data not shown). In addition, this C. albicans gene encodes a much smaller 261-amino-acid-long protein compared to the S. cerevisiae Gal4p of 881 amino acids. The regions outside the DNA binding domain of those two proteins share no similarity, and the negatively charged region that serves as the interaction domain for ScGal80p is missing in CaGal4p. Interestingly enough, although C. albicans can grow on galactose (19), its genome also lacks a Gal80p homolog.
To establish whether CaGal4p contains a transcriptional activation domain, full-length GAL4 was cloned downstream of the lexA open reading frame to create plasmid CIp-lexA-GAL4. We transformed this construct into strain CRC106 to create CMM11; the transactivating ability of CIp-lexA-GAL4 was five times more than the activity of the control vector CIp-lexA, which was transformed into CRC106 to create strain CMM10 (Fig. (Fig.2).2). Similarly, strain CMM25, which contains the lexA operator in front of HIS1, was transformed with CIp-lexA and CIp-lexA-GAL4 to create CMM26 and CMM27, respectively. As was found for the lacZ system, lexA-GAL4 showed high transcription-activating ability compared to the activity of the vector alone, since only CMM27 grew in the absence of histidine. These results suggest that C. albicans Gal4p can act as a transcriptional activator.
We examined which part of CaGal4p was essential for the transactivating capacity. Deletion of the C-terminal 71 amino acids, creating CIp-lexA-GAL4Δ188-261, abolished the transactivating ability of lexA-GAL4 when introduced into both strain CRC106 to create CMM13 and into strain CMM25 to create CMM29. In addition, fusion of lexA to the nucleotides encoding the C-terminal 71 amino acids of CaGal4p showed transcription-activating abilities similar to that of the full-length GAL4 (see CMM60 and CMM61) (Fig. (Fig.4).4). Similarly to the Gcn4p activation domain, the CaGal4p activation domain showed a nucleophilic nature; it has a 30% serine-threonine composition but shares no other similarity with the activation domain of ScGcn4p.
It is currently believed that the gene specificity of the transcription factor comes from its DNA binding domain: this domain binds to a nucleotide motif on the promoters and recruits the RNA polymerase II machinery (34). To see whether the transcriptional activation domain plays a role in the transcriptional selectivity of CaGal4p, we replaced the 71-amino-acid-long CaGal4p activation domain (Gal4AD) with the 81-amino-acid-long Gcn4p activation domain (Gcn4AD). We observed that in vivo the lexA-GAL4-GCN4AD construct in strains CMM12 and CMM28 showed the same transcription-activating ability as the lexA-GAL4-GAL4AD in strains CMM11 and CMM27, respectively (Fig. (Fig.44).
Eukaryotic transcription factors are typically bipartite in nature, with a region (the DNA binding or DB domain) specifically designed to interact with a defined DNA sequence and a region (the transcriptional activation or TA domain) required to interface the factor with the transcriptional machinery. There are several classes of each of these modules, and they are connected together in a variety of ways. Within the transcriptional activation modules, there are domains rich in acidic or basic residues or rich in glutamine, threonine/serine, or isoleucine residues (2, 7, 10, 15, 26, 30, 37, 38). In this study we defined the transcription activation domains in a pair of C. albicans transcription factors that share sequence similarity with their S. cerevisiae homologs only within their DNA binding domains. The Zn(II)2Cys6 (or C6 zinc) binuclear cluster DNA binding domain is one of the largest classes of fungal DNA binding proteins, the best characterized of which are Gal4p, Ppr1p, Leu3p, Hap1p, and Put3p. Although the DNA binding sequence of ScPut3p (CGGN10CCG) is very similar to that of ScGal4p (CGGN11CCG), the distinction in recognition sequences is conserved; C. albicans possesses homologs of all of these S. cerevisiae Gal4-like Zn(II)2Cys6 proteins, including Put3p.
In S. cerevisiae it was shown that Gal4AD and Gcn4AD have an acidic amino acid-rich nature and are located in the C′ and N′ termini, respectively (13, 26, 28, 37, 38). We analyzed the transcription activation domains of the C. albicans Gcn4p (CaGCN4) and Gal4p (CaGAL4) homologs and found that just as in S. cerevisiae, they are positioned at the N′ and C′ termini of the respective proteins. However, the C. albicans Gcn4p and Gal4p activation domains do not share sequence similarity either to each other or to the activation domains of their S. cerevisiae homologs, and C. albicans Gal4p and Gcn4p have nucleophilic activation domains. Nucleophilic transcriptional activation regions have been previously seen almost exclusively in higher eukaryotic transcription factors (9, 10, 21). A screen for C. albicans transcriptional activation domains using a genomic library fused downstream of lexA yielded an active fragment containing a normally noncoding region that expressed 33% serines and threonines in the fusion construct (data not shown), which also suggests that nucleophilicity can be an important feature of C. albicans activation domains. The serine and threonine amino acids could potentially be converted into an acidic form by phosphorylation.
The S. cerevisiae Gal4p and Gcn4p proteins each contain two transcriptional activation domains (13, 27, 28, 37, 38). In contrast, the C. albicans Gcn4p and Gal4p homologs appear to each contain only one transcriptional activation domain (Fig. (Fig.33 and and4).4). Each of the two ScGcn4p activation domains seems to be composed of two or more small subdomains that have additive effects on transcription and that can cooperate in different combinations to promote high-level expression of the Gcn4p-dependent genes (13, 28). These results are consistent with our observation that the C-to-N-terminal deletions within the CaGcn4p activation domain lead to a gradual, rather than to an abrupt, reduction of the transcription-activating abilities of the fusion protein (Fig. (Fig.33).
To determine when the changes in the activation domains of Gcn4p and Gal4p occurred during the evolution of the yeast species, we used available genomic data of the ascomycota (Schizosaccharomyces pombe, Neurospora crassa, Aspergillus niger, S. cerevisiae, Saccharomyces paradoxus, Saccharomyces mikatae, Saccharomyces kudriavzevii, Saccharomyces bayanus, Candida glabrata, Saccharomyces castellii, Kluyveromyces lactis, Ashbya gossypii, Debaryomyces hansenii, Candida tropicalis, Candida dubliniensis, and C. albicans) (Fig. 5A and B). Archiascomycetes were observed to lack the activation domains of either ScGcn4p or CaGcn4p. Euascomycetes possessed the ScGcn4p activation domain II (ADII). We noted that the common ancestor of D. hansenii, C. tropicalis, C. dubliniensis, and C. albicans lost activation domain I (ADI) of ScGcn4p and acquired the activation domain of CaGcn4p. We also observed that the ancestor of C. tropicalis, C. dubliniensis, and C. albicans lacked ScGcn4p ADII (Fig. (Fig.5A).5A). In addition, the ancestor of C. tropicalis, C. dubliniensis, and C. albicans lacked ScGal4p ADI, while the ancestor of D. hansenii, C. tropicalis, C. dubliniensis, and C. albicans lacked ScGal4p ADII (Fig. (Fig.5B).5B). In both cases, D. hansenii represents an intermediate with both S. cerevisiae and C. albicans activation domains. These observations show that the changes in the activation domains of Gal4p and Gcn4p of C. albicans occurred relatively recently on the evolutionary scale.
The lack of homology in the activation domains of transcriptional activators between S. cerevisiae and C. albicans might suggest a concomitant reduced structural similarity in the activation domain-interacting complexes between the two species. A pairwise sequence comparison of the transcriptional machinery between C. albicans and S. cerevisiae shows a high level of conservation in the RNA polymerase II complex. The exceptions for this are transcription factor IIA and the Mediator complex: S. cerevisiae and C. albicans show low levels of homology with respect to the proteins of those two complexes, and these are the complexes that interact with transcriptional regulators.
The characterization of the bipartite structure of eukaryotic transcription factors like S. cerevisiae Gal4p was a fundamental conceptual advance (33) and has led to important technical developments like the yeast two-hybrid system (17). In general, C. albicans transcription factors follow the pattern of distinct DNA binding and transcriptional activation domains, and many show strong sequence similarity, extending to both domains, to specific S. cerevisiae transcription factors. However, a large number of C. albicans proteins have strong sequence similarity that is limited only to the DNA binding module of an S. cerevisiae transcription regulator. We have shown that although the well-studied Gal4p and Gcn4p proteins of S. cerevisiae share similarity only to the DNA binding regions of the Gcn4p and Gal4p proteins of C. albicans, the Candida proteins still contain transcriptional activation capacity. Further work will be necessary to establish the molecular logic of linking common DNA binding modules to distinct activation domains in these two fungi, in particular in cases such as Gcn4p where similar cellular processes are regulated by the two proteins.
We thank A. J. Brown and Clair Russel for CRC103 and CRC106 strains and for the CIp-lexA and CIp-lexA-GCN4 DNA constructs and A. Mitchell for the C. albicans mutant libraries.
This work was supported by Canadian Institutes of Health Research grant MOP-42516 (to M.W.). M.M. gratefully acknowledges a FRSQ-FCAR-Sante Scholarship and National Research Council Graduate Student Scholarship Supplement.
This is National Research Council publication 47514.
Published ahead of print on 8 December 2006.
†Supplemental material for this article may be found at http://ec.asm.org/.