|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The peptidyl-prolyl cis/trans isomerase (PPIase) class of proteins is present in all known eukaryotes, prokaryotes, and archaea, and it is comprised of three member families that share the ability to catalyze the cis/trans isomerisation of a prolyl bond. Some fungi have been used as model systems to investigate the role of PPIases within the cell, however how representative these repertoires are of other fungi or humans has not been fully investigated.
PPIase numbers within these fungal repertoires appears associated with genome size and orthology between repertoires was found to be low. Phylogenetic analysis showed the single-domain FKBPs to evolve prior to the multi-domain FKBPs, whereas the multi-domain cyclophilins appear to evolve throughout cyclophilin evolution. A comparison of their known functions has identified, besides a common role within protein folding, multiple roles for the cyclophilins within pre-mRNA splicing and cellular signalling, and within transcription and cell cycle regulation for the parvulins. However, no such commonality was found with the FKBPs. Twelve of the 17 human cyclophilins and both human parvulins, but only one of the 13 human FKBPs, identified orthologues within these fungi. hPar14 orthologues were restricted to the Pezizomycotina fungi, and R. oryzae is unique in the known fungi in possessing an hCyp33 orthologue and a TPR-containing FKBP. The repertoires of Cryptococcus neoformans, Aspergillus fumigatus, and Aspergillus nidulans were found to exhibit the highest orthology to the human repertoire, and Saccharomyces cerevisiae one of the lowest.
Given this data, we would hypothesize that: (i) the evolution of the fungal PPIases is driven, at least in part, by the size of the proteome, (ii) evolutionary pressures differ both between the different PPIase families and the different fungi, and (iii) whilst the cyclophilins and parvulins have evolved to perform conserved functions, the FKBPs have evolved to perform more variable roles. Also, the repertoire of Cryptococcus neoformans may represent a better model fungal system within which to study the functions of the PPIases as its genome size and genetic tractability are equal to those of Saccharomyces cerevisiae, whilst its repertoires exhibits greater orthology to that of humans. However, further experimental investigations are required to confirm this.
The peptidyl-prolyl cis/trans isomerase (PPIase) class of proteins is traditionally comprised of three distinct protein families, the cyclophilins (cyclosporin A binding proteins), FKBPs (FK506 binding proteins) and parvulins, that are linked by their shared ability to catalyse the bond preceding a proline residue between its cis and trans forms. However, the recent identification of a cyclophilin-FKBP hybrid protein (FCBP; FK506- and cyclosporin-binding protein) in the protozoan parasite Toxoplasma gondii  and the identification of three further members of this novel family in two distinct bacteria, Flavobacterium johnsonii &Treponema denticola, as well as in at least ten other bacteria in the sequence databases (TJP; unpublished data), may indicate a shared early evolutionary history for the cyclophilin and FKBP families. All families, with the exception of the FCBPs that appear to be confined to the bacterial and protist lineages (TJP; unpublished data), are found widely distributed in eukaryotes, prokaryotes and archaea [2-8], implying that their function is required in cellular processes from bacteria to man, and in all the major compartments of the cell [8-11].
Despite their shared conservation throughout nature, the three traditional PPIase families do not share a conserved role in cell viability. In bacteria, the four known periplasmic PPIases in Escherichia coli (FkpA, PpiA, PpiD, and SurA) have been reported not to be essential for growth under laboratory conditions . However, two cytosolic PPIases, PpiB and trigger factor, in Bacillus subtilis have been shown not to possess an essential function under normal growth conditions, but they become essential for cell viability under starvation conditions . All of the cyclophilins and FKBPs in the budding yeast Saccharomyces cerevisiae have been individually and collectively knocked out with no effect on cell viability [14,15]. Only Ess1, the S. cerevisiae orthologue of the human parvulin Pin1, has been reported to be essential within S. cerevisiae . However, only a very low level is required for cell growth under normal conditions, but a higher level is required in the presence of environmental challenges . This essential cellular role is shared with its orthologue in the pathogenic yeast Candida albicans , however not with their orthologues in their fellow fungi Schizosaccharomyces pombe  and Cryptococcus neoformans , or the fruit fly Drosophila melanogaster . It therefore appears likely that the essential function of some Pin1 orthologues is limited to certain organisms or that there is a degree of redundancy present in these other organisms that compensates for its absence.
Recently, a mutation in the D. melanogaster cyclophilin CG3511 that severely truncates the protein has been reported to confer a synthetic lethal phenotype on cells that lack the retinoblastoma (Rbf) protein . Mice lacking Pin1 , Cyclophilin A , FKBP12 , FKBP12.6 , and FKBP52  have all been found to be viable and to develop normally, although the latter did result in partial embryonic lethality . However, Pin1-deficient mice were found to be at a higher risk of developing Alzheimer's disease , and also to have cell-proliferative abnormalities that included decreased body weight, testicular and retinal atrophies, and the failure of the breast epithelial compartment to undergo the normal changes associated with pregnancy [23,29]. FKBP12-deficient mice were found to suffer from severe dilated cardiomyopathy and noncompaction of left ventricular myocardium, which mimics a human congenital heart disorder . Cardiac hypertrophy was found in FKBP12.6-deficient male mice, but not in females, unless the protective effect of oestrogen was abrogated . Finally, FKBP52-deficient male mice were found to have several defects in reproductive tissues which included ambiguous external genitalia and a dysgenic prostate .
It therefore appears that despite the high conservation of the PPIases throughout the eukaryotes and prokaryotes, they do not possess an essential function within many cells under normal growth conditions, but may become essential in the absence of other cellular factors or in response to environmental challenges. Also, whilst PPIases appear not to be essential for the viability of the mouse, their haploinsufficiency may result in abnormalities that impact the fitness of the animal and which could be models for human disease.
Whilst some fungi have found commercial applications, such as S. cerevisiae, Sz. pombe and R. oryzae in fermenting, others have been identified as both human [30-35] and plant [36-38] pathogens. PPIases within some bacterial pathogens have been found to have a key role in their pathogenicity [39-45], and a cyclophilin within the pathogenic fungus Cryptococcus neoformans has been reported as important for its virulence . However, in most cases it is unknown what role, if any, fungal PPIases play in the pathogenicity of their host cell. Until we identify and understand the PPIase repertoires of these pathogens, we cannot begin to unravel their potential roles within the cell and their use as putative therapeutic targets.
Reported here is the identification and comparative analysis of the PPIase repertoires present in sixteen fungi that represent four different fungal taxa; Ascomycota (Candida albicans, Candida glabrata, Debaryomyces hansenii, Eremothecium gossypii, Kluyveromyces lactis, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Yarrowia lipolytica, Aspergillus fumigatus, Aspergillus nidulans, Gibberella zeae, Neurospora crassa), Basidiomycota (Cryptococcus neoformans, Ustilago maydis), Microsporidia (Encephalitozoon cuniculi) & Zygomycota (Rhizopus oryzae). By comparing these fungal repertoires, we hope to identify key conserved PPIases that are found within all their repertoires as well as those that are specific to each fungal taxa or fungus. By compiling the known functions of these PPIases, we hope to better understand both those that function within a broad range of fungi as well as those that are specific to a particular fungal linage that may be linked to their specific characteristics. This comparison will also serve to aid our interpreting of the use of fungal model systems in furthering our understanding of the roles of PPIases within vastly different cell types.
The PPIase repertoires of the different fungi were identified by BLASTP and TBLASTN searches of their proteome and genome, respectively, maintained by the National Center for Biotechnology Information (NCBI) using the protein sequences of human cyclophilin A, FKBP12, and Pin1 as probes to identify the cyclophilin, FKBP, and parvulin families, respectively. The cyclophilins were all identified with an E-value between 10-10-10-69 and a sequence identity between 28–73% with the exception of six proteins. These six proteins are members of two previously identified cyclophilin groups that possess a divergent PPIase domain  and they gave E-values <10-10 and a sequence identify <30% when compared against hCypA. These cyclophilins were identified using the Sz. pombe and S. cerevisiae members of these groups [See Table 16 in Additional File 1]; Group K – DhCyp6, AfCyp8, & UmCyp6; Group L – YlCwc27, DhCwc27, & EgCwc27). The Group K members gave E-values between 10-56-10-109 and a sequence identify between 40–60%, and the Group L members gave E-values between 10-20-10-37 and a sequence identity between 27–31%. The FKBPs were all identified with E-values between 10-13-10-38 and a sequence identity between 33–62%. The parvulins were all identified with E-values between 10-19-10-49 and a sequence identity between 34–55% with the exception of four that gave E-values between 10-6-10-8 but retained a sequence identity between 33–37% (AfPar1, AnPar1, GzPar1, & NcPar1). Upon inspection, these proteins possessed a clearly identifiable parvulin-like rotamase domain but lacked the characteristic WW domain of the metazoan Pin1 proteins. Searching of the fungal sequence databases using the human parvulin Par14 as a probe identified these four parvulins with E-values between 10-28-10-32 and a sequence identity between 54–59%, confirming them as parvulins.
The identified PPIase repertoires within the fungal genomes investigated can be found in Tables Tables1,1, ,2,2, ,3,3, ,4,4, ,5,5, ,6,6, ,7,7, ,8,8, ,9,9, ,10,10, ,11,11, ,12,12, ,13,13, ,14.14. Table Table1515 shows a comparison of the number of members of each PPIase family found within the different fungi. The repertoire orthology of the cyclophilins, FKBPs and parvulins as identified by BLAST analysis can be found in Table 16 [See Additional File 1]. Figure Figure11 shows the dendrograms generated for the cyclophilins (A), FKBPs (B), and parvulins (C). Pairwise E-values, bit scores, and percentage sequence similarity and identity between all members of each cyclophilin, FKBP, and parvulin group, as well as a global comparison for each family, can be found in Additional Files 2 &3 (cyclophilins), 4 &5 (FKBPs) and 6 &7 (parvulins). The multiple sequence alignments for each cyclophilin, FKBP, and parvulin group discussed herein can be found in Additional File 8.
The number of PPIases present within these fungi (Table (Table15)15) varies from three in E. cuniculi to 22 in R. oryzae. To compare the sizes of the different repertoires, the number of PPIases has been weighted by the number of genes in their genome. The average number of PPIases was 1.9 per 1000 genes in the genome (Table (Table15),15), with the most PPIase rich fungi being Sz. pombe, which has just over 2.5 per 1000 genes in its genome, and the most PPIase poor fungi being G. zeae, which has just over 1 per 1000 genes in its genome. These differences are also reflected in the taxa, with the Pezizomycotina fungi having an average number of PPIases of 1.54 per 1000 genes, compared to the Saccharomycotina fungi which have 2.02 per 1000 genes. Looking at the weighted number of each PPIase family in these two taxa, this difference appears due to a reduced number of cyclophilins and FKBPs within the Pezizomycotina fungi. It also appears that Sz. pombe is PPIase rich due to an above average number of cyclophilins and FKBPs, whilst G. zeae is PPIase poor due to a below average number of all three PPIase families.
Cyclophilin numbers in the different fungi vary predominantly between 6 and 11, however at the extremes E. cuniculi has only 2, and R. oryzae has 16 (Table (Table15).15). The most cyclophilin rich fungi is C. neoformans with 1.98 per 1000 genes, closely followed by Sz. Pombe with 1.87 per 1000 genes, and the most cyclophilin poor fungi is G. zeae with 0.71 per 1000 genes, followed by N. crassa with 0.89 per 1000 genes. The number of FKBPs in these fungi is typically 3 or 4, with the extremes being none in E. cuniculi and 5 in R. oryzae, which is in keeping with their respective genome sizes (Table (Table15).15). The most FKBP rich fungi is C. glabrata with 0.76 per 1000 genes, closely followed by S. cerevisiae with 0.68 per 1000 genes and Sz. pombe with 0.62 per 1000 genes, and the most FKBP poor fungi is E. cuniculi with none, followed by G. zeae with 0.21 per 1000 genes and R. oryzae with 0.29 per 1000 genes, with the latter being surprising given that it has the largest PPIase repertoire of these fungi (Table (Table15).15). There is on average just a single parvulin in these fungi (Table (Table15),15), with the exception being the presence of a second in members of the Pezizomycotina taxa, but this number is in keeping with their genome size. The most parvulin poor fungus is R. oryzae, which has just the sole parvulin despite it having a genome size larger than that of the Pezizomycotina fungi. The most parvulin rich is E. cuniculi, with its sole parvulin but very small genome size compared with the other fungi.
Overall, G. zeae and R. oryzae have the sparsest PPIase repertoires of these fungi, with both lacking the number of members in all three PPIase families that would be predicted based on their genome sizes. Sz. pombe and C. neoformans have the densest PPIase repertoires of these fungi, with the former having an increased number of cyclophilins and FKBPs over the expected value, whilst the latter solely has a larger than expected cyclophilin repertoire.
Table 16A [See Additional File 1] shows that there are only two cyclophilin groups that are conserved throughout all the fungi. One group are the human cyclophilin A orthologues (Group B), a ubiquitous group of cyclophilins that have been reported to be cytoplasmic, in agreement with the PSORT predicted localization for all members except RoCyp1 (Table 16A [See Additional File 1]), with an appreciable nuclear component [10,47-49], which given the absence of nuclear localisation sequences (NLS) within their sequences is likely due to interactions with target proteins that shuttle them into the nuclear compartment. However, SpCyp2 has been reported to be present in discrete vesicles within the cytoplasm , and RoCyp1 was predicted by PSORT to be nuclear, which is in partial agreement with the observed localization for members of this group, implying that the localization of this group may be variable between different organisms. This is supported by the observation that the cyclophilin A orthologue in N. crassa has two isoforms that localize to the cytoplasm or the mitochondrion, dependant upon the cleavage of a N-terminal signal peptide . Similar dual functions may also exist with other members of this group, but they have not yet been identified. Members of this group have been reported to function in protein activity regulation , transcriptional regulation [51-53], a vesicular import pathway , the control of both the meiotic [46,49] and mitotic [46,55] cell cycles, and in the mediation of the virulence of C. neoformans , indicating that they have wide ranging functions within the cell.
The second group that is present in all the fungi is that of the human cyclophilin B orthologues (Table 16A [See Additional File 1]; Group G). This group is identified by their targeting to the endoplasmic reticulum (ER) by an N-terminal signal peptide [9-11,56,57], in agreement with their PSORT predicted localization (Table 16A [See Additional File 1]). AnCyp6 has been reported to be upregulated during heat shock and it is capable of inhibiting calcineurin in the presence of CsA, but it is not essential for cell growth . A second cyclophilin in both R. oryzae (RoCyp9) and S. cerevisiae (ScCpr2) appears to be a paralogue of their respective Group B member, suggesting that these fungi may be evolving divergent functions within their ER that require a second ER cyclophilin related to Group G. ScCpr2 has been reported to be present in the fungi's secretory pathway [10,14,58] and induced by heat stress and tunicamycin , confirming its localization. Their presence in the ER and up-regulation during heat shock would strongly suggest a role within the vesicular protein folding pathway, presumably as a folding catalyst and chaperone.
Only one group is present in all but one of the fungi, E. cuniculi. The human cyclophilin 40 orthologues (Table 16A [See Additional File 1]; Group I) are a group of heat shock inducible [11,60-62] predominantly nuclear [10,11] cyclophilins. They are predicted by PSORT to be cytoplasmic and possess no NLS with the exception of UmCyp5, suggesting that their presence in the nucleus is through interactions with target factors that shuttle them into the nucleus rather than their direct targeting. Members of this group have been reported to function within the Hsp90 complex [62-64], potentially regulating its ATPase activity , during its functions in cellular signalling pathways that regulate transcription [62,66], the cellular heat shock response  and also in maintaining the cell cycle protein kinases Mik1, Wee1 and Swe1 . This would suggest that this group functions as a control element within a wide range of cellular signalling pathways that include the regulation of the cell cycle.
The Cwc27 orthologous are present in 12 of the 16 fungi (Table 16A [See Additional File 1]; Group L). Not originally identified as members of the cyclophilin family, recent research has identified the presence of a degenerate PPIase domain in the N-terminus of these proteins followed by a C-terminal region rich in S/K-R/E residues that is similar to those observed in hnRNP-binding proteins [11,68]. All members were predicted by PSORT to be nuclear, which has been confirmed experimentally for ScCwc27 , however SpCyp7 has been reported to be found within the perinuclear space . Both SpCyp7 and ScCwc27 have been reported to be a component of their respective Cdc5 complex , but their function, and those their orthologues, within this complex remains unknown. However, it does identify a role for this group within pre-mRNA processing. This is also in contradiction with the observed localization of SpCyp7, as the Cdc5 protein complex has been reported to be predominantly nuclear [10,70], indicating that SpCyp7 may have multiple functions within Sz. pombe.
Three groups are present in 10 out of the 16 fungi, with no members identified in E. cuniculi and all but two of the Saccharomycotina fungi (Table 16A [See Additional File 1]; Groups D, K & N). From the latter, D. hansenii, along with Y. lipolytica, has a member in both Groups D & N, or E. gossypii, in Group K. Very little is known about the functions of Group D besides the predominantly nuclear localization of SpCyp3, with a suggested role in pre-mRNA splicing , and its apparent presence in the spindle-pole bodies and/or microtubule organizing centres , both of which are contrary to the groups PSORT predicted cytoplasmic localization for all members except DhCyp2 which was predicted to be nuclear (Table 16A [See Additional File 1]), suggesting that the presence of this group in the nucleus is due primarily to their association with other factors. Again, very little is known about the functions of Group N, all of whose members are predicted to be cytoplasmic by PSORT, with the exception of DhCyp9 and UmCyp8 which were predicted to be nuclear, and share the presence of WD40 motifs in their N-terminal region. This motif is found in all eukaryotes, but not in prokaryotes, in a large variety of proteins that share no obvious commonality in their functions . Finally, Group K members share the presence of a C-terminal RNA Recognition Motif (RRM), which is found in Metazoan protein factors involved in constitutive pre-mRNA splicing and alternative splicing regulation . Again, very little is known about this group in fungi beyond the highly specific nuclear localization of SpCyp6  and a putative role in cell morphogenesis, cortical organization and nuclear reorganization . The localization of SpCyp6 is in agreement with the PSORT predicted localization of all members of this group except for EgCyp5 and DhCyp6 which were predicted to be cytoplasmic, indicating that their targeting to the nucleus may also be due to an association with other targeting factors, presumably during their functions within the pre-mRNA processing complexes.
One group is found in 9 of the 16 fungi, with no members identified in E. cuniculi, and all of the Saccharomycotina fungi with the exception of Y. lipolytica (Table 16A [See Additional File 1]; Group M). Half of its members are predicted by PSORT to be nuclear (RoCyp14, SpCyp8, YlCyp8, & UmCyp7) whilst the other half are predicted to be cytoplasmic (AnCyp9, AfCyp9, NcCyp6, GcCyp7, & CnCyp11). This difference is grouped by taxa, suggesting that there may be a difference in function between them despite the high sequence homology they all share [See Additional File 8] or that their function is both within the cytoplasm and nucleus and that this targeting difference is a response to changes in their target factors that in the latter group can transport them into the nucleus, but not in the former. They all possess an N-terminal U-Box motif, which is reported to be a modified RING-finger motif involved in protein:protein interactions that has been primarily identified in proteins involved in the ubiquitin/proteasome system . As with the previous groups, very little is known about the functions of this group within these fungi beyond the predominantly nuclear localization reported for SpCyp8 , which is in agreement with its PSORT predicted localization.
All but four of the fungi have a PSORT predicted mitochondrial cyclophilin, the exceptions being R. oryzae, E. cuniculi, Sz. pombe and U. maydis. These cyclophilins are found spread between two othology groups (Table 16A [See Additional File 1]; Groups E & F) that are distinguished based upon sequence characteristics [See Additional File 8]. Group E is present in the all Saccharomycotina fungi, whilst Group F is present in all Pezizomycotina fungi and C. neoformans. In Group E, ScCpr3 has been reported as a mitochondrial cyclophilin  required for mitochondrial function under heat-stress  and as a protein folding chaperone within the mitochondria [59,77,78]. In Group F, NcCyp4 has both a mitochondrial and cytoplasmic isoform . Mitochondrial NcCyp4 has been reported to cooperate with Hsp70 and Hsp60 within the mitochondrial matrix and whilst mitochondria lacking functional NcCyp4 efficiently imported preproteins into the matrix, the folding of the imported preprotein was significantly delayed . It has also been reported to suppress the gating of the putative fungal mitochondrial permeability transition pore in a CsA sensitive manner . No functions are yet known for its cytoplasmic isoform, but based on this information the mitochondrial isoform appears important for the maintenance of mitochondrial function. Interestingly, AnCyp3 and AfCyp4, whilst both showing a high degree of sequence homology to the other members of Group F [See Additional File 8], they were both predicted by PSORT to be cytoplasmic, unlike the other members of this group, due to the absence of a signal peptide [See Additional File 8]. R. oryzae has two cyclophilins (RoCyp5 & RoCyp6) that appear to be paralogues based on their sequence (data not shown) and also show a high degree of orthology to the members of mitochondrial Group E (data not shown), however they are predicted by PSORT to be cytoplasmic as no mitochondrial localization sequences were identified within their sequence. They could possess unknown mitochondrial targeting signals, but further investigation is required to confirm this.
Two groups are present only in R. oryzae, the Pezizomycotina fungi, and the Basidiomycota fungi with the exception of U. maydis (Table 16A [See Additional File 1]; Group A) or N. crassa (Group C). All Group A members are predicted to be cytoplasmic by PSORT, however SpCyp1 has been reported to be predominantly nuclear  and to function within a broad range of SNW/SKIP signal transduction pathways involved in cell proliferation and differentiation , suggesting that it is its interaction with factors within these pathways that cause its targeting to the nucleus. Group C members are predicted by PSORT to be cytoplasmic, but nothing is currently known about the functions of this group.
Group J is found solely within the Saccharomycotina fungi (Table 16A [See Additional File 1]). This is a second TPR-containing cyclophilin group whose members exhibit a very high degree of sequence homology with their respective member of the other TPR-containing group (Group I; data not shown), and in the case of ScCpr7, to also interact with Hsp90 [61,82,83], suggesting that they share a similar role within cellular signalling pathways. Like Group I, members of this group are predicted by PSORT to be cytoplasmic, in agreement with the reported localization for ScCpr7 , with the exception of CaCyp6 which was predicted to be nuclear, supporting possible conserved functions.
Finally, Group H is also found solely within the Saccharomycotina fungi with the exception of Y. lipolytica (Table 16A [See Additional File 1]). ScCpr4 has been reported to localise to the endoplasmic reticulum [10,14], function within the secretory pathway , possess a putative transmembrane domain and is induced by heat shock and tunicamycin. This would suggest that this group has a role within the vesicular protein folding pathways as a folding catalyst or chaperone.
The dendrogram showing the putative evolutionary relationship of the fungal cyclophilins (Figure (Figure1A)1A) shows good agreement with the groups identified by BLAST analysis (Table 16A [See Additional File 1]), and it may also allow us to better understand the evolution of the eleven individual cyclophilins identified in four of the fungi. Interestingly, we would have expected the uni-domain cyclophilins to have evolved first, followed by the larger multi-domain cyclophilins as the cells become more complex. However, the pattern in the dendrogram suggests that the initial divergence separated the ancestor of the TPR containing Groups I & J from the ancestor of the other groups. The other multi-domain groups are found to evolve amongst the other uni-domain cyclophilins, of which many have evolved from the ER Group G despite their predominant PSORT predicted cytoplasmic and nuclear localizations. Only uni-domain Groups A, E & F are observed to evolve on a separate branch which they share with four individual cyclophilins (RoCyp5, RoCyp6, RoCyp11, & CnCyp10). This would suggest that the functions of Group I became important early in the evolution of the fungi, with the other multi-domain groups evolving as the cells became more complex, and/or as the uni-domain cyclophilins evolved functions that required a second domain.
It is of note that in many cases the evolution of the members of each group does not follow that of their respective fungi. This suggests that there may be variable factors within these fungi that are driving their evolution. Some fungi share these factors in common despite not sharing a close evolutionary history, whilst others that share a close evolutionary relationship do not. This would result in variable evolutionary pressures on individual members of the group, leading to clustering in the dendrogram away from that of the evolutionary history of the host fungus but which may be representative of shared evolutionary factors.
Despite the observed clustering of most groups identified by BLAST analysis within the dendrogram, two groups do however show fragmentation; Group B, which is found in four distinct parts of the dendrogram, and Group G, which is found in two. This would suggest a complex evolutionary history for these two groups, but may also indicate that some members that have been identified by BLAST analysis may not be true orthologues of these groups. In most cases, the parent fungi of these outliers does not have a member in the group to which these outliers appear associated, and these groups also appear related to the group identified by BLAST analysis. This could indicate that there are selective pressures driving these outliers to perform some functions of the group to which they appear associated, slightly increasing their homology to this group and away from their BLAST identified group, but not enough to be distinguished by BLAST analysis, resulting in their observed clustering in the dendrogram.
The initial divergence separated the discrete branch that contains the ubiquitous TPR containing human cyclophilin 40 orthlogues (Group I) and their Saccharomycotina specific paralogues (Group J), and the human USA-CyP orthologues (Group D), implying a shared evolutionary history, from the other groups (Figure (Figure1A).1A). All are predicted by PSORT to be cytoplasmic, which is supported by experimental studies which also showed an appreciable nuclear component for members of each group [10,11]. It is interesting that the uni-domain members of Group D, whose function appears vastly-different from those of Groups I & J, appears to have evolved from a common ancestor with Group J.
The evolution of the cytoplasmic human cyclophilin A orthologues (Group B; Figure Figure1A)1A) and mitochondrial cyclophilins (Groups E & F; Figure Figure1A)1A) appears on a discrete second branch, which on a wider aspect is shared with four individual cyclophilins, three within R. oryzae (RoCyp5, RoCp6, & RoCyp11) and one from C. neoformans (CnCyp10). RoCyp5 & RoCyp6 appear to possibly be part of mitochondrial Group E based on sequence homology (data not shown) although they are predicted by PSORT to be cytoplasmic due to the absence of any signal sequences, and RoCyp11 & CnCyp10 appear to be related based upon sequence homology (data not shown), but not sufficiently to be called orthologues. The presence of Groups B, E, & F on this shared branch would suggest a close evolutionary history, which is supported by the observed dual cytoplasmic and mitochondrial nature of the NcCyp4 gene  and the appreciable sequence homology between Groups B, E, & F (data not shown). NcCyp4 shares a sub-branch with GzCyp5, AnCyp3 & AfCyp4, suggesting that these other cyclophilins may, or may have, exhibited this dual nature, which is supported in the case of GzCyp5 by the presence of a putative N-terminal signal peptide [See Additional File 8]. There is no clear separation of these groups into discrete sub-branches, with YlCyp1, a member of Group B, found within Group E on a sub-branch it shares with YlCyp3, its Group E member, indicating that their evolution may have been more restricted or more recent than the other members leading to their greater sequence homology. Interestingly, AfCyp2, a member of Group B, is found on its own sub-branch that shares a major branch with Groups D, I, and J, suggesting a different evolutionary path for this cyclophilin from the other Group B members. EcCyp1, another Group B member, is also found on a separate sub-branch, which it shares with CnCyp6, an individual cyclophilin that appears to be a paralogue of CnCyp3, its Group B member, suggesting that EcCyp1 may have adapted to perform functions for which C. neoformans has evolved a second cyclophilin to perform. Also interesting is that RoCyp1 is found within the Group J members on a separate branch, whilst RoCyp2, an apparent paralogue of RoCyp1, is found on the same branch as Groups B, E, & F. Based on sequence homology RoCyp1 would be called the R. oryzae orthologue of Group B [See Additional File 8], however its PSORT predicted nuclear localization is in contrast to the cytoplasmic localization predicted for the other members of Group B. It's observed clustering in a different branch of the dendrogram could indicate that it has a divergent function from this group, and that RoCyp2 may be the true Group B member despite exhibiting a lower sequence homology towards its members. RoCyp2 also shares its branch with RoCyp10, a fungal cyclophilin that is unique in that is possesses an N-terminal RRM domain. The presence of an RRM is only observed in one other group (Group K) where it is found in their C-terminal domain. Nuclear Group K is found on a separate branch from cytoplasmic RoCyp10 (Figure (Figure1A),1A), suggesting that these two RRM containing cyclophilin groups have evolved separately to perform their specific functions which may be within different compartments of the cell.
The final major branch sees all other groups evolve from a common ancestor that appears to begin with the precursor to ER Group G (Figure (Figure1A),1A), which itself appears to have evolved in three phases that are not restrained by the evolutionary history of the fungi. The initial branch contains all the members from the Pezizomycotina fungi, as well as CnCyp7, SpCyp4, RoCyp8, and interestingly YlCyp4, a member of the Saccharomycotina taxa. The other members from the Saccharomycotina taxa appear to evolve in the remaining two branches, with CaCyp3 evolving with DhCyp4 and interestingly UmCyp3, a member of the Basidiomycota taxa, whilst ScCpr5, EgCyp3, KlCyp3, and CgCyp3 all evolve on the final branch. This pattern of evolution is hard to explain. It may involve the convergent evolution of some cyclophilins to perform a common function present in some, but not all, of the fungi, or it could indicate the presence of shared evolutionary pressures within subsets of the fungi. An individual cyclophilin within R. oryzae, RoCyp9, is also seen to evolve on the first branch along with its closely related Group G member, RoCyp8, which may indicate that a recent gene duplication has occurred. S. cerevisiae also has an individual cyclophilin, ScCpr2, that is present on the same major branch as its Group G member, but it appears more distantly related (data not shown) which would indicate that if gene duplication did occur, it happened earlier than the R. oryzae duplication.
It is fascinating that the remaining functionally divergent groups that are predominantly present in the cytoplasm and nucleus all appear to have evolved from the ER Group G. The initial divergence from Group G appears to have separated the evolution of two cytoplasmic groups, uni-domain Group A and the WD40 containing Group N, with the former appearing to evolve on a discrete branch that shares a common ancestor with the later, from the other remaining groups. Both appear vastly different in putative function, with Group A being small uni-domain predominantly nuclear cyclophilins whilst Group N are large multi-domain putatively cytoplasmic cyclophilins, suggesting that Group N gained the WD40 domain after their divergence from their common ancestor.
The next divergence in the companion branch sees the evolution of the ER Group H on two discrete branches, which are shared with ScCpr8, an individual cyclophilin in the budding yeast that whilst it is not found in the ER, it has been reported to be a membrane bound protein . This is not far removed from that of an ER protein, and this may have come about through the loss of its ability to loose its signal peptide. Interestingly, EcCyp2, a member of Group G, is found within these branches rather than on the branches with its fellow group members. This could indicate a degree of divergent evolution of this cyclophilin to fulfil the needs of both Group G and Group H within its parent fungi. The final divergence led to the evolution of the nuclear RRM containing Group K and the related Cwc27 orthologues (Group L), both of which appear to be involved in mRNA processing, on a discrete branch from the cytoplasmic uni-domain cyclophilins of Group C and the U-box containing predominantly nuclear Group M, which themselves are also found to evolve on a shared branch.
Table 16B [See Additional File 1] shows that the only FKBP group to have members in all these fungi, with the exception of E. cuniculi which has no FKBPs in its genome, are the human FKBP12 orthologues (Group A). ScFpr1 has been reported to be cytoplasmic , in agreement with their PSORT predicted localization with the exception of UmFKBP2 which was predicted to be nuclear. SpFKBP12 has been reported to have an important role in the early steps of the fission yeast sexual development pathway but it is not essential for normal growth , and mutant cells that lack CaFKBP1  and CnFKBP1  have also been reported to be viable under normal conditions. Finally, ScFpr1 has a reported regulatory role within the homoserine synthetic pathway where disruption of its function perturbs the aspartokinase feedback inhibition by threonine resulting in the toxic accumulation of aspartate β-semialdehyde, the substrate of homoserine dehydrogenase . Group B (Table 16B [See Additional File 1]) and the individual FKBP AnFKBP2 appear to be closely related to Group A based upon sequence homology (data not shown), but nothing is presently known about the functions of these FKBPs.
The only other group to show conservation in many of these fungi is that of Group C (Table 16B [See Additional File 1]). Members are present in all but E. cuniculi, Sz. pombe, E. gossypii, G. zeae, and C. albicans, despite the closely related fungus D. hansenii (Figure (Figure2)2) having a member. R. oryzae has two closely related FKBPs, RoFKBP2 & RoFKBP3, that appear to be paralogues and show a high degree of homology with the other members of Group C (data not shown). ScFpr2 has been reported to be resident within the ER [10,88], and NcFKBP3 has been reported to be synthesized as a precursor protein with a cleavable signal sequence and a C-terminal HNEL endoplasmic reticulum (ER) retention signal, strongly suggesting that NcFKBP2 is an ER resident protein . This is in agreement with the PSORT predicted ER localization for all members of this group with the exception of CgFKBP2 and NcFKBP2 which were predicted to be cytoplasmic due to the absence of an N-terminal signal peptide. In addition, NcFKBP3 was reported to have a carboxyterminal domain that has an amino acid composition biased towards charged residues that is predicted to form an amphipathic α-helix , which could indicate that NcFKBP3 will function at the surface of plasma membranes .
The remaining groups identified within the FKBP repertoires appear specific to a subset of the fungi compared here. Group I has been identified solely within the Pezizomycotina fungi (Table 16B [See Additional File 1]), however nothing is known about the members of this group besides their PSORT predicted cytoplasmic localization (Table 16B [See Additional File 1]). Group D was identified only within N. crassa and G. zeae and both members were predicted by PSORT to localize to the ER, but their functions remain unknown. Group F (Table 16B [See Additional File 1]), whose members are predicted by PSORT to all be nuclear, is found only within some of the Saccharomycotina fungi, the exceptions being Y. lipolytica, D. hansenii, and C. albicans. Both S. cerevisiae and C. glabrata have a second FKBP that shares a high degree of homology with their member of Group F (Table 16B [See Additional File 1]; Group G), and ScFpr4 has been reported to share similar properties with ScFpr3. Both ScFpr3 [10,91-93] and ScFpr4 [10,76,94] have been identified as nuclear, and have been reported to suppress defects seen in the absence of the E3 ubiquitin ligase TOM1 . ScFpr3 has also been reported to maintain recombination checkpoint activity through the control of protein phosphatase 1 , a critical function during meiosis, and the phosphorylation state of ScFpr3 is important for correct growth  but has no affect on its localization , suggesting the phosphorylation state of ScFpr3 is important for its cellular function. Members of Group H are present only within C. albicans and D. hansenii, with the former appearing to also have a paralogue of its member, CaFKBP2, and there is a closely related FKBP within Y. lipolytica, YlFKBP3.
Finally, Group E is found in four of the 16 fungi (Table 16B [See Additional File 1]); R. oryzae, Sz. pombe and both of the Basidiomycota fungi. Three members are predicted by PSORT to be nuclear (RoFKBP4, SpFKBP39, & UmFKBP3), however CnFKBP3 is predicted to be ER due to the presence of an N-terminal signal peptide, suggesting a possible difference in function despite its high sequence homology towards the other members of this group [See Additional File 8]. SpFKBP39 has been reported as nuclear  and to have a role in chromatin remodelling involved in ribosomal DNA silencing through a potential role as a histone chaperone .
The dendrogram for the evolution of the FKBPs (Figure (Figure1B)1B) shows good agreement with the groups identified by BLAST analysis (Table 16B [See Additional File 1]). It is interesting to note that the dendrogram suggests that the evolution of the uni-domain FKBPs (Groups A, B, C, & D) occurred first, with the FKBPs that possess a second charged domain (Groups E, F, G, & H) evolving later in their history. The exception to this is the evolution of RoFKBP5, the only fungal FKBP with a TPR domain in this comparison, which appears before the evolution of the other multi-domain FKBPs. Orthologues of RoFKBP5 are not present in any of the other fungal genomes that are currently available (data not shown), suggesting that it may be restricted to a specific fungal taxa. Without further orthologues the evolutionary history of this group cannot be fully explored, as RoFKBP5 could have either evolved recently in R. oryzae, or at an earlier point in a recent ancestor.
The FKBPs all appear to have evolved from a single ancestor that was related closest to Group A, the cytoplasmic FKBP12 orthologues. These are seen to evolve first but in four different phases that appear as the other groups begin to evolve. UmFKBP2 and YlFKBP1 are found on the earliest branch, suggesting that these are the earliest precursors of this group, which does not agree with the evolution of the fungi as both are members of different taxa (Figure (Figure2).2). The next three divergences separate the evolution of the remaining Group A members from the other FKBPs. First to evolve were the Saccharomycotina Group A members on their own discrete branch, followed by the remaining Group A members SpFKBP12, RoFKBP1, CnFKBP1, AnFKBP1 & AfFKBP1, and finally NcFKBP1 and GzFKBP1. This pair shares a branch with the Group B members in A. nidulans and A. fumigatus, which both appear to be closely related to their Group A members at the sequence level (data not shown), and interestingly RoFKBP5, an individual cyclophilin that is unique in possessing a TPR domain.
The next divergence led to the evolution of AnFKBP2, an individual cytoplasmic FKBP that appears closely related to its Group A member, from the other groups. The evolution of ER Group C came next, whose members have all evolved on a discrete branch that separates the Saccharomycotina members and R. oryzae, which is normally found with the other fungal taxa, on one sub-branch and the other fungal taxa, who share their sub-branch with RoFKBP3, an individual ER FKBP that appears to be a paralogue of its Group C member (data not shown), and the two members of the final ER group, Group D, on the other.
Next came the evolution of the Group E members, which occurred in two phases. All but the Sz. pombe member evolve on a discrete branch, with SpFKBP39 found on a different branch it shares with SpFKBP39a, which appears to be a paralogue of SpFKBP39 (data not shown) that has been reported to be nuclear . This suggests that Sz. pombe has evolved a second FKBP to perform the functions of Group E. All members of Group E were predicted by PSORT to be nuclear, with the exception of CnFKBP3 which was predicted to be in the ER due to the presence of an N-terminal signal peptide. Their clustering in the dendrogram would indicate that any divergence in function was relatively recent as their sequences have not yet significantly diverged.
Cytoplasmic Group I, whose members are restricted to the Pezizomycotina fungi, is found to evolve on the companion branch from the groups that include the SpFKBP39 branch. Finally, the evolution of the nuclear Groups F, G & H are all found on the same branch, indicating a close evolutionary relationship. The evolution of Group F is observed on a discrete branch that shows a close evolutionary relationship with both members of Group G, which is supported by their high sequence homology (data not shown). Finally, Group H shares its discrete branch with two individual FKBPs, cytoplasmic YlFKBP3 and nuclear CaFKBP2, both of which appear to be related to Group H at the sequence level, but not to each other (data not shown). It also appears that the divergence of CaFKBP2 and CaFKBP3 is relatively recent, as they appear together on the same discrete branch.
Compared with the cyclophilin and FKBP repertoires, the parvulin repertoires are relatively small. All the compared fungi share a single parvulin in common with a second parvulin only found within the Pezizomycotina fungi (Table 16C [See Additional File 1]). Members of both groups are predicted by PSORT to be nuclear, with the exception of one member of Group A (GzPar1), and two members of Group B (EgPin1 & AnPin1), suggesting that the direct targeting of these groups to the nucleus is either not essential and/or that their interaction with target factors recruits them into the nucleus, or that we do not fully understand all of the fungal NLS.
The sole parvulin they all share in common is that of the human Pin1 orthology group (Table 16C [See Additional File 1]; Group B). SpPin1 is reported to be it a positive regulator of the cell cycle control proteins Wee1 and Cdc25 . ScEss1 is reported to be nuclear and involved in transcription through an interaction with the C-terminal domain of RNA polymerase 2 [10,100-105] downstream of its phosphorylation repsonse to DNA damage , and in cell cycle regulation . Interestingly, ScEss1  and CaEss1  have been reported as essential genes, whereas SpPin1  and CnEss1  have been reported to be non-essential, although that latter has been reported to be required for virulence. Cross-talk between ScEss1 and ScCpr1 [20,55], a member of cyclophilin Group B, has been reported to modulate the activity of the Sin3-Rpd3 complex with excess histone deacetylation causing mitotic arrest in ScEss1 mutants  and CnEss1-null mutants have been reported to be hypersensitive to CsA , suggesting a cyclophilin-mediated redundancy mechanism. Disruption of ScEss1 can be complemented by DmDodo  and the plant Digitalis lanata's Par13 , which lacks the WW domain conserved in the other proteins, indicating a conserved functionality may exist between all Pin1 orthologues that is essential in some but not all organisms under normal growth conditions and only requires the rotamase domain. CaEss1 is reported as essential for growth and the reduction in its dosage or activity blocks morphogenetic switching between the hyphal and pseudohyphal forms under certain conditions . Structural studies have shown that whilst CaEss1 has the same overall structure as its human orthologue, hPin1, it's altered α-helical linker results in a rigid juxtaposition of the WW and rotamase domains that gives the two domains of CaEss1 a distinct orientation from that of hPin1, eliminating the hydrophobic pocket between the domains that was identified as the main substrate recognition site . NcPin1 is unique among the known eukaryotic parvulins in containing a polyglutamine stretch between the N-terminal WW domain and the C-terminal rotamase domain , indicating that it may have a specialized function within this fungi. So far it would appear that some, but not all, of the functions of this group are essential, with varying degrees of redundancy present in some fungi, and structural studies would suggest that whilst these functions require one or both of the domains present in its members, it is unlikely that they will act in synergy given that their orientation appears to vary between the different members.
The final parvulin orthology group was identified only in the Pezizomycotina fungi (Table 16C [See Additional File 1]; Group A). Whilst nothing is presently known about this group in fungi, it is interesting to note that its members appear to all be orthologues of the second parvulin group previously only identified within the Metazoa that includes human Par14 [68,110]. This indicates that the evolution of Group A must have occurred prior to the evolutionary split of the unicellular fungi from the ancestral cell that went on to form the Metazoan eukaryotes.
As we would expect, the dendrogram for the parvulins (Figure (Figure1C)1C) shows the evolution of Group A occurs on a discrete branch from those of Group B, but interestingly it shares a branch with EcPin1, a member of Group B. The dendrogram shows that the members of Group B have not evolved within their specific taxa, but rather they have a shared evolution, as inferred by the lack of clustering of the members from each taxa onto the same branches. There are four sub-branches that form Group B, with CaEss1, DhPin1 and RoPin1, appearing the most removed of all the members, with the remaining groups forming on two branches, with UmPin1 and CnEss1 evolving form the same branch as Group A and appearing related to EcPin1. Interestingly, the three members for which structural studies have been reported are found on separate branches, indicating that the clustering of the Group B members in the dendrogram may represent the three different tertiary structural conformations that they represent.
The number of PPIases within these fungal repertoires varies, with E. cuniculi having the smallest repertoire with 3 and R. oryzae the largest with 22, which is close to that of D. melanogaster and C. elegans . The number of PPIases in a given repertoire appears associated with the number of genes in the genome of the fungus, averaging 1.9 PPIases per 1000 genes. Some fungi were found to be more PPIase rich than others, such as Sz. pombe and C. neoformans with around 2.6 PPIaess per 1000 genes, whilst others were found to be relatively PPIase poor, such as G. zeae and R. oryzae, with 1.06 and 1.26 PPIases per 1000 genes, respectively. The greatest variation was observed in the number of cyclophilins, where on average there were between 6 and 13 present, whilst the number of FKBPs were fairly constant, with between 3 and 5 present, and on average just a single parvulin was present. The exception was the presence of a second parvulin within the Pezizomycotina fungi, whose parvulin repertoires resemble those of the Metazoa . E. cuniculi was unique is possessing no FKBPs, whilst retaining a member for both of the highly conserved cyclophilin groups (Table 16A [See Additional File 1]; Groups B & G) and the sole highly conserved parvulin group (Table 16C [See Additional File 1]; Group B).
We have shown that only two cyclophilin groups, cytoplasmic Group B and endoplasmic reticular Group G, are present in all of the fungi. Another predominantly cytoplasmic group is found in all the fungi with the exception of E. cuniculi (Group I), which has no further cyclophilins in its repertoire. Five groups were found to be restricted to C. neoformans, Sz. pombe R. oryzae, the Pezizomycotina fungi, and were also present in some of the Saccharomycotina fungi. Nuclear Group L was present in four of the Saccharomycotina fungi; S. cerevisiae, D. hansenii, E. gossypii, and Y. lipolytica. Three groups were present in two Saccharomycotina fungi; Cytoplasmic Groups D & N was present in Y. lipolytica and D. hansenii, and nuclear Group K was present in D. hansenii and E. gossypii. Finally, predominantly nuclear Group M was only present in one of the Saccharomycotina fungi, Y. lipolytica. Four groups were found to be absent in the Saccharomycotina fungi. As with the above groups, cytoplasmic Group A is restricted to the Pezizomycotina fungi, C. neoformans, Sz. pombe and R. oryzae, but in this case it was absent in the Saccharomycotina fungi. Similar to Group A, cytoplasmic Group C was restricted to U. maydis, R. oryzae Sz. pombe, and the Pezizomycotina fungi with the exception of N. crassa. Finally, mitochondrial Group F was restricted to the Pezizomycotina fungi and C. neoformans. Three groups were also found to be restricted to the Saccharinycotina: Mitochondrial Group E, endoplasmic reticular Group H, and cytoplasmic Group J.
Eleven individual cyclophilins were identified, six of which are present in R. oryzae; RoCyp2, RoCyp5 and RoCyp6 which appear to be paralogues, RoCyp9, RoCyp10, which is the only known fungal cyclophilin with an N-terminal RRM, and RoCyp11. All were predicted to be cytoplasmic with the exception of RoCyp9, which was predicted to be endoplasmic reticular, and RoCyp11, which was predicted to be nuclear. Of the remaining five, two were present in both S. cerevisiae (ScCpr2 & ScCpr8) and C. neoformans (CnCyp6 & CnCyp10), and one was present in Y. lipolytica (YlCyp5), with the remainder of the fungi having no unique cyclophilins in their repertoire. Of these last five, ScCpr2 has been found to function in the ER, ScCpr8 has been reported to associate with membranes, YlCyp5 is predicted to be nuclear, and CnCyp6 is predicted to be cytoplasmic.
We have found that the cyclophilins appear to have a wide range of functions within the cell, but there are several conserved themes within these functions. We have the expected roles in protein folding and chaperoning both within the ER (Groups G & H) and the mitochondrion (Groups E & F). For the former, only one group was found to be present within all fungi (Group G) whereas Group H was found solely within all but one of the Saccharomycotina fungi, Y. lipolytica, suggesting that these fungi require an increased amount of PPIase-mediated protein folding within their ER. Given their small genome size (Table (Table15),15), this is presumably an adaptation to environmental challenges predominantly affecting this taxa. The mitochondrial groups are found separated between two separate taxa, Group E is present solely within the Saccharomycotina fungi whilst Group F is found solely within the Pezizomycotina fungi and C. neoformans, suggesting a divergence in their functions within the mitochondria. Group F also has two different types of members; we have the dedicated mitochondrial cyclophilins in C. neoformans, and presumably A. nidulans and A. fumigatus despite their PSORT predicted cytoplasmic localization, and those in N. crassa and G. zeae that have two isoforms that target them to the cytoplasm (Group B) or mitochondrion (Group F), which may shed light on the evolutionary history of these groups. Interestingly, no mitochondrial cyclophilins were identified in Sz. pombe, E. cuniculi and U. maydis, which given that they are spread throughout the fungal evolutionary tree is easiest explained by gene deletion. R. oryzae also had no identified mitochondrial cyclophilin, but it does have two cyclophilins that show a high degree of sequence homology to mitochondrial Group E but which are predicted by PSORT to be cytoplasmic due to the absence of any signal sequences.
We have also found three groups involved in cellular signalling pathways (Groups A, I, & J). Whilst Group A appears to function within the SNW/SKIP pathways involved in cell proliferation and differentiation within all the fungi except E. cuniculi, U. maydis, and the Saccharomycotina fungi, Groups I & J appear to function as part of the Hsp90 complex in hormone signaling and cell cycle pathways in all fungi except E. cuniculi, or in the case of Group J, only within the Saccharomycotina fungi, suggesting that this taxa has additional divergent pathways that require an extra cyclophilin-Hsp90 chaperone. Overall, these cyclophilin groups appear to function within cellular signaling pathways involved in regulating the cellular growth response to external signals.
Finally, we have three groups that appear to have a role within pre-mRNA processing (Groups D, K, & L), that are largely absent from the Saccharomycotina fungi and E. cuniculi (Table 16A [See Additional File 1]). The presence of an RRM in members of Group K suggests that they function in the recruitment of the RNA related to cell morphogenesis, cortical organization and nuclear reorganization into the pre-mRNA processing complexes, whereas members of Group L have been found present within the Cdc5 complex and along with Group D, are believed to function within pre-mRNA splicing.
There are also some groups with unique or less defined functions within these fungal repertoires. Group B is present in all the fungi and its members appear to function in a wide range of pathways from the control of protein activity, transcription, translation and the cell cycle, to the vesicular transport of proteins and the control of fungal virulence. Group M is found in all the fungi except E. cuniculi and the Saccharomycotina fungi, with the exception of Y. lipolytica, and its members are believed to function in the proteosome degradation pathway, potentially recruiting target proteins for degradation based upon the presence of a U-box domain in their sequence. Group N is present in the same fungi as Group M, but also D. hansenii, and its members appear to function within protein complex(s) due to the presence of a WD40 motif in their sequence. However, both groups require further investigation to elucidate their functions within the cell. Also, there is one group (Group C), which lacks any secondary domains with which to infer a putative function and at present, there is nothing known about its members besides its presence in all fungi except the Saccharomycotina fungi, E. cuniculi, Sz. pombe, and N. crassa.
Only a single FKBP group was present in all fungi, with the exception of E. cuniculi as it possessed no FKBPs, that being cytoplasmic Group A which shows orthology to human FKBP12 (Table 16B [See Additional File 1]; Group A). ER Group C was found to be in all fungi with the exception of Sz. pombe, E. gossypii, G. zeae, and C. glabrata. Three groups are restricted to the Saccharomycotina fungi. Nuclear Group F is present in all members except for Y. lipolytica, D. hansenii, and C. albicans. Nuclear Group G is only found in S. cerevisiae and C. glabrata and its members exhibit high sequence homology to members of Group F, and nuclear Group H is found only in C. albicans and D. hansenii. The remaining four groups are not found in the Saccharomycotina fungi, of which three are restricted to the Pezizomycotina fungi; Cytoplasmic Group I, ER Group D, which is only present in N. crassa and G. zeae but neither of the Aspergillus fungi, and Group B, which is found only within the Aspergillus fungi and appears closely related to Group A. Finally, nuclear Group E was restricted to R. oryzae, Sz. pombe, and both of the Basidiomycota fungi, U. maydis, and C. neoformans.
There were six individual FKBPs identified as present within five of the fungi (Table 16B [See Additional File 1]). Two were present within R. oryzae; RoFKBP2 appears to be a paralogue of RoFKBP3, a member of ER Group C, and it also shares a PSORT predicted ER localization. RoFKBP5 is predicted by PSORT to be cytoplasmic and it is the only known fungal FKBP to contain a TPR domain. The remaining four fungi each have just a single individual FKBP. A. nidulans has an extra cytoplasmic FKBP, AnFKBP2, that appears to be closely associated with Groups A & B, and Sz. pombe contains nuclear SpFKBP39a, which appears to be a paralogue of its Group E member (SpFKBP39). Both C. albicans (CaFKBP2) and Y. lipolytica (YlFKBP3) have an FKBP that shows appreciable sequence homology to nuclear Group H, with CaFKBP2 sharing this nuclear localization, whereas YlFKBP2 is predicted by PSORT to be cytoplasmic.
We do not find the same clustering of functions with the FKBPs as were found with the cyclophilins. There are two ER groups (Groups C & D), but whilst it appears that they will be involved in the vesicular protein folding pathways this has not been confirmed experimentally, and they are present only within a subset of the fungi (Table 16B [See Additional File 1]), suggesting that their function is not universally conserved. Group A was found to be in all fungi and they appear to have a wide range of roles within the sexual development of the fungi and also in other pathways where it appears likely that their role is in maintaining the correct folding of the protein components of these pathways. Group B appears closely associated with Group A based on sequence homology (data not shown), although no functions are yet known for its members. Nuclear Groups F & G are both found only within a subset of the Saccharomycotina fungi (Table 16B [See Additional File 1]) and they appear functionally linked. Members of Group F appear to have a role within recombination checkpoint maintenance. A member of nuclear Group E has been reported to function within chromatin remodelling associated with DNA silencing, and its presence in only Sz. pombe, R. oryzae and the Basidiomycota fungi implies a specific rather than generalized function. At present, nothing is known about cytoplasmic Group I, which is found solely within the Pezizomycotina fungi, or nuclear Group H, which is found only within C. albicans and D. hansenii.
Both the cyclophilins and FKBPs are found similarly distributed throughout the cells organelles, but the cyclophilins are the only family found within the mitochondria. However, the apparent evolutionary pattern of the cyclophilins and FKBPs was found to be different. Whilst the single domain FKBP groups appear to have evolved prior to the larger multi-domain FKBP groups (Figure (Figure1B),1B), suggesting that the latter became important as cellular complexity increased, the same is not true for the cyclophilins (Figure (Figure1A).1A). The TPR containing multi-domain cyclophilin groups appear to have evolved very early on, suggesting that they have a very important and primal role within the cell, whereas the other multi-domain cyclophilin groups have evolved throughout the dendrogram at varying stages in the evolution of the fungal cyclophilin groups. This would suggest that the factors that drove the evolution of the multi-domain cyclophilin groups are, at least in part, different from that of the multi-domain FKBPs. Interestingly, in some instances we find single domain cyclophilin groups evolving from the branches of multi-domain groups. This would suggest that as the multi-domain cyclophilins evolved along with the cell, in some instances they were able to adapt to perform some functions without the secondary domain, and in such instances were able to spawn a new group to perform these functions as they continued to evolve new ones. Alternatively, there may have been an evolutionary event in one of their targets, such as the creation of a duplication of a target protein that then evolved to require a subtle adaptation in the cyclophilin that could not be accommodated. In such instances, a gene duplication of the cyclophilin and the incorporation of this adaptation into one copy would allow the protein and its host fungi to continue their evolution. If this change also ends the requirement for the cyclophilins secondary domain, it would cease to be selected for which would allow for its loss at some point during the evolutionary history of this new group.
This observation would suggest that the evolution of the FKBP family may have been in response to the increasing complexity of the proteome of these fungi as they have evolved, which after a certain point required more than just the PPIase catalytic domain to perform their functions in some cases. However, the cyclophilins appear to have evolved both in response to the increasing complexity of the proteome of these fungi and to the increase in the complexity of the underlying biological pathways that allow the fungi to survive, implied by the sporadic nature of the evolution of the more complex cyclophilins.
The final and smallest of the PPIase families, the parvulins, had just a single representative in most of the fungi, that of the nuclear human Pin1 orthology group (Table 16C [See Additional File 1]; GroupB) whose members appear to have a wide range of functions within transcription and cell cycle regulation. Interestingly, a second nuclear member was present in the Pezizomycotina fungi that showed orthology to the human parvulin Par14 (Group A). Orthologues of Group A have only previously been identified in the Metazoa, and were thought to have evolved within the Metazoa, however their presence in the Pezizomycotina fungi brings a new perspective on their evolution. The position of the Pezizomycotina taxa in the evolutionary tree and the absence of any Group A orthologue in the other fungi does however create a mystery. There is no common ancestor shared between the Pezizomycotina taxa and the Metazoa that the former does not share with other fungal taxa. So if this group evolved prior to the divergence of both the Pezizomycotina and Metazoa from their common ancestor, why did the other fungi not retain this parvulin? Or, did these parvulins evolve separately within the Pezizomycotina and Metazoa by convergent evolution? This group is not present in any of the other fungal genomes currently available in the NCBI database (data not shown), making the further analysis of this group within the genomes of additional fungi as they become available imperative to understanding their evolution and role within the cell.
Taking into account both the evolutionary paths of the three different PPIase families, their conservation within the different fungi, and their known functions within the fungi, we would hypothesize that the parvulins and cyclophilins have evolved to perform conserved functions within the fungi. The parvulins appear to have primal functions, presumably retained from the bacterial lineages within which this PPIase family is the predominant group [6,111], to which they are now largely restrained, whilst the cyclophilins appear to have gained functions at varying stages during fungal evolution. However, the FKBPs appear to have evolved to fill the ever-changing niches within these constantly evolving fungi, presumably in response primarily to the evolutionary pressure associated with the increase in the number of proteins that required chaperoning as the fungi increased in complexity.
The duplication and differentiation of genes, giving rise to multigene families, and genomic rearrangements, which allowed for the acquisition of new domains and regulatory sequences, has been a characteristic feature in the evolution of eukaryotic genomes and it is likely to have been a major factor in the evolution of the fungi [112-116]. There is some evidence that gene duplication has occurred during the evolution of the fungal PPIase repertoires. Some of the PPIases in this comparison are apparent paralogues of another PPIase in their respective repertoire (Table 16 [See Additional File 1]): R. oryzae (RoCyp1-RoCyp2, RoCyp5-RoCyp6, & RoFKBP2-RoFKBP3), C. neoformans (CnCyp3-CnCyp6), Sz. pombe (SpFKBP39-SpFKBP39a), C. albicans (CaFKBP2-CaFKBP3), A. fumigatus (AfFKBP1-AfFKBP2), and in all the Saccharomycota fungi (Group I-Group J). Interestingly, A. nidulans has a group of three FKBPs that all appear to be paralogues (AnFKBP1-AnFKBP2-AnFKBP3), suggesting that their progenitor gene may have undergone multiple duplications during the evolution of A. nidulans. A. fumigatus has a pair of FKBPs that are apparent orthologues of two of these genes (Table 16 [See Additional File 1]; AnFKBP1-AfFKBP1 & AnFKBP3-AfFKBP2), indicating that the evolution of the third paralogue (AnFKBP2) occurred after the divergence of A. nidulans from A. fumigatus.
There is strong evidence that the genome of the common ancestor of S. cerevisiae, C. glabrata, and the Saccharomyces sensu stricto yeast, underwent whole genome duplication (WGD) after its divergence from the ancestors of the other fungal lineages [117-121]. Whilst a vast majority of the duplicated genome was lost, some duplicate gene copies were retained and subsequently underwent differentiation to form distinct members of the evolving multigene families. This gene retention varied from cell to cell and this is likely to have been a major contributory factor in the divergent evolution of the different yeast that evolved from this ancestral cell. The WGD event appears to have had an apparent role in the evolution of the cyclophilin and FKBP repertoires of S. cerevisiae and C. glabrata. There are three pairs of PPIase genes present within different duplicated blocks identified in the genome of S. cerevisiae [117,122]. Two pairs are cyclophilins (ScCpr2-ScCpr5, block 14; ScCpr4-ScCpr8, block 11) and one pair are FKBPs (ScFpr3-ScFpr4, block 44). Both of the cyclophilin duplications are not present within the genome of C. glabrata or any other fungi in this comparison (Table 16A [See Additional File 1]), however there are orthologues of both ScFpr3 and ScFpr4 in the C. glabrata repertoire (Table 16B [See Additional File 1]). This would suggest that the functions of these duplicate FKBP genes are important in both S. cerevisiae and C. glabrata, but the functions of the duplicate gene in both of the S. cerevisiae cyclophilin pairs are not required in C. glabrata. K. lactis is the most recent common ancestor to S. cerevisiae and C. glabrata in this comparison (Figure (Figure2),2), and its repertoire contains only one of each pair of duplicate PPIases found in S. cerevisiae and C. glabrata, supporting their evolution during the post-WGD evolution of these latter fungi.
Gene deletion is also likely to have played role in the evolution of discrete fungi, as exemplified by the PPIase repertoire differences of C. glabrata and S. cerevisiae after their divergent post-WGD evolution (Table 16 [See Additional File 1]). The differences between the different fungal PPIase repertoires in Table 16 [See Additional File 1] include both additional and absent PPIases between both the different lineages and individual fungi. In many cases, a PPIase orthology group is present in many of the fungi, but it is absent in a particular lineage or in some cases members of a particular lineage. Whilst the presence of some individual PPIases and the evolution of some groups, such as cyclophilin Groups H & J, and FKBP Groups F & I, can be explained easiest by gene duplication, in many others gene deletion appears more likely, such as: cyclophilin Groups A, C, D, K, L, M, & N, and FKBP Groups C & E.
Domain duplication and shuffling by recombination are probably the most important forces driving protein evolution , with the majority of multi-domain proteins likely to have evolved by the stepwise insertions of single domains . The dendrogram for the fungal cyclophilins (Figure (Figure1A)1A) and FKBPs (Figure (Figure1B)1B) show evidence that this has probably occurred during the evolution of the fungal PPIase repertoires. There are instances in the cyclophilin dendrogram (Figure (Figure1A)1A) where a multi-domain cyclophilin group appears to have evolved from a uni-domain group, such as: Group N – Group A, Groups K, L & M – Group C, and Groups I & J – Group D. In the FKBP dendrograms (Figure (Figure1B),1B), RoFKBP5 is found on a branch with the members of Group B, a group closely related to Group A with which it shares the larger branch (Figure (Figure1B).1B). It appears likely that the Group B members have evolved by gene duplication given their high sequence homology to their respective Group A member (data not shown). It is therefore possible that RoFKBP5 evolved from a gene duplication event of its Group A member and the subsequent acquisition of its TPR domain that resulted in its divergence from the other Group B members. Interestingly, it has been reported that in most cases domain order is conserved because recombination between the domains usually only occurrs once during the course of evolution , and in many protein family combinations, the preservation of the N to C-terminal orientation of the combined domain pair is almost absolute, with only a few examples of domain pairs appearing in both orientations . In the fungal cyclophilin repertoires identified here, we have one group with a C- terminal RNA recognition motif (RRM; Group K), and an individual cyclophilin with an N-terminal RRM (RoCyp10). It therefore appears that multiple recombination events have occurred between the RRM and cyclophilin domains, suggesting that this domain combination may have an important role to play in the complex eukaryotic cell.
The evolution of the fungal PPIase repertoires therefore appears to have been complex involving both gene and whole genome duplication, gene deletion, and multiple domain-acquisition events. Based on the pattern of PPIase conservation and their appearance/disappearance (Table 16 [See Additional File 1]) in relation to the evolutionary history of these fungi (Figure (Figure2),2), we would hypothesize that the ancestral fungus from which these fungi evolved would have had nine cyclophilins (members of Groups A, B, D, G, I, K, L, M, & N), three FKBPs (members of Groups A, C, & E), and just one parvulin (member of Group B). However, the presence of a member of the second parvuilin group (Group A) cannot be discounted due to its presence within both the Pezizomycotina and the higher eukaryotes. This repertoire size is comparable to that of the current fungal repertoires (Table (Table15),15), with much of the variation in size likely due to gene duplication and deletion, as discussed above. The future availability of the complete sequences of additional fungal genomes within which to study the size and makeup of their PPIase repertoires will greatly improve our ability to hypothesize as to the repertoire makeup of the ancestral fungal cell.
The level of orthology between the fungal PPIase repertoires and that of H. sapiens has been found to vary. Of the 12 human cyclophilins (Table 16A [See Additional File 1]), out of 17 , with orthologues present in these fungi (Table 16A [See Additional File 1]), S. cerevisiae only has representatives for four out of its repertoire of nine, which is one less than E. gossypii (out of 8), and only marginally better than C. albicans, C. glabrata and K. lactis, which all have three (out of 6). Both Y. lipolytica and D. hansenii have seven (out of 10) in common with H. sapiens, the highest of the Saccharmycotina, with both lacking representatives of hCGI-124, hPPIL3, hCyp33, and hCypF, although they do have an unrelated mitochondrial cyclophilin, and the former also lacks an hCyp57 orthologue and the latter an hCyp60 orthologue. Sz. pombe, A. nidulans, A. fumigatus, U. maydis, N. crassa, G. zeae, and E. cuniculi, all share their complete cyclophilin repertoires in common with H. sapiens, however they do fall short of having orthologues of all the human cyclophilins. The two Pezizomycotina fungi, G. zeae and N. crassa, are interesting in that they both have a cyclophilin that has both a cytoplasmic and mitochondrial isoforms (NcCyp4 & GzCyp5) that show orthology with hCypA and hCypF respectively, suggesting a shared evolutionary origin of the cytoplasmic and mitochondrial cyclophilins as shown in the dendrogram (Figure (Figure1A).1A). The other two Pezizomycotina fungi, A. nidulans and A. fumigatus, both have 11 human orthologues out of the possible 12 with both lacking an hCyp33 orthologue, and both possess separate cyclophilins that show orthology to hCypA and hCypF. This number of human orthologues is shared with R, oryzae, which also has 11 out of the possible 12 present in its repertoire of 16. In the case of R. oryzae, it is lacking a mitochondrial hCypF orthologue, and it appears to not possess a mitochondrial cyclophilin, although it does have two that show a high degree of sequence homology with the Saccharomycotina mitochondrial cyclophilins (RoCyp5 & RoCyp6; data not shown).
H. sapiens has only a single FKBP in common with these fungi, out of 13 , the orthologues of human FKBP12 (Group A; Table 16B [See Additional File 1]), which are present in all the fungi with the exception of E. cuniculi, which has no FKBPs. There are however two FKBPs within the fruit fly, Drosophila melanogaster, that are also present within some of the fungi (DmCG14715 & DmFKBP39; Table 16B [See Additional File 1]). These are both present within U. maydis, C. neoformans, and R. oryzae, and they are both absent within G. zeae, E. gossypii, and C. albicans. DmCG14715 is present in all of the other fungi with the exception of Sz. pombe, which is the only other fungi to have an orthologue of DmFKBP39. As with the FKBPs, only one human parvulin is present in all of the fungi, the orthologues of human Pin1 (Group B; Table 16C [See Additional File 1]). The second parvilin group that shows orthology to human Par14 (Group A; Table 16C [See Additional File 1]) is present only within the Pezizomycotina fungi.
Based on this orthology of the fungal PPIase repertoires to that of H. sapiens, we have shown that whilst the fungi are good models for the study of the cyclophilins and parvulins, they are a poor model for the study of the FKBPs. This supports the hypothesis that the cyclophilins and parvulins have evolved to perform conserved functions within the underlying biological processes of the cell, whilst the FKBPs have a role largely within protein folding that varies between organisms. The repertoires of A. nidulans, A. fumigatus, and C. neoformans exhibit the closest resemblance, missing only an hCyp33 orthologue, which they share with R. oryzae, which instead lacks an hCypF orthologue, and G. zeae, however the presence of a single gene encoding the hCypA and hCypF within this latter fungi reduces its resemblance.
Saccharomyces cerevisiae has proven to be a very powerful model organism that has, and continues to, contribute greatly to our deciphering of the molecular functioning of our cells [reviewed in 127–129]. The study of its PPIase repertoire has proven invaluable in understanding the elusive functions of these proteins and it continues to provide new insights into their diverse cellular roles [reviewed in 130]. However, as we have shown here and previously , its PPIase repertoire poorly represents those of the higher eukaryotes in comparison with the repertoires of A. nidulans, A. fumigatus, and C. neoformans. Whilst research on the S. cerevisiae PPIase repertoire will continue to aid us in the search to understand the functions of the PPIases in the eukaryotic cell, research within fungi whose repertoires better represent those of H. sapiens and the other Metazoa may offer greater insights into the biological processes within which these PPIases function in the more complex Metazoan cells that cannot be investigated in S. cerevisiae whose repertoire lacks orthologues of these PPIases. In this respect, C. neoformans may represent a better model fungi as its PPIase repertoire exhibits the highest orthology to the human repertoire of the fungi reported here whilst retaining a genome size close to that of S. cerevisiae (Table (Table15),15), making it more genetically tractable than A. nidulans, A. fumigatus and R. oryzae, whose genomes are up to three times that of S. cerevisiae and C. neoformans. It is already an established model organism with the functional genomics and proteomics techniques necessary to fully dissect their function Future experimental investigations are required to elucidate whether or not this will prove to be the case.
Reported here is the identification of 16 fungal PPIase rep ertoires representing multiple fungal taxa. This comparison has shown that the fungal PPIase repertoires do share some orthology; however this orthology is reduced between the different taxa and is also found to be variable within these taxa. This analysis has also shown that the evolutionary pattern of the cyclophilins and FKBPs appear to be different, with the multi-domain cyclophilins appearing throughout the evolution of the single domain cyclophilins, whereas the multi-domain FKBPs have evolved more recently after the evolution of the single domain FKBPs. Based on the data presented here, we would hypothesize that; (i) the evolution of the fungal PPIases is driven, at least in part, by the size of the proteome, (ii) evolutionary pressures differ both between the different PPIase families and the different fungi, and (iii) whilst the cyclophilins and parvulins have evolved to perform conserved functions, the FKBPs have evolved to perform variable roles that appear predominantly within protein folding. However, further experimental investigations are required to confirm these hypotheses. Interestingly, orthologues of the human parvulin hPar14 and hCyp33 have been identified within the genomes of the Pezizomycotina fungi and R. oryzae, respectively. These PPIases were thought to be restricted to the Metazoa. Their presence in these fungi suggests an older evolutionary history than was originally thought. A TPR-containing FKBP was also identified within R. oryzae, making it the only known fungal FKBP to possess one. Finally, whilst the PPIase repertoire of S. cerevisiae has been the subject of much research, it is a poor representative of the repertoire of H. sapiens and despite its highly successful use as a model organism for the study of PPIases over the past two decades, very little is still known about the functions of some of its PPIases. Whilst research into its repertoire will continue to give us insights into the function of the PPIases within the eukaryotic cell, the repertoire of C. neoformans may offer a better model system with which to further our knowledge of the role of the PPIases within the human cell given the greater orthology it exhibits towards it. This research has also highlighted the need to fully investigate how representative the model organism chosen to investigate the function of a family of proteins is to that of the primary organism, so as to improve the portability of the results.
Fungi for this study were selected primarily by the presence of a complete annotated genome sequence within the database maintained by the National Centre for Biotechnology Information (NCBI; ). The complete annotated genome sequences of Sz. pombe , E. cuniculi , S. cerevisiae [134,135], D. hansenii , E. gossypii , K. lactis , C. albicans , C. glabrata , Y. lipolyica , A. nidulans , A. fumigatus , N. crassa , &C. neoformans  were selected for this reason. G. zeae [142,143] and U. maydis  were included as their genomes were 90% and 98% complete, respectively, and they are present in an annotated state within the NCBI database. The unannotated genome of R. oryzae  was included as it is the only example of the Zygomycota in the NCBI database. In all three cases, there are no plans to complete the sequencing and annotation of their genomes making their current sequence repository as complete a picture as we are likely to achieve of their repertoires.
The repertoires of Sz. pombe and S. cerevisiae were as previously described . The identification of putative PPIases in the remaining fungi was performed using both BLASTP (protein vs. protein) and TBLASTN (protein against DNA sequence) searches  on all genomes except for that of R. oryzae, where only TBLASTN searches were possible. The TBLASTN step is important for detecting additional homologues because genes may remain unannotated in the sequence databases. The members of the three PPIase families were identified using the protein sequences of human cyclophilin A (hCypA; UniProt accession # P05092), human FKBP12 (hFKBP12; P20071) and the human parvulins Pin1 (hPin1; Q13526) and Par14 (hPar14; Q9Y237) as probes in BLASTP and TBLASTN searches of their sequences. Proteins were selected based upon the level of homology, both in regard to actual sequence homology and/or the presence of characteristic motifs, their PPIase catalytic domain exhibited towards that of their probes sequence. As the catalytic domain of the different PPIases families have been found to exhibit good conservation between related organisms [4,5,147], a conservative expected (E) value cut-off of 1 × 10-10 and/or greater or equal to 30% alignment sequence identity  was used to identify homologues of the three PPIase families. Two fungal cyclophilin groups that possess a divergent PPIase domain have been previously identified . To confirm that all members of these groups were identified (Table 16 [See Additional File 1]; Groups K & L), their Sz. pombe and S. cerevisiae members were used as probes in BLASTP and TBLASTN searches of the fungal proteomes and genomes, respectively.
In some cases, it was apparent that the sequence present in the NCBI database was incorrect or incomplete. In these instances, sequences were investigated and where possible corrected prior to further analysis. Sequence corrections obtained by this manual curation were submitted to the Universal Protein Resource (UniProt) consortium for verification . Once verified, the appropriate UniProt reports were ammended to reflect this revised annotation. All protein sequences are available in the UniProt database and their accession numbers are given in Tables Tables1,1, ,2,2, ,3,3, ,4,4, ,5,5, ,6,6, ,7,7, ,8,8, ,9,9, ,10,10, ,11,11, ,12,12, ,13,13, ,14,14, and those that were manually re-annotated are marked.
The identification of putative domains within the identified PPIases was performed using the NCBI CDD (Conserved Domain Database) database [150,151]. The predicted localisation of the PPIases and the identification of sequence motifs that support this were identified using PSORT [152,153] located on the National Institute for Basic Biology (NIBB) server . The theoretical molecular mass of the predicted proteins were calculated using the calculation tool on the ExPASy server . Pairwise percentage sequence identity and similarity was calculated using the Matrix Global Alignment Tool (MatGAT) version 2.02  using a BLOSUM50 scoring matrix.
Multiple sequence alignments (MSA) were produced using default settings in version 1.83 of the ClustalX program . This program performs a pairwise alignment of the sequences prior to the construction of a dendrogram, which describes the approximate groupings of the sequences by similarity, with the final alignment carried out using this dendrogram as a guide. The dendrograms presented in Figure Figure11 were constructed from a global MSA of each family of PPIases by the Neighbour Joining (NJ) method with a Poisson correction for distance estimation  and 500 bootstrap replicates. The dendrograms were visualised from the files generated by the ClustalX alignment using MEGA version 3.1 . The scales of the different dendrograms are not cross-comparable.
PPIases were considered to be orthologues if they fulfilled three criteria. Firstly, they should be of approximately the same size and possess the same domain architecture. Secondly, in BLAST searches they should identify each other ahead of all other PPIases within their respective genomes . This is because they should, in theory, share a more recent common ancestor than they do with the other PPIases. Sequence variation, resulting from the distinct divergent evolution of each protein, should therefore be less between two orthologues than with other PPIases. Thirdly, they should have the same intracellular location and function. This latter criterion is however reliant upon prior research which is not applicable to all PPIases. In these cases, as long as the first two criteria were met, then the proteins were deemed to be putative orthologues.
Four methods were employed to identify the orthology between the repertoires using the above criteria. Firstly, the sequences of the identified PPIases were formatted by family as a database file using the BLAST tools package obtained from the ftp-site of NCBI [131,146] and an "all against all" BLASTP method was used to identify homology between proteins. Orthologues were defined in the following manner: (i) they must be reciprocal best hits with an expectation (E)-value less than 1 × 10-10 , (ii) they must share at least 40% similarity in amino acid sequence , (iii) there must be <20% difference in protein length , and (iv) have a bit score above 60, which in almost all cases indicate a biologically relevant relationship when motifs are conserved [161,162]. However, in cases where a protein failed to meet one of these criteria but where membership to a particular orthology group was well supported by the remaining BLAST criteria as well as by the remaining identification methods detailed below, the protein was included as an orthologue of that particular group. Secondly, each orthology group identified by BLAST analysis were subjected to sequence alignment using the ClustalX program version 1.83  to check for conserved domain architecture and conserved sequence motifs [See Additional File 8]. Thirdly, the sequences for all the member proteins of each of the three different PPIase families (cyclophilins, FKBPs & parvulins) from all of the compared fungi were subjected to global sequence comparison by family using the ClustalX program version 1.83  for the purpose of creating a dendrogram. This analysis creates a putative model for how the individual sub-groups of each PPIase family may have diverged from one another based on relationships between their individual sequences, which allow us to infer a putative model for their evolution in the fungi compared here. As each orthology group should share a more recent common ancestor with themselves than with the other PPIases they should cluster together in the MSA  and therefore within the dendrogram, ideally as an individual branch with a distinct common ancestor. Fourthly, literature analysis looking for prior publications on the individual PPIases was performed which in some cases has allowed putative function(s) to be assigned to orthology groups.
Peptidyl-prolyl cis/trans isomerase (PPIase); cyclophilin (Cyp); FK506 binding-protein (FKBP); parvulin (Par, Pin or Ess); cyclosporin A (CsA); R. oryzae (Ro); Sz. pombe (Sp); E. cuniculi (Ec); S. cerevisiae (Sc); D. hansenii (Dh); E. gossypii (Eg); K. lactis (Kl); C. albicans (Ca); C. glabrata (Cg); Y. lipolyica (Yl);A. nidulans (An); A. fumigatus (Af); N. crassa (Nc); G. zeae (Gz); U. maydis (Um); C. neoformans (Cn); H. sapiens (h); endoplasmic reticulum (ER); nuclear localisation sequence (NLS); RNA Recognition Motif (RRM); tetratrico peptide repeat (TPR).
The author(s) declare that they have no competing interests.
Table 16. The orthology between the three PPIase families that make up the fungal repertoires. Table depicting the orthology between the repertoires of these sixteen fungi (Tables (Tables1,1, ,2,2, ,3,3, ,4,4, ,5,5, ,6,6, ,7,7, ,8,8, ,9,9, ,10,10, ,11,11, ,12,12, ,13,13, ,1414 & Ref. 68) as determined by BLASTP analysis.
Pairwise Expected (E)-values and bit scores of the cyclophilins. Pairwise expected (E)-values and bit scores for all members of each of the individual cyclophilin groups, and a global comparison of all the cyclophilins identified here.
Pairwise percentage sequence identity and similarity of the cyclophilins. Pairwise percentage sequence identity and similarity for all members of each of the individual cyclophilin groups, and a global comparison of all the cyclophilins identified in this report.
Pairwise expected (E)-values and bit scores of the FKBPs. Pairwise expected (E)-values and bit scores for all members of each of the individual FKBP groups, and a global comparison of all the FKBPs identified in this report.
Pairwise percentage sequence identity and similarity of the FKBPs. Pairwise percentage sequence identity and similarity for all members of each of the individual FKBP groups and a global comparison of all the FKBPs identified in this report.
Pairwise expected (E)-values and bit scores of the parvulins. Pairwise expected (E)-values and bit scores for all members of each of the individual parvulin groups, and a global comparison of all the parvulins identified in this report.
Pairwise percentage sequence identity and similarity of the parvulins. Pairwise percentage sequence identity and similarity for all members of each of the individual parvulin groups, and a global comparison of all the parvulins identified in this report.
Multiple sequence alignments of the different cyclophilin, FKBP and parvulin groups. Multiple sequence alignments of the different cyclophilin, FKBP and parvulin groups that were identified by BLAST analysis of their sequences. All alignments were constructed with ClustalX.
The author would like to thank Dr. John Kay (Brighton & Sussex Medical School, Brighton, UK) for his helpful discussions during the identification of these repertoires. The author would also like to thank Dr. Ivo Pedruzzi (Swiss Institute of Bioinformatics; Geneva, Switzerland) for his invaluable assistance in the verification and correction of the UniProt protein reports. This investigation was partly conducted in a facility constructed with support from Research Facilities Improvement Program Grant Number C06 (RR10600-01, CA62528-01, RR14514-01) from the National Center for Research Resources, National Institutes of Health.