|Home | About | Journals | Submit | Contact Us | Français|
Fungus-growing ants engage in complex symbiotic relationships with their fungal crop, specialized fungal pathogens, and bacteria that provide chemical defenses. In an effort to understand the evolutionary origins of this multilateral system, we investigated bacteria isolated from fungi. One bacterial strain (Streptomyces sp. CLI2509) from the bracket fungus Hymenochaete rubiginosa, produced an unusual peptide, tryptorubin A, which contains heteroaromatic links between side chains that give it a rigid polycyclic globular structure. The three-dimensional structure was determined by NMR and MS, including a 13C-13C COSY of isotopically enriched material, degradation, derivatives, and computer modeling. Whole genome sequencing identified a likely pair of biosynthetic genes responsible for tryptorubin A’s linear hexapeptide backbone. The genome also revealed the close relationship between CLI2509 and Streptomyces sp. SPB78, which was previously implicated in an insect–bacterium symbiosis.
Natural product studies largely focus on their therapeutic potential, but studies motivated by their ecological roles, which place them in their evolutionary context, have become increasingly rewarding for chemistry and biology.1−4 For the last several years, our laboratories have collaborated to understand the molecular basis underlying the complex relations between fungus-growing ants, their fungal crop, their crop’s specialized fungal pathogen, and bacterial symbionts that provide chemical defenses.4−7 Examples such as dentigerumycin,4 9-methoxyrebeccamycin,5 and selvamicin6 illustrate some of the molecular and biological diversity discovered in this system.
The complex multilateral symbiotic system seen today undoubtedly had its evolutionary origin in a simpler system. Two types of fungi, a basidiomycete that would eventually emerge as the fungal crop and an ascomycete that would eventually emerge as the specialized pathogen, and their antagonistic interactions are a critical component of the host microbiome. A plausible origin begins with a fungus associating with a bacterium (actinomycete) to provide chemical defenses and the subsequent recruitment of fungus–bacterium pair by an ant. The ascomycete pathogen followed, and the ant–fungus–bacterium mutualism continuing to evolve and diversify for ~50 million years.
To investigate the plausibility of a fungus-bacterium origin, actinobacterial isolates (e.g., Streptomyces spp.) were obtained from a fungi collection, and extracts of these strains were analyzed by LCMS. One strain, Streptomyces sp. CLI2509 (isolated from the bracket fungus Hymenochaete rubiginosa) produced a compound with an m/z 897.3928 [M + H]+, a mass that did not match any compounds in the natural products database AntiBase. CLI2509 was selected for large-scale fermentation, and the molecule with m/z 897 was purified and structurally characterized. HRMS data provided a molecular formula of C49H52N8O9 for the compound, which we have named tryptorubin A (1; Figure Figure11). Initial analysis of 1D and 2D NMR data indicated that the compound was a hexapeptide with characteristic 1H and 13C NMR shifts for alanine, isoleucine, and tyrosine, but the remaining three amino acids could not be easily identified due to unusual NMR shifts: C-29/H-29 (δC 131.8, δH 5.76) and C-12/H-12 (δC 92.3, δH 6.78). Additionally, there appeared to be aromatic bridging between these remaining three amino acids as indicated by an HMBC correlation (Figure Figure22) between H-40 (δH 7.17) and C-11 (δC 60.9). There were also several heteroatoms present as indicated by atypical downfield 13C NMR shifts. Tryptorubin A (1) has ten exchangeable protons as shown by dissolution in CD3OD:D2O (1:1) and analysis by HRMS.
Because of the unusual NMR shifts and aromatic bridging between amino acids, several steps were needed to characterize the three remaining amino acids. First, small-scale (25 mL) cultures of strain CLI2509 were grown with 15N-labeled amino acids, one amino acid per culture, and the extract was analyzed by LCMS to determine which amino acids were incorporated and in what quantity. Alanine, isoleucine, and tyrosine were used as positive controls, and tryptophan, phenylglycine, and phenylalanine were also tested. LCMS analysis of these labeled cultures determined that tryptorubin A (1) contained alanine, isoleucine, two tryptophans, and two tyrosines (see Supporting Information). To clarify the location of the tryptophans, CLI2509 was grown with deuterium-labeled tryptophan (indole-d5), and 1H NMR and HRMS analysis of the purified product determined nine deuterium atoms had replaced hydrogen atoms (H-12, H14–H17, H-38, H-40, H-42, H-43; see Supporting Information). Finally, to confirm the incorporation of two tyrosines, fermentation of strain CLI2509 (500 mL) in ISP2 containing 300 mg labeled l-tyrosine (ring-3,5-d2) led to the production of tryptorubin A (1) containing three deuterium atoms as evidenced by LCMS and NMR and confirming the branching at C-30. These three experiments helped determine the location of the two tryptophans, and an HMBC correlation from H-40 (δH 7.17) to C-11 (δC 60.9) suggested that the tryptophans were connected to each other by a carbon–carbon bond between C-11 and C-41. While this linkage is unusual in natural or unnatural molecules, for that matter, a similar structural motif is present in natural products such as naseseazine A,8 asperazine,9 and pestalazine A.10
After identifying the location of the two tryptophans, the location of the last amino acid with aromatic bridging, tyrosine, was determined by isotopic labeling and NMR analysis. C-30 and C-31 were not bonded to hydrogen atoms, but the connectivity of these atoms was unclear due to the ambiguous NMR shifts in the aromatic ring. Consequently, isotopic labeling (15N and 13C) of tryptorubin A (1) was used to rapidly complete the structure. Fermentation of strain CLI2509 in ISP2 medium containing 15N-labeled NH4Cl produced 15N-labeled tryptorubin A (1). 15N HMBC analysis of 15N-labeled tryptorubin A (1) provided evidence of the linkage between tryptophan and the tyrosine-derived amino acid with HMBC correlations from H-29 (δH 5.76), H-38 (δH 6.89), H-40 (δH 7.17), and H-43 (δH 7.06) to the indole nitrogen, suggesting that the indole nitrogen was connected to C-30. The 13C NMR shift at C-31 (δC 149.6), suggested that it was connected to an oxygen. In order to confirm the carbon connectivity in these two amino acids, fermentation of strain CLI2509 in ISP2 medium containing 13C-labeled glucose produced 13C-labeled tryptorubin A (1). A 13C-13C COSY of 13C-labeled tryptorubin A (1) provided the carbon–carbon connectivity for most of the structure, including the tryptophan and tyrosine.11
To confirm the location of the hydroxyl groups, tryptorubin A (1) was acetylated (see Supporting Information). Acetylation of tryptorubin A (1) resulted in a product with three acetyl groups, but NMR analysis did not allow the unequivocal location of the acetyl groups to be determined. In an alternative approach, tryptorubin A (1) was methylated, and the major product contained the O-methyl at C-1, as evidenced by 1D and 2D NMR analysis. The hydroxyl at C-31 was confirmed by TOCSY correlations from 31-OH (δH 7.28) to H-33. The hydroxyl at C-7 was confirmed by a comparison of 13C and 1H NMR shifts to literature values. After the amino acid identities and aromatic bridging were determined, analysis of ROESY, COSY, and HMBC NMR data allowed for determination of the sequence of tryptorubin A (1) (Figure Figure22).
A combination of advanced Marfey’s method, genome sequencing, NOE correlations, and molecular modeling assigned the absolute stereostructure of tryptorubin A (1). Acid hydrolysis of tryptorubin A (1) and subsequent derivatization with L-FDLA and DL-FDLA and LCMS12,13 analysis provided the configuration of l-Ala, l-Ile, and l-Tyr (see Supporting Information). Whole genome sequencing of strain CLI2509 and identification of the gene cluster responsible for producing tryptorubin A (1) indicated that all of the 6 amino acids had an l-configuration. The configurations at C-11 and C-12 remained unassigned, leaving four possible stereoisomers. The unusual branching between aromatic regions of the remaining amino acids prevented determination of the configuration of these two stereocenters using Marfey’s method. Instead, several key ROESY correlations (Figure Figure22B) existed in tryptorubin A (1) that helped determine the stereochemistry: between H-9 (δH 4.52) and H-42 (δH 7.44), as well as H-12 (δH 6.78) to H-40 (δH 7.17), suggesting that H-9, H-12, and H-42 were on the same side of the tryptophan. The four possible stereoisomers were modeled using Schrödinger and Gaussian09 software and only one of the four stereoisomers (11S, 12R) fit with the experimental ROESY correlations.14 Consequently, the absolute stereochemistry was determined to be (2S, 9S, 11S, 12R, 20S, 21S, 26S, 35S).
Tryptorubin A (1) has several unusual features, some previously reported and some unreported. The linkage between two tryptophans, the C-11 to C-41 in tryptorubin A (1), is rare, but similar links are found in naseseazine A, asperazine, and pestalazine A. However, in tryptorubin A (1) the linkage is para to the indole nitrogen; in naseseazine and pestalazine A, the linkage is meta to the nitrogen; and in asperazine, the linkage is ortho to the nitrogen. Another unusual structural feature in tryptorubin A (1) is the linkage between tyrosine’s aromatic ring (C-30) to the indole nitrogen of tryptophan, a feature not reported in any other natural product. An alkaloid produced by Penicillium citreo-viride(15) has a related linkage (see Supporting Information) that is likely produced in a different fashion. A linkage from the indole nitrogen in tryptophan to other amino acids is not unprecedented; pestalazine B, aspergilazine A,16 and kapakahine B,17 for example, are bonded between the indole nitrogen to another tryptophan. Tryptorubin A’s unusual structural features and the lack of annotated biosynthetic pathways for any structural relatives prompted an investigation of its genetic basis. We began our genetic analysis of tryptorubin A (1) by sequencing and analyzing the Streptomyces sp. CLI2509 genome. Assembly of PacBio sequence reads resulted in two linear replicons: a 7.09 Mb chromosome and a 147 kb plasmid (Genbank accession no. CP021118 and CP021119). The chromosome encodes for 18 antiSMASH-predicted biosynthetic gene clusters (BGCs),18 many of which we noticed were also predicted in two other bacteria: Streptomyces sp. strain Tü6071 (Genbank accession no. CM001165.1), isolated from the soil along the Cape Coast in Ghana,19 and Streptomyces sp. SPB78,20,21 an antifungal producer isolated from the Southern Pine Beetle Dendroctonus frontalis. A comparison of the three bacterial chromosomes using in silico genome-to-genome distance calculations22 revealed that CLI2509 was remarkably similar to both Tü6071 and SPB78, having predicted DNA–DNA hybridization values of 91.3% and 77–82.6%, respectively. In light of the genome sequence similarity, and our previous work on Streptomyces sp. SPB78 while studying a fungus-growing beetle system, we investigated this bacterium’s ability to produce tryptorubin A (1). The cultivation of SPB78 on ISP2 confirmed its ability to produce tryptorubin A (1) as determined by HRMS.
Among the predicted BGCs encoded on the CLI2509 chromosome, none can be confidently predicted to encode for the biosynthesis of tryptorubin A (1). Only a single locus, which encodes for two NRPSs that fulfill the criteria for hexapeptide formation (Figure Figure33) exists, and we speculate it is involved in tryptorubin A production. The rest of this BGC, however, presents several curiosities, including the sequences for enzymes that are clearly not involved in the biosynthesis of tryptorubin A (1). For example, tryptorubin A (1) is built from only proteinogenic amino acids yet the biosynthetic locus encodes for a set of enzymes similar to those required for dihydroxyphenylglycine (DHPG) biosynthesis. In addition, two SAM-dependent enzymes are present: the first is a predicted N-methyltransferase NRPS module and the second is a stand-alone methyltransferase. Finally, among the remaining genes in the neighborhood, one encodes for a flavodoxin, a common radical SAM redox partner, but its partner is unknown.
The missing genes, encoded elsewhere on the chromosome, cannot be identified using a bioinformatics approach. These encode for tryptorubin A’s unusual side chain cyclization. Two C–N bonds and one C–C bond must be installed to reach the final product. There are presumably at least two enzymes that accomplish these reactions: one that forms a C–N bond between the indole Nε of Trp1 and a Tyr1 Cε2 (C-30), and a second that catalyzes C–C bond formation between the indole rings at Trp1 Cζ3 (C-41) and Trp2 Cγ (C-11). The latter is likely concomitant with C–N bond and ring formation between Cδ1 (C-12) and the Trp2 amide. These two putative enzymes most likely carry out one electron oxidations that would allow all three bonds to be formed in a largely precedented fashion,23 but the genes responsible for these reactions are not obvious and merit further investigation.
In summary, a study on the origin of a complex symbiosis led to a Streptomycete, isolated from the fungus Hymenochaete rubiginosa that produced a new peptide, tryptorubin A (1). Its fascinating structural features provide a chemical rationale for further studies on this and related fungus-hosted bacteria, and have generated an interesting biosynthetic puzzle: the enzymes that are responsible for the remarkable cyclization reactions. The results also reinforce the need for continued molecular analyses (even in well-studied taxa like Streptomyces) as our bioinformatic predictive abilities continue to lag behind the diversity of chemistry that is genetically encoded by bacteria. Finally, the similarity of CLI2509 to SPB78 from the Southern Pine beetle system20 hints at the recruitment of another fungus–bacterium as the origin of another fungus-growing insect system. The identification of similar genomes and the same rare molecule in two different fungus-growing systems could, of course, be a coincidence, but it provides motivation for further studies.
This work was supported by funding from the NIH R01 GM086258 (J.C.) and NIH U19 AI09673 (J.C., C.R.C.). A.C.R. was supported by a Harvard Medical School–Merck Fellowship and a CIHR Fellowship. We thank the ICCB-Longwood Analytical Chemistry Core Facility at Harvard Medical School for the facilities to acquire LCMS data, as well as the East Quad NMR Facility at Harvard Medical School for the facilities to acquire NMR data. We also thank Arvin Moser and Maria Varlan (ACD/Labs) for assistance with NMR software. We thank the Duke GCB Genome Sequencing Shared Resource, which provided PacBio sequencing service, the Harvard Medical School Information Technology Department for access to the Orchestra High Performance Computing Cluster, and the Harvard University Information Technology Department for access to the Odyssey High Performance Computing Cluster. We thank Emily Mevers for measurement of optical rotation.
The authors declare no competing financial interest.