|Home | About | Journals | Submit | Contact Us | Français|
This study reports the release of draft genome sequences of two isolates of Lichtheimia corymbifera and two isolates of L. ramosa. Phylogenetic analyses indicate that the two L. corymbifera strains (CDC-B2541 and 008-049) are closely related to the previously sequenced L. corymbifera isolate (FSU 9682) while our two L. ramosa strains CDC-B5399 and CDC-B5792 cluster apart from them. These genome sequences will further the understanding of intraspecies and interspecies genetic variation within the Mucoraceae family of pathogenic fungi.
Lichtheimia (formerly Absidia) is a genus of saprotrophic zygomycetous fungi known to cause mucormycosis in human hosts. Although less prevalent than infections caused by Aspergillus or Candida, there has been an increase in reports of Lichtheimia corymbifera infections among immunocompromised patients (Schwartze and Jacobsen 2014). Lichtheimia species are the second and third most isolated organisms from patients with mucormycosis in Europe and worldwide, respectively (Roden et al. 2005; Alvarez et al. 2009; Skiada et al. 2011; Lanternier et al. 2012). Various studies have examined the underlying reasons behind the differences in clinical representation among Lichtheimia strains. For example, Schwartze et al. (2012) evaluated the virulence potential of 46 Lichtheimia isolates, representing all five species, in a chicken embryo model of infection. Lichtheimia ramosa has also been shown to be a primary infective agent in a burn victim, although treatment with amphotericin B was effective (Kaur et al. 2014). While Lichtheimia species tend to be morphologically and genetically distinct, they often share very similar antifungal drug susceptibilities. Recent studies continue to implicate Lichtheimia species in cutaneous (Cateau et al. 2013; Poirier et al. 2013) and other infections (Bellanger et al. 2010; Irtan et al. 2013; Kutlu et al. 2014). Two Lichtheimia genomes have recently been published (Linde et al. 2014; Schwartze et al. 2014). As more is learned about the physiological and molecular mechanisms of pathogenesis in Lichtheimia, it is important to have a deeper understanding of the underlying genetics and genomics of this important group of opportunistic pathogens.
In this study, we have sequenced two L. corymbifera isolates (CDC-B2541 and 008-049) and two L. ramosa isolates (CDC-B5399 and CDC-B5792). Lichtheimia corymbifera CDC-B2541 was isolated as a plate contaminant in 1977 in Wisconsin, USA, while isolate 008-049 was isolated from a human in a 2008 Deferasirox-AmBisome Therapy for Mucormycosis (DEFEAT) study (Spellberg et al. 2012). Lichtheimia ramosa CDC-B5792 was isolated from human sputum in 1997 in New Mexico, USA, whereas isolate CDC-B5399 was isolated as a gluteal abscess from India in 1993. DNA was extracted from fungi grown on Sabouraud's Dextrose agar using the GeneRite Kit (Carlsbarg, CA) or the OmniPrep Kit (GBiosciences). The genome sequence of each isolate was generated at the Institute for Genome Sciences (IGS) Genomics Resource Center (Baltimore, MD) using a combination of paired-end libraries (average insert size of 459 bp) and mate-pair (3 kb) libraries on the Illumina HiSeq 2000. We generated an average of 33.4 million sequence reads from each of the paired-end libraries and 29.5 million sequence reads from each of the mate-pair libraries (Table 1). The draft genome data were assembled using the MaSuRCA v.1.9.2 genome assembler (Zimin et al. 2013). The relevant statistics from the genome assemblies and annotations are summarized in Table 1. The resulting L. corymbifera genome assemblies contained an average of 1401 contigs per genome. The L. ramosa genome assemblies contained 3831 contigs on average. The average estimated coverage was 91.6 × for L. corymbifera and 44.4 × for L. ramosa.
Structural and functional annotation were performed with the IGS Eukaryotic Annotation Pipeline protocol 1.0 at the IGS Informatics Resource Center (Baltimore, MD). We generated 439 million RNA-seq reads from isolate 008–049 grown in the presence of epithelial cell line (A549 adenocarcinomic human alveolar basal cells), human umbilical vein endothelial cells or in mammalian tissue culture media alone. RNA-seq reads were pooled and RNA-seq assemblies, both de novo and genome-guided against 008–049 genomic scaffolds, were generated with Trinity (Grabherr, Haas et al. 2011). Both types of assemblies were mapped to the 008–049 genome using PASA (Haas et al. 2003), and de novo assemblies only were mapped to other Lichtheimia genomes with Genomic Mapping and Alignment Program (GMAP) (Wu and Watanabe 2005). Genomic repeat regions were annotated and masked using RepeatModeler (Smit and Hubley 2008–2010) and RepeatMasker (Smit et al. 1996–2010). Protein-coding genes were predicted ab initio with CEGMA (Parra et al. 2007), GeneMark-ES (Ter-Hovhannisyan et al. 2008), Augustus (Stanke et al. 2006), SNAP (Korf 2004), GlimmerHMM (Majoros, Pertea and Salzberg 2004) and GeneID (Blanco et al. 2007). Augustus, SNAP and GlimmerHMM used CEGMA predictions for parameter training, and GeneID used a parameter file generated by CEGMA. Raw RNA-seq reads were used to augment Augustus training for L. corymbifera 008–049. Spliced alignments of SwissProt proteins against each genome were generated with AAT (Huang et al. 1997) using cutoffs of 80% similarity and 1500 bp max intron length. To generate a consensus gene model set, all intrinsic and extrinsic predictions were combined with Evidence Modeler (Haas et al. 2008) using the following evidence weights: CEGMA 4, Augustus 4, GeneMark-ES 2, GlimmerHMM 2, SNAP 2, GeneID 2 and AAT alignments 2. Assembled RNA-seq transcript alignments were weighted 10 for alignment to self (e.g. L. corymbifera 008–049 transcripts aligned with PASA to L. corymbifera 008–049 genome), but weighted 1 when aligned to other (e.g. L. corymbifera 008–049 transcripts aligned with GMAP to L. corymbifera CDC-B2541). Non-coding RNAs were predicted with tRNAScan-SE and RNAmmer. Predicted proteins were compared to UniProt with BLAST and against TIGRFAMs/PFAMs with HMM searches to generate functional assignments including Gene Ontology terms and Enzyme Commission numbers. A summary of our structural annotation of each of the four genomes can be found in Table 1. Genome completeness, as assessed by detecting complete conserved eukaryotic genes with CEGMA (Parra et al. 2007), for each of the genomes was estimated to range from 96–98% complete (Table 1).
We probed the phylogenetic relationship between our isolates and with two Mucorales isolates whose genomes have been sequenced and annotated (Rhizopus delemar 99–880 and L. corymbifera FSU 9682). To accomplish this, ortholog pairs were detected among Mucorales genomes using InParanoid 4.1 (Remm, Storm and Sonnhammer 2001) with Umbelopsis isabellina (CDC-B7317) as an out group, using the two-pass BLAST strategy, bootstrapping and all other algorithm parameters set to default. MultiParanoid (Alexeyenko et al. 2006) was run on InParanoid output files to detect ortholog groups common to all isolates. Protein sequences from each ortholog group were aligned using Muscle v.3.7 (Edgar 2004) and gapped regions were removed with Gblocks_0.91b with default settings (Talavera and Castresana 2007). Conserved block alignments were concatenated, and phylogenetic analysis was performed with Phyml 3.0 (Guindon et al. 2010) with 100 bootstrap replicates, BioNJ starting tree, nearest neighbor interchange (NNI) tree topology search, and LG amino acid substitution model. The resulting tree was visualized in FigTree v.1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/, 9 July 2014, date last accessed). For the genome sequence of L. corymbifera FSU 9682, the previously published annotation was used (Schwartze et al. 2014). All Lichtheimia isolates, including L. ramosa, were very closely related based on a phylogenetic tree generated using over 2000 highly conserved orthologous genes (Fig. 1). For perspective, R. delemar 99–880, a better-characterized Mucorales genome, was used as an outgroup for the tree. Our phylogenetic analysis indicates that the two L. corymbifera isolates (CDC-B2541 and CDC-008–049) are closely related to the previously sequenced L. corymbifera isolate (FSU 9682), while the two L. ramosa isolates (CDC-B5399 and CDC-B5792) form a separate clade.
The genome sequence data from these Lichtheimia species provide a valuable resource for comparative genome analyses to determine interspecies and intraspecies genomic variation which will, in turn, further our understanding of the genetic elements that govern virulence, tropism and antifungal resistance of this genus.
Nucleotide sequence accession numbers: these Whole Genome Shotgun projects have been deposited at DDBJ/EMBL/GenBank under the accessions JNEU00000000, JNEP00000000, JNEE00000000, JNDO00000000 corresponding to strains CDC-B2541, CDC-B5792, 008–049and CDC-B5399, respectively. The versions described in this paper are the first versions: JNEU00000000.1, JNEP00000000.1, JNEE00000000.1 and JNDO00000000.1.
We thank Anastasia Litvintseva for project guidance and for critical review of the manuscript. We thank Kerstin Voigt for making the data for L. corymbifera strain FSU 9682 available before publication.
This project has been funded in part with federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services under contract number HHSN272200900009C. VMB and ASI were supported by U19AI110820. ASI was also supported by R01 AI063503. The findings and conclusions of this article are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention.
Conflict of interest. None declared.