|Home | About | Journals | Submit | Contact Us | Français|
Legionella longbeachae causes most cases of legionellosis in Australia and may be underreported worldwide due to the lack of L. longbeachae-specific diagnostic tests. L. longbeachae displays distinctive differences in intracellular trafficking, caspase 1 activation, and infection in mouse models compared to Legionella pneumophila, yet these two species have indistinguishable clinical presentations in humans. Unlike other legionellae, which inhabit freshwater systems, L. longbeachae is found predominantly in moist soil. In this study, we sequenced and annotated the genome of an L. longbeachae clinical isolate from Oregon, isolate D-4968, and compared it to the previously published genomes of L. pneumophila. The results revealed that the D-4968 genome is larger than the L. pneumophila genome and has a gene order that is different from that of the L. pneumophila genome. Genes encoding structural components of type II, type IV Lvh, and type IV Icm/Dot secretion systems are conserved. In contrast, only 42/140 homologs of genes encoding L. pneumophila Icm/Dot substrates have been found in the D-4968 genome. L. longbeachae encodes numerous proteins with eukaryotic motifs and eukaryote-like proteins unique to this species, including 16 ankyrin repeat-containing proteins and a novel U-box protein. We predict that these proteins are secreted by the L. longbeachae Icm/Dot secretion system. In contrast to the L. pneumophila genome, the L. longbeachae D-4968 genome does not contain flagellar biosynthesis genes, yet it contains a chemotaxis operon. The lack of a flagellum explains the failure of L. longbeachae to activate caspase 1 and trigger pyroptosis in murine macrophages. These unique features of L. longbeachae may reflect adaptation of this species to life in soil.
Isolation of Legionella longbeachae was first reported in 1981 after isolation from patients with pneumonia in the United States (11, 59). Although L. longbeachae is not a common respiratory pathogen in either North America or Europe, where Legionella pneumophila infections are predominant, it accounts for more than 50% of legionellosis cases in Australia and is also prevalent in New Zealand and Thailand (10, 12, 60, 66, 68, 77, 93, 94). Legionnaires' disease induced by L. longbeachae infection is clinically indistinguishable from the disease caused by L. pneumophila (65). However, L. longbeachae infections have been associated with gardening and the use of potting soil, whereas the disease caused by other species is linked to freshwater sources (4, 65). L. longbeachae can survive for up to 9 months in moist potting soil at room temperature, in contrast to other Legionella species, which inhabit natural and manmade freshwater systems worldwide (34, 83, 84).
In addition to the differences in habitat, L. longbeachae differs from L. pneumophila in its virulence in murine models of infection. L. longbeachae replicates in the lungs of A/J, C57BL/6, and BALB/c mice (6), whereas most inbred mice, including C57BL/6 and BALB strains, are resistant to L. pneumophila (61). These differences in murine host susceptibility are likely due to different abilities to activate caspase 1-mediated pyroptosis in macrophages. While L. pneumophila rapidly triggers pyroptosis in C57BL/6 mouse macrophages, L. longbeachae does not do this (6).
Intracellular trafficking of L. longbeachae in mammalian macrophages also follows a route distinct from that of L. pneumophila. After phagocytosis, the L. pneumophila-containing vacuole (LCV) excludes early and late endosomal markers, such as early endosomal antigen 1 (EEA1), Rab5, LAMP-1, LAMP-2, and the mannose 6-phosphate receptor (M6PR) (5, 89). In L. pneumophila the Dot/Icm type IV secretion system is required for prevention of phagosome-lysosome fusion and for intracellular replication (47). Conversely, the L. longbeachae-containing vacuole acquires the early endosomal marker EEA1 and the late endosomal markers LAMP-2 and M6PR (5). It has been suggested that L. longbeachae intracellular trafficking resembles that of the facultative intracellular pathogen Brucella abortus, since a Brucella-containing vacuole also acquires early and late endosomal markers soon after infection (5). Despite the difference in intracellular trafficking between L. longbeachae and L. pneumophila, L. longbeachae rescues Dot/Icm-deficient L. pneumophila when these two organisms coinhabit LCV (5).
Results of the studies cited above indicate that L. longbeachae differs from other legionellae in terms of habitat, host specificity, and intracellular trafficking. In this paper, we describe an analysis of the sequenced and annotated genome of L. longbeachae clinical isolate D-4968 compared with published genomes of L. pneumophila strains Corby, Lens, Paris, and Philadelphia-1 (16, 17, 38). Specifically, we compared genes involved in gene regulation, protein secretion systems, and motility in order to identify genes responsible for making L. longbeachae unique among the legionellae.
L. longbeachae serogroup 1 clinical isolate D-4968 was obtained from a 77-year-old woman diagnosed with legionellosis in Portland, OR, in May 2000 (4). It was selected from the CDC Legionella reference diagnostic library (Atlanta, GA) because it has been characterized with respect to intracellular trafficking (5), the ability to cause murine infections, and the ability to activate caspase 1 and caspase 3 in human and mouse macrophages (6, 39).
D-4968 was grown on buffered charcoal yeast extract (BCYE) agar plates for 72 h before DNA extraction. Genomic DNA was extracted using a QIAamp DNA mini kit (Qiagen Inc., Valencia, CA) according to the manufacturer's guidelines.
Legionella genome sequencing was conducted using Roche 454 GS-FLX pyrosequencing at the CDC Biotechnology Core Facility. A shotgun library was produced by the method of Margulies et al. (57) and was sequenced with an LR70 sequencing kit. After trimming and refiltering to recover short high-quality reads, the sequencing run provided 251,332 reads, a draft de novo assembly was produced using the Newbler assembler (v 2.0). This assembly resulted in a draft genome consisting of 4.05 Mb in 89 contigs.
A methanol-treated culture of D-4968 was submitted to OpGen Technologies, Madison, WI, for construction of optical maps. The immobilized genomic DNA of D-4968 was digested with either the XbaI or PvuII restriction enzyme, and the sizes and order of digested fragments were determined. The sequence contigs were converted to in silico restriction maps using the MapIt software (OpGen) and were aligned with the optical maps based on their restriction fragment patterns.
Contig arrangements predicted by the optical map were verified by PCR amplification of DNA fragments connecting adjusted contigs using Platinum Taq DNA polymerase (Invitrogen, Carlsbad, CA). Primers that annealed to the 5′ and 3′ regions of each contig were designed using the PrimerSelect program of DNASTAR Lasergene 7 (DNASTAR, Inc., Madison, WI). PCR products were purified using ExoSAP-IT (USB Corporation, Cleveland, OH). DNA sequencing was performed with an ABI BigDye Terminator v3.1 cycle sequencing kit (Applied Biosystems, Foster City, CA), and the products were analyzed with a model 3130xl ABI PRISM genetic analyzer (Applied Biosystems). DNASTAR SeqMan was used to assemble amplified DNA fragments with the contigs. Where gaps between adjusted contigs were predicted to be larger than 3 kb, either the Elongase enzyme mix (Invitrogen) or a Qiagen LongRange PCR kit (Qiagen) was used for PCR amplification. Sequencing directly from genomic DNA was also conducted for large gaps by modifying the reaction conditions using the method of Heiner et al. (41) and increasing the number of cycles to 100.
Plasmid DNA was purified from L. longbeachae D-4968, L. pneumophila strain Lens, and L. pneumophila strain Paris grown on BCYE agar plates for 96 h using a HiSpeed plasmid midi kit (Qiagen) according to the manufacturer's guidelines.
Primers were designed to amplify internal DNA fragments of the flaA, fliI, flgH, and flgG flagellar biosynthesis genes of L. pneumophila strain Philadelphia-1 and of genes annotated as cheR, cheA, and cheB in the genome of L. longbeachae D-4968 (see Table S1 in the supplemental material). DNA from L. longbeachae serogroup 1 and serogroup 2 isolates from the CDC Legionella reference diagnostic library (Atlanta, GA) was extracted using an InviMag bacterial DNA kit (Invitek, Berlin, Germany) and a KingFisher workstation. The extracted DNA was used for PCR amplification of flagellar and chemotaxis gene fragments, followed by sequencing as described above.
The D-4968 DNA sequence was submitted to the J. Craig Venter Institute (JCVI) Annotation Service, where it was subjected to JCVI's prokaryotic annotation pipeline, including gene finding with Glimmer, BLAST-extend-repraze (BER) searches, HMM searches, TMHMM searches, SignalP predictions, and automatic annotations from AutoAnnotate. The annotation tool Manatee was used to manually review the output (downloaded from SourceForge [manatee.sourceforge.net] in November 2007).
The Rapid Annotation using Subsystem Technology (RAST) service was used for fully automated annotation of the D-4968 genome and comparison with genomes of L. pneumophila strains (7). The integrated microbial genomes (IMG) system was used for searches of currently available microbial genomes for the presence of protein domains and motifs (58). Alignment of L. longbeachae and L. pneumophila genomes was conducted using the Mauve genome alignment software (23). The sequences used for comparative genomic analysis were the sequences of L. pneumophila strains Corby, Lens, Paris, and Philadelphia-1 (GenBank accession numbers NC_009494.1, NC_006369.1, NC_006368.1, and NC_002942.5, respectively).
The Whole-Genome Shotgun Project of D-4968 has been deposited in the DDBJ/EMBL/GenBank database under accession ACZG00000000. The version described in this paper is the first version (accession no. ACZG01000000).
Pyrosequencing of the L. longbeachae D-4968 genome revealed that it consists of 4.05 × 106 bp, contains 3,821 protein-encoding genes, and has an average G+C content of 37% (Table (Table1)1) based on the data for all contigs. A comparison of the L. longbeachae genome with four previously described L. pneumophila genomes indicated that L. longbeachae is substantially larger and has a higher coding capacity (Table (Table1).1). Eighty-nine contigs produced by Newbler assembly were arranged into two large scaffolds consisting of 2.5 × 106 bp and 1.4 × 106 bp using two optical maps and primer walking (see Fig S1A in the supplemental material; data not shown). Both optical maps suggested that the D-4968 chromosome has a circular topology. Thus, the two scaffolds are connected.
Eleven small contigs (size range, 0.8 to 49 kb) either fill two gaps between the scaffolds (see Fig S1A in the supplemental material) or constitute a plasmid (see Fig S1B in the supplemental material). Contig 3, which was not placed in the scaffolds, contains a number of genes that are highly homologous to genes located on the plasmid of L. pneumophila strain Paris (see Table S2 in the supplemental material). This indicates that contig 3 most likely constitutes a plasmid, which may be exchanged between different Legionella species.
Figure Figure11 shows multiple alignments of four L. pneumophila genomes with the D-4968 scaffolds. The results indicate that L. longbeachae has a very limited synteny with L. pneumophila, suggesting that there is a substantial evolutionary distance between these species. On the other hand, the L. longbeachae and L. pneumophila genomes share a number of small-scale clusters of homology where the gene order is highly conserved. Most of these clusters contain genes involved in essential metabolic processes and protein secretion apparatus biosynthesis.
A search of the D-4968 genome for regulatory genes revealed that despite the differences in habitat, host range, and genome size and order, the regulatory networks of L. longbeachae and L. pneumophila appear to be very similar. This could be due to the fact that both species are intracellular parasites and have rather small regulatory gene composition (16). Both L. pneumophila and D-4968 encode six sigma factors, RpoD, RpoH, RpoS, RpoN, RpoE, and FliA (Fig. (Fig.2;2; see Table S3 in the supplemental material) (16). There are also comparable numbers of two-component systems, including 11 histidine kinases, 16 response regulators, and three putative hybrid proteins with both sensory box histidine kinase and response regulator domains (see Table S3 in the supplemental material). Similar to L. pneumophila, the most abundant type of regulator encoded by D-4968 is the family of proteins with GGDEF and EAL domains (16) (see Table S3 in the supplemental material). These domains are involved in synthesis (GGDEF) and hydrolysis (EAL) of bis-(3′-5′)-cyclic dimeric GMP, a novel global second messenger used for regulation of motility, production of extracellular matrix components, and cell-to-cell communication in other bacteria (72). Interestingly, there are only 13 genes encoding GGDEF-EAL family proteins in D-4968, whereas the L. pneumophila genomes contain from 21 (Lens) to 23 (Paris and Corby) such genes. In contrast, the obligatory intracellular pathogen Coxiella burnetii has only two GGDEF domain proteins. If the number of these regulatory proteins correlates with the diversity of environmental stimuli encountered by bacteria (35), the smaller number of GGDEF and EAL proteins in L. longbeachae than in L. pneumophila may suggest that the former species inhabits a less diverse environment.
The biphasic L. pneumophila life cycle was first described by Rowbotham (76). Bacteria cycle between an intracellular, acid-resistant, nonmotile, replicative form and a free-living, highly motile, transmissive form, during which they express numerous virulence factors, such as the Icm/Dot secretion system (for a review see reference 64). The L. pneumophila life cycle can be modeled in broth cultures, where bacteria in the exponential growth phase have many attributes of the replicative form, while stationary-phase cells have traits of the transmissive form. Cellular differentiation is coordinated by a regulatory circuit that includes components of the stringent response mechanism, two-component regulatory systems, and alternative sigma factors (64). The L. longbeachae genome encodes all the major parts of this proposed regulatory circuit, including the recently described small noncoding RNAs RsmY and RsmZ (44, 69, 78) (Fig. (Fig.2;2; see Table S3 in the supplemental material). However, there are differences between phase regulation in L. longbeachae and phase regulation in L. pneumophila. In contrast to intracellular replication of L. pneumophila, the ability of L. longbeachae to replicate intracellularly is independent of the growth phase of the infecting bacteria (5). L. longbeachae may have additional components that alter the differentiation-controlling network, or the interactions between conserved elements of the network may differ in L. longbeachae and L. pneumophila. Several putative regulatory genes specific to L. longbeachae are described below.
L. pneumophila genomes encode multiple protein secretion systems, including the putative type I Lss secretion machinery, type II PilD-dependent Lsp, type IVA Lvh, and type IVB Icm/Dot, as well as a putative type V secretion pathway (25). Gram-negative bacterial pathogens and endosymbionts use these specialized secretion systems to transport toxins and effectors to the extracellular environment or directly into the host cell to modify host physiology and promote interactions (91). It has been established that type II and type IVB secretion systems are essential for L. pneumophila virulence (25). Genes encoding type I, II, IVA, and IVB secretion system homologs were also found in the D-4968 genome, but there were several noteworthy differences.
Type I secretion systems allow secretion of substrates into the extracellular space in a one-step process, without a stable periplasmic intermediate (37). The L. longbeachae genome includes lssX, lssY, lssZ, and lssA genes that share high levels of identity with genes of the L. pneumophila lssXYZABD locus encoding a putative type I secretion system (48). The loci in L. longbeachae and L. pneumophila have similar gene orders and compositions upstream of the lssXYZA operon, but they differ downstream of lssA (see Fig S2 in the supplemental material). L. longbeachae does not contain lssB and lssD, which appear to be essential for the secretion process as they encode a protein of the ATP-binding cassette transporter and the membrane fusion protein, respectively (1). Detection of lssA and lssY but not lssD in non-L. pneumophila Legionella species has been reported previously (48). D-4968 contains 19 genes encoding ATP-binding cassette transporters (LssB orthologs) and 13 genes encoding membrane fusion proteins with a “HlyD” Pfam domain (LssD orthologs) dispersed throughout the genome (see Table S4 in the supplemental material). However, most of these genes have homologs in the L. pneumophila genome and/or are associated with putative multidrug efflux systems. Thus, L. longbeachae may not encode a functional type I secretion system. The presence of the lssXYZA operon in L. longbeachae in the absence of contiguous type I machinery suggests that lssXYZA-encoded proteins may function independently from lssBD and are either not secreted or utilize different secretion machinery.
In many Gram-negative bacteria type II secretion is used to translocate proteins across the outer membrane via multisubunit complexes inserted into both the inner and outer membranes (37). To date, L. pneumophila is the only intracellular pathogen known to possess a functional type II secretion system (T2SS) (19). The L. pneumophila T2SS has been extensively studied and has been shown to be required for optimal growth on bacteriologic media at 12 to 25°C, survival in tap water at 4 to 17°C, establishment of biofilms, sliding motility, intracellular infection of protozoans and mammalian macrophages, and persistence in the lungs of A/J mice (24, 54, 55, 73, 75, 82, 87). Rossier et al. identified T2SS genes in several non-L. pneumophila Legionella species and hypothesized that the T2SS occurs throughout the genus Legionella (75). Indeed, the L. longbeachae genome includes all 12 lsp (Legionella secretion pathway) genes, which are organized in five loci that encode the core components of the T2SS apparatus (Table (Table2)2) (19). The L. pneumophila T2SS translocates multiple enzymes, including the major zinc metalloprotease ProA, the type II RNase SrnA, the lipases LipA and LipB, the aminopeptidases LapA and LapB, the lysophospholypase A PlaA, and the chitinase ChiA (24, 25, 73, 74). To date, a total of 25 T2SS substrates have been identified in L. pneumophila, but it has been predicted that the L. pneumophila T2SS may be involved in the secretion of up to 60 proteins (18, 24). The L. longbeachae genome encodes 13 proteins homologous to L. pneumophila enzymes demonstrated to be secreted in a T2SS manner (Table (Table2)2) (18). However, a number of T2SS substrates identified in L. pneumophila are not encoded by the D-4968 genome, such as a chiA-encoded chitinase, which was shown to promote L. pneumophila persistence in the lungs of A/J mice (24). Based on these findings and those of Rossier et al. (75), we hypothesize that a conserved T2SS apparatus is encoded by all legionellae, but the number and type of T2SS substrates vary depending on the ecological niche occupied by the species.
Type IV secretion systems (T4SSs) are membrane-associated transporter complexes believed to have evolved from bacterial conjugation machinery and to be capable of transporting both proteins and nucleic acids into plant, animal, and bacterial cells (37, 91). T4SSs are used by many Gram-negative bacteria for exchange of genetic material, spread of conjugative plasmids, and injection of virulence factors into eukaryotic host cells (8). T4SSs have been grouped into subclasses, including type IVA, which is similar to the Agrobacterium tumefaciens Vir system, and type IVB, which is similar to the Tra/Trb bacterial conjugation systems (25).
L. pneumophila strains Paris, Lens, and Philadelphia-1 express a type IVA secretion system (T4ASS), Lvh (Legionella vir homologues) (16, 17, 80). L. pneumophila strain Corby lacks Lvh, but it contains two novel T4ASSs, Trb-1 and Trb-2 (38), whose roles are not known. The Lvh system of L. pneumophila is required for plasmid conjugation, as well as for low-temperature host cell entry and replication (9, 71, 80). L. longbeachae D-4968 contains all 11 lvh genes (lvhB2 to lvhB11 and lvhD4) arranged in the same order (see Table S5 in the supplemental material), implying that the function is similar. Homologs of the Corby Trb-1 or Trb-2 T4ASSs were not found in L. longbeachae, which is consistent with the observation that the lvh and trb-tra gene clusters do not coexist in the same genome (38).
In L. pneumophila, the T4ASS-encoding gene clusters are located on plasmid-like elements which also contain genes encoding mobility factors and enzymes, such as phage integrases and transposases (27, 38). These plasmid-like elements are either integrated into the chromosome at sites in the tmRNA, tRNAPro, and tRNAArg genes or excised as multicopy plasmids (Table (Table1).1). The D-4968 genome contains two 45-bp sequences, which are identical to the 3′ end of the tRNAPro gene, in the intergenic regions flanking the lvh gene cluster (Fig. (Fig.3;3; see Fig S3 in the supplemental material). There are also several genes that are phage related, encode mobility factors, or are components of a restriction system located in the region between the tRNAPro gene and the second sequence repeat (see Table S5 in the supplemental material). The 3′ end of the tRNAPro gene and the two direct sequence repeats flanking the lvh locus may represent the chromosome-plasmid junction sites attL, attR1, and attR2 (Fig. (Fig.3).We3).We hypothesize that this region could be a plasmid-like element that exists in either integrated or excised form and that mobility of the lvh region may be widespread in the genus Legionella and contribute to interspecies exchange of genetic information.
In addition to type IVA secretion systems, L. pneumophila contains several type IVB secretion systems (T4BSSs), including the Icm (intracellular multiplication)/Dot (defective organelle trafficking) system present in all strains and Tra-like systems identified in selected strains of this species (25). In L. pneumophila, the Icm/Dot system is encoded in two genomic regions; region I contains seven genes (icmV, icmW, icmX, dotA, dotB, dotC, and dotD), and region II contains 18 genes (icmT, icmS, icmR, icmQ, icmP/dotM, icmO/dotL, icmN/dotK, icmM/dotJ, icmL/dotI, icmK/dotH, icmE/dotG, icmG/dotF, icmC/dotE, icmD/dotP, icmJ/dotN, icmB/dotO, icmF, and icmH/dotU) (for a review, see reference 1). The gene organization of both regions is highly conserved in all four L. pneumophila strains. It has been proposed that the Icm/Dot proteins form a multicomponent translocation apparatus that spans both bacterial membranes (92). The inner membrane proteins IcmG/DotF and IcmE/DotG and the outer membrane proteins DotC, DotD, and IcmK/DotH interact with each other, forming the core transmembrane complex of the Icm/Dot system (92). Icm/Dot components are required for intracellular replication in protozoans or macrophages and to cause disease in animals (29, 81). The Icm/Dot system contributes to avoidance of association with markers of endosome-lysosome fusion, remodeling of LCV by endoplasmic reticulum (ER)-derived vesicles, and modulation of host signal transduction processes (30).
The L. longbeachae D-4968 genome contains 24 out of 25 icm/dot genes identified in L. pneumophila (see Table S6 in the supplemental material). These genes are located in three genomic regions and are generally homologous, and the orders of the genes are similar, with three exceptions (Fig. (Fig.4).4). In D-4968, the icmF and icmH/dotU genes of L. pneumophila region II are separated from the rest of the icmT-icmB/dotO gene cluster. The icmE/dotG gene of L. longbeachae is 1,431 bp longer than that of L. pneumophila, mostly due to insertion of bases encoding 390 amino acids that form several pentapeptide repeats (Pfam accession number PF00805) (data not shown). Since IcmE/DotG is an inner membrane protein comprising the core of the Icm/Dot transmembrane complex (92), the considerably larger L. longbeachae IcmE/DotG protein suggests that the size and/or structure of the L. longbeachae Icm/Dot apparatus may be different from the size and/or structure of the L. pneumophila Icm/Dot apparatus. L. longbeachae contains ligB, a functional homologue of icmR (fir gene family) that corresponds in position but does not share sequence homology with L. pneumophila icmR and functions in a similar manner with the corresponding icmQ gene during infection (32, 33). The L. longbeachae icmS, ligB, and icmQ genes are required for intracellular growth in human macrophages (32). The regulatory region of ligB in D-4968 contains the CpxR consensus binding site GTAAANNNNNNGAAAG but no PmrA-binding site. This finding corroborates a previous hypothesis that proposed that L. longbeachae lost the ancestral PmrA regulatory element in the fir promoter as a response to specific environmental conditions (31).
Interestingly, the icmB gene of dot/icm genomic region II in L. longbeachae is adjacent to a pair of genes encoding a two-component sensor kinase and response regulator (Fig. (Fig.4).4). The putative sensor kinase, encoded by LLB_2545, shares 33% sequence identity with the C-terminal region of the L. longbeachae two-component sensor kinase CpxA, while the putative response regulator, encoded by LLB_2546, shares 42% identity with the Escherichia coli response regulator OmpR and 35% identity with the L. longbeachae response regulator CpxR. The LLB_2545-LLB_2546-encoded two-component regulatory system appears to be unique to L. longbeachae. Sequence similarity to CpxAR and the close proximity of the LLB_2545 and LLB_2546 genes to the icm/dot genomic region suggest involvement in regulation of icm/dot genes. However, the heterogeneity of the N-terminal signal input domain sequences of the LLB_2545 and CpxA proteins suggests that they may respond to different environmental signals. Further studies of the icm/dot region in other Legionella species are needed to determine if there is a correlation between loss of the PmrA regulatory element in fir genes and acquisition of additional two-component regulators, such as the LLB_2545 and LLB_2546 regulators. The high level of conservation of Icm/Dot system components in the D-4968 genome suggests that this T4BSS plays an essential role in the life cycle of the bacterium, but its regulation might differ from that of its counterpart in L. pneumophila.
The L. longbeachae and L. pneumophila Icm/Dot secretion systems also differ in the set of effectors secreted via these T4SSs. To date, 140 L. pneumophila proteins have been determined to be Icm/Dot effectors, and it is predicted that the total number of Icm/Dot substrates approaches 300 (14, 47). The function of the majority of these effectors remains unknown since most of them are individually dispensable during intracellular growth or for replication vacuole morphology (30). We used a combination of the lists of L. pneumophila Icm/Dot effectors from the Icm/Dot effector database (http://microbiology.columbia.edu/shuman/effectors.html) and two recent publications (14, 47) to search for homologs in L. longbeachae using a local BLAST search. L. pneumophila and L. longbeachae share such well-characterized effectors as RalF, LepA, LepB, and SdhA (for a review, see reference 47). Yet only 43 of 140 homologs were identified in D-4968, although multiple paralogs were found for SidE and LidA (see Table S7 in the supplemental material). DrrA/SidM, which exhibits both guanine nucleotide exchange factor (GEF) and GDP dissociation inhibitor displacement factor (GDF) activities with the small GTPase Rab1 promoting its recruitment to the LCV (46, 56), was conspicuously not present in L. longbeachae. However, DrrA/SidM also is not present in L. pneumophila strain Paris (1), supporting the general trend of individual dispensability of Icm/Dot effectors. Several of the L. pneumophila Icm/Dot effectors that are not present in L. longbeachae are those with ankyrin repeats and Legionella effectors identified by machine learning (Lem) (14), and only 4/13 and 6/23 of these effectors have been found in D-4968, respectively (see Table S7 in the supplemental material).
Many effectors identified in L. pneumophila contain domains that are present mostly in eukaryotes (47). Therefore, we searched the D-4968 genome for genes predicted to encode proteins with eukaryote-like motifs, including ankyrin repeats, leucine-rich repeats, F-boxes, U-boxes, Sel-1 domains, and serine-threonine protein kinase domains (1, 16). As a result, we identified 29 genes that encode proteins with domains preferentially found in eukaryotic proteins (Table (Table3).3). Ankyrin repeats mediate protein-protein interactions in eukaryotes and are involved in many cellular processes, including cell motility, cell signaling, and transcriptional regulation (36). L. longbeachae contains a large family of ankyrin repeat-containing proteins, like such intracellular pathogens as L. pneumophila, C. burnetii, Wolbachia pipientis, and Rickettsia felis. In addition to four homologs of Icm/Dot effectors, AnkH, AnkK, AnkC, and AnkD (see Table S7 in the supplement material), we identified 18 other ankyrin repeat-containing proteins (Table (Table3),3), 2 of which are homologous to the L. pneumophila Corby Lpc_1606 protein and the rest of which are unique to L. longbeachae. Ankryin repeat-containing proteins in L. longbeachae are likely to be Icm/Dot effectors, as they are in L. pneumophila.
F-box- and U-box-domain containing proteins of L. pneumophila were predicted to interact with the ubiquitin machinery of eukaryotic cells (1). We identified three L. longbeachae genes predicted to encode F-box-containing proteins, all of which have homologs in L. pneumophila (Table (Table3).3). For a long time the L. pneumophila LegU2/LubX Icm/Dot effector has been described as the only protein known in prokaryotes to contain U-box domains (16). However, a second gene with a putative U-box domain was recently found in “Candidatus Protochlamydia amoebophila” (3). A U-box domain is present in eukaryotic E3 ubiquitin ligases, and recently it has been demonstrated that LegU2/LubX has ubiquitin ligase activity since it can catalyze polyubiquitination of eukaryotic Cdc2-like kinase 1 (52). L. longbeachae does not have a LegU2/LubX homolog, but it encodes a novel U-box domain protein, the LLB_1403 protein, which does not share identity with any known protein in the database. The U-box of the LLB_1403 protein appears to contain all conserved hydrophobic residues critical for ubiquitin conjugating enzyme binding (52) (data not shown). This suggests that the LLB_1403 protein may be an Icm/Dot effector with ubiquitin ligase activity. Additional studies are needed to determine which of the 29 proteins listed in Table Table33 are actual substrates of the Icm/Dot system.
It is hypothesized that L. pneumophila strain Paris possesses a type V secretion system because it encodes a putative autotransporter protein, Lpp0779 (16). L. longbeachae does not encode a homolog of Lpp0779 or any other putative autotransporter. Therefore, it is unlikely that L. longbeachae employs this type of protein secretion.
The L. pneumophila genome encodes a large number and wide variety of eukaryote-like proteins (1, 26, 40). The acquisition of eukaryote-like genes can be explained by eukaryote-to-prokaryote horizontal gene transfer occurring during intracellular growth of naturally competent L. pneumophila in aquatic protozoans (26, 40). The eukaryote-like proteins likely interfere with host cellular processes by mimicking functions of eukaryotic enzymes (1). A list of 30 L. pneumophila proteins with the highest similarity scores with eukaryotic proteins (1) was used to search the genome of L. longbeachae D-4968, which resulted in identification of 14 homologs (Table (Table4).4). For example, LLB_0444 encodes ectonucleoside triphosphate diphosphohydrolase (apyrase), which shares 58% identity with Lpg1905, a secreted apyrase recently shown to be important for L. pneumophila replication in eukaryotic cells and for infection in A/J mice (79). It is conceivable that LLB_0444 may also be important for pathogenesis of L. longbeachae. On the other hand, a gene encoding a second L. pneumophila apyrase homolog, Lpg0971, was not found in the L. longbeachae genome. Nor does D-4968 encode a homolog of L. pneumophila sphingosine-1-phosphate lyase, which is predicted to influence host cell survival and apoptosis (40).
While L. longbeachae does not share all eukaryote-like homologs with L. pneumophila, it may encode additional eukaryote-like proteins unique to it. Therefore, we adapted the criteria used by Albert-Weissenberger et al. in the search for eukaryote-like proteins in L. pneumophila genomes (1) to search the D-4968 genome for genes that have no homologs in L. pneumophila. Over 70 such genes were identified in L. longbeachae. Table Table55 lists proteins with 35% or greater sequence identity to eukaryotic proteins and no counterparts in L. pneumophila. Some of the proteins listed have homologs in other bacteria, including many intracellular pathogens such as Coxiella and Brucella, while others have homologs that are found exclusively in eukaryotic organisms. Some examples of the genes include LLB_2177, which encodes a protein phosphatase 2C (PP2C) domain protein that shares sequence identity only with eukaryotic protein phosphatases. The PP2C family proteins are Mg2+/Mn2+-dependent serine/threonine phosphatases (50, 86). In eukaryotes, PP2Cs are involved in several cell regulation and cell signaling processes, including p53 activation (86), implying that L. longbeachae may use the serine/threonine phosphatase activity of the LLB_2177 protein to modulate similar processes in its host cells. Three L. longbeachae genes, LLB_2053, LLB_3045, and LLB_3686 (Table (Table55 and data not shown), encode proteins containing the Ras family motif (Pfam00071), a small GTP-binding domain rarely found in bacteria. A search of 1,284 bacterial genomes using the IMG website (58) yielded only 22 other bacterial species encoding proteins with a Ras family motif. L. pneumophila employs a number of Icm/Dot effectors to subvert the functions of the host small GTPases Arf1 and Rab1 in order to facilitate remodeling of the LCV by ER-derived vesicles (30). The unique L. longbeachae small GTPase homologs may mimic host small GTPase activity and participate directly in the recruitment of ER-derived vesicles to the LCV, competing with Arf1 and Rab1.
In summary, L. longbeachae contains a large number of eukaryote-like proteins, some of which have homologs in L. pneumophila, but most of these proteins are unique to L. longbeachae. These proteins are likely involved in modulation of host cellular functions during the intracellular phase of L. longbeachae growth. The difference between eukaryote-like proteins in L. pneumophila and eukaryote-like proteins in L. longbeachae may be due to the differences in the protozoan hosts that are specific for each species, since these hosts are the most likely the sources for acquisition of these proteins. It would be interesting to see how many eukaryote-like proteins are shared by different Legionella species and how many of them are species specific.
In L. pneumophila, flagellar structural and regulatory genes are organized in seven operons: (i) flaAG fliDS, (ii) fleQ, (iii) fleSR fliEFGHLI, (iv) fliMNOPOR flhBAF fleN fliA motAB, (v) flgAMN, (vi) flgBCDEFGHIJKL, and (vii) motB2A2 (43). Even though these operons are very similar to those of Pseudomonas and Salmonella, the L. pneumophila motility system lacks an important component of most bacterial flagellar systems: a chemotaxis system. The regulatory network that governs L. pneumophila flagellum-mediated motility includes RpoN (σ54) and its regulators FleQ, FleN, FleS, and FleR, as wells as FliA (σ28) and its anti-sigma factor FlgM (Fig. (Fig.2)2) (1). In contrast, the L. longbeachae genome contains a complete set of chemotaxis genes but only a few flagellar genes, most of which are predicted to perform regulatory functions (Fig. (Fig.5;5; see Table S8 in the supplement material). Comparison of the L. pneumophila regions containing flagellar operons with the L. longbeachae regions containing flagellar operons showed that the content and the order of genes adjacent to the flagellar loci are well conserved between the two species (Fig. (Fig.5).5). However, L. longbeachae lacks the majority of flagellar structural genes. The absence of transposon elements and pseudogenes in these chromosomal regions of L. longbeachae suggests that the loss of flagellar genes was not a recent event.
Previous papers contain conflicting reports regarding expression of flagella by L. longbeachae. Heuner et al. did not detect the flagellin-encoding flaA sequence in L. longbeachae serogroup 1 using Southern hybridization, nor did they visualize flagella on the surface by electron microscopy (42). On the other hand, Asare et al. reported that D-4968 and several other L. longbeachae serogroup 1 strains express flagella in the postexponential phase (6). However, extensive microscopic examination did not confirm motility for D-4968 or six other L. longbeachae isolates (both serogroup 1 and serogroup 2) from the CDC culture collection when they were tested in our laboratory (data not shown). Nor did PCR-based screening of 50 L. longbeachae isolates belonging to both serogroups detect flaA in any isolate (data not shown). PCR amplification of fragments of the less polymorphic fliI, flgG, and flgH genes from eight L. longbeachae isolates (six serogroup 1 isolates and two serogroup 2 isolates) was also unsuccessful. Amplification of the mip gene as a positive control for PCR was possible with all isolates (data not shown). The failure to detect flagellar genes in other L. longbeachae isolates could be due to decreased specificity of PCR primers designed based on L. pneumophila sequences. However, the complete absence of flagellar biogenesis genes in D-4968, in the recently sequenced L. longbeachae D-5243 isolate from the CDC culture collection (N. A. Kozak et al., unpublished data), and in four other L. longbeachae isolates sequenced in France (15) supports the hypothesis that the majority of L. longbeachae isolates do not possess flagellar genes.
It has been shown previously that L. pneumophila flagellin is required for caspase 1 activation in murine macrophages (53, 62, 70).The lack of a flagellum in D-4968 and other L. longbeachae isolates provides an explanation for the failure of this species to activate caspase 1 and trigger rapid pyroptosis in C57BL/6 and BALB/c mouse macrophages, allowing establishment of infection in mice resistant to L. pneumophila (6). The ability to replicate in various mouse strains and the fact that D-4968 was isolated from a Legionnaires' disease patient (4) indicate that the flagellum is not a factor that is required for L. longbeachae virulence.
The detection of flagellar regulatory genes without structural genes in the D-4968 genome raises several questions. The first question is whether ancestors of this L. longbeachae isolate carried the full set of flagellar structural genes and, if so, what selective pressure caused their loss and by what mechanism were they lost. Second, the presence of an intact fliA gene implies functionality, but the nature of the genes that FliA regulates has yet to be determined. Studies on L. pneumophila demonstrated that FliA, in addition to flagellar genes, regulates factors that contribute to virulence-associated traits, including the inhibition of phagosome maturation (63). A recent transcriptome study of L. pneumophila strain Paris identified several genes of the FliA regulon that do not play a role in flagellum biogenesis or regulation, including lpp0972 and lpp1290 that encode enhanced entry proteins (EnhA) (13). In D-4968, LLB_0870 and LLB_3030 are highly homologous to lpp0972 and lpp1290, respectively. Both LLB_0870 and LLB_3030 are located in chromosomal regions that correspond to loci adjacent to the flagellar operons in L. pneumophila (Fig. (Fig.5).5). This suggests that LLB_0870 and LLB_3030 may constitute the FliA regulon of D-4968. Further studies should help identify other FliA-regulated genes in L. longbeachae, which may be more numerous than FliA-regulated genes in L. pneumophila.
Identification of a homolog of the anti-sigma factor FlgM gene in the genome of D-4968 presents an interesting paradox. Typically, FlgM inhibits activity of FliA by directly binding to this sigma factor, and FliA is released only after complete assembly of a flagellar intermediate that functions as a secretion channel for FlgM (45). However, D-4968 does not have the genes required for complete synthesis of the flagellar intermediate, so FlgM cannot be secreted in this way. If the LLB_0872 protein, which shares 46% identity with L. pneumophila FlgM, does function as an FliA-specific anti-sigma factor, L. longbeachae must have a novel mechanism for release of inhibition.
L. longbeachae D-4968 contains a chemotaxis operon with six genes encoding homologs of a methyl-accepting chemotaxis protein, the histidine kinase CheA, the methylesterase CheB, the methyltransferase CheR, and two CheW scaffold proteins (Fig. (Fig.6).6). A gene encoding the response regulator CheY is usually included in a chemotaxis operon (2, 90), but a cheY homolog was not found in the operon of D-4968. Two genes, LLB_2022 and LLB_2292 located outside the chemotaxis operon, encode CheY-like proteins, which may function together with the gene products of the chemotaxis operon. There are no homologs of either LLB_2022 or LLB_2292 in L. pneumophila. PCR-based screening of eight L. longbeachae isolates from the CDC culture collection (six serogroup 1 isolates and two serogroup 2 isolates) indicated that all of them contained cheR, cheA, and cheB genes (data not shown). Therefore, it is likely that many L. longbeachae isolates contain chemotaxis genes.
It is paradoxical that the genome of L. pneumophila has a complete set of flagellum genes and no chemotaxis genes, whereas the genome of L. longbeachae lacks flagellar genes but contains a chemotaxis operon. L. pneumophila flagellum-mediated motility is not regulated by a typical chemotaxis pathway, and the L. longbeachae chemotaxis genes are likely involved in regulation of a different process. Recent studies of the chemotaxis-like systems in different bacteria showed that many systems differ from the E. coli chemotaxis system and are not involved in the control of flagellum-mediated motility. These so-called chemosensory systems were shown to regulate type IV pilus-based motility, flagellar and type IV pilus expression, cell differentiation, and biofilm formation (51). The chemotaxis-like operon in D-4968 may be involved in the regulation of type IV pilus-based motility (see below), yet L. pneumophila also possesses type IV pili and exhibits twitching motility (22). It is more likely that twitching motility is regulated in similar ways in the two species and thus is che independent. According to RAST analysis, the composition of the chemotaxis-like operon in D-4968 is similar to the composition of one of the che operons in Geobacter uraniireducens and Magnetococcus. In G. uraniireducens the che cluster that is similar to the D-4968 che operon is predicted to regulate processes involving cell-cell interactions or social motility (90). Future studies of G. uraniireducens, Magnetococcus, and L. longbeachae should indicate whether the che operons in these organisms control similar functions.
L. pneumophila type IV pili (Tfp) promote attachment to mammalian and protozoan cells, are important for biofilm formation, and are required for natural competence (55, 79, 88). A recent study reported that Tfp are also required for twitching motility at 37°C but not at 27°C (22). Similar to other bacteria producing type IVa pili, L. pneumophila contains multiple Tfp biogenesis genes that are scattered throughout the genome (67). The core components necessary for Tfp biogenesis are considered to be the major pilin PilE, the prepilin peptidase PilD, the assembly ATPase PilB, the retraction ATPase PilT, the inner membrane protein PilC, and the secretin PilQ (see Table S8 in the supplemental material) (67). L. longbeachae D-4968 contains all of the Tfp biogenesis genes present in L. pneumophila, except for one: the gene encoding the prepilin, PilX, which has a stop codon in the middle of the reading frame (see Table S8 in the supplemental material). Regardless of PilX reading frame disruption, this protein is not a core component of Tfp, and its absence may not affect Tfp expression in L. longbeachae. L. longbeachae most likely produces Tfp, which, in the absence of flagella, may be the sole motility organ. Additional studies are needed to determine whether L. longbeachae exhibits Tfp-mediated twitching motility and whether the chemotaxis-like operon described above is involved in its regulation. Perhaps the loss of flagellum and retention of Tfp by L. longbeachae were adaptations for survival in soil, in contrast to other Legionella species that inhabit freshwater environments.
L. longbeachae D-4968 contains a number of genes which encode homologs of L. pneumophila virulence factors or putative virulence factors that are not described above (see Table S9 in the supplemental material). For example, LLB_3347 encodes a macrophage infectivity potentiator (Mip). L. longbeachae mip mutants have a reduced capacity to infect and multiply within amoeba and do not cause death in guinea pigs, unlike the wild-type bacteria (28). D-4968 also encodes a weak homolog of major outer membrane protein MOMP, which has been shown to bind to the complement component CR3 of human monocytes in L. pneumophila and thus is important for phagocytosis (for a review, see reference 1). D-4968 contains a number of enh homologs, including genes encoding seven paralogs of EnhA, two paralogs of EnhB, and one paralog of EnhC, involved in L. pneumophila entry into host cells (21) (see Table S9 in the supplemental material). However, D-4968 lacks an RtxA-encoding gene that was identified together with the enh genes in L. pneumophila in a screen for the enhanced-entry phenotype and has been shown to be involved in cell entry, adherence, cytotoxicity, and pore formation (20, 21). The D-4968 genome encodes a homolog of the integral membrane protein MviN, which is important for Salmonella enterica serovar Typhimurium virulence and is predicted to be a virulence determinant in L. pneumophila (1). Finally, D-4968 homologs of the regulators BipA and Fis have been predicted to be involved in virulence gene regulation in L. pneumophila (1).
In conclusion, a comparison of the L. longbeachae D-4968 genome with four sequenced and annotated genomes of L. pneumophila strains showed that the majority of virulence factors are shared by these species. The major difference between the species is the absence of flagellar biosynthesis genes in L. longbeachae. Even though the type II secretion apparatus and the type IV secretion apparatus are conserved in L. longbeachae and L. pneumophila, we predict that the sets of effectors secreted via these systems differ and reflect the adaptation of these species to different host environments and extracellular ecological niches. Further studies should help determine whether all L. longbeachae strains have L. longbeachae D-4968-specific properties.
We thank JCVI for providing the JCVI Annotation Service, including the automatic annotation data and the manual annotation tool Manatee. We also thank Awdhesh Kalia and Yousef Abu Kwaik for helpful suggestions at the project-planning stage.
Published ahead of print on 11 December 2009.
†Supplemental material for this article may be found at http://jb.asm.org/.