|Home | About | Journals | Submit | Contact Us | Français|
The airways of individuals with cystic fibrosis (CF) often become chronically infected with unique strains of the opportunistic pathogen Pseudomonas aeruginosa. Several lines of evidence suggest that the infecting P. aeruginosa lineage diversifies in the CF lung niche, yet so far this contemporary diversity has not been investigated at a genomic level. In this work, we sequenced the genomes of pairs of randomly selected contemporary isolates sampled from the expectorated sputum of three chronically infected adult CF patients. Each patient was infected by a distinct strain of P. aeruginosa. Single nucleotide polymorphisms (SNPs) and insertions/deletions (indels) were identified in the DNA common to the paired isolates from different patients. The paired isolates from one patient differed due to just 1 SNP and 8 indels. The paired isolates from a second patient differed due to 54 SNPs and 38 indels. The pair of isolates from the third patient both contained a mutS mutation, which conferred a hypermutator phenotype; these isolates cumulatively differed due to 344 SNPs and 93 indels. In two of the pairs of isolates, a different accessory genome composition, specifically integrated prophage, was identified in one but not the other isolate of each pair. We conclude that contemporary isolates from a single sputum sample can differ at the SNP, indel, and accessory genome levels and that the cross-sectional genomic variation among coeval pairs of P. aeruginosa CF isolates can be comparable to the variation previously reported to differentiate between paired longitudinally sampled isolates.
The airways of individuals with cystic fibrosis (CF) show an exquisite predilection toward infection with the opportunistic Gram-negative bacterial pathogen Pseudomonas aeruginosa (9, 11). Initial airway infections by P. aeruginosa are usually eradicated through aggressive antibiotic intervention, though over time these infections typically become chronic in nature (9). The expectorated sputum from many chronically infected CF patients can carry very high loads of P. aeruginosa (typically >105 CFU/ml). Strategies such as variable number tandem repeat (VNTR) and multilocus sequence typing (MLST) assays have indicated that most chronically infected CF adults become colonized by a unique strain of P. aeruginosa (27, 28). However, although useful for strain typing, at this level of sequence resolution, little can be said about the degree of intraclonal genomic variation in each patient; for example, just 8 open reading frames (ORFs) are analyzed in conventional MLST analyses. A higher-throughput, complementary approach was introduced in 2007 by Wiehlmann et al., who developed a microarray capable of discriminating 58 binary single nucleotide polymorphisms (SNPs) in 7 conserved ORFs, as well as identifying the presence of strain-specific horizontally acquired genomic islands and islets (40). The greater resolution afforded by this method enables finer mapping of the P. aeruginosa population structure in each patient but still throws no light on the nature of the genomic changes outside this small number of markers. This is important because several studies have shown that the P. aeruginosa population within the CF lung displays heritable phenotypic diversity (for examples, see references 10 and 22). However, the genomic basis for this phenotypic diversity has not been investigated systematically.
Since the first P. aeruginosa reference genome was completed in 2000 (36), a number of other isolates have also been sequenced (19, 35, 42). In 2006, Smith et al. reported the genome sequence of an isolate obtained from a CF individual aged 6 months and of a second isolate of the same strain from the same patient aged 96 months (35). This analysis revealed that the genome of the latter isolate had accrued mutations (base substitutions leading to nonsynonymous amino acid changes, as well as insertions and deletions) in 68 genes during this period. These genes included ORFs encoding virulence factors, multidrug efflux pumps, and quorum-sensing (QS) functions (34, 35, 42). Crucially, Smith et al. also provided evidence that the P. aeruginosa strain in this patient displayed population heterogeneity, with multiple contemporary sublineages being present. However, this was based upon the targeted sequence analysis of selected genes rather than a genome-wide analysis of contemporary isolates. A more extensive longitudinal study was reported recently by Cramer et al. (3). These workers sequenced the genome of a single early isolate, of an intermediate isolate, and of a late-stage isolate, each harvested from two patients. For one of the patients, just 15 SNPs and a large (156-kb) deletion arose during the 15-year observation period (no smaller indels were reported). For the second patient, no SNPs differentiated between the initial and intermediate isolate, although the late-stage isolate had acquired 959 SNPs. This high rate of mutation arose because the lineage acquired (and subsequently lost) a mutation in mutL (encoding a mismatch repair protein) between the last two sampling points. These workers concluded that in the absence of hypermutation, the intraclonal sequence diversity is remarkably low. Finally, Yang et al. examined genomic changes in a transmissible P. aeruginosa lineage, DK2, isolated from several patients attending a Danish clinic over the last 30 years (44). These workers proposed that genetic adaptation to the CF lung niche likely proceeded through rapid accumulation of a relatively small number of pleiotropic mutations shortly after initial colonization, followed by gradual genetic drift thereafter, and concluded that the DK2 lineage was highly homogenous. Critically, though, in none of these earlier studies was the genome of more than one contemporary (i.e., coeval) isolate sequenced and compared.
In this work, we picked two coeval isolates each from the sputum of three patients and compared the genomic differences within each pair. The genomic differences identified, the nature of the changes in each genome, and the phenotypic impact of these changes are discussed in relation to the chronic infection of the CF airways by this pathogen.
Freshly expectorated sputum was collected from consenting adult CF patients at Papworth Hospital (Papworth, Cambridge, United Kingdom) and treated as described by Foweraker et al. (10) to isolate single colonies. Each of the patients involved gave consent for their samples to be used for research via the Papworth tissue bank, approved by the local ethics committee. A selection of P. aeruginosa colonies were picked from each plate, and their genotype was confirmed by VNTR analysis. Two colonies from each patient's sputum were selected for detailed genomic analysis. Genomic DNA (gDNA) was extracted from overnight LB-grown cultures using a Qiagen kit according to the manufacturer's instructions. The gDNA from each sample was quantified using the Quant-iT double-stranded DNA (dsDNA) high-sensitivity assay kit with the Qubit Fluorometer (Invitrogen) and prepared for sequencing on the GAIIx using standard protocols. Briefly, 500 to 1,000 ng of each sample was sheared to around 400 bp using a Covaris S2. Following end repair and A-tailing, Illumina DNA adapters were ligated to the ends of the fragments. Samples were run on a 2% (wt/vol) agarose gel, and fragments of around 600 bp (equivalent to an ~400-bp insert) were excised. Ten cycles of PCR were carried out, and products were resolved on a 2% agarose gel. Final libraries were excised from the gel and quantified using an Agilent Bioanalyzer. Libraries were sequenced in two Illumina GAIIx runs following standard procedures. Briefly, libraries were denatured using sodium hydroxide and diluted to 8 to 10 pM in buffer HT1, and runs were initiated for 2 × 100 bases of sequencing by synthesis (SBS).
We aligned the reads to the Pseudomonas LESB58 reference genome using Illumina's aligner, Elandv2e. Based on these alignments, we called SNPs and indels using Illumina's Consensus Assessment of Sequence and Variation (CASAVA-1.8) software. Single nucleotide variants (SNVs) displaying low read depth (<100) or sequence ambiguity (>5% alternative base calls) were eliminated from further analysis. The reliability of base calls made using this threshold was confirmed by conventional (Sanger-based) sequencing of selected genes. Although this analysis also confirmed some sequence variants with lower read depths, we empirically found that the use of lower thresholds resulted in the inclusion of a larger proportion of false-positive variant calls (Table 1). Indels were scored only if >80 reads supported the proposed indel, and the number of reads not supporting the indel was negligible. Again, these parameters were confirmed as being robust by Sanger sequencing of selected indel-containing genes. All of the SNPs and indels that passed this quality control were confirmed by manual inspection of the data sets. De novo assemblies of the indicated isolates were done using the Velvet assembler v. 1.1.04 (46). For each assembly, we iterated over a small range of k-mer sizes and picked the size that yielded the largest N50 scaffold size. We used NCBI BLAST v. 2.2.24 and in-house perl scripts to search for structural variants and phage insertions.
Unless otherwise stated, all assays were performed at least three times on independent biological replicates.
Supernatants from planktonic exponential cultures (grown in LB at 37°C and 300 rpm for 3 h) were collected and filter sterilized to remove bacterial cells. For BHL (N-butyryl-l-homoserine lactone) production, supernatants (100 μl) were incubated with JM109 (pSB536) (37) statically for 3 h at 30°C. For OdDHL (3-oxo-dodecanoyl-l-homoserine lactone) production, supernatants (100 μl) were incubated with JM109 (pSB1075) (41) on a rocking platform for 4 h at 30°C. Following incubation, bioluminescence was measured using Lucy 1 (Anthos Labtec Instruments, Austria). Synthetic BHL and OdDHL (both at 1 μM) were used as respective controls.
Cells seeded in static 96-well polystyrene microtiter plates were grown in LB at 37°C for 24 h. Following incubation, culture supernatant and nonadherent cells were removed and the wells were washed once with 300 μl water. Attached cells were stained for 30 min with 0.1% (wt/vol) crystal violet. The stained plates were rinsed three times with water, and the adsorbed dye was released by adding 50% (vol/vol) ethanol. Attachment was quantified by measuring the absorbance at 595 nm. All experiments were performed at least eight times on independent biological replicates.
Swimming motility was assessed at 30°C using semisolid (0.35%, wt/vol) LB agar plates, as previously described (48). Since the Liverpool Epidemic Strain (LES) displays very poor swimming motility, the plates were left for longer than usual (36 h) before measuring the diameter of the swim halo. Swarming motility was measured on 0.5% (wt/vol) Eiken agar-LB plates at 30°C, prepared as previously described (38). Twitching motility was measured on thin (1-mm) 1% (wt/vol) LB agar plates, as previously described (26).
Antibiotic susceptibility was measured using Etest strips at 30°C on LB agar plates.
Siderophore production was measured at 30°C using chrome azurol S (CAS) indicator plates, as described previously (31).
Pyocyanin production was scored visually by looking for a blue halo around the colonies when they were grown on Pseudomonas isolation agar.
Rhamnolipid production was measured on Siegmund-Wagner agar plates as previously described (33). Blue halos around the spotted colonies indicated rhamnolipid production. Where indicated, the plates were supplemented with chemically synthesized quorum-sensing molecules (kindly supplied by David Spring and James Hodgkinson, Department of Chemistry, University of Cambridge, United Kingdom).
Caseinase activity was measured by spotting aliquots (5-μl volume) of overnight culture onto skim milk plates containing 50 g/liter tryptic soy agar and 2% (wt/vol) skimmed milk. Plates were incubated at 37°C for 2 days before scoring for protease production (by measuring halo sizes). Gelatinase production was assayed similarly, except that the cultures were spotted onto plates containing 1.6% (wt/vol) agar, 13 g/liter nutrient broth, and 30 g/liter gelatin. Proteolytic halos were visualized after 48 h of incubation by flooding the plates with saturated ammonium sulfate solution.
The hypermutator assay was done as previously described by Cramer et al. (3).
PcrV is a structural protein that forms part of the type III secretion (T3S) apparatus. Planktonic cultures were grown in LB at 37°C at 300 rpm to exponential phase (3 h), and intracellular proteins were extracted as described previously (21) and quantified using the Bio-Rad DC protein assay kit, according to the manufacturer's instructions. Proteins (20 μg) were separated on 12% SDS-polyacrylamide gels, transferred to polyvinylidene fluoride (PVDF) membrane and analyzed by Western blotting with anti-PcrV antibodies. Enhanced chemiluminescence (ECL) peroxidase-labeled anti-rabbit antibodies (Sigma) were used as secondary antibodies. Blots were developed using Immobilon Western Chemiluminescent HRP substrate (Millipore). All experiments were performed at least three times on independent biological replicates.
Cells from planktonic exponential cultures of patient 7 isolates were grown in LB at 37°C and 300 rpm for 3 h. Cells were collected in RNAlater solution (Ambion) and incubated on ice for 1 h before sedimentation by centrifugation at 3,200 × g for 15 min at 4°C. Cell pellets were incubated in 1 mg/ml lysozyme for 15 min at room temperature, and total RNA was extracted using the RNeasy Minikit (Qiagen) according to the manufacturer's instructions. The resulting RNA (200 ng) was utilized as a template for reverse transcription and conversion into cDNA. PCRs were prepared with 5 μl of diluted cDNA (1:5 in water) as a template. Primers designed for the amplification of the rhlI, rhlR, or 16S rRNA-encoding genes were based on regions of the sequence that were identical in all isolates. For rhlI the primers were 5′-CTCTCTGAATCGCTGGAA-3′ and 5′-GACGATGTAGCGGGTTT-3′ (product size, 180 bp). For rhlR the primers were 5′-TGTGGTGGGACGGTTT-3′ and 5′-GCCAGGCCTTGGGATA-3′ (product size, 165 bp). PCR products were resolved on a 1% (wt/vol) agarose gel in the presence of ethidium bromide.
Supernatants from planktonic stationary cultures (grown in LB at 37°C and 300 rpm for 9 h) were collected and filter sterilized to remove bacterial cells. Pseudomonas quinolone signal (PQS) extraction and detection were carried out based on the methods described in reference 8. Briefly, an equal volume of acidified ethyl acetate (0.02% [vol/vol] glacial acetic acid) was added to the cell-free supernatants and vortexed vigorously. The organic phase was separated by centrifugation at 3,220 × g for 5 min and 2-alkyl 4-quinolones (AHQs) were precipitated by drying under vacuum. The solute was resuspended in 100 μl acidified ethyl acetate. PQS was detected by thin-layer chromatography (TLC) analysis. AHQs (5 μl) were spotted onto a TLC plate that had been soaked in 5% (wt/vol) KH2PO4 for 30 min and activated at 100°C for 1 h and were separated using 17:2:1 methylene chloride-acetonitrile-dioxane as a solvent. The plate was visualized under a UV transilluminator and photographed. Synthetic PQS was used as a reference (2 μl of a 10 mM stock concentration).
We wanted to know whether two randomly picked P. aeruginosa colonies from the expectorated sputum of CF patients would display genomic differences, and if so, to define the nature of these differences (SNP, indel, or horizontally acquired DNA). To take into account possible patient- or strain-specific idiosyncrasies, we sequenced the genomic DNA from pairs of isolates obtained from the sputa of three chronically infected adult CF patients (here referred to as patients 5, 6, and 7). Expectorated sputum samples were collected from the patients during their routine checkup visits to the Papworth Hospital CF clinic (Cambridge, United Kingdom). The patients (i) were all homozygous for the ΔF508 CFTR allele, (ii) were clinically stable at the time of sampling, and (iii) yielded expectorated sputum containing >105 CFU P. aeruginosa per ml. All three patients were chronically infected with P. aeruginosa before they were referred to the Papworth adult CF clinic. Patient 5 (male, aged 34 years) had been chronically infected for at least 6 years prior to sampling and was on long-term oral azithromycin prophylaxis. Twenty-one days before sample harvesting, the patient underwent 6 days of treatment with meropenem, ciprofloxacin, and ceftazidime, followed by 13 days of treatment with ceftazidime and aztreonam. Patient 6 (male, 25 years) had been chronically infected for at least 18 months prior to sampling and was being treated with long-term inhaled colomycin or tobramycin (alternate months) and orally administered azithromycin. Patient 7 (female, 19 years) had been chronically infected for at least 2 years prior to sampling and was being treated with inhaled meropenem or tobramycin (alternate months) and oral flucloxacillin. Twenty-eight days before sample harvesting, the patient underwent 14 days of treatment with intravenous meropenem and tobramycin.
VNTR analysis on a selection of colonies from each sputum sample indicated that each of the three patients was infected with a different P. aeruginosa strain. None of these strains were known as “epidemic” pathovars. MLST analysis of the isolates from patients 5 and 7 yielded only partial hits with the MLST database. However, the sequences from the patient 6 isolates yielded a perfect match with strain 2724 (clonal complex ST245). This strain was originally isolated in Poland in 2005. Further investigation revealed that patient 6 was a visitor to the United Kingdom from Western Russia.
The sputum from patient 5 yielded roughly equal numbers of smooth nonmucoid colonies and mucoid colonies. Mucoidy is a common phenotypic conversion seen in CF isolates (2). One colony of each morphotype was selected for genome sequencing, here referred to as isolates 5S (smooth) and 5M (mucoid). The sputum from patient 6 yielded only mucoid colonies, while that of patient 7 contained only nonmucoid colonies. Two colonies each from patients 6 and 7 were selected for genome sequencing (here referred to as isolates 612b and 69a for patient 6 and 714b and 729b for patient 7). Genomic DNA was extracted from overnight cultures of each isolate and was sequenced as described in Materials and Methods. The mean depth of sequencing of all of the isolates was high (Table 1). As a control to establish the background rate of error associated with our sequencing methodology, we also extracted gDNA from a culture of minimally passaged LESB58 (kindly provided by Craig Winstanley, University of Liverpool, United Kingdom), which was the original isolate of the Liverpool Epidemic Strain (LES) used for genome sequencing (43). The sequence reads from this LES gDNA contained no SNPs relative to the published reference. This indicated that the sequencing technology we employed had a negligible error rate and that plating and overnight growth of the colonies were unlikely to introduce mutations. The sequence reads of one isolate from each patient were de novo assembled to yield genomes of 6,306,203 bp (isolate 5M, 102 scaffolds), 6,605,755 bp (isolate 612b, 187 scaffolds), and 6,161,118 bp (isolate 714b, 112 scaffolds). The only scaffolds that did not align with the LES reference were from the isolate 612b assembly. However, these scaffolds were found to be essentially identical to sequences within the exoU-containing P. aeruginosa pathogenicity island 2 (PAPI-2) and the PAGI-7 genomic island from strain PSE9 (1).
As a comparison genome, we used the Liverpool Epidemic Strain LESB58 as a reference. The LES is a CF-associated strain and was therefore considered to be a more appropriate comparator than strain PAO1, which was originally isolated from a wound. The sequence reads for each of our isolates were screened against the LESB58 reference genome to identify single base substitutions. Table 1 summarizes the data. Each isolate contained ca. 23,000 ± 1,000 single nucleotide variations relative to the LES reference, representing the extent of strain-to-strain variation (ca. 0.4%) within the conserved “core” genome of P. aeruginosa (19). Most (ca. 86%) of these base changes were located within predicted ORFs. This is broadly consistent with the data of Stover et al., who reported that protein-coding regions comprise a comparable fraction of the P. aeruginosa genome (36). The majority (~77%) of the SNVs were derived from transition events. Of the SNVs located within the coding DNA, ~67% gave rise to synonymous codon changes. Almost all of the nonsynonymous single nucleotide changes (~5,000 per isolate) gave rise to missense mutations; very few led to nonsense mutations (Table 1). A complete list of the nonsense mutations present in each isolate (relative to the LES reference) is provided in Table S1 in the supplemental material. Only one of these (Glu→Stop in LES_57801) was present in all of the isolates, although another gene, LES_49081 (encoding a hypothetical protein in the pilA-pilD operon) contained independent nonsense mutations in the patient 5 isolates (Ser→Stop) and the patient 7 isolates (Gln→Stop), perhaps indicative of parallel selective pressures in each lineage. In some instances, the nonsense mutations may well have had clinical or physiological significance. For example, mexB, which encodes a component of the principal broad-spectrum multidrug efflux pump in P. aeruginosa, carried a nonsense mutation that truncated the encoded protein in both isolates from patient 5 at residue Y891, leading to loss of the last four transmembrane helices, one of which is known to be essential for pump function (32). In both isolates from patient 6, the protein encoded by the quorum-sensing master regulator, lasR (25, 34, 42), was truncated at residue E124 within the OdDHL-binding domain. Finally, mutS, which encodes a mismatch repair protein (and which is responsible for the so-called “hypermutator” phenotype in some strains [20, 23]), contained a nonsense mutation that truncated the encoded protein in both isolates from patient 7 at residue S813, leading to loss of the last 42 amino acids. To assess whether this truncation had functional consequences, we measured the rate of acquisition of spontaneous rifampin resistance (mainly due to rpoB mutations) for each isolate, as described by Cramer et al. (3). The isolates from patients 5 and 6 all displayed negligible spontaneous Rif resistance. However, isolates 714b and 729b both displayed hypermutator phenotypes (mutation rates of 6 × 10−6 and 2.2 × 10−5, respectively).
Individual missense mutations also had clear functional consequences. For example, the LES strain is intrinsically resistant to the fluoroquinolone antibiotic ciprofloxacin due to a T83I mutation in GyrA (45). However, this mutation is absent in the gyrA carried by the isolates from patients 5 and 6, which encode a ciprofloxacin-sensitive GyrA protein. These isolates display correspondingly increased sensitivity to ciprofloxacin (compared to the LES) when grown in vitro (Table 2). A different fluoroquinolone resistance-conferring mutation was identified in isolates 714b and 729b. Here, a G→T transversion at nucleotide 258 in gyrA led to a D87Y substitution in the protein. The D87Y mutation has been previously shown to confer resistance to fluoroquinolones (14).
For any given isolate, most of the ca. 25,000 SNVs were also present in the corresponding paired isolate from the same patient's sputum. However, when comparing individual isolates from different patients, fewer than half of the SNVs associated with one isolate were also conserved in the other isolate. For example, of the ca. 25,000 SNVs common to both 5M and 5S, only around 11,500 were shared with the isolates from either patient 6 or patient 7. These data reinforce the notion that each patient is infected with a distinctly different strain of P. aeruginosa.
When comparing isolates from the same patient, SNPs were apparent, i.e., base substitutions unique to that isolate. For example, the 5M and 5S isolates differed due to a total of 54 unique SNPs (within 31 ORFs) in the conserved core genome (Table 1). The isolates from patient 7 showed even greater individual variation (344 SNPs, of which 320 were located within ORFs), while those from patient 6 were differentiated by only 1 SNP. Presumably, isolates 714b and 729b contained far more SNPs than those from patient 5 or patient 6 because of their hypermutator (mutS) phenotype. Notably, >95% of the SNPs in isolates 714b and 729b were transitions. This is consistent with the known preference of the mismatch repair system for correcting this type of mutation (17). A full list of the SNPs present in each isolate is given in Table S2 in the supplemental material.
Around 68% of the SNPs present in ORFs in isolates 714b and 729b were nonsynonymous (NS-SNPs). The single SNP in the patient 6 isolates also led to a nonsynonymous codon change (T→P in rpoB). In contrast, of the 51 SNPs that were located within ORFs in isolates 5M and 5S, a slight majority (53%) yielded synonymous codons. However, these data are not as disparate as it may at first appear; the calculated value of dN/dS (the ratio of actual/expected nonsynonymous [dN] and synonymous [dS] base changes) for the combined patient 5 isolates was 0.34, and dN/dS values for the patient 7 isolates were 0.79 (isolate 729b) and 0.47 (isolate 714b). The value of dN/dS could not be calculated for the single SNP associated with the patient 6 isolates. Therefore, in all measurable cases, the dN/dS value was <1. This is a strong signature of ongoing negative (purifying) selection.
Indels, defined here as insertions or deletions of <300 bp, were present in all of the isolates that we examined. Overall, each isolate contained >200 genes disrupted by indels (with many genes containing more than one indel) relative to the LES reference. These indels sometimes led to frameshifting of the downstream codons and were therefore likely to be of functional significance. For example, all of the mucoid isolates contained indels in the anti-sigma factor mucA. Isolate 5M contained a 7-bp insertion (a tandem duplication of the sequence C298TGGCCG304) that frameshifted the encoded protein from residue 101 onwards. Isolates 612b and 69a both contained a single nucleotide deletion of base G430 in mucA, leading to a frameshift after residue 143 in the encoded protein. Both types of indel were confirmed by Sanger sequencing.
A comparison of the indels present in one but not the other of the paired isolates from each patient revealed indel-associated polymorphisms (IPs). For example, the isolates from patient 5 differed due to indels at 38 locations (27 of these within annotated ORFs) in the conserved core genome. The patient 6 isolates differed due to 8 IPs (although only 2 of these were located within annotated ORFs). The patient 7 isolates differed due to 93 IPs that disrupted 65 core ORFs (Table 1). A complete list of the IPs present in each isolate is given in Table S3 in the supplemental material. We conclude that IPs and SNPs are potentially comparable drivers of phenotypic variation.
One key gene affected by indels was the regulator of quorum sensing lasR. In both isolates from patient 5, lasR contained a single nucleotide deletion (of base T230) leading to a frameshift after residue serine 77 in the encoded protein. Consistent with this, neither isolate produced OdDHL (Fig. 1). A more extreme deletion affected lasR in isolates 714b and 729b. Here, a 160-bp deletion spanning nucleotides −156 (relative to the ATG start codon) to +4 essentially abolished expression of the gene. Recalling that both isolates from patient 6 also encoded a truncated LasR (due to a nonsense mutation), we conclude that all of the isolates that we examined contained clear loss-of-function lasR mutations. In each case, these mutations were confirmed by Sanger sequencing. Mutations in lasR are known to arise frequently in CF (4, 12, 30) and are associated with a poorer clinical prognosis (13).
Under laboratory growth conditions, production of quorum-sensing signal molecule BHL is under the control of the las subcomponent of the QS system (25). Interestingly, although it carries a defective lasR gene, isolate 714b (and to a lesser extent, also its paired partner, isolate 729b) produced quantities of BHL comparable to those of the LES strain (Fig. 1). RT-PCR analyses confirmed that the LES (wild type for lasR) and isolates 714b/729b (both lasR mutants) each expressed rhlI and rhlR at comparable levels (see Fig. S1 in the supplemental material). Moreover, isolates 714b and 729b both produced copious quantities of rhamnolipid, a virulence factor that is primarily BHL regulated (24) (Table 2). These results are consistent with the recent data of Dekimpe and Déziel (5) showing that under some situations, the rhl signaling system can be active in the absence of a functional LasR protein.
In addition to lasR, another important QS regulator, pqsR (also known as mvfR; a LysR-type transcriptional regulator [LTTR]) contained a 12-bp deletion (G100CGGTCAGCTCG111 in the LES orthologue) in isolate 612b. MvfR is the only known receptor for the Pseudomonas quinolone signal (PQS) and is postulated to play an important role in virulence by linking the las and rhl components of the AHL-dependent QS system (8). The IP in 612b is slightly unusual in that it seems to restore what appears to be a simple ancestral motif (“-TAVS-,” which is present in most LTTRs) that has apparently been duplicated in PqsR to yield T33AVSSAVS40. The deletion in 612b leads to excision of an amino acid segment, AVSS, from this sequence to yield T33AVS36. This deletion is predicted to directly affect the helix-turn-helix motif of the protein and so may have functional consequences. Another indel affecting PQS was seen in isolate 729b. Here, the gene encoding PqsE (a protein required for the response to PQS) contained a single nucleotide deletion (base G565) leading to a frameshift after position 188 in the protein.
Sequences partially similar to the F10-like regions of LES prophage 2 and LES prophage 3 (43) were present in isolates 5S and 729b, but not in their cognate paired partners or in the patient 6 isolates. This was graphically apparent when we compared the read depths for these prophage signatures in the paired isolates; Fig. S2 in the supplemental material shows the sequence reads from the patient 7 isolates mapped onto the prophage 3-containing region of the LES reference (similar data were obtained for the patient 5 isolates; see Fig. S3 in the supplemental material). Further de novo assembly of the sequence reads revealed that unlike the LES prophages 2 and 3, which are physically distinct units, the F10-like prophage signatures in isolate 729b comprise a single unit that localizes to just one 43-kb contig. PCR-based analyses confirmed the presence of this prophage sequence in the purified genomic DNA of isolate 729b, yet when we screened the gDNA from a selection of additional P. aeruginosa colonies coisolated with the 714b/729b samples, no evidence of the prophage was found (see Fig. S4 in the supplemental material). This suggests that this prophage is probably present only at low frequency in the population. Presumably, the added fitness cost of replicating the prophage might account for this. However, it may also be advantageous for the population to maintain a small but latent prophage reservoir: as long as all members of the population are resistant to infection by the encoded prophages, such phages may protect the niche from superinfection by susceptible P. aeruginosa strains (i.e., a mechanism of kin selection). The differential presence of specific prophage signatures in paired isolates derived from the same patient suggests that either (i) these prophages were acquired after the initial infecting clone established itself or (ii) these sequences were present in the initial infecting clone but have become differentially lost during the development of chronic infection.
The SNPs and IPs contained within the isolates from patient 5 (i.e., those encompassing the median number of genetic differences) are detailed in Tables 3 and and4,4, respectively. Notably, isolate 5S produced less siderophore than isolate 5M (Table 2). Consistent with this, isolate 5S contained an NS-SNP in pchD (a key pyochelin biosynthetic enzyme). Pyochelin is one of the major siderophores of P. aeruginosa. Isolate 5S also contained a 13-bp deletion in metF (required for methionine biosynthesis). Consequently, unlike isolate 5M, isolate 5S was auxotrophic and was unable to grow on M9 minimal medium. However, isolate 5S was able to grow when the M9 medium was supplemented with methionine (data not shown). Both of the patient 5 isolates displayed very low levels of PcrV expression compared with the LES control (see Fig. S5 in the supplemental material). PcrV is a protein involved in type III secretion. PcrV expression was almost undetectable in isolate 5S, and the protein was present at only very low levels in isolate 5M. This was surprising, since the type III secretion system antiactivator exsD carried a deletion of 7 bp (G99CCGGGT105) in both isolates, which would normally lead to increased PcrV expression. However, isolate 5S contained a point mutation (S188→F) located directly within the predicted helix-turn-helix motif of the principal transcriptional activator of type III secretion, ExsA. Given its location and nonconservative nature, this mutation is likely to have had a detrimental effect on ExsA function, thereby accounting for the absence of PcrV expression in isolate 5S. The low level of PcrV expression in the mucoid isolate 5M may have a different cause. In P. aeruginosa, the T3S system is activated by a cyclic AMP receptor protein (CRP) homolog, Vfr (7). For reasons that are not yet clear, mucA mutants display impaired Vfr-dependent virulence factor production and, therefore, lower T3S system expression (15). This impaired Vfr signaling may also explain why 5M produced less secreted caseinase activity than the mucA+ isolate, 5S (Table 2). This explanation presumably also accounts for the low level of T3S and diminished secreted protease activities in the patient 6 isolates, both of which carry a mucA mutation. We conclude that in many cases, there is a clear link between the genotypic differences we observed in each isolate and the corresponding phenotype. However, there were exceptions to this. For example, mucoidy has been strongly linked to biofilm formation (reviewed in reference 2), and indeed, both of the (mucoid) patient 6 isolates formed robust biofilms in vitro (Table 2). In contrast, neither of the patient 5 isolates formed biofilms in vitro, in spite of the mucoid phenotype of isolate 5M. Clearly, either the acquisition of mucoidy is insufficient to promote biofilm formation per se, or these isolates have acquired other mutations that suppress biofilm formation at a more profound level.
Twenty-one of the 30 SNPs present in isolate 5M (Table 3) were located in just six contiguous ORFs, corresponding to genes LES_07061 → LES_07101 (fptA-phzS). Isolate 5M also carried a 9-bp insertion in the same region (Table 4). This may indicate that this stretch of DNA is a hot spot for mutation. However, the corresponding region of DNA in isolate 5S was devoid of SNPs/indels. A likely explanation for this is that this segment of DNA in isolate 5M (or equally, 5S) is the product of a recombination event with DNA from another lineage. In support of this, many of the SNPs that differentiate isolate 5M from isolate 5S are identical to SNVs present in the isolates from patients 6 and/or 7 (presumably, these SNVs are widespread in the global P. aeruginosa gene pool). Interestingly, this region carries a gene (LES_07091; ampP) that is required for the induction of endogenous β-lactamase activity (16) and so may be important for survival in the CF lung niche, especially during antibiotic challenge.
Another example of potential “hot spotting” can be seen in isolate 5S. Here, 8 of the 24 SNPs are located within a single gene (LES_34491, annotated as a mucin-17 like precursor). These SNPs were independently confirmed by Sanger sequencing. Although most of these SNPs are synonymous, three are NS-SNPs, two of which (M→R and S→P) are nonconservative and therefore may affect function. All of the SNPs fall within a short region (residues 639 to 968) of the 2,715-residue protein encoded by LES_34491. Notably, in the paired isolate 5M this gene also has a mutation (an 84-bp insertion, giving rise to insertion of 28 amino acids following residue 1141 in the encoded protein) that is likely to affect function. LES_34491 may therefore be the subject of differential degradation in both patient 5 isolates. Zhang and Mah (47) have recently shown that LES_34491 encodes a LapA-like outer membrane protein. Moreover, the gene cluster encompassing LES_34491 is expressed only during the biofilm mode of growth and confers resistance to a range of clinically relevant antibiotics. However, as noted above, isolates 5M and 5S were unable to form robust biofilms when grown in vitro (Table 2). This impaired ability to form biofilms may explain why LES_34491 has become redundant and is degrading.
One undoubted mutational hot spot was gene LES_56471 (algP), which encodes a histone H1-like regulator of mucoidy. AlgP contains around 45 tandemly repeated KPAA units, and the ORF has been previously shown to undergo rearrangement at high frequency in CF isolates, especially with respect to the number of repeated units it contains (6). Consistent with this, the algP in isolate 69a (but not that in isolate 612b) contained a 12-bp insertion that duplicated the amino acid sequence K173PAA176. In contrast, both isolates from patient 7 contained a 12-bp deletion at the same site (this time, removing a KPAA motif). The isolates from patient 5 contained yet a different mutation; here, a 24-bp deletion removed residues V217AKPAAKP224 from the encoded AlgP protein.
We have already alluded to the fact that the paired isolates from patients 5 and 7 contained different nonsense mutations in LES_49081 (a hypothetical gene in the pilA-D cluster) and that lasR was inactivated in all of the isolates examined. That such mutations arise independently within the same genes provides support for the notion that parallel selection pressures are in operation in the different patients. Another example of this is seen in isolates 5M and 5S, which each contained different NS-SNPs in the gene encoding elongation factor G (fusA1) (Table 3). Similarly, isolates 714b and 729b also carried nonidentical NS-SNPs in fusA1 (see Table S2 in the supplemental material). The fact that this gene has been the target of four independent nonconservative NS-SNPs strongly suggests that these mutations may confer some advantage in the CF lung environment. In Salmonella enterica serovar Typhimurium, mutations in fusA1 can confer resistance to fusidic acid but also lead to pleiotropic phenotypes, including reduced virulence and altered intracellular (p)ppGpp levels (18). The ppGpp-dependent stringent response has been previously shown to modulate virulence in P. aeruginosa (39). Since downregulation of virulence phenotypes often accompanies adaptation of the CF lung environment, it may be that mutation of fusA1 is one mechanism that contributes toward this.
The CF lung presents an intricately structured, chemically heterogenous environment for microbial colonization. In such a highly differentiated habitat, nonuniform resource availability and spatial heterogeneity virtually guarantee that adaptive radiation (and therefore niche differentiation) will occur. We would therefore expect microvariation among coeval CF isolates from chronically infected patients, yet until now this possibility had not been investigated at a genome-wide level. To begin to address this issue, we used next-generation sequencing technology to assess the full vista of genomic differences that discriminate between randomly selected pairs of coeval P. aeruginosa isolates obtained from three CF patients. Our data show that the isolates comprising each pair differ from one another at the SNP, indel, and accessory genome level. Very few of the changes that we observed would have been detected in sparse sampling approaches such as MLST or microarray-based profiling. Moreover, since we used a very stringent cutoff threshold for SNP/indel identification (see Materials and Methods), the data reported here are likely to represent the lower bounds of the true diversity present.
Because only two isolates were analyzed from each patient, we cannot comment objectively on the extent of intraclonal diversity at a population-wide level. Furthermore, since this study is cross-sectional rather than longitudinal in nature, we do not know the position of each isolate on its specific evolutionary trajectory. For example, the patient 6 isolates (which displayed the least number of differences among the pairs) might have been recently diverged siblings positioned on the same clade. Alternatively, these isolates might have been sampled from a particularly stable population. These caveats notwithstanding, our data clearly indicate that there is genomic heterogeneity among coeval P. aeruginosa CF isolates and that in some cases the number of genomic differences between paired isolates can equal (or even exceed) the differences reported between longitudinally harvested CF isolates. For example, Cramer et al. reported that patient “RN” accrued 15 SNPs over a period of ca. 15 years (no indels were reported ). Similarly, in the two longitudinal isolates sequenced by Smith et al. (35), 41 SNPs and 27 indels appeared over a 7.5-year period. By comparison, in the current work, we show that the two contemporary patient 5 isolates (which displayed the median number of genomic changes) differed due to 54 SNPs and 38 indels. The cross-sectional diversity among paired contemporary P. aeruginosa CF isolates can therefore be comparable to the longitudinally sampled diversity.
The current study is the first to highlight the potential impact that acquisition of a hypermutator phenotype can have on contemporary genome-wide variability among CF isolates. Hypermutation is common in CF isolates and often leads to accelerated antibiotic resistance and diminished patient longevity. The two patient 7 isolates studied here differed due to nearly 350 SNPs, the majority of which were NS-SNPs. However, the calculated value of dN/dS for these isolates was <1, indicating that the accrued SNPs are already the subject of ongoing purifying selection. Thus, acquisition of a “runaway” mutation rate does not allow the lineage to outpace selective pressures in the CF lung.
In summary, our data show that randomly sampled pairs of coeval CF isolates can display genomic heterogeneity at every level of resolution and that most of the observed variation would have escaped detection by any method other than whole-genome sequencing. Due to minimal sampling, the work described here provides only a glimpse of the likely contemporary variation present in the P. aeruginosa population of the CF lung. Nevertheless, given the genomic variation observed between just two randomly sampled isolates from each patient, an inescapable extrapolation is that the P. aeruginosa population in these patients is likely to be globally genetically heterogeneous. Maintenance of a reservoir of genomic microvariants (which may well be centered around a local fitness peak ) could provide a selective advantage to the population by conferring a degree of preexisting adaptive potential (the Red Queen effect). Indeed, it has been noted by previous workers that coeval CF isolates display substantial variation in clinically important phenotypes such as antibiotic susceptibility (10, 22), and our own phenotypic analyses (Table 2) confirm this. Current efforts are aimed at obtaining a more detailed picture of how this natural genetic microvariation impacts cell physiology and gene expression in a wider range of coeval isolates.
This work was supported in part by a studentship awarded to Jade C. S. Chung from the BBSRC and by the Cambridge Newton Trust. We thank Illumina for funding the genome sequencing and carrying out the preliminary analysis of the data at their Great Chesterford facility.
Dervla Kenna (Health Protection Agency, Colindale, United Kingdom) is thanked for carrying out the VNTR analyses. We thank Craig Winstanley and Aras Kadiolglu (University of Liverpool), George Salmond (University of Cambridge), Julian Parkhill (The Sanger Centre, Hinxton, United Kingdom), and George Weinstock (University of Washington) for helpful discussions. David Spring and James Hodgkinson supplied the synthetic PQS and AHLs.
Published ahead of print 29 June 2012
Supplemental material for this article may be found at http://jb.asm.org/.