We wanted to know whether two randomly picked P. aeruginosa colonies from the expectorated sputum of CF patients would display genomic differences, and if so, to define the nature of these differences (SNP, indel, or horizontally acquired DNA). To take into account possible patient- or strain-specific idiosyncrasies, we sequenced the genomic DNA from pairs of isolates obtained from the sputa of three chronically infected adult CF patients (here referred to as patients 5, 6, and 7). Expectorated sputum samples were collected from the patients during their routine checkup visits to the Papworth Hospital CF clinic (Cambridge, United Kingdom). The patients (i) were all homozygous for the ΔF508 CFTR allele, (ii) were clinically stable at the time of sampling, and (iii) yielded expectorated sputum containing >105 CFU P. aeruginosa per ml. All three patients were chronically infected with P. aeruginosa before they were referred to the Papworth adult CF clinic. Patient 5 (male, aged 34 years) had been chronically infected for at least 6 years prior to sampling and was on long-term oral azithromycin prophylaxis. Twenty-one days before sample harvesting, the patient underwent 6 days of treatment with meropenem, ciprofloxacin, and ceftazidime, followed by 13 days of treatment with ceftazidime and aztreonam. Patient 6 (male, 25 years) had been chronically infected for at least 18 months prior to sampling and was being treated with long-term inhaled colomycin or tobramycin (alternate months) and orally administered azithromycin. Patient 7 (female, 19 years) had been chronically infected for at least 2 years prior to sampling and was being treated with inhaled meropenem or tobramycin (alternate months) and oral flucloxacillin. Twenty-eight days before sample harvesting, the patient underwent 14 days of treatment with intravenous meropenem and tobramycin.
VNTR analysis on a selection of colonies from each sputum sample indicated that each of the three patients was infected with a different P. aeruginosa strain. None of these strains were known as “epidemic” pathovars. MLST analysis of the isolates from patients 5 and 7 yielded only partial hits with the MLST database. However, the sequences from the patient 6 isolates yielded a perfect match with strain 2724 (clonal complex ST245). This strain was originally isolated in Poland in 2005. Further investigation revealed that patient 6 was a visitor to the United Kingdom from Western Russia.
The sputum from patient 5 yielded roughly equal numbers of smooth nonmucoid colonies and mucoid colonies. Mucoidy is a common phenotypic conversion seen in CF isolates (2
). One colony of each morphotype was selected for genome sequencing, here referred to as isolates 5S
(smooth) and 5M
(mucoid). The sputum from patient 6 yielded only mucoid colonies, while that of patient 7 contained only nonmucoid colonies. Two colonies each from patients 6 and 7 were selected for genome sequencing (here referred to as isolates 612b
for patient 6 and 714b
for patient 7). Genomic DNA was extracted from overnight cultures of each isolate and was sequenced as described in Materials and Methods. The mean depth of sequencing of all of the isolates was high (). As a control to establish the background rate of error associated with our sequencing methodology, we also extracted gDNA from a culture of minimally passaged LESB58 (kindly provided by Craig Winstanley, University of Liverpool, United Kingdom), which was the original isolate of the Liverpool Epidemic Strain (LES) used for genome sequencing (43
). The sequence reads from this LES gDNA contained no SNPs relative to the published reference. This indicated that the sequencing technology we employed had a negligible error rate and that plating and overnight growth of the colonies were unlikely to introduce mutations. The sequence reads of one isolate from each patient were de novo
assembled to yield genomes of 6,306,203 bp (isolate 5M
, 102 scaffolds), 6,605,755 bp (isolate 612b
, 187 scaffolds), and 6,161,118 bp (isolate 714b
, 112 scaffolds). The only scaffolds that did not align with the LES reference were from the isolate 612b
assembly. However, these scaffolds were found to be essentially identical to sequences within the exoU
-containing P. aeruginosa
pathogenicity island 2 (PAPI-2) and the PAGI-7 genomic island from strain PSE9 (1
Single base substitutions. (i) Global trends.
As a comparison genome, we used the Liverpool Epidemic Strain LESB58 as a reference. The LES is a CF-associated strain and was therefore considered to be a more appropriate comparator than strain PAO1, which was originally isolated from a wound. The sequence reads for each of our isolates were screened against the LESB58 reference genome to identify single base substitutions. summarizes the data. Each isolate contained ca. 23,000 ± 1,000 single nucleotide variations relative to the LES reference, representing the extent of strain-to-strain variation (ca. 0.4%) within the conserved “core” genome of P. aeruginosa
). Most (ca. 86%) of these base changes were located within predicted ORFs. This is broadly consistent with the data of Stover et al., who reported that protein-coding regions comprise a comparable fraction of the P. aeruginosa
). The majority (~77%) of the SNVs were derived from transition events. Of the SNVs located within the coding DNA, ~67% gave rise to synonymous codon changes. Almost all of the nonsynonymous single nucleotide changes (~5,000 per isolate) gave rise to missense mutations; very few led to nonsense mutations (). A complete list of the nonsense mutations present in each isolate (relative to the LES reference) is provided in Table S1 in the supplemental material. Only one of these (Glu→Stop in LES_57801) was present in all of the isolates, although another gene, LES_49081 (encoding a hypothetical protein in the pilA-pilD
operon) contained independent nonsense mutations in the patient 5 isolates (Ser→Stop) and the patient 7 isolates (Gln→Stop), perhaps indicative of parallel selective pressures in each lineage. In some instances, the nonsense mutations may well have had clinical or physiological significance. For example, mexB
, which encodes a component of the principal broad-spectrum multidrug efflux pump in P. aeruginosa
, carried a nonsense mutation that truncated the encoded protein in both isolates from patient 5 at residue Y891, leading to loss of the last four transmembrane helices, one of which is known to be essential for pump function (32
). In both isolates from patient 6, the protein encoded by the quorum-sensing master regulator, lasR
), was truncated at residue E124 within the OdDHL-binding domain. Finally, mutS
, which encodes a mismatch repair protein (and which is responsible for the so-called “hypermutator” phenotype in some strains [20
]), contained a nonsense mutation that truncated the encoded protein in both isolates from patient 7 at residue S813, leading to loss of the last 42 amino acids. To assess whether this truncation had functional consequences, we measured the rate of acquisition of spontaneous rifampin resistance (mainly due to rpoB
mutations) for each isolate, as described by Cramer et al. (3
). The isolates from patients 5 and 6 all displayed negligible spontaneous Rif resistance. However, isolates 714b
both displayed hypermutator phenotypes (mutation rates of 6 × 10−6
and 2.2 × 10−5
Individual missense mutations also had clear functional consequences. For example, the LES strain is intrinsically resistant to the fluoroquinolone antibiotic ciprofloxacin due to a T83I mutation in GyrA (45
). However, this mutation is absent in the gyrA
carried by the isolates from patients 5 and 6, which encode a ciprofloxacin-sensitive GyrA protein. These isolates display correspondingly increased sensitivity to ciprofloxacin (compared to the LES) when grown in vitro
(). A different fluoroquinolone resistance-conferring mutation was identified in isolates 714b
. Here, a G→T transversion at nucleotide 258 in gyrA
led to a D87Y substitution in the protein. The D87Y mutation has been previously shown to confer resistance to fluoroquinolones (14
Phenotypes associated with each isolatea
For any given isolate, most of the ca. 25,000 SNVs were also present in the corresponding paired isolate from the same patient's sputum. However, when comparing individual isolates from different patients, fewer than half of the SNVs associated with one isolate were also conserved in the other isolate. For example, of the ca. 25,000 SNVs common to both 5M and 5S, only around 11,500 were shared with the isolates from either patient 6 or patient 7. These data reinforce the notion that each patient is infected with a distinctly different strain of P. aeruginosa.
(ii) Single nucleotide polymorphisms.
When comparing isolates from the same patient, SNPs were apparent, i.e., base substitutions unique to that isolate. For example, the 5M
isolates differed due to a total of 54 unique SNPs (within 31 ORFs) in the conserved core genome (). The isolates from patient 7 showed even greater individual variation (344 SNPs, of which 320 were located within ORFs), while those from patient 6 were differentiated by only 1 SNP. Presumably, isolates 714b
contained far more SNPs than those from patient 5 or patient 6 because of their hypermutator (mutS
) phenotype. Notably, >95% of the SNPs in isolates 714b
were transitions. This is consistent with the known preference of the mismatch repair system for correcting this type of mutation (17
). A full list of the SNPs present in each isolate is given in Table S2 in the supplemental material.
Around 68% of the SNPs present in ORFs in isolates 714b and 729b were nonsynonymous (NS-SNPs). The single SNP in the patient 6 isolates also led to a nonsynonymous codon change (T→P in rpoB). In contrast, of the 51 SNPs that were located within ORFs in isolates 5M and 5S, a slight majority (53%) yielded synonymous codons. However, these data are not as disparate as it may at first appear; the calculated value of dN/dS (the ratio of actual/expected nonsynonymous [dN] and synonymous [dS] base changes) for the combined patient 5 isolates was 0.34, and dN/dS values for the patient 7 isolates were 0.79 (isolate 729b) and 0.47 (isolate 714b). The value of dN/dS could not be calculated for the single SNP associated with the patient 6 isolates. Therefore, in all measurable cases, the dN/dS value was <1. This is a strong signature of ongoing negative (purifying) selection.
Insertions and deletions. (i) Global trends.
Indels, defined here as insertions or deletions of <300 bp, were present in all of the isolates that we examined. Overall, each isolate contained >200 genes disrupted by indels (with many genes containing more than one indel) relative to the LES reference. These indels sometimes led to frameshifting of the downstream codons and were therefore likely to be of functional significance. For example, all of the mucoid isolates contained indels in the anti-sigma factor mucA. Isolate 5M contained a 7-bp insertion (a tandem duplication of the sequence C298TGGCCG304) that frameshifted the encoded protein from residue 101 onwards. Isolates 612b and 69a both contained a single nucleotide deletion of base G430 in mucA, leading to a frameshift after residue 143 in the encoded protein. Both types of indel were confirmed by Sanger sequencing.
A comparison of the indels present in one but not the other of the paired isolates from each patient revealed indel-associated polymorphisms (IPs). For example, the isolates from patient 5 differed due to indels at 38 locations (27 of these within annotated ORFs) in the conserved core genome. The patient 6 isolates differed due to 8 IPs (although only 2 of these were located within annotated ORFs). The patient 7 isolates differed due to 93 IPs that disrupted 65 core ORFs (). A complete list of the IPs present in each isolate is given in Table S3 in the supplemental material. We conclude that IPs and SNPs are potentially comparable drivers of phenotypic variation.
Indels affecting quorum sensing.
One key gene affected by indels was the regulator of quorum sensing lasR
. In both isolates from patient 5, lasR
contained a single nucleotide deletion (of base T230
) leading to a frameshift after residue serine 77 in the encoded protein. Consistent with this, neither isolate produced OdDHL (). A more extreme deletion affected lasR
in isolates 714b
. Here, a 160-bp deletion spanning nucleotides −156 (relative to the ATG start codon) to +4 essentially abolished expression of the gene. Recalling that both isolates from patient 6 also encoded a truncated LasR (due to a nonsense mutation), we conclude that all of the isolates that we examined contained clear loss-of-function lasR
mutations. In each case, these mutations were confirmed by Sanger sequencing. Mutations in lasR
are known to arise frequently in CF (4
) and are associated with a poorer clinical prognosis (13
AHL production in planktonic cultures of the different isolates. OdDHL (A) and BHL (B) levels in late-exponential-phase culture supernatants of the indicated isolates.
Under laboratory growth conditions, production of quorum-sensing signal molecule BHL is under the control of the las
subcomponent of the QS system (25
). Interestingly, although it carries a defective lasR
gene, isolate 714b
(and to a lesser extent, also its paired partner, isolate 729b
) produced quantities of BHL comparable to those of the LES strain (). RT-PCR analyses confirmed that the LES (wild type for lasR
) and isolates 714b
mutants) each expressed rhlI
at comparable levels (see Fig. S1 in the supplemental material). Moreover, isolates 714b
both produced copious quantities of rhamnolipid, a virulence factor that is primarily BHL regulated (24
) (). These results are consistent with the recent data of Dekimpe and Déziel (5
) showing that under some situations, the rhl
signaling system can be active in the absence of a functional LasR protein.
In addition to lasR
, another important QS regulator, pqsR
(also known as mvfR
; a LysR-type transcriptional regulator [LTTR]) contained a 12-bp deletion (G100
in the LES orthologue) in isolate 612b
. MvfR is the only known receptor for the Pseudomonas
quinolone signal (PQS) and is postulated to play an important role in virulence by linking the las
components of the AHL-dependent QS system (8
). The IP in 612b
is slightly unusual in that it seems to restore what appears to be a simple ancestral motif (“-TAVS-,” which is present in most LTTRs) that has apparently been duplicated in PqsR to yield T33
. The deletion in 612b
leads to excision of an amino acid segment, AVSS, from this sequence to yield T33
. This deletion is predicted to directly affect the helix-turn-helix motif of the protein and so may have functional consequences. Another indel affecting PQS was seen in isolate 729b
. Here, the gene encoding PqsE (a protein required for the response to PQS) contained a single nucleotide deletion (base G565
) leading to a frameshift after position 188 in the protein.
Variations in the mobile “accessory genome” composition within paired isolates.
Sequences partially similar to the F10-like regions of LES prophage 2 and LES prophage 3 (43
) were present in isolates 5S
, but not in their cognate paired partners or in the patient 6 isolates. This was graphically apparent when we compared the read depths for these prophage signatures in the paired isolates; Fig. S2 in the supplemental material shows the sequence reads from the patient 7 isolates mapped onto the prophage 3-containing region of the LES reference (similar data were obtained for the patient 5 isolates; see Fig. S3 in the supplemental material). Further de novo
assembly of the sequence reads revealed that unlike the LES prophages 2 and 3, which are physically distinct units, the F10-like prophage signatures in isolate 729b
comprise a single unit that localizes to just one 43-kb contig. PCR-based analyses confirmed the presence of this prophage sequence in the purified genomic DNA of isolate 729b
, yet when we screened the gDNA from a selection of additional P. aeruginosa
colonies coisolated with the 714b
samples, no evidence of the prophage was found (see Fig. S4 in the supplemental material). This suggests that this prophage is probably present only at low frequency in the population. Presumably, the added fitness cost of replicating the prophage might account for this. However, it may also be advantageous for the population to maintain a small but latent prophage reservoir: as long as all members of the population are resistant to infection by the encoded prophages, such phages may protect the niche from superinfection by susceptible P. aeruginosa
strains (i.e., a mechanism of kin selection). The differential presence of specific prophage signatures in paired isolates derived from the same patient suggests that either (i) these prophages were acquired after
the initial infecting clone established itself or (ii) these sequences were present in the initial infecting clone but have become differentially lost during the development of chronic infection.
Genotype-phenotype linkage (SNPs and IPs).
The SNPs and IPs contained within the isolates from patient 5 (i.e., those encompassing the median number of genetic differences) are detailed in and , respectively. Notably, isolate 5S
produced less siderophore than isolate 5M
(). Consistent with this, isolate 5S
contained an NS-SNP in pchD
(a key pyochelin biosynthetic enzyme). Pyochelin is one of the major siderophores of P. aeruginosa
. Isolate 5S
also contained a 13-bp deletion in metF
(required for methionine biosynthesis). Consequently, unlike isolate 5M
, isolate 5S
was auxotrophic and was unable to grow on M9 minimal medium. However, isolate 5S
was able to grow when the M9 medium was supplemented with methionine (data not shown). Both of the patient 5 isolates displayed very low levels of PcrV expression compared with the LES control (see Fig. S5 in the supplemental material). PcrV is a protein involved in type III secretion. PcrV expression was almost undetectable in isolate 5S
, and the protein was present at only very low levels in isolate 5M
. This was surprising, since the type III secretion system antiactivator exsD
carried a deletion of 7 bp (G99
) in both isolates, which would normally lead to increased PcrV expression. However, isolate 5S
contained a point mutation (S188→F) located directly within the predicted helix-turn-helix motif of the principal transcriptional activator of type III secretion, ExsA. Given its location and nonconservative nature, this mutation is likely to have had a detrimental effect on ExsA function, thereby accounting for the absence of PcrV expression in isolate 5S
. The low level of PcrV expression in the mucoid isolate 5M
may have a different cause. In P. aeruginosa
, the T3S system is activated by a cyclic AMP receptor protein (CRP) homolog, Vfr (7
). For reasons that are not yet clear, mucA
mutants display impaired Vfr-dependent virulence factor production and, therefore, lower T3S system expression (15
). This impaired Vfr signaling may also explain why 5M
produced less secreted caseinase activity than the mucA+
(). This explanation presumably also accounts for the low level of T3S and diminished secreted protease activities in the patient 6 isolates, both of which carry a mucA
mutation. We conclude that in many cases, there is a clear link between the genotypic differences we observed in each isolate and the corresponding phenotype. However, there were exceptions to this. For example, mucoidy has been strongly linked to biofilm formation (reviewed in reference 2
), and indeed, both of the (mucoid) patient 6 isolates formed robust biofilms in vitro
(). In contrast, neither of the patient 5 isolates formed biofilms in vitro
, in spite of the mucoid phenotype of isolate 5M
. Clearly, either the acquisition of mucoidy is insufficient to promote biofilm formation per se
, or these isolates have acquired other mutations that suppress biofilm formation at a more profound level.
SNPs present in the isolates from patient 5
IPs in the isolates from patient 5
Mutational hot spots.
Twenty-one of the 30 SNPs present in isolate 5M
() were located in just six contiguous ORFs, corresponding to genes LES_07061 → LES_07101 (fptA
). Isolate 5M
also carried a 9-bp insertion in the same region (). This may indicate that this stretch of DNA is a hot spot for mutation. However, the corresponding region of DNA in isolate 5S
was devoid of SNPs/indels. A likely explanation for this is that this segment of DNA in isolate 5M
(or equally, 5S
) is the product of a recombination event with DNA from another lineage. In support of this, many of the SNPs that differentiate isolate 5M
from isolate 5S
are identical to SNVs present in the isolates from patients 6 and/or 7 (presumably, these SNVs are widespread in the global P. aeruginosa
gene pool). Interestingly, this region carries a gene (LES_07091; ampP
) that is required for the induction of endogenous β-lactamase activity (16
) and so may be important for survival in the CF lung niche, especially during antibiotic challenge.
Another example of potential “hot spotting” can be seen in isolate 5S
. Here, 8 of the 24 SNPs are located within a single gene (LES_34491, annotated as a mucin-17 like precursor). These SNPs were independently confirmed by Sanger sequencing. Although most of these SNPs are synonymous, three are NS-SNPs, two of which (M→R and S→P) are nonconservative and therefore may affect function. All of the SNPs fall within a short region (residues 639 to 968) of the 2,715-residue protein encoded by LES_34491. Notably, in the paired isolate 5M
this gene also has a mutation (an 84-bp insertion, giving rise to insertion of 28 amino acids following residue 1141 in the encoded protein) that is likely to affect function. LES_34491 may therefore be the subject of differential degradation in both patient 5 isolates. Zhang and Mah (47
) have recently shown that LES_34491 encodes a LapA-like outer membrane protein. Moreover, the gene cluster encompassing LES_34491 is expressed only during the biofilm mode of growth and confers resistance to a range of clinically relevant antibiotics. However, as noted above, isolates 5M
were unable to form robust biofilms when grown in vitro
(). This impaired ability to form biofilms may explain why LES_34491 has become redundant and is degrading.
One undoubted mutational hot spot was gene LES_56471 (algP
), which encodes a histone H1-like regulator of mucoidy. AlgP contains around 45 tandemly repeated KPAA units, and the ORF has been previously shown to undergo rearrangement at high frequency in CF isolates, especially with respect to the number of repeated units it contains (6
). Consistent with this, the algP
in isolate 69a
(but not that in isolate 612b
) contained a 12-bp insertion that duplicated the amino acid sequence K173
. In contrast, both isolates from patient 7 contained a 12-bp deletion at the same site (this time, removing a KPAA motif). The isolates from patient 5 contained yet a different mutation; here, a 24-bp deletion removed residues V217
from the encoded AlgP protein.
We have already alluded to the fact that the paired isolates from patients 5 and 7 contained different nonsense mutations in LES_49081 (a hypothetical gene in the pilA
cluster) and that lasR
was inactivated in all of the isolates examined. That such mutations arise independently within the same genes provides support for the notion that parallel selection pressures are in operation in the different patients. Another example of this is seen in isolates 5M
, which each contained different NS-SNPs in the gene encoding elongation factor G (fusA1
) (). Similarly, isolates 714b
also carried nonidentical NS-SNPs in fusA1
(see Table S2 in the supplemental material). The fact that this gene has been the target of four independent nonconservative NS-SNPs strongly suggests that these mutations may confer some advantage in the CF lung environment. In Salmonella enterica
serovar Typhimurium, mutations in fusA1
can confer resistance to fusidic acid but also lead to pleiotropic phenotypes, including reduced virulence and altered intracellular (p)ppGpp levels (18
). The ppGpp-dependent stringent response has been previously shown to modulate virulence in P. aeruginosa
). Since downregulation of virulence phenotypes often accompanies adaptation of the CF lung environment, it may be that mutation of fusA1
is one mechanism that contributes toward this.