|Home | About | Journals | Submit | Contact Us | Français|
The aim of this study was to determine whether geographical differences impact the composition of bacterial communities present in the airways of cystic fibrosis (CF) patients attending CF centers in the United States or United Kingdom. Thirty-eight patients were matched on the basis of clinical parameters into 19 pairs comprised of one U.S. and one United Kingdom patient. Analysis was performed to determine what, if any, bacterial correlates could be identified. Two culture-independent strategies were used: terminal restriction fragment length polymorphism (T-RFLP) profiling and 16S rRNA clone sequencing. Overall, 73 different terminal restriction fragment lengths were detected, ranging from 2 to 10 for U.S. and 2 to 15 for United Kingdom patients. The statistical analysis of T-RFLP data indicated that patient pairing was successful and revealed substantial transatlantic similarities in the bacterial communities. A small number of bands was present in the vast majority of patients in both locations, indicating that these are species common to the CF lung. Clone sequence analysis also revealed that a number of species not traditionally associated with the CF lung were present in both sample groups. The species number per sample was similar, but differences in species presence were observed between sample groups. Cluster analysis revealed geographical differences in bacterial presence and relative species abundance. Overall, the U.S. samples showed tighter clustering with each other compared to that of United Kingdom samples, which may reflect the lower diversity detected in the U.S. sample group. The impact of cross-infection and biogeography is considered, and the implications for treating CF lung infections also are discussed.
Cystic fibrosis (CF) is one of the most common genetic diseases in Europe and North America, with an incidence of around 1 in 2,500 (7). Due to the altered lung physiology, CF patients suffer chronic bacterial lung infections (25). The host immune response to these infections leads to an irreversible loss of lung function, with 85 to 95% of CF patients ultimately succumbing to respiratory failure (11, 20, 25, 34). Therefore, gaining a better understanding of the characteristics and dynamics of these infections is of the greatest importance.
A limited number of bacterial species have been traditionally considered key pathogens in CF lung disease progression, including Pseudomonas aeruginosa, Staphylococcus aureus, Haemophilus influenzae, Burkholderia cepacia complex, and Stenotrophomonas maltophilia (18, 20, 22, 25). Conventional diagnostic microbiology focuses on the detection of a limited group of species, including those listed above (16). However, more recent studies have indicated that the range of bacterial species harbored in sputa from adult CF patients is much wider (28, 29, 31). These findings emerged through the application of culture-independent analytical strategies that detect bacterial species signatures in nucleic acids extracted directly from clinical samples. In particular, the combination of 16S ribosomal RNA (rRNA) gene terminal restriction fragment length polymorphism (T-RFLP) profiling and 16S rRNA clone sequence analysis has revealed the widespread presence of anaerobic bacterial species within the lower CF airways (15, 29).
Patient characteristics, such as cystic fibrosis transmembrane conductance regulator (CFTR) genotype, the presence of disease manifestations (e.g., CF-related diabetes), and patient demographics, vary according to the country of treatment (5, 11, 17). Although median survival for the United Kingdom and the United States is similar, with approximately 35 years for both (2, 9), patient and disease characteristics have been shown to vary, e.g., 40% of patients over 40 years in a United Kingdom CF center were pancreatic sufficient, compared to only 16% of patients in an equivalent U.S. center (17). There are many possible reasons why this may be the case, including socioeconomic factors, neonatal screening practices, and differences in treatment practices (14). Recommendations for the latter differ to some extent between the United Kingdom and United States (6, 10, 39). In addition, how these recommendations are implemented depends on individual treatment centers and can vary considerably.
However, as CF lung disease involves infection by bacteria of an apparently wide range of origins, it may be important to consider the environment as a source of these species and thus as another disease-modifying factor. The concept of early microbiological studies that “everything is everywhere, but the environment selects” infers that despite the purging effect of selection in different habitats, microbial dispersal is widespread, and as such it offers little opportunity for the spatial differentiation of bacterial species (8, 26).
Evidence is emerging that the composition of bacterial communities in different habitats can vary from region to region (4, 12, 13, 19). The degree to which endemism (localized populations) is important in terms of its impact on species presence in the CF lung is not clear. This could, however, be of clinical importance, given that many of the species regarded as pathogens are present in a range of natural environments, e.g., P. aeruginosa, B. cepacia complex, and S. maltophilia. For example, P. aeruginosa has been shown to occur worldwide, but clones with increased virulence and enhanced antibiotic resistance occasionally arise, leading to localized epidemic spread in the environment and CF centers, such as the Liverpool epidemic strain (27, 33).
In this study, bacterial community profiles, generated from adult CF patients in two distinct geographical regions (the United States and the United Kingdom), were compared to investigate geographical differences in the bacterial CF airway community. Culture-independent approaches of terminal restriction fragment length polymorphism (T-RFLP) profiling and clone sequence analysis were used to analyze the bacterial content of sputum samples. Patients were grouped in pairs, with one United Kingdom and one U.S. patient matched on parameters considered clinically relevant. The analysis of these data showed that, both in terms of species presence/absence as well as the relative prevalence of species in the lung, no clear transatlantic divide could be defined between bacterial communities. More-subtle differences in community composition and structure were, however, observed.
Spontaneously expectorated sputum samples were collected from 19 adult CF patients attending the Adult Cystic Fibrosis Clinic at Southampton General Hospital, United Kingdom, and 19 patients attending the University of North Carolina Cystic Fibrosis/Pulmonary Research and Treatment Center, Chapel Hill, NC, under full ethical approval. After collection, samples were stored at −80°C prior to nucleic acid extraction. All nucleic acid extractions, as well as T-RFLP analysis, were carried out by one person in the laboratory in the United Kingdom. Clone sequence analysis was performed on these same nucleic acid extracts by one person in the United States. Patient age, sex, lung disease severity, CFTR genotype, body mass index (BMI), and antibiotic therapy details relating to these samples are shown in Table Table11 . Patients recruited to this study represented a random cross-section of CF patients. All were between 18 and 47 years old, with 12 females and 7 males per group. Enrolled patients from the United Kingdom and United States were paired according to age, gender, and lung obstruction category (forced expiratory volume in 1 second [FEV1%] predicted), as assessed clinically (Table (Table1).1). An age difference of less than 10 years together with a difference of 15% or less in FEV1% predicted but in the same lung obstruction category (i.e., normal, mild, moderate, severe) was considered acceptable. Mean age difference was 5.4 years (standard deviation [SD], 4.1) and mean difference in FEV1% predicted was 10.1% (SD, 7.7%). Exceptions where only one parameter could be matched acceptably where UK/US1, UK/US3, UK/US4, and UK/US9, with an age difference and difference in FEV1% predicted of 8 years and 30%, 7 years and 18%, 17 years, and 2% and 9 years and 20%, respectively. Additional criteria, such as distance to hospital, were recorded. Patients were clinically stable at the time of sampling (at least 21 days had passed before or after a treatment for pulmonary exacerbation), with eight exceptions (U.S. patients 4 and 19 and United Kingdom patient 19, who were receiving intravenous [i.v.] antibiotics for pulmonary exacerbation; U.S. patient 2, United Kingdom patient 2, and United Kingdom patient 4, who had finished i.v. antibiotics 12, 14, and 13 days previously, respectively; and U.S. patients 7 and 17, who were clinically stable but were receiving antibiotic therapy for reasons other than pulmonary exacerbation). In all cases, clinical status was given priority over antibiotic therapy for patient matching.
Prior to DNA extraction, sputum samples were washed three times in phosphate-buffered saline (PBS; Fisher Scientific, Loughborough, United Kingdom) to remove adherent saliva. DNA extraction from clinical samples was adapted from a previously described procedure (29). Briefly, 200 μl of each sputum sample was resuspended in 800 μl of 200 mM PBS (pH 8.0) and 300 μl guanidium thiocyanate-EDTA-Sarkosyl. After the addition of 0.2 g of 0.18-mm-diameter glass beads (B. Braun Biotech International GmbH, Melsungen, Germany), samples were homogenized for 60 s at 30 Hz in a Mixer Mill 300 (Qiagen, Crawley, United Kingdom). Samples were heated at 70°C for 20 min and placed on ice for 20 min, and beads and cell debris were removed by centrifugation (13,000 × g for 5 min) at room temperature. Supernatants were transferred to fresh microcentrifuge tubes, followed by the addition of NaCl (to a final concentration of 0.5 mM) and polyethylene glycol (to a final concentration of 15%). Nucleic acids were precipitated at 4°C for 1 h.
DNA was pelleted at room temperature by centrifugation at 13,000 × g for 10 min and resuspended in 300 μl nuclease-free water. Supernatants were transferred to fresh microcentrifuge tubes, and 300 μl Tris-acetate-EDTA (TAE)-saturated phenol (pH 8.0) (Sigma-Aldrich, Gillingham, United Kingdom) was added. After the mixture was vortexed vigorously, the phases were separated by centrifugation at 13,000 × g for 5 min. The upper phase was transferred to a fresh microcentrifuge tube, 300 μl phenol-chloroform-isoamylalcohol (25:24:1 ratio) (Sigma-Aldrich) was added, and samples were vortexed vigorously and centrifuged at 13,000 × g for 10 min. Supernatants were transferred to fresh microcentrifuge tubes, and DNA was precipitated at −20°C for 1 h after the addition of an equal volume of isopropanol (Sigma-Aldrich) and 0.1 volume of 10 M ammonium acetate. The DNA was pelleted by centrifugation at 13,000 × g for 10 min and washed three times in 200 μl 75% ethanol. Pellets were briefly air dried, resuspended in 50 μl of nuclease-free water, and stored at −20°C.
Extracted genomic DNA and PCR products were verified by TAE-agarose gel electrophoresis stained with GelRed (Biotium, Hayward, CA) and visualized on a UV transilluminator (Herolab, Wiesloch, Germany). Images were captured by using a Herolab image analyzer with E.A.S.Y. Stop win 32 software (Herolab). DNA and PCR products were quantified by spectrophotometry by applying 1.5 μl directly to a NanoDrop ND-1000 spectrophotometer (LabTech International, Ringmer, United Kingdom).
The universal oligonucleotide primers for the amplification of a region of the 16S rRNA gene specific for the domain bacteria 926r (5′-CCG TCA ATT CAT TTG AGT TT-3′) and 8f700IR (5′-AGA GTT TGA TCC TGG CTC AG-3′) were used as described previously (24). Primer 926r was unlabeled, and primer 8f700 was labeled with the dye IRD700 at the 5′ end. Both primers were synthesized by TAGN, Newcastle, United Kingdom. The constituents of the PCR mixture per reaction were the following: 25 μl of the Sigma readymix REDTaq (Sigma-Aldrich), each primer at a final concentration of 0.2 mM, 50 ng of DNA template, made up to a final volume of 50 μl with nuclease-free water.
Cycling conditions comprised an initial denaturation at 94°C for 2 min, followed by 32 cycles of denaturation at 94°C for 1 min, annealing at 56°C for 1 min, and extension at 72°C for 2 min, with a final extension step at 72°C for 10 min. Amplifications were carried out in a Gene Amp PCR system 9700 (Applied Biosystems, United Kingdom). PCR products were verified on TAE-agarose gels as described above and stored at −20°C for T-RFLP analysis.
Approximately 20 ng of each PCR product was digested to completion with 1 U of the restriction endonuclease CfoI (Sigma-Aldrich) for 5 h at 37°C, in accordance with the manufacturer's instructions. The restriction enzyme was inactivated by heating to 90°C for 20 min. Approximately 10 ng of digested PCR product was denatured at 95°C for 1 min and separated by length using a 25-cm SequagelXR denaturing polyacrylamide gel (National Diagnostics, Hessle, United Kingdom), with the addition of 8.3 M urea and formamide (to a final concentration of 10%). Electrophoresis was performed at 55°C and 1,200 V on an IR2 automated DNA sequencer (LI-COR Biosciences, Lincoln, NE).
T-RFLP gel images were analyzed using Phoretix one-dimensional advanced software (version 5.10; Nonlinear Dynamics, Newcastle upon Tyne, United Kingdom). The lengths of the bands detected on the gel were determined by comparison to the positions of the size marker microSTEP 15a (700 nm) (Microzone, Lewes, United Kingdom). Additionally, Phoretix software was used to determine the band volume (the product of the area over which the band was detected and the intensity of signal recorded over that area). The band volume was expressed as a percentage of the total volume of bands resolved in one particular electrophoretic profile. The resolution of T-RFLP bands was over the region of 50 to 950 bases. T-RF bands shorter than 50 bases were not included in the analysis, as this region is susceptible to high levels of background signal. The threshold of band detection used in this study was 0.1% of total signal in one profile in the specified region.
Band identification was performed as described previously by Rogers and coworkers (28, 29). Published bacterial 16S rRNA gene sequences were retrieved from GenBank (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db_Nucleotide). The software MapSort (Wisconsin Package, version 10.3; Accelrys, Cambridge, United Kingdom) was used to predict T-RFLP band lengths (in bases) from the 5′ end of primer 8f700IR to the first site of cleavage for CfoI in each recovered 16S rRNA gene sequence.
16S rRNA gene fragments were generated by PCR using primers 8f700 and 926r and the high-fidelity proofreading enzyme PFU-Ultra (Stratagene, La Jolla, CA) as described above. The pool of 16S rRNA gene fragments was cloned into the pCR-Blunt II-TOPO plasmid vector (Invitrogen, Carlsbad, CA) according to manufacturer's instructions. Plasmids then were used to transform competent Escherichia coli DH5α by heat shock at 42°C and plated on selective LB agar plates with 50 μg/ml kanamycin. Individual colonies were used to inoculate 2 ml of LB broth containing 50 μg/ml kanamycin in 96-deep-well blocks and grown for 20 h with vigorous shaking at 37°C. One hundred microliters of culture was saved as a glycerol stock, and the remaining 1.9 ml of culture was centrifuged at 500 × g for 10 min. Supernatants were removed and plasmid DNA isolated from the bacterial pellet with the Wizard SV 96 plasmid purification system (Promega Corporation, Madison, Wisconsin). The quality of resulting cloned 16S rRNA gene fragments containing plasmid DNA was assessed by TAE-agarose gel electrophoresis. Aliquots of purified plasmid were sequenced with M13 vector primer (MWG Biotech, High Point, NC). A total of 871 clones derived from samples taken from five patients from each geographical location were sequenced.
For each patient, sequences were vector trimmed with Sequencher v4.8 (Gene Codes Corporation, Ann Arbor, Michigan) and aligned at 100% identity and clustered. The longest sequence of each cluster was conserved and shorter redundant sequences removed. Sequences were aligned with the Ribosomal Database Project (RDP; Michigan State University, Michigan) Pyrosequencing Alignment and Clustering Program (http://pyro.cme.msu.edu/) and clustered into bins of 98% similarity with the “farthest nearest neighbor” complete linkage clustering algorithm. To reduce bin complexity, bins containing three or more sequences were aligned using the Greengenes alignment tool (http://greengenes.lbl.gov/cgi-bin/nph-NAST_align.cgi) using a batch size of 100, a minimum length of 100, and a minimum percent identity of 75%. The bins of three or more aligned sequences were transferred to ClustalX v2.0.12 (Conway Institute, UCD, Dublin, Ireland), and a multiple alignment was carried out with an output in the form of a phylogenetic tree. For each bin, the longest most-similar branch was chosen as the representative sequence, labeled to represent the number of sequences it contained (at 100 and 98% sequence identity), and the remaining sequences removed. Overall, this reduced sequence complexity from 871 to 136. To construct the phylogenetic tree for all patients, these 136 binned sequences were put into the Greengenes alignment tool and ClustalX as described above, together with the sequences of type strains that all sequences were most similar to, including CF pathogens, and the phylogenetic tree was constructed with the same parameters as those described above. Clusters were named according to the type strains that the sequences clustered most closely with.
Correlations, t tests, and one-way analysis of variance with tests of normality and homogeneity of variance, as well as Bonferroni and Tukey post hoc tests, were performed using SPSS software version 15.0.1 (Chicago, IL).
Raup and Crick as well as Morisita similarity coefficients were used to evaluate similarities in community composition. Similarity coefficients were calculated and dendrograms plotted using PAST software version 1.74 (http://folk.uio.no/ohammer/past/; University of Oslo, Norway). Raup and Crick, a probability-based similarity index (SRC), was calculated from binary matrices of OTU presences and absences. The probabilities were calculated using 2,000 Monte Carlo simulations to compare the number of bands shared by two samples and the number of bands predicted if two samples were randomly selected from an amalgamation of all the samples studied.
Sequences determined in the course of this work are available through the EMBL nucleotide sequence database (http://www.ebi.ac.uk/embl/) under accession numbers FN825913 to FN826783.
In the 38 sputum samples analyzed from both United Kingdom and U.S. patients by T-RFLP, a total of 173 bands representing 73 different T-RF lengths were detected. The number of individual bands resolved in each sample ranged from 2 to 10 for the U.S. patients and 2 to 15 for the United Kingdom patients. The mean number of T-RF bands per patient of 4.6 (standard deviation [SD], 3.6) in the United Kingdom sample set was similar to that of the U.S. sample set at 4.5 (SD, 2.3) and not significantly different [P = 0.87; t(df 31) = 0.159]. A modest significant correlation was observed between the number of bands detected for the paired U.S. and United Kingdom patient sets [P = 0.03; r(df 17) = 0.496].
Of the 173 T-RF bands detected, 32.4% were “singletons,” defined here as bands that occurred in only one sample. Singletons were detected in nine United Kingdom and eight U.S. samples. The mean number of singletons in these patients was 1.7 (SD, 3.0) and 1.2 (SD, 1.8) for United Kingdom and U.S. patients, respectively. This difference was not significant [P = 0.51; t(df 30) = 0.66].
T-RF band lengths that occurred more than once were analyzed in detail. All band lengths that occurred more than four times in the overall sample set were present in at least one United Kingdom and U.S. patient. Certain band lengths were commonly found in both sample sets. Ten band lengths accounted for 57% of all bands and were found in at least one United Kingdom and U.S. patient. Species consistent with these band lengths included Pseudomonas aeruginosa, Staphylococcus aureus, and Stenotrophomonas maltophilia, certain anaerobic species of the Prevotella genus, as well as two band lengths for which no species prediction could be made.
Clone sequence analysis was carried out for five patients from the United Kingdom (UK1, UK10, UK12, UK14, and UK15) and the United States (US1, US7, US11, US17, and US18). Samples for this analysis were chosen on the basis of T-RFLP results to include two patients from each group with high diversity (at least 10 different T-RF band lengths), one patient with low diversity (5 or fewer bands), and two patients for which the band at 155 bases, consistent with P. aeruginosa, was not detected. Clone sequence analysis showed that clones clustered with 30 different typed strains. These typed strains were used to name the clusters of the phylogenetic tree constructed (Fig. (Fig.1,1, Table Table2)2) and here will be referred to as species. A wide range of species was identified from 22 different genera. These included one species, Phenylobacterium koreense, which had not previously been associated with the CF lung, as well as six species (20%) of obligate anaerobes. Overall, 9 genera of 22 were detected in both patient groups (Table (Table2,2, Fig. Fig.1).1). Individual species within eight bacterial genera detected, namely, Pseudomonas, Streptococcus, Actinomyces, Staphylococcus, Parvimonas, and Neisseria, as well as two genera of the obligate anaerobes Prevotella and Veillonella, were present in at least one individual from both the United Kingdom and United States as analyzed by this approach (Table (Table2,2, Fig. Fig.1).1). In the United Kingdom patient group, the most abundantly identified clones were P. aeruginosa and S. maltophilia (31% each), and in the U.S. patient group they were P. aeruginosa (30%) and S. aureus (33%). Overall, 12 (40%) species were found for the United Kingdom patient group alone, 7 (23%) species were found for the U.S. patient group alone, and 11 (37%) species occurred in both patient groups (Fig. (Fig.1,1, Table Table2).2). Type strain clusters that comprised four or more unique sequences all contained sequences from both United Kingdom and U.S. patients, with the exceptions of Phenylobacterium koreense and S. maltophilia, which both were detected in the United Kingdom patient group only (Table (Table2,2, Fig. Fig.11).
The similarity of the bacterial content of samples was statistically assessed on the basis of the T-RFLP data. The patterns in shared species composition between the two sample sets were assessed through the generation of a Venn diagram (Fig. (Fig.2).2). From the data set as a whole, the four most commonly occurring band lengths, consistent with bands produced by P. aeruginosa (155 bases), Prevotella spp. (103 bases), Achromobacter/Bordetella/Pseudomonas spp. (565 bases), and an unassigned band at 580 bases, were examined in detail here. The two largest sets (band lengths of 155 and 565 bases) contained high numbers of both U.S. and United Kingdom patients. In total, this represented 79% of both patient groups. Overall, all four of the most commonly detected band lengths were found in at least one patient from each geographical group. In two of the samples studied (UK14 and US16), none of the four most commonly occurring band lengths were detected, so they are presented as outliers in Fig. Fig.22.
To assess the significance of similarities and differences in species composition between United Kingdom and U.S. samples, Raup and Crick similarity indices (SRC) were calculated for the 38 T-RFLP profiles from United Kingdom and U.S. patients, and a cluster diagram was constructed using unweighted pair-group averages (UPGMA) (Fig. (Fig.3).3). Values above 0.95 indicate pairs of samples that were more similar than expected by chance, and thus indicated significant clustering. Using this 0.95 threshold, Fig. Fig.33 is comprised of three clusters; cluster 1 of four U.S. patients (US5, US10, US11, and US13), cluster 2 of five United Kingdom patients (UK7, UK8, UK17, UK18, and UK19), and a substantial mixed cluster (3) of 10 U.S. patients and six United Kingdom patients (US1, US2, US4, US7, US8, US12, US14, US15, US17, US19, UK1, UK5, UK9, UK11, UK13, and UK16).
The degree of similarity of the U.S. and United Kingdom samples was further assessed using the Raup and Crick probability-based similarity index calculated on species presences and absences. This statistically resolved whether the samples were significantly similar (SRC ≥ 0.95), significantly dissimilar (0.05 ≥ SRC), or neither (0.05 < SRC < 0.95). Table Table33 shows the numbers and proportions of significant and nonsignificant SRC values for a series of pairwise comparisons of SRC values for United Kingdom with United Kingdom patients, U.S. with U.S. patients, and the matched United Kingdom and U.S. patients. Here, 23% of United Kingdom-U.S. pairs showed a significant level of similarity. United Kingdom-U.S. pairs were more similar than United Kingdom-United Kingdom pairs (15%) but less similar than U.S.-U.S. pairs (41%). A significant level of dissimilarity was detected for 4% of United Kingdom-U.S. pairs. For United Kingdom patients, 4% were more dissimilar from each other than expected by chance, and only 1% of U.S. patients were significantly dissimilar from each other. Further, this meant that many United Kingdom samples were more similar to U.S. samples than to each other.
In addition to species presence, the similarity of the abundance of these species in samples was investigated using T-RFLP band volume data. Morisita similarity coefficients (IM) were calculated in a pairwise manner for the relative volumes of bands of all 38 T-RFLP profiles, and a cluster dendrogram was plotted using UPGMA (Fig. (Fig.44).
Two distinct clusters of exclusively United Kingdom patients were formed with four and six patients, respectively: cluster 1 (UK7, UK8, UK17, and UK19) and cluster 2 (UK4, UK5, UK11, UK13, UK15, and UK16). A third cluster of 11 U.S. and one United Kingdom patient also was observed (US1, US4, US7, US8, US11, US12, US13, US14, US15, US17, US19, and UK9). The average IM value for United Kingdom-United Kingdom patient comparisons was 0.27 (SD, 0.39; n = 171), for U.S.-U.S. comparisons was 0.59 (SD, 0.38; n = 171), and for United Kingdom-U.S. comparisons was 0.35 (SD, 0.38; n = 361). These three average Morisita index values all were significantly different at [P < 0.0001, F(df 2, 700) = 33.21, and post hoc tests]. Again the U.S. samples showed a stronger association with each other than the United Kingdom samples showed with each other, and the United Kingdom samples were significantly more associated with the U.S. samples than with one another.
In summary, the comparison of the data presented in Fig. Fig.33 and and44 showed that some degree of overlap was identified in terms of cluster membership in relation to the two different strategies of data analysis used. Figure Figure33 identified three clusters, one cluster of 4 patients (US5, US10, US11, and US13), one of 5 patients (UK7, UK8, UK17, UK18, and UK19), and another of 16 patients (US1, US2, US4, US7, US8, US12, US14, US15, US17, US19, UK1, UK5, UK9, UK11, UK13, and UK16). Figure Figure44 identified another three clusters, one of 4 patients (UK7, UK8, UK17, and UK19), one of 6 patients (UK4, UK5, UK11, UK13, UK15, and UK16), and another of 12 patients (US1, US4, US7, US8, US11, US12, US13, US14, US15, US17, US19, and UK9).
For 90% of species identified by clone sequence analysis, a T-RF band length for the corresponding cut site was detected. For Brevundimonas diminuta and Parvimonas micra, no corresponding T-RF band length could be detected, and for Oribacterium sinus, the first CfoI cut site lay outside the T-RFLP detection range. For 26% of T-RF bands a species with a corresponding cut site was detected by clone sequence analysis. The detection of species by clone sequencing that corresponded to T-RF band lengths in individual patient's profiles ranged from 13% (United Kingdom patient 1) to 67% (United Kingdom patients 10 and 12).
The aim of this study was to compare the bacterial communities present in sputum collected from adult CF patients attending CF centers in the United States and United Kingdom. Patients in the two countries were paired prior to microbiological assessment according to clinical parameters. Two culture-independent, molecular strategies were used to analyze sample pairs, T-RFLP profiling and 16S rRNA clone sequence analysis. Both relied on the PCR amplification of the 16S rRNA gene from DNA extracted directly from bacteria in the sputum samples collected. These approaches have been used previously to characterize the communities of bacteria in the adult CF lung (3, 15, 28-32, 35, 36). To the best of our knowledge, this is the first application of such techniques to assess the impact of geographical location on bacterial community composition in CF respiratory samples.
To normalize for variation other than geographical grouping, patients were matched into pairs to align major clinical parameters considered important by the treating clinicians (age, sex, lung obstruction, BMI, and CFTR genotype). An attempt to match for other parameters, such as distance to clinic and treatment, also was made. However, due to differences in the size of the catchment area and some aspects of the treatment regimes, lower priority was given to these parameters. While we recognize that no system of matching patients is ideal, this approach allowed 19 pairings to be defined based on these criteria.
The T-RFLP profiling of the generated 16S rRNA PCR products was used to assess the bacterial community. A total of 73 distinct T-RF band lengths were generated for the 38 patients studied, which each were regarded as an individual bacterial species. The paired U.S. and United Kingdom patients showed similar numbers of bacterial species within their samples, suggesting that one or more of the clinical parameters used to assign patients to their pair were relevant. No marked difference was identified in the mean number of species per sample for the United Kingdom or U.S. patient group. While no comparable studies could be identified in the literature, the mean number of species obtained here was lower than that in earlier studies of stable adult CF patients (29, 30, 36), with this difference accounted for by the selection of a higher detection threshold. Statistical assessment showed a modest significant correlation in species number in relation to the patient pairs. As such, the number of species per sample did not provide evidence for a geographical difference in samples from the United Kingdom or U.S.
No difference was identified in the number of singletons from patients in the United Kingdom or U.S. Overall, approximately one third of all species detected occurred as singletons. Taken together, these findings imply that the lung community has a strong component that is patient specific regardless of patient geographical location. Other evidence reinforcing the highly different nature of bacterial communities between patients was found in particular for two patients, one from the United Kingdom and one from the United States, whose sputa contained no species that were common with those found in the other 36 patient samples studied here.
An examination of the species most frequently detected in the sample set (detected in four or more patients and at least once in each geographical cohort) suggested that certain species not traditionally associated with CF respiratory infections are common inhabitants of the CF lung. This is particularly noteworthy given the high degree of microbial diversity reported in this and earlier studies (29, 30, 36). The presence of species not recognized as key CF pathogens in the airways of a substantial number of patients may have important implications for treatment, and this is an area that warrants further investigations (37).
Indeed, a wide range of different bacterial species other than the key CF pathogens has been identified in the past decade (15, 28, 29, 36). Clone sequence analysis from 10 patients, with five patients representing each geographical group, revealed 30 different species from 22 different genera. Species traditionally associated with the CF lung, e.g., Pseudomonas aeruginosa, Staphylococcus aureus, and Stenotrophomonas maltophilia (1), as well as a number of species previously detected in CF samples primarily by culture-independent means, were identified here (15, 28, 29, 36), including one species that had not previously been reported in the context of CF, Phenylobacterium koreense. Eleven (37%) of the species, identified within the genera Pseudomonas, Streptococcus, Actinomyces, Staphylococcus, Parvimonas, Neisseria, Prevotella, and Veillonella, were common to both the United Kingdom and U.S. patient group. Interestingly, species from within the latter two genera require anaerobic conditions for growth. Anaerobes previously have been detected in CF sputum (3, 15, 29, 37), nonetheless it was striking that even in geographically distinct locations, similar lung physiological conditions were reflected in the species present. However, the majority of species were uniquely found in one geographical location, suggesting a strong local dominance in their biogeography.
To determine whether there is an uneven geographical distribution of the four most commonly detected T-RF bands, patients were grouped according to whether each band was present in their sputum sample. These groupings are illustrated in the form of a Venn diagram (Fig. (Fig.2).2). Here, two large clusters formed that contained both U.S. and United Kingdom patients to approximately 80% of all patients studied. This suggests that certain species are core to the CF lung by adulthood. Here, it was found that species common by this definition were present within at least one patient from both sample pools. As such, this provides evidence that these species are not endemic to either geographical region.
Similarity indices were employed to identify the degree of similarity between United Kingdom and U.S. samples. The first of these used here was the Raup and Crick similarity index. This analysis generated a set of three clusters representing similar bacterial communities. None of these clusters was found to relate to patient pairs. In contrast, these three clusters comprised one containing only U.S. patients, another containing only United Kingdom patients, and the last containing 10 patients from both the United States and United Kingdom. As such, this provides evidence both for and against the transatlantic distribution patterns of members of the communities detected.
Raup and Crick probability-based similarity indices (SRC) were analyzed to determine the degree of significance of the similarities identified between bacterial community memberships (38). Here, a large number of sample pairs with nonsignificant values (0.05 > SRC < 0.95) would indicate stochastic dispersal, while a large number of significant SRC values would indicate a structuring influence, such as niche selection. The paired patients from the United States and United Kingdom showed little tendency to cluster on the basis of community composition, with evidence of this in only ca. 25% of patient pairs. In patient groups from either of the two geographic regions, where no attempt was made to pair on clinical grounds, U.S. patients had communities that clustered to a greater degree (ca. 40%) than U.S.-United Kingdom pairs, whereas United Kingdom patients showed a lower degree of significant clustering (ca. 15%). This suggests that the pairing had only a limited impact on the community findings of this aspect of the study. The reason that the United Kingdom samples as well as the United Kingdom-U.S. matches differed so much in terms of this analysis is not clear, although possible explanations may rest in the differential treatments, exposure to different microbes, or factors that were not assessed. It also is possible that the lower diversity in U.S. samples, as confirmed within the detection limit of this study, allows for the tighter grouping of these samples.
Even though samples may contain similar numbers or types of species, the abundance of those species in the samples could vary substantially, presenting completely different types of ecosystems. Morisita similarity coefficients were used to determine whether relative bacterial species abundance within samples differed between patient groups in the United Kingdom and U.S. As for the Raup and Crick analysis based on species presence or absence, three clusters were identified. Two distinct clusters of exclusively United Kingdom patients (comprised of 4 and 6 patients, respectively) were formed with a third cluster of 11 U.S. and 1 United Kingdom patient. Further, when average Morisita index values for United Kingdom, U.S., and paired United Kingdom-U.S. patient groups were examined, U.S. samples again showed a greater level of similarity. Thus, relative species abundance was more affected by biogeography than was community membership.
Comparative studies of bacterial presence in the lungs of patients attending different CF centers previously have only focused on one or more species regarded as pathogens. Johansen et al. (21) found that P. aeruginosa and B. cepacia complex were more frequently detected in patients attending a CF center in Toronto than in those attending an equivalent center in Copenhagen. This focus on single species made it difficult to carry out a full comparative analysis with the findings here. In our study, bacterial species found to be common to both geographical patient groups included both known CF pathogens, such as P. aeruginosa, and those species not traditionally associated with CF lung infections, such as members of the genus Prevotella. However, the analysis of larger patient groups will be required to better characterize the geographic distribution of these species that were commonly detected here but as yet are not recognized as CF pathogens.
Certain bacterial species, typically regarded as being of environmental origin, can act as opportunistic pathogens. A number of such species were detected in the samples analyzed here; however, whether these are derived from the wider environment, the immediate treatment environment, or directly from other CF patients is not known. While some such species were detected in both United Kingdom and U.S. patient groups, others showed an uneven geographic distribution. This distribution may be related to differences in the treatment regimes between the two centers or environmental and biogeographical factors. Again, a wider study of this topic is warranted to determine the likely origin of the differences under the condition of patients attending different CF centers (23).
In conclusion, this study has considered the bacterial species present in the lungs of CF patients in two centers in the United States and United Kingdom. Species numbers per sample were similar in the two sites, but differences in terms of species identities were detected, as could be confirmed by the detection limit here. Despite this, the four most frequently detected T-RF bands were present in 80% of all patients, suggesting that certain species are common to the CF lung on both continents. Overall, the U.S. samples showed a stronger association with each other than the United Kingdom samples, which may reflect the lower diversity detected in U.S. samples. Geographical differences were identified in species composition and more strongly in species prevalence in samples taken from patients attending the two CF centers. Overall, it is clear that the bacteriology of the CF lung is complex. Given the importance of lung function to CF patient health, it is by extension important to be able to understand this complexity as the first step in advancing therapy for these patients.
This work was supported by the Anna Trust.
Work conducted at UNC-Chapel Hill was supported by a grant from the National Institutes of Health (HL092964).
Published ahead of print on 10 November 2010.