|Home | About | Journals | Submit | Contact Us | Français|
Immunosuppression is associated with a variety of idiopathic clinical syndromes that may have infectious causes. It has been hypothesized that the cord colitis syndrome, a complication of umbilical-cord hematopoietic stem-cell transplantation, is infectious in origin.
We performed shotgun DNA sequencing on four archived, paraffin-embedded endoscopic colon-biopsy specimens obtained from two patients with cord colitis. Computational subtraction of human and known microbial sequences and assembly of residual sequences into a bacterial draft genome were performed. We used polymerase-chain-reaction (PCR) assays and fluorescence in situ hybridization to determine whether the corresponding bacterium was present in additional patients and controls.
DNA sequencing of the biopsy specimens revealed more than 2.5 million sequencing reads that did not match known organisms. These sequences were computationally assembled into a 7.65-Mb draft genome showing a high degree of homology with genomes of bacteria in the bradyrhizobium genus. The corresponding newly discovered bacterium was provisionally named Bradyrhizobium enterica. PCR identified B. enterica nucleotide sequences in biopsy specimens from all three additional patients with cord colitis whose samples were tested, whereas B. enterica sequences were absent in samples obtained from healthy controls and patients with colon cancer or graft-versus-host disease.
We assembled a novel bacterial draft genome from the direct sequencing of tissue specimens from patients with cord colitis. Association of these sequences with cord colitis suggests that B. enterica may be an opportunistic human pathogen. (Funded by the National Cancer Institute and others.)
Allogeneic hematopoietic stem-cell transplantation (HSCT) is a cornerstone of therapy for patients with certain hematologic diseases and is associated with a risk of serious complications.1,2 Conditioning and antimicrobial therapy can have direct toxic effects and alter the gut microbiome.3 Immunosuppression and the limited efficacy of immunologically naive stem cells in umbilical-cord HSCT can result in life-threatening infections, especially in the first year after transplantation.4 Gastrointestinal toxicity is common after HSCT and can be manifested clinically as colitis.5–8 Several types of colitis affect patients undergoing transplantation; these include bacterial, viral, and parasitic types as well as colitis associated with graft-versus-host disease (GVHD).7,8
Recently, a syndrome of colitis that appears to be unique to patients undergoing umbilical-cord HSCT has been described.9 The syndrome termed “cord colitis” is clinically and histopathologically distinct from other known causes of colitis in patients undergoing HSCT. This syndrome of non-bloody, frequent stools developed 3 to 11 months after umbilical-cord HSCT in 11 of 104 patients (11%) at a single center. Histopathological evaluation of colon-biopsy samples revealed chronic active colitis and epithelioid granulomas, without evidence of known microbial pathogens, viral cytopathic changes, or signs of GVHD. A traditional infectious-disease evaluation did not reveal a cause for this syndrome. The patients were eventually treated with metronidazole (in 11 patients) and a fluoroquinolone (in 9 patients); all 11 patients had a response to antibacterial therapy. A subset of patients had a relapse after the cessation of antibiotics, and all of these patients had a subsequent response to the reinitiation of antibacterial therapy.
It has been hypothesized that cord colitis is a manifestation of GVHD (rather than a distinct clinical syndrome),10 a transfusion-mediated colitis,11 colonic infection with Tropheryma whipplei,12 or an inflammatory disorder due to interactions between cord-blood stem cells in patients with double-umbilical-cord transplants.13 We pursued the hypothesis that this unusual form of colitis may be caused by an infectious organism. This hypothesis was based on several factors, including the anticipated alterations in the gut microbiome, the immunologic deficits associated with umbilical-cord stem cells, and the unique epidemiologic, clinical, and histologic characteristics of the syndrome.2–4
Investigators have previously used nucleic acid–based approaches for the identification of pathogens, resulting, for example, in the discovery of T. whipplei (the causative agent in Whipple's disease) and the Merkel-cell polyomavirus (the proposed causative agent in Merkel-cell carcinoma of the skin).14–16 Such approaches have also shown other disease–microbe associations, such as that between Fusobacterium nucleatum and colorectal cancer.17,18
We performed shotgun DNA sequencing of biopsy specimens obtained from patients with cord colitis, followed by computational subtraction of human sequences and known microbial sequences. The resultant microbial and viral analysis of the biopsy-tissue components allowed for clear characterization of the metagenome in this idiopathic, antibiotic-responsive syndrome.
We chose all 11 affected patients in the original cohort at Brigham and Women's Hospital9 for nucleic acid–based investigation. We initially limited the study to samples from a single institution, since there is substantial variation among centers in clinical approaches that are likely to affect the gut microbiome (e.g., the use of antithymocyte globulin, prophylactic and empirical antibiotic use, and GVHD prophylaxis). During review of the gastrointestinal-biopsy specimens from the original cohort, we noted that 5 patients had undergone lower gastrointestinal endoscopy with biopsy both before and after the initiation of antibiotic treatment for cord colitis; 16 of these 23 colon-biopsy specimens were selected for further investigation (Table 1). As controls, formalin-fixed, paraffin-embedded colon samples from 5 healthy persons who had undergone screening colonoscopy and 3 patients who had undergone umbilical-cord HSCT and had pathologically confirmed intestinal GVHD were used, as well as DNA from colon specimens obtained from 5 patients undergoing resection for colon cancer, as described previously.18 In addition, we obtained upper gastrointestinal–biopsy specimens from the duodenum and stomach from 3 patients from the initial cohort with cord colitis.
We also obtained gastrointestinal-biopsy specimens from a patient who was treated at Massachusetts General Hospital for colitis after undergoing HSCT and who had some features that were consistent with cord colitis, in order to investigate the microbiome of a patient who had undergone HSCT at another institution. (Details are provided in the Supplementary Appendix, available with the full text of this article at NEJM.org.) The institutional review board at each institution approved the study. A waiver of the requirement for informed consent was granted by the human research committee at Partners HealthCare, which also approved the study.
After removal of the first 20 μm of each formalin-fixed, paraffin-embedded block, we extracted DNA from two 20-μm sections with the use of a RecoverAll Total Nucleic Acid Isolation kit (Ambion). In cases in which extraction yielded more than 25 ng of DNA (in Patients 5 and 11),9 we pursued DNA sequencing. In cases in which extraction yielded 25 ng or less of DNA, we reserved the samples for validation studies (Fig. 1). Bar-coded libraries were prepared from pairs of samples obtained before and after the initiation of antibiotic therapy in Patients 5 and 11, as described previously.18 Paired-end 76-bp or 101-bp massively parallel sequencing was performed at separate sequencing centers for each patient in order to control for possible contamination (see the Supplementary Appendix for a detailed description of the contamination analysis).
We used PathSeq software, version 1.2 (www.broadinstitute.org/software/pathseq), to perform iterative computational subtraction of human reads, known microbial reads, and viral reads, as described previously.19 We used the Velvet software package for de novo short-read assembly (assembly without the use of a reference genome) to generate contiguous overlapping sequences (contigs) from reads that were not mapped in the PathSeq analysis.20 We used the Basic Local Alignment Search Tool (BLAST) for nucleotide sequences (BLASTN) and proteins (BLASTX) to compare all contigs with the National Center for Biotechnology Information (NCBI) databases of nucleotide and protein sequences, which include all human, nonhuman eukaryotic, prokaryotic, and viral nucleic acid and amino acid sequences from nondraft genomes.
Nonhuman reads from samples 11b and 11d (from Patient 11) were pooled and subjected to de novo assembly with the use of two different software packages: Velvet and ALLPATHS.20,21 Contigs that formed the novel genome were aligned to the NCBI nucleotide-sequence database with the use of BLASTN.22 Contigs with a high degree of homology with bradyrhizobium species were represented to a similar depth of coverage, suggesting a common origin. These contigs also had percentages of GC (guanine–cytosine) content similar to one another (Fig. S2 in the Supplementary Appendix). We linked contigs to one another using paired reads to generate supercontigs (see the Supplementary Appendix for a detailed description of assembly methods).
The supercontigs that were generated by the de novo assembly formed the draft genome of a novel organism, called Bradyrhizobium enterica (deposited as NCBI Bioproject PRJNA174084; accession number, AMFB00000000; strain name, B. enterica DFCI-1; www.broadinstitute.org/annotation/genome/Bradyrhizobium_enterica.1/MultiHome.html). We used the Prodigal annotation tool23 to annotate the B. enterica genome. We used the PhyloPhlAn tool (http://huttenhower.sph.harvard.edu/phylophlan) to perform rooted phylogenetic analysis on a subset of 400 core genes, followed by bootstrap analysis to determine the strength of the predicted phylogenetic associations.
Comparative genomic analysis was performed (for details, see the Supplementary Appendix). Global alignment of amino acid sequences was performed with the use of the Needleman–Wunsch algorithm,24 and percentage identity between each B. enterica gene and its closest homologue in B. japonicum was determined.
Primers for polymerase-chain-reaction (PCR) assays were designed against a poorly conserved region of the provisional B. enterica genome with the use of PrimerQuest (Integrated DNA Technologies) and were generated. These primers (forward primer, 5′-TCGAGGGCTACGGCTTGAAGATTT-3′; reverse primer, 5′-ACAACGTGTTGCCGCCAATATGAG-3′) amplify a 367-bp target, which spans an intergenic region (supercontig 17 at base-pair position 152,156–152,522). Primers that target the human actin gene (forward primer, 5′-GCGAGAAGATGACCCAGATC-3′; reverse primer, 5′-CCAGTGGTACGGCCAGAGG-3′) amplify a 102-bp target. A detailed description of PCR conditions is provided in the Supplementary Appendix.
We used fluorescence in situ hybridization (FISH) to perform experiments on formalin-fixed, paraffin-embedded colon-biopsy specimens obtained from patients with cord colitis and from controls. The experiments were carried out according to the method of Swidsinski, with both a bradyrhizobium-specific probe and a eubacterial (universal bacterial) probe.25,26 (See the Supplementary Appendix for detailed descriptions of FISH methods and competition experiments performed to show the specificity of the bradyrhizobium probe.)
DNA was extracted from the temporally distinct colon-biopsy specimens from Patients 5 and 11 (samples 5b and 5c and samples 11b and 11d) (Table 1) and was used for massively parallel sequencing (Fig. 1). Bar-coded libraries were prepared and subjected to sequencing on the Illumina V3 Platform.19 Sequential computational subtraction of human reads and known microbial reads (including bacteria, archaea, viruses, and fungi) was performed with the use of PathSeq (Table S1 in the Supplementary Appendix).19 All data regarding nonhuman reads are deposited in the NCBI Sequence Read Archive (submission number, SRS386798). More than 2.5 million reads remained unmapped, suggesting the presence of abundant sequences that were absent from the reference databases used.
A pooled set of nonhuman reads from samples 11b and 11d was subjected to de novo assembly.20,21 The ALLPATHS software generated the largest number of total contigs that were over 2.5 kb in length. Ninety-nine contigs that were generated by this method were assembled into 89 supercontigs and manually reviewed; 1 supercontig (3621 bp) was removed, since it showed high sequence similarity to a SEN virus.27 A 126-kb circular supercontig (contig 32, supercontig 25) had a high degree of homology with a plasmid element from bradyrhizobium species BTAi1 (pBBta01; accession number, CP000495.1); this plasmid is absent in B. japonicum. The 88 remaining supercontigs all contained regions with a high degree of homology with B. japonicum, which consists of a single circular chromosome of 9,105,828 bp; 86 of the 88 supercontigs had a GC content of 60 to 66%. The resulting draft genome size (including the plasmid) was 7,645,871 bp, with a 64.4% GC content. (The genome sizes of most bradyrhizobium species range from approximately 7.5 to 10 Mb.) Given the known limitations of massively parallel sequencing in genome assembly,28 small areas of the genome probably remain unassembled. The high coverage of the genome suggests that most of it has been discovered. With the use of the Prodigal genome annotation tool, 7112 protein-encoding genes were predicted within the provisional genome23 (see the Supplementary Appendix for details regarding genome assembly and annotation).
Phylogenetic analysis with the use of PhyloPhlAn generated a rooted phylogenetic tree (Fig. 2A). Bootstrap analysis revealed more than 99% consensus at all branch points except for one (circled in Fig. 2A), where the bootstrap value was 0.181. The organism was provisionally named B. enterica, given the close phylogenetic relationship with B. japonicum and the human anatomical location where the organism was discovered. The amino acid sequence identity between homologous proteins in B. enterica and B. japonicum is shown in Figure 2B.29 A list of genes that are present in B. enterica and absent in B. japonicum is provided in the Supplementary Appendix.
Assembly of long contigs corresponding to the draft genome of a novel organism was technically possible because of the inferred high abundance of the novel organism and the oligoclonality of the microbiome in the samples obtained from patients with cord colitis. To determine the ratio of B. enterica to total bacterial reads in the four index samples, we once more performed PathSeq analysis, with the addition of the draft B. enterica genome to the reference database. The most abundant bacterial reads are presented in Table 2.
In Patient 5, the relative abundance of B. enterica reads in the posttreatment sample, as compared with the pretreatment sample, obtained from Patient 5 decreased by 84.4%. The posttreatment sample was obtained 28 days after the initiation of antibiotic therapy for the cord colitis syndrome (see the Supplementary Appendix for details). Similarly, in Patient 11, the relative abundance of B. enterica reads in the posttreatment sample, as compared with the pretreatment sample, decreased by 60.5%. The posttreatment sample was obtained 44 days after the initiation of antibiotic therapy for relapsed cord colitis syndrome. B. enterica was the predominant bacterium in all four samples (Table 2). The most abundant human viruses that were identified on whole-genome sequencing are presented in Table S2 in the Supplementary Appendix.
In contrast to the microbiome of colon samples from healthy controls30 and samples of normal colonic tissue obtained adjacent to colorectal tumors,18 known intestinal commensals (e.g., Propionibacterium acnes) were present at a much lower level than B. enterica, with the number of P. acnes reads ranging from 0.08 to 0.3% of the total number of B. enterica reads. Cytomegalovirus colitis had previously been diagnosed and treated in Patient 11. There was no histopathological evidence of cytopathic changes associated with cytomegalovirus at the time that cord colitis was diagnosed.9 Of note, the number of cytomegalovirus reads was low and decreased further in the analysis of the second biopsy sample (Table S4 in the Supplementary Appendix).
PCR analysis was performed to investigate the differential abundance of B. enterica, as compared with total bacteria and total human cells, in patients with cord colitis as compared with healthy controls, patients with colon cancer, and patients who had undergone umbilical-cord HSCT and had pathologically confirmed GVHD. In addition to samples from these controls, colon-biopsy samples were obtained within 120 days before treatment and 200 days after treatment in three other patients with cord colitis. Given the very limited amount of DNA available in these samples, quantitative PCR studies were not possible. B. enterica was undetectable in all specimens from the three types of control tissue (Fig. 3A, 3B, and 3C). The abundance of B. enterica in colon-biopsy samples from three additional patients with the cord colitis syndrome was inferred by the intensity of the band corresponding to the B. enterica PCR product, as compared with the actin PCR product (Fig. 3D, 3E, and 3F). According to this qualitative measurement, B. enterica was less abundant in samples obtained before the onset of cord colitis, was present in all samples obtained close to the time of diagnosis of cord colitis, and in some cases, decreased in abundance after antimicrobial treatment.
B. enterica was also detected in samples obtained on stomach and duodenal biopsy from three patients with cord colitis (Patients 5, 6, and 11) who had upper gastrointestinal tract involvement at the time of diagnosis (see the Supplementary Appendix). Sequencing reads from B. enterica were also identified in samples obtained on duodenal and gastric biopsy from a patient with colitis after HSCT at Massachusetts General Hospital (see the Supplementary Appendix). This patient had some clinical and histopathological features that were consistent with a diagnosis of cord colitis.
FISH was used to visualize B. enterica in formalin-fixed, paraffin-embedded colon samples from patients with cord colitis. Hybridization with both universal bacterial and bradyrhizobium probes showed the presence of a specific fluorescence signal representing B. enterica within affected tissue (Fig. 3G through 3K). This signal was absent in normal colon samples (Fig. 3L and 3M). All samples that were examined with the use of FISH were counterstained with 4′,6-diamidino-2-phenylindole (DAPI) to visualize host-cell nuclei.
Conventional microbiologic tools can be used successfully to detect many clinically significant infectious organisms. However, many potentially infectious syndromes remain idiopathic. Determining a candidate causal agent in these diseases can be challenging and is often unsuccessful. Several observers have predicted that new genomic methods that are sensitive and unbiased may illuminate candidate agents in a subset of diseases, as they have in selected circumstances previously.14–16,31
We used genomic tools to identify a new bacterial species, provisionally named B. enterica, in samples obtained from a cohort of patients with an idiopathic, antibiotic-responsive colitis syndrome with distinct histopathological features. Without a priori knowledge of the organism, we assembled a novel bacterial genome and candidate human pathogen from a specimen of disease tissue. The unusual lack of diversity in the colonic microbiome after HSCT that we noted in these samples has been described previously.3 The abundance of B. enterica in the samples suggests that the syndrome is distinct from other known transplantation-associated colitis syndromes. The organism appeared to be specific to patients with cord colitis; it was not present in various controls, including patients with intestinal GVHD.
The phylogenetic analysis showed that B. enterica was taxonomically related to plant endosymbionts such as B. japonicum, a nitrogen-fixing bacterium that has been used extensively, along with related organisms, in commercial agriculture.32 To date, bradyrhizobium species have not been associated with human disease. The draft genome of B. enterica appears to lack several genes that are critical for nitrogen fixation but does code for at least four filamentous hemagglutinin genes; these genes encode proteins that have been implicated in the binding of pathogenic bacteria, such as Bordetella pertussis, to human airway epithelium.33 Although B. enterica has yet to be cultured and its natural habitat has not yet been defined, it may not be an institution-specific organism. Our findings, though compelling, are insufficient to confirm an association between B. enterica and cord colitis. To assess the generalizability of such an association, future studies will need to focus on the identification of patients with cord colitis at other institutions and investigate the microbiome in these patients and in controls. We hypothesize that variations in institutional HSCT practices probably affect the composition of the gut microbiome and thus the incidence of cord colitis. We anticipate that the identification of additional cases of this rare disease will require a multicenter effort.
Organisms related to B. enterica have shown direct or inferred sensitivity to fluoroquinolones34 and metronidazole,35,36 the therapy that was effective in the treatment of patients with cord colitis in the original cohort.9 Genes encoding pyruvate–ferredoxin oxidoreductase, which are predicted to have a critical role in the reduction of metronidazole and thus its activity,37,38 are present in the genome of B. enterica, a finding that supports the hypothesis that B. enterica is the target of metronidazole therapy in patients with cord colitis. However, on the basis of the available data, it remains possible that bacterial clearance over time and resolution of symptoms may be the result of maturation of the transplanted immune system in affected patients and not a direct effect of antibiotic therapy.
Although we have not shown that B. enterica is the cause of cord colitis,39 we have demonstrated the usefulness of sequencing-based technologies for the unbiased identification of previously undiscovered candidate human pathogens. We anticipate that this reverse microbiology approach, in which the discovery of a novel microorganism is possible without a priori knowledge of the organism or its genome, will allow for the identification of additional potential pathogens.
Our initial work supports the presence of this novel bacterium at the genomic level, but additional biologic characterization is clearly needed. Isolation and culture of B. enterica and the generation and purification of native or recombinant target antigens for antibody development, for example, will greatly aid in the determination of the medical significance and epidemiologic features of this organism. Investigation of the association between this organism and known colitis syndromes (e.g., idiopathic human immunodeficiency virus–associated diarrheal illnesses, inflammatory bowel disease, and the irritable bowel syndrome) could be clinically informative. A description of the epidemiologic features of B. enterica in normal and diseased tissues and the identification of its natural habitat may lead to a better biologic and biomedical understanding of this organism. Finally, we must caution that although we have suggested a possible association between B. enterica and cord colitis, we have not shown that the association is the cause or the consequence of the clinical syndrome, nor have we shown the generalizability of the association in patients other than those in the original, single-institution cohort.
Supported by grants from the American Association for Cancer Research, the American Society for Blood and Marrow Transplantation, and the Bladder Cancer Advocacy Network (all to Dr. Bhatt); and by grants from the Starr Cancer Consortium and National Cancer Institute (RC2CA148317) (both to Dr. Meyerson).
We thank Daniel Auclair, James Gomez, Rhonda O'Keefe, Deborah Hung, Roby Bhattacharyya, Ann Hirsch, Nikhil Wagle, Frederick Wilson, Doyle Ward, Joshua Francis, Nancy Berliner, and Robert Mayer for helpful discussions.
Disclosure forms provided by the authors are available with the full text of this article at NEJM.org.