Figure S1: B. distasonis ATCC 8503 and B. vulgatus ATCC 8482 Chromosomes
The B. distasonis ATCC 8503 chromosome is shown in (A), and the B. vulgatus ATCC 8482 chromosome is shown in (B). The coding potential of the leading and lagging strands is relatively unbiased. Circles shown in the figure represent, from inside out, GC skew, GC content variation, rRNA operons, tRNA genes, conjugative transposons (CTns), CPS loci, extracytoplasmic function (ECF)-σ factors, SusC paralogs, and all predicted genes with assigned functions on reverse and forward strands, respectively. Color codes for genes are based on their COG functional classification.
(2.3 MB PDF)
Figure S2: COG-Based Characterization of All Proteins with Annotated Functions in the Proteomes of Sequenced Bacteroidetes
The term “Bacteroides orthologs” refers to the 1,416 orthologs shared by the sequenced gut Bacteroidetes (
B. vulgatus, B. distasonis, B. thetaiotaomicron, plus the two
B. fragilis strains). Color codes are the same as
Figure S1.
(334 KB PDF)
Figure S3: Pairwise Alignments of the Human Gut Bacteroidetes Genomes Reveal Rapid Deterioration of Global Synteny with Increasing Phylogenetic Distance
Each data point on the Dotplot represents one pair of mutual best hits (BLASTP) between the two genomes, plotted by pairwise genome location. Diagonal lines indicate synteny.
(822 KB PDF)
Figure S4: CPS Loci Are the Most Polymorphic Regions in the Gut Bacteroidetes Genomes
High-resolution synteny map of CPS loci and flanking regions in the two sequenced B. fragilis strains. There are nine CPS loci in each genome. Each data point represents a pair of orthologs (mutual best hits; e-value cutoff: 10−6). Brackets define the coordinates for component genes within a given locus (some pairs are missing due to gene loss or gain): x-axis, coordinate of the middle point of the gene on the NCTC 9343 chromosome; y-axis, coordinate of the middle point of the gene on YCH 46 chromosome. With the exception of CPS locus 5, which is strictly conserved, the nine CPS loci are affected by nonhomologous gene replacement and rearrangement.
(1.0 MB PDF)
Table S1: Comparison of Genome Parameters for B. distasonis ATCC 8503, B. vulgatus ATCC 8482, B. thetaiotaomicron ATCC 29148, B. fragilis NCTC 9343, and B. fragilis YCH 46
An asterisk (*) indicates the numbers of SusC/SusD homologs provided are based on BLASTP e-value equal to 10
−20or lower; the numbers shown in parentheses are based on criteria described in
SusC/SusD alignments in Materials and Methods. See
http://rd.plos.org/pbio.0050156.a for complete lists of SusC/SusD homologs. A hybrid two-component system protein contains all of the domains present in classical two-component systems, but in one polypeptide [
50].
(92 KB PDF)
Table S2: Shared Orthologs in B. distasonis ATCC 8503, B. vulgatus ATCC 8482, B. thetaiotaomicron ATCC 29148, and B. fragilis Strains NCTC 9343 and YCH 46
For an explanation of COG-based functional codes, see
Figure S1.
(277 KB PDF)
Table S3: Glycoside Hydrolases Found in B. distasonis ATCC 8503, B. vulgatus ATCC 8482, B. thetaiotaomicron ATCC 29148, and B. fragilis Strains NCTC 9343 and YCH 46
The classification scheme used is described in the Carbohydrate-Active enZYme (CAZy) database.
(70 KB PDF)
Table S4: List of Putative Xenologs in B. distasonis ATCC 8503, B. vulgatus ATCC 8482, B. thetaiotaomicron ATCC 29148, B. fragilis NCTC 9343, and B. fragilis YCH 46
The putative xenologs are listed for
B. distasonis ATCC 8503 (A),
B. vulgatus ATCC 8482 (B),
B. thetaiotaomicron ATCC 29148 (C),
B. fragilis NCTC 9343 (D), and
B. fragilis YCH 46 (E). For an explanation of COG-based functional codes, see
Figure S1. The lateral gene transfer (LGT) column defines the predicted evolutionary history of the coding sequence: LGT-in, laterally transferred into the genome; LGT-out, laterally transferred out of the genome; and LGT-unresolved, laterally transferred but direction unknown. See
Materials and Methods for detailed explanations.
(447 KB PDF)
Table S5: CPS Loci of B. distasonis ATCC 8503, B. vulgatus ATCC 8482, B. thetaiotaomicron ATCC 29148, B. fragilis NCTC 9343, and B. fragilis YCH 46
Shown are Gene ID, annotated function, GC content (%), and the predicted evolutionary history of the coding sequence for B. distasonis ATCC 8503 (A), B. vulgatus ATCC 8482 (B), B. thetaiotaomicron ATCC 29148 (C), B. fragilis NCTC 9343 (D), and B. fragilis YCH 46 (E).
LGT-in, laterally transferred into the genome; LGT-out, laterally transferred out of the genome; LGT-unresolved, laterally transferred but direction unknown; NO, not laterally transferred; NOVEL, no homologs found in any other genomes in public databases; UNRESOLVED, whether laterally transferred or not is not resolved. See
Materials and Methods for detailed explanations. Color codes are the same as in B.
(509 KB PDF)
Table S6: CPS Loci Are among the Most Polymorphic Regions in the Two B. fragilis Genomes
The p-value is based on the tail probability of a binomial distribution. Gene loss/gain events (3,531 in total) are counted as the difference between the total number of genes and the total number of genes shared between the two genomes.
(61 KB PDF)
Table S7: ECF-σ Factor–Containing Polysaccharide Utilization Gene Clusters in B. distasonis ATCC 8503 and B. vulgatus ATCC 8482
B. distasonis ATCC 8503 is shown in (A), and
B. vulgatus ATCC 8482 is shown in (B). The three columns represent Gene ID, functional annotation, and predicted evolutionary history of the gene (labeled as in
Table S5).
(184 KB PDF)
Text S1: Overview of Strategy Used to Identify Lateral Gene Transfer
(148 KB DOC)
Accession Numbers
The genome sequences of
B. vulgatus ATCC 8482 and
B. distasonis ATCC 8503 have been deposited in GenBank (
http://www.ncbi.nlm.nih.gov/Genbank) under accession numbers CP000139 and CP000140, respectively.