|Home | About | Journals | Submit | Contact Us | Français|
Numidum massiliense gen. nov., sp. nov., strain mt3T is the type strain of Numidum gen. nov., a new genus within the family Bacillaceae. This strain was isolated from the faecal flora of a Tuareg boy from Algeria. We describe this Gram-positive facultative anaerobic rod and provide its complete annotated genome sequence according to the taxonogenomics concept. Its genome is 3 755 739 bp long and contains 3453 protein-coding genes and 64 RNA genes, including eight rRNA genes.
Several microbial ecosystems are harboured by the human body, among which is the human gut microbiota. This particular ecosystem is so vast that its cell count (1014 cells) is evaluated at ten times the number of human cells in the human body, and its collective bacterial genome size is 150 times larger than the human genome , , , . Over the years, with the evolution of exploratory techniques of microbial ecosystems from culture to metagenomics, the gut microbiota has been shown to be involved in many conditions such as obesity, inflammatory bowel disease and irritable bowel disease . It has also been shown to play key roles in digestion as well as metabolic and immunologic functions , , . A better knowledge of the gut microbiota's composition is thus required for an improved understanding of its functions.
In order to extend the gut microbiota repertoire and bypass the noncultivable bacteria issue, the culturomics concept was developed in order to cultivate as exhaustively as possible the viable population of a bacterial ecosystem; it consists in the multiplication of culture conditions, as well as varying of media, temperature and atmosphere . Using this technique, strain mt3T was isolated and identified as a previously unknown member of the Bacillaceae family. Currently there are 53 validated genera in the Bacillaceae family. This family was created by Fisher in 1895 (http://www.bacterio.net/Bacillaceae.html). The genus Bacillus was described as its type genus. The genera that belong to this family are rod shaped, mostly aerobic and facultative anaerobic bacteria. They are found in various ecosystems like the human body, soil, water, air and other environmental ecosystems .
Bacterial classification is currently based on a polyphasic approach with phenotypic and genotypic characteristics such as DNA-DNA hybridization, G+C content and 16S rRNA sequence similarity , , . Nevertheless, this classification system has its limits, among which is the high cost of the DNA-DNA hybridization technique and its low reproducibility , . With the recent development of genome sequencing technology , a new concept of bacterial description was developed in our laboratory , , , , . This taxonogenomics concept  combines a proteomic description with the matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) profile  associated with a phenotypic description and the sequencing, annotation and comparison of the complete genome of the new bacterial species .
We describe strain mt3T, a new genus Numidum massiliense gen. nov., sp. nov. (= CSUR P1305 = DSM 29571), a new member of the Bacillaceae family using the concept of taxonogenomics.
A stool sample was collected from a healthy Tuareg boy living in Algeria. Verbal consent was obtained from the patient, and the study was approved by the Institut Fédératif de Recherche 48, Faculty of Medicine, Marseille, France, under agreement 09-022.
The sample was cultured using the 18 culture conditions of culturomics . The colonies were obtained by seeding on solid medium, purified by subculture and identified using MALDI-TOF MS , . Colonies were deposited in duplicate on a MTP 96 MALDI-TOF MS target plate (Bruker Daltonics, Leipzig, Germany), which was analysed with a Microflex spectrometer (Bruker). The 12 spectra obtained were matched against the references of the 7567 bacteria contained in the database by standard pattern matching (with default parameter settings), with MALDI BioTyper database software 2.0 (Bruker). An identification score over 1.9 with a validated species allows identification at the species level, and a score under 1.7 does not enable any identification. When identification by MALDI-TOF MS failed, the 16S rRNA was sequenced . Stackebrandt and Ebers  suggest similarity levels of 98.7% and 95% of the 16s rRNA sequence as a threshold to define, respectively, a new species and a new genus without performing DNA-DNA hybridization.
In order to determine our strain's ideal growth conditions, different temperatures (25, 28, 37, 45 and 56°C) and atmospheres (aerobic, microerophilic and anaerobic) were tested. GENbag anaer and GENbag miroaer systems (bioMérieux, Marcy l'Étoile, France) were used to respectively test anaerobic and microaerophilic growth. Aerobic growth was achieved with and without 5% CO2.
Gram staining, motility, catalase, oxidase and sporulation were tested as previously described . Biochemical description was performed using API 20 NE, ZYM and 50CH (bioMérieux) according to the manufacturer's instructions. Cellular fatty acid methyl ester (FAME) analysis was performed by gas chromatography/mass spectrometry (GC/MS). Two samples were prepared with approximately 70 mg of bacterial biomass per tube collected from several culture plates. FAMEs were prepared as previously described (http://www.midi-inc.com/pdf/MIS_Technote_101.pdf). GC/MS analyses were carried out as described before . Briefly, fatty acid methyl esters were separated using an Elite 5-MS column and monitored by mass spectrometry (Clarus 500-SQ 8 S; Perkin Elmer, Courtaboeuf, France). Spectral database search was performed using MS Search 2.0 operated with the Standard Reference Database 1A (NIST, Gaithersburg, MD, USA) and the FAMEs mass spectral database (Wiley, Chichester, UK).
Antibiotic susceptibility testing was performed using the disk diffusion method according to European Committee on Antimicrobial Susceptibility Testing (EUCAST) 2015 recommendations (http://www.eucast.org/). To perform the negative staining of strain mt3T, detection Formvar-coated grids were deposited on a 40 μL bacterial suspension drop, then incubated at 37°C for 30 minutes and on ammonium molybdate 1% for 10 seconds. The dried grids on blotted paper were observed with a Tecnai G20 transmission electron microscope (FEI Company, Limeil-Brevannes, France).
N. massiliense strain mt3T (= CSUR P1305 = DSM 29571) was grown on 5% sheep's blood–enriched Columbia agar (bioMérieux) at 37°C in aerobic atmosphere. Bacteria grown on three petri dishes were collected and resuspended in 4 × 100 μL of Tris-EDTA (TE) buffer. Then 200 μL of this suspension was diluted in 1 mL TE buffer for lysis treatment that included a 30- minute incubation with 2.5 μg/μL lysozyme at 37°C, followed by an overnight incubation with 20 μg/μL proteinase K at 37°C. Extracted DNA was then purified using three successive phenol–chloroform extractions and ethanol precipitations at −20°C overnight. After centrifugation, the DNA was resuspended in 160 μL TE buffer.
Genomic DNA of N. massiliense was sequenced on the MiSeq Technology (Illumina, San Diego, CA, USA) with the mate pair strategy. The gDNA was barcoded in order to be mixed with 11 other projects with the Nextera Mate Pair sample prep kit (Illumina). gDNA was quantified by a Qubit assay with a high sensitivity kit (Life Technologies, Carlsbad, CA, USA) to 66.2 ng/μL. The mate pair library was prepared with 1 μg of genomic DNA using the Nextera mate pair Illumina guide. The genomic DNA sample was simultaneously fragmented and tagged with a mate pair junction adapter. The pattern of the fragmentation was validated on an Agilent 2100 BioAnalyzer (Agilent Technologies, Santa Clara, CA, USA) with a DNA 7500 labchip. The DNA fragments ranged in size from 1 to 11 kb, with an optimal size at 3.927 kb. No size selection, was performed and 505 ng of tagmented fragments were circularized. The circularized DNA was mechanically sheared to small fragments with an optimal at 597 bp on a Covaris device S2 in microtubes (Covaris, Woburn, MA, USA). The library profile was visualized on a High Sensitivity Bioanalyzer LabChip (Agilent Technologies), and the final concentration library was measured at 59.2 nmol/L.
The libraries were normalized at 2 nM and pooled. After a denaturation step and dilution at 15 pM, the pool of libraries was loaded onto the reagent cartridge and then onto the instrument along with the flow cell. An automated cluster generation and sequencing run were performed in a single 39-hour run in a 2 × 251 bp read length.
Open reading frames (ORFs) were predicted using Prodigal  with default parameters, but the predicted ORFs were excluded if they were spanning a sequencing gap region (contains N). The predicted bacterial protein sequences were searched against the Clusters of Orthologous Groups (COGs) using BLASTP (E value 1e-03, coverage 70%, identity percent 30%). If no hit was found, it searched against the NR database using BLASTP with an E value of 1e-03 coverage 70% and identity percent of 30%. If sequence lengths were smaller than 80 amino acids, we used an E value of 1e-05. The tRNAScanSE tool  was used to find tRNA genes, whereas rRNAs were found by using RNAmmer . Lipoprotein signal peptides and the number of transmembrane helices were predicted using Phobius . ORFans were identified if all the performed BLASTP procedures did not give positive results (E value smaller than 1e-03 for ORFs with sequence size upper than 80 aa or E value smaller than 1e-05 for ORFs with sequence length smaller 80 aa). Such parameter thresholds have already been used in previous works to define ORFans.
Genomes were automatically retrieved from the 16s RNA tree using Xegen software (Phylopattern ). For each selected genome, complete genome sequence, proteome and ORFeome genome sequence were retrieved from the National Center for Biotechnology Information FTP site. All proteomes were analysed with proteinOrtho . Then for each couple of genomes, a similarity score was computed. This score is the mean value of nucleotide similarity between all couples of orthologues between the two genomes studied (AGIOS) . An annotation of the entire proteome was performed to define the distribution of functional classes of predicted genes according to the clusters of orthologous groups of proteins (using the same method as for the genome annotation). To evaluate the genomic similarity among the compared strains, we determined two parameters: digital DNA-DNA hybridization (dDDH), which exhibits a high correlation with DNA-DNA hybridization (DDH) , , and AGIOS , which was designed to be independent from DDH.
Strain mt3T (Table 1) was first isolated in April 2014 by a preincubation of 21 days in brain–heart infusion supplemented with 5% sheep's blood and cultivated on 5% sheep's blood–enriched Colombia agar (bioMérieux) in an aerobic atmosphere at 37°C.
No significant score was obtained for strain mt3T using MALDI-TOF MS, thus suggesting that our isolate's spectrum did not match any spectra in our database. The nucleotide sequence of the 16S r RNA of strain mt3T (GenBank accession no. LK985385) showed a 90.5% similarity level with Bacillus firmus, the phylogenetically closest species with a validly published name (Fig. 1), therefore defining it as a new genus within the Bacillaceae family named Numidum massiliense (= CSUR P1305 = DSM29571). N. massiliense spectra (Fig. 2) were added as reference spectra to our database. The reference spectrum for N. massiliense was then compared to the spectra of phylogenetically close species, and the differences were exhibited in a gel view (Fig. 3).
Growth was observed from 25 to 56°C on blood-enriched Columbia agar (bioMérieux), with optimal growth being obtained aerobically at 37°C after 48 hours of incubation. Weak cell growth was observed under microaerophilic and anaerobic conditions. The cells were nonmotile and sporulating. Cells were Gram-positive rods (Fig. 4) and formed greyish colonies with a mean diameter of 10 mm on blood-enriched Columbia agar. Under electron microscopy, the bacteria had a mean diameter of 0.5 μm and length of 2.7 μm (Fig. 5).
The major fatty acid by far is the branched 13-methyl-tetradecanoic acid (88%). Other fatty acids are described with low abundances (below 6%). The majority of them were branched fatty acids (Table 2).
Strain mt3T was positive for catalase and negative for oxidase. Alkaline phosphatase, esterase (C4), esterase lipase (C8), leucine arylamidase, trypsin, α-chymotrypsin, acid phosphatase, β-galactosidase, β-glucuronidase, α-glucosidase, β-glucosidase, protease and N-acetyl-β-glucosaminidase activities were exhibited. Nitrates were reduced into nitrites. d-Ribose, d-xylose, d-mannose, d-galactose, d-fructose, d-glucose, d-mannitol, N-acetylglucosamine, amygdalin, esculin ferric citrate, d-maltose, d-lactose, d-trehalose and d-tagatose and adipic acid were metabolized.
Cells were susceptible to doxycycline, ceftriaxone, gentamicin 500 μg, ticarcillin/clavulanic acid, rifampicin, teicoplanin, metronidazole and imipenem. Resistance was exhibited against erythromycin, colistin/polymyxin B, ciprofloxacin, penicillin, trimethoprim/sulfamethoxazole, nitrofurantoin and gentamicin 15 μg.
The biochemical and phenotypic features of strain mt3T were compared to the corresponding features of other close representatives of the Bacillaceae family (Table 3).
The genome of N. massiliense strain mt3T is 3 757 266 bp long with a 52.05% G+C content (Table 4, Fig. 6). Of the 3513 predicted genes, 3448 were protein-coding genes and 65 were RNAs (three genes are 5S rRNA, four genes are 16S rRNA, two genes are 23S rRNA and 56 genes are tRNA genes). A total of 2570 genes (73.15%) were assigned as putative function (by COGs or by NR blast). Four hundred twelve genes were identified as ORFans (11.93%). The remaining 503 genes were annotated as hypothetical proteins (14.57%). The National Center for Biotechnology Information ID project is PRJEB8811, and the genome is deposited under accession number CTDZ01000000. The distribution of genes into COGs functional categories is presented in Table 5.
N. massiliense genomic characteristics were compared to other close species (Table 6).
The draft genome sequence of N. massiliense strain mt3T (3.76 MB) is smaller than the draft genome sequences of Bacillus vireti LMG 21834, Bacillus mannanilyticus JCM 10596, Paucisalibacillus globulus DSM 18846 and Bacillus subterraneus DSM 13966T (5.29, 4.53, 4.24 and 3.9 MB respectively) and larger than those of Bacillus selenitireducens MLS10 and Laceyella sacchari 1-1 (3.59 and 3.32 MB respectively). The G+C content of N. massiliense (52.05%) is larger than the G+C contents of L. sacchari 1-1, B. selenitireducens MLS10, B. subterraneus DSM 13966T, B. vireti LMG 21834, B. mannanilyticus JCM 10596 and P. globulus DSM 18846 (48.9, 48.7, 42.1, 39.7, 39.6 and 35.8% respectively).
The gene content of N. massiliense (3513) is smaller than the gene contents of B. vireti LMG 21834, B. mannanilyticus JCM 10596, P. globulus DSM 18846 and B. subterraneus DSM 13966T (5050, 4369, 4127 and 3772 respectively) but larger than those of B. selenitireducens MLS10 and L. sacchari 1-1 (3368 and 3256 respectively).
However, the distribution of genes into COGs categories was similar in all compared genomes except for those corresponding to the cytoskeleton category, which were only present in B. vireti, B. selenitireducens and B. mannanilyticus (Fig. 7). N. massiliense strain mt3T shared 1162, 1028, 1191, 1294, 1121 and 456 orthologous genes with B. mannanilyticus, B. selenitireducens, B. subterraneus, B. vireti, L. sacchari 1-1 and P. globulus respectively (Table 7). Among species with standing in nomenclature, AGIOS values ranged from 52.26% between N. massiliense and P. globulus to 66.1% between B. vireti and B. subterraneus. When N. massiliense was compared to the other species, AGIOS values ranged from 52.26% with P. globulus to 57.96% with L. sacchari. To evaluate the genomic similarity among the compared strains, dDDH was also determined (Table 8).
On the basis of phenotypic, phylogenetic and genomic analyses, we formally propose the creation of Numidum massiliense which contains the type strain mt3T. This bacterial strain has been isolated from the faecal flora of a Tuareg boy living in Algeria.
Numidum (nu.mi'dum, from Numidum, which relates to a nomad people from Africa), is a Gram-positive, sporulating, facultative anaerobic bacilli. Optimal growth in aerobic condition at 37°C. Catalase positive and oxidase negative. Nitrates were reduced into nitrites. It is urease negative. The type strain is Numidum massiliense strain mt3T.
Numidum massiliense (mas.il'ien'se. L. gen. masc., massiliense, of Massilia, the Latin name of Marseille, where strain mt3T was isolated) cells have a mean diameter of 0.5 μm. Colonies are greyish and 10 mm in diameter on 5% sheep's blood–enriched Columbia agar (bioMérieux). Positive reactions are observed for alkaline phosphatase, esterase (C4), esterase lipase (C8), leucine arylamidase, trypsin, α-chymotrypsin, acid phosphatase, β-galactosidase, β-glucuronidase, α-glucosidase and N-acetyl-β-glucosaminidase. d-Ribose, d-xylose, d-mannose, d-galactose, d-fructose, d-glucose, d-mannitol, N-acetylglucosamin, amygdalin, esculin ferric citrate, d-maltose, d-lactose, d-trehalose and d-tagatose and adipic acid were metabolized.
Cells were susceptible to doxycycline, ceftriaxone, gentamicin 500 μg, ticarcillin/clavulanic acid, rifampicin, teicoplanin, metronidazole and imipenem.
The G+C content of the genome is 52.05%. The 16S rRNA gene sequence and whole-genome shotgun sequence of N. massiliense strain mt3T are deposited in GenBank under accession numbers LK985385 and CTDZ01000000, respectively. The type strain mt3T (= CSUR P1305 = DSM 29571) was isolated from the stool of a Tuareg boy living in Algeria.
The authors thank the Xegen Company (www.xegen.fr) for automating the genomic annotation process. This study was funded by the Fondation Méditerranée Infection. We thank K. Griffiths for English-language review.