J Biomol Tech. 2010 September; 21(3 Suppl): S20.
PMCID: PMC2918072

Identification of Pathogenic Bacteria Associated with Plants and Animals using NextGENe Software

J. McGuigan, M. Manion, E. Bouton, N. Shouyong, and C.S.J. Liu
SoftGenetics, State College, PA, United States



The identification of pathogenic bacteria associated with plants and animals involves culturing and isolating bacteria over a period of days while many strains are missed due to culturing difficulties. Roche/454, Illumina, and SOLiD next-generation sequencing systems can be used for direct sequencing in order to bypass this time-consuming and inconclusive process. NextGENe software is able to provide great depth of coverage by linking paired-end reads to form 10 to 100 million reads about 200 bp long. NextGENe can be used to rapidly remove contamination by the host genome that occurs with direct sequencing. Reads that don't map to the host genome are aligned to a library of bacterial genomes. Samples can be compared between infected and healthy specimens to identify which bacterial strains play an important role in pathogenesis. Novel bacterial sequences can be assembled from the remaining reads. Sequences from unknown bacteria can be assembled with the condensation tools. The condensation will group together similar reads using anchor sequences and flanking shoulder sequences. Groups with greater than a given number of sequence differences are then separated further into subgroups. Assembly is then used to combine grouped reads into longer contigs. Another method for identification and quantification of known bacteria uses a library of 16S rRNA references. NextGENe software is able to align thousands of reads to a database of over 5000 16S reference sequences in less than 3 minutes.Thus, accurate and rapid identification of even a small amount of bacteria is possible while overcoming the limitations of current tests that can only assay for a single species or strain at a time. The relative amounts of different strains can be determined for studies of plant pathology or of drug resistance in animal models. Analysis of data from environmental and pathological samples will be presented.

