PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of bmcgenoBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Genomics
 
BMC Genomics. 2009; 10: 385.
Published online 2009 August 19. doi:  10.1186/1471-2164-10-385
PMCID: PMC2907702
Microbial comparative pan-genomics using binomial mixture models
Lars Snipen,corresponding author1 Trygve Almøy,1 and David W Ussery2
1Biostatistics, Department of Chemistry, Biotechnology and Food Sciences, Norwegian University of Life Sciences, Ås, Norway
2Centre for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
corresponding authorCorresponding author.
Lars Snipen: lars.snipen/at/umb.no; Trygve Almøy: trygve.almoy/at/umb.no; David W Ussery: dave/at/cbs.dtu.dk
Received April 24, 2009; Accepted August 19, 2009.
Abstract
Background
The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology.
Results
We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection probabilities. Estimated pan-genome sizes range from small (around 2600 gene families) in Buchnera aphidicola to large (around 43000 gene families) in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely occurring genes in the population.
Conclusion
Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes.
Articles from BMC Genomics are provided here courtesy of
BioMed Central