Search tips
Search criteria 


Logo of genbioBioMed CentralBiomed Central Web Sitesearchsubmit a manuscriptregisterthis articleGenome BiologyJournal Front Page
Published online 2009 October 1. doi: 10.1186/gb-2009-10-10-r103

Table 4

ALLPATHS, Velvet and EULER assemblies of five microbial genomes: correctness (of chunks approximately 10 kb or less)

Class I: error rate 0
 S. aureus99.3%70.7%51.7%
 E. coli99.8%68.7%42.1%
 R. sphaeroides99.7%71.8%18.9%
 S. pombe79.7%66.2%31.4%
 N. crassa78.6%49.9%19.1%
Class II: error rate <0.1%
 S. aureus0.7%13.7%26.4%
 E. coli0.2%18.0%24.1%
 R. sphaeroides0.0%19.3%39.2%
 S. pombe18.6%26.6%32.6%
 N. crassa15.3%24.3%24.1%
Class III: error rate <1%
 S. aureus0.0%9.1%13.7%
 E. coli0.0%6.4%26.8%
 R. sphaeroides0.0%5.9%37.0%
 S. pombe1.3%3.7%28.7%
 N. crassa3.2%11.4%32.3%
Class IV: error rate <10%
 S. aureus0.0%6.2%5.9%
 E. coli0.0%5.9%5.3%
 R. sphaeroides0.2%2.0%3.2%
 S. pombe0.3%2.6%5.1%
 N. crassa1.3%7.9%12.8%
Class V: error rate ≥10%
 S. aureus0.0%0.4%2.3%
 E. coli0.0%1.0%1.7%
 R. sphaeroides0.1%1.0%1.5%
 S. pombe0.2%0.8%2.1%
 N. crassa0.9%5.7%10.4%
Class VI: error rate, no match
 S. aureus0.0%0.0%0.0%
 E. coli0.0%0.0%0.0%
 R. sphaeroides0.0%0.1%0.2%
 S. pombe0.0%0.0%0.0%
 N. crassa0.7%*0.9%*1.4%*

For five genomes, statistics from ALLPATHS, Velvet, and EULER assemblies are shown. Correctness: contigs were divided into approximately 10 kb chunks, leaving smaller contigs intact. Subject to the caveat that the reference sequences might have some errors (likely greater for the fungi), we assayed the absolute accuracy of each chunk by finding the minimum number of errors (substitution or indel bases) among all alignments of it to the reference sequence for the genome. The table shows the distribution of the bases in the chunks according to their accuracy. Chunks having no 100-mer match are separately classified (class VI). *For N. crassa, some AT-rich regions that are missing from the reference [23] appear as novel sequence in the assemblies.