This study reveals that both arrays and barcodes are useful tools for the species-level identification of mammals. The main limitation of the array-based approach is that it requires advance knowledge of sequences in target species. Because of a lack of exact matches, undiscovered haplotypes or geographic variants could fail to anneal properly to the probes on the array. While we tried to avoid this problem by providing a set of three different probes per species, this factor can substantially limit the use of microarrays for large-scale biodiversity monitoring. Additionally, to explore unknown species in a given taxonomic group, it might be possible to design probe sets that specifically bind to members of a higher taxonomic level such as genus or family. However, in a situation such as S. lilium
, with different geographical variants of up to 8% sequence variation in their COI/cytb
genes, a probe set that is designed for the species in one locality might not bind to members of the species in other localities (results not shown). This hit or miss situation could make array technology less desirable in biodiversity monitoring across a wide geographic region. In fact, the current applications of microarrays are usually focused on a limited number of taxa [4
]. Because of this, assembling a 'Mammalia Chip' might not be a feasible approach for biodiversity monitoring of all mammalian species.
Because barcoding is a sequencing-based technology, it avoids the problem of unknown haplotypes. New haplotypes can be compared to existing databases of barcodes, and they can be assigned to a particular species using probabilistic algorithms [15
]. The final assignment of a new haplotype to a described species or its assignment to a new species will be achieved through comprehensive taxonomic analysis, which requires different types of data [17
]. Our analysis supports this argument in all three datasets. While smaller fragments were less powerful in resolving some closely-related species, obtaining more sequence information in these cases (i.e. full-length barcode versus mini-barcode or the whole gene versus the barcode-size fragment) can increase the resolution [10
]. However, while standard barcode-size fragments (650 bp) can be readily obtained in a single PCR amplification/sequencing from freshly collected or frozen tissue specimens, it is difficult to obtain 650-bp barcodes from specimens whose DNA is degraded (i.e. dried museum samples) [10
]. The high effectiveness of mini-barcodes means that biomonitoring through barcodes can target different types of specimens, including museum samples or traces of tissues with degraded DNA [10
]. The mini-barcode strategy also enables exploration of the use of massively parallel sequencing platforms, such as pyrosequencing-based [18
] 454 Life Sciences sequencers, for barcoding applications. Interestingly, this technology uses an emulsion PCR approach for simultaneous amplification of several thousand 100–200 base DNA molecules in one reaction. This approach will therefore allow the use of mini-barcodes on environmental samples, which have traditionally been targets for array-based technology.
This study also provides evidence that both COI and cytb
are useful species-level molecular markers for mammalian species. This finding is in agreement with earlier work [4
]. However, when it comes to selecting a molecular marker, it is also important to consider operational issues such as the availability of robust PCR primers, standardization across a wide range of taxa, the robustness of amplifying shorter fragments in PCR reactions of degraded DNA, and the prevalence of mitochondrial nuclear pseudogenes. Our study further confirms that the standard COI barcode can be applied to mammalian species with a similar high species-level resolution as has been observed in other animal taxa tested.