PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of bmcgenoBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Genomics
 
BMC Genomics. 2009; 10: 485.
Published online Oct 20, 2009. doi:  10.1186/1471-2164-10-485
PMCID: PMC2778663
The theory of discovering rare variants via DNA sequencing
Michael C Wendlcorresponding author1 and Richard K Wilson1
1The Genome Center and Department of Genetics, Washington University, St. Louis MO 63108, USA
corresponding authorCorresponding author.
Michael C Wendl: mwendl/at/wustl.edu; Richard K Wilson: rwilson/at/wustl.edu
Received March 20, 2009; Accepted October 20, 2009.
Abstract
Background
Rare population variants are known to have important biomedical implications, but their systematic discovery has only recently been enabled by advances in DNA sequencing. The design process of a discovery project remains formidable, being limited to ad hoc mixtures of extensive computer simulation and pilot sequencing. Here, the task is examined from a general mathematical perspective.
Results
We pose and solve the population sequencing design problem and subsequently apply standard optimization techniques that maximize the discovery probability. Emphasis is placed on cases whose discovery thresholds place them within reach of current technologies. We find that parameter values characteristic of rare-variant projects lead to a general, yet remarkably simple set of optimization rules. Specifically, optimal processing occurs at constant values of the per-sample redundancy, refuting current notions that sample size should be selected outright. Optimal project-wide redundancy and sample size are then shown to be inversely proportional to the desired variant frequency. A second family of constants governs these relationships, permitting one to immediately establish the most efficient settings for a given set of discovery conditions. Our results largely concur with the empirical design of the Thousand Genomes Project, though they furnish some additional refinement.
Conclusion
The optimization principles reported here dramatically simplify the design process and should be broadly useful as rare-variant projects become both more important and routine in the future.
Articles from BMC Genomics are provided here courtesy of
BioMed Central