The importance of this study is in showing how low coverage WGS of multiple mice with a phenotype can identify causative ENU mutations without the need for out-crossing, or knowledge of dominant or recessive inheritance. This strategy simultaneously generates linkage maps and identifies the shared mutations with a high degree of confidence. The advantage of our linkage-based approach is that ENU-induced mutations from multiple affected mice can be used to track the IBD regions and then isolate the causative mutation.
Linkage analysis using exome sequencing has been effectively harnessed for the study of human pedigrees. Variations on the Lander-Green method, developed for array data, incorporate knowledge of the population allele frequencies of HapMap SNPs 
. Alternative approaches use other hidden Markov models (HMM) to identify regions that are IBD and common to autosomal recessive phenotypes 
. Reducing the search space from the whole genome to the exome significantly reduces the number of informative variants 
, however this is typically several orders of magnitude larger than the number of ENU-induced exonic variants, e.g. 6000 to 8000 exonic HapMap variants per individual 
compared to 74 exonic variants in our ENU pedigree. The low frequency of ENU coding variants does not permit fine scale linkage analysis based only on exonic variants. Exome sequencing also precludes the detection of regulatory mutations 
, and the inefficiencies of capture have resulted in a failure to find the causative mutant in one in five ENU pedigrees, even for recessive traits 
. The ability to detect IBD regions using low coverage and the falling costs of next generation sequencing make our WGS method increasingly cost effective.
Knowledge of the ENU mutation frequency allows us to model an efficient sequencing strategy. Our data show that sequencing 3 affected G3 mice with a recessive trait or 6 mice with a dominant trait would yield on average 1 or 2 candidate IBD mutations. The Lander-Green algorithm on which our IBD analysis is based, scales exponentially with the number of individuals in the pedigree, but remains computationally feasible with a pedigree of
. The algorithm would accommodate further refinement to take into account the known characteristics of ENU mutations (). By generating haplotype data for many ENU pedigrees, our approach will also eventually lead to a fine scale map of active recombination sites in the mouse, which, unlike existing maps based on recombinations that arose historically between outbred strains of mice 
or more recently between intercrossed inbred strains 
, is unbiased by selection or strain differences. Such a map could then be used to optimize the performance of our Lander-Green based algorithm.
We believe that the adoption of our approach by large-scale ENU programs will lead to a substantial increase in the productivity of the programs, advancing our understanding of gene-function and the mechanisms of genetic disease. Our approach will reduce the burden in animal costs and allow post-mortem screens, with increased sophistication and accuracy in a broader range of tissues. With the rapidly falling costs of WGS we can envisage a future in which all G3 ENU mice are sequenced to a depth sufficient to identify and segregate all their mutations, creating a rich dataset of allelic variation and corresponding phenotypic information, including linkage data for non-coding mutations with measurable effects. This could be achieved accurately with 4–5× sequencing due to the increased power to impute genotypes. This database could be mined for associations across pedigrees, including the detection of subtle phenotypes.