The main reason that targeted enrichment has been developed as an adjunct for NGS in recent years is that it was needed to make extensive sequencing affordable for subregions of complex genomes. The alternative of fully sequencing many complete genomes to high average coverage (~30x or higher) to enable things like genetic variation analysis, was simply not affordable. Another reason for assaying, e.g. exome rather than whole-genome sequencing is the simpler data interpretation of the former. This is a crucial consideration as it is generally much more challenging to find the functional impact of variants in noncoding regions. A comparison of today's costs for whole-genome sequencing and targeted enrichment is show in .
Figure 3: Sequencing costs for short read next-generation sequencing. Since introduction in early 2008, costs dropped radically and are represented in a straight line on a logarithmic scale. The cost differential between sequencing, e.g. a full human genome or (more ...)
Current targeted enrichment methods are not yet optimal, and must be improved if they are to be relevant for a long time to come. One fundamental problem is the lack of evenness of coverage [48
], which is especially troublesome if the results are intended for diagnostic purposes. Poor evenness across regions with differing percentages of GC bases is a general problem for NGS itself [2
], which directly translates into lower coverage of promoter regions and the first exon of genes as these are often GC rich. Such problems are exacerbated by GC content and other biases suffered by enrichment technologies. Therefore, for reliable results, a high coverage is invaluable—but current methods for targeting several mega base pairs might only return 60–80% of the ROI at a read depth of over 40x, and 80–90% at around 20x coverage.
The comparison of different genome partitioning methods in gives a real-world indication of how very divergent the results of the available methods can be. Even for the same genetic locus, processed by the same people in the same laboratory, the different enrichment methods produce very different average coverage, evenness and specificity. All four hybrid capture methods, including three solution phase methods (home made, Flexelect, SureSelect) and one solid phase method (NimbleGen) show considerable fluctuation in coverage over the targeted region of interest. Depending on the length of fragment library, off-target sequences protrude more or less into genomic regions adjacent to the target region. In comparison, the SelectorProbe enrichment shows a more even coverage for the targeted region and fluctuations in coverage are due to the number of hybridization probes designed. The PCR-based enrichment (RainDance) results in the most even coverage across the targeted region, but this is flanked by the typical high coverage reads for the primer pair used for enrichment.
Figure 4: Comparison of different enrichment techniques for regions of interest. The graph shows enrichments for the genes CTNNB1 (1), PT53 (2) and ADAM10 (3) converted to wiggle format on the UCSC genome browser. For every targeted enrichment experiment, the upper (more ...)
For an improved understanding of many single gene disorders, targeted enrichment can help produce a catalog of rare causative mutations by deeply sequencing genomic loci of a large number of patients. The analysis of genetic variation in complex disease is not necessarily limited to human DNA but can also be applied in other health-relevant fields such as microbiology [49
]. In principle, targeted enrichment in conjunction with NGS provides emerging possibilities in many areas relying on molecular-based technologies ranging from microbial testing to diagnostics [50
Still, clinical diagnostic applications of sequencing where specific clinical questions need to be answered might favor analysis of only the relevant loci at high coverage. This has a number of advantages. First, a highly accurate answer is provided, which is required when clinicians take decisions about supplying or withholding expensive targeted biological drugs to, for instance, cancer patients. Second, a targeted sequencing approach has the advantage of focusing directly to the region of interest and therefore omitting not directly relevant genomic information. Third, an important point to consider is regulatory approval of further sequencing-based diagnostic tests. Given that regulatory approval is supplied for a dedicated and specific test that addresses a specific question, a targeted sequencing approach might be more acceptable to regulatory agencies. Hence, ultimately the adoption of enrichment methods in the sequencing field may evolve differently in the research and diagnostics fields. Indeed, the future use of sequencing for diagnostics may naturally move toward a ‘single cartridge per patient’ approach, as is the current practice for other types of molecular diagnostics.
Looking to the future, whole-genome sequencing will continue to become cheaper, simpler, and faster. This will steadily erode the rationale for using targeted enrichment rather than directly sequencing the complete genome and bioinformatically extracting the sequences of interest. The long term utility of targeted enrichment will depend increasingly on progress toward evenness and enrichment power improvements (to increase the value of the data), and also on new and better strategies for sample multiplexing and pooling (to bring down the per sample cost).
In conclusion, with cheap 3rd generation sequencing on the horizon, and with improvements in targeted enrichment still occurring, the field of targeted enrichment has not yet lost its raison d'être
. Current international large-scale sequencing projects like the 1000 Genomes Project [37
] also rely on targeted enrichment for NGS besides whole-genome sequencing because, the upfront expenses in sample preparation are more than reimbursed by a significantly reduced total sequencing demand and reduced downstream processing in terms of data analysis and storage for generating high coverage sequence data.
- Discussion of current targeted enrichment methods.
- Use of targeted enrichment in the context of analyzing complex genomes.
- Detecting genetic variation by targeted enrichment.
- Considerations in terms of methodology, applicability and descriptive metrics.
- Challenges and future perspectives of targeted enrichment.