In summary, we have developed a novel MIP-based alternative splicing assay that is sensitive, specific, and can be highly multiplexed. We compared our asMIP and qPCR methodologies and found that the correlation between asMIP and qPCR is good; for sequenced asMIP M
-scores the Pearson correlation coefficient was 0.59 or 0.82 depending on the data set (Table ) and the AUC was 0.76 (Figure ). Furthermore, when we looked at a set of splicing controls from the literature, sequenced asMIPs provided an AUC value of 0.96 (Figure ) and successfully identified 100% of the known, alternatively spliced exons along with 93% of their related junctions (for examples see Figure and ). Additionally, several previously uncharacterized, tissue-specific splice events were revealed (Figures ). We have also shown that asMIP probes can be diluted to sub-femtomolar quantities (Additional File 1
B). We conclude that the asMIP assay is capable of accurate, multiplexed quantitation of alternative splicing in human tissues; our data suggests that 20,000plex reactions will be feasible in the near future.
The advantage of using a sequence capture strategy to analyze alternative splicing in samples that could, in theory, be quantified directly using high-throughput sequencing is that you can dramatically increase the dynamic range of sequencing runs by increasing the amount of usable data. For example, if one looked at 20,000 exon junctions captured by asMIPs with 20-million short sequence reads, the dynamic range would be three orders of magnitude (103
). If one directly sequenced the sample with 20-million short reads mapped to the genome, only about 4% of those reads would likely overlap any exon junction (800,000 reads) [13
]. If there are approximately 22,000 genes in the human genome with an average of 9-10 junctions (totaling ~200,000 junctions) [26
] the dynamic range drops to 4 (800,000/200,000 = 4), less than one order of magnitude for direct sequencing.
One major advantage of asMIPs over other parallel sequence capture technologies such as DASL [18
] is high level of multiplexing that is possible with unimolecular probes. Highly multiplexed reactions require small amounts of probe and we showed that only minute amounts of each asMIP are necessary for quantitative alternative splicing measurements; asMIP reactions carried out with 100 attomole (amol) of each probe were tightly correlated (R2
= 0.94) (Additional File 1B
). SNP MIPs have already been successfully used in ~40,000plex reactions [19
] and 100,000plex reactions have been proposed [15
]. The actual limit of multiplexing for MIPs has not been determined; it could, in practice, be significantly greater than 100,000plex.
Exon arrays also provide a high level of multiplexing, but the large variation in probe hybridization [8
] and the smaller dynamic range [7
] often confounds data analysis and impedes the identification of individual tissue specific splicing changes. The asMIP assay does not appear to suffer from the same limitations and consequently, can accurately identify independent tissue-specific splicing changes, like those seen for TPD52
(Figures &), whereas exon arrays struggle with the same task [9
The asMIPs are well suited for quantifying known alternative splice events accurately in a single tube using a minimum of sample, but there are projects for which the technology is not ideal; large-scale analyses on a small number of samples might not be cost-effective and novel splice sites can not be identified using this method. Certainly, economies of scale would offset the initial financial outlay for large libraries of oligonucleotide probes and thus, asMIP collections would be a valuable resource, easily shared among laboratories. However, researchers conducting global splicing studies on only a few samples will likely find HTS more attractive than asMIPs, particularly when a large dynamic range is not essential. Similarly, researchers desiring to map isoforms or identify a collection of potential cancer biomarkers de novo would likely use high-throughput sequencing. But once those biomarkers have been identified then asMIPs are perfectly poised to accurately and cost-effectively characterize those splicing biomarkers in patient samples. Indeed, by barcoding each asMIP reaction prior to quantitation one could assay 1000 splice junctions in 20 samples using a single HTS reaction, producing a >1000-fold dynamic range. In this case, employing asMIPs would be less expensive and provide a larger dynamic range than carrying out 20 separate HTS reactions.