The present studies used a gene-level summarization approach, which has been shown to provide more reliable values than single base positions 
or smaller intervals because of the non-uniformity observed between sequence positions that may be attributed to random hexamer priming 
. In agreement with previous reports, we find that adjusting read count differences by the total read count seems to be inadequate, but an additional correction by the upper quartile of each lane provides a sufficient way to normalize the data 
. We found little evidence of gene length bias using the edgeR package, though a relatively large effect was found using a Poisson model. We show that there is some potential for false positives and false negatives in comparisons of inbred mouse strains due solely to the presence of SNPs, though the effect on this comparison is not severe. We note that any bias that may exist will be a function of SNP density. For more complex and SNP dense mouse models, such as heterogenous stock 
, SNPs may have a more serious impact on DE of some genes. We further note that insertions and deletions between the two strains, which we did not take into account, could potentially cause similar effects. The impact of genome structure on differential expression analysis will likely cease to be an issue for several inbred mouse strains upon release of the sequenced genomes of the common inbred strains being performed by Sanger 
, allowing each to be analyzed relative to its own reference sequence and transcript models.
RNA-Seq provided more genes that were “detected,” that is had reliable signal and were not impacted by SNP or annotation issues, than either the Illumina or Affymetrix microarray platforms. We recognize the difficulty in analytically describing a background level for RNA-Seq and for the future recommend additional experiments accurately measuring the abundance of mRNA from genes with a relatively wide range of counts in the model system of choice similar to what was done in earlier studies 
. Despite platform differences, we found that in general, microarrays and RNA-Seq agreed relatively well based upon fold-change direction and significance values with 58% of the RNA-Seq DE genes that were interrogated on all platforms determined to be differentially expressed in the same direction at a false discovery rate of 0.01. The largest proportion of those genes that were found to be DE by RNA-Seq and at least one microarray platform were found to be DE on all three platforms. Interestingly the Affymetrix MOE430 2.0 tended to agree with the RNA-Seq data better than Illumina. Higher correlations between Affymetrix and sequencing data have been observed before in the context of the SAGE-like digital gene expression (DGE) 
The largest proportion of RNA-Seq DE genes interrogated by all three platforms was represented by those that were seen to be differentially expressed in RNA-Seq only. We found that these genes tended to have a greater average read count relative to those that agreed with at least one microarray platform, further indicating the utility of RNA-Seq over these two microarray platforms. The 144 genes that are differentially expressed in common with the two microarray technologies may represent instances of relatively homogenous expression across the annotated gene, as the probes from these two microarray platforms should tend to be biased towards the 3′UTR end of the gene. In this respect genes that only agreed with one platform in terms of differential expression or were only seen in RNA-Seq may represent differences in transcript isoform abundances. The variation in each of these categories illustrates an advantage of RNA-Seq compared to microarrays in that, in this case, RNA-Seq calculates DE across the entire gene rather than just at an individual probe(set) location within the gene. By some estimates there are, on average, approximately 2.5 alternative transcripts for each mouse gene 
. With alternative splicing in the picture, probe location clearly impacts interpretation of microarray data.
Another potentially informative biological assessment of sensitivity across platforms could be done examining the Y-chromosome genes across the platforms. However, review of the Ensembl annotation utilized for the analysis in this study revealed only 20 genes annotated to be on the Y chromosome. Of these, if we examine the probes that passed our filters for SNPs, detection above background, and unique Ensembl mappings that were used in our analyses, this leaves only 7 probes on the Illumina array interrogating 5 genes on the Y chromosome and no probesets on the Affymetrix array. We do note that of the 5 genes detected by Illumina array, 4 of those were also detected by RNA-seq (the single exception was at the lowest acceptable signal level on the Illumina array). However, with so few annotated genes and such poor coverage on the arrays, assessments using this type of assay would not be informative at this time.
Additionally, many choices of experimental protocols currently exist for RNA-Seq, each with their own benefits and consequences for downstream analysis. For example to remove highly abundant rRNA molecules, either enrichment of poly-A containing sequences or depletion of the rRNAs has been used 
. For our RNA-Seq experiments we used a poly-A enrichment procedure as the current rRNA depletion strategies have been observed to be less efficient 
. However it is possible that use of different selection procedures may impact the agreement with microarrays, which is an interesting topic for future research.
Although many methods for analyzing RNA-Seq data currently exist and more continue to be produced, the most basic and fundamental question that can be asked is whether a gene/transcript is differentially expressed between two groups. Gene level differential expression then forms the basis for further experiments directed at identifying and quantifying transcript isoforms between samples 
and de novo
identification of unannotated expressed regions 
exist to infer alternative splicing and relative quantification of transcript isoforms 
, and the generation of paired-end sequences should allow greater confidence in any novel alternative splicing events observed 
. The use of this technology shall be pursued in the future to assess details of alternative splicing across mouse strains.