Gene expression levels in hindlimb muscle tissue from five different inbred strains (CBA, BALB, BL6, DBA, and BL10) were determined. Total RNA from two individuals per strain was isolated, reversed transcribed, and subsequently labelled according to a recently developed protocol (adapted from Xiang et al., 2002), which requires an input of only 1 μg total RNA. Labelled cDNA was hybridised to murine microarrays containing 7,776 65-mer oligonucleotides spotted in duplicate.
Significance levels (p-values) between the five mouse inbred strains were calculated using analysis of variance[10
]. Significance levels among two individual mice within each strain were determined using a hierarchical t
-test providing higher statistical power than conservative methods for low (2–4) replicate numbers[11
]. The higher power is yielded by borrowing information across genes to produce a better expression variance estimator. The gain in power is reported via an increase in the degrees of freedom associated with the t-test. Differentially expressed genes for both computations were selected by controlling the false discovery rate (FDR), as suggested by Benjamini and Hochberg (1995), rather than using pre-defined cut-offs for p-values or corrections for multiple testing. The FDR represents an expectation of the proportion of false positives among the selected differentially expressed genes, which increases dramatically during multiple testing, inherent in microarray experiments[12
Using an FDR of 10% we selected 88 out of 6144 (1.4%) expressed genes that are differentially expressed between strains (Fig. ). A lower number of differentially expressed genes was found in the analysis of variation within strains with identical FDRs of 10% (Table ). Results with other FDR levels are available online as additional file. Correlation between gene expression levels of the two samples from each strain was high (Pearson correlation coefficient ranging from 0.87 to 0.95), also indicating low internal variation (Table ). A considerable amount of differentially expressed genes (718 genes) were selected when pre-defined cut-off values (p < 0.05) were used to determine the differential gene expression between strains. However, adjusted FDR levels indicated a proportion of false positives equal to 42%. On the other hand, adjusting for multiple testing using Bonferroni correction proved to be too stringent, leaving no or few differentially expressed genes. Controlling the FDR, therefore, appears to be an optimal method for both selecting differential gene expression and simultaneously determining the validity of the experimental outcome.
Figure 1 Differentially expressed genes between mouse inbred strains Relative expression levels of differentially expressed genes between mouse inbred strains are depicted in colour as relative intensity levels. Shown for each gene are GenBank accession number, (more ...)
Number of differentially expressed genes using several cut-off strategies
To put the influence of differential gene expression due to genetic background in perspective, we studied gene expression between affected and healthy tissue from hindlimb muscle derived from mdx mice, and from control mice with identical genetic backgrounds. Selection with an FDR of 10% resulted in 1298 differentially expressed genes. Differential gene expression between the two most divergent mouse inbred strains (BL6 and CBA, data not shown) was determined to allow a direct comparison with identical statistical methods. Selection with an FDR of 10% showed an approximately ten-fold decrease in the number of differentially expressed genes (126). Absolute fold changes were calculated and subsequently a comparison of the distribution was made (Fig. ). Median gene expression levels are equal between affected/control and inbred/inbred. However, the number of large fold changes (>3) between affected/healthy (221) is much higher than between inbred/inbred (7), consistent with low contribution of differential expression due to genetic background.
Figure 2 Effect of different genetic background on differential gene expression The distribution of absolute fold changes of differentially expressed genes (n = 1298) between affected (mdx) and healthy (WT) muscle were compared to the distribution of absolute (more ...)
Although overall expression levels are similar between strains, a relatively high number of differentially expressed genes was due to deviating gene expression levels in BL6. We performed quantitative real-time RT-PCR (qPCR) on five genes to verify our microarray data. Two genes myomesin 1 and tropomodulin 1, which were 2.2-fold and 1.8-fold lower expressed in BL6 compared to the other strains on our microarrays, were also found to be lower expressed (2.0-fold and 2.2-fold respectively) in our qPCR assay (Fig. ). Three other genes (dysferlin, cystatin B, and thrombospondin 4) showed no differential expression between any strains.
Figure 3 Validation of BL6-dependent gene expression with qPCR Relative gene expression levels between mouse inbred strains of tropomodulin 1 (Tmod1) and myomesin 1 (Myom1) as determined by quantitative RT-PCR. Significantly lower expression (p < 0.01, (more ...)