This study provides an example of how statistical investigation is essential to ensure that the experiments deliver meaningful results. This research finds that the three-level nested ANOVA is a statistically appropriate method to apply to NIBP data when multiple readings are collected for a mouse, as the assumptions are met. Use of appropriate statistical tools is essential to ensure correct leads are identified for future studies, with the false-positive rate controlled to the level selected by the researcher. This study also demonstrates how optimisation of the experimental design is essential to achieve the research objectives in question but also to reduce work (and therefore cost) and enhance welfare (refine). As these are significant issues in animal research, it is critical therefore to complete these analyses before embarking on experiments, particularly in a high-throughput scenario.
To optimise the design of the experiment, the variation sources in the data were investigated and used in a power analysis. For heart rate measurements in NIBP, the variation between mice dominated such that on average 83% of the variation arose from variation between mice, 13% between days, and 3% between readings for a mouse from a given day. For blood pressure measurements, 69% of the variation arose from variation between mice, 30% between days, and only 1% between readings for a mouse from a given day. These results arose from data sets where no user review occurred. This suggests that there is little value to the user review process, omission of which saves a considerable amount of time. With so little variation arising between readings, a power analysis confirms that once one reading is obtained there is little benefit from additional readings. The number of days was influential in the sensitivity obtained, but most significant was the number of mice used in the analysis. With this information, the optimal design, which balances the cost with the available resources and experimental objective, can be chosen. Specifically, the current design in our facility achieves the target power of 0.8 to detect large changes (80% of a SD unit), which we feel is an appropriate goal for primary-screening, hypothesis-generating research. Therefore, we do not need to alter the number of mice or days in the current experimental design. However, the number of readings per day (up to 15) is excessive, yielding little added value, and can confidently be reduced without loss of power. We settled on five readings per day to allow for missing values that can arise from mouse movement during the procedure. This is a refinement from a welfare perspective as it reduces the number of measurement cycles and, hence, the experimental duration.
For both practical and ethical reasons there is a drive to reduce the number of animals used in a study, as reiterated by the mantra of the three Rs (Burch and Russell 1959
). If an experiment is underpowered, the findings are inconclusive and hence a power analysis, along with the three Rs, can be used to justify an increase in the number of mice. However, with an overpowered study, the additional readings are not necessary and the number of mice should be reduced.
Across the 46 mutant–control comparisons, a number of statistically significant findings could be identified depending on the significant threshold (p value) used. The lower the p value threshold used, the lower the risk of a false positive, which is a particular issue with a multiple-testing scenario. However, protecting against a false positive in this manner increases the risk of a false negative, where biologically significant differences are missed. To address the multiple-testing problem but maintain sensitivity, the FDR was estimated for various thresholds of significance. This data set demonstrates that allowing some false calls increases sensitivity whilst giving a measure of the associated risk.
The most robust hit was found for metastasis associated 1 (Mta1
). Homozygous null mice of both genders displayed an increase in heart rate of approximately 50–60 bpm (p
< 0.05). This increase was detected to a lesser degree (20–40 bpm) in heterozygous mice of both genders (p
< 0.05), indicating a gene-dosage effect. Mta1
is a broadly expressed gene [(Simpson et al. 2001
); in-house observation from lacZ
reporter gene study] known to be a component of the Mi-2/nucleosome remodeling and histone deacetylase (NuRD) complex and therefore plays a key role in regulation of gene expression. There are no prior publications linking Mta1
with cardiac function, although an alternative transcript was detected in the heart (Simpson et al. 2001
This case study demonstrates the value of using statistical analysis to direct experimental design, thus allowing an informed decision to ensure that the three Rs are being met. Additional statistical analysis with effect size and false discovery measures can ensure that the findings are robust and that future downstream work is efficient. This is essential for minimising the experiments whilst maximizing the potential benefit to scientific knowledge.