summarizes the minimum increases above the spontaneous frequency that can be detected in groups of five animals (the minimum currently recommended in OECD, FDA, and EPA guidelines [
17–
19] as a function of the number of target cells scored (in this case RETs) and observed spontaneous frequency (in this case %MN-RETs among RETs). Minimum detectable increases in MN-RET frequencies at p ≤ 0.05 or p ≤ 0.01, with 90% or 95% power were determined using Monte Carlo simulations. Specifically, to reflect inter-animal variability, five binomial probabilities were randomly selected from a normal distribution with the following mean, μ
0, and standard deviation, σ, combinations: (μ
0, σ) = (0.05%, 0.02%), (0.10%, 0.045%), (0.20%, 0.070%), or (0.30%, 0.092%). For a given fold-increase,
f, a second set of five binomial probabilities were randomly selected from a normal distribution with mean, μ
1 = μ
0 ×
f, and the same σ given above. Using the 5 binomial probabilities from the spontaneous mean group, five MN-RETs frequencies were randomly generated from binomial distributions, with
n = number of RETs scored, 2000, 4000, or 20,000. Such selection from a binomial distribution introduces the binomial counting error. Five MN-RET frequencies were similarly generated using the 5 binomial probabilities from the increased mean group. A one-tailed Mann-Whitney test was then performed on these 10 counts, comparing the spontaneous group to the increased group, and the p-value was noted as to whether it was 0.05 or less and/or 0.01 or less. This was repeated 3000 times and the percentages of the 3000 ‘samples’ for which the p-value was 0.05 or less and 0.01 or less were calculated. The process was repeated over a series of increases,
f, at increments of 0.1, to determine the first point at which the power exceeded 90% or 95%. We obtained very similar results (not shown) by generating the 5 binomial probabilities from beta distributions having the above combinations of μ
0, μ
1 and σ.
| Table 3Minimum Detectable Increases in MN-RET Frequency in Groups of Five Animals as a Function of Spontaneous Frequency and Number of RETs Scoreda |
For the line labeled “∞” in , there is no counting error; rather, the variability in frequencies is due to inter-animal variation alone. If we assume that inter-animal variation is normally distributed, the minimum difference between μ1 and μ0, δ = μ1 − μ0, detectable using 5 animals per group with significance level α and power 1 − β is
[
20]. Here,
tα and
tβ are the critical values from the 5 + 5 − 2 = 8 degree of freedom
t-distribution having upper tail probabilities α and β, respectively. The minimum detectable fold-increase over the spontaneous group is then
While spontaneous MN-RET frequencies determined from counting 2000 RETs from different animals are not often normally distributed, it has been our experience that spontaneous frequencies determined from counting 20,000 RETs from different animals are approximately normally distributed. Therefore, the assumptions of normality that we made above are most likely reasonable.
It should be noted that even if the counting error of the MN-RET frequency in each individual animal could be eliminated, the sensitivity of detection of changes in the observed mean group frequency would still be limited by the inter-animal variability (represented in by the line in which an infinite number of cells is scored). It is clear that the regulatory assay as currently conducted is relatively insensitive to changes in the spontaneous frequency, especially when the spontaneous frequency is low. For example, when the spontaneous frequency is 0.05% and only 2000 RETs are scored, even a 6.8-fold increase would fail to be detected at a confidence level of p ≤ 0.01 in 10% of experiments conducted. Even at the more commonly-reported spontaneous frequency of 0.1% a 4.8-fold increase would fail to be detected 10% of the time at this same confidence level. The use of flow cytometric scoring to achieve a sufficient cell count to allow individual animal frequencies with adequate certainty (i.e., certainty of the individual value relative to the inter-animal variation) would increase the sensitivity such that a doubling of a spontaneous frequency of 0.1% among 20,000 RETs scored would be detected nearly 90% of the time at a confidence level of p ≤ 0.05. It should also be noted that, regardless of the spontaneous frequency, the sensitivity achieved by scoring 20,000 RETs is close to the optimal sensitivity that could be achieved if no counting error were present.