Search tips
Search criteria 


Logo of geronaLink to Publisher's site
J Gerontol A Biol Sci Med Sci. 2011 March; 66A(3): 279–286.
Published online 2010 November 4. doi:  10.1093/gerona/glq190
PMCID: PMC3041472

Can Rodent Longevity Studies be Both Short and Powerful?


Many rodent experiments have assessed effects of diets, drugs, genes, and other factors on life span. A challenge with such experiments is their long duration, typically over 3.5 years given rodent life spans, thus requiring significant time costs until answers are obtained. We collected longevity data from 15 rodent studies and artificially truncated them at 2 years to assess the extent to which one will obtain the same answer regarding mortality effects. When truncated, the point estimates were not significantly different in any study, implying that in most cases, truncated studies yield similar estimates. The median ratio of variances of coefficients for truncated to full-length studies was 3.4, implying that truncated studies with roughly 3.4 times as many rodents will often have equivalent or greater power. Cost calculations suggest that shorter studies will be more expensive but perhaps not so much to not be worth the reduced time.

Keywords: Longevity, Rodent studies, Proportional hazards, Survival analysis, Sample size

RODENT longevity studies remain a staple of experimental aging research, having been used to evaluate the effects of diets, drugs, genetic factors, toxins, or other factors on life span. One of the greatest obstacles with such experiments is their long duration, typically requiring 3.5 years or more to observe the complete life span of all study rodents (1). In addition to the significant financial burden for such studies, the investment in terms of time until answers can be obtained and can represent a significant fraction of an investigator's research career, limiting the number of experiments an investigator can perform in their lifetime.

Interventional effects on longevity are commonly estimated through Cox proportional hazards (PHs) regression models (2) or such parametric formulations as the Gompertz model (3). Under the Cox PH model, the relative hazard rate (the instantaneous probability of death occurring at a given moment, conditional on death not happening prior to that moment) between two factors is assumed to have a fixed ratio over time. It is important to note that the Cox PH model does not restrict the hazard for a particular group to be constant as a function of time, instead entailing the inherent flexibility to be congruent with the biological expectation of increasing hazard during the course of normal aging. Similarly, the Gompertz model assumes that the relative acceleration in mortality is constant as function of age. Because these models assume constant hazard ratios (HRs)/relative increases in mortality rate over time, their use implicitly assumes that any relative effects on mean or median life span are also present for so-called “maximum life span,” a quantity of great interest to gerontologists (4).

Recognizing the significant time investment required for full longevity studies along with the application of models that assume constant effects across time invites the question as to whether the rodent mortality experience in these settings is consistent with the assumption of PH. Put another way, is it possible to reduce the length of longevity experiments by running studies for only τ years, where τ is less than full follow-up and still achieve comparable effect estimates to what one would obtain if one continued the study until all rodents died? Truncating longevity experiments prior to observing death for all rodents reduces statistical power due to the inherent loss of information. Thus, a related question is whether this loss of power can be overcome by increasing sample size, permitting shorter studies in terms of calendar time that still achieve the same expected inference concerning effects on mortality rate.

The goal of the present study is to empirically evaluate the assumption of PH using a collection of 15 rodent longevity experiments. Motivated by the design of long-term rodent toxicology (typically carcinogenicity) experiments, we also consider the situation of truncating longevity studies at 2 years of follow-up. We evaluate whether these shorter term studies could be expected to yield, on average, the same result with respect to effects on mortality rate that would be obtained from a full lifetime study. For situations in which the observed effects on mortality rate would be roughly the same in a shorter versus life-long follow-up study (ie, PH is a valid assumption), we estimate the required increase in sample size needed to achieve the same statistical power and precision as would a full lifetime study. Finally, we also compare the total study costs for a truncated follow-up experiment versus a full-length longevity study.


Data Sets

We utilized mortality data from a convenience sample of 15 rodent life-span studies available to us from our own work or provided by close collaborators (Table 1). These life-span studies evaluated a broad range of factors, including sex, genotype, and drug and diet, exercise, and surgical interventions, though we do not claim that they represent an exhaustive sampling of rodent longevity studies. Available studies, where the intervention changed during the course of the study, were excluded, such as the methionine study of Miller and colleagues (4). Of the 15 studies included, their average study length was 3.6 years (SD = 0.8) with an average sample size of 363 rodents (SD = 50).

Table 1.
List of Rodent Longevity Experiments Included in Current Study*

Statistical Methods

For each available study, we estimated the mortality effects of the various treatments, interventions, and other factors considered in the original experiment via Cox PH regression. In order to access evidence for deviations from the assumption of PH, we performed statistical tests of the PH assumption by including time-based effects in the regression model for each study (5). The inclusion of time-varying effects for each predictor variable is a standard statistical approach to assess the evidence for changing HRs changes over time, such as could occur if a given treatment leads to increased early mortality yet offers later protective effects (6). For each study, we performed an overall F test for all time-based effects in the full study data to examine the PH assumption. We utilized a jackknife procedure to conduct more focused tests for differences between the HRs obtained from analyzing the full longevity data versus truncating each study at 2 years (Appendix 1). We report uncorrected p values, so that readers may choose to apply the multiple testing or false discovery rate correction of their choice (7) or simply apply a nominal α = .05 significance level (8).

To estimate the required increase in sample size for truncated life-span studies, we assumed that, asymptotically, the variance of estimated effects is proportional to 1/N. That is, if we double the sample size, then the standard errors of estimated effects will shrink by a factor of An external file that holds a picture, illustration, etc.
Object name is geronaglq190fx1_ht.jpg. If the data from the full study yielded an estimate with variance VF and the truncated data yielded an estimate VT, then a truncated study will need a larger sample size by a factor of ~VF/VT to achieve equivalent power.

Based on the estimated sample size increases for truncated longevity studies, we also estimated the costs associated with performing a hypothetical larger truncated experiment for each of the 15 studies. Based on the National Center for Research Resources rate-setting manual (9) and consulting with experts on rodent life-span experiments, we assumed the following in cost calculations. Both mice and rats incur an initial cost of $20.00 per animal to purchase, with loaded cage maintenance costs of $1.27 and $2.14 per day for mice (up to five per cage) and rats (up to three per cage), respectively. We also assume that a lab technician (salary of $40,000 per year) can manage up to 1,000 rodents. Due to the fact that, in practice, surviving rodents are not reassigned to new cages, once cage mates have deceased, these costs of full survival studies are likely underestimated at some level. All analyzes were performed using SAS v9.1.3 (SAS, Cary, NC).


In Table 2, we report the uncorrected results of testing for departures from PH. After considering either the Bonferroni or Holm's (10) multiple-testing corrections, none of the studies gave statistically significant evidence of deviation from a PH model (Table 2). Two studies, 5 and 8, did exhibit nominal departure from PH. For Study 5, which tested selectively bred high-runner mice at different exercise levels against control mice, the control mice initially died off at a slower rate than the other two treatment groups but then died off rapidly toward the end (11). For Study 8, the mice that were administered thyroxine started the study with similar survival rates but then experienced accelerating death rates later in the study (12).

Table 2.
Statistical Tests of the Proportional Hazards Assumption by Experiment

Table 3 displays the results of truncating each of the studies at 2 years, censoring each rodent at 730 days for those with deaths that occurred after that point. The jackknife tests for differences in the estimated HRs between the truncated and full-length studies suggest that in 43 of 49 comparisons (88%), the value of the estimated coefficients did not differ significantly between the truncated and full-length studies at the α = .05 level; at the Bonferonni corrected α level of .05/49, none of the differences were significant. Of the six treatment groups whose coefficients differed at the nominal .05 level, 4 of 6 were for dietary interventions.

Table 3.
Comparative Results of the Full Versus Truncated Analyzes From Each Data set and Their Implications for the Comparative Direct Costs of 2-Y Versus Full Life-Span Study*

Figure 1 displays variance ratios for the estimated HRs from truncated and full-length studies, defined as (SE2yr/SEfull)2. Based on truncating studies at 2 years, the variances of the estimates became inflated by factors ranging from 1.2 to 34.0. As expected, the proportion of rodents that died before 2 years was strongly correlated with the level of variance inflation (Figure 1). On average, the variance of the estimates increased by a factor of 5.1 (SD = 5.6, median = 3.4). The largest increases in the variance ratios were in studies where less than 20% of rodents died before 2 years due to long life or small variation in life span. Under the assumption that truncated studies with five times as many rodents allocated to each arm would have equivalent power to a full-length study with one fifth of the total sample size, Appendix 2 displays the estimated total study cost for each type of design for each of the 15 data sets considered. The costs of running shorter experiments ranged from 1.8 to 4.0 times the cost of the full-length study (Table 4). Full details of the cost computations are shown in Appendix 2.

Figure 1.
Ratios of variance in effect estimates in truncated versus full life-span studies as a function of proportion of animals dying in truncated study. *The curved line represents a least squares second order polynomial fitted to the data. ** The numbers on ...


Our analysis indicates that, in most cases, interventions or other factors that influence rodent longevity induce effects consistent with PH models. Truncating the study length to 2 years did not significantly affect the estimated effect; only the variance of this estimate and hence its statistical significance were influenced, a result of reduced sample size and power. This implies that the Cox PH regression model (13) is sufficient for detecting differences in longevity, even when studies are cut short. We propose that increasing the number of rodents in the study by a factor derived from Figure 1, by a factor near 5 on average (mean) but in half the cases by a factor of no more than 3.4 (median), can offer power equivalent to a full-length study. In situations where time is critical and the impact of discovery is large, the additional cost burden of a shorter experiment may be justified. It may also be worth commenting on whether diet, strain, environment, cohort, etc. might have the most impact on achieving similarity of results in truncated versus full longevity studies, particularly given that each of these factors can affect both the overall life-span trends and specific mortality trajectories at earlier timepoints. Of course, the truncated approach will not provide full information on diseases of aging compared with a full longevity/disease experiment.

The current results suggest empirical adherence of mouse and rat longevity studies to PH, implying similar effects on mortality rate across the full life span. Thus, although the risk of death (hazard rate) may certainly accelerate with advanced age, it appears that differences in acceleration between groups commonly occur by a proportional factor. This further suggests that cases where early, mean, and median life span are extended, extensions in so-called maximal life span would also be the expected norm. However, exceptions to this general rule may be possible, in principle and observation. It is commonly observed and cited that in addition to mean and median life span, maximal life span is extended in response to calorie restriction, setting it apart from interventions that may increase mean or median life span independent of maximum life span (14,15). It is theoretically possible that an intervention that extends both mean and median life span has no benefit on maximal life span or that maximal life span may be increased independent of mean or median benefits. What is less commonly reported is the case where higher early- to mid-life mortality is followed by a subsequent extension of maximal life span. Although such an instance has been reported (4) and others may exist, it should be noted that the constant effects on longevity would only be expected in response to a single intervention that was maintained for the duration of the longevity study (16). Therefore, the alteration of a single intervention, as occurred with a methionine restriction protocol that showed increased early-life mortality (using the initial dietary formulation) followed by subsequent extension (as the diet was reformulated at two interim study points), does not contradict the expected proportionality of life span (4). Whether other studies that show potential differences in either early-life or late-life mortality effects that might contradict the predicted proportionality has not been formally tested. For example, calorie-restricted wild-derived mice were not extended at early- and mid-life but were extended at the 90th percentile (17); resveratrol was reported to increase the early- and mid-life span of high fat–fed mice, with no benefit on maximal life span (18).

Part of the question regarding maximal life-span effects may be related to the lack of proper statistical analyzes of reported maximal longevity results. Clearly, truncating studies at any point prior to the maximal observed life span prevents assessment of any late-life specific effects that might occur. Yet, the limited sample size defining the maximum life span (90th percentile—4 animals of a cohort of 40) results in low-powered comparisons between groups, and it is often unclear whether observed increases in maximum life span are statistically significantly greater rather than simply numerically increased. The application of a standardized method, such as those described in (19) and (20) for maximal life-span assessment between studies, would be useful to determine whether longevity studies that have observed early- and mid-life extension consistently occur independent of late-life (maximal) extension. Another point of interest related to the proportionality of survival/mortality is whether the cause of mortality may differ at various points along the survival curve. This is often likely to be the case, at least for comparisons of some groups (eg, sex differences, and if so, then an intervention that affects a specific cause of death that is isolated to late-life in rodents would be unobserved in a truncated study). Of course, our approach assumes that the observed mortality was due to natural causes in shorter lived mice; in practice, the investigator will need to check this assumption by carefully examining the causes of death.

The current results lend support to several practical applications. The first is the potential for statistical analyzes of existing data sets in order to hypothesize about the expected survival impact that may occur in response to a particular diet, compound, or intervention. In particular, we envision the use of toxicology studies of at least 2-year duration where greater than 20% mortality has occurred, potentially advancing aging research without additional study-related costs. Second, although we have specifically evaluated truncated 2-year life-span experiments in order to parallel existing toxicology resources, our results in no way imply that 2-year studies or designs with a fixed stopping point are optimal. Rather than specifying a deterministic study length, an alternative implication from PH would be the potential for applying group sequential or similar clinical trial designs that permit interim analyzes of a life-span experiment (21). For example, one could envision a life-span study employing planned interim analyzes at 3 to 6-month intervals beginning at 2 years using such approaches as conditional power methodology (22) or continual monitoring using Bayesian decision-theoretic techniques to evaluate the need to continue the experiment (23). This methodology may be particularly beneficial, given that recent publications indicate that exploration of factors affecting life span in rodents remains a staple of experimental aging research (2428). Although similar analyzes have been applied to toxicology experiments in clinical drug testing, our analyzes illustrate the potential to apply similar methodologies to a multifactorial endpoint of longevity. However, a meticulous adherence to the planned study design and proper analytic techniques is certainly advised to prevent improper interpretations, such as could occur by simply ending a study when a statistically significant result is obtained.


This work was supported in part by National Institutes of Health grants: P30DK056336, R01DK076771, T32HL072757, T32DK062710, and T32HL079888. T.G. was supported by NSF grant IOB-0543429.


D.B.A. has received grants, honoraria, donations, book royalties, and consulting fees from numerous food, beverage, dietary supplement, pharmaceutical companies, litigators and other commercial, government and nonprofit entities with interests in obesity, and related topics.


We would like to thank the following individuals for their generosity in sharing rodent data: George Roth, Chief Executive Officer, GeroScience Inc.; Richard Miller, Professor of Pathology, University of Michigan at Ann Arbor; Donald K. Ingram, Professor of Nutritional Neuroscience and Aging Laboratory, Pennington Biomedical Research Center; Julie Mattison, Laboratory of Experimental Gerontology, National Institute on Aging, National Institutes of Health (NIH); Arlan Richardson, Senior Research Career Scientist (STVHCS) Director, University of Texas Health Science Center; and Kyle Grimes, Professor of English at the University of Alabama at Birmingham. We would also like to thank Junior Bazile, Nigel Rozario, and Huichien Kuo for assistance with data compilation and standardization. The opinions expressed are those of the authors and not necessarily those of the NIH or any other organization with which the authors are affiliated.


Let θ=log(HRfull)log(HR2yr), with corresponding estimator An external file that holds a picture, illustration, etc.
Object name is geronaglq190fx2_ht.jpg based on the full data set (i = 1, 2, … , N observations). The jackknife procedure then consists of the following steps (29).

For i = 1 to N

Remove the ith subject from the data set.

Estimate θ in the reduced sample, denote this estimate as An external file that holds a picture, illustration, etc.
Object name is geronaglq190fx3_ht.jpg.


The jackknife estimate of θ is then

An external file that holds a picture, illustration, etc.
Object name is geronaglq190fx4_ht.jpg

where An external file that holds a picture, illustration, etc.
Object name is geronaglq190fx5_ht.jpg. The jackknife estimate of the standard error is

An external file that holds a picture, illustration, etc.
Object name is geronaglq190fx6_ht.jpg

Thus, a test statistic under the null hypothesis An external file that holds a picture, illustration, etc.
Object name is geronaglq190fx2_ht.jpg=0 can be constructed as

An external file that holds a picture, illustration, etc.
Object name is geronaglq190fx7_ht.jpg

where T follows an approximate t distribution with N – 1 degrees of freedom.

Appendix 2.


StudyLengthNMean LongevityMaximum LongevityTotal Rodent Days# Technicians RequiredTechnician CostRodent Overhead CostRodent Maintenance CostTotal CostCost Ratio
12 y1,800669.77301,205,4302$160,000$36,000$2,579,620$2,775,6203.9
22 y1,130623.8730704,8702$160,000$22,600$1,508,422$1,691,0224.0
32 y185628.5730116,2751$80,000$3,700$147,669$231,3691.8
42 y2,990673.77302,014,3503$240,000$59,800$2,558,225$2,858,0253.7
52 y900643.8730579,4351$80,000$18,000$735,882$833,8822.9
62 y9,215679.87306,264,25010$800,000$184,300$7,955,598$8,939,8984.0
72 y2,735699.47301,912,9903$240,000$54,700$2,429,497$2,724,1973.6
82 y1,095681.1730745,8152$160,000$21,900$947,185$1,129,0852.6
92 y2,495681.57301,700,2903$240,000$49,900$2,159,368$2,449,2683.9
102 y665710.1730472,2251$80,000$13,300$599,726$693,0261.8
112 y495728.7730360,7001$80,000$9,900$458,089$547,9892.1
122 y1,185718.2730851,1252$160,000$23,700$1,080,929$1,264,6292.4
132 y755714.7730539,5971$80,000$15,100$685,288$780,3882.3
142 y685727.2730498,1121$80,000$13,700$632,602$726,3022.1
152 y805692.3730557,3301$80,000$16,100$707,809$803,9092.8


1. Swindell WR, Harper JM, Miller RA. How long will my mouse live? Machine learning approaches for prediction of mouse life span. J Gerontol A Biol Sci Med Sci. 2008;63:895–906. [PMC free article] [PubMed]
2. Conover CA, Bale LK, Mader JR, Mason MA, Keenan KP, Marler RJ. Longevity and age-related pathology of mice deficient in pregnancy-associated plasma protein-A. J Gerontol A Biol Sci Med Sci. 2010;65(6):590–599. [PMC free article] [PubMed]
3. Mason JB, Cargill SL, Anderson GB, Carey JR. Transplantation of young ovaries to old mice increased life span in transplant recipients. J Gerontol A Biol Sci Med Sci. 2009;64(12):1207–1211. [PMC free article] [PubMed]
4. Miller RA, Buehner G, Chang Y, Harper JM, Sigler R, Smith-Wheelock M. Methionine-deficient diet extends mouse lifespan, slows immune and lens aging, alters glucose, T4, IGF-I and insulin levels, and increases hepatocyte MIF levels and stress resistance. Aging Cell. 2005;4:119–125. [PubMed]
5. Klein J, Moeschberger M. Survival Analysis: Techniques for Censored and Truncated Data. 2nd ed. New York: Springer-Verlag; 2003.
6. Bellera CA, MacGrogan G, Debled M, de Lara CT, Brouste V, Mathoulin-Pelissier S. Variables with time-varying effects and the Cox model: some statistical concepts illustrated with a prognostic factor study in breast cancer. BMC Med Res Methodol. 2010;10:20. [PMC free article] [PubMed]
7. Gadbury GL, Xiang Q, Yang L, Barnes S, Page GP, Allison DB. Evaluating statistical methods using plasmode data sets in the age of massive public databases: an illustration using false discovery rates. PLoS Genet. 2008;4:e1000098. [PMC free article] [PubMed]
8. Bailar JC, III, Mosteller F. Guidelines for statistical reporting in articles for medical journals. Amplifications and explanations. Ann Intern Med. 1988;108:266–273. [PubMed]
9. McPherson P. Cost Analysis and Rate Setting Manual for Animal Research Facilities. Bethesda, MD: NCRR Office of Science Policy and Public Liaison [serial online] 2010;
10. Holm S. A simple sequentially rejective multiple test procedure. Scand J Stat. 1979;6(2):65–70.
11. Vaanholt LM, Daan S, Garland T, Visser GH. Exercising for life? Energy metabolism, body composition, and longevity in mice exercising at different intensities. Physiol Biochem Zool. 2010;83:239–251. [PubMed]
12. Vergara M, Smith-Wheelock M, Harper JM, Sigler R, Miller RA. Hormone-treated snell dwarf mice regain fertility but remain long lived and disease resistant. J Gerontol A Biol Sci Med Sci. 2004;59:1244–1250. [PMC free article] [PubMed]
13. Cox D. Regression models and life-tables. J R Stat Soc Ser B. 1972;34:187–220.
14. Holloszy JO, Schechtman KB. Interaction between exercise and food restriction—effects on longevity of male-rats. J Appl Physiol. 1991;70:1529–1535. [PubMed]
15. Weindruch R, Walford RL. The Retardation of Aging and Disease by Dietary Restriction. Springfield, IL: C.C. Thomas Publisher; 1988.
16. Rauser CL, Laurence DM, Travisano M, Rose MR. Evolution of Aging and Late Life. Experimental EvolutionConcepts, Methods, and Applications of Selection Experiments. Berkeley, CA: University of California press; 2009. pp. 551–584.
17. Harper JM, Leathers CW, Austad SN. Does caloric restriction extend life in wild mice? Aging Cell. 2006;5:441–449. [PMC free article] [PubMed]
18. Baur JA, Pearson KJ, Price NL, et al. Resveratrol improves health and survival of mice on a high-calorie diet. Nature. 2006;444:337–342. [PubMed]
19. Wang CX, Li Q, Redden DT, Weindruch R, Allison DB. Statistical methods for testing effects on “maximum lifespan” Mech Ageing Dev. 2004;125:629–632. [PubMed]
20. Gao GM, Wan W, Zhang SJ, Redden DT, Allison DB. Testing for differences in distribution tails to test for differences in ‘maximum’ lifespan. BMC Med Res Methodol. 2008;8:49. [PMC free article] [PubMed]
21. Piantadosi S. Clinical Trials: A Methodologic Perspective. New York: Wiley-Interscience; 2005.
22. Andersen PK. Conditional power calculations as an aid in the decision whether to continue a clinical trial. Control Clin Trials. 1987;8:67–74. [PubMed]
23. Lewis RJ, Lipsky AM, Berry DA. Bayesian decision-theoretic group sequential clinical trial design based on a quadratic loss function: a frequentist evaluation. Clin Trials. 2007;4:5–14. [PubMed]
24. Ricklefs RE, Scheuerlein A. Biological implications of the Weibull and Gompertz models of aging. J Gerontol A Biol Sci. 2002;57:B69–B76. [PubMed]
25. Singh SP, Niemczyk M, Saini D, Sadovov V, Zimniak L, Zimniak P. Disruption of the mGsta4 gene increases life span of C57BL mice. J Gerontol A Biol Sci Med Sci. 2010;65(1):14–23. [PMC free article] [PubMed]
26. Smith DL, Jr., Elam CF, Jr., Mattison JA, et al. Metformin supplementation and life span in Fischer-344 rats. J Gerontol A Biol Sci Med Sci. 2010;65(5):468–474. [PMC free article] [PubMed]
27. Sun L, Sadighi Akha AA, Miller RA, Harper JM. Life-span extension in mice by preweaning food restriction and by methionine restriction in middle age. J Gerontol A Biol Sci Med Sci. 2009;64(7):711–722. [PMC free article] [PubMed]
28. Zhang Y, Ikeno Y, Qi W, et al. Mice deficient in both Mn superoxide dismutase and glutathione peroxidase-1 have increased oxidative damage and a greater incidence of pathology but no reduction in longevity. J Gerontol A Biol Sci Med Sci. 2009;64(12):1212–1220. [PMC free article] [PubMed]
29. Efron B. The Jackknife, the Bootstrap and Other Resampling Plans. Philadelphia, PA: Society for industrial and applied mathematics; 1982. CBMS-NSF Regional conference series in applied mathematics.

Articles from The Journals of Gerontology Series A: Biological Sciences and Medical Sciences are provided here courtesy of Oxford University Press