|Home | About | Journals | Submit | Contact Us | Français|
To develop an efficient clinical prediction model that includes postnatal weight gain to identify infants at risk of developing severe retinopathy of prematurity (ROP). Under current birth weight (BW) and gestational age (GA) screening criteria, <5% of infants examined in countries with advanced neonatal care require treatment.
This study was a secondary analysis of prospective data from the Premature Infants in Need of Transfusion Study, which enrolled 451 infants with a BW < 1000 g at 10 centers. There were 367 infants who remained after excluding deaths (82) and missing weights (2). Multivariate logistic regression was used to predict severe ROP (stage 3 or treatment).
Median BW was 800 g (445–995). There were 67 (18.3%) infants who had severe ROP. The model included GA, BW, and daily weight gain rate. Run weekly, an alarm that indicated need for eye examinations occurred when the predicted probability of severe ROP was >0.085. This identified 66 of 67 severe ROP infants (sensitivity of 99% [95% confidence interval: 94%–100%]), and all 33 infants requiring treatment. Median alarm-to-outcome time was 10.8 weeks (range: 1.9–17.6). There were 110 (30%) infants who had no alarm. Nomograms were developed to determine risk of severe ROP by BW, GA, and postnatal weight gain.
In a high-risk cohort, a BW-GA-weight-gain model could have reduced the need for examinations by 30%, while still identifying all infants requiring laser surgery. Additional studies are required to determine whether including larger-BW, lower-risk infants would reduce examinations further and to validate the prediction model and nomograms before clinical use.
Most infants screened for retinopathy of prematurity (ROP) under current criteria do not require treatment. Slow postnatal growth is a surrogate for low serum insulin-like growth factor 1 and a marker for severe ROP. A complex ROP surveillance model using weights has been developed.
A simple predictive model that uses postnatal weight gain to accurately predict risk of severe ROP reduced the number of infants requiring exams among a high-risk population. Nomograms potentially make it easy to apply the model clinically.
Retinopathy of prematurity (ROP) is a leading cause of preventable blindness in children.1,–3 Timely laser surgery can prevent retinal detachment, but detection involves subjecting infants to stressful, resource-intensive serial diagnostic eye examinations.4,–6 In countries with highly developed NICU systems, screening guidelines determine which children need examinations, by using birth weight (BW) and gestational age (GA) cut-point criteria, below which infants are designated to receive examinations. To avoid missing a case of severe ROP, criteria are set high enough to capture all cases, at the cost of repeatedly examining many children who develop no or mild retinopathy. Less than 5% of infants examined in the United States, United Kingdom, and Canada require laser surgery.6,–12
Groundbreaking work by Smith, Hellstrom, and co-workers13,–19 has elucidated the role of insulin-like growth factor 1 (IGF-1) in the pathogenesis of ROP, which led to development of an algorithm to predict ROP using postnatal growth, a surrogate for IGF-1. Serum IGF-1 falls with premature birth, loss of maternal sources, and poor endogenous production. IGF-1 plays a permissive role in vascular endothelial growth factor (VEGF)-induced retinal vascular growth, and low serum IGF-1 hinders retinal vessel development, with localized hypoxia and VEGF accumulation.20 When IGF-1 production eventually rises into the equivalent of the third trimester of pregnancy for the infant, VEGF activation occurs and ROP develops. Extensive laboratory work supports this model.20,–27 Clinically, a prolonged IGF-1 deficit is associated with a higher risk of severe ROP.13,28,–33 Because the hypoxic phase precedes the clinically apparent phase, there is an opportunity to predict ROP risk weeks in advance. Lofqvist et al13 developed a computer-based surveillance algorithm (WINROP) to predict risk of severe ROP on the basis of postnatal weight gain, BW, and GA. Published reports in which this complex algorithm is used have been limited to 2 low-risk, retrospective cohorts but potential to reduce examinations considerably is suggested.13,19
We sought to develop a risk model that could lead to a simple, transparent and clinically easy-to-use prediction tool using postnatal weight gain. Our goal was to identify infants at risk of developing severe ROP among a diverse and high-risk population of infants enrolled in a prospective multicenter clinical trial.
We performed a secondary analysis of prospectively collected data from the Premature Infants in Need of Transfusion (PINT) Study, a multicenter randomized controlled trial designed to compare restrictive to liberal hemoglobin thresholds for red blood cell transfusions in extremely low birth weight infants (BW < 1000 g).34,35 The primary outcome was a composite of death before discharge or impaired survival with severe ROP, bronchopulmonary dysplasia, or brain injury on cranial ultrasound. There were 451 enrolled infants randomly assigned across 10 NICUs in Canada, the United States, and Australia. Primary eligibility criteria included BW < 1000 g and younger than 48 hours old at time of enrollment. Exclusion criteria included nonviability, cyanotic heart disease, congenital anemia, acute shock, family history of hemolytic disease, or intent to use erythropoietin. In this current study, an a priori plan was to exclude infants if death occurred before known ROP outcome.
ROP diagnosis and treatment decisions in the PINT study were based on locally interpreted retinal assessments by an ophthalmologist with expertise in ROP, practicing in a tertiary care referral center and unaware of the treatment group. Staging followed the international classification of ROP.36,37 Enrollment occurred from January 2000 to February 2003. Therefore, ROP treatment decisions were based on “Threshold ROP” criteria since early treatment for ROP revised criteria were published in 2003.6 Data collected included ROP stage (1–5) and zone (I, II, or III) for each eye at time of highest stage or treatment. Severe ROP was defined as stage 3, 4, or 5, or treatment with laser or cryotherapy ablation. Data on plus disease were not available.
Demographic and medical data collected included maternal ethnicity, multiple perinatal and postnatal comorbidities, and medical and surgical interventions. Nutritional data were not available.
Weights were measured at least every other day by clinical staff using the standard equipment in use at each site, reflecting the common range of clinical practice. Infants were weighed whether on a ventilator or not, but rarely a child would be deemed too sick to move, and no measurement would be made. The choice of weekly data points were weights on days of life 8, 15, 22, etc. If no weights were measured on those days, the next closest day's weight was used.
Recommended principles of prognostic model development were followed.38 Analyses were performed by using SAS 9.1 (SAS Inc, Cary, NC). In the PINT study, no significant difference in severe ROP between treatment groups was found, so data from the groups were combined for analysis. When using randomized controlled trial data to develop a predictive model, if the treatment and control groups show no statistically significant difference, the data may be combined to study prognosis.39 Nevertheless, treatment group and cumulative transfusion volume were considered in the model. The predicted outcome was the development of severe ROP in either eye (defined above). Candidate predictor variables included BW, GA, weekly postnatal weight gain, maternal race, gender, use of medications (systemic corticosteroids, erythropoietin, methylxanthines, doxapram, indomethacin), PINT study treatment group, cumulative transfusion volume, patent ductus arteriosis, patent ductus arteriosis surgery, necrotizing enterocolitis (NEC), NEC surgery, other surgical procedure (hernia repair, etc), intraventricular hemorrhage, ventriculomegaly, abnormal head ultrasound, perinatal infection, postnatal sepsis, and cerebrospinal fluid infection. GA in weeks was considered both as a continuous variable and a categorical variable. Very low birth weight infants have an initial fall in weight and reach a nadir around 1 week of age,40 so weight gain was calculated by using weekly values beginning the second week of life. Weight measurements after severe ROP developed were excluded. Multiple variations of weight gain were evaluated, including cumulative weight gain in grams from birth; cumulative weight gain as a proportion of birth weight; weekly weight gain as a proportion of previous week's weight; weight gain rate per day in grams, calculated as current week's weight minus previous week's weight, divided by 7; and weight gain rate per day since birth.
The association of each candidate predictor with severe ROP was examined through univariate logistic regression and Fisher's exact test. Predictor variables associated with P < .10 were included in a multivariate logistic regression. The model was further reduced through backward selection.41 The final model included predictors with P ≤ .05 and well established predictors (BW, GA).38 To determine a need for eye examinations, an alarm cut-point value of predicted probability of severe ROP was set to minimize missed cases of severe ROP. The probability or risk of severe ROP was recalculated with each weekly weight. If the predicted risk was greater than the cut-point, an alarm signaled that ROP examinations would be needed. Examination timing would still be based on the postmenstrual age (PMA) that ROP would be expected to develop (eg, 31 weeks' PMA or 4 weeks of age, whichever is later). Model performance was assessed by sensitivity for detecting severe ROP, percentage reduction in the number of infants requiring eye examinations, and time interval between first alarm and severe ROP diagnosis.
To evaluate optimistic bias in reported model performance, an internal validation of the model was performed using the bootstrap methods of Harrell et al.42 This validation is based on 1000 bootstrap replicates, each consisting of 367 subjects sampled with replacement. Using each bootstrap replicate, a prediction model was developed, and its performance was evaluated on both the replicate and original data by calculating sensitivity and specificity for predicting severe ROP using the same cut-point of predicted probability as used in the original data set. The “optimism” in sensitivity or specificity was the difference between that from the bootstrap replicate and the original data set by using the replicate prediction model. The average “optimisms” and 95% confidence intervals (CIs) were calculated on the basis of 1000 replicates.
To create a pilot clinical tool to determine ROP risk, the final logistic model was converted into graphical form as nomograms with Mathematica 7 (Wolfram Research, Champaign, IL) by using methods described previously.43
Of 451 infants enrolled in the PINT study, 82 infants were excluded because death occurred before known ROP outcome. Of these, 70 died before any ROP examinations, and 12 died before 42 weeks' PMA without having developed stage 3 or treatment requiring ROP or having reached retinal vascular maturity. Two infants were excluded because of multiple missing weight measurements. After these exclusions, 367 infants remained for analysis (Table 1). Median GA was 26 weeks (range: 22–34); median BW was 800 g (range: 445–995). There were 67 (18.2%) infants who had severe ROP, of which 33 (9%) required treatment. The infants with and without severe ROP did not differ with regards to gender or race, but infants with severe ROP had lower BW (P < .0001), lower GA (P < .0001) (Table 1), and gained weight at a slower rate than did infants without severe ROP, until postnatal age weeks 9 to 10 (Fig 1).
The final predictive model included BW, GA, and daily weight gain rate calculated from the current and previous weeks' weights (Table 2). The relative risk of severe ROP for each 10 g per day lowering of weight gain rate was 1.15 (95% CI: 1.06–1.24). Although numerous other candidate predictors were associated with severe ROP in univariate analyses, none were significant in the multivariate model (data not shown). The equation for calculating risk of severe ROP appears with Table 2.
The model was run at weekly intervals to detect an alarm that indicates a need for eye examinations. Daily weight gain rate was recalculated each time from the current and previous weeks' weight, beginning with the second week of life. An alarm was triggered when the risk of severe ROP was >0.085. Application of the model in this fashion correctly identified in advance 66 of 67 infants with severe ROP and all infants who required treatment (Table 3). There were 110 infants (30%) who did not trigger an alarm and would not have been flagged to receive eye examinations if the model had been applied in practice. Median time between alarm and severe ROP diagnosis was 10.8 weeks (range: 1.9–17.6). Sensitivity for prediction of severe ROP was 99% (95% CI: 94%–100%), specificity was 36% (32%–41%), positive predictive value was 26% (22%–30%), and negative predictive value was 99% (96%–100%). The 1 infant with severe ROP without an alarm had a GA of 30 weeks, BW of 980 g, largest predicted probability of 0.023, stage 3 ROP in 1 eye, stage 2 in the fellow eye, and did not require treatment. From 1000 bootstrap replicates, the average “optimism” for sensitivity was 1.45% (95% CI: −2.19% to 6.68%) and for specificity 0.22% (95% CI: −3.74% to 4.37%), providing, on the basis of this internal validation technique, estimates of sensitivity of 97.0% (95% CI: 91.8%–100%) and specificity of 36.1% (31.9%–40.0%).
The regression equation was represented graphically through nomograms (Figs 2 and and3).3). The sample nomogram in Fig 2 is for children with a GA of 27 weeks. A straight line is drawn between the values for BW and daily weight gain rate. The intersection of this line with the probability line provides the predicted risk of severe ROP. By assuming a specific risk cutoff level (0.085), the need for eye examinations can be determined with each new week's weight. Once a need for examinations is signaled, future assessments are not required. In this fashion, such a nomogram may be used as a simple clinical tool for identifying infants who require eye examinations, although additional development and validation are essential before general use. The nomogram in Fig 3 adds some complexity but permits the determination of risk for various gestational ages. The values for BW and weight gain rate are again connected with a straight line. The intersection point of this line with the gray auxiliary axis is then connected to the value for GA, and the predicted risk of severe ROP can be read as with Fig 2. As logistic regression calculates a nonlinear function of the covariates, the output scales are not linear, and the tick marks representing probability deciles are not equidistant.
To identify potential causes of weight gain that cause false-negative signals and might exclude an infant from application of the model, the candidate predictors detailed in “Patients and Methods” (eg, sepsis, NEC, surgery) were assessed for the presence of a relationship with increased postnatal weight gain but not with severe ROP. No such factors were identified.
We developed a predictive model (PINT-ROP model) and preliminary nomograms that use BW, GA, and postnatal weight gain measurements to predict the risk of severe ROP and determine the need for eye examinations. In a high-risk cohort, this preliminary model reduced the number of infants with BW < 1000 g who required eye examinations by 30%, missed 1 infant with severe ROP, and identified all infants who required laser surgery.
Although current guidelines treat BW and GA as dichotomous variables, the PINT-ROP model accounts for the known inverse relationship between risk of severe ROP and BW or GA. Infants with very low BW or GA continue to receive examinations on the basis of their degree of prematurity at birth. In contrast, larger BW and older GA infants at increased risk for severe ROP are identified by another factor, slow postnatal weight gain. Numerous additional factors were associated with severe ROP in univariate analysis but were no longer significant in multivariate analysis. We hypothesize that many previously described risk factors for ROP act via a common pathway, lowering IGF-1, and therefore are “captured” through weight measurement as a surrogate for IGF-1 levels. For example, sepsis has been associated with both low IGF-1 and the subsequent development of severe ROP. If low serum IGF-1 lies within the causal pathway tying sepsis with ROP, then measuring its surrogate, poor postnatal weight gain, might be expected to supplant sepsis as a predictor in a risk model for ROP. This hypothesis requires additional study. As recommended by principles of prognostic model development, BW was retained in our model as an established strong risk factor for ROP.38 Its marginal statistical significance (P = .1) may result from the relatively small range of birth weights (all < 1000 g) represented in the study.
The PINT cohort was at high risk for severe ROP (median BW: 800 g; maximum BW: 995 g). Yet application of the model would have resulted in a 30% reduction in the number of infants requiring eye examinations. Because the greatest reduction in examinations occurs in larger BW infants, we anticipate even greater reductions when the model is applied to a broader cohort, inclusive of infants who meet current screening guidelines (BW < 1501 g). For example, in a much lower-risk Swedish cohort (median BW: 1290 g), Lofqvist et al13 reported a reduction of 76% for the WINROP model.
Among the ways we explored to treat weight gain, we found that rate of daily weight gain calculated from the current and previous weeks' measurements produced the most robust model. This logistic-regression approach is methodologically distinct from the statistical methods used in WINROP,13 which are a cumulative-deviations based model. A reference model of expected weights is created with linear regression and data from infants who develop no or mild ROP. Weights from new infants are compared with the reference model on a weekly basis, and the differences summed over time. When the cumulative differences surpass a threshold level, an alarm is sounded, and, in a final stratification, BW and GA cutoffs are applied to determine a need for examinations. These methods involve calculations that require the use of a computer-based algorithm. One advantage of a logistic-regression based model is that it permits direct calculation of risk and may be represented as a nomogram (Figs 2 and and33).
Nomograms provide a graphical representation of mathematical relationships or formulas, such as a multivariate logistic regression-based clinical predictive model, and can be used to calculate risk of disease without use of a calculator or computer.43 Nomograms have been developed to predict treatment response in breast cancer,44 prostate cancer lymph-node metastasis,45 bladder cancer recurrence,46 and self-assessed melanoma risk.43 For the PINT-ROP model, a need for eye examinations would be determined by setting a risk cutoff level, above which an infant requires examinations, and the user could directly see how the risk of severe ROP is altered by changes in BW, GA, or weight gain rate. However, we stress that the nomograms in Figs 2 and and33 should be considered preliminary and are not intended for clinical use at this time. In particular, the restriction to infants with BW < 1000 g limits the ability to reliably model risk for higher BW and GA infants. It is possible that with additional development in a broader BW and GA cohort, and with subsequent validation, such nomograms may eventually be used as a simple, paper-based clinical tool in lieu of a computer-based algorithm. Future studies should also assess whether an even simpler clinical tool is equally predictive, such as a single cutoff for weight gain per day in target BW or GA populations.
Strengths of this study include use of prospectively collected clinical data and a diverse, multicenter cohort of infants. In addition, the bootstrap internal validation results suggest that there is minimal optimistic bias in the estimates of sensitivity and specificity for detecting severe ROP. However, important limitations need to be addressed before clinical implementation of this or any weight-based ROP risk model, including sample size, outcome criteria, false-negative signals, and generalizability. The development of a predictive model preferably employs a data set with hundreds of outcome events.47 Much larger development studies must be pursued before subsequent validation studies can be undertaken. With regards to outcome, stage 3 ROP inadequately defines severe ROP; zone I disease and plus disease should be included. Adding “treated ROP” to the outcome criteria helped somewhat to address this issue, but treatment decisions were made using “threshold ROP” criteria, not type 1 ROP according to current Early Treatment of ROP criteria,6 during the study period. A third limitation is that clinical factors that cause weight gain but are not associated with increased IGF-1 may generate false-negative signals and have yet to be clearly identified. With regards to generalizability, no predictive model should be used clinically until validated in new patients. Although our multicenter population was diverse, the cohort was at high risk for severe ROP. The low number of infants with a GA > 28 weeks, and low number of severe ROP outcomes among them, limited our ability to model-risk for those infants. The model will likely require recalibration when applied to a broader case-mix sample. Finally, it is important to recognize that in countries with developing neonatal care systems, severe ROP occurs in infants with much higher BW and GA.2 Indeed, accurate assessment of GA may not even be possible in some regions. Application of ROP risk models will require the completion of separate, additional development and validation studies in those populations.
Growth-based ROP prediction modeling is early in its development, but preliminary results are promising. Models such as PINT-ROP and WINROP have the potential to reduce the ROP examination burden, enable better health care resource allocation, and identify early infants who may benefit from preventive interventions, such as IGF-1 supplementation and intensive nutritional management. However, before these goals can be realized, much larger studies must be undertaken to ensure that any proposed changes to screening practices continue to identify with very high sensitivity those infants who may require treatment to prevent lifelong blindness.
This work was supported by National Eye Institute grant K12 EY-01539 and Canadian Institutes of Health Research grant FR 41549.
All authors made substantive intellectual contributions to the conception and design, acquisition of data, or analysis and interpretation of data; participated in the drafting or critical revision of the article for important intellectual content; and approved the final version submitted for consideration for publication.
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.
Funded by the National Institutes of Health (NIH).