|Home | About | Journals | Submit | Contact Us | Français|
Because cystic fibrosis can be difficult to diagnose and treat early, newborn screening programs have rapidly developed nationwide but methods vary widely. We therefore investigated the costs and consequences or specific outcomes of the 2 most commonly used methods.
With available data on screening and follow-up, we used a simulation approach with decision trees to compare immunoreactive trypsinogen (IRT) screening followed by a second IRT test against an IRT/DNA analysis. By using a Monte Carlo simulation program, variation in the model parameters for counts at various nodes of the decision trees, as well as for costs, are included and applied to fictional cohorts of 100000 newborns. The outcome measures included the numbers of newborns given a diagnosis of cystic fibrosis and costs of screening strategy at each branch and cost per newborn.
Simulations revealed a substantial number of potential missed diagnoses for the IRT/IRT system versus IRT/DNA. Although the IRT/IRT strategy with commonly used cutoff values offers an average overall cost savings of $2.30 per newborn, a breakdown of costs by societal segments demonstrated higher out-of-pocket costs for families. Two potential system failures causing delayed diagnoses were identified relating to the screening protocols and the follow-up system.
The IRT/IRT screening algorithm reduces the costs to laboratories and insurance companies but has more system failures. IRT/DNA offers other advantages, including fewer delayed diagnoses and lower out-of-pocket costs to families.
Although it has been shown that cystic fibrosis newborn screening is beneficial, the strategies vary widely, and there has been uncertainty about the costs and consequences of different algorithms and whether screening methods/decisions should be based on assumed cost differences.
This study contributes by offering a comparison of both costs, assessed comprehensively, and the consequences associated with the 2 most popular screening methodologies, immunoreactive trypsinogen/immunoreactive trypsinogen and immunoreactive trypsinogen/DNA, by using a decision-tree framework allowing variation in the model parameters.
Cystic fibrosis (CF) is a relatively common, life-threatening, autosomal recessive disease that can now be diagnosed routinely through newborn screening (NBS) by using immunoreactive trypsinogen (IRT)1 as the primary analyte or first tier. After both the Centers for Disease Control and Prevention2 and CF Foundation3 endorsed universal adoption of CF screening, it was included in the list of 29 core disorders recommended by the Health Resources and Services Administration for all state NBS programs.4 Thus, by 2010 all states plus DC were screening for CF. With nationwide CF NBS underway, the research climate shifted from one focused on identifying the benefits of screening to one focused on identifying the best screening methodology.5–7 Although wide variations in screening methods and cutoff values currently exist throughout the United States and across the world,6,8 almost all algorithms begin by evaluating IRT1 levels by using a dried blood specimen collected between 1 and 5 days of age with the second tier being either a second IRT or DNA analysis for CF transmembrane conductance regulator (CFTR) mutations.9 Further testing and referrals are then made based on the particular screening method in use by the state. More specifically, the US screening protocols being used in 2010 were as follows: IRT/IRT in 15 states plus DC and IRT/DNA in 33 states; in addition, there is a recent variation of IRT/DNA, namely IRT/IRT/DNA (Colorado, Texas, and Utah). The IRT/IRT and IRT/IRT/DNA algorithms require collection of a second blood specimen to confirm hypertrysinogenemia, whereas the IRT/DNA algorithms allow completion of the screening on a single/initial blood NBS specimen.
NBS programs have chosen their screening methods based on limited information regarding the impact their choice of method can have on costs and consequences.10–12 In the IRT/IRT algorithm, there have been wide variations in definitions of the level at which an IRT screen is considered “abnormal” and in the processes for follow-up evaluation. For those states using IRT/DNA analyses, the IRT cutoff values are much lower and more consistent, but choices must be made as to the number and types of CFTR mutations identified in the second tier of the screening test. Further research of these protocols and their many variations is urgently needed to aid in identification of the best screening strategy. An important issue relates to whether decisions to use IRT/IRT or IRT/DNA screening should be based predominately on assumed cost differences. This study contributes to the current research by offering a comparison of the costs and consequences, or outcomes, associated with the 2 most popular screening methodologies in the United States by using a decision-tree framework that allows for variation in the model parameters. A decision tree visually shows the process of the path a newborn may follow under a screening program. Average assumptions, in terms of rates, are shown for moving along a path, and costs at each point of the process are determined. These results can then be summarized to evaluate the effectiveness of a screening program.
Baseline costs for this study consisted of the (1) expenses of screening laboratories, (2) costs borne by families, and (3) costs usually covered by insurance (Table 1). The costs to the screening laboratory obtained from the Wisconsin State Laboratory of Hygiene included building costs, laboratory equipment, reagents, personnel, administration, shipping/postage, maintenance, and overhead. Sweat test costs were based on those reported previously.10 Genetic counseling costs were based on publicly reported Wisconsin salary data, in addition to CF center expert opinion of the time required to counsel parents of infants who are identified as carriers of CF. The data obtained from the Wisconsin State Laboratory of Hygiene were used to calculate screening test costs of IRT/DNA and IRT/IRT. Family costs from the Kentucky CF center survey of 26 families with abnormal initial IRT screens included the costs of the second dried blood specimen collection (out-of-pocket laboratory fee for blood collection, travel, and work missed) and travel and work missed for the sweat test. The cost for missed diagnoses were based on the UK study of Sims et al13,14 by using their conversion method ($1.69/£) to US dollars. For our analyses and decision trees, we updated all costs to $2010 US.
Decision trees were constructed15–17 for the IRT/IRT and IRT/DNA methods to compute the total cost for a variety of outcomes, either dependent or independent on the sensitivity/specificity of the particular screening test, and were based on data obtained from screening laboratories, the University of Wisconsin Hospitals, and published data.18–36 The decision-trees contain nodes, or decision-points, where the process is diverted along different paths. For our analyses, 100000 newborns started the process and at each node or branch these newborns were followed with the aid of the decision trees. We assumed CF incidence in the US population was 1 in 4000 newborns5,6,18 and expected that 25 newborns of the 100000 would have CF, with 5 newborns predicted to have meconium ileus37 and who immediately would be given a diagnosis without NBS; thus, they were removed from the decision trees before initiation of the screening processes. Although some recent surveys suggest the possibility that prenatal screening may reduce the incidence of CF,19–21 other observations have not confirmed this effect, and many reports provide a rationale for using 1:400022–24; in fact, regions with a higher incidence such as the Republic of Ireland and the UK (with ~1:1500–2500)24,25 can still use the decision trees provided herein by simply adjusting the second branches.
The decision tree used for the IRT/IRT method (Fig 1) was based on the currently most common screening process27 in which the threshold levels for identifying abnormal screens were set at 105 ng/mL and 70 ng/mL for the first and second IRT levels, respectively. Based on a large database analysis in a state with ongoing CF surveillance,37 a survey, and reviewing the literature, Kloosterboer et al (2009)6 reported an average sensitivity of 80% for an initial IRT cutoff level of 105 ng/mL. For the current study, to assess sensitivity further, we reviewed all the CF NBS articles available through PubMed, focusing on those with CF false negative screening test results,28–35 and determined from worldwide reports28,32 that a 78% to 85% range of sensitivity has been observed; then we took into account data presented at a recent conference (Evaluation of IRT as a Biomarker for Cystic Fibrosis: Methodological Issues and Practical Challenges for Newborn Screening; May 23–24, 2011) and concluded that 80% is a reasonable value to apply in the decision tree, despite a claim of higher sensitivity.26
In addition, reports from Colorado describe significant loss to follow-up at the second stage of the IRT/IRT method related to collection of the second dried blood specimen depending on the presence (2% loss) or absence (20% loss) of a state-mandated second dried blood specimen collection.26,27 Surveys of current IRT/IRT screening programs by 1 of the authors (Dr Farrell) have revealed that none of them claim 100% return of the second dried blood specimen and that the range observed is 2% to 16% failure to complete IRT follow-up procedures (because of a variety of causes such as the inability to contact the family due to relocation, name changes, lack of cooperation, or even death of the infant). Consequently, for the purposes of our simulation, we used an estimated loss to follow-up average of 10%. Although the percentage of infants with a second IRT above 70 ng/mL has not specifically been reported in the literature, based on expert opinion and mathematical deduction,6 we estimated that 12.25% of infants with an initially high IRT will have a second high IRT based on expert opinions and mathematical deduction. Infants with a second high IRT are referred for a sweat test for diagnostic evaluation, but this step generally requires a referral, which provides another area for loss to follow-up and delayed diagnosis. From Wisconsin6 and Massachusetts23 data, we assumed that 5% of infants would be lost at this step.
For the IRT/DNA method, we based the process on the method used in Wisconsin, which has been applied in 32 states, by using both published5,6,10,11,37 and some unpublished data available from that program. In the decision tree we created, the newborns were selected for 23 CFTR mutation DNA analyses22 by a cutoff for the initial IRT based on the daily 96th percentile. Kloosterboer et al (2009)6 reported a sensitivity of 96% with the use of a daily top 4% IRT floating cutoff. Other programs22,33 have a similar sensitivity and also report30,32 a substantially higher sensitivity for IRT/DNA compared with IRT/IRT. In addition, according to data reported from Wisconsin from 2002 to 2003, 93.9% of infants receiving a mutation analysis have no mutations, whereas 4.6% show 1 CFTR mutation, and 0.3% 2 mutations.38 Predicted percentages of children with 1 or 2 mutations who experienced a true negative diagnosis, delayed diagnosis (due to a loss to follow-up), or early diagnosis of CF were also obtained from numbers published from Rock et al (2005)38 and again a 5% loss was assumed for sweat tests.
We conducted a 2-part computer-based simulation of 100000 newborns screened with the IRT/IRT and IRT/DNA protocols shown in Figs 1 and and2,2, respectively. The first part of the model simulated the flow of the newborns through the decision-tree, whereas the second part attributed costs for each newborn at each branch of the tree. Part 1 of the decision tree model provided a complete analysis of the system design of each screening method and allowed for identification of specific system errors that create opportunities for delayed or missed diagnoses. Part 2 of the simulation applied a cost simulation framework to the decision tree simulation, resulting in projected societal costs of the IRT/IRT and IRT/DNA systems.
From the literature18–36 or from our data,6 we identified the percentages of newborns who move from 1 node to another for part 1, whereas for part 2 we have the costs at each node. To allow for recognition that there may be variation of these values, we simulate a value for the percentages and the costs. For each run of the simulation, percentages and costs per newborn were generated on average, equal to the numbers that we gleaned from the literature or our own data. Then the outcomes from each run were combined to report as results. Thus, the method has a built-in sensitivity analysis by taking into account that the original estimates are variable.
The results from the IRT/IRT and IRT/DNA decision tree simulations showing the average counts are presented in Figs 1 and and2.2. Table 2 shows a summary of the numbers of newborns given a diagnosis by protocol, number of newborns not given a diagnosis by the screening protocol, and the 2 categories of errors. In addition, we provide the interquartile range (25th to 75th percentiles) to indicate the variation of the counts.
Analysis of the “potential missed diagnosis” category in both protocols was conducted to identify particular failures of the system design that might result in a delayed or missed diagnosis, and 2 significant issues were identified: (1) failure of the protocol definition and (2) errors in the follow-up system (particularly losses to follow-up, as shown in the decision trees). In the case of IRT/IRT screening, this is exemplified by the relatively low sensitivity resulting from high cutoff levels used for identifying abnormal IRT screens. In the case of IRT/DNA, this is exemplified both through the reduced sensitivity of the IRT screen and the decrease in sensitivity caused by not testing for all possible CFTR mutations. Protocol definition failures accounted for the largest percentage of potential misses with 6.6 (66% of undiagnosed) potential missed diagnoses in the IRT/IRT system and 2.0 (69% of undiagnosed) potential misses in the IRT/DNA system.
An overview of the results for the cost simulations are presented in Table 3. With a starting cohort of 100000, the total cost of the IRT/IRT is projected as $4.57 per newborn (interquartile prediction interval of $4.43–4.71), equating to a total cost of ~$457000 for a cohort of 100000 newborns. The largest contributors to the cost of the IRT/IRT program, accounting for nearly half of the total costs, include the IRT reagents (35%) and the collection of the second dried blood specimen (12%). The projected cost of the IRT/DNA program, on the other hand, is greater than the cost of the IRT/IRT program by $2.31 per infant, with a total cost of $6.78 per infant (interquartile prediction interval of $6.60–6.96). The largest proportion of these costs is attributable to the IRT reagents, DNA reagents, and laboratory personnel.
Costs were also assessed according to payer source, defined as laboratory costs, family/individual costs, and insurance, to allow for further analysis of each screening system. Overall, the IRT/IRT program offered substantial cost savings for state laboratories and insurance as compared with IRT/DNA, while creating higher costs for individuals and families. Average costs for running the laboratory portion of the IRT/IRT and IRT/DNA program were equal to $3.63 per newborn and $5.85, respectively. In terms of costs covered by insurance, IRT/IRT offered the higher cost savings, with a cost of $0.57 per newborn, compared with $0.73 per newborn. Finally, the cost to families of infants with an abnormal screen was lowest with the IRT/DNA strategy, totaling $20000 per year. The total family costs of the IRT/IRT program were nearly double that of IRT/DNA, with an average expense of ~$36000 per year. The IRT/IRT screening cost per diagnosis is equal to $45400, compared with $39700 per diagnosis in the IRT/DNA system.
This study of screening and diagnostic costs is 1 aspect of the global question on the cost effectiveness of CF NBS, whereas the care costs are still being investigated with utilization data that should allow us to reach conclusions about overall cost effectiveness. This comprehensive assessment of costs for diagnosis adds to the literature through novel inclusion of all procedures such as (1) the costs of procuring the second blood specimen in IRT/IRT; (2) costs for genetic counseling with IRT/DNA; (3) the outcomes associated with “loss to follow-up” and other flaws in the designs of both screening systems; (4) costs associated with missed diagnoses; (5) the overall costs of the screening program to various segments, such as the costs to families, insurers, and screening laboratories; and (6) creating more robust results by allowing the components of the model to have variability and showing a range of outcome values. Overall, results from the simulations demonstrated that the IRT/IRT system offers substantial cost savings as compared with IRT/DNA but has the potential to miss or delay diagnosis for up to 50% of all infants given a diagnosis of CF through screening annually as compared with 15% for the IRT/DNA system. As such, the IRT/DNA system cost per diagnosis is 14% lower than the IRT/IRT cost per diagnosis. Thus, the decision trees provide a visual process to examine costs at each stage of the screening program and could be used to help in a systematic evaluation of the effectiveness of laboratory costs.
Because early diagnosis of CF, before 1 to 2 months of age,14,39 is the key reason for screening, it is crucially important for NBS programs to consider both expected outcomes40–46 and costs. Three publications from the United States have assessed the costs of the IRT/DNA method and the individual cost of IRT,10,11,16 but none of these studies included the cost of health effects or offered a comparison with other screening methods. The authors of another publication from the United Kingdom assessed the cost-effectiveness of IRT/DNA compared with no screening but did not offer a comparison with other screening methodologies.14 Finally, the authors of 1 comprehensive study from the Netherlands conducted a thorough assessment of the cost-effectiveness of 4 screening methodologies but did not account for problems with loss to follow-up in either system, nor did they offer a critique of the overall system designs with a discussion of possible solutions to correct flaws in each system.12 In addition, the Netherlands study did not take into account the possible cost consequences of lowering the initial IRT cutoff levels in the IRT/IRT system to avoid false negatives.6,26
There are some limitations to our study and also some other “hidden” costs in both screening methods that are currently difficult to define with precision. It should be pointed out that some sweat tests need to be repeated because of insufficient quantities of sweat; in fact, up to 10% can be assumed,47 which adds $4029 (at $237 per sweat test) plus $1700 (at $103 per sweat test) for family travel and work costs assuming a cohort of 100000 newborns. In addition, many states that employ IRT/IRT screening are now using expanded genetic analysis either as part of the screening process for evaluating premature infants or in conjunction with the diagnostic sweat test. A survey of 4 commercial laboratories revealed that charges for CFTR multimutation analyses range from $550 to $800, and gene sequencing is $2485 to $3546 per test.
Another limitation is that we relied heavily on Wisconsin data to have completely accessible costs and outcomes for the models, but their general applicability in the United States has been described48 recently. On the other hand, certain expenses such as salaries for personnel will vary in other regions and thus the simulations described herein may require revisions (eg, in the United Kingdom where the newborn blood specimens are obtained at home, which causes greater health care personnel costs but lower family expenses).
Two major system failures were identified as contributors to missed and delayed diagnoses: (1) protocol definition failures and (2) follow-up system failures. Protocol definition failures, the first type of system failure accounting for the largest proportions of potential missed diagnoses, resulted most commonly from the definitions of an abnormal IRT screen in each protocol. In the design of our IRT/IRT assessment, standard cutoffs of 105 ng/mL and 70 ng/mL were set as the abnormal values for the first and second screen, respectively, but data suggest that this can lead to a potential missed diagnosis rate estimated as high as ~30% of the total number of newborns with CF given a diagnosis annually through screening.6 Thus IRT cutoff values have been lowered continuously during the past decade and have varied among the states from 58 to 130 ng/mL for the first IRT and 50 to 90 ng/mL for the second IRT. The IRT/DNA system uses a much lower cutoff level for identifying abnormal IRT screens, thereby lowering the percentage of infants with CF with a missed diagnosis. Therefore, lowering the cutoff level of the abnormal IRT screens in the IRT/IRT system may offer 1 solution to this prominent system failure but will also lead to an unintended consequence of increasing costs. If the IRT/IRT system were to adopt the same cutoff levels as the IRT/DNA system, 4000 infants out of the 100000 fictional screened newborns would require a second blood specimen.
The follow-up component is the second type of system where failures can occur. Among the 2 screening methods compared in this study, IRT/IRT clearly experiences the greatest number of potential missed diagnoses in the follow-up system and the largest proportion of these were lost at the point of obtaining the second dried blood specimen. In the IRT/DNA method, very few newborns with CF are categorized as a loss to follow-up because replacing the second dried blood specimen with a mutation analysis conducted on the initial dried blood specimen does not allow for loss to follow-up at this stage.
Therefore, the results from this study have demonstrated that the system design of the IRT/IRT and IRT/DNA screening programs offer different sets of advantages and disadvantages. The design of the IRT/DNA screening protocol minimizes system failures and out-of-pocket costs to families. The IRT/IRT screening protocol minimizes costs to state laboratories and insurance but is associated with more potential system failures. Redesigning the IRT/IRT system by increasing the sensitivity of the initial IRT test (ie, lowering the initial cutoff for abnormal) and decreasing loss to follow-up will improve early detection and treatment of infants with CF but will substantially increase the societal cost of the screening program.
This research was supported by National Institutes of Health grant R01DK34108. Professor Rosenberg was partially supported by grant 1UL1RR025011 from the Clinical and Translational Science Award (CTSA) program of the National Center for Research Resources, National Institutes of Health.
We thank the other investigators who have participated in the Wisconsin Cystic Fibrosis Neonatal Screening Project, especially the Madison CF Center Director, Dr Michael Rock, and its longtime coordinator, Anita Laxova. We also thank W.H. Hannon, PhD (former director of the Newborn Screening Branch at the Centers for Disease Control and Prevention).
All authors meet the following criteria for authorship: (1) they have all substantially contributed to conception and design, acquisition of data, or analysis and interpretation of data; (2) they all participated in drafting the article and/or revising it critically for important intellectual content; and (3) they all gave the corresponding author approval of the final version submitted for publication. Thus, Dr Wells, Dr Rosenberg, Mr Hoffman, Dr Anstead, and Dr Farrell have all made substantive intellectual contributions to the study. In addition, they will all approve the version to be published.
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.
Funded by the National Institutes of Health (NIH).