|Home | About | Journals | Submit | Contact Us | Français|
The Centers for Disease Control and Prevention conducted a systematic screening and assessment process to identify promising practices implemented by grantees of the National Breast and Cervical Cancer Early Detection Program and its partners that were appropriate for rigorous evaluation.
The systematic screening and assessment (SSA) process was conducted from September 2010 through March 2012 and included five steps: (1) nominations of promising practices; (2) a first rating by subject matter experts; (3) field-based evaluability assessments; (4) a second rating by experts; and (5) use of results. Nominations were sought in three program areas including health education and promotion, quality assurance and quality improvement, and case management/patient navigation.
A total of 98 practices were nominated of which 54 % were eligible for the first review by the experts. Fifteen practices were selected for evaluability assessment with ten forwarded for the second review by the experts. Three practices were ultimately recommended for rigorous evaluation, and one evaluation was conducted. Most nominated practices were based on evidence-based strategies rather than representing new, innovative activities. Issues were identified through the process including inconsistent implementation and lack of implementation fidelity.
While the SSA was successful in identifying several programs for evaluation, the process also revealed important shortcomings in program implementation. Training and technical assistance could help address these issues and support improved programming.
The Breast and Cervical Cancer Mortality Prevention Act of 1990  established the National Breast and Cervical Cancer Early Detection Program (NBCCEDP) to be administered by the Centers for Disease Control and Prevention (CDC). Through the NBCCEDP, low-income, un-insured, and underinsured women receive important cancer screening services. To date, the NBCCEDP has served over 4.5 million women and diagnosed 62,121 breast cancers, 3,458 invasive cervical cancers, and 163,548 premalignant cervical lesions . Between 2013 and 2014, the NBCCEDP provided nearly $154 million to all 50 states, the District of Columbia, five US territories, and 12 American Indian/Alaska Native tribal organizations. As the only nationally organized screening program in the USA, the NBCCEDP pays for screening and diagnostic services as well as partnership development, public education and recruitment, professional education, quality assurance and quality improvement, patient navigation, data management, and evaluation. The law currently mandates that at least 60 % of NBCCEDP funds be used to support direct clinical services, with up to 40 % allocated for the activities noted above.
The Affordable Care Act  is increasing the number of persons who have access to health insurance and, subsequently, to breast and cervical cancer screening. For most private insurance plans and newly eligible Medicaid expansion beneficiaries, the law requires coverage, with no cost-sharing, for clinical preventive services recommended by the United States Preventive Services Task Force (USPSTF), including screening for breast and cervical cancer . By removing cost as a barrier, the Affordable Care Act offers new opportunities for the NBCCEDP to increase access to screening for a broader population than has traditionally been served by the program. In particular, the NBCCEDP could leverage its non-screening components to help integrate public health and primary care, reducing an existing community clinical services gap and facilitating health care access for a broader population than traditionally reached . A possible model for the NBCCEDP is CDC’s Colorectal Cancer Control Program (CRCCP), a program encouraging population-level impact through use of evidence-based strategies focused on health systems and organizational policy change .
In anticipation of healthcare reform and full implementation of the Affordable Care Act, CDC funded several projects to explore future directions for the NBCCEDP . Recognizing the possibility for scaling up non-screening activities, one project sought to identify promising practices implemented by NBCCEDP grantees or partners that have potential to achieve population-level effects and that are well positioned for rigorous evaluation. The project aligns with an important public health goal to identify and promote evidence-based practice .
Evidence-based approaches are often identified through efficacy studies involving experimental or quasi-experimental designs whereby key variables are tightly controlled in order to isolate intervention effects. Currently, CDC largely relies on the Guide to Community Preventive Services  to identify evidence-based strategies. However, sole reliance on interventions identified through efficacy studies is limiting. In particular, evidence-based interventions are expensive to implement and often overly complex for real-world replication by community-based public health practitioners facing many constraints [10-12]. Other methods are needed to gather evidence from public health programs, including the NBCCEDP. With over 20 years of implementation experience, NBCCEDP grantees and their partners offered an opportunity to identify innovative public health practices (e.g., interventions, strategies, policies, procedures, processes, and/or activities) ripe for evaluation.
The systematic screening and assessment method (SSA), developed by the Robert Wood Johnson Foundation in collaboration with CDC, sequences expert peer review with evaluability assessment as a means to identify promising public health practices worthy of rigorous evaluation. The SSA includes criteria such as reach to the target population, generalizability, data availability, and potential impact . Both expert review  and evaluability assessment  are recognized as systematic evaluation approaches. By examining program design, implementation, and data availability, evaluability assessments—which can be considered pre-evaluations—help ascertain whether further evaluation is justified, feasible, and likely to provide useful information. Developers initially applied the SSA to obesity prevention , and it has since been used in food policy, active transportation, food service guidelines, and joint use agreements. Given limited evaluation resources, the SSA is a cost-effective approach to identify the most promising public health practices best positioned for evaluation . At the same time, valuable feedback and technical assistance can be delivered even to those programs that would not benefit from further evaluation.
To identify promising public health practices implemented by NBCCEDP grantees or partners, CDC initiated an SSA in 2010. In doing so, CDC hoped to identify non-screening practices with potential for population impact, improve its understanding of current public health practices among NBCCEDP grantees, and bridge the divide between promising and evidence-based practice. The purpose of this paper is to present results from the SSA. An evaluation team led by staff from ICF International with CDC participation carried out the SSA.
The SSA was conducted from September 2010 through March 2012. Figure 1 represents the five-step SSA process  that reduces a large number of candidate public health practices to a priority listing of those most promising and appropriate for rigorous evaluation. Each step in the process is described below.
Step 1—Nomination of practices: The evaluation team selected three priority areas for assessment including (1) health education and promotion (HEP), (2) quality assurance and quality improvement (QA/QI), and (3) case management or patient navigation (CM/PN) that are defined in Table 1. These areas were selected for their potential to achieve greater population-level effects in the new healthcare environment. On December 1, 2010, CDC held a webinar with grantees of the NBCCEDP and the National Comprehensive Cancer Control Program (NCCCP) as well as other partners to launch the SSA project. An electronic nomination form was circulated broadly to grantees and partners through various listservs (e.g., NBCCEDP, NCCCP, American Cancer Society, Susan G. Komen Foundation). The nomination form collected voluntary information on the practice name, type (e.g., HEP), and description (priority population, activities, outcomes of interest, data collection efforts). For purposes of the project, “practice” was defined as interventions, strategies, policies, procedures, processes, and/or activities implemented in the field with the intention to affect a change or achieve a particular impact. Examples of impact may be to decrease morbidity, mortality, or disability, or to increase quality of life or effectiveness and efficiency of public health. Nominations were collected through 21 January 2011.
The evaluation team reviewed all nominated practices and contacted programs for additional information. Inclusion and exclusion criteria were applied in order to select practices for review by experts. For inclusion, nominated practices had to address one of the three priority areas, have a plausible goal related to improving screening prevalence and adherence to diagnostic follow-up, have been currently implemented for at least 6 months, have a measurable outcome, and be suitable for implementation in other contexts. Practices undergoing a rigorous evaluation, already evaluated, or a strict replication of an existing evidence-based model (without major adaptations) were excluded using a standard template; the evaluation team prepared a summary of each practice including a project description and information addressing several areas (e.g., intended outcomes, reach, implementation, sustainability).
Step 2—First informal meeting of experts: Twenty-three researchers and practitioners representing academia, federal and state agencies, and nonprofit organizations with expertise in at least one of the three priority areas were recruited for the project. Experts were sent eight to nine practice summaries consistent with their area of expertise (e.g., HEP) to review using a rating assessment tool accessed online. To reduce potential bias and ensure candid discussion, pseudonyms were used for all nominated practices. Table 1 summarizes the nine criteria assessed for each practice. The assessment tool included up to four Likert scale questions for each criterion. A value of one was associated with low scores, such as “low proportion of population reached,” and a value of four represented favorable ratings, such as “highly acceptable to stakeholders.” Experts also had an option of “unable to assess” if they believed inadequate information was available to rate the criterion. A final summative item using the four-point Likert scale was included, “How strongly do you recommend this practice for evaluability assessment?” with four reflecting “strongly recommend.”
Three to four experts reviewed and rated each practice. Using SPSS version 16.0, a mean score for each of the nine criteria was calculated on individual practices. In addition, all criterion scores were averaged into an overall composite mean score for the practice. The overall composite mean score refers to the mean of the mean scores. Criteria rated as “unable to assess” were treated as missing data and not included in means calculations. Criteria were treated equally with no weighting used. Graphs were produced for each criterion to provide a visual aid for determining which practices might have the highest potential impact, the highest feasibility for adoption, and so on.
In April 2011, seventeen experts convened in Atlanta, Georgia and six experts joined by telephone to discuss the ratings and make recommendations for evaluability assessments. Experts were placed in groups representing the three priority areas to review the rating results and discuss the candidate practices. The full group then reconvened, and each priority group recommended practices for the evaluability assessments along with suggested issues for site visitors to explore during the assessment process. The evaluation team reviewed recommendations from the experts and selected practices for evaluability assessments.
Step 3—Evaluability assessments: To gather detailed information on each practice, evaluation team members reviewed background documents provided by staff. Teams of two to three then conducted two and a half day site visits between July and October 2011. The evaluation team spoke with nine to 12 stakeholders at each site for approximately 1–2 hours each to better understand the practice and its implementation—including its history, goals, and data sources. During the site visit, the evaluation team facilitated a logic model-building exercise with staff to explicate the relationships between practice inputs, activities, outputs, and outcomes .
Following each site visit, the evaluation team drafted reports describing each practice and including programmatic recommendations. Summary reports were provided to all participating practices. Staff members were invited to participate in a follow-up meeting by telephone to review the report, refine the logic model, and discuss recommendations. Based on results of the evaluability assessments that revealed significant implementation problems for several practices, the evaluation team selected 10 of the 15 practices for final review by the experts.
Step 4—Second informal meeting of experts: Each expert was provided detailed reports and logic models developed from the evaluability assessment and asked to review and rate the practices using the nine established criteria (Table 1), again through the web-based rating assessment tool. Ten to thirteen experts reviewed and rated each practice. Consistent with the first informal meeting, the evaluation team calculated a mean score for each of the nine criteria on each practice as well as a mean, composite score. As before, no weighting was used and criteria rated as “unable to assess” were treated as missing data. The experts were convened by telephone in January 2012 to discuss each practice and review assessment ratings. Individual experts presented a summary of each practice to the group. At the end of the discussion, experts were asked to vote for the top three practices they would recommend for rigorous evaluation.
Step 5—Use of SSA results: Following the second informal meeting, the evaluation team met to discuss results and prioritize a practice for evaluation to CDC leadership. Considerations influencing selection included the potential costs in terms of both time and money to carry out the evaluation. The evaluation team synthesized results from the SSA and reflected upon the experience in order to extract lessons learned across similar practices.
Nomination of practices: The nominations process yielded a total of 98 practices including 32 practices in HEP, 39 in QA/QI, and 27 in CM/PN. Sixteen of the nominated practices addressed more than one category. Most often, CM/PN practices included a health education and promotion component. Nominations were evenly distributed across the US: 24 nominations were received from the Northeast, 25 from the Southeast, 22 from the Midwest, 24 from the West, and three from US territories (Fig. 2).
Based on inclusion and exclusion criteria applied by the evaluation team, 45 practices were excluded. The remaining 53included20HEP, 21 QA/QI, and 12CM/PN. Practices were excluded for issues related to design (e.g., no plausible logic to increase screening rates), implementation (e.g., no longer implemented, partial implementation of a practice), evaluation (e.g., data collection not possible, already evaluated), and capacity (e.g., no response to project inquiries).
First informal meeting of experts: Based on the experts’ assessment ratings, composite mean scores ranged from 2.3 to 3.6 for the 53 practices with a median of 3.19. The lowest mean scores recorded were for feasibility of adoption (2.78) and intervention sustainability (2.64); stakeholder acceptance (3.35) and feasibility of implementation (3.34) received the highest mean scores.
Following extensive discussion, the experts recommended 12 practices for evaluability assessment and identified several others than might warrant additional consideration. Experts also shared the following observations:
Following the informal meeting, the evaluation team met to discuss the recommendations from the experts. Based on the experts’ advice and CDC’s interest in practices with potential for population-based effects or those addressing the screening needs of vulnerable populations, the team considered additional practices for evaluability assessment. Aside from examining ratings from the experts, the team engaged in the same assessment process using the web-based tool (results not presented). Based on that process, three other practices were added for a total of 15 identified for evaluability assessment. The practices represented the following categories: six HEP, three QA/QI, and six practices spanning multiple categories (four HEP–CM/PN, one HEP–QA/QI, one reflecting all three categories).
Evaluability assessments: Fifteen evaluability assessments were conducted, and reports summarizing assessment results for each site visit were drafted. Five practices were removed from further review given significant implementation problems that included the lack of implementation fidelity with the program design and inconsistent, non-systematic implementation. Consequently, ten practices were forwarded for final review and rating by the experts including three HEP, three QA/QI, three HEP–CM/PN, and one HEP–QA/QI.
Reflecting on the evaluability assessment process, evaluation team members identified several commonly observed issues:
Second informal meeting of experts: Table 2 summarizes the experts’ ratings by criteria for each of the ten practices, numbered 1–10. Consistent with the first informal meeting, mean scores for individual criteria were lowest for feasibility of adoption (2.35) and intervention sustainability (2.59) and highest for feasibility of implementation (3.55) and stakeholder acceptance (3.53).
Figure 3 provides the composite and overall recommendation scores for each practice. Composite scores ranged from 2.86 (practice #4) to 3.42 (practice #1); overall recommendation scores ranged from 2.50 (practice #4) to 3.70 (practice #5). Results of voting conducted during the informal meeting of experts for the top three practices were nearly consistent with composite scores; the three practices receiving the most votes were included in the four practices receiving the highest composite scores. Of interest, three practices received no votes from experts in this final exercise.
Step 5—Use of SSA results: The evaluation team recommended the top three practices identified by the experts to decision makers in the CDC’s Division of Cancer Prevention and Control and solicited their input to identify the first practice for evaluation. At the time, resources were available to support one evaluation. The Colorado NBCCEDP’s Women’s Wellness Connection bundled payment system, a QA/QI practice, was selected given several factors, including that the practice was innovative and appeared to be highly replicable for other NBCCEDP and cancer programs, staff were willing to provide clinical and financial data for analysis, and the evaluation could be completed within existing time and cost constraints. The team designed and conducted an evaluation study of the Colorado NBCCEDP’s Women’s Wellness Connection bundled payment system in 2012–2013 .
The team also reflected on the SSA results identifying the following lessons learned:
We found the SSA an effective process for identifying promising cancer screening-related practices implemented in real-world settings ready for rigorous evaluation. By applying criteria such as acceptability to stakeholders and feasibility of adoption, areas typically ignored in highly controlled intervention research, the SSA allows evaluators to assess field-based practices and contribute toward building practice-based evidence from the “bottom up,” a current gap. Recently, CDC authors, drawing on the SSA as well as the reach, effectiveness, adoption, implementation, maintenance (RE-AIM) model  and the integrative validity model , proposed a new framework that recognizes a continuum of evidence-based practice, from emerging to best . The framework is meant to complement traditional, systematic review approaches (e.g., Community Guide) and, like the SSA, contribute to building practice-based evidence. The framework integrates two distinct concepts, public health impact and quality of evidence, in order to assess practices. Five components comprise public health impact: effectiveness, reach, feasibility, sustainability, and transferability. Strength of evidence is assessed via a range: weak, moderate, strong, and rigorous.
While the SSA proved an effective method, the process yielded far fewer practices ready for evaluation than anticipated. Despite a seemingly successful nomination effort generating 98 practices for review, only a few were ultimately found appropriate for further evaluation. Other SSAs have had similar findings . Our results may reflect the NBCCEDP’s traditional emphasis on screening provision; as noted, grantees are required by law to expend at least 60 % of funds on clinical service delivery. Grantees may simply not have adequate resources to implement strong practices in the three foci for this SSA.
However, results may also reflect limited capacity in program planning, implementation, and evaluation for the non-clinical intervention activities we assessed. A 2013 survey of NBCCEDP grantees identified the need for training in implementation of specific strategies such as systems change, quality improvement, and provider assessment and feedback, as well as in program monitoring and evaluation . However, through the SSA, we identified other types of shortcomings including partial and inconsistent implementation, lack of implementation fidelity, and inadequate “dose” of the practice because implementers were either not doing enough of a given activity or trying to do too many different things. This is significant given that strong empirical evidence has demonstrated that the level of implementation affects outcomes . Further, one of the lowest scored criteria for both informal meetings was feasibility of adoption, suggesting the practice would be difficult to adopt in its current form. We also found that documentation of implementation efforts was limited, and programs faced other challenges to program monitoring and evaluation.
Based on these findings, CDC program consultants and staff assigned to provide regular technical assistance to grantees, as well as the grantees, received extensive training in program logic modeling in 2013 to support improved planning, implementation, and evaluation. By developing a logic model, program planners and implementers can help ensure a plausible relationship between the proposed activities and the intended outcomes and also identify outputs and outcomes for program monitoring and evaluation. In addition, logic models benefit program planning by improving clarity about the program among program staff and stakeholders, identifying needed resources, specifying the sequencing of activities that should be implemented, and serving as a basis for program evaluation . Under the most recent NBCCEDP funding announcement, grantees were required to develop a program logic model to specify relationships between program activities and intended outcomes. To help grantees meet this requirement, CDC developed related guidance and program consultants and evaluators are providing technical assistance. Other training to address evaluation planning is forthcoming, an area for which SSA participants explicitly requested assistance. As a screening program, the NBCCEDP is data-driven and supports a strong culture of data use [23, 24]; therefore, this capacity should be lever-aged as programs potentially expand non-screening efforts.
Through the SSA, we also found that grantees often implement multiple strategies to achieve individual screening objectives, as demonstrated by practices falling into more than one category (e.g., HEP and CM/PN). Such multiple component approaches, if implemented appropriately, may increase impact as synergies across activities are realized. Innovation may not rest in the practice alone, but also in how evidence-based strategies are combined to enhance impact. Effective organized screening programs involve integrating strategies such as patient outreach and recruitment, client and provider reminders, patient navigation, and provider assessment and feedback.
While the SSA offers a systematic means to identify promising practices and identify those most appropriate for rigorous evaluation, it does have limitations. Results suggest the SSA process may be slightly improved by specifying whether replications of evidence-based strategies or whether truly innovative practices are preferred as nominations. The validity and reliability of the assessment tool used by the experts have not been examined. Additionally, the number of persons reviewing each practice wasrelatively small. The experts were comprised of highly respected researchers and practitioners in the field, and the discussions that followed the objective rating may prove equally important in determining which practices the experts select to move forward for evaluability assessment or for evaluation. In addition, experts may not have had as complete and comprehensive information about each practice as would be ideal. However, the evaluation team compiled extensive materials and conducted thorough evaluability assessments in order to describe the practices as fully as possible.
Although the SSA proved an effective method, fewer practices ready for rigorous evaluation were identified than anticipated. An unexpected and important consequence of the SSA process was the identification of needed improvements in program planning, implementation, and evaluation. Training and technical assistance could address these deficiencies, supporting improved programming and better preparing programs for evaluation. In particular, program logic model training could support improvements in both program planning and process and outcome evaluation. Similarly, training addressing factors including fidelity, dose, and reach could improve program implementation and, subsequently, the likelihood of achieving related outcomes.
Disclaimer The findings and conclusions of this paper are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
Amy DeGroff, Division of Cancer Prevention and Control, Centers for Disease Control and Prevention, 4770 Buford Hwy, MS K-57, Atlanta, GA 30314, USA.
Karen Cheung, Public Health and Survey Research, ICF International, Atlanta, GA, USA.
Nicola Dawkins-Lyn, Public Health and Survey Research, ICF International, Atlanta, GA, USA.
Mary Ann Hall, Public Health and Survey Research, ICF International, Atlanta, GA, USA.
Stephanie Melillo, Division of Cancer Prevention and Control, Centers for Disease Control and Prevention, 4770 Buford Hwy, MS K-57, Atlanta, GA 30314, USA.
Rebecca Glover-Kudon, Division of Cancer Prevention and Control, Centers for Disease Control and Prevention, 4770 Buford Hwy, MS K-57, Atlanta, GA 30314, USA.