Search tips
Search criteria 


Logo of pubhealthrepLink to Publisher's site
Public Health Rep. 2012 Jul-Aug; 127(4): 375–382.
PMCID: PMC3366295

The Predictive Value of Self-Report Questions in a Clinical Decision Rule for Pediatric Lead Poisoning Screening



We derived a clinical decision rule for determining which young children need testing for lead poisoning. We developed an equation that combines lead exposure self-report questions with the child's census-block housing and socioeconomic characteristics, personal demographic characteristics, and Medicaid status. This equation better predicts elevated blood lead level (EBLL) than one using ZIP code and Medicaid status.


A survey regarding potential lead exposure was administered from October 2001 to January 2003 to Michigan parents at pediatric clinics (n=3,396). These self-report survey data were linked to a statewide clinical registry of blood lead level (BLL) tests. Sensitivity and specificity were calculated and then used to estimate the cost-effectiveness of the equation.


The census-block group prediction equation explained 18.1% of the variance in BLLs. Replacing block group characteristics with the self-report questions and dichotomized ZIP code risk explained only 12.6% of the variance. Adding three self-report questions to the census-block group model increased the variance explained to 19.9% and increased specificity with no loss in sensitivity in detecting EBLLs of ≥10 micrograms per deciliter.


Relying solely on self-reports of lead exposure predicted BLL less effectively than the block group model. However, adding three of 13 self-report questions to our clinical decision rule significantly improved prediction of which children require a BLL test. Using the equation as the clinical decision rule would annually eliminate more than 7,200 unnecessary tests in Michigan and save more than $220,000.

The purpose of testing for elevated blood lead levels (EBLLs) in children is to identify cases that would require appropriate treatment and/or environmental intervention. The 1989 Centers for Medicare & Medicaid Services requirement to test all children enrolled in Medicaid is costly and requires extensive outreach.1 It also created a need for clinical practice guidelines and decision rules to determine which young children are at high enough risk to require testing for lead poisoning.

In 1997, the Centers for Disease Control and Prevention (CDC)2,3 developed such clinical practice guidelines and defined high risk of symptoms of lead poisoning as having a blood lead level (BLL) of ≥10 micrograms per deciliter (μg/dL).4 CDC encouraged state health departments to develop plans to identify all children at high risk of lead poisoning based on local data concerning BLLs and selected risk factors.

The CDC guidelines were based on several criteria, including age of housing and percentage of population with incomes below the federal poverty level (FPL) in the ZIP code in which the child resides. As part of its targeting, CDC also recommended using information obtained from five self-report questions on lead exposure (Figure).3 The Michigan Department of Community Health (MDCH) replaced question #5 with the following question: “Does the child's family use any home remedies that may contain lead?” and provided a list of such remedies. At the time, MDCH followed Medicaid rules requiring that all children enrolled in Medicaid be tested, although not all such children were actually tested. In addition, CDC and MDCH recommended that a child be tested if the child's caregiver answered “yes” or “I don't know” to any of the self-report questions, or if the child was Medicaid-enrolled or lived in a high-BLL-risk ZIP code.

CDC and MDCH recommended screening questions for pediatric lead poisoning

While these self-report questions have been widely used, studies have not found them to be good predictors of EBLL.58 Moreover, some of these questions are poorly worded and may not be understood by respondents. Therefore, there is a need to both improve the wording of these questions and assess the predictive validity of the improved questions.

Previous research in Michigan9 and elsewhere10,11 has led to a refined geographic approach to predicting the BLL of children. Unlike previous studies that use ZIP codes and census tracts, this research relied heavily on the characteristics of the census-block group (the smallest geographic unit for which detailed data were available). In Michigan, census-block groups explained substantially more variance in BLL than census tracts or ZIP codes. This method yielded a prediction equation with better sensitivity and specificity than ZIP code and Medicaid status, which would have identified more high-risk children, while saving more than $150,000 during the four-year period 2002–2005.9

This equation, derived from our previous work,9 predicted BLL from a weighted linear combination of the following characteristics of the census-block group in which the child lived: percentage of housing built before 1940 (HSNG_PRE1940), percentage of housing built during 1940–1949 (HSNG_1940-49), percentage of population with income <185% FPL (INC_185%_POV), percentage who did not graduate from high school (HS_DROPOUT), percentage black, and percentage Latino; as well as the following characteristics of the child: Medicaid status, race/ethnicity (BLACK_CHILD or not), age, and year tested.

The optimal prediction equation for 2002–2005 was Ln(BLL − 0.5) = −0.487 + 0.694 × HSNG_PRE1940 + 0.0119 × HSNG_1940-49 + 0.212 × INC_185%_POV + 0.400 × %_BLACK + 0.556 × %_LATINO + 0.206 × BLACK_CHILD + 0.172 × MEDICAID + 0.109 × MEDICAID × HSNG_PRE1940 + 0.171 × MEDICAID × HSNG_1940-49.

This prediction equation also includes coefficients of dummy variables that adjust for the child's age and year of the BLL test. Additionally, it includes empirical Bayesian residuals generated by Hierarchical Linear Modeling.12 These residuals estimate the degree to which the prediction equation over- or underestimates the BLL in each census-block group.

While prior literature suggests that these self-report exposure questions are not adequate predictors of BLL,58 it is possible that some of them are worth adding to the census-block group prediction equation. The present study expands our previous work by including modified MDCH-CDC self-report questions and other self-report questions suggested in the literature. The purpose is to create the most cost-effective method of targeting BLL testing, which can then be disseminated by public health professionals to clinicians and parents.


Two datasets were linked for this study. One was our survey on sources of lead exposure, which was administered from October 2001 to January 2003 to more than 4,000 parents and caregivers of young children at 36 pediatric clinics in Michigan, including 18 county health departments, two urban health systems, and 16 small clinics serving migrant workers. The survey contained questions that modified the five MDCH-CDC items, plus a question about lead smelters and questions about additional sources of lead exposure (e.g., pacifier use5 and water that came through lead pipes).13

Because respondents were disproportionately of low education, the current survey modified the CDC questions by dividing sentences with several clauses or phrases into simpler sentences. It also divided some of the CDC questions into several questions. CDC question #1 in the Figure was divided into three separate questions asking (1) whether the residence was built before 1950, (2) whether it had peeling paint, and (3) whether the child had visited a house with peeling paint. Our questionnaire had a Flesch-Kincaid reading grade level of 4.1, whereas the original CDC questionnaire has a reading grade level of 7.8. We subsequently created variables whose meaning closely corresponded to the original CDC questions (Table 1).

Table 1.
Descriptive statistics for key questions in a survey of BLLsa and items in the MDCH clinical BLL database: Michigan, October 2001–January 2003

We pilot-tested the survey with 15 parents to ascertain whether parents were willing to complete it and readily understood the questions. All parents completed the survey. Only one parent had any difficulty with reading the questionnaire, and clinic staff described her as barely able to read.

After piloting, the survey was administered by trained staff at participating clinics. Of those invited to complete the survey, the participation rate exceeded 90%. As an incentive, respondents were given a long-distance phone card. The survey took less than 10 minutes to complete, and respondents filled it out while waiting at the clinic.

The second dataset consisted of a clinical registry of all BLL tests in Michigan maintained by MDCH, with identifiers removed. If a child had more than one BLL test, we analyzed the highest venous result, if available, and the highest capillary result, if not available. Only 26% of the tests were venous, although 65% of the EBLLs were venous.

Of the 4,194 survey responses, 92.4% (n=3,876) could be linked to the child's BLL test. The final number of cases (3,396) was primarily due to cases lost from BLL records that lacked a valid address. Other cases were excluded because the child was older than 5 years of age or the age was missing.

Methods for missing data

CDC recommended that missing data on any question regarding risks of lead exposure be treated as having answered “yes” to that question.2 We followed this recommendation as one of our methods, with several qualifications. For example, because we asked questions about housing built before 1950 and the presence of peeling paint separately, we could have missing data on one of those questions. If the respondent answered “no” to either question, then even if there were missing data on the other question, we considered the answers tantamount to a “no” on the original CDC question #1. When using this method, we also recoded missing values of race to black, Medicaid eligibility to yes, and pacifier use to never, as these values are associated with higher BLL.

The aforementioned CDC method tends to produce biased coefficient estimates. To obtain unbiased estimates without list-wise deletion, we used 10 random imputations of missing data as an alternative method.14 The values of the dependent variable were used to impute missing values of the independent variables.15 The imputations were performed using SPSS® version 18.0, using the automatic imputation method.16


The range of BLLs was 1 to 164 μg/dL. Consistent with previous findings,9,17 BLL was normally distributed after logarithmic transformation. The transformation that minimizes both skew and heteroscedasticity of variances is Ln(BLL − 0.5), where Ln is the logarithm to base e (≈2.718). This function was the dependent variable in all regressions. The minimum BLL recorded in the MDCH database was 1.0. Therefore, many results recorded as 1.0 might actually have been lower.

Comparing BLLs for survey cases with the full MDCH database

We compared BLL data for the survey cases with those from the full MDCH database of tests in 2002 (the year in which 83% of the BLL tests of children of survey respondents were conducted). As shown in Table 2, the prediction equation explained a much smaller proportion of the variance in Ln(BLL − 0.5) in the survey cases than in the full MDCH database. It also compared the conditional standard deviations (SDs) of the same dependent variable for the two datasets. These SDs were very similar in the two datasets, indicating that the dependent variable was predicted almost equally well18 in both.

Table 2.
Comparison of BLL results for survey data cases from October 2001–January 2003 with full MDCH BLL database for 2002

The survey data cases have a smaller mean for the dependent variable than does the full MDCH data. This smaller mean occurred because the survey data underrepresented some high-risk groups, such as low-income families in Detroit. Almost all of the self-report questions provided three response alternatives: “yes”, “no”, and “don't know.” Table 1 shows response distributions.

Regression analyses

The census-block group prediction score by itself explained 18.1% of the variance in Ln(BLL − 0.5), while the self-report questions plus dichotomized ZIP code risk explained only 12.6% of the variance. Thus, the self-report questions should not replace the census-block group information.

The major research aim was to estimate how much additional predictive value the self-report variables contributed to the estimation of BLL, above that provided by the previous census-block group prediction equation. We entered the linear combination of variables from the optimal census-block group prediction equation9 followed by the self-report variables.

For both the CDC “missing data implies risk”2 and the multiple imputation methods, we added 13 self-report variables as predictors to the regression containing the census-block group equation. Twelve of these variables are in Table 1. The 13th variable was the product of the self-report response regarding whether the child lived in a house with interior peeling paint and the proportion of housing built before 1940 in the respondent's census-block group.

For the CDC method, the adjusted R2 from adding all 13 predictors was 0.192, while for random imputation it was 0.200 (data not shown). For both methods, the same three added predictors had statistically significant coefficients, while the other predictors (not shown) were nonsignificant.

The next step was to conduct a regression analysis using only the census-block group equation and the three significant predictors. As shown in Table 3, the adjusted R2 values, with only these three self-report predictors, were within 0.001 of the aforementioned adjusted R2 values from all 13 predictors.

Table 3.
Unstandardized regression coefficients predicting Ln(BLL – 0.5) from census-block group prediction equation and self-report variables, assuming that missing responses are tantamount to “yes” (i.e., risk suggestive) and by multiple ...

To aid interpretation, we present the anti-logarithms of the regression coefficients in Table 3. For both data analysis methods, all other things being equal, daily pacifier use resulted in a predicted BLL that was 81% of the BLL of a child who did not use one. For both methods, a child living in a house with peeling paint in a census-block group in which 100% of the housing was built before 1940 had a predicted BLL that was approximately 40% higher than that of an otherwise identical child in a house without peeling paint or a census-block group with no pre-1940 housing. Using the CDC method, children who had a sibling with an EBLL had a predicted BLL that was 13% higher than that of an otherwise identical child without such a sibling. Using random imputation, children with an EBLL sibling had a predicted BLL that was 46% higher.

Identifying the most cost-effective decision rule

The most cost-effective decision rule would correctly identify all children with EBLLs (100% sensitivity) without misidentifying any without EBLLs (100% specificity). We compared cost-effectiveness by computing the sensitivity and specificity for three BLL prediction equations: first by using the census-block group prediction equation by itself, and then by including the three significant self-report lead questions. This comparison was made for both the CDC's missing data implies risk strategy and the multiple imputation method. While no BLL is known to be safe,19 consistent with CDC, we defined EBLL as ≥10 μg/dL. A good clinical decision rule requires a well-constructed risk score,20 and we used the predicted value of Ln(BLL−0.5). The Centers for Medicare & Medicaid Services requirement that all children on Medicaid be tested, and the CDC guidelines calling on state health departments to develop screening plans for all children at high risk, indicate that high sensitivity, i.e., correctly identifying all children with EBLLs, is more important than high specificity. We chose 0.3 as our risk score cutoff point because this number achieves high sensitivity.

Table 4 compares the prediction methods. The sensitivity of the three methods was identical (0.992). However, adding the three self-report questions to the census block group equation gave higher specificity. Adding these questions via CDC's method would eliminate 153 (673 – 520) unnecessary tests. Adding these questions via random multiple imputation would eliminate 186 (706 – 520) unnecessary tests.

Table 4.
Sensitivity and specificity of BLL scoresa in predicting BLLs of ≥10 micrograms per deciliter from census-block group prediction equation and adding three best self-report exposure predictors, by two different methods: Michigan, October 2001–January ...

Brown and Chattopadhyay found that the average cost of a private-sector BLL venous sample test was $31, including the cost of the blood draw.21 Adding the self-report questions would have resulted in a total savings of $31 × 153 = $4,743 for the CDC method and $31 × 186= $5,766 for the random multiple imputation method. This analysis was based on 3,396 cases, approximately 2% of the more than 160,000 BLL tests performed each year in Michigan from 2007 to 2009. Extrapolating suggests that using the three self-report questions statewide would annually eliminate more than 7,200 unnecessary tests and save more than $220,000 using the CDC method and almost $270,000 using random multiple imputation. This savings would occur with no loss in sensitivity.


In an effort to improve prior predictions of children's BLLs, we used a combination of survey and clinical data. Three of 13 self-report questions made statistically significant contributions to predicting BLL. These three were having a sibling with an EBLL, using a pacifier (i.e., using it more often is better), and having peeling paint inside a child's residence (when multiplied by the percentage of pre-1940 housing in the census-block group).

By comparison, the CDC-based question, “Has the child lived in a house built before 1950 with peeling paint inside?” and not multiplying by the census information had no significant predictive value, even when the interaction term involving this question was excluded from the model. This question may have had little value because parents who rent their housing often do not know when the house was built, and more than half of the respondents answering this question indicated that they rented. Similarly, the nonsignificant predictive value of the question on lead pipes may have been a result of parents not knowing whether or not their residence had such pipes.

The observed negative association between pacifier use and BLL is surprising, given that Dalton et al.5 found a significant positive association. Their results are consistent with the assumption that if pacifiers fall on the floor and pick up lead dust, children may then inhale or ingest lead upon insertion. However, our results suggest that a pacifier may act as a barrier that prevents lead dust from entering the mouth and/or reduces the need to suck on paint chips or lead-painted toys. We also note that our sample size (n=3,396) was much larger than Dalton's (n=463) and that we estimated the effect of pacifier use on BLL while controlling for many other variables, whereas Dalton's analyses were strictly bivariate.


This study was subject to several limitations. For one, these data were from one state and represented a limited time period (late 2001 through early 2003). Also, the survey was not representative geographically, as the sample consisted of patients at the participating pediatric clinics in the state. Furthermore, some of the elevated capillary tests were not followed up by more accurate venous tests. Therefore, it is possible that some of the EBLL tests in the clinical records represented false-positives. Hence, we reran the analysis, eliminating elevated capillary BLL results. We found that all estimated regression coefficients were within 5% of the values presented in Table 3 and that none of our conclusions would be brought into question.


Prior clinical decision rules for targeting lead testing have used the criteria of Medicaid eligibility, residing in a high-risk ZIP code, or answering “yes” or “don't know” to one of the recommended self-report questions. These rules have a much lower predictive validity than an equation that is a linear combination of census-block group characteristics and several individual characteristics.9 As shown previously, adding the three self-report questions increased the predictive value of the census-block group equation.

CDC recently noted that since its 1997 recommendations were developed, all 42 CDC-funded childhood lead poisoning prevention programs in 37 states have developed data-driven targeted screening criteria. CDC went on to recommend that because “state and local officials are more familiar than federal agencies with local risk for elevated BLLs … that these officials have the flexibility to develop blood lead screening strategies that reflect local risk for elevated BLLs.”1

Our clinical decision rule adheres to this guidance and has the potential to better target BLL testing in children, thereby reducing testing costs while simultaneously identifying more cases of EBLL. In fact, our previous research9 shows that using a census-block group equation in Michigan, rather than ZIP codes, would have identified more cases of BLL >10 μg/dL, while saving more than $150,000 in four years.

This study shows that adding the three self-report questions to the equations increases specificity with no loss in sensitivity and should increase the monetary savings with no loss in the number of elevated cases identified.

The purpose of the prediction equation is to create a clinical decision rule that provides medical providers and public health departments with a better approach to determine which children should undergo BLL testing. Such decision rules are derived from rigorous quantitative research2225 and are developed for clinically important conditions that are prevalent and for which current diagnostic testing practices vary widely and are inefficient.9,25

While the census-block group equation has much greater predictive validity for EBLL than ZIP codes, it is fairly easy to know the ZIP code and, at first glance, using the prediction equation appears more difficult and time-consuming. However, a Web tool has been developed to perform the computations in this equation, and online access is available at Clinic staff or parents can enter the child's Michigan address, Medicaid status, race/ethnicity, age, and the answers to the three self-report questions that are useful predictors. The Web tool processes these responses using the prediction equation and returns a recommendation as to whether or not the child needs a BLL test.

Clinical decision rules similar to the one presented in this article can be developed for other states. However, it will require developing a prediction algorithm using existing statewide BLL test data and U.S. Census Bureau information. We have described the type of statistical analyses required in this article and our previous article.9 A prediction equation would be the core of an online program that could be easily used by providers and parents alike. It would also facilitate cost-effective testing.2124


The authors thank Yasmina Bouraoui and the Michigan Department of Community Health (MDCH) for partnering with Michigan State University to secure the Centers for Disease Control and Prevention (CDC) funding and for engaging Michigan clinics to provide survey data; Sharon Hudson for her assistance with clinical contacts; Robert L. Scott for providing access to the blood lead level database; Dr. Warren A. Brown for small-area census analysis and geocoding; Sean Frost for assistance with data management; Brian Biroscak for assistance in preparing the manuscript; Vivek Joshi for initially programming the website that recommends which children need BLL tests; Fuad Abujarad, who currently maintains and updates this website; and the staffs of cooperating clinics for their assistance in collecting survey data from patients.


This study was supported by CDC grant #PA00053 Supplemental Studies Part C. The views expressed in this article are those of the authors and do not necessarily represent the views of CDC or MDCH. This study was approved by the Institutional Review Boards at the Michigan Department of Community Health, Michigan State University, Henry Ford Health System, Spectrum Health System, and the Detroit Department of Health.


1. Wengrovitz AM, Brown MJ. Recommendations for blood lead screening of Medicaid-eligible children aged 1-5 years: an updated approach to targeting a group at high risk. MMWR Recomm Rep. 2009;58(RR-9):1–11. [PubMed]
2. Centers for Disease Control and Prevention (US) Screening young children for lead poisoning: guidance for state and local public health officials. Atlanta: CDC; 1997.
3. Laraque D, Trasande L. Lead poisoning: successes and 21st century challenges. Pediatr Rev. 2005;26:435–43. [PubMed]
4. Blood lead levels—United States, 1999–2002. MMWR Morb Mortal Wkly Rep. 2005;54(20):513–6. [PubMed]
5. Dalton MA, Sargent JD, Stukel TA. Utility of a risk assessment questionnaire in identifying children with lead exposure. Arch Pediatr Adolesc Med. 1996;15:197–202. [PubMed]
6. Haan MN, Gerson M, Zishka BA. Identification of children at risk for lead poisoning: an evaluation of routine pediatric blood lead screening in an HMO-insured population. Pediatrics. 1996;97:79–83. [PubMed]
7. Kazal LA., Jr The failure of CDC screening questionnaire to efficiently detect elevated lead levels in a rural population of children. J Fam Pract. 1997;45:515–8. [PubMed]
8. Binns HJ, LeBailly SA, Fingar AR, Saunders S. Evaluation of risk assessment questions used to target blood lead screening in Illinois. Pediatrics. 1999;103:100–6. [PubMed]
9. Kaplowitz SA, Perlstadt H, Post LA. Comparing lead poisoning risk assessment methods: census block group characteristics vs. zip codes as predictors. Public Health Rep. 2010;125:234–45. [PMC free article] [PubMed]
10. Trepka MJ. Using surveillance data to develop and disseminate local childhood lead poisoning screening recommendations: Miami-Dade County's experience. Am J Public Health. 2005;95:556–8. [PubMed]
11. Krieger N, Chen JT, Waterman PD, Soobader MJ, Subramanian SV, Carson R. Choosing area based socioeconomic measures to monitor social inequalities in low birth weight and childhood lead poisoning: the Public Health Disparities Geocoding Project (US) J Epidemiol Community Health. 2003;57:186–99. [PMC free article] [PubMed]
12. Raudenbush SW, Bryk AS. Hierarchical linear models: applications and data analysis methods. 2nd ed. Thousand Oaks (CA): Sage Publications; 2002.
13. Payne M. Lead in drinking water. CMAJ. 2008;179:253–4. [PMC free article] [PubMed]
14. Allison PD. Missing data. Thousand Oaks (CA): Sage Publications; 2002.
15. von Hippel PT. Regression with missing Ys: an improved strategy for analyzing multiply imputed data. Sociol Methodol. 2007;37:83–117.
16. SPSS, Inc. SPSS®: Version 18.0 for Windows. Chicago: SPSS, Inc.; 2010.
17. Brody DJ, Pirkle JL, Kramer RA, Flegal KM, Matte TD, Gunter EW, et al. Blood lead levels in the US population. Phase 1 of the Third National Health and Nutrition Examination Survey (NHANES III, 1988 to 1991) JAMA. 1994;272:277–83. [PubMed]
18. Achen CH. Interpreting and using regression. Newbury Park (CA): Sage Publications; 1982.
19. Lanphear BP, Hornung R, Khoury J, Yolton K, Baghurst P, Bellinger DC, et al. Low-level environmental lead exposure and children's intellectual function: an international pooled analysis. Environ Health Perspect. 2005;113:894–9. [PMC free article] [PubMed]
20. Cook CE. Potential pitfalls of clinical prediction rules. J Man Manip Ther. 2008;16:69–71. [PMC free article] [PubMed]
21. Brown MJ, Chattopadhyay S. Lead, elevated blood lead level evidence statement: screening. In: Campbell KP, Lanza A, Dixon R, Chattopadhyay S, Molinari N, Finch RA, editors. A purchaser's guide to clinical preventive services: moving science into coverage. Washington: National Business Group on Health; 2006. [cited 2012 Jan 16]. pp. 164–9. Also available from: URL:
22. Wasson JH, Sox HC, Neff RK, Goldman L. Clinical prediction rules. Applications and methodological standards. N Engl J Med. 1985;313:793–9. [PubMed]
23. Stiell IG, Wells GA. Methodologic standards for the development of clinical decision rules in emergency medicine. Ann Emerg Med. 1999;33:437–47. [PubMed]
24. McGinn TG, Guyatt GH, Wyer PC, Naylor CD, Stiell IG, Richardson WS. Users' guides to the medical literature: XXII: how to use articles about clinical decision rules. vidence-Based Medicine Working Group. JAMA. 2000;28:79–84. [PubMed]
25. Stiell IG, Bennett C. Implementation of clinical decision rules in the emergency department. Acad Emerg Med. 2007;14:955–9. [PubMed]

Articles from Public Health Reports are provided here courtesy of Association of Schools of Public Health