|Home | About | Journals | Submit | Contact Us | Français|
To describe the item-selection and item-reduction for the Lung Function Questionnaire (LFQ), being developed to help clinicians identify patients appropriate for diagnostic evaluation for chronic obstructive pulmonary disease (COPD) using spirometry.
Item selection and reduction were based on information from 387 ≥40-year-old respondents to the third National Health and Nutrition Examination Survey who had self-reported chronic bronchitis. Item reduction involved stepwise logistic regression. The accuracy of the final subset of items for identifying individuals with airflow obstruction (forced expiratory volume in one second/forced vital capacity <0.70) versus those without it was assessed with receiver operating characteristic analysis. Content and face validity were assessed using focus groups of primary care physicians (n = 16) and interviews with COPD patients (n = 16).
The model with all five items (age; smoking history; the presence of wheeze, dyspnea, and phlegm) compared with models with combinations of fewer items had the highest classification accuracy (area under the curve [AUC] = 0.720) with sensitivity and specificity of 73.2% and 58.2%, respectively. The presence of three or more factors yielded the highest AUC, a result suggesting that three or more affirmative answers is the most appropriate criterion indicating presence of airflow obstruction.
The five-item LFQ retained sufficient accuracy, sensitivity, and specificity in identifying individuals with COPD for further validation testing.
Chronic obstructive pulmonary disease (COPD) affects approximately 12 million adults in the United States, where it causes approximately 1.5 million emergency department (ED) visits, 726,000 hospitalizations, and 119,000 deaths annually.1 COPD is manifested by cough, sputum production, and breathlessness associated with airflow obstruction.2 Deterioration in lung function impairs patients’ general health and quality of life and eventually leads to respiratory failure and premature death. Until recently, the progression of COPD was viewed as being inexorable and the disease as being refractory to therapeutic intervention. Contradicting this view, a convergence of evidence suggests that, although lung tissue damage in COPD appears to be permanent, the course of the disease can be altered through measures such as smoking cessation, pulmonary rehabilitation, and use of pharmacotherapy.3–6 Data showing that symptoms and frequency of exacerbations can be reduced and exercise capacity and health status can be improved with intervention have shifted the paradigm in COPD management such that the disease is now viewed as being preventable and treatable.4,5
In this new paradigm, early identification of COPD and aggressive approaches to treatment are regarded as being integral to optimizing outcomes.6 Primary care physicians, who are thought to provide care for the majority of patients with early or mild COPD, are crucial in efforts to prevent COPD and to diagnose it early.3 However, data suggest that COPD is underdiagnosed in primary care as it is in other health care settings. For example, in a recent study conducted in the primary care setting, 182 of 1960 patients (9.3%) were found to meet diagnostic criteria for COPD, but only 19% of those meeting the diagnostic criteria had been diagnosed and treated.7 Diagnosis of COPD is complicated by the fact that, during its initial, often prolonged stage, COPD symptoms can be confused with aging, de-conditioning, or symptoms of other chronic conditions and therefore not recognized as a respiratory issue by patients or their health care professionals.3
Diagnosis of COPD is based on objective evidence of airflow limitation, usually defined as a postbronchodilator forced expiratory volume in one second/forced vital capacity ratio (FEV1/FVC) < 0.70 associated with risk factors such as smoking and/or symptoms of chronic sputum production, wheezing, and dyspnea.2 If detection of COPD is to be improved in primary care, screening tools for detection of early symptomatic COPD prior to the onset of disabling symptoms are needed. Although necessary for diagnosing COPD, spirometry is not recommended as a screening tool as its benefits do not outweigh potential harms according to a recent evidence-based review conducted for the US Preventive Services Task Force.8 Since that review, one study has suggested that giving patients their lung age rather than just the FEV1 from spirometry testing doubled smoking quit rates.9 This result suggests a possible benefit of spirometry screening beyond the diagnosis of COPD. Further evaluation is necessary before the possible smoking-cessation benefit can justify widespread spirometry screening.
Until then, a screening tool for detection of people appropriate for spirometry evaluation should be brief, self-completed, and easy to administer and score and must have high sensitivity and reasonable specificity for spirometry-confirmed airflow obstruction. This paper describes the item-selection and item-reduction phases of the development of the Lung Function Questionnaire (LFQ), designed as a patient-completed screening tool that can be used efficiently in primary care settings to detect those appropriate for spirometry testing for airflow obstruction. Future studies will be required to validate the use of the LFQ in primary care practice.
The initial development of the LFQ occurred in two phases: 1) an empirical item-selection and item-reduction phase during which candidate questionnaire items were identified and their accuracy evaluated and 2) a qualitative phase to assess for content validity and face validity.
The study sample was a subset of the third National Health and Nutrition Examination Survey (NHANES III), a US population-based survey conducted from 1988 to 1994.10 The survey involved 33,994 respondents who were interviewed in their homes and then invited to a mobile examination center for a medical examination that included a physical examination, completion of several questionnaires or interviews, and tests and procedures including spirometry. To be included in the current study, respondents had to be at least 40 years old and to have a self-reported diagnosis of chronic bronchitis (CB), defined as an affirmative answer to the question, “Has a doctor ever told you that you had chronic bronchitis?” No questions on self-reported COPD or emphysema were included in NHANES III.
Patients with airflow obstruction, defined as prebronchodilator FEV1/FVC < 0.70, were compared with patients without airflow obstruction with respect to age; gender; smoking history; and presence of phlegm, dyspnea, wheeze, and cough. The groups were compared using the chi-square test for categorical variables and the t-test for continuous variables.
The first phase of the study involved evaluating eight candidate items for potential inclusion in the LFQ. The eight candidate items, which were based on known risk factors for airflow obstruction, were assessed for accuracy in correctly identifying individuals with airflow obstruction in the NHANES III sample. Selection of these items was based on literature reviews and clinical input. Stepwise selection procedures were conducted for eight base models based on varying cutoffs for the candidate items (Table 1). In each base model, the dependent variable was the presence of airflow obstruction; and the independent variables were age, body mass index, cough, phlegm, dyspnea, wheeze1, wheeze2, and smoke (Table 1 for variable definitions as captured in NHANES). Cough, phlegm, dyspnea, wheeze1, and wheeze2 were coded as binary variables (1 = yes; 0 = no). Smoke was coded as 1 if the respondent indicated smoking for at least 20 years; otherwise a value of zero was used. For each of the remaining two independent variables, two different cutpoints were used (Table 1). The base models were evaluated for classification accuracy in terms of sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (using probability cutoff of 0.500).
Next, stepwise logistic regression procedures were used to reduce the number of items and to identify the items most predictive of airflow obstruction (item reduction). In general, item choice was based on the results obtained with the base models, clinical relevance, and ease of administration of the questionnaire. The classification accuracy of eight reduced models obtained from the stepwise procedure was compared with that of the base models in terms of sensitivity, specificity, and area under the ROC curve (probability cutoff, 0.500).
The accuracy of the final (reduced) subset of items identified for inclusion in the LFQ was assessed using ROC analysis. The area under the ROC curve (AUC), which was obtained for all combinations of the candidate items, was used as the measure of accuracy. AUC values ≥ 0.7 were considered to reflect acceptable accuracy for detecting airflow obstruction.
As a sensitivity analysis we re-defined the dependent variable to reflect GOLD stage II disease ie, FEV1/FVC fixed ratio < 0.7 and FEV1< 80% predicted (prebronchodilator) for those aged 65 and older. Subjects aged less than 65 years were classified as obstructed if FEV1/FVC < 0.7. Logistic regression models were used to identify items from the questionnaire most predictive of obstruction; screening accuracy was also tested using ROC curve analyses. Additionally, a general population comprising individuals aged 40 years and older was used in a sensitivity analysis to explore performance of the candidate pool of items and whether the items selected would change (results not shown). This additional analysis was done to ensure that the LFQ items did not restrict properties studied to only a sample of patients with symptoms consistent with chronic bronchitis.
To examine the minimum number of items associated with best accuracy, the accuracy of the model was examined according to different cutpoints of a summed scale derived from the subset of variables included in the best performing model. The predictor model with the highest AUC was considered to reflect the minimum score of the LFQ that most accurately predicts airflow obstruction. (Scoring will be tested further in the subsequent validation study, which will include testing of five-point response options for LFQ questions).
The qualitative phase involved physician focus groups and one-on-one patient interviews to test further the face and content validity (clinical relevance) of the questions and response options. The physician focus groups comprised 16 primary care physicians (who were eligible to participate if they had treated at least 10 patients with CB per month and had a minimum of three years postresidency experience in practice) who were asked to review the screener and to provide feedback in focus groups conducted during February and March 2007 in Philadelphia, Pennsylvania. The focus groups were conducted by experienced moderators who used structured discussion guides.
After incorporating feedback from the focus groups, the face validity of the resulting items was evaluated during two rounds of cognitive one-on-one interviews held during March 2007 in Collegeville, Pennsylvania, with 16 patients (eight per round). Patients were aged at least 40 years with a confirmed physician diagnosis of COPD or emphysema (round 1 of interviews) or to have a diagnosis of CB or self-reported productive, chronic cough on most days for at least three months of the year within the last two years (round 2 of interviews). One-on-one interviews were conducted using a standard “think-aloud” procedure and directed probes about the draft screener title, directions, and items while describing their thought processes aloud until saturation of information was reached.
Table 2 summarizes characteristics of the study sample for item evaluation and reduction. The sample included 387 patients, 51% of whom had airflow obstruction. Although all in the study sample had a self-reported diagnosis of CB, only 32% reported cough symptoms. Patients with airflow obstruction by spirometry compared with those without airflow obstruction were older; more likely to be male; to have smoked at least 20 years; and to have symptoms of phlegm, dyspnea, wheeze, and cough.
In stepwise selection procedures on the eight base models, AUC values ranged from 0.775 to 0.811 across models. Ranges for sensitivity and specificity were 57.8% to 64.9% and 68.5% to 83.8%, respectively. The following variables were statistically significant (<0.05) across models: age, dyspnea, wheeze1, and smoke. Although body mass index (BMI) (specifically, BMI < 25) was also statistically significant in stepwise analyses, it was only weakly related to prebronchodilator FEV1/FVC in linear regression analyses. This variable was eliminated from consideration for the reduced subset because of low discriminatory power and the difficulty of easily and reliably assessing it in a patient-reported questionnaire (as calculation involved computation). Wheeze2 was chosen in favor of wheeze1 for inclusion in the reduced subset of potential LFQ items based on clinician advice suggesting that wheeze2 was the more clinically useful and specific item. Also, AUC values of models including wheeze2 instead of wheeze1 did not appreciably differ. Phlegm was not identified in any of the regression models. However, because of its clinical importance, phlegm was included in the questionnaire. Final items included in the questionnaire were age, dyspnea, wheeze2, smoke, and phlegm.
Table 3 shows the odds ratios for presence of airflow obstruction from multivariate logistic regression of the reduced subset of five potential LFQ items. The five items differentiated patients with airflow obstruction from those without airflow obstruction in the NHANES III database. After controlling for wheezing, dyspnea, phlegm, and smoking, the odds of airflow obstruction for those aged 50 years and older were more than three times the odds of airflow obstruction for those aged less than 50 years. Those who had smoked for at least 20 years were 1.8 times more likely to have airflow obstruction than those who had smoked for 0 to 19 years. Airflow obstruction was 1.5, 2.0, and 1.5 times more likely to be present in those with wheeze, dyspnea, and phlegm, respectively, than in patients without these symptoms.
The model with all five items (variables) had the highest AUC (0.720) with sensitivity and specificity of 73.2% and 58.2%, respectively. Figure 1 shows the ROC curve for the best model (ie, that with five items). The ROC curve describes the accuracy of a test regardless of the decision threshold. Each point in the ROC plot represents the combination of sensitivity and specificity values generated by a different decision threshold.
Redefining the criteria for obstruction per the sensitivity analysis (SA) resulted in negligible changes in the overall results. Under the new definition, the accuracy of the LFQ (AUC = 0.709) decreased slightly, a result that shows that the items are not appreciably affected by the “aging lung” phenomenon. Also, in another SA using a base population of individuals aged more than 40 years, the performance of LFQ questions was very similar to performance in the CB population (data not reported). This finding suggests that the LFQ captures concepts related to airflow obstruction.
Table 4 shows the AUCs for logistic regressions using dichotomized predictors derived from the LFQ summed score. A score ≥3 (regardless of the combination of questions) on the LFQ scale yielded the highest AUC and suggests a risk of airflow obstruction. (Scoring will be tested further in a subsequent validation study for the LFQ).
Table 5 shows the LFQ items identified during the empirical phase of development and the changes made to these items in response to qualitative input from physicians and patients.
The 16 physicians who participated in focus groups to assess content validity of the final set of five LFQ items had been in practice a mean 13.9 years (range 3.5 to 30 years) and treated an average of 59.4 patients (range 20 to 150) with CB per month. All physicians practiced in the primary care setting (10 family practice, five internal medicine, one general practice).
Physicians’ review of the draft questionnaire resulted in modification of the directions for completion of the LFQ to enhance clarity as well as revision of the items on shortness of breath and phlegm (which physicians suggested be instead termed mucus) to enhance understanding (Table 5).
The majority of the 16 patients who participated in one-on-one cognitive interviews to assess face validity of the LFQ were female (61%). Patients indicated that items and concepts in LFQ were relevant to their disease and symptoms. Based on patient feedback, changes were made to the order of the items until patients brought up no new information. When the questions regarding smoking were presented first, respondents indicated they felt it was a “smoking questionnaire” and felt threatened/attacked by it. Re-ordering of the same items addressed this concern (Table 5). Other significant changes based on patient input included simplification of the instructions and development of more precise wording for the items and response options.
Previously regarded as an inexorably progressive disease that is refractory to treatment, COPD is now understood to be treatable through measures such as smoking cessation, pulmonary rehabilitation, and use of pharmacotherapy.3–6 Early identification of COPD is crucial to treatment efforts. Because spirometry is not practical as a screening tool in many healthcare settings,8 alternatives to spirometry are needed to screen patients for COPD. In this study, a set of items that accurately identifies patients with spirometry-based airflow obstruction, the primary manifestation of COPD, was identified for potential inclusion in the LFQ. Items related to age; occurrence of wheezing, phlegm, and dyspnea; and smoking history were identified for inclusion in the LFQ based on the statistical analysis, expert advice regarding the clinical relevance of the candidate items, and ease of administration of items. The final set of items in this initial development of the LFQ achieved a classification accuracy of 0.72, a value that reflects fair accuracy. The balance of sensitivity (73.2%) and specificity (58.2%) was good as a relatively greater focus on sensitivity is desirable in noninvasive screening tools such at the LFQ. It is pertinent to recognize that lack of higher specificity does suggest some practical implications for the LFQ in this current form. This implies higher number of false positives and therefore has a practical burden of time and cost for physicians.8 It should be recognized that these characteristics constitute only preliminary exploration of characteristics of these questions as posed in NHANES III survey (yes/no format). A subsequent validation study will examine these properties in an independent sample of respondents in a primary care practice. This validation study will further refine the properties of the LFQ such that this burden of false positives is minimized. In previous research, an initially promising COPD diagnostic questionnaire based on airflow obstruction was found not to be externally valid in a high-risk population comprising middle-aged current smokers,11 a finding that highlights the importance of thorough external validation studies.
The sample for the empirical phase of the study involved individuals at least 40 years old with a self-reported diagnosis of CB who had participated in the US population-based NHANES III survey.10 This sample has been previously used to describe the epidemiology of airflow obstruction in the general population12 and to assess the usefulness of screener questions to identify those having airflow obstruction.13 The representativeness of the sample suggests that the findings of the current study are widely generalizable. Individuals aged 40 years or older were chosen for study because they are the target population for COPD screening; COPD is rare in those younger than 40 years. This study included patients reporting chronic bronchitis to ensure item selection in an “at-risk” sample. Selecting an initial pool of items from a general population may not have illustrated characteristics that one that was more “at risk” for COPD would. In order to explore what impact this may have on the items, the same models were tested and regression procedures were performed on a general population taking away the restriction of chronic bronchitis. No appreciable impact on results or on items selected was found. Lung capacity is known to diminish with age.14 Therefore, a classification scheme relying on FEV1/FVC < 0.7 across all age groups is likely to result in large false-positive rates among elderly respondents (aged 65 and older). To address this “aging lung” phenomenon, a separate analysis that redefined dependent variable of airflow obstruction as FEV1/FVC fixed ratio < 0.7 and FEV1< 80% predicted was conducted for individuals 65 years and older. The dependent variable for those under 65 years remained the same – that is, FEV1/FVC fixed ratio < 0.7. Results (not shown) remained fairly consistent and final LFQ items remained unchanged. As discussed previously, the authors also examined whether these variables would perform in a similar fashion within a multivariate setting in a general NHANES population aged 40 years and older regardless of chronic bronchitis diagnosis, Results were fairly similar and further strengthened the choice of the items from the primary analysis.
Evidence suggests that continuum-based scales have better psychometric properties than dichotomous Yes/No scales.15 In order to investigate this further, in addition to the Yes/No answer format used in the present study, questions with five-point Likert-type response scales were included in a subsequent validation study (the subject of a separate manuscript). The objective of the validation study is to further ascertain the screener’s psychometric properties, including screening accuracy, and to determine the performance of the Yes/No response options compared with a five-point scale.
The current research extends previous findings establishing the feasibility of using screening questionnaires to identify those at risk of airflow obstruction or COPD.13,16–22 The LFQ is being developed to improve upon many COPD screening tools19,21,22 by being easy to self-administer, by not requiring interviewer administration or information from medical records, and by being broadly useful across patient types and settings rather than being targeted only to a particular population (eg, smokers). The LFQ demonstrated good content validity and face validity among physicians and patients during the qualitative phase of assessment. With these characteristics, the LFQ should be particularly appropriate for use in the primary care setting. The LFQ should also be useful as an initial screener in epidemiological studies, disease management programs, and clinical research. In ongoing research, the LFQ is being further validated among primary care providers and patients.
Several other screening tools for COPD have been explored.13,16–25 The LFQ is unique among existing tools in having demonstrated both content validity and face validity, which are critical to an instrument’s utility in clinical practice. Both patient input and physician input were integral in the establishment of face validity and content validity of the LFQ. The content of items was both driven and confirmed by patients and physicians. Furthermore, question and response options were refined based on patient and physician feedback in order to maximize their relevance to patients and to the disease of interest. Input from primary care clinicians was particularly useful in shaping the instrument to be practical for use in the target setting. Through a sequential process, items that were selected from NHANES using a statistical model were further refined by qualitative patient and physician input. Within the realms of instrument development, this follows accepted methodology. The qualitative step in no way precludes the screening properties of the LFQ obtained from the first step. The LFQ also differs from existing screening tools for COPD in the extent to which its psychometric properties and clinical utility are being refined in a sequential process. The study described herein is one of a program of studies designed to refine and validate the LFQ.
Many questionnaires have been developed using information from specialty populations in the United States. The NHANES survey is a very large representative sample of US patients who are arguably comparable to primary care populations in the United States. The NHANES data were used to select questions for further validation studies. As lung function measurements were available in these patients, they seemed appropriate to consider as a group that had reported physician diagnosis of chronic bronchitis. Furthermore, patients with self-reported chronic bronchitis were selected in order to be able to discern most relevant items predicting airflow obstruction. Also, because the majority of patients with early, undiagnosed COPD (the targets of this questionnaire) are passed off as having smoker’s cough or chronic bronchitis, the initial pool of questions was developed using this population. This sample was felt to provide more disease-specific inputs for further testing. While COPD is underdiagnosed in primary care, it is also likely incorrectly diagnosed without the use of spirometry in many practices. Therefore, this group is appropriate to include in the question selection. The entire process of item selection was also repeated using a general population aged 40 years and older to examine any changes in item selection as a sensitivity analysis. This analysis did not change the pool of items selected.
The LFQ is being developed to help health care professionals screen for obstructive lung disease manifested by prebronchodilator FEV1/FVC < 0.70, a likely marker for COPD. As a screening tool, the LFQ can help health care providers identify patients in need of further evaluation for possible COPD but is not intended as a diagnostic tool. Patients whose LFQ score suggests the presence of airflow obstruction require clinical evaluation and spirometric assessment to assess for COPD.
This study should be interpreted in the context of the limitation that it was conducted in individuals who self-reported a diagnosis of CB. While inclusion of only these patients was useful in profiling the performance of LFQ items in the target population, the performance of LFQ items in nonselected samples is also of interest. Additional validation studies are needed to assess the performance of LFQ items in community-based samples that include individuals without self-reported CB. Another limitation of this study is the use of prebronchodilator spirometry as a criterion measure. It is clinically accepted that postbronchodilator spirometry, after accounting for reversibility, may be a better measure of lung function than prebronchodilator spirometry. The use of prebronchodilator spirometry was dictated by the source of data for this study – the NHANES III survey. The NHANES III survey captured prebronchodilator spirometry, but not postbronchodilator spirometry. In subsequent validation studies of the LFQ, postbronchodilator spirometry will be used as a criterion measure. This change is not expected to result in major changes in the characteristics or performance of the LFQ. Information around the recall period was not available for these questions (primarily because these questions within NHANES were captured in yes/no format). However, information around recall period was addressed in patient cognitive interviews as well as physician focus groups. Feedback did not suggest that absence of recall was necessarily a handicap owing to simplicity of questions and concepts being explored.
In summary, the five-item LFQ can be used in the primary care setting as a patient-completed screening tool to identify patients with a high risk of airflow obstruction. The LFQ had adequate accuracy, sensitivity, and specificity in a sample comprising individuals with self-reported CB and had good content and face validity according to primary care physicians and patients. The LFQ is a good candidate tool to facilitate earlier recognition of COPD. Further validation efforts to improve upon scoring and confirm screening accuracy are needed to establish this tool as an aid in primary practice.
GlaxoSmithKline (GSK) funded this study. Dr Dalal is an employee of GSK. Dr Mapel has received research grants from and served as a consultant for GSK, Pfizer Pharmaceuticals, and Boehringer-Ingelheim, Inc. The authors acknowledge Jane Saiers, PhD, for assistance with writing the manuscript. GSK funded Dr Saiers’ work. Dr Mintz has been a speaker and advisor for GSK, AstraZeneca, and Sepracor. He has also been a speaker for Takeda and Pfizer. Dr Hanania has received research grant support and served as a consultant or speaker for GSK, Dey, Sepracor, Novartis, and Boehringer Ingelheim. Dr Mannino serves on advisory boards for Boehringer Ingelheim, Pfizer, GSK, and Ortho Biotech; is on the speakers bureau for Boehringer Ingelheim, Pfizer, GSK, and Dey; and has received research grants from GSK, Novartis, and Pfizer. Dr Donohue has been associated with GSK in the following capacities: advisory board, consultant, speakers bureau and research contracts. Dr Samuels has served as a consultant for GSK. Dr Yawn has served on advisory committees for COPD for AstraZeneca and Pfizer and has been a consultant to GSK for research study development. Dr Yawn has received research funding from GSK and Novartis for investigator initiated research and clinical trials from GSK and BI/Pfizer in the area of COPD. Dr Martinez is a consultant for Altana and has also served on advisory boards for Genzyme, GSK, Novartis, Schering Plough, AstraZeneca, and Forrest/Almirall. Dr Martinez is a member of the speakers bureau for GSK, AstraZeneca, Schering Plough, and Boehringer Ingelheim.
Some of the data described in this manuscript were presented at CHEST 2008, the annual meeting of the American College of Chest Physicians, held October 25 to 30 in Philadelphia, PA, USA.