Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Occup Environ Hyg. Author manuscript; available in PMC 2010 June 2.
Published in final edited form as:
J Occup Environ Hyg. 2009 June; 6(6): 324–331.
doi:  10.1080/15459620902836856
PMCID: PMC2879629

Validity and Reliability of an Occupational Exposure Questionnaire for Parkinsonism in Welders


This study assessed the validity and test-retest reliability of a medical and occupational history questionnaire for workers performing welding in the shipyard industry. This self-report questionnaire was developed for an epidemiologic study of the risk of parkinsonism in welders. Validity participants recruited from three similar shipyards were asked to give consent for access to personnel files and complete the questionnaire. Responses on the questionnaire were compared with information extracted from personnel records. Reliability participants were recruited from the same shipyards and were asked to complete the questionnaire at two different times approximately 4 weeks apart. Percent agreement, kappa, intraclass correlation coefficient (ICC), and sensitivity and specificity were used as measures of validity and/or reliability. Personnel files were obtained for 101 of 143 participants (70%) in the validity study, and 56 of the 95 (58.9%) participants in the reliability study completed the retest of the questionnaire. Validity scores for items extracted from personnel files were high. Percent agreement for employment dates and job titles ranged from 83–100%, while ICC for start and stop dates ranged from 0.93–0.99. Sensitivity and specificity for current job title ranged from 0.5–1.0. Reliability scores for demographic, medical and health behavior items were mainly moderate or high, but ranged from 0.19 to 1.0. Most recent job/title items such as title, types of welding performed, and material used showed substantial to perfect agreement. Certain determinants of exposure such as days and hours per week exposed to welding fumes demonstrated mainly moderate agreement (κ = 0.42–0.47, percent agreement 63–77%); however, mean days and hours reported did not differ between test and retest. The results of this study suggest that participants’ self-report for job title and dates employed are valid compared with employer records. While kappa scores were low for some medical conditions and for caffeine consumption, high kappa scores for job title, dates worked, types of welding, and materials welded suggest participants generated reproducible answers important for occupational exposure assessment.

Keywords: parkinsonism, self-report questionnaire, test-retest reliability, validity, welding exposures


Parkinsonism is the presence of clinical features of Parkinson’s disease (PD) and includes bradykinesia, rigidity, tremor (typically at rest), and postural instability. Parkinson’s disease is one of the most common causes of parkinsonism, affecting approximately one million people in North America, although several other diseases may manifest similar symptoms.

Some material safety data sheets (MSDS) for welding consumables list parkinsonism as a potential hazard of welding, but no definitive epidemiologic evidence has been published to support this risk. Several case control and population-based studies that have proposed evidence against a relationship between parkinsonism and welding have methodological limitations, including small sample sizes studied and limited attempts at dose reconstruction.(13)

A preliminary study by Racette et al.(4) resulted in a substantially higher prevalence odds ratio for parkinsonism in welders in three standard occupational codes when compared with a reference population. Nevertheless, there remains substantial controversy regarding the relationship between welding and parkinsonism or PD, (57) justifying the need for further epidemiologic study.

To facilitate an efficient screening of the large number of welders needed to conduct this type of study, a self-report questionnaire was developed to assess medical and work histories, including questions related to other risk factors related to parkinsonism. Since retrospective occupational exposure reconstruction will be determined largely from these self-reports in absence of detailed company-based quantitative exposure measurements, it is essential that the answers generated from the questionnaire are both valid and reliable to avoid serious exposure misclassifications.

Although other questionnaire modules have been developed to assess welding exposures,(8) and other studies have successfully validated self-reports of workers in the shipyard industry,(9, 10) no other occupational epidemiologic studies have specifically assessed validity and reliability of questions that pertain to parkinsonism and welding exposures. The purpose of this study was to assess the validity and test-retest reliability of self-reported questionnaire items related to parkinsonism and lifetime occupational exposures to welding fumes in the shipyard industry.


All work performed as part of this study was approved by the Institutional Review Boards of Saint Louis University and Washington University School of Medicine. A welding exposure questionnaire was adapted from one used in previous published studies of welders(4) with additional items generated from the National Cancer Institute (NCI) exposure survey,(8) input from International Brotherhood of Boilermakers (IBB) partners, and with questions on known PD confounders.(11) The NCI welding questionnaire is a module from a larger questionnaire intended for use in case-control studies of chronic occupational exposures and has been used successfully for the evaluation of exposures in working populations.(8)

The questionnaire developed for use in this study consists of questions regarding demographic information and medical history/behaviors, including specific questions about previous diagnosis or family history of parkinsonism, and PD-specific questions derived from a validated questionnaire.(12) The exposure assessment component of the questionnaire includes a detailed work history with a listing of employer, start and end dates, and job title for each job held. Questions focus on determinants of exposure to welding fumes (such as days and hours exposed to specific types of welding); metals welded; types of rods used; and the work environment (ventilated, outdoors, confined space) during the participants’ work history. Participants indicated the amount of time usually spent around welding fumes and which types of welding, materials, and conditions were typical during their work in each job.

Validity and reliability of the questionnaire was completed in three phases: (1) face validity and usability, (2) validity, and (3) test-retest reliability. Phase 1 included industry partners in the IBB who reviewed the questionnaire to ensure that the questions reflected the actual practices of the facilities to be studied. Changes were made to the questions, particularly with regard to concerns with wording about specific exposures encountered in the shipyards and other welding sites, changes in welding exposures over time, and the format of the questions.

In addition, the IBB supplied MSDSs and names of welding processes and materials used specifically by shipyard workers. In Phase 2, with assistance from the IBB, participants employed in three similar shipyards were recruited for validity testing of the questionnaire. Participants signed a release permitting review of their personnel records.

Questionnaire responses were then compared with information contained within an individual’s employment record, where the employment record was considered the gold standard of accuracy. After completing the questionnaire, participants were asked to further participate in a focus group-style interview concerning ease of use and appropriateness and clarity of questions and nomenclature used. In Phase 3, participants were recruited in a similar manner from the same facilities to complete the test-retest reliability assessment.

Those who completed the initial questionnaire were then requested to complete the same questionnaire approximately 1 month later. Participants consented to follow-up at the time of enrollment, but to minimize the impact on recall, they were not told they would be asked to complete the same questionnaire until the time of the follow-up study visit. Responses from the two completed questionnaires were then compared for reliability purposes.

Data Analysis

Both validity and test-retest reliability were measured for questionnaire items by percent agreement and either Cohen’s kappa statistic or intraclass correlation coefficient (ICC). Kappa (κ) is a function of the ratio of agreements to disagreements in relation to expected frequencies for nominal data.(13) Questionnaire items with continuous data responses were measured with ICC to determine the correlation between test and retest responses. Sensitivity and specificity were also calculated for job titles, using the personnel record as the gold standard.

It is generally accepted that a κ of 0.0–0.40 indicates poor to slight agreement, 0.41–0.59 indicates moderate agreement, 0.60–0.79 indicates substantial agreement, ≥ 0.80 indicates outstanding agreement, and 1.0 indicates perfect agreement.(14) ICC was interpreted similarly, where as the ICC approached 1.0, there was high correlation or little variance between test and retest responses. Disagreements on chronic exposures on the order of days or months are less likely to influence cumulative dose reconstruction; therefore, similar to other studies comparing dates of employment,(10, 15, 16) two dates being compared for either validity or reliability were considered in agreement when the dates were within 1 year of each other. All analyses were performed in SPSS v.14.0.



Of the 143 participants recruited for questionnaire validation, 101 (70.6%) personnel files were obtained for review. Table I contains demographic information of the validity participants. Percent agreement between self-reported start date of current job was high (91.0%), and the start dates were highly correlated (ICC = 0.93).

Demographic Characteristics of Shipyard Workers Participating in Validity and Reliability Studies

Furthermore, the mean elapsed time between start date and questionnaire completion by self-report (mean 12.6 years) did not differ significantly from this elapsed time based on employer records (mean 12.1 years, p = 0.16). Sensitivity was high for job titles welder (1.0), steelworker (0.75), pipe/ship fitter (1.0), electrician (0.89), and machinist (0.71) but was low for painter (0.5). Specificity was high for all job titles (range = 0.96–1.0) (Table II).

Validity Measures Between Self-Reported Questionnaire Items and Employer Records for Shipyard Workers

There were 18 cases where a participant reported a previous job held within the same company as his current job. Start and end dates for the past job also showed high agreement (% agreement = 88.9–91.6). Because the number of participants in each past job title was small, sensitivity and specificity were calculated for “welder” and “other title.” Sensitivity and specificity for “welder” was both 1.0, while sensitivity and specificity for “other title” was 0.75 and 0.80, respectively (Table II).


Of the 95 participants recruited to complete the questionnaire for reliability analysis, 56 (58.9%) completed the second questionnaire. There was no difference in age, education level, race, or welder status between subjects who completed only the initial test and those who completed the test and retest (data not shown). Demographic information for those who completed both the test and retest can be found in Table I.

Kappa scores for demographic characteristics demonstrated outstanding to perfect agreement (κ 0.91–1.0, % agreement = 95–100). Most medical condition items showed substantial to perfect agreement, except for the presence of amyotrophic lateral sclerosis (ALS), depression, and head injury, all of which had moderate agreement. Presence of thyroid disease showed less than moderate agreement; however, the response percent agreement between questionnaires was very high. Questions specific for parkinsonism demonstrated high percent agreement (87.5–100%) with kappa scores mostly in the moderate range, with some substantial and one perfect score (Table III).

Medical History Test-Retest Reliability for Shipyard Workers

Reliability of questions related to potential PD risk factors, alcohol, caffeine, and nicotine, varied from κ = 0.43–0.72. Frequency of alcohol consumption showed substantial agreement (κ = 0.70), while amount of consumption showed moderate agreement (κ = 0.56). Nicotine (cigarette, pipe, cigar, and chewing tobacco use) demonstrated mainly substantial to outstanding agreement. However, caffeine (coffee, tea, cola, and chocolate) demonstrated poor agreement, except for questions about amount of coffee (κ = 0.72) and decaffeinated coffee (κ = 0.43) consumed per day (Table IV).

Health Behaviors and Exposures Test-Retest Reliability for Shipyard Workers

Reliability analysis for most recent employment (Table V) and previous jobs was performed separately with the expectation that agreement for the most recent job would be higher than for past jobs. Most recent job kappa and ICC scores for the name of company, start year, final year, and job title, were outstanding to perfect (κ or ICC = 0.86–1.0, % agreement = 91.1–100). Subjects were asked to categorize their position as a welder, welder helper, around welding activities or not around welding activities, along with frequency of exposure and use of a respirator while around fumes.

Current Job Characteristics Test-Retest Reliability for Shipyard Workers

Responses to specific job titles obtained outstanding agreement (κ = 0.86, % agreement = 96.4). Participants were asked to further classify his/her title into “welder,” “welder helper,” or “around welding.” Classifications of welder and around welding showed substantial and outstanding agreement, but responses to welder helper status were not as consistent (κ = 0.35, % agreement = 89.2). Days per week around welding fumes and hours per day around welding fumes showed moderate agreement (κ = 0.47 and 0.42, respectively).

However, disagreements did not affect overall mean days (4.37 days vs. 4.29 days, p = 0.70) or hours (4.61 hr vs. 4.63 hr, p = 0.90) reported between test and retest. Percent of time around welding in a ventilated space or in a confined space had substantial and outstanding agreement (κ = 0.64 and 0.83, respectively); however, percent of time performing welding tasks outside demonstrated lower agreement (κ = 0.35). Respirator use during welding exposure had substantial agreement, yet responses to frequency of use and specific type of respirator used showed less reproducibility (Table V). Agreement for type of electrode used in the welding process was moderate to substantial, and agreement for metals welded and type of welding was substantial to outstanding.

Reliability scores were also determined for responses identifying the previous two jobs the participants held. Of the 56 participants, 40 reported at least one additional job (Job 2), 18 (45%) of which were welding related. In addition, 18 participants reported having a third job (Job 3), six (33.3%) of which were welding related. Reliability analysis was not performed beyond Job 3 due to lack of data. In most cases, all responses for Job 2 and Job 3 had similar agreement patterns to most recent job in regard to company name, start year, final year, job title, welding status, days per week, and hours per day. Agreement for use of specific materials, metals welded, and types of welding was lower in Job 2 (poor to moderate) than in the most recent job (moderate to outstanding). However, agreement for the use of specific materials, metals welded, and types of welding were outstanding to perfect in Job 3 (data not shown).


This study determined the validity and reliability of a self-report questionnaire developed for an epidemiologic study of parkinsonism in shipyard welders. Reproducibility of parkinsonian signs and symptoms as well as parkinsonism confounders was generally moderate to high, with the exception of caffeine consumption questions. Coffee showed the highest reproducibility and was the type of caffeine most commonly reported as consumed in this population.

Other studies have shown that self-reports of foods and beverages less regularly consumed tend to have lower agreement and correlations between test and retest.(17, 18) This may explain the lower percent agreement and kappa scores for the other caffeine (tea, cola, hot chocolate) questions in this study population. While these lower scores may be of concern when controlling for parkinsonism confounders, they will have no effect on occupational exposure assessment.

The strong agreement found between participant responses and employer records for start and stop dates and job title indicate that participants self-report will not lead to major misclassification regarding duration of employment in specific job titles in the larger epidemiologic study. Two other studies have attempted to validate shipyard workers’ self-reported work information with employer records. Stewart et al.(10) found somewhat lower agreement for start year (±1 year) (κ = 0.85) than in the present study (κ = 0.93). However, they state that poorer agreement was found with participants who left the shipyard more than 30 years prior to interview, and that the participants in the 65–69 age range had the poorest recall.

In the present study, only one participant who completed the validity questionnaire was in this age range, and all participants were still employed in the shipyard at time of completing the self-report questionnaire. The younger and currently employed study participants in the present study may account for the more accurate recall. The validity results within this present study are also somewhat higher for job start year (90.1% agreement) than another validity study (76% agreement) that compared self-reported work history information from 100 randomly selected, currently employed shipyard workers to information in their personnel files.(9)

Sensitivity (range = 0.50–1.0) and specificity (range = 0.96–1.0) for specific job titles in this study closely resemble those by Stewart et al.(10) (sensitivity range = 0.33–1.0 and specificity range = 0.95–1.0). In Stewart et al. and in the present study, job titles reported among false-negative responses were often titles in related jobs. For example, in the present study, when there were false-negatives found for “welder” and “fitter,” the correct classification was “steelworker.” The job description found within company records for the title of “steelworker” includes a wide range of duties, including simple welding, fitting, grinding, burning, bolting, etc., indicating that while there may be disagreement in the title, duties may overlap and exposures may be similar within these titles.

“Painter” had the lowest sensitivity in this study, but there were only four “painters” as classified by personnel files. The reason for disagreement is not clear; however, one company-classified “painter” self-reported the title of “steelworker,” a position, according to the personnel file, was the title this participant held for 2 years 30 years prior.

The relatively strong agreement found in the test-retest reliability results indicate that estimates of exposure in the larger ongoing study will be reliable from self-reports. Medical history reliability tended to have high percent agreement (85.7 to 100%). This is similar to a study by Booth-Jones et al.(16) in which the medical history section had a combined percent agreement of 95.7. Responses about the presence of typical parkinsonian symptoms demonstrated high percent agreement (91–100%) and moderate to perfect kappa scores. Responses to Parkinson’s disease confounders such as tobacco and alcohol consumption patterns were moderately to highly reproducible, while reproducibility of responses to consumption of caffeine was lower. These results indicate that, overall, our welding questionnaire elicits reliable responses to confounders and medical history including symptoms specific to parkinsonian disorders.

Other studies have also evaluated the reliability of using occupational histories to accurately assign exposure, and have reported 78–88% agreement for measures such as job assignment and duties, occupational code, and job tenure.(1922) In the present study, percent agreement for the determinants of exposure classification such as length of employment (start and end dates) job title, welding status, days and hours per week around welding fumes, use of respirator, and types of metals and materials welded varied ranged from 57.8–96.4 for the most recent job. Participants were less likely to consistently report their job classification as a “welder helper,” which is less definitive than the other classifications. Specific job titles, however, had outstanding agreement.

Participants were less likely to reproduce the same responses to the number of days welding per week (κ = 0.47) or welding hours per day (κ = 0.42). While the time between test and retest was relatively short (4 weeks), the potential temporal variability in job tasks or duties from one test period to the next, could influence the disagreement between responses. However, the majority of disagreements for days per week around welding were within 1 day per week and the mean days reported per week did not statistically differ between the first round of the questionnaire and the retest. Similarly, there was no statistical difference between mean hours around welding reported between questionnaires. Although the kappa scores were not high for these items, the fact that the differences in means were not significant indicates that the degree of the disagreements was not influential.

While still acceptable, kappa scores and percent agreement were generally lower for these variables for participants’ immediate past job or title (Job 2). For those who reported a previous job, the average time between ending the previous job and completing the first questionnaire in the reliability study was 19 years. Thus, this long period for recall of the previous job (Job 2) may account for lower agreement between responses on the two questionnaires. The average recall period may also explain the higher percent agreement and kappa scores for the six participants who reported holding a third job or title in the welding industry. Their responses for Job 3 showed higher agreement and kappa scores than the most immediate past job (Job 2); however, the mean time between ending the third job for these participants and completing the first test-retest questionnaire was 13 years, indicating they have worked more jobs but in shorter periods on average than the workers in Job 2.

There are several notable limitations of this study. An important limitation was that employment records could be obtained only for the company in which the participants were currently employed. Therefore, validity of information provided by participants pertaining to jobs or titles in other previous companies could not be determined. Company personnel records were considered the gold standard for comparison, but there could be instances where personnel records are incorrect. This is most likely rare but could lead to lower validity measures when the participant correctly reports the information.

Furthermore, company records did not contain important determinants of exposure, such as days or hours per week around welding fumes or type of welding performed/exposed, which are important measures of exposure assessment. As a result, validity of self-reports for these items could not be determined. In addition, the validity study included only a currently employed population; therefore, it could not be determined how well retirees from these shipyards could recollect and report employment dates and titles.

A major strength of the present study is the collaboration with and input from our industry partners for questionnaire development. This ensured that questions included terms that were commonly used in the work environments studied and were familiar to the study population. This type of collaboration likely improves the accuracy of worker’s responses(23) and may have led to the fairly high validity and reliability results generated from this study population.


The questionnaire developed for the larger epidemiologic study has produced both valid and reliable responses from the study population. The questionnaire has undergone a more thorough development process than any previous questionnaire reported in studies of parkinsonism and welding and will thus prove useful in the reconstruction of retrospective exposures.


The authors would like to thank Angela Birke and Susan Criswell for their support and involvement with this study. This work was supported by the Michael J. Fox Foundation, NIH grants K23NS43351 and ES013743, and the Greater St. Louis Chapter of the American Parkinson Disease Association.


1. Fored M, Fryzek C, Brandt L, et al. Parkinson’s disease and other basal ganglia or movement disorders in a large nationwide cohort of Swedish welders. Occup Environ Med. 2006;63:135–140. [PMC free article] [PubMed]
2. Fryzek P, Hansen J, Cohen S, et al. A cohort study of Parkinson’s disease and other neurodegenerative disorders in Danish welders. J Occup Environ Med. 2005;47:466–472. [PubMed]
3. Kirkey L, Johnson K, Rybicki B, Peterson E, Kortsha G, Gorell J. Occupational categories at risk for Parkinson’s disease. Am J Ind Med. 2001;39:564–571. [PubMed]
4. Racette A, Tabbal B, Jennings D, Good L, Perlmutter J, Evanoff B. Prevalence of parkinsonism and relationship to exposure in a large sample of Alabama welders. Neurology. 2005;64:230–235. [PubMed]
5. Jankovic J. Searching for a relationship between manganese and welding and Parkinson’s disease. Neurology. 2005;64:2021–2028. [PubMed]
6. Santamaria B, Cushing A, Antonini J, Finley B, Mowat F. State-of-the-science review: Does manganese exposure during welding pose a neurological risk? J Toxicol Environ Health B Crit Rev. 2007;10:417–465. [PubMed]
7. Tanner MC. PD or not PD? That is the question. Neurology. 2003;61:5–6. [PubMed]
8. Stewart A, Stewart P, Heineman E, Dosemeci MO, Linet M, Inskip D. A novel approach to data collection in a case-control study of cancer and occupational exposures. Int J Epidemiol. 1996;25:744–752. [PubMed]
9. Bourbonnais R, Meyer F, Theriault G. Validity of self reported work history. Br J Ind Med. 1988;45:29–32. [PMC free article] [PubMed]
10. Stewart FW, Tonascia AJ, Matanoski MG. The validity of questionnaire-reported work history in live respondents. J Occup Med. 1987;29:795–800. [PubMed]
11. Checkoway H, Powers K, Smith-Weller T, Franklin G, Longstreth W, Swanson P. Parkinson’s disease risks associated with cigarette smoking, alcohol consumption, and caffeine intake. Am J Epidemiol. 2002;155:732–738. [PubMed]
12. Duarte J, Claveria LE, de Pedro-Cuesta J, Sempere AP, Coria F, Calne DB. Screening Parkinson’s disease: A validated questionnaire of high specificity and sensitivity. Mov Disord. 1995;10:643–649. [PubMed]
13. Bartko JJ, Carpenter TW., Jr On the methods and theory of reliability. J Nerv Ment Dis. 1976;163:307–317. [PubMed]
14. Landis RJ, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174. [PubMed]
15. Baumgarten M, Siemiatycki J, Gibbs WG. Validity of work histories obtained by interview for epidemiologic purposes. Am J Epidemiol. 1983;118(4):583–591. [PubMed]
16. Brisson C, Vezina M, Bernard PM, Gingras S. Validity of occupational histories obtained by interview with female workers. Am J Ind Med. 1991;19(4):523–530. [PubMed]
17. Feskanich D, Rimm EB, Giovannucci EL, et al. Reproducibility and validity of food intake measurements from a semiquantitative food frequency questionnaire. J Am Diet Assoc. 1993;93(7):790–796. [PubMed]
18. Rimm EB, Giovannucci EL, Stampfer MJ, Colditz GA, Litin LB, Willett WC. Reproducibility and validity of an expanded self-administered semiquantitative food frequency questionnaire among male health professionals. Am J Epidemiol. 1992;135(10):1114–1126. [PubMed]
19. Booth-Jones AD, Lemasters GK, Succop P, Atterbury MR, Bhattachayra A. Reliability of questionnaire information measuring musculoskeletal symptoms and work histories. Am Ind Hyg Assoc J. 1998;59:20–24. [PubMed]
20. Brower SP, Attfield DM. Reliability of reported occupational history information for US coal miners, 1969–1977. Am J Epidemiol. 1998;148:920–926. [PubMed]
21. Rona JR, Mosbech J. Validity and repeatability of self-reported occupational and industrial history from patients in EEC countries. Int J Epidemiol. 1989;18:674–679. [PubMed]
22. Warneryd B, Thorslund M, Ostlin P. The quality of retrospective questions about occupational history—A comparison between survey and census data. Scand J Soc Med. 1991;19:7–13. [PubMed]
23. Teschke K, Olshan A, Daniels J. Occupational exposure assessment in case-control studies: opportunities for improvement. Occup Environ Med. 2002;59:575–593. discussion 594. [PMC free article] [PubMed]