To evaluate the validity and reliability of the structured Composite International Diagnostic Interview (CIDI) in diagnosing current major depressive disorder (MDD) among East African adults.
A sample of 926 patients attending a major referral hospital participated in this diagnostic assessment study. We used a two stage-study design where participants were first interviewed using an Amharic version of the CIDI and a stratified random sample underwent a follow-up semi-structured clinical interview conducted by a psychiatrist, blinded to the screening results, using the Schedules for Clinical Assessment in Neuropsychiatry (SCAN) instrument. We tested construct validity by examining the association of the CIDI and World Health Organization Quality of Life (WHO-QOL) questionnaire. We calculated the psychometric properties of the CIDI using the SCAN diagnostic interview as a gold standard.
We found that the Amharic version of the CIDI diagnostic interview has good internal reliability (Cronbach’s alpha= 0.97) among Ethiopian adults. Compared to the SCAN reference standard, the CIDI had fair specificity (72.2%) but low sensitivity (51.0%). Our study provided evidence for unidimensionality of core depression screening questions on the CIDI interview with good factor loadings on a major core depressive factor.
The Amharic language version of the CIDI had fair specificity and low sensitivity in detecting MDD compared with psychiatrist administered SCAN diagnosis. Our findings are generally consistent with prior studies. Use of fully structured interviews such as the CIDI for MDD diagnosis in clinical settings might lead to under detection of DSM-IV MDD.
CIDI; Validation; Africa; Ethiopia; Depression; MDD
Self-compassion is a key psychological construct for assessing clinical outcomes in mindfulness-based interventions. The aim of this study was to validate the Spanish versions of the long (26 item) and short (12 item) forms of the Self-Compassion Scale (SCS).
The translated Spanish versions of both subscales were administered to two independent samples: Sample 1 was comprised of university students (n = 268) who were recruited to validate the long form, and Sample 2 was comprised of Aragon Health Service workers (n = 271) who were recruited to validate the short form. In addition to SCS, the Mindful Attention Awareness Scale (MAAS), the State-Trait Anxiety Inventory–Trait (STAI-T), the Beck Depression Inventory (BDI) and the Perceived Stress Questionnaire (PSQ) were administered. Construct validity, internal consistency, test-retest reliability and convergent validity were tested.
The Confirmatory Factor Analysis (CFA) of the long and short forms of the SCS confirmed the original six-factor model in both scales, showing goodness of fit. Cronbach’s α for the 26 item SCS was 0.87 (95% CI = 0.85-0.90) and ranged between 0.72 and 0.79 for the 6 subscales. Cronbach’s α for the 12-item SCS was 0.85 (95% CI = 0.81-0.88) and ranged between 0.71 and 0.77 for the 6 subscales. The long (26-item) form of the SCS showed a test-retest coefficient of 0.92 (95% CI = 0.89–0.94). The Intraclass Correlation (ICC) for the 6 subscales ranged from 0.84 to 0.93. The short (12-item) form of the SCS showed a test-retest coefficient of 0.89 (95% CI: 0.87-0.93). The ICC for the 6 subscales ranged from 0.79 to 0.91. The long and short forms of the SCS exhibited a significant negative correlation with the BDI, the STAI and the PSQ, and a significant positive correlation with the MAAS. The correlation between the total score of the long and short SCS form was r = 0.92.
The Spanish versions of the long (26-item) and short (12-item) forms of the SCS are valid and reliable instruments for the evaluation of self-compassion among the general population. These results substantiate the use of this scale in research and clinical practice.
Self-compassion; Validation; Spanish; Mindfulness
The Child Perceptions Questionnaire for children aged 11 to 14 years (CPQ11–14) is a 37-item measure of oral-health-related quality of life (OHRQoL) encompassing four domains: oral symptoms, functional limitations, emotional and social well-being. To facilitate its use in clinical settings and population-based health surveys, it was shortened to 16 and 8 items. Item impact and stepwise regression methods were used to produce each version. This paper describes the developmental process, compares the discriminative properties of the resulting four short-forms and evaluates their precision relative to the original CPQ11–14.
The item impact method used data from the CPQ11–14 item reduction study to select the questions with the highest impact scores in each domain. The regression method, where the dependent variable was the overall CPQ11–14 score and the independent variables its individual questions, was applied to the data collected in the validity study for the CPQ11–14. The measurement properties (i.e. criterion validity, construct validity, internal consistency reliability and test-retest reliability) of all 4 short-forms were evaluated using the data from the validity and reliability studies for the CPQ11–14.
All short forms detected substantial variability in children's OHRQoL. The mean scores on the two 16-item questionnaires were almost identical, while on the two 8-item questionnaires they differed by only one score point. The mean scores standardized to 0–100 were higher on the short forms than the original CPQ11–14 (p < 0.001). There were strong significant correlations between all short-form scores and CPQ11–14 scores (0.87–0.98; p < 0.001). Hypotheses concerning construct validity were confirmed: the short-forms' scores were highest in the oro-facial, lower in the orthodontic and lowest in the paediatric dentistry group; all short-form questionnaires were positively correlated with the ratings of oral health and overall well-being, with the correlation coefficient being higher for the latter. The relative validity coefficients were 0.85 to 1.18. Cronbach's alpha and intraclass correlation coefficients ranged 0.71–0.83 and 0.71–0.77, respectively.
All short forms demonstrated excellent criterion validity and good construct validity. The reliability coefficients exceeded standards for group-level comparisons. However, these are preliminary findings based on the convenience sampling and further testing in replicated studies involving clinical and general samples of children in various settings is necessary to establish measurement sensitivity and discriminative properties of these questionnaires.
This paper evaluates the internal consistency reliability and concurrent validity of the assessment of Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) attention deficit hyperactivity disorder (ADHD) in the adolescent version of the World Health Organization (WHO) Composite International Diagnostic Interview Version 3.0 (CIDI). The CIDI is a lay-administered diagnostic interview that was carried out in conjunction with the US National Comorbidity Survey Adolescent Supplement, a US nationally representative survey of 10,148 adolescents and their parents. Internal consistency reliability was evaluated using factor and item response theory analyses. Concurrent validity was evaluated against diagnoses based on blinded clinician-administered interviews. Inattention and hyperactivity-impulsivity items loaded on separate but correlated factors, with hyperactivity and impulsivity items forming a single factor in parent reports but separate factors in youth reports. We were able to differentiate hyperactivity and impulsivity factors for parents as well by eliminating a subset who endorsed zero ADHD items from the factor analysis. Although concurrent validity was relatively weak, decomposition showed that this was due to low validity of adolescent reports. A modified CIDI diagnosis based exclusively on parent reports generated a diagnosis that had good concordance with clinical diagnoses [area under the curve (AUC) = 0.78]. Implications for assessing ADHD using the CIDI and the effect of different informants on measurement are discussed.
attention deficit hyperactivity disorder; WHO Composite International Diagnostic Interview (CIDI); validity; National Comorbidity Survey Replication Adolescent Supplement (NCS-A)
A questionnaire to assess physical activity related environmental factors in the European population (a 49-item and an 11-item version) was created as part of the framework of the EU-funded project "Instruments for Assessing Levels of PHysical Activity and fitness (ALPHA)". This paper reports on the development and assessment of the questionnaire's test-retest stability, predictive validity, and applicability to European adults.
The first pilot test was conducted in Belgium, France and the UK. In total 190 adults completed both forms of the ALPHA questionnaire twice with a one-week interval. Physical activity was concurrently measured (i) by administration of the long version of the International Physical Activity Questionnaire (IPAQ) by interview and (ii) by accelerometry (Actigraph™ device). After adaptations, the second field test took place in Belgium, the UK and Austria; 166 adults completed the adapted questionnaire at two time points, with minimum one-week interval. In both field studies intraclass correlation coefficients (ICC) and proportion of agreement were computed to assess the stability of the two test scores. Predictive validity was examined in the first field test by correlating the results of the questionnaires with physical activity data from accelerometry and long IPAQ-last 7 days.
The reliability scores of the ALPHA questionnaire were moderate-to good in the first field testing (ICC range 0.66 - 0.86) and good in the second field testing (ICC range 0.71 - 0.87). The proportion of agreement for the ALPHA short increased significantly from the first (range 50 - 83%) to the second field testing (range 85 - 95%). Environmental scales from both versions of the ALPHA questionnaire were significantly associated with self-reported minutes of transport-related walking, and objectively measured low intensity physical activity levels, particularly in women. Both versions were easily administered with an average completion time of six minutes for the 49-item version and less than two minutes for the short version.
The ALPHA questionnaire is an instrument to measure environmental perceptions in relation to physical activity. It appears to have good reliability and predictive validity. The questionnaire is now available to other researchers to investigate its usefulness and applicability across Europe.
No validated disease-specific measures are available to assess health-related quality of life (HRQoL) in adult subjects with immune thrombocytopenic purpura (ITP). Therefore, we sought to develop and validate the ITP-Patient Assessment Questionnaire (ITP-PAQ) for adult subjects with ITP.
Information from literature reviews, focus groups with subjects, and clinicians were used to develop 50 ITP-PAQ items. Factor analyses were conducted to develop the scale structure and reduce the number of items. The final 44-item ITP-PAQ, which includes ten scales [Symptoms (S), Bother-Physical Health (B), Fatigue/Sleep (FT), Activity (A), Fear (FR), Psychological Health (PH), Work (W), Social Activity (SA), Women's Reproductive Health (RH), and Overall (QoL)], was self-administered to adult ITP subjects at baseline and 7–10 days later. Test-retest reliability, internal consistency reliability, construct and known groups validity of the final ITP-PAQ were evaluated.
Seventy-three subjects with ITP completed the questionnaire twice. Test-retest reliability, as measured by the intra-class correlation, ranged from 0.52–0.90. Internal consistency reliability was demonstrated with Cronbach's alpha for all scales above the acceptable level of 0.70 (range: 0.71–0.92), except for RH (0.66). Construct validity, assessed by correlating ITP-PAQ scales with established measures (Short Form-36 v.1, SF-36 and Center for Epidemiologic Studies Depression Scale, CES-D), was demonstrated through moderate correlations between the ITP-PAQ SA and SF-36 Social Function scales (r = 0.67), and between ITP-PAQ PH and SF-36 Mental Health Scales (r = 0.63). Moderate to strong inter-scale correlations were reported between ITP-PAQ scales and the CES-D, except for the RH scale. Known groups validity was evaluated by comparing mean scores for groups that differed clinically. Statistically significant differences (p < 0.01) were observed when subjects were categorized by treatment status [S, FT, B, A, PH, and QoL, perceived effectiveness of ITP treatment [S], and time elapsed since ITP diagnosis [PH].
Results provide preliminary evidence of the reliability and validity of the ITP-PAQ in adult subjects with ITP. Further work should be conducted to assess the responsiveness and to estimate the minimal clinical important difference of the ITP-PAQ to more fully understand the impact of ITP and its treatments on HRQoL.
To test the reliability, validity and responsiveness of the 13-item Shortness of Breath with Daily Activities (SOBDA) questionnaire, and determine the threshold for response and minimal important difference (MID).
6 week, randomised, double-blind, placebo-controlled study.
40 centres in the USA between 29 October 2009 and 1 July 2010.
Primary and secondary outcome measures
547 patients with chronic obstructive pulmonary disease (COPD) were enrolled and 418 entered the 2-week run-in period. Data from the run-in period were collected to test internal consistency, test–retest reliability, convergent validity and known-groups validity of the SOBDA. Three hundred and sixty six patients were randomised 2:2:1 to fluticasone propionate/salmeterol 250/50 µg, salmeterol 50 µg or placebo, twice daily. Results from the SOBDA questionnaire, Patient Global Assessment of Change Question, modified Medical Research Council Dyspnoea Scale (mMRC), Clinician Global Impression of Dysponea Severity (CGI-S), Clinician Global Impression of Change Question and Chronic Respiratory Disease Questionnaire self-administered standardised version (CRQ-SAS) were evaluated; spirometry and safety parameters were measured. Study endpoints were selected to investigate the cross-sectional and longitudinal validity of the SOBDA questionnaire in relation to the clinical criteria.
Internal consistency of the SOBDA questionnaire (Cronbach α) was 0.89. Test–retest reliability (intraclass correlation) was 0.94. The SOBDA weekly scores correlated with the patient-reported and clinician-reported mMRC, CGI-S and CRQ-SAS dyspnoea domain scores (0.29, 0.24, 0.24 and –0.68, respectively). The SOBDA weekly scores differentiated between the responders and the non-responders as rated by the patients and the clinicians. Anchor-based and supportive distribution-based analyses produced a range of the potential values for the threshold for the responders and MID.
The 13-item SOBDA questionnaire is reliable, valid and responsive to change in patients with COPD. On using anchor-based methods, the proposed responder threshold shows a −0.1 to −0.2 score change. A specific threshold value will be identified as more data are generated from future clinical trials.
NCT00984659; GlaxoSmithKline study number: ASQ112989.
THORACIC MEDICINE; RESPIRATORY MEDICINE (see Thoracic Medicine)
The sibling relationship and its potential impact on neurodevelopment and mental health are important areas of neuroscientific research. Validation of the tools assessing the quality of the sibling relationship would be the first essential step for conducting neurobiological and psychosocial studies related to the sibling relationship. However, to the best of our knowledge, no sibling relationship assessment tools have been empirically validated in Korean. We aimed to evaluate the psychometric properties of the Korean version of the Lifespan Sibling Relationship Scale (LSRS), which is one of the most commonly used self-report questionnaires to assess the quality of the sibling relationship. A total of 109 adults completed a series of self-report questionnaires including the LSRS, the mental health subscale of the Medical Outcomes Study-Short Form 36 version 2 (SF36v2), the Satisfaction with Life Scale (SLS), and the Marlowe-Crowne Social Desirability Scale (MC-SDS). The internal consistency, subscale intercorrelations, one-week test-retest reliability, convergent validity, divergent validity, and the construct validity were assessed. All six subscale scores and the total score of the LSRS demonstrated good internal consistency (Cronbach's α=0.85-0.94) and good test-retest reliability (intraclass correlation coefficient=0.77-0.92). Correlations of the LSRS with the SF36v2 mental health score (r=0.32, p=0.01) and with the SLS (r=0.27, p=0.04) supported the good convergent validity. The divergent validity was shown by the non-significant correlation of the LSRS with the MC-SDS (r=0.15, p=0.26). Two factors were extracted through factor analysis, which explained 78.63% of the total variance. The three Adult subscales loaded on the first factor and the three Child subscales loaded on the second factor. Results suggest that the Korean version of the LSRS is a reliable and valid tool for examining the sibling relationship.
sibling relationships; validity; reliability; lifespan sibling relationship scale; psychometrics
A self-report questionnaire is frequently used to measure symptoms reliably and to distinguish patients with functional gastrointestinal disorders (FGIDs) from those with other conditions. We produced and validated a cross-cultural adaptation of the Rome III questionnaire for diagnosis of FGIDs in Korea.
The Korean version of the Rome III (Rome III-K) questionnaire was developed through structural translational processes. Subsequently, reliability was measured by a test-retest procedure. Convergent validity was evaluated by comparing self-reported questionnaire data with the subsequent completion of the questionnaire by the physician based on an interview and with the clinical diagnosis. Concurrent validation using the validated Korean version of the Short Form-36 Health Survey (SF-36) was adopted to demonstrate discriminant validity.
A total of 306 subjects were studied. Test-retest reliability was good, with a median Cronbach's α value of 0.83 (range, 0.71-0.97). The degree of agreement between patient-administered and physician-administered questionnaires to diagnose FGIDs was excellent; the κ index was 0.949 for irritable bowel syndrome, 0.883 for functional dyspepsia and 0.927 for functional heartburn. The physician's clinical diagnosis of functional dyspepsia showed the most marked discrepancy with that based on the self-administered questionnaire. Almost all SF-36 domains were impaired in participants diagnosed with one of these FGIDs according to the Rome III-K.
We developed the Rome III-K questionnaire though structural translational processes, and it revealed good test-retest reliability and satisfactory construct validity. These results suggest that this instrument will be useful for clinical and research assessments in the Korean population.
Dyspepsia; Functional gastrointestinal disorders; Irritable bowel syndrome; Questionnaires; Validation studies
In Iranian Traditional Medicine, mizaj (temperament) plays a key role in preventive, therapeutic and lifestyle recommendations. A reliable self-reported scale for mizaj identification is critically needed to introduce ITM into the official medical and health care system especially in the case of designing national preventive protocols.
The present study aimed to design a preliminary self-administered mizaj questionnaire and assessed its reliability and validity in Iran.
Patients and Methods
In this cross-sectional study, a questionnaire with 52 items was designed based on mizaj-related indices. Subsequent to content and face validity assessment, using qualitative and quantitative method, 47 items remained. Based on the non-randomly sampling, the test-retest reliability of each question and internal consistency of the questionnaire was examined by the participation of 35 volunteers. The reliable version questionnaire was filled up by 52 volunteers wherein they were divided into warm/cold and wet/dry groups based on their mizaj which was predetermined by a team of expert practitioners. Logistic regression analysis was performed for validity process between the experts’ assessment of mizaj and each of the items in the questionnaire that resulted to the final ten-item questionnaire divided into two subscales. By using ANOVA and post Hoc with Dunnet statistics, the optimum cut-off points were defined and their sensitivity and specificity was assessed.
The weighted kappa coefficients of the 39 items were between 0.40 and 0.82 showing their acceptable reliability and the Cronbach’s α coefficient was 0.71 showing the internal consistency. The sensitivity and specificity of the final questionnaire cut-off points were 65% and 93% for the warm group, 52% and 97% cold group, 53% and 67% dry group and finally 53% and 76% wet group.
Our results suggested that many of the designed questions according to the literature’s mizaj identification indices had satisfactory reliability and the final ten-item questionnaire could discriminate the different groups of mizaj, therefore, this can be used as the first version of a brief self-report mizaj estimating scale.
Medicine; Traditional; Unani; Temperament; Questionnaires; Reproducibility of Results
Assessment of health-related quality of life (HRQL) is important in patients with chronic obstructive pulmonary disease (COPD). Despite the high prevalence of COPD in Germany, Switzerland and Austria there is no validated disease-specific instrument available. The objective of this study was to translate the Chronic Respiratory Questionnaire (CRQ), one of the most widely used respiratory HRQL questionnaires, into German, develop an interviewer- and self-administered version including both standardised and individualised dyspnoea questions, and validate these versions in two randomised studies.
We recruited three groups of patients with COPD in Switzerland, Germany and Austria. The 44 patients of the first group completed the CRQ during pilot testing to adapt the CRQ to German-speaking patients. We then recruited 80 patients participating in pulmonary rehabilitation programs to assess internal consistency reliability and cross-sectional validity of the CRQ. The third group consisted of 38 patients with stable COPD without an intervention to assess test-retest reliability. To compare the interviewer- and self-administered versions, we randomised patients in groups 2 and 3 to the interviewer- or self-administered CRQ. Patients completed both the standardised and individualised dyspnoea questions.
For both administration formats and all domains, we found good internal consistency reliability (Crohnbach's alpha between 0.73 and 0.89). Cross-sectional validity tended to be better for the standardised compared to the individualised dyspnoea questions and cross-sectional validity was slightly better for the self-administered format. Test-retest reliability was good for both the interviewer-administered CRQ (intraclass correlation coefficients for different domains between 0.81 and 0.95) and the self-administered format (intraclass correlation coefficients between 0.78 and 0.86). Lower within-person variability was responsible for the higher test-retest reliability of the interviewer-administered format while between person variability was similar for both formats.
Investigators in German-speaking countries can choose between valid and reliable self-and interviewer-administered CRQ formats.
COPD; Health Related Quality of Life; Chronic Respiratory Questionnaire; Standardisation; Self-Administration
The Eye Allergy Patient Impact Questionnaire (EAPIQ) was developed based on a pilot study conducted in the US and focus groups with eye allergy sufferers in Europe. The purpose of this study was to present the results of the psychometric validation of the EAPIQ.
One hundred forty six patients from two allergy clinics completed the EAPIQ twice over a two-week period during the fall and winter allergy seasons, along with concurrent measures of health status, work productivity, and utility. Construct validity, reliability (internal consistency and test-retest), concurrent, known-group, and clinical validities, and responsiveness of the EAPIQ were assessed. Known-group validity was assessed by comparing EAPIQ scale scores between patients grouped according to their self-rating of ocular allergy severity (no symptoms, very mild, mild, moderate, severe, very severe). Clinical validity was assessed by assessing differences in EAPIQ scores between groups of patients rated by their clinician as non-symptomatic, mild, moderate, and severe.
Results and Discussion
Results from the validation study suggested the deletion of 14 of 43 items (including embedded questions) that required patients to complete the percentage of time they were troubled by something (daily activity limitations/emotional troubles). These items yielded a significant amount of missing or inconsistent data (50%). The resulting factor analysis suggested four domains: symptoms, daily life impact, psychosocial impact, and treatment satisfaction. When included as separate scales, the symptom-bother and symptom-frequency scales were highly correlated (> 0.9). As a consequence, and due to superior discriminative validity, the symptom bother and frequency items were summed. All items met the tests for item convergent validity (item-scale correlation = 0.4). The success rate for item discriminant validity testing was 97% (item-scale correlation greater with own scale than with any other). The criterion for internal consistency reliability (alpha coefficient ≥ 0.70) was met for all EAPIQ scales (range 0.89–0.93), as was the criterion for test-retest reliability (intraclass correlation [ICC] ≥ 0.70). Largely moderate correlations between the scales of the EAPIQ and the mini Rhinoconjunctivitis Quality of Life Questionnaire (miniRQLQ) and low correlations with the Health Utilities Index 2/3 (HUI2/3) were indicative of satisfactory concurrent validity. The EAPIQ symptoms, Daily Life Impact, and Psychosocial Impact scales were able to distinguish between patients differing in eye allergy symptom severity, as rated by patients and clinicians, providing evidence of satisfactory known-group and clinical validities, respectively. Preliminary analyses indicated the EAPIQ Symptoms, Daily Life Impact, and Psychosocial Impact scales to be responsive to changes in eye allergies.
Following item reduction, construct validity, reliability, concurrent validity, known-group validity, and preliminary responsiveness were satisfactory for the EAPIQ in this population of ocular allergy patients.
Patient functioning; ocular allergy; psychometric validation; EAPIQ; patient reported outcomes
The changes in the organization of mental health care services have made the role of the family even more important in caring for patients with mental disorders. Caring may have serious consequences for family caregivers, with a great impact on the quality of family life. This study reports on the translation, cultural adaptation, and validation of the Involvement Evaluation Questionnaire-European Union (IEQ-EU) into the Greek language.
Caregivers of patients with major mental disorders were interviewed to test a modified version of the IEQ-EU questionnaire. Psychometric measurements included reliability coefficients, exploratory factor analysis and confirmatory analysis by linear structural relations. To measure the concurrent validity we used the Nottingham Health Profile (NHP).
Most caregivers were female (83%), mainly mothers living with the patient (80%), with quite a high level of burden. The Greek version of the IEQ-EU (G-IEQ-EU) demonstrated a good reliability with high internal consistency (α = 0.88), Guttman split-half correlation of 0.71, high test-retest reliability (ICC = 0.82) and good concurrent validity with the NHP. A four-factor structure was confirmed for the G-IEQ-EU, slightly different from the original IEQ. The confirmatory factor analysis demonstrated that the four-factor model offered modest fit to our data.
The G-IEQ-EU is a reasonably valid and reliable tool for use in both clinical and research contexts in order to assess the burden of caregivers of patients with mental disorders.
Caregivers; Mental disorders; Validation
Background: If an institute is looking for improvement of its learning environment, a reliable and valid assessment tool is needed for measurement of the educational environment .The Dundee Ready Educational Environment Measure (DREEM) has been used in various studies to evaluate the educational environment. However, psychometric evaluations of the instrument seem necessary, for all known versions of the instrument.
The aim of this study was to investigate the reliability and validity of Persian version of the DREEM in the major clinical wards in teaching hospitals affiliated to Iran University of Medical Sciences.
Methods: This descriptive - analytical study, involved medical students (clinical stagers and interns) in 4 major clinical wards. In this study, DREEM questionnaire was reviewed in content, face validity and construct validity through confirmatory factor analysis. The reliability was calculated according to test - retest and the internal
consistency was measured using Cronbach's alpha coefficient.
Results: A total number of 267 questionnaires were completed by medical stagers (60%) and interns (40%) including 181 females and 82 males. The mean age of stagers and interns were 23.60 ± 1.27 and 25.45 ± 1.22 years, respectively. The total mean of the questionnaire was calculated as 96.15 (93.5375, 98.7547) out of 176, with 95% confidence interval. The face validity of the questionnaire was confirmed. The mean of content validity
ratio (CVR) was calculated as 0.35, and 6 questions were omitted in this step. The content validity index
(CVI) was 0.39. The reliability coefficient mean was 0.71. In confirmatory factor analysis five factors were confirmed that changed the orientation of some questions. The Cronbach's alpha coefficient of the whole questionnaire was obtained as 0.914.
Conclusion: The modified and validates DREEM questionnaire in Persian language with 44 items and appropriate psychometric attributes is capable of being used in assessment of clinical education environments in Iran.
Validity; Reliability; DREEM; Educational environment
The Australian Whiplash Disability Questionnaire (WDQ) was cross-culturally translated, adapted, and tested for validity to be used in German-speaking patients. The self-administered questionnaire evaluates actual pain intensity, problems in personal care, role performance, sleep disturbances, tiredness, social and leisure activities, emotional and concentration impairments with 13 questions rated on an 11-point rating scale from zero to ten.
In a first part, the Australian-based WDQ was forward and backward translated. In a consensus conference with all translators and health care professionals, who were experts in the treatment of patients with a whiplash associated disorder (WAD), formulations were refined. Original authors were contacted for clarification and approval of the forward-backward translated version. The German version (WDQ-G) was evaluated for comprehensiveness and clarity in a pre-study patient survey by a random sample of German-speaking patients after WAD and four healthy twelve to thirteen year old teenagers.
In a second part, the WDQ-G was evaluated in a patient validation study including patients affected by a WAD. Inpatients had to complete the WDQ-G, the North American Spine Society questionnaire (NASS cervical pain), and the Medical Outcomes Study 36-Item Short Form Health Survey (SF-36) at entry in the rehabilitation centre.
In the pre-study patient survey (response rate 31%) patients rated clarity for title 9.6 ± 0.9, instruction 9.3 ± 1.4 and questions 9.6 ± 0.7, and comprehensiveness for title 9.6 ± 0.7, instruction 9.3 ± 1.4 and questions 9.8 ± 0.4. Time needed to fill in was 13.7 ± 9.0 minutes.
In total, 70 patients (47 females, age = 43.4 ± 12.5 years, time since injury: 1.5 ± 2.6 years) were included in the validation study. WDQ-G total score was 74.0 ± 21.3 points (range between 15 and 117 points). Time needed to fill in was 6.7 ± 3.4 minutes with data from 22 patients. Internal consistency was confirmed with Cronbachs’s α = 0.89. Concurrent validity showed a highly significant correlation with subscale pain and disability (NASS) at r = 0.74 and subscale pain (SF-36) at r = 0.71.
The officially translated and adapted WDQ-G can be used in German-speaking patients affected by a WAD to evaluate patients’ impairments in different domains. The WDQ-G is a self-administered outcome measure showing a high internal consistency and good concurrent validity.
Whiplash injury; Questionnaire; Impairment; Pain; Activities of daily living; Kraniozervikales beschleunigungstrauma; Fragebogen; Schmerz; Einschränkungen; Aktivitäten des täglichen Lebens
The eight-item Brief Illness Perception Questionnaire is used as a screening instrument in physical therapy to assess mental defeat in patients with acute low back pain, besides patient perception might determine the course and risk for chronic low back pain. However, the psychometric properties of the Brief Illness Perception Questionnaire in common musculoskeletal disorders like acute low back pain have not been adequately studied. Patients’ perceptions vary across different populations and affect coping styles. Thus, our aim was to determine the internal consistency, test-retest reliability and validity of the Dutch language version of the Brief Illness Perception Questionnaire in acute non-specific low back pain patients in primary care physical therapy.
A non-experimental cross-sectional study with two measurements was performed. Eighty-four acute low back pain patients, in multidisciplinary health care center in Dutch primary care with a sample mean (SD) age of 42 (12) years, participated in the study. Internal consistency (Cronbach’s α) and test-retest procedures (Intraclass Correlation Coefficients and limits of agreement) were evaluated at a one-week interval. The concurrent validity of the Brief Illness Perception Questionnaire was examined by using the Mental Health Component of the Short Form 36 Health Survey.
The Cronbach’s α for internal consistency was 0.73 (95% CI, 0.67 – 0.83); and the Intraclass Correlation Coefficient test-retest reliability was acceptable: 0.72 (95% CI, 0.53 – 0.82), however, the limits of agreement were large. The Intraclass Correlation Coefficient measuring concurrent validity 0.65 (95% CI, 0.46 – 0.80).
The Dutch version of the Brief Illness Perception Questionnaire is an appropriate instrument for measuring patients’ perceptions in acute low back pain patients, showing acceptable internal consistency and reliability. Concurrent validity is adequate, however, the instrument may be unsuitable for detecting changes in low back pain perception over time.
Illness perceptions; Reliability; Validity; Acute nonspecific low back pain; Brief IPQ
The patient-rated elbow evaluation (PREE) is a joint-specific, self-administered questionnaire consisting of a pain scale (PREE-P) and a functional scale (PREE-F), the latter consisting of specific function (PREE-SF) and usual function (PREE-UF). The purpose of this study was to cross-culturally adapt the PREE into Japanese (PREE-J) and to test its reliability, validity, and responsiveness.
A consecutive series of 74 patients with elbow disorder completed the PREE-J, the Japanese version of the disabilities of the arm, shoulder, and hand (DASH–JSSH) questionnaire, and the official Japanese version of the 36-Item Short-Form Health Survey (SF-36). Of the 74 patients, 53 were reassessed for test–retest reliability 1 or 2 weeks later. Reliability was investigated in terms of reproducibility and internal consistency. The validity of the PREE-J was examined by factor analysis, and correlation coefficients were obtained using the PREE-J, DASH-JSSH, and SF-36. Responsiveness was examined by calculating the standardized response mean (SRM) and effect size after elbow surgery in 53 patients.
Cronbach’s α coefficients for PREE-P, PREE-F, and PREE were 0.92, 0.97, and 0.97, respectively, and the corresponding intraclass correlation coefficients were 0.92, 0.93, and 0.94, respectively. Unidimensionality of PREE-P and PREE-F was confirmed by factor analysis. The coefficients of correlation between PREE-P and PREE-F or DASH–JSSH were 0.81 and 0.74, respectively; that between PREE-F and DASH–JSSH was 0.86, and those between DASH–JSSH and PREE-SF or PREE-UF were 0.85 and 0.82, respectively. Moderate correlation was observed in “physical functioning” for SF-36 and PREE-F (r = −0.69) or PREE (r = −0.68). The SRMs/effect sizes of PREE-P (1.31/1.32) or PREE (1.28/1.12) were more responsive than the DASH–JSSH (0.99/0.85), “bodily pain” (−1.15/−1.43), and “physical functioning” (−0.70/−0.44) in SF-36.
The PREE-J represents a reliable, valid, and responsive instrument and has evaluation capacities equivalent to those of the original PREE.
The aim of this study was to investigate the dimensionality, reliability, and validity of an alternate version of the chewing function questionnaire in partially dentate patients in Japan.
Subjects were partially dentate patients who attended the prosthodontic clinic at Tokyo Medical and Dental University (N = 491, 71% women, mean age (± SD): 63.0 ± 11.5 years). The questionnaire asked each subject to rate his or her ability to chew 20 common Japanese foods. For each individual, responses were combined to yield a chewing function summary score, with higher scores indicating better self-reported chewing ability. We used exploratory factor analysis to investigate the scores' dimensionality. For validity assessment, we computed the correlations between the chewing function score and oral health-related quality of life (OHRQoL, as measured by the Japanese 14-item Oral Health Impact Profile (OHIP-14)) Internal consistency of scores and test-retest reliability were investigated by asking a subset of subjects (N = 62) to complete the questionnaire twice, 2 weeks apart.
Exploratory factor analysis provided some evidence that self-reported chewing ability can be characterized by a summary score as the original authors suggest. Support for the validity of chewing function scores using the alternate version of the questionnaire was derived from correlations with OHIP-14 scores (r = -0.46, 95% confidence interval (CI): -0.53 to -0.39); thus, better chewing ability was associated with less impaired OHRQoL. Internal consistency was 'satisfactory,' with a Cronbach's alpha of 0.90 (lower limit of 95% CI: 0.89). The test-retest reliability was 'good,' with an intraclass correlation coefficient of 0.69 (95% CI: 0.56 to 0.82).
The alternate version of the chewing function questionnaire can be used as a stand-alone instrument because of the demonstrated reliability and validity of scores obtained using the questionnaire in partially dentate patients.
Although growing interest exists in the bipolar spectrum, fully structured diagnostic interviews might not accurately assess bipolar spectrum disorders. A validity study was carried out for diagnoses of threshold and sub-threshold bipolar disorders (BPD) based on the WHO Composite International Diagnostic Interview (CIDI) in the National Comorbidity Survey Replication (NCS-R). CIDI BPD screening scales were also evaluated.
The NCS-R is a nationally representative US household population survey (n = 9282 using CIDI to assess DSM-IV disorders. CIDI diagnoses were evaluated in blinded clinical reappraisal interviews using the non-patient version of the Structured Clinical Interview for DSM-IV (SCID).
Excellent CIDI-SCID concordance was found for lifetime BP-I (AUC = .99 κ = .88, PPV = .79, NPV = 1.0), either BP-II or sub-threshold BPD (AUC = .96, κ = .88, PPV = .85, NPV = .99), and overall bipolar spectrum disorders (i.e., BP-I/II or sub-threshold BPD; AUC = .99, κ = .94, PPV = .88, NPV = 1.0). Concordance was lower for BP-II (AUC = .83, κ = .50, PPV = .41, NPV = .99) and sub-threshold BPD (AUC = .73, κ = .51, PPV = .58, NPV = .99). The CIDI was unbiased compared to the SCID, yielding a lifetime bipolar spectrum disorders prevalence estimate of 4.4%. Brief CIDI-based screening scales detected 67–96% of true cases with positive predictive value of 31–52%.
CIDI prevalence estimates are still probably conservative, though, but might be improved with future CIDI revisions based on new methodological studies with a clinical assessment more sensitive than the SCID to sub-threshold BPD.
Bipolar spectrum disorders are much more prevalent that previously realized. The CIDI is capable of generating conservative diagnoses of both threshold and sub-threshold BPD. Short CIDI-based scales are useful screens for BPD.
Bipolar Disorders; Bipolar Spectrum; Mania; Hypomania; Composite International Diagnostic Interview (CIDI); Validity; National Comorbidity Survey Replication (NCS-R)
related quality of life (HRQoL) measurement is important in determining
the impact of disease on daily functioning and subsequently informing
interventions. In cystic fibrosis (CF) generic HRQoL measures have been
employed but these may not be sufficiently specific. The aim of the
current work was to develop and validate a disease specific HRQoL
measure for adults and adolescents with cystic fibrosis.
concern to adults and adolescents with CF were identified by
unstructured interviews, self-administered questionnaires, consultation
with multidisciplinary specialist staff, a review of the relevant
literature, and examination of other HRQoL measures. Items for the
questionnaire were generated on the basis of this process. Continued
evaluation and development of the Cystic Fibrosis Quality of Life
(CFQoL) questionnaire was undertaken by a process of statistical
analysis and continued feedback from patients. The full testing and
validation of the CFQoL questionnaire took place over four phases: (1)
initial item generation and testing of a preliminary questionnaire, (2)
testing and validation of the second version of the questionnaire, (3)
test-retest reliability of a third and final version of the
questionnaire, and (4) sensitivity testing of the final version of the questionnaire.
of functioning were identified using principal components analysis with
varimax rotation. Internal reliability of the identified domains was
demonstrated using Cronbach alpha coefficients (range 0.72-0.92) and
item to total domain score correlations. Concurrent validity (range
r = 0.64-0.74), discriminatory ability
between different levels of disease severity, sensitivity across
transient changes in health (effect size range, moderate d = 0.56 to
large d = 1.95), and test-retest reliability
(r =0.74-0.96) were also found to be robust.
questionnaire is a fully validated disease specific measure consisting
of 52 items across nine domains of functioning which have been
identified by, and are of importance to, adolescents and adults with
cystic fibrosis. This measure will be useful in clinical trials and
To develop and validate the Adult Hypopituitarism Questionnaire (AHQ) as a disease-specific, self-administered questionnaire for evaluation of quality of life (QOL) in adult patients with hypopituitarism.
We developed and validated this new questionnaire, using a standardized procedure which included item development, pilot-testing and psychometric validation. Of the patients who participated in psychometric validation, those whose clinical conditions were judged to be stable were asked to answer the survey questionnaire twice, in order to assess test-retest reliability.
Content validity of the initial questionnaire was evaluated via two pilot tests. After these tests, we made minor revisions and finalized the initial version of the questionnaire. The questionnaire was constructed with two domains, one psycho-social and the other physical. For psychometric assessment, analyses were performed on the responses of 192 adult patients with various types of hypopituitarism. The intraclass correlations of the respective domains were 0.91 and 0.95, and the Cronbach’s alpha coefficients were 0.96 and 0.95, indicating adequate test-retest reliability and internal consistency for each domain. For known-group validity, patients with hypopituitarism due to hypothalamic disorder showed significantly lower scores in 11 out of 13 sub-domains compared to those who had hypopituitarism due to pituitary disorder. Regarding construct validity, the domain structure was found to be almost the same as that initially hypothesized. Exploratory factor analysis (n = 228) demonstrated that each domain consisted of six and seven sub-domains.
The AHQ showed good reliability and validity for evaluating QOL in adult patients with hypopituitarism.
Most patients are anxious before surgery. The level of preoperative anxiety depends on several factors and merits an objective evaluation. The Amsterdam Preoperative Anxiety and Information Scale (APAIS) is a self-report questionnaire comprising six questions that have been developed and validated to evaluate the preoperative anxiety of patients. This global index assesses three separate areas: anxiety about anaesthesia, anxiety about surgery, and the desire for information. The purpose of this study was to translate the APAIS into French and to evaluate the psychometric properties of the French version of the APAIS.
The process consisted of two steps. The first step involved the production of a French version of the APAIS that was semantically equivalent to the original version. In the second step, we evaluated the psychometric properties of the French version, including the internal consistency and reliability, the differential item functioning, and the external validity. Participants older than 18, undergoing elective surgery (except obstetric), able to understand and read French, and able to complete a self-report questionnaire were eligible for inclusion in the study. A forward-backward translation was performed. The psychometric evaluation covered three domains: internal validity, external validity, and acceptability. Within 4–48 h after surgery, the patients were asked to complete the “Evaluation du Vécu de l’ANesthésie” questionnaire” (EVAN) questionnaire, which is a validated, multi-dimensional questionnaire that assesses the patient’s experiences in the perioperative period.
A database with 175 patients was created. The principal component factor analysis revealed the same three-dimensional structure as the original scale. The confirmatory factor analysis showed a strong fit with a root mean square error of approximation of 0.069 and a comparative fit index of 1.00. The amount of differential item functioning (DIF) between the subgroups of patients (i.e., based on age, gender, type of anaesthesia or surgery, premedication, ASA physical status, and ambulatory course) was low. The APAIS was strongly correlated with the dimensions of the EVAN. Each dimension had a low proportion of missing values (ranging from 0.6 to 2.9%), which indicates good acceptability of the questionnaire.
The French version of the APAIS is valid and reliable. The availability of this tool enables the evaluation of anxiety in French patients undergoing anaesthesia.
Temporomandibular disorders (TMD) screeners assume significant item overlap with the
screening questionnaire proposed by the American Academy of Orofacial Pain
To test the reliability and validity of the Portuguese version of AAOP questions
for TMD screening among adolescents.
Material and Methods
Diagnoses from Research Diagnostic Criteria for Temporomandibular Disorders
(RDC/TMD) Axis I were used as reference standard. Reliability was evaluated by
internal consistency (KR-20) and inter-item correlation. Validity was tested by
sensitivity, specificity, predictive values, accuracy and receiver operating
characteristic (ROC) curves, the relationship between the true-positive rate
(sensitivity) and the false-positive rate (specificity). Test-retest reliability
of AAOP questions and intra-examiner reproducibility of RDC/TMD Axis I were tested
with kappa statistics.
The sample consisted of 1307 Brazilian adolescents (56.8% girls; n=742), with mean
age of 12.72 years (12.69 F/12.75 M). According to RDC/TMD, 397 [30.4% (32.7%
F/27.3% M)] of adolescents presented TMD, of which 330 [25.2% (27.6% F/22.2% M)]
were painful TMD. Because of low consistency, items #8 and #10 of the AAOP
questionnaire were excluded. Remaining items (of the long questionnaire version)
showed good consistency and validity for three positive responses or more. After
logistic regression, items #4, #6, #7 and #9 also showed satisfactory consistency
and validity for two or more positive responses (short questionnaire version).
Both versions demonstrated excellent specificity (about 90%), but higher
sensitivity for detecting painful TMD (78.2%). Better reproducibility was obtained
for the short version (k=0.840).
The Portuguese version of AAOP questions showed both good reliability and validity
for the screening of TMD among adolescents, especially painful TMD, according to
Temporomandibular joint disorders; Questionnaires; Validation studies; Adolescent
There are very few reliable and valid measures in Japan assessing health-related quality of life (HRQOL) in children with cancer. The present study aimed to develop a Japanese version of the Minneapolis-Manchester Quality of Life Survey of Health Adolescent Form (MMQL-AF), which is a measure for assessing the HRQOL of childhood cancer survivors, and investigate its reliability and validity.
Participants were 141 children with cancer who had been off therapy for more than one year and 183 healthy controls. The reliability and internal consistency of the measure were assessed through test-retest methods using Cronbach’s coefficient alpha and intra-class correlation coefficients (ICCs). For validation of the measure, factorial validity, concurrent validity using the Japanese version of PedsQL 4.0 Generic Core Scales (PedsQL-J), and discriminant validity using comparisons between children with cancer and healthy controls were investigated.
Of the 46 items in the original version, 44 items were determined to comprise the Japanese version of the MMQL-AF. Cronbach’s coefficient alphas for each subscale were high ranging from 0.83 to 0.89. Test-retest reliability ranged between ICC 0.79 to 0.96. Investigation of concurrent validity using the PedsQL-J demonstrated strong correlations in physical functions and moderate correlations for other factors. A significant difference was observed between children with cancer and healthy controls.
Thus, the Japanese version of the MMQL-AF served as a self-evaluation questionnaire that allowed for practical, comprehensive, and multidimensional measurement of HRQOL specific to childhood cancer survivors.
Childhood cancer; Survivor; Health-related quality of life
How to protect patients from harm is a question of universal interest. Measuring and improving safety culture in care giving units is an important strategy for promoting a safe environment for patients. The Safety Attitudes Questionnaire (SAQ) is the only instrument that measures safety culture in a way which correlates with patient outcome. We have translated the SAQ to Norwegian and validated the translated version. The psychometric properties of the translated questionnaire are presented in this article.
The questionnaire was translated with the back translation technique and tested in 47 clinical units in a Norwegian university hospital. SAQ's (the Generic version (Short Form 2006) the version with the two sets of questions on perceptions of management: on unit management and on hospital management) were distributed to 1911 frontline staff. 762 were distributed during unit meetings and 1149 through the postal system. Cronbach alphas, item-to-own correlations, and test-retest correlations were calculated, and response distribution analysis and confirmatory factor analysis were performed, as well as early validity tests.
1306 staff members completed and returned the questionnaire: a response rate of 68%. Questionnaire acceptability was good. The reliability measures were acceptable. The factor structure of the responses was tested by confirmatory factor analysis. 36 items were ascribed to seven underlying factors: Teamwork Climate, Safety Climate, Stress Recognition, Perceptions of Hospital Management, Perceptions of Unit Management, Working conditions, and Job satisfaction. Goodness-of-Fit Indices showed reasonable, but not indisputable, model fit. External validity indicators – recognizability of results, correlations with "trigger tool"-identified adverse events, with patient satisfaction with hospitalization, patient reports of possible maltreatment, and patient evaluation of organization of hospital work – provided preliminary validation.
Based on the data from Akershus University Hospital, we conclude that the Norwegian translation of the SAQ showed satisfactory internal psychometric properties. With data from one hospital only, we cannot draw strong conclusions on its external validity. Further validation studies linking the SAQ-scores to patient outcome data should be performed.