PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (410062)

Clipboard (0)
None

Related Articles

1.  Screening for depressive disorders using the MASQ anhedonic depression scale: A receiver-operator characteristic analysis 
Psychological assessment  2010;22(3):702-710.
The present study examined the utility of the anhedonic depression scale from the Mood and Anxiety Symptoms Questionnaire (MASQ-AD) as a way to screen for depressive disorders. Using receiver-operator characteristic analysis, the sensitivity and specificity of the full 22-item MASQ-AD scale, as well as the 8 and 14-item subscales, were examined in relation to both current and lifetime DSM-IV depressive disorder diagnoses in two nonpatient samples. As a means of comparison, the sensitivity and specificity of a measure of a relevant personality dimension, neuroticism, was also examined. Results from both samples support the clinical utility of the MASQ-AD scale as a means of screening for depressive disorders. Findings were strongest for the MASQ-AD 8-item subscale and when predicting current depression status. Furthermore, the MASQ-AD 8-item subscale outperformed the neuroticism measure under certain conditions. The overall usefulness of the MASQ-AD scale as a screening device is discussed, as well as possible cutoff scores for use in research.
doi:10.1037/a0019915
PMCID: PMC2992834  PMID: 20822283
depressive disorders; anhedonic depression; Mood and Anxiety Symptoms Questionnaire; receiver-operator characteristic analysis; screening
2.  Reference values for generic instruments used in routine outcome monitoring: the leiden routine outcome monitoring study 
BMC Psychiatry  2012;12:203.
Introduction
The Brief Symptom Inventory (BSI), Mood & Anxiety Symptom Questionnaire −30 (MASQ-D30), Short Form Health Survey 36 (SF-36), and Dimensional Assessment of Personality Pathology-Short Form (DAPP-SF) are generic instruments that can be used in Routine Outcome Monitoring (ROM) of patients with common mental disorders. We aimed to generate reference values usually encountered in 'healthy' and ‘psychiatrically ill’ populations to facilitate correct interpretation of ROM results.
Methods
We included the following specific reference populations: 1294 subjects from the general population (ROM reference group) recruited through general practitioners, and 5269 psychiatric outpatients diagnosed with mood, anxiety, or somatoform (MAS) disorders (ROM patient group). The outermost 5% of observations were used to define limits for one-sided reference intervals (95th percentiles for BSI, MASQ-D30 and DAPP-SF, and 5th percentiles for SF-36 subscales). Internal consistency and Receiver Operating Characteristics (ROC) analyses were performed.
Results
Mean age for the ROM reference group was 40.3 years (SD=12.6) and 37.7 years (SD=12.0) for the ROM patient group. The proportion of females was 62.8% and 64.6%, respectively. The mean for cut-off values of healthy individuals was 0.82 for the BSI subscales, 23 for the three MASQ-D30 subscales, 45 for the SF-36 subscales, and 3.1 for the DAPP-SF subscales. Discriminative power of the BSI, MASQ-D30 and SF-36 was good, but it was poor for the DAPP-SF. For all instruments, the internal consistency of the subscales ranged from adequate to excellent.
Discussion and conclusion
Reference values for the clinical interpretation were provided for the BSI, MASQ-D30, SF-36, and DAPP-SF. Clinical information aided by ROM data may represent the best means to appraise the clinical state of psychiatric outpatients.
doi:10.1186/1471-244X-12-203
PMCID: PMC3551660  PMID: 23171272
Reference values; Routine outcome monitoring; Questionnaires; Mood disorders; Anxiety disorders; Somatoform disorders
3.  Using Computerized Adaptive Testing to Reduce the Burden of Mental Health Assessment 
Objective
This study investigated the combination of item response theory and computerized adaptive testing (CAT) for psychiatric measurement as a means of reducing the burden of research and clinical assessments.
Methods
Data were from 800 participants in outpatient treatment for a mood or anxiety disorder; they completed 616 items of the 626-item Mood and Anxiety Spectrum Scales (MASS) at two times. The first administration was used to design and evaluate a CAT version of the MASS by using post hoc simulation. The second confirmed the functioning of CAT in live testing.
Results
Tests of competing models based on item response theory supported the scale’s bifactor structure, consisting of a primary dimension and four group factors (mood, panic-agoraphobia, obsessive-compulsive, and social phobia). Both simulated and live CAT showed a 95% average reduction (585 items) in items administered (24 and 30 items, respectively) compared with administration of the full MASS. The correlation between scores on the full MASS and the CAT version was .93. For the mood disorder subscale, differences in scores between two groups of depressed patients—one with bipolar disorder and one without—on the full scale and on the CAT showed effect sizes of .63 (p<.003) and 1.19 (p<.001) standard deviation units, respectively, indicating better discriminant validity for CAT.
Conclusions
Instead of using small fixed-length tests, clinicians can create item banks with a large item pool, and a small set of the items most relevant for a given individual can be administered with no loss of information, yielding a dramatic reduction in administration time and patient and clinician burden.
doi:10.1176/appi.ps.59.4.361
PMCID: PMC2916927  PMID: 18378832
4.  Clinical utility of the Mood and Anxiety Symptom Questionnaire (MASQ) in a sample of young help-seekers 
BMC Psychiatry  2007;7:50.
Background
The overlap between Depression and Anxiety has led some researchers to conclude that they are manifestations of a broad, non-specific neurotic disorder. However, others believe that they can be distinguished despite sharing symptoms of general distress. The Tripartite Model of Affect proposes an anxiety-specific, a depression-specific and a shared symptoms factor. Watson and Clark developed the Mood and Anxiety Symptom Questionnaire (MASQ) to specifically measure these Tripartite constructs. Early research showed that the MASQ distinguished between dimensions of Depression and Anxiety in non-clinical samples. However, two recent studies have cautioned that the MASQ may show limited validity in clinical populations. The present study investigated the clinical utility of the MASQ in a clinical sample of adolescents and young adults.
Methods
A total of 204 Young people consecutively referred to a specialist public mental health service in Melbourne, Australia were approached and 150 consented to participate. From this, 136 participants completed both a diagnostic interview and the MASQ.
Results
The majority of the sample rated for an Axis-I disorder, with Mood and Anxiety disorders most prevalent. The disorder-specific scales of the MASQ significantly discriminated Anxiety (61.0%) and Mood Disorders (72.8%), however, the predictive accuracy for presence of Anxiety Disorders was very low (29.8%). From ROC analyses, a proposed cut-off of 76 was proposed for the depression scale to indicate 'caseness' for Mood Disorders. The resulting sensitivity/specificity was superior to that of the CES-D.
Conclusion
It was concluded that the depression-specific scale of the MASQ showed good clinical utility, but that the anxiety-specific scale showed poor discriminant validity.
doi:10.1186/1471-244X-7-50
PMCID: PMC2151061  PMID: 17868477
5.  Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms 
Purpose
Short-form patient-reported outcome measures are popular because they minimize patient burden. We assessed the efficiency of static short forms and computer adaptive testing (CAT) using data from the Patient-Reported Outcomes Measurement Information System (PROMIS) project.
Methods
We evaluated the 28-item PROMIS depressive symptoms bank. We used post hoc simulations based on the PROMIS calibration sample to compare several short-form selection strategies and the PROMIS CAT to the total item bank score.
Results
Compared with full-bank scores, all short forms and CAT produced highly correlated scores, but CAT outperformed each static short form in almost all criteria. However, short-form selection strategies performed only marginally worse than CAT. The performance gap observed in static forms was reduced by using a two-stage branching test format.
Conclusions
Using several polytomous items in a calibrated unidimensional bank to measure depressive symptoms yielded a CAT that provided marginally superior efficiency compared to static short forms. The efficiency of a two-stage semi-adaptive testing strategy was so close to CAT that it warrants further consideration and study.
doi:10.1007/s11136-009-9560-5
PMCID: PMC2832176  PMID: 19941077
Computer adaptive testing; PROMIS; Item response theory; Short form; Two-stage testing
6.  Development of a Computerized Adaptive Test for Depression 
Archives of general psychiatry  2012;69(11):1104-1112.
Context
Unlike other areas of medicine, psychiatry is almost entirely dependent on patient report to assess the presence and severity of disease; therefore, it is particularly crucial that we find both more accurate and efficient means of obtaining that report.
Objective
To develop a computerized adaptive test (CAT) for depression, called the Computerized Adaptive Test–Depression Inventory (CAT-DI), that decreases patient and clinician burden and increases measurement precision.
Design
Case-control study.
Setting
A psychiatric clinic and community mental health center.
Participants
A total of 1614 individuals with and without minor and major depression were recruited for study.
Main Outcome Measures
The focus of this study was the development of the CAT-DI. The 24-item Hamilton Rating Scale for Depression, Patient Health Questionnaire 9, and the Center for Epidemiologic Studies Depression Scale were used to study the convergent validity of the new measure, and the Structured Clinical Interview for DSM-IV was used to obtain diagnostic classifications of minor and major depressive disorder.
Results
A mean of 12 items per study participant was required to achieve a 0.3 SE in the depression severity estimate and maintain a correlation of r=0.95 with the total 389-item test score. Using empirically derived thresholds based on a mixture of normal distributions, we found a sensitivity of 0.92 and a specificity of 0.88 for the classification of major depressive disorder in a sample consisting of depressed patients and healthy controls. Correlations on the order of r=0.8 were found with the other clinician and self-rating scale scores. The CAT-DI provided excellent discrimination throughout the entire depressive severity continuum (minor and major depression), whereas the traditional scales did so primarily at the extremes (eg, major depression).
Conclusions
Traditional measurement fixes the number of items administered and allows measurement uncertainty to vary. In contrast, a CAT fixes measurement uncertainty and allows the number of items to vary. The result is a significant reduction in the number of items needed to measure depression and increased precision of measurement.
doi:10.1001/archgenpsychiatry.2012.14
PMCID: PMC3551289  PMID: 23117634
7.  A factor analytic investigation of the Tripartite model of affect in a clinical sample of young Australians 
BMC Psychiatry  2008;8:79.
Background
The Mood and Anxiety Symptom Questionnaire (MASQ) was designed to specifically measure the Tripartite model of affect and is proposed to offer a delineation between the core components of anxiety and depression. Factor analytic data from adult clinical samples has shown mixed results; however no studies employing confirmatory factor analysis (CFA) have supported the predicted structure of distinct Depression, Anxiety and General Distress factors. The Tripartite model has not been validated in a clinical sample of older adolescents and young adults. The aim of the present study was to examine the validity of the Tripartite model using scale-level data from the MASQ and correlational and confirmatory factor analysis techniques.
Methods
137 young people (M = 17.78, SD = 2.63) referred to a specialist mental health service for adolescents and young adults completed the MASQ and diagnostic interview.
Results
All MASQ scales were highly inter-correlated, with the lowest correlation between the depression- and anxiety-specific scales (r = .59). This pattern of correlations was observed for all participants rating for an Axis-I disorder but not for participants without a current disorder (r = .18). Confirmatory factor analyses were conducted to evaluate the model fit of a number of solutions. The predicted Tripartite structure was not supported. A 2-factor model demonstrated superior model fit and parsimony compared to 1- or 3-factor models. These broad factors represented Depression and Anxiety and were highly correlated (r = .88).
Conclusion
The present data lend support to the notion that the Tripartite model does not adequately explain the relationship between anxiety and depression in all clinical populations. Indeed, in the present study this model was found to be inappropriate for a help-seeking community sample of older adolescents and young adults.
doi:10.1186/1471-244X-8-79
PMCID: PMC2561028  PMID: 18799017
8.  An accurate and efficient identification of children with psychosocial problems by means of computerized adaptive testing 
Background
Questionnaires used by health services to identify children with psychosocial problems are often rather short. The psychometric properties of such short questionnaires are mostly less than needed for an accurate distinction between children with and without problems. We aimed to assess whether a short Computerized Adaptive Test (CAT) can overcome the weaknesses of short written questionnaires when identifying children with psychosocial problems.
Method
We used a Dutch national data set obtained from parents of children invited for a routine health examination by Preventive Child Healthcare with 205 items on behavioral and emotional problems (n = 2,041, response 84%). In a random subsample we determined which items met the requirements of an Item Response Theory (IRT) model to a sufficient degree. Using those items, item parameters necessary for a CAT were calculated and a cut-off point was defined. In the remaining subsample we determined the validity and efficiency of a Computerized Adaptive Test using simulation techniques, with current treatment status and a clinical score on the Total Problem Scale (TPS) of the Child Behavior Checklist as criteria.
Results
Out of 205 items available 190 sufficiently met the criteria of the underlying IRT model. For 90% of the children a score above or below cut-off point could be determined with 95% accuracy. The mean number of items needed to achieve this was 12. Sensitivity and specificity with the TPS as a criterion were 0.89 and 0.91, respectively.
Conclusion
An IRT-based CAT is a very promising option for the identification of psychosocial problems in children, as it can lead to an efficient, yet high-quality identification. The results of our simulation study need to be replicated in a real-life administration of this CAT.
doi:10.1186/1471-2288-11-111
PMCID: PMC3199909  PMID: 21816055
9.  Migrating from a legacy fixed-format measure to CAT administration: calibrating the PHQ-9 to the PROMIS depression measures 
Purpose
We provide detailed instructions for analyzing patient-reported outcome (PRO) data collected with an existing (legacy) instrument so that scores can be calibrated to the PRO Measurement Information System (PROMIS) metric. This calibration facilitates migration to computerized adaptive test (CAT) PROMIS data collection, while facilitating research using historical legacy data alongside new PROMIS data.
Methods
A cross-sectional convenience sample (n = 2,178) from the Universities of Washington and Alabama at Birmingham HIV clinics completed the PROMIS short form and Patient Health Questionnaire (PHQ-9) depression symptom measures between August 2008 and December 2009. We calibrated the tests using item response theory. We compared measurement precision of the PHQ-9, the PROMIS short form, and simulated PROMIS CAT.
Results
Dimensionality analyses confirmed the PHQ-9 could be calibrated to the PROMIS metric. We provide code used to score the PHQ-9 on the PROMIS metric. The mean standard errors of measurement were 0.49 for the PHQ-9, 0.35 for the PROMIS short form, and 0.37, 0.28, and 0.27 for 3-, 8-, and 9-item-simulated CATs.
Conclusions
The strategy described here facilitated migration from a fixed-format legacy scale to PROMIS CAT administration and may be useful in other settings.
doi:10.1007/s11136-011-9882-y
PMCID: PMC3175024  PMID: 21409516
Calibration; Computerized adaptive testing; Depression; Item banks; Item response theory; PROMIS
10.  Measuring Physical Functioning in Children with Spinal Impairments with Computerized Adaptive Testing 
Journal of pediatric orthopedics  2008;28(3):330-335.
Background
The purpose of this study was to assess the utility of measuring current physical functioning status of children with complex spinal impairments by applying computerized adaptive testing (CAT) methods. CAT uses a computer-interface to administer the most optimal items based on previous responses, reducing the number of items needed to obtain a scoring estimate.
Methods
This was a prospective study of 77 subjects (0.6 – 19.8 yrs) with spinal impairments who were seen during a routine clinic visit. Using a multidimensional version of the Pediatric Evaluation of Disability Inventory CAT program (PEDI-MCAT), we evaluated content range, accuracy and efficiency, known –groups validity, concurrent validity with the Pediatric Outcomes Data Collection Instrument (PODCI), and test-retest reliability in a sub-sample (n=16) within a two-week interval.
Results
We found the PEDI-MCAT to have sufficient item coverage in both self-care and mobility content for this sample, although a majority of the patients tended to score at the higher ends of both scales. Both the accuracy of PEDI-MCAT scores as compared to a fixed-format of the PEDI (r = 0.98 for both mobility and self-care) and test-retest reliability were very high (self-care: ICC (3,1)=0.98, mobility: ICC(3,1)=0.99). The PEDI-MCAT took an average of 2.9 minutes for the parents to complete. The PEDI-MCAT detected expected differences between patient groups, and scores on the PEDI-MCAT correlated in expected directions with scores from the PODCI domains.
Conclusion
Use of the PEDI-MCAT to assess the physical functioning status, as perceived by parents of children with complex spinal impairments, appears to be feasible and achieves accurate and efficient estimates of self-care and mobility function. Additional item development will be needed at the higher functioning end of the scale to avoid ceiling effects for older children.
Level of Evidence
This is a level II prospective study designed to establish the utility of computer adaptive testing as an evaluation method in a busy pediatric spine practice.
doi:10.1097/BPO.0b013e318168c792
PMCID: PMC2696932  PMID: 18362799
computerized adaptive testing; assessment; outcomes; spine impairments
11.  A Web-Based Computerized Adaptive Testing (CAT) to Assess Patient Perception in Hospitalization 
Background
Many hospitals have adopted mobile nursing carts that can be easily rolled up to a patient’s bedside to access charts and help nurses perform their rounds. However, few papers have reported data regarding the use of wireless computers on wheels (COW) at patients’ bedsides to collect questionnaire-based information of their perception of hospitalization on discharge from the hospital.
Objective
The purpose of this study was to evaluate the relative efficiency of computerized adaptive testing (CAT) and the precision of CAT-based measures of perceptions of hospitalized patients, as compared with those of nonadaptive testing (NAT). An Excel module of our CAT multicategory assessment is provided as an example.
Method
A total of 200 patients who were discharged from the hospital responded to the CAT-based 18-item inpatient perception questionnaire on COW. The numbers of question administrated were recorded and the responses were calibrated using the Rasch model. They were compared with those from NAT to show the advantage of CAT over NAT.
Results
Patient measures derived from CAT and NAT were highly correlated (r = 0.98) and their measurement precisions were not statistically different (P = .14). CAT required fewer questions than NAT (an efficiency gain of 42%), suggesting a reduced burden for patients. There were no significant differences between groups in terms of gender and other demographic characteristics.
Conclusions
CAT-based administration of surveys of patient perception substantially reduced patient burden without compromising the precision of measuring patients’ perceptions of hospitalization. The Excel module of animation-CAT on the wireless COW that we developed is recommended for use in hospitals.
doi:10.2196/jmir.1785
PMCID: PMC3222179  PMID: 21844001
Computerized adaptive testing; computer on wheels; classic test theory; IRT; item response theory; nonadaptive testing
12.  Dynamic Assessment of Health Outcomes: Time to Let the CAT Out of the Bag? 
Health Services Research  2005;40(5 Pt 2):1694-1711.
Background
The use of item response theory (IRT) to measure self-reported outcomes has burgeoned in recent years. Perhaps the most important application of IRT is computer-adaptive testing (CAT), a measurement approach in which the selection of items is tailored for each respondent.
Objective
To provide an introduction to the use of CAT in the measurement of health outcomes, describe several IRT models that can be used as the basis of CAT, and discuss practical issues associated with the use of adaptive scaling in research settings.
Principal Points
The development of a CAT requires several steps that are not required in the development of a traditional measure including identification of “starting” and “stopping” rules. CAT's most attractive advantage is its efficiency. Greater measurement precision can be achieved with fewer items. Disadvantages of CAT include the high cost and level of technical expertise required to develop a CAT.
Conclusions
Researchers, clinicians, and patients benefit from the availability of psychometrically rigorous measures that are not burdensome. CAT outcome measures hold substantial promise in this regard, but their development is not without challenges.
doi:10.1111/j.1475-6773.2005.00446.x
PMCID: PMC1361218  PMID: 16179003
Measurement; quality of life; psychometrics; reliability
13.  Development of a Computerized Adaptive Test to Assess Health-related Quality of Life in Adults with Asthma 
The Journal of Asthma  2011;49(2):190-200.
Objective
The purpose of this research was to calibrate an item bank for a computerized adaptive test (CAT) of asthma impact on health-related quality of life (HRQOL), test CAT versions of varying lengths, conduct preliminary validity testing, and evaluate item bank readability.
Methods
Asthma Impact Survey (AIS) bank items that passed focus group, cognitive testing, and clinical and psychometric reviews were administered to adults with varied levels of asthma control. Adults self-reporting asthma (N=1106) completed an Internet survey including 88 AIS items, the Asthma Control Test (ACT), and other HRQOL outcome measures. Data were analyzed using classical and modern psychometric methods, real-data CAT simulations, and known groups validity testing.
Results
A bi-factor model with a general factor (asthma impact) and several group factors (cognitive function, fatigue, mental health, physical function, role function, sexual function, self-consciousness/stigma, sleep, and social function) was tested. Loadings on the general factor were above 0.5 and were substantially larger than group factor loadings, and fit statistics were acceptable. Item functioning for most items and fit to the model was acceptable. CAT simulations demonstrated several options for administration and stopping rules. AIS distinguished between respondents with differing levels of asthma control.
Conclusions
The new 50-item AIS item bank demonstrated favorable psychometric characteristics, preliminary evidence of validity, and accessibility at moderate reading levels. Developing item banks for CAT can improve the precise, efficient, and comprehensive monitoring of asthma outcomes, and may facilitate patient-centered care.
doi:10.3109/02770903.2011.633674
PMCID: PMC3320653  PMID: 22115275
asthma control; Asthma Impact Survey; item response theory; patient-reported outcome; health-related quality of life
14.  Development and validation of a computer adaptive test for measuring dyspnea in heart failure 
Journal of cardiac failure  2010;16(8):659-668.
Background
Dyspnea is a common symptom among patients with heart failure. Currently, there is no standardized, rapid, precise method to assess dyspnea.
Methods and Results
From a review of the literature, we pooled questions from various questionnaires assessing dyspnea. 201 patients with heart failure completed all questions in the preliminary item bank. Each item asks how much shortness of breath the patient had when doing an activity. Medical charts were reviewed for hospitalization within 1 or 3 months of completing the questions. We created a dyspnea item bank of 44 items. Computer Adaptive Tests (CAT) generated from this item bank can assess dyspnea by administering on average 10 questions. Simulation CAT scores were generated to compare with the item bank scores. The CAT scores had a correlation of 0.98 with item bank scores. Logistic regression models predicting the probability of being hospitalized from the dyspnea score were statistically significant (p<0.05). A 5-point score increase was associated with a 32% increased odds of hospitalization in1 month and a 20% increased odds of hospitalization in 3 months.
Conclusions
This computer based tool for dyspnea assessment obtains similar precision to that of answering the entire dyspnea item bank with less patient burden.
doi:10.1016/j.cardfail.2010.03.002
PMCID: PMC2913136  PMID: 20670845
15.  Development and Preliminary Testing of a Computerized Adaptive Assessment of Chronic Pain 
The aim of this article is to report the development and preliminary testing of a prototype computerized adaptive test of chronic pain (CHRONIC PAIN-CAT) conducted in two stages: 1) evaluation of various item selection and stopping rules through real data simulated administrations of CHRONIC PAIN-CAT; 2) a feasibility study of the actual prototype CHRONIC PAIN-CAT assessment system conducted in a pilot sample. Item calibrations developed from a US general population sample (N=782) were used to program a pain severity and impact item bank (k=45) and real data simulations were conducted to determine a CAT stopping rule. The CHRONIC-PAIN CAT was programmed on a tablet PC using QualityMetric's Dynamic Health Assessment (DYHNA®) software and administered to a clinical sample of pain sufferers (n=100). The CAT was completed in significantly less time than the static (full item bank) assessment (p<.001). On average, 5.6 items were dynamically administered by CAT to achieve a precise score. Scores estimated from the two assessments were highly correlated (r=.89) and both assessments discriminated across pain severity levels (p<.001, RV=.95). Patients’ evaluations of the CHRONIC PAIN-CAT were favourable.
Perspective
This report demonstrates that the CHRONIC PAIN-CAT is feasible for administration in a clinic. The application has the potential to improve pain assessment and help clinicians manage chronic pain.
doi:10.1016/j.jpain.2009.03.007
PMCID: PMC2763618  PMID: 19595636
Chronic Pain; Item Response Theory; Computer Adaptive Testing; Pain Assessment
16.  Creating scenarios of the impact of copd and their relationship to copd assessment test (CAT™) scores 
Background
The COPD Assessment Test (CAT™) is a new short health status measure for routine use. New questionnaires require reference points so that users can understand the scores; descriptive scenarios are one way of doing this. A novel method of creating scenarios is described.
Methods
A Bland and Altman plot showed a consistent relationship between CAT scores and scores obtained with the St George's Respiratory Questionnaire for COPD (SGRQ-C) permitting a direct mapping process between CAT and SGRQ items. The severity associated with each CAT item was calculated using a probabilistic model and expressed in logits (log odds of a patient of given severity affirming that item 50% of the time). Severity estimates for SGRQ-C items in logits were also available, allowing direct comparisons with CAT items. CAT scores were categorised into Low, Medium, High and Very High Impact. SGRQ items of corresponding severity were used to create scenarios associated with each category.
Results
Each CAT category was associated with a scenario comprising 12 to 16 SGRQ-C items. A severity 'ladder' associating CAT scores with exemplar health status effects was also created. Items associated with 'Low' and 'Medium' Impact appeared to be subjectively quite severe in terms of their effect on daily life.
Conclusions
These scenarios provide users of the CAT with a good sense of the health impact associated with different scores. More generally they provide a surprising insight into the severity of the effects of COPD, even in patients with apparently mild-moderate health status impact.
doi:10.1186/1471-2466-11-42
PMCID: PMC3199910  PMID: 21835018
17.  SIMPOLYCAT: An SAS Program for Conducting CAT Simulation Based on Polytomous IRT Models 
Behavior research methods  2009;41(2):499-506.
A real-data simulation of computerized adaptive testing (CAT) is an important step in real life CAT applications. Such a simulation allows CAT developers to evaluate important features of the CAT system such as item selection and stopping rules before live testing. SIMPOLYCAT, an SAS macro program, was created by the authors to conduct real-data CAT simulation based on polytomous item response theory (IRT) models. In SIMPOLYCAT, item responses can be input from an external file or generated internally based on item parameters provided by users. The program allows users to choose among methods of setting initial θ, approaches to item selection, trait estimators, CAT stopping criteria, polytomous IRT models, and other CAT parameters. In addition, CAT simulation results can be saved easily and used for further study. The purpose of this article is to introduce SIMPOLYCAT, briefly describe the program algorithm and parameters, and provide examples of CAT simulations using generated and real data. Visual comparisons of the results obtained from the CAT simulations are presented.
doi:10.3758/BRM.41.2.499
PMCID: PMC2778068  PMID: 19363190
18.  A Computer-Adaptive Disability Instrument for Lower Extremity Osteoarthritis Research Demonstrated Promising Breadth, Precision and Reliability 
Journal of Clinical Epidemiology  2009;62(8):807-815.
Objective
To develop and evaluate a prototype measure (OA-DISABILITY-CAT) for osteoarthritis research using Item Response Theory (IRT) and Computer Adaptive Test (CAT) methodologies.
Study Design and Setting
We constructed an item bank consisting of 33 activities commonly affected by lower extremity (LE) osteoarthritis. A sample of 323 adults with LE osteoarthritis reported their degree of limitation in performing everyday activities and completed the Health Assessment Questionnaire-II (HAQ-II). We used confirmatory factor analyses to assess scale unidimensionality and IRT methods to calibrate the items and examine the fit of the data. Using CAT simulation analyses, we examined the performance of OA-DISABILITY-CATs of different lengths compared to the full item bank and the HAQ-II.
Results
One distinct disability domain was identified. The 10-item OA-DISABILITY-CAT demonstrated a high degree of accuracy compared with the full item bank (r=0.99). The item bank and the HAQ-II scales covered a similar estimated scoring range. In terms of reliability, 95% of OA-DISABILITY reliability estimates were over 0.83 versus 0.60 for the HAQ-II. Except at the highest scores the 10-item OA-DISABILITY-CAT demonstrated superior precision to the HAQ-II.
Conclusion
The prototype OA-DISABILITY-CAT demonstrated promising measurement properties compared to the HAQ-II, and is recommended for use in LE osteoarthritis research.
doi:10.1016/j.jclinepi.2008.10.004
PMCID: PMC3328293  PMID: 19216052
outcome assessment (Health Care); osteoarthritis; clinical trials; disability; item response theory; computer adaptive testing
19.  Development Of A Computer-Adaptive Version Of The Forgotten Joint Score 
The Journal of Arthroplasty  2013;28(3):418-422.
Patient-reported outcomes (PROs) are an important endpoint in orthopedics providing comprehensive information about patients' perspectives on treatment outcome. Computer-adaptive test (CAT) measures are an advanced method for assessing PROs using item sets that are tailored to the individual patient. This provides increased measurement precision and reduces the number of items. We developed a CAT version of the Forgotten Joint Score (FJS), a measure of joint awareness in everyday life. CAT development was based on FJS data from 580 patients after THA or TKA (808 assessments). The CAT version reduced the number of items by half at comparable measurement precision. In a feasibility study we administered the newly developed CAT measure on tablet PCs and found that patients actually preferred electronic questionnaires over paper–pencil questionnaires.
doi:10.1016/j.arth.2012.08.026
PMCID: PMC3587796  PMID: 23219089
patient-reported outcomes; forgotten joint score; electronic data capture; computer-adaptive testing; total knee arthroplasty; total hip arthroplasty
20.  Comparison of two Bayesian methods to detect mode effects between paper-based and computerized adaptive assessments: a preliminary Monte Carlo study 
Background
Computerized adaptive testing (CAT) is being applied to health outcome measures developed as paper-and-pencil (P&P) instruments. Differences in how respondents answer items administered by CAT vs. P&P can increase error in CAT-estimated measures if not identified and corrected.
Method
Two methods for detecting item-level mode effects are proposed using Bayesian estimation of posterior distributions of item parameters: (1) a modified robust Z (RZ) test, and (2) 95% credible intervals (CrI) for the CAT-P&P difference in item difficulty. A simulation study was conducted under the following conditions: (1) data-generating model (one- vs. two-parameter IRT model); (2) moderate vs. large DIF sizes; (3) percentage of DIF items (10% vs. 30%), and (4) mean difference in θ estimates across modes of 0 vs. 1 logits. This resulted in a total of 16 conditions with 10 generated datasets per condition.
Results
Both methods evidenced good to excellent false positive control, with RZ providing better control of false positives and with slightly higher power for CrI, irrespective of measurement model. False positives increased when items were very easy to endorse and when there with mode differences in mean trait level. True positives were predicted by CAT item usage, absolute item difficulty and item discrimination. RZ outperformed CrI, due to better control of false positive DIF.
Conclusions
Whereas false positives were well controlled, particularly for RZ, power to detect DIF was suboptimal. Research is needed to examine the robustness of these methods under varying prior assumptions concerning the distribution of item and person parameters and when data fail to conform to prior assumptions. False identification of DIF when items were very easy to endorse is a problem warranting additional investigation.
doi:10.1186/1471-2288-12-124
PMCID: PMC3552735  PMID: 22900979
21.  A Multidimensional Computer Adaptive Test Approach to Dyspnea Assessment 
Objective
To develop and test a prototype dyspnea computer adaptive test.
Design
Prospective study.
Setting
Two outpatient medical facilities.
Participants
A convenience sample of 292 adults with COPD.
Interventions
Not applicable
Main Outcome Measure
We developed a modified and expanded item bank and computer adaptive test (CAT) for the Dyspnea Management Questionnaire (DMQ), an outcome measure consisting of four dyspnea dimensions: dyspnea intensity, dyspnea anxiety, activity avoidance, and activity self-efficacy.
Results
Factor analyses supported a four-dimensional model underlying the 71 DMQ items. The DMQ item bank achieved acceptable Rasch model fit statistics, good measurement breadth with minimal floor and ceiling effects, and evidence of high internal consistency reliability (α = 0.92 to 0.98). Using CAT simulation analyses, the DMQ-CAT showed high measurement accuracy compared to the total item pool (r = .83 to .97, p < .0001) and evidence of good to excellent concurrent (r = −.61 to −0.80, p < .0001) validity. All DMQ-CAT domains showed evidence for known-groups validity (p ≤ 0.001).
Conclusions
The DMQ-CAT reliably and validly captured four distinct dyspnea domains. Multidimensional dyspnea assessment in COPD is needed to better measure the effectiveness of pharmacologic, pulmonary rehabilitation, and psychosocial interventions in not only alleviating the somatic sensation of dyspnea but also reducing dysfunctional emotions, cognitions, and behaviors associated with dyspnea, especially for anxious patients.
doi:10.1016/j.apmr.2011.05.020
PMCID: PMC3526016  PMID: 21963123
Dyspnea; COPD; Outcomes assessment; Reliability; Validity
22.  Creating a Computer Adaptive Test Version of the Late-Life Function & Disability Instrument 
Background
This study applied Item Response Theory (IRT) and Computer Adaptive Test (CAT) methodologies to develop a prototype function and disability assessment instrument for use in aging research. Herein, we report on the development of the CAT version of the Late-Life Function & Disability instrument (Late-Life FDI) and evaluate its psychometric properties.
Methods
We employed confirmatory factor analysis, IRT methods, validation, and computer simulation analyses of data collected from 671 older adults residing in residential care facilities. We compared accuracy, precision, and sensitivity to change of scores from CAT versions of two Late-Life FDI scales with scores from the fixed-form instrument. Score estimates from the prototype CAT versus the original instrument were compared in a sample of 40 older adults.
Results
Distinct function and disability domains were identified within the Late-Life FDI item bank and used to construct two prototype CAT scales. Using retrospective data, scores from computer simulations of the prototype CAT scales were highly correlated with scores from the original instrument. The results of computer simulation, accuracy, precision, and sensitivity to change of the CATs closely approximated those of the fixed-form scales, especially for the 10- or 15-item CAT versions. In the prospective study each CAT was administered in less than 3 minutes and CAT scores were highly correlated with scores generated from the original instrument.
Conclusions
CAT scores of the Late-Life FDI were highly comparable to those obtained from the full-length instrument with a small loss in accuracy, precision, and sensitivity to change.
PMCID: PMC2718692  PMID: 19038841
outcome assessment (Health Care); geriatrics; rehabilitation
23.  Validation of a computer-adaptive test to evaluate generic health-related quality of life 
Background
Health Related Quality of Life (HRQoL) is a relevant variable in the evaluation of health outcomes. Questionnaires based on Classical Test Theory typically require a large number of items to evaluate HRQoL. Computer Adaptive Testing (CAT) can be used to reduce tests length while maintaining and, in some cases, improving accuracy. This study aimed at validating a CAT based on Item Response Theory (IRT) for evaluation of generic HRQoL: the CAT-Health instrument.
Methods
Cross-sectional study of subjects aged over 18 attending Primary Care Centres for any reason. CAT-Health was administered along with the SF-12 Health Survey. Age, gender and a checklist of chronic conditions were also collected. CAT-Health was evaluated considering: 1) feasibility: completion time and test length; 2) content range coverage, Item Exposure Rate (IER) and test precision; and 3) construct validity: differences in the CAT-Health scores according to clinical variables and correlations between both questionnaires.
Results
396 subjects answered CAT-Health and SF-12, 67.2% females, mean age (SD) 48.6 (17.7) years. 36.9% did not report any chronic condition. Median completion time for CAT-Health was 81 seconds (IQ range = 59-118) and it increased with age (p < 0.001). The median number of items administered was 8 (IQ range = 6-10). Neither ceiling nor floor effects were found for the score. None of the items in the pool had an IER of 100% and it was over 5% for 27.1% of the items. Test Information Function (TIF) peaked between levels -1 and 0 of HRQoL. Statistically significant differences were observed in the CAT-Health scores according to the number and type of conditions.
Conclusions
Although domain-specific CATs exist for various areas of HRQoL, CAT-Health is one of the first IRT-based CATs designed to evaluate generic HRQoL and it has proven feasible, valid and efficient, when administered to a broad sample of individuals attending primary care settings.
doi:10.1186/1477-7525-8-147
PMCID: PMC3022567  PMID: 21129169
24.  Development of a Parent-Report Cognitive Function Item Bank Using Item Response Theory and Exploration of its Clinical Utility in Computerized Adaptive Testing 
Journal of Pediatric Psychology  2011;36(7):766-779.
Objective The purpose of this study is to report the reliability, validity, and clinical utility of a parent-report perceived cognitive function (pedsPCF) item bank. Methods From the U.S. general population, 1,409 parents of children aged 7–17 years completed 45 pedsPCF items. Their psychometric properties were evaluated using Item Response Theory (IRT) approaches. Receiver operating characteristic (ROC) curves and discriminant function analysis were used to predict clinical problems on child behavior checklist (CBCL) scales. A computerized adaptive testing (CAT) simulation was used to evaluate clinical utility. Results The final 43-item pedsPCF item bank demonstrates no item bias, has acceptable IRT parameters, and provides good prediction of related clinical problems. CAT simulation resulted in correlations of 0.98 between CAT and the full-length pedsPCF. Conclusions The pedsPCF has sound psychometric properties, U.S. general population norms, and a brief-yet-precise CAT version is available. Future work will evaluate pedsPCF in other clinical populations in which cognitive function is important.
doi:10.1093/jpepsy/jsr005
PMCID: PMC3146757  PMID: 21378106
assessment; cancer and oncology; cognitive assessment; computer applications/eHealth; neuropsychology; quality of life
25.  Usability Evaluation of a Web-Based Support System for People With a Schizophrenia Diagnosis 
Background
Routine Outcome Monitoring (ROM) is a systematic way of assessing service users’ health conditions for the purpose of better aiding their care. ROM consists of various measures used to assess a service user’s physical, psychological, and social condition. While ROM is becoming increasingly important in the mental health care sector, one of its weaknesses is that ROM is not always sufficiently service user-oriented. First, clinicians tend to concentrate on those ROM results that provide information about clinical symptoms and functioning, whereas it has been suggested that a service user-oriented approach needs to focus on personal recovery. Second, service users have limited access to ROM results and they are often not equipped to interpret them. These problems need to be addressed, as access to resources and the opportunity to share decision making has been indicated as a prerequisite for service users to become a more equal partner in communication with their clinicians. Furthermore, shared decision making has been shown to improve the therapeutic alliance and to lead to better care.
Objective
Our aim is to build a web-based support system which makes ROM results more accessible to service users and to provide them with more concrete and personalized information about their functioning (ie, symptoms, housing, social contacts) that they can use to discuss treatment options with their clinician. In this study, we will report on the usability of the web-based support system for service users with schizophrenia.
Methods
First, we developed a prototype of a web-based support system in a multidisciplinary project team, including end-users. We then conducted a usability study of the support system consisting of (1) a heuristic evaluation, (2) a qualitative evaluation and (3) a quantitative evaluation.
Results
Fifteen service users with a schizophrenia diagnosis and four information and communication technology (ICT) experts participated in the study. The results show that people with a schizophrenia diagnosis were able to use the support system easily. Furthermore, the content of the advice generated by the support system was considered meaningful and supportive.
Conclusions
This study shows that the support system prototype has valuable potential to improve the ROM practice and it is worthwhile to further develop it into a more mature system. Furthermore, the results add to prior research into web applications for people with psychotic disorders, in that it shows that this group of end users can work with web-based and computer-based systems, despite the cognitive problems they experience.
doi:10.2196/jmir.1921
PMCID: PMC3374538  PMID: 22311883
Schizophrenia; Web-Based systems; Recommendation systems; usability testing; self-management

Results 1-25 (410062)