|Home | About | Journals | Submit | Contact Us | Français|
The Panic Disorder Severity Scale (PDSS) is promising to be a standard global rating scale for panic disorder. In order for a clinical scale to be useful, we need a guideline for interpreting its scores and their changes, and for defining clinical change points such as response and remission.
We used individual patient data from two large randomized controlled trials of panic disorder (total n=568). Study participants were administered the PDSS and the Clinical Global Impression (CGI)-Severity and -Improvement. We applied equipercentile linking technique to draw correspondences between PDSS and CGI-Severity, numeric changes in PDSS and CGI-Improvement, and percent changes in PDSS and CGI-Improvement.
The interpretation of the PDSS total score differed according to the presence or absence of agoraphobia. When the patients were not agoraphobic, score ranges 0–1 corresponded with “Normal,” 2–5 with “Borderline”, 6–9 with “Slightly ill”, 10–13 with “Moderately ill”, and 14 and above with “Markedly ill.” When the patients were agoraphobic, score ranges 3–7 meant “Borderline ill,” 8–10 “Slightly ill,” 11–15 “Moderately ill,” and 16 and above “Markedly ill.” The relationship between PDSS change and CGI-Improvement was more linear when measured as percentile change than as numeric changes, and was indistinguishable for those with or without agoraphobia. The decrease by 75–100% was considered “Very much improved,” that by 40–74% “Much improved,” and that by 10–39% “Minimally improved.”
We propose that “remission” of panic disorder be defined by PDSS scores of 5 or less and its “response” by 40% or greater reduction.
A workgroup on assessment standardization in panic disorder was convened by the NIMH in 1993 and agreed on the desirability of a global, composite rating scale of severity for panic disorder . A subsequent review of the literature revealed only a small and variable percentage of study reports recorded any form of global rating of the disorder . Four of the authors (Shear, Woods, Gorman and Barlow) have therefore developed the Panic Disorder Severity Scale (PDSS) [3, 4] as a brief clinician-rated scale to assess overall panic disorder severity.
The PDSS has performed well and meets the need for a reliable global assessment. The instrument, modeled after the Yale-Brown Obsessive Compulsive Scale , consists of seven items, each rated on a 0 to 4 scale (0 denoting none, and higher ratings reflecting greater degrees of symptom severity.) The items assess frequency of panic attacks, distress caused by panic attacks, anticipatory anxiety, agoraphobic fear/avoidance, panic-related sensation fear/avoidance, and work and social impairment. Each item is clearly defined and is equipped with semi-structured interview guides as well as with explicit anchor points . In the first study of its performance, this scale demonstrated adequate internal consistency and reliability, excellent inter-rater reliability, good discriminant validity and sensitivity to change . A replication study with a new set of patients confirmed its reliability and convergent and discriminant validity, and added information about a cut-score to discriminate patients with and without current panic disorder . The scale has gained wide acceptance and now been translated into at least nine languages (Spanish, Portuguese, Italian, Hungarian, Finnish, Serbo-Croatian, Japanese, Korean, Turkish), with satisfactory reliability and validity, comparable to the original English version [7–9].
In order for a rating scale to be optimally useful, clinicians need to know how to interpret the obtained scores; for example, they need to know how severe a patient is when the score is, say, 20, and also how much better a patient has become when the score has decreased, say from 20 to 15.
Moreover, there is not yet consensus regarding definitions of clinically important judgments of response and remission. The availability of clear guidelines for estimating response and remission would greatly facilitate communication among researchers, clinicians and patients. Consistent use of a reliable rating scale, anchored with such clinically meaningful guidelines, would also substantially improve the usefulness of meta-analytic reviews. For example, a comprehensive Cochrane search identified 23 relevant randomized comparisons of medication and psychotherapy in panic disorder. However the primary outcome of “response” could be determined in only 17 of these , thus rendering the pooled results susceptible to selective outcome reporting bias [11, 12].
One way to increase clinical interpretability of a psychiatric rating scale is the so-called anchor-based approach , by which a new instrument is compared with a standard or anchor that is itself interpretable. Leucht and his colleagues adopted this approach for the Positive and Negative Syndrome Scale (PANSS) by comparing it with the Clinical Global Impression (CGI) scales that are by themselves informative and interpretable . Similar evidence-based guidelines for interpretation are now available for the Brief Psychiatric Rating Scale , the Hamilton Rating Scale for Depression , Montgomery-Asberg Depression Rating Scale, the Hamilton Rating Scale for Anxiety, the Panic and Agoraphobia Scale, and the Liebowitz Social Anxiety Scale .
The present paper aims to utilize such a method in order to provide further guidelines for the interpretation of the PDSS.
We used individual patient data from the 2 large randomized controlled trials of panic disorder with or without agoraphobia conducted by 4 of the authors (Barlow, Gorman, Shear and Woods). The first was the Multi-Center Collaborative Treatment Study of Panic Disorder (MCCTSPD) , designed to compare medication, cognitive behavioral treatment (CBT) and their combination. A total of 312 patients with panic disorder (DSM-III-R) recruited across 4 sites, were randomly assigned to receive imipramine only, cognitive-behavioral therapy only, placebo only, CBT plus imipramine or CBT plus placebo. The patients were treated weekly for 3 months (acute phase treatment), and responders were then seen monthly for 6 months (maintenance treatment) and those who maintained their response were followed up for 6 months after treatment discontinuation. Study participants were administered the PDSS and the CGI-Severity at baseline, and then the PDSS, the CGI-Severity and the CGI-Improvement at end of acute phase treatment, at end of maintenance treatment and at follow-up. The outcome assessors were social workers, doctoral level psychologists or advanced doctoral psychology students who had been trained to reliability prior to the beginning of the study, and participated in bimonthly conference calls to ensure continued reliability. In addition, all assessment sessions were audiotaped and inter-rater reliability was determined for a randomly selected 10% of these interviews. Reliability on main measures remained above 90%.
The second study, Treatment of Panic Disorder Long-Term Study (TOPDLTS), was designed to determine long term outcome following open treatment with CBT. The design of this study entailed an acute phase open treatment with CBT during which all participants diagnosed with panic disorder with or without agoraphobia (DSM-IV) received 11 weeks of CBT (completer n=256). The current paper includes data from the acute phase open trial . Participants were administered the PDSS at baseline, and the PDSS plus the CGI-Improvement at post treatment. As in our first study, evaluators had been trained to reliability, participated in monthly supervision conference calls but were kept blind to the allocated treatment. 10% of all assessments were randomly selected for monitoring throughout the course of the study. The intraclass correlation coefficient for the PDSS was 0.99.
We utilized a modified CGI-Severity Scale, anchored for assessment of overall severity of panic disorder as follows:
The CGI-Improvement Scale consisted of a 7-point scale rating overall improvement in panic disorder distress and impairment, in comparison with an earlier time point as specified in the study protocol. The following panic disorder-specific anchor points were provided:
In the psychometric literature the search for corresponding points on different, but correlated, measures is referred to as ‘linking’ . For this study we used equipercentile linking, a technique that identifies those scores on both measures that have the same percentile rank, by using the SAS program EQUIPERCENTILE , a realization of the algorithms described in Chapter 2 of Kolen and Brennan . In the first step, percentile rank functions are calculated for both variables. Using the percentile rank function of one variable and the inverse percentile rank function of the other, one then finds for every score of one variable a score on the other variable that has the same percentile rank. Please note here that, although often used, a regression analysis would not have been appropriate because linear regression treats one scale as the independent variable measured without error and the other as the dependent variable measured with error. This is conceptually wrong because both variables are measured with random error [14, 15].
Linking the PDSS and CGI-Severity was possible with the MCCTSPD dataset only as the second study did not utilize the CGI-Severity rating. In the MCCTSPD study we utilized baseline, post-acute, post- maintenance and follow-up assessments, where available patients were administered the PDSS and the CGI-Severity simultaneously. We first computed the Pearson correlation coefficients for each time point, in order to confirm that these two scales do indeed correlate with each other. We then graphically examined the linking between these two measures at the four time points. The PDSS scores corresponding with each of the CGI-Severity anchoring points were read off from the graphs.
Both datasets permitted us to link changes in the PDSS and the CGI-Improvement. We examined these correspondences both in terms of the numeric change and the percentage change in the PDSS total scores. In the first study, CGI-Improvement was measured at post-acute, post-maintenance and follow-up, each time asking the rater to compare the patient’s current state with the baseline. The CGI-Improvement was anchored to measure the change from baseline during the acute phase of study in TOPDLTS.
It is known that the source of disability among patients with panic disorder may be different depending on the degree of agoraphobia. We therefore ran sensitivity analyses by repeating all the analyses for those currently with no to mild agoraphobia (PDSS item 4 “agoraphobic fear/avoidance” =0 or 1) and those currently with moderate to extremely severe one (PDSS item 4 = 2 or 3 or 4).
Table 1 tabulates the baseline demographic and clinical characteristics of the included patients from the MCCTSPD and TOPDLTS.
The Pearson correlation coefficients between the PDSS and the CGI-Severity were 0.63, 0.84, 0.88 and 0.89 respectively for baseline (n=278), end of acute phase treatment (n=247), end of maintenance treatment (n=167) and at follow-up (n=168) in the MCCTSPD.
Changes in PDSS scores and CGI-Improvement showed correlations in a similar range as the severity scores both in the MCCTSPD and the TOPDLTS studies. In the former, the Pearson correlation coefficients for the absolute change scores of the PDSS were −0.75 (n=231) post acute treatment, −0.65 (n=157) post maintenance and −0.77 (n=151) at follow up. When we transformed scores to percentages, correlations were −0.83 (n=231) post acute treatment, −0.81 (n=157) post maintenance and to -0.83 (n=151) at follow up, all statistically significantly greater than for absolute scores (t=4.43, t=5.49, and t=2.22 respectively, all p<0.05). In the second study, the correlation between CGI-Improvement and absolute change in PDSS was −0.75 and that between CGI-Improvement and percent change in PDSS was −0.83 at end of the open-label acute phase treatment (n=256). The correlation was again significantly stronger using the percent transformation (t=4.5, p<0.001).
Figure 1 shows the result of the linking between the PDSS and the CGI-Severity from the MCCTSPD at baseline, at end of acute phase treatment, at end of maintenance treatment and at end of follow-up. As seen in the Figure, linking between PDSS and CGI-Severity was linear and virtually identical for all the four time-points.
The graphs suggest that being considered “Normal” on the CGI-Severity corresponds approximately to a PDSS total score of 0, being considered “Borderline” to 3, being considered “Slightly ill” to 8, being considered “Moderately ill” to 12, being considered “Markedly ill” to 16–17, and being considered “Among the most severely ill” to 21–22.
Figure 2 shows the linking between the absolute change in PDSS scores and the CGI-Improvement from the MCCTSPD and the TOPDLTS. All the three graphs from the MCCTSPD and that from the TOPDLTS depicting the open-label acute phase treatment, all of which evaluated the changes from the baseline, were virtually overlapping. Here the linking was possible within the “improved” range because only very few patients deteriorated and there were not enough data in the left upper area of the graphs.
Readings of these graphs suggest that “Minimally improved” on the CGI-Improvement corresponds approximately to an absolute decrease of 3 points in PDSS, “Much improved” to that of about 6–7 points, and “Very much improved” to that of 12 points.
Figures 3 shows the linking functions between the percent change in the PDSS scores and the CGI-Improvement from the MCCTSPD and the TOPDLTS. Comparing Figure 3 with Figure 2 shows that percent transformation helps to linearize the relation between PDSS change and CGI-Improvement. Again the two sets of graphs from the MCCTSPD and the TOPDLTS are largely consistent with each other and suggest the following. In comparison with the beginning of the treatment, very few patients become “much or very much worse” so that we cannot have precise corresponding values for these anchor points.
Being rated “Minimally worse” corresponds with 20–30% increase in PDSS score, ““Minimally improved” with 20% decrease, “Much improved” with 55% decrease and “Very much improved” with about 90% decrease.
The linking of PDSS total score with CGI-Severity shows clear-cut differences between the subsamples: for all points in time and every given CGI-Severity score the patients with agoraphobia scored 1 to 3 points higher than those without agoraphobia on the PDSS (Figure available from the first author upon request). On the other hand, with regard to the linking between percent improvement of the PDSS and the CGI-Improvement, there were no interpretable differences between these two subsamples, at least in those areas where we have enough data points (Figure available from the first author upon request).
Because the EQUIPERCENTILE method supposes a continuous distribution of the CGI scores, the range of scores that correspond with, for example, “Moderately ill” (CGI-Severity score of 4), can be read off from the graphs that correspond with 3.5 and 4.5. Based on the present findings, we suggest the guidelines as depicted in Tables 2 and and33 for the interpretation of the PDSS. Table 2 provides the interpretation of the PDSS total scores for patients currently with agoraphobia and currently without, separately. Table 3 provides the interpretation of the PDSS percentile changes.
These guidelines can give us a hint as to how to define “response” and “remission” if we are to base their judgment on the global severity of the disorder. “Response” is most often interpreted as “Much or very much improved” in terms of the CGI-Improvement, whereas “remission” is thought to be equivalent to “Borderline ill or normal” on the CGI-Severity . Using these definitions, in conjunction with the analyses presented in this paper, we propose that PDSS percent reduction of 40% be used to identify “response” and that PDSS scores of 5 or less be considered “remission” of panic disorder (we would not consider patients currently with agoraphobia to be in remission).
These evidence-based interpretations are remarkably in line with existing suggestions in the literature. We had formerly defined PDSS response as 40% or greater reduction from baseline in the first report from the MCCTSPD  because this score represented the optimum cutoff in an ROC analysis to detect responders defined as “Much or very much improved” on CGI-Improvement and “Slightly ill” or better on CGI-Severity. Our re-analysis of the MCCTSPD dataset using the EQUIPERCENTILE method confirmed this finding, and a new analysis from our second study (TOPDLTS) replicated this cut score. Yamamoto et al  proposed the following rules of thumb for interpreting absolute PDSS scores: scores up to 10 correspond with “mild,” those between 11 and 15 with “moderate,” and those at or above 16 with “severe” panic disorder. This interpretative guide, although based on only 24 Japanese patients with panic disorder, is roughly consistent with our results based on over 200 American patients. These findings can also be usefully integrated with those of a previous study by the second author (Shear et al 2001) in which we found a PDSS score of 8 accurately identified individuals with, compared to those without, current panic disorder, because 8 represents the most representative point estimate for the “Slightly ill” range among those currently without agoraphobia.
There are several limitations, however. First, we must keep in mind the conceptual difficulty of defining severity, or its change, of a clinical condition by the numeric sum of a rating scale. The psychometric assumption behind rating scales, that individual item scores can be summed to produce a total that has a linear relationship with the clinical phenomenon it was designed to measure, has been called into question [23, 24]. How to overcome this apparent shortcoming, however, is yet to be specified . In this article we have followed the standard psychometric conventions to inform the clinical practices and we believe that our emphasis on interpretability of total scores as well as their changes is in accordance with Feinstein’s call for clinical sensibility of rating scales . Secondly, the PDSS and CGI ratings were not independent, and the same rater completed both scales based on the same assessment interview. All the raters, however, had been trained to reliability and were under continuous supervision throughout the trials. Moreover, whether the same or different raters rate the scale in question and the CGI may not always affect the anchoring. For example, in a similar attempt to provide an interpretative guideline for the Hamilton Rating Scale for Depression, data from studies where the ratings of the two scales were done by one or two raters were quite convergent . Lastly, the first study (MCCTSPD) included only individuals with mild to moderate agoraphobia and participants currently taking psychotropic medications were excluded. In the second study, however, exclusion criteria were much less restrictive, and participants could have any degree of agoraphobia and current medications were permitted if the patient was willing to consider discontinuation during the open treatment phase.
On the other hand, strengths of our study include the large sample size, rigorously maintained reliability of the PDSS and the CGI ratings by trained raters, psychometrically sound equipercentile linking between the test and the anchor, and replication of the findings across two independent samples.
In summary, we believe that our findings represent a significant pragmatic contribution to the treatment of panic disorder and enhance the ability to link research and practice. The guideline we propose will assist clinical investigators in translating findings to be interpretable by practitioners and patients, and will also support practitioners in their use of the PDSS in management of panic disorder.
MCCTSPD and TOPDLTS were supported by National Institute of Mental Health grants MH045963 to Dr Gorman at Columbia University Hillside-Hospital; MH045964 to Dr Shear at Department of Psychiatry, University of Pittsburgh School of Medicine; MH045965 to Dr Barlow and Boston University; and MH045966 to Dr Woods at Department of Psychiatry, Yale University.
Financial disclosure: Dr Furukawa has received research funds and speaking fees from Asahi Kasei, Astellas, Dai-Nippon Sumitomo, Eisai, Eli Lilly, GlaxoSmithKline, Janssen, Kyowa Hakko, Meiji, Nikken Kagaku, Organon, Otsuka, Pfizer, and Yoshitomi. He was on research advisory board for Pfizer, Janssen, Mochida and Meiji, and is currently on research advisory board for Sekisui Chemicals. Dr Shear has served as a consultant to Pfizer and has received grant/research support from Forest. Dr Barlow receives royalties from Oxford University Press and Guilford Press. Stefan Leucht received speaker/consultancy honoraria from SanofiAventis, BMS, EliLilly, Janssen/Johnson and Johnson, Lundbeck and Pfizer. Dr Woods, Dr Gorman, Mr Money, Ms Etschel and Dr Engel report no additional financial affiliations or other relationships relevant to the subject of this article.