|Home | About | Journals | Submit | Contact Us | Français|
Intimate partner (domestic) violence, IPV, is a common problem in medical practice that is associated with adverse health outcomes. There are widespread calls to improve IPV education for physicians, but there are few valid, reliable, easily available, and comprehensive measures of physician IPV knowledge, attitudes, and practices that can be used to assess training effectiveness.
In 2002, expert consensus and prior surveys were used to develop a new survey-based IPV self-assessment tool that included more information on current IPV knowledge and practices than previous tools. The draft tool was evaluated using standard psychometric techniques in a group of 166 physicians in 2002, revised, and then retested in a second group of 67 physicians on three occasions in 2003 and 2004. Analyses were conducted in 2005.
The draft IPV survey tool demonstrated good internal consistency reliability, with Cronbach’s alpha for 10 final scales =0.65. The developed scales were closely correlated with theoretical constructs and predictive of self-reported behaviors. On repeat testing, a revised version of the tool was found to have good stability of psychometric properties in a different physician population (a =0.65 and internal correlations as predicted), good correlation with measured office IPV practices, and stable results in this population over 12 months.
The final version of this instrument, named the PREMIS (Physician Readiness to Manage Intimate Partner Violence Survey) tool, is a 15 minute survey that is a comprehensive and reliable measure of physician preparedness to manage IPV patients. This tool is publicly available and can be used to measure the effectiveness of IPV educational programs.
Intimate partner (domestic) violence, IPV, is a common problem in medical practice that is associated with a number of adverse health outcomes.1–5 Many believe that these poor outcomes could be improved with better physician education,6–9 but, despite ongoing educational efforts, field studies continue to show that physicians rarely screen for IPV, are not aware of community resources, and are not confident in their abilities to manage IPV patients.10–12 Importantly, recent medical school graduates report more emphasis on IPV in medical school, but no improvement in their sense of IPV-related competence.13
One factor that may be limiting the success of IPV education programs, as well as a more clear understanding of physician educational needs, is a lack of convenient and well-defined educational outcome measures. Some researchers have used locally developed survey tools combined with measures of actual change in clinical practices to assess the effectiveness of their IPV education programs. 14–17 This approach provides depth and credibility, but, since the survey tools are usually not further defined or psychometrically tested, and since the clinical data are derived from costly chart reviews or patient interviews, these outcome measures are not well suited for widespread use.
Others have used standardized self-administered survey tools to measures changes in knowledge, attitudes, beliefs, and self-reported behaviors (KABB) following IPV education. This approach is more adaptable to large-scale implementation, but requires that such tools be readily available and tested in multiple settings (i.e., generalizable). Two survey instruments, in particular, have been used to assess physician IPV KABB in multiple settings. One, developed by Short at the Centers for Disease Control and Prevention (CDC), is a 13-scale tool that was used to measure the effectiveness of an IPV education program at UCLA18 and two community settings.19,20 Although this tool has been described on several occasions, the development and underlying psychometric properties of the tool have not been reported in the literature. A separate six-scale IPV KABB tool, was developed and tested by Maiuro.21 This tool was used by Thompson et al.22 to evaluate a live IPV training workshop and modified by Harris and Maiuro to evaluate an online IPV continuing medical education (CME) program.23
These standardized IPV survey tools have features that could be improved. Both tools gather information on provider self-assessed knowledge, attitudes and beliefs, but neither assesses actual knowledge using current literature as a standard. Likewise, neither tool collects much information on self-reported behaviors (practice issues). For example, the Maiuro tool asks only about the frequency of IPV inquiry, while the Short tool asks only whether the user asks all new patients about abuse. Finally, neither tool has been shown to provide reliable results in the same population of health professionals over time. The aim of this project was to develop an easily administered, survey-based tool that would provide comprehensive and reliable measures of physician readiness to manage IPV.
Existing IPV physician survey tools were reviewed and initial survey items were adapted from previous work, particularly that of the authors with the CDC (LS) and the Massachusetts Medical Society (EA). To establish content validity, the proposed survey items were reviewed by an outside group of IPV educators (see Acknowledgments). These reviewers were charged with selecting existing items or developing new ones that reflected key theoretical constructs and measured important IPV educational outcomes, as described in the literature.24,25
The characteristics of the survey tool were evaluated in two different physician populations. Initial psychometric studies were undertaken in a group of 166 practicing physician subscribers to a commercial CME Website (the development group). All physician subscribers to the CME Website (1100 total physician subscribers in November, 2002) were offered a modest honorarium of $30 of “store credit” at the Website, which could only be used toward the purchase of CME programs, for completing the survey online. No CME credit was offered for completing the survey. The Website subscriber base has been used previously to evaluate the effectiveness of online CME programs and been found to be representative of U.S. physicians in general.26
A revised, paper-based version of the survey tool was also tested on three separate occasions in a group of 67 practicing primary care physicians in Phoenix and Kansas City (the evaluation group). These physicians practiced in smaller (<8 physicians) community-based practices and were participants in a study of online IPV education27 who did not receive specific IPV education during this period. Study physicians were recruited by various approaches from the population of all primary care physicians in Phoenix and Kansas City (approximately 6000 total). Physicians in this study were offered $25 cash for completing the survey. Although the primary focus of this work was to develop a reliable and comprehensive measure of physicians self-reported KABB, all offices that participated in the CME study were also visited to compare overall office IPV practices with individual physician self-reported practices. Office practices were assessed using standardized techniques as described elsewhere.27
Final data analyses were completed in 2005. Maximum likelihood factor analysis with an oblique rotation was used to extract key survey factors and verify how well survey items fit the constructs they were designed to measure.28,29* These methods were selected to obtain a Chi Square estimate of fit and to allow the maximum amount of discrimination between factor loadings to be displayed. Cronbach’s alpha coefficient was used to determine internal consistency reliability within identified scales.
The construct validity of identified scales was tested by examining the relationship between empirically derived scales and objective values assigned to items, based on the expert panel’s original theoretical constructs, using the Rand coefficient.30 The Rand coefficient compares items grouped according to two different clustering solutions and ranges between 0 and 1, with higher values indicating higher levels of agreement between the two solutions. Additionally, correlations among survey items that should be related, such as opinions about the adequacy of prior training and opinions about perceived levels of preparation, were determined. Multiple regression analysis was used to test the internal predictive validity of key survey items.
Multiple Analysis of Variance (MANOVA) was used to compare survey item results between two groups of physicians and within the same group of physicians over time. Correlation coefficients were used to measure the relationship between physician self-reported behaviors on the survey and actual office practices as measured by site visit.
The draft instrument developed in conjunction with the expert reviewers contained an 11-question respondent profile (used for tracking and group demographics) and 90 proposed survey questions, some of which had multiple components. Survey questions were grouped into four major sections: (1) Background (four items/scales dealing with type of prior IPV training, Amount, in hours, of prior IPV training, Perceived (felt) IPV Knowledge, and Perceived (felt) IPV Preparation); (2) Actual Knowledge (a scale containing 19 multiple choice, matching, and true/false questions); (3) IPV Opinions (54 individual questions concerning attitudes and beliefs, scored on a seven-point Likert-type scale from Strongly Disagree to Strongly Agree; some opinion items were intentionally worded negatively and were reversed scored); and (4) Practice Issues (a 13-item scale dealing with self reported behaviors, such as individual and office IPV practices and policies).
The prototype survey instrument was completed online in November 2002 by 166 physicians. Most (76%) of the sample were male and their mean age was 50.4 years (SD=11.8). Respondents were entirely from the U.S. The most common medical specialties were Family Medicine (22.8%), Internal Medicine (19.8%), and Obstetrics and Gynecology (7.8%), although 38 specialties were represented. Respondents had practiced a mean of 20.03 (SD=12.53) years. Approximately 17% did not see patients, but over half saw more than 60 patients per week. Of the 80 % of physicians who provided information on their previous IPV training, 13.5% (n=18) had received no previous training, 35.3% had received 2 or fewer hours, 71% had received 6 or fewer hours of training, and 4.8% had received 25 or more hours of training.
Respondents in this psychometric group were asked about type, amount, and perceived effectiveness of prior IPV training in the Background section of the instrument. The type of prior training was a descriptive item and was not included in the psychometric analyses nor is it part of the final tool. The Amount of prior IPV training was an estimate of total number of hours. The Perceived Preparation scale included 11 items asking respondents how prepared they felt they were to work with IPV victims. Scores and responses ranged from 1=not prepared to 7=well prepared, with a mean score of 4.14 (SD=1.49) across all 11 items. The internal consistency of this scale was high (alpha = 0.959). The Perceived Knowledge scale contained 16 items asking how much respondents felt they knew about IPV. Scores and responses on these questions ranged from 1=nothing to 5=very much, with a mean score of 3.00 (SD=0.82), and high internal consistency among the items (alpha =0.963).
The Actual Knowledge scale was based on findings from the IPV literature and included eight multiple-choice items and 11 true/false items. It was not appropriate to measure internal consistency of this criterion-referenced section of the instrument.31 A total score of correct items was used to represent actual IPV knowledge.
Data from the 54 items in the Opinions section of the tool showed that 13 items from earlier survey tools, which primarily represented the physician’s role in IPV (e.g., Healthcare providers should not be responsible for identifying cases of IPV; Nothing I do would help prevent future incidents of violence to a victim of IPV), were very skewed, demonstrating a ceiling effect, where almost all respondents scored highly in the appropriate direction. These items were, therefore, eliminated from the tool.
Initial factor analysis of the remaining 41 Opinion items revealed a well-fitting 10-factor solution (Chi Square=629.323; df=585; p=0.100). Only factor loadings over 0.2 were displayed. Given this, over one quarter of the items loaded on a single factor. Where items loaded on two or more factors, the highest loading was used for scale assignment. In most cases, this loading was considerably higher than loadings on other factors (e.g., 0.656 vs. 0.258). One factor contained only one item that didn’t load higher on another factor. Therefore, the reliability of the Opinion scales identified in the remaining 9 factors was tested.
Using the alpha coefficient to determine the internal consistency reliability within each identified Opinion scale, several items were found that could be dropped without losing critical information and two scales demonstrated insufficient cohesion. After dropping the items that either didn't contribute or were redundant, there remained a 36-item Opinions section. Eighteen of these 36 items were from the CDC instrument, 9 were reworded from the CDC instrument, and 9 were new. Six well-fitting scales, with 31 items, were identified in this section (alpha = 0.65 or higher), with an additional 2 scales that were kept for future testing. These eight Opinion scales, a sample of contributing items, and their alpha coefficients are displayed in Table 1.
One measure of construct validity is the extent to which survey items and scales are consistent with their original theoretical constructs. The Rand coefficient for the relationship between the eight empirically derived Opinion scales and the objective values assigned to items according to the original theoretical constructs developed by the expert panel was 0.89, indicating a high degree of association between the original theoretical constructs and the empirical scales derived with the factor analysis.
Another measure of construct validity is the correlation between instrument scales, which, while measuring different aspects of a physician’s preparedness to manage IPV, should move in the same direction. As expected, the Perceived Knowledge score was significantly correlated with the amount of Previous Training (R=0.337; p=0.000) and Perceived Preparation (R=0.789; p=0.000). Actual Knowledge was also correlated with Perceived Knowledge (R=0.201; p=0.012), but, interestingly, not with the amount of Prior Training or with Perceived Preparation. Correlations between the three Background item/scales, the Actual Knowledge scale, and all eight Opinion scales are shown in Table 2. As may be seen in Table 2, five of the six reliable Opinion scales and one of the test Opinion scales were significantly correlated with Perceived Preparation, and Perceived Knowledge. All Opinion scales except those dealing with Legal Requirements, the relationship between IPV and Alcohol/drug use, and Victim Autonomy were significantly correlated with amount of Prior Training. All scales except those dealing with Preparation, Legal Requirements, and Constraints were significantly correlated with Actual Knowledge. Several of the reliable Opinion scales were also correlated with each other, as shown in Table 3.
A third measure of construct validity is the extent to which self-assessed knowledge, attitudes and beliefs predict self-reported behaviors. Unlike other standardized IPV surveys, the Practice Issues section of this survey tool included a diverse list of 13 items related to the physician’s actual practice, such as: situations in which the physician screens for IPV, actions taken when IPV is identified, the presence and use of IPV resource materials in the practice, and familiarity with workplace policies and community resources. The score for the practice issues scale was based on the sum of appropriate responses to questions in this section. The analysis showed significant correlations between scores on Practice Issues, all Background scales, Actual Knowledge, and scores on six of eight Opinion scales. The only predictor scales not significantly correlated with Practice Issues were Alcohol/Drugs and Victim Autonomy from the Opinion section.
Multiple regression analysis of scores on the Practice Issues scale with all other scale scores entered as independent variables showed a significant relationship between these variables taken together and Practice Issues (F=5.76; p=0.000), which explained the variation in Practice Issues scores fairly well (R = .621; R2 Adj. = 0.319). A subsequent stepwise multiple regression analysis demonstrated that three of these variables best predicted variation in Practice Issues in this group: amount of Prior Training, and two Opinion scales, Workplace Issues, and Self Efficacy (R2 Adj. = 0.345).
Following initial psychometric evaluation, the revised survey tool contained 72 items, which included two test Opinion scales. A paper version of this tool was then evaluated in a group of 67 community-based physicians on three separate occasions, approximately 6 months apart, from September 2003 to October 2004. This evaluation group differed from the original development group of physicians in several ways. A smaller percentage of this group was male (55% vs 76%). The mean age was lower at 45.1 years (SD = 10.17) vs 50.4. All physicians were in community-based practice in either Phoenix or Kansas City. Most (82.1%) practiced family medicine, pediatrics, or obstetrics/gynecology. Respondents had practiced a mean of 16.64 (SD = 9.917) years vs 20.03. Over 85% saw more than 60 patients per week. Almost one-third (31.3%) had had no prior IPV training.
MANOVA analysis demonstrated several significant differences between the scores obtained on the survey tool from this second group of physicians and scores obtained from the original development group. The evaluation (second) group scored significantly lower in Perceived Preparation than the original group (F=14.9; p=0.000) and significantly lower on one of the eight Opinion scales (Legal Issues: (F=39.0; p=0.000). On the other hand, the evaluation group scored significantly higher in Actual Knowledge (F=81.5; p=0.000) and on IPV Practice Issues (F=10.1; p=0.000). There were no significant differences in Prior Training, Perceived Knowledge, or the other seven Opinion scales.
Despite the differences in survey results, the psychometric properties of the tool were consistent (reliable) between the two groups of physicians. The internal consistency reliability, as measured by the alpha statistics, was comparable on all scales (complete data available on request). For example, the Perceived Preparation scale was found to have an alpha of 0.902 in the evaluation group vs 0.959 in the original development group. These results also held for all eight Opinion scales, although three items were removed from the Self-Efficacy scale to preserve an alpha = 0.65. When the original psychometric data were retested with these three items removed the internal consistency reliability for the Self-Efficacy scale held fairly well at 0.63. The two Opinion scales that were less internally consistent in the original group, Constraints and Victim Autonomy, were also less consistent in this group, with alpha statistics of 0.57 and 0.46, respectively.
The between-scale correlations that were identified in the original group were generally present in the evaluation group (complete data available on request). As with the original group, the Perceived Knowledge score was significantly correlated with hours of Previous Training (R=0.85; p=0.019) and the Perceived Preparation score (R=0.815; p=0.000). As in the earlier group, Actual Knowledge was not correlated with the amount of Previous Training or with Perceived Preparation. In distinction to the earlier group Actual Knowledge was not correlated with Perceived Knowledge either.
The evaluation group scored significantly higher than the development group on the Practice Issues section of the survey (see above), but the internal correlations with other predictor scales were similar. At the 0.05 level of significance, seven of the twelve predictor variables were significantly correlated with Practice Issues and, at the 0.1 level, the results were identical to those found in the original psychometric group. As with the original data, multiple regression analysis with all predictor variables entered into the model demonstrated a significant relationship of these variables taken together with Practice Issues (F=5.281; p=0.000), which explained the variation in Practice Issues fairly well (R = 0.741; R2 Adj. = 0.445).
MANOVA analysis was used to determine the consistency of the survey scores over time in the same physicians. These data, shown in Table 4, indicate that survey scores were quite consistent over the 12-month period of the study, in the absence of outside IPV education or other interventions.
Five items from the Practice Issues section of the instrument regarding physical evidence were similar to items obtained from site visits to evaluation physicians’ offices. These items were subjected to a correlational analysis, and significant relationships were found for each element pair. These findings suggest that overall office practices, in a stable environment, reflected individual physician self-reported data, thus providing an extra level of external validation to this section of the survey instrument. Question wording and results are displayed in Table 5.
This IPV survey has been named the PREMIS (Physician Readiness to Manage Intimate Partner Violence Survey) tool. In its final form (minus the Constraints and Victim Autonomy scales) it has 67 individual items and requires approximately 15 minutes to complete. The PREMIS tool has a high level of consistency with constructs that theoretically contribute to effective healthcare provider response to victims of IPV and a high level of consistency with earlier instruments. PREMIS is more current and more comprehensive than previous standardized IPV assessment tools. The tool has been tested in multiple settings and been shown to be reliable and valid. Other work demonstrates that PREMIS is sensitive to change and capable of discriminating trained from non-trained physicians.27 The tool, codebook, and scoring methodology is freely available for use by IPV educators and program developers
This instrument has the potential to be useful in a number of different ways: (1) as a pretest and needs assessment to measure physician knowledge, attitudes, beliefs, behaviors, and skills that may need to be addressed during training or other on-site intervention; (2) as a training adjunct to orient physicians to the topic and expose them to the complexity of IPV issues; (3) as a posttest to determine changes in physician KABB over time or as the result of training; and (4) as a comparative instrument to assess differences in KABB between physicians who have received training and those who have not.
Current limitations of this tool include a lack of psychometric data from non-physician healthcare providers and a lack of correlation with individual IPV practices. While the tool could be used to assess the readiness of medical students or nurses (for instance) to manage IPV, it would be reasonable to also evaluate the tool’s psychometric properties in these populations. Also, as with any self-report instrument, this survey tool does not assess actual behaviors, although the data show a good correlation between certain elements of the PREMIS tool and office practices measured in stable settings. Future studies should investigate not only the relationships between the KABB items on the tool and actual physician behaviors, but also the relationship between KABB items, physician behaviors and patient outcomes.
The authors thank the members of the expert consultant team who reviewed the survey tool: Denise M. Dowd, MD, MPH (Children’s Mercy Hospital, Kansas City, MO); Christine Fiore, PhD (University of Montana); Jennifer Gunter, MD (University of Colorado); Randa M. Kutob, MD (University of Arizona); Penny Randall, MD; Patricia Salber MD; and Ellen Taliaferro, MD. The authors also thank Stephen Buck, PhD. for his assistance with data preparation and analysis.
*Factor analysis is a data-reduction technique that transforms data into linear combinations of items. Maximum likelihood analysis is an iterative process to derive parameter estimates that best fit the proposed model. Oblique rotation allows the factors to be correlated, and creates the most simple structure possible among the items.
The development of the PREMIS tool and the research study were supported by a Small Business Innovation and Research (SBIR) Grant, R44-MH62233, from the National Institute of Mental Health to Medical Directions, Inc. Dr. Harris was the Principle Investigator on this grant.
No financial conflict of interest was reported by the authors of this paper.