|Home | About | Journals | Submit | Contact Us | Français|
To use multivariate regression methods to analyze simultaneously data obtained from multiple respondents or data sources (informants) at health centers.
Surveys of executive directors, medical directors, and providers from 65 community health centers (176 informants) who participated in an evaluation of the Health Disparities Collaboratives.
Cross-sectional survey of staff at the health centers during 2003–2004.
In order to illustrate this method, we analyze the association between informants' assessments of the culture of the center and participation in the collaborative, and the association between computer availability and the effort made by management to improve the quality of the care and services at their center. Multivariate regression models are used to pool information across informants while accounting for informant-specific effects and retaining informants in the analysis even if the data from some of them are missing. The results are compared with those obtained by traditional methods that use data from a single informant or average over informants' ratings.
In both the Collaborative participation and quality improvement efforts analyses, the multivariate regression multiple informants' analysis found significant effects and differences between informants that traditional methods failed to find. Participating centers emphasized developmental (entrepreneurship, innovation, risk-taking) and rational culture. The effect of hierarchical culture (stability and bureaucracy) on participation depended on the informant; executive directors and medical providers were the most discrepant. In centers that participated in the Collaborative, the availability of computers was positively associated with the effort that management made toward improving quality.
The multiple informants model provided the most precise estimates and alerts users to differential effects across informants. Because different informants may have different insights or experiences, it is important that differences among informants be measured and ultimately understood by health services researchers.
Health services researchers often collect similar or even identical data from multiple sources (e.g., physicians, nurses, patients, teachers, parents) in order to increase the reliability of the measurements or to gain insights from several different perspectives and contexts. When multiple informants measure an independent variable, researchers are typically interested in how the variable or construct measured by each informant affects the outcome. When multiple informants assess an outcome variable, the focal point is often the overall effect of the independent variables on the outcome. In either case, more precise estimates are possible if effects can be pooled across informants.
Traditional approaches for analyzing data from multiple informants include conducting independent analyses for each informant, selecting one informant as the informant of interest and ignoring others (i.e., only using the data obtained from the chosen informant for the analysis), and reducing the data from the informants into an overall measure (e.g., the mean score across informants). However, these methods have several disadvantages. The effects of individual informants are often ignored, differences between or among informants are not testable, and the degree of correlation between informants is not estimated (Horton and Fitzmaurice 2004). In addition, summary measures (e.g., means) are not consistently defined when one or more informants' scores are missing, and will result in a loss of sample size and hence suboptimal analysis if only those units with a full set of responses are included.
The exclusion of data from some informants, or the reduction of data from multiple informants to a summary statistic, is one of the ways in which data are edited in practice. In general, investigators edit data in ad hoc, idiosyncratic ways and often do not disclose their unique strategies (Leahey, Entwisle, and Einaudi 2003; Leahey 2004). Because data-editing decisions can profoundly affect conclusions (Gould 1981; Dewald, Thursby, and Andersen 1986), it is important that these decisions are documented, or better still are avoided (Leahey 2004). For example, if the relationship between two variables varies in direction across informants the relationship between the means of the informants' ratings could be misleading. It is better to utilize multiple (even multiple discordant) views rather than reducing the data to a convenient form for analysis.
Individual responses to surveys can be influenced by several sources of variation including characteristics of individuals, which can affect an individual's interpretation of the construct being measured such as their training and experience, or position in the organization. For example, in sociology the social status of an informant has been shown to have a bearing on the response obtained (Leahey, Entwisle, and Einaudi 2003; Leahey 2004). These characteristics of the informant and context need to be accounted for in order to pool information sensibly across informants, to determine the best way of measuring a construct, and ultimately to make general inferences about the relationship of the variable being measured with other variables.
Multivariate regression models have recently been used to analyze data from multiple informants (Horton, Laird, and Zahner 1999; Goldwasser and Fitzmaurice 2001; Horton et al. 2001; Lash et al. 2003; Horton and Fitzmaurice 2004). The key characteristic of this approach is that individual data from all informants are modeled, enabling the evaluation of comparative inferences about informants (e.g., testing whether the informants have the same or different effects), and avoiding reducing or omitting the data for a unit of analysis because data from some informants were unavailable or missing.
Another model that might be considered for multiple informant data is the “method factor” model (Bollen and Paxton 1998). In the context of a method-factors model, the informant is the “method” of measurement. The analysis of a method-factors model is typically based on a structural equations model that treats the informants' ratings as subjective error-prone measurements of an unobserved trait. In contrast, the multivariate regression models referred to in the preceding paragraph allow for multiple underlying traits, and directly model informant specific effects as opposed to estimating the effect of a common underlying trait (although this could be accommodated within the multivariate regression framework). Because we want to model the effect of informant-specific effects on the outcome, rather than the effect of an underlying trait, herein we use the multivariate regression approach.
In this paper, we analyze data collected from surveys of personnel at community health centers (CHCs) that are part of an ongoing evaluation of a Bureau of Primary Health Care-sponsored quality improvement (QI) program (HRSA 2004). We compare standard approaches for analyzing multiple informant data, including picking one informant as representative of the whole and averaging over informants' ratings, with multivariate regression methods. We address the cases in which information on an independent or dependent variable of interest is available from multiple informants. Although multivariate regression models for analyses of multiple informants have already been established, the advantages of using this approach over standard approaches have not been demonstrated previously using real data. The results will be useful to health services researchers as multiple informant data become increasingly common.
As part of a controlled evaluation of a QI program at CHCs, we surveyed three types of health professionals at participating and nonparticipating health centers about personal and health center characteristics including culture, QI efforts, and availability of computers. Both culture and QI efforts are assessed by multiple informants while the availability of computers is assumed to be known. To illustrate the case where an independent variable is available from multiple informants, we examine whether staff members' assessments of the culture at a health center are associated with the center's participation in the QI program and whether associations vary across informants. To address the situation where a dependent variable is available from multiple informants, we examine whether characteristics of a health center, such as computer availability, size of center, and urban versus rural location, and informant characteristics such as gender and age are related to assessments of QI efforts at the center. For both situations, we compare results under the multiple informant (MIF) multivariate regression model with the results from standard regression models for a single informant (SIF) and the mean of the informants' ratings (MR) to assess the extent to which these often-used basic methods may lack precision or mask important effects. The analyses embody six distinct models corresponding to whether the independent or dependent variable is measured by multiple informants and the corresponding three modeling approaches (i.e., the SIF, MR, and MIF models) applied to each case. In addition, we use the MIF model to determine whether different effects are observed across different types of informants.
In 1998, the Bureau of Primary Health Care's Health Resources and Services Administration (HRSA) initiated the Health Disparities Collaboratives in response to mounting evidence of disparities in the health and quality of care of the underserved, uninsured, and underinsured served by CHCs in the United States. The goal of these disease-specific collaboratives was to generate and document improved health outcomes in underserved populations with chronic medical conditions (HRSA 2004). The evaluation included a national sample of 44 health centers that participated in a collaborative and 21 control centers matched by region, setting (urban versus rural), number of sites, and the total number of patients. QI teams established at each center are encouraged to design interventions guided by the Chronic Care Model (Wagner, Austin, and Von Korff 1996).
At each center, we surveyed the executive director, deputy director, medical director, and a randomly selected provider (at up to three different sites for centers with multiple sites). We mailed 117 surveys to executive and deputy directors and 235 surveys to clinical personnel, for a total of three to nine surveys per center. Nonrespondents were subject to mail and telephone follow-up. A total of 298 surveys were returned (an overall response rate of approximately 85 percent).
For the analyses described herein, we use data from the executive director (or deputy if the director is not available), medical director, and one clinician (a physician or other clinician if a physician is not available) at each center. In centers with multiple clinician respondents, we selected a provider from the primary site (this was the site that the collaborative team leader was from for participating centers and the site with the greatest number of informants for control centers), resulting in at most one respondent of each type from each center (the standard scenario in multiple informants' analyses).
The binary variable indicating participation in the collaborative is the dependent variable for the case where the independent variable is measured by multiple informants (case 1). The five-item leadership scale (see Table 1 for description) of the measure of QI in the organization (Shortell et al. 1995) is the dependent variable for the case where the dependent variable is measured by multiple informants (case 2).
The independent variable of main interest for the analysis of collaborative participation is culture. The culture of a center is assessed by each informant according to four ideal types of culture (group, developmental, hierarchical, and rational) based on the competing values model described by Zammuto and Krakower (1991). Scores indicate the extent to which that aspect of culture is emphasized at the expense of the others (see Table 1 for interpretation). The scores are positive quantities that sum to 100; hence, the types of culture compete for their share of the 100-point total.
The key independent variables in the QI (leadership) analysis are participation in the collaborative and computer availability. Other independent variables used in both models are fixed characteristics of the centers (computer availability, number of sites, number of patients treated, urban versus rural setting), and characteristics of the individual respondents (informant type [executive director, medical director, provider], gender [male versus female], ethnicity [Hispanic versus non-Hispanic], race [black versus nonblack], age, and length of employment).
Although the models can all be described for general outcome variables (continuous, ordinal, categorical), we present them for the case of a binary outcome for the analysis of assessments of the independent variable by multiple informants and the case of a continuous outcome when multiple informants assess the dependent variable. We denote the individual informants within a center by j=1, …, J where J is the total number of informants, the representative informant in a single informant analysis by , and the centers by i=1, …, n. In practice, often corresponds to the informant deemed to have the greatest knowledge, insight, or authority. However, could be selected at random. The data from all other informants are not used in a single informant analysis.
In this case, the dependent variable (participation in the collaborative) is measured at the unit of analysis (the CHC) and thus does not vary across informants, whereas the key independent variable (informant-assessed culture of the CHC) is informant specific.
The SIF model regresses the unit-level dependent variable (Participationi) on informant 's rating of culture () and any other covariates either specific to the informant () or measured at the unit level and thus common to all informants (CovUi). The MR model takes the same form but uses the mean rating of culture across informants at the center in place of the single informant's rating. The SIF and MR models are standard logistic regression models.
In contrast to the SIF and MR models, the MIF model simultaneously fits separate logistic regression equations for each informant (Horton, Laird, and Zahner 1999; Pepe, Whitaker, and Seidel 1999). These are incorporated into a single model by defining the dependent variable separately for each informant even though it has the same value for all informants. In the case of a binary outcome such as participation in the collaborative, the model may be expressed as
where Logit denotes the logit link function that characterizes a logistic regression model, Pr denotes probability, (aj, bj, cj) are regression parameters specific to informant j, and d is a regression parameter common to all informants.
The distinguishing feature between the MIF model in Equation (1) and fitting separate SIF models to each informant is that the former is a single model. Only the MIF is able to test whether the construct measured by each informant's rating of culture has the same or differential associations with the likelihood of participation in the collaborative; it does so by comparing the fit of the model in which against the model that has unrestricted coefficients (i.e., allowing the effect to vary across informants). When the data indicate that the regression coefficient for a predictor variable is the same for all informants, the coefficient may be fixed across informants in the MIF model to obtain inferences with smaller standard errors. In addition, by modeling the data at the individual level, the MIF model avoids having to delete or impute data for centers with missing observations for some informants (as often is the case when using the MR model).
In the MIF model, the robust “sandwich estimator” (White 1982; Liang and Zeger 1986) is used to correct the covariance matrix of the parameter estimates for the clustering of observations within centers.
In this case, the dependent variable (the “leadership” component of the QI effort scale) varies between informants whereas the key independent variables of interest are often at the unit level (e.g., computer availability). A linear regression model for informant regresses the rating for the dependent variable () on the informant-specific and unit-level covariates, again denoted by and CovUi, respectively. The MR model has the same form as the SIF model but with the center mean rating of leadership as the dependent variable; this is often referred to as an ecological regression model because both the dependent and the informant-level independent variables are aggregated to the unit level. A problem with the MR model is that the relationship between aggregated variables can be opposite to the relationship at the individual level (Simpson's paradox). This can lead to misleading conclusions when models of aggregated data are treated as if they yield individual-level results (the ecological fallacy).
The MIF model is the generalization of the SIF model with the following general form:
In this model, aj accounts for unexplained variation in Leadershipij specific to the jth informant, and the dependence of bj on the informant enables a test of differences in the effects of CovIij across informants to be conducted. If the values of bj do not vary significantly they are assigned a common coefficient and the model simplifies. By using the data from all informants, more information is available to estimate the effects of the independent variables than for the SIF model.
To account for the correlation between the observations made on the same center, one can either treat the center as a random variable and use a hierarchical linear model (Raudenbush and Bryk 2003) with random effects for the centers, or fit a marginal regression model using generalized estimating equations (Liang and Zeger 1986; Zeger and Liang 1986). The hierarchical linear model (also known as a mixed effects model) accommodates a wider range of correlation structures but requires more assumptions, such as assuming a particular distribution for the random effects.
When the dependent variable is measured by multiple informants, the MIF model reduces to the MR model in the special case where there are no informant-specific effects, equal variances among informants, and equal correlations between informants (Goldwasser and Fitzmaurice 2001).
As a precursor to performing a multiple informant analysis, we assess the level of correlation between the informants' ratings for each measure of culture and for leadership. Very high correlations indicate that a multiple informants' analysis might be redundant because the informants' ratings are strongly related. On the other hand, low correlations might indicate that averaging the informants' ratings may obscure important differences.
For the SIF and MR models, we report estimated effects and p-values for the key independent variables in the model (the informants' ratings of culture in the collaborative participation model and the characteristics of the centers and informants in the leadership model). In order to include the same centers in the analysis as for the MIF analysis, the mean of the informants' ratings was used in the MR analysis even if only a single informant responded. An alternative approach often used in practice is to omit centers that do not have ratings from each informant. Although unit deletion ensures that the mean is evaluated over the same set of informants at each center, there can be a substantial loss of sample size. There are a number of other methods for handling the missing observations in the MR model (e.g., mean imputation, regression imputation) but these are plagued by their own difficulties and are beyond the scope of this paper.
For the MIF model of participation, we report estimated effects and p-values of the ratings of culture for each informant and measure of culture (individual contrast test). We also test whether the effects of the predictor variables vary across informants or whether the same coefficient can be used for all informants (interaction test). If there is insufficient evidence to conclude that the effects of culture vary across informants, we refit the MIF model without the corresponding informant type–culture interaction to estimate a common (i.e., “pooled”) effect. We include only the developmental, hierarchical, and rational components of culture as predictor variables because group culture is determined by these. The associated parameter estimates are interpreted as the effect of a one-unit increase in that component of culture and a one-unit decrease in group culture. The effect of a one-unit increase in group culture at the expense of another component of culture is the negative of the coefficient corresponding to that component of culture (e.g., if b1 is the regression coefficient for developmental culture, then −b1 is the effect of a one-unit increase in group culture and a one-unit decrease in developmental culture).
In the MIF leadership analysis, we test for interaction effects between the center and informant characteristics, and the type of informant. If these are not significant, we report parameter estimates and p-values for the main effects only. We controlled for culture but do not report any parameter estimates related to it in the leadership analyses.
To account for the hierarchical structure of the data, we use SAS Proc Genmod and SAS Proc Mixed to fit the MIF model for the participation and leadership analyses, respectively. Procedures for the corresponding nonhierarchical models may be used to fit the SIF and MR models.
Because there was one control center for which the executive director (the informant of choice) did not respond, there were 64 observations in total (44 collaborative and 20 control observations) available for the SIF analyses (Table 2A), whereas there were 65 observations for the MR analysis (one for each center) and a total of 176 observations for the MIF analyses. The distributions of informants were similar for the collaborative and control centers (not shown).
Table 2B contains the mean and standard deviation of the center-level variables for collaborative and control centers. Collaborative centers had more sites on average (6.3 versus 4.5) while control centers were more likely to have computer availability (76 versus 68 percent), had higher annual patient loads (about 17,000 versus 15,000), and were slightly more likely to be located in an urban setting.
The mean and standard deviation of the ratings made by each type of informant and the characteristics of the informants are displayed in Table 2C. Executive directors gave the highest ratings of any informant for the QI leadership rating at both collaborative and control centers. The component of culture that received the largest average rating was group while developmental had the lowest, especially in control centers. Executive directors were older, and had been employed longer than the other informants. Informants at control centers were more likely to be male and Hispanic than at collaborative centers. However, the mean age, employment at primary site, and length of employment were similar for collaborative and control centers.
The correlations (Table 3) between executive and medical directors of the ratings of culture were modest (around 0.3), those for executive directors and medical providers were small (less than 0.12), and those for medical directors and medical providers were variable (range 0.13–0.47). The correlations between the ratings for leadership were smaller than for culture for all three pairings of informants. The general lack of correlation (discordance) of the informants' ratings suggests that an MIF approach has the potential to be more informative than the SIF or MR approaches.
Table 4 compares the results obtained from the SIF and MR analyses with those for the MIF analysis. In general, the analyses led to different conclusions about the association between the culture at a health center and participation in the collaborative.
Only the MIF model identifies a significant interaction for hierarchical culture between the informants (p=.04), suggesting that relying on the SIF or MR models would have led to misleading conclusions; executive and medical directors have negative effects and providers have a positive effect. The SIF and MR models find nonsignificant results for hierarchical culture, the latter likely because the effects for different informants offset each other. The SIF model and MR models also fail to find significant results for rational culture, whereas the MIF model obtains a borderline significant result for executive directors (p=.05). However, the three methods obtain similar results for developmental culture, where, because the effects for developmental culture in the MIF model do not vary significantly across informants, they are pooled. The pooled effect under the MIF model finds the strongest evidence of an overall effect of developmental culture (p=.011), followed by the MR model (p=.017), and the SIF model (p=.050).
In the MIF analysis, we found no significant interactions between any of the center characteristics (the variables of primary interest) and informant type. Therefore, all the effects for the MIF model in Table 5 are pooled effects. The MIF model finds that of centers participating in the collaborative, the leaders of those with computers make more effort to improve quality than the leaders of those that do not have computers (p=.03). The reverse is true of centers that do not participate in the collaborative. This effect is also detected by the MR model but not by the SIF model. The MIF model also finds that male informants tended to give lower ratings of leadership, an effect that was also detected by the SIF model but not by the MR model.
In this paper, we compared three approaches for analyzing data from multiple informants. Two of the approaches make restrictive assumptions. The SIF model assumes that the chosen informant (in our case, the executive director) provides all relevant information. The MR model assumes that the informants' ratings are exchangeable (i.e., randomly distributed about some common value) and can thus be summarized by averaging. Methods that focus on a single informant or the mean of the informants' ratings provide no insight into whether combining information from all informants enables better understanding of the constructs being studied, or whether analyzing certain informants alone will lead to results that are substantially different from those for other informants. In contrast, the MIF models provide a general pathway for how to use information from multiple informants in analyses. MIF models allow the researcher to make a more informed decision on whether a single informant dominates (e.g., is by far the strongest predictor in the model) to the extent that results should be based on them alone and also allows for averaging data among informants when it is appropriate to do so (e.g., when the interaction effects with the informant are not significant). When the informants' effects are significantly different, there may be no “correct” response because the informants may be attuned to different characteristics of the center. However, recognizing this alerts the analyst to the possibility that a multidimensional trait may exist and provides the opportunity to seek informants that can provide information about each dimension. The SIF and MR models are reduced cases of the MIF model that are only appropriate in special circumstances.
Our multiple informants' analysis of CHC participation, found that the effect of hierarchical culture differed significantly across informants. The effects for executive directors (negative) and providers (positive) differed significantly, suggesting that these informants in particular may draw on different experiences when assessing culture. The medical provider had the largest effect of hierarchical culture on participation, implying that they might be the single most useful informant in terms of discriminating between centers (arguing against our use of the executive director as the informant of choice). The lack of significance for the SIF model is attributed to a lack of sample size when only the executive directors are analyzed, while the lack of significance for the MR model is attributed to the effects of the individual informants canceling each other. This example illustrates the danger of relying on a single informant or using MR models when data from multiple informants can be obtained or are available. For rational culture, the MIF model reveals that although there is a significant result for executive directors, some caution should be exercised because results for the other informants are not significant. The SIF and MR models do not give any indication of the potential differences in effects across the informants.
In addition to obtaining results that are qualitatively more informative by revealing differences between informants, when it is appropriate to pool effects across informants the MIF model typically yields results that are more precise than traditional methods. This was illustrated by developmental culture in the collaborative participation analysis where only the MIF model found a significant result. Another example is the effect of computer availability on leadership for participating centers. Although all three models obtained similarly sized estimates, only the estimate based on the MIF model was significant. There are two reasons why the MIF model yields a more precise inference: (1) it is not restricted to a single informant and does not exclude centers for which some informants did not respond or were not available (approximately one-third of centers); and (2) it does not weight each informant equally as in the MR model but rather accounts for differences in the informants' ratings, yielding a more precise estimate.
A limitation of the analyses discussed herein is that we ignored the problem of missing data. Although, the MIF model takes into account all available observations, it implicitly assumes that missing data are missing completely at random. If this assumption is violated, the results have the potential to be biased or less precise. However, the MIF model can easily be extended to account for missing data (e.g., using multiple imputation [Rubin 1987]) under fairly general conditions. Although methods for missing data can also be applied to the SIF and MR models, the MIF model provides a more complete and informative framework.
A disadvantage of the MIF model relative to the SIF and MR models is that in order to incorporate the results from all informants, it makes additional assumptions about the distribution of the observations (e.g., their correlation structure, the distribution of any random effects) and thus is more reliant on the validity of the model than the SIF and MR models, which only involve one observation per center. However, the greater generality of the MIF analysis might outweigh the increased sensitivity to model assumptions in many situations.
Because the MIF model uses all the information in the data, it has the potential to lead to increased precision. From a design perspective, the MIF model encourages sampling more informants at each center. However, it is important to note that sampling more informants does not automatically justify sampling fewer centers. The power of the MIF design depends on the poolability of the informants; a substantial increase in power relative to the single informant design occurs only when the effect of interest is the same or similar for all informants. Realistic and preferably conservative assumptions about the poolability of the effects across informants should be used when determining the appropriate sample sizes for a study involving multiple informants.
Finally, we note that an alternative way of modeling an independent variable for which there are data from multiple informants is to include covariates for each informant in a single regression equation. Although this is a fine approach for obtaining a predictive model of the outcome, including multiple measures of the same independent variable in one regression equation may obscure the association between the construct being measured and the dependent variable (Horton, Laird, and Zahner 1999) and may be limited by degrees of freedom/power issues.
The main objective of this paper is to increase awareness of the most appropriate statistical methods for analyzing data from multiple informants. We have shown that the MIF models produce a richer set of results than the standard (SIF and MR) models. First, we identified that the relationship of hierarchical culture with participation in a collaborative was respondent dependent. When the ratings of the informants are discordant, the SIF and MR models are not reliable. In cases where effects did not vary across informants (e.g., computer availability), the MIF model found significant effects not detected by the standard methods. It is important that health services researchers are aware of the pitfalls of standard methods for analyzing multiple informant data and that these can be overcome by using an MIF model.
This research was supported by grant number 1 U01 HS13653 from the Agency for Healthcare Research and Quality and The Commonwealth Fund grant number 20030185. The authors thank Barbara J. McNeil, M.D., Ph.D., and Alan Zaslavsky, Ph.D., for helpful reviews of the manuscript and suggestions, and Thomas Keegan, Ph.D., Yang Xu, and Mary Ly for help with data collection.