Approximately 38,231 subjects participated in the studies included in the meta-analysis (k
=206 comparisons, s
Independent two-group post-test effect sizes included data from 24,520 subjects (k
=71); two-group pre–post effect sizes, from 14,630 subjects (k
= 80, s
=59); and pre–post treatment group comparisons, from 22,413 subjects (k
=125). Sample sizes varied dramatically from 12 to 5,038 subjects.78,155
Multiple treatment groups were common: 34, 10, 3, and 1 paper(s) reported on two, three, four, and six treatment groups, respectively. Twelve unpublished dissertations and one unpublished presentation paper were included. Many studies reported funding (s
=59). One report was disseminated before 1970, 5 in the 1970s, 35 in the 1980s, 49 in the 1990s, and 48 were disseminated after 2000. The earliest study was reported in 1969 and the most recent study in 2007. Analyses were completed in 2008.
Among the studies that reported details about worksites, 55 were for-profit and 50 were not-for-profit companies. Most papers did not report company size (s=80). Among the papers reporting this information, the vast majority were large companies (at least 750 employees), with only five described as small (fewer than 100 employees). Most studies were conducted in single companies at one location (s=87), 17 used multiple locations of one company, and 23 conducted studies at multiple companies. The most common types of companies were education/health services (s=37), government (s=32), and manufacturing (s=17). Few studies reported whether study data were collected at the worksite; among those providing this information, 51 collected data at the workplace and 14 did not. Interventions were more often delivered at the workplace (s=51) than in other locations (s=21). Nearly all of the studies recruited subjects at the worksite (s=121). Only 32 papers reported that interventions were delivered during employees' paid time. Most studies used interventionists employed by the research project (s=101) instead of workplace employees. Only six studies reported including an organizational-level policy change, such as providing free or reduced memberships to fitness centers not located at the worksite. Twenty-six studies involved workplace employees in designing interventions. Thirty-eight papers reported on interventions that included fitness facilities at the worksite. Supervised exercise was used in 27% of the studies while 80% employed motivational/educational sessions. Further details about interventions are found in .
Visual and statistical assessment of funnel-plot asymmetry, as indicators of possible publication bias, suggested substantial evidence of asymmetry for physical activity, fitness, lipids, and diabetes risk, especially for single-group comparisons. Evidence of asymmetry was weaker but still notable for anthropometric measures and mood. Due to the relatively few effect sizes on quality of life, health services utilization, work attendance, job stress, and job satisfaction, evidence for or against funnel-plot asymmetry was inconclusive for these variables.
Effects of Interventions on Physical Activity Behavior, Health, and Well-Being
presents the overall effects of interventions on physical activity, health, and well-being outcomes. The findings should be interpreted with caution given the small number of studies or subjects for some outcomes. For physical activity behavior, the mean overall effect at post-test comparison in two-group studies was 0.21. The two-group pre–post effect and treatment group pre–post comparisons were of comparable magnitude. The Common Language Effect Size (CLES) of 0.56 for the two-group post-test effect size indicates that 56% of the time a random treatment subject would have a higher physical activity score than a random control subject (all CLES values reported are based on a random-effects mean effect size for two-group post-test comparisons). To enhance interpretability, mean physical activity effect sizes were transformed to steps/day using means and standard deviations from appropriate reference groups. For two-group post-test comparisons, the raw mean difference was 612, which corresponds to a final steps/day mean of 8,869 for treatment subjects versus 8,257 for control subjects. The homogeneity test and estimated between-studies standard deviation (Q
in ) demonstrated significant heterogeneity for all physical activity behavior comparison types. The I2
value (), the percentage of total variation among studies' observed effect sizes that is due to heterogeneity rather than sampling of participants, also documents significant heterogeneity.
Random effects health and well-being outcome estimates and tests
Fitness outcomes also were significantly better among treatment than control subjects, and better at post-test when treatment subjects' pre- and post-intervention scores were compared. Mean effect sizes ranged from 0.47 to 0.57 (CLES=0.66). As with steps/day for physical activity, the mean effect size on fitness was transformed to maximal oxygen consumption (V02max). For two-group comparisons, the raw mean difference was 3.5, which corresponds to, for example, a final V02max mean of 37.7 mL/kg/min for treatment subjects versus 34.2 mL/kg/min for control subjects. Fitness effect sizes were significantly heterogeneous which indicates some studies found significantly better fitness outcomes than other studies.
Diabetes risk was significantly reduced by interventions. Mean effect sizes for the two-group comparisons were 0.90 to 0.98 (CLES=0.76). For two-group studies, the calculated raw mean difference was −12.6, which corresponding to a post-intervention fasting glucose mean of 81.0 mg/dL for treatment subjects versus 93.6 mg/dL for control subjects. Both mean values are within the range considered normal fasting glucose levels. Diabetes risk effect sizes exhibited significant substantial heterogeneity. Diabetes risk findings should be considered tentative given the small number of studies that reported this variable (k=6).
Lipid and anthropometric effect sizes were more modest but positive, indicating better scores following interventions among treatment subjects. Lipids mean effect sizes ranged from 0.12 to 0.17 (CLES=0.54). In terms of the ratio of total cholesterol to HDL, the raw mean difference was −0.2, such as from a mean post-intervention ratio of 4.6 for treatment versus 4.8 for control. All of the lipids effect sizes were significantly heterogeneous. Anthropometric mean effect sizes for treatment subjects varied from 0.07 to 0.13 (CLES=0.52). For the two-group comparison in terms of BMI, the raw mean difference was −0.3, which would occur if the post-intervention BMI mean were 25.0 for treatment versus 25.3 for control. Anthropometric effect sizes were significantly heterogeneous, except the two-group pre–post comparisons.
Mean effect sizes for both quality of life (0.23) and mood (0.13) two-group comparisons were positive, indicating better outcomes among treatment subjects, but these did not reach statistical significance. Effect sizes for two-group pre–post and pre–post effects were significant with improved quality of life and mood scores following interventions. Most of the quality of life and mood effect sizes exhibited significant heterogeneity.
Effects of Physical Activity Interventions on Work-Related Variables
Estimates and tests for work-related outcomes are reported in . The two-group post-test comparison of work attendance documented that, on average, treatment subjects had lower mean absenteeism than control subjects (effect size=0.19, CLES=0.55). Although the direction of the effect was similar, mean effect sizes were smaller for both two-group pre–post effects and treatment group pre–post comparisons. Job stress was significantly lower at follow-up among treatment subjects than control subjects (effect size=0.33, CLES=0.59). Job stress effect sizes were positive for other comparisons but were not significant. Job satisfaction was significantly greater following interventions among treatment subjects than controls in the two-group pre–post effect analysis (effect size=0.20, CLES=0.54), but similar findings did not achieve statistical significance for the two-group post-test analysis. Effect sizes for most comparison types on most outcomes were significantly heterogeneous, as documented by Q, estimated between-studies standard deviations, and I2 values.
Random effects work-related outcome estimates and tests
Healthcare utilization two-group post-test analyses revealed significantly higher healthcare utilization among treatment subjects than among control subjects (effect size= −0.17,CLES=0.45). The two-group pre–post effect estimate was of similar magnitude (−0.18) but not significant. The pre–post comparison for treatment subjects revealed no utilization differences. Healthcare effect sizes were more homogeneous than most other variables in the project. Findings regarding job stress, job satisfaction, and healthcare utilization should be viewed as tentative given the small numbers of studies which reported these variables (k in ).
Analyses of potential workplace moderators were conducted for variables with sufficient cases: physical activity behavior, fitness, lipids, and anthropometric variables. Dichotomous moderator results are presented in . Profit versus nonprofit company status was not significantly linked with mean effect size for any variable (QB in ). Neither company size nor whether multiple companies were included in the study were significant moderators of mean effect sizes on physical activity behavior, fitness, lipids, or anthropometric outcomes. Three-level moderator analyses were conducted for numbers of companies and locations (results available from first author): The only significant effect was for anthropometric effect size, with significantly higher mean effect size for interventions conducted in one multi-location company (0.22) than in other combinations of numbers of companies and locations (both 0.04).
Independent-groups comparisons mixed-effects analysis on four major variables
Intervention delivery at the worksite or elsewhere was significant only for anthropometric effect sizes, such that interventions delivered at workplaces yielded a larger mean effect size (0.17) than those delivered elsewhere (0.05). Whether employees received interventions on company paid time was significant for two of the four outcomes: Studies with employees paid during intervention reported larger mean effect sizes than those with employees receiving interventions outside company paid time on both fitness (0.92 vs 0.49) and anthropometric measures (0.22 vs 0.02). Interventions with employee interventionists were more effective than those with others as interventionists for fitness (1.03 vs 0.50), lipids (0.59 vs 0.09), and anthropometric measures (0.32 vs 0.05). Workplace participation in designing the interventions, as compared to interventions designed by people not employed by the worksite, was significant for fitness (1.18 vs 0.49) and anthropometric outcomes (0.22 vs 0.06) but not for lipids or physical activity behavior. Neither recruitment nor data collection location (workplace versus elsewhere) were related to variables with adequate data for moderator analyses.
The presence of a fitness facility onsite in the workplace did not affect mean effect sizes on fitness or physical activity behavior. Studies with onsite fitness facilities reported larger mean effect sizes on lipids (0.32) than studies without such facilities (0.07). Anthropometric outcomes also yielded larger mean effect sizes among studies with onsite facilities (0.24) than those without facilities (0.05). Organizational policy change could be analyzed for lipids and anthropometric outcomes only. Lipid effect sizes were unrelated to policy changes while anthropometric outcomes yielded significantly larger mean effect sizes in studies with policy changes (0.24) than those without policy changes (0.03). Whereas for physical activity behavior, fitness, and lipids nearly all moderators left significant residual heterogeneity (QW in ), all but two moderators left nonsignificant residual heterogeneity for anthropometric outcomes. Results of exploratory multiple moderator analyses are available from the corresponding author.