In the present study, different sampling strategies were evaluated with respect to accuracy and precision of variance components for three posture variables for car mechanics. The results showed the consequences of violating theoretical assumptions behind the random effects model; inaccurate results were caused by individual samples being time-dependent within working days (autocorrelation). The present study used a bootstrap method for investigating the performance of sampling strategies that is also applicable in other occupational settings and for other exposure variables.
The present study showed that sampling data in large time blocks may lead to inaccuracy and imprecision in estimates of variance components. This was particularly prominent for strategies where small fractions of working days were sampled. Variance component estimates were particularly biased for strategies with small sample sizes
and large block sizes. For
percentage time above 90°, negative biases of up to 26% of the size of the “true” within-day variance component and positive biases of up to 1110% of the size of the “true” between-days variance component were observed. However, in many cases the total error in the variance component estimates was dominated by imprecision rather than inaccuracy. For the within-day variance component, which was estimated at 164.7 in the original data set, the 90% prediction interval ranged from 137 to 191 for the sampling strategy giving the best precision (
ns
=

20,
ttot
=

480,
nd
=

4,
tb
=

1). For the within-day variance component, the median width of the 90% prediction interval across all investigated sampling strategies was 89 while the median bias of the variance component was 1.4. Thus, our results suggested that imprecision will often be a more serious problem than inaccuracy for studies of the sizes simulated here. The results further indicated that the sample sizes investigated by us might not be sufficient to retrieve variance components with a satisfying precision. In occupational epidemiology, variance components are required for designing efficient exposure measurement strategies, and when combined with information on costs associated with data collection, they give a basis for deciding on efficient budget allocation [
31-
34]. Variance components can guide the selection of targets for interventions to reduce suspected hazardous exposures [
2]; they are used in assessments of clinical reliability [
35], and they are necessary inputs in conventional power analysis of, for instance, studies addressing exposure differences between groups or effects of an intervention [
4,
6]. The present paper clearly illustrates that the results of these applications of estimated variance components can be very uncertain, in particular if the estimates have been based on short and continuous exposure samples. This caveat is rarely addressed in the literature. Estimated variances are known to follow a positively skewed distribution; this was apparent even in the present study. Hence, a sample estimate of a variance is more likely too small compared to the true value than too large. Using an estimated variance in a power analysis of a planned intervention study will therefore more often lead to too “optimistic” (small) predictions of the necessary study size than to too “pessimistic”. The error may be considerable, as illustrated by the wide prediction intervals on variances in the present study. In order to account for variance estimation uncertainty in power analyses, some authors have suggested to use the 80
th percentile of the expected distribution of variance estimates as an input rather than the actual variance estimate [
36]. A particular challenge appears if variance components
per se are the exposure variables of interest, for instance in studies of exposure variation [
6,
7]. Variances are not normally distributed, and a conventional power analysis, which requires data to have this property, is not applicable. Developing power analysis procedures for studies addressing variance components is an interesting issue for further research.
The epidemiologic study, for which the data was originally sampled, attempted a random collection of subjects and working weeks [
21,
26,
27]. The present data is therefore a likely representative sample of car mechanics’ exposure to elevated upper arms. The original study also included measurements on house painters and machinists [
21,
26,
27]. The car mechanics, spending on average 4.7% time with the right arm elevated above 90°, worked more with elevated arms than machinists (1.6% time >90°), but less than house painters (8.8% time >90°). The size of exposure variability in the three groups differed in a similar fashion; the car mechanics showed more variability than the machinists did, but less than the house painters did. The autocorrelation function for
percentage time above 90° at lag 1 was 0.31 for machinists and 0.46 for house painters, compared to an autocorrelation of 0.52 for the car mechanics [
14]. This implies violation of the assumption of independence in the error term also for machinists and house painters. This leads us to believe that the principal effects of sampling strategy on variance component estimators shown in the present study are relevant also to data collections of other exposure variables and in other occupational groups. The magnitude of these effects, however, probably varies between variables and occupational groups, and our numerical results should therefore be applied outside the group of car mechanics only with great caution.
Variability of upper arm elevation has been reported in the literature for other occupations [
6,
22,
37], but the posture variables were different from the ones used in the present study and accuracy and precision of the reported variance components were not explored. Consistent with our findings, estimates of variance components were shown to be associated with considerable imprecision in previous studies on muscle activity during assembly work [
3,
4] and on posture and electromyography data from short-cycle manual handling [
6].
The present study determined exposure variability between and within subjects using a random effects model, which assumes that effects are uncorrelated. Results showed that this assumption was violated since the car mechanics exhibited considerable autocorrelation between measurements within a working day. As demonstrated by David [
38], the ordinary sample variance estimator,

, underestimates the population variance if observations are not independent and if the sample size is not large. An equivalent effect can be expected on the estimator for within-day variance. This is a likely explanation why within-day variance estimates were inaccurate for sampling strategies with larger block sizes, while they were not for strategies with block size 1, where observations will be (close to) unaffected by autocorrelation. Since variance components are partitions of the total (constant) variance present in the data, a negative bias in the within-day variance estimate propagates to the other variance components, in particular showing up as a positive bias in the between-days variance. When block size increases, the time span between the observations in the sample decreases. Hence, the sample will be more autocorrelated, which leads to a larger bias. We believe that increased autocorrelation explains the occasional larger bias of variance components estimated by strategies where a particular sampling time was distributed across four days rather than two. This will lead to a smaller sampling time per day and – if the block size is large – to a more dominant effect of autocorrelation.
Violations of other assumptions of the random effects model (equation 1) than independence also occurred. Visual inspections of plots of predicted values of the random effects and their residuals suggested that the assumption of constant variance across subjects and days was violated in some cases. The residuals also had positively skewed distributions. Although the model in equation (1) makes no assumptions of the distributional form of the random effects, the REML estimators that we used to estimate the variance components assume normal distributions. However, REML estimators are identical to ANOVA estimators when data is balanced [
39], and as ANOVA estimators are not based on distributional assumptions we do not consider this to be a problem. We did not transform the exposure data because we could not identify a transformation that improved the fit of the random effects model to any noticeable extent.
More complex random effects models are available that can model time dependence in the error term by incorporating correlation structures according to different time series models, like autoregressive models (AR) or moving average models (MA) [
40]. A successful fit of such a model might result in unbiased measures of variability between and within subjects. However, for data sets as large as that used in the present study (23 subjects*4

days*480 minutes

=

44160 observations), computations with very large variance and covariance matrices will be involved. Thus, it may not be possible to fit these models. Moreover, identifying a reasonable model of the time dependence of our posture variables is not a trivial matter.
A parametric model for the structure of the arm elevation data was not available so parametric bootstrapping [
41] was not an option. While procedures for non-parametric bootstrapping for hierarchical data have mainly been discussed in the context of two-level data sets [
29,
42-
44], a recent paper by Ren et al. [
45] addressed non-parametric bootstrapping for data sets with three levels or more. The paper concluded that units at the first level (here subjects) should be selected with replacement while units at the two lower levels (here days and quanta within days) should be selected without replacement; this was the procedure used in the present paper.