Information on exposure variability, expressed as exposure variance components, is of vital use in occupational epidemiology, including informed risk control and efficient study design. While accurate and precise estimates of the variance components are desirable in such cases, very little research has been devoted to understanding the performance of data sampling strategies designed specifically to determine the size and structure of exposure variability. The aim of this study was to investigate the accuracy and precision of estimators of between-subjects, between-days and within-day variance components obtained by sampling strategies differing with respect to number of subjects, total sampling time per subject, number of days per subject and the size of individual sampling periods.
Minute-by-minute values of average elevation, percentage time above 90° and percentage time below 15° were calculated in a data set consisting of measurements of right upper arm elevation during four full shifts from each of 23 car mechanics. Based on this parent data, bootstrapping was used to simulate sampling with 80 different combinations of the number of subjects (10, 20), total sampling time per subject (60, 120, 240, 480 minutes), number of days per subject (2, 4), and size of sampling periods (blocks) within days (1, 15, 60, 240 minutes). Accuracy (absence of bias) and precision (prediction intervals) of the variance component estimators were assessed for each simulated sampling strategy.
Sampling in small blocks within days resulted in essentially unbiased variance components. For a specific total sampling time per subject, and in particular if this time was small, increasing the block size resulted in an increasing bias, primarily of the between-days and the within-days variance components. Prediction intervals were in general wide, and even more so at larger block sizes. Distributing sampling time across more days gave in general more precise variance component estimates, but also reduced accuracy in some cases.
Variance components estimated from small samples of exposure data within working days may be both inaccurate and imprecise, in particular if sampling is laid out in large consecutive time blocks. In order to estimate variance components with a satisfying accuracy and precision, for instance for arriving at trustworthy power calculations in a planned intervention study, larger samples of data will be required than for estimating an exposure mean value with a corresponding certainty.