A few observers guessed the nature of the experiment and never detected targets during experimental trials. However, during postexperiment questioning, the vast majority of observers indicated that they were convinced that targets appeared on some proportion of experimental trials. Across all observers the proportions of trials for which participants indicated illusory detections were 32%, 36%, and 36% for Experiments 1, 2, and 3, respectively (s.e.m = .013, .012, and .012). Observers whose hit rate for the first training block with easy targets was less than 40% or who had fewer than five illusory detections during the experimental trials were eliminated from further analysis. These criteria eliminated 24.5% of participants from Experiment 1 (173 remained, 79 with the original image orientation and 94 with the rotated), 11.0% from Experiment 2 (188 remained, 86 with the original image orientation and 102 with the rotated), and 22.4% from Experiment 3 (201 remained, 53 with mirrored images and the original response keys, 52 with mirrored images and the swapped response keys, 43 with mirrored and rotated images and the original response keys, and 53 with mirrored and rotated images and the swapped response keys). Response statistics for the remaining observers are shown in .
Mean response measures for the observers that met the criteria for inclusion.
Because our experimental method is new we used two methods to analyze the results. Both reached the same conclusions regarding the spatial distribution of correlation values across the classification images. In the first analysis we created a separate CI for each observer in the traditional manner and compared the median correlations between different image regions using a repeated measures ANOVA. Because each observer rated only 480 images, these median correlation values based on individual CIs were small. However, because these values were analyzed across individuals, regional differences were reliable. This method of analysis allowed us to make claims regarding the population of observers, and it also allowed us to evaluate the role of between-observer control factors such as image orientation. However, because this method was based on median correlations within a spatial region, it did not allow analyses of the distribution of correlation values across pixels within a spatial region. To analyze these distributions, a second analysis method evaluated a single between-observer CI for each experiment by correlating the proportion of detection responses for each image with the luminance of each pixel. The significance of this CI and measurements of it were then evaluated using null sampling distributions that were generated by repeatedly shuffling observer responses (i.e., Monte Carlo sampling).
3.1. Repeated measures analyses of individual CIs
In the first analysis method a CI was created for each observer by correlating their detection/nondetection response with the luminance (i.e., gray scale value) of each pixel across the 480 noise images viewed during the experimental portion of the study. The spatial distribution of CIs for each experiment were analyzed by finding the median pixel correlation within different spatial regions of interest for each observer's CI. These median correlation values were then analyzed using an ANOVA with observer as a random factor. Control factors such as image rotation and response key were included in the analyses. Correlation values were largest for the central regions, and so all analyses were performed on the center ninth of the CIs, based on an evenly spaced three-by-three grid over the entire image.
First, we compared the left and right halves of the center region (see insets at the top of ). Dark areas on the left of images were more correlated with detections than the right for both Experiment 1, F(1,171) = 4.19, p = .042, η2 = .024, and Experiment 3, F(4,197) = 6.88, p = .009, η2 = .034, but not for Experiment 2, F(1,186) < 1. Next, the top and bottom halves of the center region were compared, revealing no differences for Experiment 1, F(1,171) < 1, or Experiment 2, F(1,186) < 1, but a slight bias toward the top for Experiment 3, F(4,197) = 3.91, p = .049, η2 = .019.
Figure 2. Repeated measures analyses of individual classification images (CIs) based on the median correlations within different regions of each observer's CI. The left column compares the left and right halves of the center region of the image as indicated by (more ...)
To provide a more fine-grained spatial analysis, the center region, which resulted from a three-by-three grid over the whole image, was itself further divided into three rows by three columns (see inset of , upper right). For Experiment 1 the only significant effect was an effect of column, F(2, 170) = 4.21, p = .016, η2 = .047. The pattern over columns was middle, to left, to right, and post hoc tests, αadj = .017, showed that the middle column was more correlated than the right, t(172) = 2.98, p = .003. For Experiment 2 there were significant interactions between row and column, F(4,183) = 5.67, p < .001, η2 = .110, as well as main effects of row, F(2,185) = 14.77, p < .001, η2 = .13, and column,F(2,185) = 17.69, p < .001, η2 = .161. Post hoc corrected t-tests assessing the interaction, αadj = .0014, found these effects were largely driven by the center region, which had much larger correlations than all other regions, t(187) ≥ 4.64, all p-values < .001. The upper-middle, middle-left, middle-right, lower-left, and lower-middle regions were additionally more correlated than the lower-right region, t(187) ≥ 3.39, all p-values ≤ .001. For Experiment 3 there was again a row-by-column interaction, F(4,194) = 3.93, p = .004, η2 = .075, as well as main effects of row, F(2,196) = 4.32, p = .015, η2 = .042, and column, F(2,196) = 9.53, p < .001, η2 = .089. In contrast to Experiment 2, corrected comparisons showed that pixels in the upper-left region had stronger correlations than the middle-left, middle-right, and lower-right regions, t ≥ 3.39, p-values ≤ .001, and that the center and lower middle had higher correlations than the middle right and lower right, t(200) ≥ 3.26, p-values ≤ .001.
Next, we report the results of the control variables. These control variables were counterbalanced across observers, and the focus of these analyses was on interactions between the control variables and the spatial characteristics of the CIs. There were no interactions with image orientation for Experiment 1, F-ratios < 1.45, p-values > .238. In Experiment 2, however, image orientation interacted with left/right half, F(1,186) = 17.94, p < .001, η2 = .088, and with top/bottom half, F(1,186) = 7.55, p < .001, η2 = .039. Additionally, there was an interaction with column for the three-by-three analysis, F(2,185) = 9.32,p < .001, η2 = .092. The interactions of image orientation with halves are shown along the left and middle columns of . For the images shown in their original orientation, the right was significantly more correlated than the left, t(85) = 2.26, p = .026, and there was a marginal difference such that the bottom was more correlated than the top, t(85) = 1.93, p = .056. Both of these effects reversed when the images were rotated: left more correlated than right, t(101) = 3.82, p ≤ .001; top more correlated than bottom, t(101) = 1.99, p = .049. Examination of the column-by-orientation interaction, αadj = .017, showed that for the original orientation images there were differences between the center columns and each side: both t(85) ≥ 3.22, both p-values ≤ .002. But for the rotated images both the center and left regions had stronger correlations than the right: both t(101) ≥ 4.158, both p-values < .001. In Experiment 3 there were no significant interactions with image rotation or response key for any of the analyses, F-ratios ≤ 1.34, p-values ≥ .256.
3.2. Monte Carlo analyses of between-observer CIs
A second set of analyses was based on between-observer CIs rather than on the individual CIs. A between-observer CI was calculated based on the correlation between pixel luminance (i.e., gray scale value) and the proportion of observers who gave detection responses to an image containing that pixel. The detection proportions for each image were computed as the number of detection responses divided by the sum of the number of detection and no-detection responses (i.e., excluding no-response trials). Including no-response trials did not change the results. It is worth noting that these between-observer CIs were nearly identical in appearance to the average of the individual CIs. Null hypothesis distributions were determined separately for each pixel location. Thus, if there were a systematic bias in the noise images (i.e., a tendency for dark areas to be on the left), it would be reflected in the null sampling distributions. These analyses allowed a fine-grained spatial evaluation by examining clusters of pixels within the CIs.
The expected distribution of correlation values at each pixel location from random responding was determined by running 5,000 Monte Carlo simulations for each experiment. For each Monte Carlo simulation the mapping between the 480 detection probabilities and the 480 images was randomly reshuffled. The resulting means and standard deviations at each pixel location were used to calculate z-scores for each observed pixel correlation, which are displayed in the right column of . To verify that the correlation values were normally distributed, we performed a Lilliefors test (i.e., a Kolmogorov-Smirnov test with unknown parameter values) separately for each of the 230,400 pixels of each experiment. Setting α = .05, 5% of the pixels produced correlation distributions that differed significantly from a normal distribution (i.e., the proportion of rejections was exactly as expected based on the chosen type-I error rate).
Figure 3. Left: the average of the 40 true images viewed during training. Right: the between-observer classification images (CIs) for Experiment 1 (top, N = 173), Experiment 2 (middle, N = 188), and Experiment 3 (bottom, N = 201). Each pixel's color is defined (more ...)
Next, the distribution of correlation values was assessed by counting the number of pixels in the top versus bottom and in the left versus right that exceeded a chosen correlation threshold. Correlation thresholds were defined by the proportion of the largest absolute correlations; a rank-ordered list of all the absolute correlations for the entire image was formed from largest to smallest, and different proportions of this list were included in the analyses. As shown in , the pixel counts for one region were subtracted from the other region to yield a measure of relative spatial bias. The black lines in the figures show the observed biases, and the dashed red lines show 95% confidence intervals for this measure based on the Monte Carlo simulations. Similar to the repeated measures analyses, these analyses were carried out only for the middle region of the display, as shown by the insets of the figure. As seen in the left column of , face detection revealed a left bias (both Experiment 1 and Experiment 3), replicating the ANOVA analyses. As seen in the right column of , only bounded face detection (Experiment 3) showed any reliable vertical bias, with the most reliable correlations found in the top half.
Figure 4. The distribution of correlation values within different regions of the between-observer classification images (CIs). These graphs show the difference in the number of pixels that fall into one region of the CI versus another region (e.g., left minus right) (more ...)
Beyond assessing different spatial regions, the between-observer CIs were used to assess the nature of pixel clusters (i.e., correlations between pixels). To some degree, regions of correlated pixels were expected because the noise images contained Gaussian blobs rather than independent pixel noise. However, because the noise images contained Gaussian blobs of different sizes, we could evaluate biases for pixels of different cluster sizes across the three experiments. Regions were defined as groups of contiguous pixels with the absolute value of correlations above a threshold proportion. The number of contiguous regions, mean size of regions, and standard deviation of region sizes were computed for varying threshold proportions for the observed between-observer CIs and the Monte Carlo CIs. The results are plotted in . For small threshold proportions the observed data of all three experiments revealed fewer clusters than expected by chance and that those clusters were larger and more variable in size than would be expected. However, these effects were greatly magnified for letter detection, which had the smallest number of regions that were of the largest size.
Figure 5. Monte Carlo analyses of between-observer classification images to evaluate clusters of correlated pixels. For pixel correlations greater in magnitude than the proportion threshold the number (left column), size (middle column), and standard deviation (more ...)