To get back to Wertheimer's original goal of understanding object recognition, especially letter identification, we created letter-shaped contours and displayed them on a background of visual noise. Our alphabet, based on Sloan's, has 10 letters (see Method). This task allows us to directly measure efficiency of letter identification as a function of deviation from collinearity. Each standard (Sloan) letter is defined by the path a pen's stroke would follow in drawing it. In the standard condition, gratings are placed at regular intervals along the letter's (invisible) path, aligned with the path. We perturbed collinearity by rotating, offsetting, or phase shifting successive gratings, right or left alternately, relative to the path ().
Figure 4 Measuring wiggle. The first row shows three unperturbed letters: The gratings are collinear with the path of the letter. The second row shows a sample letter for each of our three perturbations: orientation (Z), offset (R), and phase (S). In the third (more ...)
shows some perturbed letters. Note that the perturbation seems to bend the stroke, making it seem serpentine or wiggly. Inspired by this impression, we fitted a sinusoid, tangent to the white–black (not black–white) crossing nearest to the centre of the gratings. We define wiggle as the angle the sinusoid makes with its axis.
Each wiggled alphabet was created once and was then used unchanged through all training and testing.
The noise background was fresh (independent, identically distributed) on each presentation. The visual noise background swamps any additive intrinsic noise in the observer and makes the task an explicit computational problem, for which the optimal algorithm (maximum likelihood choice among the possible letters) may be solved mathematically and implemented as a computer program that represents the ideal observer
(Appendix A of Pelli et al., 2006
). We measured threshold contrast for 82% correct letter identification for both human and ideal observers. At threshold, we computed the contrast energy, integrated square of the contrast function over the signal area. The ratio of threshold energies, ideal over human, is called efficiency
(Pelli & Farell, 1999
Efficiency strips away the intrinsic difficulty of the task to reveal a pure measure of human ability. See Pelli and Farell (1999)
for a tutorial explaining how to measure efficiency.
Our paradigm is similar in some ways to the tumbling E test introduced by Levi, Sharma, and Klein (1997)
. Like our snake letters, their letters consisted of gabors. However, instead of adding a white noise background they perturbed the position of each gabor randomly on each presentation. Like us, they compare human and ideal thresholds to compute efficiency, but their manipulations did not assess the role of grouping, so their results are not relevant here.
More relevant is the tumbling C test of Saarinen and Levi (2001)
. They made a Landolt C (a perfect circle with a gap) out of gabors and presented the C at one of four orientations (90° apart), asking the observer to say which. They compared the threshold contrast for Cs made up of gabors that were all collinear with the C's path, or all orthogonal with the path, or each randomly collinear or orthogonal. Like us, they found that the orthogonal case resulted in higher thresholds than the collinear case. (The random case elevated threshold slightly more than the orthogonal case, but this difference was statistically significant for only one of the three observers, and in the group average.) They note in their abstract that their use of four-way identification was an advance on the prior work, which was all binary discrimination: “A number of previous studies have reported that integration of local information can aid ‘pop-out’ or enhance discrimination of figures embedded in distractors. Our study differs from the previous studies in that, rather than a figure–ground discrimination, our experiments measured contrast thresholds for shape identification” (Saarinen & Levi, 2001
). While four-way identification is indeed an advance, it is our impression (confirmed by Levi, personal communication) that, at least subjectively, this particular task quickly reduces to detecting the gap and reporting its location, especially when near threshold. Thus, even though they were asking observers to identify four versions of a letter, it does not seem that the observers were doing ordinary object recognition.
presents letter-chart versions of two of our three experiments, perturbing orientation (left panel) and offset (right panel). (We could not make a similar three-column chart for phase wiggle, because, as we defined it, there are only two strengths: on and off.) The perturbation increases from left to right. Letter contrast diminishes from bottom to top. For each column (perturbation) your efficiency is given by the highest (faintest) letter you can identify. Note the drop in efficiency as wiggle increases from left to right.
Figure 5 Letters in noise, demonstrating that good continuation is important for letter identification. For each letter chart, starting from the bottom, read up each column as far as you can. The height of the faintest identifiable letter is your contrast sensitivity (more ...)
A “wiggle” is “a wavy line drawn by a pen, pencil, etc.” (Oxford English Dictionary
, noun, 3). Others have measured sinusoidal curvature in order to study shape perception (e.g., Prins, Kingdom, & Hayes, 2007
; Siddiqi, Kimia, Tannenbaum, & Zucker, 1999
; Tyler, 1973
; Wilson & Richards, 1989
). Our treatment of wiggle is novel in applying one metric to three different kinds of perturbation, to test whether the effect of the various perturbations is mediated by this one parameter.