An alternative to the hierarchical feature detector model of vision arose at the end of the 1960s; this model is known as the spatial-frequency model (F. W. Campbell and Robson, 1968; R. L. De Valois and De Valois, 1988), which is not at all intuitive. The spatial-frequency model proposes that the visual system analyzes the number of cycles of light-dark (or color) patches in any stimulus. Some cycles are narrow, others broad. Some cycles of light-dark are oriented vertically, others horizontally, and others somewhere in between. If this description is accurate, the cortical neurons should respond to repeating bars of light, like the grid in Figure 1a, even better than to a single bar of light. That is precisely what researchers found. By spatial frequency of a visual stimulus, we mean the number of light-dark (or color) cycles that the stimulus shows per degree of visual space. For example, Figures 1a and b differ in the spacing of the bars: Figure 1a has twice as many bars in the same horizontal space and is therefore said to have double the spatial frequency of Figure 1b.
Using a mathematical formula called Fourier analysis, we can produce any complex sound by adding together simple sine waves. Conversely, we can also use Fourier analysis to determine which combination of sine waves would be needed to make any particular complex sound. The same principle of Fourier analysis can be applied to visual patterns. If the dimension from dark to light varies according to a sine wave function, visual patterns like the ones in Figures 1c and d result. Any series of dark and light stripes, like those in Figures 1a and b, can be analyzed into the sum of a visual sine wave and its odd harmonics (multiples of the basic frequency).
A complex visual pattern or scene can also be analyzed by the Fourier technique; in this case, frequency components at different angles of orientation are also used. A given spatial frequency can exist at any level of contrast; Figures 1c and d show examples of high and low contrast, respectively. To reproduce or perceive a complex pattern or scene accurately, the visual system has to handle all the spatial frequencies present. If the high frequencies are filtered out, the small details and sharp contrasts are lost; if the low frequencies are filtered out, the large uniform areas and gradual transitions are lost. Figures 1e–g show how the filtering of spatial frequencies affects a photograph. The photograph is still recognizable after either the high visual frequencies (Figure 1f) or low frequencies (Figure 1g) are filtered out, but information is lost in either case. (Similarly, speech is still recognizable, although it sounds distorted, after either the high audio frequencies or the low frequencies are filtered out.).
Thus, the simple cortical receptive fields that Hubel and Wiesel found are not just detecting lines of a particular orientation. Rather, they are detecting particular spatial frequencies—stripes of particular orientation that are either close together (high spatial frequency) or far apart (low spatial frequency). In other words, the receptive field for a simple cortical cell is not just a line of light that excites the neuron; it also consists of lines on either side that inhibit the cell. Therefore, the best stimulus for exciting a simple cortical cell is a series of bars, like those shown in Figures 1a–d, of a particular spatial frequency, in a particular orientation (horizontal, vertical, or somewhere in between), in a particular part of the visual field.
The suggestion that the visual system detects various spatial frequencies was soon supported by experiments investigating selective adaptation to spatial patterns (Blakemore and Campbell, 1969; Pantle and Sekuler, 1968). In these experiments a person spent a minute or more inspecting a visual grating with a given spacing (or spatial frequency), such as those in Figures 1a and b. Looking at the grating made the cells tuned to that specific frequency adapt (become less sensitive). Then the person’s sensitivity to gratings was reduced briefly at the particular frequency to which that person had adapted.
Similarly, Leonardo da Vinci’s Mona Lisa is famous because sometimes the model seems to be smiling but other times she doesn’t (see Figure 2). That ambiguity may be due to differences in spatial frequency (Livingstone, 2000). The low-spatial-frequency components of the picture (left panel of Figure 2) make it look as if she’s smiling, but the high-spatial-frequency components (right panel ) give her a rueful, almost sad expression. As we run our eyes over the original, views from the fovea report the sad, high-frequency components; but views from the peripheral vision, with large receptor fields that can detect only low-frequency components, emphasize the smile.
Blakemore, C., and Campbell, F. W. (1969). On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. Journal of Physiology (London) 203: 237–260.
Campbell, F. W., and Robson, J. G. (1968). Application of Fourier analysis to the visibility of gratings. Journal of Physiology (London) 197: 551–566.
De Valois, R. L., and De Valois, K. K. (1988). Spatial vision. New York, NY: Oxford University Press.
Livingstone, M. S. (2000). Is it warm? Is it real? Or just low spatial frequency? Science 290: 1299.
Pantle, A., and Sekuler, R. (1968). Size-detecting mechanisms in human vision. Science 162: 1146–1148.