Perceptual Centers ( P-centers )

John Morton, Steve Marcus and Clive Frankish

Medical Research Council, Applied Psychology Unit, Cambridge, England


Words presented with regular acoustic onsets are not perceptually regular. The requirements for perceived regularity were investigated, and the perceptual center (P-center) of a word was defined as its psychological moment of occurrence. Some properties of these perceptual centers have been empirically determined, and the range of their applicability is sketched. In particular, it is already clear that temporal alignment of P-centers is a relevant variable in dichotic presentation of speech.


We wish to introduce a new term into the language of perception, in particular, speech perception. The term, itself the title of this note, refers to a phenomenon that is obvious once pointed out. To start with, let us attempt a definition: The P-center of a word corresponds to its psychological moment of occurrence. This definition is not completely satisfactory for polysyllabic words but will serve as a first approximation.

The need for the concept arose when we started recording stimulus tapes for memory experiments using a computer. It was apparent that producing items at regular intervals was not simply a question of having the onsets at regular intervals. Thus we were forced to ask ourselves what it was that was regular in a rhythmic list. To simplify our discussions we defined this as the P-center of each item. This act of reification completed, we began to ask questions about the concept itself.

First, we had to produce phenomenally regular stimulus lists. Trial and error variation of the relative onset times for the spoken digits one through nine led to the selection of particular relative onset times required to achieve a phenomenally regular ("P-center adjusted") list. Subsequent work has led to a paradigm for the determination of these relative alignment positions. The paradigm consists of presenting a pair of sounds in alternation under computer control. One of the sounds occurs at fixed intervals; the other occurs at a moment determined by the observer, who adjusts a knob until the pair of sounds is perceived to occur at equal intervals…

The relative alignments for particular exemplars of the spoken digits one through nine are shown in Figure 1. The vertical lines are spaced at 100-msec intervals and are included to aid in comparison. The figure shows the amplitude waveforms of the sounds as represented by the computer, relative horizontal alignment indicating the temporal offset necessary to produce a perceptually regular sequence of these stimuli. For example, the relative onset asynchrony between the words seven and eight is 80 msec in the figure; thus if the eight were to follow the seven in a list of items presented at a rate of two per second, then its onset would have to occur 580 msec after the onset of the seven. Conversely, if the eight preceded the seven, then their onset-to-onset time would need to be 420 msec.

Our initial assumption is that P-centers are a property simply of the acoustic makeup of each stimulus independent of the context. This is the null hypothesis waiting to be falsified. Present data in support of the hypothesis indicate that in digit lists, P-centers are independent of surrounding acoustic context; that is, the P-center of each digit is not affected by adjacent digits. It remains to be seen whether they are subject to phonological, semantic, or syntactic influences in situations more closely approximating continuous speech. A little can be said about the nature of the acoustic cues for P-center allocation, though only in the form of negative statements. Thus it is clear from Figure 1 that P-centers can correspond neither to word onset, stressed vowel onset, nor to position of peak vowel intensity. (Although it will be remembered that vowel onsets are aligned for the words three, six, and seven, which begin with fricatives, this has not proven a general feature of other sounds or even other sets of digits.) It would seem, then, that we are dealing with some complex function, though the precise nature of the computation remains to be discovered.


Although P-centers have been defined as a property of speech sounds, the paradigm described above allows speech and nonspeech stimuli to be freely mixed. When subjects attempt to adjust an alternating sequence of a spoken word and a click to perceptual regularity, they perform the task with considerably higher variance than when adjusting pairs of speech sounds. With repeated presentation, even speech stimuli eventually lose coherence (for example, the initial fricatives can become disconnected), and when this happens the judgments of interval become more difficult.

Rapp (1971) and Allen (1972) have explored related techniques for examining what Allen has termed "syllable beats." Because they have attempted to determine absolute positions immediately, they have been hampered by individual criterion differences, especially in Allen's perceptual tasks. Our techniques establish relative timing of speech sounds to one another and show little individual variability; thus we suspect that individual differences arise in a part of the perceptual system remote from that in which P-centers are processed, perhaps even in the motor-responses system.



P-centers exert an influence on the production of words (and P can here stand for “production" or "performance"). Humanly produced lists of words never have as a systematic property the regularity of the sound onset. If someone is asked to repeat two sounds alternately at regular intervals, it is simple to discover the relative P-centers. If subjects are asked to produce /ba/ and /ma/ in alternation, we find that the acoustic onset of /ma/ is relatively advanced, even though there are no perceptual irregularities. Onset-to-onset times are smaller for /ba-ma/ than for /ma-ba/. Estimates for two different speakers gave values of 75 and 68 msec. One might note that this is of the same order of magnitude as differences in reaction times between voiced stops and nasals in phoneme monitoring experiments (Foss & Lynch, 1969; blorton & Long, 1976).



Finally, it has not escaped our notice that the same concept will serve as a basis for the description of a wide variety of situations. For example, in trying to understand what happens when a ballerina performs a movement "in time to the beat," it might be useful to consider that it is the P-centers of the production units of the movements that are adjusted to successive P-centers of the input music. Equally, if a double bass player, a flautist, and a tympanist play in unison, we can ask what it is that they do together and what it is about the sounds that is isochronous.



Figure 1. Typical exemplars of the spoken digits one through nine, illustrating relative P-center alignment.


In summary then, the concept of the P-center allows and encourages questions to be asked about the times of occurrence of events. In addition, it points to the complexity of these mechanisms necessary to make judgments concerning co-occurrence of events or necessary to produce responses isochronous with each other and with external events. The concept itself has no explanatory power; its virtue is that it has the right kind of neutrality to prevent people like ourselves from assuming, incorrectly, that the onset of a speech sound determines its moment of occurrence.