from 'Language Myths', eds. L. Bauer and P.
Trudgill, Penguin, 1998, pp. 150-8
“Some Languages are Spoken More Quickly Than Others”
Peter Roach
We all make judgements
about how quickly someone is speaking, but it is not at all easy to work out
what we base these judgements on. Speakers of some languages seem to rattle
away at high speed like machine-guns, while other languages sound rather slow
and plodding. We find the same when we listen to dialects of our own native
language - within English, for example, it is a familiar cliché that cowboys in
Westerns (usually set in Texas or neighbouring states) speak slowly, with a
drawl. English rural accents of East
Anglia and the South-West are also thought of as slow-speaking, while urban
accents such as those of London or New York are more often thought of as
fast-speaking. However, impressionistic
judgements about such things are often unreliable. Ilse Lehiste, who has
studied very many languages, wrote ‘Whether there are differences in the rates
of speech of speakers with different linguistic backgrounds is not well known”
(Lehiste, 1970, p.52). More recently, Laver (1995) has written “The analysis of
phenomena such as rate is dangerously open to subjective bias ...listeners’
judgements rapidly begin to lose objectivity when the utterance concerned comes
either from an unfamiliar accent or (even worse) from an unfamiliar language”
(p. 542). Can we establish scientifically
that there really are characteristic differences in speaking speed? There are,
it seems to me, three possibilities:
(1)
some
languages really are spoken more rapidly, and some more slowly, than others as
a natural result of the way their sounds are produced.
(2)
we
get the impression that some languages are spoken more quickly than others
because of some sort of illusion.
(3)
in
some societies it is socially acceptable or approved to speak rapidly, and in
others slow speaking is preferred.
1.
Measures
of speaking rate in different languages.
We need to look
for appropriate ways to measure how quickly someone is talking. We are used to
measuring the speed at which someone can type, write or take shorthand
dictation in terms of how many words per minute are taken down. Some adjustment
usually has to be made to penalise someone for going so quickly that they make
a lot of mistakes. In measuring speech, we can do the same thing - we can give
someone a passage to read, or a speaking task such as describing what they did
on their last holiday, and count how many words they speak in a given time.
However, in speech it makes a big difference whether or not we include pauses:
if I want to work out how long it took me to cycle somewhere, I might make a note
of my times both including and excluding rest stops that I made on the way. In
a similar way, most studies of speaking have found it necessary to make two
different measurements of the rate at
which we produce units of speech: the rate including pauses and hesitations,
and the rate excluding such things. The terms usually used are speaking rate and articulation rate (Laver, 1995). Both are highly correlated with
perceived speech tempo, according to van Bezooyen (1984). Tauroza and Allison
(1990) measured words per minute, syllables per minute and syllables per word
in different styles of spoken English and found substantial differences. It is
quite possible that some languages make more use of pauses and hesitations than
others, and our perception of speed of speaking could be influenced by this
(Ofuka, 1996). In comparing different languages, however, there is a more
serious problem: some languages (e.g. German, Hungarian) have some very long
words while others (e.g. Chinese) have very few words of more than one or two
syllables. It has been found that Finnish was faster than English if syllables
per second are measured, but slower if words are counted, since Finnish words
tends to be longer than English words. Much depends, of course, on how we
define what a word is (Palmer, 1984, pp.41-8). This inter-language difference
could have a serious impact on the accuracy of our measurements, and for this
reason many investigators have chosen instead to measure the number of syllables spoken in a given amount of time.
This usually results in a syllables-per-second measurement, and at this more
detailed level of measurement it is usual to exclude pauses. This is not the
end to our problems, however: although counting syllables is likely to be a
much more reliable way of comparing different languages for speaking rate than
counting words, we should bear in mind that different languages have very
different syllable structures. Many of the world’s languages do not use
syllables with more than three or four sounds, while other allow syllables of
many more sounds. In English, for example, the word ‘strengths’ /streNTs/
contains seven sounds; the six-syllable English sentence “Smith’s strength
crunched twelve strong trucks” (containing 32 sounds) would take much longer to
say than the six-syllable Japanese phrase “kakashi to risu ” which contains 12
sounds. So if a language with a relatively simple syllable structure like
Japanese is able to fit more syllables into a second than a language with a
complex syllable structure such as English or Polish, it will probably sound
faster as a result. Dauer (personal communication) has found that Greek and
Italian are spoken more rapidly than English in terms of syllables per second,
but this difference disappears when sounds per second are counted. It seems,
then, that we should compare languages’ speaking rate by measuring the number
of sounds produced per second,
rather than the number of syllables. Within a particular language, it is clear
that speech rate as measured in sounds per second does vary quite widely:
Fonagy and Magdics (1960) measured different speaking styles and found rates
varying from 9.4 sounds (average) per second for poetry reading to 13.83 per
second for sports commentary. But this still leaves us with a problem. The
faster we speak, the more sounds we leave out. Speaking slowly, I might
pronounce the sentence “She looked particularly interesting” as /Si lUkt p«tIkj«l«li Int«r«stIN/, which contains 27 sounds, but
speaking rapidly I might say /Si
lUk
p«tIkli
IntrstIN/,
which only contains 20 sounds. In theory, then, it could happen that in
speaking quickly I might produce no more sounds per second than when speaking
slowly. In order to get a meaningful measure, it would be necessary to count
not the sounds actually observable in the physical signal, but the “underlying
phonemes” that I would have produced
in careful speech.
Osser and Peng
(1964) measured sounds per second for speakers of Japanese and of American
English, and found no significant difference between them. den Os (1988)
compared Dutch and Italian and found no significant difference in terms of
syllables per second, though Italian was somewhat slower in terms of sounds per
second. In a review of measurements of a number of different languages,
Dankovicova (1994) quotes average figures from various studies: for German,
5.55 and 5.7 syllables per second, for French 5.29, 5.2 and 5.7 syl/sec, for
Dutch 6.1, and for Italian 6.4. These are all for “normal” speaking rate - in
different circumstances, of course, rates can vary. I have a recording of a
friend who left a message on my telephone answering machine and kept up an
average speed of over 8 syl/sec over a
period of about 20 seconds. Arnfield and Roach (1995) showed rates in
English varying between 3.3 and 5.9 syls/sec. But overall, it seems that, on
the evidence available at present,
there is no real difference between different languages in terms of
sounds per second in normal speaking styles.
How might we
pursue this question further? One possibility would be to make use of some of
the carefully assembled speech databases stored on computer which have been
phonetically labelled. Databases such as EUROM-1 (Chan, 1995), which comprises
speech of six Western European languages and BABEL (Roach et al, 1996)
containing five languages of Eastern Europe, will, when complete and available
to researchers, give us valuable new material. But the expectation is that
these collections of normal, unemotional monologues will give us the same
answers as the other surveys - we will find no difference between languages in
terms of sounds per second or syllables per second.
2. Speaking rate as an illusion
Our impression of a language being spoken faster or slower may depend to some
extent on its characteristic rhythm. More precisely, it is said that we are
influenced by whether a language is perceived to be stress-timed or syllable-timed.
The distinction was given a detailed exposition by Abercrombie (see for example
Abercrombie, 1967), though the idea had been proposed long before by Pike
(1945). Pike refers (p.37) to the “pattering” effect of Spanish speakers and
their “sharp-cut syllable-by-syllable pronunciation”. Most people feel
intuitively that there is a genuine rhythmical difference between languages
such as English (classed as stress-timed) and French or Spanish (classed as
syllable-timed), and it usually seems that syllable-timed speech sounds faster
than stress-timed to speakers of stress-timed languages. So Spanish, French and
Italian sound fast to English speakers, but Russian and Arabic don’t. The
theory suggests that in syllable-timed languages all syllables tend to be given
equal amounts of time, while in stress-timed languages more time is given to
stressed syllables and less to unstressed. In addition, it is said that
stressed syllables occur at regular intervals of time in stress-timed
languages. Unfortunately, many studies based on detailed measurement of
time-intervals in different languages (e.g. Roach, 1982; Dauer, 1983) have been
unable to confirm these claims, with the result that we are forced to retreat
to a weaker claim: that some languages sound
stress-timed and others sound
syllable-timed. We may be forced to accept something similar in answer to our
present question - perhaps languages and dialects just sound faster or slower, without any physically measurable
difference. The apparent speed of some languages might simply be an illusion.
One of the
questions raised by this possibility is the degree to which listeners can
detect differences of speaking rate in their own language and in other
languages. If it turns out that we are no good at detecting speed differences
in different languages, we will have to conclude that our judgements of
speaking rate are unreliable. Vaane (1982) carried out a study using recordings
of Dutch (the subjects’ native language), English, French, Spanish and Moroccan
Arabic; these were spoken at three different rates. Two groups of listeners,
one phonetically trained and the other untrained, had to try to judge the speed
of utterance. Vaane tested the hypothesis that we will be less adept at judging
the speed of a language we do not know, and an unknown language is likely to
sound faster than our own language (presumably because it “sounds harder to
do”). Her results suggest that in fact both trained and untrained listeners are
quite accurate in judging the rate of speaking for their own language and also
for languages with which they are unfamiliar, a finding which compares interestingly
with the view quoted from Laver (1995) above. From this we can conclude that
the judgements are not based on linguistic knowledge (such as we use in
identifying words). We must be using one or more phonetic characteristics of
the speech that we are able to detect whether or not we know the language being
spoken.
Useful though the above findings
are, they do not yet bring us an answer to the question of whether some
languages are spoken more rapidly than others (when situational and personal
factors have been taken into account). Vaane does quote mean
syllables-per-second rates for the test passages in her experiment, but does
not tell us if the inter-language differences are statistically significant.
Interestingly, Dutch comes out with the highest speaking rate in all three
conditions, though this is not a language that most English people would
immediately think of as being rapidly spoken.
3.
Social
and personal factors and speaking rate
Social factors
influence the speakers of a language in different ways: a number of anecdotal
sources suggest that in some societies it is regarded as acceptable or approved
to speak rapidly, while in others slow speech is preferred. There is almost
certainly an interaction with gender here, with slow speech usually being
preferred for males. This would mean that, while at normal speaking speed the sounds-per-second rate for all
languages may be effectively the same, some languages are characteristically
using higher and lower speaking rates than other languages in particular social
situations. In a carefully controlled study, Kowal et al (1983) looked at two
very different types of speech (storytelling and taking part in interviews) in
English, Finnish, French, German and Spanish. They found significant
differences between the two styles of speech (both in terms of the amount of
pausing and of the speaking rate) but no significant difference between the
languages. They concluded that the influence of the language is negligible
compared with the influence of the style of speech. Similarly, Barik (1977)
showed that differences in tempo between English and French were due to the
style of speech, not to the language. Certainly we are all capable of speaking
faster and slower when we want to. There are variations in speed associated
with the situation in which the speech is being produced - we speak more
rapidly if we are in a hurry, or saying something urgent, or trying not to be
interrupted in a conversation. We tend to speak more slowly when we are tired
or bored. The emotional state of the speaker at the time of speaking is clearly
influential. There seems also to be a personal factor - some people are
naturally fast talkers, while others habitually speak slowly, within the same
language and dialect and in the same situation. Research has shown that our
opinion of speakers is influenced by their speaking rate: Giles (1992) reports
that “a positive linear relationship has repeatedly been found between speech
rate and perceived competence”, and Stephen Cowley (personal communication)
says that in Zulu society, slow speech tempo is a sign of respect and
sincerity. Yet another social factor is the amount of temporal variability,
where the alternation between speaking rapidly and speaking slowly may itself
have considerable communicative value - this has been pointed out by Cowley
(1994), who has found very wide tempo variation from phrase to phrase among
Italian speakers in conversational data.
While this idea of social determination of
speed seems the most plausible explanation, the only way we are going to be
able to test it is by much more research across a wide variety of languages and
social situations. Let us hope that this research will be carried out.
My thanks to Bill Barry, Stephen Cowley, Jana Dankovicova and Marianne
Jessen for their advice and discussion.
References:
Abercrombie, D.
(1967) Elements of General Phonetics,
Edinburgh University Press.
Arnfield, S.,
Roach, P., Setter, J., Greasley, P. and Horton, D. (1995) ‘Emotional stress and speech tempo variation’, Proceedings of the ESCA/NATO Workshop on
Speech Under Stress, Lisbon, pp. 13-15.
Barik, H.C.
(1977) ‘Cross-linguistic study of temporal characteristics of different types
of speech materials’, Language and
Speech, 20, 116-126.
Bezooyen, R. van
(1984) Characteristics and
Recognizability of Vocal Expressions of Emotion, Dordrecht: Foris.
Chan, D. and
others (1995) ‘EUROM: a spoken language resource for the EU’ in Proceedings
of Eurospeech 95, Madrid, pp.867-870.
Cowley, S.
(1994) ‘Conversational functions of rhythmical patterning’, Language and Communication, vol.14.4, pp. 353-376.
Dankovicova, J.
(1994) ‘Variability in articulation rate in spontaneous Czech speech’, unpublished
M.phil thesis, University of Oxford.
Dauer (1983)
‘Stress-timing and syllable-timing re-analysed’, Journal of Phonetics, vol.11, pp. 51-62.
den Os, E.A.
(1988) Rhythm and Tempo of Dutch and
Italian, Utrecht: Drukkerij Elinkwijk.
Fonagy, I. and
Magdics, K. (1960) ‘Speed of utterance in phrases of different lengths’, Language and Speech 4, 179-92.
Giles, H. (1992)
‘Speech tempo’ in W.Bright (ed.) The
Oxford International Encyclopedia of Linguistics, Oxford University Press.
Kowal, S.,
Wiese, R. and O’Connell, D. (1983) ‘The use of time in storytelling’,
Language and Speech, vol. 26.4, pp.
377-392.
Laver, J. (1995)
Principles of Phonetics, Cambridge
University Press.
Lehiste, I.
(1970) Suprasegmentals, MIT.
Ofuka, E. (1996)
Acoustic and Perceptual Analyses of
Politeness in Japanese Speech, unpublished PhD thesis, University of Leeds.
Osser, H. and
Peng, F. (1964) ‘A cross-cultural study of speech rate’, Language and Speech, 7, 120-5.
Palmer, F.R.
(1984) Grammar, (Second Edition),
Penguin.
Pike, K.L. (1945) The
Intonation of American English, University of Michigan Press.
Roach, P. (1982)
‘On the distinction between “stress-timed” and “syllable-timed” languages’, in
Crystal, D. (ed.) Linguistic
Controversies , Edward Arnold.
Roach, P.,
Arnfield, S. and Hallum, E. (1996) ‘BABEL: A multi-language database’, Proceedings of the Australian International
Conference on Speech Science and Technology (SST-96), pp. 351-4.
Tauroza, S.
and Allison, D. (1990) ‘Speech rates in
British English’, Applied Linguistics, 11,
pp.90-105.
Vaane, E. (1982)
‘Subjective Estimation of Speech Rate’,
Phonetica vol.39, pp. 136-149.