Science Brief

More than words: Disfluencies, emphasis and gesture aid in communication

Recent experiments show long-term effects on listeners’ understanding and memory.

By Duane G. Watson

Duane G. Watson Duane G. Watson is an associate professor in the Department of Psychology with appointments in the Department of Linguistics and the Beckman Institute at the University of Illinois Urbana-Champaign. Watson received his doctorate in cognitive science from the Massachusetts Institute of Technology and completed postdoctoral training in Cognitive Science at the University of Rochester. His research focuses on the cognitive processes that underlie the use of prosody, gesture and disfluency. His work also explores the cognitive mechanisms that underlie individual differences in language processing.

Author website

It is often the case that how a person says a sentence is as important to communication as what they say. For example, an employer who tells an employee "Nice work" might be giving sincere praise for a job well done, or might be using the same phrase to rebuke an employee who lost a big account. The exact intentions of the employer can be made clear through tone of voice, body language and gesture. These nonverbal modes of communication are often critical in conveying the subtle nuances of language and, for the most part, listeners are exquisitely sensitive to these cues.

My colleagues and I are interested in the cognitive processes that underlie the use of these types of cues in both language production and language comprehension. Previous work suggests that these cues not only convey information about speakers’ intentions, but also convey information about other levels of linguistic structure such as semantic and syntactic structure (e.g., Wagner & Watson, 2010). Furthermore, these cues can help to facilitate language processing. Our goal as language scientists is to understand the factors that facilitate the processes that underlie language use in order to gain insights into how these processes are organized.

One particular goal is to understand how factors like gesture and emphasis influence long-term understanding in communication. In general, the work in my laboratory suggests that how something is said can have a long-term impact on what a listener later remembers. Below, I discuss this work in detail.

Disfluency and memory

In everyday speech, we often make errors in what we say. These can include slips of the tongue, hesitations, saying "uh" or "um" and repeating parts of what was just said. These are called disfluencies. Traditionally, language researchers assumed that listeners interpreted these disfluencies as noise to be filtered out of the linguistic signal, so that the language processing system could analyze an idealized, error-free linguistic form. More recent work suggests that listeners use disfluencies to make inferences about speakers' intentions. For example, researchers have found that in communication games, when listeners hear their partner produce a disfluency, they assume that the speaker is about to refer to something that is new to the conversation or otherwise difficult to describe (e.g., Arnold, Tanenhaus, Altmann, & Fanano, 2004; Arnold, Hudson Kam, & Tanenhaus, 2007). This work suggests that listeners have a working knowledge of where disfluencies are likely to occur, helping them to better understand the incoming linguistic signal. 

Although it is clear that disfluencies can facilitate processing language as it is heard, we were interested in whether disfluencies could have an impact on long-term understanding. In our experiment (Fraundorf & Watson, 2011), participants listened to a chapter from the Lewis Carroll short novel "Alice in Wonderland." We manipulated the story so that listeners either heard stories containing "uhs" and "ums" or stories that were completely fluent. Their task was to retell the chapters after hearing them. We found that participants' memories for the story were better when the story contained "uhs" and "ums" than when the story was produced with no disfluencies. To make sure that this effect was not simply due to the fact that "uhs" and "ums" provided listeners with more time to process what they were hearing, we compared performance in this condition to a condition in which participants listened to stories that contained coughs at the exact same location as the disfluencies. These coughs were digitally manipulated to match the duration of the "uhs" and "ums". If the disfluencies were simply providing listeners with more processing time, performance in the cough condition and disfluency condition should be similar since the coughs and disfluencies were the same length. In this comparison, participants performed more poorly in the cough condition than in the fluent condition or the disfluency condition. This rules out the hypothesis that disfluencies are just providing listeners with more time for processing, which improves their memory for the story. These data suggest that disfluencies are actually playing some role in the interpretation process that is helping the listener better remember what they have heard.

So how are disfluencies helping listeners? One possibility is that they provide information about the structure of the story. Disfluencies like "uh" and "um" are more likely to occur before new information in a story (Arnold et al., 2004). They also tend to occur at major plot and discourse points in narratives (Fraundorf & Watson, 2008). Thus, for the listener, disfluencies might serve as guideposts to the underlying structure of the narrative. Another possibility is that disfluencies have a more general effect. Previous work suggests that disfluencies may serve to focus listeners’ attention (e.g., Fox-Tree, 2001). To test these possibilities we compared stories in which disfluencies occurred before major plot points in the story, a typical location for disfluencies, to stories in which disfluencies occurred within the major plot points, an atypical location (Fraundorf & Watson, 2011). If disfluencies are serving as guideposts to the story, memory should only improve in the condition in which disfluencies have been appropriately placed. If disfluencies generally increase attention, then stories with disfluencies in both typical and atypical locations should be better remembered than fluent stories. We found that the location of the disfluency did not matter: disfluencies improved recall compared to fluent stories, no matter where they occurred. This suggests that disfluencies might be playing a more general role in communication by increasing listener attention.

We are currently investigating why disfluencies might improve attention. One possibility is that if listeners believe the speaker is producing disfluencies because they are having difficulty organizing their thoughts or planning their speech, listeners will assume that the process of understanding will be more difficult. This might encourage the listener to allocate more resources to language processing. Although future work is needed to understand the exact mechanisms at work, at this point, the data are clear: disfluencies improve later remembering. Thus, public speakers, policy makers and teachers might be better off producing more natural speech that includes the occasional disfluency, than giving perfectly scripted, error-free presentations.

Emphasis and memory

Emphasis, or accenting, is the foregrounding of information in a sentence. Accenting is accomplished in several different ways: producing a word more loudly than the words surrounding it, increasing word length, and raising or lowering pitch across the word. Accents can be used to contrast what the speaker is saying with another possibility. For example, you might imagine a person in a coffee shop receiving an espresso and saying: "I wanted an espresso with MILK!" By putting an accent on "milk" they are saying not only that they wanted milk in their espresso, but they are also saying that they did not want something else, like cream, which they presumably received. In contrast, a person who says: "I wanted an ESPRESSO with milk," presumably received a different drink, like a coffee, but that drink correctly contained milk. Thus, accenting can convey important information about what happened, as well as what did not happen, even when the words in the sentence are exactly the same.

There is a great deal of work demonstrating that listeners are sensitive to the subtle differences in meaning conveyed by accenting (see Wagner & Watson, 2010, for a review). However, how accenting affects our longer-term understanding of speech is poorly understood. In one of our studies, participants listened to stories in which two pieces of information are contrasted (Fraundorf, Watson, & Benjamin, 2010). For example, in one story, a Scottish knight and an English knight competed in a tournament. The Scottish knight ended up winning the jousting competition. We manipulated whether listeners heard a contrastive accent on the word "Scottish" in the sentence about the jousting contest. The next day, participants were asked to make true/false judgments about the stories. When “Scottish” had been accented, participants were more likely to answer “true” to the statement “The Scottish knight won the jousting contest.” Critically, participants were also more likely to answer “false” to the statement “The English knight won the jousting contest,” but no more likely to say “false” to a distractor sentence with a nonmentioned knight (e.g., “The Welsh knight won the jousting contest.”). Thus, the contrastive accent appears to induce listeners to encode specific information about what happened and what did not happen. Significantly, this can impact long-term understanding. We have also replicated these findings in written texts with words that are emphasized either through capitalization or italics. Just as in speech, we find that highlighting information in text improves memory for what was written and what was not written. This potentially has consequences for how information should be structured in learning contexts such as in textbooks and research articles.

Because these accents provide information about discourse and story structure, we have been interested in how use of this information differs across the population and how it might change over the lifespan (Fraundorf, Watson & Benjamin, 2012). We have found that elderly adults use accents in the same way as young adults: it improves their memory for what happened and what did not happen. However, unlike younger adults, elderly adults were less likely to remember other information in the sentence when an accent was present. We have found the same overall pattern with younger adults who have lower working memory scores. These data suggest that although using accent information to structure the discourse can facilitate processing, it is also resource intensive and may exact a cost at the expense of understanding the entire story.

Gesture and memory

So far, we have discussed how prosodic aspects of language can have an influence on listeners’ later memory.  However, nonverbal aspects of language can also benefit speakers. It has long been known that gestures that accompany speech can actually facilitate language production processes, however there is also a debate as to how this is actually accomplished. Some researchers have argued that gesturing lightens the speaker’s working memory load (e.g., Wagner, Nusbaum, & Goldin-Meadow, 2004). Others have argued that gestures facilitate speaking by helping listeners access words during sentence planning (e.g., Krauss, 1998).

In work in progress, we tested these theories by examining whether individual differences in working memory capacity and individual differences in verbal fluency predicted a person's likelihood of gesturing. If gestures support poor working memory, we would expect speakers with lower working memory capacities to be more likely to gesture. If gestures aid in word retrieval, we would expect speakers with lower verbal fluency to gesture more. In our task, participants completed a battery of individual difference tasks to assess their verbal fluency and their working memory capacity. To elicit gestures, participants watched several short cartoons and then had to describe the events in the cartoons to the experimenters. We videotaped their retellings and measured how often each speaker gestured while retelling their story. We found a correlation between working memory and gesture: speakers with lower working memory capacity were more likely to gesture during the retelling of the story. There was no link between verbal fluency and gesture. Thus, these findings suggest that gesture facilitates language production by supporting the working memory processes that are engaged in speaking. 


In communication, we often focus on the effects of specific words in a given conversation, speech or text. We try to evaluate how word choice and phrasing might influence our audience. However, the work described above suggests that other aspects of communication can be important not only in helping speakers more effectively convey their main points but also for listeners' long-term understanding. Aspects of speech like accenting, disfluencies and gesture can all convey subtle information that can have a lasting impact on the process of communication.


This work was supported by NIH grant R01 DC008774 and a grant from the James S. McDonnell Foundation. The work described here was conducted in collaboration with Scott Fraundorf, Maureen Gillespie, Aaron Benjamin and Kara Federmeier.


Arnold, J.E., Hudson Kam, C.L., & Tanenhaus, M.K. (2007). If you say thee uh- you’re describing something hard: The on-line attribution of disfluency during reference comprehension. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 33
, 914–930.

Arnold, J.E., Tanenhaus, M.K., Altmann, R.J., & Fagnano, M. (2004). The old and thee, uh, new: Disfluency and reference resolution. Psychological Science, 15, 578–582.

Fox-Tree, J.E. (2001). Listeners’ uses of um and uh in speech comprehension. Memory & Cognition, 29, 320-326.

Fraundorf, S.H., & Watson, D.G. (2008). Dimensions of variation in disfluency production in discourse. In J. Ginzburg, P. Healey, & Y. Sato (Eds.), Proceedings of LONDIAL 2008, the 12th workshop on the semantics and pragmatics of dialogue (pp. 131–138). London: King’s College London.

Fraundorf, S.H. & Watson, D.G. (2011). The disfluent discourse: Effects of filled pauses on recall. Journal of Memory and Language, 65, 161-175.

Fraundorf, S.H., Watson, D.G., & Benjamin, A.S. (2010). Recognition memory reveals just how CONTRASTIVE contrastive accenting really is. Journal of Memory and Language, 63, 367-386.

Fraundorf, S.H., Watson, D.G., & Benjamin, A.S. (2012). The effects of age on the strategic use of pitch accents in memory for discourse: A process-resource account. Psychology and Aging, 27, 88-98.

Krauss, R.M. (1998). Why do we gesture when we speak? Current Directions in Psychological Science, 7, 54–60.

Wagner, M. & Watson, D.G. (2010). Experimental and theoretical advances in prosody: A review. Language and Cognitive Processes, 25, 905-945.

Wagner, S.M., Nusbaum, H., Goldin-Meadow, S. (2004). Probing the mental representation of gesture: Is handwaving spatial? Journal of Memory and Language, 50, 395–407.

The views expressed in Science Briefs are those of the authors and do not reflect the opinions or policies of APA.