Building biases: Word learning as a developmental cascade
By Larissa Samuelson
Larissa Samuelson is an associate professor of psychology and a director of the CHILDS Facility (CHild Imaging Laboratory in Developmental Science) at the University of Iowa. Dr. Samuelson received a BS with honors from Indiana University in 1993 and a joint PhD in Psychology and Cognitive Science from Indiana University in 2000. She is the recipient of the J.R. Kantor Graduate Award, and in 2010, she received the American Psychological Association (APA) Distinguished Scientific Award for Early Career Contribution to Psychology in the area of developmental psychology. Her research examines early word and category learning and incorporates neural network and dynamical systems models. She has had continuous funding from the National Institutes of Health since 2004.
The fundamental question for the field of cognitive development is how does knowledge change? For example, how does an infant progress from having no recognizable words in her productive vocabulary at 10 months of age to saying her first word around 12 months, to having a vocabulary of well over 200 words just a year later? Equally central to the field is the issue of how children use their knowledge in real time in specific tasks. That is, why is a 12-month-old equally likely to pick a cup as a bottle when given both and asked to “find the bottle” even though she can clearly say “bot-el” when mom presents just that object?
I approach these questions from a dynamic systems perspective and the belief that three things are critical to any explanation of development. First, because our means of assessing internal mental states always relies on external behavior, a rich understanding of behavior is essential if we are to understand underlying cognitive processes. Second, appreciation for empirical details is critical because behavior is created from the interaction between the child and the task used to probe development. From this perspective, formal models are useful because they require specifying the nature of the task, the stimuli, the knowledge brought by the child, the underlying assumptions, and how all these components interact. Third, an understanding of how individual behaviors accumulate to create longer-term patterns is necessary because developmental change is created over multiple timescales.
These basic tenets structure my approach to a specific area of interest, early word learning. Between 18 and 30 months of age, the typical child’s productive vocabulary increases tenfold and comes to be dominated by object names (Fenson et al., 1994). Young children are so good at learning nouns that a two-year-old who hears a new name one time is able to systematically generalize that novel name to new instances of the category (Samuelson, Horst, Schutte, & Dobbertin, 2008). How do they do it? Theoretical approaches to this question suggest that the task of learning nouns is made easier by biases or constraints that reduce the problem of finding the correct word-referent mapping to a solvable size. Despite considerable empirical support for these word learning biases, however, little is known about their origins or the mechanisms by which they are implemented. The main line of research in my laboratory is geared toward addressing this gap in our understanding by testing and augmenting a processed-based account of the development of one particular word learning bias—the “shape bias.”
The shape bias refers to the robust finding that young children shown a novel solid exemplar object and told a novel name (e.g. “See this? This is a zup!”) are biased to generalize the new name to test objects of the same shape, rather than test objects that match in color or material (Landau, Smith & Jones, 1988).My previous work suggests that the shape bias is learned in the course of early noun learning (Samuelson & Smith, 1999; Samuelson & Smith, 2000). My colleagues and I have proposed a four-step process to explain this acquisition (Samuelson, 2002; Smith, Jones, Landau, Gershkoff-Stowe, & Samuelson, 2002; Smith & Samuelson, 2006). According to this proposal, children begin by learning specific names for specific members of noun categories. So, in step one, a child learns that “ball” applies to the round red thing that he bounces, while “cup” applies to the thing he drinks milk from. In the second step, the child learns the similarity structure of individual categories and, thus to make first order generalizations (e.g., that all “balls” are round). At this point, the child can apply the name of the object to the entire category of similar things. The second order generalization is made in step three, when the child learns similarities in organization across categories; for example, that solid things are in categories well organized by similarity in shape. This is the point at which we see a “shape bias” as the child can apply this second order generalization to a novel solid object. Finally, in step four, the child is able to use this new word learning bias to acquire new object names quickly.
The four-step process receives support from studies showing that the statistical regularities in the early noun vocabulary are sufficient to support the development of the shape bias in connectionist networks and very young children, and that early training with nouns that name categories of solid things organized by similarity in shape can lead to an acceleration in vocabulary development (Perry, Samuelson, Malloy, & Schiffer, in press; Samuelson, 2002; Smith et al., 2002; Smith & Samuelson, 2006). Further, the development of a shape bias influences how children apply even known nouns to novel objects (Samuelson & Smith, 2005).
Thus, there is good empirical support for the idea that the shape bias is the developmental product of the early noun vocabulary and of the four-step process in particular. An important next question, then, is what are the developmental and learning processes that move children from step to step in the proposed process? Put differently, how do children begin to make first and second order generalizations? This is fundamentally a question of how individual naming experiences accumulate to create development. Answering this question requires understanding how children use what they know about nominal categories in real time in a word learning task, and how those experiences influence performance at the next time step.
How Children Use Their Represented Knowledge in Real Time
In one series of studies, my students and I varied the kind of stimuli, the specifics of the task, and the age of children studied to probe the interaction between vocabulary knowledge and noun generalization behavior. Samuelson and Horst (Samuelson & Horst, 2007) demonstrated that 24-month-olds’ tendency to generalize by similarity in shape or material was influenced by the solidity of the stimuli (hard clay vs. hair gel), the specifics of the warm-up trials they received (all solid or mixed), and whether material-matching test objects also matched the named exemplar in color. Moreover, Samuelson et al. (2008) demonstrated that 36-month-old children generalize novel names for deformable things (made of sponge) by shape similarity, but 24- and 48-month-old children do not. Because 36-month-olds know more names for solid things in categories organized by shape than 24-month-olds, but have a less diverse vocabulary than 48-month-olds, this suggests that children’s generalizations emerge from the interaction between the particular set of statistical regularities in their vocabulary and the specifics of the stimuli. Finally, Samuelson, Schutte & Horst (2009) recently demonstrated that children matched for age and vocabulary knowledge perform differently when the same stimuli are presented in different task formats (yes/no versus forced choice). Together these studies demonstrate that the specifics of children’s novel noun generalization behaviors emerge from the interaction of the stimuli, the task, and the child’s vocabulary knowledge.
From Real Time to the Next Time Step
We have also examined what children take away from noun learning experiences and how this influences future learning by looking at children’s retention of names presented in a “fast mapping” context. The term fast mapping was coined by Susan Carey to describe children’s ability to quickly link a novel word to a novel referent after just one exposure (Carey & Bartlett, 1978) and is often referred to as an example of how good children are at word learning. Horst and Samuelson (2008) demonstrated, however, that children do not actually retain the word-object mappings formed in a fast mapping task over a five-minute delay.
To more closely examine the processes that determine whether a child learns a word in a fast mapping context, Bob McMurray, Jessica Horst, and I developed a computational model of the fast mapping task that captures the data from Horst and Samuelson (2008) as well as a number of other word learning studies (McMurray, Horst, & Samuelson, 2010; McMurray, Horst, Toscano, & Samuelson, 2009). The model works by finding the most likely referent of a novel word based on the current constraints—its vocabulary of known words and the available referent objects. Thus the model is able to map a novel word to a novel referent quickly in the moment of naming, just as children do. However, at that point the model has not learned the word. Rather, it takes many iterations of mapping before the model is able to produce the word reliably. This highlights the fact that word learning unfolds over two timescales: the in-the-moment selection of a referent and the slower building of a learned name-referent link.
A follow up study, suggested by the mechanisms in the model, demonstrates that general processes of visual familiarity also play a critical role in children’s fast-mapping abilities (McMurray et al., 2010; McMurray et al., 2009). In particular, if children are familiarized with a set of novel objects that are subsequently used in a fast mapping task that includes other novel, never-before-seen objects, children will map the novel word to the “supernovel” never-before-seen objects even though the children lack names for any of the stimuli. Likewise, a second line of follow-up work comparing prior familiarization with the words versus the objects to be mapped has demonstrated modality-specific differences in how children bring prior knowledge to bear in a task (Kucker & Samuelson, 2010). Children are able to fast-map following both kinds of familiarization, but only retain the mappings following familiarization with the objects. Thus, both the empirical data and the model argue for the importance of general perceptual process in early word learning.
In the context of the shape bias, my recent work on 24-month-olds’ generalizations of names for nonsolid substances provides further evidence that word learning biases change over the slower timescale of several trials. We have found that young children become less likely to generalize a novel name for a novel solid rigid object by shape if, over the course of the experimental session, they have previously seen many nonsolid substances and solid things broken into pieces (Samuelson & Horst, 2007).
This work confirms that in-the-moment noun generalization and referent selection depends on how specific tasks and stimuli engage children’s represented knowledge, and that the process of word learning is not fast, but rather, one of slow building from multiple repeated experiences. In an effort to understand how individual experiences accumulate to create developmental changes, I have used a Dynamic Neural Field model to instantiate the real time decision processes in yes/no and forced-choice noun generalization tasks and to capture behavioral differences in these tasks (Samuelson et al., 2008). Dynamic Neural Fields are in a class of bi-stable neural network models that capture neural activation patterns at the level of neural populations (see Spencer, Perone & Johnson, 2009). Concretely, such models specify how activation patterns change over time within and between different cortical populations. Because activation in these models evolves dynamically moment-by-moment via the interactions of individual neurons, they enable us to capture the moment-by-moment processes by which children make decisions about noun generalization and thus to understand how neural activation patterns are linked to real time behaviors in context. This modeling work suggests that whether children can directly compare stimuli or not influences how knowledge is accessed and integrated in forced choice and yes/no tasks, and hence how children perform. As a first step towards the longer timescales of development, we have quantitatively fit our model to developmental changes in children’s noun generalization behavior from 24 to 36 months of age via changes in two parameters: one that determines the stability/noisiness of children’s representations of the stimuli, and another that changes the amount of attention children devote to shape similarity (Samuelson et al., 2008).
We have also used longitudinal noun training studies to examine how current noun learning influences subsequent word learning behaviors. In these studies, children are taught sets of words as they play with objects in a naturalistic setting. In some studies we have manipulated the structure of the word sets relative to the child’s existing noun vocabulary (Samuelson & Schiffer, in prep). In other studies we manipulated the structure of the exemplars used to represent the nominal categories (Perry et al., 2010). In all cases, we have seen that the trajectory of word learning following the experimental manipulation is a function of how prior knowledge, the specifics of the training vocabulary, and the specifics of the instances used to teach each category come together in individual children. For example, children with smaller vocabularies taught a set of nouns that all referred to categories well organized by similarity in shape showed a larger vocabulary acceleration following training compared to those who received the same training but started the study with a larger vocabulary. Likewise, children who saw more variable exemplars of a category—a 15 cm white paper bucket, a 12 cm orange cloth pumpkin bucket and a 20cm clear plastic bucket to teach the word “bucket”—learned twice as many words following training than children who learned the same words but saw more similar exemplars. These longitudinal studies show how children’s learning of individual words teaches them something more about how words, in general, link to categories. In this way then, noun learning teaches children how to learn nouns.
A Cascading Process
Together then, this program of research points to a cascading set of processes that create word learning. In particular, the course of word learning builds from effects at the level of individual noun decisions that accumulate on a moment-to-moment timescale and structure subsequent word learning behaviors. We have recently developed a new model of word learning that captures this developmental cascade by integrating across the timescales of individual word learning decisions in multiple tasks to both build a lexicon and learn how to learn words over development. The new model builds on our prior Dynamic Neural Field (DNF) model of individual noun generalization decisions over development (Samuelson et al., 2009). DNF models are ideal for integrating across the multiple timescales of development because they focus on the second-to-second timescale of the processes that generate individual behaviors such as generalizing a novel noun to a new instance, and they also have a mechanism for learning across repeated responses (Spencer, Dineva & Schöner, 2009). Moreover, DNF models have recently been show to capture fast, flexible real time learning behaviors. Finally, they provide a neural grounding that links brain and behavior (Spencer,Perone & Johnson, 2009; Simmering, Schutte & Spencer, 2007) in a way that can generate precise behavioral predictions and allow for the integration of multiple systems from object encoding to selection of objects in the task space to word learning in context and category formation (Spencer, et al., 2009).
Thus, the goal of my research program is a developmental explanation of the emergence of one word learning bias. Because this explanation is grounded in a rich understanding of noun generalization behavior, an appreciation for empirical details, and has the goal of understanding how individual behaviors accumulate to create longer term patterns, it illuminates more general principles of how children’s represented knowledge is accessed and used in real time and in real tasks, and it illuminates the general developmental problem of how knowledge changes over multiple timescales. I contend that developmental explanations of this kind are necessary for a complete understanding of any behavior.
Carey, S., & Bartlett, E. (1978). Acquiring a single new word. Proceedings of the Stanford Child Language Conference, 15, 17–29.
Fenson, L., Dale, P. S., Reznick, J. S., Bates, E., Thal, D., & Pethick, S. (1994). Variability in early communicative development Monographs of the Society for Research in Child Development.
Horst, J. S., & Samuelson, L. K. (2008). Fast mapping but poor retention by 24-month-old infants. Infancy, 13(2), 128-157.
Kucker, S., C., & Samuelson, L., K. (2010). Object and word familiarization differentially boost retention in fast-mapping. Proceedings of the 32nd Annual Meeting of the Cognitive Science Society, Portland, Oregon. 2621.
Landau, B., Smith, L.B. & Jones, S.S. (1988). The importance of shape in early lexical learning. Cognitive Development, 3, 299-321.
McMurray, B., Horst, J. S., & Samuelson, L. K. (2010). Using your lexicon at two timescales: Investigating the interplay of word leaning and recognition. Manuscript Under Review.
McMurray, B., Horst, J. S., Toscano, J., & Samuelson, L. K. (2009). Connectionist learning and dynamic processing: Symbiotic developmental mechanisms. In J. P. Spencer, M.S. Thomas & J.L. McClelland (Eds.), Towards a new grand theory of development? Connectionism and dynamic systems theory reconsidered (pp. 218-249). New York, NY: Oxford University Press.
Perry, L., K., Samuelson, L., K., Malloy, L., M., & Schiffer, R., N. (2010). Learn locally, think globally: Exemplar variability supports higher-order generalization and word learning. Psychological Science, 21(12), 1894-1902
Samuelson, L. K. (2002). Statistical regularities in vocabulary guide language acquisition in connectionist models and 15-20-month-olds. Developmental Psychology, 38(6), 1016-1037.
Samuelson, L. K., & Horst, J. S. (2007). Dynamic noun generalization: Moment-to-moment interactions shape children's naming biases. Infancy, 11(1).
Samuelson, L. K., Horst, J. S., Schutte, A. R., & Dobbertin, B. N. (2008). Rigid thinking about deformables: Do children sometimes overgeneralize the shape bias? Journal of Child Language, 35(03), 559-589.
Samuelson, L. K., & Schiffer, R. N. (in prep). Statistics and the shape bias: It matters what statistics you get and when you get them.
Samuelson, L.K., Schutte, A.R. & Horst, J.S. (2009). The dynamic nature of knowledge: Insights from a dynamic field model of children’s novel noun generalizations. Cognition, 110, 322-345.
Samuelson, L. K., & Smith, L. B. (1999). Early noun vocabularies: Do ontology, category organization and syntax correspond? Cognition, 73(1), 1-33.
Samuelson, L. K., & Smith, L. B. (2000). Children's attention to rigid and deformable shape in naming and non-naming tasks. Child Development, 71(6), 1555-1570.
Samuelson, L. K., & Smith, L. B. (2005). They call it like they see it: Spontaneous naming and attention to shape. Developmental Science, 8(2), 182-198.
Simmering, V., Schutte, A. R. & Spencer, J. P., (2007). What does theoretical neuroscience have to offer the study of behavioral development? Insights from a dynamic field theory of spatial cognition. In J. M. Plumert, & J. P. Spencer, (Eds.), The emerging spatial mind (pp. 320-361). New York: Oxford University Press.
Smith, L. B., Jones, S. S., Landau, B., Gershkoff-Stowe, L., & Samuelson, L. K. (2002). Object name learning provides on-the-job training for attention. Psychological Science, 13(1), 13-19.
Smith, L. B., & Samuelson, L. (2006). An attentional learning account of the shape bias: Reply to Cimpian and Markman (2005) and Booth, Waxman, and Huang (2005). Developmental Psychology, 42(6), 1339-1343.
Spencer, J. P., Dineva, E. & Schöner, G. (2009). Moving toward a grand theory while valuing the importance of the initial conditions. In J. P. Spencer, M. S. Thomas & J. L. McClelland (Eds.), Toward a unified theory of development: Connectionism and dynamic systems theory re-considered (pp. 354-372). New York: Oxford University Press.
Spencer, J. P., Perone, S., & Johnson, J. S. (2009). The dynamic field theory and embodied cognitive dynamics. In J. P. Spencer, M. S. Thomas & J. L. McClelland (Eds.), Toward a unified theory of development: Connectionism and dynamic systems theory re-considered (pp. 86-118). New York: Oxford University Press.
The views expressed in Science Briefs are those of the authors and do not reflect the opinions or policies of APA.
PSA is a free monthly email publication from the APA Science Directorate. If you’re not already receiving PSA, request your free subscription now!