The Emergence of a New Paradigm in Ape Language Research

Stuart Shanker and Barbara King


§1. The Spreading Appeal of the Dance Metaphor

      In recent years the same metaphor has cropped up time and again in very different areas of communication studies. In Ape Language Research (ALR), Sue Savage-Rumbaugh observes how the origins of language comprehension lie in “interindividual routines” which are like “a delicate dance with many different scores, the selection of which is being constantly negotiated while the dance is in progress, rather than in advance" (Savage-Rumbaugh et al. 1993: 27). In nonverbal communication research, Michael Argyle describes how: "a speaker starts gesticulating and looks away as he starts to speak, and reverses this when he stops. There is an intricate co-ordination of pausing and looking within turns, followed by head-nods, smiles, and gazes. Interactional synchrony has been called a ‘gestural dance’, and likened to a waltz" (Argyle 1988: 118). In infant development research, Daniel Stern recounts how, at the age of four months, an infant “passes into the Immediate Social World. In this world of the ‘here and now, between us’, he reports on the rich choreography between himself and his mother, on the subtle moves by which they regulate their flow of feelings. Thus, Joey introduces us to the basic dance we all play out with other people throughout our lives” (Stern 1990: 7). What is it about this dance metaphor that so appeals to scientists who are interested in the dynamics of communication and language development?

      The answer to this question lies in the fact that the terms used to describe a dance are radically different from those used in the information-transmission metaphor that has hitherto dominated the study of ape communication (King & Shanker submitted). The information-transmission metaphor prompts one to conceptualize communicative exchanges in terms of such constructs as signal and response, sending and receiving, or encoding and decoding. But the dance metaphor leads one to conceptualize communicative encounters in terms such as engagement and disengagement, synchrony and discord, or breakdown and repair. Whereas the transmission metaphor places the emphasis on the goal of communication, which is to transmit pre-determined ‘messages’, the dance metaphor focuses on the co-regulated activity of communicating and the emergence of communicative intentions within that context.

      The chief appeal of the dance metaphor is that it draws attention to how communicating partners continuously establish and sustain a feeling of shared rhythm and movement. Such a process of mutual attunement is established through a number of different modalities. Communicating partners not only mirror each other’s specific behaviors but may also attune to one another cross-modally. For example, an infant suddenly jerks her arms and her mother “responds with a sharp ‘Oh!’ that has the same temporal and intensity contour as the infant’s arm movement” (Fogel: in press 7), or the tone of voice prompts the other to move closer or farther away.

      This process of mutual attunement “reflects the role of emotion in communication. It can be used to share feelings with another, to empathize, to mock, to respond contingently, to change the other’s arousal level or emotion, to change the other’s goal, to teach, or to play” (Ibid.). Virtually from birth, an infant and her caregiver(s) are involved in an "interactive system of reciprocal stimulation" (Schore 1994: 71). The primary modality for this process is shared gaze. In “synchronized, mutual gaze, a state of ‘mutually entrained central nervous system propensities’ (Horner 1985) involved in maternal regulatory systems of arousal" occurs (Schore 1994: 80). These gaze exchanges induce an “affect-amplifying,” “symbiotic” state that is shared between caregiver and child (Ibid: 78ff). Thus, the members of a dyad are engaged from the start in an ‘affect-regulating dance’ that forms the template for the intentional communicational behaviours that start to emerge in the infant between the ages of 5 and 9 months.[1]

      The dance metaphor brings to mind several different images: it summons up a picture of a novice being guided by a more experienced partner; a picture of awkwardness and friction when two partners are not in harmony with one another; and a picture of fluid movement when two partners are communicating fluently. The dance metaphor also suggests how difficult it can be to identify one partner as the ‘initiator’ of an exchange: is it the one who asks the other to dance? But then, how much nonverbal communication may have preceded the actual issuing of this ‘invitation’? Similarly, Schore describes how the young infant smiles in order to evoke her mother's gaze, and conversely, averts her gaze when she finds too much arousal unpleasant (Schore 1994: 82ff). But such behaviors only occur within the context of being gently held and cooed to, or recently fed, not to mention all the previous gaze exchanges.

      Like two dancers who are aware of themselves and each other as a single entity, the members of a dyad are said to be in interactional synchrony when they are in a similar affective state and fully attuned to one another’s communicative behaviors. In normal dyadic interactions this is thought to occur as much as 30% of the time; the other 70% of the time the caregiver and infant are in various degrees of being ‘out of synch’ with one another (Tronick 1989). Attentive caregivers are sensitive to these periodic breakdowns and good at restoring interactional synchrony. When a caregiver is poor at repairing these breakdowns there is a marked decline in interactional synchrony.

      Communication breakdowns are even more common with infants who have problem temperaments. There has been a great deal of research confirming the findings of Thomas and Chess’s ‘goodness-of-fit’ model (Chess & Thomas 1984): viz., when caregivers respond to the difficult child harshly or inconsistently, periods of ‘dyadic dissonance’ increase and the child is more likely to behave aggressively and egocentrically with its peers later in life. But when caregivers are effective at adjusting their parenting skills to match the child’s temperament, we see higher levels of interactional synchrony. What is more, there is evidence to suggest that the more secure the attachment, the more positive the child’s social interactions with her peers and the more developed her prosocial attitudes (Ainsworth et al. 1978). In other words, it appears that the greater the amount of interactional synchrony between an infant and her caregiver, the better able is the child to adjust her style and responsiveness to the rhythms of different dance partners as she grows older.

      The dance metaphor is used, therefore, to convey the idea that communication is a free-flowing and co-regulated activity. The shift from the transmission metaphor to a dance metaphor thus represents a fundamental shift in communications theory from an information-processing to a dynamic systems paradigm. In a dynamic system, all of the elements are continuously interacting with and changing in respect to one another, and an aggregate pattern emerges from this process of mutual co-action. Hence communication is seen, not as a linear, binary sequence but rather, as a “continuous unfolding of individual action that is susceptible to being continuously modified by the continuously changing actions of the partner” (Fogel 1993: 29). Fogel terms this process co-regulation, a term we prefer to interactional synchrony because it highlights the nonlinear nature of this process of continuous mutual adjustment.

      The shift from an information-processing to a dynamic systems paradigm further represents an important transformation in our understanding of the nature of communication. On the information-processing paradigm, what is communicated is always information. The information that is communicated is said to be an internal state or an internal representation of an environmental feature, and genuine communication is said to occur when B decodes the message that A intended to encode. Hence A and B must possess the same code in order for genuine communication to occur. But on the dynamic systems paradigm, mutual understanding is something that emerges as both partners converge on some shared feeling, thought, action, intention, etc., and develop or deploy various behaviors that signify this convergence.

      On the dynamic systems paradigm, what is communicated is not simply information – although, to be sure, an important aspect of communication is what kinds of information a subject can communicate. Indeed, it might be argued that one of the major criteria for describing a species as ‘more’ or ‘less’ advanced is what kinds of information it can communicate. Certainly, one of the most important aspects of studying ape communication has been learning about the surprisingly complex kinds of information that apes communicate to one another and the complex social situations in which this occurs. But in addition to communicating various kinds of information, they also communicate their desires and intentions, fears, warnings and invitations, and, of course, attitudes and emotions. To reduce all these communicative acts to a single metric is to impose an abstract formalized model on an activity that is fundamentally variegated and dynamic. Moreover, to communicate, e.g., an attitude or an emotion, is not to ‘transmit information’ about one’s ‘internal state’; when attitudes and emotions are communicated they are shared and co-regulated. Even a stereotypical behavior, such as a juvenile ape behaving submissively before an alpha male, involves a complex dance that is carried out in several modalities and is highly context-sensitive.

      The dynamic systems paradigm draws attention to the following points:

·        partners are continuously active in communication

·        their actions are not coordinated by fixed or innate ‘codes’

·        their actions are fundamentally relational; partners mutually adjust their behaviors to each other in subtle ways

·        it can be difficult to identify the ‘initiator’ of a communication

·        communication cannot be reduced to a single modality nor to the summation of multiple modalities

·        the communicative significance of a particular gesture, vocalization, facial expression, etc., is a function of its role within the communicative process (e.g. a hand movement only counts as a gesture in the context of a communicative exchange and not on the basis of what is going on inside a speaker’s head); i.e. the significance of a particular gesture, vocalization, movement etc. cannot be decontextualized

·        the constraints on communication cannot be quantified; they are a function of biological, psychological and social factors

      Clearly, this new paradigm has considerable methodological significance for the study of communication. Above all else, it requires that the researcher adopt a hermeneutic stance. There are several reasons why one must become thoroughly acquainted with one’s subjects if one hopes to interpret the meaning of their communicative behaviours and assess the ‘quality’ of their communications (in terms of periods of synchrony and discord, the factors that cause these to fluctuate, and how successful they are at repairing breakdowns). For one thing, one must expect to encounter marked fluctuations in the communicative exchanges between the members of a dyad (which can be caused by any number of endogenous and exogenous factors). And one must always be prepared for the possibility that a subject’s communicative acts are highly idiosyncratic; in such cases the significance of the act can only be appreciated if one has obtained a fairly intimate knowledge of the subject. To develop a profile of an individual’s or a species’ communicative skills therefore demands multiple perspectives – e.g. observing the subject(s) interacting with different partners in multiple settings over extended periods of time.

      The shift to a dynamic systems paradigm has equally profound implications for our understanding of the character of communicative behaviors. On the information-processing paradigm we “consider the nature of animal signals as if they have been ‘designed’ for a specific purpose. This is a shorthand way of saying that we assume that the signals we observe are the product of natural selection, which has favored those properties of signals that make them most effective at conveying information” (Halliday 1983: 43). Thus, we seek to isolate specific behaviors (‘signals’) and establish (through repeated observations) the conditions that ‘trigger’ those behaviors and the  ‘responses’ that they evoke. When we look for the reasons why an ape did such-and-such, therefore, we are looking for the reasons why that behavior was selected, and not, why that particular ape did such-and-such under specific conditions.

      On the dynamic systems paradigm, natural selection is thought to apply to the whole developmental manifold and not just the genes (see Gottlieb 1997). Highly predictable developmental outcomes are seen as the result of canalizing influences -- i.e. highly predictable environmental circumstances (van der Weele 1999) – and not as canalized traits (i.e. traits that are strongly buffered from environmental perturbations by information that is encoded in an organism’s genes). And the ‘canalizing influences’ that most stand out in the case of ape, as in the case of human communicative development, are the close dyadic relationships in which an infant is nurtured over the first few years of its life, and the influence of social factors.

      The communication/dance that occurs between infants and caregivers is not itself something that is genetically determined: i.e. a complex behavioral sequence that can be broken down into a nested hierarchy of micro-signals and responses. To be sure, in ‘normal’ circumstances certain conditions are present (in both infant and caregiver) that enable the dyad to engage in communication dances (which then become a vehicle for their own, as well as other aspects of the infant’s development). But there are multiple factors that can impinge on dyadic interactions: both endogenous (e.g. biological deficits in the infant, psychological impairments in the caregiver, the effects of attentive or inattentive caregiving) and exogenous (e.g. abnormal environmental conditions, social conflicts, external threats). As in the case in child developmental research, we see ample evidence in the primatological literature of the effects of good versus poor caregiving on an infant’s development. In particular, Savage-Rumbaugh’s work with the chimpanzees Sherman and Austin, and with the bonobo Kanzi, has dramatically demonstrated just how significant the effects of the environment can be on an ape’s communicative development (infra). And primatological research is amassing ever more extensive evidence of social modifiability in several of the modalities listed above (see Seyfarth and Cheney 1997, Snowdon 1999).

      In general, therefore, the dynamics systems paradigm establishes the study of communication on the following precepts:

1.                  In place of the continuum picture, which postulates a linear hierarchy of information-processing systems, dynamic systems theory envisages multiply branching growth (Darwin’s tree metaphor) of ever more variable and pliable communicative behaviours.

2.                  The more ‘plastic’ an organism’s development, the more dependent are species-typical behaviours on species-typical experiences.

3.                  The development of communicative behaviours cannot be treated as the summation of hereditary and environmental factors (i.e. H X E) but rather, must be seen as the result of a complex interplay between maturational and experiential factors.

4.      An individual’s communicative development is manifested in terms of such factors as rigidity versus adaptability and novelty, and imaginative or creative behaviours.

5.      The significance of communicative behaviours can only be interpreted in terms of the particular contexts within which they occur.

One of the most dramatic examples of the significance of this framework can be found in the debate over the significance of recent advances in ALR.


§2. Sherman and Austin

      Ape language research (ALR) has hitherto been overshadowed by the debate over whether the mental capacity to acquire syntax and semantics is uniquely human. Discussions have thus primarily centered on the question of whether enculturated apes’ communicative behaviors can be compared with early norms in child language acquisition studies. The most striking results obtained so far have been with Kanzi, the male bonobo who was born and raised at the Language Research Center in Georgia State University. Hence there has been a natural tendency to focus on the linguistic feats of this extraordinary ape, to the exclusion of other important areas of ALR. This tendency is unfortunate, however, for two different reasons. First is the obvious reason that we risk overlooking important findings that have been made with other great apes. But perhaps the more important reason is that this – understandable – preoccupation with Kanzi’s achievements may skew the significance of ALR, by casting the research as strictly a matter of establishing whether apes can be brought to cross the ‘Language Rubicon’.

      The danger here is that ALR might be seen as solely a matter of ascertaining whether apes can perform ‘high enough’ on language tests – as measured, e.g., by how many words they can learn and what sorts of syntactical constructions they can master – so as to refute any lingering doubts about whether their behavior can be legitimately described in linguistic terms. The problem with such a viewpoint is that it accepts from the outset the presupposition that there is a categorial distinction between language and natural communication, and thus, that the fundamental challenge faced by ALR is to see whether apes’ productive behaviors are genuinely ‘language-like’ or merely ‘natural’. For example, if one assumes that there is a categorial distinction between instrumental and symbolic gestures, and that all natural ape behaviors belong to the former category, one can then argue that the use of signs or lexigrams belongs equally to the former and not the latter category. Thus, this framework has engendered a polemical atmosphere in which researchers from opposite ends of the ‘continuity/discontinuity’ spectrum reach opposite conclusions about how the data should be properly interpreted.

      By no means should the research community neglect the interpretation of the rich corpus of data that has been accumulated with Kanzi or with the apes trained to use a version of American Sign Language (e.g., the chimpanzee Washoe [Gardner & Gardner 1969] and her ape companions [Cianelli & Fouts 1998] or the orangutan Chantek [Miles 1990]; for a recent review see Hixson 1998). For there are many important questions in the existing data that remain to be answered: e.g. what kinds of concepts have they mastered; what kind of linguistic constructions; the order in which they acquired various linguistic skills; whether they can master still more abstract concepts and grammatical constructions, etc. But we must not lose sight of the fact that the driving impetus behind ALR is to discover what kinds of communicative skills apes can acquire and, equally important, what sorts of skills seem to be irrevocably beyond their grasp; how environmental contingencies affect an ape’s cognitive and communicative development; and perhaps as a result of this research, the significance of various aspects of the social environment and caregiving practices on a child’s cognitive and language development. Thus, as exciting as the achievements of the ape ‘linguistic savants’ may be, some of the most important findings have been in those ‘grey’ areas where the distinction between ‘natural’ and ‘linguistic’ behavior is hardest to draw.

      For this reason, we shall begin this section on ALR with a review of the Animal Model Project, which was conducted using chimpanzees at the Language Research Center in the late 1970s and early 1980s. The stated purpose of this research was not to establish that an ape could reach such-and-such a level of (age-matched) linguistic performance in order to silence the discontinuity critic. Rather, as Savage-Rumbaugh explained in Ape Language:

As Sherman and Austin moved from the simplest discrimination tasks to complex spontaneous communications, it became increasingly apparent that they were continually learning to do far more than they were being taught. The issue of whether or not they had achieved "true" human language was never the goal. The goal was to improve their communicative competence and in doing so to more clearly define the skills involved, both at the behavioral and at the cognitive levels (Ibid, p. 404).

The reason why the research with Sherman and Austin provides such an important starting-point for this section is because, in their case, we have a fully documented account of the steps that were taken to enable an ape to undergo what Deacon describes as “the shift from conditioned associations to symbolic associations” (Deacon 1997: 84). Moreover, we have the added benefit that Deacon presents a sophisticated attempt to explain this ‘shift’ on the linear model of communication, which, as we shall see, stands in sharp contrast to Savage-Rumbaugh’s own explanation of this phenomenon, which is highly resonant of the dynamic systems paradigm. Thus, the contrast between their two accounts brings into sharp relief the contrast between the two models of communication that we seek to elucidate in this paper.

       Significantly, Sherman and Austin were both highly communicative with Savage-Rumbaugh from the start of the research. She reports that they were especially sensitive to affects conveyed by her tone of voice, to her facial expressions, and that they frequently gestured to communicate their desires to her (Savage-Rumbaugh 1986: 38, 56). What they could not do very well, however, was pair lexigrams with objects.[2] An experimenter would train them on the association between a food item and its lexigram, and then later hold up the food item and ask them with which lexigram it was associated. Even though other apes had performed well on this task, Sherman and Austin both experienced considerable difficulty when they had to choose between two keys; and despite extensive training, they were unable to perform above chance if they had to select from three keys. But a dramatic breakthrough occurred when Savage-Rumbaugh shifted to a ‘request task’ paradigm in which the experimenter would hold up a food item that the chimp would immediately receive if he pressed the right key. Now Sherman and Austin started to make rapid gains in lexigram-object pairings. But if the expectancy of receiving a food item by pressing its lexigram key was removed their response behaviour quickly became fragmented again.

      Savage-Rumbaugh proceeded to teach Sherman and Austin the difference between ‘requesting’ and ‘naming’ by fading out their food rewards. In her view, the breakthrough occurred when the apes began to use lexigrams with no expectation of receiving a food reward. Now

When a training task was begun, instead of waiting for the teacher to ask that certain items be given or labelled, the chimpanzees began naming items spontaneously and then showing the named item to the teacher. As the chimpanzees decided which objects were to be named and shown, they also incorporated many aspects of the teacher's role into their own behavior. They initiated trials, singled out objects, and actively engaged in behaviors designed to draw the teacher's attention to what they were saying. Moreover, these indicative behaviors, once they appeared, were not limited to training contexts (Savage-Rumbaugh 1986: 326).

In other words, Sherman and Austin began to demonstrate communicative behaviors which are normally seen in the 1-year old child: they imitated Savage-Rumbaugh’s actions, and they used lexigrams spontaneously, in novel situations, to refer to objects, direct her attention, and express their intentions. Whether or not such actions are present in the wild, these were certainly new behaviors for them, and thus, as Deacon argues, we need to explain this radical ‘shift’ in their behavior.

      We saw in §1 how, on the information-processing paradigm, communication is defined in terms of, and is thus limited to, the number of channels available to the sender and receiver; the quality of the medium; and the nature of the ‘internal states’ experienced by sender and receiver. Since neither of the first two factors was significantly altered by the training paradigm, the explanation for the ‘shift’ in Sherman and Austin’s communicative behaviour must, on the information-processing paradigm, lie in the third factor. And this is precisely the route that Deacon takes: according to Deacon, the qualitative shift observed in Sherman and Austin’s communicative behaviour was the result of a “radical transformation in the[ir] mode of representation” (Deacon 1997: 87).

      On Deacon’s account, this ‘mental transformation’ was induced by first training and then systematically extinguishing illicit symbol combinations in a combinatorial system consisting of two ‘verbs’ and four ‘nouns’. Even such a simple combinatorial system allows for 720 pair sequences, most of which are nonsensical. Over thousands of trials, these illicit combinations were gradually extinguished. As a result, the apes learned, not simply ‘symbol-object’ pairings, but “a set of logical relationships between the lexigrams” (Ibid: 86). That is, they discovered “that the relationship that a lexigram has to an object is a function of the relationship it has to other lexigrams, not just a function of the correlated appearance of both lexigram and object” (Ibid). And this, Deacon concludes, “is the essence of a symbolic relationship” which, once grasped, enabled Sherman and Austin to assimilate new symbols into their lexicon, quickly and effortlessly (Ibid). Thus, Deacon’s explanation of this phenomenon is presented in the same terms as one would use to describe a Pattern Recognition program: viz., “the shift from associative predictions to symbolic predictions is initially a change in mnemonic strategy, a recoding. It is a way of offloading redundant details form working memory, by recognizing a higher-order regularity in the mess of associations, a trick that can accomplish the same task without having to hold all the details in mind” (Ibid: 89).

      Deacon’s argument stands in sharp contrast, however, to what Savage-Rumbaugh herself tells us about her research with Sherman and Austin. Deacon primarily bases his account on the early guidelines of the Animal Model Project (see Rumbaugh 1977; Savage-Rumbaugh & Rumbaugh 1978). But Savage-Rumbaugh quickly abandoned this approach and decided that, that “unlike all previous ape-language projects, this one would not have as its goal the production of word combinations or sentences. I wasn’t in search of the linguists’ holy grail. I was going to focus on words: What does a word mean to a chimpanzee, and how can we find out?” (Savage-Rumbaugh & Lewin 1994: 49). That is, she deliberately abandoned the earlier approach of training the chimps on which cues were irrelevant and which combinations were illicit, and began instead to encourage the chimps to use lexigrams in their day-to-day activities. Not surprisingly, therefore, we see striking differences in the manner in which Deacon and Savage-Rumbaugh conceptualize the ‘shift’ that occurred in Sherman and Austin’s communicative behaviours, and in their explanations of how this ‘shift’ occurred.

      According to Deacon, Sherman and Austin experienced a sort of ‘gestalt-like’ mental reorganization: what he calls a ‘recoding’ or a ‘re-representation’ of the lexigrams they had originally learned as indexical pairings. But according to Savage-Rumbaugh, the ‘qualitative shift’ observed in Sherman and Austin’s communicative behaviors was the result of her efforts to establish “a much closer physical proximity with the apes, interacting with them in a social, preschool-like setting [that] would emphasize communicative needs rather than promoting teaching efficiency” (Ibid). For Deacon “the food lexigrams are in a real sense ‘nouns’ [because they] are defined by their potential combinatorial roles” (Deacon 1997: 88). For Savage-Rumbaugh “it seemed that [Sherman and Austin] really ‘had words’ [when] they understood that words could be used to express future intentions and thereby coordinate actions, rather than simply as a mechanism to get others to do something for them” (Savage-Rumbaugh & Lewin 1994: 127). Thus, for Deacon, the ‘shift’ in Sherman and Austin’s behavior occurred as the result of a mental transition from learning isolated symbol-object pairings to learning how new symbols fit into a combinatorial system; for Savage-Rumbaugh the shift occurred as the result of switching from a behavior modification paradigm to using lexigrams in ordinary social circumstances. Whereas Deacon’s explanation focuses on what went on ‘inside their heads’, Savage-Rumbaugh’s explanation focuses on the dyadic and triadic interactions in which the apes engaged.

      For example, Savage-Rumbaugh recounts how, to teach them comprehension skills, she hid a food item in a container and used the keyboard to tell them what was hidden in the container:

The first time I did this Sherman rushed to smell the container, but was unable to detect what was in it. He gestured for me to open the container, but I refused. Instead, I went to my keyboard, located just outside Sherman and Austin’s room, and stated this chow. When I used my keyboard, it made the symbol ‘chow’ appear on projectors located just above Sherman and Austin’s keyboard. Sherman saw this information and apparently believed me because he immediately used his own keyboard to say open chow. On the next twenty trials of this novel situation, Sherman made just two errors, even though I used many different words (Savage-Rumbaugh & Lewin 1994: 71).

In this episode, Sherman’s novel communicative behavior emerged in the context of a complex routine that encompassed not only the present circumstances but also, all of the food-sharing interactions that Sherman had previously experienced. Furthermore, what Savage-Rumbaugh was doing was a crucial factor in what Sherman was doing, just as what he was doing was a crucial factor in what she was doing. The manner in which each of them was acting was part of their shared history together; the very fact that Savage-Rumbaugh describes Sherman as “believing her” attests to her perception of the importance of the strong affective relationship that enabled Sherman to make this communicative advance.

      Savage-Rumbaugh reports that from this moment on she witnessed a qualitative shift in her interactions with Sherman and Austin: viz., they began to express their intentions before acting; to pay far more attention to the consequences of their own communicative actions; and to attend more closely to her actions in order to ascertain what she intended (Savage-Rumbaugh & Lewin 1994: 72). Deacon’s argument is that, given the dramatic shift between their behavior prior to and immediately following this ‘qualitative shift’, we can infer that they must have experienced some such ‘mental re-organization’ as that described above. Prior to this moment their use of the lexigram board was merely conditioned, but after the ‘qualitative shift’ their uses of the lexigram board became symbolic. But Savage-Rumbaugh rejects both aspects of this description; she argues that, prior to the ‘watershed’, Sherman and Austin were intent on discovering how they could use the lexigram board to control the behavior of their experimenters (Savage-Rumbaugh 1986: 65), and afterwards, that they became intent on discerning her intentions and expressing their own.

      Thus, rather than viewing Sherman and Austin’s transformed behavior as evidence of a ‘mental transformation’, Savage-Rumbaugh treats their altered behavior as a criterion for describing them as having experienced a ‘moment of insight’ which resulted in a sharp change in their attention to context and another speaker’s intentions. The point she is making here is reminiscent of what Tomasello says about language acquisition in children. Tomasello argues that: “At around nine to twelve months of age human infants begin to engage in a host of new behaviors that would seem to indicate something of a revolution in the way they understand their worlds, especially their social worlds” (Tomasello 1999: 61). What Tomasello has in mind is the fact that, although infants are gesturing intentionally by 8 months of age, it is only “At nine months of age [that they] begin engaging in a number of so-called joint attentional behaviors that seem to indicate an emerging understanding of other persons as intentional agents” (ibid). So too, Sherman and Austin were clearly communicating intentionally with the board prior to this ‘watershed’ moment, but according to Savage-Rumbaugh, after this ‘aha experience’ they not only began to attend closely to her actions but also, “to pay close attention to each other’s communications; they engaged each other before delivering their message; they gestured to emphasize or clarify messages; they took turns” (Ibid: 84). This cognitive and communicative advance observed in Sherman and Austin lends further credence to the point made by leading child developmentalists (Greenspan 1997) that the capacity to ‘mindread’ is a product of early socialization processes and not, as some have speculated, a maturational phenomenon.

      As a result of this qualitative shift in their communicative behaviors, the chimps began to engage in highly atypical activities, such as freely sharing each other’s plant food and using the lexigram board to cooperate in complex food-sharing tasks. For example, Savage-Rumbaugh placed them in adjoining rooms that were separated by a clear Plexiglas window. In Sherman’s room there were a number of boxes, each baited with a different kind of food, and each needing a specific tool in order to be opened. Austin was then placed in another room with all of the tools. He could see all the different foods through a window, and would signal to Sherman which food he wanted. Sherman responded by using the lexigram board to tell Austin which tool he needed to open that box. Austin would select the appropriate tool (e.g., a key or a wrench) and pass this through a small hole to Sherman. Sherman would then open the right box and pass the food through to Austin (eating a portion of it along the way).

      Clearly far more was involved here than ‘mutual instrumental behavior’. Savage-Rumbaugh reports how "Joint regard, amplification of symbols with gestures, and spontaneous correction of errors were behaviors that emerged out of the interindividual interactions between Sherman and Austin" (Savage-Rumbaugh 1986: 203). For the apes weren’t just monitoring the other’s behavior in order to ensure that their communicative intention had been correctly ‘decoded’, and when this was not the case, repeating or reinforcing the behavior in question. Rather, as is the case with children who are learning their first words, they were using lexigrams to “coordinate [their] management of a complex interactional task”  (Taylor 1992: 245). Indeed, they were even correcting themselves, as well as each other. For example, “on one trial Sherman mistakenly requested a key when a wrench was appropriate for the task, and he watched as Austin began to look over the toolkit in response to the request. Austin picked up the key, and Sherman looked surprised, turned to look at the keyboard, which still showed the key request he'd made, and realized his mistake. He rushed to the keyboard and corrected himself by tapping on the wrench symbol to draw Austin's attention to the changed request. Austin looked up, saw what Sherman was doing, dropped the key, and took the wrench to the window to give to Sherman” (Savage-Rumbaugh & Lewin 1994: 82). 

      In other words, Sherman and Austin had reached a point where one wants to say: not only were they able to understand the meaning of the lexigrams they were using, but they were even able to understand each other using lexigrams. The problem that Deacon and Savage-Rumbaugh are both addressing, therefore, is: what justifies such an assertion? Deacon’s answer is that, given the objective structure of the combinatorial system that they were using, one can infer that they had both experienced the same mental transformation. Savage-Rumbaugh’s argument is that one is warranted in describing Sherman and Austin as understanding the meaning of lexigram symbols and each other on the grounds that they could do such things as use a lexigram correctly and respond appropriately to its use by others; initiate spontaneous exchanges with lexigrams; use lexigrams to express their intentions; jointly attend to lexigrams, each other, and another person or object; use lexigrams to direct each other’s or another person’s attention; extend the use of lexigrams to novel (but suitable) circumstances; spontaneously assign unlabelled keys to new foods; closely attend to their own, and to someone else’s use of lexigrams; and correct their own or each other's mistaken uses of lexigrams.

      It is important to be clear here that, as far as the Dynamic Systems paradigm is concerned, the question is whether Sherman and Austin’s behavior was sufficiently complex to satisfy the criteria for describing them as understanding the meaning of symbols, and what implications this might have for our views about the relationship between language and communication. If – as would appear to be the case from the above catalogue of behaviors – Sherman and Austin can legitimately be described as having acquired primitive linguistic skills, it is because of the sorts of things that they began to do with lexigram symbols and their knowledge of what sorts of things one was supposed to do with those symbols (see Shanker & Taylor in press). And herein lies the reason why Savage-Rumbaugh concluded that, contrary to what is postulated by the linear model of communication, "meaning and intent are not to be found by looking ‘inside’ a speaker” (Ibid, p. 382). For the explanation of the fact that Sherman and Austin had begun to understand the meaning of lexigram symbols, and that they could understand each other using lexigrams, does not revolve around what (if anything) went on in their minds. Whether or not a subject understands the meaning of ‘p’, or another speaker, is established by what she says or does in the context of dynamic interactions.

      If anything, the research with Sherman and Austin attests to just how problematic it is to draw a categorial distinction between nonverbal communicative behavior and primitive linguistic behavior (see Savage-Rumbaugh, Shanker & Taylor 1998). For what the research with Sherman and Austin demonstrates is how verbal behavior – i.e. what we describe as ‘verbal behavior’ – “emerges from and with nonverbal behavior, and as it does, it provides for a new means of coordinating interindividual object-oriented behaviors” (Savage-Rumbaugh 1986: 31). On this line of thinking, the ontogeny of language skills lies, not in a genetic blueprint for ‘encoding’ and ‘decoding’ epistemically private ‘mental states’, nor in a sudden mental shift from ‘indexical’ to ‘symbolic’ comprehension, but rather, in “interindividual interactions [that] come to be coordinated through the use of words” (Ibid).

      The psychological problem we are then left with is: what made the ‘qualitative shift’ in Sherman and Austin’s communicative behavior possible? How did Sherman and Austin acquire such species-atypical skills as using lexigrams to communicate with one another and humans? Clearly, what was most atypical about Sherman and Austin was the environment in which they were raised and the kinds of tasks that they were compelled to master. For example, by being physically separated but still able to see and interact with each other, they were forced to employ alternative means for engaging in food-sharing activities. To be sure, the technology that they employed literally forced them to engage in ‘turn-taking’ exchanges; yet that does not mean that the communication between them during these tasks was limited to these exchanges. Rather, their use of the board was incorporated into their nonverbal ‘dances’, and like Sultan and Chica, they mastered the use of a tool to overcome obstacles and achieve their desired goals. Unlike Sultan and Chica, however, what Sherman and Austin mastered was a communication tool. And to master that tool, and the increasingly complex demands that Savage-Rumbaugh imposed upon them, required sustained attention and interaction.

      But it was not just the task and the tool that made this cognitive and communicative development possible; the presence of an unusually responsive and emotionally attuned caregiver was equally crucial. That is, we must not lose sight of the fact that Savage-Rumbaugh herself was an essential element in Sherman and Austin’s cognitive and communicative development. For what is clear from her account is that Savage-Rumbaugh was learning as much from these social interactions as were the apes. That is, her own development as a primatologist – her growing understanding of Sherman and Austin’s temperaments, their attitudes, thoughts, needs, and of course, intentions – played an integral role in Sherman and Austin’s cognitive and communicative development. In other words, the socialization of attention that was observed in Sherman and Austin was a dyadic, and in some cases a triadic, and not an endogenous phenomenon.


§3. Kanzi

      It is not difficult to understand why the story of the Kanzi research has so captured the general public’s imagination. All of the ingredients that one looks for in a gripping scientific narrative are here: the infant of a little-known species of great ape suddenly, and unexpectedly, succeeds where his adoptive mother, despite extensive training, had failed. A research program on the brink of losing its funding is suddenly reinvigorated. Paleoanthropologists begin to speculate that here at last has been found, if not the ‘missing link’, then at any rate a plausible model of the ‘common ancestor’ (before chimpanzees and hominids moved off on their separate evolutionary paths). Psychologists are forced to reconsider their preconceptions about the cognitive and communicative bifurcation between animals and humans. And society as a whole is forced to reassess the morality, and perhaps even the legality of its attitudes towards apes (see Wise 2000).

      Given the larger ethical as well as scientific implications that hinges on this research, it is not surprising that it has become so important to catalogue the exact nature of Kanzi’s linguistic achievements. This record is now widely known (see Savage-Rumbaugh et al. 1993; Savage-Rumbaugh & Lewin 1994; Savage-Rumbaugh, Shanker & Taylor 1998): when he was 2½ years old Kanzi could use 8 symbols on a lexigram board to request various food items. By the time he was 3 he was using 20 symbols, and when he was 8 he had mastered the productive use of over 250 symbols. He uses these signs purposefully, without cuing or imitation, to do such things as refer to objects and locations in the immediate, present surroundings, as well as to others that are ‘absent’, and even, to comment on events that occurred in the past, ask questions, play games, or simply provide information (both requested and unsolicited).

      Even more significant than his use of signs on the lexigram keyboard is Kanzi’s ability to understand spoken English sentences. When he was 8 years old Kanzi was extensively tested on the same data-set as Alia, a 2 year-old child (see Savage-Rumbaugh et al. 1993). The sentences on which they were tested involved such requests as to put something on or in something; to give or show something to someone; to do something to someone; to take something to a distal location; to fetch an object or objects from a distal location; or to engage in some make-believe sequence. Almost all of the sentences were new to Kanzi, and many involved somewhat bizarre requests in order to ensure that he was not able to derive their meaning solely on the basis of semantic predictability.

      The results of this comparative study are fascinating: Kanzi was correct on 72% of the 650 sentences on which they were tested while Alia was correct on 66%. Even in the cases that were classified as errors Kanzi was usually partially correct. For example, if asked to fetch two objects from some location he might return with only one. Or he might give the right item to the wrong person, or the wrong item to the right person. In those cases where both Kanzi and Alia were completely mistaken in their responses, it was generally either because of inattention or because they responded to some atypical request (e.g. ‘Put the knife in the hat’) with a customary action (Kanzi attempted to cut a bar of soap with the knife while Alia attempted to cut an apple).

      Perhaps the most striking difference between Kanzi and Alia was in respect to their memory abilities: Alia could tolerate fairly long delays before she executed the task that had been requested, whereas Kanzi needed to act fairly promptly if he was to execute the requested task. Yet Kanzi actually performed better than Alia when asked to go to a distal location to fetch an object. Interestingly, both experienced difficulty when confronted with conversational implicatures. For example, they were asked to ‘Go outdoors and get an orange’ while seated in front of an array of objects that included an orange. Half of the time Kanzi would pick up the object immediately in front of him and then go to the location named, while Alia responded in a similar fashion 25% of the time. But when the ambiguity was removed and he was asked, e.g., to ‘Go get the orange that’s outdoors’, Kanzi responded appropriately 91% of the time.

      These controlled studies confirmed Kanzi's ability to understand English sentences displaying a variety of syntactic patterns. Some of these sentences exhibited a degree of syntactic complexity, including the use of embedded constructions. And many of the sentences were paired with their semantic inversion so as to ensure that Kanzi was responding to the syntactic structure and not simply to semantic cues. This proven ability to understand spoken sentences in English represents a whole new dimension in ape language research. But what are the implications of this advance for our understanding of nonhuman primate communicative capacities, and perhaps, for our understanding of language acquisition? Once again, the contrast between Deacon’s discussion of this issue and Savage-Rumbaugh’s is highly illuminating.

      Deacon places Kanzi’s current linguistic proficiency at around the level of a 3-year old child, which is more than high enough to pose a formidable challenge to nativist theories of language acquisition (Deacon 1997: 125). But then, this leaves us with a paradox. Assuming that Kanzi acquired the rudiments of language whereas Matata did not, simply because he was exposed to a language-enriched environment from birth whereas she was wild-born and was 10 years old when Savage-Rumbaugh started working with her, why should Kanzi’s brain have evinced a “language-specific critical-period adaptation” when apes in the wild do not possess language (Ibid: 126)? Furthermore, Deacon argues that the Kanzi research highlights the same ‘basic paradox’ that we see in language acquisition: viz., why should Kanzi have easily mastered a task which, prima facie, is far more complex than simpler tasks requiring conscious memorization of novel associations?

      Generativists have, of course, long capitalized on the latter ‘paradox’ in order to bolster their claim that a child must possess innate knowledge of the ‘principles and parameters’ of language (Pinker 1994). But Deacon eschews this nativist strategy and, in its stead, pursues the same sort of ‘less-is-more’ solution as Elissa Newport: viz., perhaps the child’s general cognitive deficit vis-à-vis problem-solving is actually an advantage when it comes to language-learning (Newport 1991)? Perhaps this is the reason why Kanzi was able to make the categorial shift from indexical to symbolic association spontaneously whereas Sherman and Austin required extensive training in order to experience this ‘mental reorganization’. Perhaps Kanzi’s mind worked differently, not because there is something unique about Kanzi, but because there is something distinctive about the brain of an infant primate? Indeed, perhaps there is something unique about the structure of language that renders it ideally suited for an ‘immature’ mind to acquire?

      By no means is Deacon challenging the premise that a child – or an ape – must acquire grammar when he or she acquires language; for grammar, he insists, is “essential to successful [symbolic] communication” (Ibid: 128), insofar as it enables a subject to predict which symbol combinations are licit and which nonsensical. But Deacon’s argument side-steps the whole controversy over whether Kanzi’s achievements constitute ‘true’ language. For on Deacon’s terms what is clear is that Kanzi has acquired a symbolic system with a simple grammar equivalent (if not identical) to that of a 3 year-old child, and that in itself is all that really matters here: i.e. if we can explain the psychological mechanism that enabled Kanzi to acquire symbols and a grammar, perhaps we shall therein discover the key to the ‘basic paradox’ in the study of language acquisition.

      The crux of Deacon’s thesis lies in his earlier discussion of Sherman and Austin. To review his argument: in the combinatorial system on which Sherman and Austin were trained there were 720 pair sequences, most of which are nonsensical. Granted this is still a large number, but with enough time and patience it was possible to extinguish the illicit combinations and overcome Sherman and Austin’s “natural learning predispositions that worked against their discovery of the symbolic reference associations of the lexigrams they were taught” (Deacon 1997: 125). But with Kanzi we are talking about a productive system of 250 symbols and a comprehension system of at least 650 words. Clearly any thought of systematically extinguishing illicit combinations is untenable; for the system is large enough that a version of Gold’s Theorem[3] applies here as much as to natural languages.

      It was undoubtedly because he was exposed to language from birth that Kanzi “crossed the same cognitive threshold [as Sherman and Austin] supported mostly by his own spontaneous structuring of the learning process” (Ibid: 125). That is, Kanzi had to undergo the same cognitive transition as Sherman and Austin – from indexical to symbolic associations – but he did so spontaneously because his brain was exposed to this task at a time when his prefrontal cortex was still relatively under-developed. Thus our task is to explain what it was about “Kanzi’s immaturity [that] made it easier [for him] to make the shift from indexical to symbolic reference and to learn at least the global grammatical logic hidden behind the surface structure of spoken English” (Ibid: 137). Kanzi’s cognitive deficits must somehow have resulted in a match between his “spontaneous structuring of the learning process” and “the structure of the patterns to be learned” (Ibid: 128).

      As noted above, one of the more intriguing contrasts that emerged in the Kanzi-Alia comparative study was the difference between their short-term memory capacities. But this disparity is not surprising, given the rapid pace of a child’s cortical development in the first two years of life. Kanzi’s short-term memory capacity would likely be closer to that of, say, a 1 year-old child. This point turns out to be important for Deacon’s thesis, for he wants to argue that, in general, the immature primate brain renders it difficult for infants to attend to surface details. That is, because of his short-term memory deficits, Kanzi found it difficult to store specific symbol-object associations. But this turned out to be an advantage when it came to language-learning, precisely because grammar and syntax are “surface expressions of the deep web of symbolic relationships” (Ibid: 128).

      Thus the reason why Kanzi acquired so much more language than Sherman and Austin, and acquired it so much more easily, has nothing to do with that extra 1% of ‘human DNA’ supposedly possessed by bonobos; nor the fact that “Bonobos manifest a more intricate socio-communicative repertoire, including the use of more gestures and more vocalization, than common chimps do” (Savage-Rumbaugh & Lewin 1994: 125). Rather, Deacon argues that the basic difference between Sherman and Austin, and Kanzi, concerns the stage of cortical development which each had reached when they were first exposed to language. Given their greater attentional and short-term memory capacities the chimps’ natural learning predisposition was to focus on the details of word-object relationships, whereas Kanzi’s natural learning predisposition was to inhibit these ‘surface details’. It was this initial ‘learning bias’ which enabled Kanzi “to notice the existence of superordinate patterns of combinatorial relationships between symbols” (Ibid: 136).

      The reason why Kanzi’s achievements are so relevant to language acquisition, therefore, is “because if his prodigious abilities are not the result of engaging some special time-limited language acquisition module in his nonhuman brain, then such a critical period mechanism is unlikely to provide the explanation for the language prescience of human children either” (Ibid: 137). That is, “Precisely because of children’s [general] learning constraints, the relevant large-scale logic of language ‘pops out’ of a background of other details too variable for them to follow” (Ibid: 135). Deacon is comparing language acquisition here to the well-known perceptual phenomenon in which one distinct image in an array of identical images suddenly materializes. The reason why adults find it so difficult to see these ‘hidden images’ is because they tend to focus too closely on the picture and it requires a conscious effort to stop oneself from attending to surface details in order for the hidden image to ‘pop out’. So too, Deacon wants to argue, with language-acquisition: here the ‘surface’ details are the indexical associations between signs and objects and the ‘hidden’ details are the symbolic associations that are a function of the distributed relationships among the symbols in a system. The infant’s mind must be equipped with some sort of ‘bias’ that inhibits surface details and this is what enables her to perceive ‘hidden’ symbolic relationships and grammar.

      Rather than supposing that the human brain must somehow have been ‘hardwired’ during the Pleistocene to acquire language, therefore, we can see how it must be languages that have evolved in such a way as to capitalize on the biases of the immature (primate) mind. That is, “Language structures may have preferentially adapted to children’s learning biases and limitations because languages that are more easily acquired at an early age will tend to replicate more rapidly and with greater fidelity from generation to generation than those that take more time or neurological maturity to be mastered” (Ibid: 137). Thus if nonhuman primates lack languages in the wild, this must presumably be due to cultural and not genetic factors.

      The idea of the ‘biases’ that lies at the heart of Deacon’s thesis harkens back to AI’s notion of cognitive heuristics. It is no coincidence that a computational metaphor should play such an integral role in his theory, for his argument is fundamentally mechanist: his notion of ‘mental reorganization’ is that of a spontaneous processing phenomenon which is brought about by the exposure to appropriate inputs during a sensitive period. Again, this is a classic form of ‘linear’ explanation: Kanzi’s advanced communicational abilities are the result of a ‘mental re-coding’ – literally the acquisition of a symbolic code – that occurs “spontaneously, without conscious effort or formal instruction [and] is deployed without awareness of its underlying logic” (Pinker 1994: 18). It turns out that the obstacles to this process – which we all encounter as adults if we should try to acquire a second language – are conscious effort and attention.

      A crucial aspect of Deacon’s argument, therefore, is the idea that Kanzi acquired his language skills spontaneously, around the age of 2½. Savage-Rumbaugh, on the other hand, places great emphasis on the events leading up to that momentous day when she discovered that Kanzi had acquired 8 lexigram symbols without any direct instruction. For example, she tells us that, at around the age of 6 months, Kanzi “became mesmerized by the keyboard, staring at the symbols as they flashed onto the projectors at the top of the keyboard” (Savage-Rumbaugh & Lewin 1994: 129). When he was 14 months-old Kanzi began “to press keys on the keyboard and then run to the vending machine as though he had grasped the idea that hitting keys produced food” (Ibid: 130). At this stage his behavior was similar to Matata’s, who had also grasped the communicative function of the lexigram board, but had great difficulty with individual lexigram-object associations (the ‘surface details’?). When he was 18 months-old Kanzi started “inventing simple iconic gestures, the first of which indicated the direction of travel in which he wished to be carried. He did this not with a finger point, but with an outstretched arm” (Ibid: 134). He even “added emphasis to his gesture by forcefully turning [Savage-Rumbaugh’s] head in the direction he wished to go. … At other times, as he sat on [her] shoulders, he would lean his whole body in the desired direction of travel so that there was no mistaking his intent” (Ibid). And he often “vocalized while gesturing, which served to catch [her] attention and to convey the emotional affect that accompanied each request” (Ibid).

      Around the age of 2 Kanzi started to incorporate lexigrams into his communicative repertoire. For example, he “started deliberately to select the ‘chase’ symbol. He would look over the board, touch this symbol, then glance about to see if [Savage-Rumbaugh] had noticed and whether [she] would agree to chase him” (Ibid). Interestingly, on that first day after Matata’s departure, when he was left alone at the lab with Savage-Rumbaugh, the first thing Kanzi did with the board “was to activate ‘apple,’ then ‘chase’. He then picked up an apple, looked at me, and ran away with a play grin on his face” (Ibid: 135). Throughout that day he repeatedly “hit food keys, and when [Savage-Rumbaugh] took him to the refrigerator, he selected those foods he’d indicated on the keyboard. Kanzi was using specific lexigrams to request and name items, and to announce his intention” (Ibid).

      Savage-Rumbaugh’s explanation of Kanzi’s language development proceeds from essentially the same starting-point as Deacon’s; for she too would argue that “If early exposure to language is even part of the explanation for Kanzi’s comparatively exceptional language acquisition, then it must be attributable to something about infancy in general, irrespective of language” (Ibid: 126-7). But, as is clear from the foregoing account, the focus of her argument is on the importance of Kanzi’s precocious communicational development for the development of his language skills. In many ways, Kanzi’s acquisition of lexigram symbols is reminiscent of the effects of preliterate experiences on a child’s acquisition of reading skills; for, typically, the more a caregiver reads to an infant, the more the child understands about the function of the printed word, and thus, the faster the child learns how to read (see Adams 1998). Moreover, one cannot ignore the dominant role of affect in communicative development; for the most effective speech-language therapies for children with severe language delays are those that mobilize the child’s affects (see Greenspan 1997). Significantly, Kanzi’s first ‘language act’ with the board the day after Matata was taken away was not to obtain a food that he wanted to eat but to engage Savage-Rumbaugh in one of his favorite pastimes.

      In Savage-Rumbaugh’s mind, the most important decision that they made was to “abandon any and all plans of [formally] teaching Kanzi and simply to offer him an environment that maximized the opportunity for him to learn as much as possible” (Ibid: 137). This decision demanded that they create new lexigrams for the most important aspects of Kanzi’s day-to-day activities: e.g. the names of foods, caregivers, other apes, locations in the forest, toys and games. No symbols were inserted solely for the purpose of ascertaining whether Kanzi could grasp some abstract concept. If anything, we should look at the board in the same way that we look at motherese; for the board was not designed to test or to instruct: it was designed to facilitate interactions by providing Kanzi with an artificial communication tool (and a fairly cumbersome one at that). As a result, Kanzi’s “communications soon began to revolve around his daily activities, such as where we were going to travel in the forest, what we would eat, the games we wanted to play, the toys Kanzi liked, the items we carried in our backpacks, television shows Kanzi liked to watch, and visits to Sherman and Austin” (Ibid: 139). The conclusion that Savage-Rumbaugh reaches is thus the exact opposite from Deacon’s: far from being the result of a spontaneous ‘mental reorganization’, she argues that Kanzi’s language development was a prolonged process that occurred because he “was aware that we employed the keyboard as a means of communication and apparently felt keenly motivated to do so as well” (Ibid).

      It might be tempting to conclude that one might simply combine these two arguments: i.e. treat Deacon’s linear account as providing the ‘psychological explanation’ of the communicative development described on Savage-Rumbaugh’s dynamic systems approach. But it is essential that we recognize how the two paradigms not only differ in their explanation of how Kanzi acquired his ‘prodigious abilities’ but also, fundamentally differ in their understanding of the nature of those abilities. On Deacon’s argument, the ‘structure’ of the system that Kanzi acquired is something that pre-existed his encounter with it. Like the ‘pop-out’ visual array, the grammar of the system was hidden somewhere in the lexigram array. So the problem Deacon sets out to answer is: how was Kanzi able to see and thus acquire this structure? But the problem that Savage-Rumbaugh addresses is: how did Kanzi’s verbal skills emerge in the context of, and as a way of augmenting and co-regulating his nonverbal interactions? How were Kanzi’s natural communicative abilities shaped by the reflexive characteristics of the environment in which he was raised into languacultural phenomena: i.e. into acts of reference, utterances, truths and falsities, apologies, explanations, corrections, etc.?

      In place of an information-processing explanation, Savage-Rumbaugh pursues the same kind of interactional explanation as Tomasello explores in The Cultural Origins of Human Cognition. According to Tomasello, “sounds become language for young children when and only when they understand that the adult is making that sound with the intention that they attend to something. This understanding is not a foregone conclusion, but a developmental achievement” (Tomasello 1999: 101; our italics). That is, in order for sounds to function linguistically, “the two interactants [must] share an understanding of each other’s interactive goals” in whatever the context in which they are engaged (Ibid: 99). To see how this argument relates to Kanzi’s linguistic development we might consider a few scenes from the NHK video, Kanzi: An Ape Genius:


The Cookout

Kanzi and Savage-Rumbaugh are engaged in a clearly defined joint activity. When asked by Savage-Rumbaugh, Kanzi collects and breaks sticks for the fire. He chooses long sticks to break and picks up short sticks that he does not break. When Savage-Rumbaugh tells him that he can get the lighter from her pocket he immediately responds by reaching one hand into her pocket. Savage-Rumbaugh stands up in order to make the pocket more accessible. The moment she says he can use the lighter to start the fire he is already starting to do so. He tries several times to light the fire and only stops, and drops the lighter in the fire, when he sees that the flame has taken. Interestingly, he chooses to light the paper and not, say, one of the larger branches. He then stares very intently at the growing flames.


Match-to-Sample Test

      This is one of the most familiar tasks that have been done with Kanzi. The scene begins with Kanzi seated in front of a large lexigram board while Rose is standing behind him and Sue and a male worker are standing at a small window in an adjacent room around 5 feet away, giving him instructions through a microphone. The test begins with Kanzi half-turning in his seat to look at Rose who says ‘Let’s listen some more’. He turns back to the board, then orients to the right to the sound of his name, then immediately turns back to the board when Sue says his name, recognizing that this is his cue to look at the board. When he hears the spoken word he quickly points to the correct symbol and Rose says ‘Yes’. Kanzi vocalizes and it sounds like he too may be trying to make the same sound [yz]. Sue says ‘ice’ and as he points to the key he vocalizes [ice?]? He and Rose then interact in a play tickle (initiated by Rose). Then the male worker says ‘balloon’: Kanzi can be seen to be leaning towards him at this precise moment, and it is not clear whether Kanzi is leaning in response to the vocalization (perhaps to hear better), or his leaning is cuing the vocalization. He scans the board left-to-right and doesn’t move until he has spotted the correct key and then extends his left index finger. (Throughout this sequence he always points at the selected key with his left index finger. He always withdraws the point as soon as the synthesizer articulates the sound.)

      The sequence is repeated, with Kanzi seated before the board but looking over his right shoulder at the male worker until he says a word. Rose says ‘Good Kanzi’ when he gets it right and he can be seen to dip his head when she says this. The worker then says ‘chicken’ and Kanzi methodically scans the board, left-to-right, right-to-left, then left-to-right again (6 seconds in total) before he sees the key. He does not begin to point until he has found the key. The male worker says ‘hot dog’ and again Kanzi scans left-to-right and points with his left index finger. Now Rose says (for the first time): ‘Perfecto’ and raises her arms and stands with her hands open. She says this with a different tone of voice which seems to indicate that the task is over, as does the gesture (inviting hug?). Kanzi vocalizes himself, and it sounds like he’s trying to imitate her vocalization. Then he turns and starts to get off the chair, suggesting that he too thinks that Rose’s vocalization indicated that he was finished. But Sue interrupts, saying ‘one more’. He immediately sits back down and turns back to the board. The worker says ‘grapes’ and Kanzi quickly points to the key. Rose has been standing with her arms raised all this time. When Kanzi points to grapes she says, in an even more emphatic tone: ‘Success’. As she says this she lowers her arms and starts to move towards him. He interprets this as indicating that the task really is over and stands up, but he moves to the window and not to Rose (who is advancing towards him). He looks out the window (to see if Sue is getting him food?) while Rose hugs him from behind. Hugging him Rose says ‘Good job’ with a rising intonation, then repeats ‘Good job’ with a falling intonation. There can be no doubt now that the session is over. He begins to look towards Rose and she fixates directly on his eyes and, with her own eyes very wide open says: ‘And then we’ll get some more grapes.’ He vocalizes ([yz]?) and Rose, turning and moving away, says ‘How does that sound’. It is Kanzi’s turn to respond excitedly, vocalizing, standing upright, and swinging his arms, which draws Rose back to him. The two of them start to move and vocalize together excitedly, their arm movements and vocalizations mimicking each other.


Kanzi and Tamuli

In this scene, Sue is seated on one side of a chain-link fence and, immediately opposite her are seated Kanzi and Tamuli, his adoptive half-sister  at the LRC who has received little exposure to language. The contrast between Kanzi’s and Tamuli’s behaviour in this scene is fascinating. It starts with Sue repeatedly saying Tamuli’s name and pointing at her to get her attention. Then Sue says: “Tamuli, could you slap Kanzi? Tamuli, you (pointing), slap Kanzi.” Tamuli does nothing, but Kanzi himself starts to gently slap Tamuli on her back. Then he shakes her arm (which is resting on his leg) and vocalizes quietly. Tamuli is looking all around and seems to regard all this as a game, for there is a large play-grin on her face, and she is uttering play-grunts. Sue tries something else: “Tamuli, could you give Kanzi a hug?” As Sue says this Kanzi leans forward towards Tamuli, lowers his head, and hugs her. Sue laughs, start to say “Kanzi is…” But Tamuli continues to look all around with a play-grin on her face. Sue tries a third request: “Tamuli, could you groom Kanzi?” The moment she says this Kanzi picks up Tamuli’s left hand and raises it to his chin while making a facial gesture (rounded lips). He looks intently at Tamuli, but when she fails to respond he shrugs off her arm. But Sue immediately repeats “He’s asking you to groom him,” and as she says this Kanzi engages in the exact same action, holding Tamuli’s hand up to his chin and leaning forward. As he does this Sue is saying “look, he put your hand up there… Isn’t that nice?” But again Tamuli doesn’t respond, and Kanzi drops her hand. As he does so Sue says “look, he’s showing you.” But at this point Tamuli has lost all interest in this game and moves off to play with someone else, while Kanzi looks after her, then turns back to Sue and receives a treat.


      In all of these scenes (and several other examples in Kanzi: An Ape Genius) we see Kanzi totally engaged in a joint activity for a sustained period. There is a seamless web of communication between him and Savage-Rumbaugh, in multiple modalities and not just through spoken English. Much of the communication between them occurs without Kanzi looking directly at Savage-Rumbaugh as she speaks. In the first of the three episodes, Kanzi doesn’t simply do what he is asked: he understands why he is being asked to do this. He knows what size of stick is needed for the task and roughly how many sticks are needed. Perhaps he even knows that he is providing an essential service, insofar as Savage-Rumbaugh finds it difficult to break large branches. He understands that Savage-Rumbaugh isn’t just informing him that there is a lighter in her pocket, but that she is giving him permission to fetch the lighter from her pocket in order to start the fire.

      The second scene demonstrates how Kanzi can attend to multiple speakers, and even grasps whom he should respond to at any given moment. The task itself is clearly something he can easily perform, despite the complexity of the array. In a couple of cases it is clear that he doesn’t know where the symbol he needs is situated, but rather than searching randomly he systematically scans the board until he’s found the right symbol. The manner in which this session ends is also interesting: rather than any one of the participants formally signalling an end to the task we can see them negotiating with each other before they are satisfied that they have reached a closing.

      In the final scene, Kanzi recognizes immediately that Tamuli has not understood what she is being asked to do. It is significant, not only that he wants to help her but also, that he repeatedly tries to show her what she is being asked to do. The difference between the manner in which this session and the previous scene ends is also interesting. Her attention no longer engaged, Tamuli simply wanders off; but Kanzi remains seated with Sue and turns back to her, at which point they jointly negotiate an end to the activity.

      There are other scenes that we could have discussed that demonstrate equally striking behaviors. Scenes in which Kanzi can be seen to correct himself, or explain something, or apologize for an action; scenes in which he engages in pretend-play, or more formal types of games (like Pacman); scenes in which he engages in imitative and creative problem-solving and tool-making; or in which he seemingly understands what someone is thinking or feeling. In other words, the behavior that we observe on this video is very much, as Deacon states, like that of a 3 year-old child. But for this very reason, Kanzi has been widely perceived as an anomaly in ALR. His ‘prodigious abilities’ have been viewed – and in some cases dismissed – as a misleading indicator of nonhuman primate communicative capacities, precisely because of the unusual circumstances in which he was raised: i.e. in which he was literally ‘raised’ to an ‘unnatural’ cognitive and communicative level, because he “receive[d] a kind of ‘socialization of attention’” (Tomasello 1999: 35). That is, “responding to a culture and creating a culture de novo” are seen as “two different things” (Ibid: 36). As far as we know, “apes in their natural habitats do not have anyone who points for them, shows them things, teaches them, or in general expresses intentions toward their attention” (Ibid: 35).

      Elsewhere (King & Shanker submitted) we look in some detail at the issue of whether, or to what extent, apes in their natural habitats have displayed elements of such behaviors. The point we would like to focus on here is Tomasello’s suggestion that when apes are raised in a human-like cultural environment, in which “they are constantly interacting with humans who show them things, point to things, encourage (even reinforce) imitation, and teach them special skills,” they experience a “socialization into the referential triangle – of a type that most human children receive – that accounts for the special cognitive achievements of these special apes” (Ibid: 35). Indeed, one of the principle effects of being deprived of these socializing experiences – whether because of endogenous or exogenous reasons – is that a child or an ape develops the sorts of social and communicative deficits that are labelled ‘autistic’ (Greenspan 1997; Harlow & Zimmerman 1959).

      As we saw in the preceding section, Tomasello talks about the ‘nine-month revolution’ that occurs when an infant starts to “’tune in’ to the attention and behavior of adults toward outside entitites” (Ibid: 62). From this point on, the infant’s communicative behavior is marked by gaze following, extended bouts of social interaction, joint engagement, social referencing, imitative learning, and deictic pointing (directed gaze, imperatives, declaratives). It seems likely that the regularities observed in the appearance of these behaviors are related to ‘critical periods’ in the child’s neurobiological development (see Johnson 1997). But then, one must not overlook the importance of a caregiver’s behavior (e.g. smiling, facial animation, body posture, gaze, etc.) during these ‘critical periods’ for the child’s cortico-cortical and neurohormonal development (see Schore 1994). That is, here too, the significance of the developmental manifold cannot be divorced from what might appear to be strictly maturational events (see Gottlieb 1997).

      Moreover, every one of the above communicational behaviors has been observed in Kanzi. This suggests, not only that the development of these abilities is not confined to humans, but more importantly, that the development of these abilities is not genetically predetermined. Rather, the child’s growing ability to engage with her caregivers in complex communicational activities, express her intentions and desires, and describe and express her ideas and feelings, all develop in the context of close dyadic relationships with her primary caregivers. And herein lies the crux of the dynamic systems alternative to Deacon’s information-processing view about what it is “about infancy in general, irrespective of language” that enables a primate to develop language skills.

      We are not concerned here with what might or might not have gone on ‘inside Kanzi head’ that enabled him to develop language skills; nor is language viewed as a pre-existing combinatorial system whose ‘structure’ he had to ‘grasp’. Rather, we are concerned with how Kanzi’s attentional capacities, and his use of lexigrams and his comprehension of spoken English, developed as a result of being nurtured in language-enriched interactions with their caregivers. We saw in the opening section how, on the dynamic systems paradigm, communication is viewed as a “continuous unfolding of individual action that is susceptible to being continuously modified by the continuously changing actions of the partner” (Fogel 1993: 29). Hence the ability to attend to another subject’s actions – and all of the other executive functions (see Russell 1997) – are both vital to and continue to develop as a result of the ‘communicational dances’ in which the infant engages with her caregivers.

      On this line of thinking, what the research with Kanzi ultimately shows us is how an infant’s cognitive and communicative development involves an ongoing and complex interplay between biological, social, and cultural factors, rendering it exceedingly difficult to draw any hard-and-fast distinction between a child’s communicative and her linguistic development. Language does not suddenly appear at some pre-determined age but rather, emerges as a means of co-regulating and augmenting such primal activities as sharing, requesting, imitating, and playing. The child or ape is increasingly motivated to use and develop these potential communicational tools so that she may achieve context-dependent, interactional goals: goals which themselves develop as a function of the child or ape’s developing communicational environment and her growing abilities and increasingly differentiated affects.


§4. The Implications of the New Paradigm

      The basic premise of the continuum picture that has hitherto been so influential in the study of ALR is that, given that apes are ‘penultimate’ to humans (at least, in the existing natural order), we can expect them to possess sophisticated, but not quite fully human communicational abilities: e.g., conscious and intentional, but not normative or linguistic capacities. If one adopts a Piagetian model of cognitive development, one would expect to find that apes consistently perform at an early stage of human cognition: e.g., the sixth sub-stage of sensori-motor development (see Parker & McKinney 1999). That is, one would expect to find them confined to a point that is just prior to when human infants burst forth into the world of creative thinking, human social cognition, and language. The ape mind is thought to be at the stage where it is just starting to become purposive and intentional but is still characterized by poor abstract thinking (e.g. perceptually-bound) and is highly egocentric. Hence the challenge rendered by the continuum picture is to establish the parameters of ape communicative behaviour: i.e. the basic manner in which they communicate (both send and receive/respond); the kinds of messages that they communicate; the stimuli that prompt them to communicate; and the functions of their communications. It is thought that, in answering these questions, we shall deepen our understanding of apes’ cognitive and communicative capacities and thereby illuminate the more ‘primitive’ elements of human communication that underpin or accompany linguistic communication: i.e. those aspects of human communication that are ‘paralinguistic’.

      The problems with the continuum picture, however, are that, not only does it set an upper limit on the communicational complexity of the species being studied but also, it shapes how that complexity is conceptualized. In the early days of information-processing studies, when apes were still being viewed as stimulus-bound creatures, it was natural enough to think of ape communicative behaviours in linear terms. But it is difficult to see how apes could engage in the complex and sustained interactions that have been observed over the past decades if their communicative behaviours were entirely automatic and/or instrumental. For on this account, B's response to whatever A does must also be automatic or instrumental. Accordingly, ‘successful communication’ can only occur when A and B's behaviour is mutually satisfying. (For example, A grunts in order to get B to move so that he can pick up x, while at the same time, B moves in order to get A to pick up x.) The idea of genuine communication (in a non- ‘information-theoretic’ sense) or information transmission emerging from such discrete interactions is rendered virtually impossible.

      As useful as the information-processing paradigm might be for goal-directed mechanical systems and for some simple organisms, it is difficult to account for the kinds of communicational complexity that we observe in nonhuman primates in terms of discrete linear sequences. For example, apes engage in more gesturing and vocalizing than such an outlook can accommodate. The quantity, type, and manner of gesturing vary with age, social conditions, activity, and the subject’s wishes. Species-typical gestures, such as slapping, clapping, pounding, and chestbeating, have been seen to vary from one group to another, from one individual to another, and from one age to another (Tanner & Byrne 1999). Infants have demonstrated a growing awareness of the communicative significance of their own gestures (King 2000; Parker & McKinney 1999). Individuals have been observed to create new gestures, and possibly, to generalize these idiosyncratic gestures to other communicational situations (Tanner & Byrne 1999). Individuals have been observed to restrain their supposedly innate and automatic gestures (Tanner & Byner 1999). And extremely subtle variations have been observed in ‘species-typical’ behaviours: e.g. whether an ape touching another moves its hand vertically or horizontally; quickly or slowly; lightly or more heavily; with a pushing or a pulling motion).

      Furthermore, we have to absorb the implications of the startling advances that have been made in ALR. In addition to the surprising number of symbols and syntactical patterns that apes have mastered, it is also important to note that they have demonstrated the ability to play games that are based on complex rules; engage in sophisticated make-believe and role-playing; solve complex tasks imitatively and creatively; perform remarkably well on match-to-sample tasks (even when the instructions are delivered through earphones or by different speakers); deal easily with simple Theory of Mind tasks; and even engage in normative behaviours such as justifying or explaining their own actions, or trying to teach or correct another ape’s actions (see Shanker & Taylor in press). Critics have objected that such studies tell us little about natural primate abilities, insofar as it is only human intervention that has enabled apes to rise to these cognitive and communicative levels. But then, that is surely the point of such studies; for by demonstrating the plasticity of nonhuman primate capacities, we are learning about the significance of environmental and experiential factors in nonhuman primate development.

      Taking all these factors together we can see how, far from being fixed and invariant, ape communicative behaviours in the wild as well as in research facilities are carefully nurtured and culturally variable. In place of the information-processing model that has hitherto dominated the study of ape communication, therefore, we believe that it is imperative that we now shift to the dynamic systems paradigm, which places the emphasis on the dyad rather than the isolated individual; which sees ape communication as a co-regulated process, rather than a linear and discrete sequence; which focuses on the variability of ape communicative behaviours, rather than treating them as phenotypic traits; and which is thus better able to account for both the social complexity and the developmental character of nonhuman primate communicative abilities.



We are deeply grateful to the following people, who have profoundly influenced our views about ape language research and about dynamic systems theory: Sue Savage-Rumbaugh, Talbot Taylor, Alan Fogel, Gilbert Gottleib, and Stanley Greenspan. SGS would also like to express his debt to the Canada Council, which supported this research with a Standard Research Grant.



Adams, M.J. (1998) Beginning to Read, The MIT Press.

Ainsworth, M.D.S., Blehar, M., Waters, E. & Wall, S. (1978) Patterns of attachment, Erlbaum.

Argyle, M. (1988) Bodily Communication, Routledge.

Chess, S. & Thomas, A. (1984) Origins and evolution of behavior disorders, Brunner/Mazel.

Cianelli, S.N. & Fouts, R.S.  (1998) Chimpanzee to chimpanzee American sign language. Human Evolution 13 (3-4): 157-159.

Deacon T. (1997) The Symbolic Species, W.W. Norton & Company.

Fivaz-Depeursinge, Corboz-Warnery, E. & A. (1999) The Primary Triangle: A Developmental Systems View of Mothers, Fathers, and Infants, Basic Books.

Fogel, A. (1993) Developing Through Relationships, The University of Chicago Press.

Fogel, A. (in press) Beyond Individuals: A relational-historical approach to theory and research on communication. To appear in Il rapporto madre-bambino (The mother-child bond), M. L. Genta, ed. Rome, Carocci Editore.

Gardner, R.A. & Gardner, B.T. (1969) Teaching sign language to a chimpanzee. Science 165: 664-672.  

Goodall, J. (1990) Through a Window, Houghton Mifflin Company.

Gottlieb, G. (1997) Synthesizing nature-nurture: Prenatal roots of instinctive behavior, Erlbaum.

Greenspan, S. (1997) The Growth of the mind: and the endangered origins of intelligence, Addison-Wesley Publishing Company, Inc.

Halliday, T.R. & Slater, P.J.B. (1983) Animal behaviour, vol.2, Communication, Freeman.

Harlow, H.F.  & Zimmerman, R. (1959) Affectional responses in the infant monkey. Science, 130: 421-32.

Hixson, M.D. (1998) Ape language research: a review and behavioral perspective. The Analysis of Verbal Behavior 15: 17-39.

Horner, T. (1985) Subjectivity, intentionality, and the emergence of reality testing in early infancy. Psychoanalytic Psychology, 2: 341-63.

Johnson, M.H. (1997) Developmental Cognitive Neuroscience, Blackwell.

King, B.J. (2000) On gesture and culture in great apes. Paper circulated in advance of Wenner Gren International Symposium.

King, B.J. & Shanker, S.G. (Submitted) How can we know the dancer from the dance: the co-regulated nature of ape communication. In A. Russon & D. Begun, eds. Modern Great Ape Intelligence, Cambridge University Press.

Miles, L.W. (1990) The cognitive foundations for reference in a signing orangutan. In S.T. Parker & K.R. Gibson, eds. ”Language” and Intelligence in Monkeys and Apes, 511-539, Cambridge University Press.

Newport, E. (1991) Contrasting conceptions of the critical period for language. In Carey, S. & Gelman, R., eds. The epigenesis of mind: Essays on biology and cognition, Erlbaum.

Parker, S.T. & McKinney, M.L. (1999) Origins of Intelligence, The John Hopkins University Press.

Pinker, S. (1994) The Language Instinct, William Morrow and Company, Inc.

Rumbaugh, D., ed. (1977) Language learning by a chimpanzee, Academic.

Russell, J., ed. (1997) Autism as an executive disorder, Oxford University Press.

Savage-Rumbaugh, E. S (1986) Ape Language, Columbia University Press.

Savage-Rumbaugh, S. & Lewin, R. (1994) Kanzi: The Ape at the Brink of the Human Mind, John Wiley & Sons, Inc.

Savage-Rumbaugh, E.S. & Rumbaugh, D. (1979) Symbolization, language and chimpanzees: A theoretical reevaluation based on initial language acquisition processes in four young Pan troglogdytes. Brain and Language, 6: 265-300

Savage-Rumbaugh, S., Murphy, J., Sevcik, R., Brakke, K., Williams, S. & Rumbaugh, R. (1993) Language Comprehension in Ape and Child, Monographs of the Society for Research in Child Development, Serial No. 233, Vol.58, Nos. 3—4.

Savage-Rumbaugh, S., Shanker, S. & Taylor, T. (1998) Apes, Language and the human mind, Oxford University Press.

Schaffer, H. (1984) The Child’s entry into a social world, Academic Press.

Schore, A. (1994) Affect Regulation and the origin of the self, Lawrence Erlbaum Associates, Publishers.

Seyfarth, R.L. & Cheney, D.L. (1997) Some general features of vocal development in nonhuman primates. In C.T. Snowdon & M. Hausberger, eds. Social Influences on Vocal Development, 249-273, Cambridge University Press.

Snowdon, C.T. (1999) An empiricist view of language evolution and development. In B.J. King, ed. The Origins of Language: What Nonhuman Primates Can Tell Us, 79-114, School of American Research Press.

Stern, D.N. (1990) Diary of a Baby, Basic Books.

Tanner, J. & Byrne, R.W. (1999) The development of spontaneous gestural communication in a group of zoo-living lowland gorillas. In S.T. Parker,  R.W.

Mitchell & H.L. Miles, eds. The Mentalities of Gorillas and Orangutans, 211-239, Cambridge: Cambridge University Press.

Taylor, T.J. (1992) Mutual misunderstanding, Routledge.

Tomasello, M. (1999) The Cultural Origins of Human Cognition, Harvard University Press.

Tronick, E.Z. (1989) Emotions and emotional communication in infants. American Psychologist, 44: 115-23

van der Weele, C. (1999) Images of Development, SUNY Press.


[1] We follow  conventional practice in referring to the caregiver-infant dyad as the primary context for infant development, but endorse Fivaz-Depeursinge & Corboz-Warnery’s (1999) approach to studying triadic interactions in early infancy.

[2] The use of lexigram symbols as a communication tool was introduced in the Lana Project (Rumbaugh 1977). Lexigrams are colourful, iconic symbols arranged on a computer keyboard. By pressing them in the proper sequence Lana could turn on music, watch slides, open a window, cause food or drinks to be dispensed and invite people into her room to visit and play.

[3] Viz, “Without explicit error correction, an astronomical number of alternative mappings of word relationships to potential rules cannot be excluded” (Deacon 1997: 127)