Re: Harnad: The Symbol Grounding Problem

From: Watfa Nadine (
Date: Fri Mar 02 2001 - 13:43:55 GMT

Harnad's paper covers the symbol grounding problem - what
it is, and how to overcome it. This paper explores symbol
systems and connectionism (neural nets), their problems,
and the solution - a hybrid system that combines the two
systems and produces a system that is AI.

Harnad starts his paper by exploring the change in psychology
from behaviourism to cognitivism - from observables
explaining behaviour, to the possibility of unobservable
processes underlying behaviour. Cognitive theory had the
following view:

>In fact, semantic interpretability (meaningfulness), as we
>shall see, was one of the defining features of the most
>prominent contender vying to become the theoretical
>vocabulary of cognitivism, the "language of thought"
>(Fodor 1975), which became the prevailing view in
>cognitive theory for several decades in the form of the
>"symbolic" model of the mind: The mind is a symbol system
>and cognition is symbol manipulation.

I disagree with this view in cognitive theory. The mind
cannot solely be a symbol system, and cognition, symbol
manipulation. If the mind was just a symbol system we
would only be able to do things such as calculation,
reasoning and problem solving, but we would be unable to do
sensorimotor activities, learn, or even make mistakes. To
humans, symbols are not arbitrary objects, symbols have
meanings, and humans operate on those meanings.

Harnad then discusses symbol systems to form the basis of
the symbol grounding problem.

>Semantic interpretability must be coupled with explicit
>representation (2), syntactic manipulability (4), and
>systematicity (8) in order to be symbolic.
> Hence it is only this formal sense of "symbolic" and
>"symbol system" that will be considered in this
>discussion of the grounding of symbol systems.

Symbol systems are one way of modelling the mind. Another
method discussed here is connectionism, neural nets which
are parallel distributed systems that change their
interconnections with experience. Such systems have turned
out to have powerful learning capacities.

>According to connectionism, cognition is not symbol
>manipulation but dynamic patterns of activity in a
>multilayered network of nodes or units with weighted
>positive and negative interconnections. The patterns
>change according to internal network constraints governing
>how the activations and connection strengths are adjusted
>on the basis of new inputs (e.g., the generalized "delta
>rule," or "backpropagation," McClelland, Rumelhart et al.
>1986). The result is a system that learns, recognizes
>patterns, solves problems, and can even exhibit motor

While discussing cognitive theory accounting for behaviour
in a brainlike way, Harnad states:

>To "constrain" a cognitive theory to account for behavior
>in a brainlike way is hence premature in two respects: (1)
>It is far from clear yet what "brainlike" means, and (2)
>we are far from having accounted for a lifesize chunk of
>behavior yet, even without added constraints. Moreover,
>the formal principles underlying connectionism seem to be
>based on the associative and statistical structure of the
>causal interactions in certain dynamical systems; a neural
>network is merely one possible implementation of such a
>dynamical system.[4]

What is a lifesize chunk of behaviour? What are the other
possible implementations of such a dynamical system?

Harnad continues by assessing the differences between the
two methods mentioned above.

>It is far from clear what the actual capabilities and
>limitations of either symbolic AI or connectionism are.
>The former seems better at formal and language-like tasks,
>the latter at sensory, motor and learning tasks, but there
>is considerable overlap and neither has gone much beyond
>the stage of "toy" tasks toward lifesize behavioral

What are these "toy" tasks? And what is a lifesize
behavioural capacity? Harnad discusses why connectionism is
not symbolic, where many of our behavioural capacities
appear to be.

>Moreover, there has been some disagreement as to whether
>or not connectionism itself is symbolic. We will adopt the
>position here that it is not, because connectionist
>networks fail to meet several of the criteria for being
>symbol systems, as Fodor & Pylyshyn (1988) have argued
>Nets seem to do what they do non symbolically. According
>to Fodor & Pylyshyn, this is a severe limitation, because
>many of our behavioral capacities appear to be symbolic,
>and hence the most natural hypothesis about the underlying
>cognitive processes that generate them would be that they
>too must be symbolic.

This is what leads us on to the symbol grounding problem.
If our linguistic capacities and other skills are all
symbolic then why use connectionism to model cognitive
capacities? Why not just use symbol systems? The answer
is that symbol systems are not enough to model our human
scale capacities. A good example here is the Turing Test.
The first version was the "pen-pal" version, but this
version had a few problems. One being that it could in
principle be passed by computation alone (a symbol system),
and if so, is open to Searle's argument: Since a computer
programme is implementation independent, Searle himself
could execute all the code for passing the Turing Test
without even understanding a word of what his penpal was
talking about (e.g. if it was in Chinese).

So, the second version was the "robotic" Turing Test. If
any system can do what we can do, indistinguishably from
the way we do it, it has a mind too. This version of the
Turing Test is immune to Searle's argument (since it is not
implementation independent, Searle cannot "become" the
system by "reading its mind"). More importantly, the
robotic Turing Test can no longer be passed by just a
symbol system (computation alone). It needs a hybrid
symbolic/sensorimotor system: combining both symbol systems
and connectionism. As you will later this too is what
Harnad proposes.

Harnad has two examples to illustrate the symbol grounding
problem. The first being Searle's "Chinese Room Argument":

>Searle's simple demonstration that this cannot be so
>consists of imagining himself doing everything the
>computer does -- receiving the Chinese input symbols,
>manipulating them purely on the basis of their shape, and
>finally returning the Chinese output symbols. It is
>evident that Searle (who knows no Chinese) would not be
>understanding Chinese under those conditions -- hence
>neither could the computer.
>Hence, if the meanings of symbols in a symbol system are
>extrinsic, rather than intrinsic like the meanings in our
>heads, then they are not a viable model for the meanings
>in our heads: Cognition cannot be just symbol

Harnad's second example is the Chinese/Chinese Dictionary-Go-Round:

>2.2 The Chinese/Chinese Dictionary-Go-Round
>My own example of the symbol grounding problem has two
>versions, one difficult, and one, I think, impossible. The
>difficult version is: Suppose you had to learn Chinese as
>a second language and the only source of information you
>had was a Chinese/Chinese dictionary. The trip through the
>dictionary would amount to a merry-go-round, passing
>endlessly from one meaningless symbol or symbol-string
>(the definientes) to another (the definienda), never
>coming to a halt on what anything meant.[6]

There are an infinite number of languages, but only a
finite number of ways to define them (i.e. symbol shapes
meaning different things to different people). An example
of this is if English was your first language, and assuming
you know everything there is to know about the English
language, you're trying to learn Chinese. "Word loops" can
be formed from a starting symbol, following the meanings
through the dictionary back to the original symbol. An
appropriate property (i.e. meaning) can then be assigned to
a specific "word loop" using prior first language
knowledge. Given one symbol from the word loop, it can be
identified from the applied meaning. In this way, one has
redefined an established language. (As cryptologists of
ancient languages have managed to do).

>The second variant of the Dictionary-Go-Round, however,
>goes far beyond the conceivable resources of cryptology:
>Suppose you had to learn Chinese as a first language and
>the only source of information you had was a
>Chinese/Chinese dictionary![8] This is more like the
>actual task faced by a purely symbolic model of the mind:
>How can you ever get off the symbol/symbol merry-go-round?
>How is symbol meaning to be grounded in something other
>than just more meaningless symbols?[9] This is the symbol
>grounding problem.[10]

There is no solution to this second variant. How is it
different in our heads? How come the symbols in our mind
mean something? It has to be because some of those symbols
are connected to the things they stand for by the
sensorimotor mechanisms that detect and recognise those
things. Then a dictionary is built in our minds from the
grounded basic vocabulary, by combining and re-combining
the symbols into higher-order categories.

>It is one possible candidate for a solution to this
>problem, confronted directly, that will now be sketched:
>What will be proposed is a hybrid nonsymbolic/symbolic
>system, a "dedicated" one, in which the elementary symbols
>are grounded in two kinds of nonsymbolic representations
>that pick out, from their proximal sensory projections,
>the distal object categories to which the elementary
>symbols refer.
> Connectionism and symbolisms respective strengths will be
>put to cooperative rather than competing use in our hybrid
>model, thereby also remedying some of their respective

A hybrid system will take the strengths from the two
methods, and in theory will be a system that can do more
and more what we can do, until no longer distinguishable,
and hence pass the Turing Test.

Harnad continues by looking at the behavioural capacities
that such a cognitive model must generate.

>We already know what human beings are able to do. They can
>(1) discriminate, (2) manipulate,[12] (3) identify and (4)
>describe the objects, events and states of affairs in the
>world they live in, and they can also (5) "produce
>descriptions" and (6) "respond to descriptions" of those
>objects, events and states of affairs. Cognitive theory's
>burden is now to explain how human beings (or any other
>devices) do all this.[13]

I agree with Harnad's view on discrimination and identification:

>To be able to discriminate is to able to judge whether two
>inputs are the same or different, and, if different, how
>different they are. Discrimination is a relative judgment,
>based on our capacity to tell things apart and discern
>their degree of similarity. To be able to identify is to
>be able to assign a unique (usually arbitrary) response --
>a "name" -- to a class of inputs, treating them all as
>equivalent or invariant in some respect. Identification is
>an absolute judgment, based on our capacity to tell
>whether or not a given input is a member of a particular

Harnad then poses the question:

>What sort of internal representation would be needed in
>order to generate these two kinds of performance?

Harnad answers this by iconic and categorical representations:

>According to the model being proposed here, our ability
>to discriminate inputs depends on our forming "iconic
>representations" of them (Harnad 1987b). These are
>internal analog transforms of the projections of distal
>objects on our sensory surfaces (Shepard & Cooper 1982).
>Discrimination is independent of identification. I could
>be discriminating things without knowing what they were.
> For identification, icons must be selectively reduced to
>those "invariant features" of the sensory projection that
>will reliably distinguish a member of a category from any
>nonmembers with which it could be confused. Let us call
>the output of this category-specific feature detector the
>"categorical representation".

Again I agree with Harnad here. To discriminate between
two things you only have to look at them and say what's the
same or different. You don't need to identify them (know
what they are). To identify an object you must be able to
distinguish certain features about it (from learnt
experience) that make it that object.

>Note that both iconic and categorical representations
>are nonsymbolic. The former are analog copies of the
>sensory projection, preserving its "shape" faithfully; the
>latter are icons that have been selectively filtered to
>preserve only some of the features of the shape of the
>sensory projection: those that reliably distinguish
>members from nonmembers of a category. But both
>representations are still sensory and nonsymbolic.

Symbols have not been used up to this point. A good model
for this is connectionism (neural nets), as Harnad states
later in his paper. But what about the symbols and rule we
have in our minds? How do we create that dictionary in our
heads? Harnad has this response:

>What would be required to generate these other systematic
>properties? Merely that the grounded names in the category
>taxonomy be strung together into propositions about
>further category membership relations. For example:
>(1) Suppose the name "horse" is grounded by iconic and
>categorical representations, learned from experience, that
>reliably discriminate and identify horses on the basis of
>their sensory projections.
>(2) Suppose "stripes" is similarly grounded.
>Now consider that the following category can be
>constituted out of these elementary categories by a
>symbolic description of category membership alone:
>(3) "Zebra" = "horse" & "stripes"[17]
>What is the representation of a zebra? It is just the
>symbol string "horse & stripes."

Zebra may well be just the symbol string "horse & stripes"
but "zebra" = "horse" & "stripes" is a manipulation
rule/algorithm that is applied, just like "3" = "2" + "1".

>But because "horse" and "stripes" are grounded in their
>respective iconic and categorical representations, "zebra"
>inherits the grounding, through its grounded symbolic
>representation. In principle, someone who had never seen a
>zebra (but had seen and learned to identify horses and
>stripes) could identify a zebra on first acquaintance
>armed with this symbolic representation alone (plus the
>nonsymbolic -- iconic and categorical -- representations
>of horses and stripes that ground it).

This example that Harnad has given is not necessarily true.
Just by the grounding of "horse & stripes" will not
automatically give us zebra. You have not been told
anything else (i.e. where the stripes are, are they on the
horse, where on the horse, what colour are the stripes…).
Say I don't know what a zebra looks like, and armed with
the grounding inherited by the zebra, I see a Cheshire
horse. I have no prior knowledge of what it is, but I know
it looks like a horse, and it's got a stripy (blond and
brown) tail, and stripy legs (again blond just above its
hooves, and the rest brown). Well, to me I see "horse &
stripes" so it must be a "zebra"! There needs to be a bit
more information grounded in here (possibly with

This leads us on to the Credit/Blame assignment problem.
I'm told I'm wrong but I don't know why I'm wrong - it's a
horse with stripes isn't it? I've followed the right rules
and features, which do I need to change? This is the
"blame assignment problem". With only a few features this
is not so hard a problem, but where there exists a huge
number of rules and features this problem will be very
difficult. This is the problem faced by any system that
hopes to scale up to the human-scale learning capacity.
And this is the major problem of trying to do AI with
symbols only. Learning is mostly sensorimotor and
nonsymbolic, and that's why it's necessary to have

If the credit/blame assignment problem is a variant of
the frame problem, and the frame problem a symptom of the
symbol grounding problem, could the credit/blame assignment
then be related to the symbol grounding problem as I have
just illustrated?

>Once one has the grounded set of elementary symbols
>provided by a taxonomy of names (and the iconic and
>categorical representations that give content to the names
>and allow them to pick out the objects they identify), the
>rest of the symbol strings of a natural language can be
>generated by symbol composition alone,[18] and they will
>all inherit the intrinsic grounding of the elementary
>set.[19] Hence, the ability to discriminate and categorize
>(and its underlying nonsymbolic representations) has led
>naturally to the ability to describe and to produce and
>respond to descriptions through symbolic representations.

This is where the symbol system comes into play. With the
right input symbols and manipulation rules, the resulting
output symbols will all be meaningfully interpretable.
Combining both connectionism and symbol systems results in
a hybrid system that should be able to pass the Turing
Test, and answer AI's "How?" question: "what is it that
makes a system able to do the kinds of things normal people
can do?"

>This circumscribed complementary role for connectionism in
>a hybrid system seems to remedy the weaknesses of the two
>current competitors in their attempts to model the mind
>independently. In a pure symbolic model the crucial
>connection between the symbols and their referents is
>missing; an autonomous symbol system, though amenable to a
>systematic semantic interpretation, is ungrounded. In a
>pure connectionist model, names are connected to objects
>through invariant patterns in their sensory projections,
>learned through exposure and feedback, but the crucial
>compositional property is missing; a network of names,
>though grounded, is not yet amenable to a full systematic
>semantic interpretation. In the hybrid system proposed
>here, there is no longer any autonomous symbolic level at
>all; instead, there is an intrinsically dedicated symbol
>system, its elementary symbols (names) connected to
>nonsymbolic representations that can pick out the objects
>to which they refer, via connectionist networks that
>extract the invariant features of their analog sensory

In conclusion, I agree with Harnad's theory of the hybrid
system. It is obvious that the answer to AI is not solely
symbol systems as we have seen with the "penpal" Turing
Test. And the answer cannot be solely connectionism as it
is not symbolic whereas our behavioural capacities are. So
the solution could well be the system that Harnad proposes.

>This is still no guarantee that our model has captured
>subjective meaning, of course. But if the system's
>behavioral capacities are lifesize, it's as close as we
>can ever hope to get.

Nadine Watfa

Nadine Watfa

This archive was generated by hypermail 2.1.4 : Tue Sep 24 2002 - 18:37:19 BST