Re: Harnad: The Symbol Grounding Problem

From: Shynn Chris (
Date: Thu Mar 01 2001 - 16:06:14 GMT

This paper written by Steven Harnad details the sysbol grounding
problem and what problems it creates for formal symbol systems if they
are to be taken as a complete cognitive theory. The paper starts by
detailing a candidate solution to the problem which consists of symbol
systems grounded in sensorimotor capacity to give what he calls 'iconic
representations' which are given to represent objects and events.
The sketched solution builds from these representations to bring in
higher order representations which describe concepts such as member.

> Connectionism is one natural candidate for the mechanism that
> learns the invariant features underlying categorical representations,
> thereby connecting names to the proximal projections of the distal
> objects they stand for. In this way connectionism can be seen as a
> complementary component in a hybrid nonsymbolic/symbolic model of
> the mind, rather than a rival to purely symbolic modeling.

Firstly Harnad describes connectionism as a natural candidate with
which we may learn what features of objects are constant, and thus
ground a symbol system within those constant catagorising features. It
proposes that a purely symbolic system cannot ground these features as
it doesnt have sensorimotor capacity and so states that this capacity
should be used in a hybrid system.

> The mind is a symbol system and cognition is symbol manipulation.
> The possibility of generating complex behavior through symbol
> manipulation was empirically demonstrated by successes in the field
> of artificial intelligence (AI).

Next Harnad goes on to describe cognitivitism which was able to
incorporate into its model higher order actions and concepts. This
is shown to be much preferable to behaviorism which declared that
the subject matter of psychology was to be limited to the observable,
which were seen as self-explanatory.

> A symbol system is:
> 1.a set of arbitrary "physical tokens" scratches on paper, holes on
> a tape, events in a digital computer, etc. that are
> 2.manipulated on the basis of "explicit rules" that are
> 3.likewise physical tokens and strings of tokens. The rule-governed
> symbol-token manipulation is based
> 4.purely on the shape of the symbol tokens (not their "meaning"),
> i.e., it is purely syntactic, and consists of
> 5."rulefully combining" and recombining symbol tokens. There are
> 6.primitive atomic symbol tokens and
> 7.composite symbol-token strings. The entire system and all its
> parts -- the atomic tokens, the composite tokens, the syntactic
> manipulations both actual and possible and the rules -- are all
> 8."semantically interpretable:" The syntax can be systematically
> assigned a meaning e.g., as standing for objects, as describing
> states of affairs).

Harnad then goes on to describe a purely sybol based system and the
set of descriptors which describe the system and its behaviour in an
explicit and technical sense. It describes the system as purely a set
of tokens which are manipulated only on their shapes and not their
meanings in accordance with explicit rules.

> Symbolists emphasize that the symbolic level (for them, the mental
> level) is a natural functional level of its own, with ruleful
> regularities that are independent of their specific physical
> realizations. For symbolists, this implementation-independence is
> the critical difference between cognitive phenomena and ordinary
> physical phenomena and their respective explanations.

Here Harnad describes one of the main features of a purely symbolic
system, that of implementation independance. Since the symbols are
manipulated solely on the basis of their shape and not their meaning
the system may be used by any implementation and should operate in
exactly the same way.

> A thermostat may be interpreted as following the rule: Turn on the
> furnace if the temperature goes below 70 degrees and turn it off if
> it goes above 70 degrees, yet nowhere in the thermostat is that
> rule explicitly represented. Wittgenstein (1953) emphasized the
> difference between explicit and implicit rules: It is not the same
> thing to "follow" a rule (explicitly) and merely to behave "in
> accordance with" a rule (implicitly).

Here Harnad describes the differences between explicitly and implicitly
using rules. He states that it is not enough for a system like the
thermostat to implicitly act in accordance with a rule for the system
to be considered symbolic in the technical sense described above. Yet
to me this poses the problem of whether we as human beings follow rules
or merly act in accordance with them, surely some of our human
characteristics are folling rules explicitly as they are built into
our genetic profile, ie. the hairs on our arms standing on end when we
are cold. Yet can the human mind be seens as explicitly following
a set of preset rules defined by genetic knowledge ? or are we merely
acting in accordance with rules build up subconsiously throughtout our
lifetime ?

> Now, much can be said for and against studying behavioral and brain
> function independently, but in this paper it will be assumed that,
> first and foremost, a cognitive theory must stand on its own merits,
> which depend on how well it explains our observable behavioral
> capacity. Whether or not it does so in a sufficiently brainlike way
> is another matter, and a downstream one, in the course of theory
> development.

In this section of the paper Harnad is describing connectionist systems
and how connectionist systems which are also described as neural nets
attempt to provide a theory for brain functions. However, Harnad states
that because so little is known about brain function creating a system
to mimic it is needless. I aggree with this as it could be seen from
the point of Turing, if a system is indistinguisable in its behaviour
from an intelligent being then it is intelligent, the processes within
the brain of the being or Turing candidate are inconsequencial to the

> It is far from clear what the actual capabilities and limitations of
> either symbolic AI or connectionism are. The former seems better at
> formal and language-like tasks, the latter at sensory, motor and
> learning tasks, but there is considerable overlap and neither has
> gone much beyond the stage of "toy" tasks toward lifesize behavioral
> capacity.

In this passage Harnad states that the current level of ability is at
the 'toy' stage, yet does not define what this level is or indeed what
is the level of the 'lifesize' tasks he mentions, leaving this section

> Moreover, there has been some disagreement as to whether or not
> connectionism itself is symbolic. We will adopt the position here
> that it is not, because connectionist networks fail to meet several
> of the criteria for being symbol systems, as Fodor & Pylyshyn (1988)
> have argued recently. In particular, although, like everything else,
> their behavior and internal states can be given isolated semantic
> interpretations, nets fail to meet the compositeness (7) and
> systematicity (8) criteria listed earlier

I agree with Harnad on this point as although connectionist systems
perform symbol manipulation they cannot be fully represented by them
even though many neural nets are implemented by symbol systems they
have physical properties as well. This is explained well in the
footnote provided by Harnad.

> Searle challenges the core assumption of symbolic AI that a symbol
> system able to generate behavior indistinguishable from that of a
> person must have a mind.

Here Harnad is introducing the symbol grounding problem by showing it
in action. His first example of the problem is Searles Chinese room
argument which shows that meaning must be intrinsic to the symbol
system itself and not just the human beings watching the system at

> But the interpretation will not be intrinsic to the symbol system
> itself: It will be parasitic on the fact that the symbols have
> meaning for us, in exactly the same way that the meanings of the
> symbols in a book are not intrinsic, but derive from the meanings in
> our heads. Hence, if the meanings of symbols in a symbol system are
> extrinsic, rather than intrinsic like the meanings in our heads, then
> they are not a viable model for the meanings in our heads: Cognition
> cannot be just symbol manipulation.

Here Harnad formalises the argumant against cognition being able to be
modelled solely by formal symbol manipulation as with formal symbol
manipulation the symbols are manipulated according to their shapes, and
not their meanings. I agree with Harnad in that the system modelling
the mind must have the intrinsic knowledge of meaning for symbols, yet
meaing itself must also be defined for this argument to be complete.
Meaning could be plainly seen as referencing symbols to other symbols
in the system but this enters into the infinite loop of the symbol
grounding problem such that to find the meaning of a single symbol you
would have to look up the meanings of the symbols which give the
meaning of the symbol and then their meanings and so on. This part
of the problem is then described in the next section of the paper.

> The second variant of the Dictionary-Go-Round, however, goes far
> beyond the conceivable resources of cryptology: Suppose you had to
> learn Chinese as a first language and the only source of information
> you had was a Chinese/Chinese dictionary![8] This is more like the
> actual task faced by a purely symbolic model of the mind: How can
> you ever get off the symbol/symbol merry-go-round? How is symbol
> meaning to be grounded in something other than just more meaningless
> symbols?[9] This is the symbol grounding problem.

In this section of the paper Harnad describes his examples of the
symbol grounding problem and their manifestations. I think that this
is the perfect example of the symbol grounding problem which occurs
in purely symbol based systems which have no intrinsic knowledge of
the symbols they are manipulating. Without some sort of grounding
knowledge to give at least some of the symbols some meaning the
problem loops infinitely.

> Many symbolists believe that cognition, being symbol-manipulation,
> is an autonomous functional module that need only be hooked up to
> peripheral devices in order to "see" the world of objects to which
> its symbols refer (or, rather, to which they can be systematically
> interpreted as referring).[11] Unfortunately, this radically
> underestimates the difficulty of picking out the objects, events and
> states of affairs in the world that symbols refer to, i.e., it
> trivializes the symbol grounding problem.

In this piece of the paper Harnad shows that many symbolists believe
the symbol grounding problem to be easily solved by adding sensorimotor
input, and although this would be, in my opinion, a step in the right
direction I agree with Harnad that the symbol grounding problem is
being underestimated. For example, with human beings with all our
complex brain functions it takes us many months to learn how to talk
and learn the meanings of specific sounds. Computers who have no
understanding of the meanings of their symbols at all are like
new-borns who need to learn the nuances of language. However, with
human beings it is not understood whether we begin life with genetic
knowledge which would provide the grounding for us, or whether we
learn how to ground the symbols through experiences.

> To be able to discriminate is to able to judge whether two inputs are
> the same or different, and, if different, how different they are.
> Discrimination is a relative judgment, based on our capacity to tell
> things apart and discern their degree of similarity. To be able to
> identify is to be able to assign a unique (usually arbitrary)
> response -- a "name" -- to a class of inputs, treating them all as
> equivalent or invariant in some respect. Identification is an
> absolute judgment, based on our capacity to tell whether or not a
> given input is a member of a particular category.

Here Harnad details both discrimination and identification which are
both human characteristics. For each one a hybrid system described
earlier would be capable of them to some degree. But no where near as
detailed or precise as the human mind. Neural nets work well for
discrimination as they are good for classifiers and pattern recognition,
but even these pale in comparison to the capacity of humans for
recognition of simple objects. For both discrimination and
identification computers must employ very complex algorithms for
image analysis whereas in humans we do al this subconsiously with the
brain merging the two pictures from our eyes to give us depth
perception. Because this is done subconsiously for humans and is
contained in the genetic knowledge we carry in our DNA I feel that
this should be explored before we will be able to correctly give
computers the power of sight.

> According to the model being proposed here, our ability to
> discriminate inputs depends on our forming "iconic representations"
> of them (Harnad 1987b). These are internal analog transforms of the
> projections of distal objects on our sensory surfaces (Shepard &
> Cooper 1982). In the case of horses (and vision), they would be
> analogs of the many shapes that horses cast on our retinas.

According to Harnad discrimination depends upon being able to form
iconic representations of objects. I believe the overall view of
iconic representations is of an amalgum of the information all five
of our senses give us about the object, because even when we lose the
power of one sense, such as sight, we are still able to recognise
objects using our other four senses. This ammounts to a huge ammount
of sensory data which our brains must hold and be able to access at
a moments notice which all builds up a part of what Harnad calls a
'catagorical representation' of the object.

> In a world where there were bold, easily detected natural
> discontinuities between all the categories we would ever have to
> (or choose to) sort and identify -- a world in which the members of
> one category couldn't be confused with the members of any another
> category -- icons might be sufficient for identification. But in our
> underdetermined world, with its infinity of confusable potential
> categories, icons are useless for identification because there are
> too many of them and because they blend continuously[15] into one
> another

Here Harnad argues why these icons he describes cannot be used for
identification alone but should only be part of the overall
representation of the object. He puts foward that since our
environment is a world of infinite variety then iconic representations
are not enough as there are too many icons and therefore too many
catagories. He also says that the iconic representations should be
reduced to a set of invariant features which are unique and would
provide information for the catagorical representation which is the
diffinitive catagoriser for that object. As well as this he states
that since evolution is not fail-safe many of these representations
must come from our life experiences and I agree with this statement.

> Note that both iconic and categorical representations are
> nonsymbolic. The former are analog copies of the sensory projection,
> preserving its "shape" faithfully; the latter are icons that have
> been selectively filtered to preserve only some of the features of
> the shape of the sensory projection: those that reliably distinguish
> members from nonmembers of a category.

Here Harnad states that the representations described above are both
non-symbolic. Both of the representations could be modelled well using
a neural net which uses weight patterns to catagorise inputs.

> Nor is there any problem of semantic interpretation, or whether the
> semantic interpretation is justified. Iconic representations no more
> "mean" the objects of which they are the projections than the image
> in a camera does. Both icons and camera-images can of course be
> interpreted as meaning or standing for something, but the
> interpretation would clearly be derivative rather than intrinsic.

Here it is shown that even though the above representations can be used
to identify or classify the objects, neither represent the meaning of
the object.

> What would be required to generate these other systematic
> properties? Merely that the grounded names in the category taxonomy
> be strung together into propositions about further category
> membership relations. For example: "Zebra" = "horse" & "stripes"
> What is the representation of a zebra? It is just the symbol string
> "horse & stripes." But because "horse" and "stripes" are grounded
> in their respective iconic and categorical representations, "zebra"
> inherits the grounding, through its grounded symbolic representation.

Here Harnad puts forward the notion that given grounded catagories,
such as 'horse' and 'stirpes' it is a relatively easy concept to define
a new catagory called 'zebra' which is a conjunction of the two other
grounded catagories. This new catagory would then inherit the grounding
from the previous two catagories making it a completely valid catagory.

> Once one has the grounded set of elementary symbols provided by a
> taxonomy of names (and the iconic and categorical representations
> that give content to the names and allow them to pick out the objects
> they identify), the rest of the symbol strings of a natural language
> can be generated by symbol composition alone,[18] and they will all
> inherit the intrinsic grounding of the elementary set.

What Harnad puts foward here is that once a basic set of symbols has
been grounded all other symbols may be infered from these. In the same
way that all words are made up of the basic set of twenty six letters
of the aplhabet a basic set of symbols could be used to describe any
object. I agreee with this as this sort of optimisation can be seen in
multiple instances, from computer programs to human learning.

> (does it meet the eight criteria for being a symbol system?) and one
> behavioral test (can it discriminate, identify and describe all the
> objects and states of affairs to which its symbols refer?). If both
> tests are passed, then the semantic interpretation of its symbols
> is "fixed" by the behavioral capacity of the dedicated symbol system,
> as exercised on the objects and states of affairs in the world to
> which its symbols refer; the symbol meanings are accordingly not
> just parasitic on the meanings in the head of the interpreter, but
> intrinsic to the dedicated symbol system itself. This is still no
> guarantee that our model has captured subjective meaning, of course.
> But if the system's behavioral capacities are lifesize, it's as
> close as we can ever hope to get.

Harnad still hasnt defined exactly what he means by lifesize, however,
I like the approach Harnad has taken to this problem and I agree with
him that a hybrid system is probably the best route for AI to take now.
I believe that human beings are a combination of sensorimotor input and
a symbol system, even if it is only a symbol system because that is the
was we have decided to learn and process information. I also agree that
these tests may not capture whether a system has grasped the intrinsic
meaning of objects as yet, but do we fully understand the programs we
create ? I do not think we do, and until we do fully understand the
programs we create then I do not think we will fully achieve AI's aims.
At present humans believe plants to be unintelligent because we can
see no characteristics of intellignece within them, yet what scale are
we measureing them on ? Intelligence could be everywhere we look and
all we lack is the means to communicate with it in a meaningful way
so I do not believe we will achieve full AI untill we understand
exactly what we are, and what we are doing.

This archive was generated by hypermail 2.1.4 : Tue Sep 24 2002 - 18:37:18 BST