Hi All,
I hope that with the rest of this term's readings everything will
begin to fall in place (with Penny it clearly seems to have done).
You will find lecture notes from last year's course at:
http://www.cogsci.soton.ac.uk/~harnad/CM302/
I's appending them to this list directly, in case you want to Q/C
them.
(The readings this year are different, so don't feel obliged to read the
unfamiliar ones mentioned below unless you are especially interested.
Next term we will be reading looking at recent papers on hybrid
systems.)
Chrs, S
WHAT IS COMPUTATION?
First, What was there before computation? Differential Equations and
Dynamical Systems.
The Turing Machine is a hypothetical mechanical device that reads and writes
symbols on a tape according to internal states that causally dictate what it
does. An example would be:
Machine is in state I, which is such that it makes it read its
input tape position now.
(Read input.)
(Input is 0.)
State I is such that if the input is 1, the machine goes into
state J, which is Halt; if the input is 0, the machine goes into
state K, which is, advance the tape, write 1 and Halt.
For a more formal definition of a Turing Machine, see:
http://i12www.ira.uka.de/~pape/papers/puzzle/node4.html
or
http://obiwan.uvi.edu/computing/turing/ture.htm
or
http://aleph0.clarku.edu/~anil/math105/machine.html
These simple mechanical operations are what computation consists of. They
are things any mindless device can do. They are based on making all
operations explicit and automatic. No "thinking" is required. (That is the
point!)
The Turing Machine is only a hypothetical device. (A digital computer is a
finite physical approximation to it, differing in that its tape is not
infinitely long.)
Implementation-Independence:The physical device that actually implements the
Turing Machine is irrelevant (except of course that it has to be implemented
by a physical device). This fact is critical. It is simple, but often
misunderstood or forgotten, yet, as you will see, it is essential to the
definition of computation. It is also the basis of the hardware/software
distinction.
Let's call the marks on the Turing machine's tape "symbols" (actually
"symbol tokens," because "symbol" really refers to a symbol-type, a kind of
generic pattern, like "A," whereas a symbol token is an actual instance of
A; but we will use "symbol" for both symbol types and symbol tokens except
where the difference matters).
The implementation-independence or hardware-independence of computation is
related to the notation-independence of a formal system: Arithmetic is
arithmetic regardless of what symbol or notational system I use, as long as
the system has the right formal properties (something corresponding
systematically to "0," "+" "=" etc.). This is exactly the same as the
implementation independence of computation: A Turing Machine is performing a
particular computation if it implements the right formal properties. The
physical details matter no more than the details of the shapes of the
symbols in a notational system.
The counterpart of the hardware-independence of computation is the
shape-independence and arbitrariness of symbols and symbol systems: It does
not matter whether I designate "add" by "add," "+" "&" or "PLUS" -- as long
as I use the it consistently and systematically to designate adding.
A symbol cannot really be defined in isolation. Or rather, a single symbol,
unrelated to any symbol system, is trivial. (There is a joke about a
wonder-rabbi at his death-bed, with all his disciples gathered together to
hear his last words. The wonder-rabbi murmurs "Life.... is like.... a
bagel." All his disciples are abuzz with the message: "Pass it on: The
wonder-rabbi says life is like a bagel!" The word is passed on till it
reaches the synagogue-sweeper, the lowliest of the flock. He asks" "Life is
like a bagel? How is life like a bagel?" The buzz starts again as the
question propagates back the the deathbed of the wonder-rabbi:
"Wonder-Rabbi, How is life like a bagel?"
The wonder-rabbi pauses for a moment and then says "Okay, so life's not
like a bagel."
The point is that you can read anything and everything into a point-symbol:
It only becomes nontrivial if the symbol is part of a symbol system, with
formal relations between the symbols. And, most important, the symbol system
must be semantically interpretable. That is, it must mean something; it must
make sense -- systematic sense.
For example, it doesn't matter what symbol you use for "addition" in
arithmetic, but then whenever you refer to addition, you must use that
(arbitrary) symbol, and the strings of symbols in which it occurs must be
systematically interpretable as denoting addition. In particular, "1 + 1"
must equal "2" no matter what notation you use for "1" "+" "=" and "2".
Now we are ready to define computation (the Turing Machine was more an
example than a definition): Computation is symbol manipulation. Nontrivial
computations are systems of symbols with formal rules for manipulating them.
The shape of the symbols is arbitrary. That is just part of a notational
system. But the symbols and the manipulations must be semantically
interpretable: It must be possible to interpret them (systematically, not
point-wise, like the bagel) as meaning something.
For example, arithmetic is a formal symbol system, consisting of its
primitive symbols (o, +, =, etc.) and strings of symbols (axioms) and
manipulation rules (rules for forming well-formed formulas, for making
logical inferences, and for making arithmetic calculations).
See:
http://www.csc.liv.ac.uk/~frans/dGKBIS/peano.html
http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Peano.html
Now there is one and only one way to interpret all the axioms, theorems and
calculations of arithmetic: as referring to numbers and their properties.
(There are also so-called "nonstandard" interpretations, but these only
apply to things very much like numbers and isomorphic to them in critical
respects.) Unlike "Life is like a bagel," the Peano's arithmetic symbol
system has, to all intents and purposes, only one coherent interpretation.
It doesn't make sense if interpreted as a military manual, a planetary map,
or a Shakespeare play (and vice versa). This systematic mappability into
meanings and vice versa is the central property of symbol systems.
Symbol systems are also compositional: They consist of elementary symbols
that are combined and recombined according to the symbol manipulation rules.
Yet all the (well-formed) combinations are semantically interpretable too,
and all the interpretations cohere. This is not a trivial property. It is
easily to invent an arbitrary code, consisting of symbols and symbol
manipulation rules. It is much harder to invent one in which the symbols and
symbol combinations all make sense.
And, conversely, it is hard to take an undeciphered symbol system that does
have a unique, nontrivial interpretation, and decipher it so as to find that
interpretation.
All formal or artificial symbol systems (including all of mathematics and
logic, computer programmes, and artificial "languages") are subsets of
natural language: We don't change languages when we begin to talk
"geometry," "boolean algebra," or "C," we simply use a specialized subset of
the vocabulary of English.
So all formulas are really sentences in English (or any other natural
language). This means that natural language is the "mother of all symbol
systems."
THE CHURCH/TURING THESIS
The Turing machine is one attempt to formalise the mathematician's working
notion of "computation." The mathematician has an idea of what it is that
he is doing when he is doing mathematics. There are differences of opinion
among mathematicians about what mathematics is about. Four schools of
thought (at least) about the foundations of mathematics exist:
(1) The Realists (also called "Platonists") hold that mathematics is about
the eternal truths of the universe: that numbers, for example, are
properties of the world too, except that they are even more real than
objects, because what is true of objects only happens to be true in the
actual world, whereas what is true of numbers is true in every possible
world, on pain of contradiction.
(2) The Formalists do not think numbers (or other mathematical entities)
really exist, like objects. They think that their properties are just the
formal consequences of the formal rules we choose to adopt (i.e., which
symbol strings we take as axioms, which symbol-manipulation rules we adopt
for making deductions from the axioms, etc.). ("Formal" means based on
arbitrary shape conventions or notations, as we discussed earlier in
connection with symbol systems.)
(3) The Logicists are really a special kind of formalist, in that they too
think that mathematical objects are just formal, rather than real, but they
think all of mathematics can be reduced to and derived from logic, which is
to say a particular formal system, whereas the formalists just say it's all
formal, with no commitment to reducibility to logic. One thing that all
formalists, including logicists, have in common is that they emphasise
"apagogic" proof: proving that things are true or that things "exist" on
pain of contradiction (i.e., because otherwise it would lead to a
contradiction). You may remember the method of proof called "reductio ad
absurdum." For the logicists, and for all formalists, this is the model for
what a proof really is.
(4) The Intuitionists (also called "Constructivists") think that
mathematical objects such as numbers are neither real, as the realists do,
nor merely formal, as the formalists/logicists do. They think they are ideas
constructed by human minds. They differ from the formalists (and perhaps the
realists) in that they don't believe that a theorem is true, or that a
mathematical object exists, merely because it can be shown that if it did
not exists, that would lead to a contradiction. They believe only in things
that have been proved by a construction, that is, an algorithm that will
actually find or generate the object in question. [The famous example of the
function that exists for everyone else but not for intuitionists is the one
that takes the value "1" if, somewhere in the decimal expansion of pi there
occurs a string of 7 consecutive 7's (i.e., pi =
2.1416.........7777777...), and the value "0" if it does not. No one knows
whether there is such a string of 7's in pi so no one knows whether or not
the search for it in the decimal expansion would ever come to a halt. For
the intuitionists, this is not a well-defined mathematical function, even
though we know, on pain of contradiction, that it would have to have either
value 0 or 1.]
Now I mention these 4 foundational views because one of them, formalism,
looks so much like what we have said about symbol systems, and that is no
coincidence, because symbol systems are formal systems (and symbol shapes
are just the shapes used in a formal notational system).
All mathematicians are formalists at least to this extent:
(a) They all agree that a statement is false if it leads to a contradiction.
(b) They all agree that mathematics involves "computation,"
but until the 20th century that was just an intuitive idea.
Computation was thought of as an "effective procedure," a way to get a
result mechanically and unambiguously.
The formalists (who, by the way, are the vast majority) had the further
belief, expressed most prominently by David Hilbert, that is will eventually
be possible to compute all the theorems that follow from axioms: That
mathematical proof is computational, and that all mathematical truths are
computable.
Goedel, as you probably know, went on to show famously that Hilbert's
formalist programme for mathematics was doomed to fail because even in
arithmetic it was provable that there were always truths that were
unprovable within any particular formalisation (axiomatisation) of
arithmetic. We will not be talking about that, although some thinkers
(including Lucas
<http://cogprints.soton.ac.uk/abs/phil/199807022>) and Penrose) have thought
Goedel's proof showed that all of intelligence could not be captured by
computation.
But another thing Goedel did was to produce another formalisation of
mathematicians' intuitive notion of "computation," and his formalisation
turned out to be equivalent to Turing's. The logician Alonzo Church (and
another called Post) also had a go, and although their formalisations all
looked different, they too turned out to be just notational variants of the
Turing Machine.
Given that all these independent attempts to capture what mathematicians
have in mind by "computation" all turned out to be equivalent, it was
natural to propose the thesis (and note that it is a "thesis" or conjecture,
rather than a theorem that has been or can be proved) that the Turing
Machine and its various equivalent variants all capture what is meant by
computation, and hence that anything a mathematician can "compute" can be
computed by a Turing Machine. This is has come to be known as the
Church/Turing Thesis.
TURING EQUIVALENCE
So according to the C/T Thesis, computation is "universal." What ever can be
computed at all, can be computed by a Turing Machine (i.e., a symbol system
with the right symbols and rules). This also led to the idea of the
"Universal Turing Machine" and "Turing Equivalence," for if everything that
is computable is Turing-computable, then for every computation, in whatever
symbol system, there should be an equivalent Turing Machine version of the
computation (and the Universal Turing Machine was simply one that could
perform any computation).
Computation, however, is just formal, and a Turing Machine is just an
abstraction. To make computation happen in the real world, you need a
physical device to do it, something very like a Turing Machine, but with the
physical details needed to make it work. The modern digital computer, with
the stored-programme architecture due to von Neumann, is a physical
approximation to such a machine. It is a programmable symbol manipulator, a
Universal Turing Machine that can be transformed, via software, into any
particular Turing Machine (and particular symbol system).
So far this is standard stuff, known (if only vaguely) to every computer
scientist. Here is something new: What is the relation between Universal
Turing Machines and other things, both other machines (such as bridges,
cars and planes), and other physical systems in the world (such a galaxies,
gases, molecules, and avalanches)? If, according to the C/T Thesis,
computation captures mathematics, does it capture physics and engineering
too? Are machines and other physical systems also Turing-Equivalent to some
symbol system?
The answer is yes: Apart from (1) true continuity (which calls for the
differential equations of Newton rather than discrete difference equations),
plus (2) turbulence and (3) quantum effects, digital computers can simulate
anything. This means symbol systems and algorithms can formally encode
anything and everything (and even 1-3, to an approximation). You might call
this the Physical version of the C/T Thesis: A computer can not only do any
computation, hence "imitate" any other computer, it can also imitate any
other physical system.
But to be able to imitate anything and everything is not to be able to be
anything and everything. It is good to remind yourself now and then that a
computer can simulate flight, but it cannot fly; it can simulate evaporation
and liquefaction, but it cannot evaporate or liquify; it can simulate a
fire, but not by burning.
"Turing Equivalence" means computational or formal equivalence. That means
equivalence up to a systematic interpretation. A symbol system, if it is the
right symbol system (i.e., if it really is a set of symbols and rules
systematically interpretable, squiggle for squiggle, as corresponding to
some other system in the physical world), can be designed to be equivalent,
for example, to the solar system. So it will encode the positions,
movements, and other properties of the sun and planets to as close an
approximation as we like (and as we have the algorithms for). We can then
make it work as a calculating oracle, cranking out the positions of the
planets at any time of the year or the millennium. Or, if we want to
translate the symbols not just into a verbal interpretation in terms of
plantetary position, but into something that looks to our senses like the
planets (viewed through a powerful telescope, for example), the semantic
interpretability of the symbols can be cashed in as a Virtual Reality
simulation. That is all part of Turing Equivalence -- the fact that,
property for property, a symbol system will formally match what it is
equivalent to.
Turing Equivalence leads naturally to the Turing Test.
THE TURING TEST
Each of you should read Turing's Paper on this topic. Even though he did
not put it in words that were immune to misinterpretation, the paper is has
become a classic.
Turing was interested in whether a machine could be "intelligent." Normally,
we only use this word to describe people and some animals, and normally we
think of intelligence as almost synonymous with "having a [conscious] mind."
One way to settle the matter of whether machines can be intelligent is
simply to drop these two features we normally ascribe to intelligence (that
only people/animals with minds have it) and simply say that intelligence is
merely what it takes to be able to do what has hitherto required a person or
an animal with a mind to do (until today, when we can build very capable
machines).
This solution -- to simply define "having intelligence" as "being able to do
the kinds of things that could until now only be done by organisms with
minds" -- is appealing to some people, but it does trivialise the matter.
For, according to this definition, a thermostat is intelligent (it can turn
on a furnace when it's cold, and off when it's warm, something that only
people could do until now), and so is a desk calculator, etc.
Is there another approach, one that does not trivialise intelligence? We can
always pick a more difficult task, and say "a machine is only intelligent if
it can do this," but that sounds arbitrary: What do we know, in advance,
about "intelligence," whatever that is, that allows us to pick and choose
tasks in deciding whether or not the executor is intelligent?
Turing did not propose to define intelligence. (A good idea not to do so, in
advance of having any real idea of what it is.) In fact, one interpretation
of Turing's paper is that he suggested forgetting about what intelligence
was altogether, and simply pushing on with getting machines to do all the
things we can do. That's not bad advice, and in the end it's probably the
methodological moral of his paper, but I think he can be interpreted as
saying something more substantial than that.
He tries to influence our intuitions about intelligence having us imagine a
party game in which a man and woman leave the room and interact with us only
through messages on paper. The game is to figure out which is the woman and
which is the man. (They try to fool us.) The intuition comes here: He
suggested that if we continued to exchange messages with both players --
indeed, if we played many games, with many pairs of candidates like this --
we would sometimes be right about which was the man and which the woman, and
sometimes right. But suppose, unbeknownst to us, sometimes one of the
candidates was neither a man nor a woman, but a machine -- but one capable
of interacting with us in the same way.
Turing suggested that if the machine never did anything to make us suspect
it was a machine -- if we kept guessing that it was a man, or a woman, as
the case may be, but never was that it was neither, but a machine -- then,
when the game was over, if we were told that it had been a machine, we would
really have no non-arbitrary reason for revising our judgments about it. Our
judgments might be wrong about whether it was a man or a woman, just as they
might be wrong about a real man or woman, but what basis do we have for
saying that they are wrong about the fact that it was a person, with a mind,
with intelligence? In discovering it was a machine, what have we really
learnt?
Turing might have added a bit more to clarify this intuition, and strengthen
the case for his conclusion that intelligence is as intelligence does, and
that this is no less true of us than it is of machines. For a "machine" is
merely a man-made physical system, obeying the cause/effect laws of the
universe like everything else in the universe. Surely the property being
"man-made" or not "man-made" has nothing to do with being intelligent. If I
had been cobbled together in a lab, would that make me any the more or less
intelligent?
So Turing might have been more explicit about the fact that he was pumping
our intuitions about our ignorance about what machines really are, hence
what is or is not one; also about our ignorance about what does and does not
have a mind. Each of us knows in his own private case that he has a mind,
but how do we know about anyone else, other than on the basis of what they
do? This has been called the "other-minds" problem by philosophers, and it
turns out that the way we solve it from day to day is by Turing Testing.
Does anyone know another way? Does anyone have a periscope for peering into
a candidate to see whether he has a mind? (Do brain scans do that? But how
do we know that?)
Turing might also have been more careful to point out that although he
introduced it as a party games, he was not talking about games or trickery,
and that far from being a one-time test, Turing Testing is the game of life:
It is not enough to fool a few people briefly at a party. The candidate
would have to be indistinguishable ("Turing-Indistinguishable") from any of
the rest of us for a lifetime.
And even the out-of-sight, message-passing version of the Turing Test in
this paper is (arbitrarily) restricted, for if the candidate is to be
Turing-Indistinguishable from us in what it can do, then there is a great
deal that we can do besides just sending and receiving messages: Messages
are, after all, just strings of symbols. That is what we get from a
life-long pen-pal, but people can also see and hear one another, and, more
important than what people look like (for other-minds testing is based more
on what we can do -- including what we can say and write -- than on what we
look like) people can also interact with the world of objects that their
strings of symbols are systematically interpretable as being about.
In other words, for full generality, the Turing Test should be thought of in
its robotic version, not merely its disembodied pen-pal version, for
something more substantial than intuitions rides on that difference: For the
pen-pal version of the Turing Test (let's call it T2) could in principle be
passed by a just a computer implementing a symbol system, symbols in and
symbols out, whereas the robotic version (T3) necessarily also calls for the
capacity to interact causally with the world that those symbols are
interpretable as being about, and it could not be passed by just a computer
receiving, manipulating and sending symbols. The difference is critical, and
Searle's Chinese Room Argument is based on this critical difference (though
Searle does not quite seem to realise or admit it).
SYMBOLISM VS DYNAMISM
Last term you had Bob's course on neural nets and hybrid modeling, so you
should have a practical idea of what the difference between the
logical/symbolic and neural-net approaches to AI and cognitive modeling
amounts to in practise. We will now turn to a closer analysis of what it
amounts to in theory. First, have a look at the readings under " SYMBOLIC AI
CRITIQUES OF CONNECTIONISM" in
<http://www.cogsci.soton.ac.uk/~harnad/topics.html> There you will find
theorists disagreeing about the scope and limits of the two approaches. What
underlies this disagreement?
Let's look at the distinction in terms of what we have learned already:
First we have to ask what is and is not computation, and then we have to ask
what neural nets are (we already know that logical/symbolic computation is
computation!)
Computation is implementation-independent, systematically interpretable,
symbol manipulation. Symbols are just arbitrary objects. They are
manipulated according to rules or "algorithms" that are applied on the basis
of the shape of the symbols, not on the basis of what the symbols mean
(i.e., not on the basis of their interpretation, yet they are nevertheless
systematically interpretable as meaning something). The shapes of the
symbols are arbitrary: they neither resemble nor are physically connected in
any way with what it is that they mean; they are merely a notational system.
And all symbol systems, no matter what their shape, are equivalent if they
follow the same symbol-manipulation rules. This is the sense in which symbol
systems are independent of their physical implementations: Their shapes are
arbitrary, and systems with completely different shapes may still be exactly
the same symbol system, if the rules are the same.
The intuitive example you should keep in your mind whenever you think of a
symbol system is binary digits, 0's and 1's (because all other symbol
systems can be translated into 0's and 1's, among other possibilities) and
the simplest kind of manipulation rule might be: replace all 0's by 1's.
That is a (trivial) symbol system. And physical implementation of it -- that
it, any system that will actually perform according to that rule -- would be
able to take, as input, any string of 0's and 1's, and produce, as output an
equally long string, consisting of all 1's.
The idea is that the symbol manipulation rule is mechanical; that means that
any mindless machine could do it, without needing to "understand" what it
was doing, or why. The remarkable property of such mindless, mechanical,
symbol-manipulation systems is that, nevertheless, the symbols and
manipulations (if they are not trivial or arbitrary ones) can be
systematically interpreted. The system does not understand, yet what it is
doing is understandable (and useful) to someone who is capable of
understanding, and has some use for what it is doing.
Now it is obvious that logical/symbolic programmes are examples of symbolic
computation. What is not an example of symbolic computation? Well one
general set of examples is the very things that computations are
interpretable as being about: in general, unless the computation is itself
meant to be about symbols (i.e., about arbitrarily shaped objects that are
interpretable as standing for still further objects), objects themselves are
not computational.
Apples are not symbols or computations. Symbol systems can be systematically
interpretable as being about apples; they may be systematically
interpretable as being like apples in every respect. (If there is any apple
property missing, you can fix the algorithm so it includes that too.) But,
unlike apples, which are not just systematically interpretable as apples,
but they really are apples, symbol systems are just systematically
interpretable as apples. They are not apples. And apples are not symbol
systems. Nor is just about any other object under the sun that you might
think of (except, say, a computer, that happens to be implementing a symbol
system that is systematically interpretable as, say, an apple!).
So most objects (e.g., apples) are not symbol systems. Any object can be
used as a symbol, of course, so, for example, an apple could be part of a
symbol system. It could stand for, for example, a banana, just as the symbol
"0" could stand for a banana. But it should be obvious that then its shape
would be irrelevant to what it was being used for. S even an apple that is
being used as one of the arbitrary objects in a symbol system is not itself
a computational object, as an apple, because the physical details of its
shape are irrelevant. Any other shape would have done the same job, just as
well.
Now, assuming that it is clear that not every object is a symbol system
(though symbol systems can probably be devised that are systematically
interpretable as just about any object), it follows that if we are using an
object as a model for something or other, if it is the object's physical
properties that are doing the work (i.e., the object is not just being used
as an arbitrary symbol in a symbol system), then the model is not a
computational model. In this sense (and this is important), so-called
"analog" computation is not really computation (or at least it is not
symbolic computation, and symbolic computation is what we mean by
computation in this course; it is also what Goedel, Church, Turing, von
Neumann, etc. meant by "computation" too).
So a sun-dial, for example, is not a computer, even though it "computes" all
the times of day for us. Analog computers manage to deliver the results they
deliver because of their physical "shapes." This means that they are not
shape-independent, or implementation-independent, and hence do not "compute"
in the formal, classical, Turing-Machine sense. In the same sense, a system
whose performance is explained by a set of differential equations is not a
computer either. In fact, it's a fairly good rule of thumb is that if a
system's performance is best explained as conforming to a set of
differential equations rather than as implementing a computer programme,
then the system is not a computer.
In general, physics is the science of dynamical systems -- physical systems
that change in time. Newtonian mechanics explains the dynamics of things
like billiard ball interactions and celestial mechanics. Quantum mechanics
explains the interactions on the subatomic scale. None of the systems in
question -- billiard balls, stars, electrons -- is a symbol system, and the
explanations are not computational. (It is almost always true that one can
do a discrete computational simulation of any dynamical system, but that is
only an approximation, and, like the simulation of an apple, it is not the
dynamical system.)
Computers are of course dynamical systems too: every physical system is a
dynamical system. But their dynamics is irrelevant to what they are
computing, because computation is implementation-independent [ =
dynamics-independent]. A computation must be physically implemented somehow,
to be sure -- even if it is only by a person doing the symbol manipulations
on paper -- but the dynamical details of the implementation can vary wildly:
the difference can be as big as the difference between a nuts and bolts PC
pushing flip-flops and a flesh and blood person pushing a pencil on paper.
So there is a true difference between dynamical and symbolic models. Now we
at last come to the question of neural nets: If something is explained using
a neural net, is that a symbolic/computational or a dynamic/noncomputational
model?
First let's set aside two tricky trivial cases:
(1) It is known that a neural net architecture can be used to implement a
symbol system. This is irrelevant, because computation is
implementation-independent. If the only use to which you are putting your
neural net in your "hybrid" symbolic/nonsymbolic model is that you are using
it as the hardware to perform the computations of the symbolic component,
then you don't have a hybrid model at all, but a symbolic one (and you have
added to it, for some unknown reasons, some details of your particular
implementation of the symbol system, details that we know are irrelevant,
because we could implement the same symbol system in wildly different ways).
(2) Most neural nets are not "real" neural nets: They are not really
parallel, distributed, systems of nodes, with interconnections of
continuously variable and mutable strength. In reality, they are
computational simulations of parallel, distributed nets. In other words,
they are symbol systems that are interpretable as neural nets. That's fine,
but then the question arises: For what the models are meant to be able to do
-- for the functional capacity that they are meant to deliver -- is there
anything about a real parallel, continuous, distributed, modifiably
interconnected hardware that is essential? Or could a discrete, serial
system (like the net simulation itself) have done the same job? For if so,
then the "neural net" is really just a symbolic algorithm, and the "hybrid"
model is really just a combination of different symbolic algorithms.
How could parallelness or distributedness be essential to a system, rather
than merely one arbitrary way of implementing something that was serial,
symbolic, and very fast? One case is suggested by the actual nervous system.
There is some evidence that global oscillations (EEG) might play a role in
regulating brain activity. This means that brain waves going on
simultaneously in different regions interact to produce a global pattern.
This would not be possible if the areas were only active serially.
But don't strain your brains too hard to find cases where some dynamical
property is essential to a system, because it almost never is. That's one of
the implications of he Church/Turing Thesis: Computation can approximate any
other system as closely as you like. The most famous case is in which people
tried to show that a system had to be analog was mental rotation: Roger
Shepard used a computer to generate unfamiliar new 3-D geometric shapes. He
showed pairs of them to human subjects at various spatial orientations and
asked them to judge whether they were the same or different. When he showed
two that were the same, but one was rotated into a different orientation
from the other, the amount of time it took subjects to report that they were
the same was directly proportional to the degree that they were rotated. It
was natural to conclude that the way they solved the problem was to
"mentally" rotate one to see whether it would match the other.
Now a mental image rotation is an analog process, and that was what Shepard
concluded that the brain must be doing (and he was right, as was later
confirmed by more complicated studies, including brain scans during the
task). But that did not stop the computationalists from saying "Not
necessarily: It could be a fast, serial, digital approximation, operating
on, say, the numerical coordinates of the vertices with an algorithm, rather
than by any analog shape rotation."
And the computationalists were right. It could have all been done by a
symbolic algorithm (although then the correlation between the degree of
rotation and the reaction time would have required a more strained, ad hoc
explanation, because there is no reason the algorithm would be more complex
or time-consuming, the greater the degree of rotation!). But the fact is
(almost certainly) that it is not done by a serial, symbolic algorithm in
the brain, for the simple reason that the analog implementation would be so
much more economical and efficient than the serious one in this case.
And so it might be with neural nets: For some problems, real parallelism and
distributedness might be more economical and efficient than any digital
approximation. And in such cases you would be better off with a hybrid model
than a purely symbolic one.
But can we do better than that? Are there cases where it is essential that a
system be analog (e.g., parallel/distributed) rather than simply more
economical and efficient?
One big domain in which such cases can be found is in sensorimotor
systems: systems whose function is to transduce optical or acoustic or
mechanical energy, or to generate movement output. A retina, for examples,
transduces photons, not unlike the way a synthetic photoreceptor does. There
are no computational options for this: If you simulate the light and you
simulate the retina, then you can simulate the transduction too. But if the
system has to deal with real photons, then it has to have real transducers.
Exactly the same is true for motor output systems: In a virtual world,
movement can be simulated symbolically. But if the system has to move around
in the real world, there is no computational substitute for real motor
effectors.
Remember the symbol grounding problem? the problem that squiggles and
squoggles don't have any meaning? that their meanings are just projected
onto them by the mind of their interpreter/users? The proposed solution to
the symbol grounding problem (which had the added advantage of being immune
to Searle's Chinese Room Argument) turned out to be sensorimotor
grounding: The meanings of symbols need to be grounded in robotic
capacities: in robotic interactions with the objects, events and states that
the symbols are interpretable as standing for.
So perhaps it is not coincidental that sensorimotor transduction/actuation
is the paradigmatic case of nonsymbolic function. What might neural nets
have to do with this?
Let's consider the most elementary relation between a symbol and an
object: The symbol stands for the object: "Apples" stands for apples.
(Symbols are objects too, but their physical details as objects are
irrelevant to this; the objects are just used for an arbitrary notational
system.) How is the meaning of "apples" grounded in the objects it stands
for, namely, apples?
Let us consider a system (we won't say yet whether it is symbolic, dynamic,
or hybrid): We want that symbols in that system to be "grounded." We don't
want it to be subject to the criticism: "Its symbols only seem to mean what
they mean because an outside interpreter interprets them as such: It is
nice, and nontrivial, that they can indeed be systematically interpreted as
such, but that is not enough, because without the mediation of the
interpreter, there is no connection whatsoever between the symbols and the
objects they refer to." So how could we design a system that had symbols
like "apples," that were systematically interpretable as referring to
apples, but a system in which the connection between "apples" and apples was
autonomous and direct, rather than being dependent on the mediation of an
outside interpreter.
Who knows? But here is one candidate: Suppose the system was a robot, one
that could go about in the world, and it could learn (just as we do), what
objects are called "apples," and what are the sensorimotor interactions we
can have with them (pick them, eat them, call them "apples"). You could step
aside from such a system and say: "So you think its symbol "apples," just
means apple because I can interpret it that way? Well then let me step
aside, and get out of the loop, so to speak, and you just watch it interact
with the real world of apples, and see whether the connection between its
symbol "apples" and apples is just in my head, or its in its head too!"
First, notice that we now have something much more than just a bunch of
symbols that can be given a systematic interpretation. We have that too, but
not just that. We also have the objects the symbols stand for, and we have
interactions between the system and those objects. And those interactions
are direct and autonomous: It doesn't pick, eat, and name apples only
because I'm interpreting it that way: It really picks, eats and names
apples.
Now a robot that could do that (and only that) would only be an impressive
toy. Even on the purely symbolic side of its capacities, "apples" would not
really be systematically interpretable as meaning apples, because all it
could do was pick, eat and name apples. That is trivial, and could be done
by countless different systems. And if you re-labeled everything, it could
be interpreted as referring bananas, and if you relabeled it yet again, it
could be referring to prime numbers. Which is to say that it really wasn't
even systematically interpretable as referring to anything at all.
This is where the Turing hierarchy comes in: For whereas that criticism
(that it's just a trivial toy) is valid enough for the apple-talk model, it
loses more and more of its force as we scale up toward T2 (on the inside)
and T3 (on the outside, in its robotic capacities). For the symbols inside a
T3 robot would be as grounded as the symbols in the head of its interpreter
(or you, or me).
Where do neural nets come it? They are a natural candidate (though not the
only one) for making the "connection" between (1) the shadow that the
outside object casts on the robot's sensorimotor transducer surfaces and (2)
its internal symbols. If there is one thing that neural nets do well, it is
pattern learning: They can learn the mapping between a pattern and its name
(or, more complexly, a sensorimotor interaction with it, such as taking it,
eating it, beating it up, fleeing from it, mating with it, etc.). For the
connection between object and symbol is not the connection between one
unique input and one unique output. It is the connection between a kind of
thing, and its name.
The pattern falling on a robot's sensorimotor surfaces -- say, its optical
transducer surface -- is usually the 2-dimensional projection of a 3-D
object of some kind. A neural net would have to learn to reliably get from
that kind of optical pattern (for typically, the same kind of object will
cast many potential shadow patterns) to the arbitrary symbol that is the
object's name, and to the many nonarbitrary motor patterns corresponding to
all the kind of things the robot might need to learn to do with the object
from which the sensorimotor projection originates. It is not for nothing
that neural net modeling is also called "connectionism," and the connections
are not just between internal units, but also between analog input
configurations and analog output configurations -- with the possibility of
further internal configurations in between, both analog and symbolic.
Now, in principle, because of the Church/Turing Thesis, the very next stage
after the sensory input projection could be digital and symbolic, all the
way through to the very last stage before the motor output projection, but
the need for even as simple a capability as mental rotation [and T3 entails
a lot more than that] already calls for internal analog projections and
transformations too. So it begins to become clear why a grounded T3 system
would need to be hybrid through and through.
This archive was generated by hypermail 2b30 : Tue Feb 13 2001 - 16:36:28 GMT