Harnad, S. (1993) Grounding Symbols in the Analog World with Neural Nets. Think 2: 12 - 78 (Special Issue on "Connectionism versus Symbolism" D.M.W. Powers & P.A. Flach, eds.). Pp. 57-62,
It is ironic that the Symbol Grounding, Connectionist/Symbolist and Chinese Room / Artificial Intelligence debates hang so much on definitions -- Symbol Grounding points out that definitional systems (the sense in which symbolic systems are used in that context) are severely limited: they are inherently circular. Similarly, Cognitive Linguistics points out that objectivity is prejudiced by the metaphor-like mechanisms which underlie both syntax and semantics, these being the mechanisms which are responsible for grounding [Lako87, John92, Powe92b].
In Cognitive Science, people with different backgrounds have definitional systems grounded via different metaphors. This is perhaps most obvious in relation to discussion of analog systems and signals, but also in terms of definition of computation or mind. There is also a problem that some bandy terms like "Turing equivalence" and "(Universal) Turing Machine" with only a very vague (and often incorrect) notion of what lies behind them (and even that they have nothing to do with the Turing Test). Even Harnad 5.5 is unfortunately expressed so as to gloss over the distinction between the programmed behaviours being identical and the capabilities of the underlying machine (substrate) being the same, irrespective of program.
So I would like to address some of these definitional problems here, picking up on Harnad's discussion in Sections 1 and 2.
Harnad first addresses the `hypothesis' that cognition is computation, which forms the one side of the "computationalism vs. connectionism" chasm. A major problem is that people from the computationalist camp don't see it as a hypothesis, but as self-evident and axiomatic. The question is, for them, not `if' or `whether', but `how' and `when', computational will succeed in achieving the various forms of cognitive behaviour: given the Turing equivalence of computational and connectionist models (Harnad Section 2), the debate is simply ridiculous and reflects a lack of understanding of the nature of computation on the part of those who see the chasm. What real alternative is there to computation as a model of cognition?
Harnad clarifies what is meant by computation as follows:
"manipulation of physical `symbol tokens' on the basis of syntactic rules that operate only on the `shapes' of the symbols (which are arbitrary ...)" (1.2);
"implementation independent: whatever properties or capabilities a system has purely in virtue of its computational properties will be shared by every implementation ..." (1.3);
"the symbols and symbol manipulations in a symbol system are SYSTEMATICALLY INTERPRETABLE ... they can be assigned a semantics ..." (1.4).
This is an important foundation for the discussion, although expressed in non-computationalist terms. Moreover, there are riders: "operate only on the shapes", "purely in virtue of its computational properties", "can be assigned a semantics". These provide possible loopholes which lead us into controversy.
First, does the substance or the nature of a neural net add capability beyond its computational properties? Two possible answers emerge in Harnad's article: parallelism and analog processing (2.1.3, 5.6). Second, can nets operate on anything other than the shapes which the representation permits? Harnad's suggestion is that the mechanisms of interaction with the outside world, the transduction, may somehow be qualitatively different for the nets (see e.g. 6.3, 7.1, 8.1.5). Finally, can a systematic interpretation, a semantics, be totally internal, in terms of relationships between one representation or language and another, all represented, in the last analysis, with the same symbols (e.g. bits)? Symbol Grounding says no (by definition interpretation implies representing relationships across different systems), so we're back to the transduction question!
These questions take us into the area of implementation. Harnad's answers to them are PROBABLY NOT, PROBABLY and NO. Mine would be NO, NO and NO! Note that the final agreement on NO also allows room for some differences: can a system be indirectly grounded, through a user or programmer, or by copying a brain state, or by isolating the brain from its sensory-motor environment. Once there is no sensory-motor connection, the question becomes academic -- it then aptly fits Harnad's comparison with a stone, there is no discernable difference between mind and stone when there is no communication.
I wonder whether the differences on these questions relates to what we mean by `capability' or `can'? These hide theoretical and pragmatic questions which I would like to elucidate by distinguishing between efficacy and efficiency. Parallel and analog systems may be faster, and thus able to achieve something digital computers may not, simply because a million neurons working in parallel may be able to achieve more than a single CPU working a million times faster with operations a thousand time less complex or less relevant. Such a computer can simulate the neural net, at a cost of a thousand operations times a million neurons. A couple of billion microseconds is of the order of an hour, a couple of thousand milliseconds is just seconds. But in terms of achieving the required result, they are equivalent, and if we could build a serial computer a billion times faster, we could achieve the same result in real time.
Considering transducers, neural networks provide a model which takes us right through to retina or cochlea, effector or sensor, which are implemented using the same cellular stuff. But silicon technology also extends right through to the sensors - even the retina makes use of special pigments, and the cochlea special hairs. No sensors are just ordinary neurons though, as there is no such things as ordinary neurons. There are many different sorts, finitely many, but many more than are reflected in any connectionist implementation I know of. Given the right symbolic computer program, Harnad's TTT robot could be built totally from mechanical/chemical/electronic components within our current technology, apart from questions of efficiency (in which I include all the tradeoffs of speed, size, resolution and complexity). Anything human sensors measure, from smell to temperature, can be measured even more precisely with current technology - but digital or analog thermometers, not to mention electron microscopes, gas chromatographs and the like, don't yet fit in the space occupied by the average neuron.
In neural and electronic cases precision is lost at each level of transduction or processing. The intercorrelations in the physical vision processes may possibly occur more remotely from the transduction in robot vision, or in more limited fashion, but there is no reason why it cannot be emulated adequately to achieve comparable behaviour, forgetting about practicalities of size and speed (viz. efficiency).
Harnad also raises (1.5) the question of whether the goals of researcher in the direction of either Artificial Intelligence or Cognitive Modelling will make a difference in our perspective on these issues. They certainly do, but they don't change the issues. If for purpose of Cognitive Modelling, the focus is developing systems which are accurate at some deeper level than the surface behaviour, then certain possible AI systems will automatically be excluded. Some phenomenon of this sort certainly does appear to present in this type of debate, when systems exhibiting identical behaviour (e.g. Searle himself, versus a chinaman in a room emulating him), are judged to have "differently computational properties". The `charitable assumption' by which we judge others to be like ourselves, innocent until proven guilty, would let us attribute a mind to a TTT robot, given that we couldn't, and wouldn't have any basis to think otherwise - irrespective of what Searle in the driver's seat thinks about the matter - irrespective of whether it was designed by a connectionist or a computationalist - irrespective of whether it was designed as an exercise in Machine Intelligence or Cognitive Science.
Parallel Distributed Processing has come to be synonymous with Connectionism, but Parallel Processing and Distributed Processing go back well beyond the coining of either term. All vision systems, whether symbolically motivated or connectionist, must examine relationships between the pixels in a region in order to extract features. They perform systolic algorithms to extract outlines. They look for relationships between different regions to track movement or make comparisons. They are increasingly being implemented on parallel systems in which the goal of one processor per pixel is the ideal being approached. Weather simulation is another such area, or wind-tunnel and aerodynamic simulation.
Statistical techniques are also proving surprisingly effective in relation to Natural Language, Speech Recognition and Machine Translation (see [Powe92c]). These techniques seek out correlations in a way which is very similar to the correlating effect in neural nets. A lattice of possible choices for individual words may admit a multitude of possible parses, which are evaluated in parallel.
Harnad mentions parallelism, real parallelism, as one aspect of connectionism which has been proposed as a candidate to explain the expectation that connectionism will succeed in cognition whereas symbolism must fail.
I would rather point the finger at the distributional aspects, and the logical parallelism which I will call concurrency, as it is of no account whether it implemented with real or simulated parallelism. The real neural networks which do our computation for us implement distributed concurrent processing with real parallelism. But concurrency and distributed processing are being investigated in the context of symbolic processing too. I have been working with Concurrent Logic Programming Systems (one based on a connection graph theorem prover) and have implemented both connectionist and conventional programs for machine learning of natural language in this context [Powe88,89].
If there are multiple interactive tasks to perform which are naturally concurrent, that is necessarily overlap, then parallelism, real or simulated, is absolutely necessary to meet the specifications. Of course such parallel simulation capability is built into every timesharing operating system or envirnonment (like UNIX, or Windows, or X-Windows). Moreover it is becoming a standard part of conventional languages (e.g. it was designed into the languages SIMULA and ADA, and is possible in most modern PROLOGs). Furthermore, there is virtually no computing environment today that doesn't allow or require peripheral processing to occur in parallel with central processing: that is, the peripherals interrupt the current process with a priority dictated by their speed and the urgency with which they need to be serviced (e.g. even PC's and Macintoshes have this sort of concurrency as standard).
It is true to say that processing of distributed relationships is necessary for cognition, as for vision. Lots of separate pieces of information need to be combined or related. Symbolic systems tend to put the emphasis on combination, and connectionist on correlation, but it is not a hard and fast rule.
In practice, my experience is that neural correlations do admit a symbolic analysis, in terms of which particular neurons or synapses can be identified as labeling particular patterns or implementing particular rules [Powe89]. Moreover, an information theoretic analysis, and a consideration of the cognitive mechanisms available, does suggest that this should be expected [Powe92b,c].
Any neural net implementation of an arbitrary cognitive capacity can be implemented AT LEAST AS EFFICIENTLY in a non-connectionist implementation on the same SEQUENTIAL hardware. My experience here is that the connectionist systems are easier and faster to program, and indeed shorter `programs', but that the symbolic versions are easier to tune, and faster to execute, and indeed require less memory.
I believe the neural implementation can't beat the conventional because (as they say in chess circles, "assuming best play from black"):
1. Neural nets are based on a fixed set of higher level `subroutines'; 2. Given the generic neural network runs on the hardware, we can optimize our application specifically in ways which will hide the neural network origin; 3. We can analyze the behaviour of the net and obtain the correlations more directly, with efficient indexing, etc. 4. The neural network would provide distributional compositional properties and expose symbolic substructure; 5. I can't lose the bet unless neural network X is an optimal solution to the problem, and even then I could reframe it in terms of another model.
In other words, once we are down to efficiency as the only grounds for distinguishing computational and connectionist models, neural networks are just one particular computational model, and while this model is being simulated on conventional computers, the race just isn't on!
Finally I want to come back to the definition implicit in this debate. Turing asked the question `Can a computer think?' Searle and Harnad have changed it to `Can a computer have a mind?' Note the different nuances. After 50 years of computers we are far more inclined to apply the word "think" to them, whether by virtue or metaphor, concession or charitable assumption. Mind, soul and spirit have been nebulous, even nefarious, words for centuries. The phrase `electronic brain' for computers has been and gone.
But there is more to the change of wording than this. Mind focusses on consciousness in a far more direct way than thought. Turing addressed thought in terms of whether the computer could hold its own with people in terms of particular sort of problem solving, and was indeed far ahead of his time in recognizing that it was the `simple' aspects of intelligence, like language, that were going to be more difficult than the `advanced' intelligence reflected in, for example, chess playing. Searle changes the question to whether the computer is conscious of itself, and maps this down to an assumed homunculus.
The shift towards making a digital-analog distinction also hides some misconceptions and changes in definition. In Turing's days, we had analog computers which `reasoned' in ways which contrasted with digital computers in two respects.
1. Digital systems counted in some number system, and represented things with a finite number of symbols (the base of that number system, two as a rule today). Analog systems manipulated `continuous' functions (but were still faced by limits on resolution).
2. Analog systems got their name because they worked by analogy, that is to say lengths or levels implemented in one way (electrical, hydraulic, etc.) were used to represent variables of a totally different sort (e.g. cannonshot weights and ranges). Digital systems represented values numerically, and couldn't compete in speed against analog computers well matched to the problem (but the discussion has also lost sight of the fact that analog systems were used as simulations of other systems).
So what role does Harnad assign the term analog in 2.1.3? He is referring to the CONTINUOUS nature of the input and output of neurons. But there is no such things as continous at the quantum level. Neural processors are mediated by exchange of ions and neurotransmitter molecules, and deviate from the idea at an even higher level. Continuous signals tend to drift, references or states formed by complex interaction functions are necessary for stability. Even chaotic behaviour, as experienced in unstable systems, can be simulated digitally (e.g. mathematically).
My conclusion is that this is a total red herring. In any case, present neural net implementations are overwhelmingly digital.
However, I do believe that analogy plays an important role in the significance of connectionists network for cognitive modeling. The particular range of grammatical and semantic structures we have reflect many structural similarities which result from the commonality of mechanisms. The fact that we can use a word in many different contexts, ranging from the most concrete to the most abstract, but still mean the same thing, is a reflection of the similarity of the relevant representations. For example, consider `in' in `in the room', `in an hour', `in my mind' (see [Lako87]).
Note that the TTT can directly test comprehension of a word like `in', but the TT can only test it indirectly. But this advantage disappears rapidly as we move from the concrete to the abstract domains of application. Of course the TT is a subset of the TTT (which was proposed by Harnad as a generalization of the TT), and as a special case can be passed by any system capable of passing the TTT. From the point of view of symbol grounding, the point is really that TTT capabilities are necessary to pass the TT. Of course, such a system when disconnected from all its sensory-motor periphery, and allowed just teletype communication, is no longer a TTT-capable system. Similarly, grounding in a virtual reality system (like MAGRATHEA [Powe89]) can theoretically produce a TTT- capable system. Given the virtual reality is accurate enough, it should be possible to unplug it from the virtual reality and connect it up to real reality and have it pass the TTT, or disconnect it entirely and have it pass the TT, or have it pass some sort of Virtual TT (VTT) in which it is pitted against a human in the same virtual reality.
Again, technically, a system can be grounded by including in the system a Searle who in this case simulates not the CPU but the PPU, the Peripheral Processing Unit. This Searle translates between his sensory-motor experience and some representation languages understood by the program. This is the mode which Natural Language researchers have traditionally worked in. While it is theoretically possible, it is practically impossible, not only because of the sheer information load on the Searle (or the team of programmers/Searles), but because of the dynamic nature of our environment. The traditional programming approach is not adaptive, and hence incapable of producing systems capable of passing either the TTT or the TT, which allow, for example, my teaching an English TT Chinese!
The virtual reality approach is also, in practise, only a bootstrapping convenience because it is easier to program certain laws of reality than it is to provide by hand scenario after scenario. Thus while a VTT-passing robot should also be capable of passing TTT and TT (given the appropriate replugging), it will in practise eventually come unstuck somewhere along the line simply because the virtual reality simulation isn't accurate enough (but theoretically there is no reason why it couldn't be - the trivial observation that it has to be implemented on a computer of finite size located in real reality is irrelevant, because the experience of the human opponents is also gained in a subset of real reality which they succeed in modeling adequately).
For these reasons, I do expect that successful TT-passing systems will have TTT or VTT capabilities (and moreover their learning capabilities will not be limited to just language). Furthermore they will have representations which have a high degree of correspondence with those which are responsible in the neural circuitry of our brains. But whether the first such systems are labeled connectionist or not is quite another question. What is certain is that they will be adaptive and capable of learning both language and ontology.
[John92] Mark Johnson, Philosophical implications of cognitive semantics, Cognitive Linguistics Vol 3 No. 4, Mouton 1992.
[Lako87] George Lakoff, Women, Fire and Dangerous Things: What categories reveal about the mind, University of Chicago Press 1987.
[Powe89] David M. W. Powers and Christopher C. R. Turk, Machine Learning of Natural Language, Springer 1989.
[Powe90] David M. W. Powers, Parallel and Efficient Implementation of the Compartmentalized Connection Graph Proof Pocedure: Resolution to Unification, in B. Fronh\"{o}fer and G. Wrightson, Parallelization in Inference Systems, Proc. Workshop on Massively Parallel Inference Systems 1990, LNAI 590, Springer 1992.
[Powe91] David M. W. Powers, How far can self-organization go? Results in unsupervised language learning, in D. Powers and L. Reeker, Machine Learning of Natural Language and Ontology, Proc. AAAI Spring Symposium, DFKI Document D91-09, DFKI Kaiserslautern FRG 1991.
[Powe92a] David M. W. Powers, "On the significance of closed classes and boundary conditions: Experiments in lexical and syntactic learning" in W. Daelemans and D. Powers, Background and Experiments in Machine Learning of Natural Language, ITK Proceedings 92/1, University of Tilburg NL 1992.
[Powe92b] David M. W. Powers, Multi-Modal Modelling with Multi-Module Mechanisms: Autonomy in a Computational Model of Language Learning. ITK Research Report 33, University of Tilburg NL 1992.
[Powe92c] David M. W. Powers, A Basis for Compact Distributional Extraction, THINK Vol 1 No. 2, ITK University of Tilburg 1992.
Powers's VTTT (Virtual TTT) is no TTT at all: just another TT, a symbols-only oracle. But if I assume that Powers does not mean to invoke Dyer's and McDermott's Cheshire cat with this example, but only the possibility that, in principle, a simulated robot in a simulated world, simulated by a sufficiently ingenious (very near omniscient) modeller, who had successfully anticipated and encoded everything that was relevant in both, could be used to come up with the complete blueprint from which to BUILD a real TTT-passing robot through virtual-world testing alone, then I don't disagree, though this seems just about as likely as Bringsjord's pongid poetry (and Powers seems to agree). But even then it would certainly not be just a matter of "unplugging" the virtual robot from its virtual world and "replugging" it into the real world (as Powers seems to suggest): if in doubt, try this first with virtual planets in a virtual cosmos.
And it would still only be the real TTT robot, successfully built from the principles learned from the VTTT, that was grounded. Virtual grounding is not grounding any more than virtual transduction is transduction. The causal connections between symbols and what they are interpretable as being about must be real. There is no way to break out of the symbolic circle through mere symbol/symbol connections.
Powers also seems to think Searle can simulate transduction the same way
he can simulate symbol manipulation. I'd like to hear this spelled out in
a concrete case, say, transducing photons. Short of using his own eyes as
add-on peripherals (which would of course be begging the question -- see
my reply to McDermott), the only way I can see to "reconfigure" Searle so
as to be able to do this would seem to call for more radical forms of engineering
than mere software!
-- S.H.