Harnad, S. (1993) Exorcizing the Ghost of Mental Imagery. Commentary on: JI Glasgow: "The Imagery Debate Revisited." Computational Intelligence (in press)
Janice Glasgow has presented some very interesting ideas and findings concerning array representations and the processing of data referring to visible objects, but the connection with imagery seems to be largely interpretive rather than substantive.
The problem seems apparent even in Glasgow's term ``depict'', which is used by way of contrast with ``describe''. Now ``describe'' refers relatively unproblematically to strings of symbols, such as those in this written sentence, that are systematically interpretable as propositions describing objects, events, or states of affairs. But what does ``depict'' mean? In the case of a picture -- whether a photo or a diagram -- it is clear what depict means. A picture is an object (I will argue below that it is an analog object, relative to what it is a picture of) and it DEPICTS yet another object: the object it is a picture OF. But in the case of an array, whether described formally, with numerical coordinates, or stored in a machine, or ``depicted'' diagrammatically by way of a secondary illustration, it is not at all clear whether the entity in question is indeed a picture, or merely yet another set of symbols that is INTERPRETABLE as referring to a picture, which picture in turn depicts an object! It is clear that we are dealing with many layers of interpretation here already, and so far we are still talking only about external objects (such as pictures, symbols and objects simpliciter). We still have not gotten to MENTAL objects, such as mental ``images''.
But perhaps we never need to get to ``mental images'' (even though this is the topic announced in the opening sentence and indeed the title of Glasgow's paper), for in paragraph four an informal operational definition is given for ``imagery'' that is so narrow and particular as probably to leave the question of mental imagery entirely untouched: ``We consider imagery as the ability to manipulate image representations for the purpose of retrieving visual and spatial information that was not explicitly encoded in long-term memory.'' Let's not quibble about the note of circularity here. We know what the author means. She is referring to a capacity of certain systems (people and machines) to DO certain things (identify objects, answer questions, operate on the world) by using certain kinds of objects (``images''), and the paper is about those objects (symbolic and numerical arrays, for the most part, as used either by people directly, or by computers that are in turn used by people).
So, as a first pass, the hypothesis here might be that what is actually going on in the heads of people when they perform certain visual tasks similar to those performed by the computer is very much like what is going on in the computer, which uses either array inputs or internally stored or generated arrays. This hypothesis is tenable, though it will require a lot of evidence to show that computational array representations can bear the weight of (shall we call it) the full array of visual tasks of which people are capable. But to show this we of course do not require a discussion at this time of whether or not computational arrays are mental images; rather, we need much more work on the performance scope and limits of computational arrays. If they do turn out to be able to do everything people can do, THAT will be the time to puzzle over whether or not they constitute mental images (and even then, being -- as I will argue below -- purely symbolic rather than something else, arrays continue to be vulnerable to existing objections to purely symbolic approaches to mental modeling (Searle 1980; Harnad 1987, 1990a, 1992)).
For now, what might arrays be, however? The simplest ``array'' is of course a scalar number, and what really seems to be at issue here is not symbolic descriptions versus mental images but the relative virtue of two forms of symbolic representation: logical versus numerical. In both cases it seems clear that all we have are symbolic representations, but in one case these are interpretable (by US) as, say, natural language propositions and inferences from them, whereas in the other they are interpretable as numerical expressions (which are likewise, I hasten to remind the reader, proper subsets of natural language). But then this is not a matter of ``description'' versus ``depiction'' at all, but of one form of description versus another. And the interpretations are, as usual, user-relative rather than intrinsic to the representations.
We have to be very careful to avoid getting wrapped up in our own interpretations here (what I have elsewhere called ``getting lost in the hermeneutic hall of mirrors'' that is created whenever we interpret symbols and forget that the interpretation originates from us rather than being inherent in the symbols (Harnad 1990b; Harnad 1990c)). It is fine to speak about ``depicted'' information as being ``explicit'' in some forms of representation but only ``implicit'' in others (by way of analogy with the color that is explicit in a picture or a mental image of an apple and only implicit in a set of sentences about apples, from which their redness may perhaps be inferred), but let us not forget that there is something very homuncular and mentalistic in the notion of implicitness and explicitness being envisioned here, and it is not at all clear that the (implicit) mentalistic analogy has any justification.
``Implicit/explicit FOR WHOM?'' one is, after all, entitled to ask. Here the case of the external picture and the mental image already separate from one another, for whatever information might be ``explicit'' in a real apple, it is clear that it only becomes MENTAL -- i.e., explicit to a person -- in the head of a person. And it is not the apple that goes into the head! So explicit in the apple and explicit in what goes on in the head must be two different kinds of things. One could say the same of a picture of an apple: what is ``explicit'' in IT is not what is explicit in someone's mental image of an apple.
We can drop the homuncularity of implicit/explicit talk, however, in favor of information, simpliciter. Whatever information is there, is there. What is not, is not. Mental states have nothing to do with it. There is the usual example of the trajectory of the ball thrown to me, which I accurately compute in virtue of catching it at the right location; yet I have no (``explicit'') imagery for that calculation. Although it is a strain to do so, we should remind ourselves that, for all we know or can even imagine from what we know so far, that might have been how it turned out for ALL internal states and processes. There seems no reason why any of them should have been MENTAL at all. In other words, the information ``implicit'' in them need not have become ``explicit'' to anyone: there need not have been anybody home, no ``inspector'' of the information that is explicitly available by ``inspection''. Surely the only relevant COMPUTATIONAL property is the AVAILABILITY of the information (and the means for extracting it -- whether numerical or logical, immediate or requiring further computations), not its implicit/explicitness. And the ``inspector'' is the system itself, not some internal ghost (Harnad 1982, 1991).
The healthy de-interpretation that this sort of homuncular exorcism engenders can also serve to prevent us from being drawn into other hermeneutic vortices, such as the one underlying Glasgow's apparently incoherent distinction between ``visual thinking'' (which is ``concerned with what an object looks like'') and ``spatial reasoning'' (which ``depends more on where an object is located relative to other objects in a scene (complex image)''). We all know what she means, of course, but that's exactly the problem with mentalistic discourse: for in reality all we have is a world of objects, and systems (like machines and us) operating on them. The objects' ``shapes'' clearly consist of their local spatial properties, whereas what Glasgow would like to reserve ``spatial'' for is their more global spatial properties, especially those between rather than within objects. But clearly this distinction is arbitrary in any but a pragmatic sense -- which would then of course have to be justified pragmatically.
And not even the ``visual'' in the visual/spatial contrast seems to be coherent, for even an object's most local spatial properties might be detected from a variety of transducer modalities, including the optical, acoustic, and mechanical. And that's without making any commitment to the internal representations of objects, which may have been engendered by any or all of these transducer modalities. Besides, once it's inside, it's safer to speak of such information merely as internal and available, rather than continuing to wrap it in mentalistic interpretations that only serve to confuse if not prejudice us.
To summarize, if we speak only about the information available in an object or a data structure -- and forget for now that we have mental lives at all, concerning ourselves only with our performance capacities -- it seems clear that array representations are merely another form of symbolic information. Are they likely to be the only form of internal representation, or the main one, that explains our visual and spatial capacities? I think not; I think tasks like Shepard & Cooper's (1982) ``mental rotation'' may be better accounted for by internal representations that do not turn transducer projections into numbers at all, but preserve them in analog form, one that is physically invertible by an analog transformation that is one-to-one with the transducer projection (to some subsensory and subcognitive level of neural granularity). In other words, I agree with Glasgow that it is a matter of preserving information in the internal representation, but I am not persuaded that arrays are the form the preserved information takes (see Camberlain & Barlow 1982; Jeannerod 1994).
CHAMBERLAIN, S.C. & BARLOW, R.B. 1982. Retinotopic organization of lateral eye input to Limulus brain. Journal of Neurophysiology, 48: 505-520.
HARNAD, S. 1982. Consciousness: An afterthought. Cognition and Brain Theory, 5: 29 - 47.
HARNAD, S. 1987. The induction and representation of categories. In: Harnad, S. (ed.) Categorical Perception: The Groundwork of Cognition. New York: Cambridge University Press.
HARNAD, S. 1990a. The symbol grounding problem. Physica D, 42: 335-346.
HARNAD, S. 1990b. Against computational hermeneutics, Invited commentary on Eric Dietrich's computationalism. Social Epistemology, 4: 167-172.
HARNAD, S. 1990c. Lost in the hermeneutic hall of mirrors, Invited commentary on Michael Dyer's ``Minds, machines, Searle and Harnad''. Journal of Experimental and Theoretical Artificial Intelligence, 2: 321 - 327.
HARNAD, S. 1991. Other bodies, Other minds: A machine incarnation of an old philosophical problem. Minds and Machines, 1: 43-54.
HARNAD, S. 1992. Connecting Object to Symbol in Modeling Cognition. In: A. Clarke and R. Lutz (Eds) Connectionism in Context Springer Verlag.
JEANNEROD, M. 1994. The representing brain: neural correlates of motor intention and imagery. Behavioral and Brain Sciences, 17(2) in press.
SEARLE, J. R. 1980. Minds, brains and programs. Behavioral and Brain Sciences, 3: 417-457.
SHEPARD, R. N. and L. A. COOPER. 1982. Mental images and their transformations. MIT Press, Cambridge, MA.