Shanahan: Robotics and Common Sense

From: Sloss Finn (
Date: Sat Mar 03 2001 - 14:38:34 GMT

Shanahan - Robotics and the Common Sense Informatic Situation

This paper presents a set of logical formula, that allow a robot to make
a model of the world it is placed in. The robot uses an abductive
process, where sensor data is explaned by hypothesising the existence,
location and shape of objects in the world. Symbols in the explanation
are given meaning through the theory and are grounded by the robot
interacting with the world.

>Without ignoring the lessons of the past, the nascent area of Cognitive
>Robotics seeks to reinstate the ideals of the Shakey project, namely
>the construction of robots whose architecture is based on the idea of
>representing the world by sentences of formal logic and reasoning about
>it by manipulating those sentences. The chief benefits of this approach
>are: A) that it facilitates the endowment of a robot with the capacity
>to perform high-level reasoning tasks, such as planning, and B) that it
>makes it possible to formally account for the success (or otherwise) of
>a robot by appealing to the notions of correct reasoning and correct

The project states its goal as generating a formal logic that will
define the actions of a robot, allowing it to interact and learn about
its environment. The idea is that if the logic statements can be used in
"forwards" mode, to predict when the robot should bump into objects,
stored in a logically defined map, then the same logic could be used in
"reverse" mode, where the map can be built from the sensor data.

>The key idea of this paper is to consider the process of assimilating
>a stream of sensor data as abduction. Given such a stream, the
>abductive task is to hypothesise the existence, shapes, and locations
>of objects which, given the output the robot has supplied to its
>motors, would explain that sensor data. This is, in essence, the map
>building task for a mobile robot.
>More precisely, if a stream of sensor data is represented as the
>conjunction Y of a set of observation sentences, the task is to find
>an explanation of Y in the form of a logical description Delta(M) of
>the initial locations and shapes of a number of objects, such that,
> Sum(B) ^ Sum(E) ^ Delta(N) ^ Delta(M) = Y
>Sum(B) is a background theory, comprising axioms for change
>(including continuous change), action, space, and shape,
>Sum(E) is a theory relating the shapes and movements of objects
>(including the robot itself) to the robot's sensor data, and
>Delta(N) is a logical description of the movements of objects,
>including the robot itself.

The "reverse" mode is to build the map from the sensed data, obtained
from the environment, so instead of predicting the bump (robot hitting
an object) from map elements, the map elements can be built up from
actual bumps. In the definition for Y, the predicting task is to work
out Y - the sensed data. If Y is supplied by a physical robot, placed in
an unknown environment, the unknown element in the formula is the
Delta(M) variable - the map. The theories B & E are functions that can
manipulate the sensed data into logic sentences that define the points
where the robot should collide.

>In the event calculus, we have sorts for fluents, actions (or events),
>and time points.
>The formula HoldsAt(f, t) says that fluent f is true at time point t.
>The formulae Initiates(a, f, t) and Terminates(a, f, t) say
>respectively that action a makes fluent f true from time point t, and
>that a makes f false from t. The effects of actions are described by a
>collection of formulae involving Initiates and Terminates.
>Once a fluent has been initiated or terminated by an action or event,
>it is subject to the common sense law of inertia. This means that it
>retains its value (true or false) until another action or event occurs
>which affects that fluent.

The paper goes on to introduce an 'event calculus' as a means of
representing the logical states of the robot.

>A narrative of actions and events is described via the predicates
>Happens and Initially. The formula Happens(a,t) says that an action or
>event of type a occurred at time point t. Events are instantaneous. The
>formula Initially(f) says that the fluent f is true from time point 0.
>A theory will also include a pair of uniqueness-of-names axioms, one
>for actions and one for fluents.
>HoldsAt(f,t) <- Initially(f) ^ ~Clipped(0,f,t) (EC1)
>HoldsAt(f,t2) <- (EC2)
> Happens(a,t1) ^ Initiates(a,f,t1) ^ t1 < t2 ^ ~Clipped(t1,f,t2)
>~HoldsAt(f,t2) <- (EC3)
> Happens(a,t1) ^ Terminates(a,f,t1) ^ t1 < t2 ^ ~Declipped(t1,f,t2)
>Clipped(t1,f,t2) <-> (EC4)
> #a,t [Happens(a,t) ^
> [Terminates(a,f,t) ^ Releases(a,f,t)] ^ t1 < t ^ t < t2]

>Declipped(t1,f,t2) <-> (EC5)
> #a,t [Happens(a,t) ^
> [Initiates(a,f,t) ^ Releases(a,f,t)] ^ t1 < t ^ t < t2]
>* NOTE - # is used to mean the superset symbol usually a backwards E *
>Let the conjunction of (EC1) to (EC5) be denoted by EC. The
>circumscription policy to overcome the frame problem is the following.
>Given a conjunction of Happens and Initially formulae N, a conjunction
>of Initiates, Terminates and Releases formulae E, and a conjunction of
>uniqueness-of-names axioms U, we are interested in,
>CIRC[N ; Happens] ^
> CIRC[E ; Initiates, Terminates, Releases] ^ U ^ EC
>This formula embodies a form of the common sense law of inertia, and
>thereby solves the frame problem.

The benefit of having the robot interact with its world by using logic
sentences generated from its inputs, is that it completely eliminates
the frame problem. In a "standard" robot situation, where the robot is
told where the objects should be positioned, any unexpected collisions
of the robot and its surroundings will completely confuse the robot and
prevent it from functioning correctly. If extra objects are added or
removed the robot will be at a loss as to what it should do. By
utilising the formal logic described in this paper, any new objects that
are encountered are instantly incorporated into the robot's description
of the world, therefore it can cope with new objects or being immersed
into a strange new environment. Initially the robot will think it can
move anywhere because the initial map is empty, but as it bumps into
objects it will correct its choices as to where to move next, eventually
being able to avoid collisions. In terms of AI this feature is of great
importance for robots that are required to interact with our everyday
world. Obviously the scale of this project is much smaller, but the same
ideals still hold. Given the necessary sensors and actuators a robot
could be made to interact for example: A kitchen robot could be told to
make a coffee, in the case of a standard robot, if the coffee cup is not
in its predefined location, the robot could fail at the task. Using the
logic framework described in Shanahan's paper, the robot would be able
to aquire the new location of the cup and successfully interact.

>The central idea of this paper is the assimilation of sensor data
>through abduction. This is in accordance with the principle,
>"prediction is deduction but explanation is abduction" .
>To begin with, we'll be looking at the predictive capabilities of the
>framework described.
>The conjunction of our general theory of action, change, space, and
>shape with the theory Sum(E) , along with a description of the initial
>locations and shapes of objects in the world and a description of the
>robot's actions, should yield a description of the robot's expected
>sensory input. If prediction works properly using deduction in this
>way, the reverse operation of explaining a given stream of sensor data
>by hypothesising the locations and shapes of objects in the world is
>already defined. It is simply abduction using the same logical

As mentioned earlier, with a predictive system, the frame problem
prevents a robot from being able to show full intelligence, even in the
limited area of environment interaction. This is because as soon as
something unpredictable happens, the pseudo-inteligence about the
environment falls over. It is interesting to see that the physical
interaction problem can be solved by first using a forwards engineering
method, and applying the method in reverse to allow abduction. It shows
that it may be possible to apply this technique to other areas of AI. In
the full scale AI problem of making a concious robot that could pass the
TT, if this style of prediction/abduction could be applied the theory is
that the robot could appear inteligent. In this particular situation I
believe that the intelligence shown is not true intelligence. While the
robot can navigate its world, it still cant decide why it wants to
travel to the particular location. This theory will be very useful in
fulfilling part of the soft AI goal, namely useful devices that appear
to reason about their tasks.

>The robot's task is to do its best to explain its sensor data in terms
>of a model of the physics governing that world. In any such model,
>incoming sensor data is the end of the line, causally speaking. In the
>physical world, it's not a sensor event that stops the robot but a
>collision with a solid object.

This is an illustration of how the robot evades the problem of symbol
grounding. In this situation the symbols inherently have meaning related
to the robot's physical environment. Instead of having to choose a
symbolic representation for all the different situations the robot can
be in, and having to apply meaning to those symbols, the meaning comes
from what happened to the robot in the real world, the symbols come from
the actions it carried out.

>In the specification of an abductive task like this, the set of
>explanations of the required form will be referred to as the hypothesis
>space. It's clear, in the present case, that some constraints must be
>imposed on the hypothesis space to eliminate bizarre explanations.
>Furthermore, the set of all explanations of the suggested form for a
>given stream of sensor data is hard to reason about, and computing a
>useful representation of such a set is infeasible.

>A great deal of further work has already been completed, including a
>treatment of noise via non-determinism and a consistency-based form of
>abduction. This has led to the design of a provably correct algorithm
>for sensor data assimilation, which forms the basis of a C
>implementation which has been used in a number of experiments with the

In practice the task of mapping out the world the robot is in, is very
difficult. Specifically, the task of obtaining the shapes of the objects
that the robot bumps into is not a trivial one; a set of points can
often be interperated in several different ways. Further work has been
carried out into ways of resolving shapes from a limited set of points.

Physical interaction with the real world is certainly an ability that
cognitive robots need. Shanahan's robot has the ability to learn the
shapes of objects in it's enviroment, and navigate around obstacles.
It's limited sensory capabilites mean that it will take some time, and
quite a few collisions, before it can traverse its world more
successfully. The addition of a range sensor would allow the robot to
avoid actually crashing into the object before it does any damage, and a
camera could allow the robot to build up the shapes of objects more
quickly. The problem of noise is briefly mentioned in Shanahan's paper,
this will cause some problems in the actual building of the robot, the
theory assumes noise free sensors. As the map is defined by the logic
sentences from the input sensors and relationship logic, noise will
cause the map to be slightly misdefined, the problem of noise has been
addressed but is byond the scope of Shanahan's paper.

This archive was generated by hypermail 2.1.4 : Tue Sep 24 2002 - 18:37:19 BST