Verbal Concept "Mediators" as Simple Operants

Verplanck, W.S. (1992) Verbal concept "mediators" as simple operants. The Analysis of Verbal Behavior, 10, 45-68.

For some years now, problems of "learning without awareness" have arisen in a number of contexts; they have created a theoretical, and sometimes an experimental fuss. Willy-nilly, those who investigate human operant behavior sooner or later are among those involved, whether they have leaped, slipped, or been dragged into the fray. These seem to be the avenues by which participants enter into scientific controversies, as well as into barroom brawls.

The courses of development of these two kinds of controversy are rather similar. They show a certain orderliness. In both, as the dispute rises in heat, and the blows--or experiments--get exchanged at higher rates, the original issue tends to get los t, if there was one to begin with. In the present case, the issue summarizes itself like this:
"You can't," "I can," in progressively stronger inflections. Just what can or cannot be done either has been omitted, or repeatedly redefined, as the controversy has extended itself. It is not surprising that seemingly contradictory results turn up. To this writer, the present dispute, which seems to have something to do with the subject's ability to state experimental contingencies, is a regrettable one. As it has developed it seems to have led to the performance of experiments on inappropriate forms of behavior, and to a proliferation of speculative theory.

By inappropriate forms of behavior, I mean this: The experiments that have been--by now--repeated over and over with only minor modifications are those that have confounded at least two questions, the identification of response classes, and the stabili ty (habituability) of reinforcers. Saying plural nouns, constructing sentences in the first person, mm-hmm, and good may serve to demonstrate the occurrence of operant conditioning, but they are not necessarily the best choice for experimen ts on other problems. Statements about whether or not operant conditioning occurs must depend upon the changes in behavior that occur with reinforcement and its withdrawal, and not upon anything the subject may have to say about it. (It should also be s uperfluous to point out that the terms "voluntary" and "operant" refer, by and large, to the same behaviors.) Many psychologists, in pursuing thought along these lines, seem to have tended to adopt ever more subtle (but not stringent) definitions of "awa reness" and to have introduced theory in inverse proportion to the clarity of their experimental findings. Some seem to believe that if they can somehow demonstrate something that can be tagged with the label "awareness," they have in some sense found an "explanation" for the orderliness of human conditioning.

One would not express discomfort with this state of affairs, if it were not for the fact that this seems, at least to the writer, the wrong time to attempt to use "awareness" as explanatory, or descriptive, of much of anything. The fact is, very littl e is suggested as to how "awareness," however it may have been defined, can or does control or affect behavior in the first place. Statements about "awareness" as prerequisite to learning have shown little, if any, experimental unity, and the word seems to have become a label indicating an explanatory dead end. However, the issue(s) (??) as they have thus far been stated were resolved, little new information would be added.

The word seems to be associated with a rather special kind of phenomenological approach to behavior. While this may seem somewhat heretical to those phenomenological oriented, to the writer it has always seemed that when the experimental facts get est ablished, their phenomenological aspects seem to take care of themselves.

Some years ago, E.J. Green (1955) remarked that each of his subjects in a discrimination experiment could figure out its correct basis only once. The writer had made much the same observation during human conditioning (Verplanck, 1956). In the latter experiments, many subjects have a good deal to say while being conditioned; some of what they say is to the point. That is, some of it corresponds to the experimenter's rules in conditioning the subject. Both these observations related rather directly to subjects' behavior in a number of exploratory experiments on discrimination and "concept-formation" that the writer had been doing. In these, while seeming to behave in conformity with continuity theory, the subjects always did a lot of "hypothesizing " (again, some of it to the point) a la the Tolman-Krechevsky school. Even the writer, resent it though he may (as a Spencian incrementalist at the time), found that he "hypothesized" when serving as a subject. Self- observation, however, yielded few cl ues as to what was going on.

The common link seemed to be this: In all cases, the subject could come across the correct rule, the "solution," only once in any experiment. Only once could Green's subjects catch on to the critical dots that were correlated with reinforcement. Only once could the conditioning subjects "catch on" that "touching the nose with the right forefinger" produced a point. Only once could subjects figure out that pictures of "objects that can be used in transport" were to be put in the pile on the right. T he correct rule, once said, hung on, the problem was solved (ten successive correct choices), and the experiment terminated.

The "aha" that came is this: In operant conditioning of rats and pigeons, too, the subject is observed to "solve the problem" only once. Thereafter he "applies the solution." In shaping bar-pressing, or key- pressing, the skilled experimenter finds ve ry quickly that he is dealing with a one-trial event. The first bar-press that yields the click of food dropping into the magazine, ant then the rat's quick dive toward it (a Guthrian affair), is followed in most cases by another bar-press, after an inte rresponse time that is no greater than those that are later recorded after 10 or 100 reinforcements. Where this does not occur, it seems that the experimenter, not the rat, made the mistake. We may look back at Estes' paper on conditioning (Estes, 1950) . To attain a clear-cut incremental process in bar-pressing, he found it necessary to introduce a second bar; gradual changes in pressing bar 1 occurred while extinction to bar 2 was going on. One might put it this way: incremental processes in conditi oning seem always to involve extinction, either of the response itself to stimuli other than the one the experimenter has chosen, or of a competing response. With proper experimental control, operant conditioning is a "once" affair; subsequent reinforce ments serve primarily to maintain it at strength, and to develop resistance to extinction, which might be characterized as "reluctance to give up the solution." At the time these considerations were asserting themselves, the writer was busy defining "res ponse" for a glossary, and was struck both by the restrictions that this empirical definition placed on the kind of behavioral events to which the term applied, and by the extraordinary range of new behaviors which could experimentally prove to be respons es, behaving, under discriminative and reinforcing stimuli, in a simple manner.

All this suggested an approach to some of the problems raised by human behavior, and especially by verbal concepts. Let experimental work seek to establish directly how the verbal behavior occurring in an experiment is related to the other behaviors t hat occur. Verbal behavior, if overt, meets the behaviorist's demands for experimental data, and while they can hardly be expected to bear a one-to-one relationship with concepts of "awareness," "hypotheses," " mediators" and the like used by others, the re can be no dispute that they have something to do with at least part of what may be meant by "awareness." So, we sought to make a direct experimental attack upon the problem of how verbal behavior acts under the effect of various environmental conditio ns, and how it in turn is related to the motor behaviors to which it is, at least linguistically, associated. Just how closely such verbal behaviors may relate to "awareness," must be left to those who are surer than of what is referred to by the word.

Specifically, we undertook to investigate the "rules" [Since this paper was given, a monograph (Shepard, Hovland, and Jenkins, 1961) has appeared in which the results of experiments on much more complex problems of the same class are reported. It is e ncouraging to note that data were gathered on the rules--the notants--that subjects eventually came up with. But no effort was made to determine experimentally their origin, and their history through differential reinforcement. It is the behavior of suc h "rules" that this paper deals with.] that subjects say to themselves, and "try out" in various experimental problems. So long as these are allowed to remain covert, the experimenter forfeits the opportunity to exert direct experimental control over the m. If they are made overt, the experimenter can directly subject them to environmental contingencies, as he can other behaviors. The ways in which they are controlled by antecedent or consequent stimuli can be determined by straightforward and simple ex perimental methods. We should be able to determine how they occur in response to environmental events, how they serve as discriminative stimuli for other behaviors, and how they alter in strength with reinforcement.

Our first guess was that overt verbal statements of "rules" would prove to be simple operant behaviors, conditionable as are other operants. Preliminary experimentation based on this proposition led to methods that have since been further developed. The first method is a simple one: it requires the subject in a "concept-formation" card-sorting experiment to state aloud, on each presentation of a stimulus--object, the "rule" that he is following in trying to get as many cards as possible correctly pl aced to right or left. In this situation, where many different possible rules may apply, the experimenter is able to make social reinforcement ("right," or "wrong") contingent either upon the particular statement made by the subject, or upon the behavior that the statement instructed the subject to perform. In either case, he may deliver it after the placement.

Preliminary experiments determined the selection of the stimulus material, and the problem. Stimulus materials which permit the experimenter to choose any one of an almost unlimited number of possible "solutions" proved indispensable. The experimente r must be free to change the "solution" of any problem in midstream--he must be able to make wrong what was previously right, and right what was wrong. He must have far more latitude than provided by, say, the Weigl cards. Second, the material must not require the acquisition of names (the acquisition of a single new response to an arbitrary class of events; stated conversely, the acquisition of a new stimulus class. See Shepard, Hovland, and Jenkins, 1962). Third, the behavior required s hould not press the subject's immediate memory span.

The results of these experiments led us to choose as the first formal experiment one that seemed to place maximal demands on the proposition that subjects' "hypotheses" are simple operants. We (that is, Stuart Oskamp (1956) and the writer) chose to sh ow that these would occur at a high relative frequency even under partial reinforcement, under conditions where we could also keep track of the behavior presumed to be controlled by them.

Stimulus materials consisted of a set of 110 children's "trading cards" [The tremendous variety in trading cards, on which pictures and designs vary in innumerable dimensions, and which may be further varied, independent of their individuality, by pres enting them to the subject upside down, sidewise, or the like, makes such procedures possible. There are an effectively infinite number of possible rules that the experimenter can follow in giving reinforcement, and among which he can shift, whether he i s reinforcing monents or placements. Similarly, their variety permits the experimenter to select stimulus materials with considerable freedom and control, although never with the degree of control provided by "artificial" materials, such as the Weigl car ds. This flexibility seems indispensable for finding the orderly behavior of our subjects.] backs of playing cards, each different from all the others. Fifty-five had representation of single objects or figures, and 55 had two or more objects pictured. The subjects' task was, given the cards one at a time, to place each either to the right or to the left. The instructions also told the subject that he could get all of them correctly placed. Three groups of college students were run. Members of all t hree groups, P, PH, and PH, received the instructions to place each card to either right or left. Two of the groups, PH and PH, received the further instruction to state on each trial the rule followed in attempting to get the card right, before placing it. Members of the first group, P, and one of the latter, were told "right" or "wrong" according to whether they had placed the card correctly by the experimenters rule without regard to the rule they stated in a ttempting to get all placements correct. Members of group PH were told "right" or "wrong" on each trial after they had placed the card, according to whether they had stated a specific version of the rule followed by the experimenter in reinforcing, regardless of where they placed the card. (In group designations, the bold indicates whether placement (P) or "hypothesis" (H) was reinforced.) For all groups, reinforcement with "right" or "wrong" was given only after the card was placed .

In order to assure that any experimental results obtained could not be accounted for in terms of "partially correct hypotheses," only a limited subset of the rules that could produce consistently correct placements was positively reinforced in members of group PH. That is, we shaped a particular set. The rules differentially reinforced for group PH were all of the form:

"Single (one) principal object (figure, design) to the right, two (more than one, several, two, three) principal objects (figures, designs) to the left." If the subject, in stating the rule, named the object or objects pictured, he was told "wr ong." He had to use an abstract term. Records were kept, trial by trial, both of placements, and, for groups PH and PH, of rules stated.

The procedure was this: Acquisition trials were carried out as usual in this type of concept formation experiment (continuous reinforcement of correct responses) until the subjects met the criterion of ten successive correct responses. Thereafter, wit h no change in procedure, all subjects were placed on a partial reinforcement schedule, in which they were told "wrong" following each incorrect response, and following four out of each successive ten correct responses (placements for P and PH; rule statements for PH). On the remaining 60 percent of correct responses, they were told "right." These positive reinforcements were given according to a predetermined randomized schedule.

The schedule places the correct rule-statement on partial positive reinforcement, and at the same time punishes incorrect rule-statement 100 percent of the time. The strength of correct rule-statement will depend, then, on reinforcement by avoidance, on partial positive reinforcement, or on both. Any of these provides accrual of strength by conditioning processes.

Many statements that subjects in group PH could make would lead them to place the cards consistently in the correct pile (e.g., "one dog, belongs to the right," "two dancers go to the left"), but these were not reinforced, since they did not cor respond with the rule-statement required by the experimenter. For members of group PH, if such "wrong" statements were followed by placements consistent with them, they would be followed by reinforcement contingent on the correct placement.

The results of this experiment were clear. First, although the mean number of trials to criterion was smallest for group PH, such differences among groups were not reliable. Several subjects in this group first stated a correct rule following three or four consecutive correct placements. But our primary interest is in the behavior under partial reinforcement. Of the placements made by subjects in groups P and PH on reinforcement trials 51 through 100 [Through the first 50 trial s, the percentage correct drops from 100 percent to an asymptotic value. The rate at which this occurs varies from subject to subject, evidently as a function of differences in the aversiveness of the socially presented "wrong."] following the ten trials in the criterion run, 60 percent were reinforced, and for PH 58.9 percent of correct placements followed instances of the correct rule that were reinforced. The percentages of correct placements under partial reinforcement were, respectively, 71. 2, 71.8, and 76.8, which differ significantly from chance (50 percent), but not from one another. On the 23.2 percent of the trials on which members of PH made incorrect placements, these subjects were reinforced 43.9 percent of the time; that is, with 4 of every 10 incorrect placements, they stated the correct rule, the one for whose statement they were being reinforced. More striking are the percentages of trials on which (a) the correct rule, (b) rules that were incorrect, but yielded correct placements, (c) rules that related to the objects pictured, rather than to other features of the stimulus material (borders, colors, realism, and the like), were stated by members of PH and PH, the two groups giving the rules on each trial. These are summarized in Table 1.

Table 1
Percentages of Trials 51 - 100 on which members of groups
PH and PH stated each of four categories of rules
Category of Rule Stated	Group PH	Group PH
(1) Correct rule	30.2	92.2
(2) Other version of rule that would yield correct placement consistently	18.2	2.0
(3) Incorrect rules that named object depicted	17.2	0.2
(4) All others	34.4	5.6

The data of the table indicate clearly that the rule that has been, and continues to be differentially reinforced, occurs at high relative frequency. Its relative frequency is higher than that of the behavior it is presumed to control. Although PH subjects state the correct rule on 92.2 percent (and one or another version of it on 9.2 percent) of the trials, they place the cards correctly on only 76.8 percent of the trials. In other words, they do not place the card where they say they are goi ng to on 17.4 percent of the trials. Group PH, however, states the correct rule, or a version of it, on 48.4 percent of the trials, but places the cards correctly on 71.8 percent--a discrepancy of 23.4 percent in the other direction. The rule-sta tement, and the behavior for which it is presumably a discriminative stimulus, have been dissociated by manipulating their contingencies of reinforcement.

In a later experiment by Rilling (1962) on the reinforcing properties of "right" and "wrong," one group underwent an experimental procedure which replicated that of group PH. He obtained results almost identical with those of Oskamp (1956): on 72.8 percent of the trials, the placement was correct; on only 57.1 percent of the trials was any version of the experimentally correct rule given.

The results may be summarized as follows: under partial reinforcement, the statement of a specific rule retains considerable strength, as do simple operants. The strength is, in fact, greater than that of the behavior that the rule is presumed to con trol--here, the placement of a card. Where reinforcement is contingent on placement, a higher percentage of correct placements occurs than can be accounted for by correct rules. Experimentally, the subject's rules, his "hypotheses," can be dissociated t o a degree from the behaviors that they are presumed to direct. He does not carry out his intentions.

In fairness both to theorists, and to the conceptual system within which this experiment was done, it is now necessary to introduce a term for these statements-of-a-rule by our subjects. They must be distinguished from the "hypotheses," referred to in many theories and from the rules followed by the experimenter in conducting the experiments. The term chosen is "monent," derived from a Latin verb meaning "advising, guiding, or directing," and it is "monents" that now become subject to a number of exp eriments aimed at determining further how subject's verbal behavior acts in controlling others of his behaviors. The outcome of this experiment leads, also, to further methods of investigating such verbal behaviors, and hence to data that have shown their status as operants, their discriminative stimuli, and the kinds of events that reinforce them. For clarity of exposition, we will reserve the words "rule" and "principle," for the rules followed by the experimenter. Let me summarize very briefly a varie ty of experiments, in the approximate order in which they were done, with a brief account of the immediate context in which they were performed. All of them are based upon the experimental method of shifting the basis of reinforcement from monent to mone nt, from monent to placement according to one or another rule, from placement to placement, and back again.

A. Extinction and recovery. In order to determine how monents behave under extinction, we performed a number of experiments using the same stimulus materials, the same set of instructions as those given to groups PH and PH, and t he same general method. [Many of these effects can be obscured by averaging the data of subjects. It is the individual subject whose behavior is orderly. Combining the data of many subjects serves not only to force discontinuous data into a guise of con tinuity, but it also yields a degree of variability that leads one to seek "significance" by placing more and more subjects in each group, rendering it still less likely that he will either observe carefully the behavior of any one individual, or sharpen up the experimental design. Subjects do differ from one another, and in ways that make group data treacherous.]

A simple demonstration comes when one gives the subject instructions to state the rule he is trying before each placement, and then tells him "wrong" on every trial. Latencies of monents increase progressively, more and more improbably monents occur w hen they are finally given ("can be used to carry opium" is the writer's favorite--they sound like something useful for a projective test!), and finally the subject gives up--"I can't think of anything else;" "my mind's a blank," and so on. Only very rar ely does a subject come up with the one paradoxically reinforceable monent: "Anything I say is going to be wrong!"

Extinction with spontaneous recovery occurs when the experimenter delivers reinforcement according to the following rules: reinforce five consecutive times the second monent stated by the subject (i.e., the monent first stated by the subject on the se cond trial): extinguish this monent thereafter, but give five consecutive reinforcements to the second new monent given after the last instance of the first reinforced monent. Repeat this shift in reinforcement two more times until each of four differen t monents has received five consecutive reinforcements, then shift to reinforcement of placement according to a rule that does not correspond with any of subject's monents. Under these conditions, subjects will eventually reach the criterion of 100 perce nt correct placement, but the monents they state typically resemble closely the initially reinforced four. These recur, to be re-extinguished and again to recover spontaneously. The subject often is never able to state the rule followed by the experimen ter in reinforcing placements, even though he reaches 100 percent correct. Under these conditions, subjects may take several hundred trials to reach solution.

B. The monent as a chain of responses. The protocols of this and of similar experiments show that the monent is a chain composed of two responses, made up of a word or phrase descriptive of the card, the "notate," linked to an instruction, the "predocent" such as "put to the right," or "goes to the right." A notate may not recur after single reinforced occurence. If the subject says "people go to the right," and gets no reinforcement, he is not likely to try the logically expected "people go to the left;" he is more likely to say something such as "cards with blue go to the right." The two parts of the monent thus may be separated; their initial strengths differ greatly, as does their resistance to extinction.

A notate (Latin--roughly translatable as "what has been observed") is defined as any word or phrase given in response to a stimulus or to an object incorporating stimuli. Notates can be further characterized as "descriptions," "associations," "discrim inated responses," "descriptive characteristics," "categories" or even, "verbal percepts." Notates are stimulus-controlled and are symbolic of one or another feature of the stimulus. They are synonymous, then, with Skinner's (1957) tact. The sec ond part, "put to the right," "goes to left," termed the "predocent" (roughly "instructing beforehand"), is defined as a verbal response that is an S^D for motor behavior. (One might expect that there would be a third member of the chain, "is c orrect." Such occur very rarely.)

C. Some response equivalences, and lack of them. In some experiments, subjects have been permitted to say "same." If, after a series of "sames," the subject is asked what "same" means, he gives the monent last stated. That is, the subject's "same" can be believed, and reinforcing "same" gives results identical, insofar as can be determined, with those obtained by reinforcing the last previously stated monent itself. Another effect should be noted: reinforcing "borders go to the left" is or dinarily equivalent to reinforcing "nonborders go to the right." Under some circumstances in placement reinforcement, which we would hesitate to try to characterize as yet, the two may be dissociated, and the subject may systematically say, "borders to th e left," and "animals to the right," depending on the stimulus card presented. That is, monents may adventitiously become differentially reinforced with respect to stimuli. The effects of the adventitious reinforcement of "borders" when presented with ca rds having borders are not incompatible with those of the adventitious reinforcement of "animals" to cards with animals, and to cards with both borders and animals.

Again under circumstances that have not yet been determined, subjects may show a perfect discrimination for placements to the right, and show no discrimination of placements to the left, without respect to the strength of any monent. In these cases, s ome cards that belong on the right are being put to the left, and the S^D is a subclass of the stimulus the experimenter has chosen.

D. The discrimination process: extinction of placements to S_ . Further analyses were made on the data obtained on individual subjects in groups P and PH of the initial experiment, and on subjects in other experiments following s imilar procedures. In these, cumulative frequencies of placements to the right are plotted as a function of cumulative instances of (a) S^D (i.e., the class of cards that belong on the right according to the experimenter's rule) and of (b) S_ for this response. A similar pair of curves is plotted for placements (the two S_ - R curves) fall off in extinction curves. Under PH instructions, the correct monent tends to occur for many subjects only after considerable extinction has taken p lace. When this occurs, the extinction process is "short-circuited" out, and the extinction curve takes a slope of zero at once. But considerable (and recoverable) resistance to extinction for either R in the presence of their S_'s remains, to reveal it self in "careless errors."

These results emphasize the fact that monents are not discriminated, but once they occur correctly, may be reinforced on every trial thereafter, whereas placements to the right, or to the left, can be reinforced only when their discriminative stimuli ( cards that go to right, and to left respectively) are presented. Placements seem governed by Spencian laws, based on differential reinforcement with respect to two sets of stimuli, that is, with reinforcement of correct, and nonreinforcement of incorrect responses with respect to their stimuli. The correct monent, by contrast, as in simple operant conditioning, is reinforced on every trial, irrespective of the particular stimulus presented, and single reinforcements yield immediate repetitions. Both con tinuity and noncontinuity theories are substantially correct--but for different behaviors. But unless reinforcement of monents is experimentally distinguished from that of placements, the correct monent will "take over" as soon as it occurs, and will obs cure the gradual development of a discrimination.

E. Differential reinforcement of monents. It should be possible to place monents under discriminative control by making reinforcement of a particular monent contingent upon the presence of a particular discriminative stimulus. Thus, S^D (as, experimenter leaning forward, or the card presented sidewise) "people to the right, nonpeople to the left" can be reinforced, and under S_ (experimenter sitting up straight, or the card presented straight up and down) "cards with borders go the right, nonborders to the left." (This is evidently the "conditional hypothesis.") Experiments of this sort were done, and, the expected discrimination curves for the monents were found.

F. Manipulability of availability of monents. When subjects are used in a series of experiments, with the reinforced monent varied from time to time, there are large transfer effects. Initially improbable monents may appear first in a new exp eriment, if it has been reinforced in an earlier one. Subject's repertory of monents, and their relative probabilities, may be manipulated over a wide range ("salience").

G. Covert monents. It should be emphasized that no assertion has been made that the spoken monent is the only verbal behavior involved. Subjects show many signs of covert verbal behavior, and much of this becomes overt when the experimenter a sks question. The subjects' answers often yield additional notates and monents different from what was given aloud, or give elaborated versions of the overt one: ("I was wondering if it had something to do with alternate piles, too," or "It may be a part icular kind of people.") These previously unstated monents (where the subject is not following the experimenter's instruction to him) may pick up a few reinforcements adventitiously.

H. Conditioning of covert monents, as superstitions. An experiment was designed to determine whether a covert event that corresponds in its behavior with the monent occurs. Subjects, run together in sets of five before an audience, have been given the following instructions: "You will be shown a series of pictures. Following a simple rule, some of them are plusses, and some minuses. Your job is to find the rule that makes each card a plus or a minus. On each trial, write in your data book whether you think the picture is a plus or a minus, and you will be told whether you are right or wrong each time. When you think you know what the rule is, put a check next to your answer on that trial. When you are certain what the rule is, put a doub le check." The subjects were then individually reinforced according to a arbitrary prearranged schedule, independent of their overt response, although the individuals delivering the reinforcements went through the motions of looking at them, before sayin g "right" or "wrong."

On trials 1, 3, 4, and 7, all subjects were told "wrong." On all other trials through trial 30, all subjects were told "right." Over the next 12 trials (el-42), one of each set of five subjects was told "wrong" once, another 3 times, another 6 times, another 9 times. On the other trials, all were told "right." In each set, one control subject remained on continuous reinforcement, that is, he was told "right" on every trial past 7. All subjects were then continuously reinforced for a further 28 tri als (to a total of 70). At the end, the subjects were asked to write down the rule that was correct, and how sure they were of it, and, if the rule changed, to write down the second rule, and how sure they were of it. This procedure has been replicated a number of times.

In this procedure, then, reinforcements occur--are "shot in"--at times when a monent should have occurred covertly, but the reinforcement was independent of what the monent might be. Monents could be conditioned, then, as "superstitions." The results indicated that covert monents occur, and that they behave under reinforcement as do overt ones.

We may state with confidence that monents occur covertly, and that they are then subject to the same laws of reinforcement as when they are overt.

In all these experiments, the behavior of individual subjects was orderly to a high degree; subject's "thinking" came under experimenter's control in very much the way the behavior of a rat does when a response is being shaped. On the other hand, ques tioning a subject at the end of these experiments on what he was doing, or what he thought he was going on, or how he solved the problem, yields a good deal of verbal behavior that usually corresponds poorly with what the subject had in fact been doing, o r how frequently he had been reinforced. It reflects very seldom the environmental variables whose control led this subject to behave as other subjects do under the same procedure. What the subject answers to such questions seems to be most closely rela ted to his behavior over the few trials immediately prior to the questioning, and suggests a short-range "immediate memory." Rationalizing, not reasoning, seems to be the appropriate term. The statements recall the flavor of the introspective protocols given by subjects in the functionalists' experiments at the beginning of the century. One can hear and see what led Watson to behaviorism.

In the preceding experiments, the experimenter was limited by the fact that he had to keep track of, and record, two kinds of behavior--the monent, and either card-placement, or writing + or -. Moreover, in delivering reinforcement, there was inevitab le the ambiguity that both placement and monant could be reinforced on any one trial (the ambiguity is evident to remarkably few subjects). A new procedure was therefore developed that eliminated one of the two behaviors, and hence the ambiguity. It ena bled us to study the verbal behavior alone.

The subject is presented with two side-by-side piles of cards, picture side down. These have previously been sorted by the experimenter according to some rule or sequence of rules. The instructions are: "All the cards on the right differ in a system atic way, that is, in the same way, from all the cards on the left. Your job is to turn the cards over, a pair at a time, and for each pair tell me the rule that you think distinguishes all the ones on the right from all the ones on the left. I'll tell you whether you are right or wrong." By stacking the cards, the experimenter can arrange for several rules to apply successively for fixed numbers of trials, thus providing the experimental conditions for extinction, counterconditioning, and the like.

Table 2
Number of Changes in Monents Reported as a Function of
Number of Nonreinforcements through Trials 31-42 (N = 25)
Kind of Change	Group	A	B	C	D	E
No. "Wrongs" Trials 31-42	0	1	3	6	9
None	5	2	1	1a	0
Minor changeb	0	3	3	2	2
Complete changec	0	0	1	2	3

A. The notant: a chain of notates. As with the monent, the verbal behaviors, such as "cards with blue showing are on the right," constitute a chain. As with the monents, a single nonreinforced occurrence usually eliminates the notate that is the first member of the chain (and the subject does not say "cards with blue are on the left"). From this fact, and from the fact that these statements do not direct the subject to do anything further, it becomes necessary to distinguish between these cha ins and monents. The first member of both, the discriminated verbal response to a feature of the card "blues," "girls," "single object," is a notate. The second number for monents is the "predocent," which "tells the subject what to do." The class of v erbal chains which state an order in the environment, are termed notants. Their second member is a "predicant," roughly translatable as "predicating something about the environment," which is defined: a verbal response to a notate, incorporating one or m ore other notates. The notants in the present series of experiments are all of the sort-- "cards with borders are on the right," or "the right pile includes all the bordered cards." Border and right are notate and predicant respectively. The distinction between predicants and other notates is an operational one; in these experiments, the stimuli for the predicants are presented on every trial. Those for other notates need not be. The order in these chains is a matter determined largely by grammatical constraints and is often of no great importance.

B. Reinforcement by confirmation. Initially, in these experiments the experimenter told the subject "right" or "wrong" following each notant. It soon became obvious that he need say nothing, and that the instructions could be changed. A notan t shows the effects of reinforcement (one-trial change in response probability, and progressive- with-trials increments in resistance to extinction) as a function of the pair of stimuli presented to the subject on the following trial. If these stimuli el icit the notant given on the previous trial, they reinforce it. Such confirmation does not differ in its control over behavior from social reinforcement "right" and "wrong," except quantitatively (vide infra, D). A confirmation is a reinforcing stimulus.

C. Social vs. confirming reinforcement. In some experiments on notants, the experimenter's "rights" and "wrongs" were given in contradiction to the reinforcement (by confirmation) given by the prearranged stacking of cards. These results are o f importance in their own right, since striking individual differences in behavior are observed under these conditions. Some subjects under these conditions are controlled primarily by the social reinforces, and others ignore these, and behave in conform ity with the nonsocial confirmations.

D. Relative availability of notants. It was found possible to arrange the cards so that the availability of a given notate can be varied through a considerable range. This is done by arranging the cards in each of the two stacks in the order of ascending, or descending, probability that each will elicit the experimentally correct notate and no others. (E.g., border vs. no-border is ordinarily a very difficult notate. However, it may be produced on trial number 1 by presenting the subject wit h a pair of cards about which there is nothing to say but "border," that is, two blank cards, one with a border.) The availability of a particular notate (which it will now be evident is almost identical with "concept") proves to be a simple function of the sequence of environmental events, and of the subject's previous experimental history. It is readily manipulable by the experimenter.

E. Extinction. In these experiments, nonreinforcement of a notant can be carried out by one or another of a number of different operations. Let us say the notant is "flowers on the right, nonflowers on the left." Nonreinforcement of this not ant can be associated with (a) systematically presenting a flower on the left, and no flower on the right, (b) systematically presenting no flowers at all, on either side, (c) systematically presenting flowers on both sides, and (d) having the two decks r andomized with respect to flowers. All four procedures yield extinction curves, but it has not yet been determined whether the last three produce results different from one another. The first of the four counterconditions a new notant--"flowers on left" (cf. B, under Monent). The notant continues to be reinforced; this corresponds with the "reversal shift," which seems to puzzle some theorists. With b, c, and d, the cards may be stacked so that a notant which incorporates a new notate can be conditione d.

F. Counterconditioning. In experiments where a new notant is subject to reinforcement as the previous one undergoes nonreinforcement, the distinguishing notate drops out for a time after only one or two nonreinforcements. The full characteris tic extinction curve of the first is obtained only over a long series of trials during which the second notant occurs on each trial and is continuously reinforced. In this case, after a number of trials, subjects often "tack- on" the extinguishing notate , as follows: if "cards with borders on the right" was reinforced, then extinguished and "cards with blue showing on the right" then conditioned, subjects will, for example, say, when a card with both blue and a border appears on the right, "blues on the right, and there's a border."

When the second notant undergoes extinction, still more instances of the first notant recur.

G. Functions of the number of reinforcements. Resistance to extinction, the number of unreinforced responses that occur after the termination of reinforcement, is a function of regular reinforcements, here as in other conditioning. The subjec t's "certainty" is also a function of this number. After three or four consecutive reinforcements the subject is "pretty sure." After three or four more, he is "very sure, "or" "certain." Quantitative data of a sort may be obtained by asking the subjec t after each consecutive pair, or after a given number of regular reinforcements, how much he would be willing to bet that the next pair will conform with his notant.

H. "Refining" the notant. When the experimenter has applied two principles in stacking the decks (cards with both borders and people to right, cards with neither borders nor people to left), many subjects, when one of the two notants has been conditioned and is under continuous reinforcement, will stick with the first one, unmodified. A few subjects will, after a few more trials, emit the second notate as well, while the first is still under regular reinforcement. Some of them speak of this as "refining my hypotheses." Further experimental work is needed before we can determine under what conditions, and with what kinds of subjects, the latter highly adaptive behavior may be expected to occur.

I. Notants and monents. In general, subjects arrive at an experimentally correct notant far more quickly than they do the experimentally correct monent. This is true even when the difference in the number of cards presented per trial is taken into account. This finding is consistent with the observation that bystanders watching a subject perform in a concept-formation experiment of the card-sorting type often get the concept more quickly than the subject himself. The bystander is more effec tively reinforced through observation of the cards that the subject has placed to right or left than the subject is by his own placement of them, and the differential social reinforcement he receives.

Concerned that the orderliness of the data obtained in these experiments might depend upon the particular stimulus-material used, and on the instructions given by the experimenter, we sought a very different kind of material that could be used in simil ar experimental manipulations. More particularly, we wished to deal with simple notates, unchained with other responses. Such material has been used by Underwood (1957), who compiled lists of words illustrating concepts, and has done experimental work u tilizing them. As a result, we found ourselves in the area of word-association. With the new material, a still further simplification of the experimental procedure proved not only possible, but desirable.

The experiments that follow are all based on the use of stimulus material that is made up of sets of words, ranging in number from 20 to 50. Each set lists words that are the names of objects that have a single common property (objects that are rou nd; rectangular; made of wood; made of paper, and so on).

On the basis of the work of Bousfield and others (e.g., 1953), all the words of each list should have some measurable probability of eliciting the same word (the "concept") in a word-association experiment. "Orange," "wheel," and "clockface" are all li kely to yield "round." Initially, on a systematic basis, and now on an experimental one, these verbal responses have been identified as notates, and a concept is recognized as that class of stimuli all of which control the same notate. The name of the con cept is given by the notate controlled by it.

The first experiment was the simple and obvious one, essentially replicating experiments that had already been done, but in a context, and using methodological details, that were new. The subjects were (individually) instructed as follows: "I will re ad you a list of words, all of which have something in common. Your job is to figure out what they all have in common. After each word, tell me what you think the common element or feature is, and I will tell you whether you are right or wrong." In the se experiments, the subject's behavior shows nothing that was not already familiar from the previous sets of experiments on notants.

As before, social reinforcement proved unnecessary; reinforcement by confirmation, given by the occurrence of a second word eliciting the same notate was similarly effective in (a) altering the probability of response after its first occurrence, (b) bu ilding resistance to extinction, (c) progressively building subject's certainty that he is "right," and (d) increasing his tendency to five the same notate to an initially ineffective or weak stimulus for it.

By arranging words in order of notate probabilities, the number of trails required by the subject to reach the correct notate can be varied up and down. Lists can be "stacked" as were the cards in the previous experiments. (See Appendix.)

Two classes of notates occasionally occur that are almost impossible to extinguish. The first is one so general that it is available as a response to almost any noun, e.g., "useful to humans." The other class of undisconfirmable notates are words tha t are inexact in their level of abstraction. One subject (a psychologist) given list A of the Appendix, and immediately thereafter list C in reverse order, gave "container" to the second stimulus word, "barrel." After the seven ensuing reinforcements of "container," "cigarette" yielded: "Container-- contains air." The identical response was given to "wheel." Clock face "contains time." Objects thereafter contained food value, atoms, merit, and so on. A fascinating performance.

The effects produced when social and environmental reinforcement are given in contradiction to one another replicate those of the previous experiments on notants.

Altogether, these experiments confirmed the generalizations that had been arrived at, and rendered it most improbable that they were not artifacts of the specific stimulus materials that had been used.

A. Notates and word-associations. When a subject is presented with a list of words, all members of one concept, but is instructed that this is a word-association test and that he is to say the first word he thinks of as soon as the word is pro nounced, there seems to be a tendency for the correct notate to occur more often toward the end of the list. If, at the end of the list, the subject is told--"All the words I gave you were of the same sort: they were examples of the same kind of think. "Did you notice? What were they?," most subjects are immediately able to state the concept. (Subjects who cannot state it immediately do so after one or two words of the list when the list is now reread.) With no instructions to do so, they have "solve d the problem"-- which had not been stated. The mere presentation of a series of stimuli all of which control the same response, alters the probability that the response will occur.

In an elaboration of this experiment, a group of 36 high school students was given "word-association test," in which four stimulus lists of 25 words each were given ("red," "footwear," "food," and "furniture"). Each word was spoken 6 times consecutive ly, at 4 second intervals: thus, up to six responses could be written to each (most subjects were able to give six consistently). After all the responses had been made, subjects were told that all the words on each of the four lists illustrated differen t concepts, and were asked what they were. Table 3 gives the results.

These results show that subjects do indeed find concepts, even when not instructed to do so.

Examination of the data sheets reveals the word associations that compelled such correlated concepts. They show that the concept acquired by each subject is typically determined by his most frequent response, and that occurrence of a response increase s its probability of occurring again. The "erroneous" concepts given by these subjects were produced by their most frequent responses. This is best seen by the concept "accident, injury, violence, death" of the third list. The first word of this list was "blood," to which the great majority of college students give, as their first response, the word "red." The second word was "stop-li ght," the second most effective, for college students, in producing "red." When presented in this order to the 36 high school students in November 1960 their first responses to "red" were given as in Table 4. When the subjects went on to "stop- light," they frequently produced "police car," "arrest," and related words. Having responded with words associated with crime, they tended to continue to do so. (Many "misheard" the word "radish" as "ravish," and responded accordingly.)

It is not surprising that 13 of the 36 identified the concept, in retrospect, as Table 3 shows.

Quite clearly, the concept they "get" is the response they have just made most frequently. With "concept" instruction, this same list is gotten 100 percent correctly in a matter of four or five trials.

Incidentally, the variety, not to say candor, of these students' responses makes one wonder as to the generality of association data gathered on standard college sophomore beginning psychology students.

We started with an explicit attempt to determine how the rules, the "hypothesis," which the subject "tries out" in operant conditioning and concept formation experiments, operate in controlling his behavior. We wound up, far afield, in word-association experiments. We started with a frank attempt to find out, irrespective of whether it is necessary for conditioning, how verbal behavior operates. We wound up with a new area where "incidental learning" takes place. The results of these experiments jus tify some tentative generalizations that may prove of use not only in bringing order into some of those areas of human learning where problems of "awareness" have arisen, but also in rendering problem solving and similar complex behaviors amenable to expe rimental elucidation rather than theoretical elaboration.

I. When a discriminative stimulus is presented to a human subject, it produces, at different probabilities, a very broad variety of verbal responses. Each of these responses is termed a notate. Both the number and specific identity of those which ar e given overtly will be functions of the specific instructions that are given to the subject. Whether overt or covert, these responses are operants ("voluntary," if you will), and are subject to alteration in both probability of occurrence, and resistance to extinction.

II. The probability of occurrence of a given notate to any one of its stimuli is a function of the numbers of preceding presentations of others of its stimuli. That is, the greater the number of a notate's stimuli that precede a specific one, the gre ater the probability that the notate will be given to that specific instance. This statement in itself may be no more than a rephrasing of a general law of stimulus summation; with continued presentation, a stimulus that is initially inadequate for a giv en response may elicit, or release the response.

It follows, then, that the repetition of stimuli that initially do not produce a specific notate overtly, or (as revealed by questioning) covertly, will progressively tend to do so as they are presented following more and more stimuli which also have s ome low probability of yielding it.

(From this, it also follows that the instruction of a human subject into a given experimental situation will eventually lead him to respond systematically to initially "unnoticed" features of the environment. For example, if he gets "conditioned," he will almost necessarily notice it. Similarly, subjects will sooner or later start "making hypotheses" about features of the experimental setting and procedure which have been eliminated as controls over behavior by being held at constant values, (or so th e experimenter thinks).

III. If a notate is stated on one trial, and if a stimulus for the same notate is given on the following trial, the notate is reinforced by confirmation, in the absence of any social reinforcement. A single reinforcement is sufficient to produce some resistance to extinction. If the notate is correct, with this one confirmation it reaches its maximal relative frequency with respect to instances of its stimulus class. It is "stuck in," and continues to be given so long as its stimuli occur.

IV. The effectiveness of reinforcement by confirmation is amplified many times by the experimenter's instructions to the subject, and by the subject's instructions to himself. What was initially a very weak reinforcer becomes, by instruction, an extr emely strong one. The subject's certainty, his willingness to bet that he is right, is a simple function of the number of continuous reinforcements.

V. The statements about the environment made by a subject to himself are found to be of two sorts: those which simply describe the environment, but suggest no further behavior (monents). The latter are self-instructions, instructions of the subject to himself. They tell him what to do. Most of the time, he does it. Such monents may also be introduced to guide the subject's behavior by statement in the instructions.

The way to determine how a subject's behavior is guided by self- instructions is by the systematic experimental manipulation of instructions either to himself or from another. It is not wise to assume, as is usually done, that a subject will do what h e is told to do, whether by himself or by another. Nor does it make sense to assume that, if we but knew the self-instruction, we would know "what the subject is really doing," or "what is controlling his behavior." Such relationships need to be experim entally established. It is encouraging that some aspects of this problem are now being explicitly investigated by Grant (1962), who has found not only some expected results, but some unexpected ones: apparently innocuous or inconsequential alterations in instructions can yield some large, unpredicted, and as yet cryptic quantitative changes in subjects' behavior.

VI. In most experiments on conditioning, problem-solving, and the like, the experimenter follows one rule throughout the experiment. From the foregoing it follows that the subject will almost always "find the rule," even when he has not necessarily be en instructed to do so. It will hence be all but impossible, in a highly ordered laboratory situation, when the subject is "in an experiment," to preclude him from finding and stating the rules followed by the experimenter. He need hit the "right" rule only on one occasion for it to become subject to regular reinforcement. Only by devious means, as by distraction, can one expect to prevent a subject from verbally responding to the significant variables of the experiment.

VII. The subject's "certainty" that a rule is correct is a function of the number of continuous reinforcements it has had. Other schedules of reinforcement also increase resistance to extinction, but with another effect on "certainty." (As a subject on 60 percent reinforcement in group PH said in explanation of his behavior, "Well, I knew it wasn't exactly right, but it was right most of the time, so I stuck with it.")

VIII. Reinforcement by confirmation is imprecise, not well-suited for shaping. The probability that the subject will get the exactly correct rule or principle will be determined by the sequence of stimuli given him, and only with precise control of t hese stimuli can such successful "solutions" be assured. Those experimenters who wish to shape up the correct notate, notant, or monent can do so, but when these verbal operants are allowed to occur covertly, picking up essentially uncontrolled reinforce ments, some odd superstitions may occur.

IX. It would appear that whenever a monent is on continuous reinforcement, so that reinforcement is delivered alike to monent and the behavior it "directs," it will exert maximal control over the behavior for which it is the predocent.

X. Only by dissociating, in one way or another, the reinforcement of the monent from the reinforcement of the behavior controlled by the monent is it possible to show the nature of their relationship. Under partial reinforcement of the behavior, the strength of the correct monent becomes weaker than that of the behavior, and under partial reinforcement of the monent, its strength exceeds that of the motor behavior. The remaining resistance to extinction of the incorrect responses reveals itself in t he form of occasional "errors."

Closing Remarks

Where does this all leave us with respect to "awareness?"

"Awareness," as it has been described, seems to have been assigned to particular properties as a consequence of which differential behavior might be expected. It is used rather as a verbal magic that allows one to say that operant conditioning is not operant conditioning, because the subject was "aware." There are alternatives, however.

The burden of the experiments here reported seems to be this: Watson's "verbal reports," and Hunter's "SP-LR's" can be dealt with as can any other behavior. They do not need to be ignored, as they are by some. They do not need to be treated purely as reflecting some other process, some solely inferrable state, whether "mediating process," "consciousness," or "awareness." As relevant behaviors, they can be experimented upon directly. When this is done, these verbal behaviors not only reveal orderlin ess with respect to both discriminative and reinforcing stimuli like that of nonverbal behaviors, but also they show their function as discriminative stimuli in directing and controlling other behaviors. In this, they show properties that they do not sha re with simpler motor activities, or with nonsense-syllables. A further, fuller empirical investigation of their quantitative characteristics should, we can state with some confidence, make questions of "awareness" of limited empirical significance. Whe n these relationships are more fully elucidated, the word "awareness" may prove as dispensable as, say, phlogiston.

As an experimental strategy, then, let us remain unaware of awareness, but let us diligently ask the subject what he is or "thinks" he is, doing, and let us, using the methodology that has proven fruitful in showing the order in explicitly nonverbal be haviors, determine how such verbal statements behave, and, in turn, how they are related to-- sometimes control--other ongoing activities.

Appendix

The reader is invited to try out the following on himself, and on other subjects.

Give individual subjects one or another of the instructions in the text, and then present the words in one of the three lists either as ordered, or in reverse direction, or scramble, with an interpresentation interval sufficient to allow the subject to state the notate for each stimulus word. Most revealing, give the subject list A following instructions to find the concept, and then, immediately, with no indication to the subject of a change, present list C in reverse order. This last procedure will often yield, to a single stimulus item following a critical nonreinforcement, several notates in succession, chained into an elaborate "hypothesis,"--a notant.

Word Lists
A	B	C
coffee can	envelope	danger-sign
barrel	post card	blood
bottle	stamp	stoplight
flower-pot	ticket	flare
bowl	magazine	rash
wastebasket	newspaper	fire-engine
cup	dollar-bill	exit-sign
tin can	blotter	brick
oil drum	kleenex	nailpolish
cigarette	shoebox	lipstick
wheel	ruler	ruby
clockface	bulletin board	fir
porthole	cigar box	cardinal
telephone dial	plank	poinsettia
dime	drawer	rose
discuss	desktop	blush
sun	mattress	lips
cake	brick	raw meat
doughnut	trunk	tongue
meatball	U.N. Building	radish
pie	refrigerator	raspberry
orange	chocolate bar	apple
sausage	lump of sugar	cherry
grapefruit	butter patty	beet
macaroni	sandwich	strawberry

References

Bousefield, W. A., & Cohen, B.H. (1953). The effect of reinforcement on the occurrence of clustering in the recall of randomly arranged words of different frequencies of usage. Journal of General Psychology, 52, 83-95.
Estes, W.K. (1950). Effects of competing reactions on the conditioning curve for bar pressing. Journal of Experimental Psychology, 40, 200-205.
Gibson, J.J. (1960). The concept of the stimulus in psychology. American Psychologist, 15, 694-703.
Green, E.J. (1955). Concept formation: a problem in human operant conditioning. Journal of Experimental Psychology, 49, 175-180.
Oskamp, S. (1956). Partial reinforcement in concept formation: "hypothesis" in human learning. Unpublished Master's thesis, Stanford University.
Rilling, M. (1962). Acquisition and partial reinforcement of a concept under different verbal reinforcement conditions. Unpublished Master's thesis, University of Maryland.
Shepard, R.N., Hovland, C.I., & Jenkins, H.M. (1961) Learning and memorization of classifications. Psychology Monograph, 75, No. 13 (Whole No. 517).
Skinner, B.F. (1957). Verbal Behavior. New York: Appleton-Century-Crofts.
Underwood, B.J. (1957). Studies of distributed practice: XV. Verbal concept learning as a function of intralist interference. Journal of Experimental Psychology, 54, 33-40.
Verplanck, W.S. (1954). Burrhus F. Skinner. In W. K. Estes, et al., Modern learning theory. (pp. 267-316). New York: Appleton-Century-Crofts.
Verplanck, W.S. (1956). The operant conditioning of human motor behavior. Psychological Bulletin, 53, 70-83.
Verplanck, W.S. (1957) A glossary of some terms used in the objective science of behavior. Psychological Review, 64, (Suppl.), 1-42.

VERBAL CONCEPT "MEDIATORS" AS SIMPLE OPERANTS