THE U N I V E R S I T Y O0F MI C HI'G AN COLLEGE OF LITERATURE, SCIENCE, AND THE ARTS Department of Psychology Final Report THE ROLE OF EXTINCTION IN REVERSAL LEARNING WITH RATS David Birch ORA: Project 03.455 under contract with: NATIONAL SCIENCE FOUNDATION NSF-G9779 WASHINGTON, D.C. administered through: OFFICE OF RESEARCH ADMINISTRATION ANN ARBOR January 1962

r\ fth "I -

ADKNOWLEDGMENTS The author wishes to acknowledge with pleasure the contributions of James Allison, L. Thomas Clifford, Robert House, James R. Ison, Sally E. Sperling and Theron Stimmel, in assistance in the laboratory and in discussions in the office. 11i

I

TABLE OF CONTENTS Page LIST OF TABLES vii ABSTRACT ix BACKGROUND OF THE RESEARCH 1 PARTIAL REINFORCEMENT OF IRRELEVANT CUES 3 EXTENDED TRAINING IN REVERSAL LEARNING 19 DISCRIMINATION LEARNING EFFECTS ON RESISTANCE TO EXTINCTION 31 REFERENCES 41

LIST OF TABLES Table Page I. Design and Data from Lawrence (1949) 4 II. Absolute and Patterned Component Analysis of Stimulus Arrays Used by Lawrence (1949) 6 III. Lawrence (1949) Data Rearranged to Show Transfer Errors as a Function of Training and Transfer Conditions 9 IV. Design from Sperling (1960) 11 V. Mean Log Reciprocal Latency for W Trials on the Last Acquisition Day and over 15 Extinction Trials with Single Contrasts Among Group Differences from Acquisition to Extinction with 95% Confidence Intervals 14 VI. Response Speed to the Goal Platform as Measured by Mean Log Reciprocal Latency X 1000 for Successive Blocks of Six Trials During Extinction 16 VII. Analysis of Variance Summaries for Extinction 17 VIII. Mean Log Starting Speed (Times 100) for the Extended Training and Control Groups During First Fifteen and- Last Three Trials of Acquisition and for Six Trials of Extinction 26 IX. Analysis of Variance Summaries of Acquisition and Extinction Data for 18- and 63-Trial Groups 27 X. Mean Log Time Spent Values (Log TS X 100) and Analyses for the 18- and 63-Trial Groups in the Goal Box and the Alternative Chamber for the First Two Entries into Each During the Pre- and Post-Tests 28 XI. Mean Log (1000/Latency) X 100 for the Five Groups of Experiment I at the End of Acquisition and During Extinction 33 XII. Means and Standard Deviations of Measures Appropriate to Evaluation of Extinction Performance for Approach Speed 33 XIII. Means and Standard Deviations of Measures Appropriate to Evaluation of Extinction Performance for Approach and Goal Platform Speeds 37 vii

ABSTRACT The experiments of the present project revolve around the major axis formed by the connection of a theoretical analysis of certain complex discrimination transfer studies to previpus experimental work pointing to the possibly important role of resistance to extinction in such studies. Three investigations were tied most directly to this axis and were supported in full by the project. One of these, a Ph.D. dissertation, obtained evidence that an irrelevant stimulus (i.e., a stimulus with presentation uncorrelated with rewards and nonrewards) gains resistance to extinction appropriate to a partially reinforced stimulus. A second extended an earlier finding that the reversal learning effect (i.e., easier discrimination reversal following overtraining on the original problem) is at least partially due to resistance to extinction factors and can be obtained in situations which minimize the possible facilitating action of special perceptual factors. The third investigated whether the positive stimulus in a common laboratory discrimination problem demonstrates resistance to extinction more similar to that of a 10Go reinforced stimulus or to that of a partially reinforced stimulus, with the resu8lts from two experiments favoring the first alternative. Two other studies supported in full by the project pursued more specialized aspects of the extinction question. In a Ph.D. dissertation some basic aspects of the relationship between incentive motivation and performance were examineda problem of considerable importance to the present investigations because of the proposed close relation between incentive motivation and resistance to extinction. The second study compared the resistance to extinction, following 100% —and 50% acquisition schedules, under conditions of generalization, and indicated in a preliminary way that a partially reinforced stimulus may be altered considerably and still maintain greater resistance to extinction than an unaltered lO10o reinforced stimulus. The project also partially supported an experiment which successfully attempted a new and relatively direct measurement of proposed avoidance characteristics of the effects of nonreward during extinction by comparing the time spent in the goal chamber with that spent in an alternative chamber following initial extinction trials, ix

BACKGROUND OF THE RESEARCH The point of departure for the research reported here was arrived at as a result of the reinterpretation of certain data from experiments on complex discrimination transfer. These experiments, as introduced by Lawrence (1949, 1950), have three major aspects: -(1) choice of original and transfer discrimination problems so as to rule out the direct transfer of either facilitating or interfering instrumental responses; (2) differing conditions of original discrimination learning for groups which (3) are subsequently compared in their learning of a transfer discrimination problem. These characteristics of the procedure, particularly the restraint imposed by the first, are important because together they force interpretation of differences among groups on the third task in terms other than those of the transfer of simple, learned instrumental responses. Lawrence (1949), for example, used his studies to introduce the notion of acquired distinctiveness of cues, a learning phenomenon related to the perception of stimuli and presumed to occur in addition to the learning of instrumental responses. Other investigators (Reid, 1953, Pubols, 1956) have since used the concept as an explanatory device in situations closely resembling those of Lawrence. The justification for the introduction and use of this additional explanatory machinery is that the data call for it. However, a most important component of the theoretical and procedural framework required to demonstrate this need is the definition of stimuli and responses in discrimination learning experiments. Given the definitions provided by Lawrence, the logic of the experiments is sound and the need for additional theorizing is clear. An alternative description, which draws heavily from those offered by Spence (1952) and Nissen (1950), provides the basis for a reinterpretation of the Lawrence results and for the research to be reported. The focus of the reinterpretation is on the role of extinction in discrimination transfer with special attention given to resistance to extinction arising from (1) the partial reinforcement of irrelevant cues, (2) the extent and degree of original training, and (3) possible partial reinforcement effects resulting from the similarity of the positive and negative discriminanda. In each of these three cases the theoretical and experimental intent is not to show that the theoretical conception of acquired distinctiveness of cues is inappropriate or unnecessary in problems of complex discrimination learning and transfer, but rather to examine with care the theoretical and 1

procedural framework used to justify the introduction of the mechanism into the theory. The potential importance of the conception is not under question. In fact, it is because the notion carries strong intuitive appeal that adequate experimental situations for demonstrating and studying it are so important.

PARTIAL REINFORCEMENT OF IRRELEVANT CUES The basic hypothesis investigated by Lawrence is that, while S is learning instrumental responses in a discrimination problem, the initial order of distinctiveness of the various stimulus aspects is being simultaneously modified. Under investigation is the assumption that relevant cues (i.e., stimuli to which S must respond to learn the discrimination) will acquire distinctiveness while irrelevant cues (i.e., stimuli presented in an uncorrelated fashion with rewards and punishments) will tend to lose distinctiveness. By assuming that ease of discrimination learning is positively related to distinctiveness, the Lawrence hypothesis leads to predictions of differential ease in learning a transfer discrimination problem as a function of prior discrimination training with the same or similar stimuli. Table I summarizes the design, predictions, and data for Lawrence's first experiment (Lawrence, 1949, pp. 775, 777). These data show that the positive transfer groups, as designated by Lawrence, make fewer errors than both the negative transfer and the control groups in all comparisons. The direction of differences in errors between the negative transfer and control groups was predicted for only three of the six comparisons, however. The transfer task must be selected carefully to demonstrate the need for the acquired distinctiveness of cues hypothesis. "It must be possible to determine the influence that previous experience with a set of cues in one situation has on learning in a second situation, involving the same cues, when the transfer of overt instrumental responses from the first to the second task is ruled out. The instrumental behavior learned in the first situation must neither facilitate nor hinder the learning of the instrumental responses in the second situation" (Lawrence, 1949, p. 770). In the first experiment the form of the transfer task chosen was that of shifting from simultaneous to successive presentation of stimuli. Whether or not the transfer task chosen by Lawrence was adequate for his purposes depends to a great extent on the fashion in which the stimuli and responses are specified. In a Black-White simultaneous discrimination, Lawrence defines the stimuli as "Black on the left, White on the right" and "White on the left, Black on the right," and the responses as approaching black or turning left for the first set of stimuli and approaching black or turning right for the second. In the successive discrimination he defines the stimuli as "All Black (both right and left)" and "All White (both right and left)," and the responses as a right turn for the first and a left turn for the second. S, having learned in the simultaneous discrimination to approach black (or to turn left to the B-W complex and to turn right to the W-B complex), is neither at an advantage nor a disadvantage in his requirement to turn left 5

TABLE I DESIGN AND DATA FROM LAWRENCE (1949) Simultaneous T-Maze Discrimination Transfer Mean Errors Group Discrimination Relevant Irrelevant Predicted in Transfer 1A B-W B-W R-S + 9.67 lB R-S B-W R-S - 22.00 1C Wd-N B-W R-S o 26.67 2A R-S R-S B-W + 25.33 2B B-W R-S B-W - 30.00 2C Wd-N R-S B-W o 37.00 3A B-W B-W Wd-N + 13.33 3B Wd-N B-W Wd-N - 24.33 3C R-S B-W Wd-N o 14.33 4A Wd-N Wd-N B-W + 19.67 4B B-W Wd-N B-W - 29.33 4C R-S Wd-N B-W o 22.00 5A R-S R-S Wd-N + 27.33 5B Wd-N R-S Wd-N - 41.00 5C B-W R-S Wd-N o 39.00 6A Wd-N Wd-N R-S + 24.33 6B R-S Wd-N R-S - 25.00 6c B-W Wd-N R-S o 28.533 In this training schedule, B-W stands for black-white cues, R-S for roughsmooth, and Wd-N for wide-narrow. Relevant cues are ones that are correlated with the rewards and punishments; irrelevant cues are ones that have no correlation with the reward. Positive transfer (+) is predicted if the group has had previous experience with the relevant cue, negative transfer (-) if the group has had previous experience with the irrelevant cue, and zero transfer (o) if the group has not had previous experience with either of them. Permission to adapt this table as shown has been granted by the editor of the Journal of Experimental Psychology. when the stimuli are all white and to turn right when the stimuli are all black in the successive transfer problem. Thus the definitions of stimuli and responses chosen by Lawrence make his transfer task appropriate, Other definitions of the stimuli and responses in simultaneous and successive discriminations are possible, however. An example is that offered by 4

Spence (1952). In his analysis Spence specified the stimuli in terms of absolute and patterned components and the response as that of approaching or not approaching these components. The present discussion adopts this alternative analysis with one modification relating to the conditions under which learning with respect to patterns may be expected. Spence proposed that discrimination will involve patterning only when absolute components are not differentially reinforced. Birch and Vandenberg (1955) showed that learning to patterns can occur even when absolute components are differentially reinforced. This finding will be used in analyzing the Lawrence studies. In reinterpreting the relevant-irrelevant stimulus results without appeal to changes in the distinctiveness of cues, two basic assumptions are made: (1) Both simultaneous and successive discriminations involve concurrent learning with respect to absolute and patterned stimulus components. (2) Irrelevant cues, defined by the experimenter as those aspects of the stimulus which are uncorrelated with reward, are under a partial reinforcement schedule and thus gain greater resistance to extinction than those cues always reinforced. These assumptions are brought to bear both on the differences consistently found in comparing the positive transfer groups with the negative transfer and control groups, and on the irregular (from the standpoint of the acquired distinctiveness of cues hypothesis) array of differences found between the latter two groups. In his 1949 study, Lawrence begins with three basic groups, each of which learns a simultaneous discrimination problem with one of three possible dimensions as relevant and with midvalue points on the other two, irrelevant dimensions. For notational purposes, let A, B, and C stand for the dimensions of brightness, texture, and size of the goal chamber and let the subscripts, 1, 2, and 3 stand for low, middle, and high values on a dimension. The stimulus arrays for the three groups on the simultaneous problem are shown in Table II along with an analysis in terms of absolute and patterned components. Counterbalancing of positive and negative stimuli is ignored as are higher-order patterned components [e.g., (Al B2 L)+ and (A3 B2 R)-] since their inclusion would not change the conclusions drawn on the basis of the simpler patterns. As shown in Table II, Group 1 is subjected to a schedule of reinforcement which yields (A1L)+, (A1R)+, and (A3R)-, (A3L)- while Groups 2 and 3 receive (A2L)+ and (A2R)+. This is of special importance because analysis will show these components to be crucial in the learning of the successive discrimination transfer problem. 5

TABLE II ABSOLUTE AND PATTERNED COMPONENT ANALYSIS OF STIMULUS ARRAYS USED BY LAWRENCE (1949) Group Left Right Analysis Simultaneous Stimulus Arrays 1 A1B2C2 A3B2C2 (A1)+ (A2)- (B2)~ (L)~ (R)~ - + A3B2C2 A1B2C2 (A1B2)+ (A1C2)+ (AlL)+ (A1R)+ (A3B2)- (A3C2)- (A3L)- (A3R)(B2C2)~ (B2L)~ (B2R)~+ (C2L)~- (C2R)~ 2 A2B1C2 A2B3C2 (A2)~+ (B1)+ (B3)- (C2)~- (L)~ (R)+ - + A2B3C2 A2B1C2 (A2B1)+ (B1C2)+ (B1L)+ (B1R)+ (A2B3)- (B3C2)- (B3L)- (B3R)(A2C2)~+ (A2L)~ (A2R)~+ (C2L)~+ (C2R)~ + 3 A2B2C1 A2B2C3 (A2)~+ (B2)~+ (C1)+ (C3)- (L)+~ (R)~ - + A22C3 A2B2C1 (A2C1)+ (B2C1)+ (C1L)+ (C1R)+ (A2C3)- (B2C3)- (C3L)- (C3R)(A2B2)~ (A2L)~+ (A2R)~+ (B2L)~- (B2R)+ Successive Stimulus Array Example + A1B1C2 A1B1C2 (A1)~- (A3)~ (B1)~+ (B3)~ (C2)~+ (L)~ (R)+ + A1B3C2 A1B3C2 (A1B1)~ (A1B3)~+ (A3B1)~+ (A3B3)+ - ~~+ A3B1C2 A3B1C2 (A1C2)~+ (A3C2)~ (B1C2)~ (B3C2)+ - ~~~ + A3B3C2 A3B3C2 (B1L)~+ (B1R)+~ (B3L)~+ (B3R)+ (C2L)~ (C2R)+ (A1L)+ (A1R)- (A3L)- (A3R)+

An example of a successive transfer task common to all three groups is also presented in Table II, along with an analysis in terms of absolute and patterned stimuli, again ignoring counterbalancing and higher-order patterns. Examination of Table II shows that the transfer task is arranged so that the initial transfer of preference relationships to all components is controlled. One class of components is controlled because identical elements appear on both the left and the right [e.g., (A1)+ = (A)+~]. A second class is controlled approximately, or on the average, because they are treated nondifferentially with respect to reinforcement and nonreinforcement during the successive discrimination training [e.g., (B1L)+ and (B1R)~]. As may be seen in Table II, the appropriate pairs of these cues within each stimulus array for each group are treated equivalently during the simultaneous training also. The third class of cues, (A1L)+ against (A1R)- and (A3L)- against (A3R)+, is controlled for initial preference and provides the only basis for discrimination in the successive problem. The present analysis stresses the differences among groups in their reinforcement history with respect to the third class of cues. Group 1 enters the transfer problem with (A1L)+, (A1R)+, (A3L)-, and (A3R)-; Groups 2 and 3 had experimenter-imposed partial reinforcement schedules on (A2L) and (A2R). If it is assumed that Group 1 can extinguish the approach response to the previously 100% reinforced (A1Ri) and acquire the approach response to the previously 0% reinforced (AR-R-with fewer trials than Groups 2 and 3 can extinguish the generalized partial reinforced tendencies to (A1R) and (A3L), a superiority of Group 1 over the other two groups may be expected. This is the basic finding of Lawrence's experiment. In the present argument this assumption is presumed to hold although Lawrence changed from a discrimination box for the simultaneous problem to a T maze for the successive problem. Thus the generalization for Groups 2 and 3 must occur over variations both in the A dimension and in the specific characteristics of L and R. Unfortunately, there is no direct evidence about the extent to which a partially reinforced stimulus may be altered and still retain greater resistance to extinction than a 100% reinforced stimulus which is unaltered during extinction. One experiment (Sperling and Birch, 1960) was oriented in the direction of this problem and will be reported subsequently. It appears from the present analysis, however, that the differences between Group 1 and Groups 2 and 3 should be greater if training were given with end points (rather than midvalues) during simultaneous training. The differences in analysis of the two approaches to the Lawrence data may be examined in more detail also. Lawrence predicts a facilitating effect when the same set of cues is relevant in the two problems, an interfering effect when initially relevant cues are made irrelevant in the transfer problem, and no effect when initially relevant cues are removed by presenting a midvalue of the dimension during transfer. The basis for these predictions is the perceptual factor of differential utility of previously acquired distinctiveness of cues. In contrast, the present analysis focuses on the resistance 7

to extinction properties of the discriminanda in the transAfer problem. This analysis suggests that the major sources of variance in the transfer data should be related to the interaction between the differentially reinforced stimuli during simultaneous and successive discrimination training. Table III presents the Lawrence data rearranged so that row effects correspond to simultaneous training conditions and column effects to successive testing conditions. The two entries in each cell show subgroups which are treated alike with respect to the row and column effects, but differently with respect to the irrelevant cues of the test problem. The hypothesis is that the variance between subgroups within cells is not appreciable. A summary of this analysis of variance is also shown in Table II.I The hypothesis that the irrelevant cues of the test problem are ineffectual in determining the performance is supported by the insignificance of the value (F < 1) of the subgroups within cells source of variance. The analysis also indicates significant sources of variance attributable to the two main effects and the interaction. The importance of this set of results is that it emphasizes the relevant cues in transfer (over-all evaluation of easy-to-difficult is B-W, Wd-N, and R-S) and the past reinforcement history of these cues (note significant main effect and interaction involving simultaneous training conditions), At the same time the results show rather clearly that the effects of the irrelevant cues during the transfer task are not very great, contrary to the expectations of the Lawrence position. The Lawrence data can be used to pursue the implications of the present analysis one step further. Of special importance to this position is the outcome when cues irrelevant (midvalues) in the initial task are made relevant in transfer. These patterned cues have a past history of partial reinforcement (generalized from the midvalues) and the effects of such a schedule might be expected to show in the transfer task by a greater difficulty of attaining discrimination between the two cues. The extent of difficulty in the transfer discrimination task should be a function of the form and extent of the partial reinforcement schedule during the initial task. In his initial simultaneous discrimination problems, Lawrence had one group learn a wide-narrow (Wd-N) problem, a second group learn a rough-smooth (R-S) problem, and a third group learn a black-white (B-W) problem. Examination of the acquisition curves shows Wd-N to be appreciably more difficult than R-S and B-W. R-S is only slightly more difficult than B-W. Since all animals were run the same number of trials no matter what problem they were operating on, those in the B-W and R-S groups were not receiving nonreinforcement during the later stages of their acquisition training. As a matter of fact, the number of errors from approximately trial 20 through trial 40, the end of the acquisition series, is negligible for these two groups. Thus the irrelevant cues were under a partial reinforcement schedule only for the

TABLE III LAWRENCE (1949) DATA REARRANGED TO SHOW TRANSFER ERRORS AS A FUNCTION OF TRAINING AND TRANSFER CONDITIONS Relevant in Transfer B-W R-S Wd-N- Total Irrel: R-S 9o67 B-W 30.00 B-W 29.33 B-W Wd-N 15333 Wd-N 39.00 R-S 28.33 Total 11o50 Total 34.50 Total 28.83 24.94 Irrel: R-S 22.00 B-W 25.33 B-W 22.00 R-S Wd-N 14.33 Wd-N 27.33 R-S 25.00 Total 18.17 Total 26.33 Total 23.50 22.67 Irrel: R-S 26 67 B-W 37.00 B-W 19.67 Wd-N Wd-N 24.5533 Wd-N 41.00 R-S 24o33 Total 25.50 Total 39.00 Total 22.00 28.83 Total 18.39 33.28 24.78 Source df SS MS F P Training (Tn) 2 350,1 175.0 3.418.05 Transfer (Tr) 2 2008.5 1004.2 19.613.001 Tn x Tr 4 887.9 222.0 4.336.01 Subgroups within Cells 9 315.7 3551 <1 Error 36 1843 2 51.2 Total 53 5405.4 first 20 of the 40 trials and then were, in effect, shifted io 100% reinforcement for the last 20 trials. It would not be surprising if these last 20 trials under 100lo reinforcement were sufficient to dilute to a great extent the partial reinforcement effects during the first 20~ In contrast, Wd-N reaches 100% performance only afver 40 acquisition trials, Thus, during the complete training period errors were being made, and the treatment afforded 9

the irrelevant cues in this group should approach more nearly the conditions of a partial reinforcement schedule. If this analysis is correct, Ss in the Wd-N training group should encounter somewhat more difficulty on the B-W and R-S transfer problems than the corresponding Ss from the B-W and R-S training groups. Specifically, the present analysis calls for more errors on the B-W transfer for Wd-N than for R-S, for more errors for Wd-N than for B-W on the R-S transfer, and for little difference between the B-W and R-S training groups on the Wd-N transfer. Table III may be used to evaluate these implications. The first two fare well since on the B-W transfer Wd-N averages 25.50 errors and R-S only 18.17, and on the R-S transfer a mean of 39.00 errors was made by Wd-N to 34.50 for B-W. At the same time, however, on the Wd-N transfer B-W averaged 28.83 errors and R-S, 23,50, a difference of the same order of magnitude as the comparisons involving Wd-N. The reanalysis, while encouraging, is not sufficient, and certainly requires additional empirical support. Two major assumptions yielded the present analysis: (1) learning to patterned stimulus components will occur concurrently with learning to absolute components, and (2) partial reinforcement of stimulus components, whether absolute or patterned, leads to increased resistance to extinction of these components. A dissertation by Sperling (1960) was aimed directly at this latter question. By using the procedures of singlestimulus presentation rather than those of the choice situation, Sperling was able to evaluate the resistance to extinction of partially reinforced component stimuli (irrelevant by Lawrence's definition) more simply and straightforwardly. One of the major advantages of using single-stimulus presentation is that, during acquisition, those stimuli deemed irrelevant and thus under 50% partial reinforcement schedule can be maintained on such a schedule since the animal is required to respond to each discriminanda on each trial. Sperling's (1960, pp. 11-13) design and procedure was as follows: The Ss were randomly assigned to one of six treatment groups. Five of these received 8 acquisition trials per day for 10 days, the sixth received 8 acquisition trials per day for 5 days. The five 80-trial groups were given the following treatments during acquisition: For Group (wnLD) (N = 9), the main experimental group, "light-bulb on" (L) was the relevant positive stimulus, "light-bulb off " (D) the relevant negative stimulus; each of these appeared 40 times. On 20 of the positive trials and 20 of the negative trials Wide stripes (W) were present and Narrow stripes (N) were present on the other one-half of the trials. Group (LD) (N = 8), the control group for assessing resistance to extinction in Group (wnLD), received 40 L positive trials and 40 D negative trials with no stripes present. Group (wn) (N = 8) was included to provide a comparison of the effects of 50 per cent partial reinforcement with and without the presence of a relevant stimulus 10

set. These Ss were given 40 W trials and 40 N trials. Twenty of the W and 20 of the N trials were reinforced; the other one-half of the trials were not reinforced. This group differs from the usual partial reinforcement group used to examine resistance to extinction following partial as opposed to 100 per cent reinforcement in that two partially reinforced stimuli were presented during acquisition. Group (WN) (N = 8) was, therefore, included as a 100 per cent reinforcement reference group. These Ss received 40 trials in the presence of W, all reinforced, and 40 nonreinforced trials in the presence of N. Group (WNld,a) (N = 10) provided an analogous 100 per cent reinforcement reference group for Group (wnLD). These S received 40 reinforced trials in the presence of W and 40 nonreinforced trials in the presence of No On 20 of the W trials and 20 of the N trials L was present; the D condition prevailed on the remaining trials. The 40-trial group (WNld,b) (N = 8), received the same treatment as (WNld,a) but with one-half the number of trials, This group was included as a control against possible overlearning in Group (WNld,a) which might affect the comparisons of this group with Group (wnLD). On the day following their last acquisition day, all Ss (except 4 in the (LD) group) received 16 extinction trials in the presence of W. In order to provide a partial check on the comparability of the stripe stimuli these 4 Ss were presented with a N stimulus during extinction, The experimental design is summarized in Table IV. TABLE IV DESIGN FROM SPERLING (1960) Group (wnLD) (WNlda) (WNld.,b) (wn) (LD) (WN) N=9 N=10 N=8 N=8 N=8 N=8 Acquisition Stimuli and Reinforcement Schedules Light On 100% 50% 50% - 100% - Light Off 0% 50% 50% - 0% - Wide Stripe 50% 100% 100% 50% - 100% Narrow Stripe 50% 0% 0% 50% - 0% Number of Acquisition Trials 80 80 40 80 80 80 Extinction Stimulus W W W W W W N 11

Sperling (1960, pp. 39-42) discusses her results (sed Table V) as follows. The purposes of this study were to determine whether response tendency had accrued to a nondifferentially reinforced irrelevant stimulus during initial training in a complex stimulus situation and to attempt to assess the strength of this response tendency. There were no differences among the groups in speed of responding on the last day of acquisition in the presence of W, the stimulus to be presented during extinction. This stimulus had been one of two nondifferentially reinforced stimuli during acquisition training for Group (wnLD) and had not appeared during acquisition training for Group (LD). Group (wnLD) ran faster than Group (LD) during extinction, indicating that response tendency may be considered to have accrued to this stimulus during acquisition training. Since the presence of response tendency had been demonstrated for extinction trials in the presence of W, an assessment of the strength of this tendency was made by comparing the mean decrease in log RS from acquisition level to the extinction day for Group (wnLD) with that of groups (wn) and (WN). In order for these comparisons to provide information on the strength of the response tendency in Group (wnLD) it was necessary to show first that the partial reinforcement effect could be observed in the presence of a stimulus that had been one of the two 50 percent partially reinforced stimuli for Group (wn) and had been the positive discriminanda during discrimination training for Group (WN). Group (wn) ran faster during extinction and showed less decrease in log RS than Group (WN) indicating that the partial reinforcement effect is demonstrable following two-stimulus training under a constant 15-minute intertrial interval. Groups (wnLD) and (wn) did not differ in amount of log RS decrease, and Group (wnLD) showed less decrease than Group (WN). These two comparisons, considered together, indicate that the strength of the response tendency established under conditions of nondifferential reinforcement of an irrelevant stimulus is greater than that established under 100 percent reinforcement in a simple discrimination situation and may not differ from that established under 50 percent partial reinforcement. It thus seems that the reinforcement schedule associated with a nondifferentially reinforced irrelevant stimulus may be considered to be a 50 percent reinforcement schedule. From the four comparisons above, the primary determinant of extinction performance would seem to be the reinforcement conditions associated with W during initial training. If this were the case, then Group (WNld,a) which received 100 percent reinforcement in the 12

presence of W and O percent in the-presence of N. withAL and D present as nondifferentially reinforced irrelevant stimuli, would be less resistant to extinction than Group (wnLD) and would not differ from Group (WN), However, Group (WNld,a) is more resistant to extinction than would be predicted on the basis of the analysis of the reinforcement history of absolute stimulus components during initial training. Groups (WNld,a) and (wnLD) do not differ either in terms of log RS or mean log RS decrease. Groups (WNld,a) and (WN) do not differ under the first analysis, but Group (WNld,a) apparently shows less decrease in log RS. These results, considered together, indicate that Group (WNld,a)'s extinction performance is more like that of Groups (wnLD) and (wn), the 50 percent partial reinforcement groups, than it is like that of Group (WN), the comparable 100 percent reinforcement group. Groups (WNld,a) and (wnLD) both had L and D conditions present during initial training; Groups (wn) and WN) received initial training under conditions analogous to the D situation for the first two groups i.e., the light bulb was unscrewed and in place. During extinction the door holding the light bulb was removed and the W stripes were extended to the end of the stimulus platform. The extinction situation, therefore, constituted a change for all groups, in which the level of illumination on the stimulus platform, corresponding to the "light-bulb off" condition, was one of the constants over the two situations. If the extinction situation is considered to be most closely analogous to the WD condition present during acquisition, it is possible to analyze the different reinforcement histories of Groups (wn), (wnLD), (WNld,a), and (wn) with respect to the (WD) pattern. Such an analysis shows that (WD) has been 50 percent reinforced for Group (wn), never reinforced for Group (wnLD), and always reinforced for Groups (WNld,a), and (WN). This analysis would predict that Groups (WNld,a) and (WN) would not differ but the data are contrary to this prediction. Similarly, an analysis in terms of a combination of stimulus absolutes and the (WD) pattern predicts that Groups (WNld,a) and (WN) will not differ in resistance to extinction, and therefore, cannot account for the obtained data. The initial training conditions for Groups (wnLD), (wn), and (WNld,a) are alike in one respect not shared by Group (WN). All three groups have a 50 percent partial reinforcement schedule present during acquisition training, although this schedule is associated with W and N for Groups (wnLD) and (wn) and with L and D for Group (WNld,a). An explanation of the increased resistance to extinction for Group (WNld,a) over that expected may lie in some form of generalization or intrusion of the partial reinforcement history into 13

their extinction situation. It has been shown a numlber of times that resistance to extinction is greater following 50 percent partial reinforcement training than following 100 percent reinforcement training) under conditions where separate groups have been used for this comparison. This experiment, is, to the writer's knowledge, the first in which resistance to extinction has been examined following the concurrent administratioh of 50 percent and 100 percent reinforcement scheduleso The results of this study indicate that resistance to extinction following nondifferential reinforcement of an irrelevant stimulus (a 50 percent schedule) may not differ from that observed under straight 50 percent partial reinforcement training, but resistance to extinction following 100 percent reinforcement of a relevant stimulus in the complex stimulus situation may be greater than that observed following 100 percent reinforcement under straight discrimination training. TABLE V MEAN LOG RECIPROCAL LATENCY FOR W TRIALS ON THE LAST ACQUISITION DAY AND OVER 15 EXTINCTION TRIALS WITH SINGLE CONTRASTS AMONG GROUP DIFFERENCES FROM ACQUISITION TO EXTINCTION WITH 95% CONFIDENCE INTERVALS (Adapted from Sperling, 1960) Mean Group Contrasts.Group Acq. Ext' Diffo (wnLD) (WNld a) (WNldb) (WN) (wnl) 2.03 1.83.20.18.27,38.72* (wnLD) 1.74 1.o 36 38.09.20 o 54* (WNld,a) 2003 1.56.47.11 o45*( t ) (WNld,b) 1.66 1.08.58 34 (WN) 2.04 1.12.92 (LD) (1.70).88 Range of Contrast 5 4 3 2 95 CI o +o 55 +~52 + 47 +539 *Difference significantly greater than zero at 95% level of confidence. The results of this study are not compatible with Lawrence's assumption that nondifferentially reinforced irrelevant cues come to be ignored by the animal, since Sperling demonstrated that a tendency to respond to these stimuli increased during initial training on another set of relevant cueso This 14

response tendency has further been shown to be approximately equal in strength to that established under straight 50% partial reinforcement. The Sperling data thus give the first direct indication that, at least under procedures of single-stimulus presentation, stimuli defined as irrelevant, i.e., uncorrelated with reward, do in fact produce increased resistance to extinction as compared to stimuli which have undergone a 100% reinforcement schedule. This finding is of potential far-reaching importance in the interpretation of complex discrimination learning experiments of the type Lawrence originally conducted. It is as yet undemonstrated that the same kinds of effects occur when the situation is one of choice during acquisition. The second important aspect of the interpretation of the Lawrence results in terms of partial reinforcement effects of irrelevant cues involves the comparison of the resistance to extinction of generalized partial reinforcement to that of nongeneralized 100o reinforcement. This problem arose in the interpretation of Lawrence's results when it was necessary to assume that the hypothesized partial reinforcement effects acquired in the simultaneous discrimination box training would generalize to the T-maze used in the successive discrimination problem. Sperling and Birch (1961) directed an experiment to this problem. The same apparatus was used as in Sperling (1960, 1962). Five groups of rats, called 100% SL (N = 8), 100% S (N = 9), 50o SL (N = 9), 50% S (N = 9), and 50% G (N = 9), were provided single-stimulus presentation training to the same goal platform, one used by Sperling previously, followed by extinction to systematically varied goal platform conditions. The training platform, 12 in. long and 4 in. wide, was covered with longitudinally laid alternating black and white stripes of mystic tape, 1 in. wide. In addition, a swinging aluminum door near the terminal end of the goal platform was covered with black tape that concealed the mounting and held a flashlight bulb which was turned on to provide a light (L) as well as a striped (S) stimulus. S, under 23-hr food privation, pushed the door to obtain a small pellet of laboratory food from the foodcup. Following the usual pretraining, the 100% SL and 100% S groups were given 24 acquisition trials under 100o reinforcement on the first day and 6 additional trials on the second day followed by 20 extinction trials to either the SL or the S platform. The latter platform was the SL platform with the door and light arrangement removed. The 50% SL, 50o S, and 50% G groups were given the same acquisition conditions as the 100% groups except that they were under a 50% schedule of reinforcement. Extinction occurred either to SL or to S or to G, a midgrey platform without door and light. Training and extinction occurred under relatively massed conditions of a 1-minute intertrial interval. The design of the experiment permits the comparison of the extinction performance of groups trained under 100o or 50%0 reinforcement over several 15

extinction conditions. Table VI summarizes the response speed during successive thirds of extinction (trials 2-19) in terms of log reciprocal latency times 1000. The most pertinent aspects of the data for present purposes are: (1) the considerable differences in over-all running speed among the five groups is practically wholly attributable to the reinforcement schedules; (2) the suggestion of slower over-all running as a function of the nature of the extinction platform is not supported by statistical analysis either within the 50%0 or the 100oo groups or.for the two pooled; and (3) the appearance of a nonmonotone extinction function for both the 100%o S and 50%0 S groups. See Table VII for a summary of the statistical analysis supporting these points. TABLE VI RESPONSE SPEED TO THE GOAL PLATFORM AS MEASURED BY MEAN LOG RECIPROCAL LATENCY X 1000 FOR SUCCESSIVE BLOCKS OF SIX TRIALS DURING EXTINCTION Trial Block 1 2 3 50%o SL 2928 2853 2637 50 o S 2786 2661 2698 50%o G 2559 2532 2471 1O00% SL 2549 2096 2051 100% S 2410 2164 2432 Although the array of extinction performances by the five groups is consistent with informal expectations, apparently this experiment did not provide sufficiently great changes in the character of the goal platform from acquisition training to extinction testing (at least in terms of the size of the error variance) to obtain satisfactory levels of statistical significance. The present data tend to indicate, however, that a response acquired under partial reinforcement conditions may retain its high degree of resistance to extinction under marked changes in the stimulus situation. In fact, the difference between the 50% G and 10OO% SL groups favors the 50%1 G group at the 10% level of significance, as is also shown in Table VII. More important in this analysis is the significant interaction observable in Table VI; the two groups show very comparable speeds on the first trial block of extinction, but the 50%0 G group reduces its speed much more slowly than does the 100o SL group. This latter occurrence is in accord with the assumption made in the previously proposed alternative account of Lawrence's data (p. 6), Briefly summarized, our examination of the resistance to extinction of partially reinforced, irrelevant cues as a possible explanatory factor in com16

TABLE VII ANALYSIS OF VARIANCE SUMMARIES FOR EXTINCTION Source df SS MS F P Over-all Groups Subjects 43 25227053 Analysis Groups (G) 4 6138494 1534624 3.939.01 Error (b) 39 19088559 389562 Within 88 8799581 Trials (T) 2 954622 477311 5.586.01 T x G 8 1179916 147490 1.726 Error (w) 78 6665o43 85449 Total 131 34026634 Reinforcement Subjects 34 8350073 Schedule by Schedule (Sc) 1 6140422 6140422 97.413.001 Extinction Stimulus (St) 1 936 936 <1 Stimulus Analysis Sc x St 1 254625 254625 4.039 Error (b) 31 1954090 63035 Within 70 7926771 Trials (T) 2 1154973 577486 6.306.005 T x Sc 2 316009 158004 1.725 T x St 2 650393 325196 3.551 005 T x St x Sc 2 127941 63970 <1 Error (w) 62 5677455 91572 Total 104 16276844 Between Stimulus Subjects 26 10370302 Within 50% Groups (G) 2 1149940 574970 1.497 Reinforcement Error (b) 24 9220362 384182 Condition Within 54 4844520 Analysis Trials (T) 2 327535 163768 1.819 T x G 4 194490 48622 <1 Error (w) 48 4522495 90052 Total 80 15214822 50% G and 100% WL Subjects 16 6295474 Comparison Analysis Groups (G) 1 1055944 1055944 3.023 Error (b) 15 5239530 49502 Within 34 3289632 Trials (T) 2 759064 379532 5,589.o0 T x G 2 493507 246754 3.634 005 Error (w_) 530 2037061 67902 Total 50 9585106 17

plex discrimination learning transfer of the Lawrence type suggests the following major points: (1) Lawrence's basic data, the data used as the basis for the introduction of the theoretical concept of "acquired distinctiveness of cues" by Lawrence, can be accounted for alternatively by defining the stimuli and responses of the situation differently and recognizing the possible important role of resistance to extinction to stimulus components; (2) the alternative analysis leads to an examination of additional aspects of the Lawrence data and appears to deal with these aspects fairly adequately; (3) a dissertation by Sperling resulted in evidence supporting a fundamental assumption of the new analysis, namely, that partially reinforced irrelevant stimuli gain appreciable resistance to extinction rather than become ignored; and (4) a study by Sperling and Birch, while leaving something to be desired in its precision, suggests that another assumption of the new analysis, that partially reinforced stimuli may retain their property of strong resistance to extinction under extensive alteration in the stimulus situation, may be warranted, 18

EXTENDED TRAINING IN REVERSAL LEARNING The discrimination reversal problem, like the complex discrimination learning problem with relevant and irrelevant cues, has provided data interpreted in terms of an hypothesis of perceptual learning. The major impetus for the two experiments to be reported in this section comes from a study by Birch, Ison, and Sperling (1960). In introducing their experiment they review reversal learning experiments as follows (1960, ppo 36-37). In simple form the procedure employed in the reversal problem involves two stages. In the initial stage, approaching responses to one of two discriminanda are reinforced while approaching responses to the other are not reinforced. In a subsequent stage, the relation of reinforcement to stimuli is reversed so that approaching the previously reinforced cue is now nonreinforced and vice versa. Reid (1953), Pubols (1956), and Capaldi and Stevenson (1957) report data which indicate that rats which have been provided extended overtraining on an initial Black-White simultaneous discrimination learning problem require fewer trials to criterion on the subsequent reversal problem than Ss which are reversed at criterion performance on the original problem. An analysis of the reversal problem at a molar level points up two factors, either or both of which could operate to produce the obtained results. The difference between the criterion and overlearning Ss during reversal training could arise from differential rates of extinction to the previously positive cue and/or from differential rates of acquisition to the newly reinforced cue for the two groups. In order to investigate the importance of these two factors it is necessary that separate measures of response tendency strength be obtained for the two discriminanda. Simultaneous discrimination learning is not well adapted for this purpose since a given choice response is determined by the combination and interaction of the approaching tendencies to a complex of stimuli including both the two discriminanda and the directional cues of the situation. In the present experiment single stimulus presentation in a straight runway apparatus with response speed as the measure of response tendency strength is employed to investigate reversal learning. Grice (1948), Logan (1952), and Birch (1955), have presented evidence on the relationship between the single stimulus response speed measure and the choice measure. 19

For the purposes of the present study the single stimulus presentation procedure has three clear advantages over a simultaneous choice situation as a method of investigation. (a) separate measures of response strength to the two discriminanda are obtained, (b) the possibility of differentially strong position habits is eliminated, and (c) the importance of complex receptor-orienting acts is minimized. The third advantage is of special interest since both Reid (1953) and Pubols (1956) suggest that their reversal results may be accounted for in terms of a response of discriminating, which is learned concurrently with the choice response. By assuming that the response of discriminating transfers to the reversal problem with a facilitating effect and that this response exhibits greater resistance to extinction for the overlearning Ss, the obtained results are derived. The response of discriminating is described by Reid as a response "to a set of stimuli within the total stimulus complex-in this case the black and white stimulus cards" (1953, po 110), and is identified by Pubols as "orientation of the eyes toward differential cues" (1956, p. 248) in the case of visual discriminations. The importance of the response of discriminating as an explanatory device appears to be related to the insuring of advantageous reception of the two discriminanda presented simultaneously on each trial. The utility of such orienting acts should be minimized, if not eliminated, by the single stimulus procedure which presents S with only one of the two cues on each trial. Birch et al., found that overlearning also facilitates reversal under the single stimulus methodology. Under a procedure of eleven trials a day, six to the- reinforced white platform and five to the nonreinforced black, the criterion of solution for an individual rat was chosen to be that of no overlap in the two response-speed distributions for the last ten trials on a given day. The reinforcement schedule was reversed for the criterion animals on the day following criterion performance; the overlearning animals were given twelve additional days of training on the original problem before reversal. Comparison of trials to criterion on the reversal problem for the two groups shows superiority for the overlearning group. Examination of the response speed curves suggests that the effect is due primarily to the differential rates of extinction of the response to the formerly rewarded cue for the two groups rather than to speed of acquisition of the response to the newly rewarded cue. While the Birch, Ison, and Sperling experiment pointed clearly to an account of the reversal learning results independent of an appeal to perceptual learning factors, initial training exposed the two groups differentially to the choice-point stimuli. Differential learning of facilitating choice-point responses of an instrumental or perceptual nature was thus possible for the two groups. Ison and Birch (1961, p. 200) directed their attention to this possibility as follows. 20

The present experiment is a further investigation of the overlearning reversal phenomenon in which the learning of possibly facilitating choice-point responses during initial training was eliminatedo To accomplish this both criterion and overlearning groups were given the appropriate rewarded and nonrewarded experiences by direct placement into detached black and white endboxes. In this initial training there was no exposure to the choice-point of the T maze. The endboxes were subsequently attached to the arms of a single unit, mid-grey, T maze and the two groups were given T maze training. In this training the reward contingency of the endboxes was reversed, the endbox in which S received food in placements was now empty and the endbox which was empty in placements was baited. They summarize their findings as follows (1961, po 202). This experiment tested one explanation of the finding that a group given extended training on a discrimination problem learns its reverse in fewer trials than a group reversed at criterion. Two groups of rats received rewarded placements into a black endbox, one group receiving 50 placements and the other group 200. The endboxes were then attached to the arms of a grey T maze and Ss received T maze training. In the T maze the white endbox was empty and black endbox contained food. The groups given the 200 placements learned the T maze with fewer errors. Neither group had any experience with the choice-point stimuli prior to the test problem and could not have learned any differential responses to the choice-point during the placements. The explanation that extended training groups learn responses to the choice-point stimuli which subsequently facilitate the learning of the reversal problem is in- adequate to account for these results. Both the Ison and Birch and the Birch, Ison, and. Sperling experiments concentrated their controls on factors presumed to be important for perceptual learning effects and focused attention on the possibly important role of resistance to extinction in explaining reversal learning results. Such an emphasis leads readily to an interest in the phenomena and theories of extinction in their own right-particularly since the data suggest that resistance to extinction is not a simple increasing function of the number of acquisition trials but may actually decrease under conditions of extended training. A theoretical interpretation of extinction during the Nebraska Symposium on Motivation (Birch, 1961), and experiments by Ison (1961) and Stimmel and Birch (1962) manifested this interest, Drawing primarily on the findings from reversal learning studies and paying particular attention to the nonmonotone nature of the function relating degree of training to resistance to extinction, Birch (1961, pp. 186-190), accounted for extinction in terms of the theoretical notions of frustration theory as follows. 21

The nonmonotone extinction relation I have referred to suggests a compatability with the theory of frustration elaborated by Amsel (1958). Within this theoretical framework one would expect that acquisition training under reward conditions leads to the conditioning of a fractional anticipatory goal response (r - sg) to the runway stimuli and that the (rg - sg) mechanism will exhibit motivational or nonassociative properties in relation to the instrumental responses, also elicited by the runway. The elicitation of (rg - Sg) in conjunction with the experimental operation of withholding the reward is defined as the condition for the elicitation of a second motivational mechanism termed frustration and symbolized as (rf - Sf)o Amsel suggests that some level of (rg - sg) appreciably greater than zero must be reached before the nonreward operation is effective in producing the frustration. A further assumption that frustration may provide the basis for the elicitation of responses which compete with the approach response completes the groundwork for a two factor theory of extinction in the runway. Instrumental approach behavior may be diminished during nonreward first because the incentive motivation support for the approach is reduced by experimental extinction of (rg - s ) and second because of the introduction of competing instrumental responses resulting from frustration. Thus, for relatively few acquisition trials, extinction of the approach response is primarily attributable to experimental extinction of the (rg - sg). As the number of acquisition trials increases, however, although a greater magnitude or strength of (rg - Sg) is available for experimental extinction, (rf - sf) with its associated competing responses enters to diminish the responses in a second way. In this view extinction of the instrumental approach response occurs on a motivational basis directly due to the experimental extinction of (rg - sg) and indirectly due to the introduction of competing responses through (rf - sf). The combination of these two extinction factors could result in a nonmonotone relation between the resistance to extinction of the instrumental responses and the number of acquisition trials. For example, one might assume that the strength of (rg - sg) increases with the number of rewarded trials and that the greater the strength of (rg - sg), the greater its resistance to experimental extinction. So long as (rf - sf) is absent or weak, as would be expected when the number of acquisition trials has been relatively small, the extinction of the instrumental response would be determined by the experimental extinction of (rg - sg) and would be an increasing function of the number of acquisition trials. How ever, for larger numbers of acquisition trials a strength of (rg Sg) would be reached such that nonreward would function not only to experimentally extinguish the (rg - sg) but also to elicit (rf - sf) 22

and its accompanying competing responses* The addition of these competing responses to the instrumental responses hierarchy could reverse the positive relation between number of acquisition trials and persistance of the running response consistent with the findings of Birch, Ison, and Sperling and of North and Stimmel (1960). The results of the Ison and Birch study reported previously are also consistent with the incentive motivation-frustration interpretation. The 100 food-rewarded placements were apparently sufficient to produce a level of incentive motivation to the white endbox such that frustration with its accompanying competing responses occurred during the nonrewarded trials to that endbox during T maze training. These competing responses, generalized to the choice-point, would work to the advantage of the 200 placement group over the 50 placement group on the reversal problem. It will be noted that the conditions for the introduction of (rf - sf) into the theoretical account of extinction relate to the strength or magnitude of the (rg - sg) present at the time of nonreward. This implies that variables other than number of rewarded trials which are also considered determiners of the level of (rg - Sg) should show relationships with resistance to extinction. One such variable is the magnitude of the goal object. Spence (1956) and Logan, Beier and Ellis (1955), for example, have interpreted the results of magnitude of reward studies in terms of a monotone relation between the magnitude of reward and the strength of (rg - sg). A finding by Armus (1959.) suggests that the theoretical emphasis on the importance of the strength or magnitude of (rg - sg) in determining resistance to extinction may not be misplaced. In his study Armus gave two groups of rats 75 rewarded trials in an enclosed runway followed by 150 extinction trials. One group was provided with ten 45 mg. food pellets on each acquisition trial and the second group with only one such pellet. The ten pellet group ran faster in the middle section of the alley during acquisition and extinguished more quickly on this measure than the one pellet group. That is, the running time during extinction was greater for the ten pellet group. If it were also to be assumed that the strength of (rg - sg) is influenced by the imposed time of deprivation appropriate to the goal object, decreased resistance to extinction under higher deprivation levels might be anticipated~ As pointed out in detail by Estes (1958), studies involving shifts in time of deprivation conditions are extremely complicated in interpretation. The possibility of time of deprivation effects on (rg - Sg), however, suggests additional complexities in using extinction data as evi

dence relevant to questions of the energizing and drive stimulus properties of time of deprivation manipulations. Other recent studies by Miller (1960.) and Miles (1956) suggest that the relation of extent of reward training to persistence of the response is more complex than would appear from the work I have discussed so fare Miller, using a straight alley, found that the resistance of the running habit to disruption by shocks at the goal was diminished by extended acquisition training. This finding parallels that for the extinction studies and points to an effect of theoretical concern more general than frustrative nonreward. Miles provided rats with either 0, 10, 20, 40, 80, or 160 food rewarded bar pressing responses prior to extinction. He found monotonically increasing resistance to extinction as measured by the medium number of bar presses to a creterion of 4 minutes without a press. Similar monotonicity was obtained as a function of increased time of food deprivation during the extinction period. Whether or not this difference in results may be attributable to different parameter values arising from the runway and Skinner Box situations cannot be determined with our present information. The experiments by Ison (1961) and by Stimmel and Birch (1962) followed from the previous studies and the theorizing. Ison, located at the State University of Iowa but supported by a Rackham Post-doctoral Fellowship from The University of Michigan, systematically investigated the relation between the number of reinforcements and experimental extinction. Ison reports: Six groups of rats received either 10, 20, 40, 60, 80, or 100 rewarded trials followed by 80 nonrewarded extinction trials with a minimum intertrial interval of 18 minutes. Both running speed in extinction and the mean number of trials to extinction criteria of 40 and 120 seconds were negatively related to Ng (number of reinforcement). In addition, the mean total number of trials on which an avoidance response occurred was positively related to Ng and the mean number of trials to the first avoidance response was negatively related to Ngo Clearly, Ison was able to replicate the extend the basic finding of decreased resistance to extinction with increased numbers of training trials. Stimmel and Birch (1962) attempted to assess the hypothesized negative or avoidance properties of frustration. Two groups of 13 rats each were made thirsty and provided differential numbers of water-reinforced trials in a straight runway. An extended training group was given 63 trials and a control group, 18 trials prior to 6 trials of experimental extinction. Special procedures and apparatus were employed to measure possible avoidance effects. 24

The grey runway was composed of a start box, a mid-segment, and a two-chambered endboxo One chamber of the endbox, the goal chamber, contained the water cup and was attached directly to the runway. The right wall of the goal chamber was removable and was either solid so as to present the animal with the usual runway situation or contained a swinging rubber door which allowed entry into an empty chamber. All rats were given preliminary experience with both chambers of the endbox but only the goal box was available during acquisition and extinction trials. The basic question was whether the two groups, given differential reinforced training and thus presumably susceptible to different amounts of frustration when reward was removed, would manifest this difference by time spent in the goal chamber during the test period. More specifically, the procedure was as follows. After gentling, all 26 animals were given a 40-minute test in the twochambered endbox, in which they were placed directly into the goal chamber and permitted free access back and forth between the two chambers. A clock reading was made each time S left one chamber and entered the other, thus providing the time spent in each chamber on each occasion that S was in that chamber. On the day following this initial assessment of preference for the two chambers, 13 Ss (extended training) began 15 water-reinforced runway trials a day, in which the intertrial interval was approximately 5 minutes. The other 13 Ss (controls) were treated with a combination of placements into the empty goal chamber and water-reinforced runway trials. One of the uncontrolled factors in extended training experiments in the past has been that the control groups, that is, the groups with fewer trials, have received not only fewer reinforcements and fewer excursions down the runway, but also have experienced the goal box fewer times. It was decided to match this last factor for the two groups in the present experiment by placing the 18-trial group 15 times per day for three days into the empty goal chamber at the same time that the 63-trial group was receiving its rewarded runs. The time that the animal remained in the goal chamber was matched in the two groups. At the end of four acquisition days the extended training group had received 60 water-reinforced trials while the control group had received 45 placed trials plus 15 water-reinforced runway trials, The procedures on the extinction and test day were common for the two groups. Each S was provided with 3 reinforced trials followed by 6 extinction trials in which the water cup was present but empty. On the sixth trial the wall containing the swinging rubber door was in place and S was permitted to spend 40 minutes in the two chambers. The time spent in each chamber on each occasion was recorded in this post-test in the same fashion as in the initial test. Both start time and running time were recorded during acquisition and the six extinction trials, but only the start time seemed to give a sensitive measure. In summary these results were as follows. 25

During the first 15 runway acquisition trials for the'two groups, the control group appeared to decrease its starting time more quickly than the extended training group. This difference, however, fails to reach acceptable levels of statistical significance when evaluated either by mean log starting speed or by the interaction of groups with trials. In addition, the two groups respond with comparable start latencies on the last four acquisition trials. Thus, under the conditions of the present experiment, the 45 placed trials provided the control group do not appear as a significant factor in acquisition. No differences in start speed over the five extinction trials (eliminating the first) were found for the two groups either. Tables VIII and IX summarize the data and analyses for acquisition and extinction. TABLE VIII MEAN LOG STARTING SPEED (TIMES 100) FOR THE EXTENDED TRAINING AND CONTROL GROUPS DURING FIRST FIFTEEN AND LAST THREE TRIALS OF ACQUISITION AND FOR SIX TRIALS OF EXTINCTION Group Trials 18-Trial 63-Trial Acquisition 1 219 221 2 222 208 3 215 219 4 232 232 5 241 219 6 236 210 7 238 228 8 228 217 9 247 232 10 237 227 11 256 238 12 254 234 13 267 227 14 252 237 15 258 252 16 (61) 229 232 17 (62) 266 261 18 (63) 273 276 Extinction 1 290 274 2 283 - 254 3 282 254 4 260 247 5 233 212 6 228 232 26

TABLE IX ANALYSIS OF VARIANCE SUMMARIES OF ACQUISITION AND EXTINCTION DATA FOR 18- AND 63-TRIAL GROUPS Source df SS MS F P First Fifteen Subjects 25 16o 95 Acquisition Groups (G) 1 1.33 1 33 2.05 Trials Error (b) 24 15.62 o.65 Within 364 47~04 Trials (T) 14 6.96 0.50 4.55.001 T x G 14 173 0.12 1.09 Error (w) 336 38.35 o.11 Last Four Subjects 25 11536 Acquisition Groups (G) 1 0.02 0.02 <1 Trials Error (b) 24 11.34 0.47 Within 78 26,48 Trials (T) 3 4.00 1.33 4,29.01 T x G 3 0.17 o0.06 <1 Error (w) 72 22,31 0.31 Five Extinction Subjects 25 15.05 Trials Groups (G) 1 0 96. 96 1.62 Error (b) 24 14,o9 0.59 Within 104 22.53 Trials (T) 4 4.78 1.20 6.66.001 T x G 4 0,47 0.12 <1 Error (w) 96 17.28 0.18 The emphasis of the experiment, however, is on the amount of time spent in the goal chamber and in the alternative chamber by the two groups following six extinction trials. To be sensitive to the possibly transient nature of the frustration effect, it was decided to examine the time spent in the goal box (GB) and the "not goal box" (GB) on each successive entry into that chamber, The time values were converted to logs to counteract the skewness of the distributions. Table X summarizes the mean log time spent for GB(1), the only measure for which all subjects are available, and for GB(1), B(1), GB(2), and GB(2) 27

with the reduced number of cases. Three subjects in the 18-trial group failed to make two entries into GB. For GB(1) with all subjects the major finding is that the 18-trial group spends more time in GB than the 63-trial group under both the Pre-test and the Post-test, but the drop in time from the Pre-test to the Post-test is significantly greater for the 63-trial group (see Table X). Analysis of Post-test performance, as shown also in Table X, for the two groups in terms of GB-GB differences indicates small differences in the 18-trial group favoring GB and larger differences in the 63-trial group favoring GB. In addition, the groups-by-entry interaction is nonsignificant although the shift toward spending more time in GB is more noticeable for the 63-trial group. This latter shift is toward no difference and may signal the passing of the frustration effect. TABLE X MEAN LOG TIME SPENT VALUES (LOG TS X 100) AND ANALYSES FOR THE 18- AND 63-TRIAL GROUPS IN THE GOAL BOX AND THE ALTERNATIVE CHAMBER FOR THE FIRST TWO ENTRIES INTO EACH DURING THE PRE- AND POST-TESTS Group GB(l) GB(1) GB(l) GB(2).B(2) Pre-test 18 N=13 204 N=10 178 184 181 194 63 N=13 182 N=13 182 153 184 164 Post-test 18 N=13 195 N=10 167 162 157 148 63 N=13 137 N=13 137 178 127 145 Source df SS MS F P GB(l) Subjects 25 127286 Groups (G) 1 20162 20162 4.52.05 Error (b) 24 107124 4464 Within 26 58854 Tests (T) 1 1428 1428 <1 T x G 1 12246 12246 6.51.05 Error (w) 24 45180 1882 GB - GB Subjects 22 45522 Differences Groups (G) 1 14971 14971 10.29.005 Error (b) 21 50551 i4.. Within 23 43373 Trials (T) 1 2440 2440 1.28 Tx G 1 1026 1026 <1 Error (w) 21 39907 1900 28

The results of this experiment support the hypothesis that a group given extended reinforced training will be more frustrated during extinction than-a group given lesser acquisition training and that the results of this differential frustration can be demonstrated in terms of an avoidance of the goal chamber in which the frustration took place. In using the frustration concept to deal theoretically with the nonmonotone relation between the extent of training and resistance to extinction, it was hypothesized that the extendedtraining groups develop more frustration during extinction. This higher level of frustration was assumed to result in more or stronger competing or avoidance responses in the extended-training groups. The present experiment gives support to these hypotheses and to the underlying theoretical structure. The last experiment to be reported in this section is one which originates from basic considerations of incentive motivation and its relation to performance. Incentive motivation and the effects of nonreward in conjunction with various magnitudes of incentive motivation have provided the guiding theoretical lines for our investigations of extinction in the context of extended training. In this connection it became important to examine some of the conditions crucial to the development and operation of incentive motivation. A part of this problem was studied by Ison (1960) in a thesis. Ison was concerned about a specification of the necessary and sufficient conditions for the development of incentive motivation and for its interaction with other response tendencies. Using the analysis of instrumental approach behavior proposed by Spence (1956), it can be predicted that rewarded endbox placement will increase the speed of the instrumental approach response if the stimuli present in the runway are similar to the stimuli present in the endbox. An experiment by Stein (1957) performed in the context of this analysis failed to demonstrate any effect of endbox feedings on running speed. In the present experiment a number of conditions were explored in which endbox feedings might influence the speed of the instrumental response. Six groups of food-deprived rats were given nine nonfood terminated runs in a straightalley apparatus. For all six groups the starting box and runway were black; for three groups the endbox was black, and for three it was white. The Ss subsequently were given nine placements into the appropriate baited endbox. Two groups were placed over the foodcup at the end of the endbox, two groups were placed over the foodcup at the center of the endbox, and two groups were placed at the beginning of the endbox from where they ran to the foodcup at the end. All Ss then received a test run through the apparatus. Four measures were recorded: starting-box speed, runway speed (recorded before the endbox cues were visible), endbox-entry speed, and endbox running speed. In starting speed there were no differences between trials 9 and 10. In the runway the group fed in the center of the black endbox demonstrated a significant increase on trial 10. In the endbox measures the groups fed in the center of the endbox and the groups which ran to the end of the endbox 29

increased in speed. The increase in running speed in the runway demonstrated by the group fed in the center of the black endbox, and the increases of the four groups in the endbox measures was interpreted as consistent with the analysis proposed by Spence. This analysis states that the effect of the feedings was to condition the fractional goal response, rg, to the endbox stimuli, and that _g, generalized from the black endbox to the black runway. The simultaneous occurrence of rg and the instrumental response Ra, resulted in the increased speed of Ra. This experiment suggests that the previous failure to demonstrate the increase in running speed can be attributed to a failure to insure that S received the appropriate endbox stimuli at the time of the consummatory response. In summary, our investigations related to extended training in reversal learning have yielded the following: (1) an extension of the reversal learning finding to acquisition conditions of rewarded endbox placements, a procedure which controls further for possible contributions of perceptual learning factors to the result; (2) application of frustration theory to deal with the finding of nonmonotonicity between degree of rewarded training and resistance to extinction with supporting experimental evidence for the phenomena itself and for the hypothesized relation between degree of frustration and magnitude of competing response effects; and (3) a careful experimental investigation of some aspects of the development and interaction effects of incentive motivation which, while tending to support the Spencian theory, indicate clearly that most of the problem is yet to be dealt with. 3o

DISCRIMINATION LEARNING EFFECTS ON RESISTANCE TO EXTINCTION The present experiments continue an investigation of the role of extinction in discrimination reversal learning in rats using methods of single-stimulus presentation in the runway. Birch, Ison, and Sperling (1960) and Ison and Birch (1961) have focused on the relationship between extent of original training and resistance to extinction; Sperling (1960) has reported evidence favorable to the hypothesis that so-called irrelevant stimuli (Lawrence, 1949) are more appropriately considered partially reinforced stimuli which acquire increased resistance to extinction than stimuli which became ignored and thus ineffective. With attention directed to the importance of the resistance to extinction of the response to the formerly positive cue in accounting for reversal learning results, it becomes of interest to examine the acquisition conditions for such cues during original training. The positive cue in simple discrimination learning is under an experimenter-presented 10% schedule of reinforcement, and as such might be expected to show the resistance to extinction appropriate to 100l reinforced stimuli. On the other hand, it can also be argued that the positive cue is under conditions theoretically similar to those prevailing during the administration of partial reinforcement schedules and should demonstrate the increased resistance to extinction associated with such schedules. Reasons for the latter possibility hinge basically on the assumption that certain aspects of the positive stimulus are similar or identical to aspects of the negative stimulus and therefore receive a sequence of both reinforced and, —nonreinforced trials, in effect a partial reinforcement schedule, during discrimination learning. A theoretically more detailed analysis has been presented by Amsel (1958). Also, recently Jenkins (1961), using the key-peck response of pigeons, has reported a study directed to this same point and has summarized previous suggestions and work by Skinner (1938) and Wickens and Snide (1955). The emphasis of this study is on the question of resistance to extinction attained by the positive stimulus under commonly employed laboratory conditions of discrimination learning. No attempt was made to maximize the chances of obtaining support for either possibility or even to sample procedures which might be influential in this respect. Five groups of rats were used under procedures of single-stimulus presentation on an elevated runway in the first experiment. Two groups (W+' and W18), differing in total number of trials, received only reinforced trials to a white platform. Three additional groups received onehalf reinforced and one-half nonreinforced trials. The white platform was present on reinforced trials for all groups, but one group (W~) received its nonreinforcement on white, a second group (WG) on grey, and a third group (WB) on black. Comparisons of the extinction performance of the WG and WB discrimination 31

groups to the W+ and W18 continuously reinforced and the W~' partially reinforced reference groups are of primary interest. Twelve rats in the each of the five experimental groups were run on the same two-platform, elevated runway used throughout the project. In the experiment proper, after 14 days of handling and pre-training, during which the animal was maintained on a 22-hr food maintenance schedule, all five groups were provided reinforced training followed by 19 nonreinforced extinction trials to the white platform. During acquisition groups W+ and W18 were presented with the white platform only, and were 100o% reinforced for the approach response. The remaining groups received one-half reinforced and one-half nonreinforced trials with reinforcement always to the white but nonreinforcement either to white (W~) or to grey (WG) or to black (WB). Four groups, W18 excepted, received 36 trials, 14 on days 1 and 2, and 8 on day 3 prior to the extinction series, also on day 3. For groups W~, WG, and WB the pattern of reinforced and nonreinforced trials was -+-+ —+++-++ — on day 1, the opposite on day 2, and.+-+ —++- on day 3, Group W18 was included to control for the total number of reinforced trials to white in the noncontinuously reinforced groups, and was provided 10 trials on day I and 8 on day 2, the extinction day. All animals were run under 22-hr food deprivation with a small food pellet serving as reward. The intertrial interval was approximately 5 minutes and was maintained throughout the experiment. Acquisition and extinction data for the five groups are presented in terms of log (1000/Latency) in Table XI. Inspection of the extinction data suggests that the difference in response speed for the two reference groups (W+ and W+) is appreciable and that the three experimental groups (W18, WB, and WG) are more similar to the W+ group than to the W~ group. Table XII presents the mean response speed during extinction for each group and simple analysis of variance yields an F = 7.156, which for df = 4/55 is significant beyond the.001 level. Tukey tests (Schaffe, 1959) using the 95% confidence interval on all possible pairs of comparisons among the five means shows Group W+ significantly different from each of the other four groups, but no comparison — among the W+, W18, WB, and WG groups reaches significance. Discrimination performance during acquisition for the WG and WB groups was also evaluated. Analyses of over-all differences in log response speed to the positive and negative stimuli show that both group WB and group WG respond significantly faster to the positive stimulus (t = 5.516, df = 11, P <.001 for WB and t = 2.648, df = 11, P <.05 for WG). The mean difference is greater for the WB group than for the WG group (F = 4.119, df = 1/22, P <.10), tending to indicate greater difficulty for the WG problem. Table XI also suggests that the five groups did not enter extinction with a common terminal acquisition speed, making interpretation of the comparisons among the groups during extinction difficult~ Table XII contains the mean response speed for the last four corresponding presentations of the W stimulus for each group. Simple analysis of variance of these data shows significant 32

TABLE XI TABLE XII MEAN LOG (1000/LATENCY) X 100 FOR THE MEANS AND STANDARD DEVIATIONS OF.MEASURES APPROPRIATE FIVE GROUPS OF EXPERIMENT I AT THE:TO EVALUATION OF EXTINCTION PERFORMANCE END OF ACQUISITION AND DURING EXTINCTION FOR APPROACH SPEED Group Group Trial W+ WE W18 W+ W W WG WB W18 - Acq. ( 3) 31 303 158 170 242 316 Exto. M 274 174 206 177 194 (-) 32 324 119 163 252 314 Speed S.D. 63 50 47 44 45 33 310 208 262 244 318 34 306 231 278 262 322 Terminal M 3-10 227 281 258 323 (-) 35 322 171 175 250 322 Acq. Speed SoDo 47 78 46 69 44 36 312 226 288 265 326 Ext. 1 312 243 296 260 325 C M =0.11 0.37 0,57 o.48 S.D. o 61 o 172.78.73 2 320 235 293 221 323 3 289 201 277 236 286 4 320 197 235 220 256 B M -532 3.12 -5.14 4.84 6,98 5 07 176 221 214 228 S.Do 4.15 3 74 2.75 4.94 3530 6 300 186 214 200 190 7 298 174 189 194 183 A M 277 164 190 164 8 283 160 192 169 1693 S.D 71 57 60 48 9 280 177 222 185 170 10 290 188 188 180 198 (2.432)A=B2 M 375 429 352 345 11 281 172 208 141 156 S.D. 132 141 89 108 12 287 142 206 151 165 13 256 150 191 147 160 (1320)A-B2 M 320 192 217 169 14 221 166 170 152 206 S.D. 114 72 79 50 15 263 167 163 165 164 16 241 160 186 155 156 17 242 153 188 138 146 18 223 168 183 153 180o 19 235 155 179 171 156

differences among the groups (F = 4.755, df = 4/55, P <.0*G), and Tukey tests using the 95% confidence interval show significant differences between WG and W~ and between WG and W+. In addition, the correlation between terminal acquisition response speed, and mean extinction response speed is r =.67 for all groups combined and r =.92,.78,.65,.55, and.74 for groups W~, W+, W18, WB, and WG, respectively. To facilitate a decision concerning the similarity of the extinction performance of the experimental groups to the two reference groups, unconfounded by differences in terminal acquisition level, the following procedures were carried out. For purposes of analysis it was assumed that the course of extinction for all 60 Ss, and thus for the five groups, was the same (i.e., that the same curve would fit the data for all five groups) and that differences in terminal acquisition level reflected simply differences in the point at which the common extinction curve was entered. Specifically, it was assumed that a curve of the form Y = e(a+bX+cX2) or log Y = a + bX + cX2, where Y = response speed and X = number of extinction trials, is adequate to represent the extinction function. This curve was fitted to the data for each of the 60 Ss by the methods of orthogonal polynomials (Anderson and Houseman, 1942). Under the assumption that different initial response speeds may reflect different initial locations along the X axis, it would be desirable to fit the curve, log Yi = ai + bi Xi + ci x2, where Xi is chosen appropriate to each individual. Because a uniform set of X values is used for all individuals, however, the function log Yi = Ai + BiX + CiX2 is actually fit. Letting X = Xi + Ki, where Ki is the unknown amount by which Xi has been shifted to carry out the curve fitting with the uniform set of X values, log Yi = Ai + Bi (Xi + Ki) + Ci (Xi + Ki)2 = (Ai + BiKi + CKi 2) + (Bi + 2CiKi)Xi + (Ci)Xi2. Thus ai = Ai + BiKi + CiKi 2 bi = Bi + 2CiKi, and ci = Ci where Ai, Bin and Ci have numerical values from the curve-fitting process and ai, bi, ci, and Ki, are unknown. It will be noted that ci = Ci, so questions concerning the C parameter may be evaluated directly from the obtained C values. In addition, by eliminating Ki in the two remaining equations, the relationship 4Ciai-bi2 = 4C1Ai2 i Bi is obtained, permitting tests of certain hypotheses concerning the a and b parameters. In the present experiment it is of interest to compare the five groups with respect to mean C values and mean (4CA-B2) values since the null hypothesis proposes no differences among groups for these quantities. Summarized, the computational procedure was as follows. The 18 trials of extinction data for each individual, using log response speeds as the meas-! I ure, were fit by the function Y = Ao + Al1l + A2t2 using the Fisher and Yates (1938) notation which yields numerical values for Ao, Al and A2 appropriate to values of X = 1, 2, -.,18, Rewriting the function for each individual in terms of (x-x) = X, a function of the form Y = A + BX + CX2 was obtained with numerical coefficients. Table XII summarizes the means and standard deviations of these parameter values for the five groups.

An indication of the adequacy with which the function log Yi = A + BiX + CiX2 fits the data for the 60 Ss may be obtained from the ratio of the SS for the linear and quadratic components to the total SS. These values range from a low of.003 to a high of.787 with a median of.364. Medians of.422,.365,.348,.326, and.414 were found for groups W+, WG, WB, W18, and W+, respectively. Simple analysis of variance comparing the five groups with respect to mean C value yields an. F = 3.752 which for df = 4/55 is significant beyond the.01 level. Tukey tests using the 95% confidence interval on all possible pairs of comparisons among groups show the difference only between the W~ and W+ group to be significant. Thus, the analysis of the C parameter measure bears out the difference between the two reference groups, but does not provide the basis for a decision concerning the three experimental groups. The results of the C analysis suggest that the experimental groups be evaluated with respect to the W+ and W+ groups by two further analyses using the 4CA - B2 measure. This measure, however, is sensitive to the value of C used. So Table XII indicates two sets of means and standard deviations for (4CA-B2) using, first, mean C = 0.608 computed from groups W+, W18, WB, and WG and, second, mean C = 0.330 computed from groups W+, W18, WB, and WG. Simple analysis of variance of the first measure indicates nonsignificance among groups (F = 1.102, df = 3/44, P >.20). For the second measure, significance among groups was obtained (F = 7.326, df = 3/44, P <.001) and subsequent Tukey tests using the 95% confidence interval showed all three experimental groups to differ significantly from W~, but not to differ among themselve s. Thus the further analysis designed to eliminate the confounding due. to different terminal response speeds among groups produced the same result as the. original simple analysis; namely, group W~ differs from group W+, and groups W18, WB, and WG are more similar to group W+ than to W~. Experiment II, employing only WB and WG groups, was carried out to obtain further data on the terminal acquisition response speed and its relation to extinction for these two conditions. A difficulty with the results of Experiment I was that the WG group, and to a lesser extent the WB group, responded more slowly during acquisition than the control groups. The second experiment was run to gain additional evidence on the stability of this finding and to compare the WB and WG groups in resistance to extinction under slightly altered conditions. One suggestion* to account for the slower terminal acquisition speeds in the discrimination groups (particularly WG) notes that Ss in these groups are exposed to two stimuli over the course of training, whereas Ss in the other *The author appreciates the suggestions and discussion of G. Blum and W. McKeachie in this connection. 35

groups are exposed to only one. Since the latency of response is measured from the time the goal platform stimulus is presented until S leaves the approach platform, it might be that Ss respond more slowly in the discrimination groups because time is taken for "stimulus processing" (e.g., in making observing responses). If this were so, it might also be expected that more time would be taken by the WG than the WB group, compatible with the findings of Experiment I. In Experiment II a longer goal platform was used, and the running speed of S while on the goal platformr as well as Ss response speed in leaving the approach platform was recorded. The latter measure was used as a replication condition for Experiment I while the new measure was assumed to be free of the "stimulus processing" time referred to. The acquisition and extinction performance of two groups, WG and WB, were compared using both measures. The Ss were 27 experimentally naive, male pigmented rats from the colony maintained by the University's Department of Psychology. Ss were approximately 90 days old at the beginning of training. Two Ss were discarded, one from the WB group for refusal to run, and one from the WG group because of apparatus failure. The elevated runway in Experiment I was adapted for Experiment II by the addition of two photoelectric cell units which were mounted externally to the apparatus housing and which controlled stopping of two additional Standard Electric clocks. This device permitted the measurement of the Ss latency of response to the goal platform from the approach platform and also a measure of his speed of running once on the goal platform. To obtain the goal platform measure it was necessary to replace the goal platform of Experiment I with a longer (19-in.) goal platform in Experiment II. The procedures in Experiment I were carried out in Experiment II with one major exception: Examination of the acquisition data of the discrimination groups in Experiment I suggested that the day 2 results might not be a simple continuation of day 1. Therefore, in Experiment II both groups were provided with 26 trials on day 1 (+-++ —+ —-++-+-+-+++ —++ —) and 8 trials (+-++ —+-) on day 2 prior to the 19 extinction trials. Table XIII summarizes the results for both time measures in terms of log (1000/Lat) x 100 on the two groups. The same analyses were carried out on these data as on those of Experiment I. Briefly, the results of the analyses are as follows: (1) over-all differences between the speed to the positive and negative stimuli favor the positive stimulus. (2) The mean difference while greater for WB than for WG is not statistically significant. (3) Differences in terminal acquisition speed to the positive stimulus for the two groups as suggested in Experiment I were not found for the approach measure but reach the 5% level for the goal platform measure. (4) No differences between the two groups during extinction approached significance with either measure upon

TABTLE XIII iMEANS AND STANDARD DEVIATIONS OF MEASURES APPROPRIATE TO EVALUATION OF EXTINCTION PERFORMANCE FOR APPROACH AND GOAL PLATFORM SPEEDS Approach Goal Platform Group WG WB WG WB Ext inct- M 206 188 303 310 ion Speed S.D. 38.0 60.3 26.5 17.7 Terminal M 281 270 322 333 Acq. Speed S.D. 44.8 68.6 13.8 10.4 C M 0.58 0.71 0.02 0.04 S.D. 0.64 0.42 0.50 0.32 B M -4.08 -6.71 -1.51 -0.54 S.D. 5.00 4.07 2.04 1.76 A M 190 168 302 308 S.D. 44.8 64.1 35.1 22.7 (2.764)A-B2 M 484 404 (0.126)A-B2 32 36 S.D. 99 184 11 6 analysis of log extinction speed, the C parameter, or 4CA-B2. Out of the six analyses the difference approaching significance most closely was on the 4CA-B2 comparison using the approach platform measure, but this reached only approximately the 20% level. The results of Experiment I suggested that the WB and WG groups approach the white platform at different speeds. No support for this suggestion was found in Experiment II. A comparison of' the performance of the WG and WB groups in the two experiments indicates that the WG group in the first experiment is the anomalous group. These differences in results may be attributable to the fact that the source of subjects was different in the two experiments (Bloomington Supply House in Experiment I and the departmental colony in Experiment II), the acquisition procedure was changed so that only one day of training was given to the animals in Experiment II, the change in the length of the goal platform, or to factors simply not under Es control. Neither Experiment I nor II gave evidence that, for the stimuli used, a discrimination problem yields extinction results intermediate between the 100% and the 50%0 schedules, It is important to note that this does not mean that the phenomena cannot be found. Perhaps most important in obtaining the effect is the degree of similarity between the positive and negative stimuli. It ap37

pears possible to choose a negative stimulus so highly similar to the positive that an extremely large number of acquisition trials would be required to reach a criteria of discrimination, Under these conditions, it would not be surprising if appreciable resistance to extinction were obtained, particularly if testing were carried out before the criterion of discrimination was reached. The importance of the findlng of the present experiments lies in their purpose. The black, white, grey stimulus platforms were chosen because these are stimuli commonly used in discrimination learning problems and particularly in reversal learning problems. The failure to find increased resistance to extinction to the positive stimulus for the stimuli used indicates that the proposed factor is not important in a theoretical account of the differences between extended training and criterion groups on reversal learning problems. It seems that most of the important conditions permitting comparison between the present experiments and the usual reversal learning experiments were mete That is, for the training given, the WG problem tended to be more difficult than the WB in both experiments, and yet both groups reached acceptable criterion levels of discrimination performance before entering into the extinction series From the results of Experiment I it seems clear that the W+, W18, WB, and WG groups belong together according to their extinction performance, and that these four groups are different from the W~ group, both in over-all speed during extinction and also in the more elaborate analyses carried out, which control for individual differences in the initial speed of response during extinction. The failure to get the hypothesized results, which are derivable from frustration theory, calls for further comment, Jenkins (1961) obtained results anticipated by the theory using a key-peck response with pigeons. An examination of Jenkins' study points up four procedural differences from the present experiment, any one of which might be sufficient to obtain the differences in results. First, Jenkins used pigeons and the key-peck response, whereas we used rats and an elevated runway. Second, Jenkins used stimuli which were horizontal or vertical dark lines, on a back-lighted key. This is apparently a very difficult discrimination problem for the pigeon since Jenkins reports: "The Ss in group DISC were trained to the following criterion: Beginning with a session in which the probability of response to S+ was equal to or greater than.9, while the probability of response to S_ was equal to or less than.1, 10 sessions were required for which the average probabilities of response to the stimuli were within these limits, Two Ss failed to reach this criterion after extended training and were replaced." In another place Jenkins notes that 17 to 25 training sessions were required. In addition, although the Ss in Jenkins' different groups received different numbers of trials within a training session, a minimum of 40 trials per session was given. This procedure contrasts with ours, in which rats were able to discriminate between the positive and negative stimuli in a statistically satisfactory fashion after only 36 trials. The third aspect of Jenkins' experiment in

contrast to ours is that he provided five to seven days of pre-training during which the pigeon was 100% reinforced for key-peckirg the positive stimulus. That is, his subjects were not placed in discrimination training at the beginning of the experiment but only after a reliable response, with a short latency, was occurring to the positive stimulus. It might be expected this procedure would provide a higher level of generalized anticipatory goal reactions to the negative stimulus early in training so that nonreinforcement would produce a larger frustration reaction which could then generalize more strongly to the positive stimulus. A fourth difference in the experiment lies in the extinction procedures. The only change Jenkins made in going from acquisition to extinction sessions was to remove the reinforcement. His discrimination group, theref~ore, continued to receive the negative stimulus as during acquisition. This means that the extinction conditions were different for his three groups, his intermittent reinforcement group, his 100o reinforcement, and his discrimination group. In contrast, we presented only the positive stimulus without reinforcement for all groups. It is possible that the presentation of a stimulus with an extended history of nonreinforcement along with a newly nonreinforced stimulus could extend the response to the stimulus. This might be because the change from the acquisition to the extinction condition is less or perhaps because of some more complicated motivational fact involving a contrast between reward and nonreward conditions. Future systematic variation of the last three factors would be of considerable interest. It might well be expected to result in knowledge of the conditions under which the effect can and cannot be obtained, and the extent to which it is obtained. However, from our own orientation toward factors affecting the ease of the reversal learning, it appears that the a priori attractive one of a partial reinforcement effect arising within the context of discrimination learning is not important. 39

REFERENCES Amsel, A. The role of frustrative nonreward in noncontinuous reward situations. Psychol. Bull., 1958, 55, 102-119. Anderson, R. L., and Houseman, E. E. Tables of orthogonal polynomial values extended to N = 104. Iowa Agr. Exp. Sta., Res. Bull. 297, 1942, Ames, Iowa. Armus, H. L. Effect of magnitude of reinforcement on acquisition and extinction of a running response. J. exp. Psychol., 1959, 58, 61-63. Birch, D. Discrimination learning as a function of the ratio of nonreinforced to reinforced trials. J. comp. Physiol. Psychol. 1955, 48, 371-374. Birch, D. A motivational interpretation of extinction. In M.R. Jones (Ed.), Nebraska Symposium on motivation, 1961. Lincoln: Univ. of Nebraska Press, 1961, Pp. 179-197. Birch, D., Ison, J. R., and Sperling, S. E. Reversal learning under single stimulus presentation. J. exp. Psychol., 1960, 60, 36-40. Birch, D., and Vandenberg, V. The necessary conditions for cue-position patterning. J. exp. Psychol., 1955, 50, 391-396. Capaldi, E. J., and Stevenson, H. W. Response reversal following different amounts of training. J. comp. Physiol. Psychol., 1957, 50, 195-198. Estes, W. K. Stimulus-response theory of drive. In M. R. Jones (Ed.), Nebraska Symposium on motivation, 1958, Lincoln: Univ. of Nebraska Press, 1958, Pp. 35-69. Fisher, R. A., and Yates, F. Statistical tables for biologicial, agricultural and medical research. Edinburgh: Oliver and Boyd, 1938. Grice, G. R. The acquisition of a visual discrimination habit following response to a single stimulus. J. exp. Psychol., 1948, 38, 633-642. Ison, J. R. Changes in instrumental response speed following rewarded endbox placement. Unpublished doctor's dissertation, Univ. of Mich., 1960. Ison, J. R. Experimental extinction as a function of number of reinforcements. Submitted to J. exp. Psychol., 1961. Ison, J. R., and Birch, D. T-maze reversal following differential endbox placement J. exp. Psychol., 1961, 62, 200-202. 41

Jenkins, H. M. The effect of discrimination training on extinction. J. exp. Psychol., 1961, 61, 111-121. Lawrence, D. H. Acquired distinctiveness of cues~ I. Transfer between discriminations on the basis of familiarity with the stimulus. J. exp. Psychol., 1949, 39, 770-784. Lawrence, Do H. Acquired distinctiveness of cues: II. Selective association in a constant stin1lus situation. J. exp. Psychol., 1950, 40, 175-189. Logan, F. A. Three estimates of differential excitatory tendency. Psychol. Rev., 1952, 59, 300-307. Logan, F. A., Beier, E. M., and Ellis, R. A. Effect of varied reinforcement on speed of locomotion. J. exp. Psychol., 1955, 49, 260-266. Miles, R. C. The relative effectiveness of secondary reinforcers throughout deprivation and habit strength parameters. J. comp. Physiol. Psychol., 1956, 49, 126-130. Miller, N. E. Learning resistance to pain and fear: Effects of overlearning, exposure, and rewarded exposure in context. J. exp. Psychol., 1960, 60, 137-145. Nissen, H. W. Discription of the learned response in discrimination behavior. Psychol. Rev., 1950, 57, 121-131. North, A. S., and Stimmel, D. T. Extinction of an instrumental response fol-lowing a large number of reinforcements. Psychol. Rep. 1960, 6, 227-234. Pubols, B. H. The facilitation of visual and spatial discrimination by overlearning. J. comp. Physiol. Psychol., 1956, 49, 243-248. Reid, L. S. The development of noncontinuity behavior through continuity learning. J. exp. Psychol., 1953, 46, 107-112. Schaffe, H. The analysis of variance. New York. John Wiley and Sons, Inc., 1959. Skinner, B. F. The behavior of organisms. New York: Appleton-Century-Crofts, 1938. Spence, K. W. The nature of response in discrimination learning. Psychol. Rev., 1952, 59, 89-93. Spence, K. W. Behavior theory and conditioning. New Haven: Yale Univ. Press, 1956. 42

Sperling, S. E. Resistance to extinction following nondiffetential reinforcement of irrelevant stimuli. Doctor's dissertation, Univo of Mich., 1960. Sperling, S. E, Extinction effects following nondifferential reinforcement of an irrelevant stimulus. J. exp. Psychol,, 1962, 63, in press. Sperling, S. E., and Birch, D. Extinction effects under changed stimulus conditions following 50% and 100%o reinforcement. Paper read at MPA, 1960. Stein, L. The classical conditioning of the consummatory response as a determinant of instrumental performance. J. comp. Physiol. Psycholo, 1957, 50, 269-278. Stimmel, D. T., and Birch, D. Frustration effects in early extinction following different numbers of acquisition trials. In preparation. Wickens, D. D., and Snide, J. D. The influence of nonreinforcement of a component of a complex stimulus on resistance to extinction of the complex itself. J. exp. Psychol., 1955, 49, 257-259. 43

UNIVERSITY OF MICHIGAN III1 I III I'''''' | 3 9015 02527 8022