PHILOSOPHER-SCIENTIST: Getting back to the example then. If the rat was first conditioned to press the lever by continuous reinforcement, what would be the next appropriate step in the shaping process if the final goal was a fixed-ratio (1000) schedule, where lever pressing was only reinforced once for each one thousand lever presses?
TINNY: I think you could safely move from continuous reinforcement to a fixed-ratio (5) schedule. If every fifth response was now reinforced, I doubt that extinction would take place during those first four responses after the schedule change took effect. Then after the rat pressed the lever the fifth time, reinforcement would once again occur. Any holistic analysis that might be coming to the conclusion, during those first four non-reinforced responses, that lever pressing was no longer followed by reinforcement would lose emphasis as reinforcement once again followed a lever press. Soon the fixed-ratio (5) schedule would be well established and accepted just as well as the continuous reinforcement schedule had been originally accepted.
PHILOSOPHER-SCIENTIST: Wouldn't the rat mind having to press the lever five times instead of one time to receive each food pellet?
TINNY: I doubt the rat would think that way. It would seem natural to an animal to sometimes get food for little effort and at other times to expend a great deal of energy getting food.
PHILOSOPHER-SCIENTIST: I think many people would resent having the amount of work increased while the reward stayed the same.
TINNY: I'm sure you're right.
PHILOSOPHER-SCIENTIST: Which attitude is best, the animal or the human?
TINNY: Harmony is a quality of animal existence, discord is seldom felt. Disharmony as presently a quality of human existence, discord is often felt.
PHILOSOPHER-SCIENTIST: Once the rat had been effectively conditioned to press the lever five times for each reinforcement, what is the next step as the fixed-ratio(1000) goal is sought?
TINNY: The step could be bigger this time. The schedule could probably be extended to a fixed-ratio (20). This would mean only every twentieth response would be followed by reinforcement
PHILOSOPHER-SCIENTIST: Why could the step in the shaping process be larger this time?
TINNY: The lever pressing response had initially been conditioned by continuous reinforcement. After having been conditioned under continuous reinforcement then shifting to a fixed ratio (5) schedule it would soon become obvious that some responses were then not being reinforced. Before extinction took place the holistic analysis would have come to accept that every response is not being reinforced, in fact most responses don't bring food. That any particular response is not followed by a reinforcing stimulus is no longer considered by the holistic analysis as sufficient information to determine whether or not lever pressing still brings food. Extinction would not take place nearly as quickly after having been conditioned on a fixed ratio (5) schedule since many more responses would be necessary to determine whether or not lever pressing was still an effective way to obtain food.
PHILOSOPHER-SCIENTIST: After lever pressing had been firmly established on the fixed-ratio (20) schedule could the next step be even larger?
TINNY: The higher the fixed-ratio schedule the larger the next step in the shaping process can be.
PHILOSOPHER-SCIENTIST: Is there no limit to that rule?
TINNY: That was actually a general statement. I'm sure there would be limits and exceptions.
PHILOSOPHER-SCIENTIST: How large could this next step be?
TINNY: All these numbers I'm giving are only estimates. I'm pretty sure they are plausible though. I'd say the next step in this example could require fifty or perhaps sixty responses for each reinforcement.
PHILOSOPHER-SCIENTIST: How soon after raising the number of responses required to receive reinforcement can the schedule be raised again?
TINNY: There is a name given to the process of requiring more responses for less reward. It's called thinning the schedule.
PHILOSOPHER-SCIENTIST: That's an appropriate name. So how soon after thinning the schedule can the schedule be thinned again?
TINNY: There is no fixed rule. It depends on the circumstances. It is best to let the conditioning at each thinned fixed-ratio schedule become stable before moving on to the next step is the shaping process.
PHILOSOPHER-SCIENTIST: What does it mean to let the conditioning become stable?
TINNY: Stability of conditioning means that the subject has become adjusted to the contingencies of the schedule and the schedule can maintain the behaviour. The worry always during the shaping process is that the conditioning will lose its influence and the behaviour being conditioned will undergo extinction.
PHILOSOPHER-SCIENTIST: You say that reinforcement is not only necessary to condition behaviour, but also to maintain behaviour.
TINNY: Behaviour which has been raised above the baseline level by reinforcement will always tend to drop back toward that baseline level unless some further reinforcement can maintain the conditioned level.
PHILOSOPHER-SCIENTIST: What might a next reasonable step in this shaping process be?
TINNY: Perhaps a fixed-ratio (150) schedule.
PHILOSOPHER-SCIENTIST: What after that?
TINNY: Maybe a fixed-ratio (300), then a fixed-ratio (500), a fixed-ratio (800) and finally a fixed-ratio(1000) schedule, which is the final goal of the shaping process.
PHILOSOPHER-SCIENTIST: A fixed-ratio (1000) schedule; one thousand lever presses for each food pellet. That's a lot of lever presses for just one food pellet.
TINNY: That would be called an extremely thin schedule of reinforcement. When a rat is pressing a lever one thousand times for just one food pellet, the rat would be slowly killing itself.
PHILOSOPHER-SCIENTIST: Why do you say that?
TINNY: It would require more energy to press the lever one thousand times than the amount of energy contained in one food pellet. At a fixed-ratio (1000) schedule of reinforcement the rat would be slowly working itself to death.
PHILOSOPHER-SCIENTIST: Conditioning must be very powerful.
TINNY: The application of the laws of learning is far too powerful to be used indiscriminately. That example, where the rat was conditioned to press the lever a thousand times for one food pellet, shows the danger of the misuse of these learning principles.
PHILOSOPHER-SCIENTIST: Do you mean because whoever conditioned the rat to work itself to death is doing a wrong thing?
TINNY: That was very wrong, of course; but, I meant this example is indicative of the general danger of so powerful an influence.
PHILOSOPHER-SCIENTIST: You said this conditioning is everywhere, so we can't eliminate the danger by not using these laws of learning. How can we protect the world against the misuse of these learning principles?
TINNY: The laws of learning themselves are amoral. They have no morality, good or bad. These laws can be used for good or they can be used for evil. We must forsake the use of these powerful influences for wrong ends. We must choose to use these powerful influences only for that which is good. We must use the laws of learning only to further our quest for perfection.
PHILOSOPHER-SCIENTIST: That sounds like an excellent idea; but, how can it be brought into reality? The knowledge of these learning principles belongs to every member of human society; though, if these laws of learning are freely distributed to all humanity there will be those will use these powerful influences to further their wrong desires.
TINNY: To insure the right use of these laws of learning, these principles must be presented as an integral part of a positive philosophy of life. The two, the technology and the philosophy, should be forever inseparable.
PHILOSOPHER-SCIENTIST: That is an unusual plan.
TINNY: I'm sure few would have expected this to be the answer. I know I didn't.
PHILOSOPHER-SCIENTIST: As you tell me of the way you came to understand these laws of learning, I am seeing the development of the new worldview, the unified theory of existence.
TINNY: The two have become one. The laws of learning and the development of a new, more positive worldview are forever inseparable.
PHILOSOPHER-SCIENTIST: Tell me more about these laws of learning so I may know more about all things.
TINNY: Since I just explained a few things about fixed-ratio schedules of reinforcement, this would be a good time to describe the other ratio schedule, the variable-ratio. Variable-ratio schedules, instead of providing reinforcement after the same number of responses each time, reinforce differing numbers of responses. These different numbers of responses vary around some average number. A fixed-ratio (5) schedule would result in every fifth response being followed by a reinforcing stimulus. A variable-ratio (5) schedule means that while reinforcement may occur after a different number of responses each time, the average of those numbers would be five responses. Reinforcement might occur after two responses, eight responses, six responses, four responses, one response, and nine responses. The total number of responses in that sequence is thirty. During those thirty responses reinforcement occurred six times. The average number of responses until reinforcement is five. That would be a variable-ratio (5) schedule.
PHILOSOPHER-SCIENTIST: If the rat received the same number of reinforcers for the same overall number of responses, would there be any difference between the effect on conditioning of fixed-ratio or variable-ratio schedules? It seems like there wouldn't be a difference in the effect, because both schedules require the same amount of work for the same reward.
TINNY: There are many similarities between the two different types of ratio schedules. There is one main difference in the effect of the two schedules though.
PHILOSOPHER-SCIENTIST: What is that main difference between the effect of a fixed-ratio schedule and a variable-ratio schedule?
TINNY: Just as the fixed-ratio schedule is more resistant to extinction than continuous reinforcement, the variable-ratio schedule is even more resistant to extinction that the fixed-ratio schedule.
PHILOSOPHER-SCIENTIST: What is it about variable-ratio schedules which make them more resistant to extinction than fixed-ratio schedules?
TINNY: Resistance to extinction is related to the amount of information available about the relationship between the response and reinforcing stimuli. The more information available to holistic analysis from each response about the relationship, the more quickly extinction will occur. The less information each response provides about that relationship the greater the resistance to extinction. During extinction of a response that has been conditioned on a schedule of continuous reinforcement, every response which is not reinforced provides information that the schedule is no longer in effect. Each response which is not followed by a reinforcing stimulus verifies that responding no longer brings reward. During extinction of a response which has been conditioned on a fixed-ratio schedule most responses provide no information as to whether or not the schedule is still in effect. They provide no information because they would not be followed by a reinforcer whether the fixed-ratio schedule was still in effect or if extinction had begun. Only the response which completes the ratio is reinforced; as for example in a fixed-ratio (5) schedule, only every fifth response would be reinforced. It is only this one response out of each five which could provide information that extinction had begun and no longer would any response be followed by a reinforcing stimulus. The continuous reinforcement schedule provided information during extinction with every non-reinforced response. The fixed-ratio (5) schedule provided information during extinction only on every fifth response; much less information than on a continuous reinforcement schedule, but fixed ratio schedules do still provide some information as to whether or not that reinforcement is still being provided for responding. With the variable-ratio schedule, though, no response can ever provide information which definitely indicates extinction has begun.
PHILOSOPHER-SCIENTIST: Why is that?
TINNY: Because during a variable-interval schedule there is no certain response which can ever be counted on to be followed by a reinforcing stimulus. For example a variable-ratio (5) schedule doesn't mean every fifth response can be expected to be reinforced. There could theoretically be a hundred or even a thousand responses without reinforcement during conditioning with a variable-interval (5) schedule.
PHILOSOPHER-SCIENTIST: How could it be a variable-interval (5) schedule if there was a hundred responses from one reinforcement to the next?
TINNY: I'm not saying a variable-ratio (five) schedule using these particular numbers would be effective. I'm trying to make the point that since there are an average number of responses per reinforcement, not a certain number of responses; therefore it can never be known beforehand which response will be reinforced. I'll give you an example of a sequence of numbers which make up a variable-ratio (5) schedule with over one hundred responses between one reinforcement and the next. Numbers which would fulfill the requirements are - reinforcement follows the first response, then after one hundred and one responses, and then after every one of the next twenty three responses. Reinforcement would have been given twenty five times during one hundred and twenty five responses, for an average of five responses for every reinforcer. This would still be a variable-ratio (5) schedule, although that would be an extremely unlikely spacing of reinforcement to ever occur in real life situations.
PHILOSOPHER-SCIENTIST: I see your point, though. Behaviour conditioned by a variable-ratio schedule is more resistant to extinction because each response gives less information about whether reinforcement is still possible than either continuous reinforcement or fixed-ratio schedules.
TINNY: I'm afraid my explanation got a bit complicated.
PHILOSOPHER-SCIENTIST: Sometimes that which is simple in practice is not easily put into words. Don't worry about the occasional awkwardness of your explanations. You have been doing a remarkable job explaining some extremely complex ideas, concepts, and scientific knowledge in a reasonably simple and understandable manner.
TINNY: Thank you. I've really been trying hard to do a good job.
PHILOSOPHER-SCIENTIST: Besides being more resistant to extinction, what other differences result from variable-ratio conditioning rather than from fixed-ration schedules?
TINNY: There is a minor difference which has more significance in the understanding of how learning takes place than in the difference itself.
PHILOSOPHER-SCIENTIST: What is the difference?
TINNY: The variable-ratio schedules generate high constant rates of responding. Fixed-ratio schedules generate similarly high rates of responding, but they are not so constant. During fixed-ratio conditioning there is often a pause in responding right after reinforcement occurs.
PHILOSOPHER-SCIENTIST: Does this pause have a name?
TINNY: Rather unimaginatively, it's called the post-reinforcement pause.
PHILOSOPHER-SCIENTIST: Might this pause just be caused by the time it takes to eat the reinforcer?
TINNY: Reinforcers aren't always food, but even in the example of the rat receiving food pellets, the pause cannot be accounted for by time taken to eat the reinforcing stimulus. If the reinforcer is food it takes the same amount of time to eat after reinforcement on a variable-ratio schedule as it does on a fixed-ratio schedule; and, there is no post reinforcement pause in the responding during variable-ratio conditioning.
PHILOSOPHER-SCIENTIST: What could account for this post-reinforcement pause?
TINNY: To answer that question it would be good to consider what might take place in the holistic analysis when a response is reinforced on fixed-ratio and on variable-ratio schedules. Remembering that every response provides information which is included in the holistic analysis, what prediction can be made about the next response after reinforcement on a fixed-ratio schedule?
PHILOSOPHER-SCIENTIST: On a fixed-ratio schedule the presentation of the reinforcing stimulus signals that the next response will not be reinforced. It indicates that there is once again some certain number of responses which must be emitted before reinforcement can once again follow the response.
TINNY: Would that same information be available on a variable-ratio schedule? Could holistic analysis assume that the response after reinforcement will not be reinforced?
PHILOSOPHER-SCIENTIST: On a variable-ratio schedule it cannot be known whether or not the next response after the presentation of the reinforcing stimulus will also be reinforced. That two or more responses in a row may be reinforced on a variable-ratio schedule remains always a possibility.
TINNY: In general, responding is more likely to occur if the response can result in reinforcement than if it can't. On a fixed-ratio schedule the response after reinforcement cannot be reinforced. It is for this reason pauses occur just after the presentation of the reinforcing stimulus during fixed-ratio conditioning, but not during variable-ratio schedules. From a human point of view it might be considered as having just finished a meal break from work and the person might think, "Well OK, I guess I have to get back to work now" with some hesitation before actually beginning work.
PHILOSOPHER-SCIENTIST: Before we move on to discuss the interval schedules, I think it would help to increase understanding of the ratio schedules if you would give some examples from human life.
TINNY: Many people have jobs which involve fixed-ratio schedules. Jobs of this sort are often called piece work. A certain amount of money is given for each fixed number of units completed. The units of work could be shirts made, boxes of fruit picked, number of papers delivered, or many other things.
PHILOSOPHER-SCIENTIST: Is a fixed-ratio schedule of that type effective in getting the job done?
TINNY: Doing piece work usually results in the employees working fast and hard.
PHILOSOPHER-SCIENTIST: I suppose the employers would like that.
TINNY: I'm sure they would; but, I think piece work is of much more advantage to the employer than to the worker. Jobs that are paid on a piecework basis are often not preferred jobs. The workers who take these jobs are often desperate and have no other alternative.
PHILOSOPHER-SCIENTIST: Don't the workers have the opportunity to make a good living by doing a lot of work?
TINNY: Not usually. It is more often the case that the reward for long hours and hard work will be barely enough to live on.
PHILOSOPHER-SCIENTIST: Why is that?
TINNY: Employers often misuse the power of learning principles. In technical terms, they thin the schedule until the workers are almost like the rat that was working itself to death. I wouldn't say that piecework could never provide a decent income, but it would be rare, if ever, that piecework was not of greater advantage to the employer than to the worker. Employers who have their workers on a fixed-ratio schedule tend to get more work for less pay than any other arrangement. There is no time for leisure, regardless of the post-reinforcement pause we discussed, on a fixed-ratio schedule. Any leisure time, indeed any break from the work routine, is paid for by the worker and costs nothing to the employer.
PHILOSOPHER-SCIENTIST: Well, that was a very disheartening example of fixed-ratio scheduling. How about an example of variable-ratio conditioning from human life?
TINNY: Variable-ratio schedules are very common and very old in human society. Gambling is one of the most prominent examples. Gambling behaviour is developed and maintained by conditioning on a variable-ratio schedule.
PHILOSOPHER-SCIENTIST: Would you explain how gambling involves a variable-ratio schedule?
TINNY: I'll have to give a very simple example because I don't know much about gambling. Suppose people were to bet on the throw of one dice.
PHILOSOPHER-SCIENTIST: Dice come in pairs. One of the pair of dice is called a die.
TINNY: It sounds funny to call one a die, but I guess I'll get used to it. Anyway, suppose people were to bet on the throw of one die. A die, being a cube, has six sides. The chance of any particular number coming up is one in six. When betting on the number that will come up there is one chance in six of winning. This doesn't mean the bettor will win every sixth throw of the die, it means that on average there would be one win for every six throws. This is clearly an example of a variable-ratio (6) schedule. One the average one throw in six will win. The throw, or the bet on each throw, is the response. The winnings are the reinforcing stimulus. It is the intermittent pairing of the response and the reinforcing stimulus which conditions gambling behaviour. Sometimes on variable-ratio schedules the reinforcing stimulus will follow several responses in succession, and sometimes there will be a large number of responses in a row which go unrewarded. When a person gambles for the first time they may encounter a number of reinforced responses in quick succession or they may bet quite a few times and never win. The person who won a number of times in quick succession would be more likely to gamble again in the future. Once the gambling behaviour becomes stabilised on the variable-ratio schedule it would be very resistant to extinction.
PHILOSOPHER-SCIENTIST: Sometimes gambling can become a very severe problem in people's lives.
TINNY: It is the resistance to extinction of variable-ratio schedules which can cause those terrible problems. In gambling there is never a response which doesn't have a chance of being reinforced. Since every bet has some chance of winning, there comes an expectation, or more truthfully, a hope that the next bet will win. Even after losing everything, when gambling has become a sickness, a gambler will try to gather money for just one more bet. Some people's lives have been destroyed by the power of variable-ratio conditioning to resist extinction.
PHILOSOPHER-SCIENTIST: Would you give me another example of human behaviour that compares extinction after continuous reinforcement with extinction after a ratio schedule?
TINNY: I'll use an example that doesn't involve the darker side of human behaviour. The devices we use sometimes condition our behaviour. In this example there are two people, each with a flashlight. One of the flashlights has always been in perfect working order. When the batteries are charged the flashlight works as soon as it is switched on. The other flashlight has never been in perfect working order. It seems to have a loose wire and sometimes, even when the batteries are charged, must be switched on and off a number of times before it will light. The owner of the good working flashlight is on a continuous reinforcement schedule. Every response of switching the flashlight on is immediately reinforced by the light coming on. The owner of the flashlight with the loose wire is on a variable-ratio schedule. Sometimes the first response of switching on the flashlight is reinforced, sometimes the switch must be operated ten or more times before the light comes on. Now the extinction process begins. The batteries in both the flashlights have lost their charge.
There is no longer any possibility the response of switching on either flashlight will be reinforced by the light coming on. The person with the flashlight that is in perfect working order switches the flashlight on for the first time with the unchanged batteries. The response of switching on the flashlight is not reinforced. This first response during the extinction process provides a lot of information. The person immediately thinks the batteries have lost their charge. The person with this flashlight may try the switch once, perhaps even twice more, but would almost immediately stop attempting to turn the light on. Because conditioning took place during continuous reinforcement the response would extinguish almost immediately when the response was no longer followed by the reinforcing stimulus. Now contrast this with the person's behaviour who has the flashlight with a loose wire. Extinction is in effect here also, since regardless of the loose wire the batteries have no charge. The light cannot be turned on. When the person switches on the flashlight, for the first time during extinction, that response is not reinforced. The light does not come on.
This lack of reinforcement provides little information as to the fact that the batteries have no charge. The flashlight often fails to work even when the batteries are charged. Rather than thinking the lack of reinforcement means the batteries have lost their charge, the more likely belief is that further responding will make the light come on. Instead of stopping trying after only one or two attempts at switching on the light, there might be ten, twenty, or more attempts before giving in. Even then the person would not be sure it was the batteries lack of charge which was the reason the light would not come on. On a variable-ratio schedule it is never possible to be sure that further responses won't be reinforced. Even after attempting to switch the light on twenty or thirty times, the flashlight might be put aside only to be tried again later. Responses conditioned by variable-ratio schedules can last a very long time after reinforcement for the response is no longer available.
PHILOSOPHER-SCIENTIST: It is interesting that even after giving up on attempts to get the flashlight to work the person would come back and try again.
TINNY: That is one of the general characteristics of behaviour during extinction. Even after a response appears to have been totally extinguished there will often be another sudden response or burst of responding.
PHILOSOPHER-SCIENTIST: Does that characteristic of extinction have a name?
TINNY: It is called spontaneous recovery.
PHILOSOPHER-SCIENTIST: Is spontaneous recovery more likely to occur during the extinction process of behaviour conditioned on some schedules more than others?
TINNY: Those schedules that condition behaviours which are most resistant to extinction will show the greatest amount of spontaneous recovery. Those schedules which are least resistant to extinction will show the least amount of spontaneous recovery.
PHILOSOPHER-SCIENTIST: So extinction of a response conditioned by continuous reinforcement would show less spontaneous recovery than extinction of a response conditioned by fixed-ratio or variable-ratio reinforcement would show.
TINNY: That's correct. The more certain holistic analysis can be that the response will no longer be reinforced, the less likely spontaneous recovery is to occur. The less certain holistic analysis can be that the response will no longer be reinforced, the more likely spontaneous recovery is to occur.
PHILOSOPHER-SCIENTIST: That sounds quite logical. If it's virtually certain some particular behaviour will not have the desired outcome, it is not very reasonable to behave again in that manner; but, if it appears there is some chance a behaviour will have the desired outcome it is reasonable to behave again in that manner.
TINNY: Once again we see how logically holistic analysis functions.
PHILOSOPHER-SCIENTIST: Maybe now we could discuss the interval schedules of reinforcement.
TINNY: Fine. There are two types of interval schedules, just as there were two types of ratio schedules. They are fixed-interval and variable-interval. The fixed-interval schedule is one in which only the first response after some fixed-interval of time is reinforced. For example, on a fixed-interval (five minute) schedule only the first response after each five minute period had elapsed will be reinforced. No response during the five minute period is ever reinforced.
PHILOSOPHER-SCIENTIST: That must mean there would be five minutes between reinforcements.
TINNY: It could mean that, but it doesn't necessarily mean that.
PHILOSOPHER-SCIENTIST: If it is a fixed-interval (five minute) schedule, wouldn't reinforcements always come at fixed five minute intervals?
TINNY: That is a common misunderstanding of what fixed-interval scheduling means. The reinforcement doesn't necessarily come at the end of the fixed-interval, only the opportunity to be reinforced for a response comes at the end of the fixed-interval. The reinforced response may occur immediately after the end of the five minute time period in which case the reinforcement would be given at the end of the fixed-interval period, or the response could come any time after the end of the five minute fixed-interval, for example two minutes later. The rat would in that case not be reinforced at the end of the five minute fixed time period, but would be reinforced two minutes later.
PHILOSOPHER-SCIENTIST: So the end of each fixed-interval does not mean reinforcement will occur, but only that a response can now be reinforced.
TINNY: That's right.
PHILOSOPHER-SCIENTIST: How many responses can be reinforced at the end of each fixed-interval?
TINNY: Just one. As soon as a response occurs and is reinforced, after the end of each fixed-interval, the same interval begins again. No response during the fixed-interval can ever be followed by a reinforcing stimulus, reinforcement can only occur after the fixed-interval has elapsed.
PHILOSOPHER-SCIENTIST: There isn't much reinforcement available on a fixed-interval schedule.
TINNY: You're right, there isn't much reinforcement available, but unlike the fixed-ratio schedule which could require much work for little reward, fixed-interval schedules don't require much work. There is probably less work done on a fixed-interval schedule than on any other schedule.
PHILOSOPHER-SCIENTIST: Is it generally true that on interval schedules the response rate is lower than on ratio schedules?
TINNY: That is almost always true. Ratio schedules tend to generate very high rates of responding. Interval schedules tend to generate low to medium rates of responding.
PHILOSOPHER-SCIENTIST: Why do ratio schedules generate such high rates of responding?
TINNY: On all ratio schedules whether fixed or variable, whether low ratio or high ratio, one fact always remains true. The greater the number of responses the greater the amount of reinforcement. The more you work, the more you get.
PHILOSOPHER-SCIENTIST: Wouldn't more responses always bring more reward, no matter what schedule of reinforcement is in effect?
TINNY: The response rate during interval schedules has virtually nothing to do with how much reinforcement is received. During interval schedules all responses which occur before the interval has elapsed are wasted. There is never any chance they could have been rewarded. The only response which is of importance is the first one after the particular interval has ended.
PHILOSOPHER-SCIENTIST: So it is really only necessary to respond one time after each interval has ended to receive the maximum reinforcement that is possible.
TINNY: That's all it would take to get the most reward possible, but it's not so easy to do.
PHILOSOPHER-SCIENTIST: Why not?
TINNY: To respond most efficiently on an interval schedule it would be necessary to know exactly when each interval has ended. Animals can't judge time very well; neither can human beings without a clock. So even if the length of the interval were known, which is often not the case, it would still be difficult to judge when the interval had elapsed. If the judgment as to when the interval has ended is too early the responses will not be reinforced.
PHILOSOPHER-SCIENTIST: But if the judgment as to when the interval has ended is too late, the first response will still be reinforced. Wouldn't it be a good tactic to just wait an overly long time before responding and the first response would always be reinforced. Wouldn't that make the most efficient use of each response?
TINNY: It would be an efficient use of each response, but it would be an inefficient use of time.
PHILOSOPHER-SCIENTIST: Would you explain that with an example?
TINNY: If the rat in the experimental chamber, which was being conditioned on a fixed-interval (five minute) schedule, were to adopt the tactics you describe it might wait ten minute after each reinforced response before responding again. It would do this, according to the tactic you propose, so that it would never waste a response by responding before the five minute interval has elapsed. If the rat could judge time perfectly, which it can't, it would respond immediately after the end of each five minute interval and be reinforced twelve times each hour. By waiting ten minutes after the each reinforcement, the rat does get reinforced for each response, but only gets reinforced six times each hour. This is why I said the tactic of purposefully waiting extra time without responding is an inefficient use of time.
PHILOSOPHER-SCIENTIST: I would have to agree the tactic I suggested was not an efficient one. Tell me, how did the rat solve the problem as to how most efficiently use both time and energy when being conditioned by an interval schedule?
TINNY: There are two types of interval schedules, and each requires a different tactic to receive the most reinforcement for the least work. The difference between fixed-interval and variable-interval schedules is greater than the difference between fixed-ratio and variable-ratio schedules.
PHILOSOPHER-SCIENTIST: I want you to explain those differences, but first would you define a variable-interval schedule. I think I know what it would entail from our discussion of the other schedules, but I'd like to be sure before we go into the differences.
TINNY: As an example I'll describe a variable-interval (five minute) schedule. This would mean that while the length of the intervals will differ they will average five minutes. Just as with fixed-interval schedules, no response is ever reinforced during an interval, regardless of the interval's length. The only response reinforced is the first one that occurs after the different intervals end. A variable-interval (five minute) schedule might contain a sequence of intervals; two minutes, nine minutes, thirty seconds, seven and a half minutes, six minutes, six minutes, and four minutes. The total amount of time in this sequence is thirty five minutes and there are seven opportunities for a response to be reinforced. One the average one response would be reinforced every five minutes, although none of the intervals was actually five minutes in length.
PHILOSOPHER-SCIENTIST: From that explanation of fixed-interval and variable-interval schedules it isn't easy to see why the differences would be greater between fixed-interval and variable-interval schedules than between fixed-ratio and variable-ratio schedules.
TINNY: I think the reasons will become clear if we consider how the holistic analysis may interpret the information available from those fixed-interval and variable-interval schedules. Remember that, as always, the function of holistic analysis is to insure logical and efficient behaviour in relation to whatever circumstances arise. When obtaining food the object is to get as much food for as little work as possible. It is that general plan which determines the behaviours on the various schedules of reinforcement. First we'll consider how the rat in an experimental chamber might respond on a fixed-interval (five minute) schedule of reinforcement. Assuming that the rat has been conditioned on this schedule long enough for the behaviour to become stabilised, what information will be provided to holistic analysis when a lever pressing response is reinforced with a food pellet?
PHILOSOPHER-SCIENTIST: A response being reinforced would provide information that there will be some certain period of time until another level pressing response will be reinforced.
TINNY: We know that period of time when the response will not be reinforced is five minutes long, but does the rat know that?
PHILOSOPHER-SCIENTIST: Of course the rat would have no idea what five minutes means and wouldn't be able to judge that length of time accurately.
TINNY: That's true, but a rat still has a fairly developed time sense. At least it knows there is some period of time in which lever pressing is never followed by the reinforcing stimulus. The holistic analysis would not advise the lever be pressed for some period of time after each reinforced response. Any lever pressing early in the five minute time period would be wasted energy, so holistic analysis wants to wait as long as possible without responding. It is also desirable to get as much reinforcement as possible, so waiting too long is not acceptable either. These conflicting desires present a dilemma. It is neither desirable to respond too early nor too late; but, without an accurate way to tell time it is not possible to be certain when the best time to respond is. Holistic analysis deals with this dilemma in the following way; during the beginning of the five minute period after the reinforced response, holistic analysis does not advise any lever pressing.
PHILOSOPHER-SCIENTIST: How long might that period of non-responding be?
TINNY: The rat might not press the lever at all for the first two or three minutes of the five minute interval. Since not being sure how long until reinforcement is once more possible and not wanting to miss out, holistic analysis begins advising an occasional lever press after the first few minutes. As time passes, the likelihood that the non-reinforcement period has ended increases, so lever pressing becomes more frequent. By the time the five minute interval has ended, the rat is responding at a fairly high rate. This high rate of responding at the end of the fixed time interval insures that the reinforcer will be received as soon as it becomes available. Then, as soon as the first response after the five minute interval is reinforced the rat stops pushing the lever for several minutes again. During the first several minutes there were no responses, after that the rate of lever pressing increased slowly, only becoming fast very near the end of the fixed interval. If the rate of responding was averaged over the whole five minute interval that rate would be quite low.
PHILOSOPHER-SCIENTIST: That is a very clever way to deal with a fixed-interval schedule of reinforcement.
TINNY: Holistic analysis is very clever.
PHILOSOPHER-SCIENTIST: Is there any common human behaviour that is governed by fixed-interval reinforcement?
TINNY: Many schools condition students to study on a fixed-interval schedule. When exams are scheduled to be given on some certain date in the future, this acts as a fixed-interval schedule. For instance an exam may be planned for a month in the future. The response under consideration is the student's study behaviour. The schedule is a fixed-interval (one month). Many students after hearing the exam is a month away will do no studying at all for two or three weeks. Then they may do some small amount of study until a day or two before the exam when a high rate of study behaviour begins, and many will stay up most of the night before the exam studying at a very high rate, called cramming.
PHILOSOPHER-SCIENTIST: What is the reinforcement in this schedule?
TINNY: Hopefully a good or at least passing grade in the exam.
PHILOSOPHER-SCIENTIST: The study behaviour pattern of students is almost identical to the lever pressing behaviour of a rat on a fixed-interval schedule.
TINNY: Fixed-interval schedules condition behaviour into a very distinctive pattern.
PHILOSOPHER-SCIENTIST: Since fixed-interval schedules of reinforcement result in such low overall rates of behaviour, do you think that is a good schedule to develop the study behaviour of students with?
TINNY: I think the fact that study behaviour is taught by means of fixed-interval scheduling is just one of many examples of how ignorance of even the most simplistic learning principles limits personal development and creates various human social problems.
PHILOSOPHER-SCIENTIST: What kind of study behaviour would develop if exams were given on a variable-interval schedule?
TINNY: If exams were given on a variable-interval schedule, the students would never know when to expect an exam. Every once in a while the student would be presented with an exam when they showed up in class. Many students in this type of situation tend to do some studying almost every day. They are afraid not to study at all because each day may be the day of the exam. They don't study intensely each day because there may be no exam that day, and they would receive no reward for studying. A steady, but low, rate of studying would be developed by the variable-interval exam schedule.
PHILOSOPHER-SCIENTIST: Is a slow and steady rate the general pattern of responding on variable-interval schedules of reinforcement?
TINNY: That's right. If the rat was conditioned to press the lever on a variable-interval (five minute) schedule its response pattern would be very similar to the student's pattern of study behaviour on a variable-interval schedule. It would press the lever at a continuous slow and steady rate throughout the interval.
PHILOSOPHER-SCIENTIST: Would the rat's reasons for pressing the lever at a slow and steady rate be the same as the student's reasons for doing only a small amount of study regularly?
TINNY: The student might or might not consciously decide a slow but steady rate of study is the most sensible way to respond to exams on a variable-interval schedule. The rat would not be able to make that same level of conscious decision; but, it will still act logically and efficiently. On a variable-interval schedule if the rat pressed the lever at a high rate there would be a great number of unreinforced responses. This would be a waste of energy. If a rat waited a long while between lever presses it might not get as much reward as the schedule would allow. Reinforcement on a variable-interval schedule might be available anytime. It could be available almost immediately after the last reinforced response. The slow steady rate of responding is a way of constantly making sure if reinforcement is available without expending too much energy.
PHILOSOPHER-SCIENTIST: It seems giving exams at variable-intervals might be preferable to giving them at fixed-intervals.
TINNY: There would probably be some gain in the total amount of study done, that is true; but, it also might cause some unhappiness to the students.
PHILOSOPHER-SCIENTIST: Why is that?
TINNY: The students may feel it an unfair situation to be required to study constantly, but often receive no reward. Additionally, they would still be working at a level far below their potential. They would be learning far less than they could if they fulfilled their potential.
PHILOSOPHER-SCIENTIST: Why should the goal be to do as much work, or as in this example as much learning, as possible if the strategy used by holistic analysis is to do as little work for as much reward as possible?
TINNY: The overriding desire as long as there is deprivation of any need is to obtain as much reinforcement, reward, or success as possible. So receiving maximum reward is primary; but, while obtaining that maximum reinforcement the secondary desire is to achieve this goal while expending as little energy as possible. When the amount of reward is unlimited, as is knowledge, study behaviour while still striving for maximum efficiency achieves maximum reward by a maximum expenditure of energy. Ideally students will not be studying only to do well on an exam, but will study in order to acquire knowledge. The results on exams should be secondary to the reward of acquired knowledge.
PHILOSOPHER-SCIENTIST: If students really studied for the knowledge rather than passing the exam with a high score wouldn't exams be unnecessary?
TINNY: Exams are used to provide a means to assign grades. Exams can also be used to let the student know how well their studies are progressing; and, specifically, to let the student know the weaknesses or gaps in their knowledge. This information could be used to plan for future study.
PHILOSOPHER-SCIENTIST: Neither of the interval schedules of reinforcement seems appropriate to study behaviour. Would ratio schedules be any better?
TINNY: Either fixed-ratio or variable-ratio schedules could be used effectively to encourage study behaviour. Ratio schedules would be appropriate because they offer unlimited opportunity for reward for hard work. The more responding, the more reinforcement. The more you study, the more you learn. Variable-ratio would probably be best as it would allow greatest flexibility.
PHILOSOPHER-SCIENTIST: How would fixed and variable-ratio schedules be organised to reinforce study behaviour?
TINNY: When study is done from textbooks such schedules are easy to set up. On a fixed-ratio schedule the student would be allowed to take an exam each time they had studied a particular number of pages, for example on a fixed-ratio (twenty five pages) schedule. On a variable-ratio schedule the student would be allowed to be examined on what they learned after any number of pages they chose. A student may find one page particularly important but difficult, and may want to be examined on how effectively they had learned that one page before moving on. At other times a student may feel very confident of the progress of their knowledge and may not want to interfere with that progress. In this case the student may desire to verify their success and confidence by an examination after studying a hundred pages.
PHILOSOPHER-SCIENTIST: In these examples is taking an exam considered to be a reinforcer?
TINNY: Yes it is.
PHILOSOPHER-SCIENTIST: Many students don't find the opportunity to take an exam very rewarding.
TINNY: Any stimulus can be positive or negative depending on the circumstances. Ideally an exam should be a positive experience for a student. Both correct and incorrect answers on an exam can be reinforcing. If an answer is correct the reward comes from having successfully acquired new knowledge. If an answer is incorrect the reward comes from the opportunity to correct a flaw in knowledge.
PHILOSOPHER-SCIENTIST: Often students are punished in some way for incorrect answers in an exam. This punishment can be a poor grade, parental or teacher disapproval, or even the loss of opportunity to continue studies.
TINNY: None of those situations should ever occur. All forms of punishment hold the student back from the true purpose of study, which is the gaining of knowledge. There are always problems associated with the use of punishment. Punishment is seldom an effective way to stop negative behaviour; and it is never an effective way to develop positive behaviour.
PHILOSOPHER-SCIENTIST: I think I understand both ratio and interval schedules of reinforcement fairly well now.
TINNY: I'll ask you a few questions about the different schedules to make sure.
PHILOSOPHER-SCIENTIST: Then we will both know if I have correctly understood and if we are ready to go on to some of the more complex aspects of learning principles.
TINNY: Which schedule of reinforcement would usually be the quickest way to teach any behaviour?
PHILOSOPHER-SCIENTIST: Learning usually takes place faster on continuous reinforcement than any other schedule of reinforcement.
TINNY: That's right, but why is it true?
PHILOSOPHER-SCIENTIST: Because continuous reinforcement gives the most information about the relationship between the response and the reinforcement. It is always clear exactly which behaviour will bring reward, since every correct response is reinforced.
TINNY: Right again. In the following example tell me which is the response being conditioned, what the reinforcing stimulus is, and what schedule of reinforcement is in effect. There is a young boy four years old. Whenever he doesn't get his way he throws a tantrum. Sometimes his parents give in to him as soon as tantrums begin, but other times, since they are becoming accustomed to his tantrums, they let him go on with his tantrums for a long time until the noise and screaming rises so high it becomes unbearable and they can't stand it anymore.
PHILOSOPHER-SCIENTIST: That was a more complicated example than we have discussed so far.
TINNY: I don't want to know if you can just repeat answers to questions about learning we have already discussed. If you have truly understood what we have discussed so far you will be able to answer questions which push the boundaries of that knowledge.
PHILOSOPHER-SCIENTIST: Well then, in your example the little boy's tantrum behaviour is the response being conditioned. Tantrum behaviour is both being maintained at a high rate and shaped to a higher intensity by the intermittent reinforcement offered by the parents. The child is rewarded for his tantrums by getting his own way. The parents are rewarding their son's tantrum behaviour on a variable-ratio schedule. This ensures that the tantrums will occur more often and that they will be very resistant to extinction.
TINNY: That was a very good answer. You saw that given those conditions the tantrum behaviour will not only increase in number but will become louder and more violent. It has happened in such pathological situations that children have come to harm themselves or others severely during uncontrolled tantrums. That critical situation doesn't arise suddenly, but is conditioned through a series of small steps over a long period of time.
PHILOSOPHER-SCIENTIST: Why don't the parents quit reinforcing such tantrum behaviour before it becomes so harmful?
TINNY: Because they often don't understand even the simplest workings of the laws of learning. The parents might attribute the child's tantrums to all sorts of influences, but not to the obvious conditioning process which actually controls the tantrum behaviour.
PHILOSOPHER-SCIENTIST: What kind of things might the parents attribute the child's tantrums to?
TINNY: The child's tantrums would often be attributed to some vague inner state, such as "that is his nature", "he was born bad", or, "that runs in the family". All those statements imply the child's tantrum behaviour is actually an inherited trait, somehow carried in the genes, and therefore not susceptible to parental influence. This allows the parents to disown responsibility for their part in the development of the child's tantrum behaviour.
PHILOSOPHER-SCIENTIST: Are the parents solely to blame for their children's actions?
TINNY: Particularly when children are very young, the parents must accept a great deal of the responsibility for the behavioural patterns their children develop. I purposefully used the word 'responsibility' instead of the word 'blame'. Responsibility only indicates the degree of significance of the natural influences which occur in the early parent-child relationships. Blame has an associated meaning beyond responsibility, as if the person has done something bad and perhaps should be punished. Blame often results in guilt. Guilt is not a beneficial state of mind. Responsibility should be accepted objectively, with no feelings of guilt. Responsibility for something which has gone wrong should lead to different behaviour in the future. Guilt need not be felt for most wrong behaviour. Wrong behaviour should be recognised for what it is and serve as a cue to behave differently in the future. In such a case the wrong behaviour functions as a discriminative stimulus which indicates a different, more positive behaviour will be reinforced.
PHILOSOPHER-SCIENTIST: Guilt is a form of punishment, isn't it?
TINNY: Guilt is one of the most common forms of punishment at the human level of existence.
PHILOSOPHER-SCIENTIST: Before we get away from the example of the learned tantrum behaviour, I wanted you to define a word you used in that example. It was the word 'pathological'.
TINNY: The word pathological refers to disease or illness. I used the word to indicate the sickness of human society which results in certain negative behaviour and the social interactions which condition those wrong behaviours.
PHILOSOPHER-SCIENTIST: So you are calling that conditioning of wrong behaviours 'sick', in the sense of mental illness.
TINNY: Both the tantrum behaviour and the interactions between the parents and child which conditioned that tantrum behaviour are symptoms of an insane society.
PHILOSOPHER-SCIENTIST: Many, perhaps most, children throw tantrums sometime in their lives. What does that say about their families?
TINNY: It says more about society than about the individual families. Sick behaviour is the norm in an insane society.
PHILOSOPHER-SCIENTIST: In a truly sane society wouldn't there still be some tantrums?
TINNY: In a truly sane society there would be no tantrums.
PHILOSOPHER-SCIENTIST: Do you have more questions for me?
TINNY: I still want to ask more about the example of tantrum behaviour. Are the parents being conditioned on any schedule of reinforcement? If they are, what type of schedule is in effect; what is the response being conditioned; and, what is the reinforcer?
PHILOSOPHER-SCIENTIST: Yes, the parents' behaviour is also being conditioned. The parents are on a continuous reinforcement schedule. The response being reinforced is giving in to their child's demands. The reinforcing stimulus in this case is the termination of the tantrum behaviour. Every time they give in to their child's demands the tantrum stops.
TINNY: It's not surprising the parents find it rewarding to be able to put an end to a tantrum. They are being conditioned to give in to their child's demands by negative reinforcement. Do you remember the definition of negative reinforcement?
PHILOSOPHER-SCIENTIST: Negative reinforcement increases the rate of a response which puts an end to an ongoing aversive stimulus. In this example it seems like the parents are doing their child a great disservice by giving in to him while he is throwing a tantrum.
TINNY: They are teaching him something that will bring him much harm and unhappiness throughout his life.
PHILOSOPHER-SCIENTIST: Why would they ever begin giving in to tantrums in the first place?
TINNY: It is possible that the parents never began giving in to tantrums.