Operant conditioning (also, “instrumental conditioning”) is a learning process in which behavior is sensitive to, or controlled by its consequences. For example, a child may learn to open a box to get the candy inside, or learn to avoid touching a hot stove. In contrast, classical conditioning causes a stimulus to signal a positive or negative consequence; the resulting behavior does not produce the consequence. For example, the sight of a colorful wrapper comes to signal "candy", causing a child to salivate, or the sound of a door slam comes to signal an angry parent, causing a child to tremble. The study of animal learning in the 20th century was dominated by the analysis of these two sorts of learning, and they are still at the core of behavior analysis.
Historical note 1
- Thorndike's law of effect 1.1
- Skinner 1.2
Concepts and procedures 2
- Origins of operant behavior: operant variability 2.1
Modifying operant behavior: reinforcement and shaping 2.2
- Factors that alter the effectiveness of reinforcement and punishment 2.2.1
- Shaping 2.2.2
- Stimulus control of operant behavior 2.3
- Behavioral sequences: conditioned reinforcement and chaining 2.4
Escape and Avoidance 2.5
- Discriminated avoidance learning 2.5.1
- Free-operant avoidance learning 2.5.2
- Two-process theory of avoidance 2.5.3
- Operant or "one-factor" theory 2.5.4
Some other terms and procedures 2.6
- Noncontingent reinforcement 2.6.1
- Schedules of reinforcement 2.6.2
- Discrimination, generalization & context 2.6.3
- Operant hoarding 2.6.4
- Operant conditioning to change human behavior 3
- Biological correlates of operant conditioning 4
- Operant conditioning in economics 5
- Questions about the law of effect 6
- See also 7
- References 8
- External links 9
Thorndike's law of effect
Operant conditioning, sometimes called instrumental learning, was first extensively studied by Edward L. Thorndike (1874–1949), who observed the behavior of cats trying to escape from home-made puzzle boxes. A cat could escape from the box by a simple response such as pulling a cord or pushing a pole, but when first constrained the cats took a long time to get out. With repeated trials ineffective responses occurred less frequently and successful responses occurred more frequently, so the cats escaped more and more quickly. Thorndike generalized this finding in his law of effect, which states that behaviors followed by satisfying consequences tend to be repeated and those that produce unpleasant consequences are less likely to be repeated. In short, some consequences strengthen behavior and some consequences weaken behavior. By plotting escape time against trial number Thorndike produced the first known animal learning curves through this procedure.
Humans appear to learn many simple behaviors through the sort of process studied by Thorndike, now called operant conditioning. That is, responses are retained when they lead to a successful outcome and discarded when they do not, or when they produce aversive effects. This usually happens without being planned by any "teacher", but operant conditioning has been used by parents in teaching their children for thousands of years;
 published in 1938, initiated his lifelong study of operant conditioning and its application to human and animal behavior. Following the ideas of Ernst Mach, Skinner rejected Thorndike's reference to unobservable mental states such as satisfaction, building his analysis on observable behavior and its equally observable consequences.
To implement his empirical approach, Skinner invented the  The effects of schedules became, in turn, the basic findings from which Skinner developed his account of operant conditioning. He also drew on many less formal observations of human and animal behavior.
Many of Skinner's writings are devoted to the application of operant conditioning to human behavior. In 1948 he published
Simple non-associative learning
Library resources about
1. Staddon, J. E. R. & Cerutti, D. T. (2003) Operant behavior. Annual Review of Psychology, 54:115-14 2. Kalat, J. (2013). Introduction to Psychology (10th ed.). Cengage Learning. 3. Elmes, D. (2011). Research Methods in Psychology (9th ed.). Cengage Learning. 4. Boyd, D. (2014). Lifespan Development (7th ed.). Cengage Learning. 5. Myers, D. (2011). Psychology (10th ed.). Cengage Learning. 6. Ormrod, J. (2011). Human Learning (6th ed.). Pearson. 7. Skinner, B.F. (1953). Science and Human Behavior. New York: Macmillan.
- Jenkins, H. M. “Animal Learning and Behavior Theory” Ch. 5 in Hearst, E. “The First Century of Experimental Psychology” Hillsdale N. J., Earlbaum, 1979
- Thorndike, E.L. (1901). "Animal intelligence: An experimental study of the associative processes in animals". Psychological Review Monograph Supplement 2: 1–109.
- Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". Thomson/Wadsworth, 2008. p. 9.
- Miltenberger, R. G., & Crosland, K. A. (2014). Parenting. The wiley blackwell handbook of operant and classical conditioning. (pp. 509-531) Wiley-Blackwell. doi:10.1002/9781118468135.ch20
- Skinner, B. F. "The Behavior of Organisms:An Experimental Analysis", 1938 New York: Appleton-Century-Crofts
- Skinner, B. F. (1950). "Are theories of learning necessary?". Psychological Review 57: 193–216.
- Schacter, Daniel L., Daniel T. Gilbert, and Daniel M. Wegner. "B. F. Skinner: The role of reinforcement and Punishment", subsection in: Psychology; Second Edition. New York: Worth, Incorporated, 2011, 278-288.
- Ferster, C. B. & Skinner, B. F. "Schedules of Reinforcement", 1957 New York: Appleton-Century-Crofts
- Staddon, J. E. R; D. T Cerutti (February 2003). "Operant Conditioning". Annual Review of Psychology 54 (1): 115–144.
- Mecca Chiesa (2004) Radical Behaviorism: The philosophy and the science
- Skinner, B. F. "Science and Human Behavior", 1953. New York: MacMillan
- Skinner, B.F. (1948). Walden Two. Indianapolis: Hackett
- Skinner, B. F. "Verbal Behavior", 1957. New York: Appleton-Century-Crofts
- Neuringer, A (2002). "Operant variability: Evidence, functions, and theory". Psychonometric Bulletin & Review 9 (4): 672–705.
- Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". Thomson/Wadsworth, 2008. p. 84.
- Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". Thomson/Wadsworth, 2008. p. 86.
- Tucker, M.; Sigafoos, J.; Bushell, H. (1998). "Use of noncontingent reinforcement in the treatment of challenging behavior". Behavior Modification 22: 529–547.
- Poling, A.; Normand, M. (1999). "Noncontingent reinforcement: an inappropriate description of time-based schedules that reduce behavior". Journal of Applied Behavior Analysis 32: 237–238.
- Schacter et al.2011 Psychology 2nd ed. pg.280-284 Reference for entire section Principles version 130317
- Cole, M.R. (1990). "Operant hoarding: A new paradigm for the study of self-control". Journal of the Experimental Analysis of Behavior 53: 247–262.
- Pierce & Cheney (2004) Behavior Analysis and Learning
- "Activity of pallidal neurons during movement", M.R. DeLong, J. Neurophysiol., 34:414–27, 1971
- Richardson RT, DeLong MR (1991): Electrophysiological studies of the function of the nucleus basalis in primates. In Napier TC, Kalivas P, Hamin I (eds), The Basal Forebrain: Anatomy to Function (Advances in Experimental Medicine and Biology, vol. 295. New York, Plenum, pp. 232–252
- PNAS 93:11219-24 1996, Science 279:1714–8 1998
- Neuron 63:244–253, 2009, Frontiers in Behavioral Neuroscience, 3: Article 13, 2009
- Michael J. Frank, Lauren C. Seeberger, and Randall C. O'Reilly (2004) "By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism," Science 4, November 2004
- Schultz, Wolfram (1998). "Predictive Reward Signal of Dopamine Neurons". The Journal of Neurophysiology 80 (1): 1–27.
- Domjan, M. (2009). The Principles of Learning and Behavior. Wadsworth Publishing Company. 6th Edition. pages 244-249.
- Timberlake, W (1983). "Rats' responses to a moving object related to food or water: A behavior-systems analysis". Animal Learning & Behavior 11 (3): 309–320.
- Neuringer, A.J. (1969). "Animals respond for food in the presence of free food". Science 166: 399–401.
- Williams, D.R.; Williams, H. (1969). "Auto-maintenance in the pigeon: sustained pecking despite contingent non-reinforcement". J. Exper. Analys. of Behav 12: 511–520.
- Peden, B.F.; Brown, M.P.; Hearst, E. (1977). "Persistent approaches to a signal for food despite food omission for approaching.". Journal of Experimental Psychology: Animal Behavior Processes 3 (4): 377–399.
- Gardner, R.A.; Gardner, B.T. (1988). "Feedforward vs feedbackward: An ethological alternative to the law of effect". Behavioral and Brain Sciences 11: 429–447.
- Gardner, R. A. & Gardner B.T. (1998) The structure of learning from sign stimuli to sign language. Mahwah NJ: Lawrence Erlbaum Associates.
- Baum, W. M. (2012). "Rethinking reinforcement: Allocation, induction and contingency". Journal of the Experimental Analysis of Behavior 97: 101–124.
- Locurto, C. M., Terrace, H. S., & Gibbon, J. (1981) Autoshaping and conditioning theory. New York: Academic Press.
- Animal testing
- Applied behavior analysis (ABA; application of operant and classical conditioning)
- Behavioral contrast
- Behaviorism (philosophy behind behavior analysis)
- Behavior modification (old expression for ABA; modifies behavior with consequences not antecedents)
- Child grooming
- Cognitivism (psychology) (theory of internal mechanisms without reference to behavior)
- Consumer demand tests (animals)
- Educational psychology
- Educational technology
- Experimental analysis of behavior (research in operant and classical conditioning)
- Exposure therapy
- Jerzy Konorski
- Learned industriousness
- Matching law
- Negative (positive) contrast effect
- Radical behaviorism (B.F. Skinner's philosophy)
- Reinforcement learning
- Reward system
- Power and control in abusive relationships
- Preference tests (animals)
- Premack principle
- Psychological manipulation
- Social conditioning
- Spontaneous recovery
- Traumatic bonding
These observations and others appear to contradict the law of effect, and they have prompted some researchers to propose new conceptualizations of operant reinforcement (e.g. A more general view is that autoshaping is an instance of classical conditioning; the autoshaping procedure has, in fact, become one of the most common ways to measure classical conditioning. In this view, many behaviors can be influenced by both classical contingencies (stimulus-response) and operant contingencies (response-reinforcement), and the experimenter’s task is to work out how these interact.
A number of observations seem to show that operant behavior can be established without reinforcement in the sense defined above. Most cited is the phenomenon of autoshaping (sometimes called "sign tracking"), in which a stimulus is repeatedly followed by reinforcement, and in consequence the animal begins to respond to the stimulus. For example, a response key is lighted and then food is presented. When this is repeated a few times a pigeon subject begins to peck the key even though food comes whether the bird pecks or not. Similarly, rats begin to handle small objects, such as a lever, when food is presented nearby. Strikingly, pigeons and rats persist in this behavior even when pecking the key or pressing the lever leads to less food (omission training).
Questions about the law of effect
Both psychologists and economists have become interested in applying operant concepts and findings to the behavior of humans in the marketplace. An example is the analysis of consumer demand, as indexed by the amount of a commodity that is purchased. In economics, the degree to which price influences consumption is called "the elasticity of demand." Certain commodities are more elastic than others; for example, a change in price of certain foods may have a large effect on the amount bought, while gasoline and other essentials may be less affected by price changes. In terms of operant analysis, such effects may be interpreted in terms of motivations of consumers and the relative value of the commodities as reinforcers. 
Operant conditioning in economics
A neurochemical process involving dopamine has been suggested to underlie reinforcement. When an organism experiences a reinforcing stimulus, dopamine pathways in the brain are activated. This network of pathways "releases a short pulse of dopamine onto many dendrites, thus broadcasting a rather global reinforcement signal to postsynaptic neurons." This allows recently activated synapses to increase their sensitivity to efferent (conducting outward) signals, thus increasing the probability of occurrence for the recent responses that preceded the reinforcement. These responses are, statistically, the most likely to have been the behavior responsible for successfully achieving reinforcement. But when the application of reinforcement is either less immediate or less contingent (less consistent), the ability of dopamine to act upon the appropriate synapses is reduced.
The first scientific studies identifying neurons that responded in ways that suggested they encode for conditioned stimuli came from work by Mahlon deLong and by R.T. Richardson. They showed that nucleus basalis neurons, which release acetylcholine broadly throughout the cerebral cortex, are activated shortly after a conditioned stimulus, or after a primary reward if no conditioned stimulus exists. These neurons are equally active for positive and negative reinforcers, and have been shown to be related to neuroplasticity in many cortical regions. Evidence also exists that dopamine is activated at similar times. There is considerable evidence that dopamine participates in both reinforcement and aversive learning. Dopamine pathways project much more densely onto frontal cortex regions. Cholinergic projections, in contrast, are dense even in the posterior cortical regions like the primary visual cortex. A study of patients with Parkinson's disease, a condition attributed to the insufficient action of dopamine, further illustrates the role of dopamine in positive reinforcement. It showed that while off their medication, patients learned more readily with aversive consequences than with positive reinforcement. Patients who were on their medication showed the opposite to be the case, positive reinforcement proving to be the more effective form of learning when dopamine activity is high.
Biological correlates of operant conditioning
- State goal Clarify exactly what changes are to be brought about. For example, "reduce weight by 30 pounds."
- Monitor behavior Keep track of behavior so that one can see whether the desired effects are occurring. For example, keep a chart of daily weights.
- Reinforce desired behavior For example, congratulate the individual on weight losses. With humans, a record of behavior may serve as a reinforcement. For example, when a participant sees a pattern of weight loss, this may reinforce continuance in a behavioral weight-loss program. A more general plan is the token economy, an exchange system in which tokens are given as rewards for desired behaviors. Tokens may later be exchanged for a desired prize or rewards such as power, prestige, goods or services.
- Reduce incentives to perform undesirable behavior For example, remove candy and fatty snacks from kitchen shelves.
Applied behavior analysis, which is the name of the discipline directly descended from Skinner's work, uses four terms: conditioned stimulus (SC), discriminative stimulus (Sd), response (R), and reinforcing stimulus (Srein or Sr for reinforcers, sometimes Save for aversive stimuli). The conditioned stimulus controls behaviors developed through respondent (classical) conditioning, such as emotional reactions. The other three terms combine to form Skinner's "three-term contingency": the discriminative stimulus sets the occasion for responses that lead to reinforcement. Researchers have found the following protocol to be effective when they use the tools of operant conditioning to modify human behavior:
Operant conditioning to change human behavior
Operant hoarding refers to the observation that rats reinforced in a certain way may allow food pellets to accumulate in a food tray instead of retrieving those pellets. In this procedure, retrieval of the pellets always instituted a one-minute period of extinction during which no additional food pellets were available but those that had been accumulated earlier could be consumed. This finding appears to contradict the usual finding that rats behave impulsively in situations in which there is a choice between a smaller food object right away and a larger food object after some delay. See schedules of reinforcement.
- "Discrimination" typically occurs when a response is reinforced only in the presence of a specific stimulus. For example, a pigeon might be fed for pecking at a red light and not at a green light; in consequence, it pecks at red and stops pecking at green. Many complex combinations of stimuli and other conditions have been studied; for example an organism might be reinforced on an interval schedule in the presence of one stimulus and on a ratio schedule in the presence of another.
- "Generalization" is the tendency to respond to stimuli that are similar to a previously trained discriminative stimulus. For example, having been trained to peck at "red" a pigeon might also peck at "pink", though usually less strongly.
- "Context" refers to stimuli that are continuously present in a situation, like the walls, tables, chairs, etc. in a room, or the interior of an operant conditioning chamber. Context stimuli may come to control behavior as do discriminative stimuli, though usually more weakly. Behaviors learned in one context may be absent, or altered, in another. This may cause difficulties for behavioral therapy, because behaviors learned in the therapeutic setting may fail to occur elsewhere.
Most behavior is under stimulus control. Several aspects of this may be distinguished:
Discrimination, generalization & context
- Fixed interval schedule: Reinforcement occurs following the first response after a fixed time has elapsed after the previous reinforcement.
- Variable interval schedule: Reinforcement occurs following the first response after a variable time has elapsed from the previous reinforcement.
- Fixed ratio schedule: Reinforcement occurs after a fixed number of responses have been emitted since the previous reinforcement.
- Variable ratio schedule: Reinforcement occurs after a variable number of responses have been emitted since the previous reinforcement.
- Continuous reinforcement: Reinforcement occurs after each response.
Schedules of reinforcement are rules that control the delivery of reinforcement. The rules specify either the time that reinforcement is to be made available, or the number of responses to be made, or both.
Schedules of reinforcement
Noncontingent reinforcement is the delivery of reinforcing stimuli regardless of the organism's behavior. Noncontingent reinforcement may be used in an attempt to reduce an undesired target behavior by reinforcing multiple alternative responses while extinguishing the target response. As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".
Some other terms and procedures
Some theorists suggest that avoidance behavior may simply be a special case of operant behavior maintained by its consequences. In this view the idea of "consequences" is expanded to include sensitivity to a pattern of events. Thus, in avoidance, the consequence of a response is a reduction in the rate of aversive stimulation. Indeed, experimental evidence suggests that a "missed shock" is detected as a stimulus, and can act as a reinforcer. Cognitive theories of avoidance take this idea a step farther. For example, a rat comes to "expect" shock if it fails to press a lever and to "expect no shock" if it presses it, and avoidance behavior is strengthened if these expectancies are confirmed.  
Operant or "one-factor" theory
This theory was originally proposed in order to explain discriminated avoidance learning, in which an organism learns to avoid an aversive stimulus by escaping from a signal for that stimulus. Two processes are involved: classical conditioning of the signal followed by operant conditioning of the escape response: a) Classical conditioning of fear. Initially the organism experiences the pairing of a CS with an aversive US. The theory assumes that this pairing creates an association between the CS and the US through classical conditioning and, because of the aversive nature of the US, the CS comes to elicit a conditioned emotional reaction (CER) – "fear." b) Reinforcement of the operant response by fear-reduction. As a result of the first process, the CS now signals fear; this unpleasant emotional reaction serves to motivate operant responses, and responses that terminate the CS are reinforced by fear termination. Note that the theory does not say that the organism "avoids" the US in the sense of anticipating it, but rather that the organism "escapes" an aversive internal state that is caused by the CS. Several experimental findings seem to run counter to two-factor theory. For example, avoidance behavior often extinguishes very slowly even when the initial CS-US pairing never occurs again, so the fear response might be expected to extinguish (see Classical conditioning). Further, animals that have learned to avoid often show little evidence of fear, suggesting that escape from fear is not necessary to maintain avoidance behavior.
Two-process theory of avoidance
In free-operant avoidance a subject periodically receives an aversive stimulus (often an electric shock) unless an operant response is made; the response delays the onset of the shock. In this situation, unlike discriminated avoidance, no prior stimulus signals the shock. Two crucial time intervals determine the rate of avoidance learning. This first is the S-S (shock-shock) interval. This is time between successive shocks in the absence of a response. The second interval is the R-S (response-shock) interval. This specifies the time by which an operant response delays the onset of the next shock. Note that each time the subject performs the operant response, the R-S interval without shock begins anew.
Free-operant avoidance learning
A discriminated avoidance experiment involves a series of trials in which a neutral stimulus such as a light is followed by an aversive stimulus such as a shock. After the neutral stimulus appears an operant response such as a lever press prevents or terminate the aversive stimulus. In early trials the subject does not make the response until the aversive stimulus has come on, so these early trials are called "escape" trials. As learning progresses, the subject begins to respond during the neutral stimulus and thus prevents the aversive stimulus from occurring. Such trials are called "avoidance trials." This experiment is said to involve classical conditioning, because a neutral CS is paired with an aversive US; this idea underlies the two-factor theory of avoidance learning described below.
Discriminated avoidance learning
Two kinds of experimental settings are commonly used: discriminated and free-operant avoidance learning.
In escape learning, a behavior terminates an (aversive) stimulus. For example, shielding one's eyes from sunlight terminates the (aversive) stimulation of bright light in one's eyes. (This is an example of negative reinforcement, defined above.) Behavior that is maintained by preventing a stimulus is called "avoidance," as, for example, putting on sun glasses before going outdoors. Avoidance behavior raises the so-called "avoidance paradox", for, it may be asked, how can the non-occurrence of a stimulus serve as a reinforcer? This question is addressed by several theories of avoidance (see below).
Escape and Avoidance
Most behavior cannot easily be described in terms of individual responses reinforced one by one. The scope of operant analysis is expanded through the idea of behavioral chains, which are sequences of responses bound together by the three-term contingencies defined above. Chaining is based on the fact, experimentally demonstrated, that a discriminative stimulus not only sets the occasion for subsequent behavior, but it can also reinforce a behavior that precedes it. That is, a discriminative stimulus is also a "conditioned reinforcer". For example, the light that sets the occasion for lever pressing may be used to reinforce "turning around" in the presence of a noise. This results in the sequence "noise - turn-around - light - press lever - food". Much longer chains can be built by adding more stimuli and responses.
Behavioral sequences: conditioned reinforcement and chaining
Though initially operant behavior is emitted without reference to a particular stimulus, during operant conditioning operants come under the control of stimuli that are present when behavior is reinforced. Such stimuli are called "discriminative stimuli." A so-called "three-term contingency" is the result. That is, discriminative stimuli set the occasion for responses that produce reward or punishment. Thus, a rat may be trained to press a lever only when a light comes on; a dog rushes to the kitchen when it hears the rattle of its food bag; a child reaches for candy when she sees it on a table.
Stimulus control of operant behavior
Shaping is a conditioning method much used in animal training and in teaching non-verbal humans. It depends on operant variability and reinforcement, as described above. The trainer starts by identifying the desired final (or "target") behavior. Next, the trainer chooses a behavior that the animal or person already emits with some probability. The form of this behavior is then gradually changed across successive trials by reinforcing behaviors that approximate the target behavior more and more closely. When the target behavior is finally emitted, it may be strengthened and maintained by the use of a schedule of reinforcement (see below).Shaping Most of these factors serve biological functions. For example, the process of satiation helps the organism maintain a stable internal environment (
- Satiation/Deprivation: The effectiveness of a positive or "appetitive" stimulus will be reduced if the individual has received enough of that stimulus to satisfy its appetite. The opposite effect will occur if the individual becomes deprived of that stimulus: the effectiveness of a consequence will then increase. If someone is not hungry, food will not be an effective reinforcer for behavior.
- Immediacy: An immediate consequence is more effective than a delayed consequence. If one gives a dog a treat for "sitting" right away, the dog will learn faster than if the treat is given later.
- Contingency: To be most effective, reinforcement should occur consistently after responses and not at other times. Learning may be slower if reinforcement is intermittent, that is, following only some instances of the same response, but responses reinforced intermittently are usually much slower to extinguish than are responses that have always been reinforced.
- Size: The size, or amount, of a stimulus often affects its potency as a reinforcer. Humans and animals engage in a sort of "cost-benefit" analysis. A tiny amount of food may not "be worth" an effortful lever press for a rat. A pile of quarters from a slot machine may keep a gambler pulling the lever longer than a single quarter.
The effectiveness of reinforcement and punishment can be changed in various ways.
Factors that alter the effectiveness of reinforcement and punishment
It is important to note that actors (e.g. rat) are not spoken of as being reinforced, punished, or extinguished; it is the actions (e.g. lever press) that are reinforced, punished, or extinguished. Also, reinforcement, punishment, and extinction are not terms whose use is restricted to the laboratory. Naturally occurring consequences can also reinforce, punish, or extinguish behavior and are not always planned or delivered by people.
- Positive reinforcement (reinforcement): This occurs when a behavior (response) is followed by a stimulus that is appetitive or rewarding, increasing the frequency of that behavior. For example, if a rat in a Skinner box gets food when it presses a lever, its rate of pressing will go up. This procedure is usually called simply reinforcement.
- Negative reinforcement (escape): This occurs when a behavior (response) is followed by the removal of an aversive stimulus, thereby increasing that behavior's frequency. In the Skinner box experiment, the aversive stimulus might be a loud noise continuously sounding inside the box; negative reinforcement would happen when the rat presses a lever, turning off the noise.
- Positive punishment: This occurs when a behavior (response) is followed by a stimulus, such as a shock or loud noise, which results in a decrease in that behavior. Positive punishment is a rather confusing term, and usually the procedure is simply called "punishment."
- Negative punishment (penalty) (also called "Punishment by contingent withdrawal"): Occurs when a behavior (response) is followed by the removal of a stimulus, such as taking away a child's toy following an undesired behavior, resulting in a decrease in that behavior.
- Extinction: This occurs when a behavior (response) that had previously been reinforced is no longer effective. For example, a rat is first given food many times for lever presses. Then, in "extinction", no food is given. Typically the rat continues to press more and more slowly and eventually stops, at which time lever pressing is said to be "extinguished."
Thus there are a total of five basic consequences -
- Extinction occurs when a previously reinforced behavior is no longer reinforced with either positive or negative reinforcement. During extinction the behavior becomes less probable.
There is an additional procedure
- Positive Reinforcement and Negative Reinforcement increase the probability of a behavior while Positive Punishment and Negative Punishment reduce the probability of a behaviour that it follows.
Reinforcement and punishment are the core tools through which operant behavior is modified. These terms are defined by their effect on behavior. Either may be positive or negative, as described below.
Modifying operant behavior: reinforcement and shaping
Operant behavior is said to be "emitted"; that is, initially it is not elicited by any particular stimulus. Thus one may ask why it happens in the first place. The answer to this question is like Darwin's answer to the question of the origin of a "new" bodily structure, namely, variation and selection. Similarly, the behavior of an individual varies from moment to moment, in such aspects as the specific motions involved, the amount of force applied, or the timing of the response. Variations that lead to reinforcement are strengthened, and if reinforcement is consistent, the behavior tends to remain stable. However, behavioral variability can itself be altered through the manipulation of certain variables.
Origins of operant behavior: operant variability
Concepts and procedures
which extended the principles of operant conditioning to language, a form of human behavior that had previously been analyzed quite differently by linguists and others. Skinner defined new functional relationships such as "mands" and "tacts" to capture some essentials of language, but he introduced no new principles, treating verbal behavior like any other behavior controlled by its consequences, which included the reactions of the speaker's audience. ,Verbal Behavior published Skinner In 1957,