Trial-and-Error Learning: Taking a Rocky Road
It is instructive to note that one of the most popular books on writing ever published
is called Trial and Error by the novelist Jack Woodford. It sold many copies
Learning: Understanding Acquired Behavior 77
over a number of years, and communicated to would-be authors that the only
way to learn to write was by taking the rocky road of learning by making one's
own mistakes.
The first kind of learning to be studied experimentally in the United States was
trial-and-error learning. Edward L. Thorndike (1874–1949) first studied maze
learning in baby chickens (with the assistance and approval ofWilliam James). Later
he studied the escape behavior of cats from puzzle boxes. The cats had to learn to
pull a string that released a latch connected to a door. The cats learned to pull the
string, but only very gradually. They showed no sudden burst of insight or comprehension.
Thorndike concluded that the learning was a robotlike process controlled
primarily by its outcomes. If a specific behavior helped a cat to escape, that
behavior was retained by the cat. Thorndike called this process stamping in,
meaning that an action that is useful is impressed upon the nervous system.
What stamps in a response, according to Thorndike, is satisfaction. The cat
that escapes from a puzzle box is rewarded with food. Thorndike called the tendency
to retain what is learned because satisfactory results are obtained the law of
effect. Thorndike's law of effect is the forerunner of what today is usually known
as the process of reinforcement (see the next section).
(a) If a specific behavior helps a cat to escape from a puzzle box, this behavior is retained by
the cat. Thorndike called this process .
(b) Thorndike's law of effect is the forerunner of what today is usually known as the
process of .
Answers: (a) stamping in; (b) reinforcement.
Operant Conditioning: How Behavior Is Shaped by Its
Own Consequences
Operant behavior is characterized by actions that have consequences. Flick a
light switch and the consequence is illumination. Saw on a piece of wood and the
consequence is two shorter pieces of wood. Tell a joke and the consequence is
(sometimes) the laughter of others. Work hard at a job all week and the consequence
is a paycheck. In each of these cases the specified action "operates" on the
environment, changes it in some way.
It was B. F. Skinner (1904–1990) who applied the term operant to the kind
of behaviors described above. He saw that operant behavior is both acquired and
shaped by experience. Consequently, he identified it as a kind of learning. In
addition, he also categorized it as a form of conditioning because he believed that
such concepts as consciousness and thinking are not necessary to explain much
(perhaps most) operant behavior.
78 PSYCHOLOGY
Skinner, long associated with Harvard, invented a device called the operant
conditioning apparatus; its informal name is the Skinner box. Think of the
apparatus as something like a candy machine for animals such as rats and
pigeons. A rat, for example, learns that it can obtain a pellet of food when it
presses a lever. If the pellet appears each time the lever is pressed, the rate of
lever pressing will increase. Lever pressing is operant behavior (or simply an operant.)
The pellet is a reinforcer. A reinforcer is a stimulus that has the effect of
increasing the frequency of a given category of behavior (in this case, lever
pressing).
(a) Operant behavior is characterized by actions that have .
(b) The formal term for a Skinner box is the .
Answers: (a) consequences; (b) operant conditioning apparatus.
The concept of reinforcement plays a big part in Skinner's way of looking at
behavior. Consequently, it is important to expand on the concept. Note in the
above definition that a reinforcer is understood in terms of its actual effects. It is to
be distinguished from a reward. A reward is perceived as valuable to the individual
giving the reward, but it may not be valued by the receiving organism. In the
case of a reinforcer, it is a reinforcer only if it has some sort of payoff value to the
receiving organism. By definition, a reinforcer has an impact on operant behavior.
Its function is always to increase the frequency of a class of operant behaviors.
One important way to categorize reinforcers is to refer to them as positive and
negative. A positive reinforcer has value for the organism. Food when you are
hungry, water when you are thirsty, and money when you're strapped for cash all
provide examples of positive reinforcers.
(a) The function of a reinforcer is always to the frequency of a class of operant
behaviors.
(b) A has value for the organism.
Answers: (a) increase; (b) positive reinforcer.
A negative reinforcer has no value for the organism. It does injury or is noxious
in some way. A hot room, an offensive person, and a dangerous situation all
provide examples of negative reinforcers. The organism tends to either escape
from or avoid such reinforcers. The operant behavior takes the subject away from
the reinforcer. Turning on the air conditioner when a room is hot provides an
example of operant behavior designed to escape from a negative reinforcer. Note
that the effect of the negative reinforcer on behavior is still to increase the frequency
of a class of operants. You are more likely to turn on an air conditioner
tomorrow if you have obtained relief by doing so today.
Learning: Understanding Acquired Behavior 79
It is also important to note that a negative reinforcer is not punishment. In the
case of punishment, an operant is followed by an adverse stimulus. For example, a
child sasses a parent and then gets slapped. Getting slapped comes after the child's
behavior. In the case of a negative reinforcer, the adverse stimulus is first in time.
Then the operant behavior of escape or avoidance follows.
(a) Operant behavior takes a subject from a negative reinforcer.
(b) In the case of punishment, an operant is by an adverse stimulus.
Answers: (a) away; (b) followed.
Another important way to classify reinforcers is to designate them as having
either a primary or a secondary quality. A primary reinforcer has intrinsic value
for the organism. No learning is required for the worth of the reinforcer to exist.
Food when you are hungry and water when you are thirsty are not only positive
reinforcers, as indicated above, they are also primary reinforcers.
A secondary reinforcer has acquired value for the organism. Learning is
required. Money when you're strapped for cash is a positive reinforcer, as indicated
above, but it is a secondary one. You have to learn that cash has value. An
infant does not value cash, but does value milk. A medal, a diploma, and a trophy
all provide examples of secondary reinforcers.
(a) A has intrinsic value for an organism.
(b) A has acquired value for an organism.
Answers: (a) primary reinforcer; (b) secondary reinforcer.
One of the important phenomena associated with operant conditioning is
extinction. Earlier, we discussed how extinction takes place when the conditioned
stimulus is presented a number of times without the unconditioned stimulus.
Extinction also takes place when the frequency of a category of operant responses
declines. If, using the operant conditioning apparatus, reinforcement is withheld
from a rat, then lever pressing for food will decline and eventually diminish to
nearly zero. The organism has learned to give up a given operant because it no
longer brings the reinforcer.
Both animal and human research on extinction suggest that it is a better way
to "break" bad habits than is punishment. If a way can be found to eliminate the
reinforcer (or reinforcers) linked to a behavior pattern, the behavior is likely to
be given up. Punishment tends to temporarily suppress the appearance of an
operant, but extinction has not necessarily taken place. Consequently, the
unwanted operant has "gone underground," and may in time surface as an
unpleasant surprise. Also, punishment is frustrating to organisms and tends to
make them more aggressive.
80 PSYCHOLOGY
(a) Extinction takes place when the frequency of a category of operant responses
.
(b) Punishment is frustrating to organisms and tends to make them more .
Answers: (a) declines; (b) aggressive.
Another important phenomenon associated with operant conditioning is the
partial reinforcement effect, the tendency of operant behavior acquired under
conditions of partial reinforcement to possess greater resistance to extinction than
behavior acquired under conditions of continuous reinforcement. Let's say that rat
1 is reinforced every time it presses a lever; this rat is receiving continuous reinforcement.
Rat 2 is reinforced every other time it presses a lever; this rat is receiving
partial reinforcement. Both rats will eventually acquire the lever-pressing
response. Now assume that reinforcement is withheld for both rats. The rat that
will, in most cases, display greater resistance to extinction is rat 2. Skinner was surprised
by this result. If reinforcement is a kind of strengthening of a habit, then rat
1, receiving more reinforcement, should have the more well-established habit.
And it should demonstrate greater resistance to extinction than rat 2.
Nonetheless, the partial reinforcement effect is a reality, and Skinner became
interested in it. He and his coworkers used many schedules of reinforcement to
study the partial reinforcement effect. In general, it holds for both animals and
human beings that there is indeed a partial reinforcement effect. Random reinforcement
is determined by chance, and is, consequently, unpredictable. If
behavior is acquired with random reinforcement, it exaggerates the partial reinforcement
effect. Skinner was fond of pointing out that random payoffs are associated
with gambling. This explains to some extent why a well-established
gambling habit is hard to break.
(a) Operant behavior acquired under conditions of partial reinforcment tends to possess
greater resistance to than behavior acquired under conditions of continuous
reinforcement.
(b) What kind of reinforcement is determined by chance?
Answers: (a) extinction; (b) Random reinforcement.
Assume that an instrumental conditioning apparatus contains a light bulb.
When the light is on, pressing the lever pays off. When the light is off, pressing
the lever fails to bring forth a reinforcer. Under these conditions, a trained experimental
animal will tend to display a high rate of lever pressing when the light is
on and ignore the lever when the light is off. The light is called a discriminative
stimulus, meaning a stimulus that allows the organism to tell the difference
between a situation that is potentially reinforcing and one that is not. Cues used
to train animals, such as whistles and hand signals, are discriminative stimuli.
Learning: Understanding Acquired Behavior 81
Skinner notes that discriminative stimuli control human behavior, too. A factory
whistle communicating to workers that it's time for lunch, a bell's ring for a
prizefighter, a school bell's ring for a child, and a traffic light for a driver are all
discriminative stimuli. Stimuli can be more subtle than these examples. A lover's
facial expression or tone of voice may communicate a readiness or lack of readiness
to respond to amorous advances.
Skinner asserts that in real life both discriminative stimuli and reinforcers automatically
control much of our behavior.
A stimulus that allows the organism to tell the difference between a situation that is potentially
reinforcing and one that is not is called a .
Answer: discriminative stimulus.
Consciousness and Learning: What It Means to Have
an Insight
Although classical and operant conditioning play a large part in both animal and
human learning, it is generally recognized by behavioral scientists that these two
related processes give an insufficient account of the learning process, particularly
in human beings. Consequently, it is important to identify at least four additional
aspects of learning. These are (1) observational learning, (2) latent learning, (3)
insight learning, and (4) learning to learn.
Observational learning takes place when an individual acquires behavior
by watching the behavior of a second individual. Albert Bandura, a principal
researcher associated with observational learning, identified important features
of this particular process. The second individual is a model, and either intentionally
or unintentionally demonstrates behavior. If the observer identifies with
the model and gains imaginary satisfaction from the model's behavior, then
this is vicarious reinforcement. Vicarious reinforcement is characterized by
imagined gratification. Psychologically, it acts as a substitute for the real thing.
Let's say that Jonathan admires a particular tennis star. When the star wins an
important tournament, Jonathan is ecstatic. This emotional state is a vicarious
reinforcer.
It should be noted that the concept of watching a model is very general. Reading
a mystery novel and identifying with the detective is a kind of observational
behavior. The thrills associated with the hero's adventures are vicarious thrills.
(a) What kind of learning takes place when an individual acquires behavior by watching the
behavior of a second individual?
(b) A either intentionally or unintentionally demonstrates behavior.
82 PSYCHOLOGY
(c) is characterized by imagined gratification.
Answers: (a) Observational learning; (b) model; (c) Vicarious reinforcement.
Social learning theory, associated with Bandura's research, states that much
of our behavior in reference to other people is acquired through observational
learning. Let's say that Carol is a fifteen-year-old high school student. She is on
the fringe of a group of adolescent females who admire a charismatic eighteenyear-
old named Dominique. Dominique smokes, uses obscenities, and brags
about her sexual exploits. Carol observes Dominique and obtains a lot of vicarious
reinforcement from Dominique's behavior. If Carol begins to imitate
Dominique's behavior, then social learning has taken place.
Both prosocial behavior and antisocial behavior can be acquired through
observational learning. Prosocial behavior is behavior that contributes to the
long-run goals of a traditional reference group such as the family or the population
of the nation (see chapter 16). If an individual admires one or both parents,
then the parents may be taken as role models. Many adolescents and young adults
acquire attitudes and personal habits that resemble those of their parents. If one is
patriotic and ready to defend one's nation during time of war, it is quite likely that
the individual is taking important historical figures such as presidents and generals
as role models.
Antisocial behavior is behavior that has an adverse impact on the long-run
goals of a traditional reference group. From the point of view of Carol's parents,
if Carol begins to act like Dominique, then Carol's behavior is antisocial.
(a) What theory states that much of our behavior in reference to other people is acquired
through observational learning?
(b) is behavior that contributes to the long-run goals of a traditional
reference group.
(c) is behavior that has an adverse impact on the long-run goals of a
traditional reference group.
Answers: (a) Social learning theory; (b) Prosocial behavior; (c) Antisocial behavior.
Latent learning is a second kind of learning in which consciousness
appears to play a large role. Pioneer research on latent learning is associated with
experiments conducted by the University of California psychologist Edward C.
Tolman and his associates. Let's say that a rat is allowed to explore a maze without
reinforcement. It seems to wander through the maze without any particular
pattern of behavior. It is probably responding to its own curiosity drive, but no
particular learning appears to be taking place. Let's say that after ten such opportunities,
reinforcement in the form of food in a goal box is introduced. The rat,
Learning: Understanding Acquired Behavior 83
if it is typical, will quickly learn to run the maze with very few errors. Its learning
curve is highly accelerated compared to that of a rat that has not had an earlier
opportunity to explore the maze. This is because the first rat was actually
learning while it was exploring. The function of reinforcement in this case is to
act as an incentive, a stimulus that elicits and brings forth whatever learning the
organism has acquired.
Note that the learning was actually acquired when the rat was exploring.
Therefore learning was taking place without reinforcement. Such learning is called
latent learning, meaning learning that is dormant and waiting to be activated.
Let's say that Keith is an adolescent male. For years his mother has forced him,
with no particular reinforcement, to make his bed and hang up his clothes neatly.
But Keith has, from his mother's point of view, been a slow learner. He does both
tasks poorly. He enlists in the army shortly after his eighteenth birthday. In basic
training he makes his bed and hangs up his clothes neatly. He has been told that
he will obtain his first weekend pass only if he performs various tasks properly.
The fact that Keith shows a very rapid learning curve under these conditions provides
an example of latent learning. He was learning under his mother's influence,
but he wasn't motivated to bring the learning forth.
The process of latent learning calls attention to the learning-performance
distinction. Learning is an underlying process. In the case of latent learning it is
temporarily hidden. Performance is the way in which learning is displayed in
action. Only performance can actually be observed and directly measured.
(a) is learning that is dormant and waiting to be activated.
(b) is the way in which learning is displayed in action.
Answers: (a) Latent learning; (b) Performance.
Insight learning is a third kind of learning in which consciousness appears to
play a major role. Groundbreaking research on insight learning was conducted by
Wolfgang Köhler, one of the principal Gestalt psychologists. One of Köhler's
principal subjects was an ape named Sultan. Sultan was presented with two short
handles that could be assembled to make one long tool, a kind of rake. An orange
was placed outside of Sultan's cage and it was beyond the reach of either handle.
Sultan spent quite a bit of time using the handles in useless ways. He seemed to be
making no progress on the problem.
Then one day Sultan seemed to have a burst of understanding. He clicked
together the handles and raked in the orange. Köhler called this burst of understanding
an insight, and defined it as a sudden reorganization of a perceptual
field. Originally, Sultan's perceptual field contained two useless handles. With
insight, Sultan's perceptual field contained a long rake. The conscious mental
process that brings a subject to an insight is called insight learning.
84 PSYCHOLOGY
A burst of understanding associated with the sudden reorganization of a perceptual field is
called an .
Answer: insight.
Insight learning is also important for human beings. Let's say that a child in
grammar school is told that pi is the ratio of the circumference of a circle to the
diameter, and that a rounded value for pi is 3.14. The child memorizes the definition,
but the definition has little meaning. If, on the other hand, the child is
encouraged to measure the diameters and the circumferences of cans, pie tins, and
wheels using a string and a ruler, the child may acquire the insight that round
items are always about three times bigger around than they are across. Acquiring
an insight is more satisfying than just memorizing material. Also, insights tend to
resist the process of forgetting.
Harry Harlow, a former president of the American Psychological Association,
using rhesus monkeys as subjects, discovered a phenomenon called learning
sets. Assume that a monkey is given a discrimination problem. It is required to
learn that a grape, used as a reinforcer, is always to be found under a small circular
container instead of a square one. The learning curve is gradual, and a
number of trials are required before learning is complete. A second similar
problem is given. The discrimination required is between containers with two
patterns, a crescent moon and a triangle. The learning curve for the second
problem is more accelerated than the learning curve for the first problem. By
the time a fourth or a fifth similar problem is given, the monkey is able to solve
the problem in a very few trials. The monkey has acquired a learning set, an
ability to quickly solve a given type of problem. The underlying process is called
learning to learn.
Human beings also acquire learning sets. A person who often solves crossword
puzzles tends to get better and better at working them. A mechanic who has
worked in the automotive field for a number of years discovers that it is easier and
easier to troubleshoot repair problems. A college student often finds that advanced
courses seem to be easier than basic courses. All of these individuals have learned
to learn.
An acquired ability to quickly solve a given type of problem is called a .
Answer: learning set.
Memory: Storing What Has Been Learned
What would life be like without memory? You would have no personal history.
You would have no sense of the past—what you had done and what your child-
Learning: Understanding Acquired Behavior 85
hood was like. Learning would be a meaningless concept, because learning implies
retention. You will recall that the definition of learning includes the idea that
learning is more or less permanent.
Memory is a process that involves the encoding, storage, and retrieval of cognitive
information. Let's explore these three related processes one by one. Encoding
is a process characterized by giving an informational input a more useful
form. Let's say that you are presented with the letters TCA. They seem meaningless.
You are told that the letters represent an animal that meows. You think, "The
animal is a cat." You have just transformed the informational input TCA into
CAT, and it has become more useful to you. The use of symbols, associations, and
insights are all examples of human encoding.
The use of a mnemonic device, a cognitive structure that improves both
retention and recall, is a special case of encoding. Let's say that in a physics class
you are asked to memorize the colors of the rainbow in their correct order—red,
orange, yellow, green, blue, indigo, and violet. You can use the name Roy G. Biv
as a mnemonic device, using the first letter of each color.
(a) is a process characterized by giving an informational input a more
useful form.
(b) The use of the name Roy G. Biv to remember the colors of the rainbow is an example
of a .
Answers: (a) Encoding; (b) mnemonic device.
Storage refers to the fact that memories are retained for a period of time. A
distinction is made between short-term memory and long-term memory. Shortterm
memory, also known as working memory, is characterized by a temporary
storage of information. If you look up a telephone number, hold it in at the
conscious level of your mind for a few minutes, use it, and then promptly forget
it, you are employing the short-term memory process. Long-term memory is
characterized by a relatively stable, enduring storage of information. The capacity
to recall much of your own personal history and what you learned in school provide
examples of the long-term memory process.
If short-term memory is impaired, as it is in some organic mental disorders
(see chapter 14), then this interferes with the capacity to form new long-term
memories.
(a) refers to the fact that memories are retained for a period of time.
(b) Short-term memory is also known as .
(c) is characterized by a relatively stable, enduring storage of information.
86 PSYCHOLOGY
Answers: (a) Storage; (b) working memory; (c) Long-term memory.
Retrieval of cognitive information takes place when a memory is removed
from storage and replaced in consciousness. Three phenomena are of particular
interest in connection with the retrieval process: recall, recognition, and repression.
Recall takes place when a memory can be retrieved easily by an act of will. You
see a friend and think, "There's Paula." You have recalled the name of your friend.
Recognition takes place when the retrieval of a memory is facilitated by the
presence of a helpful stimulus. A multiple-choice test that provides four names,
one of them being the correct answer, is an example of an instructional instrument
that eases the path of memory. The item to be remembered is right there in
front of you.
Repression takes place when the ego, as a form of defense against a psychological
threat, forces a memory into the unconscious domain. This is a psychoanalytical
concept, and it was proposed by Freud. He suggested that memories
associated with emotionally painful childhood experiences are likely to be
repressed (see chapter 13).
(a) takes place when a memory can be retrieved easily by an act of will.
(b) takes place when the retrieval of a memory is facilitated by the presence
of a helpful stimulus.
(c) takes place when the ego, as a form of defense against psychological
threat, forces a memory into the unconscious domain.
Answers: (a) Recall; (b) Recognition; (c) Repression.
SELF-TEST
1. The unconditioned reflex is
a. a kind of behavior acquired by experience
b. always associated with voluntary behavior
c. a learned response pattern
d. an inborn response pattern
2. What takes place when the conditioned stimulus is presented a number of
times without the unconditioned stimulus?
a. Forgetting
b. Extinction
c. Discrimination
d. Stimulus generalization
Learning: Understanding Acquired Behavior 87
3. Thorndike said that when satisfactory results are obtained there is a tendency
to retain what has been learned. He called this tendency the
a. law of effect
b. principle of reinforcement
c. principle of reward
d. law of positive feedback
4. Operant behavior is characterized by
a. actions that have no meaning
b. its inability to be affected by reinforcement
c. its conscious nature
d. actions that have consequences
5. What principle is associated with the phrase greater resistance to extinction?
a. The law of effect
b. The total reinforcement effect
c. The partial reinforcement effect
d. The pleasure-pain effect
6. Vicarious reinforcement is characterized by
a. primary gratification
b. imagined gratification
c. extinction
d. the discriminative stimulus
7. What did Köhler define as the sudden reorganization of a perceptual field?
a. Operant conditioning
b. Classical conditioning
c. Insight
d. Extinction
8. The concept of a learning set is associated with what underlying process?
a. Spontaneous inhibition
b. The law of effect
c. Learned optimism
d. Learning to learn
9. The use of a mnemonic device is a special case of
a. encoding
b. short-term memory
c. antagonistic stimuli
d. involuntary conditioning
10. Which one of the following is not associated with the memory process of
retrieval?
a. Recall
b. Recognition
88 PSYCHOLOGY
c. Cognitive inhibition
d. Repression
ANSWERS TO THE SELF-TEST
1-d 2-b 3-a 4-d 5-c 6-b 7-c 8-d 9-a 10-c
ANSWERS TO THE TRUE-OR-FALSE PREVIEW QUIZ
1. True.
2. False. A conditioned reflex is a learned response pattern.
3. False. Operant behavior is characterized by actions that have consequences for an
organism.
4. True.
5. False. Short-term memory is an important aspect of the memory process.
KEY TERMS
antisocial behavior
behavioral tendency
classical conditioning
conditioned reflex
conditioned stimulus
conditioning
discrimination
discriminative stimulus
encoding
experience
extinction
incentive
insight
insight learning
involuntary
latent learning
law of effect
learning
learning set
learning to learn
learning-performance distinction
long-term memory
memory
mnemonic device
model
negative reinforcer
observational learning
operant
operant behavior
operant conditioning apparatus (Skinner
box)
partial reinforcement effect
positive reinforcer
primary reinforcer
prosocial behavior
random reinforcement
Learning: Understanding Acquired Behavior 89
recall
recognition
reinforcer
repression
response
retrieval
reward
secondary reinforcer
short-term memory
social learning theory
stamping in
stimulus generalization
storage
trial-and-error learning
unconditioned reflex
unconditioned stimulus
vicarious reinforcement
working memory
90