REFERENCE ONLY

Trial-and-Error Learning: Taking a Rocky Road

It is instructive to note that one of the most popular books on writing ever published

is called Trial and Error by the novelist Jack Woodford. It sold many copies

Learning: Understanding Acquired Behavior 77

over a number of years, and communicated to would-be authors that the only

way to learn to write was by taking the rocky road of learning by making one's

own mistakes.

The first kind of learning to be studied experimentally in the United States was

trial-and-error learning. Edward L. Thorndike (1874–1949) first studied maze

learning in baby chickens (with the assistance and approval ofWilliam James). Later

he studied the escape behavior of cats from puzzle boxes. The cats had to learn to

pull a string that released a latch connected to a door. The cats learned to pull the

string, but only very gradually. They showed no sudden burst of insight or comprehension.

Thorndike concluded that the learning was a robotlike process controlled

primarily by its outcomes. If a specific behavior helped a cat to escape, that

behavior was retained by the cat. Thorndike called this process stamping in,

meaning that an action that is useful is impressed upon the nervous system.

What stamps in a response, according to Thorndike, is satisfaction. The cat

that escapes from a puzzle box is rewarded with food. Thorndike called the tendency

to retain what is learned because satisfactory results are obtained the law of

effect. Thorndike's law of effect is the forerunner of what today is usually known

as the process of reinforcement (see the next section).

(a) If a specific behavior helps a cat to escape from a puzzle box, this behavior is retained by

the cat. Thorndike called this process .

(b) Thorndike's law of effect is the forerunner of what today is usually known as the

process of .

Answers: (a) stamping in; (b) reinforcement.

Operant Conditioning: How Behavior Is Shaped by Its

Own Consequences

Operant behavior is characterized by actions that have consequences. Flick a

light switch and the consequence is illumination. Saw on a piece of wood and the

consequence is two shorter pieces of wood. Tell a joke and the consequence is

(sometimes) the laughter of others. Work hard at a job all week and the consequence

is a paycheck. In each of these cases the specified action "operates" on the

environment, changes it in some way.

It was B. F. Skinner (1904–1990) who applied the term operant to the kind

of behaviors described above. He saw that operant behavior is both acquired and

shaped by experience. Consequently, he identified it as a kind of learning. In

addition, he also categorized it as a form of conditioning because he believed that

such concepts as consciousness and thinking are not necessary to explain much

(perhaps most) operant behavior.

78 PSYCHOLOGY

Skinner, long associated with Harvard, invented a device called the operant

conditioning apparatus; its informal name is the Skinner box. Think of the

apparatus as something like a candy machine for animals such as rats and

pigeons. A rat, for example, learns that it can obtain a pellet of food when it

presses a lever. If the pellet appears each time the lever is pressed, the rate of

lever pressing will increase. Lever pressing is operant behavior (or simply an operant.)

The pellet is a reinforcer. A reinforcer is a stimulus that has the effect of

increasing the frequency of a given category of behavior (in this case, lever

pressing).

(a) Operant behavior is characterized by actions that have .

(b) The formal term for a Skinner box is the .

Answers: (a) consequences; (b) operant conditioning apparatus.

The concept of reinforcement plays a big part in Skinner's way of looking at

behavior. Consequently, it is important to expand on the concept. Note in the

above definition that a reinforcer is understood in terms of its actual effects. It is to

be distinguished from a reward. A reward is perceived as valuable to the individual

giving the reward, but it may not be valued by the receiving organism. In the

case of a reinforcer, it is a reinforcer only if it has some sort of payoff value to the

receiving organism. By definition, a reinforcer has an impact on operant behavior.

Its function is always to increase the frequency of a class of operant behaviors.

One important way to categorize reinforcers is to refer to them as positive and

negative. A positive reinforcer has value for the organism. Food when you are

hungry, water when you are thirsty, and money when you're strapped for cash all

provide examples of positive reinforcers.

(a) The function of a reinforcer is always to the frequency of a class of operant

behaviors.

(b) A has value for the organism.

Answers: (a) increase; (b) positive reinforcer.

A negative reinforcer has no value for the organism. It does injury or is noxious

in some way. A hot room, an offensive person, and a dangerous situation all

provide examples of negative reinforcers. The organism tends to either escape

from or avoid such reinforcers. The operant behavior takes the subject away from

the reinforcer. Turning on the air conditioner when a room is hot provides an

example of operant behavior designed to escape from a negative reinforcer. Note

that the effect of the negative reinforcer on behavior is still to increase the frequency

of a class of operants. You are more likely to turn on an air conditioner

tomorrow if you have obtained relief by doing so today.

Learning: Understanding Acquired Behavior 79

It is also important to note that a negative reinforcer is not punishment. In the

case of punishment, an operant is followed by an adverse stimulus. For example, a

child sasses a parent and then gets slapped. Getting slapped comes after the child's

behavior. In the case of a negative reinforcer, the adverse stimulus is first in time.

Then the operant behavior of escape or avoidance follows.

(a) Operant behavior takes a subject from a negative reinforcer.

(b) In the case of punishment, an operant is by an adverse stimulus.

Answers: (a) away; (b) followed.

Another important way to classify reinforcers is to designate them as having

either a primary or a secondary quality. A primary reinforcer has intrinsic value

for the organism. No learning is required for the worth of the reinforcer to exist.

Food when you are hungry and water when you are thirsty are not only positive

reinforcers, as indicated above, they are also primary reinforcers.

A secondary reinforcer has acquired value for the organism. Learning is

required. Money when you're strapped for cash is a positive reinforcer, as indicated

above, but it is a secondary one. You have to learn that cash has value. An

infant does not value cash, but does value milk. A medal, a diploma, and a trophy

all provide examples of secondary reinforcers.

(a) A has intrinsic value for an organism.

(b) A has acquired value for an organism.

Answers: (a) primary reinforcer; (b) secondary reinforcer.

One of the important phenomena associated with operant conditioning is

extinction. Earlier, we discussed how extinction takes place when the conditioned

stimulus is presented a number of times without the unconditioned stimulus.

Extinction also takes place when the frequency of a category of operant responses

declines. If, using the operant conditioning apparatus, reinforcement is withheld

from a rat, then lever pressing for food will decline and eventually diminish to

nearly zero. The organism has learned to give up a given operant because it no

longer brings the reinforcer.

Both animal and human research on extinction suggest that it is a better way

to "break" bad habits than is punishment. If a way can be found to eliminate the

reinforcer (or reinforcers) linked to a behavior pattern, the behavior is likely to

be given up. Punishment tends to temporarily suppress the appearance of an

operant, but extinction has not necessarily taken place. Consequently, the

unwanted operant has "gone underground," and may in time surface as an

unpleasant surprise. Also, punishment is frustrating to organisms and tends to

make them more aggressive.

80 PSYCHOLOGY

(a) Extinction takes place when the frequency of a category of operant responses

(b) Punishment is frustrating to organisms and tends to make them more .

Answers: (a) declines; (b) aggressive.

Another important phenomenon associated with operant conditioning is the

partial reinforcement effect, the tendency of operant behavior acquired under

conditions of partial reinforcement to possess greater resistance to extinction than

behavior acquired under conditions of continuous reinforcement. Let's say that rat

1 is reinforced every time it presses a lever; this rat is receiving continuous reinforcement.

Rat 2 is reinforced every other time it presses a lever; this rat is receiving

partial reinforcement. Both rats will eventually acquire the lever-pressing

response. Now assume that reinforcement is withheld for both rats. The rat that

will, in most cases, display greater resistance to extinction is rat 2. Skinner was surprised

by this result. If reinforcement is a kind of strengthening of a habit, then rat

1, receiving more reinforcement, should have the more well-established habit.

And it should demonstrate greater resistance to extinction than rat 2.

Nonetheless, the partial reinforcement effect is a reality, and Skinner became

interested in it. He and his coworkers used many schedules of reinforcement to

study the partial reinforcement effect. In general, it holds for both animals and

human beings that there is indeed a partial reinforcement effect. Random reinforcement

is determined by chance, and is, consequently, unpredictable. If

behavior is acquired with random reinforcement, it exaggerates the partial reinforcement

effect. Skinner was fond of pointing out that random payoffs are associated

with gambling. This explains to some extent why a well-established

gambling habit is hard to break.

(a) Operant behavior acquired under conditions of partial reinforcment tends to possess

greater resistance to than behavior acquired under conditions of continuous

reinforcement.

(b) What kind of reinforcement is determined by chance?

Answers: (a) extinction; (b) Random reinforcement.

Assume that an instrumental conditioning apparatus contains a light bulb.

When the light is on, pressing the lever pays off. When the light is off, pressing

the lever fails to bring forth a reinforcer. Under these conditions, a trained experimental

animal will tend to display a high rate of lever pressing when the light is

on and ignore the lever when the light is off. The light is called a discriminative

stimulus, meaning a stimulus that allows the organism to tell the difference

between a situation that is potentially reinforcing and one that is not. Cues used

to train animals, such as whistles and hand signals, are discriminative stimuli.

Learning: Understanding Acquired Behavior 81

Skinner notes that discriminative stimuli control human behavior, too. A factory

whistle communicating to workers that it's time for lunch, a bell's ring for a

prizefighter, a school bell's ring for a child, and a traffic light for a driver are all

discriminative stimuli. Stimuli can be more subtle than these examples. A lover's

facial expression or tone of voice may communicate a readiness or lack of readiness

to respond to amorous advances.

Skinner asserts that in real life both discriminative stimuli and reinforcers automatically

control much of our behavior.

A stimulus that allows the organism to tell the difference between a situation that is potentially

reinforcing and one that is not is called a .

Answer: discriminative stimulus.

Consciousness and Learning: What It Means to Have

an Insight

Although classical and operant conditioning play a large part in both animal and

human learning, it is generally recognized by behavioral scientists that these two

related processes give an insufficient account of the learning process, particularly

in human beings. Consequently, it is important to identify at least four additional

aspects of learning. These are (1) observational learning, (2) latent learning, (3)

insight learning, and (4) learning to learn.

Observational learning takes place when an individual acquires behavior

by watching the behavior of a second individual. Albert Bandura, a principal

researcher associated with observational learning, identified important features

of this particular process. The second individual is a model, and either intentionally

or unintentionally demonstrates behavior. If the observer identifies with

the model and gains imaginary satisfaction from the model's behavior, then

this is vicarious reinforcement. Vicarious reinforcement is characterized by

imagined gratification. Psychologically, it acts as a substitute for the real thing.

Let's say that Jonathan admires a particular tennis star. When the star wins an

important tournament, Jonathan is ecstatic. This emotional state is a vicarious

reinforcer.

It should be noted that the concept of watching a model is very general. Reading

a mystery novel and identifying with the detective is a kind of observational

behavior. The thrills associated with the hero's adventures are vicarious thrills.

(a) What kind of learning takes place when an individual acquires behavior by watching the

behavior of a second individual?

(b) A either intentionally or unintentionally demonstrates behavior.

82 PSYCHOLOGY

Answers: (a) Observational learning; (b) model; (c) Vicarious reinforcement.

Social learning theory, associated with Bandura's research, states that much

of our behavior in reference to other people is acquired through observational

learning. Let's say that Carol is a fifteen-year-old high school student. She is on

the fringe of a group of adolescent females who admire a charismatic eighteenyear-

old named Dominique. Dominique smokes, uses obscenities, and brags

about her sexual exploits. Carol observes Dominique and obtains a lot of vicarious

reinforcement from Dominique's behavior. If Carol begins to imitate

Dominique's behavior, then social learning has taken place.

Both prosocial behavior and antisocial behavior can be acquired through

observational learning. Prosocial behavior is behavior that contributes to the

long-run goals of a traditional reference group such as the family or the population

of the nation (see chapter 16). If an individual admires one or both parents,

then the parents may be taken as role models. Many adolescents and young adults

acquire attitudes and personal habits that resemble those of their parents. If one is

patriotic and ready to defend one's nation during time of war, it is quite likely that

the individual is taking important historical figures such as presidents and generals

as role models.

Antisocial behavior is behavior that has an adverse impact on the long-run

goals of a traditional reference group. From the point of view of Carol's parents,

if Carol begins to act like Dominique, then Carol's behavior is antisocial.

(a) What theory states that much of our behavior in reference to other people is acquired

through observational learning?

(b) is behavior that contributes to the long-run goals of a traditional

reference group.

traditional reference group.

Answers: (a) Social learning theory; (b) Prosocial behavior; (c) Antisocial behavior.

Latent learning is a second kind of learning in which consciousness

appears to play a large role. Pioneer research on latent learning is associated with

experiments conducted by the University of California psychologist Edward C.

Tolman and his associates. Let's say that a rat is allowed to explore a maze without

reinforcement. It seems to wander through the maze without any particular

pattern of behavior. It is probably responding to its own curiosity drive, but no

particular learning appears to be taking place. Let's say that after ten such opportunities,

reinforcement in the form of food in a goal box is introduced. The rat,

Learning: Understanding Acquired Behavior 83

if it is typical, will quickly learn to run the maze with very few errors. Its learning

curve is highly accelerated compared to that of a rat that has not had an earlier

opportunity to explore the maze. This is because the first rat was actually

learning while it was exploring. The function of reinforcement in this case is to

act as an incentive, a stimulus that elicits and brings forth whatever learning the

organism has acquired.

Note that the learning was actually acquired when the rat was exploring.

Therefore learning was taking place without reinforcement. Such learning is called

latent learning, meaning learning that is dormant and waiting to be activated.

Let's say that Keith is an adolescent male. For years his mother has forced him,

with no particular reinforcement, to make his bed and hang up his clothes neatly.

But Keith has, from his mother's point of view, been a slow learner. He does both

tasks poorly. He enlists in the army shortly after his eighteenth birthday. In basic

training he makes his bed and hangs up his clothes neatly. He has been told that

he will obtain his first weekend pass only if he performs various tasks properly.

The fact that Keith shows a very rapid learning curve under these conditions provides

an example of latent learning. He was learning under his mother's influence,

but he wasn't motivated to bring the learning forth.

The process of latent learning calls attention to the learning-performance

distinction. Learning is an underlying process. In the case of latent learning it is

temporarily hidden. Performance is the way in which learning is displayed in

action. Only performance can actually be observed and directly measured.

(a) is learning that is dormant and waiting to be activated.

(b) is the way in which learning is displayed in action.

Answers: (a) Latent learning; (b) Performance.

Insight learning is a third kind of learning in which consciousness appears to

play a major role. Groundbreaking research on insight learning was conducted by

Wolfgang Köhler, one of the principal Gestalt psychologists. One of Köhler's

principal subjects was an ape named Sultan. Sultan was presented with two short

handles that could be assembled to make one long tool, a kind of rake. An orange

was placed outside of Sultan's cage and it was beyond the reach of either handle.

Sultan spent quite a bit of time using the handles in useless ways. He seemed to be

making no progress on the problem.

Then one day Sultan seemed to have a burst of understanding. He clicked

together the handles and raked in the orange. Köhler called this burst of understanding

an insight, and defined it as a sudden reorganization of a perceptual

field. Originally, Sultan's perceptual field contained two useless handles. With

insight, Sultan's perceptual field contained a long rake. The conscious mental

process that brings a subject to an insight is called insight learning.

84 PSYCHOLOGY

A burst of understanding associated with the sudden reorganization of a perceptual field is

called an .

Answer: insight.

Insight learning is also important for human beings. Let's say that a child in

grammar school is told that pi is the ratio of the circumference of a circle to the

diameter, and that a rounded value for pi is 3.14. The child memorizes the definition,

but the definition has little meaning. If, on the other hand, the child is

encouraged to measure the diameters and the circumferences of cans, pie tins, and

wheels using a string and a ruler, the child may acquire the insight that round

items are always about three times bigger around than they are across. Acquiring

an insight is more satisfying than just memorizing material. Also, insights tend to

resist the process of forgetting.

Harry Harlow, a former president of the American Psychological Association,

using rhesus monkeys as subjects, discovered a phenomenon called learning

sets. Assume that a monkey is given a discrimination problem. It is required to

learn that a grape, used as a reinforcer, is always to be found under a small circular

container instead of a square one. The learning curve is gradual, and a

number of trials are required before learning is complete. A second similar

problem is given. The discrimination required is between containers with two

patterns, a crescent moon and a triangle. The learning curve for the second

problem is more accelerated than the learning curve for the first problem. By

the time a fourth or a fifth similar problem is given, the monkey is able to solve

the problem in a very few trials. The monkey has acquired a learning set, an

ability to quickly solve a given type of problem. The underlying process is called

learning to learn.

Human beings also acquire learning sets. A person who often solves crossword

puzzles tends to get better and better at working them. A mechanic who has

worked in the automotive field for a number of years discovers that it is easier and

easier to troubleshoot repair problems. A college student often finds that advanced

courses seem to be easier than basic courses. All of these individuals have learned

to learn.

An acquired ability to quickly solve a given type of problem is called a .

Answer: learning set.

Memory: Storing What Has Been Learned

What would life be like without memory? You would have no personal history.

You would have no sense of the past—what you had done and what your child-

Learning: Understanding Acquired Behavior 85

hood was like. Learning would be a meaningless concept, because learning implies

retention. You will recall that the definition of learning includes the idea that

learning is more or less permanent.

Memory is a process that involves the encoding, storage, and retrieval of cognitive

information. Let's explore these three related processes one by one. Encoding

is a process characterized by giving an informational input a more useful

form. Let's say that you are presented with the letters TCA. They seem meaningless.

You are told that the letters represent an animal that meows. You think, "The

animal is a cat." You have just transformed the informational input TCA into

CAT, and it has become more useful to you. The use of symbols, associations, and

insights are all examples of human encoding.

The use of a mnemonic device, a cognitive structure that improves both

retention and recall, is a special case of encoding. Let's say that in a physics class

you are asked to memorize the colors of the rainbow in their correct order—red,

orange, yellow, green, blue, indigo, and violet. You can use the name Roy G. Biv

as a mnemonic device, using the first letter of each color.

(a) is a process characterized by giving an informational input a more

useful form.

(b) The use of the name Roy G. Biv to remember the colors of the rainbow is an example

of a .

Answers: (a) Encoding; (b) mnemonic device.

Storage refers to the fact that memories are retained for a period of time. A

distinction is made between short-term memory and long-term memory. Shortterm

memory, also known as working memory, is characterized by a temporary

storage of information. If you look up a telephone number, hold it in at the

conscious level of your mind for a few minutes, use it, and then promptly forget

it, you are employing the short-term memory process. Long-term memory is

characterized by a relatively stable, enduring storage of information. The capacity

to recall much of your own personal history and what you learned in school provide

examples of the long-term memory process.

If short-term memory is impaired, as it is in some organic mental disorders

(see chapter 14), then this interferes with the capacity to form new long-term

memories.

(a) refers to the fact that memories are retained for a period of time.

(b) Short-term memory is also known as .

86 PSYCHOLOGY

Answers: (a) Storage; (b) working memory; (c) Long-term memory.

Retrieval of cognitive information takes place when a memory is removed

from storage and replaced in consciousness. Three phenomena are of particular

interest in connection with the retrieval process: recall, recognition, and repression.

Recall takes place when a memory can be retrieved easily by an act of will. You

see a friend and think, "There's Paula." You have recalled the name of your friend.

Recognition takes place when the retrieval of a memory is facilitated by the

presence of a helpful stimulus. A multiple-choice test that provides four names,

one of them being the correct answer, is an example of an instructional instrument

that eases the path of memory. The item to be remembered is right there in

front of you.

Repression takes place when the ego, as a form of defense against a psychological

threat, forces a memory into the unconscious domain. This is a psychoanalytical

concept, and it was proposed by Freud. He suggested that memories

associated with emotionally painful childhood experiences are likely to be

repressed (see chapter 13).

(a) takes place when a memory can be retrieved easily by an act of will.

(b) takes place when the retrieval of a memory is facilitated by the presence

of a helpful stimulus.

threat, forces a memory into the unconscious domain.

Answers: (a) Recall; (b) Recognition; (c) Repression.

SELF-TEST

1. The unconditioned reflex is

a. a kind of behavior acquired by experience

b. always associated with voluntary behavior

c. a learned response pattern

d. an inborn response pattern

2. What takes place when the conditioned stimulus is presented a number of

times without the unconditioned stimulus?

a. Forgetting

b. Extinction

c. Discrimination

d. Stimulus generalization

Learning: Understanding Acquired Behavior 87

3. Thorndike said that when satisfactory results are obtained there is a tendency

to retain what has been learned. He called this tendency the

a. law of effect

b. principle of reinforcement

c. principle of reward

d. law of positive feedback

4. Operant behavior is characterized by

a. actions that have no meaning

b. its inability to be affected by reinforcement

c. its conscious nature

d. actions that have consequences

5. What principle is associated with the phrase greater resistance to extinction?

a. The law of effect

b. The total reinforcement effect

c. The partial reinforcement effect

d. The pleasure-pain effect

6. Vicarious reinforcement is characterized by

a. primary gratification

b. imagined gratification

c. extinction

d. the discriminative stimulus

7. What did Köhler define as the sudden reorganization of a perceptual field?

a. Operant conditioning

b. Classical conditioning

c. Insight

d. Extinction

8. The concept of a learning set is associated with what underlying process?

a. Spontaneous inhibition

b. The law of effect

c. Learned optimism

d. Learning to learn

9. The use of a mnemonic device is a special case of

a. encoding

b. short-term memory

c. antagonistic stimuli

d. involuntary conditioning

10. Which one of the following is not associated with the memory process of

retrieval?

a. Recall

b. Recognition

88 PSYCHOLOGY

c. Cognitive inhibition

d. Repression

ANSWERS TO THE SELF-TEST

1-d 2-b 3-a 4-d 5-c 6-b 7-c 8-d 9-a 10-c

ANSWERS TO THE TRUE-OR-FALSE PREVIEW QUIZ

1. True.

2. False. A conditioned reflex is a learned response pattern.

3. False. Operant behavior is characterized by actions that have consequences for an

organism.

4. True.

5. False. Short-term memory is an important aspect of the memory process.

KEY TERMS

antisocial behavior

behavioral tendency

classical conditioning

conditioned reflex

conditioned stimulus

conditioning

discrimination

discriminative stimulus

encoding

experience

extinction

incentive

insight

insight learning

involuntary

latent learning

law of effect

learning

learning set

learning to learn

learning-performance distinction

long-term memory

memory

mnemonic device

model

negative reinforcer

observational learning

operant

operant behavior

operant conditioning apparatus (Skinner

box)

partial reinforcement effect

positive reinforcer

primary reinforcer

prosocial behavior

random reinforcement

Learning: Understanding Acquired Behavior 89

recall

recognition

reinforcer

repression

response

retrieval

reward

secondary reinforcer

short-term memory

social learning theory

stamping in

stimulus generalization

storage

trial-and-error learning

unconditioned reflex

unconditioned stimulus

vicarious reinforcement

working memory

Last Chapter Next Chapter