Chapter 5: Direct Learning and Human Potential
Psychology studies how heredity (nature) and experience (nurture) interact to influence behavior. In the previous chapter, we related Maslow’s hierarchy of human needs to very different human conditions. Whether we are growing up in the rain forest or in a technologically-enhanced urban setting, the bottom of the pyramid remains the same. We need to eat and drink and require protection from the elements. Deprivation of food or water will result in our becoming more active as we search for the needed substance. Unpleasant weather conditions will result in our becoming more active to remove the source of unpleasantness. Our senses enable us to detect appetitive and aversive stimuli in our internal and external environments. Our physical structure enables us to move, grasp, and manipulate objects. Our nervous system connects our sensory and motor systems.
Human beings inherit some sensory-motor connections enhancing the likelihood of our survival. Infants inherit two reflexes that increase the likelihood of successful nursing. A reflex is a simple inherited behavior characteristic of the members of a species. Human infants inherit rooting and sucking reflexes. If a nipple is placed in the corner of an infant’s mouth it will center the nipple (i.e., root). The infant will then suck on a nipple in the center of its mouth. Birth mothers’ breasts fill with milk and swell resulting in discomfort that is relieved by an infant’s nursing. This increases the likelihood that the mother will attempt to nurse the infant. This happy combination of inherited characteristics has enabled human infants to survive throughout the millennia.
Human mothers eventually stop producing milk and human infants eventually require additional nutrients in order to survive. This creates the need to identify and locate sources of nutrients. Humans started out in Africa and have migrated to practically every location on Earth’s land. Given the variability in types of food and their locations, it would be impossible for humans to depend upon the very slow biological evolution process to identify and locate nutrients. We cannot inherit reflexes to address all the possibilities. Another more rapid and flexible type of adaptive sensory-motor mechanism must be involved.
We described foraging trips conducted by members of the Nukak tribe. Foods consisted of fruits and honey and small wild animals including fish and birds. The Nukak changed locations every few days in order to locate new food supplies. Where they settled and looked changed with the seasons. Hunting and gathering included the use of tools assembled with natural elements. Clearly, prior experience (i.e., nurture) affected their behavior. This is what we mean by learning.
Operational Definition of Learning
All sciences rely upon operational definitions in order to establish a degree of consistency in the use of terminology. Operational definitions describe the procedures used to measure the particular term. One does not directly observe learning. It has to be inferred from observations of behavior. The operational definition describes how one objectively determines whether a behavioral observation is an example of the process.
The most common operational definitions of learning are variations on the one provided in Kimble’s revision of Hilgard and Marquis’ Conditioning and Learning (1961). According to Kimble, “Learning is a relatively permanent change in behavior potentiality which occurs as a result of practice.” Let us parse this definition. First, it should be noted that learning is inferred only when we see a change in behavior resulting from appropriate experience. Excluded are other possible causes of behavior change including maturation, which is non-experiential. Fatigue and drugs do not produce “relatively permanent” changes. Kimble includes the word “potentiality” after behavior to emphasize the fact that even if learning has occurred, this does not guarantee a corresponding behavior change.
The fact that prior learning may not be reflected in performance is based on a classic experiment conducted by Tolman and Honzik in 1930. They studied laboratory rats under conditions resembling the hunting and gathering of the Nukak. Three different groups were placed in a complex maze and the number of errors (i.e., wrong turns) was recorded (see Figure 5.1).
Figure 5.1. Maze used in Tolman and Honzik’s 1930 study with rats (Jensen, 2006).
A Hungry No Reward (HNR) group was simply placed in the start box and removed from the maze after reaching the end. A Hungry Reward (HR) group received food at the end and was permitted to eat prior to being removed. The third, No Reward -> Reward group, began the same as the No Reward group and was switched to being treated the same as the Regular Reward group after ten days (HNR-R).
Before considering the third group, let us see how the results for the first two enable us to conclude that learning occurred in the Regular Reward group (see Figure 5.2). The HNR and HR groups were treated the same with one exception, the second received food at the end. Therefore, if the results differ, we can conclude that it must be this experience that made the difference. The average number of errors did not drop significantly below chance performance over the course of the experiment for the HNR group. In comparison, the HR group demonstrated a steady and substantial decline in errors, exactly the pattern one would expect if learning were occurring. This decline in errors as the result of experience fulfills the operational definition of learning. It would not be possible to conclude that the experience made a difference in the HR group without the HNR control condition. One could argue that something else was responsible for the decline taking place (e.g., a change in the lab conditions, maturation, etc.).
Figure 5.2. Results from Tolman & Honzik’s study (Jensen, 2006).
It is possible to conclude that the rewarded group learned the maze based upon a comparison of its results with the no reward group. A related, seemingly logical conclusion would be that the group not receiving food, failed to learn the maze. Tolman and Honzik’s third group was like the no reward Group for the first 10 days and like the rewarded group for the remaining days. This group enabled the test of whether or not the absence of food resulted in the absence of learning. It is important to understand the rationale for this condition. If the HNR-R group had not learned anything about the maze on the first 10 days, the number of errors would be expected to gradually decline from there on, the pattern demonstrated from the start by the HR condition. However, if the HNR-R group had been learning the maze, a more dramatic decline in errors would be expected once the food was introduced. This dramatic decline in errors is indeed what occurred, leading to the conclusion that the rats had learned the maze despite the fact that it was not evident in their behavior. This result has been described as “latent learning” (i.e., learning that is not reflected in performance). Learning is but one of several factors affecting how an individual behaves. Tolman and Honzik’s results imply that incentive motivation (food in this instance) was necessary in order for the animals to display what they had learned. Thus, we see the need to include the word “potentiality” in the operational definition of learning. During the first 10 trials, the rats clearly acquired the potential to negotiate the maze. These results may remind you of those cited in Chapter 1 with young children. You may recall that some scored higher on IQ tests when they received extrinsic rewards for correct answers. Just like Tolman and Honzik’s rats, they had the potential to perform better but needed an incentive.
- State the operational definition of learning, describing why each of the terms is included.
- Describe the procedures, rationale, results, and implications of Tolman and Honzik”s study demonstrating that one cannot conclude the absence of learning from the absence of performance.
Learning as an Adaptive Process
The operational definition tells us how to measure learning but does not tell us what is learned or why it is important. I attempted to achieve this by defining learning as an adaptive process whereby individuals acquire the ability to predict and control the environment (Levy, 2013). There is nothing the Nukak can do to cause or stop it from raining. Over time, however, they may be able to use environmental cues such as dark skies or perhaps even cues related to the passage of time to predict the occurrence of rain. The Nukak can control the likelihood of discovering food by exploring their environment. They can obtain fruit from trees by reaching for and grasping it. The abilities to predict rain and obtain food certainly increase the likelihood of survival for the Nukak. That is, these abilities are adaptive.
The adaptive learning definition enables us to appreciate why it is necessary to turn our attention to two famous researchers whose contributions have enormously influenced the study of learning for decades, Ivan Pavlov and B. F. Skinner. Pavlov’s procedures, called classical conditioning, investigated learning under circumstances where it was possible to predict events but not control them. Skinner investigated learning under circumstances where control was possible. These two researchers created apparatuses and experimental procedures to study the details of adaptive learning. They identified many important learning phenomena and introduced technical vocabularies which have stood the test of time. We will describe Pavlov’s contributions to the study of predictive learning in this chapter and Skinner’s contributions to the study of control learning next chapter.
Figure 5.3 Ivan Pavlov.
One cannot overstate the significance of the contributions Ivan Pavlov made to the study of predictive learning. Pavlov introduced a level of rigor and precision of measurement of both the independent and dependent variables in animal learning that did not exist at the time. In 1904, Pavlov, a physiologist, was awarded the Nobel Prize in Medicine for his research investigating the digestive process in dogs. He became fascinated by an observation he and his laboratory assistants made while conducting this research. One of the digestive processes they studied was salivation. Saliva contains enzymes that initiate the process of breaking down what one eats into basic nutrients required to fuel and repair the body. The subjects frequently started salivating before being placed in the experimental apparatus. Pavlov described this salivation as a “psychic secretion” since it was not being directly elicited by food. He considered the phenomenon so important that within a few years he abandoned his research program in digestion and dedicated the rest of his professional career to systematically studying the details of this basic learning process.
This is a wonderful example of what has been described as serendipity, or accidental discovery in science. Dogs have been domesticated for thousands of years. A countless number of people probably observed dogs appearing to predict (i.e., anticipate or expect) food. Pavlov, however, recognized the significance of the observation as an example of a fundamental learning process. We often think of science as requiring new observations. Pavlov’s “discovery” of the classical conditioning process is an example of how this is not necessarily the case. One of the characteristics of an exceptional scientist is to recognize the significance of commonly occurring observations.
We will now review the apparatus, methods, and terminology Pavlov developed for studying predictive learning. He adapted an experimental apparatus designed for one scientific field of inquiry (the physiology of digestion) to an entirely different field (adaptive learning). Pavlov made a small surgical incision in the dog’s cheek and implanted a tube permitting saliva to be directly collected in a graduated test tube. The amount of saliva could then be accurately measured and graphed as depicted in figure 5.3. Predictive learning was inferred when salivation occurred to a previously neutral stimulus as the result of appropriate experience.
Watch the following video describing Pavlov and classical conditioning:
Animals inherit the tendency to make simple responses (i.e., reflexes) to specific types of stimulation. Pavlov’s salivation research was based on the reflexive eliciting of salivation by food (e.g., meat powder). This research was adapted to the study of predictive learning by including a neutral stimulus. By neutral, we simply mean that this stimulus did not initially elicit any behavior related to food. Pavlov demonstrated that if a neutral stimulus preceded a biologically significant stimulus on several occasions, one would see a new response occurring to the previously neutral stimulus. Figure 5.4 uses the most popular translation of Pavlov’s (who wrote in Russian) terminology. The reflexive behavior was referred to as the unconditioned response (UR). The stimulus that reflexively elicited this response was referred to as the unconditioned stimulus (US). A novel stimulus, by virtue of being paired in a predictive relationship with the food (US), acquires the capacity to elicit a food-related, conditioned response (CR). Once acquiring this capacity, the novel stimulus is considered a conditioned stimulus (CS).
Figure 5.4 Pavlov’s Classical Conditioning Procedures and Terminology.
Basic Predictive Learning Phenomena
In Chapter 1, we discussed the assumption of determinism as it applied to the discipline of psychology. If predictive learning is a lawful process, controlled empirical investigation has the potential to establish reliable cause-effect relationships. We will see this is the case as we review several basic classical conditioning phenomena. Many of these phenomena were discovered and named by Pavlov himself, starting with the acquisition process described above.
The term acquisition refers to a procedure or process whereby one stimulus is presented in a predictive relationship with another stimulus. Predictive learning (classical conditioning) is inferred from the occurrence of a new response to the first stimulus. Keeping in mind that mentalistic terms are inferences based upon behavioral observations, it is as though the individual learns to predict if this happens, then that happens.
The term extinction refers to a procedure or process whereby a previously established predictive stimulus is no longer followed by the second stimulus. This typically results in a weakening in the strength of the prior learned response. It is as though the individual learns what used to happen, doesn’t happen anymore. Extinction is commonly misused as a term describing only the result of the procedure or process. That is, it is often used like the term schizophrenia, which is defined exclusively on the dependent variable (symptom) side. Extinction is actually more like influenza, in that it is a true explanation standing for the relationship between a specific independent variable (the procedure) and dependent variable (the change in behavior).
Watch the following video describing classical conditioning acquisition and extinction:
The term spontaneous recovery refers to an increase in the strength of the prior learned response after an extended time period lapses between extinction trials. The individual acts as though, perhaps what used to happen, still does.
Is Extinction Unlearning or Inhibitory Learning?
Pavlov was an excellent example of someone whom today would be considered a behavioral neuroscientist. In fact, the full title of his classic book (1927) is Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex. Behavioral neuroscientists study behavior in order to infer underlying brain mechanisms. Thus, Pavlov did not perceive himself as converting from a physiologist into a psychologist when he abandoned his study of digestion to explore the intricacies of classical conditioning. As implied by his “psychic secretion” metaphor, he believed he was continuing to study physiology, turning his attention from studying the digestive system to studying the brain.
One question of interest to Pavlov was the nature of the extinction process. Pavlov assumed that acquisition produced a connection between a sensory neuron representing the conditioned stimulus and a motor neuron eliciting salivation. The reduction in responding resulting from the extinction procedure could result from either breaking this bond (i.e., unlearning) or counteracting it with a competing response. The fact that spontaneous recovery occurs indicates that the bond is not broken during the extinction process. Extinction must involve learning an inhibitory response counteracting the conditioned response. The individual appears to learns that one stimulus no longer predicts another. The conclusion that extinction does not permanently eliminate a previously learned association has important practical and clinical implications. It means that someone who has received treatment for a problem and improved is not the same as a person never requiring treatment in the first place (c.f., Bouton, 2000; Bouton and Nelson, 1998). For example, even if someone has quit smoking, there is a greater likelihood of that person’s relapsing than a non-smoker’s acquiring the habit.
Describe the basis for concluding that extinction is an inhibitory as opposed to an unlearning process.
Stimulus Generalization and Discrimination
Just imagine if you had to learn to make the same response over and over again to each new situation. Fortunately, this is often not necessary. Stimulus generalization refers to the fact that a previously acquired response will occur in the presence of stimuli other than the original one, the likelihood being a function of the degree of similarity. In Figure 5.5, we see that a response learned to a 500 Hz frequency tone occurs to other stimuli, the percentage of times depending upon how close the frequency is to 500. It is as though the individual predicts what happens after one event will happen after similar events.
Figure 5.5 Stimulus generalization gradient.
The fact that generalization occurs, significantly increases the efficiency of individual learning experiences. However, there are usually limits on the appropriateness of making the same response in different situations. For example, new fathers often beam the first time they hear their infant say “dada.” They are less thrilled when they hear their child call the mailman “dada!” Usually it is necessary to conduct additional teaching so that the child only says “dada” in the presence of the father. Stimulus discrimination occurs when one stimulus (the S+, e.g., a tone or the father) is predictive of a second stimulus (e.g., food or the word “dada”) but a different stimulus (the S-, e.g., a light or the mailman) is never followed by that second stimulus. Eventually the individual responds to the S+ (tone or father) and not to the S- (light or mailman) as though learning if this happens then that happens, but if this other thing happens that does not happen.
- Define and give examples of the following classical conditioning phenomena: acquisition, extinction, spontaneous recovery, stimulus generalization, and discrimination.
- Explain how the fact that spontaneous recovery occurs indicates that the connection between a conditioned stimulus and conditioned response is not broken during the extinction process.
Pavlov’s Stimulus Substitution Model of Classical Conditioning
For most of the 20th century, Pavlov’s originally proposed stimulus substitution model of classical conditioning was widely accepted. Pavlov viewed conditioning as a mechanistic (automatic) result of pairing neutral and biologically significant events in time. He believed that the established conditioned stimulus became a substitute for the original unconditioned stimulus. There were four assumptions underlying this stimulus substitution model:
- Classical conditioning requires a biologically significant stimulus (i.e., US)
- Temporal contiguity between a neutral stimulus and unconditioned stimulus is necessary for the neutral stimulus to become a conditioned stimulus
- Temporal contiguity between a neutral stimulus and unconditioned stimulus is sufficient for the neutral stimulus to become a conditioned stimulus
- The conditioned response will always resemble if not be identical to the unconditioned response.
Does Classical Conditioning Require a Biologically Significant Stimulus?
Higher-order conditioning is a procedure or process whereby a previously neutral stimulus is presented in a predictive relationship with a second, previously established, predictive stimulus. Learning is inferred from the occurrence of a new response in the presence of this previously neutral stimulus. For example, after pairing the tone with food, it is possible to place the tone in the position of the US by presenting a light immediately before it occurs. Research indicates that a conditioned response (salivation in this case) will occur to the light even though it was not paired with a biologically significant stimulus (food).
Watch the following video for a demonstration of higher-order conditioning:
Is Temporal Contiguity Necessary for Conditioning?
Human beings have speculated about the learning process since at least the time of the early Greek philosophers. Aristotle, in the fourth century B.C., proposed three laws of association that he believed applied to human thought and memory. The law of contiguity stated that objects or events occurring close in time (temporal contiguity) or space (spatial contiguity) became associated. The law of similarity stated that we tended to associate objects or events having features in common such that observing one event will prompt recall of similar events. The law of frequency stated that the more often we experienced objects or events, the more likely we would be to remember them. In a sense, Pavlov created a methodology permitting empirical testing of Aristotle’s laws. The law that applies in this section is the law of temporal contiguity. Timing effects, like many variables studied scientifically, lend themselves to parametric studies in which the independent variable consists of different values on a dimension. It has been demonstrated that in human eyelid conditioning in which a light is followed by a puff of air to the eye is strongest when the puff occurs approximately 500 milliseconds (½ -second) after the light. The strength of conditioning at shorter or longer intervals drop off within tenths of a second. Thus, temporal contiguity appears critical in human eyelid conditioning, consistent with Pavlov’s second assumption.
An Exception – Acquired Taste Aversion
Acquired taste aversion is the only apparent exception to the necessity of temporal contiguity in predictive learning (classical conditioning). This exception can be understood as an evolutionary adaptation to protect animals from food poisoning. Just imagine if members of the Nukak got sick after eating a particular food and continued to eat the same substance. There is a good chance the tribe members (and tribe!) would not survive for long. It would be advantageous to avoid foods one ate prior to becoming ill, even if the symptoms did not appear for several minutes or even hours. The phenomenon of acquired taste aversion has been studied extensively. The time intervals used sometimes differ by hours rather than seconds or tenths of seconds. For example, rats were made sick by being exposed to X-rays after drinking sweet water (Smith & Roll, 1967). Rats have a strong preference for sweet water, drinking it approximately 80 per cent of the time when being given a choice with ordinary tap water. If the rat became sick within ½-hour, sweet-water drinking was totally eliminated. With intervals of 1 to 6 hours, it was reduced from 80 to 10 per cent. There was even evidence of an effect after a 24-hour delay! Pavlov’s dogs would not associate a tone with presentation of food an hour later, let alone 24 hours. The acquired aversion to sweet water can be interpreted as either an exception to the law of temporal contiguity or contiguity must be considered on a time scale of different orders of magnitude (hours rather than seconds).
Is Temporal Contiguity Sufficient for Conditioning?
Pavlov believed not only that temporal contiguity between CS and US was necessary for conditioning to occur; he also believed that it was all that is necessary (i.e., that it was sufficient). Rescorla (1966, 1968, 1988) has demonstrated that the correlation between CS and US (i.e., the extent to which the CS predicted the US) was more important then temporal contiguity. For example, if the only time one gets shocked is in the presence of the tone, then the tone correlates with shock (i.e., is predictive of the shock). If one is shocked the same amount whether the tone is present or not, the tone does not correlate with shock (i.e., provides no predictive information). Rescorla demonstrated that despite temporal contiguity between tone and shock in both instances, classical conditioning would be strong in the first case and not occur in the second.
Another example of the lack of predictive learning despite temporal contiguity between two events is provided in a study by Leon Kamin (1969). A blocking group received a tone (CS 1) followed by shock (US) in the first phase and a control group was simply placed in the chamber (see figure 5.13). The groups were identical from then on. During the second phase, a compound stimulus consisting of the light and a tone (CS 2) was followed by shock. During a test phase, each component was presented by itself to determine the extent of conditioning.
In the blocking group, conditioning occurred to the tone and not the light. Conditioning occurred to both elements of the compound in the control group. It is as though the prior experience with the tone resulted in the blocking group subjects not paying attention to the light in the second phase. The light was redundant. It did not provide additional information.
A novel and fun demonstration of blocking in college students involved a computerized video game (Arcediano, Matute, and Miller, 1997). Subjects tried to protect the earth from invasion by Martians with a laser gun (the space bar). Unfortunately, the enterprising Martians had developed an anti-laser shield. If the subject fired when the shield was in place, their laser-gun would be ineffective permitting a bunch of Martians to land and do their mischief. A flashing light preceded implementation of the laser-shield for subjects in the blocking group. A control group did not experience a predictive stimulus for the laser-shield. Subsequently, both groups experienced a compound stimulus consisting of the flashing light and a complex tone. The control group associated the tone with activation of the laser-shield whereas, due to their prior history with the light, the blocking group did not. For them, the tone was redundant.
Watch the following video for a demonstration of blocking:
The blocking procedure demonstrates that temporal contiguity between events, even in a predictive relationship, is not sufficient for learning to occur. In the second phase of the blocking procedure, the compound stimulus precedes the US. According to Pavlov, since both components are contiguous with the US, both should become associated with it and eventually elicit CRs. The combination of Rescorla’s (1966) and Kamin’s (1969) findings lead to the conclusion that learning occurs when individuals obtain new information enabling them to predict events they were unable to previously predict. Kamin suggested that this occurs only when we are surprised. That is, as long as events are proceeding as expected, we do not learn. Once something unexpected occurs, individuals search for relevant information. Many of our activities may be described as “habitual” (Kirsch, Lynn, Vigorito, and Miller, 2004) or “automatic” (Aarts and Dijksterhuis, (2000). We have all had the experience of riding a bike or driving as though we are on “auto pilot.” We are not consciously engaged in steering as long as events are proceeding normally. Once something unexpected occurs we snap to attention and focus on the immediate environmental circumstances. This provides the opportunity to acquire new information. This is a much more active and adaptive understanding of predictive learning than that provided by Pavlov’s stimulus substitution model (see Rescorla, 1988).
Must The Conditioned Response Resemble the Unconditioned Response?
We will now examine the fourth assumption of that model, that the conditioned response always resembles the unconditioned response. Meat powder reflexively elicits salivation and Pavlov observed the same reaction to a conditioned stimulus predictive of meat powder. Puffs of air reflexively elicit eye blinks and taps on the knee elicit knee jerks. The conditioned responses are similar to the unconditioned responses in research involving puffs of air and knee taps as unconditioned stimuli. It is understandable that Pavlov and others believed for so long that the conditioned response must resemble if not be identical to the unconditioned response. However, Zener (1937) took movies of dogs undergoing salivary conditioning and disagreed with this conclusion. He observed, “Despite Pavlov’s assertions, the dog does not appear to be eating an imaginary food. It is a different response, anthropomorphically describable as looking for, expecting, the fall of food with a readiness to perform the eating behavior which will occur when the food falls.”
Kimble (1961, p. 54) offered the possible interpretation that “the function of the conditioned response is to prepare the organism for the occurrence of the unconditioned stimulus.” Research by Shepard Siegel (1975, 1977, 1984, 2005) has swung the pendulum toward widespread acceptance of this interpretation of the nature of the conditioned response. Siegel’s research involved administration of a drug as the unconditioned stimulus. For example, rats were injected with insulin in the presence of a novel stimulus (Siegel, 1975). Insulin is a drug that lowers blood sugar level and is often used to treat diabetics. Eventually, a conditioned response was developed to the novel stimulus (now a CS). However, rather than lowering blood sugar level, the blood sugar level increased to the CS. Siegel described this increase as a compensatory response in preparation for the effect of insulin. He argued that it was similar to other homeostatic mechanisms designed to maintain optimal levels of biological processes (e.g., temperature, white blood cell count, fluid levels, etc.). Similar compensatory responses have been demonstrated with morphine, a drug having analgesic properties (Siegel, 1977) and with caffeine (Siegel, 2005). Siegel (2008) has gone so far as to suggest that “the learning researcher is a homeostasis researcher.”
Siegel has developed a fascinating and influential model of drug tolerance and overdose effects based upon his findings concerning the acquisition of compensatory responses (Siegel, 1983). He suggested that many so-called heroin overdoses are actually the result of the same dosage being consumed differently or in a different environment. Such an effect has actually been demonstrated experimentally with rats. Whereas 34 percent of rats administered a higher than usual dosage of heroin in the same cage died, 64 percent administered the same dosage in a different cage died (Siegel, Hinson, Krank, & McCully, 1982). As an experiment, this study has high internal validity but obviously could not be replicated with human subjects. In a study with high external validity, Siegel interviewed survivors of suspected heroin overdoses. Most insisted they had taken the usual quantity but indicated that they had used a different technique or consumed the drug in a different environment (Siegel, 1984). This combination of high external validity and high internal validity results makes a compelling case for Siegel’s learning model of drug tolerance and overdose effects.
Drug-induced compensatory responses are consistent with the interpretation that the conditioned response constitutes preparation for the unconditioned stimulus. Combining this interpretation with the conclusions reached regarding the necessity of predictiveness for classical conditioning to occur leads to the following alternative to Pavlov’s stimulus substitution model: Classical conditioning is an adaptive process whereby individuals acquire the ability to predict future events and prepare for their occurrence.