What leads an animal to perform a behavior?

: General; 04 August 2020

As we already know, a behavior is determined by its consequences, so we will keep in mind that depending on how those consequences are, there will be a higher or lower probability of the behavior repeating itself.

For example, if the consequence is undesirable the probability of recurrence in the future will decrease, and conversely, if it is desirable, the probability of recurrence will increase.

Operant conditioning covers several techniques but on this occasion we will focus on the use of positive reinforcement, where we can find two types:

• Primary reinforcement: All those that have an intrinsic value, that are reinforcing by nature, and it is not necessary to learn them by a previous experience. Such as: food, temperature, water, sex…

• Secondary reinforcement: also called conditioned reinforcement, their value is given by a history of association with a primary or other secondary reinforcement. Such as: rubs, toys, different types of tactile, saying "very well" and even the bridge itself.

We consider it important to maintain a balance of both types of reinforcement in our training program, because if an animal does not want to eat or is sick, our session may not be very successful if we only have primary reinforcement. If on the contrary we have conditioned other types of reinforcement, we have more variability and opportunities to interact with our animals.

What is a schedule of reinforcement?

We call the schedule reinforcement a precise set of rules that determine how the presentation of the reinforcer relates to the responses. The delivery of the reinforcer can depend on several factors, such as: number of responses, time, etc.

The most common schedules of reinforcement in animal training are:

• Continuous reinforcement: where each correct behavior is reinforced with primary. It is usually the program most used by trainers to train a new behavior as it generates more motivation and consistency.

• Intermittent reinforcement: It is the program that generates the highest rate of learning and resistance to extinction. It is characterized because only some responses are reinforced. In it we find:

- Fixed ratio: the reinforcement always appears after a given number of responses. For example, we bridge and reinforce after every two behaviors, but always every two behaviors.

- Variable ratio: the animal performs the behavior and we mark the bridge after a variable number of responses. For example, a dolphin jumps and sometimes we bridge after the first jump, second, fourth, etc.

- Fixed interval: the reinforcement appears after the same period of time. For example, our animal performs a behavior and we bridge always after 60 seconds. The response will improve as the moment of reinforcement approaches, but the criterion decays during the initial seconds.

- Variable interval: the response is reinforced after a changing time interval. For example, an animal performs a behavior and the bridge appears after 5 seconds or 10 or 20...

Positive reinforcement Vs. Negative reinforcement

Despite what many people believe, positive reinforcement and negative reinforcement does not mean 'good' vs 'bad'. Positive reinforcement is when we add a stimulus that the animal wants and increases the probability of a behavior, such as food.

While negative reinforcement is when we eliminate an unwanted stimulus to increase the probability of behavior, such as what occurs with herding dogs and sheep. The dog pursues the sheep and barks at them thus adding an undesired stimulus (this particular action is considered positive punishment, which is when we add something that decreases the probability of behavior), however when the sheep move and head towards their enclosure, they enter and the dog withdraws, therefore the sheep have entered by negative reinforcement.

Another example known in our day to day life is the uncomfortable beeping that the car emits when we do not wear the seat belt. Once we put it on the belt, the unwanted stimulus ceases, therefore we have performed the behavior by negative reinforcement.

This type of reinforcement is a technique of so many available but we recommend only its use for very specific moments and only in expert hands. Basing your training on the use of positive reinforcement generates a much more reliable, positive and safe history for the animal.

Differential reinforcement

The process of reinforcing the desired responses to any other (either after having made a sign to the animal or reinforcing its behavior while observing them) is known as differential reinforcement.

It is a very useful technique in which the amount of reinforcement has a direct relationship with the quality of the response that an animal will offer us.

In other words, the more reinforcement an animal gets for an answer, the better and more accurate will the behavior be that it performs in the future.

Differentiation in the amount of reinforcement is called magnitude or "jackpot" and is used to highlight a great response from the animal.

However, some trainers, either consciously or unconsciously, perform their own type of differential reinforcement by performing a little-recommended practice: varying bridge intonation. If the behavior has been very good your whistle sounds louder or with a longer tone than if the behavior is simply correct.

At WeZooit we respect any technique that never compromises animal welfare, but we consider the latter practice confusing, unnecessary and impractical when establishing any training program.

Finally, we recommend being variable with reinforcement and reinforce each correct response either with primary, with secondary and even with other already known behaviors.

In this way we will help to maintain the motivation of the animal and its expectation of reinforcement.

And remember, if it’s possible... WeZooit!