Just what does the clicker mean to your dog?
To my dogs; it (the clicker/marker word) means just three things:
- Yes, that was the correct behaviour
- Yes, there is a reward coming
- Yes, you can stop doing that behaviour
This is the way I have used clicker training for the last 20+ years. I like the clarity it gives to my dogs; there is no guessing on their behalf, they know they have got the behaviour right and that a reward is coming. There is no doubt in their minds as to whether they should continue with that behaviour or whether they are free to move about and ‘re-set’ themselves or to even have a break.
I know a lot of people use the clicker as a ‘Keep going signal’ meaning that the dog keeps doing that particular behaviour that has been clicked for. Don’t get me wrong, I do use a keep going signal (KGS)at times, it just tends to be a verbal one and one that isn’t associated as a marker; it is more a verbal encourager than a marker, such as super dog, aren’t you clever?
Why don’t I use the clicker as a KGS? I want the dog to be crystal clear on what the clicker means rather than having meaning yes that behaviour is over one minute and the next minute it is meaning keep doing that behaviour. I don’t want the dog getting confused as confusion can lead to frustration and frustration can become an emotional part of the training process; really not something that we want.
Yes, I know lots of people successfully use the clicker as a KGS, precision marker and add in a separate release cue such as’Break’ or similar. However, my preference, is to keep my training ‘clean’ and to only use the clicker (marker word) as an event marker rather than a KGS.
As a side note, I also don’t use the clicker for Two-fers and Three-fers, I always reward after I have marked a behaviour, I don’t get the dog performing multiple repetitions of that behaviour for multiple clicks and only one reward. The click/marker is a promise of a reward being delivered and I have no intention of breaking that promise.
Of course, if the dog decides to hold that position after the click, then that is their choice and depending on what I am training, I may reward in that position or I may want them to move so that I can ‘re-set’ them. When I’m using 300 peck for teaching stays, I usually find, that after a few repetitions, the dogs naturally choose to stay in position after the click and that is fine. If they do move between the click and the reward or the reward and the reset cue, then I’m not bothered, they have had ‘permission’ to break the position because I had marked the behaviour, and the click had ended it.
I am now experimenting with different markers that each tell the dog which reward is coming and how/where it will be delivered. So ‘Yes’ means the food is delivered to the dog and that they should stay put until the food arrives; ‘Get It’means the food is going to be thrown for them to chase; ‘Catch’ is fairly obvious and ‘Find It’ means search for food dropped on the floor. I’m also adding a marker that means drive to a dish/container of food (or the Memory trainer). ‘Go’ as a marker means yes you’ve done that correctly, now drive to that dead toy. We are working on ‘Fetch’ meaning chase a moving toy, ‘Take it’ meaning take the toy from my hands, and I need to clarify a tug marker.
We are having great fun and I’m loving the clarity that this approach is bringing to my training.
I will just add that, apart from using 300 peck to build duration, I don’t tend to switch to variable reinforcement schedules. My dogs are always rewarded for the correct behaviour; they are always on a continuous schedule of reinforcement. What I do move on to, is using variable reinforcers, so sometimes it is a piece of kibble that they get, sometimes, it could be roast beef. They maybe rewarded by being allowed to go and sniff or to go for a swim (or a wallow in the mud if your name is Asia). We might have a game with a toy, we might play chase games. They may get a scratch or verbal praise, their reward could be anything. The correct performance of a behaviour will always be rewarded.
So just what does your click or marker word mean to your dog?
The saying ‘Reward the good and ignore the bad’ has a lot to answer for in how people view reward based training (and trainers). Some people seem to think that reward based trainers will ignore all sorts of bad behaviour (such as barking, aggression, chewing inappropriate things, jumping up at people etc.) and will just wait patiently, for the dog to stop doing that behaviour and do something that can be rewarded. This just isn’t the case and the saying is grossly over simplified (have a look also at my blogs Proactive not Passive and Changing Challenging Behaviour).
All too often, trainers just focus on using the 4 quadrants of Operant conditioning and forget about all the other ways that organisms learn and that can be a real handicap for a trainer. Yes, we know that rewarding appropriate behaviour with something the dog want, will lead to an increase in that behaviour (R+) and that with holding something the dog wants will reduce undesired behaviour (P-). We also know that positively punishing the dog (P+) by applying something that the dog will actively work to avoid, will reduce behaviour. We also know that removing something that the dog will work to avoid, will increase a desired behaviour (R-). What else do we need to know?
All too often, Classical Conditioning gets forgotten about (this does tend to go hand in hand a bit with Operant conditioning; it is difficult to split them completely). Classical conditioning deals with reflexes and conditioned emotional responses. A dog that is fearful of something needs to be classically conditioned to learn that the scary thing isn’t that scary. So the most fabulous rewards appear in the presence of the scary thing. I’m sure if someone offered me enough chocolate, I could eventually learn not to be frightened of Earwigs (shudder).
The opposite side of the coin is sensitisation, where a dog becomes more and more worried by something. This is a natural trait and helps keep animals safe from predators, but also occurs in pets where they can be overwhelmed in their socialisation experiences and become worried by something or it can be a by-product of using positive punishment or flooding.
Another understated method is Premack or Grandma’s Law (eat your greens and you’ll get dessert). Basically, Premack makes a less probable behaviour become more probable. Let’s use Beau as an example. She is a ball obsessed spaniel that really finds it difficult to ignore a tennis ball, even if she already has one in her mouth. The behaviour that I want to make more likely is her letting go/leaving the ball (She would much rather hang onto the ball, so this is a low probability behaviour). I reward her leaving the ball, by letting her go and get the ball (this is a highly desirable behaviour as far as she is concerned). By using a highly desirable activity to reward a much less desirable behaviour (as far as the dog is concerned), we are gradually building a more reliable leave.
The same principle can be used to increase the reliability of the recall. If your dog chases critters, then you can use that to help the recall away from critters. The chasing of critters is a highly desirable (to the dog) behaviour (high probability) and the recalling away from the critter is less desirable to the dog (low probability), but by allowing the dog to return to chasing after it has recalled, will make that recall from critters much stronger. Just bear in the mind that by the time your dog has called away from chasing the critter, it will have long gone by the time you send the dog back for it..they still get the fun of sniffing where it was.
How many other examples of Premack can you think of?
Then there is good old Habitutation. Basically, this just means being exposed to something and getting used to it. It should be a none (neutral) event really, with no positive or negative emotional responses. I used to live in a house next to a church with a chiming clock. When we first moved in, I heard that darn clock chime very quarter of an hour. It didn’t take long for that sound to become just background noise and I had to really listen for that clock chiming if I wanted to check the time. The same happens with people that live next to busy roads or next to a railway line. Allowing a dog to explore an environment before asking them to work is a form of habituation (or acclimatisation). The more environments they are used to being in, the faster they will habituate to new ones.
Socialising a puppy is basically habituation as we want the puppy to be used to every day things. It should be a neutral process or mildly positive (see Keep those experiences positive)
We could also talk about Flooding (sink or swim approach), but I really hope that no one uses this approach with dogs any more as it it not the most humane approach and just results in a dog shutting down through excessive stress and learned helplessness (if you can’t escape something, you just give in to the inevitable).
Extinction is another way for an organism to learn. A previously reinforced behaviour is no longer reinforced (rewarded) and gradually disappears. This often happens by accident, when the pet owner forgets to reward a desired behaviour and over time, the dog stops doing that behaviour and does something else that does earn them reinforcement. This often happens with behaviours such as recall and loose lead walking. Extinction, can result in a large amount of frustration. Just try not feeding a dog titbits from the table when it has had a long history of being fed titbits that way….you will see the frustration build and if you persist (many owners will give up), you will see an extinction burst and then the behaviour goes away. Take note though, you only have to reinforce that behaviour again and it will be back to full strength very quickly and this time, it will be harder to extinguish.
A better way to extinguish behaviour is to couple extinction with Differential reinforcement, where a different behaviour is reinforced and the undesired one extinguished. There are several approaches to using differential reinforcement: DRI, DRO, DRA and DRL
DRI – differential reinforcement of an incompatible behaviour. Your dog can’t jump up on someone is he is taught to sit. Sitting being incompatible with jumping up. Training your dog to go to its bed or to a mat when the doorbell rings is another form of DRI. I’m using DRI to teach Lara to leave me be whilst I am training another dog. In the video clip, she is being rewarded for staying on a platform while the other dog is working.
You could also use DRI to teach a puppy not to nip, by reinforcing for them carrying a toy, for example.
DRO – is the differential reinforcement of another behaviour provided that the undesirable one doesn’t occur with in a defined, fixed period of time. So if our puppy doesn’t mouth us within 5 seconds of being stroked or played with, then they are rewarded, no matter what behaviour they are exhibiting. You do need to know how frequently the mouthing occurs.
DRA – differential reinforcement of alternative behaviour. This is useful when it is difficult to find a behaviour that is incompatible with the undesired one, so another behaviour is chosen that can be reinforced.
DRL actually refers to differential reinforcement of lower frequency. The aim is to decrease the frequency of the undesired behaviour, but not necessarily to remove it all together. It doesn’t tend to get used a lot in dog training. Some trainers have defined DRL as differential reinforcement of lower intensity.
We also have Insight learning, Latent learning, Social learning, Counter conditioning, Systematic desensitsation and Observational learning to consider.
Learning theory and positive dog training is so much more than just rewarding the good and ignoring the bad.