Positive Reinforcement and Clicker Training
Created | Updated Mar 25, 2008
The art of positive reinforcement lies in the observational and timing skills of the trainer. At the exact moment a desired behaviour occurs, the event must be marked and then rewarded, enough times for the trainee to be able to work out exactly what the desired behaviour is.
Clicker training is fast becoming very popular in dog training, and this will form the focus of this Entry. Event marking and reward have been used for decades to train animals where it is impossible to physically manoeuvre them into a particular position (which was the traditional way of training dogs). For example, it's impossible to force a dolphin to do a double-backwards-somersault through a hoop whilst whistling the 'Star Spangled Banner', so positive reinforcement is used instead.
Cats (great at operating light switches if they can reach), sheep, whales (if you've got one), rabbits and horses will all respond to positive reinforcement, with a clicker or more appropriate event marker. It also works for humans, although possibly better as a party game or team-bonding exercise than an office management technique.
If you have a different sort of pet, do try this with them - adapting the actions taught, but with the basic method remaining the same.
The Event Marker
Training dogs with positive reinforcement is usually referred to as 'clicker training' because a small box with a metal tongue 'clicks' when pressed. This is ideal as the sound is unique and something the dog doesn't hear very often, but it is not essential. Some dogs don't like the clicking sound, or it may be too restrictive - if the dog does something you want when you don't have it to hand for instance. Any sound (or gesture if training a deaf animal) can be used, as long as it won't be a sound the dog hears any other time.
Some alternatives could be:
- Clicking a pen
- Making a clicking sound with the mouth
- An exaggerated word - 'gooood' or 'yeeeesss' for instance
The Reward
This is usually food. If the dog is more play-oriented, a quick game can be used instead, but there is a danger of exciting the dog too much and distracting it from what it just learned.
It's unlikely the dog will work for its dinner - which it gets for free anyway - but ideally the rewards will be very small pieces of whatever treat the dog likes, mixed with some of its normal food or biscuits if dry. Keep some 'bonus treats' aside, so if the dog does something really well they can be given a higher reward. Dogs are likely to work harder for longer if they don't know what treat they are getting each time. They'll want that bit of liver cake, instead of their normal dinner and will work until they get it.
Tuning the Click
To teach your dog that click equals reward, click and treat, and repeat a few times. Once your dog starts looking at the clicker, waiting for it to click again, you know they've got the idea. If you click, you must treat, even if you clicked by mistake, or got the timing wrong.
Capturing
To start teaching your dog something, decide what you want them to do. Start with something simple, even if they already know it, for example, the 'sit'. If they're already sitting, throw a treat at them, but to the side, so they have to get up to get it. When they sit again, click, then reward. If they don't sit, just hang around waiting for them to sit. If they're wandering around, sniffing, just let them do it, be patient. They will eventually sit, even if you have to watch them for a while. Once their bottom touches the floor, click and reward - ideally throwing the treat aside again so they have to get up. There's nothing more likely to get in the way of good training than to have to move the dog out of the position you're training - the dog gets confused with what you want and if you eventually move onto training from a distance it's just a nuisance.
At this stage, commands are not given. The dog learns the action; when the action is reliable and as you want it, the command is then taught.
Initially, start with short training sessions, and if you've got other animals or children, keep them out of the way. If you're keen, you can train for five or ten-minute sessions every hour or so if you want - that's better than keeping on for an hour once a day. Always start with a clear idea of the behaviour you want to train - remembering that this is a living animal that is likely to have its own ideas about what it's doing. Be flexible enough to change your plans. Always end on a plus, never a failure, and at the next training session be ready to start with some basic work to remind the dog of what it's learned.
Training sessions can be made longer as the dog progresses, allowing more than one behaviour to be practised at each session. However, when teaching new behaviours, stick to that for the session to avoid confusing the dog.
Moving On
Once the 'sit' has been mastered, move onto the 'down'. This may require more patience, as the dog knows he's supposed to sit when he sees the clicker. Suddenly, he's not getting his click, and he wants to know why. At the initial stages of training he may get frustrated, so you will need extra patience. Try to avoid luring or forcing him into position. Pet dogs spend most of their lives asleep, so eventually the dog will lie down (follow him about if you need to, or wait until a time when he's normally lying down anyway). Click, and treat (in a way that makes him get up again). He'll most likely start sitting again, but once more patience will pay off. Eventually the dog will learn that this clicking business is more complicated and will start 'offering' different behaviours and positions if at first he doesn't get his click.
Shaping
Shaping is what we call it when we want a particular behaviour, but have to break it down into little steps first. This is the reason we don't teach a command at the start; the initial behaviour that is clicked may not be the final position or action that we want.
For instance, you may want your dog to bark on command. This is useful for those everyday 'Lassie' situations. You've fallen down a well, and you want your dog to bark to get the attention of passers-by. He's busy sniffing, eating who knows what and wondering when you're coming out to get on with the walk; in fact, doing everything but barking. So here's how you teach him.
Get your clicker out. By now he will sit, lie down, dance a little jig to get you to click. Eventually, he might make a noise, even if he's just annoyed that you've not clicked - could be a whine, a growl, possibly a bark straight away. No matter how small, or unbarklike it is, click. 'That's great' he thinks, and makes the noise again. Or maybe he spends another five minutes trying to work out what made you click. But he will make a noise again - click and reward. When he's got the idea that the noise is being rewarded (it could take three clicks, it could take more, but the association will come), and will happily make the noise quickly, withhold the click. Wait until he makes a different noise, then click and reward that until he gets it, and so on until he barks regularly. In this way, we can shape the dog's behaviour from their initial offering to the desired result. This can be used for many different things. If you want to teach the dog to go to bed on command when you've got visitors who don't want to be covered in dog kisses (strange to think there are people out there who don't want dogs jumping all over them) there are lots of ways you can do it, but clicking him in there - click for one foot, click for two feet, click for standing in it, click for sitting in it, click for lying down (rewarding every click, and not going through to the next stage until the previous one is totally understood) - is more likely to mean that he won't associate bed with punishment, and is more likely to stay in there once he's in. If he's just put into his bed, or shouted at until he gets in, he may well just get out again when not restrained.
Behaviour Chains
A behaviour chain is where the dog does more than one action to get the reward. For instance, if you want him to come to you when you call, and sit. Not a very big chain, but it consists of a recall and a sit. Or you might want him to go and get you a beer out of the fridge. He has to be taught to walk away from you (something that goes against most dogs' instincts), go into a specific room, open the fridge door, select the correct item, shut the door, come back to you, and give you the item. All without having a sneaky bit themselves, and all these steps have to be taught one at a time.
To teach the chain is much like shaping, but generally it's best to shape 'backwards'. That is, teach the last thing first. That way when the dog learns each part of the chain, and moves onto the earlier step, he's reinforced twice - once with the action he knows is correct, and once with the treat.
This is a useful way of breaking behaviours you don't want. For instance if your dog likes to play fetch, but is reluctant to give you the ball when he gets back to you, or teases you with running off again, give him the ball direct, then click him for letting it go (ideally in your hand, or possibly on the floor if you're not keen on mud or dog slobber). As long as he's not overly ball obsessed he'll let go if offered a treat - click as his mouth opens and the treat becomes the reward. If he's so ball obsessed he won't let go, carry another ball to offer him.
Once he's learned to let go on command, give him the ball and walk away. He then learns to come towards you - another part of the chain - to give you the ball, which he knows is correct and will be rewarded. This way he always knows what's coming next, and he doesn't have the excitement of running for the ball to distract him until he's really learnt what you want. You'll soon be able to throw the ball for him, knowing he'll come back with it for another go.
Hints and Tips
When shaping, if the dog won't stay in the position you want to mark for long enough to get the timing right, it's better to click them moving into position, rather than click them moving out of it.
Imagine you're taking a photograph. You want the picture of the perfect behaviour - so you click as if to freeze it in time.
The reward doesn't have to be instantaneous. The idea of the click is that they know what they did right, and that a treat is on its way. So if you catch your dog doing something great, you can mark the event, then go find the treat (as long as we're talking a few minutes, not half an hour later).
It can also be used with 'damaged' animals - so if an animal has been traumatised by something, it's possible they can be gradually reintroduced to it with this method. However, that is an advanced technique and not recommended for the beginner.
Be flexible. If you make up your mind you want to teach 'rollover' and the dog keeps jumping up on his hind legs, then click his 'Irish Dancing' instead. Don't switch between the two, but be prepared to work with your dog, rather than against him.
To start with, go with your dog's instincts. Most breeds and first crosses will be genetically programmed to perform certain tasks, retrieving, guarding, herding, etc, so try not to train in activities that go against these instincts. Once you've been successful training with all these techniques over a period of time, then you can start on the harder behaviours.