Let me start by asking: Which does your dog like better, the anticipation of something good, or the good thing itself? Counter-intuitively, the research of Jaak Panksepp suggests it’s the anticipation of a reinforcement that is most enjoyable, not the reinforcement itself. Panksepp calls the emotion of anticipation the “seeking or wanting system,” versus the “liking system” that kicks in once an individual gets what it wants. In other words, which gets your dog more excited, hearing the clicker, or getting the treat? Hearing “Let’s go on a walk!” or actually going on the walk? (See here for a speech on this system by Panksepp.)
I thought of this a while ago when walking into the Overture Center in Madison, Wisconsin to see a show by the famous comedian, Jerry Seinfeld. I floated in on a sea of dopamine, barely believing I was going to see in person the man who had left me helpless with laughter in my own living room for so many years. After we got inside we found our seats, the lights finally dimmed and the man himself walked out onto the stage. Wheee, I was so jazzed! And he was good. We laughed a lot. But the giddy, roller coaster-like high I had felt as I walked into the auditorium was gone, and midway through the show I wanted it back.
It was Simon Gadbois’ thoughtful talk at SPARCS a few months ago that got me thinking about this topic again. One of Gadbois’ primary points was that learning is driven by motivation, and motivation is fueled by anticipation. He has done a lot of work with scent detection dogs (fascinating projects related to teaching dogs to find endangered species) and argues that good training capitalizes on the “seeking system,” (the click, if you are a clicker trainer) not the “liking system” (or the treat that comes after the click). For those of you interested in neurobiology, “liking,” or the feeling a dog has while eating a food reward, follows the same neurological pathway as morphine, while “seeking” stimulates the same route as cocaine. (See an interesting paper on neurological distinctions between “liking” and “wanting” by Berridge et. al. 2009.)
Experimental research suggests that it is “seeking” rather than “liking” that best motivates an individual to learn. For example, Gadbois mentioned one of Panksepp’s studies in which cats were always given a reward when they touched one object, but only occasionally when they touched a second object. Guess which object the cats touched most? You got it, the second one. That is why Gadbois argues that clicker trainers should not give a treat every time they click. This all makes great sense to me until I think of chocolate, which I would much rather eat than anticipate eating, thank you very much.
You can imagine that Gadbois’ argument led to a spirited discussion. I was fascinated, because for many years I have wondered why standard clicker training always follows a click with a treat. Karen Pryor strongly advocates for us to reinforce every click [secondary reinforcer] with a treat [primary reinforcer]. Ken Ramirez, one of the best animal trainers in the world in my opinion, always follows a click with a treat. But Gadbois went farther, given the link between motivation and anticipation, suggesting that it was important to balance the “seeking” and “liking” systems, with more emphasis on the former than the later during training. He strongly advocating for not following every click [which creates anticipation] with a treat, far from it, for the reasons described above.
I’ve always argued that family dogs should be reinforced both intermittently (once the behavior is established) and with different types of reinforcement early on. Asking a dog to sit when asked? Decrease the frequency of reinforcement as the dog becomes more reliable (depending, of course, on the level of difficulty) but early on vary the reinforcement, mixing food with play and belly rubs or whatever else the dogs loves. I suspect my inclination has been in part based on what I’ve read in the learning literature and my own experiences over the years. I hereby admit I have never quite understood the importance of reinforcing every click with a treat (or other reinforcement), and so you can imagine I listened to Gadbois’ talk with special interest.
[Full disclosure: I’ve added the following based on some early excellent comments and a chance to reflect on the issue at more length.]
Keep in mind that, based on Pavlov’s research, the click is effective because the trainee has been classically conditioned to respond to the click as a secondary reinforcer. What surprised Pavlov was that the conditioned stimulus (like the click) had the same power, and elicited the same response, as the unconditioned stimulus (the food). Thus, if the click is as powerful as the food, Gadbois asked, why use the food? The answer might be in a talk given by Susan Friedman, (thank you Erin for sending in this link!) that advocates for always pairing a treat with the click, but fading out the click as a means of creating intermittent reinforcement.
Anyone want to pick up the debate? What about you? How do you balance “Seeking” and “Liking?” If you are a clicker trainer, do you always reinforce a click? Always click a correct response? If so, why?
[On a parallel note, Panksepp has always been one of my heroes. Besides being a stellar neurobiologist, he’s the guy who discovered that rats giggle. You can see a video of him initiating and recording their laughter here. Here, also, is his TED talk for those of you who haven’t seen him. And I was fascinated by Gadbois’ work using scent detection dogs to facilitate research on endangered species.]
MEANWHILE, back on the farm. Jim, the Border Collies and I are just back from a heavenly weekend at a Kathy Knox sheepdog clinic in Hinckley, Minnesota. Heavenly is the right word, because it was a perfect weekend. Kathy is an articulate and excellent trainer, the setting was heart-swellingly beautiful (and thank you “other Kathy” and Steve for hosting!), and I learned enough to make my head spin. Willie had a wonderful time, and seemed to love every minute of it, which made me happier than I can say. He did beautifully too, which was gratifying after his strange performance at the last trial.
Maggie struggled with sheep much flightier than she is used to, which provided a great opportunity for me to learn how to help her gain confidence and get control over the flock. I illustrated what I call “infinite trial learning,” and will go to sleep every night repeating: “Watch the sheep, Trisha, watch the sheep.” (I am a split second late when handling Willie because I focus too much on him instead of what the sheep are doing. I know perfectly well to “watch the sheep” because they will tell you what your dog is doing. I just seem to falter on converting that knowledge to practice. Thus the phrase “infinite trial learning….” and the following assigned grades: Willie–A. Maggie–B. Trisha–C, but extra credit for continued effort.
As usual, Jim made it twice as fun by being his usual perfect travel partner, not to mention schlepping Maggie’s crate in and out of motel rooms with nary a complaint. What lucky dogs Willie, Maggie and I are.
Here’s Odi, the good dog of Janet McN, who helped keep the sheep under control for the rest of us. What a guy.
This is Beck and Skye, watching another team of dog and handler the sheep. (For the love of God, Trisha, watch the sheep. Just like the dogs.)
Terry Golson says
I believe that you always reward after a click. Always. But the reward is not necessarily food. For a brilliant dog who thrives on seeking, it can be the opportunity to do something new, for a horse, the chance to do something faster. For my dog of little brain (I have one) food is what he lives for, even more than butt scratches, which are a close second. I just had a conversation with a horse clicker trainer who claimed that she didn’t treat after each click. I observed her. She was – it just wasn’t food. Also, isn’t the goal to train for duration and stimulus control? If you’ve got that stay up to 5 minutes, or you have that BC staring at a sheep and not moving, don’t they deserve a reward? If you’re progressing, increasing criteria, and increasing the challenge, then the reward after the click is certainly earned and necessary. I think that the biggest mistake that trainers do, especially pet owners, is to be stingy on rewards – both what they treat with and how frequently they do it, so suggesting that they don’t have to reward after the click is just confirming the attitude of “my dog should do it because I tell him to.” The click is for clarity. It loses it’s strength when there’s not a reward afterward. However, you certainly don’t have to click after every behavior. Say “good dog”, expect manners, toss a ball. But save the click for training, and reward after you use it.
Robin Jackson says
This gets into some very technical areas (much of Skinner’s work was on different reinforcement schedules), but first I’d argue that in dog training most intermittent reinforcement is NOT based on “not following a click with a treat” but rather in neither marking nor rewarding a performance. So no click and no treat. Not a click without a treat. The “anticipation” in a slot machine paradigm comes from the reward opportunity, not a marker for the player’s behaviour.
If I have a slot machine that rings bells and flashes lights but DOESN’T pay out, my expectation is that very soon gamblers will stop using it altogether. It will feel broken. And that’s what a click without a treat is.
Instead, the question of how often you should give a cue without a treat, or a performance opportunity without a reward, is a very interesting one. There’s no broken slot machine in that paradigm–there’s just a machine that doesn’t pay everytime. I agree that may in fact be more motivating in some circumstances for some learners.
Again from a technical standpoint, I would be very careful about confusing a marker (which recognises a specific performance) with a “reward may be available” indicator.
If a cat hunts birds, bird song is a “reward may be available” indicator. There’s a bird around here somewhere. Whether you can catch it or not is a whole other question. Bird song triggers seeking behaviour. That’s way different than a click as marker.
A lot of Skinner’s work involved a “reward may be available” indicator, like a light going on to indicate pressing a lever MIGHT pay off with food. That light was NOT a behaviour marker like a typical clicker, because the light came on whether the rat went near the lever or not.
So I would argue that “seeking” ties in more to birdsong than clickers. “Reward opportunity” vs “marker for the seeker’s behaviour.”
Just giving the dog a cue (for a behaviour the dog is fluent in) should build anticipation in the seeking sense, because that’s what represents a reward opportunity.
Respectfully,
Robin J.
Margaret McLaughlin says
As a born-again clicker trainer, I always c/t, or, when I want to up the ante, c/throw the ball. I have observed, both in my dogs & in myself (checking the mailbox for the arrival of a long wanted book comes to mind) that the excitement comes with the click (or sight of the mail truck), & that the food is much less important. The ball is more important. So I guess it’s another “it depends”, surprise, surprise.
From a practical training standpoint, tho”, I mostly use food reinforcers, (& always reinforce) the click. A few reasons–by keeping the excitement under threshold, actual speed of learning increases. If 10 reps per minute is optimal, I’ve GOT to use food, since chasing, catching, & returning the ball to my hand eats up 30-45 secs for 1 rep. It’s also much more difficult to use physical positioning of the placement of the reward to reinforce the marker with a ball than with food for some things–heeling especially. Nina’s dancing out in front of heel position, hoping I’ll throw the ball, & then getting frustrated when I don’t, because she’s never IN heel position, so I can’t mark & reward. The flip side is that I can use the ball to develop love for an inherently non-rewarding activity like weaving, having the ball magically appear in front of her as she exits the last pole.
I do suspect, tho’, that the cocaine-like anticipation would fade if the reinforcement were more random, perhaps even more quickly with the low-value treats–I use kibble as a way to do lots & lots of reps without over-filling the dog. Even if the morphine fix is not as exciting, she’s getting it Every Time, with the ball for Best Effort. Maybe that’s the perfect drug cocktail?
Another–but related–question is when to stop routinely c/t because the behavior itself is reinforcing. Nina ran her first agility trial 2 weeks ago, & she was so pumped that she ignored the ball I scooped up on our way out of the ring. I was really pleased with her run, but even more pleased that agility has become its own reward.
Erin says
I could try to explain this but I think going to the experts is probably better. Here is Dr Susan Friedman’s article “Blazing Clickers.” She gave a talk about this at the last clicker expo and in this article.
http://behaviorworks.org/files/journals/Blazing%20Clickers.pdf
The main idea seems to be that this practice reduces the power of your training tool (the secondary reinforcer, in this case a click). If you want or need a variable reinforcement schedule, you should withhold the click, not the treat after the click.
chloe says
It is not my experience that anticipation is better than ‘the thing’. I’m thinking back on my agility years and obedience and now on training puppies, going to the park, etc. Nope they seem as excited when the leash comes out when the shoes get laced on the ride to the park and running like maniacs.
I always treat after clicking because I feel that that is the contract and I try to not break promises. however once the clicker is put away, post learning a trick etc, I don’t always give a treat or play.
parallel says
More general question on clicker training. Is it possible for an animal to be TOO food motivated for clicker training to work? My cat is a very smart guy (even the behaviorists at Penn agreed), but for the life of me I can’t clicker train him. As soon as he hears a bag crinkle or smells treats, he becomes so obsessed with getting to them that he doesn’t really offer any behaviors beyond sniff/bite mom repeatedly until she hands it over. He almost becomes *panicked* when he realizes treats are coming is the best way I can describe it. Have I just not been trying hard enough? Or can hyper food motivation be an actual problem?
Frances says
I don’t clicker train because I am a klutz with lousy timing and I ended up with confused dogs and muddled me, rapidly followed by happy dogs when I dropped the treat bag on the floor! But looking at my two, I begin to suspect that there may be as many different learning styles and motivational forces at play amongst dogs as amongst humans. Sophy is highly motivated to work things out for herself – she will try every possible route to get at an out of reach treat, and has even been known to paw the book it rests on towards her. If she does not get rewarded pretty frequently while playing training games (which is what we consider them), she will take herself off to find something more interesting. Poppy rather relies on me to be the Great Provider, of food, fun and friendliness. she will turn herself inside out to get the reward – which adds its own problems, as if one thing doesn’t work immediately, she will throw in half a dozen other behaviours in quick succession. Both of them would do extremely well with a really good trainer with good timing: unfortunately they have me, so we muddle through…!
So if you start with dogs with different motivations, add in early experiences of training methods often not very well applied and all the other variables, I fear we are heading back to our favourite response to most of these questions – it depends. It would be fascinating to see a proper trial done, though – using dogs with no history of clicker training, perhaps? Shelter dogs could only benefit from the attention and training, and should provide enough candidates to match cohorts.
Robin Jackson says
One more thought…
Clickers are used in different ways by different trainers. Most pet dog trainers use it as a precision marker, as in, “There–that’s what I’m paying off for now.” But a clicker is just a noise. Some trainers, particularly horse trainers, use it as a Keep Going Signal, in which case you’d almost always have multiple clicks per treat.
Bob Bailey has said that in his opinion clickers are way overused in pet dog training. Most obedience behaviours aren’t moves that require that degree of precision marking. Sit, Down, Target, Spin, Heel, Come can all be very effectively taught to most dogs with rewards but without a clicker (as Dr. Ian Dunbar has shown for years).
And nosework, specifically, is often taught with rewards but without a clicker because it is difficult for the human to know precisely what they’re marking, especially early on. (Much discussion of this in nosework circles for anyone who’s interested.)
So again, technical stuff. Before getting too deep into the discussion, it may be helpful to first verify exactly how and when the clicker is being used in the training protocol. If the handler does not intend the click to “end the behaviour” then the idea of a click without a reward has a very different meaning.
There are some trainers who use a clicker as a cue to start a behaviour, or as a Keep Going Signal mid behaviour, or as a No Reward Marker to restart a behaviour. For them, the relationship of click to treat will be understandably different than for a trainer who uses a clicker to mark the behaviour that is being rewarded.
So step 1 is to verify what the clicker is intended to communicate to the dog. Otherwise we could be in an apples to oranges discussion and not realise it.
Robin J.
Jessica Hekman says
A few years ago, I shadowed some sea lion trainers for two weeks. Instead of clicking, they use a verbal marker (“good”) which they refer to as a bridge. I noticed that they only intermittently followed the bridge with fish. When I asked why, I received a lecture on the power of intermittent reinforcement. When I explained that in the dog training world, we also believe in the power of intermittent reinforcement, but we accomplish it by only intermittently bridging/clicking, rather than only intermittently following the bridge/click with food, I got a “hmmmm.”
Hmmm! Do the two approaches have the same effect? I would love to see some research.
Jessica
Trisha says
Thanks for the excellent comments so far. I’ve added a note to the original post, because I hit “publish” too fast after writing the post in the car on the way home from the sheepdog clinic. (An inappropriate “click?”) Gadbois’ point was that if Pavlov was correct, that a conditioned stimulus (CS) eventually carries the power of the unconditioned stimulus (UCS), then how often does one really need to use the UCS? (A primer: “Unconditioned” means something inherently rewarding, like food. Some would call this a primary reinforcer. “Unconditioned” means a stimulus not inherently reinforcing, like a clicker, but classically conditioned to be associated something an animal inherently wants, like a click). Robin’s question about flashing lights from a slot machine is a good one, but here, perhaps, is another perspective: The flashing lights, one could argue, are not really the cue that something great is about to happen, they are there to tell everyone else in the casino that someone else just won money so they should keep putting their own money in the slot machine. But I get your point. Perhaps a better analogy is the use of praise. If you condition a dog to associate praise (a CS) with food (an UCS) then eventually it will have a similar power. However, I’ll be the first to say that when training a dog to do something difficult I’d pull out the chicken, even if the dog has been well conditioned to praise. More food (ha) for thought.
Thanks to Erin for the link to Susan Friedman’s talk on clickers and reinforcement schedules. Great link, thanks for adding it to the conversation. Dr. Friedman is speaking at APDT and I already have my calendar marked to attend her talk. I had lunch with her a while back and enjoyed it immensely, and have heard nothing but raves about her work. I hope to see others there, come up and say hi if you’re a blog reader and will be attending to.
Robin Jackson says
@Trisha,
However, as Pavlov also pointed out, the finding was not that “the CS eventually carries the power of the UCS and retains that power forever.” Rather it was that the “CS eventually carries the power of the UCS and retains that power for some time.”
That’s exactly Dr. Friedman’s point–the CS gains power with every pairing and loses power as the association is broken.
How often you have to refresh a secondary reinforcer has been a topic of much study, so lots of literature available on that.
As to whether the slot machine’s flashing lights are only there to alert others that there’s been a payoff, again there’s already literature on that, and the answer is no. Which is why toy slot machines also have lights and bells. This is exactly on point to the anticipation discussion, since players in isolation choose the machines with the extra bells and whistles, and overvalue those payoffs.
Consider in animal training the “Party!” Jackpot, where the trainer provides a handful of treats plus praise plus “Woo hoo!” plus excited body language. Bells and whistles to increase the perceived value of the payoff.
Robin Jackson says
@parallel,
Absolutely, some learners are so overwhelmed at the availability of a reward opportunity that they can’t focus on the training!
In her book CONTROL UNLEASHED: THE PUPPY PROGRAM, Leslie McDevitt discusses exactly this issue and has a chapter on protocols for training puppies who go over the top when they realise treats may be available. It might give you some ideas for working with your cat, so I’d see if you can find a dog sports friend to borrow the book from.
em says
What an intriguing topic!
I’ve never used a clicker myself, though my instinct inclines me to agree with Robin J- I think it is more confusing than motivating to click without a treat if the clicker is used as a marker.
Two cents aside, however, I do have an anecdote about the relationship between anticipation/reward vs the inherent value of the reward.
Back in his poor appetite days, Otis would very frequently refuse all but the most appealing food rewards (and sometimes those). He was likely to walk away and refuse food that was simply offered to him unprompted, or take it politely and spit it out (yes, seriously. Even chicken or liver sometimes) BUT, even in his worst bouts of non-eating, he was more likely to eat a treat if the person offering asked him for a behavior first, and MUCH more likely to take and eat it if they asked him to wait for it. Anticipation of an earned reward was certainly much more compelling to him than the reward by itself in that instant, which he wouldn’t have bothered with otherwise.
I suspect that with a lot of training, the anticipation is built in, or can easily be built in by raising the stakes, asking for more complex or difficult behaviors before marking and rewarding. The advantage of intermittent vs. consistent reinforcement may be complicated by many factors, including the power of habit, the difficulty that the dog experiences in interpreting or carrying out the cues, the degree to which the behavior itself is self-reinforcing.
Most of the behaviors described as more appealing when intermittently reinforced are simple ones like pressing a lever, or playing a video game, physically easy to master and do repeatedly. I’m not sure what the literature says about ‘high stakes’ behaviors. I might be willing to sit and plunk nickles into a slot machine for a slim chance of a pay off, but if I drive across town to the bakery anticipating fresh bread for dinner and they’re sold out, I will be ticked. And less likely to repeat the behavior.
Trisha says
Thanks Robin for adding so intriguingly to the conversation. I love hearing that there is a literature on the bells and whistles in casinos; (should have guessed, given the money involved…) and more to our topic, that there is an extensive literature on how often to pair the UCS with the CS. More on this later, related to pairing experience and research, but gotta get to working on my next book. For now, thanks for the thought provoking and informed addition to our discussion. Love it.
Robin Jackson says
@Frances,
I agree with you about different learning styles among individuals, and would go further and say that all motivators rise and fall in value, even for one individual, depending on context and reward history.
Hunger, thirst, desire for social interaction, desire for play, and sleepiness are all primary motivators, but which takes precedence in any one moment depends in part on their current level of satiation. My dog normally prefers peanut butter treats to water as a training reward, but he will occasionally leave a training session to go get a drink. It’s not that water is more motivating than PB, but that in that moment his thirst was his primary motivator.
For example, the cats choosing the random payoff lever makes perfect sense unless these were cats rescued from a starvation situation in a hoarder’s house, in which case I’d be astonished if they didn’t prefer the pays-off-every-time lever for quite some time. We see this with human foster kids coming from neglect situations–they hide food to ensure a continuous supply and it takes a long time before they can tolerate temporary scarcity that better cared for kids never even notice.
So “it depends,” indeed! :).
But still, there are useful generalisations to be made. My dog usually prefers PB to water, and provided water is readily available for those moments when his thirst jumps to the top of the list, our training sessions will be more effective if I do bring the treats. But there’s no doubt that individuals vary in many ways, and as a practical trainer, it’s important to be alert to those variations.
Evelynn says
I do not always reward after click, I do at first but later on no. I can take one of my students as an example, we had issues teaching him heel because he will chomp on the hand. (rescue with poor bite inhibition when excited). So we switched to clicker, and we mark each correct step with a click, but he don’t get the treat until exercise over. In the beginning he got a click/treat for every step and we did one step, two steps, and built it up. However, by not rewarding every time, despite he gets the click, you can see him work even harder with full eagerness. I have also noticed the need of frequency also varies a lot depending on the dog, easier bored needs more frequent rewarding def.
Khris Erickson says
I always follow the click with a treat, however how I treat has changed since I’ve been following the amazing trainer Denise Fenzi. Denise does a lot of play as reinforcement and one of the ways she plays is food play — by playing a bit of keep-away with the treat, hiding the treat, and tossing the treat.
What I’ve found is that if I use food this way it is much more reinforcing to my dogs than if I just hand it to them. When I used to call my dogs in from the yard and handed them a high value treat I didn’t get a high rate of compliance. If the neighbor’s dog was out or there was a squirrel they didn’t always want to come in right away.
Now I play “kibble party” — they come in and I grab 10 pieces of kibble and scatter it on the floor. I get 99.99% compliance. Lower value treat, but because it’s delivered in way that makes them use seeking behavior I’m getting better results.
I also toss treats for agility, and my dogs obviously love chasing after the treat.
Trisha says
Khris, I LOVE this idea of Denise Fenzi to make the food more exciting. Sounds like it creates even more anticipation, yes? Thanks so much for sharing, this is a brilliant use of the knowledge that “seeking” is a powerful motivator!
Bree Mize says
This makes me think of why there are so many casinos now all over the country. Gambling has an intermittent reward and look how much people seek it out. Wow. Great info.
Andre Delicata says
I would like to challenge the traditional concept of treats as a food treat or a click that the dog associates with a dog treat. Too often we pay too much attention to our dogs, petting them and fussing over them. This is not what dogs need but what we need. A dog needs exercise and mental stimulation, coupled with discipline. If we reserve our petting and other signs of affection for more deserving moments, these could be the treats instead of food treats.
Joe says
For a Labrador Retriever, it seems that the reward for a successful fetch is to throw the dummy again for another fetch. Cocoa anticipates fetching when I bring out the dummy, and make her heel-sit, but I am reasonably certain that the resulting fetch is more rewarding. When I bring out the dummy, she sometimes play-bows and barks at me (she almost NEVER barks otherwise!) to “Please get on with this!!” After a successful fetch, she will take a butt-pat, but what she really wants is to fetch again. How this works in to your comments above re: sheep herding dogs, I am not quite certain.
Robin Jackson says
@Evelynn,
Yours is a perfect example of one of those situations where a click does not end the behaviour, and so multiple clicks per treat makes sense provided there is a treat eventually when the behaviour does end.
What you’re doing is precisely what some of the horse trainers do, and in this case the clicker is being used as a Keep Going Signal. It doesn’t mark the completed behaviour for which a reward will be given. Instead it notes each step of a multi step behaviour (in your case literally). Or each phase of a duration behaviour.
Clickers can indeed be very effective for this use, as they’re a fairly unobtrusive sound registered quickly but also easily moved past, so they don’t cause the learner to lose momentum.
Secondary reinforcers are all about expectations, or as Chloe said, the contract the teacher had with the learner. If the expectation is frustrated, the power of the secondary reinforcer eventually declines.
In the case of a Keep Going Signal, expectations are increased because the communication is given that this is the correct path to the eventual reward.
So again, the details matter.
If I use the clicker to end the behaviour, I click, and then I walk away giving no treat, the power of the clicker is slightly diminished each time I fail to pay off.
If I use the clicker mid behaviour multiple times as a Keep Going Signal, and treat when the behaviour is complete, then I’ve strengthened the power of the clicker.
The clicker itself starts out neutral. It’s the associations built through reward history that give it meaning to the learner.
Dieta says
The local trainer here insists that every click is followed by a treat, though the type of treat varies from food to a quick game or a chance to explore. And yes, sometimes anticipation is as good and better than the actual treat, but the treat has to come. Taking Trisha’s example – if the anticipation you experienced on your way to the show had been followed by a sign at the door, saying that Mr Seinfeld has fallen ill and the show was off, your disappointment may well have matched your earlier anticipation in intensity. So I may not necessarily give the treat straight away,may throw it or hide it for the dog to find, never letting anticipation become frustration (if I mess around for too long), but the treat definitely followes the click.
Lynn Whinery says
I remember watching a DVD of a Bob Bailey lecture. He mentioned training a large type of lizard to push a button. As a reinforcer he would drop in a tiny breed of lizard. The large lizard wasn’t very enthusiastic about the process, but he did it. Then, for some reason, (Mr Bailey isn’t sure why), he got the idea to put an empty shotgun cartridge in the tank with the lizard. Now, when he dropped in the tiny lizard, it ran straight into the cartridge. NOW the big lizard was super excited! He waited in anticipation for the chance to gobble up the little one. Then he ran back and pushed the button again. The ‘thrill of the hunt’ was far more reinforcing than the actual eating of the little lizard!
However, I still firmly believe that when initially training the behavior it should be rewarded every time. When we’re first learning something we *need* to know we’re on the right track. I think that this new research merely explains why an intermittent schedule of reinforcement is so effective – because of the ‘high’ of anticipation.
Ann Bemrose says
For me, every click deserves a treat. It’s my contract with the dog. The click ends the behaviour and says that it’s time to get paid. I use “yes” as a no-reward marker that means “keep going” when chaining behaviours together. I think that many trainers (including me) often use the clicker for a lot longer than they need to use it. Once the dog has learned the behaviour, we don’t need the clicker. Put a cue on the behaviour and, when performed reliably, put the clicker away or move onto something else.
In notice that my dogs become alert and excited when they know I have a clicker in my hand or pocket. They start looking for things to do. They’ve got loads of clicker experience and they want to get clicked. Games like 101 things to do with a box (or chair or whatever) are ideal demonstrations of Panksepp’s SEEKING system, I think. Yes, the dog gets a treat for every click, but what really seems to keep the dog going is the active engagement with the object in the dog’s quest for another click. The treat is part of the “reset,” then, that brings that particular sequence to an end so that another can begin.
The question of rewards is really important! I find that high-value rewards are great for getting attention for new learning or for absolutely crucial things like recalls.But as part of regular training, high-value rewards are distractions, so I switch to something the dog likes but doesn’t get so excited about that he can’t pay attention to what we’re doing. The trick is that pieces of a hot dog may be extremely valuable to some dogs and only okay to others, so it’s important to figure out what the range of rewards for any particular dog might be. While my dogs really enjoy play, in general I don’t find it to be great reward throughout the training session. It takes a lot longer to play tug or fetch than to deliver a bit of cheese. But play is a wonderful way to end training a particular behaviour, so we might play tug after working on retrieving over the high jump and before working on free heeling. At the end of the training session, a good game of fetch is a fantastic way to wind up our time.
Betsy Calkins BS CPDT-KA says
It seems that whenever I read these interesting discussions, it comes down to “what definition are you using for whatever ” etc. Setting the definition first allows for a more precise discussion. But I enjoy these rambling conversations also, since I learn good things from them too.
Ken Ramirez once very politely corrected me in a seminar for commenting that I disapproved of using the clicker as a “keep going” signal at a local exotic animal park program. He explained that many people have different definitions for how they use the clicker and there was no way that I could tell for sure what they were attempting to mark. (but I still disagree with the way they were using it! haha)
I attended a three day seminar with Dr. Friedman and, honestly, it was life altering for me. I really would like to do her online course, but haven’t been able to devote the time to it yet. Maybe next year, fingers crossed. I can’t recommend it enough! Like you, Trisha, she is a beacon of light in a sometimes soul crushing profession.
Aliesha says
I always pair a word such as “good” or “good job” when I click. Then when I am later working with a dog if I don’t click I will still use the words “good,” or “good job” and that for me acts as my anticipation string. They got the word, but not the click. For me I always try to treat after I click, but I try to phase the clicker out once I feel the dog understands what he is to do. That or I will put much more space in between each click and use my words only as long as I feel the dog is still in that happy anticipation mode. If I can see him getting distracted more I change what I am doing and bring the clicker back out with the treats.
Sheri Cassens says
I am a graduate of the Karen Pryor Academy for Dog Trainers. Yes, a treat or other reinforcement always follows the click. Yes, one click always ends the behavior. A clever clicker trainer can build complex behavior chains where each cue reinforces the previous behavior ending the chain with one click then reinforcement. A clicker savvy dog learns to keep working in anticipation of some fabulous reinforcement.
Ken Ramirez made an interesting point in one of his fabulous presentations that what type of reinforcement the animal expects has an impact on behavior as well. My summary will surely fall short of the original information! If the click is followed by a food treat for hundreds of repetitions, the dog would expect a food reward. A dog may love to play ball but may not find the same value in playing ball as a reinforcer after the click if he expected a food reinforcement.
Great topic!
Nan Arthur says
For another take on this, check out Robert Sapolsky’s talk on this. He agrees to a point, but points to it being more unique to humans as we will delay gratification.
He implies that there will come a point that other species will stop “lever-pressing,” the longer the delay in reinforcement.
You can go to #8 on the video list for this: http://fora.tv/2011/02/15/Robert_Sapolsky_Are_Humans_Just_Another_Primate
Nan Arthur
Lisa says
Dieta – I was reading the comments and planning a response when I got to yours. That is exactly what I was going to write including the scenario where Seinfeld does not show up. Even if he rescheduled, would the members of the audience be as excited before the next show? Maybe they would because they had waited even longer or maybe they would be a little wary.
I am from the click=reinforcer (whether that be a treat, toy, praise or the chance to chase a squirrel) camp but try to keep an open mind. My question is do you need to click for a certain period of time so the animal can build up the anticipation or do they have enough anticipation from the beginning to be reinforcing? I would think that would be highly variable. I also think there are a lot of things that can be reinforcing for dogs so it would be difficult to say that it was anticipation alone that was acting as the reinforcer (I haven’t read all the research so this may have been answered already).
Interesting discussion. I have been fortunate enough to hear both Simon Gadbois and Susan Friedman talk on the subject. I would love to see the two of them discussing it (cage match optional).
Kat says
Robin Jackson has done an excellent job of laying out my thoughts on the subject. As I use the clicker with my dogs the click marks the behavior I’m trying to capture and when they do that behavior they get paid. If I want them to continue a behavior in anticipation of an eventual payoff I use a litany of good. For example teaching Finna to heel off leash would be ‘good, good, good, good, good, good, stop, sit/click, treat’ At this point stopping and her sitting ends the exercise because that’s about as far as she can go right now. Behaviors the dogs have mastered get reinforced/paid as the spirit moves me. For both of them the anticipation comes when they see the bait pouch and clicker come out that’s their signal that the opportunity to earn rewards now exists. They don’t know what might earn the reward and will work really hard to figure it out; their seeking centers are definitely active from the minute they know the possibility of earning rewards is present.
And to em, thank you for the very timely reminder that food that’s earned is valued more highly than free food. Ranger has been a very sick puppy of late. After several days in the doggy hospital and a small fortune in tests we know he’s got a tick borne infection that’s pretty uncommon in our area. He’s on massive doses of doxycycline which tends to depress appetite but since he’s already down six pounds and anemic eating is kind of important. Thanks to the timely reminder I made him practice his recall repeatedly to earn his breakfast. He’s still really weak so I’d stand two or three feet and call him to come reward with a few mouthfuls of breakfast, move and repeat. He ate almost all of his breakfast before needing to rest some more. It was a very welcome change from hand feeding him and begging him to eat it. Thank you!
Scott Thomas says
I also think that there are some other things in the dog that should be considered in the scheme of Random Intermittent Reinforcement. The aspects of Instinctual Drift (as in genetically selected behavior) and Social Cognition also need to be considered. For some dogs not getting rewarded might be a breakdown in understanding what the handler/trainer is expecting. Additionally, a dog is not a dog is not a dog, with breed selected behavioral traits we will see something akin to Instinctual Drift with dogs selected for specific behavioral patterns. Clicking may be so much white noise to a Labrador waiting for a ball to be thrown. I did see in my cross breeding with Vizslas and Labradors that performing the same task of detection was vastly different in terms of reward. The Labs definitely were not quitting until they found the ball, but the Viz/Lab crosses were much more content to continue searching after the find.
Beth says
I’ve always been taught that the intermittent reward (once a behavior is learned) is the more powerful reinforcer.
Or put it this way: Which draws the bigger crowd in Vegas, the slot machine or the vending machine?
Robin Jackson says
@Ann Bemrose,
Great examples! I think 101 Things to do with a Box definitely has a large seeking component. Also a really important point that a reward that is too high value may distract attention from the training itself. So many different variables!
p.s. One tiny technical point, but which can be important when reviewing the literature…most behavioural psychologists use two different terms. A “Keep Going Signal” tells the learner that the reward is still available and the learner is doing the right things to get it. It’s like saying “Warmer” in a room search.
A “No Reward Marker” (NRM) is used to tell the learner that they have failed to earn the reward. Typically in pet training it’s “Oops!” or “Uh oh!” or “Nope.” The equivalent of saying “Colder” in a room search.
Generally an NRM is emotionally satisfying to the trainer, but annoying as heck to the learner. Kathy Sdao tells a great story of when they were studying dolphins and introduced NRMs. They were using an underwater speaker on the side of the tank. When they got to about the 4th day of trials, a dolphin was doing a search, they played the NRM signal–and the dolphin went over and knocked the speaker off the wall!
So as you describe it, it sounds like your “yes” is probably a KGS, not an NRM as a lab scientist would use the terms. That may not make any difference in everyday life, but if you want to do a literature search, it helps to know.
Scott Thomas says
http://isites.harvard.edu/fs/docs/icb.topic188365.files/Class-10-07.pdf
A better way of understanding the combined ideas of Panksepp, the Brelands, and even the Coppingers. I think the newer models of dog cognition make us seriously reconsider traditional notions.
Scott Thomas says
http://youtu.be/Jl5L0MG_FM0
Dopamine jackpot!
Robin Jackson says
@Beth,
You asked “Which draws the bigger crowd in Vegas, the slot machine or the vending machine?”
Here’s where the answers gets interesting. If you mean, which machine do people stand around and watch others use, unquestionably slot machines.
But if you mean which is used by the highest number of hotel guests, it’s the vending machines. Vegas casino hotels have a lot of vending machines, from snacks and personal hygiene items to magazines, music, sunscreen, and jewelry. There’s one that dispenses cupcakes and another that dispenses gold nuggets. And of course there’s an ice machine on every floor.
Other people don’t usually stand around and watch you use these, but in terms of number of uses by different casino guests, vending machines traditionally beat the slots.
And if you include ATMs, they beat slots by a lot.
Obviously this isn’t an apples to oranges comparison–it’s quarters to candy bars.
Some people play the old style all or nothing slots a lot. But most people don’t.
Here’s what’s really interesting. In the last 10 years, the mechanics of slot machine play has changed precisely because it wasn’t attracting many people.
Up until about 2005, almost all slot machines were all or nothing games. You spun. You either won or lost (mostly lost). You watched each column stop, usually in sequence. 7…7…5. Awwwww.
But now with digital machines, came a whole new style of play. Now you can bet on multiple rows, up, down, sideways, and percentages.
Now you bet 70 cents across many different configurations and almost inevitably win 20 or 30 of them.
So you bet 70 cents, get back 30 cents–and your brain interprets that as a “win.” Now even a nonaddictive personality will keep playing, sometimes for hours. More people play, and they play for longer.
http://www.vox.com/2014/8/7/5976927/slot-machines-casinos-addiction-by-design
Intermittent big rewards turn out to be less motivating than variable but nearly constant small rewards if the participant has free choice in what activities to do. Which is really interesting from a trainer’s standpoint. 😉
Robin Jackson says
@Scott Thomas,
Great link, thanks! Yours and Erin’s link on Dr. Friedman really help frame the issues.
My extended family has always had border collies, many as working dogs on working farms. Freefeeding, where the kibble is left out in buckets for the dogs to help themselves, is a pretty common practice. These dogs have low food motivation, a useful trait in working farm dogs.
There’s no doubt in my mind that most of them are internally rewarded for task mastery. They are “in the zone” dogs, and offering a treat in the middle of a work run is as annoying to them as applause in the middle of a Beethoven sonata is to a concert pianist at Carnegie Hall. They appreciate the appreciation, but they don’t want it to interrupt the work itself!
I agree that we’re just starting to understand the biochemistry involved. Fascinating stuff.
Robin Jackson says
@Nan and @Scott Thomas (who shared the same Sapolsky talk)
I’m sorry the Sapolsky clip doesn’t show the graphic for the 25% and %75 payoff scenarios, because I think many people will miss the significance.
The point was that the dopamine levels in the monkeys was highest at a 50% payoff rate. They went down, relatively, for BOTH the 25% and the 75% payoff scenarios.
Make a reward too frequent and the task becomes uninteresting.
Make the reward too rare, and the task ALSO becomes uninteresting.
The other aspect that Sapolsky doesn’t mention in his talk is that that particular study involved a really simple task, lever pressing. Other monkey dopamine studies using much more complex tasks did not get the same results.
CHOOSING WHICH RESOURCE TO PURSUE
I think one factor we tend to talk around is that of agency. The monkey/dog/person starts the day with a limited number of hours and a limited amount of energy to complete all necessary life tasks. The organism has the choice (agency) how to spend that energy. What built in reward structures will tend to influence those choices in a way that promotes generational survival?
I think the guaranteed payoff tasks become uninteresting because the organism can always go back to that food source later if something better wasn’t found. So we are wired to move on to the less reliable sources because the lower frequency implies that one may run out. Get it while you can. As soon as the reward frequency drops too low, it becomes uninteresting again. And we start looking for a new source.
In a competitive world, the always available source offers no competitive advantage vis a vis other seekers. But once the always reliable is found, we can go on to seek sources of a higher risk.
On the other hand, context again. If the organism starts out in deficit, the always reliable source suddenly becomes much more interesting, because the first motivation, like my thirsty dog, is to refuel. Once an appropriate level is achieved, then other rewards can be sought.
So variety is itself motivating because of agency. It’s my chance to do something better than a competitor. At the same time, better is useless to me if I haven’t met my minimum needs. So both motivations exist, but which one is preeminent depends on whether my basic needs are currently in deficit or not.
ANTICIPATION BEGINS BEFORE THE SEEKING BEHAVIOUR
To turn this all around and come back to Trisha’s initial post, I would argue that as most pet trainers use it, the question of breaking the association between clicker and treat is irrelevant to the power of anticipation.
Instead, what we should be thinking about is birdsong. Or as Sapolsky’s slide showed, a Signal. Not a behaviour marker for when the behaviour occurs, but rather something before the behaviour starts to tell the dog, “Get ready! A reward opportunity is about to be presented.”
Ann Bemrose talked about the clicker in her pocket raising her dog’s level of excitement. I would argue that THAT’S “anticipation” in Panksepp’s term. Anticipation PRECEDES behaviour, generates dopamine, and dopamine encourages behaviour. So way before the typical click.
Most of us have signals for training sessions even if we don’t realise it. We get out the treat bag and the clicker, put out the mat, move to a particular room, stand or sit with a particular posture. That’s when we should see dopamine levels go up in our dogs. That’s the anticipation phase.
So can we formalise that? Use it? Shape it?
Think about all the sports dogs that do great in practice and flounder in trials. Or the pet dogs who won’t Sit at the park.
Sure, there are many contributing factors: distractions, nongeneralisation, muddy cues, etc. But what if some of it is simply that the dog’s dopamine levels are lower because we haven’t given the signal that generates Anticipation? We haven’t turned on the Seek circuitry? So the dog hesitates–a classic indicator of lower dopamine levels.
Although it’s true that dopamine is high during seeking behaviour, the monkey study, among others, indicates that the dopamine level actually jumps well before the behaviour itself starts, and is probably what helps initiate that behaviour. Anticipation happens when the cat hears the birdsong, and that’s when the dopamine first jumps. Then it stays high during the hunt.
So what can we do before our dog performs the desired behaviour to signal that a reward opportunity (not the reward itself) is now available? Do we need a “pre cue”? Or in a well trained dog will the cue itself be enough “birdsong” to get things going?
LisaW says
I do not use a clicker (too uncoordinated to click precisely, and in the beginning, uber-noise-sensitive Olive would be scared by the click noise.) But, I did learn to mark when she did something right with a soft “yes” and always followed by a treat. I have learned that for her, the seeking is almost as much of a reward as the finding. We started with kibble hidden under yogurt containers, and I would move them around like the old shell game, and she would nose the one hiding the treat. When she chose correctly, she’d look at me with a true sparkle in her eye, and I’d have to tell her ok, get it. Engaging her nose has been a true gift for both of us. Find it’s in the yard or hiding little treats around the house (on door hinges, chair legs, in shoes, etc) are two of her favorite things to do and have become our default way of redirecting her when she is too anxiously focused on the car pulling into the driveway or the cat across the street. Like Khris’s dogs, her response rate had risen dramatically with the seeking games as motivation and/or reward. She hears “find it” and she immediately goes into seek mode. I wish we had nose work classes in our area.
I heard Simone Gadbois talk about the anticipation of a reward being more of a motivator than the actual reward, and if a behavior is rewarded every time, the response rate may decrease. I tried this with Olive and her recall, which at least in our fairly large yard is 95% reliable running to me at full tilt even if she can’t see me and she’s barking at who-knows-what when I call her. I varied the type of treats and the amount of times she was rewarded with food and her recall became less reliably quick and focused. I went back to how I had been doing it, not wanting to mess with something that seems to be working well.
Kat, I hope Ranger is on the mend. I’m sorry to hear he has been so sick. I haven’t met him of course, but I have such a fondness for him from all your stories of his amazingness. Paws crossed for a speedy recovery.
Lulu Brooks Tartt says
Love this!! I started my career in marine mammals. I have always been taught that since the whistle/click/bridge is a conditioned secondary reinforcer- there is no need to follow it up with primary reinforcement every single time. I have witnessed so many animals who have appeared to react more to the bridge than to the primary reinforcement delivered after. And those sessions were always the most successful ones.
When I came to the world of dog training, the practice of treating after every click has baffled me everyday. The science just didn’t make sense in my head. But I went with it because the experts said do to so. I’m very glad to know that I’m not crazy over here for thinking this for so many years.
Trisha says
Lots more later tomorrow, but Kat, I’m sorry that Ranger is sick. Those tick-borne diseases can be nasty. All paws crossed for you.
Beth says
Robin, having been to Vegas, it seems unlikely (though I could be wrong) that there are more vending machines in use than slots. There are hundreds of slots on each game floor and vending machines are not so easy to find, nor do people (given the choice) prefer to stand in front of a vending machine plunking in quarters, as opposed to buying their groceries in the traditional way. And I’ve never heard of someone running up thousands in debt at the Snapple machine. ;-0
A slot machine exemplifies intermittent rewards. Sometimes you do indeed get nothing, sometimes for several spins in a row. Sometimes you get a small “win” that is really a loss, as you pointed out. And sometimes you get a jackpot— getting much more out than what you put in. Intermittent rewards with dogs mimics that; sometimes nothing, sometimes a small treat, sometimes a huge praise party with a big payoff. Constant reward is more like a savings account. And most of us, even those who are not big gamblers, do not get excited about making a bank deposit the way we get excited about dropping a few quarters in the slots (or the stock market, for those who play that “game.”), even though the savings account is a sure thing and the gambling play is most definitely not.
*******
When I train, I find that training away from the treat jar and giving “yes” or “good” for partial successes, then a huge “GOOOOD dog, Goood boy, good boy, who’s a good boy” in an excited voice while clapping and saying “yay” and then running together to the treat jar is MUCH more motivating to my dogs than playing Pez dispenser. I call it a “praise party.” Neither of mine respond well to the Pez dispenser method. First, they seem to find it very distracting. Second, it interrupts their concentration. And third, that excitement of running for the treats seems to make the treats so much more rewarding to the dogs. If we are out and about, making a huge show of fishing them out of my pocket has a similar effect, though not as strong since the dogs have nothing to do but ants about waiting (the running seems to increase the reward value).
Then giving a bigger treat that they can turn around and run off with seems to increase the value even more.
I worked with an agility instructor who used the clicker method and my dog did not respond nearly as well as to my own “run and get the treat after you’ve completed the whole task”. method. I was gently reminded to have my treats ready and not in my pocket. But getting them out of the pocket seemed to be half the fun for the dog….
Kat says
Thank you for the good wishes for Ranger. He’s been a total pain this afternoon getting into mischief, holding things for ransom, and generally being a brat. I’d say he’s feeling quite a bit better. He even tried to engage Finna in some wrestling! Fortunately she, amazingly, had better sense than he did and restricted the play to some gentle pushing. He is responding to the medication (at least as of yesterday’s check) and his prognosis is very good just not very quick.
Robin Jackson says
@Lulu Brooks Tartt,
As the slides Scott Thomas linked to mentioned, different species do have some intrinsic differences, even as individuals also vary. Working with a prey animal, whether it’s a rabbit or a horse, presents some very real differences to working with a predator.
Captive marine mammals, at least the ones who do well, seem to have a particularly high play drive, a desire for stimulation and novelty. That may well be linked to that satisfaction in mastery we were talking about earlier.
Just as one example, several trainers who’ve worked with both marine mammals and dogs have remarked on the higher need to vary rewards with cetaceans. Many dogs have a favourite high value treat (Trisha mentioned chicken, my Tulip prefers salmon). With dolphins in captivity, at least, novelty appears to be more highly valued than any one treat.
If we go back to Skinner’s adage that “the rat is always right,” it’s the learner that determines what’s rewarding, not the trainer.
Assuming basic needs are met, for some people there’s no question that work may be more rewarding than eating, most of the time. I believe that’s generally true of many working border collies also.
In the article on slot machine research linked to above, the author mentioned that with the newest slot machines that offer partial payouts, many players value being in the flow of the game so that they actually resent a big payout because it interrupts the game! They prefer the little payouts which register as a win but just credit the account and let them go on playing uninterrupted.
All of which is to say I haven’t seen any studies, but it wouldn’t surprise me at all if the general rule for marine mammals in captivity turns out to be a bit different from the general rule for dogs as household pets. And if the general rule for dogs as household pets varies a bit from the general rule for working livestock guardians, etc
Typical Marine mammals in captivity may find novelty, mastery, and being “in the zone” more rewarding than any given food reward, up until the point when they feel physically hungry again. Which would make a bridge and immediately moving onto something new more motivating for them than stopping for a snack.
Typical labs may find human approval, and food (!) more rewarding than most secondary reinforcers. Even if they just finished dinner. 🙂
So I don’t find it surprising that good trainers come up with somewhat different guidelines for different species. As Trisha says so rightly, “It depends.” 🙂
em says
@Kat,
I am so sorry to hear about Ranger’s illness, and so glad if my comment helped. I hope he feels better and is back to his amazing self very soon.
Beth says
I have finally made it through all the excellent comments, and wow do I now have a lot to think about. And a few thoughts to share.
First, I am very sorry to hear about Ranger, Kat. I hope he starts feeling better soon. I hate ticks.
As to click/reward, I think that there are lots of things beneath the surface of the conversation. Someone linked to an article about clicker training for zoo animals. Much of what zoo animals are trained to do, though, is at least mildly aversive. If you no longer associate the click with the treat, the behavior will quickly extinguish because standing for handling, or staying in the viewing area instead of the nice tree, are behaviors that the animal does not really want to do. If the behavior itself is inherently positive punishment, then the reward has to be consistent enough and strong enough to counter act that. So you’d better believe that my vet-scared Jack gets a treat every single time I click him for standing quietly in the vet’s office. Otherwise that behavior would extinguish because he hates the experience.
On the other hand, coming when called is not always inherently aversive to the dog. Hanging out with mom should be a mostly positive experience for the dog. Because we sometimes interrupt something he likes MORE when we call him, it is a behavior that needs to be trained and reinforced. But being near mom is also good, even if at the time nosing that bush is something better. So the behavior, once trained, should not be so quick to extinguish if you don’t reward it every time.
As to pairing the click with a treat: I don’t know that anyone is suggesting that the click no longer means a treat is coming. But the question is, how long til it comes? So when I used a clicker to sharpen up recall with Jack, I would click once when he turned toward me on the recall command, and once again when he arrived at my feet. That is two clicks for one reward. I saw no signs that it soured him on the clicker, or that he no longer associated the click with the reward. In fact, I used it that way quite often when he was getting less reliable on recall and he would start whining excitedly whenever he’d see me get out the clicker. And his recall improved dramatically. I would never pull out a clicker and click the dog and then just stop and never give a treat at all. But I will click more than once on the activity and then reward at the end, rather than do the pez dispenser thing of click-pop treat in mouth.
When it comes to shaping behavior with a clicker or other reward marker, I think it is valuable to think about what the dog finds rewarding in that sequence. To some degree, asking a dog to figure out a new behavior is stressful. For most dogs, it is a “good” stressor, but it is still a stressor. If you click or say “yes” or “good” and immediately pause and allow the dog to stop attempting to figure out the behavior, what you are doing is creating pressure and then using the let up of the pressure as your reward (negative reinforcement). So the click/good/yes IS paired with a reinforcer, but it’s a negative reinforcer rather than a positive one in that instance. Asking the dog to offer random behaviors is the stress, allowing the dog to stop by signaling that the behavior offered was the correct one is the reward. Then when the session is done, they get the positive reinforcement of the food or tug game or whatever you use to remind them of just how great training sessions are. Training sessions of this sort need to be very short though; you can’t expect a dog to wait a half hour for a reward and remember what it was for, but most of them can handle 3 to 5 minutes. I can’t be sure of what my dogs think, but to me it seems they think “Training is this fun thing where I’m asked to spend some time being attentive and performing, and when we’re all done I get a praise party and a great snack.” They don’t seem to think “every time she acknowledges I did good I better get some food.”
Robin gave the example of playing “hot or cold”, which is how I use “good” or “yes” in training. IF you condition the dog from the beginning that the click means the same thing as “good” or “yes”, then of course you don’t need to treat with every click. And anyone who has ever played hot and cold remembers how exciting it is to get that positive response.
But if you train the dog from the beginning that click means “we’re done with this micro-lesson and here’s your treat” I suppose some dogs would sour if you changed the rules in mid-game. If you played hot and cold at a party and the reward was the excitement and social feedback of the game, people will gladly play. But if the first and second and third times you play everyone gets a dollar for every “hotter” response and then you stop handing out money on the fourth game, people would probably then hate the game and feel cheated.
HFR says
Fantastic discussion! Thanks, Trisha!
I don’t use a clicker and probably over-reward my dogs because I feel too bad for them when they seem disappointed (in me?) when they don’t get what they expected. But I will say I use to compete in agility and my food-motivated herding dog mix was decidedly less motivated when in the ring at a trial. Eventually it became clear that she figured out there were no treats in the ring. The jackpot she got when we left the ring was not enough since it was still less fun for her wi the treats in the ring. She excelled in class, but was slightly less happy (and therefore slower) at a trial.
HFR says
Sorry. That posted too quickly for some reason.
Now we compete in nosework where treating during competition is allowed and she is driven the whole time. Not a scientific study but if you could ask her she’d say, “yeah, forget that whole intermittent-is-better thing”
Also the gambling analogies are interesting because gambling has no appeal to me at all. I like a sure thing. I would quickly stop participating in anything if I felt I was “failing” most of the time. The feeling of disappointment to me is much stronger than the rush I get from anticipation. Perhaps it’s because I can think about the potential disappointment ahead of time. Not sure dogs can do that.
Scott Thomas says
Robin, I was thinking the same thing about vending versus slots, but I could not come up with the eloquent description you made. Spot on!!
And here we come at the funny rub- that poor damned Animal Enterprises raccoon putting a quarter in the player piano and dancing a jig for a treat. Well, up until you give him two quarters. We learn to adapt to variations in a wild animals and then we take dogs whose entire neurobiological system was designed at the whims of men selecting behavior patterns without ever considering training consequences. The Field Trial Labrador whose pain thresholds are through the roof because of the use of electric collars as a training tradition. The racing husky who aspirates vomit on the trail because he wants to do nothing but run. The malinois that must have object gratification and who in the absence of the desired object start flank sucking to seek that satisfaction.
This is why the paradigm of traditional training must shift. I spent five years in the company of Marine Mammals, two years in the company of exotic birds, and another two years with wild cat species and monkeys. All it did was make me train smarter and spend more time observing that trying to show my mighty dominion over all creatures great and small. The emotionality was much easier to see in my wild coworkers than in the domestic dogs.
I yell at my dog club people everyweek, “what is your dog telling you”, “is your dog happy or stressed right now?” We must move to understand the complexities of animal learning/emotions and avoid so many of our comfortable misconceptions.
Suzanne Rhebergen says
I’ve been reading this blog and the comments with great interest. During SPARCS some sparks flew concerning this topic. What I am interested in, is what the reward schedules, the clicks and the treats look like. Simon Gadbois mentioned that they train their dogs using the seeking system, and he was quit adamant that this is the science and we should use it to our advantage. It would be very insightful if he could publish some of these schedules and the results he and his team have accomplished using it.
Kay says
As an Aussie owner, not a trainer,I want to briefly note that I find a reward after a click if verbal equivalent to be the most effective. The reward does not need to be food always. An exception for me is recall off leash. My dog gets a very high-value treat for this and he gets a moderately-high value treat for most check-ins when off leash. This practice has observably led to the habit of staying closer to me. We live in the Maine woods, so it keeps him safer. I think it would be a good idea to speak more about the role of adrenalin in all this, as well as dopamine. Mine is not the only dog for whom a low-arousal threshold is a major issue. Therefore, when I am giving food rewards, I give lower-value treats, like kibble and Cheerios, when I need calm (especially “look at the kitty without chasing”) because hard experience has taught me that high-value treats in some situations can backfire on me. So, Trish, the question you have posed likely has no simple answer! We have to do what works best with each dog — dang it!
Suzanne Ramos says
I LOVE this discussion. Although I do not click (choosing to use “yes” as my memory marker — notice I said “memory marker” rather than “reward marker”), I have long used the mark as a continuation cue. Once the mark is installed and paired with a reinforcer, I have used it successfully to continue or intensify behaviors without rewarding every mark. It takes thoughtful application and a lot of judgement on the part of the trainer to decide when to mark and reward and when only to mark. For example, I have a dog that wants to sniff the floor as we enter and leave stores. At first I marked and rewarded several voluntary instances of non-sniffing behavior. Then I simply marked a short string of them. Yes. Yes. Yes. Before ending in a Yes and treat. The dog tried longer and harder than she ever did when I rewarded each mark. And she seemed to take more pride in her successes. It appears that the dog begins to understand the concept of, “Hey! I did something right!” Rather than “I did something … now show me the cookie!” As I said, you have to use judgement here. You don’t want to start this before the dog fully understands the expected behavior. But if you never go there, it’s all to easy to get a dog that will only perform when constantly rewarded. I also believe it is beneficial to use a variety of reinforcers in response to a desired behavior (once it is established) — food, fun, toys, enjoyable physical contact, praise.
Trisha says
I just finished re-reading this entire thread, and going to all the links provided. First off, thanks to all of you who have greatly enriched this conversation, I’ve learned a lot and love it. Just love it. Susan Friedman’s article was, as expected, logical and compelling (She did indeed convince me that, in general, the ‘click’ (or secondary reinforcer) needs to be followed by a primary reinforcer like a ‘treat,’ at least as often as possible. (I’d argue that in real life, it’s not always possible.) But clearly the research shows that the response will eventually extinguish IF you are just using a learned reinforcement (like praise or a clicker).
I was also interested to learn that Bob Bailey, clicker trainer extra-ordinaire, thinks that clickers are used too frequently in dog training. He argued that they don’t need to be used to teach a dog something that a dog already does, like ‘sit’ or ‘lie down.’ The value of the clicker is in its precise timing, which is critical when shaping a new, and possibly unfamiliar behavior, but not when asking a dog to do something as easily learned as sitting or lying down.
I love the examples that have come in regarding using anticipation to increase an animal’s motivation (say for a picky eater). I’ve found this technique to be invaluable. I’ve worked with hundreds of dogs who ‘won’t take treats’ until I put the food to their nose and then snatched it away. Repeat this three times, and the majority of dogs decide they actually really, really, really want the treat, thank you very much. I always said that ‘hard to get’ works just as well on dogs as people.’
I think that one of the most applicable aspects of our discussion for most families with companion dogs is understanding the importance of ‘variety and variation.’ Variety of reinforcers for any one dog, variation in what is reinforcing to each dog, and variation in what a dog wants most during any training session (like Robin’s example of a thirsty dog generally wanting food more but when thirsty would rather have water).
All this is especially interesting to me after spending the weekend at a sheepdog training clinic. Our trainer’s method, which is similar to that of many of the other top trainers I’ve seen, is to correct the dog for making a mistake, and let it figure out what is right without interference from you. Thus, during a training session you’ll hear “Hey!” said in a gruff voice if the dog busts in or starts to chase, but very little else. I imagine this will appall some “positive” trainers, but it can be effective for several reasons. First off, you can NOT ignore a dog chasing or busting in on the sheep in hopes it will extinguish. Either action is much too much fun for the dogs to let it continue. You simply have to stop it in its tracks. Because Border Collies are so sensitive, the correction is usually little more than a gruff voice that the dogs respond to beautifully. This works, and only works, because 2) the primary reinforcement for the dogs is to get and keep control of the sheep. Your job as a trainer is to basically get out of their way, and let the sheep train the dog what works and what doesn’t. Thus, when you stay out of its way and let the dog learn how to control the sheep in a quiet, confident manner, it gets all the reinforcement it can possibly need. Yes, we give an occasional verbal praise like “there you go” or “good work,” but you can get in a lot of trouble by over praising your dog and interrupting them. I’m not sure how to categorize “taking control of the sheep” as a primary reinforcer, but I guarantee you that there is little more powerful to a herding dog.
Of course, the downside of using even just a gruff voice, is that corrections become over used. Perhaps the dog has no idea what “right” is and needs your help to figure it out. A “gruff voice” can lead to a louder voice, etc etc, and too soon the handler actually is scaring their dog. In my opinion there is still too much of this in sheepdog training. I love the trainers best who are very, very quiet while working their dogs, and most often simply tell their dogs to lie down if the dog does something wrong. Quietly, with no gruffness in their voice, just “lie down.” The fact is that you simply can not use 100% positive reinforcement when working a sheepdog (chasing is waaaaaay too much fun, and chasing is not herding), and learning how to mix your responses in a way that gives a dog what it needs to manage a variety of sheep in a variety of circumstances is a fascinating exercise in understanding dog behavior, sheep behavior and training. No wonder I love it!
So many interesting avenues of conversation that this discussion has opened; thanks to all of you for adding so greatly to its value!
Robin Jackson says
For those who want to read Bailey’s comments on reserving clickers for precision tasks:
“I am not a fan of the ‘ever-clicking’ approach to training. The proper application of the clicker is that akin to using a scalpel to make fine cuts. However, the increasing use of reinforcement to get behavior is good, so I guess the prevalence of sloppy ‘clicking’ is a price paid for trainers thinking more about reinforcement rather than punishment. Most pet owners seldom have need for a clicker, in my opinion; a clicker can easily get in the way of getting good behavior. After a pet owner learns the skill of delivering food, or petting, or a toy, and that owner really wants to do more, then add the clicker. I do think that sometime, down the road, most trainers will learn that the clicker is the most powerful single tool they have, and they will quit beating it to death and learn to exploit it to its highest potential.”
http://www.clickersolutions.com/interviews/bailey.htm
Robin Jackson says
I’m going to go way out on a limb here, because I have no formal data for this at all, but I think when it comes to border collies working sheep most of what even the top trainers call “corrections” are in fact Natural Social Indicators (NSIs), the same thing as saying “Owwwwww!” when a puppy nips.
That is, the dog values cooperative interaction with a human. An NSI makes the dog unhappy not because it is unpleasant in itself but because it’s an indicator that the cooperative interaction isn’t working. This is what people mean when they say bc’s are “sensitive.”
Most bc’s want the person to be there, and to be happy. This is a totally different temperament type than beagles on a scent or livestock guardians with their flock.
Watch what happens when there are sheep in a pen and the person turns and starts walking back to the house. Without giving any cues at all, the typical border collie will react completely differently than the typical great pyr.
So with most bc’s, working the sheep is very rewarding. But they want the person to be there while they do. They want to be part of the team. And that, too, is a reward.
So again, out on a limb here, but I think often what a bc trainer calls a verbal “correction” (which would be P+) is actually R- based on an NSI. It’s saying, “Dog, I’m not happy, the teamwork isn’t working.” And because the dog cares about the teamwork, she self modifies.
Try the same method to call a beagle off a rabbit and I think you’ll see the difference. 😉
Just my own opinion, as I said, I have no science for this, just a lifetime of country living.
Jan Rinker says
The timing of this is perfect as a friend and I were in the process of talking about motivation and anticipation I told her that I saved what looked to be an interesting article by Patricia McConnell. Needless to say this has totally peaked my interest. As a neuroscience student at WSU, and NWPR employee, my son became acquainted with Jaak Panksepp through engineering his interview on laughing rats for NPR. My son found his research very interesting and also said he is a very nice person. I love it when I hear that someone is a nice person. I plan to look at all of your links and see what I find. I was already motivated to somehow figure out how to use the “seeking” behaviors of my dogs and integrate that into a happier, more consistent, higher energy agility performance. Fun times ahead:)
Patricia McConnell says
To Robin re Border Collies, briefly (from my iPad on way out of town): Yes and No… Stay tuned on Friday for an explanation.
Scott Thomas says
I would say one of the long standing problems with discussions like these is in understanding the difference between reality and theory. Most of modern dog cognition is theory (thought we are getting better at establishing actual facts). Learning theory changes as humankind’s understanding of the world around them increases. As technology advances, it changes the way we interpret the inner workings of the mind. The mind has been compared to a clock, a steam engine, a switch board, and a computer. We know now it is something more, but theory is to make our understanding easier and as theory it is constantly changing. It does not mean it is a perfect representation of what goes on in the animals mind. The Penn Vet Working Dog Conference two years ago had a panel discussion with Bob Bailey, Parvene Farhoody, and Kayce Cover. They discussed the terminal versus intermediate bridge. Kayce marks behavior at 8th of a second intervals, but Bob seemed to think that was unnecessary and difficult for many to do, but both agreed that the 8th of a second interval was the correct length of time between “bridges” or “markers”.
This is all further confused by our goals. Training for pleasure, training for professional applications, and training for learning theory academia are different goals. Medical alert dogs are always a hot topic of discussion. There is still much debate on how to train response and what the dog is cuing off of to respond. The early dogs seemed to perform these tasks as an altruistic behavior and now we want to train it as a learned/rewarded behavior. The sport dog versus the police dog, the pet versus the working dog, a therapy dog or a service dog. Each one changes the dynamic. I saw a post from someone that was so happy that their Nosework dog identified Anise in a single trial. She was completely unaware of the pitfalls of rewarding novel odor amongst blanks. Imagine a law enforcement officer with an explosives detection dog training the same way.
We have to understand the theories, but we also have to relate to our animal partners. Learning to deal with an emotional yellow naped amazon who has been wronged by by his human perch, a dominant male dolphin who knows you are nervously swimming in his pool, a condor that has learned to spy mini pizzas during his free flight, a lion that does not like being scratched between the ears, the sport dog that could care less that he won first place, these are the teachers and there is little time to pull out a manual on learning theory.
Your empathy and compassion are your first tools and then comes you understanding of learning theory.
Scott Thomas says
My last input for the day on contingent, interval, random reinforcers. I was riding in a bus from Brussels to Ippers, Belgium for a conference. I was tired and just wanted to sleep and started to listen the conversation in the seat next to mine. The discussion was about food reward or ball reward for explosives detection. Both of the gentleman had vast experience in working with such dogs, but one was a dog trainer and one was a veterinarian who read more training theory than he had ever practiced. The trainer was becoming exasperated in the argument because they were not on the same sheet of music. I hopped in by saying ball rewarded is easier to train and maintain with a novice handler than is food reward. Complete control of diet takes far more technique, observation and experience than does the ball in a dog with drive for the ball. The common denominator was not the best training methods, but what is practical in day to day application. The trainer is very skilled with food use in multiple species and continues with dogs trained in multiple reward types and multiple reinforcement schedules, but he also must account for whether he is training the dog or whether someone else without his skill set must be able to train and maintain the behavior. In my humble opinion, toy is easier than food with dogs and contingent is easier than random, but if i want the highest reliability than I better train with most powerful of reinforcement schedule which at this time seems to be random reinforcement.
Your goals must dictate your training, an IPO dog will have no ball or food for obedience on the day of the trial, so you must vary the reinforcer and the reinforcement schedule. They will get all the reward they need during the protection phase, as the motivator is on the field with you the whole time!
Daniel Hunt says
On the topic of randomized reward schedules and gambling machines, I’d just like to clear up an apparent misconception raised earlier.
Gambling machines do make noises and have sparkly graphics after every push of the Spin Now button.
Industry research trialed silent machines that only made noises when significant jackpots were won, and found test subjects lost interest very quickly if they weren’t psychologically reinforced with “bells and whistles” very regularly.
The chimes and graphics serve as “low value reinforcement”, while the award of free-spins, smaller monetary rewards and significant jackpots complete the arsenal of addictive rewards.
To use the terminology raised earlier, the obvious cash payouts are unconditioned rewards, while the subtler chimes and graphics are conditioned rewards.
I personally feel there is no such thing as an unrewarded click, unless delivered by C-3PO or a professional mime; canines are so adept at observing our body language, our pleasure at their performance is clear to them and serves as a low value reward.
Clicks, or any other marker signal, are a tool for inter-species clarity of communication and an overabundance of unrewarded marker signals, in my experience, simply leads to confused canines.
I use a clicker initially and hand-signals thereafter for most behaviours.
However, I use an ultrasonic whistle as a signal my dogs are released from a sit to eat their main evening meal, and this is my emergency recall signal.
Because that unique signal is ALWAYS rewarded with a “jackpot” reward, the response of our dogs is reliable and instant.
It has saved my girlfriend’s dog’s life – a black lab on a dark night chasing a cat toward a busy intersection, he ignored her as she screamed his name, but instantly abandoned the chase and reversed direction when I blew the whistle.
I would never dream of using the whistle and not following it with a significant food reward.
I shot this quick vid a few weeks back, unscripted and untested, to determine the effectiveness of the recall whistle on our latest rescue adoption, a half wild hybrid.
I threw a large double-handful of high value oily chicken treats onto his blanket where he normally receives his evening meal, snuck down to the other end of the house, and blew the whistle.
With zero hesitation, he immediately abandons the high value food and races toward what he knows is going to be boring dry kibble.
There may be a place for unrewarded clicks, but I feel very strongly that marker signals for especially important commands such as recall and Cease Bite should always be rewarded.
Randomize the reward itself, but ensure there is always a reward.
Dr. M says
I haven’t read all of the comments, but the following post was published the same day as yours, and summarizes a study in people showing that the happiness derived from anticipation of the thing is in fact greater than the happiness derived from the thing itself, but that experiences provide more happiness than things. Interesting. http://pediatricinsider.wordpress.com/2014/09/22/get-more-happiness-from-doing-things-rather-than-having-things/#comment-8511
And how much did you pay for those Seinfeld tickets? 😉 Was it worth the price?
Melissa says
When Simon brought this up on a dog group some months before SPARCS, my head kind of imploded a little bit, but then I thought, well, I can only try it. So, I did. With both my dogs, I gradually wound back marker to reward ratio. Interesting things happened. My persistent dog who loves to shape more than pretty much anything showed no change I could detect. I would guess he noticed, but it wasn’t important to him at all. Now, my other dog has had a rocky relationship with shaping. He liked it, then he didn’t like it, then he liked it again. I think a lot of the problem was that he is not very persistent at all, and he has a low frustration threshold. He wants the instant gratification, and even so much as a 3-second pause where he thought he should have got a treat is kind of upsetting to him. So, I didn’t have huge hopes for this intermittent reward ratio thing with him. As it turned out, his enthusiasm for training went through the roof. The change wasn’t immediate, so I can’t be sure it was just the change in reward ratio that was responsible, but I don’t think I changed anything else in that time. I persisted with it, his enthusiasm continued to climb, and suddenly I had all the impulse control and yelling at me problems that I used to have in my persistent shaping maniac. I had long sorted them out in him, but all the yelling from his brother was making it hard for him. He’s kind of sensitive to basically stimuli in general. I started working on impulse control with my dog that now had the problems, but he did not pick this up the way my other dog had. His frustration threshold is lower, and he seemed to be just getting cranky. And my eardrum was getting buzzy after he literally yelled in my ear every time he felt I was being particularly unfair. He seemed to love to train more than ever, and was always up for it, but there was a desperate edge to it all. Sometimes if I was ignoring the yelling, and he couldn’t seem to stop anyway, I would release him and he would look relieved and run off to sniff madly. So, I switched back to 1:1 ratio for marker to reward. Lo and behold, his frustration was under control again. He was quiet, he could concentrate, and he seemed a little relieved. I added some uncertainty into the ratio. He started whining and getting agitated. Back to 1:1, and back to relative calm.
So, where does this leave me after my sample size of 2 vastly different dogs? I don’t really know. Do I try again, but more gently? Do I work on impulse control and then try again? Do I decide he’s just the kind of dog that is not good with uncertainty? Certainly his motivation for training seemed to increase dramatically, which I suppose is a good thing, at least, it would be if I wanted a dog addicted to training. I’m not sure I do, though. I just trick train as a way to keep us all sharp and busy. Sometimes I would like to walk in the park with them without my dog bouncing around like “TRAIN ME!” Having said that, his motivation was not really more than my other dog’s has always been. The other dog just has a lifetime of training behind him to help him manage his.
Victoria says
Not sure there is a single right answer to this. At the moment I have a saluki X. Getting her to eat anything is a total nightmare, I will stand there with a hand full of cheese, sausage and chopped liver and she turns her nose up at all of them. She’ll eat steak if she’s in the mood. That kind of thing. Yet, she will work for the joy of being offered something, even if she doesn’t actually want to eat it. She seems to like ‘winning’ a treat.
Whereas a few years back, I had a greyhound. She ate everything. Eating was her number 1 favorite thing. Incredibly obedient if rewarded every single time, but if you didn’t, you could literally see her thinking ‘hmm, she has run out of treats, no point doing what she says now!’
Two very different training experiences.
Simon Gadbois says
I think some people here are confusing issues of “schedules of reinforcement” and issues of conditioned or secondary reinforcers. They both can be used to activate the wanting system (SEEKING), but they are not the same. This was the basis of Karen Pryor’s comment after my presentation at SPARCS. I am not sure where the confusion is coming from.
Amanda says
I do believe clickers are a brilliant way of training very specific movement and behaviour very quickly – but I would agree perhaps it’s possible to click and not treat every time BUT only when a behaviour is well established. If you tried to click and not treat with an inexperienced learner and with a behaviour that is not established I would imagine you could rapidly end up with the clicker losing its meaning to the dog and the dog becoming unmotivated.
I suppose it depends on whether you subscribe to the idea that the click says “yes – a reward is coming”. If you rigidly subscribe to that line of thinking then you are potentially going to demotivate the dog if you click and don’t reward. Karen Pryor and others teach that a variable reinforcement schedule should definitely be followed as soon as the behaviour is established BUT you don’t click and not treat you just don’t click at all.
I really like this point about SEEKING systems and personally can’t see why you can’t click and not treat (with an experienced dog and well established behaviour) – in a way you’re saying ‘yes’ with the click which is no different to saying ‘good dog’ and then not giving a treat. You could click and give the dog a rub or just click and only treat every other click to start with perhaps.
One of my dogs whom I’ve had from a pup and is now 19 months is very operant and I don’t doubt with many of her well established behaviour me clicking and not treating every time would not harm her learning or confidence. However, with my other dog, whose 3, only just come to live with me 3 months ago and not an experienced clicker dog – I have to be very ‘clean’ with my training with the clicker making sure the treats are not in view and I don’t lure with food as she is so food motivated she won’t concentrate on anything but the food – I also know that if I started clicking and not treating everytime, at this moment in time, she would rapidly lose confidence and probably walk away.
I guess it requires experience and understanding of each individual dog and each individual circumstance to judge what you can and cannot do.
It’s really great to see someone start this debate. I often think it’s so very easy to get caught up in one pattern of thinking and never challenge it, when in fact there could be multiple ways of approaching the system without harm or lack of efficacy in training.
Frances says
Am I the only human who has not put my own money into a slot machine since I learnt the meaning of “odds” by losing a whole shilling in my distant youth? The thruppence I won along the way was small recompense…
I strongly suspect that many of the experimental subjects are gambling with someone else’s money. Now that opens up a whole new discussion!
My experience is in some ways similar to Melissa’s – Sophy gets stressed if the rules change, and stops enjoying the game. Perhaps if the concept of “once you know what to do, sometimes there will be an ordinary reward, sometimes none, and sometimes something wonderful” had been introduced when she was young it would be different, but I think it is more that she likes to feel in control. Unpredictable outcomes – even when they are good – are worrying as she tries to work out how to predict them. Poppy never really feels in control of anything, doesn’t care, and is happy to grab the good stuff whenever offered!
Mona Lindau says
Great topic!!!
I think we have to realize that different types of behaviors need different types of motivation,
Self-reinforcing behaviors, like herding for herding dogs, bite-work for guard breeds, running for coursing breeds, nose-work for all dogs, do not need rewards from humans, and this is also where anticipation seems to be the major reward. When we get to the training grounds for bite-work or nose-work, the dog pulls, jumps up and down, happy grin on her face, everything in the dog’s body language and facial expression is “Wow! let’s do it. Now”. The anticipation is emotionally rewarding, and executing the innate behavior sequences that are programmed for different breeds, seems to be as satisfying for the dogs as sex is for humans. Satisfying a drive, afterwards Whew, that felt good.
On the other hand, when we deal with operant behaviors, like obedience training and teaching a straight sit, fast swing to the side, etc. the dog does a lot better with rewards, and in my experience clicker training for these behavior speed up the process and keeps the dog relatively happy.. And this is where we can optimize clicker training with what research tells us is the best schedule. I loved the article with 50% rewards, makes sense to me.
I also notice that when we get to the training grounds for obedience training, the dogs are happy, but nowhere near as ecstatic as when they land on the grounds for nose-work.
And finding the right motivation for each individual, or even breed, for operant behaviors is important. For example, in training Sit and Down or operating the Yuppy-Puppy slot machine for any kind of bull-breed, the Model/Rival methods works a LOT better than straight forward reward-based training. Watching another dog suddenly get my attention and treats makes the bullmastiff/bullterrier/staffie sit up and take notice and focus very quickly.
Erin Willson says
Hey, does anyone have the paper or more details on the Panksepp cat study? Would love to read it, I couldn’t find it with a quick search. I recently did a pilot study in cats who I trained to target touch using three different positive reinforcement methods, primary reinforcement only, a beep used as a bridge stimulus, and a beep used as a secondary reinforcer (with intermittent bridging to food). Results soon! 🙂
Melissa Alexander says
Using your own example though… you were high with anticipation… what if the event had been cancelled? THAT’s what happens when you click and don’t treat. Yes, the high is powerful, but the treat is still a necessary part of the chain or their anticipation will diminish. Honestly, the only thing this study has done is underscore the importance of treating after every click in order to keep the reinforcement factor as high as possible.
Trisha says
To Simon and all still reading this thread: Simon Gadbois is correct, as are several commenters, that the issue of Primary versus Secondary Reinforcers is a separate issue than intermittant reinforcement. I suspect that the way I wrote the blog post appears to confound the two. Sigh… I do well know the difference but wrote the post too fast and it came out a bit sloppy. That’s what I get for trying to multi-task. (Note to self: Do not write blog post when sitting in car coming back from a sheepdog clinic. Don’t worry… I wasn’t driving, but my head was still on the sheep, fall colors and working dogs. Lesson learned.)
Trisha says
To Robin, about your comment that our sheepdog trainers don’t use corrections when training, they use “Natural Social Indicators.” (Like yelping when a puppy mouths too hard.) I answered quickly yesterday, while at a sheepdog trial, “yes and no.” Here’s my explanation: Yes, Border Collies have indeed been bred to work as a team, and also to be especially responsive to sound (Thus, we have a dog who will lie down to a whistle when chasing prey animals when 500 yards away from you. And also a dog more likely to be afraid of thunder than individuals of other breeds–at least it appears that way.) And yes, of course, no one would expect a beagle to stop chasing a rabbit to a quiet “That’ll Do.”
However, Border Collies are not so socially responsive that pleasing their human is more important than anything else and is the primary reinforcer. The primary reinforcer for a well bred working Border Collie is taking control of the sheep. Period. Trainers have to work hard to train young dogs that they best accomplish that as a team. For example, just about every good young dog doesn’t want to stop working the sheep at the end of a training session. The way you train them to “That’ll Do” is to say the phrase, then correct the dog if it doesn’t stop with voice and/or a body block if they don’t respond in some small way. This often takes a lot of running to get in between the dog and the sheep. Once the dog responds, even if only to pause turn its head toward you, you then use positive reinforcement and instantly “Shush” them to go back to working the sheep. Lots of dogs work very hard to work the sheep on their own, and it can take a long time to develop the kind of team work necessary, especially for a competitive trial dog. The best partnerships are due to the dog learning that s/he and the handler can manage the sheep better together than apart. Dogs have to learn to trust you, but the trust isn’t just that you will be kind and fair. It’s that you understand sheep too, won’t set the dog up to fail too often and that your training leads to the dog getting more and more skilled at taking and keeping control of the sheep.
Last point about NSI’s. Why is a yelp to a mouthy puppy not a correction? It is positive punishment, right? You add something that decreases the frequency of the behavior. Or do you define “correction” differently? [Well, this could take this thread on an entirely different route! Maybe it will lead to another blog sometime…?]
Beth says
Computer problem. Trying from my phone.
Just to play devil’s advocate to the comments that not giving a reward after a click is like not getting to see Jerry Seinfeld at all: In my mind, it’s more like pulling out the ticket and looking at it 5 times in the weeks leading up to the show; again I don’t think anyone was suggesting you click and not reward at all. I think the question was, is it ok during a session to sometimes click three or four times before following up with a primary reward? Or must the reward follow every single click?
To me, that is (in human terms) like looking at the ticket on Monday when the show is on Friday, as opposed to the show being cancelled. To use the same analogy others are using.
The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
Lulu Brooks Tartt says
@Robin
I am greatly enjoying reading all if your posts 🙂
And I completely agree with your response to my post- so many differences within the species we are working with and the individual animal.
And when I think back on the dolphin who responded the most to just the sound of the whistle- great value was placed on novelty with her. Keeping things fresh for her was always a challenge. But it was a fun challenge! 🙂
Miriam says
I haven’t read any comment but I would reply to your first paragraph:
How would you feel, after such anticipation of the Seinfeld’s show to know it’s suddenly been canceled?
How stressed would you feel when you get your next ticket for the following show? Will it happen? will it not happen? The ticket you get (click) would it have the same value or would it be the bigger poisoned cue in existence?
Pat Miller says
Study done by one of Dr. Jesus Rosales Ruiz’s graduate students on this very question…
Effects of click + continuous food vs. click + intermittent food on the maintenance of dog behavior.
http://digital.library.unt.edu/ark:/67531/metadc3598/
Robin Jackson says
@Pat Miller,
Thanks for sharing the paper. I thought the initial description of the issue to be studied was very thorough, with good references and a clear description.
However, I admit to being somewhat baffled by the experimental design itself. I don’t read it as demonstrating the conclusion that the experimenter draws.
First, she notes that “Most of the time the dog walked with the experimenter over to the sofa chair….If the dog did not…in some cases [the Experimenter would] pull the dog (only in click, click, food delivery conditions…)”
That in itself would invalidate some of her conclusions for me. For example, one of the dogs began running away to avoid the C/C/T sessions. But since those were also the only situations where the experimenter “pulled the dog” over, you’ve introduced a whole set of additional aversives just for that protocol.
Second, there’s the whole issue of the feeding schedules. These were pet dogs in their own home. The Experimenter was a stranger to them. The owner originally fed the dogs in the morning and the training took place in the afternoon. However, as the experimenter notes, midway through the study the owner started feeding the dogs in the early afternoon. The Experimenter then changed criteria and required a two hour break between meal and training session but again, to me, this invalidates many of the conclusions. When you’re using food as a primary reinforcer, the learner’s level of satiety is hugely important. You just can’t change feeding schedules midway and expect to be able to compare results.
I question the method of cue attachment, in which she is saying the new cue at the same time as giving the old hand signal cue, rather than the more common New Old format, but if it worked, it worked. At least the methodology appears to have been consistent.
But my biggest problem with the experiment lies in this statement:
“Once both dogs were able to spin and bow at a rate of 90% or better, then the first C+F condition began.”
That’s right–FIRST she trained the dogs to fluency in the two behaviours. Then she started testing click/treat vs click/click/treat.
Let me repeat that: the dogs already knew both behaviours fluently before she began recording the effect of the clicker.
This is exactly the point where most adept dog trainers begin fading the clicker altogether. As well as the food rewards.
I don’t see this particular experiment as contributing in any way to the question of whether click/treat or click/click/treat is more effective during the acquisition of a new cued behavior. Which is the question that most people are interested in.
She does show that, at least with these two dogs, there was some annoyance displayed by the dogs when they no longer received a click and treat for every successful performance of a behavior after the behavior was already learned to fluency. But the small sample size, the issues I already mentioned above with feeding schedule and physical manipulation may negate that as well.
So–great definition of a question to be studied. But the conclusions drawn, to my eye at least, don’t match the actual experimental results.
Others may read it differently, of course. But I do think for most of us the practical question is how to most effectively train a behavior to fluency.
BTW, the experimenter could have done this and kept most of her same design by first training the behavior to fluency on one cue (the hand signal), and then using her C/C/T and C/T protocols to test the acquisition of the verbal cue for the behavior. Or training to fluency with English cues and then beginning the experiment by training, say, whistle cues or cues in Spanish or flashcard cues. But without something new for the dog to learn during the trial, it’s just not really testing the issue raised in the paper’s introduction.
Respectfully,
Robin J.
Beth says
Robin, one of my dogs will quit and/or regress if I keep asking him to repeat a pattern he has mastered. It confuses him. I learned early on that it was best to stop completely after his first 100% successful completion of an exercise, and then wait a few days, for the behavior to move into long-term memory, and then in practice sessions mix up the new behavior with older behaviors to strengthen it. If I ask him to keep doing a behavior he mastered, he seems to get confused or flustered. My best guess of what he is thinking is is “I thought I had it figured out, but you are asking me to try again so maybe I was wrong so a pox on both your houses.” Or something. This happens whether I treat on every successful repetition, or not.
So being asked to sit in a room and repeat a mastered behavior multiple times would definitely create the sorts of behaviors in him mentioned in the study— wanting to leave the testing area, offering other behaviors or mixing frustration behaviors in.
For my own dog, then, measuring willingness to comply with requests to repeat the same behavior once it had been successfully performed would not be measuring the effectiveness of the marker/reward in learning. It MIGHT be measuring the ability of the marker/reward ratio to overcome his existing frustration with the exercise.
Thank you for explaining your observations on the study. I had some vague uneasiness when I read it but I wasn’t able to put a finger on why.
Honestly clicker training is not my favorite but I have long used marker words before I knew what marker words were. Since most of my training consists of using “yes” or “good” to mark behavior and then giving very good treats after a random number of “yes” or “good” markers, and the dogs I have trained have always loved training sessions more than just about any other activity on the planet, I suppose I’m still not convinced that every secondary reinforcer must be followed by a primary reinforce in order for it to continue to work. I know I’m using my own very small sample sizes.
Brian says
Really interesting insight into the psychology that drives dogs. Better understanding how their brains tick can even lead to much easier and relaxed training, so win-win.
andrea wiesner says
I use both ways actually depending on the dog I work with. I find there are dogs being better with click and treat and there are some doing fine with click and not getting treats every time. Actually I found out as my dog OFFERED it to me, obviously being quite content with the click. It was the start of using both ways for me.
Robin Jackson says
@Trisha,
One note on Natural Social Indicators, particularly crying and yelps…
Definitely not automatically defined as positive punishment. There are two reasons why.
DECREASE NIPPING? OR INCREASE POLITE PLAY?
First, the question is always are you trying to reduce the nipping or increase the “play nice” behavior? If the human would be perfectly satisfied with a puppy who just sat on the far side of the room and didn’t interact, then, sure, the purpose of the NSI is to reduce nipping. But most humans seek instead to increase the time when the dog is “being polite.”
In this case, the NSI is actually information to explain the R-, which is the withdrawal of the play opportunity. “I am not going to play with you now because it hurts when you bite me.” The interesting thing is that most humans who use an NSI with a nippy puppy reintroduce the play opportunity very soon after the Ow!, demonstrating that the play is the reward for polite behavior, withdrawal of the play is the R- to increase polite behavior, and the Ow! is informational.
I know, I know, very technical stuff. But interesting if you’re interested in that kind of thing.
THE SOCIAL CONTEXT
The second thing is also important. NSIs are dependent on the social context. They’re not aversive in and of themselves, they’re only aversive IF the puppy feels compassion for that particular person.
A squeaky toy’s “yelp” doesn’t reduce the incidence of biting–in fact, it tends to stimulate it.
A bully may feel pleasure at a victim’s crying.
A skilled interrogator, ala Perry Mason in the courtroom, may feel professional satisfaction when a witness breaks down and starts crying, because it indicates you’re getting close to the truth.
Someone traveling from New York to Los Angeles likely feels compassion when they see a young child crying and alone in an airport–and annoyance when they hear the same child doing the same crying on the plane 3 rows behind them.
One fascinating study found that cats living with only one person often use an attention cry which is similar to a baby’s cry, and appears to stimulate the same part of the human brain. It’s very difficult to ignore. But cats living in multiperson households didn’t develop this.
http://www.theguardian.com/science/2009/jul/13/cats-purr-food-research
The desireability of the activity depends in part on the existing relationship, and so does the effect of the NSI itself.
BABY CRIES DRAW THE MOTHER CLOSER
There are quite a few studies on mothers and babies that indicate the baby’s crying is likely designed to get the mother to increase desirable behaviour. And most importantly in terms of the definition of punishment–the mother will move towards the crying baby, not away.
The baby’s cry is aversive in the sense that the mother doesn’t like it–but it’s not positive punishment for a woman with a healthy maternal relationship with her child. In almost all cases, when the baby starts crying, the mother turns towards it, and begins looking for things to do (feed it, change it, comfort it). It’s R-. The crying stops when the mother does the “right” things.
There are a few “compassion” studies on the behavior of dogs when a person is crying. Very few dogs run out of the room when their owner is crying. One study of well socialised pet dogs found that the dogs moved TOWARDS a crying person (even a stranger) and nuzzled them.
http://news.discovery.com/animals/zoo-animals/dogs-empathy-humans-120831.htm
If that data holds up, it’s hard to argue that simply yelping Ow! when a puppy nips is going to be P+ because of the pre-existing impact of the crying as an NSI. In standard P+, the learner moves physically away from the aversive if possible.
If the message of the Ow! is that the puppy can stop the person from crying by playing nice, and the puppy moves towards the person who says Ow!, then the really interesting thing is that the puppy is likely to naturally do exactly the behavior the person wants to increase–they will nuzzle or lick the person, but NOT bite. This is why NSIs can be so powerful, they already come paired with desired behaviours, at least in well socialised learners.
All of that is probably more than you wanted to hear about the yelping issue in terms of quadrant classification, but if you think about in the context of a baby crying, I think it gets clearer.
Sarah Owings says
Hi Patricia,
I’m definitely with Karen, Ken, and Susan on this one, but that doesn’t mean that all the cool research being done on the seeking system is wrong. Far from it. I just think that sometimes in the laboratory setting, scientists sometimes take too narrow a view of what the “click” actually conveys to the animal. There are many ways to build anticipation in learning. In my opinion, the entire process of marker-based training–whether you use a clicker or not–activates the seeker system plenty already. There is no need to mess about with intermittent primary reinforcement after the click in the early phases of training just to induce more junkie-like gambling endorphins. For me the click is much more than just a secondary reinforcer, much more even than a promise of reinforcement. The click / treat pairing equals *information*, and as such, I always want that pairing to mean the exact same thing to learners each and every time. When shaping new behavior, I feel it is extremely important not to muddy the clarity of that all important stream of information. Fuzziness creates frustration and frustration causes stress, and stress is not conducive to fluency. When my dog is trying to figure something out, I want her to know with pinpoint accuracy when she is correct. I don’t want her thinking too much about whether or not this time the primary reinforcer is going to come, and doubting the correctness of her offered behavior because this time the food did not come. I want her to feel Oh boy! –eat–and then get right back to working for that next Oh Boy! feeing again. By not feeding after the click sometimes, I might actually inadvertently create a pattern where my dog starts thinking / worrying too much about whether the food is coming, and lose track of the task at hand. This actually draws a lot more attention to the food if you think about it. The dog will stop listening to the click and start focusing on other “tells” of when the food will appear. Once the behavior is fluent and on cue, of course, I can drop the use of the clicker and then build in all types of anticipation and patterns of delayed gratification via chaining (agility, freestyle, etc), intermittent jackpots, variable types of reinforcement (tug, play, chase, release cues, premack, etc), as well as longer duration behaviors that require much more work before the final pay off (jobs such as search and rescue, heel-work, scent-work, etc), etc. Many thanks for your wonderful, thoughtful blog as always.
Eileen Fletcher says
Sarah Owings, you have described exactly what I aim to do and the reasons behind it – are you psychic?? Thanks for putting it into words 🙂
Sue Alexander says
I have saved this blog for a night when I have time to give it that attention that I wanted to give to it and it seems like a lot of like minded people have also taken the time to think this whole debate through very thoroughly.
After watching Simon Gadbois’ incredible presentation at SPARCs I was left with some questions. Simon mentioned that you can find this outlined in any decent first year psych book; I have been picking up first year psych books ever since in the hopes of finding a reference and so far…no dice! Few offer anything beyond the bare bones of operant conditioning, and those that discuss it don’t discuss markers in any great length at all.
Never the less, as a very experienced trainer and teacher, I am always open to trying new things. I introduced this first with my advanced students in my school. We have about fifteen students that I was able to use this technique with, and with 100% of the dogs, it improved performance of known behaviours in novel environments. What this means is that if the dog understands come when called, and you do it in a novel environment, then clicking him as he approaches you increases the speed and helps the dog to cope with distractions such as birds, other dogs or even children throwing things across their line of sight. I have found through experiement with my students and their dogs that this works best for sequences of behaviours, such as heeling, coming when called and chains where the dog is working towards an end goal.
Initially, I just put the click itself on a variable schedule of reinforcement and had my students reinforce when I coached them to do so, and we had some very mixed results. Then I took something that I saw Atilla Szkuklek do at a Clicker Expo when working with the dogs doing freestyle. He actually used my own dog to demo this and I tucked it away as interesting…but not something I felt I needed at the time. He clicks as a keep going signal, and then he double clicks was a terminal signal. I have fiddled around from time to time with this, but when I introduced it to my advanced students, suddenly I felt like we had all the pieces to the puzzle. Now the dogs understand fully that a click means that they are on on the right track and a double click means that they will get the reinforcer. We vary our reinforcers a LOT in our classes using pretty much anything you can think of (rubber dog toy stuffed with sheep wool to sniff worked very well for a sight hound who wasn’t interested in food; she could smell the sheep…but where was it? A puzzle that engaged her seeking system and delighted her even though she could not find the sheep!), including but not limited to food.
So I will say that for dogs who have a solid understanding of click means the end of the behaviour, and that a reinforcer of some sort will ensue, they switch incredibly quickly to click, double click and across the board our advanced dogs can also be shaped for novel behaviours using click/double click.
So on to our intermediate students. These dogs have some level of training but not nearly the fluency that the advanced dogs do. There are perhaps 20-30 intermediate dogs at any given time, and the problem I ran into with them was not so much that the dogs didn’t understand the system but rather that the trainers were less experienced and thus they would sometimes click randomly, sometimes click the wrong things and sometimes just not click at all. The difference in the intermediate and advanced students really appeared to me to be an issue of handler competency not dog competency.
There were some exceptions that were very interesting though. I work with one client who is an intermediate student whose dog developed some very funky dietary challenges including not being able to eat much in quantity or in variety, and it is not always practical to use tug or toys in a group class. The handler is a fairly accurate trainer so I introduced him to click/double click. With a detailed explanation of what I wanted him to do, some demoing and some coaching, he made it work for his dog. It was very exciting to see this dog go from frustrated to calm with the single/double variation.
I have not worked with this with my beginners. I don’t lure train; I see too many dogs who can follow food but who cannot do cued behaviours. Yes, you get early success with your clients, but no you do not turn out vast numbers of actually trained dogs. I use capturing and shaping from the start with all of my clients, so I have enough to teach them in the early stages of their work without adding inn click/double click.
I had some thoughts on the earlier posts. First; the cat issue. Consider that your crinkle bag IS your click. So whatever the cat was doing when you crinkled is being marked by crinkling. What I would suggest you need to work on is teaching the cat that once she receives the food, she has to do more to get you to crinkle again. I see this issue with dozens of my students who have inadvertently taught their dogs that putting a hand in the pocket IS the click. Thus, the dog repeats whatever behaviour caused the student to mark the behaviour by putting their hand in their pocket to obtain a treat.
My thoughts on the thesis presented for a masters degree at UNT go far beyond the experimental design; how can you call anything a study on an N of 2 dogs? Really? I am not sure how you make 29 pages out of that! I would also suggest that the reversal design has a number of significant issues with it. There are so many holes in this research that I do not even begin to consider it relevant. The validity of the experimental design, the small sample size, the lack of consistency in methodology and the lack of repeatability are all reasons that I would not consider that study to be at all useful. Top this off with the fact that this is a master’s thesis, not a published study, and you have more reasons than you can believe to not take it seriously.
Considering the dog mentioned above who got frustrated with the intermittent reinforcement of the click, I would suggest that you may wish to try again, but set your criteria differently; instead of only working on behaviours, set your criteria for calm behaviours. This can make a world of difference.
For a few final notes; I have played with this with my somewhat fractious horse with good success. She worked harder, and developed a greater degree of patience with the click/double click system. With my husband’s horse who has much less experience with the clicker, she just lost interest in what we were doing!
And very finally, I think the MOST important part of training an animal is not what method you use, what tools you have in your toolbox, but rather a consistent agreement between you and the individual animal about what you are talking about. If intermitent primaries work for Simon and his lab…go for it; we need all the data we can find on endangered species and I would be volunteering for you in a heartbeat if you didn’t insist on having your lab so far away from my house! If click treat and building chains works for you a la Karen Pryor, go to! If playing with your food according to Denise Fenzi has polished up your agreement with your obedience partner, have at it. And at my school, click/double click, in the spirit of Attila and the incredible Fly is taking my students to new heights. At the end of the day if you and the animal you are working with have an agreement that works and produces results, what could possibly be better than that?
Cindy Martin says
When we talk about the clicker (or any other bridge signal) as a secondary reinforcer, it’s important to keep in mind we have endowed that particular secondary with additional meaning, unlike than most other secondaries. It has properties as a secondary, but also as a cue. When clicks (or any other marker signal) are used both as a keep going signal and as a signal that reinforcement is on the way, the meaning becomes ambiguous. The learner will often look for other information in the environment to clarify what the click means, and if he can expect reinforcement. Then we have to ask if our click is really doing what we think it’s doing.
Dolores Palmer says
When the dog hears the click he is not in seeking mode; though he might be feeling anticipation. I do scent work with my dog (searching for truffles) – my dog is in seeking mode when searching for the scent. Important not to get things confused. I’ve seen the results of dogs not given rewards – they get frustrated & give up!
Carlo says
Hello, I would like to bring your attention to the following.According to Professor Saposky dopamine release drops just when the work is done (behavior is clicked) as you may see in this video https://www.youtube.com/watch?v=axrywDP9Ii0.
So the clicker does not provoke pleasure it ends it.
Best wishes
Carlo Colafranceschi
Scott Stauffer says
It’s soooo nice to hear Affective Neuroscience used!!!!
Then reading through many comments, applying Social Neuroscience.
Even while still discussing CC and DS.
Affective Dog Behavior is using it all.
You can’t have behaviors, counter conditioning, desensitizating without discussing AN or SN and how important to the dog, feeling safe is. Their way.
Neuroscience is really paving the way to better explain why behaviors happen. Why techniques work.
When Jaak showed 2 rats playing together, at first, they didn’t want to play together. Just with Jaak. Until he showed them that he was safe and then they both played with him and each other.
Social Learning at its finest.
Jaak didn’t even discuss that part. 😉
Jaak’s work has influenced the mental health field, commercials, this pandemic, addiction and how why we form relationships in ways we don’t even understand.
That emotional connection is innate and needed for and technique to work.
I’m totally interested in discussing this with anyone interested in learning.
Thank You for posting this.
I feel validated with my work and creating Affective Dog Behavior.
Scott Thomas says
Was discussing intermediate bridge with an expert last week. Her point being IB at 1/8 of a second intervals. Dove in and found this discussion that I had previously entered into 8 years ago. I’m wrestling with the IB marker as the pertinent signal when we are giving constant signal (movement, position, breathing, eye contact, pattern of verbalization, etc) to shape behavior. Perhaps the animal is responsive to training with constant signal with or without 1/8 of a second IBS?