03/11/2025
The Quadrants of Learning (and How They Show Up in Herding)
You can read more in the brand new second edition of Urban Sheepdog. Order here: https://amzn.to/3Kx3yK0
Few things in dog training get mixed up as much as the four quadrants of learning, but they aren’t opinions or methods. They’re just a way to describe what happens after a behaviour, and whether that behaviour becomes more or less likely next time.
The other day, there was a post in a herding group with so many comments trying to unpack how the quadrants apply in herding. Some suggested it was all positive reinforcement, some said it wasn’t. Some thought herding is void of the quadrants.
Let’s unpack what is actually taking place!
•Positive reinforcement: You add something the dog wants, and the behaviour increases. Example: You give your dog a treat for sitting, and they sit more often.
•Negative reinforcement: You remove something the dog doesn’t want, and the behaviour increases. Example: You loosen leash pressure when your dog stops pulling, so they learn that staying close makes the discomfort go away.
•Positive punishment: You add something the dog doesn’t want, and the behaviour decreases. Example: You say “hey!” sharply when they jump up, and they stop jumping as much.
•Negative punishment: You remove something the dog wants, and the behaviour decreases. Example: You stop the game when they bite too hard, so they learn that rough play makes the fun end.
That’s all it is.
“Positive” and “negative” mean add or remove, like math. It’s not “positive is good” and “negative is bad.” “Reinforcement” means the behaviour goes up. “Punishment” means it goes down.
Now, picture a herding dog on stock. The learning theory is happening constantly:
When a handler steps in toward the dog, swings a stick, or uses a sharp tone, that’s positive punishment: something unpleasant is added to make the current behaviour (like diving in too close or gripping) less likely.
When the dog changes their behaviour and backs off, gives space, finds balance, and the "pressure" or correction stops, that’s negative reinforcement: the removal of something the dog finds aversive makes that better behaviour more likely next time.
When a dog works well and the handler lets them keep working or praises quietly, that’s positive reinforcement. The praise is added, and we're using the sheep as the reinforcer: something the dog wants is added, and the work continues because the dog’s choices keep paying off.
When a dog loses the chance to work because they ignored cues or got too wound up, that’s negative punishment: the thing they wanted most (the sheep) disappears, so that behaviour is less likely.
The Sheep Are Learning Too!
Learning theory doesn’t just apply to the dog. The sheep are also responding to consequences in real time. Every movement from the dog or handler changes what they feel, want, or avoid, and that shapes their behaviour too.
Negative reinforcement: When they move away from the dog, and the dog eases up on its intensity, the "pressure" from the dog decreases. The removal of that discomfort (the dog’s eye, movement, or proximity) makes them more likely to respond in the same way next time.
Positive punishment: If they challenge the dog or refuse to move, and the dog rushes in, grips, or blocks hard, something unpleasant is added. That makes the bold behaviour less likely.
Negative punishment: If a sheep drifts too far from the group and loses the safety of the flock, the loss itself is punishing, and they’re more likely to stay closer next time.
So while the dog is learning how to influence the sheep, the sheep are learning how to respond to the dog.
The whole system is built on feedback loops of what they call “pressure and release”, which is really just the quadrants!
It's happening to the humans, too. We buy a bunch of sheep, realize how expensive hay is, lose money and stop buying sheep, that's negative punishment (the loss of something good ($), which decreases my behaviour in the future!