# Statistical Perils in Hitchhiking; Part One Regression to the Mean

Created | Updated Jun 5, 2014

The Nababa consider themselves to be a forward looking people, thus they keep detailed records of local rainfall as well as human sacrifices. These records are intertwined because all Nababa know that human sacrifices are the best way to end droughts. Being a forward looking people the Nababa are rather sheepish about their tradition of human sacrifice, and will quickly point out that they sacrifice only sacrfice people when the rain has been extremely poor. They also staunchly defend the scientific backing of their sacrifices, truthfully pointing out that rainfall has almost always increased after years with many sacrifices, and often delclines after years when they make no sacrifices. If the Nababa understood regression to the mean they would quickly see the flaw in their reasoning. As it is the guide strongly recommends avoiding the Nababa unless there has been a heavy rain recently.

Finally, we get to Regression to the Mean.

Regression to the mean is a statistical concept that says if you get a generate a random value that is above average

^{1}you should expect the next value to be lower and vice-versa. For example if you roll a fair d10

^{2}and get an eight there is an 70% chance that you’ll get a lower number the next time you roll it. Conversely, if your first roll is a 1 you can’t get any lower on your next roll and will very likely roll higher.

Many real life situations will have additional complications that can outweigh regression to the mean, however effect will be present anytime values have even slight random variance. Whenever, you observe a particularly extreme case you should expect the next case to less extreme, otherwise what are you comparing extreme to?

So what do this have to do with the Nababa?

Because the Nababa sacrifice people in response to unusually poor weather the rain merely returning to normal makes it seem like their sacrifice has been effective when really it is just regression to the mean. Unfortunately, the Nababa aren't the only ones to make this kind of mistake.

More cases of regression to the Mean.

Many sports fans believe in some variety of the rookie of the year curse. This superstition holds that being awarded rookie of the year or similar titles harms an athletes performance. While there may be some real effect present, most of this can be explained in terms of regression to the mean. Becoming the rookie of the year obviously requires a lot of talent, however it also requires a lot of luck. Players who become rookie of the year almost always had an unusually lucky year and its unsurprising that that luck often doesn’t last for two years in a row.

More seriously some psychologists believe it causes people to overestimate the effectiveness of punishments and underestimate the usefulness of rewards. The theory goes that rewarding people for behaving better than normal will often be followed by them behaving more or less normally i.e. worse than they were just behaving, thus discouraging whoever rewarded them. Of course this theory also predicts than the opposite will happen when people are punished for behaving poorly. This theory isn’t just speculation, studies routinely show that people overrate punishment and devalue rewards relative to how those incentives perform in controlled studies. It was also directly tested by Paul Schaffner, in 1985 he conducted an experiment where the participants were allowed to give feedback to a virtual student. The virtual student was supposed to arrive at school by 8:30, and participants could give him virtual punishments for being late. When asked afterwards the participants almost all said that there punishments were effective. The catch was that the virtual students behavior was randomly generated.

While we're on the subject of science, regression to the mean is a partial cause of the placebo effect. People receive more medical treatment when they're unusually ill, thus they tend to be healthier after treatment.

^{1}Strictly speaking the important average here is the median. Regression to the mean was first examined in the context of normal distributions which have the property mean=median=mode, but are normally talked about in terms of means. Hence the slightly misleading name.

^{2}A ten sided dice that has an equal chance of landing on each of its sides. The concept holds with normal dice,but the numbers are nice and round with d10s.