Cherry Ping-PongOctober 13, 2009
There’s a lot of tit-for-tat going on in the blogosphere over the alleged cherry-picking of data (also here and original, criticized post here). I’ll remain agnostic on the empirical question, as all usual caveats apply. But what exactly is cherry picking? Is it ever okay to select data?
Let’s be clear on one thing: deliberative and judicious selection of data is not equivalent with cherry picking. A cherry picking charge is considerably more severe. Scientists, like technicians, select out data using criteria that “seem to fit” their view of what is happening. There’s nothing suspicious about this. It’s what all specialists do: scientists, academics, politicians, lawyers, policy makers, businessmen, and so on. We select out relevant data and discard the irrelevant stuff by using our professional judgment.
Selection becomes a problem, however, when we discard and/or discriminate against relevant data that either does not support our position or that contradicts our position. Cherry picking is an informal fallacy of relevance. (There are other, related, fallacies of induction; but cherry picking, as I understand it, is a fallacy of relevance.)
A clear implication of this fallacy is that the charge of cherry picking cuts both ways. The charge applies to anyone who chooses to select data that fallaciously demonstrate her position. The nature of the dispute over whether someone has cherry picked, in other words, must be over the relevance of the data and not the mere existence of contravening data.
Relevance is key.
In Chip Knappenberger’s Guide to Cherry Picking, Chip artfully tries to show that depending on your stopping and starting points, you can end up with one or the other conclusions about the warming or cooling of the earth.
…the answers [about whether the climate is warming or cooling] depend on several things, among them the dataset you want to use and the time period over which you examine—i.e., which cherries you wish to pick.
But this is a distortion of what we mean by “answer”…
What the answer actually depends on is the climate, and whether the climate is in fact warming or cooling. The job of the scientist is to try to get at this answer by using strong scientific and inductive reasoning. Depending on how she approaches the problem, depending on the strength of her reasoning, she may give you a slightly different answer. If she uses poor selection procedures in choosing her data, for instance, she’s liable to get the wrong answer. If she actively disregards contravening data, she is then manipulating the answer.
Chip Knappenberger, who originally wrote the cherry-picker’s guide, doesn’t give us this discussion. He just gives us the data, which he pairs with several answers spinning around the blogosphere. We are left to draw our own conclusions. Again, I’m not a scientist, so I’m not suited to judge the relevance of the data. But I can definitely judge the argument; and the argument in Knappenberger’s case doesn’t even aim at demonstrating relevance. It simply shows that that there are other data. One would expect more on relevance conditions from a cherry-picker’s guide.
Here’s another set of data to add to his chart.
The temperature in my downstairs guest room stays roughly 65 degrees, year-round, give or take ten degrees. Thanks to the insulating nature of Colorado’s soil, our indoor air temperature on our bottom floor doesn’t fluctuate much. If I include such data in my calculations of the climate, I’m likely to see that, over the past decade, temperatures have not changed. They’re flat at 65 degrees.
Fortunately, it’s easy for most people to see how irrelevant my basement temperature is to a broader discussion of climate. It’s irrelevant because it is not a good measurement of climate. It’s just as irrelevant as the temperature in my refrigerator or the CPU temperature of my computer. What is less easy to see, however, is that this relevance is not a given. A scientist has to first identify this information as irrelevant and then filter it out of her calculations.
Scientists don’t waste much time discussing the climate of their basements. Most everyone knows that basement climates are irrelevant. Scientists do spend time discussing, however, other possibly-relevant noise that only may or may not have something to do with the correct answer about to whether the climate is moving up or down.
As I said, Chip Knappenberger does not show, or even discuss, whether the data chosen to demonstrate a given point of view were either relevant or irrelevant. He does not offer up the reasoning. He simply shows us that different graphs yield different answers, paying little attention to the correctness of the answer.
Finally, a note of caution that should cast doubt on this guide. Knappenberger has the following to say for himself, in his second paragraph, typed in italics:
What I can say for certain, is that the recent behavior of global temperatures demonstrates that global warming is occurring at a much slower rate than that projected by the ensemble of climate models, and that global warming is most definitely not accelerating.
This is an exceptionally strong claim. It is not just moderately strong. It is exceptionally strong. Even knowing nothing about climate science, as is my case, I am far less likely to put stock in a claim posed in the positive, about what one knows “for certain,” than a claim posed in the negative, about what we “cannot know for certain.” This is not because some things cannot be known for certain, but rather because when there are reliable and expert sources that say otherwise, said certainty can readily be called into question. If someone says that they can say for certain that X did not cause Y, and yet there are many reliably expert people who beg to differ, then you should question not those who beg, but those who proclaim certainty.
It appears that some scientists, particularly over at RealClimate, beg to differ. They write:
Even the highly “cherry-picked” 11-year period starting with the warm 1998 and ending with the cold 2008 still shows a warming trend of 0.11 ºC per decade (which may surprise some lay people who tend to connect the end points, rather than include all ten data points into a proper trend calculation).
It is clear even to a non-scientist like me that looking at a short timespan and drawing a conclusion about the certain state of the earth’s climate is a form of cherry picking the data. Stefan is correct to point this out. There is natural variability over any set of years, and if this natural variability is at all to be taken seriously, as Stefan gives feasible argument that it should, then we need to look across a longer time horizon. Unlike what we’ve been exposed to in the cherry-picker’s guide, we do get here, in short form, an appeal to relevance.