There’s a lot of tit-for-tat going on in the blogosphere over the alleged cherry-picking of data (also here and original, criticized post here). I’ll remain agnostic on the empirical question, as all usual caveats apply. But what exactly is cherry picking? Is it ever okay to select data?
Let’s be clear on one thing: deliberative and judicious selection of data is not equivalent with cherry picking. A cherry picking charge is considerably more severe. Scientists, like technicians, select out data using criteria that “seem to fit” their view of what is happening. There’s nothing suspicious about this. It’s what all specialists do: scientists, academics, politicians, lawyers, policy makers, businessmen, and so on. We select out relevant data and discard the irrelevant stuff by using our professional judgment.
Selection becomes a problem, however, when we discard and/or discriminate against relevant data that either does not support our position or that contradicts our position. Cherry picking is an informal fallacy of relevance. (There are other, related, fallacies of induction; but cherry picking, as I understand it, is a fallacy of relevance.)
A clear implication of this fallacy is that the charge of cherry picking cuts both ways. The charge applies to anyone who chooses to select data that fallaciously demonstrate her position. The nature of the dispute over whether someone has cherry picked, in other words, must be over the relevance of the data and not the mere existence of contravening data.
Relevance is key.
In Chip Knappenberger’s Guide to Cherry Picking, Chip artfully tries to show that depending on your stopping and starting points, you can end up with one or the other conclusions about the warming or cooling of the earth.
…the answers [about whether the climate is warming or cooling] depend on several things, among them the dataset you want to use and the time period over which you examine—i.e., which cherries you wish to pick.
But this is a distortion of what we mean by “answer”…