Tuesday, July 12, 2011

A really excellent post by Peter Dorman on how we talk about empirical evidence

I strongly concur with this. Here is some of it:

"A researcher has a theory, call it X1, that can be expressed as a model of how some portion of the world works. Among other things, this theory predicts an outcome Y1 under a specified set of circumstances. There is a dataset that enables you to ascertain that these circumstances apply and to identify whether or not Y1 has arisen. How should this test be interpreted?

My proposal is simply this: the researcher should be expected to consider how many other plausible theories, X2, X3 and so on, also predict Y1. This should take the form of a section in the writeup: “How Unique Is this Prediction?” or something like that. If X1 is the only plausible theory that predicts, or better permits, Y1—if Y1 is inconsistent with all X except X1—the empirical test is critical: it decisively scrutinizes whether X1 is correct. If, however, there are other X’s that also yield Y1, the test is much weaker. It will accurately determine if X1 is false only if all the other X’s are false as well

We like to talk about Popper, but the fact is I think falsification is so elusive that it's dangerous to talk about empiricism as falsification at all. Falsification is better thought of as something to always keep in the back of our minds rather than something we "do".

It has occurred to me before that a lot of what I've worked on revolves around pointing out precisely what Dorman is saying here. I have:

- Explained in the RAE how Woods, Murphy, and Powell forget other equally plausible explanations of the effect of fiscal policy in 1920-1921,
- Explained in Econ Journal Watch how Buturovic and Klein ignore other more plausible explanations of observed relations between ideology and "enlightenment" (they pushed the data a little harder and ended up more or less agreeing with me),
- Explained in my QJAE article under review that observed wage aggregate cyclicality obscured other alternative explanations for the impact that Hoover had on wages.

Considering these things is important, particularly in such a hard-to-statistically-identify field such as macroeconomics. Indeed, I think these things bother me so much precisely because I came up through labor economics and econometrics, where people obsess over model specification and identification.


  1. What Dorman describes is how to use falsification to discriminate among competing hypotheses (since none of them can be verified by the evidence).

  2. I don't think this is right, Lee.

    You're not doing any falsification here. You are doing what you can do without pissing off an advocate of falsification (or simply an advocate of clear thinking). But there is no falsification, and this is my point.

    A lot of people think that empirical science actually does falsification on a regular basis. I imagine in some cases it does - but even in those cases we are ultimately ignorant of alternative explanations that may keep it from being a true case of "falsification".

    This mistake in assuming that we're falsifying something leads people to false confidence in their results. Moreover - the view that we HAVE to falsify things to do good empirical work that comes out of this illusion that we're falsifying something leads people to reject good empirical research.

  3. If by "using falsification to discriminate" all you mean is "being up front about when you're not falsifying something", then I agree with you.

  4. There is no such thing as good empirical work that doesn't involve falsification (unless it's just a collection of observations with no theory).

    There is a very simple reason for this. Our empirical observations are like existential statements in logic, e.g. "there exists an x, and x is y." And our theoretical explanations are like universal statements in logic, e.g. "all x are y."

    What can our existential statements entail about our universal statements? They cannot entail truth, but they can entail falsity. That is, every statement about an observation, if true, implies that infinite possible theories are false, but it does not imply that a single one is true.

    In other words, to "support" a theory, evidence must falsify an alternative theory -- this is basically what Dorman is saying.

  5. By the way, I do not mean to suggest that a falsification some how puts the matter to bed once and for all. Sometimes the evidence turns out to have been defective (e.g. the measuring instruments were improperly calibrated, or some other auxiliary assumption was incorrect). A falsification is always as provisional as the theory which it falsifies, because we are all human and prone to error.

    However, the logical relation between evidence and theory can only be one of falsifiability.

  6. "Plausible" means the BELIEF of what could or could not be, and has nothing to do with what IS and IS NOT.


All anonymous comments will be deleted. Consistent pseudonyms are fine.