Ryan Murphy has taken over a project on fiscal multipliers (here) which has a special emphasis on out of sample tests. He's looking for reactions to the summarization work so far, any examples of out of sample tests you're aware of, and also any other multiplier papers. I have a few thoughts on this:
Out of sample testing
First, the whole idea of an out of sample test is tough for the same reason that evaluating the Romer-Bernstein analysis was tough. When you go out of sample you are testing at least two things: (1.) the impact estimate of fiscal policy and (2.) the baseline behavior of the economy. If we are talking about multipliers just above one, often on relatively small fiscal packages it's very plausible that normal (or even abnormal) error in #2 will swamp #1. Certainly it's not easy to pull apart whether #1 or #2 are wrong.
Now let's say you can pull the two apart. Let's say, in your test out of sample, you can hold all the other things in the economy constant and look at the impact of fiscal policy in new data. What is that? Well of course it's just another multiplier estimate!
In that sense, we have lots of "out of sample tests" of multiplier estimates - it's just a new multiplier estimate. When we think about testing out of sample we're usually thinking about testing a whole system of equations that describes the economy. We don't usually think about testing a single parameter out of sample, which is sort of what this project is requesting.
These intuitions of mine seem to be confirmed by a quick scan of the articles they have that do test (or sort of test) out of sample. They're all DSGE models - namely, forecasting efforts using systems of equations that describe the whole economy. That you can test out of sample. But a single impact estimate? That's tough. Granted although I have an interest in macro I'm not a macro empirical guy, so I may be misunderstanding something.
What's interesting about multiplier studies
We've talked about a handful of multiplier studies on here before, although my knowledge of the literature is nothing near as comprehensive as what they have on this website. What I find interesting about multiplier studies - more interesting than this out-of-sample issue that they're tracking - are questions around identification strategies and what is held constant. The two points are often related. If exogenous variation across states or localities is used to identify the model, then the principle factor you're holding constant is monetary policy. That's a different sort of multiplier than the multiplier that Sumner refers to when he says that the multiplier is zero. Of course that introduces biases elsewhere associated with demand leakages across state lines. Papers that solve that problem often have the opposite problem - are they properly identifying stimulus at the national level, or is fiscal policy still endogenous? Without disparaging the work they've done, I would be interested in columns with headings like "identification strategy", "level of analysis", etc.
The other thing that is worth keeping track of with multiplier studies is the macroeconomic environment. Nobody thinks that multipliers are stable over time. Obviously structural parameters governing consumption behavior do not stay the same. But on top of that, the degree of crowding out is going to be a function of capacity utilization. It is not a coincidence that multipliers from things like war spending that are estimated in high-demand periods are lower than multipliers from narrative histories of exogenous taxes (which may come in high or low demand periods), which are themselves lower than multipliers estimated during the Great Depression. The good papers know this and interact the multiplier estimate with some variable indicative of the macroeconomic environment. That, of course, again raises the question of identification. Your fiscal variable better be well identified anyway, but it especially better be well identified if you are introducing those sorts of policy interactions.