Friday, January 24, 2014

Two different critiques of Dube, Lester, and Reich: Part 1 - Meer and West

Note: This post has changed a bit since I started it and I still feel like it's provisional (I'm maybe 76% sure of it now) in the sense that I'm still trying to get a grasp of how state-specific time trends have actually been operationalized across these studies. So, just a word of warning (and if you can clarify anything for me please do!).

*****

The other day I was writing about Bob Murphy's concern about including county specific time trends in DID estimate of the minimum wage (this post will assume your familiarity with that older post). I hope I made a persuasive case that generally speaking it makes good sense to include time trends in DIDs. County pair matches are great, but they aren't perfect and what they don't handle is county-specific time variant heterogeneity. And how do we deal with that?: County specific time trends.

However, at the time (as you'll see if you review my post linked above) I was thinking in terms of time-trends projected from the pre-treatment period. This is only sensible so I figured that's what was being done. If you include information from the post-period on universal time trends (or time fixed effects) that's fine of course because the variation explained by those sorts of time trends would have to be common to treatment and comparison cases alike. Not so for county-specific (or for the purposes of the Meer and West paper I'll be discussing here, state-specific) time trends. Now, maybe DLR did do time trends the way I assumed they would do them, in which case a lot of the Meer and West critiques are not substantial. But at this point I think Meer and West are right on this econometric point (whether it matters in the data is a different question, and a question for another post [or for your own googling... lots of people have talked about the results]).

OK, let's back up a little.

There are two potential critiques of including state or county-specific time trends, which I'll call the Meer and West Critique and the Murphy Critique (although clearly I'm putting them each in my own words). The Meer and West Critique comes from a working paper (Meer and West 2013, specifically Appendix A and B) that looks at the impact of minimum wages on employment growth. The Murphy Critique comes from Bob's post from about a year ago, with my own modification from the other day about the cases where his concerns will matter if county-specific time trends based on pre-period trends (it's only under specific conditions that it matters - see my old post). Both are important, but they are fundamentally different and pose fundamentally different burdens on the DLR specifications and results. The Murphy Critique poses many more problems econometrically and practically than the Meer and West Critique in the sense that it still has to be dealt with even if you use pre-period time trends. In my own words:

The Meer and West Critique: If the impact of the treatment is on growth rates rather than levels, then a DID with time trend controls and levels as the outcome "could lead to dramatically incorrect inference about the treatment effect" (Meer and West, 2013, pg. 38).

The Murphy Critique: If county specific time varying heterogeneity is correlated with treatment, then county-time trends will not identify the treatment effect, and the estimates of the treatment effect will be biased.

*****

To think about the Meer and West Critique, let's revisit those DID diagrams that I had before. Meer and West consider a case of "staggered treatment", where all cases ultimately get a treatment, but it occurs at different times. This is very common in actual empirical studies because policy changes don't all happen at once and usually many of the cases that are comparison cases at one point in time eventually get some form of the treatment. Consider the case of three periods, where group A is treated at the end of period 1 and group B is treated at the end of period 2. This would look something like this:


This makes the distinction between Treatment and Comparison "groups" a little more complicated because in one period group A will be the treatment group and in another period it will be the comparison group. But notice that this actually allows you to do two classic DIDs. You can do:

[(Period 2, A) - (Period 1, A)] - [(Period 2, B) - (Period 1, B)]

and you can do:

[(Period 3, B) - (Period 2, B)] - [(Period 3, A) - (Period 2, A)]

If you have thousands of counties all changing their minimum wage status multiple times, the number of DIDs you can do increases as well. This is the fixed effects model operating as a DID model that I referred to in my earlier post (you just need to be able to match treatment and comparison groups, like DLR do with contiguous county pairs, and the minimum wage variable and the county and time fixed effects do the rest).

This is the basic framework that Meer and West (2013) use, and which they illustrate using diagrams very similar to mine from the other day in their Appendix A. In their scenario (a), below, there is no treatment effect (no effect of the minimum wage) on either employment growth or levels. There is a time trend for each state, but you can see that it's common between the states so the time fixed effects should account for it. A DID estimator like the one I outlined above would find no effect of the minimum wage, regardless of whether the dependent variable was employment levels or growth in employment. (All three of the following figures come from Meer and West's paper.)


In scenario (b.), Meer and West assume a negative treatment effect on the employment levels, but not employment growth. Again, this is no problem in a standard DID. If the dependent variable is employment levels, the standard DID will find it because the time trends are common and controlled for. If the dependent variable is employment growth the standard DID will find the (correct) zero employment growth effect because the rate of change of employment doesn't change. Again, all is well with the standard DID.


Meer and West worry more about their scenario (c.), where the minimum wage reduces employment growth but has no effect on levels. In this case, a standard DID will identify a zero levels effect if employment levels are the dependent variable and a negative growth effect if growth rates are the dependent variable:


They call the zero levels effect result "counter-intuitive" - and it is at first. If you look at the first and second period (so the DID for state A), you clearly have employment levels getting closer in the post period (albeit as a function of the decline in the growth rate). So that should give you a negative levels effect in the DID. Same goes for the second period if you control for state specific trends, right? State B would have been closer than it is if the trend had continued, so if those state specific trends were controlled for you'd think the DID would pick up on a negative levels effect. But it doesn't.

This was what was difficult for me to grasp at first, because my assumption was that state-specific time trends were based only on the pre-period. At this point I don't think they are in DLR, because they specify the state specific time trend as:
 With Is being an indicator variable for a state. This makes no distinction between pre- and post-, so it seems to be the full study period state specific time trend. What seems to be going on here (although I find their explanation in Appendix A to be cryptic) is that using a time trend that covers the entire period makes the post-period employment performance look better than it should, because the contributions of the post-period to the time trend are absorbing the effect of the minimum wage.

This explanation of what's going on is confirmed in Appendix B, where Meer and West run simulations to demonstrate their point. They also present a figure where they draw the state specific time trends:


The state specific time trend for the treated state (the dashed line) is an under-estimate of the pre-period time trend, so when you adjust for time-invariant factors by including a state fixed effect (i.e. - when you shift the dashed line to have the same intercept or starting point as the solid line), then it doesn't look like the post period has decline from the trend the state was on! This is a problem because we know (from the way the data was constructed) that it declined from trend a lot in the post period. This is illustrated by plotting the results of the DID with and without this dotted line trend. The results on the right are near zero because the state specific trend absorbed much of the decline in the post period so that it was not attributed to the minimum wage:


Note that this should not be a problem if the time trend (the dotted line) was based only off of the pre-period time trend, as I had originally assumed was the case. Clearly the coding to get pre-period time trends when minimum wages are being implemented in many places at many different times would be difficult, but that's really the specification you want, IMO.

Meer and West have a different view. They think you're better off just omitting the state-specific trends, which they demonstrate bias the results if all the action is in growth rates rather than levels. This is a little dangerous, I think. It's all well and good for Meer and West to show us a simulation where there's not big spatial heterogeneity in time trends so that the only thing including time trends does is bias the results. But there is no particularly good reason to think this is true in practice. If spatial heterogeneity is substantial then you need those time trends (although it would be preferable to use pre-period time trends).

And actually Meer and West themselves give us a good reason NOT to drop time trends. They show that they do not bias estimates when the growth rate itself is the dependent variable. So it seems to me the ultimate moral of the story is more to look at a number of outcome variables than pulling the time trends out.

One last point I want to make here is that time trends aren't actually the huge difference between Meer and West on the one hand and DLR on the other. The big difference is that Meer and West do not use the contiguous county matches that DLR do. That means that all this talk of DIDs by Meer and West is a little bit of a stretch in the first place. What they have is a fixed effects model with no real matching of treatment and comparison counties at all (except, of course, in their simulation exercise).

It would be nice, then, to have a paper that does what Meer and West want (look at both employment growth and employment levels), and do it in the context of contiguous counties. There's a short Dube paper that does just that. He picks up the negative growth effects in the state level model that Meer and West prefer, but they go away when spatial heterogeneity is accounted for by the contiguous pairs.

So the bias that Meer and West seems to be more interesting in theory (and in simulation) than in practice. In practice, spatial heterogeneity seems to be the much, much bigger bias, and DLR still rule the roost on that count. And really, this makes sense I think. Why would the minimum wage have a greater impact on growth than on levels in the first place? The textbook theory everyone loves to allude to implies an impact on levels. If you think in terms of a search and matching model, you'd expect a minimum wage to introduce greater scrutiny of worker productivity on the front end and as a result lower separation rates on the back end, so that there are lower turnover rates overall (this is what Dube, Lester, and Reich find in another paper on the subject), but it's not immediately obvious why employment growth rates have changed after the level of employment has been reduced. I suppose if adjustments are made through attrition you might have some immediate declines in growth rates, but you'd think that would be temporary. I don't want to suggest it can't have an effect on growth rates - I'm just saying that it's very hard to imagine the situation that Meer and West are coming up with here in their simulation - where there is no effect on levels  but an effect on the growth rate.

So I chalk up the Meer and West Critique as an important reminder about time trends and maybe a reason in the future to think about grounding time trends in the pre-period only. But practically speaking it doesn't seem like it matters because when Dube does what Meer and West suggests (look at growth rates), there's no action.

The Murphy Critique is somewhat different. Hopefully I'll get to that in the next few days, but if not I think he should have something coming out on it at some point.

I'm very interested in your thoughts on this - let me know if I've missed something.

2 comments:

  1. If there were such an effect, I think it more likely that faster employment growth before would induce a rise in the minimum wage towards equalizing rates after as political competition between neighboring areas.

    ReplyDelete
  2. Thanks for delving into this. Wanted to note a few things. Meer and West do not really offer a critique of the methodology of Dube Lester Reich (2010, 2012/3). That is because DLR's main specification does NOT rely on a county or state specific trends - the main methodology is to use compare only across counties within a pair (operationalized by using county-pair-specific time-period dummies that wash out variation between pairs.) So the primary border-discontinuity specification is not subject to any criticism from Meer and West. Indeed, Meer and West merely cite the Neumark Salas and Wascher study (I'd say more for support rather than illumination) as a critique of the border discontinuity methodology.

    Now, it is true that besides our preferred discontinuity specification, we *also* show results using "intermediate" specifications with state linear trends - as a way of showing that even less granular types of controls for spatial heterogeneity kills the disemployment estimate in the "canonical" place and time fixed effects. (And in Appendix Table A1 we additionally subject our discontinuity specification to state linear trends as a robustness check and show results are the same.) So in principle these specifications (but NOT our primary discontinuity specification) are subject to the issues with linear trends that Meer and West raise.

    Even there, however, there are two important pieces of evidence that suggest Meer and West's critique is off base. First, we show dynamic specifications with 16 quarters of lags. This means that in models with state trends, those state trends are estimated WITHOUT using the first 16 quarters of post-intervention data - so they are largely PRE-existing trends; see the original Wolfers 2006 paper on this. (As Meer and West point out, most minimum wage differences last around 4 years).

    Yet there is not an indication that the "long run elasticity" [associated with the t+16 value in Figure 4] in linear trend specification, that is largely estimated using PRE-existing state trends, is any more negative - as would be the case if Meer and West's explanation were correct. Moreover, during these 16 quarters, there is no indication of falling employment as suggested by Meer and West's simulation.

    In general, if you estimate a set of flexible distributed lags (as we always do in DLR as well as Allegretto Dube and Reich), you should catch falling employment levels if there is a growth reduction. But we don't see that anywhere. Which raises serious doubts about the explanation Meer and West offer for the discrepancy between levels and growth as outcomes.

    Finally, see this post by John Schmitt about the Meer and West paper's own findings regarding industries: http://www.cepr.net/index.php/blogs/cepr-blog/more-on-meer-and-wests-minimum-wage-study

    Since this comment is already too long, I'll stop here.

    ReplyDelete

All anonymous comments will be deleted. Consistent pseudonyms are fine.