Saturday, January 18, 2014

Thinking about the specifications of the Dube, Lester, and Reich minimum wage study: Part 2 - controlling for county level trends

Q: When is a fixed effects model not a fixed effects model?
A: When it's a difference-in-differences model.

The most important question in any impact analysis is "how do they identify their model"? Sometimes its buried in the math, but there are a few canonical forms of how to identify a model (often very closely related) that in my opinion at least help to think about model specification and exactly what kind of assumptions and variation the authors are relying on.

I'm sure a lot of you know that a fixed effects model is just a model you run on panel data with dummy variables for each cross-sectional unit to soak up all the time-invariant non-observed characteristics, and dummy variables for each time period to soak up all the common time trends. The big thing you don't get automatically in a fixed effects model is control of time-variant variables that are cross-sectional unit specific.

Turning a fixed effects model into what is essentially a difference-in-differences (DID) model is pretty straightforward. In fact we discussed it in the last post on Dube, Lester, and Reich (DLR): you just include county-pair fixed effects in their case. These fixed effects capture any variation across pairs, so the only variation left to estimate the minimum wage coefficient on is variation within a pair, between the counties in that pair over time. DLR have intermediate versions of this, restricting the estimation of the effect to pairs within regions, states, and metropolitan areas. But what they're narrowing in on is essentially the DID. The logic of the DID is straightforward and I want to walk through it before getting to more of Bob Murphy's thoughts.

You have panel data so you've got data before and after a treatment. You also have two cases: a treatment case (on the left, below), and the comparison case (on the right). The treatment case may be changing over time anyway without the treatment, so to isolate the treatment effect any changes in your comparison case (the paired county in DLR), is subtracted out of the treatment effect. Why? The changes in the comparison cannot (or should not... there's a different literature on that issue) be affected by the treatment because it didn't get the treatment. So that small gap on the right is the counterfactual of what would have changed in the absence of treatment, and therefore cannot be attributed to the treatment effect on the left. Notice also that it doesn't matter if the comparison case is a little different from the treatment case (see how I've drawn it a little lower?). What matters is the differential response to treatment, because the DID estimator is:

[(Post-Treatment) - (Pre-Treatment)] - [(Post-Comparison) - (Pre-Comparison)]

[UPDATE: I had the terms switched above before - this version is correct. You take the raw change in the treatment case, but then you want to subtract out the change in the comparison case from that]

So if there's something about the comparison group that's time-invariant that makes it a little different from the treatment group, that's OK. That's why we have county dummy variables. What's more problematic is differences in the counties over time (which I'll discuss below).


The profile over time in the diagram above is flat, but we could easily imagine a common time trend (imagine the slopes in the figure below are the same!). This doesn't matter for the simple DID case at all for two reasons. First, if the portion of these time trends common to all counties is already absorbed by the time period dummy variables I mentioned above. Any other time trend that is common between the paired counties will be subtracted out of the treatment effect by the exact same logic of the case without the time trend: we are removing the change in the comparison group from the


As I alluded to above, the big trouble comes in when you have time trends in the treatment and comparison group that are different. That would look something like this:


If you implement the DID estimator here it will make the treatment effect a lot smaller because there was a big change in the comparison group over time relative to the treatment group (in other words, the match might have been good, but it wasn't perfect). Looking at what's actually going on, though, you can tell that the true treatment effect should be exactly the same - we're just conflating the rate of change that has nothing to do with the treatment effect with the treatment effect itself.

What you want to do in this case is control for the trend rate of change by county so that any increase in the comparison group in the post period that follows that rate of change is not used to penalize the treatment effect. You could just as easily imagine a scenario where you'd want to do this because it would over-estimate the treatment effect. I draw it this way because this is what DLR came across. Once you control for that time trend, you're back to the situation of the first picture (common time trends will be swept up in the county-specific time trends, which is just fine - we don't care about common trends), and you've got an unbiased DID again.

So as best as I can tell, Bob Murphy has two related concerns. First, he's concerned that we're including other controls when we were supposed to be dealing with all that by matching counties. That, I hope, is clear from both this post and the last post: even good comparison groups can be improved upon. You never have a perfect comparison group until you have random assignment.

But there's another issue he has with this. About a year ago, Bob wrote:
"What Dube, Lester, and Reich are really saying here, is that maybe for some reason minimum wage hikes happen to be concentrated in regions that have lower than average employment growth. Hence, just because we find that teenage employment grows more slowly in regions with higher minimum wages, doesn’t mean we can blame it on the relatively higher minimum wage. But hang on a second. Minimum wage hikes aren’t randomly distributed around the country, such that we might happen to get an outcome where they tend to be concentrated in slow-growth regions. On the contrary, minimum wage hikes are implemented by “progressive” legislatures, who also (given my economic worldview) implement other laws that retard adult employment growth.

For example, suppose that if a state legislature jacks up the minimum wage, then it is also likely to pass “pro-labor” stuff like laws giving unions more organizing power, laws allowing unfairly terminated employees to receive years of back pay, and laws granting extra perks for maternity leave. Now, these last three items I listed: Would they reduce the employers’ incentives to hire teenagers or adults, more? On the margin, they would make it costlier to hire adults, because if penalties are expressed in years of back pay, or have to do with paid leave, or strengthen unions who traditionally are going to organize adults…You get the picture. Adults make more than teenagers, and so these rules will penalize adult employment more than teenage employment.

Thus, if my model here is correct, it would produce the pattern we actually see: Looking narrowly at minimum wage laws, they seem to retard teenage employment. But then when you ask if states with high minimum wage laws have a bigger slowdown in teen employment versus adult employment, the signal becomes much weaker. It looks like, by dumb luck, for some reason all the minimum wage hikes happen in states that also have slower-than-average employment growth among adults."
So Bob's issue is bigger than the easily dispatched with concern that we matched on counties and then decided that wasn't good enough (I didn't quote that part). The concern is that somehow we are absorbing the effect of the minimum wage.

This may happen under very special circumstances, but generally it's not a problem. Bob is - I think - forgetting the panel element to the data. We are subtracting out the pre-period from the post-period for both the treatment and the comparison, and then comparing those two differences. We know the post period minus the pre period for the comparison group should have no effect at all of the minimum wage so that is the appropriate counterfactual. When we are controlling for a county specific trend we are saying "those secular trends that were going on before anyone adopted a minimum wage would have gone on if the minimum wage hadn't been adopted, so we want to clean that out of the treatment effect". If they are common between counties, my diagram two shows why that's not a concern. If they're different between counties (maybe because one has a progressive legislature), it needs to be accounted for. You are not weakening the signal you are making the signal more accurate because the only impact attributable to the minimum wage is what changes after its implementation. A time trend that continues on the same after as it did before does not change after the implementation of the minimum wage.

What special circumstances might justify Bob's fear? Time trends that are not the same before and after the minimum wage and that are not related to the minimum wage. That might look something like the following:


Let's say the true impact of the minimum wage does not increase the rate of growth of Y in the post period. In other words, let's say the slopes in both these cases would have happened without the minimum wage. If we control for county time trends using pre-period data in this case, we would find that the minimum wage had the effect of:

#1. A one time, persistent, positive shock to Y, and
#2. An increase in the rate of growth of Y

Why? Because we're differencing out the county trend in the comparison case, but we're only differencing out part of the county trend in the treatment case. The rest of the county trend in the treatment case is going to be attributed to the treatment. The #2 effect is false and introduces bias into the estimate of the treatment.

So, in order for Bob's fear to be a problem you can't just have county or state specific differences in trend between treatment and control cases. You'd need to have trends that change at the implementation of the minimum wage, and in a way that biases (rather than just adds noise) to the estimate.

It's not a completely crazy fear. You could have, for instance, a very liberal state legislature that implements a bunch of stuff at once, including a minimum wage law. That's possible, but the effect isn't clear to me. If Bob thinks the liberal legislature will tend to make employment for teens worse, and all these reforms were clustered, that would understate the effect of the minimum wage. Of course the opposite could also be true if a bevy of liberal reforms were to help. If it hurt adults more than it hurt youth that only seems like it would impact the minimum wage estimate if the effect of the minimum wage were estimated relative to the impact on all adults, and I don't think it is. I'm not really sure where those concerns about adult employment relative to youth employment come from.

So I will concede that because lots of different changes may happen together in a state legislature, it would be nice to account for that (of course if the increase is coming from a federal increase, whatever is going on in the state legislature shouldn't matter). These could certainly improve the estimate further. But I don't see any clear evidence that controlling for county time trends doesn't improve the estimate. Remember, the trends as calculated vary across counties, not the difference between the pre- and post- trends. And as my last figure illustrates you'd need the difference between pre- and post- trends for this to mess up the DID estimator.

There may be a Part 3 to this series. DLR also include placebo effects as a sensitivity analysis on these time trends. But I don't have a good sense yet of how all that works right now. If I have time to figure that out and put that together I will.

14 comments:

  1. "So Bob's issue is bigger than the easily dispatched with concern that we matched on counties and then decided that wasn't good enough (I didn't quote that part). The concern is that somehow we are absorbing the effect of the minimum wage."

    This is my concern, reflected in my comment in your previous post. I don't think you've managed to convince me here. The problem I have with your above explanation is that it reflects over-confidence in the model. You write, "But I don't see any clear evidence that controlling for county time trends doesn't improve the estimate."

    What would change your mind? Again, you write: "So, in order for Bob's fear to be a problem you can't just have county or state specific differences in trend between treatment and control cases. You'd need to have trends that change at the implementation of the minimum wage, and in a way that biases (rather than just adds noise) to the estimate."

    At no point do you question the theoretical validity of the model. A model that reaches a conclusion that stands in stark contrast to expectations driven by economic theory is a model that provides at least *SOME* "evidence that controlling for county time trends doesn't improve the estimate." That's because data is not supposed to diverge from economic "law" without good reason.

    That "good reason" might be a revised theory, or an exogenous shock, or a data anomaly, or something else. It has to be explained. That's the whole rub. It has to be explained. And, moreover, the explanation has to be better, clearer, and more conclusive than the theory that served us so well for so long.

    ReplyDelete
    Replies
    1. Nothing would change my mind on county time trends. Trends that occur both pre and post cannot be attributable to the minimum wage as far as I know and they HAVE to be eliminated for an unbiased estimator. So I guess another way of saying that is that to convince me you'd have to show that something very funny is going on with causality here or that I don't understand the properties of the DID estimator. I don't think you'll do the latter and anything you could come up with on the former won't make anywhere near as much sense as the spatial heterogeneity in time trends.

      What model am I supposed to be questioning? The DLR empirical model? I've commented at length now on why I think their approach is right and I'm probably not done yet. Give me a convincing reason to question it. Every query I make about their modeling choices comes out with them making the right modeling choices so far. I've asked several times "why would you do this?" and "how what is the correct way to deal with this in the data or that in the data" and every single time the DLR way seems the best from what I know about this estimator. So I think you're simply wrong when you say I don't question the model.

      On the second half of that paragraph I've talked at length about what I think the empirics say for theory too. Search "minimum wage" in my search function if you're curious.

      I don't understand the last paragraph at all. It has to be explained? What do you think we spend so much time thinking about identifying the model for?

      If you want me to throw anything out you have to give me a good reason to throw it out. I've described in great detail here why I think they are right.

      Delete
    2. Bob has an actual issue with the specification and I respond with a discussion of the appropriateness of the specification. I don't know what your concerns are except that you don't think it fits the theory you think it should.

      Delete
    3. I'm arguing about what you, personally, "should" do. My point is a relatively simple one.

      The simpler model - the one that finds that increases to the minimum wage increase unemployment - is consistent with a simple, intuitive, and long-standing theory.

      To that model, we add variables that control for variation by county. Without knowing anything else about anything, I can already predict with certainty that the significance of the minimum wage variable will decrease and the value of its parameter will change. Without knowing anything else about anything, I can already predict with certainty that the more counties you add to your model, the more pronounced this effect will be.

      Considering that this is little more than a mathematical fact about statistical models, we should all be very comfortable with the idea that the standard of proof for this new model - the one that includes counties - should be much more stringent. We should anticipate the effects I've just described, and run through a full range of tests to isolate them and minimize them.

      There are many such statistical tests and variable inclusion algorithms and criteria for doing so. Now, I'm just "some dude." I don't do this for a living, and I don't read a lot of economic literature. But I do know a thing or two about statistics and a teeny, tiny bit of economic theory. When I see a model that presents statistical results I fully expect to observe based purely on the design of the model - and when that model is subsequently used as evidence against a well-established, long-standing, fully intuitive theory, the homunculus in my head starts asking me to rule out all mathematical problems prior to considering the model's conclusions.

      This is not a demand on you, personally. It is not a description of something I want you do to, for yourself or for my benefit. This is a comment I am making on your blog post. It may not be the comment you expected. It may not be a comment you know how to respond to, or whether to respond to it at all. It may not be the kind of comment you enjoy, and you may or may not have anything to say by way of reply. It could be that it is just what entered into my head upon reading your blog post. That is what most of my comments are, anyway.

      Delete
    4. re: "To that model, we add variables that control for variation by county. Without knowing anything else about anything, I can already predict with certainty that the significance of the minimum wage variable will decrease and the value of its parameter will change. "

      Well I don't think you know the first one at all, but the latter seems pretty likely.

      re: "Without knowing anything else about anything, I can already predict with certainty that the more counties you add to your model, the more pronounced this effect will be."

      What do you mean by "pronounced"? Are we talking about standard errors or changes of the point estimates? The standard errors should get smaller (i.e. - more significant) and there's no reason at all the point estimates themselves should change (they will, just as a consequence of an alternative sample, but the expected value of the effect of increasing the sample size should be zero).

      You should be more comfortable with DLR's results, not less. You seem less comfortable and if these are the reasons I don't think it's justified. Precision should improve, not be reduced. The added variables and controls should reduce bias, not increase.

      You seem to want to say that because they depart from the result you were comfortable with from theory before the departure is something that needs to be justified. I strongly disagree. That's why we do empirical testing - to confirm theory. So what I've done in these posts is walk through all the controls and changes: contiguous county sample, geographic fixed effects, time fixed effects, and time trends - and shown how each of them reduces bias in the estimate. Do you disagree with one of these arguments? Because if none of these arguments sound wrong to you you should feel more comfortable with DLR's results than the naive regression. If one of the arguments seems wrong to you then that's something to talk about.

      Delete
    5. ehhh... not exactly, but I don't want to explain it anymore. You can't just add variables and claim a reduction in bias. That's not statistically valid. And yes, models that depart from expected theory should not inspire confidence. They should be checked and checked and checked again, over and over, until there is sufficient evidence to overrule theory.

      But anyway.

      Delete
    6. Nobody said just adding variables reduces bias.

      It matters what is and isn't controlled for depending on the identification strategy of the model. Adding a county-pair dummy does a lot to remove the bias. Adding some random thing might not (although it's difficult to understand what would INCREASE the bias, but you could imagine adding variables that significantly change the interpretation such that it's not something you care about).

      I'm not sure you understand what is going on in the models - I'm happy to explain more if you want.

      Delete
  2. Correction, that last comment of mine should begin with "I'm **NOT ** arguing about what you personally..." My apologies.

    ReplyDelete
  3. Replies
    1. Hadn't seen this - I'll take a look at it and try to get to it in the next couple days.

      I have not looked at the Meer and West paper yet, although I've come across it in reading up on DLR. So this actually may take more of an investment before I can respond.

      The Wolfers paper Ozimek cites is very good. He is responding to Liz Peters, who I know from the Urban Institute. She became center director of my center maybe six months before I left. So I didn't know her well, but a little. Both (Wolfers and Peters) are smart cookies.

      Delete
  4. I'm a bit new to the experimental econometrics literature so this is just a question for clarification. I'm confused by your wording in the beginning where you seem to say that a FE model is different from a DID model. My understanding is that a FE model with two periods is the same as the corresponding DID model. And that in general a FE model is like a generalization of DIDs. Is this wrong? Or am I just confusing words?


    ReplyDelete
    Replies
    1. The way I think about it at least is that all DIDs are FEs but not all FEs are DIDs, insofar as in a general FE without something like DLR's contiguous county pairs you are comparing a treatment case to all counties that didn't get treatment. The mechanics I suppose are still quite similar, but the whole identifiication strategy behind the DID is that you have a matched comparison group. The geographic fixed effects they added showed that that difference was a big one.

      A lot of the concerns carry over. I have an article out in JEBO now that relies on more naive variation in policies across states and we don't rely on this matching - but we still had to make corrections for things like serial correlation that come up in the panel DID models (pointed out by Bertrand, Duflo, and Mullainathan). So a lot is indeed the same, but you can see how the major factor identifying a DID model is missing in the FE model.

      Delete
  5. Thanks.

    This is somewhat off topic but do you know of any studies that do a good job of controlling for non-wage benefits? As a former restaurant employee, one of the biggest perks was 50% off meals. At one restaurant, I got free meals! Anyways, the point is that this type of benefit pervades restaurants (and many low wage businesses) so I'd be curious to know whether those benefits - or those kinds of benefits in general - take a hit with higher minimum wages.

    From an identification standpoint it's often hard to measure non-wage benefits. And it appears to me that not controlling for these benefits would bias the results as the lower benefits would be a unit specific simultaneous treatment effect.

    ReplyDelete
  6. This is GREAT stuff Daniel, thanks. If you have time, I'd love to hear you weigh in on the Ozimak article that "Lord" posted. He is claiming that if min. wage led to a permanently lower rate of growth, then the corrections introduced by Dube would mask it. If you could spell that out, it would be sweet.

    ReplyDelete

All anonymous comments will be deleted. Consistent pseudonyms are fine.