Tuesday, April 7, 2015

Does anybody have experience with a distributed lag model for panel data?

Does anybody have experience with a distributed lag model for a panel dataset? I'm getting this odd result where I'm trying a bunch of different lag lengths and no matter what I run the two longest lags have much bigger coefficients than the rest. So when I run with six lags, five and six have big coefficients but when I run with sixteen lags fifteen and sixteen do. I feel like this has to indicate something about the data structure and the model - it can't be real to always show that no matter what the lag length. I'm just not sure what it indicates. If it matters - I'm looking at size of apprenticeship programs in an unbalanced panel with lags of the unemployment rate. No lags of the dependent variable.


  1. Why are you using a lagged exogenous variable but not a lagged dependent variable (other than that the estimation issues are simpler w/o a lagged dep. var.)? Think about this for a moment. Consider the following situation:

    1) Your data have been at a long term equilibrium.
    2) The unemployment rate then increases for one period, thereafter returning to its long term equilibrium value.

    What happens?

    - Assuming that there is an immediate (i.e., within period) effect on the dependent variable (Y), it responds right away.
    - In the second period, when U has returned to normal, is the previous period's increase still having a direct effect on the dependent variable? Or is its effect actually due to Y's no longer being at its long term equilibrium value. In the first case you want a lagged term of U. In the second, what you really need is a lagged term of Y.

    Look into panel granger causality tests to select an appropriate specification.

  2. Followup to yesterday's comment:

    1) Notice that an AR(1) is equivalent to an infinite order MA process; however, the coefficients on the MA terms should be declining exponentially if the AR process is stable (i.e., no unit root).
    2) Are your last 2 lag coefficients about the same magnitude and opposite signs? I suspect that your unemployment rate is very multi-collinear (i.e., lots of positive serial correlation between terms). Although the textbooks say that multicollinearity typically leads to statistically insignificant point estimates, in my experience it often leads to statistically significant estimates of about the same magnitude and opposite signs.

  3. Hi Paul - thanks so much for these thoughts. I really don't have a lot of time series experience, which has been a barrier to thinking about these things. I've followed the lead of a handful of studies on the same topic on the specification which is why I had started out with lagged independent but not dependent variables. I've been reading up more since posting this and your first post is what I've largely come across - I think that hits the nail on the head as far as the specification problem. I'm still using the lagged independent variable specifications to compare to the earlier studies (which will also provide a nice little discussion of the problems with those), but now I'm also using an Arellano-Bond GMM approach with lagged dependent variables. Does that seem reasonable? One of the paper on my topic takes this approach and after thinking about this more I agree it's the right one.

    I think the multicollinearity issue is right too. That's actually where I started down this road. I came across the Kocyx transformation to deal with the multicollinearity problem but then of course you no sooner find that than you come across the lagged dependent variable bias issues which lead me to Arellano-Bond.

    I'll check out the granger tests for panel, but does the Arellano-Bond approach seem reasonable?

    Thanks again so much for these thoughts.

    1. I haven't done a lot of dynamic panel models, just played around a bit. As with most nearly everything else, Stata has a nice suite of estimators for them, including Arrelano-Bond (but xtdpd seems to be a bit more general since it allows for serially correlated errors, but start with xtabond since it is simpler). That is pretty much the limit of my experience. I guess the thing to do is to try the different ones out (and hope the results are similar). Never before heard of the Kocyx transformation (and googling is not much help: do you mean Koyck?).


All anonymous comments will be deleted. Consistent pseudonyms are fine.