I've got a good handle on my identification strategy. There are lots of good instruments here (good instruments? does that sound like an oxymoron to anyone else?). Now I'm focusing more on the specification of the model.
First, I'm working with 10 years of the ACS, which comes out to something like 30 million observations. I expect only about 3,000 of them are petroleum engineers (I'm getting about 300 each year). So what I'd like to do is just restrict the sample to engineers.
1. - My first question is - is this OK??? Ryoo and Rosen (2004) look at demand elasticities for all engineers, and they restrict their sample to all college grads - and they seem fine calling that "the demand for engineers". It seems strange to me to exclude all other workers and still call it the "labor supply elasticity" if I'm restricting my sample like that. But perhaps its fine? Perhaps the idea is that it's a local elasticity estimate and while non-engineers may be drawn into the petroleum engineering labor market they are so far out on the labor supply function that it's fine to ignore them??? Anyway - if I've got good instruments to identify demand shifts specific to petroleum engineers, is it OK to just estimate on a sample of engineers?
2. - My second question is - I've been reading a lot of papers on the "extended Roy model" - particularly Heckman (most of which is over my head), and some operationalization of it by Blau. The whole idea is that you need to add a sector selection correction to get an unbiased elasticity, which basically means satisfying another exclusion restriction that I have no clue how I'm going to satisfy. So the question is - if I'm restricting my sample to engineers anyway, is a selection correction strictly necessary? Ryoo and Rosen don't use one. Blau does. But Blau is looking at child care workers - that seems like it could have a lot more selection problems associated with it. Will it look bad if I don't include this step?
3. - Finally, while I don't know generalized method of moments (GMM) too well myself, I really thik I need to do that in addition to 2SLS when I'm presenting my results. It looks like somebody has generated a new stata command - ivreg2, that includes GMM as an option. Is this a good way of doing GMM in stata? Has anyone done GMM and have any other suggestions?
1. I think you can do this as long as you're clear that you're estimating the *intensive margin* (which in my understanding is the elasticity in question conditional on past participation in the market: http://en.wikipedia.org/wiki/Margin_(economics) )
ReplyDelete2. No clue.
3. http://www.economics.harvard.edu/faculty/stock/files/jbes02m084r2.pdf // I have used LIML to double-check 2SLS estimates, but as long as things were roughly in line, I never felt the need to report them, nor have I seen any alternative estimates reported frequently in papers that use 2SLS.
Thanks
Delete1. My hope is to estimate both the intensive and extensive elasticities (I have hours worked and employment status). I personally find the extensive elasticities more enlightening in occupations like this where (I'm guessing) you're not going to find 20 hour a week engineers. My concern is that if I restrict it to engineers, I'd be doing this by looking at how changes in the relative wages of petro engineers moves the likelihood of being a petro engineer out of a sample of all engineers. If I changed that comparison to everyone in the survey, would that change things?
2. That makes two of us :)
3. Thanks for the link - this is helpful. I'll have to see what the standard is for reporting them. Certainly I'll want to have done it all. I'll check out LIML too.
DK, super quick as I'm running off to a dinner, but...
ReplyDelete1) Yes, I see no problem with that (although labour economics is not my field of expertise...)
3) I've used ivreg2 many times for my own work. Indeed, I've developed a strong preference for it above normal 2SLS procedures. One bonus is that you can incorporate a Newey-West HAC estimator if you're facing problems of autocorrelation and heteroskedasticity. In fact, this particular routine uses a Bartlett kernel and you might run that as follows:
ivreg2 y (x1 = z1) x2, robust bw(_LAGS)
Where the optimal number of lags; _LAGS = the number of time observations in your dataset ^ 0.25
(The Greene textbook gives this as the best way of determining the optimal number of lags IIRC)
The ivreg2 command also produces automatic tests of instrument validity, which is cool (e.g. Anderson Correlation Coefficient for identification).
Sorry if this is confusing, drop me an email if you need some more info. Gotta run!
Thanks - that's very helpful. I may get back to you if I have questions.
DeletePetro engineers do not have to be licensed and any one can perform the work. Accordingly, how is the number of college grads the measure of anything?
ReplyDelete