I've got a good handle on my identification strategy. There are lots of good instruments here (good instruments? does that sound like an oxymoron to anyone else?). Now I'm focusing more on the specification of the model.
First, I'm working with 10 years of the ACS, which comes out to something like 30 million observations. I expect only about 3,000 of them are petroleum engineers (I'm getting about 300 each year). So what I'd like to do is just restrict the sample to engineers.
1. - My first question is - is this OK??? Ryoo and Rosen (2004) look at demand elasticities for all engineers, and they restrict their sample to all college grads - and they seem fine calling that "the demand for engineers". It seems strange to me to exclude all other workers and still call it the "labor supply elasticity" if I'm restricting my sample like that. But perhaps its fine? Perhaps the idea is that it's a local elasticity estimate and while non-engineers may be drawn into the petroleum engineering labor market they are so far out on the labor supply function that it's fine to ignore them??? Anyway - if I've got good instruments to identify demand shifts specific to petroleum engineers, is it OK to just estimate on a sample of engineers?
2. - My second question is - I've been reading a lot of papers on the "extended Roy model" - particularly Heckman (most of which is over my head), and some operationalization of it by Blau. The whole idea is that you need to add a sector selection correction to get an unbiased elasticity, which basically means satisfying another exclusion restriction that I have no clue how I'm going to satisfy. So the question is - if I'm restricting my sample to engineers anyway, is a selection correction strictly necessary? Ryoo and Rosen don't use one. Blau does. But Blau is looking at child care workers - that seems like it could have a lot more selection problems associated with it. Will it look bad if I don't include this step?
3. - Finally, while I don't know generalized method of moments (GMM) too well myself, I really thik I need to do that in addition to 2SLS when I'm presenting my results. It looks like somebody has generated a new stata command - ivreg2, that includes GMM as an option. Is this a good way of doing GMM in stata? Has anyone done GMM and have any other suggestions?