Facts & other stubborn things: Data adjustments - not a conspiracy, just a part of empirical work in economics

Tuesday, April 7, 2015

Data adjustments - not a conspiracy, just a part of empirical work in economics

Posted by dkuehn at 1:07 PM

I got an email today announcing an Urban seminar, and the abstract reminded me of some of the Piketty debates around Bob Murphy and Phil Magness's paper and subsequent discussions. Here it is:

"ABSTRACT: The 2014 Current Population Survey, Annual Social and Economic Supplement (CPS-ASEC) introduced major changes to the income questions. The questions were introduced in a split-sample design—with 3/8 of the sample asked the new questions and 5/8 asked the traditional questions. Census Bureau analysis of the 3/8 and 5/8 samples finds large increases in retirement, disability, and asset income and modest increases in Social Security and public assistance benefits under the new questions. However, despite the additional income, poverty rates are higher for children and the elderly in the sample asked the new questions. In this brownbag, we discuss the changes to the survey, the effects of the changes on retirement and other income, and describe how compositional differences among families with children in the 3/8 and 5/8 samples may explain the unexpectedly higher poverty rates in the 3/8 sample. The discussion has practical as well as theoretical importance, as researchers will have a choice of datasets to choose from when analyzing the 2014 CPS-ASEC data—the 3/8 sample weighted to national totals, the 5/8 sample weighted to national totals, a combined sample, and possibly also an additional file prepared by the Census Bureau that imputes certain income data to the 5/8 sample based on responses in the 3/8 sample."

The CPS is typically not used to address inequality for all sorts of reasons, including the nature of the questions, coverage, and top-coding. But it still has income questions, and note that a recent redesign changes asset income reports. Of course if we were to use the CPS to think about some of Piketty's research questions, this change would be important. Moreover, if you wanted to use a consistent series from the CPS you would have to adjust the data to either move down the newer half of the series, or (probably preferably if this redesign represents an improvement) moving up the older half of the series. They do split samples discussed in the abstract so that you can understand the sort of adjustment that might be appropriate.

This is what Piketty is doing too when he harmonizes several of the wealth inequality series, and he uses years when the data series overlap to develop the adjustment factors. The figure Murphy and Magness like to call the "Frankenstein graph" suggests that certain blocks of the series come from different datasets, but in reality Piketty is typically taking data from several datasets to provide a harmonized estimate (for example, combining the Kopczuk and Saez data and the SCF data). This is how you'd want to merge several datasets, and it's generally not "pivoting" between datasets or "overstating" them as Murphy and Magness put it.

Anyone can criticize these sorts of data decisions, but it's a normal part of empirical work. If your criticism is just that the data decisions result in the conclusion that Piketty draws, that's not a very reasonable criticism. It's entirely circular: Piketty's conclusions are bad because his data decisions are bad. How do you know his data decisions are bad? Because they correspond to his conclusions!

13 comments:

SpektatorApril 14, 2015 at 1:27 AM
You do yourself a real disservice, Daniel, when you make arguments such as these.

M&M are fully aware that he's harmonizing different series. YOU just seem unwilling to admit that they've dug deeper into this one than you're comfortable discussing yourself.

What M&M point out and what Pikkety's defenders ignore is that he does more than just harmonize data sets through "standard" stats practices. When the Kopczuk and Saez data starts to break from Pikkety's story he just deletes it and goes to the SCF. And when the SCF doesn't support him either he starts picking and choosing *which* SCF years he wants to include. That's not standard social science. That's data malpractice which he doesn't explain either in his footnotes.
ReplyDelete
Replies
Daniel KuehnApril 14, 2015 at 10:12 AM
I've been trying to figure out why the Wolff and Kennickell picture of the 90s differ but I can't find anything obvious and they don't cite each other so it's hard to compare (not surprising since they came out around the same time). It might be vehicles - I think Wolff excludes them and Kennickell includes them - but I'm not 100% sure.

In any case none of this really changes the fact that the SCF shows inequality increasing.
ReplyDelete
Replies
SpektatorApril 14, 2015 at 10:57 AM
Kenickell shows only 1 clear increase from 30.2% to 34.6% in between 1992 and 1995 (the Clinton Years).

But then he has 34.6% in 1995 and 34.5% in 2010. Where's his increase, cause that's actually a slight decrease?

Neither of these support: "In any case none of this really changes the fact that the SCF shows inequality increasing."
ReplyDelete
Replies
Daniel KuehnApril 14, 2015 at 1:50 PM
Piketty's 1980 to 2010 change comes from Kennickell, 1989 to 2007. He says in the note that he grabs 1983 from Wolff but he doesn't appear to, but I'm just glancing over all of this. What is not understood about where Piketty gets it from? What do you mean they don't match either source? They look right to me.

It's not malpractice.

Stop saying that.

If you have a disagreement with Piketty to offer, make it. If all you have is that you don't know what Piketty's source is you shouldn't accuse him of malpractice until you do know what Piketty's source is. I have Wolff, Kennickell, and Piketty on my screen right now and I see where his numbers are coming from. I don't know what your confusion on this is.

What I don't know is what is the source of the difference between Kennickell and Wolff, but the fact that two researchers came up with somewhat different answers to potentially different questions (are they measuring net worth the same? I don't know) does not mean Piketty is guilty of malpractice.
ReplyDelete
Replies

Add comment

All anonymous comments will be deleted. Consistent pseudonyms are fine.

Tuesday, April 7, 2015

Data adjustments - not a conspiracy, just a part of empirical work in economics

13 comments:

Welcome!

Blogroll

Labels

Followers

Blog Archive

Search This Blog

Tuesday, April 7, 2015

Data adjustments - not a conspiracy, just a part of empirical work in economics

13 comments:

Welcome!

Subscribe To

Blogroll

Labels

Followers

Blog Archive

Search This Blog