I got an email today announcing an Urban seminar, and the abstract reminded me of some of the Piketty debates around Bob Murphy and Phil Magness's paper and subsequent discussions. Here it is:
"ABSTRACT: The 2014 Current Population Survey, Annual Social and Economic Supplement (CPS-ASEC) introduced major changes to the income questions. The questions were introduced in a split-sample design—with 3/8 of the sample asked the new questions and 5/8 asked the traditional questions. Census Bureau analysis of the 3/8 and 5/8 samples finds large increases in retirement, disability, and asset income and modest increases in Social Security and public assistance benefits under the new questions. However, despite the additional income, poverty rates are higher for children and the elderly in the sample asked the new questions. In this brownbag, we discuss the changes to the survey, the effects of the changes on retirement and other income, and describe how compositional differences among families with children in the 3/8 and 5/8 samples may explain the unexpectedly higher poverty rates in the 3/8 sample. The discussion has practical as well as theoretical importance, as researchers will have a choice of datasets to choose from when analyzing the 2014 CPS-ASEC data—the 3/8 sample weighted to national totals, the 5/8 sample weighted to national totals, a combined sample, and possibly also an additional file prepared by the Census Bureau that imputes certain income data to the 5/8 sample based on responses in the 3/8 sample."
The CPS is typically not used to address inequality for all sorts of reasons, including the nature of the questions, coverage, and top-coding. But it still has income questions, and note that a recent redesign changes asset income reports. Of course if we were to use the CPS to think about some of Piketty's research questions, this change would be important. Moreover, if you wanted to use a consistent series from the CPS you would have to adjust the data to either move down the newer half of the series, or (probably preferably if this redesign represents an improvement) moving up the older half of the series. They do split samples discussed in the abstract so that you can understand the sort of adjustment that might be appropriate.
This is what Piketty is doing too when he harmonizes several of the wealth inequality series, and he uses years when the data series overlap to develop the adjustment factors. The figure Murphy and Magness like to call the "Frankenstein graph" suggests that certain blocks of the series come from different datasets, but in reality Piketty is typically taking data from several datasets to provide a harmonized estimate (for example, combining the Kopczuk and Saez data and the SCF data). This is how you'd want to merge several datasets, and it's generally not "pivoting" between datasets or "overstating" them as Murphy and Magness put it.
Anyone can criticize these sorts of data decisions, but it's a normal part of empirical work. If your criticism is just that the data decisions result in the conclusion that Piketty draws, that's not a very reasonable criticism. It's entirely circular: Piketty's conclusions are bad because his data decisions are bad. How do you know his data decisions are bad? Because they correspond to his conclusions!