Thursday, May 9, 2013

I was just awarded a 2013 Infometrics Institute Summer Fellowship!

This was very good news to get today. I didn't get the Mathematica fellowship I had applied for and hadn't heard back from a lot of other stuff, so I was worried I'd be working retail this summer. The fellowship will help me keep writing while building up my diaper fund (no, I am not incontinent - I am expecting).

The Infometrics Institute website is here.

More on information theoretic approaches to econometrics here and here (the latter authored by my many-times-over professor and likely dissertation committee member).

My co-fellow is in my cohort - Woubet Kassa. I just talked to him and he tells me he'll be continuing work he did this spring on the importance of land titling institutions for economic development. He's a great guy and I'm very excited to be working more with him this summer.

This is the proposal I sent in. It is going to fit in very well with my Sloan Foundation work using the same data:
The economic literature on grade inflation has primarily been concerned with the biased representation of student ability and the impact on student major choices. However, the studies to date are not completely satisfactory if we expect agents operating in the labor market to behave rationally. The problem of grade inflation is common knowledge, so why wouldn’t employers, students, and teachers simply adjust their expectations by an appropriate expected grade inflation factor before making their decisions? An information theoretic approach to grade inflation is potentially more fruitful. From the perspective of information theory, the critical point is not that grades are being inflated over time (agents can adjust for inflation), but that they are also being compressed due to truncation in the grading scale. This implies that grade inflation not only changes the central tendency of the grade distribution; it also reduces the entropy of grades. When employers and graduate school admissions committees review college transcripts they are interested in extracting compressed information about the student’s abilities. Additional compression as a result of grade inflation degrades this information further.

I propose an investigation of the relative entropy of grades by college major, its impact on student and employer choices and the determination of wages inside and outside one’s field. I will use the 2008/2009 Baccalaureate and Beyond survey (I already have the data), which includes college transcript data for approximately 15,000 American students who graduated in the 2008/2009 academic year. Fields of study with high entropy in grade distributions should be associated with wider dispersion in starting wages, lower labor market churning, and slower wage growth over time (because there is less to be learned about student ability due to the higher information content of grades). Fields of study with lower entropy grade distributions should have lower starting wages and less wage dispersion, but potentially more churning between jobs and wage growth over time as student abilities are learned by employers on the job. My preliminary review of the literature on grade inflation suggests that an information theoretic approach to the problem would be an original contribution. The only comparable analysis I have found is an unpublished memo by a computer science professor discussing the possibility of an information theoretic approach to understanding grade inflation.


  1. Interesting topic.

    I've wondered why employers (and graduate schools) don't use class rank instead of grade. Do colleges make that information available? Your explanation of the problem might still work, if a compressed grade scale meant that class rank had a larger random error than before.

    Also, might the problem be uneven grade inflation across colleges, or even across majors, making it harder for the employer to judge what a grade means? The obvious solution would be for the employer to have information on average grades, or better on average and standard deviation, by college and perhaps by major--I don't know how available that would be.

    1. I know class rank is a delicate issue in some cases. Not in college, but in high school I remember it was not reported because it was so sensitive for people. But obviously if colleges have centralized transcript information the rank would be trivial. It's never appeared on any of my transcripts, though.

      I think the problems across colleges and majors are important too. I have data on all grades - so I can associate them with in and outside of the major classes. I should take a look at that and potential differential effects.

      Again, for a college all of this ought to be available if they wanted to report it. Of course there would be more trouble reporting across colleges.

      For graduate school applicants this is, of course, why we have GREs, MCATs, and LSATs. You get both a score and a rank from that and it's a national score and rank. What's interesting is that there's no standardized test of the same sort for people employed directly out of college.

    2. Infometrics is an approach to statistical inference that relies on the information content of a variable. It's kind of like an inversion of the maximum likelihood approach. Instead of asking "what very specific claim do I have the highest confidence in making about this data?" you are effectively asking "how broad of a claim can I make and still be confident that it describes this data?" (or that's my best shot at explaining it - I'm sure I'll be able to do better in the future).

      So the real interest of this proposal to the Infometrics Institute is that it's approaching grades in the same way: for their information content.

  2. Very interesting. Where did you learn about information theory if I may ask? I picked it up through cryptography, but I never heard of it anywhere in my (admittedly undergraduate) economics program.

    (Fun fact: information theory is a really cool tool to solve riddles because if you can calculate the entropy remaining in the system after you've acquired information, you can know unambiguously whether you are likely to solve the riddle or not)

    1. Among econometricians that use information theory, one of the world experts is the econometrics professor in our department (Amos Golan). He runs the institute, and he often brings the material into classes.

      In the fall he'll actually have a seminar dedicated to it which I'll be taking which will be really cool (most AU students get a dose of it from him - but a relatively small dose, in econometrics).

    2. I am very much a newbie to it.

      The fellowship pretty much sponsors empirically-oriented students for the summer and they pick up information theory in the process.

  3. Congratulations on attaining this fellowship, Daniel.

    Out of curiosity Daniel Kuehn, were you aware that information theory takes concepts and techniques from statistical mechanics?

    Also, have you asked Professor Amos Golan if he is aware of the econophysics literature, and if he is, has he formed an opinion on that literature?

    Furthermore, were you aware that Claude E. Shannon, the mathematician who fathered information theory, based some of his work from the contributions of the British autodidactic prodigy George Boole?

    J. M. Keynes himself also dealt extensively with George Boole in A Treatise on Probability. I wonder what philosophers and mathematicians would have to say when they compare the roles of Boole, Keynes, and Shannon, in intellectual history.

    Also, if anyone else is interested in exploring the connection between Boole and Keynes, please see the following article, which was recently published in History of Economic Ideas, Volume 20, 2012, Issue 3.

    1. I feel like we're always playing a six degrees of separation game back to Brady's Boole and Keynes article.

      Yes, I knew about the Shannon/Boole connection. No I don't know what Golan's views or knowledge of econophysics are.

    2. Regarding six degrees of separation...well, it's a small world. Networks tend to be filled with people who tend to know of other people in some way or form due to coinciding interests of varying degree.

      Regarding Professor Amos you plan on finding out whether he is aware of the econophysics literature and if he is, what is his opinion on it?

      Also, regarding information entropy...Daniel, could you please ask Professor Golan whether he can explain the differences between information theory and decision theory? Because I know that there is a certain degree of overlap, as both do deal with knowledge, and that if one traces the scholarly literature of each, there are articles that cite older papers with applications to both.

      For example, this 2011 article written by Marcello Basili and Alain Chateauneuf and published in the International Journal of Approximate Reasoning cited two old articles by the American physicist E. T. Jaynes on information theory and statistical mechanics.

      (Also, is Professor Golan aware of that journal? The International Journal of Approximate Reasoning publishes other things besides papers about on decision theory under ambiguity, it also publishes articles on fuzzy logic, non-additive probability theory, artificial intelligence, and philosophical approaches to uncertainty.)

    3. I have absolutely no intention of raising the issue with Prof. Golan.

      Information theory deals with specific information theoretic metrics like entropy - the name refers to the amount of information in a system and its adequacy for determining a question of interest. So it's often relevant for data compression: how much can you compress a signal and still successfully extract the signal at the other end? How much information is lost, in other words, by a particular data compression?

      Decision theory is much bigger than that and pretty much refers to what the name implies - making inferences to make decisions.


All anonymous comments will be deleted. Consistent pseudonyms are fine.