The only difference between machine learning and good old econometrics is the area of application. Econemetricians try to predict the past, e.g. explain what was the reason for this or that level of salary, or gdp. That’s why they worry about standard errors and distribution of estimators in the limit. Machine learners mostly concerned about the point estimate rather than a confidence interval. The funny thing is that machine learning and econometrics are the same things. This is good old mathematical statistics which is called differently. Machine learning is a reduced form estimation sent to the limit. A century ago there were no computers and most of the machine learning techniques were not available; the best the statisticians could do was ordinary least squares because estimators have closed form solution and tractable limiting distributional behaviour. The solutions required ingenuity and fundamental understanding of mathematics.

Hm… I have gone astray a little bit. The point which I was trying to make in this post is that just like Star Track incorporate that some geeky stuff about theoretical physics into popular culture, social networking incorporated machine learning, or econometrics, into popular culture. An average Joe became informed about things that can be done once you know some mathematical statistics. However, what an average Joe doesn’t know is that there is a competing concept that allows doing the same thing and it is called structural estimations. McFadden got a Nobel price for pioneering this field. The idea here is that you think explicitly how the data was generated. It is closely related to an idea of sufficiency in statistics. One does not need to know the data about the whole population to make conclusions, a random sample from this population is enough. A random sample from a population is an example of a sufficient statistic i.e. there is already enough information for the analysis. Structural estimations do the same thing; we have the data but maybe it is not random, or maybe something that we need to control for is missing, but it is good enough if we bring some outside knowledge about this data.

The paper that I have presented the other day is about how the demand can be predicted if one has data, potentially very limited amount of it. It could be a great alternative to machine learning for dating services, for example.