Likelihood ratio test

Likelihood ratio test is used to compare the fit of two models one of which is nested within the other.

In the context of machine learning and the Mahout project in particular, the term LLR is usually meant to refer to a test of significance for two binomial distributions, also known as the G squared statistic. This is a special case of the multinomial test and is closely related to mutual information. The value of this statistic is not normally used in this context as a true frequentist test of significance since there would be obvious and dreadful problems to do with multiple comparisons, but rather as a heuristic score to order pairs of items with the most interestingly connected items having higher scores. In this usage, the LLR has proven very useful for discriminating pairs of features that have interesting degrees of cooccurrence and those that do not with usefully small false positive and false negative rates. The LLR is typically far more suitable in the case of small than many other measures such as Pearson’s correlation, Pearson’s chi squared statistic or z statistics. The LLR as stated does not, however, make any use of rating data which can limit its applicability in problems such as the Netflix competition.

The actual value of the LLR is not usually very helpful other than as a way of ordering pairs of items. As such, it is often used to determine a sparse set of coefficients to be estimated by other means such as TF-IDF. Since the actual estimation of these coefficients can be done in a way that is independent of the training data such as by general corpus statistics, and since the ordering imposed by the LLR is relatively robust to counting fluctuation, this technique can provide very strong results in very sparse problems where the potential number of features vastly out-numbers the number of training examples and where features are highly interdependent.