Sunday, March 22, 2009

Infinitely Imbalanced Logistic Regression

Art Owen had an interesting, if not surprising, paper titled Infinitely Imbalanced Logistic Regression in JMLR 8 (2007). In this paper, Owen investigates logistic regression for binary classification tasks where one of the classes have finite number of samples and the number of samples for the other class approaches infinity, thus creating an infinite imbalance between the two. He shows that although the intercept of the learned logistic regression approaches -infinity, the remaining coefficient vector approaches a non-trivial and useful limit. Furthermore, the minority class affects the coefficient vector ONLY through its empirical mean, thus it suffices (within sampling uncertainty, that is) to replace all the samples of the minority class with a single sample for the mean.

0 comments:

 
Learning in Vision: Infinitely Imbalanced Logistic Regression