Sunday, April 20, 2008

Ng & Jordan, 2001 in simple words

I have read Ng & Jordan's paper On discriminative and generative classifiers: A comparison of logistic regression and naive Bayes many times, and at every attempt ended up wasting some time struggling with the notation. So finally, I am penning down my interpretations of all of the results mentioned in this paper.

Notation: m = number of samples, n = dimension of the feature space.

Conclusion: The asymptotic error of the discriminative model is lower than the asymptotic error of the corresponding generative model (from the generative - discriminative pair), but the generative model require number of examples logarithmic in the dimensionality of the feature space to catch up with this asymptotic error (as opposed to linear in the case of discriminative models).

Prop. 1: Asymptotic error for discriminative (logistic regression) classifier is at most the asymptotic error for the generative (naive Bayes) classifier.

Prop. 2: For n-dimensional logistic regression, the error lies within O(sqrt(n/m log m/n)) of the asymptotic error.

Lemma 3: The parameters of the (naive Bayes) generative model converge to those of the asymptotic classifier with logarithmic number of samples, with high probability.

Theorem 4: The error also converges with a margin of error G(t), which in turn is upper-bounded by Pr(l_gen(x) \in [-tn, tn]).

Prop. 5: As long as most of the features are relevant to the class label, the expexted value of |l_gen,\infty(x)| will be \Omega(n)

Corollary 6: Under specific assumptions, the error of h_gen converges to h_gen, \infty with high probability when m = \Omega(log n).

0 comments:

 
Learning in Vision: Ng & Jordan, 2001 in simple words