Abstract:
In order to model text processing effectively, text vectors can be represented as points on statistical manifold and kernels can be used to integrate discriminative and generative model. And then, we present diffuse kernels based on Dirichlet compound multinomial (DCM) manifold. More specifically, we proposed kernel nearest neighbor classifier based on kernel distance metric of DCM manifold to implement text classification task. As demonstrated by our experimental results on various real-world text datasets, we show that our text classifier is more desirable and provides much better computational accuracy than some current state-of-the-art methods.