杨俊, 陈贤富. 基于KPCA和RBF网络的文本分类研究[J]. 微电子学与计算机, 2010, 27(3): 122-125.
引用本文: 杨俊, 陈贤富. 基于KPCA和RBF网络的文本分类研究[J]. 微电子学与计算机, 2010, 27(3): 122-125.
YANG Jun, CHEN Xian-fu. Text Categorization Based on KPCA and RBF Neural Network[J]. Microelectronics & Computer, 2010, 27(3): 122-125.
Citation: YANG Jun, CHEN Xian-fu. Text Categorization Based on KPCA and RBF Neural Network[J]. Microelectronics & Computer, 2010, 27(3): 122-125.

基于KPCA和RBF网络的文本分类研究

Text Categorization Based on KPCA and RBF Neural Network

  • 摘要: 基于词空间的分类方法很难处理文本的高维特性和复杂相关性, 为此文中提出了基于核的主成分分析和径向基神经网络的文本分类算法.首先利用核主成分分析选择合适的核函数从高维特征空间中提取文本向量的主成分, 实现了文本输入空间的降维和语义特征空间的抽取, 然后在语义特征空间中训练径向基神经网络分类器, 并利用训练得到的分类器进行文本分类工作.实验结果表明:核主成分分析不仅实现了降维, 而且能在大幅减减少径向基神经网络训练时间的基础上显著提高其分类精度.

     

    Abstract: It is difficult for methods based on word spaces to handle with the high dimensionality characteristic and complex correlation of the texts vectors.To solve this problem, a algorithm based on kernel principal component analysis and RBF neural network is proposed.First, this new algorithm employs KPCA with a appropriate kernel function to find the principal components of the input vectors in the high dimensional feature space, which effectively reduces the dimensionality of input vectors and gets the semantic feature space.Then, we train a RBF neural network in the semantic feature space.The experiment results show that the new method can effectively reduce the dimensionality of the data sets and notably enhance the classification precision while reduces the training time of the RBF networks.

     

/

返回文章
返回