席峰. 基于命名实体及关系的网页文本关联分析方法[J]. 微电子学与计算机, 2011, 28(8): 51-55.
引用本文: 席峰. 基于命名实体及关系的网页文本关联分析方法[J]. 微电子学与计算机, 2011, 28(8): 51-55.
XI Feng. WebPages Association Analysis Based on Named Entities and Relations[J]. Microelectronics & Computer, 2011, 28(8): 51-55.
Citation: XI Feng. WebPages Association Analysis Based on Named Entities and Relations[J]. Microelectronics & Computer, 2011, 28(8): 51-55.

基于命名实体及关系的网页文本关联分析方法

WebPages Association Analysis Based on Named Entities and Relations

  • 摘要: 针对传统关联分析技术应用于网页文本分析上存在的问题,提出一种基于命名实体及实体关系的网页文本关联分析方法.该方法以命名实体和实体关系作为特征来代替传统高频词,首先采用基于向量相似度比较的修正策略来提取网页文本中的命名实体,然后分析Maxfpminer算法并对其进行改进,利用改进的Maxfpminer算法对网页文本进行关联分析.实验结果表明,该方法分析得到的知识模式的有效性和可读性均优于传统方法.

     

    Abstract: Aiming at problems of traditional association analysis methods on webpages,this paper presents a new method of webpages association analysis based on named entities and relations.The new method uses named entities and relations,instead of traditional words,to express text feature.First,the new method adopts modify strategy based on vector similarity comparison to extract named entities on webpages.Second,it analyzes Maxfpminer algorithm and improves it.At last,the new method uses the improved algorithm to complete webpages association analysis.Results of experiment show that validity and readability of the new method's result excel to traditional methods.

     

/

返回文章
返回