田卫东, 黄勇. 频繁子树模式在中心词识别中的应用研究[J]. 微电子学与计算机, 2015, 32(11): 27-32.
引用本文: 田卫东, 黄勇. 频繁子树模式在中心词识别中的应用研究[J]. 微电子学与计算机, 2015, 32(11): 27-32.
TIAN Wei-dong, HUANG Yong. Study on the Application of Frequent Sub-tree Patterns in Focus Words Recognition[J]. Microelectronics & Computer, 2015, 32(11): 27-32.
Citation: TIAN Wei-dong, HUANG Yong. Study on the Application of Frequent Sub-tree Patterns in Focus Words Recognition[J]. Microelectronics & Computer, 2015, 32(11): 27-32.

频繁子树模式在中心词识别中的应用研究

Study on the Application of Frequent Sub-tree Patterns in Focus Words Recognition

  • 摘要: 中文问句中心词识别领域中,现有方法未能有效利用依存句法中的深层统计关系.为解决此问题并探究中心词在词的多维属性上的统计关系,首次提出多维树概念,给出多维频繁模式挖掘方案并应用于中文问句中心词识别中.针对此应用给出频繁子树模式精简及规则冲突解决方案,训练出一个中文中心词识别模型.此方法是典型的客观方法,实验表明,此方法有较好的稳定性、适应性与鲁棒性,且较条件随机场模型在准确率上有进一步提高.

     

    Abstract: In the field of Chinese Focus-words Recognition, current studies don't take full advantages of some deep statistical relationships in dependency syntax. To solve this problem and explore statistical relationships between Chinese focus words and the multiple properties of words, a new concept called Multi-Dimensional Tree (MDT) and a solution of mining frequent MDT pattern are proposed and applied. Solutions of condensing those frequent patterns and dealing with pattern conflicts are given, a Chinese focus words recognizer is trained. The method is a kind of typical objective method, the empirical results show that this method is good at stability, adaptability and robustness and can reach higher recognition accuracy rate than CRF model.

     

/

返回文章
返回