期刊文献+

基于混合方法的含动词名词短语识别研究

Research on Noun Phrase with Verb Recognition Based on Mixed Method
分享 导出
摘要 随着计算机技术的迅猛发展,知识变成人工智能领域中的一个重要资源。在面向自由文本的自动知识抽取中,名词短语识别是基础的任务之一,尤其是含动词的名词短语。已有的名词短语识别研究更多地着眼于命名实体的识别,范围较窄且不能解决其他含动词的名词短语,加之含动词名词短语识别存在分词错误、边界确定、特殊结构、标记数据少等难点,含动词名词短语识别目前仍然是一个巨大的挑战。基于此,文章提出了一种神经网络与规则、统计相结合的方法。首先对语句进行预处理,其中包含包括词性、助词、时间、数量词等内容的修正和合并;然后,使用双向LSTM与条件随机场融合的方法对含动词命名实体进行识别;接着使用百度词条、固定搭配、语义分类和描述框架文法的方式对含动词名词短语识别;最后使用随机抽取的多动词文本进行实验和分析,实验结果表明,本文方法达到89%的准确率。 With the rapid development of computer technology,knowledge has become an important resource in the field of artificial intelligence.Noun phrase recognition is one of the fundamental tasks in automatic knowledge extraction for free text,especially the noun phrase with verb.Many existing noun phrase recognition focuses on the named entity.Named entity recognition is narrow and can’t solve other problems about noun phrase with verb.In addition,noun phrase with verb recognition has many difficulties,such as segmentation errors,boundary determination,special structure noun phrases,and less labeled data.We present a method of noun phrase recognition which combines neural network、rules and statistics.Firstly,we introduce the pretreatment of sentence,which contains modify and combine part of speech,auxiliary,time and so on.Secondly,we recognize the named entity with verb by using BiLSTM and CRF.Thirdly,we recognize the noun phrase with verb based on Baidu encyclopedia entries,fixed collocations and grammars in FSTD.Finally,we use the random text for experiments and analysis the results of the experiment.The experimental result shows that the accuracy of the method is 89%.
作者 方芳 王石 王亚 符建辉 曹存根 FANG Fang;WANG Shi;WANG Ya;FU Jianhui;CAO Cungen(Key Laboratory of Intelligent Information Processing, Institute of Computer Technology,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处 《山西大学学报:自然科学版》 CAS 北大核心 2019年第1期31-40,共10页 Journal of Shanxi University (Natural Science Edition)
基金 国家重点研发计划(2017YFC1700300 2017YFB1002300).
关键词 含动词名词短语识别 命名实体识别 语义分类和描述框架文法 noun phrase recognition named entity recognition Framework of Semantic Taxonomy and Description
作者简介 方芳(1990-),博士研究生,研究方向为人工智能。E-mail:fangfang2016@ict.ac.cn;通信作者:曹存根(CAO Cungen),E-mail:cgcao@ict.ac.cn.
  • 相关文献

参考文献2

二级参考文献48

  • 1H Y Tan. Chinese place automatic recognition research. In: C N Huang, Z D Dong, eds. Proc of Computational Language.Beijing: Tsinghua University Press, 1999 被引量:1
  • 2Zhang Huaping, Liu Qun, Zhang Hao, et al. Automatic recognition of Chinese unknown words recognition. First SIGHAN Workshop Attached with the 19th COLING, Taipei, 2002 被引量:1
  • 3S R Ye, T S Chua, J M Liu. An agent-based approach to Chinese named entity recognition. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002 被引量:1
  • 4J Sun, J F Gao, L Zhang, et al. Chinese named entity identification using class-based language model. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002 被引量:1
  • 5Lawrence R Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proc of IEEE, 1989,77(2): 257~286 被引量:1
  • 6Shai Fine, Yoram Singer, Naftali Tishby. The hierarchical hidden Markov model: Analysis and applications. Machine Learning,1998, 32(1): 41~62 被引量:1
  • 7Richard Sproat, Thomas Emerson. The first international Chinese word segmentation bakeoff. The First SIGHAN Workshop Attached with the ACL2003, Sapporo, Japan, 2003. 133~143 被引量:1
  • 8J Hockenmaier, C Brew. Error-driven learning of Chinese word segmentation. In: J Guo, K T Lua, J Xu, eds. The 12th Pacific Conf on Language and Information, Singapore, 1998 被引量:1
  • 9Andi Wu, Zixin Jiang. Word segmentation in sentence analysis.1998 Int'l Conf on Chinese Information Processing, Beijing, 1998 被引量:1
  • 10D Palmer. A trainable rule-based algorithm for word segmentation. The 35th Annual Meeting of the Association for Computational Linguistics (ACL'97), Madrid, 1997 被引量:1

共引文献174

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部 意见反馈