
基于HowNet的术语语义知识库构建技术的研究
王羊羊 陈刚 蔡东风 王裴岩
基于HowNet的术语语义知识库构建技术的研究
Construction Techniques of Terminology Semantic Knowledge Base Based on HowNet
领域知识库能够满足特定领域的自然语言处理系统对知识的需求,然而大部分领域知识库的构建方式为手工构建,效率较低。针对这一问题,本文分析了前人手工构建的2300余条航空术语描述信息及其在构建过程中总结的规则,在此基础上,总结了200余条核心词框架,核心词以外的其他词通过本文提出的一种规则与统计相结合的方法进行框架的自动填充,从而提高了构建术语语义知识库的自动化程度。在文章最后,对采用本文方法构建的术语描述信息进行了相似度计算,取得了较好的结果。
Knowledge base for specific domains can satisfy the knowledge requirements for the natural language processing system. However, most of the way to build the domain knowledge base is hand-build, it’s inefficient. To solve this problem, the paper analyzes more than 2300 pieces of terminology describing information which have been built and the rules that have been formulated. Then summarizes more than 200 semantic frameworks based on core word, the other words will be filled to the frameworks according to the regulation and the statistical results, then the efficiency is improved. Finally, we demonstrate the validness of the constructed knowledge base by good results in term similarity calculation.
/
〈 |
|
〉 |