
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2018,29(10):29312947 [doi: 10.13328/j.cnki.jos.005552] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
一种准确而高效的领域知识图谱构建方法
杨玉基
,
许
斌
,
胡家威
,
仝美涵
,
张
鹏
,
郑
莉
(清华大学 计算机科学与技术系 知识工程实验室,北京 100084)
通讯作者: 杨玉基, E-mail: yangyujiyyj@gmail.com
摘 要: 作为语义网的数据支撑,知识图谱在知识问答、语义搜索等领域起着至关重要的作用,一直以来也是研
究领域和工程领域的一个热点问题,但是,构建一个质量较高、规模较大的知识图谱往往需要花费巨大的人力和时
间成本.如何平衡准确率和效率、快速地构建出一个高质量的领域知识图谱,是知识工程领域的一个重要挑战.对领
域知识图谱构建方法进行了系统研究,提出了一种准确、高效的领域知识图谱构建方法——“四步法”,将该方法应
用到中国基础教育九门学科知识图谱的构建中,在较短时间内构建出了准确率较高的学科知识图谱,证明了该方法
构建领域知识图谱的有效性.以地理学科知识图谱为例,使用“四步法”共得到 67 万个实例、1 421 万条三元组,其中,
标注数据的学科知识覆盖率和知识准确率均在 99%以上.
关键词: 语义网;知识图谱;本体;语义标注;实体集扩充;关系抽取
中图法分类号: TP18
中文引用格式: 杨玉基,许斌,胡家威,仝美涵,张鹏,郑莉.一种准确而高效的领域知识图谱构建方法.软件学报,2018,29(10):
29312947. http ://www.jos.org.cn/1000-9825/5552.htm
英文引用格式: Yang YJ, Xu B, Hu JW, Tong MH, Zhang P, Zheng L. Accurate and efficient method for constructing domain
knowledge graph. Ruan Jian Xue Bao/Journal of Software, 2018,29(10):29312947 (in Chinese). http://www.jos.org.cn/1000 -
9825/5552.htm
Accurate and E fficient Me thod for Constr uc ting Domai n Knowle dge Graph
YANG Yu-Ji, XU Bin, HU Jia-Wei, TONG Mei-Han, ZHANG Peng, ZHENG Li
(Knowledge Engineering Group, Department of Computer and Sciences, Tsinghua University, Beijing 100084, China)
Abstra ct : In supporting semantic Web, knowledge graphs have played a vital role in many areas such as knowledge QA and semantic
search . Theref ore, th ey have b ecome a hot topic in t he field of resear ch and en gineerin g. How ever, it is often costly t o build a large-scale
knowledge graph with high accuracy. How to balance the accuracy and efficiency, and quickly build a high-quality domain knowledge
graph, is a big challenge in the field of knowledge engineering. This paper engages a systematic study on the construction of domain
knowledge graphs, and puts forward an accurate and efficient method of constructing domain knowledge graphs as “four-steps”. This
method has been applied to the construction of knowledge graphs of nine subjects in the k12 education of China, and the nine subject
knowledge graphs have been developed with high accuracy, which demonstrates that the new method is effective. For example, the
geographical knowledge graph, which is constructed using the “four-steps” method, h as 670 thousand instances and 14.21 million triples.
And as part of it, the annotation data’s knowledge coverage and knowledge accuracy are both above 99%.
Key words: semantic Web; knowledge graph; ontology; semantic annotation; entity set expansion; rel ation extraction
1998 年,互联网的创始人 Berners-Lee 最先提出了“语义网(semantic Web)”的概念
[1]
,其核心思想是:在网页
基金项目: 国家高技术研究发展计划(863)(2015AA015401)
Foundation item: National High Technology Research and Development Plan of China (2015AA015401)
本文由“本体工程与知识图谱”专题特约编辑漆桂林教授推荐.
收稿时间: 2017-07-22; 修改时间: 2017-11-08; 采用时间: 2018-01-24; jos 在线出版时间: 2018-02-08
CNKI 网络优先出版: 2018-02-08 11 :55:49, http: //kns.cnki.net/kcms/d etail/11.2560.TP.20180208.1155.008.html
评论